X

MySQL and MySQL Community information

  • December 2, 2020

Breakthrough Enhancements in MySQL Database Service with Analytics Engine

Nipun Agarwal
Vice President, Research & Advanced Development

 

Introduction

MySQL is the most popular open-source database and many organizations choose MySQL to store their valuable enterprise data. MySQL is optimized for OLTP, but it is not designed for analytic processing. As a result, organizations which need to efficiently run analytics on data stored in MySQL database move their data to another database to run analytic applications.

This approach of moving data to another database introduces complexity and additional cost for customers in multiple ways:

  1. Applications need to define complex logic for extracting relevant data from MySQL.
  2. The extracted data needs to be transported to another database across networks in a secure way, which takes times and creates security vulnerability.
  3. Data in the other database needs to be manually kept in sync with the MySQL database and as a result the data on which analytics is performed can become stale.
  4. Managing multiple databases and running OLTP and analytics applications on different databases incurs additional overhead and cost.

MySQL Analytics Service is designed to enable customers to run analytics on data which is stored in MySQL database without the need for ETL (Figure 1)

Figure 1. Both OLTP and OLAP applications can be run with MySQL Database Service with Analytics Engine

This service is built on an innovative in-memory analytics engine which is architected for scalability and performance and is optimized for Oracle Cloud Infrastructure (OCI). Various aspects of this service leverage machine learning driven automation, reducing database administrative costs. This results in a very performant solution for SQL analytics at a fraction of the cost compared to other industry solutions.

Customers who use MySQL Analytics Service will benefit from significantly better performance, a single database for their OLTP and analytic needs, not requiring ETL, support for real-time analytics and significantly reduced cost.

Deployment Scenarios

With the native MySQL Analytics Engine, customers who store data in MySQL database service can seamlessly run analytics.

A MySQL Analytics instance is a cluster composed of a MySQL database service VM and one or more analytics nodes, as shown in red in Figure 2. When analytics is enabled, an analytics plugin is installed on the MySQL database node which is responsible for cluster management, loading data into the memory of the analytics nodes, query scheduling, query execution, and returning query results to the MySQL database. Applications continue to interact with the MySQL database without requiring to be aware of the MySQL Analytics Engine. 

An Analytics node is a VM instance that stores data in memory for analytics processing and executes analytics queries. The number of nodes needed for a workload depends on the amount of data which needs to be processed, the compression factor which is achieved on the dataset, and the query characteristics. On an average, users can expect to load around 400GB of data per analytics node.  The exact number of nodes needed for a schema can be predicted by the auto provisioning advisor which is a part of the service.

Figure 2. MySQL Analytics exclusively available in Oracle Cloud

MySQL Analytics currently supports up to 24 analytics nodes in one cluster which translates to processing capacity of approximately 10 TB of analytics data. 10TB is the amount of data which can be populated in the memory of the analytics nodes at a given moment. There is no limit to the amount of data which can be stored in the MySQL database and customers can choose which tables or columns from MySQL database schema to load into the memory of analytics nodes. If the tables are no longer needed by queries, user can remove the tables from the memory to make room for other data.

MySQL Database Service with Analytics Engine is a great solution for customers who need to run hybrid transactional and analytical workloads. Transactional queries are processed in the MySQL database node and data updated in MySQL InnoDB is transparently propagated to the analytics cluster in real-time. This enables customers to run both OLTP and real-time analytics simultaneously within a single database.

On premise customers who cannot move their MySQL deployment to a Cloud because of compliance or regulatory requirements, can still leverage MySQL Analytics by using the hybrid deployment model as shown in Figure 3. In a hybrid deployment, customers can leverage MySQL replication to replicate on-premise MySQL data to MySQL Analytics without the need for ETL.

Figure 3. Hybrid deployment for enabling analytics on data stored on premise

Native Real-Time Analytics for MySQL

Integration of MySQL Analytics Engine with MySQL Database Service provides a single data management platform for all OLTP and analytics needs of an enterprise. MySQL Analytics is designed as a native engine, and completely shields all the low-level implementation details at the storage level from the end users. As a result, users can manage both MySQL Analytics Engine and MySQL Database Service with the same management tools including OCI console, REST API, and command line interface.

Since MySQL Analytics is an in-memory processing engine, data is only persisted in MySQL InnoDB storage engine. This allows users to manage analytics data the same way they manage transactional data in MySQL.

Users connect to MySQL database node through standard tools and standard-based ODBC/JDBC connectors. Queries which can be accelerated by the analytics engine are transparently pushed by the MySQL database node to the analytics engine. This enables existing applications to take advantage of the MySQL Analytics Engine without any changes to their application, allowing easy and quick integration.

Once users submit a query, the MySQL query optimizer transparently decides if the query should be offloaded to the analytics engine for accelerated execution (Figure 4). This decision is based on whether all operators and functions referenced in the query are supported by the MySQL Analytics Engine and if the estimated time to process the query with the analytics engine is less than with MySQL. If both conditions are met, the query is pushed to analytics nodes for processing. Once processed, the results are sent back to the MySQL database node and returned to the application.

Figure 4. MySQL Database enhanced to natively integrate analytic processing

Since the MySQL optimizer has been enhanced to transparently push the queries to the analytics engine, there is no change required in queries for them to be accelerated by the MySQL analytics engine. As a result, all existing MySQL compatible tools and applications work seamlessly with the MySQL Analytics Engine (Fig 5).

Figure 5. All MySQL compatible tools and application run with the MySQL Analytics Engine

Data for the MySQL Analytics Engine is persisted in MySQL InnoDB. Any updates to the tables are automatically propagated to the memory of the analytics nodes, as a result subsequent queries always see the updated data. This is done behind the scene by a light-weight change propagation algorithm that can keep up with MySQL data update rates. Changes made by an OLTP application are visible in real-time to analytic queries.

Please stayed tuned for the next blog where we will provide insights into the innovations of the MySQL Analytics Engine.

Join the discussion

Comments ( 1 )
  • Donovan Thursday, December 10, 2020
    This looks wonderful. Does the Analytic Cluster fully conform the isolation levels, e.g. repeatable read?
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.