MySQL HeatWave is the only fully managed MySQL database service that combines transactions, analytics, machine learning, and GenAI services, without ETL duplication. MySQL HeatWave also includes MySQL HeatWave Lakehouse, allowing users to query data stored in object storage, MySQL databases, or a combination of both. Users can deploy MySQL HeatWave–powered apps on a choice of public clouds: Oracle Cloud Infrastructure (OCI), Amazon Web Services (AWS), and Microsoft Azure.

Background on MySQL Autopilot

MySQL Autopilot automates the most important and often challenging aspects of achieving high query performance at scale for MySQL HeatWave — including provisioning, loading, querying, and failure handling. It employs machine-learning-based automation to facilitate and enhance the three main stages of data management: system setup, data load, and query execution. It also facilitates the process of handling failures.

In this blog post we talk about the capabilities of Autopilot Indexing, which is a new addition to the MySQL Autopilot.

MySQL Autopilot Indexing

Creating an optimal set of indexes has always been a challenging task for DBAs. On the one hand, creating too few indexes results in low query performance, especially for SELECTs. On the other hand, creating too many indexes leads to excessive index maintenance during DMLs, in turn again hampering performance.  Furthermore, the existence of too many indexes takes up storage and can also cause the query compilation to slow down as the optimizer tries to evaluate the best index candidate for a given table/query. For DBAs to recommend indexes, not only do they need to have a deep understanding of how indexes impact the various query constructs like WHERE predicates, JOINs, ORDER BY etc., they need to come up with an optimal and minimal set of indexes for a given workload, that maintains a balance between performance and storage.  In addition, if the user workload changes, the DBA needs to revisit the index choices all over again, making their task quite daunting.

With MySQL Autopilot Indexing, we aim to ease the task of index recommendations. MySQL Autopilot Indexing is an ML-based feature designed to help optimize database systems for better cost and performance. With Autopilot Indexing, database administrators no longer need to manually identify which secondary indexes are most beneficial for their workload. As the base tables in InnoDB are already organized as index organized tables based on their primary key, Autopilot Indexing only focuses on secondary indexes. It automatically generates secondary index recommendations for creating and dropping indexes based on the current workload. The feature considers both the query performance and the cost of maintaining the indexes. It provides index creation time, performance, and storage estimations, as well as explanations for the recommendations it generates. There is also a simple and intuitive console UI that allows you to view and analyze the performance and storage impact of recommended index suggestions. This makes it easy to see the impact of changes to your database systems before applying the suggestions.

Figure 1. High-level diagram showing a sample user workload and indexes before and after using MySQL Autopilot Indexing

 

                                                 Figure 1. High-level diagram showing a sample user workload and indexes before and after using MySQL Autopilot Indexing.

Autopilot Indexing has a comprehensive set of features that allow the user to tune the performance of database systems, including:

  1. Considers both query and DML performance (index maintenance cost)
  2.  Recommends CREATE and DROP of indexes
  3. Generates DDLs for index creation/drop
  4. Provides performance prediction (per query and total workload)
  5. Provides storage prediction for the recommended indexes
  6. Provides creation time prediction for the recommended indexes
  7. Provides explanation for the recommendations
  8. Console integration to improve user experience

 

Invoking Autopilot Indexing

To benefit from Autopilot Indexing recommendations, the users need to run their workload prior to invoking the advisor. The workload needs to have a minimum set of five queries that are executed against the tables in the user schema. Autopilot Indexing evaluates these queries, determines if their execution is sub-optimal, and if they can benefit from an index. If yes, then index recommendations are provided along with storage and performance impact, as well as explanations for the recommendations. If the workload dataset is very small, or if the current set of indexes are sufficient, Autopilot Indexing may not recommend any additional indexes. Autopilot Indexing can be invoked either via a console for HeatWave on AWS, or via command line on MySQL HeatWave on OCI.

Autopilot Indexing User Interface

On MySQL HeatWave on AWS console, the users find the Autopilot Index Advisor on the Workspaces tab. Figure 2 depicts the landing page of the Autopilot Index Advisor. It shows an example of the outcome from a sample workload that resulted in four different secondary index related actions. Figure 2  label (1) shows the recommendations that include CREATE and DROP indexes. Figure 2  label (2) shows the exact reason why a certain index recommendation was made. In this case, the reason for CREATE is missing index. In case of DROP index recommendations, the reason is unused index, and duplicate index. It also presents the indexed columns and performance improvement labels, shown as HIGH or LOW (for DROP suggestions, no performance impact is presented given that such indexes were unused).

A screenshot of a computerDescription automatically generated

                                                     Figure 2.  MySQL Autopilot Indexing console screenshot showing (i) CREATE index suggestions, (ii) explanations

Figure 3 shows an additional window pane of MySQL Autopilot Index Advisor that lists the affected query statements based on the selected index recommendation(s). Each affected SQL statement is fully printed at the bottom screen with additional execution metadata such as the number of times that a particular statement was executed in the analyzed workload. Figure 3 label (3) shows per query performance improvement estimation when related index suggestion is applied. Similarly, Figure 3 label (4) shows the total storage impact when selected index suggestions are applied. By doing so, users can understand both performance and storage impact of applying the indexing recommendations. To apply the suggestions, users can simply click the “Apply Recommendations” button which automatically creates and drops indexes based on the listed suggestions.

 

A screenshot of a computerDescription automatically generated

                                                Figure 3 MySQL Autopilot Indexing console screenshot of (iii) estimated performance impact, and (iv) estimated storage impact

 A demo of the Autopilot Indexing user interface can be seen below.

 

 

Autopilot Indexing SQL interface

For the users who want to access the feature via SQL clients and/or programmatic way, Autopilot Indexing exposes sys.autopilot_index_advisor() interface.

For now, this is also the only way for users on OCI to access the feature. 

Invoking the interface requires calling the advisor, which is installed under ‘sys’ schema.

mysql> CALL sys.autopilot_index_advisor(JSON_OBJECT('target_schema', JSON_ARRAY('bb_seats_sf7')));

 

The output in Figure 4 shows the “Index Suggestions” section of the SQL interface for a sample schema (bb_seats_sf7), the information is presented per suggestion action. Each action contains performance impact, storage impact, and index creation time estimations along with reason and indexed column info. At the bottom of the window a summary of the suggestions and total estimations are provided.                                                              

Figure 4

                                                                                                 Figure 4. SQL interface listing index recommendations with estimated performance and storage footprint.

Figure 5 shows the advisor output that provides programmatic access to the advisor via SQL. By following the instructions in these outputs, the users can further access the generated DDL script and the explanations.

output from MySQL autopilot

                                    Figure 5. The users have access to generated DDLs as well as detailed explanations.

Figure 6 shows how the user can retrieve the explanations for the index recommendations.

detailed explanations

                                      Figure 6. Detailed explanations for the recommendations.

Finally, we show an example of generated DDLs. The users are given the following SQL code to access the DDL statements that will apply the suggested indexes.

mysql> SELECT log->>"$.sql" AS "SQL Script" FROM sys.autopilot_index_advisor_report WHERE type = "sql" ORDER BY id;
    

Figure 7 shows the output of the SQL command. The DDL output is generated based on the suggestions generated. By observing the generated DDLs, the users get visibility into exact changes that will be applied to their database. As the user workloads evolve over time, the users are encouraged to invoke the advisor via console or SQL interface to keep their database indexes efficient.

Generated DDL

                                        Figure 7. Generated DDL for creating two indexes.

Customers with workloads that can potentially benefit from Autopilot indexing are sent an email notification. The email notification includes information on the average performance improvement that the customers can expect on their workload if they create the indexes recommended by Autopilot indexing.

Summary

Autopilot Indexing is a powerful tool for database systems performance optimization, using machine learning and tight engineering to automate the creation, drop, and maintenance of indexes. It provides performance prediction, storage prediction, and explanation for recommendations, and integrates with the console of the database system to improve the user experience.

More Information