Last year, in MySQL 8.0.31, we announced how HeatWave uses the new MySQL hypergraph optimizer to get better query plans. Since then, we have continued to work on the optimizer, and starting with MySQL 9.0, it has been enabled as an opt-in feature in HeatWave MySQL. This new feature is only available in HeatWave MySQL and will benefit OLTP workloads. It’s not available in other vendors’ cloud offerings. Read on for a quick introduction to how to try it and what it means for your queries.
How to Enable the Hypergraph Optimizer
Since a new optimizer is a huge deal, we are introducing it as an opt-in feature first, which mean that you have to turn it on explicitly. The hypergraph optimizer is enabled by setting the optimizer switch hypergraph_optimizer:
SET SESSION optimizer_switch='hypergraph_optimizer=on';
That’s all you need to do! Your next query will use the hypergraph optimizer. Of course, since the optimizer only decides how to execute the query, not the semantics of the query, the results will still be the same as you got with the old optimizer, so you won’t notice much at first. But there are some small user visible changes. The most visible one is in the EXPLAIN output. MySQL currently supports three formats for the output of EXPLAIN: TRADITIONAL, JSON and TREE. With the hypergraph optimizer, the TRADITIONAL format which has been the default for many years is no longer an option. The default for the hypergraph optimizer is the TREE format.
EXPLAIN FORMAT=JSON was made configurable in MySQL 8.3. By setting the system variable explain_json_format_version to 2, you will get the new JSON format which basically is a machine readable version of the TREE format. In fact, the TREE output is implemented in the server as a more human readable projection of the JSON output. With the hypergraph optimizer, you can’t use JSON format version 1 anymore. The two available options are TREE and JSON format version 2.
The reason for this is a technical one: The old optimizer and the hypergraph optimizer use different internal formats to represent the query plan, and the TRADITIONAL and JSON format version 1 are dumps of data structures that don’t exist when using the hypergraph optimizer. For quite some time now, when using the old optimizer, MySQL has been silently converting the old plan format to the new plan format that is used natively by the hypergraph optimizer, which is why the TREE and JSON format version 2 are also available for the old optimizer.
What Does the Hypergraph Optimizer Mean for Me?
The new EXPLAIN output is certainly something we all have to get used to, but that is just how things are displayed. The main difference is in how the hypergraph optimizer chooses to plan queries.
The point of the optimizer is to select the best possible query plan for your queries. The hypergraph optimizer generally does a much better job of that than the old optimizer. E.g., if you have struggled with making the optimizer make the best choice betweeen using an index to read data in sorted order or to do an explicit sort operation, the hypergraph optimizer should solve this for you. Setting the prefer_ordering_index optimizer switch should become a thing of the past.
Join orderings should also improve, if they aren’t optimal already. The hypergraph optimizer is also allowed to generate bushy join plans, which is sometimes a good idea, especially for more complicated analytical queries. The hypergraph optimizer is also better at making the correct choice between hash joins and nested loop joins, which should speed up many queries.
In general, the hypergraph optimizer is much better at selecting the best plan based on cost calculations. The old optimizer relies much more on heuristics, which may work well for some queries, but there are many queries where the old rules of thumb don’t hold, and that’s where the hypergraph optimizer is able to make true cost based decisions that result in better query plans.
Does that mean that we can throw out all hints and just rely on the optimizer? Not quite. Cost formulas and statistics are certainly more important with the hypergraph optimizer, and you will notice that generating better statistics (e.g., creating more histograms) will help the optimizer make better choices. But cost formulas and statistics are still simplifications of the real cost of a query plan, and, theoretically, it will always be possible to construct a data set and queries that go against the embedded assumptions and produce a suboptimal plan.
Even when we try hard to make the cost model work well for real queries and restrict the anomalies to strange, unrealistic queries, there may be some real life cases where hints are still necessary to get a good plan. For this reason, we’ve tried to maintain as much as we can of existing optimizer hint behavior. Old hints should, as long as it makes sense in the context of the hypergraph optimizer, continue to force plan choices in the same way that they did before. But our best recommendation is to try without hints first, and then resort to hints only when necessary. Give the optimizer a chance to get it right by itself!
We are still working on improving the cost model, so if a query doesn’t perform well today, it might tomorrow. Without hints.
The MySQL hypergraph optimizer is available in HeatWave MySQL on OCI, AWS and Azure. You won’t find it in other vendors’ cloud offerings. Give the MySQL hypergraph optimizer a try today!
