InnoDB Persistent Statistics at last
By Calvin Sun on Apr 11, 2011
Note: this article was originally published on http://blogs.innodb.com on April 11, 2011 by Vasil Dimov.
InnoDB gathers statistics for the data in user tables, which are used by the MySQL optimizer to choose the best query plan. For a long time the imprecision and instability of these statistics have been creating problems for users.
The problem is that these statistics are recalculated at any of the following events:
* When the table is opened
* When the table has changed a lot (1/16th of the table has been updated/deleted or inserted)
* When ANALYZE TABLE is run
* When SHOW TABLE STATUS or SHOW INDEX FROM is run
* When InnoDB Monitor is turned ON
and so their recalculation must be quick an unnoticeable. Thus the quick algorithm just picks 8 random pages and could give a wildly varying results.
To solve this problem MySQL 5.6.2 introduces the InnoDB Persistent Statistics feature, which:
* Uses a more precise sampling algorithm (which is also slower) in an attempt to better inform the MySQL optimizer so it can choose the best possible query plan.
* Does not recalculate the statistics automatically, only when ANALYZE TABLE is run. This means that the same query plan will always be chosen even when MySQL is restarted (see below). A query plan using the global statistics can only change after ANALYZE TABLE has been run (manually). This also means that the user and/or DBA is responsible for running ANALYZE TABLE regularly if the data in the table has been changed a lot.
* Saves the statistics on disk in normal user-visible and updatable SQL tables.
How to turn it ON?
To turn ON the persistent statistics for a given table you must first create the persistent statistics storage by executing the SQL script storage/innobase/scripts/persistent_storage.sql which is shipped in the source distribution (in future versions it may be executed automatically by mysql_install_db). This script will create a new database named “innodb” with two tables in it – “table_stats” and “index_stats”. InnoDB will read/write the statistics from/to those tables. Second you must set the new parameter innodb_analyze_is_persistent to ON (it is OFF by default). This will instruct InnoDB to use the better (and slower!) sampling algorithm during ANALYZE TABLE and to save the results to the stats tables. To change the number of leaf pages that are being sampled during such a “persistent-results” ANALYZE TABLE set the parameter innodb_stats_persistent_sample_pages to a higher value.
Once the stats “meta” tables are present and stats for a given table exist in them, then those stats will be used whenever stats are needed – including open table/show table status/show index from and others. To stop using them you must DELETE the corresponding rows from innodb.table_stats and innodb.index_stats.
Beware of bugs
Some bugs in persistent stats code which may cause InnoDB to crash have been fixed, but are not included in 5.6.2. See: