More Early Access Features in the MySQL 5.6.3 Development Milestone!
By Rob Young on Oct 03, 2011
For those with an interest in MySQL, this week at Oracle OpenWorld has gotten off to a great start. Demonstrating how Oracle drives MySQL innovation Tomas' "State of the Dolphin" keynote on Monday gave a great overview of the new MySQL products that have recently been delivered:
- MySQL Installer for Windows
- New MySQL Security and Scalability features
- New MySQL High Availability features
- New early access features in the MySQL 5.6.3 Development Milestone Release ("DMR")
- New early access features in the MySQL Cluster 7.2 DMR
- New MySQL Enterprise Oracle certifications
Of these announcements I am most excited about the new early access features ready for community evaluation in the MySQL 5.6.3 DMR. Since 5.5 was released last December we have gotten community and customer feedback that it is the best, highest quality release to date. 5.6 builds on 5.5 by improving:
The MySQL Optimizer for better query performance, scale, diagnostics
Oracle owns the MySQL optimizer and is making a huge investment in engineering and re-factoring to ensure the optimizer evolves to meet the most demanding and complex query requirements. To this end, the 5.6.3 optimizer now includes:
File sort optimizations for queries with SELECT *, ORDER by non-indexed column, with small LIMIT values
Currently such queries that ORDER by a non-indexed column perform a full table scan, create sort operations for entire table, potentially create temp tables, toss out the unneeded rows and merge the result set. With the new default behavior the whole table is scanned, but only the needed rows are sorted and returned in the result set. Early internal tests show 3x improvement in execution times when using the default sort buffer (of course your mileage may vary.)
Index Condition Pushdown ("ICP")
By default, the optimizer now pushes indexed columns in WHERE clause down to the storage engine for evaluation. For example, for this table and related query on secondary key of postalcode+age:
CREATE TABLE person (
personid INTEGER PRIMARY KEY,
KEY k1 (postalcode,age)
SELECT lastname, firstname FROM person
WHERE postalcode BETWEEN 5000 AND 5500 AND age BETWEEN 21 AND 25;
In this case the optimizer will use a range scan on postalcode from 5000-5500 (on index) and push the evaluation of age to InnoDB which will evaluate the value of age "BETWEEN 21 and 25". For each match, InnoDB will return a row back to the server, which will continue requesting/receiving rows until all rows have been evaluated and returned. Early non-scientific testing shows:
With 5 million rows in table, random values, running on commodity, desktop-grade hardware
ICP disabled: Disk bound (default buffer pool 128 Mb) the query will take 15 seconds
ICP disabled: All data in memory (buffer pool 1.5 Gb) the query will take 1.4 seconds
ICP enabled: execution time reduced to 90 ms for both
Batched Key Access ("BKA") and Multi-Range Read ("MRR")
Multi-Range Read was introduced in the 5.6.2 DMR. MRR improves the performance of queries that use secondary indexes, by scanning one or more index ranges used in a query, sorting the associated disk blocks for the row data, then reading those disk blocks using larger sequential I/O requests. The speedup benefits operations such as
range index scans and equi-joins on indexed columns.
BKA builds on MRR by improving on the performance of disk-bound join queries. With BKA the server fills up the join buffer with "join keys". The server then sends batches of join keys to the storage engine layer (InnoDB or MyISAM, and soon NDB) for processing. The storage engine then uses MRR to sort indexes according to the data sequence so the data pages can be accessed in sequence for efficiency.
Early testing of DBT3 Q3: Customer Distribution Query on InnoDB, buffer pool = 50 MB, 4GB InnoDB data file, and disk/memory ratio (~2% in memory) has shown execution time drops from 2000 sec to 10 sec with a sufficiently large join buffer.
EXPLAIN for INSERT/UPDATE/DELETE
EXPLAIN for SELECT has been available since MySQL 5.0. This new feature is a long standing request from our community and customer base. Best to get straight to an example of how it is used:
CREATE TABLE t1(c1 INT, c2 INT, KEY(c2), KEY(c2, c1));
EXPLAIN UPDATE t1 SET c1 = 10 WHERE c2 = 1;
mysql> EXPLAIN UPDATE t1 SET c1 = 10 WHERE c2 = 1;
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | | Extra |
| 1 | SIMPLE | t1 | range | c2,c2_2 | c2 | 5 | NULL | 1 | Using where |
1 row in set (0.00 sec)
In this example the optimizer will use a type=range (scan) on key=c2 to identify and UPDATE the affected row(s).
This feature allows DBAs, developers to trace the decision-making steps of the optimizer. The TRACE output shows how the EXPLAIN plan was generated and the decision points resolved along the way. An example would be:
SELECT (SELECT 1 FROM t6 WHERE d = c) AS RESULT FROM t5 ;
select * from information_schema.OPTIMIZER_TRACE;
With output snippet:
Optimizer tracing works for SELECT, EXPLAIN SELECT, INSERT or REPLACE ( with VALUES or SELECT), UPDATE/DELETE, SET, DO, DECLARE/CASE/IF/RETURN.
Other Optimizer related improvements in 5.6.3 include:
Late materialization of views/subqueries in FROM clause
This improvement enables fast EXPLAINs for views/subqueries and defers/avoids materialization when possible, enabling for quicker bail out if query is abandoned or canceled. Internal tests show a 240x improvement in execution time (drops from 8 min to 2 sec)
Persistent Optimizer Statistics for InnoDB
This allows for more stable, accurate InnoDB statistics and minimizes the overhead of recalculating statistics on start up or crash recovery. Statistics can be manually updated when needed.
The 5.6.3 DMR also includes new features and improvements to the following:
Performance Schema now includes instrumentation for:
- Statements/execution stages - What are my most resource intensive queries?
- Table and Index I/O - Which app tables/indexes cause the most load?
- Table Locks - Which tables cause the most contention?
- Users/Hosts/Accounts level resource consumption - Who are the most resource intensive users, hosts, accounts?
- Network I/O - Is it the network or my app? How long do sessions idle?
- Aggregated summaries by thread, user, host, account, object
- New INFORMATION_SCHEMA tables (metrics, system and buffer pool information) for better diagnostics
- Dump and restore buffer pool for quicker startup and recovery
- Reduce contention during file extension for better concurrency
- Crash-safe slaves and binlog, replication checksums for better data integrity
- Multi-threaded slaves for better performance
- time-delayed replication for better testing, data integrity
- Informational log events for easier troubleshooting
You can learn more about the 5.6.3 DMR and other early access features by visiting the MySQL Developers Zone. Here you will find technical articles, the MySQL Database 5.6.3 and MySQL Cluster 7.2 DMR downloads, supporting documentation and all related PlanetMySQL blogs.
If you are attending Oracle Open World this week please plan to attend as many of the 47 MySQL sessions as you can, including my sessions on the "MySQL Roadmap" (Tuesday at 5:00 pm) and "Using MySQL with Other Oracle Products" (Wednesday at 11:45 am PT). Both are in the Marriott Marquis (just up the block from the Moscone Center), Golden Gate C2. Also, be sure to join us at the MySQL Community Reception . There will be many familiar faces and friends there, and it is open to all, even if you are not attending OOW.
As always, thanks for your continued support of MySQL!