Saturday Sep 29, 2012

Online ALTER TABLE in MySQL 5.6

This is the low-level view of data dictionary language (DDL) operations in the InnoDB storage engine in MySQL 5.6. John Russell gave a more high-level view in his blog post April 2012 Labs Release – Online DDL Improvements.

MySQL before the InnoDB Plugin

Traditionally, the MySQL storage engine interface has taken a minimalistic approach to data definition language. The only natively supported operations were CREATE TABLE, DROP TABLE and RENAME TABLE. Consider the following example:

INSERT INTO t VALUES (1),(2),(3);

The CREATE INDEX statement would be executed roughly as follows:


You could imagine that the database could crash when copying all rows from the original table to the new one. For example, it could run out of file space. Then, on restart, InnoDB would roll back the huge INSERT transaction. To fix things a little, a hack was added to ha_innobase::write_row for committing the transaction every 10,000 rows.

Still, it was frustrating that even a simple DROP INDEX would make the table unavailable for modifications for a long time.

Fast Index Creation in the InnoDB Plugin of MySQL 5.1

MySQL 5.1 introduced a new interface for CREATE INDEX and DROP INDEX. The old table-copying approach can still be forced by SET old_alter_table=0.

This interface is used in MySQL 5.5 and in the InnoDB Plugin for MySQL 5.1. Apart from the ability to do a quick DROP INDEX, the main advantage is that InnoDB will execute a merge-sort algorithm before inserting the index records into each index that is being created. This should speed up the insert into the secondary index B-trees and potentially result in a better B-tree fill factor.

The 5.1 ALTER TABLE interface was not perfect. For example, DROP FOREIGN KEY still invoked the table copy. Renaming columns could conflict with InnoDB foreign key constraints. Combining ADD KEY and DROP KEY in ALTER TABLE was problematic and not atomic inside the storage engine.

The ALTER TABLE interface in MySQL 5.6

The ALTER TABLE storage engine interface was completely rewritten in MySQL 5.6. Instead of introducing a method call for every conceivable operation, MySQL 5.6 introduced a handful of methods, and data structures that keep track of the requested changes.

In MySQL 5.6, online ALTER TABLE operation can be requested by specifying LOCK=NONE. Also LOCK=SHARED and LOCK=EXCLUSIVE are available. The old-style table copying can be requested by ALGORITHM=COPY. That one will require at least LOCK=SHARED. From the InnoDB point of view, anything that is possible with LOCK=EXCLUSIVE is also possible with LOCK=SHARED.

Most ALGORITHM=INPLACE operations inside InnoDB can be executed online (LOCK=NONE). InnoDB will always require an exclusive table lock in two phases of the operation. The execution phases are tied to a number of methods:

Checks if the storage engine can perform all requested operations, and if so, what kind of locking is needed.
InnoDB uses this method to set up the data dictionary cache for upcoming CREATE INDEX operation. We need stubs for the new indexes, so that we can keep track of changes to the table during online index creation. Also, crash recovery would drop any indexes that were incomplete at the time of the crash.
In InnoDB, this method is used for creating secondary indexes or for rebuilding the table. This is the ‘main’ phase that can be executed online (with concurrent writes to the table).
This is where the operation is committed or rolled back. Here, InnoDB would drop any indexes, rename any columns, drop or add foreign keys, and finalize a table rebuild or index creation. It would also discard any logs that were set up for online index creation or table rebuild.

The prepare and commit phases require an exclusive lock, blocking all access to the table. If MySQL times out while upgrading the table meta-data lock for the commit phase, it will roll back the ALTER TABLE operation.

In MySQL 5.6, data definition language operations are still not fully atomic, because the data dictionary is split. Part of it is inside InnoDB data dictionary tables. Part of the information is only available in the *.frm file, which is not covered by any crash recovery log. But, there is a single commit phase inside the storage engine.

Online Secondary Index Creation

It may occur that an index needs to be created on a new column to speed up queries. But, it may be unacceptable to block modifications on the table while creating the index.

It turns out that it is conceptually not so hard to support online index creation. All we need is some more execution phases:

  1. Set up a stub for the index, for logging changes.
  2. Scan the table for index records.
  3. Sort the index records.
  4. Bulk load the index records.
  5. Apply the logged changes.
  6. Replace the stub with the actual index.

Threads that modify the table will log the operations to the logs of each index that is being created. Errors, such as log overflow or uniqueness violations, will only be flagged by the ALTER TABLE thread. The log is conceptually similar to the InnoDB change buffer.

The bulk load of index records will bypass record locking. We still generate redo log for writing the index pages. It would suffice to log page allocations only, and to flush the index pages from the buffer pool to the file system upon completion.


Starting with MySQL 5.6, InnoDB supports most ALTER TABLE operations natively. The notable exceptions are changes to the column type, ADD FOREIGN KEY except when foreign_key_checks=0, and changes to tables that contain FULLTEXT indexes.

The keyword ALGORITHM=INPLACE is somewhat misleading, because certain operations cannot be performed in-place. For example, changing the ROW_FORMAT of a table requires a rebuild.

Online operation (LOCK=NONE) is not allowed in the following cases:

  • when adding an AUTO_INCREMENT column,
  • when the table contains FULLTEXT indexes or a hidden FTS_DOC_ID column, or
  • when there are FOREIGN KEY constraints referring to the table, with ON…CASCADE or ON…SET NULL option.

The FOREIGN KEY limitations are needed, because MySQL does not acquire meta-data locks on the child or parent tables when executing SQL statements.

Theoretically, InnoDB could support operations like ADD COLUMN and DROP COLUMN in-place, by lazily converting the table to a newer format. This would require that the data dictionary keep multiple versions of the table definition. For simplicity, we will copy the entire table, even for DROP COLUMN.

The bulk copying of the table will bypass record locking and undo logging. For facilitating online operation, a temporary log will be associated with the clustered index of table. Threads that modify the table will also write the changes to the log.

When altering the table, we skip all records that have been marked for deletion. In this way, we can simply discard any undo log records that were not yet purged from the original table.

Off-page columns, or BLOBs, are an important consideration. We suspend the purge of delete-marked records if it would free any off-page columns from the old table. This is because the BLOBs can be needed when applying changes from the log. We have special logging for handling the ROLLBACK of an INSERT that inserted new off-page columns. This is because the columns will be freed at rollback.

New Enhancements for InnoDB Memcached

In MySQL 5.6, we continued our development on InnoDB Memcached and completed a few widely desirable features that make InnoDB Memcached a competitive feature in more scenario. Notablely, they are

1) Support multiple table mapping

2) Added background thread to auto-commit long running transactions

3) Enhancement in binlog performance

 Let’s go over each of these features one by one. And in the last section, we will go over a couple of internally performed performance tests.

Support multiple table mapping

In our earlier release, all InnoDB Memcached operations are mapped to a single InnoDB table. In the real life, user might want to use this InnoDB Memcached features on different tables. Thus being able to support access to different table at run time, and having different mapping for different connections becomes a very desirable feature. And in this GA release, we allow user just be able to do both. We will discuss the key concepts and key steps in using this feature.

1) "mapping name" in the "get" and "set" command

In order to allow InnoDB Memcached map to a new table, the user (DBA) would still require to "pre-register" table(s) in InnoDB Memcached “containers” table (there is security consideration for this requirement). If you would like to know about “containers” table, please refer to my earlier blogs in Once registered, the InnoDB Memcached will then be able to look for such table when they are referred.

Each of such registered table will have a unique "registration name" (or mapping_name) corresponding to the “name” field in the “containers” table.. To access these tables, user will include such "registration name" in their get or set commands, in the form of "get @@new_mapping_name.key", prefix "@@" is required for signaling a mapped table change. The key and the "mapping name" are separated by a configurable delimiter, by default, it is ".". So the syntax is:

get [@@mapping_name.]key_name

set [@@mapping_name.]key_name


 get @@mapping_name

set @@mapping_name

Here is an example:

Let's set up three tables in the "containers" table:

The first is a map to InnoDB table "test/demo_test" table with mapping name "setup_1"

INSERT INTO containers VALUES ("setup_1", "test", "demo_test",

"c1", "c2", "c3", "c4", "c5", "PRIMARY");

 Similarly, we set up table mappings for table "test/new_demo" with name "setup_2" and that to table "mydatabase/my_demo" with name "setup_3":

INSERT INTO containers VALUES ("setup_2", "test", "new_demo", "c1", "c2",

"c3", "c4", "c5", "secondary_index_x");

INSERT INTO containers VALUES ("setup_3", "my_database", "my_demo",

"c1", "c2", "c3", "c4", "c5", "idx");

To switch to table "my_database/my_demo", and get the value corresponding to “key_a”, user will do:

get @@setup_3.key_a

(this will also output the value that corresponding to key "key_a"

or simply

get @@setup_3

Once this is done, this connection will switch to "my_database/my_demo" table until another table mapping switch is requested. so it can continue issue regular command like:

get key_b

 set key_c 0 0 7

These DMLs will all be directed to "my_database/my_demo" table.

And this also implies that different connections can have different bindings (to different table).

2) Delimiter:

For the delimiter "." that separates the "mapping name" and key value, we also added a configure option in the "config_options" system table with name of "table_map_delimiter":

INSERT INTO config_options VALUES("table_map_delimiter", ".");

So if user wants to change to a different delimiter, they can change it in the config_option table.

3) Default mapping:

Once we have multiple table mapping, there should be always a "default" map setting. For this, we decided if there exists a mapping name of "default", then this will be chosen as default mapping. Otherwise, the first row of the containers table will chosen as default setting.

Please note, user tables can be repeated in the "containers" table (for example, user wants to access different columns of the table in different settings), as long as they are using different mapping/configure names in the first column, which is enforced by a unique index.

4) bind command

In addition, we also extend the protocol and added a bind command, its usage is fairly straightforward. To switch to "setup_3" mapping above, you simply issue:

bind setup_3

This will switch this connection's InnoDB table to "my_database/my_demo"

In summary, with this feature, you now can direct access to difference tables with difference session. And even a single connection, you can query into difference tables.

Background thread to auto-commit long running transactions

This is a feature related to the “batch” concept we discussed in earlier blogs. This “batch” feature allows us batch the read and write operations, and commit them only after certain calls. The “batch” size is controlled by the configure parameter “daemon_memcached_w_batch_size” and “daemon_memcached_r_batch_size”. This could significantly boost performance.

However, it also comes with some disadvantages, for example, you will not be able to view “uncommitted” operations from SQL end unless you set transaction isolation level to read_uncommitted, and in addition, this will held certain row locks for extend period of time that might reduce the concurrency.

To deal with this, we introduce a background thread that “auto-commits” the transaction if they are idle for certain amount of time (default is 5 seconds). The background thread will wake up every second and loop through every “connections” opened by Memcached, and check for idle transactions. And if such transaction is idle longer than certain limit and not being used, it will commit such transactions. This limit is configurable by change “innodb_api_bk_commit_interval”. Its default value is 5 seconds, and minimum is 1 second, and maximum is 1073741824 seconds.

With the help of such background thread, you will not need to worry about long running uncommitted transactions when set daemon_memcached_w_batch_size and daemon_memcached_r_batch_size to a large number. This also reduces the number of locks that could be held due to long running transactions, and thus further increase the concurrency.

Enhancement in binlog performance

As you might all know, binlog operation is not done by InnoDB storage engine, rather it is handled in the MySQL layer. In order to support binlog operation through InnoDB Memcached, we would have to artificially create some MySQL constructs in order to access binlog handler APIs. In previous lab release, for simplicity consideration, we open and destroy these MySQL constructs (such as THD) for each operations. This required us to set the “batch” size always to 1 when binlog is on, no matter what “daemon_memcached_w_batch_size” and “daemon_memcached_r_batch_size” are configured to. This put a big restriction on our capability to scale, and also there are quite a bit overhead in creating destroying such constructs that bogs the performance down.

With this release, we made necessary change that would keep MySQL constructs as long as they are valid for a particular connection. So there will not be repeated and redundant open and close (table) calls. And now even with binlog option is enabled (with innodb_api_enable_binlog,), we still can batch the transactions with daemon_memcached_w_batch_size and daemon_memcached_r_batch_size, thus scale the write/read performance.

Although there are still overheads that makes InnoDB Memcached cannot perform as fast as when binlog is turned off. It is much better off comparing to previous release. And we are continuing optimize the solution is this area to improve the performance as much as possible.

Performance Study:

Amerandra of our System QA team have conducted some performance studies on queries through our InnoDB Memcached connection and plain SQL end. And it shows some interesting results.

The test is conducted on a “Linux 2.6.32-300.7.1.el6uek.x86_64 ix86 (64)” machine with 16 GB Memory, Intel Xeon 2.0 GHz CPU X86_64 2 CPUs- 4 Core Each, 2 RAID DISKS (1027 GB,733.9GB). Results are described in following tables:

Table 1: Performance comparison on Set operations


5.6.7-RC-Memcached-plugin ( TPS / Qps) with memcached-threads=8***


X faster

Set (QPS)


















* mysql-5.6.7-rc-linux2.6-x86_64

** The “set” operation when implemented in InnoDB Memcached involves a couple of DMLs: it first query the table to see whether the “key” exists, if it does not, the new key/value pair will be inserted. If it does exist, the “value” field of matching row (by key) will be updated. So when used in above query, it is a precompiled store procedure, and query will just execute such procedures.

*** added “–daemon_memcached_option=-t8” (default is 4 threads)

So we can see with this “set” query, InnoDB Memcached can run 4.5 to 9 time faster than MySQL server.

Table 2: Performance comparison on Get operations


5.6.7-RC-Memcached-plugin ( TPS / Qps) with memcached-threads=8


X faster

Get (QPS)


















With the “get” query (or the select query), memcached performs 1.5 to 2 times faster than normal SQL.


In summary, we added several much-desired features to InnoDB Memcached in this release, allowing user to operate on different tables with this Memcached interface. We also now provide a background commit thread to commit long running idle transactions, thus allow user to configure large batch write/read without worrying about large number of rows held or not being able to see (uncommit) data. We also greatly enhanced the performance when Binlog is enabled. We will continue making efforts in both performance enhancement and functionality areas to make InnoDB Memcached a good demo case for our InnoDB APIs.

Jimmy Yang, September 29, 2012

Helping to Reduce Page Compression Failures Rate

When InnoDB compresses a page it needs the result to fit into its predetermined compressed page size (specified with KEY_BLOCK_SIZE). When the result does not fit we call that a compression failure. In this case InnoDB needs to split up the page and try to compress again. That said, compression failures are bad for performance and should be minimized.

Whether the result of the compression will fit largely depends on the data being compressed and some tables and/or indexes may contain more compressible data than others. And so it would be nice if the compression failure rate, along with other compression stats, could be monitored on a per table or even on a per index basis, wouldn't it?

This is where the new INFORMATION_SCHEMA table in MySQL 5.6 kicks in. INFORMATION_SCHEMA.INNODB_CMP_PER_INDEX provides exactly this helpful information. It contains the following fields:

| Field           | Type         | Null |
| database_name   | varchar(192) | NO   |
| table_name      | varchar(192) | NO   |
| index_name      | varchar(192) | NO   |
| compress_ops    | int(11)      | NO   |
| compress_ops_ok | int(11)      | NO   |
| compress_time   | int(11)      | NO   |
| uncompress_ops  | int(11)      | NO   |
| uncompress_time | int(11)      | NO   |

similarly to INFORMATION_SCHEMA.INNODB_CMP, but this time the data is grouped by "database_name,table_name,index_name" instead of by "page_size".

So a query like

compress_ops - compress_ops_ok AS failures
FROM information_schema.innodb_cmp_per_index
ORDER BY failures DESC;
would reveal the most problematic tables and indexes that have the highest compression failure rate.

From there on the way to improving performance would be to try to increase the compressed page size or change the structure of the table/indexes or the data being stored and see if it will have a positive impact on performance.

InnoDB Compression Improvements in MySQL 5.6

MySQL 5.6 comes with significant improvements for the compression support inside InnoDB. The enhancements that we'll talk about in this piece are also a good example of community contributions. The work on these was conceived, implemented and contributed by the engineers at Facebook. Before we plunge into the details let us familiarize ourselves with some of the key concepts surrounding InnoDB compression.

  • In InnoDB compressed pages are fixed size. Supported sizes are 1, 2, 4, 8 and 16K. The compressed page size is specified at table creation time.
  • InnoDB uses zlib for compression.
  • InnoDB buffer pool will attempt to cache compressed pages like normal pages. However, whenever a page is actively used by a transaction, we'll always have the uncompressed version of the page as well i.e.: we can have a page in the buffer pool in compressed only form or in a state where we have both the compressed page and uncompressed version but we'll never have a page in uncompressed only form. On-disk we'll always only have the compressed page.
  • When both compressed and uncompressed images are present in the buffer pool they are always kept in sync i.e.: changes are applied to both atomically.
  • Recompression happens when changes are made to the compressed data. In order to minimize recompressions InnoDB maintains a modification log within a compressed page. This is the extra space available in the page after compression and it is used to log modifications to the compressed data thus avoiding recompressions.
  • DELETE (and ROLLBACK of DELETE) and purge can be performed without recompressing the page. This is because the delete-mark bit and the system fields DB_TRX_ID and DB_ROLL_PTR are stored in uncompressed format on the compressed page. A record can be purged by shuffling entries in the compressed page directory. This can also be useful for updates of indexed columns, because UPDATE of a key is mapped to INSERT+DELETE+purge.
  • A compression failure happens when we attempt to recompress a page and it does not fit in the fixed size. In such case, we first try to reorganize the page and attempt to recompress and if that fails as well then we split the page into two and recompress both pages.

Now lets talk about the three major improvements that we made in MySQL 5.6.

Logging of Compressed Page Images:
InnoDB used to log entire compressed data on the page to the redo logs when recompression happens. This was an extra safety measure to guard against the rare case where an attempt is made to do recovery using a different zlib version from the one that was used before the crash. Because recovery is a page level operation in InnoDB we have to be sure that all recompress attempts must succeed without causing a btree page split. However, writing entire compressed data images to the redo log files not only makes the operation heavy duty but can also adversely affect flushing activity. This happens because redo space is used in a circular fashion and when we generate much more than normal redo we fill up the space much more quickly and in order to reuse the redo space we have to flush the corresponding dirty pages from the buffer pool.
Starting with MySQL 5.6 a new global configuration parameter innodb_log_compressed_pages. The default value is true which is same as the current behavior. If you are sure that you are not going to attempt to recover from a crash using a different version of zlib then you should set this parameter to false. This is a dynamic parameter.

Compression Level:
You can now set the compression level that zlib should choose to compress the data. The global parameter is innodb_compression_level - the default value is 6 (the zlib default) and allowed values are 1 to 9. Again the parameter is dynamic i.e.: you can change it on the fly.

Dynamic Padding to Reduce Compression Failures:
Compression failures are expensive in terms of CPU. We go through the hoops of recompress, failure, reorganize, recompress, failure and finally page split. At the same time, how often we encounter compression failure depends largely on the compressibility of the data. In MySQL 5.6, courtesy of Facebook engineers, we have an adaptive algorithm based on per-index statistics that we gather about compression operations. The idea is that if a certain index/table is experiencing too many compression failures then we should try to pack the 16K uncompressed version of the page less densely i.e.: we let some space in the 16K page go unused in an attempt that the recompression won't end up in a failure. In other words, we dynamically keep adding 'pad' to the 16K page till we get compression failures within an agreeable range. It works the other way as well, that is we'll keep removing the pad if failure rate is fairly low. To tune the padding effort two configuration variables are exposed.

  • innodb_compression_failure_threshold_pct: default 5, range 0 - 100,dynamic, implies the percentage of compress ops to fail before we start using to padding. Value 0 has a special meaning of disabling the padding.
  • innodb_compression_pad_pct_max: default 50, range 0 - 75, dynamic, the  maximum percentage of uncompressed data page that can be reserved as pad.

Performance Enhancement in Full-Text Search Query

Ever since its first release, we are continuing consolidating and developing InnoDB Full-Text Search feature. There is one recent improvement that worth blogging about. It is an effort with MySQL Optimizer team that simplifies some common queries Query Plans and dramatically shorted the query time. I will describe the issue, our solution and the end result by some performance numbers to demonstrate our efforts in continuing enhancement the Full-Text Search capability.

The Issue:

As we had discussed in previous Blogs, InnoDB implements Full-Text index as reversed auxiliary tables. The query once parsed will be reinterpreted into several queries into related auxiliary tables and then results are merged and consolidated to come up with the final result. So at the end of the query, well have all matching records on hand, sorted by their ranking or by their Doc IDs.

Unfortunately, MySQLs optimizer and query processing had been initially designed for MyISAM Full-Text index, and sometimes did not fully utilize the complete result package from InnoDB.

Here are a couple examples:

Case 1: Query result ordered by Rank with only top N results:

mysql> SELECT FTS_DOC_ID, MATCH (title, body) AGAINST ('database')


In this query, user tries to retrieve a single record with highest ranking. It should have a quick answer once we have all the matching documents on hand, especially if there are ranked. However, before this change, MySQL would almost retrieve rankings for almost every row in the table, sort them and them come with the top rank result. This whole retrieve and sort is quite unnecessary given the InnoDB already have the answer.

In a real life case, user could have millions of rows, so in the old scheme, it would retrieve millions of rows' ranking and sort them, even if our FTS already found there are two 3 matched rows. Apparently, the million ranking retrieve is done in vain. In above case, it should just ask for 3 matched rows' ranking, all other rows' ranking are 0. If it want the top ranking, then it can just get the first record from our already sorted result.

Case 2: Select Count(*) on matching records:

mysql> SELECT COUNT(*) FROM articles WHERE MATCH (title,body)


In this case, InnoDB search can find matching rows quickly and will have all matching rows. However, before our change, in the old scheme, every row in the table was requested by MySQL one by one, just to check whether its ranking is larger than 0, and later comes up a count.

In fact, there is no need for MySQL to fetch all rows, instead InnoDB already had all the matching records. The only thing need is to call an InnoDB API to retrieve the count

The difference can be huge. Following query output shows how big the difference can be:

mysql> select count(*) from searchindex_inno where match(si_title, si_text) against ('people') 

| count(*) |
| 666877 |
1 row in set (16 min 17.37 sec)

So the query took almost 16 minutes.

Lets see how long the InnoDB can come up the result. In InnoDB, you can obtain extra diagnostic printout by turning on innodb_ft_enable_diag_print, this will print out extra query info:

Error log:

keynr=2, 'people'
NL search
Total docs: 10954826 Total words: 0
UNION: Searching: 'people'
Processing time: 2 secs: row(s) 666877: error: 10
keynr=2, 'people'
NL search
Total docs: 10954826 Total words: 0
UNION: Searching: 'people'
Processing time: 3 secs: row(s) 666877: error: 10

Output shows it only took InnoDB only 3 seconds to get the result, while the whole query took 16 minutes to finish. So large amount of time has been wasted on the un-needed row fetching.

The Solution:

The solution is obvious. MySQL can skip some of its steps, optimize its plan and obtain useful information directly from InnoDB. Some of savings from doing this include:

1) Avoid redundant sorting. Since InnoDB already sorted the result according to ranking. MySQL Query Processing layer does not need to sort to get top matching results.

2) Avoid row by row fetching to get the matching count. InnoDB provides all the matching records. All those not in the result list should all have ranking of 0, and no need to be retrieved. And InnoDB has a count of total matching records on hand. No need to recount.

3) Covered index scan. InnoDB results always contains the matching records' Document ID and their ranking. So if only the Document ID and ranking is needed, there is no need to go to user table to fetch the record itself.

4) Narrow the search result early, reduce the user table access. If the user wants to get top N matching records, we do not need to fetch all matching records from user table. We should be able to first select TOP N matching DOC IDs, and then only fetch corresponding records with these Doc IDs.

Performance Results and comparison with MyISAM

The result by this change is very obvious. I includes six testing result performed by Alexander Rubin just to demonstrate how fast the InnoDB query now becomes when comparing MyISAM Full-Text Search.

These tests are base on the English Wikipedia data of 5.4 Million rows and approximately 16G table. The test was performed on a machine with 1 CPU Dual Core, SSD drive, 8G of RAM and InnoDB_buffer_pool is set to 8 GB.


mysql> SELECT si_title, match(si_title, si_text) against('family') as rel FROM si WHERE match(si_title, si_text) against('family') ORDER BY rel desc LIMIT 10;



Times Faster

Time for the query

1.63 sec

3 min 26.31 sec


You can see for this particular query (retrieve top 10 records), InnoDB Full-Text Search is now approximately 127 times faster than MyISAM.


mysql>select count(*) from si where match(si_title, si_text) against('family‘);

| count(*) |
| 293955 |



Times Faster

Time for the query

1.35 sec

28 min 59.59 sec


In this particular case, where there are 293k matching results, InnoDB took only 1.35 second to get all of them, while take MyISAM almost half an hour, that is about 1289 times faster!.

Table 3: SELECT ID with ORDER BY and LIMIT CLAUSE for selected terms

mysql> SELECT <ID>, match(si_title, si_text) against(<TERM>) as rel FROM si_<TB> WHERE match(si_title, si_text) against (<TERM>) ORDER BY rel desc LIMIT 10;


InnoDB (time to execute)

MyISAM(time to execute)

Times Faster


0.5 sec

5.05 sec


family film

0.95 sec

25.39 sec


Pizza restaurant orange county California

0.93 sec

32.03 sec


President united states of America

2.5 sec

36.98 sec


Table 4: SELECT title and text with ORDER BY and LIMIT CLAUSE for selected terms

mysql> SELECT <ID>, si_title, si_text, ... as rel FROM si_<TB> WHERE match(si_title, si_text) against (<TERM>) ORDER BY rel desc LIMIT 10;


InnoDB (time to execute)

MyISAM(time to execute)

Times Faster


0.61 sec

41.65 sec


family film

1.15 sec

47.17 sec


Pizza restaurant orange county california

1.03 sec

48.2 sec


President united states of america

2.49 sec

44.61 sec


Table 5: SELECT ID with ORDER BY and LIMIT CLAUSE for selected terms

mysql> SELECT <ID>, match(si_title, si_text) against(<TERM>) as rel  FROM si_<TB> WHERE match(si_title, si_text) against (<TERM>) ORDER BY rel desc LIMIT 10;


InnoDB (time to execute)

MyISAM(time to execute)

Times Faster


0.5 sec

5.05 sec


family film

0.95 sec

25.39 sec


Pizza restaurant orange county califormia

0.93 sec

32.03 sec


President united states of america

2.5 sec

36.98 sec


Table 6: SELECT COUNT(*)

mysql> SELECT count(*) FROM si_<TB> WHERE match(si_title, si_text) against (<TERM>) LIMIT 10;


InnoDB (time to execute)

MyISAM(time to execute)

Times Faster


0.47 sec

82 sec


family film

0.83 sec

131 sec


Pizza restaurant orange county califormia

0.74 sec

106 sec


President united states of america

1.96 sec

220 sec


 Again, table 3 to table 6 all showing InnoDB consistently outperform MyISAM in these queries by a large margin. It becomes obvious the InnoDB has great advantage over MyISAM in handling large data search.


These results demonstrate the great performance we could achieve by making MySQL optimizer and InnoDB Full-Text Search more tightly coupled. I think there are still many cases that InnoDB’s result info have not been fully taken advantage of, which means we still have great room to improve. And we will continuously explore the area, and get more dramatic results for InnoDB full-text searches.

Jimmy Yang, September 29, 2012


This is the InnoDB team blog.


« September 2012 »