By Marko Mäkelä on Sep 29, 2012
MySQL before the InnoDB Plugin
Traditionally, the MySQL storage engine interface has taken a
minimalistic approach to data definition language. The only natively
supported operations were
RENAME TABLE. Consider the following
CREATE TABLE t(a INT); INSERT INTO t VALUES (1),(2),(3); CREATE INDEX a ON t(a); DROP TABLE t;
CREATE INDEX statement would be executed roughly
CREATE TABLE temp(a INT, INDEX(a)); INSERT INTO temp SELECT * FROM t; RENAME TABLE t TO temp2; RENAME TABLE temp TO t; DROP TABLE temp2;
You could imagine that the database could crash when copying all
rows from the original table to the new one. For example, it could run
out of file space. Then, on restart, InnoDB would roll back the
INSERT transaction. To fix things a little, a hack
was added to
ha_innobase::write_row for committing the
transaction every 10,000 rows.
Still, it was frustrating that even a simple
INDEX would make the table unavailable for modifications for a
Fast Index Creation in the InnoDB Plugin of MySQL 5.1
MySQL 5.1 introduced a new interface for
DROP INDEX. The old table-copying approach can still
be forced by
This interface is used in MySQL 5.5 and in the InnoDB Plugin for
MySQL 5.1. Apart from the ability to do a quick
INDEX, the main advantage is that InnoDB will execute a
merge-sort algorithm before inserting the index records into each
index that is being created. This should speed up the insert into the
secondary index B-trees and potentially result in a better B-tree fill
ALTER TABLE interface was not perfect. For
DROP FOREIGN KEY still invoked the table
copy. Renaming columns could conflict with InnoDB foreign key
ADD KEY and
ALTER TABLE was problematic and not atomic inside the
ALTER TABLE interface in MySQL 5.6
ALTER TABLE storage engine interface was
completely rewritten in MySQL 5.6. Instead of introducing a method
call for every conceivable operation, MySQL 5.6 introduced a handful
of methods, and data structures that keep track of the requested
In MySQL 5.6, online
ALTER TABLE operation can be
LOCK=EXCLUSIVE are available. The old-style table
copying can be requested by
ALGORITHM=COPY. That one will
require at least
LOCK=SHARED. From the InnoDB point of
view, anything that is possible with
also possible with
ALGORITHM=INPLACE operations inside InnoDB can be
executed online (
LOCK=NONE). InnoDB will always require
an exclusive table lock in two phases of the operation. The execution
phases are tied to a number of methods:
- Checks if the storage engine can perform all requested operations, and if so, what kind of locking is needed.
- InnoDB uses this method to set up the data dictionary cache for
CREATE INDEXoperation. We need stubs for the new indexes, so that we can keep track of changes to the table during online index creation. Also, crash recovery would drop any indexes that were incomplete at the time of the crash.
- In InnoDB, this method is used for creating secondary indexes or for rebuilding the table. This is the ‘main’ phase that can be executed online (with concurrent writes to the table).
- This is where the operation is committed or rolled back. Here, InnoDB would drop any indexes, rename any columns, drop or add foreign keys, and finalize a table rebuild or index creation. It would also discard any logs that were set up for online index creation or table rebuild.
commit phases require an
exclusive lock, blocking all access to the table. If MySQL times out
while upgrading the table meta-data lock for the
phase, it will roll back the
ALTER TABLE operation.
In MySQL 5.6, data definition language operations are still not
fully atomic, because the data dictionary is split. Part of it is
inside InnoDB data dictionary tables. Part of the information is only
available in the
*.frm file, which is not covered by any
crash recovery log. But, there is a single commit phase inside the
Online Secondary Index Creation
It may occur that an index needs to be created on a new column to speed up queries. But, it may be unacceptable to block modifications on the table while creating the index.
It turns out that it is conceptually not so hard to support online index creation. All we need is some more execution phases:
- Set up a stub for the index, for logging changes.
- Scan the table for index records.
- Sort the index records.
- Bulk load the index records.
- Apply the logged changes.
- Replace the stub with the actual index.
Threads that modify the table will log the operations to the logs
of each index that is being created. Errors, such as log overflow or
uniqueness violations, will only be flagged by the
TABLE thread. The log is conceptually similar to
the InnoDB change
The bulk load of index records will bypass record locking. We still generate redo log for writing the index pages. It would suffice to log page allocations only, and to flush the index pages from the buffer pool to the file system upon completion.
Native ALTER TABLE
Starting with MySQL 5.6, InnoDB supports most
TABLE operations natively. The notable exceptions are changes
to the column type,
ADD FOREIGN KEY except
foreign_key_checks=0, and changes to tables that
ALGORITHM=INPLACE is somewhat misleading,
because certain operations cannot be performed in-place. For example,
ROW_FORMAT of a table requires a
Online operation (
LOCK=NONE) is not allowed in the
- when adding an
- when the table contains
FULLTEXTindexes or a hidden
- when there are
FOREIGN KEYconstraints referring to the table, with
FOREIGN KEY limitations are needed, because MySQL
does not acquire meta-data locks on the child or parent tables when
executing SQL statements.
Theoretically, InnoDB could support operations like
DROP COLUMN in-place, by lazily
converting the table to a newer format. This would require that the
data dictionary keep multiple versions of the table definition. For
simplicity, we will copy the entire table, even for
The bulk copying of the table will bypass record locking and undo logging. For facilitating online operation, a temporary log will be associated with the clustered index of table. Threads that modify the table will also write the changes to the log.
When altering the table, we skip all records that have been marked for deletion. In this way, we can simply discard any undo log records that were not yet purged from the original table.
Off-page columns, or BLOBs, are an important consideration. We
suspend the purge of delete-marked records if it would free any
off-page columns from the old table. This is because the BLOBs can be
needed when applying changes from the log. We have special logging for
ROLLBACK of an
inserted new off-page columns. This is because the columns will be
freed at rollback.