Tips and Tricks for Faster DDL
By Calvin Sun on Apr 07, 2011
Note: this article was originally published on http://blogs.innodb.com on April 6, 2011 by Calvin Sun.
Data Dictionary Language (DDL) operations have traditionally been slow in MySQL. Any change to the table definition would be implemented by creating a copy of the table and index layout with the requested changes, copying the table contents row by row, and finally renaming tables and dropping the original table.
The InnoDB Plugin for MySQL 5.1 implements a more efficient interface for creating and dropping indexes. Indexes can be created or dropped without rebuilding the entire table.
Speeding up Bulk
Normally, InnoDB would update all indexes of a table when rows are inserted or deleted. If you update an indexed column, InnoDB would have to delete the old value and insert the new value in the corresponding index. If you update a primary key column, the row would be deleted and inserted in every index of the table. This can be slow even despite the change buffering in MySQL 5.5.
Bulk data imports, bulk deletes and bulk updates could be executed faster if they did not have to update too many secondary indexes. Provided that there are no operations running that would benefit from the existence of the indexes while the bulk operation is in progress, it could be useful to drop the affected indexes before starting a bulk operation and to create the indexes after the operation.
Consider this artificial example:
CREATE TABLE friends(
id INT PRIMARY KEY AUTO_INCREMENT,
name CHAR(50) NOT NULL,
Yes, it is a bad idea to have
age in the table, because
it depends on the current time. A birthday would be more accurate and
would not need to be updated. When you update
age, you will be updating every non-
row. It probably pays off to drop the affected index for the duration
of the operation. Other indexes can be left alone, because
age is not part of them or the
ALTER TABLE friends DROP INDEX age;
UPDATE friends SET age=age+1;
ALTER TABLE friends ADD INDEX(age);
INSERT operations can be sped up by creating the
secondary indexes afterwards. In this way, the records will be merge
sorted and then inserted into the InnoDB index B-trees in order, using
sequentially allocated pages. This will create a more optimal layout
especially for indexes whose columns are updated rarely. To continue our
ALTER TABLE friends DROP INDEX name, DROP INDEX age;
LOAD DATA INFILE 'friends.txt' INTO TABLE friends;
ALTER TABLE friends ADD INDEX (name), ADD INDEX (age);
At some point, we may get suspicious about grey-haired people who are younger than 30 years:
DELETE FROM friends WHERE age<30 AND hair_color='grey';
Depending on the number of affected rows, it might pay off to drop and re-create all indexes, or to optimize the whole table.
OPTIMIZE TABLE statement in MySQL makes a copy of
the table, copying rows one by one to every index. Because the secondary
indexes are not likely to be in the order of the
PRIMARY KEY (for example, the
(age,id) tuples in the secondary index B-tree
age might not be
(50,1),(42,2),(65,5),(45,8)) this will lead to the secondary indexes being populated out of order.
If you want to create every index in order, you might want to mimic
OPTIMIZE TABLE like this:
CREATE TABLE optimized_friends LIKE friends;
ALTER TABLE optimized_friends DROP INDEX age, DROP INDEX name;
INSERT INTO optimized_friends SELECT * FROM FRIENDS;
ALTER TABLE optimized_friends ADD INDEX(age), ADD INDEX(name);
RENAME TABLE friends TO unoptimized_friends;
RENAME TABLE optimized_friends TO friends;
DROP TABLE unoptimized_friends;
INSERT…SELECT will copy the data in the
PRIMARY KEY order, which should allocate and populate the clustered index B-tree in sequential order. Then, the
will create the secondary indexes. As the index entries will be merge
sorted by InnoDB before inserting, the secondary indexes will be in
order as well.
ALTER TABLE … ADD INDEX statement instead of multiple
has the benefit that the data rows will be scanned only once to create
the temporary merge sort files for all new indexes. If you are short on
temporary file space, you might not want to create all indexes of a
large table in one go.
If you suspect that only some secondary indexes are fragmented, you can
ADD only those indexes.
While the fill factor of the index B-trees cannot be controlled directly in InnoDB, it may happen that sequentially inserted records result in a tighter packed tree than records that were inserted in random order. This can be either a good or bad thing, depending on the number and nature of future modifications.
If you want the old table-copying behavior back, you can