Tips and Tricks for Faster DDL

Note: this article was originally published on http://blogs.innodb.com on April 6, 2011 by Calvin Sun.

Data Dictionary Language (DDL) operations have traditionally been slow in MySQL. Any change to the table definition would be implemented by creating a copy of the table and index layout with the requested changes, copying the table contents row by row, and finally renaming tables and dropping the original table.

The InnoDB Plugin for MySQL 5.1 implements a more efficient interface for creating and dropping indexes. Indexes can be created or dropped without rebuilding the entire table.

Speeding up Bulk INSERT, UPDATE and DELETE Operations

Normally, InnoDB would update all indexes of a table when rows are inserted or deleted. If you update an indexed column, InnoDB would have to delete the old value and insert the new value in the corresponding index. If you update a primary key column, the row would be deleted and inserted in every index of the table. This can be slow even despite the change buffering in MySQL 5.5.

Bulk data imports, bulk deletes and bulk updates could be executed faster if they did not have to update too many secondary indexes. Provided that there are no operations running that would benefit from the existence of the indexes while the bulk operation is in progress, it could be useful to drop the affected indexes before starting a bulk operation and to create the indexes after the operation.

Consider this artificial example:

CREATE TABLE friends(
id INT PRIMARY KEY AUTO_INCREMENT,
name CHAR(50) NOT NULL,
age INT,
hair_color ENUM('red','black','blond','grey','artificial','other'),
INDEX(name), INDEX(age)
) ENGINE=InnoDB;

Yes, it is a bad idea to have age in the table, because it depends on the current time. A birthday would be more accurate and would not need to be updated. When you update age, you will be updating every non-NULL row. It probably pays off to drop the affected index for the duration of the operation. Other indexes can be left alone, because age is not part of them or the PRIMARY KEY:

ALTER TABLE friends DROP INDEX age;
UPDATE friends SET age=age+1;
ALTER TABLE friends ADD INDEX(age);

Bulk INSERT operations can be sped up by creating the secondary indexes afterwards. In this way, the records will be merge sorted and then inserted into the InnoDB index B-trees in order, using sequentially allocated pages. This will create a more optimal layout especially for indexes whose columns are updated rarely. To continue our example:

ALTER TABLE friends DROP INDEX name, DROP INDEX age;
LOAD DATA INFILE 'friends.txt' INTO TABLE friends;
ALTER TABLE friends ADD INDEX (name), ADD INDEX (age);

At some point, we may get suspicious about grey-haired people who are younger than 30 years:

DELETE FROM friends WHERE age<30 AND hair_color='grey';

Depending on the number of affected rows, it might pay off to drop and re-create all indexes, or to optimize the whole table.

Outdoing OPTIMIZE TABLE

The OPTIMIZE TABLE statement in MySQL makes a copy of the table, copying rows one by one to every index. Because the secondary indexes are not likely to be in the order of the PRIMARY KEY (for example, the (age,id) tuples in the secondary index B-tree age might not be (42,1),(45,2),(50,4),(65,7) but (50,1),(42,2),(65,5),(45,8)) this will lead to the secondary indexes being populated out of order.

If you want to create every index in order, you might want to mimic OPTIMIZE TABLE like this:

CREATE TABLE optimized_friends LIKE friends;
ALTER TABLE optimized_friends DROP INDEX age, DROP INDEX name;
INSERT INTO optimized_friends SELECT * FROM FRIENDS;
ALTER TABLE optimized_friends ADD INDEX(age), ADD INDEX(name);
RENAME TABLE friends TO unoptimized_friends;
RENAME TABLE optimized_friends TO friends;
DROP TABLE unoptimized_friends;

The INSERT…SELECT will copy the data in the PRIMARY KEY order, which should allocate and populate the clustered index B-tree in sequential order. Then, the ALTER TABLE will create the secondary indexes. As the index entries will be merge sorted by InnoDB before inserting, the secondary indexes will be in order as well.

Issuing one ALTER TABLE … ADD INDEX statement instead of multiple CREATE INDEX has the benefit that the data rows will be scanned only once to create the temporary merge sort files for all new indexes. If you are short on temporary file space, you might not want to create all indexes of a large table in one go.

If you suspect that only some secondary indexes are fragmented, you can DROP and ADD only those indexes.

Considerations

While the fill factor of the index B-trees cannot be controlled directly in InnoDB, it may happen that sequentially inserted records result in a tighter packed tree than records that were inserted in random order. This can be either a good or bad thing, depending on the number and nature of future modifications.

If you want the old table-copying behavior back, you can SET old_alter_table=1.

Comments:

Post a Comment:
Comments are closed for this entry.
About

This is the InnoDB team blog.

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today