InnoDB Compression Improvements in MySQL 5.6
By Inaam Rana on Sep 29, 2012
MySQL 5.6 comes with significant improvements for the compression support inside InnoDB. The enhancements that we'll talk about in this piece are also a good example of community contributions. The work on these was conceived, implemented and contributed by the engineers at Facebook. Before we plunge into the details let us familiarize ourselves with some of the key concepts surrounding InnoDB compression.
- In InnoDB compressed pages are fixed size. Supported sizes are 1, 2, 4, 8 and 16K. The compressed page size is specified at table creation time.
- InnoDB uses zlib for compression.
- InnoDB buffer pool will attempt to cache compressed pages like normal pages. However, whenever a page is actively used by a transaction, we'll always have the uncompressed version of the page as well i.e.: we can have a page in the buffer pool in compressed only form or in a state where we have both the compressed page and uncompressed version but we'll never have a page in uncompressed only form. On-disk we'll always only have the compressed page.
- When both compressed and uncompressed images are present in the buffer pool they are always kept in sync i.e.: changes are applied to both atomically.
- Recompression happens when changes are made to the compressed data. In order to minimize recompressions InnoDB maintains a modification log within a compressed page. This is the extra space available in the page after compression and it is used to log modifications to the compressed data thus avoiding recompressions.
- DELETE (and ROLLBACK of DELETE) and purge can be performed without recompressing the page. This is because the delete-mark bit and the system fields DB_TRX_ID and DB_ROLL_PTR are stored in uncompressed format on the compressed page. A record can be purged by shuffling entries in the compressed page directory. This can also be useful for updates of indexed columns, because UPDATE of a key is mapped to INSERT+DELETE+purge.
- A compression failure happens when we attempt to recompress a page and it does not fit in the fixed size. In such case, we first try to reorganize the page and attempt to recompress and if that fails as well then we split the page into two and recompress both pages.
Now lets talk about the three major improvements that we made in MySQL 5.6.
Logging of Compressed Page Images:
InnoDB used to log entire compressed data on the page to the redo logs when recompression happens. This was an extra safety measure to guard against the rare case where an attempt is made to do recovery using a different zlib version from the one that was used before the crash. Because recovery is a page level operation in InnoDB we have to be sure that all recompress attempts must succeed without causing a btree page split. However, writing entire compressed data images to the redo log files not only makes the operation heavy duty but can also adversely affect flushing activity. This happens because redo space is used in a circular fashion and when we generate much more than normal redo we fill up the space much more quickly and in order to reuse the redo space we have to flush the corresponding dirty pages from the buffer pool.
Starting with MySQL 5.6 a new global configuration parameter innodb_log_compressed_pages. The default value is true which is same as the current behavior. If you are sure that you are not going to attempt to recover from a crash using a different version of zlib then you should set this parameter to false. This is a dynamic parameter.
You can now set the compression level that zlib should choose to compress the data. The global parameter is innodb_compression_level - the default value is 6 (the zlib default) and allowed values are 1 to 9. Again the parameter is dynamic i.e.: you can change it on the fly.
Dynamic Padding to Reduce Compression Failures:
Compression failures are expensive in terms of CPU. We go through the hoops of recompress, failure, reorganize, recompress, failure and finally page split. At the same time, how often we encounter compression failure depends largely on the compressibility of the data. In MySQL 5.6, courtesy of Facebook engineers, we have an adaptive algorithm based on per-index statistics that we gather about compression operations. The idea is that if a certain index/table is experiencing too many compression failures then we should try to pack the 16K uncompressed version of the page less densely i.e.: we let some space in the 16K page go unused in an attempt that the recompression won't end up in a failure. In other words, we dynamically keep adding 'pad' to the 16K page till we get compression failures within an agreeable range. It works the other way as well, that is we'll keep removing the pad if failure rate is fairly low. To tune the padding effort two configuration variables are exposed.
- innodb_compression_failure_threshold_pct: default 5, range 0 - 100,dynamic, implies the percentage of compress ops to fail before we start using to padding. Value 0 has a special meaning of disabling the padding.
- innodb_compression_pad_pct_max: default 50, range 0 - 75, dynamic, the maximum percentage of uncompressed data page that can be reserved as pad.