Allow UNDO logs to reside in their own tablespace
By Calvin Sun on Jul 25, 2011
Note: this article was originally published on http://blogs.innodb.com on July 25, 2011 by Sunny Bains.
The InnoDB UNDO entries reside in a special system table called the UNDO log. This log is made up of several segments. These segments are called rollback segments. A segment in InnoDB is similar to what a file would be in a file system,e.g., user tables and indexes are also stored as separate segments within the same tablespace, only their format is different. In that sense there is nothing special about InnoDB UNDO logs. This feature allows storing of the UNDO log across several tablespaces.
UNDO logs contain the before image of modified records. There are two types of UNDO records, one for insert and another for updates. The insert UNDO records can be discarded on transaction rollback. The update records are used for rollback, MVCC and by purge. It is because of purge that we can’t just remove the UNDO log records once the UNDO logs are not referenced by any running transaction’s snapshot view (for MVCC). When a transaction is started it is assigned a rollback segment in a round robin scheme. Multiple transactions can be assigned the same rollback segment to write their changes. Up to a maximum of 1024 transactions can use a the same rollback segment. If you have more rollback segments then there is less contention around the rollback segment mutex.
The purge thread(s) run in the background and they read the UNDO log records from oldest to the latest, up to but not including the oldest active snapshot view in the system. It parses the UNDO log records and for entries that represent delete operations it uses the stored index key to search for the records in both the secondary and primary index and purges the entries, modifying the index tree structure if required. Normal DML operations simply delete mark the records but they don’t physically purge them, unless it is an insert that is being rolled back. This is to avoid expensive tree modifying operations in DML code. Once purge is finished with the UNDO entries it then truncates the UNDO log up to where it has processed the entries. For MVCC user transactions we need to follow the DATA_ROLL_PTR pointer to the UNDO log to build a previous version of the row.
Why did we make this change
All this activity results in lots of RANDOM IO. In order to reduce
the of random IO this feature helps by allowing the user to place the
UNDO logs anywhere in the file system. This flexibility can be used to
place them on an SSD for example.
New configuration variables
- innodb_undo_dir – This should be set to the directory where all the configured UNDO tablespaces will be created. Can be an absolute path too. Default is “.”.
- innodb_undo_tablespaces – This should be set to the number of UNDO tabespaces that you want to created. Note: This must be set at the time of database creation. Once created this cannot be changed.
- innodb_undo_logs (renamed from innodb_rollback_segments) – This should be set to the minimum number of rollback segments that you want. More can be added later but they cannot be removed. However, setting it to a smaller value at runtime will ensure that only the number set will be used when assigning to transactions.
There are a few gotchas here that you should be aware off. The innodb_undo_tablespaces can be set only at the time of database creation and the value must be the same for the lifetime of the database. More UNDO tablespaces cannot be added and existing tablespaces cannot be dropped. We may add better UNDO tablespace management in the future.
Case 1. 128 segments spread across 128 segments.
- innodb_undo_logs = 128
- innodb_undo_tablespaces = 127 (we always create one log in tablespace 0, this is to avoid breaking the system tablespace layout)
Case 2. Start with one rollback segment but may want to create more later and spread them across say 4 UNDO tablespaces
- innodb_undo_tablespaces = 4
- innodb_undo_logs = 1
This will create the UNDO tablespaces but they will be empty, the single rollback segment that is created in tablespace 0 will work. Later if you want to add more tablespaces. You will need to shutdown the server and set the innodb_undo_logs variable to the number that you want. Lets say we add 2 more spaces. We would restart the server with the following parameters:
- innodb_undo_tablespaces = 4
- innodb_undo_logs = 3 - This will create two new rollback segments in the UNDO tablespace 1 and UNDO tablespace 2. Also this will disable the use of the rollback segment in tablespace 0.
If there were some UNDO records in rollback segment 1 they will be processed as normal. Only change is that no new ones will be written to rollback segment 1. It will be skipped when assigning rollback segments to new transactions.
Mapping of tablespace id to tablespace name is something that we would like to add so that users can change the path of individual tablespaces instead of all being lumped together in innodb_undo_dir. Managing these extra tablespaces is also something that we think is important, especially the adding and dropping of UNDO tablespaces including better management of the UNDO segments.
As we have seen writing to the UNDO logs and accessing them can result in a lot of random IO. One way to reduce that overhead is to place the UNDO tablespaces on SSD. This feature allows the placement of the UNDO logs anywhere on the file system in a directory that is controlled by the configuration variable innodb_undo_dir. Note: this change is not backward compatible, databases created with this feature cannot be opened by older versions of InnoDB. The main problems are that the old code doesn’t open these new system tablespaces before applying REDO and secondly there are some checks that assume space id can only be 0. The file format is not the problem as such.