X

MySQL and MySQL Community information

  • September 4, 2014

Optimistic Backup

Introduction 

MySQL Enterprise Backup (MEB) is a highly efficient tool for taking backups of your MySQL databases. In the 3.11.0 release, we are taking that one step further by introducing a new concept called "optimistic" backup. Optimistic backup leverages the patterns we saw frequently especially as related to very large databases.

For backups the goals are:

1 - Quality and Consistency - the backup and more importantly the restore just "works".
2 - Size, time, and overhead - like in the game of golf - low score wins - for backups and for
     restores.
3 - Flexibility – It’s not always one size fits all - whether how the backup is run, where it goes,
     how it is recovered.

With optimistic backup - we look at mostly the read aspects of your database to enable us to create a backup that is smaller, faster to backup, faster to restore, and has less overhead when running, but still remains consistent. As we said earlier “optimistic” is not for all databases. Those that benefit the most are databases where modifications(insert/updates/deletes) are focused on a minority of the tables, and store a great deal of data will benefit the most. In our tests (details below) mean time to restore reduced 18%, resulting from a 60% reduction in the size of the redo log that is applied on recovery. What follows is the deals on how this works. We also want to thank the DBAs who provided great feedback and details on their database characteristics during our development of this new feature.

Optimistic Backup

Whole idea is summarized in following two steps.

1.  Divide the InnoDB tables into two groups. In the first group are those tables that are unlikely to be modified during the backup, and in the second group are all the other tables. MEB is optimistic about the first set of tables as they are unlikely to be modified during backup operation hence these tables are termed as optimistic tables and backup operation is named after them as "optimistic backup". Tables in the second group are likely to be modified during backup operation hence are termed as optimistic busy tables.

2.  Split full backup operation into two phases. In the first phase of the backup, tables from the first group are copied to the backup without locking the MySQL server instance. In the second phase tables from second group are copied similar to normal backup i.e.  All InnoDB tables from second group and tables from the first group that have modified since the backup started are copied to backup, then MySQL server instance is locked to copy remaining files. First phase always excludes the undo/redo logs, system table space files and other meta files because these are most likely to be modified. Since optimistic tables are copied in the first phase hence it is called optimistic phase and second phase is called normal phase. 

How to trigger Optimistic backup

Following two new options have been introduced to trigger optimistic backup. Provide either of these two or both options to enable optimistic backup.

optimistic-time

This option specifies the cut-off date-time to identify tables to be skipped in first phase. In other words, tables modified before the specified date-time (i.e. optimistic tables), are copied in first phase and tables modified after specified date-time value (i.e. busy tables), are copied in second phase.  Following date-time formats are supported by this option- 

• <Number><unit> - A valid integer number according to its unit. Supported units are years, days, months, hours and minutes. 

For example: ‘--optimistic-time=1years’ means tables modified one year before the backup starts to be copied in first phase. In other words tables modified in last one year from the backup starts to be copied in second phase.

• Date-time - It can be specified in different formats as:  YYMMDD, YYYYMMDD, YYMMDDHHMMSS, YYYYMMDDHHMMSS, YY-MM-DD, YYYY-MM-DD, YY-MM-DD HH.MM.SS,   and YYYYMMDDTHHMMSS where T is a character. 

For example - 16th May 2014 can be specified as 140516 or 20140516. Therefore ‘--optimistic-time=20140516’ means tables modified before this time will be copied in first phase.

• now - This is the default value of this option. If no value is specified against this option then MEB will assign “now” as default value. Internally MEB keeps backup start time as optimistic time and use the same throughout the backup operation. With this option all InnoDB tables are be copied in the 1st step.

For example:   --optimistic-time=now   OR --optimistic-time

optimistic-busy-tables

This option specifies the InnoDB tables (i.e. busy tables) to be skipped in the first phase. It requires regular expression as specified in the POSIX 1003.2 standard, similar to – include/exclude-tables=REGEX option. This option expects fully qualified names of database objects, i.e. database_name[.table_name] in regular expression. Tables that match the REGEX criteria will be skipped in the first phase and copied in the second phase. This option can co-exist with 'optimistic-time', 'include-tables' and 'exclude-tables' options. These options will get precedence in ascending order as following 

Include/exclude-tables > optimistic-busy-tables > optimistic-time

Include/exclude-tables options gets higher precedence over optimistic-busy-tables option.

For instance - if same table is specified in the –exclude-tables and –optimistic-busy-tables then that table is not copied at all. 

optimistic-busy-tables gets higher precedence over optimistic-time option. 
For instance - if a table ‘user’ has modified time of ‎ “04/15/‎2014 ‏‎2:19:32”. If --optimistic-busy-tables=^test.user --optimistic-time=04142014 values are specified then file test/user.ibd is not be copied in first step because optimistic-busy-tables value matches the REGEX so test.user.ibd is not evaluated by optimistic-time option.

Using Optimistic backup

Following are samples of command line options need to be specified to perform different types of optimistic backups. 

1. Normal Backup with optimistic backup options

  >mysqlbackup.exe --backup-dir=D:\backup --optimistic-busy-tables="^db2\.t1+" backup

  >mysqlbackup.exe --backup-dir=D:\backup --optimistic-time backup

2. Compress Backup with optimistic backup options

 >mysqlbackup.exe --backup-dir=D:\backup --compress-method=lzma
   --optimistic-busy-tables="^db2\.t1+" backup 

 >mysqlbackup.exe --backup-dir=D:\backup --compress-method=lzma --optimistic-time=now
   backup 

3. Compress Image Backup with optimistic backup options

 >mysqlbackup.exe --backup-dir=D:\backup --compress-method=lzma
  --backup-image=D:\backup.img  --optimistic-busy-tables="^db2\.t1 +" backup-to-image

 >mysqlbackup.exe --backup-dir=D:\backup --compress-method=lzma
  --backup-image=D:\backup.img  --optimistic-time="2014-08-27 13.25.00" backup-to-image

4. Compress and encrypted Backup with optimistic backup options

 >mysqlbackup.exe --backup-dir=D:\backup --compress-method=lzma
   --backup-image=D:\backup.img --encrypt --key=xxxxxxxxx
   --optimistic-busy-tables="^db2\.t1 +" backup-to-image

 >mysqlbackup.exe    --backup-dir=D:\backup --compress-method=lzma
   --backup-image=D:\backup.img --encrypt --key=xxxxxxxxx
   --optimistic-time=20140827T132500 backup-to-image

5. Partial Backup with optimistic backup options

 >mysqlbackup.exe --backup-dir=D:\backup --optimistic-time=25days
   --include-tables="^db2+"  backup

 >mysqlbackup.exe --backup-dir=D:\backup --optimistic-busy-tables="^db2\.t1+"
   --include-tables="^db2+"  backup 

 >mysqlbackup.exe --backup-dir=D:\backup --optimistic-busy-tables="^db2\.t1+"
   --optimistic-time=10Months --include-tables="^db2+"  backup

Troubleshooting optimistic backup

If optimistic backup is triggered then MEB prints INFO messages in log in following order to know when each phase of backup started and finished. 

First phase of optimistic backup started 

…….............

…….............

First phase of optimistic backup completed successfully

Second phase of optimistic backup started

……….........

……….........

Second phase of optimistic backup completed successfully

Turn on --trace option for more diagnostic information in logs. Internally MEB converts optimistic-time value to an epoch time stamp and compares the same with the InnoDB files’ modification time. 

With increased trace levels we see optimistic-time, backup start time and InnoDB table modification time in form of the epoch time stamp. We can easily convert these epoch time stamps in human readable time either using SQL query or through some external tools, for instance epoch converter.

Choose values of Optimistic backup options wisely

It's imperative to choose the suitable values of above options in order to maximize the gain from optimistic backup. Optimistic backup targets large databases in which only a few tables are modified frequently. DBA can find out such tables or time since when most of those tables have modified, either by profiling database or directly from file system. If such tables are identified then those tables can be specified in optimistic-busy-tables option. If duration, since when only small percentage of tables are modified, is identified then that date-time can be specified in optimistic-time option.  

Imagine the value of optimistic-busy table option is not chosen carefully then chances are that some optimistic tables might be copied in second phase which ideally should have been copied in first phase only, as a result redo log size will increase and hence overall backup time.
Similarly if the value of the optimistic-time option is not chosen carefully then chances are that some busy tables might be copied in first phase which ideally should have been copied in second phase only. In second phase those stale tables are copied again, as a result overall backup time will increase. In such cases for image backup, busy tables will be copied in both phases hence disk space and backup time both will increase. 

If only a few tables less in size (e.g. ~2% of the total size of the the database) will be modified during backup time then default value of optimistic-time 'now' should serve the purpose. In this case only those a few tables will be copied twice. 

Performance tests

In our internal tests we created a 1.2 TB of DB which had 25 tables initially. We created 3 tables and updated them continuously until backup operation completed successfully. We observed ~60% reduction in redo log file (ibbackup_logfile) size and ~18%  reduction in overall backup (backup + apply log) time compared to normal backup operation. Of course these numbers highly depends on the hardware, CPU load and types of DDL being executed. Therefore a user might experience better or poor performance than the numbers shared here. Purpose to share the number is to show that if options are specified correctly then optimistic backup performs better compared to normal backup.

Conclusion

The optimistic backup isn't for all cases, but for many it will have significant advantages and benefits over a typical full backup. We hope you'll give it a try and provide us feedback on how it works for your environment. We shall also follow-up with additional queries and scripts to help you easily determine how values to set for the optimistic backups, although we suspect that many DBAs likely already know their data so well they won't need these. Never forget, good backups, and more importantly successful and timely recovery is the top priority for a DBA. For more details and usage samples, please refer MEB Manual.

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.