Sunday, March 5, 2017

Optimistic Incremental Backup

MySQL Enterprise Backup Team is pleased to announce major improvements in incremental backup performance starting with
release 4.1.

Introduction

The current
incremental backup algorithm scans all the tables to gather changed pages even
if very few tables are modified since the previous backup and thus results in a
'full-scan' incremental backup. This may result in increment backups requiring the
same amount of time as full backup because it scans all the tables. The new algorithm
aims to eliminate this extra time.

The
new algorithm scans only those tables that have been modified since the
previous backup. This algorithm relies on modification time, which is similar
to an earlier improvement made for full backup. That full backup algorithm is
known as optimistic full backup,
hence new improvement is named ‘Optimistic Incremental Backup’. For
comparison, we will use optimistic full backup to refer to the performance
improvements for full backup and optimistic increment backup for the new improvements
to incremental backup.

In the
new optimistic incremental backup algorithm, we refer to the tables that are
not modified since the previous backup as 'unchanged tables'. We refer to tables
that are modified after the previous backup as 'busy tables'. The new
optimistic incremental backup therefore scans only busy tables for changed
pages.

However,
there is one difference between optimistic full backup and the new optimistic
incremental backup. For optimistic full backup, the user had to specify either
the --optimistic-time or --optimistic-busy-tables options in order to identify
the busy tables. In case of optimisitic incremental backup, no additional
parameters are necessary because the algorithm identifies busy tables, which are
clearly defined and identified.

How Optimistic
Incremental Backup works

The
first and foremost goal for MySQL Enterprise Backup (MEB) is quality and
consistency. To achieve a consistent
backup during optimistic incremental backup, MEB identifies a point in time
against which the modification time of tables could be compared. MEB then acquires
a read lock on the tables for very short span of time to copy the non-innoDB
files. Since non-Innodb tables cannot be modified the lock period, MEB records a
timestamp during the lock period which we might call the consistency time; the
time that tables are consistent when the timestamp is recorded.

When
the optimistic incremental backup starts, it compares the modified time of each
table against the consistency time. If the modification time of a table is
greater than the consistency time, that table has been modified after
consistency time was recorded and optimistic incremental backup needs to scan
that table for changed pages.


The above
diagram depicts an optimistic incremental backup sample execution as the following.

  • There are 6 tables to be scanned (and may be copied too) during previous backup operation.
  • MEB notes the consistency time when tables are locked.
  • Table7 is created after the tables are unlocked but backup is still copying the meta files.
  • After the backup is finished, Table1 is updated and Table8 is created.
  • When optimistic incremental backup starts, it looks for the consistency time from the previous backup.
  • MEB compares the consistency time with the modification time of all the tables present in datadir. It finds that only three tables that have been modified after the consistency time. Hence, these three tables are scanned for changed pages. The remaining tables are unchanged tables and are ignored.  If any unchanged tables are modified during optimistic incremental backup, the changes are recorded in the redo log file. These changes will be applied at the apply log (restore) phase.

Notes

The consistency
time is stored in a column named consistency_time_utc in the backup_history
table as well as in a field with the same name in the meta file
backup_variables.txt.

If the
--no-locking or --no-connection options are used during backup, the backup
start time is recorded as the consistency time.

During
an optimistic incremental backup, if MEB is unable to discover the consistency
time from the previous backup, it defaults to the older incremental backup algorithm.

It is
possible in some special cases, optimistic incremental backup may perform the
same as the older incremental backup algorithm.

How to
Trigger an Optimistic Incremental backup

The
current --incremental option is extended
to include the following values.

  • optimistic:
    optimistic incremental backup algorithm is
    triggered if --incremental=optimistic is specified while taking the incremental
    backup. 
  • full-scan: the older incremental backup algorithm is triggered if --incremental=full-scan or --incremental is specified while taking the incremental backup. It scans all the tables that qualifies the backup operation. This is also the default algorithm to be used when no argument is given for the option.

The
following are some examples.

Optimistic
Incremental image backup using dir:directory_path:

>mysqlbackup.exe
--backup-image=<image file name > --backup-dir=<temporary directory
name> \

--incremental-base=dir:<previous backup directory> \

--incremental=optimistic backup-to-image

Optimistic
Incremental image backup using history:last_backup:

>mysqlbackup.exe
--backup-image=<image file name > --backup-dir=<temporary directory
name> \

--incremental-base=history:last_backup backup directory> \

--incremental=optimistic backup-to-image

Performance
Tests

In our
internal tests, we created a 1TB database where 10 tables were present, each
100 GB in size. We executed three
iterations and observed the results. In the
first iteration (A2), we modified only 1 table. In the second iteration (A3),
we modified 2 tables. In the third iteration (A4), we modified 3 tables. We
compared those times with performing a full backup and an incremental backup
using the old algorithm.

We
found that theoretical gains (A2- 90%, A3 - 80%, A4 - 70%) almost match with
the practical gains (A2- 89%, A3 - 79%, A4 - 68%). It means performance will be closer to
default IB as the count of modified table
increases, which is expected. For brevity, we use FB for full backup, IB for
the old incremental backup algorithm, and OIB for the new optimistic
incremental backup algorithm.

Tag

Description

Time taken (hh:mm:ss)

% of Improvement
[(Bi-Ai)/Bi*100 ]

B1

FB (10 tables;
100GB per table)

2:13:07

-

B2

IB (one table
changed)

2:12:18

-

B3

IB (two table
changed )

2:12:12

-

B4

IB (three table
changed)

2:12:11

-

A1

FB (10 tables;
100GB per table)

2:13:07

-

A2

OIB (one table
changed )

0:13:55

89

A3

OIB (two table
changed )

0:27:43

79

A4

OIB (three
table changed)

0:42:13

68

Conclusion

Optimistic incremental backup will have
significant advantages and benefits over the older incremental backup
algorithm, especially in cases where changes are limited to small set of
tables. We hope you'll give it a try and provide us feedback on how it works
for your data. For more details and usage samples, please refer MySQL Enterprise
Backup 4.1 User’s Guide .

Join the discussion

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.Captcha
 

Visit the Oracle Blog

 

Contact Us

Oracle

Integrated Cloud Applications & Platform Services