By SanjayM on Feb 08, 2013
How do you ensure that you have the capability to be able to tune the level of parallelism for varying input and output devices and varying levels of load?
These were some of the questions that we needed to answer when we were trying to implement multi-threading capability for MySQL Enterprise Backup (MEB).
The trivial way of achieving parallelism is by having the multiple threads pick up the different files (in a file per table) scenario. But this did not seem adequate because:
a) The sizes of these files (corresponding to the tables) could be different and then one large file would limit the level of parallelism since it would be processed by a single thread.
b) If you have to stream the backup how do you reconcile these multiple files being streamed by separate threads? Large backups are streamed directly to tape so it is better to have a single file being output and not multiple files.
c) If you buffer each file and wait for a file to be completely processed and then push it to tape then it is not true streaming because you are using intermediate disk space to save the incomplete portions of all the files.
The answer that we found was to implement the parallel algorithm using a horizontal strategy instead of a vertical strategy.
In the vertical strategy, each thread acts on a separate file. This
limits streaming since the file sizes can vary.
In the horizontal strategy, each file is broken into a sections (denoted by multiple colors). A separate thread is assigned to operate on a single section.
Parallel operations are then possible for reading , processing and writing of these file subsections because no two threads will be operating on the same section of the file.
This setup is especially useful when using compression since there can be multiple threads performing compression while the read and write continues in parallel.
There may be additional overhead of ensuring that the buffers are in the correct order when they are written out, but since most of the buffers of the same size and having similar operations being performed, the overhead is minimal.
You get truly serialized output that is streamed to tape as it gets processed. If you are streaming to a remote host or to tape, there is almost no additional space required on your main server. We call this new mechanism parallel backup because we are achieving parallelism thereby making the backup faster. Indeed, using parallel backup may see up to 10 times the speed of a normal backup in certain scenarios.
The graph below shows the time it took for backup for MEB 3.7.1 v/s MEB 3.8 using varying number of threads.
Note : This is a 16 GB, 2 x 2000 MHz, 2 RAID DISKS (1027 GB,733.9GB) machine running Oracle Linux.
As you can see above; MEB 3.8 provides options to configure the number of threads you use for reading, writing and processing. Lets denote RT, PT and WT as number of Read, Process and Write threads respectively. Default values for MEB 3.8 is RT=3,PT=3, WT=3 which is changing in MEB 3.8.1 to RT=1, PT=6, WT=1.
This is close to the fastest backup we get in the graph above. The reason for not choosing RT=1, PT=12, WT=1 (which is the fastest) is because the CPU gets very highly utilized in the 1,12,1 configuration.
Remember, the read write throughput depends on your input and output devices. It is possible that multiple threads do not give you a better performance for read or write v/s a single thread.
There are also options available to have a configurable number of buffers used by these threads.
Each buffer is of size 16MB. You should have at-least [RT+PT+WT+ (MAX(RT,PT,WT) ] number of buffers so that you get optimal parallelism.
For Example if RT=1, PT=6, WT=1 then you should configure 1+6+1+6 = 14 buffers (default in MEB 3.8.1)
If for example you configure multiple threads but configure only 1 buffer then your backup is not taking advantage of parallelism at all. The read thread reads into the single buffer, buffer is then processed, written and then freed. The read thread is waiting for a buffer to be free to read into it; so it is like a serial process.
One more thing to note is that the number of buffers is limited by the memory limit configured for backup (default 300MB). Please ensure that you configure enough memory to be able to distribute it to the buffers you have configured. If the memory limit configured is less then what is required for the configured number of buffers; MEB will automatically decrease the number of buffers to fit into the memory limit. Based on the default values, if you are configuring more than 18 buffers you will need to increase the memory limit.
Please look at the previous 3.8 blog for detailed configuration examples :
or into our documentation of this feature at
and remember the wise DBA advise:
If you don't verify your backups periodically it is like not having backups at all