Partial Binary Log Recovery

I came across a situation recently where I was asked if it was possible to edit a binary log to remove a part of it to restore onto a slave server. Now the choice of doing something like a hexedit did not seem appealing, and the more experienced might suggest that it is simply a matter of using  mysqlbinlog with the --start-position and/or --stop-position options. However, the problem had arisen that required the binary log to played through the replication process onto the slave based on specific options in MySQL cluster, so using an SQL dump from the binary log was of no use.

Initially this may seem like a daunting task where you will have to find some specialist tool or delve into the deep recesses of the binary log format, but a much simpler solution was found. The replication process allows the slave to be started up to a specific point in the log files. The command is the START SLAVE UNTIL... statement as seen in the manual at:

http://dev.mysql.com/doc/refman/5.1/en/start-slave.html

This offers the ability to start up the slave or the io thread to some particular part of a binary log that is specified. Whether you use the io thread to get the specific details from the master binary log, or you use the SQL thread to limit the SQL statements that are executed on the slave.

All seems well with this approach until you find out that you need a section removed from the binary log not the end of the log. The START SLAVE UNTIL statement only allows for an ending point, there is no equivalent START SLAVE FROM... statement to specify a starting point. So how can we utilise the START SLAVE UNTIL... command in the replication process to allow us to restore a binary log, leaving out a window of the log statements?

The solution is to combine the START SLAVE UNTIL statement with the CHANGE MASTER TO statement to allow us to define a starting and stopping point in the replication process. Here is a simple outline of how it would work, assuming the master is still replicating and the slave is currently stopped, or using --skip-slave-start:

Start the slave using START SLAVE UNTIL MASTER_LOG_FILE='binary.00035',MASTER_LOG_POS=2500;

This will give us the initial replication of statements that we want up to the start of the window to be removed from the log.

The replication then stops at this position, we can then RESET SLAVE; CHANGE MASTER TO MASTER_LOG_FILE='binary.00035',MASTER_LOG_POS=3500;

This defines the starting point for our next section to replicate, or in another term it is the end of the window being removed from the log file.

We then start the replication process again, either with START SLAVE; if there is no other window, or we can continue the process all over again by using the START SLAVE UNTIL.. again.

This simple process will allow us to restore sections of the binary log without having to resort to any specialist tools or hexeditors of any sort. Again, this is not usually required when you can simply use the mysqlbinlog command with the --start-position and --stop-position options, but if you ever find yourself in the situation where you have to use replication and restore only part of the binary log, this can be an option.



Comments:

Cool tip, Jonathon!
Simple and neat!
Thanks

Giuseppe

Posted by Giuseppe Maxia on June 23, 2009 at 04:40 PM EST #

Savvy, Smart and Simple! ... definitely handy in that situation.

Cheers,
Darren

Posted by Darren Cassar on June 23, 2009 at 04:45 PM EST #

Good trick! It wasn't that long ago I discovered the UNTIL syntax - and I love it.

If it was multiple coordinates, I'm curious as to if it would have been better to mysqlbinlog the relay log files, remove the erroneous events, and then replayed them that way?

This would give you the ability to proof the changes.

Posted by Morgan Tocker on June 23, 2009 at 05:28 PM EST #

I seem to remember something about this issue. :)

Posted by Joshua Gordon on June 24, 2009 at 12:58 AM EST #

Morgan: Based on my understanding of how the replication worked with this cluster option, it would not have worked. It needed to be changed in how the handler wrote the record into the database, where as the relay log is very simply a copy of the master's binary log on the slave. Certainly in other situations, this would be a possibility though.

Posted by Jonathon Coombes on June 24, 2009 at 07:35 AM EST #

Hi,

Good post.
In place of RESET SLAVE, which also forgets about master host, user & password, perhaps just STOP SLAVE; CHANGE MASTER TO ...; START SLAVE; ?

Posted by Shlomi Noach on June 24, 2009 at 03:33 PM EST #

Shlomi: Yes, I realise this is possible, but it was planned for a user. I thought a RESET SLAVE would be cleaner to make sure the relay logs were definitely cleared out of older data.

Posted by Jonathon Coombes on June 28, 2009 at 01:54 AM EST #

Hi,

CHANGE MASTER TO MASTER_LOG_FILE=..., MASTER_LOG_POS=...
will clear the relay logs of old data. There's no need to RESET SLAVE for that.

Cheers

Posted by Shlomi Noach on June 28, 2009 at 06:23 AM EST #

Post a Comment:
Comments are closed for this entry.
About

Jonathon Coombes

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today