Watch out for hostname changes when using replication!

For one reason or another, many times we find ourselves changing the hostname of a machine. It's been repurposed or moved - or perhaps the original installer didn't know what name it should have. To achieve this on most modern Linux distributions there are 2 key files you need to update.


  1. /etc/hostname needs to be updated with correct hostname to be set on boot, and

  2. /etc/hosts needs to be updated for DNS lookups of the local hostname. This is more important than you might think and will break many applications if not updated.

Some people also take the third step of updating the hostname on the fly with the 'hostname' tool, which if you do that means the gotcha I'm about to describe take you completely unaware in some weeks or months.

If you are using MySQL replication, there are two key options which depend on the hostname. These are the 'log-bin' and 'relay-log' for the binary log and binary log replication log paths respectively. The problem is not only do the logs themselves depend on the hostname, so does the index which tells you where to find them. So if you restart the server, it will look for a new index file and won't find it - causing errors such as:

090825 17:17:15 [ERROR] Failed to open the relay log './mellie-relay-bin.000002' (relay_log_pos 339)
090825 17:17:15 [ERROR] Could not find target log during relay log initialization

There are several possible solutions to this, one involves combining the old and new files (which you can find documented here - but that's a more pro-active approach. The second is to completely restart the replication process - which in my opinion, is cleaner. So I will detail that approach.

First we need to stop the slave process, just to make sure.

mysql> STOP SLAVE;

Then we can get the slave status, to see what position the master is currently. You can see what that looks like here. The important values to note are the following two:

mysql> SHOW SLAVE STATUS\\G
\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\* 1. row \*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*
Master_Host: localhost
Master_User: root
Master_Port: 3306
Relay_Master_Log_File: gbichot-bin.005
Exec_Master_Log_Pos: 79

These values tell us what the current position in the master's binary logs the slave has executed up to as well as the basic master details. The reason this is important is we are going to tell the slave to completely forget about it's current replication information and fetch the data fresh.

While normally you could just change the master log file and position, since it can't open the relay log at all - the slave replication does not start and we must completely reset and specify all of the details again. The above information contains everything you need except the password for the replication user. You can find that by reading either the 'master.info' file or by prior knowledge.

ERROR 1201 (HY000): Could not initialize master info structure; more error messages can be found in the MySQL error log

To reset that, we run RESET SLAVE like so:

mysql> RESET SLAVE;

Then we run we need to construct a CHANGE MASTER statement with the above information.

In our case the statement is

mysql> RESET SLAVE;
mysql> CHANGE MASTER TO MASTER_HOST="localhost", MASTER_USER="root", MASTER_PASSWORD="test", MASTER_LOG_FILE="gbichot-bin.005", MASTER_LOG_POS=79;

... but be sure to use your own values from the SLAVE STATUS and make sure the log and position are from Relay_Master_Log_File and Exec_Master_Log_Pos - there are other values that look similar so don't confuse them. Once this is done we can start the slave, and check the status to make sure it is replicating correctly.

mysql> START SLAVE;
\*wait a few moments\*
mysql> SHOW SLAVE STATUS\\G

When the slave status is displayed, make sure that both the IO and SQL threads are running and there are no errors.


mysql> SHOW SLAVE STATUS\\G
\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\* 1. row \*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Seconds_Behind_Master: 8

Also keep an eye on the Seconds_Behind_Master value and make sure that it is reducing to 0.

That's this weeks situation. If you do change your hostname, it pays to reboot your machine to make sure everything has worked and that issues will not pop up down the track when your machine is unexpectedly rebooted by a crash or some other circumstance. You don't need unexpected changes causing problems!

This applies to any situation where you might change a configuration file, but making it's effects current are done separately. This is very common in MySQL, where you might want to change a dynamic variable and also edit the my.cnf file. If you make a syntax error, you won't know until your server reboots. So it helps to be very careful and preferably do the actual MySQL or server reboot.

Comments:

I would add that it is best to avoid this altogether and always name your logs directly:

Master:
log-bin=/var/lib/mysql/db1-bin-log
log-bin-index=/var/lib/mysql/db1-bin-log.index

Slave:
relay-log=/var/lib/mysql/db2-relay-log
relay-log-index=/var/lib/mysql/db2-relay-log.index

That way if you change the hostname, it doesn't change the files. It also protects replication if an update to MySQL changes the default log-file name and paths since you are defining both. You can also opt to put the logs in another location (such as on another partition).

Finally, making sure that IPs and not hostnames are being used in MySQL itself prevents anything from breaking. Of course that doesn't fix needing to make IP changes but those seem to be easier to remedy than having to muck around with binary and relay log filenames and paths.

Posted by Tim Soderstrom on August 25, 2009 at 03:42 PM WST #

Tim,

You are 100% correct about changing it not to use those. That is a recommended setup and I had forgot to mention that fact. Thanks for pointing it out!

Thanks,
Trent

Posted by Trent Lloyd on August 25, 2009 at 03:57 PM WST #

This can also happen without changing the hostname. The default log file naming has changed at least twice in the life of 5.0. I ran into this when upgrading from 5.0.27 to 5.0.72sp1 because the default log file name no longer has a hostname in it.

Posted by Eric Bergen on August 26, 2009 at 12:17 AM WST #

"Sun’s Trent Lloyd cautions Watch out for hostname changes when using replication!, for there is a gotcha there."

Log Buffer #159: http://www.pythian.com/news/3724/log-buffer-159-a-carnival-of-the-vanities-for-dbas

Posted by Log Buffer on August 28, 2009 at 08:40 PM WST #

Post a Comment:
  • HTML Syntax: NOT allowed
About

This blog is a set of tips, tricks and articles from Trent Lloyd - MySQL Support Engineer @ Sun from Perth, Australia working in the APAC team.

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today