百分百源码网-让建站变得如此简单! 登录 注册 签到领金币!

主页 | 如何升级VIP | TAG标签

当前位置: 主页>网站教程>数据库> MysqlMaster切换日志分析示例
分享文章到:

MysqlMaster切换日志分析示例

发布时间:01/15 来源: 浏览: 关键词:
MHA监控复制架构的主服务器,一旦检测到主服务器故障,就会自动进行故障转移。即使有些从服务器没有收到最新的relay log,MHA自动从最新的从服务器上识别差异的relay log并把这些日志应用到其他从服务器上,因此所有的从服务器保持一致性了,下面我们来看一个MysqlMaster切换日志分析,希望对各位有帮助。

下面是MHA从启动,到failover的完整manager.log,以及集群中所有mysql的general.log.我们可以通过这些日志,更多的了解和分析MHA的运行原理.对用好MHA有很多帮助。
架构如下:
DB1 MASTER
DB2 SLAVE,candidate_master
DB3 SLAVE

DB1:

[root@localhost mysql]# cat /var/lib/mysql/localhost.log

140928 10:17:04     5 Connect   mha@192.168.153.150 on
                    5 Query     set autocommit=1
                    5 Query     SELECT CONNECTION_ID() AS Value
                    6 Connect   mha@192.168.153.150 on
                    6 Query     set autocommit=1
                    6 Query     SELECT CONNECTION_ID() AS Value
                    6 Query     SET wait_timeout=86400
                    6 Query     SELECT @@global.server_id As Value
                    6 Query     SELECT VERSION() AS Value
                    6 Query     SHOW GLOBAL VARIABLES LIKE 'log_bin'
                    6 Query     SHOW MASTER STATUS
                    6 Query     SELECT @@global.relay_log_info_repository AS Value
                    6 Query     SELECT @@global.datadir AS Value
                    6 Query     SELECT @@global.relay_log_info_file AS Value
                    6 Query     SELECT @@global.datadir AS Value
                    6 Query     SHOW SLAVE STATUS
                    6 Query     SELECT @@global.read_only As Value
                    6 Query     SELECT @@global.relay_log_purge As Value
140928 10:17:11     7 Connect   mha@192.168.153.150 on
                    7 Query     select @@version_comment limit 1
                    7 Query     select count(*) as c from mysql.user
                    6 Quit
                    7 Quit
                    8 Connect   mha@192.168.153.150 on
                    8 Query     set autocommit=1
                    8 Query     SET wait_timeout=6
                    8 Query     SELECT GET_LOCK('MHA_Master_High_Availability_Monitor', '12') AS Value
                    8 Query     SELECT 1 As Value
140928 10:17:14     8 Query     SELECT 1 As Value
140928 10:17:17     8 Query     SELECT 1 As Value
140928 10:17:20     8 Query     SELECT 1 As Value
140928 10:17:23     8 Query     SELECT 1 As Value
140928 10:17:26     8 Query     SELECT 1 As Value
140928 10:17:29     8 Query     SELECT 1 As Value
140928 10:17:32     8 Query     SELECT 1 As Value
140928 10:17:35     8 Query     SELECT 1 As Value
140928 10:17:38     8 Query     SELECT 1 As Value
DB2:

[root@localhost mysql]# cat /var/lib/mysql/localhost.log

140928 10:17:04   214 Connect   mha@192.168.153.150 on
                  214 Query     set autocommit=1
                  214 Query     SELECT CONNECTION_ID() AS Value
                  215 Connect   mha@192.168.153.150 on
                  215 Query     set autocommit=1
                  215 Query     SELECT CONNECTION_ID() AS Value
                  215 Query     SET wait_timeout=86400
                  215 Query     SELECT @@global.server_id As Value
                  215 Query     SELECT VERSION() AS Value
                  215 Query     SHOW GLOBAL VARIABLES LIKE 'log_bin'
                  215 Query     SHOW MASTER STATUS
                  215 Query     SELECT @@global.relay_log_info_repository AS Value
                  215 Query     SELECT @@global.datadir AS Value
                  215 Query     SELECT @@global.relay_log_info_file AS Value
                  215 Query     SELECT @@global.datadir AS Value
                  215 Query     SHOW SLAVE STATUS
                  215 Query     SELECT @@global.read_only As Value
                  215 Query     SELECT @@global.relay_log_purge As Value
                  215 Query     SHOW SLAVE STATUS
                  215 Query     SELECT Repl_slave_priv AS Value FROM mysql.user WHERE user = 'slave'
140928 10:17:09   216 Connect   mha@192.168.153.151 on
                  216 Query     select @@version_comment limit 1
                  216 Query     set sql_log_bin=0
                  216 Query     create table if not exists mysql.apply_diff_relay_logs_test(id int)
                  216 Query     insert into mysql.apply_diff_relay_logs_test values(1)
                  216 Query     update mysql.apply_diff_relay_logs_test set id=id+1 where id=1
                  216 Query     delete from mysql.apply_diff_relay_logs_test
                  216 Query     drop table mysql.apply_diff_relay_logs_test
140928 10:17:10   216 Quit
140928 10:17:11   215 Quit
140928 10:17:50   217 Connect   mha@192.168.153.150 on
                  217 Query     set autocommit=1
                  217 Query     SELECT CONNECTION_ID() AS Value
                  218 Connect   mha@192.168.153.150 on
                  218 Query     set autocommit=1
                  218 Query     SELECT CONNECTION_ID() AS Value
                  218 Query     SET wait_timeout=86400
                  218 Query     SELECT @@global.server_id As Value
                  218 Query     SELECT VERSION() AS Value
                  218 Query     SHOW GLOBAL VARIABLES LIKE 'log_bin'
                  218 Query     SHOW MASTER STATUS
                  218 Query     SELECT @@global.relay_log_info_repository AS Value
                  218 Query     SELECT @@global.datadir AS Value
                  218 Query     SELECT @@global.relay_log_info_file AS Value
                  218 Query     SELECT @@global.datadir AS Value
                  218 Query     SHOW SLAVE STATUS
                  218 Query     SELECT @@global.read_only As Value
                  218 Query     SELECT @@global.relay_log_purge As Value
                  218 Query     SHOW SLAVE STATUS
                  219 Connect   mha@192.168.153.150 on
                  219 Query     set autocommit=1
                  219 Query     SELECT CONNECTION_ID() AS Value
                  220 Connect   mha@192.168.153.150 on
                  220 Query     set autocommit=1
                  220 Query     SELECT CONNECTION_ID() AS Value
                  220 Query     SET wait_timeout=86400
                  220 Query     SELECT @@global.server_id As Value
                  220 Query     SELECT VERSION() AS Value
                  220 Query     SHOW GLOBAL VARIABLES LIKE 'log_bin'
                  220 Query     SHOW MASTER STATUS
                  220 Query     SELECT @@global.relay_log_info_repository AS Value
                  220 Query     SELECT @@global.datadir AS Value
                  220 Query     SELECT @@global.relay_log_info_file AS Value
                  220 Query     SELECT @@global.datadir AS Value
                  220 Query     SHOW SLAVE STATUS
                  220 Query     SELECT @@global.read_only As Value
                  220 Query     SELECT @@global.relay_log_purge As Value
                  220 Query     SHOW SLAVE STATUS
                  220 Query     SELECT GET_LOCK('MHA_Master_High_Availability_Failover', '1') AS Value
                  220 Query     SHOW SLAVE STATUS
                  220 Query     STOP SLAVE IO_THREAD
                  220 Query     SHOW SLAVE STATUS
140928 10:17:51   220 Query     SHOW SLAVE STATUS
                  220 Query     SHOW SLAVE STATUS
                  220 Query     STOP SLAVE SQL_THREAD
                  220 Query     SHOW SLAVE STATUS
                  220 Query     SHOW MASTER STATUS
                  220 Query     SELECT @@global.read_only As Value
                  221 Connect   slave@192.168.153.152 on
                  221 Query     SELECT UNIX_TIMESTAMP()
                  221 Query     SHOW VARIABLES LIKE 'SERVER_ID'
                  221 Query     SET @master_heartbeat_period= 1799999979520
                  221 Query     SET @master_binlog_checksum= @@global.binlog_checksum
                  221 Query     SELECT @master_binlog_checksum
                  221 Query     SELECT @@GLOBAL.GTID_MODE
                  221 Query     SHOW VARIABLES LIKE 'SERVER_UUID'
                  221 Query     SET @slave_uuid= 'dda2f003-4244-11e4-b851-000c29da163f'
                  221 Binlog Dump       Log: 'mysql-bin.000016'  Pos: 688
                  220 Query     STOP SLAVE
                  220 Query     SHOW SLAVE STATUS
                  220 Query     RESET SLAVE /*!50516 ALL */
                  220 Query     SHOW SLAVE STATUS
                  220 Query     SELECT RELEASE_LOCK('MHA_Master_High_Availability_Failover') As Value
                  220 Quit
DB3:

[root@localhost ~]# cat /var/lib/mysql/localhost.log
140928 10:17:04   223 Connect   mha@192.168.153.150 on
                  223 Query     set autocommit=1
                  223 Query     SELECT CONNECTION_ID() AS Value
                  224 Connect   mha@192.168.153.150 on
                  224 Query     set autocommit=1
                  224 Query     SELECT CONNECTION_ID() AS Value
                  224 Query     SET wait_timeout=86400
                  224 Query     SELECT @@global.server_id As Value
                  224 Query     SELECT VERSION() AS Value
                  224 Query     SHOW GLOBAL VARIABLES LIKE 'log_bin'
                  224 Query     SHOW MASTER STATUS
                  224 Query     SELECT @@global.relay_log_info_repository AS Value
                  224 Query     SELECT @@global.datadir AS Value
                  224 Query     SELECT @@global.relay_log_info_file AS Value
                  224 Query     SELECT @@global.datadir AS Value
                  224 Query     SHOW SLAVE STATUS
                  224 Query     SELECT @@global.read_only As Value
                  224 Query     SELECT @@global.relay_log_purge As Value
                  224 Query     SHOW SLAVE STATUS
140928 10:17:10   225 Connect   mha@192.168.153.152 on
                  225 Query     select @@version_comment limit 1
                  225 Query     set sql_log_bin=0
                  225 Query     create table if not exists mysql.apply_diff_relay_logs_test(id int)
                  225 Query     insert into mysql.apply_diff_relay_logs_test values(1)
                  225 Query     update mysql.apply_diff_relay_logs_test set id=id+1 where id=1
                  225 Query     delete from mysql.apply_diff_relay_logs_test
                  225 Query     drop table mysql.apply_diff_relay_logs_test
140928 10:17:11   225 Quit
                  224 Quit
#开始failover流程的监测
140928 10:17:50   226 Connect   mha@192.168.153.150 on
                  226 Query     set autocommit=1
                  226 Query     SELECT CONNECTION_ID() AS Value
                  227 Connect   mha@192.168.153.150 on
                  227 Query     set autocommit=1
                  227 Query     SELECT CONNECTION_ID() AS Value
                  227 Query     SET wait_timeout=86400
                  227 Query     SELECT @@global.server_id As Value
                  227 Query     SELECT VERSION() AS Value
                  227 Query     SHOW GLOBAL VARIABLES LIKE 'log_bin'
                  227 Query     SHOW MASTER STATUS
                  227 Query     SELECT @@global.relay_log_info_repository AS Value
                  227 Query     SELECT @@global.datadir AS Value
                  227 Query     SELECT @@global.relay_log_info_file AS Value
                  227 Query     SELECT @@global.datadir AS Value
                  227 Query     SHOW SLAVE STATUS
                  227 Query     SELECT @@global.read_only As Value
                  227 Query     SELECT @@global.relay_log_purge As Value
                  227 Query     SHOW SLAVE STATUS
#正式开始failover的监测
                  228 Connect   mha@192.168.153.150 on
                  228 Query     set autocommit=1
                  228 Query     SELECT CONNECTION_ID() AS Value
                  229 Connect   mha@192.168.153.150 on
                  229 Query     set autocommit=1
                  229 Query     SELECT CONNECTION_ID() AS Value
                  229 Query     SET wait_timeout=86400
                  229 Query     SELECT @@global.server_id As Value
                  229 Query     SELECT VERSION() AS Value
                  229 Query     SHOW GLOBAL VARIABLES LIKE 'log_bin'
                  229 Query     SHOW MASTER STATUS
                  229 Query     SELECT @@global.relay_log_info_repository AS Value
                  229 Query     SELECT @@global.datadir AS Value
                  229 Query     SELECT @@global.relay_log_info_file AS Value
                  229 Query     SELECT @@global.datadir AS Value
                  229 Query     SHOW SLAVE STATUS
                  229 Query     SELECT @@global.read_only As Value
                  229 Query     SELECT @@global.relay_log_purge As Value
                  229 Query     SHOW SLAVE STATUS
                  229 Query     SELECT GET_LOCK('MHA_Master_High_Availability_Failover', '1') AS Value
                  229 Query     SHOW SLAVE STATUS
                  229 Query     STOP SLAVE IO_THREAD
                  229 Query     SHOW SLAVE STATUS
140928 10:17:51   229 Query     SHOW SLAVE STATUS
                  229 Query     SHOW SLAVE STATUS
                  229 Query     SHOW SLAVE STATUS
                  229 Query     SHOW SLAVE STATUS
                  229 Query     STOP SLAVE SQL_THREAD
                  229 Query     SHOW SLAVE STATUS
                  229 Query     STOP SLAVE
                  229 Query     SHOW SLAVE STATUS
                  229 Query     RESET SLAVE
                  229 Query     CHANGE MASTER TO MASTER_HOST = '192.168.153.151' MASTER_USER = 'slave' MASTER_PASSWORD =  MASTER_PORT = 3306 MASTER_LOG_FILE = 'mysql-bin.000016' MASTER_LOG_POS = 688
                  229 Query     START SLAVE
                  230 Connect Out       slave@192.168.153.151:3306
                  229 Query     SHOW SLAVE STATUS
                  229 Query     SELECT RELEASE_LOCK('MHA_Master_High_Availability_Failover') As Value
                  229 Quit
140928 10:18:07   236 Connect   root@localhost on
                  236 Query     select @@version_comment limit 1
                  236 Query     select USER()
140928 10:18:10   236 Quit
MHA Manage.log:

[root@localhost ~]# cat /var/log/masterha/app1/manager.log
#启动前的准备工作

#检查服务器状态,获取相关参数设置
Sun Sep 28 10:17:03 2014 - [info] MHA::MasterMonitor version 0.55.
Sun Sep 28 10:17:04 2014 - [info] Dead Servers:
Sun Sep 28 10:17:04 2014 - [info] Alive Servers:
Sun Sep 28 10:17:04 2014 - [info]   192.168.153.150(192.168.153.150:3306)
Sun Sep 28 10:17:04 2014 - [info]   192.168.153.151(192.168.153.151:3306)
Sun Sep 28 10:17:04 2014 - [info]   192.168.153.152(192.168.153.152:3306)
Sun Sep 28 10:17:04 2014 - [info] Alive Slaves:
Sun Sep 28 10:17:04 2014 - [info]   192.168.153.151(192.168.153.151:3306)  Version=5.6.20-68.0-log (oldest major version between slaves) log-bin:enabled
Sun Sep 28 10:17:04 2014 - [info]     Replicating from 192.168.153.150(192.168.153.150:3306)
Sun Sep 28 10:17:04 2014 - [info]     Primary candidate for the new Master (candidate_master is set)
Sun Sep 28 10:17:04 2014 - [info]   192.168.153.152(192.168.153.152:3306)  Version=5.6.20-68.0-log (oldest major version between slaves) log-bin:enabled
Sun Sep 28 10:17:04 2014 - [info]     Replicating from 192.168.153.150(192.168.153.150:3306)
Sun Sep 28 10:17:04 2014 - [info]     Not candidate for the new Master (no_master is set)
Sun Sep 28 10:17:04 2014 - [info] Current Alive Master: 192.168.153.150(192.168.153.150:3306)
Sun Sep 28 10:17:04 2014 - [info] Checking slave configurations..
Sun Sep 28 10:17:04 2014 - [info]  read_only=1 is not set on slave 192.168.153.151(192.168.153.151:3306).
Sun Sep 28 10:17:04 2014 - [warning]  relay_log_purge=0 is not set on slave 192.168.153.151(192.168.153.151:3306).
Sun Sep 28 10:17:04 2014 - [info]  read_only=1 is not set on slave 192.168.153.152(192.168.153.152:3306).
Sun Sep 28 10:17:04 2014 - [warning]  relay_log_purge=0 is not set on slave 192.168.153.152(192.168.153.152:3306).
Sun Sep 28 10:17:04 2014 - [info] Checking replication filtering settings..
Sun Sep 28 10:17:04 2014 - [info]  binlog_do_db= , binlog_ignore_db=
Sun Sep 28 10:17:04 2014 - [info]  Replication filtering check ok.
#测试ssh连接是否成功
Sun Sep 28 10:17:04 2014 - [info] Starting SSH connection tests..
Sun Sep 28 10:17:07 2014 - [info] All SSH connection tests passed successfully.
#测试MHA node是否可用
Sun Sep 28 10:17:07 2014 - [info] Checking MHA Node version..
Sun Sep 28 10:17:08 2014 - [info]  Version check ok.
Sun Sep 28 10:17:08 2014 - [info] Checking SSH publickey authentication settings on the current master..
Sun Sep 28 10:17:08 2014 - [info] HealthCheck: SSH to 192.168.153.150 is reachable.
Sun Sep 28 10:17:09 2014 - [info] Master MHA Node version is 0.54.
Sun Sep 28 10:17:09 2014 - [info] Checking recovery script configurations on the current master..
Sun Sep 28 10:17:09 2014 - [info]   Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql,/var/log/mysql --output_file=/var/log/masterha/app1/save_binary_logs_test --manager_version=0.55 --start_file=mysql-bin.000026
#创建MHA日志目录
Sun Sep 28 10:17:09 2014 - [info]   Connecting to root@192.168.153.150(192.168.153.150)..
  Creating /var/log/masterha/app1 if not exists..    ok.
  Checking output directory is accessible or not..
   ok.
  Binlog found at /var/lib/mysql, up to mysql-bin.000026
Sun Sep 28 10:17:09 2014 - [info] Master setting check done.
#master的检查到此完毕

#开始检查slave的差异日志应用权限
Sun Sep 28 10:17:09 2014 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
Sun Sep 28 10:17:09 2014 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='mha' --slave_host=192.168.153.151 --slave_ip=192.168.153.151 --slave_port=3306 --workdir=/var/log/masterha/app1 --target_version=5.6.20-68.0-log --manager_version=0.55 --relay_log_info=/var/lib/mysql/relay-log.info  --relay_dir=/var/lib/mysql/  --slave_pass=xxx
Sun Sep 28 10:17:09 2014 - [info]   Connecting to root@192.168.153.151(192.168.153.151:22)..
  Checking slave recovery environment settings..
    Opening /var/lib/mysql/relay-log.info ... ok.
    Relay log found at /var/lib/mysql, up to mysql-relay.000003
    Temporary relay log file is /var/lib/mysql/mysql-relay.000003
    Testing mysql connection and privileges..Warning: Using a password on the command line interface can be insecure.
 done.
    Testing mysqlbinlog output.. done.
    Cleaning up test file(s).. done.
Sun Sep 28 10:17:10 2014 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='mha' --slave_host=192.168.153.152 --slave_ip=192.168.153.152 --slave_port=3306 --workdir=/var/log/masterha/app1 --target_version=5.6.20-68.0-log --manager_version=0.55 --relay_log_info=/var/lib/mysql/relay-log.info  --relay_dir=/var/lib/mysql/  --slave_pass=xxx
Sun Sep 28 10:17:10 2014 - [info]   Connecting to root@192.168.153.152(192.168.153.152:22)..
  Checking slave recovery environment settings..
    Opening /var/lib/mysql/relay-log.info ... ok.
    Relay log found at /var/lib/mysql, up to mysql-relay.000003
    Temporary relay log file is /var/lib/mysql/mysql-relay.000003
    Testing mysql connection and privileges..Warning: Using a password on the command line interface can be insecure.
 done.
    Testing mysqlbinlog output.. done.
    Cleaning up test file(s).. done.
Sun Sep 28 10:17:11 2014 - [info] Slaves settings check done.
Sun Sep 28 10:17:11 2014 - [info]
192.168.153.150 (current master)
 +--192.168.153.151
 +--192.168.153.152

#调试master_ip_failover_script
Sun Sep 28 10:17:11 2014 - [warning] master_ip_failover_script is not defined.
#调试shutdown_script
Sun Sep 28 10:17:11 2014 - [info] Checking shutdown script status:
Sun Sep 28 10:17:11 2014 - [info]   /opt/master_ip_failover.sh --command=status --ssh_user=root --host=192.168.153.150 --ip=192.168.153.150
Sun Sep 28 10:17:11 2014 - [info]  OK.
Sun Sep 28 10:17:11 2014 - [info] Set master ping interval 3 seconds.
#设置二次检查的主机
Sun Sep 28 10:17:11 2014 - [info] Set secondary check script: masterha_secondary_check -s 192.168.153.151 -s 192.168.153.152
Sun Sep 28 10:17:11 2014 - [info] Starting ping health check on 192.168.153.150(192.168.153.150:3306)..

#MHA启动完毕,进入监测状态
Sun Sep 28 10:17:11 2014 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..

#监测DB1服务器挂了
Sun Sep 28 10:17:41 2014 - [warning] Got error on MySQL select ping: 2006 (MySQL server has gone away)

#通过定义的二次监测,确认master是否挂了
Sun Sep 28 10:17:41 2014 - [info] Executing seconary network check script: masterha_secondary_check -s 192.168.153.151 -s 192.168.153.152  --user=root  --master_host=192.168.153.150  --master_ip=192.168.153.150  --master_port=3306
Sun Sep 28 10:17:41 2014 - [info] Executing SSH check script: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql,/var/log/mysql --output_file=/var/log/masterha/app1/save_binary_logs_test --manager_version=0.55 --binlog_prefix=mysql-bin
Monitoring server 192.168.153.151 is reachable, Master is not reachable from 192.168.153.151. OK.
Sun Sep 28 10:17:42 2014 - [info] HealthCheck: SSH to 192.168.153.150 is reachable.
Monitoring server 192.168.153.152 is reachable, Master is not reachable from 192.168.153.152. OK.

#确认master挂了,开始进入failover流程
#再试尝试连接master和master的ssh
Sun Sep 28 10:17:42 2014 - [info] Master is not reachable from all other monitoring servers. Failover should start.
Sun Sep 28 10:17:44 2014 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
Sun Sep 28 10:17:44 2014 - [warning] Connection failed 1 time(s)..
Sun Sep 28 10:17:47 2014 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
Sun Sep 28 10:17:47 2014 - [warning] Connection failed 2 time(s)..
Sun Sep 28 10:17:50 2014 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
Sun Sep 28 10:17:50 2014 - [warning] Connection failed 3 time(s)..
Sun Sep 28 10:17:50 2014 - [warning] Master is not reachable from health checker!
Sun Sep 28 10:17:50 2014 - [warning] Master 192.168.153.150(192.168.153.150:3306) is not reachable!
Sun Sep 28 10:17:50 2014 - [warning] SSH is reachable.
#通过配置文件,监测其他slave的状态
Sun Sep 28 10:17:50 2014 - [info] Connecting to a master server failed. Reading configuration file /etc/masterha_default.cnf and /etc/app1.cnf again, and trying to connect to all servers to check server status..
Sun Sep 28 10:17:50 2014 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sun Sep 28 10:17:50 2014 - [info] Reading application default configurations from /etc/app1.cnf..
Sun Sep 28 10:17:50 2014 - [info] Reading server configurations from /etc/app1.cnf..
Sun Sep 28 10:17:50 2014 - [info] Dead Servers:
Sun Sep 28 10:17:50 2014 - [info]   192.168.153.150(192.168.153.150:3306)
Sun Sep 28 10:17:50 2014 - [info] Alive Servers:
Sun Sep 28 10:17:50 2014 - [info]   192.168.153.151(192.168.153.151:3306)
Sun Sep 28 10:17:50 2014 - [info]   192.168.153.152(192.168.153.152:3306)
Sun Sep 28 10:17:50 2014 - [info] Alive Slaves:
Sun Sep 28 10:17:50 2014 - [info]   192.168.153.151(192.168.153.151:3306)  Version=5.6.20-68.0-log (oldest major version between slaves) log-bin:enabled
Sun Sep 28 10:17:50 2014 - [info]     Replicating from 192.168.153.150(192.168.153.150:3306)
Sun Sep 28 10:17:50 2014 - [info]     Primary candidate for the new Master (candidate_master is set)
Sun Sep 28 10:17:50 2014 - [info]   192.168.153.152(192.168.153.152:3306)  Version=5.6.20-68.0-log (oldest major version between slaves) log-bin:enabled
Sun Sep 28 10:17:50 2014 - [info]     Replicating from 192.168.153.150(192.168.153.150:3306)
Sun Sep 28 10:17:50 2014 - [info]     Not candidate for the new Master (no_master is set)
#再次监测slave的配置是否有变化,是否符合failover条件
Sun Sep 28 10:17:50 2014 - [info] Checking slave configurations..
Sun Sep 28 10:17:50 2014 - [info]  read_only=1 is not set on slave 192.168.153.151(192.168.153.151:3306).
Sun Sep 28 10:17:50 2014 - [warning]  relay_log_purge=0 is not set on slave 192.168.153.151(192.168.153.151:3306).
Sun Sep 28 10:17:50 2014 - [info]  read_only=1 is not set on slave 192.168.153.152(192.168.153.152:3306).
Sun Sep 28 10:17:50 2014 - [warning]  relay_log_purge=0 is not set on slave 192.168.153.152(192.168.153.152:3306).
Sun Sep 28 10:17:50 2014 - [info] Checking replication filtering settings..
Sun Sep 28 10:17:50 2014 - [info]  Replication filtering check ok.
Sun Sep 28 10:17:50 2014 - [info] Master is down!
Sun Sep 28 10:17:50 2014 - [info] Terminating monitoring script.
Sun Sep 28 10:17:50 2014 - [info] Got exit code 20 (Master dead).
Sun Sep 28 10:17:50 2014 - [info] MHA::MasterFailover version 0.55.
#正式开始failover
Sun Sep 28 10:17:50 2014 - [info] Starting master failover.
Sun Sep 28 10:17:50 2014 - [info]

#再次对slave配置做检查
Sun Sep 28 10:17:50 2014 - [info] * Phase 1: Configuration Check Phase..
Sun Sep 28 10:17:50 2014 - [info]
Sun Sep 28 10:17:50 2014 - [info] Dead Servers:
Sun Sep 28 10:17:50 2014 - [info]   192.168.153.150(192.168.153.150:3306)
Sun Sep 28 10:17:50 2014 - [info] Checking master reachability via mysql(double check)..
Sun Sep 28 10:17:50 2014 - [info]  ok.
Sun Sep 28 10:17:50 2014 - [info] Alive Servers:
Sun Sep 28 10:17:50 2014 - [info]   192.168.153.151(192.168.153.151:3306)
Sun Sep 28 10:17:50 2014 - [info]   192.168.153.152(192.168.153.152:3306)
Sun Sep 28 10:17:50 2014 - [info] Alive Slaves:
Sun Sep 28 10:17:50 2014 - [info]   192.168.153.151(192.168.153.151:3306)  Version=5.6.20-68.0-log (oldest major version between slaves) log-bin:enabled
Sun Sep 28 10:17:50 2014 - [info]     Replicating from 192.168.153.150(192.168.153.150:3306)
Sun Sep 28 10:17:50 2014 - [info]     Primary candidate for the new Master (candidate_master is set)
Sun Sep 28 10:17:50 2014 - [info]   192.168.153.152(192.168.153.152:3306)  Version=5.6.20-68.0-log (oldest major version between slaves) log-bin:enabled
Sun Sep 28 10:17:50 2014 - [info]     Replicating from 192.168.153.150(192.168.153.150:3306)
Sun Sep 28 10:17:50 2014 - [info]     Not candidate for the new Master (no_master is set)
Sun Sep 28 10:17:50 2014 - [info] ** Phase 1: Configuration Check Phase completed.
Sun Sep 28 10:17:50 2014 - [info]

#对原Master做master_ip_failover_script和shutdown_script的操作
Sun Sep 28 10:17:50 2014 - [info] * Phase 2: Dead Master Shutdown Phase..
Sun Sep 28 10:17:50 2014 - [info]
Sun Sep 28 10:17:50 2014 - [info] Forcing shutdown so that applications never connect to the current master..
Sun Sep 28 10:17:50 2014 - [warning] master_ip_failover_script is not set. Skipping invalidating dead master ip address.
Sun Sep 28 10:17:50 2014 - [info] Executing SHUTDOWN script:
Sun Sep 28 10:17:50 2014 - [info]   /opt/master_ip_failover.sh --command=stopssh --ssh_user=root  --host=192.168.153.150  --ip=192.168.153.150  --port=3306
Sun Sep 28 10:17:51 2014 - [info]  Power off done.

Sun Sep 28 10:17:51 2014 - [info] * Phase 2: Dead Master Shutdown Phase completed.
Sun Sep 28 10:17:51 2014 - [info]

#开始差异日志的恢复:获取slave最后得到的binlog位置
Sun Sep 28 10:17:51 2014 - [info] * Phase 3: Master Recovery Phase..
Sun Sep 28 10:17:51 2014 - [info]
Sun Sep 28 10:17:51 2014 - [info] * Phase 3.1: Getting Latest Slaves Phase..
Sun Sep 28 10:17:51 2014 - [info]
Sun Sep 28 10:17:51 2014 - [info] The latest binary log file/position on all slaves is mysql-bin.000026:120
Sun Sep 28 10:17:51 2014 - [info] Latest slaves (Slaves that received relay log files to the latest):
Sun Sep 28 10:17:51 2014 - [info]   192.168.153.151(192.168.153.151:3306)  Version=5.6.20-68.0-log (oldest major version between slaves) log-bin:enabled
Sun Sep 28 10:17:51 2014 - [info]     Replicating from 192.168.153.150(192.168.153.150:3306)
Sun Sep 28 10:17:51 2014 - [info]     Primary candidate for the new Master (candidate_master is set)
Sun Sep 28 10:17:51 2014 - [info]   192.168.153.152(192.168.153.152:3306)  Version=5.6.20-68.0-log (oldest major version between slaves) log-bin:enabled
Sun Sep 28 10:17:51 2014 - [info]     Replicating from 192.168.153.150(192.168.153.150:3306)
Sun Sep 28 10:17:51 2014 - [info]     Not candidate for the new Master (no_master is set)
Sun Sep 28 10:17:51 2014 - [info] The oldest binary log file/position on all slaves is mysql-bin.000026:120
Sun Sep 28 10:17:51 2014 - [info] Oldest slaves:
Sun Sep 28 10:17:51 2014 - [info]   192.168.153.151(192.168.153.151:3306)  Version=5.6.20-68.0-log (oldest major version between slaves) log-bin:enabled
Sun Sep 28 10:17:51 2014 - [info]     Replicating from 192.168.153.150(192.168.153.150:3306)
Sun Sep 28 10:17:51 2014 - [info]     Primary candidate for the new Master (candidate_master is set)
Sun Sep 28 10:17:51 2014 - [info]   192.168.153.152(192.168.153.152:3306)  Version=5.6.20-68.0-log (oldest major version between slaves) log-bin:enabled
Sun Sep 28 10:17:51 2014 - [info]     Replicating from 192.168.153.150(192.168.153.150:3306)
Sun Sep 28 10:17:51 2014 - [info]     Not candidate for the new Master (no_master is set)
Sun Sep 28 10:17:51 2014 - [info]
#获取原master的binlog日志
Sun Sep 28 10:17:51 2014 - [info] * Phase 3.2: Saving Dead Master's Binlog Phase..
Sun Sep 28 10:17:51 2014 - [info]
Sun Sep 28 10:17:51 2014 - [warning] Dead Master is not SSH reachable. Could not save it's binlogs. Transactions that were not sent to the latest slave (Read_Master_Log_Pos to the tail of the dead master's binlog) were lost.
Sun Sep 28 10:17:51 2014 - [info]

#确定新的master
Sun Sep 28 10:17:51 2014 - [info] * Phase 3.3: Determining New Master Phase..
Sun Sep 28 10:17:51 2014 - [info]
Sun Sep 28 10:17:51 2014 - [info] Finding the latest slave that has all relay logs for recovering other slaves..
Sun Sep 28 10:17:51 2014 - [info] All slaves received relay logs to the same position. No need to resync each other.
Sun Sep 28 10:17:51 2014 - [info] Searching new master from slaves..
Sun Sep 28 10:17:51 2014 - [info]  Candidate masters from the configuration file:
Sun Sep 28 10:17:51 2014 - [info]   192.168.153.151(192.168.153.151:3306)  Version=5.6.20-68.0-log (oldest major version between slaves) log-bin:enabled
Sun Sep 28 10:17:51 2014 - [info]     Replicating from 192.168.153.150(192.168.153.150:3306)
Sun Sep 28 10:17:51 2014 - [info]     Primary candidate for the new Master (candidate_master is set)
Sun Sep 28 10:17:51 2014 - [info]  Non-candidate masters:
Sun Sep 28 10:17:51 2014 - [info]   192.168.153.152(192.168.153.152:3306)  Version=5.6.20-68.0-log (oldest major version between slaves) log-bin:enabled
Sun Sep 28 10:17:51 2014 - [info]     Replicating from 192.168.153.150(192.168.153.150:3306)
Sun Sep 28 10:17:51 2014 - [info]     Not candidate for the new Master (no_master is set)
Sun Sep 28 10:17:51 2014 - [info]  Searching from candidate_master slaves which have received the latest relay log events..
Sun Sep 28 10:17:51 2014 - [info] New master is 192.168.153.151(192.168.153.151:3306)
Sun Sep 28 10:17:51 2014 - [info] Starting master failover..
Sun Sep 28 10:17:51 2014 - [info]
From:
192.168.153.150 (current master)
 +--192.168.153.151
 +--192.168.153.152

To:
192.168.153.151 (new master)
 +--192.168.153.152
Sun Sep 28 10:17:51 2014 - [info]
#在new master上应用差异的binlog日志
Sun Sep 28 10:17:51 2014 - [info] * Phase 3.3: New Master Diff Log Generation Phase..
Sun Sep 28 10:17:51 2014 - [info]
Sun Sep 28 10:17:51 2014 - [info]  This server has all relay logs. No need to generate diff files from the latest slave.
Sun Sep 28 10:17:51 2014 - [info]
Sun Sep 28 10:17:51 2014 - [info] * Phase 3.4: Master Log Apply Phase..
Sun Sep 28 10:17:51 2014 - [info]
Sun Sep 28 10:17:51 2014 - [info] *NOTICE: If any error happens from this phase, manual recovery is needed.
Sun Sep 28 10:17:51 2014 - [info] Starting recovery on 192.168.153.151(192.168.153.151:3306)..
Sun Sep 28 10:17:51 2014 - [info]  This server has all relay logs. Waiting all logs to be applied..
Sun Sep 28 10:17:51 2014 - [info]   done.
Sun Sep 28 10:17:51 2014 - [info]  All relay logs were successfully applied.
#获取new master的binlog位置。
Sun Sep 28 10:17:51 2014 - [info] Getting new master's binlog name and position..
Sun Sep 28 10:17:51 2014 - [info]  mysql-bin.000016:688
Sun Sep 28 10:17:51 2014 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='192.168.153.151', MASTER_PORT=3306, MASTER_LOG_FILE='mysql-bin.000016', MASTER_LOG_POS=688, MASTER_USER='slave', MASTER_PASSWORD='xxx';
#如果有master_ip_failover_script,那么给new master设置VIP
Sun Sep 28 10:17:51 2014 - [warning] master_ip_failover_script is not set. Skipping taking over new master ip address.
Sun Sep 28 10:17:51 2014 - [info] ** Finished master recovery successfully.
Sun Sep 28 10:17:51 2014 - [info] * Phase 3: Master Recovery Phase completed.
Sun Sep 28 10:17:51 2014 - [info]

#开始恢复其他slave,也是从原master的binlog对比来做恢复
Sun Sep 28 10:17:51 2014 - [info] * Phase 4: Slaves Recovery Phase..
Sun Sep 28 10:17:51 2014 - [info]
Sun Sep 28 10:17:51 2014 - [info] * Phase 4.1: Starting Parallel Slave Diff Log Generation Phase..
Sun Sep 28 10:17:51 2014 - [info]
Sun Sep 28 10:17:51 2014 - [info] -- Slave diff file generation on host 192.168.153.152(192.168.153.152:3306) started, pid: 8387. Check tmp log /var/log/masterha/app1/192.168.153.152_3306_20140928101750.log if it takes time..
Sun Sep 28 10:17:51 2014 - [info]
Sun Sep 28 10:17:51 2014 - [info] Log messages from 192.168.153.152 ...
Sun Sep 28 10:17:51 2014 - [info]
Sun Sep 28 10:17:51 2014 - [info]  This server has all relay logs. No need to generate diff files from the latest slave.
Sun Sep 28 10:17:51 2014 - [info] End of log messages from 192.168.153.152.
Sun Sep 28 10:17:51 2014 - [info] -- 192.168.153.152(192.168.153.152:3306) has the latest relay log events.
Sun Sep 28 10:17:51 2014 - [info] Generating relay diff files from the latest slave succeeded.
Sun Sep 28 10:17:51 2014 - [info]

Sun Sep 28 10:17:51 2014 - [info] * Phase 4.2: Starting Parallel Slave Log Apply Phase..
Sun Sep 28 10:17:51 2014 - [info]
Sun Sep 28 10:17:51 2014 - [info] -- Slave recovery on host 192.168.153.152(192.168.153.152:3306) started, pid: 8389. Check tmp log /var/log/masterha/app1/192.168.153.152_3306_20140928101750.log if it takes time..
Sun Sep 28 10:17:51 2014 - [info]
Sun Sep 28 10:17:51 2014 - [info] Log messages from 192.168.153.152 ...
Sun Sep 28 10:17:51 2014 - [info]
Sun Sep 28 10:17:51 2014 - [info] Starting recovery on 192.168.153.152(192.168.153.152:3306)..
Sun Sep 28 10:17:51 2014 - [info]  This server has all relay logs. Waiting all logs to be applied..
Sun Sep 28 10:17:51 2014 - [info]   done.
Sun Sep 28 10:17:51 2014 - [info]  All relay logs were successfully applied.
#差异日志应用完成以后,切换所有slave到new master。
Sun Sep 28 10:17:51 2014 - [info]  Resetting slave 192.168.153.152(192.168.153.152:3306) and starting replication from the new master 192.168.153.151(192.168.153.151:3306)..
Sun Sep 28 10:17:51 2014 - [info]  Executed CHANGE MASTER.
Sun Sep 28 10:17:51 2014 - [info]  Slave started.
Sun Sep 28 10:17:51 2014 - [info] End of log messages from 192.168.153.152.
Sun Sep 28 10:17:51 2014 - [info] -- Slave recovery on host 192.168.153.152(192.168.153.152:3306) succeeded.
Sun Sep 28 10:17:51 2014 - [info] All new slave servers recovered successfully.
Sun Sep 28 10:17:51 2014 - [info]
Sun Sep 28 10:17:51 2014 - [info] * Phase 5: New master cleanup phase..
Sun Sep 28 10:17:51 2014 - [info]
Sun Sep 28 10:17:51 2014 - [info] Resetting slave info on the new master..
Sun Sep 28 10:17:51 2014 - [info]  192.168.153.151: Resetting slave info succeeded.
Sun Sep 28 10:17:51 2014 - [info] Master failover to 192.168.153.151(192.168.153.151:3306) completed successfully.
Sun Sep 28 10:17:51 2014 - [info]

#failover操作完成,生成failover报告
----- Failover Report -----

app1: MySQL Master failover 192.168.153.150 to 192.168.153.151 succeeded

Master 192.168.153.150 is down!

Check MHA Manager logs at localhost.localdomain:/var/log/masterha/app1/manager.log for details.

Started automated(non-interactive) failover.
Power off 192.168.153.150.
The latest slave 192.168.153.151(192.168.153.151:3306) has all relay logs for recovery.
Selected 192.168.153.151 as a new master.
192.168.153.151: OK: Applying all logs succeeded.
192.168.153.152: This host has the latest relay log events.
Generating relay diff files from the latest slave succeeded.
192.168.153.152: OK: Applying all logs succeeded. Slave started, replicating from 192.168.153.151.
192.168.153.151: Resetting slave info succeeded.
Master failover to 192.168.153.151(192.168.153.151:3306) completed

打赏

打赏

取消

感谢您的支持,我会继续努力的!

扫码支持
扫码打赏,你说多少就多少

打开支付宝扫一扫,即可进行扫码打赏哦

百分百源码网 建议打赏1~10元,土豪随意,感谢您的阅读!

共有3人阅读,期待你的评论!发表评论
昵称: 网址: 验证码: 点击我更换图片
最新评论

本文标签

广告赞助

能出一分力是一分吧!

订阅获得更多模板

本文标签

广告赞助

订阅获得更多模板