Mysql master-slave, dual master and master-slave synchronization 1236 error

Original link: https://chegva.com/5767.html

Recently, when the computer room was moved, the Mysql cluster changed to a new environment, and the master-slave inconsistency occurred. Specifically, the 1236 error was reported. The cluster architecture adopts dual-master + keepalived high availability. Several recovery methods have been tested and recorded.

Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: ‘Slave has more GTIDs than the master has, using the master’s SERVER_UUID. This may indicate that the end of the binary log was truncated or that the last binary log file was lost, eg, after a power or disk failure when sync_binlog != 1. The master may or may not have rolled back transactions that were already replica’

Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: ‘could not find next log; the first event ” at 4, the last event read from ‘./slave-bin.000135’ at 234, the last byte read from ‘./slave-bin.000135′ at 234.’

First recall the Mysql master-slave synchronization:

主从复制原理图.jpg

Master-slave replication is divided into “master server ( master )” and “slave server (slave )” , “the master server is responsible for writing, and the slave server is responsible for reading” , the process of MySQL ‘s master-slave replication is an “asynchronous process” . This process of separating read and write can greatly improve the overall service performance. Even if the write operation takes a long time, it will not affect the read operation.

There are three main threads in MySQL’s master-slave replication: master (binlog dump thread ), slave (I/O thread , SQL thread ) , one thread in Master and two threads in Slave.

The general steps of master-slave replication are as follows:

Execute change master to on the slave library, save the information of the master library to the master.info file in the slave library, then execute start slave , open I/O Thread , SQL Thread , and master-slave replication starts

1. I/O Thread connects to the main library through the connection information of the main library in the master.info file. After the connection is successful, the main library will open Dump Thread

2. Dump Thread reads the newly generated binary log of the main library, and then delivers it to the I/O Thread

3. The I/O Thread receives the new binary log delivered by the Dump Thread, writes the log to the relay log , and the I/O Thread waits for the main library Dump Thread to take the initiative to deliver the newly generated binary log

4. SQL Thread will restore the newly generated log of the relay log to the database and write it to the disk

Mysql < 5.7 dump_thread reads binlog and main library writes binlog is a serial operation, 5.7+ dump_thread reads and writes binlog in parallel, high master-slave delay can configure parallel replication and enhance semi-synchronous replication

MySQL semi-synchronous replication is between asynchronous and full synchronous. The main library only needs to wait for at least one slave node to receive and flush the binlog to the relay log file. The main library does not need to wait for all the slave libraries to give feedback to the main library. The main library only needs to receive feedback from any slave library, and it is not the feedback that the slave library has completed and submitted, but the slave library only needs to complete the io_thread content to give feedback. There is no need to wait until the execution of the sql_thread is complete.

半同步复制.png

Image Source

Common commands:

start slave; #Open io_thread and sql_thread

start slave io_thread;

start slave sql_thread;

stop slave; #close io_thread and sql_thread

stop slave io_thread;

stop slave sql_thread;

reset slave; #Delete master.info, relay-log.info data; delete all relay logs; set delay option master_delay to 0

reset master; #delete all binary log files

show master status; #View the status of the main library

show global variables like ‘%gtid%’;

show binary logs;

show processlist;

show binlog events;

Detailed explanation of slave state:

show slave status \G; #View slave status

 mysql> SHOW SLAVE STATUS\G 
  
*************************** 1. row ********************* ***** 
  
Slave_IO_State: Waiting for master to send event 
  
Master_Host: 172.16.11.22 #Main library IP 
  
Master_User: repl #Main library user Master_Port: 3306 #Main library port number Connect_Retry: 60 
  
Master_Log_File: master-bin.000002 #The current slave library io_thread is reading the binary log file of the master library Read_Master_Log_Pos: 775058446 #The current slave library io_thread is reading the position of the master library binary file Relay_Log_File: relay.000004 #The current slave library sql_thread is reading Get the relay log file from the library Relay_Log_Pos: 407 #The current sql_thread is reading the position of the relay log file from the library Relay_Master_Log_File: master-bin.000002 #The current SQL statement read from the relay log by the library sql_thread, In which binlog is the sql statement corresponding to the main library Slave_IO_Running: Yes 
  
Slave_SQL_Running: Yes 
  
Replicate_Do_DB:  
  
Replicate_Ignore_DB:  
  
Replicate_Do_Table:  
  
Replicate_Ignore_Table:  
  
Replicate_Wild_Do_Table:  
  
Replicate_Wild_Ignore_Table:  
  
Last_Errno: 0 
  
Last_Error:  
  
Skip_Counter: 0 
  
Exec_Master_Log_Pos: 775058446 #The event currently executed by the slave library sql_thread corresponds to the position in the binlog of the master library 
  
Relay_Log_Space: 691 
  
Until_Condition: None 
  
Until_Log_File:  
  
Until_Log_Pos: 0 
  
Master_SSL_Allowed: No 
  
Master_SSL_CA_File:  
  
Master_SSL_CA_Path:  
  
Master_SSL_Cert:  
  
Master_SSL_Cipher:  
  
Master_SSL_Key:  
  
Seconds_Behind_Master: 0 #Master-slave replication delay time (s), if it is 0, it means no master-slave delay Master_SSL_Verify_Server_Cert: No 
  
Last_IO_Errno: 0 #The last IO thread error message Last_IO_Error:  
  
Last_SQL_Errno: 0 #The last SQL thread error message Last_SQL_Error:  
  
Replicate_Ignore_Server_Ids:  
  
Master_Server_Id: 2 
  
Master_UUID: a67d2e10-b10c-11ed-948f-fa163e521e8a 
  
Master_Info_File: mysql.slave_master_info 
  
SQL_Delay: 0 #Delay replication SQL_Remaining_Delay: NULL 
  
Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates 
  
Master_Retry_Count: 86400 
  
Master_Bind:  
  
Last_IO_Error_Timestamp:  
  
Last_SQL_Error_Timestamp:  
  
Master_SSL_Crl:  
  
Master_SSL_Crlpath:  
  
Retrieved_Gtid_Set: a67d2e10-b10c-11ed-948f-fa163e521e8a:6-11 #The binlog that the current io_thread has received 
  
Executed_Gtid_Set: a66803a8-b10c-11ed-be14-fa163e8dddf7:1-2410271, #The binary log position currently executed from the library sql_thread a67d2e10-b10c-11ed-948f-fa163e521e8a:6-11 
  
Auto_Position: 0 
  
Replicate_Rewrite_DB:  
  
Channel_Name:  
  
Master_TLS_Version:  
  
1 row in set (0.00 sec)

For master-slave construction details, see:


Dual master construction:

ms.png

Image Source

1. For dual-master Master ⇌ Standby, the server-id setting in the my.cnf configuration file should be different, and log_slave_updates should be configured

2. The difference between auto_increment_offset and auto_increment_increment prevents primary key conflicts when two nodes double write.

  • auto_increment_offset The starting value of the auto-increment field, set to 2 means that the auto-increment field starts from 2

  • auto_increment_increment The value of the auto-increment field is incremented step by step. Setting it to 2 means incrementing by 2 each time. According to the setting of the master node, the value of the auto-increment field may be like this: 1,3,5,7…

  • log_slave_updates writes replication events to the binlog. This option must be enabled when a server is both the master and the slave. Here, it is enabled on Master and Standby

If you enable the log-bin parameter from the library, if you write data directly to the slave library, you can record the log-bin log, but the slave library reads the binary log file of the main library through the I0 thread, and then writes the data through the SQL thread. The binlog log will be recorded. That is to say, the data copied from the main library by the slave library is not written into the binlog log of the slave library. Therefore, when the slave library is used as the master library of other slave libraries, the log-slave-updates parameter needs to be added to the configuration file.

Dual master configuration reference:

1236 error repair test results:

 1. Standby executes reset slave, master does not execute reset slave 
  
The master-slave synchronization does not report an error, and there is no problem in master-slave synchronization to standby. Sometimes the standby writes will not be synchronized to the master, and the data on both sides may cause inconsistency and stop slave; 
  
CHANGE MASTER TO MASTER_LOG_FILE='slave-bin.000001', MASTER_LOG_POS=34978, master_auto_position=0; 
  
start slave; 
  
 
  
2. Both standby and master execute reset slave 
  
Both master and standby can be synchronized normally, and there is no inconsistency in writing data on both sides. stop slave; 
  
reset slave; 
  
CHANGE MASTER TO MASTER_LOG_FILE='slave-bin.000001', MASTER_LOG_POS=126974, master_auto_position=0; 
  
start slave; 
  
 
  
3. Standby executes reset slave & reset master (in this case, the binlog of the slave library is invalid, so execute this command to clear the binlog and execute it from the library), and the master executes reset slave 
  
The writes on both sides of the master and standby can be synchronized normally, and there is no inconsistency in the data written on both sides. 4. Both the standby and the master execute reset slave & reset master, and then change the master (dangerous operation, use with caution online may lose data) 
  
The writes on both sides of the master and standby can be synchronized normally, and there is no inconsistency in the data written on both sides.

Production suggestion:

1. It is best to dynamically set read-only for the standby node (set global read_only = 0|1, read_only=0 for the primary node, and read_only=1 for the standby node), and set Keepalived to non-preemptive mode. When switching, the standby node must wait for the synchronization of the primary node to complete .

2. The best way to stop services during migration is to stop the Standby first, then the Master; start the Master first, and then start the Standby.

Test Data:

 Master executes: 
  
create database mytest; 
  
use mytest; 
  
CREATE TABLE `test1` ( 
  
`id` int(10) NOT NULL AUTO_INCREMENT PRIMARY KEY, 
  
`name` varchar(125) NOT NULL, 
  
`age` int(3) DEFAULT NULL 
  
); 
  
 
  
insert into test1 values ​​(0, 'a', 16); 
  
insert into test1 values ​​(0, 'b', 24); 
  
insert into test1 values ​​(0, 'c', 18); 
  
 
  
mysql> SHOW TABLES; 
  
+------------------+ 
  
|Tables_in_mytest| 
  
+------------------+ 
  
|test1| 
  
+------------------+ 
  
1 row in set (0.00 sec) 
  
 
  
mysql> insert into test1 values ​​(0, 'a', 16); 
  
Query OK, 1 row affected (0.02 sec) 
  
 
  
mysql> insert into test1 values ​​(0, 'b', 24); 
  
Query OK, 1 row affected (0.01 sec) 
  
 
  
mysql> insert into test1 values ​​(0, 'c', 18); 
  
Query OK, 1 row affected (0.01 sec) 
  
 
  
mysql> select * from test1; 
  
+----+------+------+ 
  
| id | name | age | 
  
+----+------+------+ 
  
| 1 | a | 16 | 
  
| 3 | b | 24 | 
  
| 5 | c | 18 | 
  
+----+------+------+ 
  
3 rows in set (0.00 sec) 
  
 
  
Standby execution: 
  
insert into test1 values ​​(0, 'e', ​​16); 
  
insert into test1 values ​​(0, 'f', 24); 
  
insert into test1 values ​​(0, 'g', 18); 
  
 
  
mysql> select * from test1; 
  
+----+------+------+ 
  
| id | name | age | 
  
+----+------+------+ 
  
| 1 | a | 16 | 
  
| 3 | b | 24 | 
  
| 5 | c | 18 | 
  
+----+------+------+ 
  
3 rows in set (0.01 sec) 
  
 
  
 
  
mysql> insert into test1 values ​​(0, 'e', ​​16); 
  
Query OK, 1 row affected (0.01 sec) 
  
 
  
mysql> insert into test1 values ​​(0, 'f', 24); 
  
Query OK, 1 row affected (0.02 sec) 
  
 
  
mysql> insert into test1 values ​​(0, 'g', 18); 
  
Query OK, 1 row affected (0.01 sec) 
  
 
  
mysql> select * from test1; 
  
+----+------+------+ 
  
| id | name | age | 
  
+----+------+------+ 
  
| 1 | a | 16 | 
  
| 3 | b | 24 | 
  
| 5 | c | 18 | 
  
| 6 | e | 16 | 
  
| 8 | f | 24 | 
  
| 10 | g | 18 | 
  
+----+------+------+ 
  
6 rows in set (0.00 sec) 

refer to:

This article is transferred from: https://chegva.com/5767.html
This site is only for collection, and the copyright belongs to the original author.