binlog2sql: Binlog to Raw SQL Conversion and Point In Time Recovery

binlog2sql

binlog2sqlIn this blog, I will look at the tool binlog2sql, designed by Cao Danfeng. The tool will help to decode the MySQL binary logs and extract the raw SQL, which also helps to generate the ROLLBACK statements for point in time recovery (PITR) using the “flashback” feature. This feature is also implemented in MariaDB’s version of “mysqlbinlog” (Check out Flashback: Another Take on Point-In-Time Recovery in MySQL/MariaDB/Percona Server by Przemys┼éaw Malkowski, which describes this feature).

With the native mysqlbinlog tool, we can convert the binary logs from binary to text format. We can also generate the exact SQL statements using the mysqlbinlog utility. It depends on the “binlog_format & binlog_rows_query_log_events” parameters. Anyway, the tool really doesn’t have the feature to generate the ROLLBACK statements. 

  • With “binlog_format =  STATEMENT” format we do get SQL using mysqlbinlog. 
  • With the parameter binlog_rows_query_log_events = ON we do get both the SQL and binlog events using mysqlbinlog. 

Mysqlbinlog vs Binlog2sql

Before jumping into the binlog2sql tool, I wanted to compare the outputs from Mysqlbinlog and bilog2sql. In this section, I am going to show the output format from each tool on various options. For reference,  I have used the below query to generate the binlog events.

mysql> insert into binlog2sqlVSmysqlbinlog (name) values ('hercules7sakthi');
Query OK, 1 row affected (0.00 sec)

 

Mysqlbinlog:

Mysqlbinlog with ( binlog_format=ROW, binlog_rows_query_log_events=OFF ),

[root@hercules7sakthi2 ~]# mysqlbinlog --base64-output=decode-rows -vv --start-position 313 --stop-position 449 /var/lib/mysql/percona-bin.000003  
... 
### INSERT INTO `percona`.`binlog2sqlVSmysqlbinlog` 
### SET 
###   @1=1 /* INT meta=0 nullable=0 is_null=0 */ 
###   @2='hercules7sakthi' /* VARSTRING(64) meta=64 nullable=1 is_null=0 */ 
...
  • Output: Text format, not real SQL.
  • Rollback: The tool doesn’t have the feature to generate rollback statements.

Mysqlbinlog with ( binlog_format = STATEMENT ),

[root@hercules7sakthi2 ~]# mysqlbinlog --base64-output=decode-rows -vv --start-position 323 --stop-position 507 /var/lib/mysql/percona-bin.000004 
...
/*!80011 SET @@session.default_collation_for_utf8mb4=255*//*!*/;
insert into binlog2sqlVSmysqlbinlog (name) values ('hercules7sakthi')
/*!*/;
...
  • Output: SQL format.
  • Rollback: The tool doesn’t have the feature to generate rollback statements.

Mysqlbinlog with ( binlog_format=ROW and binlog_rows_query_log_events=ON ),

[root@hercules7sakthi2 ~]# mysqlbinlog --base64-output=decode-rows -vv /var/lib/mysql/percona-bin.000006 --start-position 313
...
# insert into binlog2sqlVSmysqlbinlog (name) values ('hercules7sakthi')
...
### INSERT INTO `percona`.`binlog2sqlVSmysqlbinlog`
### SET
###   @1=1 /* INT meta=0 nullable=0 is_null=0 */
###   @2='hercules7sakthi' /* VARSTRING(64) meta=64 nullable=1 is_null=0 */
...
  • Output: SQL and Text format.
  • Rollback: The tool doesn’t have the feature to generate rollback statements.

Binlog2sql:

[root@hercules7sakthi2 ~]# ./binlog2sql/binlog2sql/binlog2sql.py --user percona -p --start-file percona-bin.000007
Password: 
INSERT INTO `percona`.`binlog2sqlVSmysqlbinlog`(`id`, `name`) VALUES (1, 'hercules7sakthi'); #start 4 end 542 time 2020-06-21 22:36:16
  • Output: SQL format
  • Rollback: Possible with binlog2sql

What is binlog2sql?

  • It is an open source tool to parse the binary logs.
  • It has the feature to extract the raw SQL statements from binary logs.
  • It has the feature to generate the ROLLBACK SQL from binary logs for point-in-time recovery.

MySQL Requirements

#vi my.cnf

[mysqld]
#binlog
server-id = #id
log-bin
binlog_format = row
binlog_row_image = full

MySQL server must be active for two reasons:

  • binlog2sql is based on BINLOG_DUMP protocol to obtain binlog content.
  • binlog2sql is necessary to read the INFORMATION_SCHEMA.COLUMNS table from the server, to obtain metadata of the table. (A server with lots of tables can have significant overhead to query INFORMATION_SCHEMA.COLUMNS table.)

Installation

git clone https://github.com/danfengcao/binlog2sql.git && cd binlog2sql
pip install -r requirements.txt

Tested Environment

  • Python 2.7, 3.4+
  • MySQL 5.6,5.7,8+

Note: Make sure you need to have “pymysql – 0.9.3 ” for MySQL 8+ support. 

Binlog2sql Options

optional arguments:
  --stop-never          Continuously parse binlog. default: stop at the latest
                        event when you start.
  --help                help information
  -K, --no-primary-key  Generate insert sql without primary key if exists
  -B, --flashback       Flashback data to start_position of start_file
  --back-interval BACK_INTERVAL
                        Sleep time between chunks of 1000 rollback sql. set it
                        to 0 if do not need sleep
connect setting:
  -h HOST, --host HOST  Host the MySQL database server located
  -u USER, --user USER  MySQL Username to log in as
  -p [PASSWORD [PASSWORD ...]], --password [PASSWORD [PASSWORD ...]]
                        MySQL Password to use
  -P PORT, --port PORT  MySQL port to use
interval filter:
  --start-file START_FILE
                        Start binlog file to be parsed
  --start-position START_POS, --start-pos START_POS
                        Start position of the --start-file
  --stop-file END_FILE, --end-file END_FILE
                        Stop binlog file to be parsed. default: '--start-file'
  --stop-position END_POS, --end-pos END_POS
                        Stop position. default: latest position of '--stop-
                        file'
  --start-datetime START_TIME
                        Start time. format %Y-%m-%d %H:%M:%S
  --stop-datetime STOP_TIME
                        Stop Time. format %Y-%m-%d %H:%M:%S;
schema filter:
  -d [DATABASES [DATABASES ...]], --databases [DATABASES [DATABASES ...]]
                        dbs you want to process
  -t [TABLES [TABLES ...]], --tables [TABLES [TABLES ...]]
                        tables you want to process
type filter:
  --only-dml            only print dml, ignore ddl
  --sql-type [SQL_TYPE [SQL_TYPE ...]]
                        Sql type you want to process, support INSERT, UPDATE,
                        DELETE.

How to Extract the Raw SQL From Binary Logs

For testing purposes, I have installed a MySQL server and created the following data:

mysql> use percona
Database changed

mysql> create table binlog2sql(id int primary key auto_increment, name varchar(16), status enum('A','NA'), up_date datetime default current_timestamp);
Query OK, 0 rows affected (0.16 sec)

mysql> insert into binlog2sql (name,status) values ('kani','A'),('ram','A'),('durai','A'),('asha','A'),('sakthi','A');
Query OK, 5 rows affected (0.00 sec)
Records: 5  Duplicates: 0  Warnings: 0

Decoding binary log Using binlog2sql:

[root@hercules7sakthi2 binlog2sql]# ./binlog2sql.py -upercona -p --start-file percona-bin.000002 --start-position 321 --stop-position 510  | cut -f1 -d"#" 
Password:  
INSERT INTO `percona`.`binlog2sql`(`status`, `up_date`, `id`, `name`) VALUES ('A', '2020-06-20 14:54:51', 1, 'kani');  
INSERT INTO `percona`.`binlog2sql`(`status`, `up_date`, `id`, `name`) VALUES ('A', '2020-06-20 14:54:51', 2, 'ram');
INSERT INTO `percona`.`binlog2sql`(`status`, `up_date`, `id`, `name`) VALUES ('A', '2020-06-20 14:54:51', 3, 'durai');  
INSERT INTO `percona`.`binlog2sql`(`status`, `up_date`, `id`, `name`) VALUES ('A', '2020-06-20 14:54:51', 4, 'asha');  
INSERT INTO `percona`.`binlog2sql`(`status`, `up_date`, `id`, `name`) VALUES ('A', '2020-06-20 14:54:51', 5, 'sakthi');

How Does binlog2sql Support PITR?

  • Binlog2sql tool has the option “–flashback”, which will help to generate the ROLLBACK statements.
  • We can recover the data from DELETE and UPDATE statements. 
  • It will not support DDL ( DROP, TRUNCATE ) as the actual row events will not be available in the binary logs.

Case 1 – PITR from DELETE:

To test this case, I have deleted two records “id in (4,5)” from my test table “binlog2sql”.

mysql> delete from binlog2sql where id in (4,5);
Query OK, 2 rows affected (0.03 sec)

mysql> select * from binlog2sql;
+----+-------+--------+---------------------+
| id | name  | status | up_date             |
+----+-------+--------+---------------------+
|  1 | kani  | A      | 2020-06-20 14:54:51 |
|  2 | ram   | A      | 2020-06-20 14:54:51 |
|  3 | durai | A      | 2020-06-20 14:54:51 |
+----+-------+--------+---------------------+
3 rows in set (0.00 sec)

For the reference, I am generating the raw SQL using binlog2sql.

[root@hercules7sakthi2 binlog2sql]# ./binlog2sql.py -upercona -pxxxxxxx --start-file percona-bin.000002 --start-position 1172 --stop-position 1344  | cut -f1 -d'#'
DELETE FROM `percona`.`binlog2sql` WHERE `status`='A' AND `up_date`='2020-06-20 14:54:51' AND `id`=4 AND `name`='asha' LIMIT 1; 
DELETE FROM `percona`.`binlog2sql` WHERE `status`='A' AND `up_date`='2020-06-20 14:54:51' AND `id`=5 AND `name`='sakthi' LIMIT 1;

The above DELETE’s are the exact statements, which executed to delete the records. Now, I am going to generate the ROLLBACK statements using the option “–flashback” as shown below,

[root@hercules7sakthi2 binlog2sql]# ./binlog2sql.py -upercona -pxxxxxx --start-file percona-bin.000002 --start-position 1172 --stop-position 1344 --flashback  | cut -f1 -d'#'
INSERT INTO `percona`.`binlog2sql`(`status`, `up_date`, `id`, `name`) VALUES ('A', '2020-06-20 14:54:51', 5, 'sakthi'); 
INSERT INTO `percona`.`binlog2sql`(`status`, `up_date`, `id`, `name`) VALUES ('A', '2020-06-20 14:54:51', 4, 'asha');

You can see those DELETE statements are now converted to INSERT statements. Now, I can load the rollback statements into the server,

[root@hercules7sakthi2 binlog2sql]# ./binlog2sql.py -upercona -pxxxxxx --start-file percona-bin.000002 --start-position 1172 --stop-position 1344 --flashback  | cut -f1 -d'#' | mysql -vvv 
--------------
INSERT INTO `percona`.`binlog2sql`(`status`, `up_date`, `id`, `name`) VALUES ('A', '2020-06-20 14:54:51', 5, 'sakthi')
--------------
Query OK, 1 row affected (0.03 sec)

--------------
INSERT INTO `percona`.`binlog2sql`(`status`, `up_date`, `id`, `name`) VALUES ('A', '2020-06-20 14:54:51', 4, 'asha')
--------------
Query OK, 1 row affected (0.01 sec)
Bye

[root@hercules7sakthi2 binlog2sql]# mysql -e "select * from percona.binlog2sql"
+----+--------+--------+---------------------+
| id | name   | status | up_date             |
+----+--------+--------+---------------------+
|  1 | kani   | A      | 2020-06-20 14:54:51 |
|  2 | ram    | A      | 2020-06-20 14:54:51 |
|  3 | durai  | A      | 2020-06-20 14:54:51 |
|  4 | asha   | A      | 2020-06-20 14:54:51 |
|  5 | sakthi | A      | 2020-06-20 14:54:51 |
+----+--------+--------+---------------------+

Recovered the data from DELETE.

Case 2 – PITR from UPDATE:

In this case, I am going to update two rows. I changed the status from “A” to “NA” for “id in (4,5)”.

mysql> update binlog2sql set status='NA' where id in (4,5);
Query OK, 2 rows affected (0.01 sec)
Rows matched: 2  Changed: 2  Warnings: 0

mysql> select * from binlog2sql;
+----+--------+--------+---------------------+
| id | name   | status | up_date             |
+----+--------+--------+---------------------+
|  1 | kani   | A      | 2020-06-20 14:54:51 |
|  2 | ram    | A      | 2020-06-20 14:54:51 |
|  3 | durai  | A      | 2020-06-20 14:54:51 |
|  4 | asha   | NA     | 2020-06-20 14:54:51 |
|  5 | sakthi | NA     | 2020-06-20 14:54:51 |
+----+--------+--------+---------------------+
5 rows in set (0.00 sec)

For the reference, I am generating the executed UPDATEs using binlog2sql.

[root@hercules7sakthi2 binlog2sql]# ./binlog2sql.py -upercona -pxxxxxxx --start-file percona-bin.000002 --start-position 2136 --stop-position 2343  | cut -f1 -d'#'
UPDATE `percona`.`binlog2sql` SET `status`='NA', `up_date`='2020-06-20 14:54:51', `id`=4, `name`='asha' WHERE `status`='A' AND `up_date`='2020-06-20 14:54:51' AND `id`=4 AND `name`='asha' LIMIT 1; 
UPDATE `percona`.`binlog2sql` SET `status`='NA', `up_date`='2020-06-20 14:54:51', `id`=5, `name`='sakthi' WHERE `status`='A' AND `up_date`='2020-06-20 14:54:51' AND `id`=5 AND `name`='sakthi' LIMIT 1;

Generating the ROLLBACK statements.

[root@hercules7sakthi2 binlog2sql]# ./binlog2sql.py -upercona -pxxxxxx --start-file percona-bin.000002 --start-position 2136 --stop-position 2343 --flashback  | cut -f1 -d'#'
UPDATE `percona`.`binlog2sql` SET `status`='A', `up_date`='2020-06-20 14:54:51', `id`=5, `name`='sakthi' WHERE `status`='NA' AND `up_date`='2020-06-20 14:54:51' AND `id`=5 AND `name`='sakthi' LIMIT 1; 
UPDATE `percona`.`binlog2sql` SET `status`='A', `up_date`='2020-06-20 14:54:51', `id`=4, `name`='asha' WHERE `status`='NA' AND `up_date`='2020-06-20 14:54:51' AND `id`=4 AND `name`='asha' LIMIT 1;

Loading the rollback statements into the server.

[root@hercules7sakthi2 binlog2sql]# ./binlog2sql.py -upercona -pxxxxxxx --start-file percona-bin.000002 --start-position 2136 --stop-position 2343 --flashback  | cut -f1 -d'#' | mysql -vvv
--------------
UPDATE `percona`.`binlog2sql` SET `status`='A', `up_date`='2020-06-20 14:54:51', `id`=5, `name`='sakthi' WHERE `status`='NA' AND `up_date`='2020-06-20 14:54:51' AND `id`=5 AND `name`='sakthi' LIMIT 1
--------------
Query OK, 1 row affected (0.02 sec)
Rows matched: 1  Changed: 1  Warnings: 0

--------------
UPDATE `percona`.`binlog2sql` SET `status`='A', `up_date`='2020-06-20 14:54:51', `id`=4, `name`='asha' WHERE `status`='NA' AND `up_date`='2020-06-20 14:54:51' AND `id`=4 AND `name`='asha' LIMIT 1
--------------
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0
Bye

[root@hercules7sakthi2 binlog2sql]# mysql -e "select * from percona.binlog2sql"
+----+--------+--------+---------------------+
| id | name   | status | up_date             |
+----+--------+--------+---------------------+
|  1 | kani   | A      | 2020-06-20 14:54:51 |
|  2 | ram    | A      | 2020-06-20 14:54:51 |
|  3 | durai  | A      | 2020-06-20 14:54:51 |
|  4 | asha   | A      | 2020-06-20 14:54:51 |
|  5 | sakthi | A      | 2020-06-20 14:54:51 |
+----+--------+--------+---------------------+

The changes have been rolled back.

Conclusion

  • As I mentioned earlier, the tool will support rollback only for DELETE/UPDATE.
  • The tool will not support DDL because the DDL statements do not log any actual data in the binary logs. 
  • The tool will not work with encrypted/compressed binary logs.
  • The tool has been tested on MySQL 5.6 and MySQL 5.7 environments. It has the support for MySQL 8 with “pymysql – 0.9.3”.

by Sri Sakthivel via Percona Database Performance Blog

Comments

Popular posts from this blog