How to migrate from Galera Cluster to MySQL Group Replication

lefred
December 17, 2016
5.7, CentOS, Galera, General, Group Replication, HA, MySQL, ProxySQL
8 Comments

In this article, I will show you how it’s possible to perform an online migration from a 3 members Galera cluster setup (in this case I’m using PXC 5.7.14) to a 3 members MySQL Group Replication cluster setup (MySQL Community 5.7.17).

Don’t forget that before adopting Group Replication as database backend, you should validate that your application do match GR requirements and limitations. When this is validated, you can start !

So first, let’s have a look at the current situation:

We have an application (sysbench 0.5), reading and writing to a Galera Cluster (Percona XtraDB Cluster 5.7.14) via ProxySQL. We write on all the nodes (Multi-Master) and we will do the same on our MySQL Group Replication Cluster, we will use a Multi-Primary Group.

This is the command used to simulate our application:

while true; do sysbench --test=/usr/share/doc/sysbench/tests/db/oltp.lua \
     --mysql-host=127.0.0.1 --mysql-port=6033 --mysql-password=fred \
     --mysql-table-engine=innodb --mysql-user=fred --max-requests=0 \
     --tx-rate=20 --num-threads=2  --report-interval=1 run ; done;

And this an overview of the machines used:

Hostname	OS	Software	IP	Server_id
mysql1	CentOS 7	PXC 5.7.14	192.168.90.10	1
mysql2	CentOS 7	PXC 5.7.14	192.168.90.11	2
mysql3	CentOS 7	PXC 5.7.14	192.168.90.12	3
app	CentOS 7	ProxySQL sysbench 0.5	192.168.90.13	n/a

So the goal will be to replace all those PXC nodes by MySQL 5.7.17 with Group Replication one by one and avoid downtime.

For those familiar with ProxySQL this is how we see the Galera nodes in the proxy:

ProxySQL Admin> select hostname, status, hostgroup_id from runtime_mysql_servers;
+---------------+--------+--------------+
| hostname      | status | hostgroup_id |
+---------------+--------+--------------+
| 192.168.90.12 | ONLINE | 1            |
| 192.168.90.10 | ONLINE | 1            |
| 192.168.90.11 | ONLINE | 1            |
+---------------+--------+--------------+

For the other ones, you can find more info in the previous post: HA with MySQL Group Replication and ProxySQL.

To be able to proceed as we planned, we need to have binary logs enabled on every PXC nodes and also use MySQL GTIDs.
So in my.cnf you must have:

enforce_gtid_consistency = on
gtid_mode  = on
log_bin 
log_slave_updates

First Step: remove one node and migrate it to MySQL 5.7.17

Our first step in this section will be to stop mysqld and remove the PXC packages on mysql3:

[root@mysql3]# systemctl stop mysql

ProxySQL Admin> select hostname, status from runtime_mysql_servers;
+---------------+---------+
| hostname      | status  |
+---------------+---------+
| 192.168.90.11 | ONLINE  |
| 192.168.90.12 | SHUNNED |
| 192.168.90.10 | ONLINE  |
+---------------+---------+

Our application is of course still running (it might of course be disconnected), so in this case sysbench runs in a loop.

As all our nodes are running CentOS 7, we will use the mysql57 community repo for el7.

[root@mysql3 ~]# yum install http://dev.mysql.com/get/mysql57-community-release-el7-9.noarch.rpm

Now we can change the packages:

[root@mysql3 ~]# yum -y swap Percona-XtraDB-Cluster* mysql-community-server mysql-community-libs-compat
...
=========================================================================================================
 Package                                   Arch     Version              Repository                 Size
=========================================================================================================
Installing:
 mysql-community-libs-compat               x86_64   5.7.17-1.el7         mysql57-community         2.0 M
 mysql-community-server                    x86_64   5.7.17-1.el7         mysql57-community         162 M
Removing:
 Percona-XtraDB-Cluster-57                 x86_64   5.7.14-26.17.1.el7   @percona-release-x86_64   0.0  
 Percona-XtraDB-Cluster-client-57          x86_64   5.7.14-26.17.1.el7   @percona-release-x86_64    37 M
 Percona-XtraDB-Cluster-server-57          x86_64   5.7.14-26.17.1.el7   @percona-release-x86_64   227 M
 Percona-XtraDB-Cluster-shared-57          x86_64   5.7.14-26.17.1.el7   @percona-release-x86_64   3.7 M
 Percona-XtraDB-Cluster-shared-compat-57   x86_64   5.7.14-26.17.1.el7   @percona-release-x86_64   6.7 M
Installing for dependencies:
 mysql-community-client                    x86_64   5.7.17-1.el7         mysql57-community          24 M
 mysql-community-common                    x86_64   5.7.17-1.el7         mysql57-community         271 k
 mysql-community-libs                      x86_64   5.7.17-1.el7         mysql57-community         2.1 M

Transaction Summary
=========================================================================================================
Install  2 Packages (+3 Dependent packages)
Remove   5 Packages

After that step, it’s time to modify my.cnf and comment all wsrep and pxc related variables and add some extra that are mandatory:

binlog_checksum = none
master_info_repository = TABLE
relay_log_info_repository = TABLE
transaction_write_set_extraction = XXHASH64
loose-group_replication_group_name="afb80f36-2bff-11e6-84e0-0800277dd3bf"
loose-group_replication_start_on_boot=off
loose-group_replication_local_address= "192.168.90.12:3406"
loose-group_replication_group_seeds= "192.168.90.10:3406,192.168.90.11:3406"
loose-group_replication_bootstrap_group= off
loose-group_replication_single_primary_mode= off

Then we move that server to another hostgroup in ProxySQL:

ProxySQL Admin> update mysql_servers set hostgroup_id =2 where hostname ="192.168.90.12";
ProxySQL Admin> load mysql servers to runtime;
ProxySQL Admin> select hostname, status, hostgroup_id from runtime_mysql_servers;
+---------------+---------+--------------+
| hostname      | status  | hostgroup_id |
+---------------+---------+--------------+
| 192.168.90.11 | ONLINE  | 1            |
| 192.168.90.10 | ONLINE  | 1            |
| 192.168.90.12 | SHUNNED | 2            |
+---------------+---------+--------------+

It’s time now to start mysqld:

[root@mysql3 ~]# systemctl start mysqld
[root@mysql3 ~]# mysql
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 5
Server version: 5.7.17-log MySQL Community Server (GPL)
...

Step 2: create Group Replication Cluster of 1 node

Now we need to bootstrap our group.

This is a very easy step:

mysql3> INSTALL PLUGIN group_replication SONAME 'group_replication.so';
mysql3> SET GLOBAL group_replication_bootstrap_group=ON;
mysql3> START GROUP_REPLICATION;
mysql3> SET GLOBAL group_replication_bootstrap_group=OFF;

Now the Group is started:

mysql3> select * from performance_schema.replication_group_members\G
*************************** 1. row ***************************
CHANNEL_NAME: group_replication_applier
   MEMBER_ID: 9e8416d7-b1c6-11e6-bc10-08002718d305
 MEMBER_HOST: mysql3.localdomain
 MEMBER_PORT: 3306
MEMBER_STATE: ONLINE

It’s not needed now, but it’s the best time to also setup the credentials for the recovery process. You might forget it.

mysql1> CHANGE MASTER TO MASTER_USER='repl', MASTER_PASSWORD='password' 
             FOR CHANNEL 'group_replication_recovery';

This user is not yet created but we will created directly during the next step (and it will be replicated to all nodes).

Step 3: make this MySQL 5.7.17 replicate from PXC

So now we need to create a replication user on the Galera cluster that we will use for this new MySQL 5.7 server (and later for the Group Replication’s recovery process):

mysql1> CREATE USER 'repl'@'192.168.90.%' IDENTIFIED BY 'password';
mysql1> GRANT REPLICATION SLAVE ON *.* TO 'repl'@'192.168.90.%';

And we can start asynchronous replication from the Galera Cluster to this new MySQL server:

mysql3> CHANGE MASTER TO MASTER_HOST="192.168.90.10", MASTER_USER="repl", 
        MASTER_PASSWORD='password', MASTER_AUTO_POSITION=1;
mysql3> START SLAVE;

Now we have the following environment:

Step 4: migrate an extra node to the Group

Now we will almost do the same with mysql2:

stop mysql
install mysql community repository
swap the packages
modify my.cnf
put mysql2 in hostgroup_id 2 in ProxySQL
start mysqld
join the group

Let’s skip points 1 to 4.

Unlike Galera, it’s mandatory with MySQL Group Replication that all the nodes have a unique server_id. We must then be careful, in this case we will set it to 2.

Don’t forget to also swap the addresses for mysql3 and mysql2 between group_replication_local_address and group_replication_group_seeds:

loose-group_replication_local_address= "192.168.90.11:3406"
loose-group_replication_group_seeds= "192.168.90.10:3406,192.168.90.12:3406"

Put mysql2 in hostgroup_id 2 in ProxySQL, so you have:

ProxySQL Admin> select hostname, status, hostgroup_id from runtime_mysql_servers;
+---------------+---------+--------------+
| hostname      | status  | hostgroup_id |
+---------------+---------+--------------+
| 192.168.90.10 | ONLINE  | 1            |
| 192.168.90.12 | ONLINE  | 2            |
| 192.168.90.11 | SHUNNED | 2            |
+---------------+---------+--------------+

Start mysqld and it’s time to join the group (point 7) !

mysql2> INSTALL PLUGIN group_replication SONAME 'group_replication.so';
mysql2> CHANGE MASTER TO MASTER_USER='repl', MASTER_PASSWORD='password' 
        FOR CHANNEL 'group_replication_recovery';
mysql2> START GROUP_REPLICATION;

So this is the new situation:

mysql3> select * from performance_schema.replication_group_members\G
*************************** 1. row ***************************
CHANNEL_NAME: group_replication_applier
   MEMBER_ID: 5221ffcf-c1e0-11e6-b1f5-08002718d305
 MEMBER_HOST: pxc2
 MEMBER_PORT: 3306
MEMBER_STATE: ONLINE
*************************** 2. row ***************************
CHANNEL_NAME: group_replication_applier
   MEMBER_ID: 5a2d38db-c1e0-11e6-8bf6-08002718d305
 MEMBER_HOST: pxc3
 MEMBER_PORT: 3306
MEMBER_STATE: ONLINE

Step 5: move the application to our new Cluster

It’s time now to let the application connect to our new MySQL Group Replication Cluster. In ProxySQL, we change the hostgroup_id for mysql2 and mysql3 to 1 and to 2 for mysql1, then we load it runtime and we stop mysql on mysql1:

ProxySQL Admin> select hostname, status, hostgroup_id from runtime_mysql_servers;
+---------------+--------+--------------+
| hostname      | status | hostgroup_id |
+---------------+--------+--------------+
| 192.168.90.10 | ONLINE | 1            |
| 192.168.90.12 | ONLINE | 2            |
| 192.168.90.11 | ONLINE | 2            |
+---------------+--------+--------------+
ProxySQL Admin> update mysql_servers set hostgroup_id =2 where hostname ="192.168.90.10";
ProxySQL Admin> update mysql_servers set hostgroup_id =1 where hostname ="192.168.90.11";
ProxySQL Admin> update mysql_servers set hostgroup_id =1 where hostname ="192.168.90.12";

ProxySQL Admin> load mysql servers to runtime;

ProxySQL Admin> select hostname, status, hostgroup_id from runtime_mysql_servers;
+---------------+--------+--------------+
| hostname      | status | hostgroup_id |
+---------------+--------+--------------+
| 192.168.90.12 | ONLINE | 1            |
| 192.168.90.11 | ONLINE | 1            |
| 192.168.90.10 | ONLINE | 2            |
+---------------+--------+--------------+

In this case as we are using ProxySQL, as soon as mysql1 (192.168.90.10) changes group, all the connection to it are killed and they will reconnect to the new nodes that are now part of the MySQL Group Replication.

[root@mysql1 ~]# systemctl stop mysql

To finish, we have two options, or we configure the remaining PXC node to be slave for a while, so we can still decide to rollback the migration (I would then consider to add an extra node to the current MySQL Group Replication Cluster, as with 2 nodes, the cluster is not tolerant to any failure). Or we can directly migrate the last Galera node to Group Replication.

Conclusion

As you could see, migrate your current Galera environment to MySQL Group Replication is not complicated and can be done with really minimal impact.

Don’t hesitate to leave your comments or questions like usual.

Note:

It’s also possible to swap Setp 2 and Step3, this means that the asynchronous replication is started before the bootstrap of the Group Replication. In that case, it might happen that the asynchronous replication fails while starting Group Replication, as the replication recovery is started and therefore no transaction can be executed.

You can see the following in SHOW SLAVE STATUS:

Last_SQL_Error: Error in Xid_log_event: Commit could not be completed, 
'Error on observer while running replication hook 'before_commit'.'

The error log also gives you information about it:

[Note] Plugin group_replication reported: 'Starting group replication recovery 
       with view_id 14817109352506883:1'
[Note] Plugin group_replication reported: 'Only one server alive. 
       Declaring this server as online within the replication group'
[ERROR] Plugin group_replication reported: 'Transaction cannot be executed while Group Replication 
        is recovering. Try again when the server is ONLINE.'
[ERROR] Run function 'before_commit' in plugin 'group_replication' failed
[ERROR] Slave SQL for channel '': Error in Xid_log_event: Commit could not be completed, 
        'Error on observer while running replication hook 'before_commit'.', Error_code: 3100
[Warning] Slave: Error on observer while running replication hook 'before_commit'. Error_code: 3100
[ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart 
        the slave SQL thread with "SLAVE START". We stopped at log 'pxc1-bin.000009' position 76100764
[Note] Plugin group_replication reported: 'This server was declared online within the replication group'

Restarting again replication solves the problem:

mysql3> STOP SLAVE;
mysql3> START SLAVE;

8 Comments

Kapil Singla

December 20, 2016 / 14:39 Reply

Nice Article ! Logically makes sense..
Vasiliy Lyk'yanchikov

December 23, 2016 / 10:31 Reply

Hi, Frederic!
Good guide, thanks for that. I want to translate it to russian language. Please let me know if you have any objection.
- lefred
  
  December 24, 2016 / 11:17 Reply
  
  Hi Vasily,
  No problem, just add a link to this original post and I’m perfectly fine. I would even thank you for that. Send me back the url of the Russian version too 😉
  Cheers,
  - Vasiliy Lyk'yanchikov
    
    December 27, 2016 / 13:28 Reply
    
    Hi, Frederic,
    Yes of course. I always add link back to the original post.
    Here is the translation of your article on russian language – http://sqlinfo.ru/articles/info/32.html
    You write interesting articles. I’ll translate some more your articles later.
    Happy new year.
Janet

February 24, 2018 / 08:29 Reply

Hi frederic,
The article is great. But I have concern at the value of loose-group_replication_group_seeds. When you set the node 3(192.168.90.12, first node to MGR), 192.168.90.10 and 192.168.90.11 was still PXC. I thought the loose-group_replication_group_seeds should be “192.168.90.12:3406”.
loose-group_replication_local_address= “192.168.90.12:3406”
loose-group_replication_group_seeds= “192.168.90.10:3406,192.168.90.11:3406”
Looking forward to your feedback on this.
- lefred
  
  February 25, 2018 / 09:59 Reply
  
  Hi Janet,
  It doesn’t matter, at that time the setting is not really used as we bootstrap the Group from that node. And in case it reboots, it will try to connect to the other nodes in that list, none will be available and a bootstrap will be necessary anyway. The setting it’s used at boot to join a Group.
  Cheers,
boshi fan

September 27, 2018 / 04:53 Reply

Hi, Frederic
I’m now migrating my pxc to mgr , but one problem is that mgr can’t catch up with master(pxc).
my pxc version is 5.7.14. I found a [blog](https://www.percona.com/blog/2017/04/19/how-we-made-percona-xtradb-cluster-scale/) it seems like that pxc5.7.14 is not support group commit very well. is there any way to optimize replication latency? here is my my.cnf

###BASIC SETTINGS
server-id = 0040263307
user = mysql
port = 3307
basedir = /usr/local/mysql
datadir = /data/mysql_3307/data
socket = /data/mysql_3307/tmp/mysql.sock
pid-file = /data/mysql_3307/tmp/node004026.pid
character-set-server = utf8mb4
transaction_isolation = READ-COMMITTED
explicit_defaults_for_timestamp = 1
max_allowed_packet = 64M
open_files_limit = 65535
sql_mode = NO_ENGINE_SUBSTITUTION

###CONNECTION SETTINGS
interactive_timeout = 30
wait_timeout = 30
lock_wait_timeout = 1800
skip_name_resolve = 1
max_connections = 2560
max_user_connections=2048
max_connect_errors = 1000000
back_log = 1024

###TABLE CACHE PERFORMANCE SETTINGS
table_open_cache = 2048
table_definition_cache = 1024
table_open_cache_instances = 64

###SESSION MEMORY SETTINGS
read_buffer_size = 16M
read_rnd_buffer_size = 32M
sort_buffer_size = 32M
join_buffer_size = 32M
tmp_table_size = 64M
max_heap_table_size = 64M
thread_cache_size = 64
query_cache_size = 0
query_cache_type = 0
key_buffer_size = 16M
max_length_for_sort_data = 8096
bulk_insert_buffer_size = 64M
myisam_sort_buffer_size = 128M
myisam_max_sort_file_size = 10G
myisam_repair_threads = 1

###LOG SETTINGS
log_error = /data/mysql_3307/logs/node004026.err
general_log_file = /data/mysql_3307/logs/node004026.log
slow_query_log = 1
slow_query_log_file = /data/mysql_3307/logs/slow-queries.log
long_query_time = 0.2
log_queries_not_using_indexes = 1
log_slow_admin_statements = 1
log_slow_slave_statements = 1
log_throttle_queries_not_using_indexes = 60
min_examined_row_limit = 100
expire_logs_days = 3
#log_timestamps = system

###INNODB SETTINGS
innodb_buffer_pool_size = 50G
innodb_buffer_pool_instances = 8
innodb_buffer_pool_load_at_startup = 1
innodb_buffer_pool_dump_at_shutdown = 1
innodb_lru_scan_depth = 4096
innodb_lock_wait_timeout = 20
innodb_io_capacity = 2000
innodb_io_capacity_max = 4000
innodb_flush_method = O_DIRECT
#innodb_undo_logs = 128
#innodb_undo_tablespaces = 3
#innodb_max_undo_log_size = 4G
#innodb_undo_log_truncate = 1
innodb_flush_neighbors = 0
innodb_log_file_size = 1G
innodb_log_files_in_group = 5
innodb_log_buffer_size = 32M
innodb_large_prefix = 1
innodb_thread_concurrency = 64
innodb_print_all_deadlocks = 1
innodb_strict_mode = 1
innodb_purge_threads = 4
innodb_write_io_threads = 8
innodb_read_io_threads = 8
innodb_page_cleaners = 8
innodb_sort_buffer_size = 32M
innodb_file_per_table = 1
innodb_stats_persistent_sample_pages = 64
innodb_autoinc_lock_mode = 2
innodb_online_alter_log_max_size = 4G
innodb_open_files = 65535
#innodb_data_file_path = ibdata1:1G:autoextend
innodb_flush_log_at_trx_commit = 1
innodb_checksums = 1
innodb_checksum_algorithm = crc32
innodb_rollback_on_timeout = 1
internal_tmp_disk_storage_engine = InnoDB
innodb_status_file = 1
innodb_status_output_locks = 1

###REPLICATION SETTINGS
sync_binlog = 1
master_info_repository = TABLE
relay_log_info_repository = TABLE
gtid_mode = on
enforce_gtid_consistency = 1
log_slave_updates
binlog_format = row
binlog_row_image = full
binlog_cache_size = 4M
binlog_gtid_simple_recovery = 1
max_binlog_cache_size = 2G
max_binlog_size = 1G
relay_log_recovery = 1
relay_log = /data/mysql_3307/logs/0040263307-relay-bin
relay_log_purge = 1
log_bin = /data/mysql_3307/logs/0040263307-mysql-bin
#binlog_transaction_dependency_history_size = 25000
slave_parallel_type = LOGICAL_CLOCK
slave_parallel_workers = 8
slave_preserve_commit_order = on

###PERFORMANCE_SCHEMA SETTINGS
performance_schema = 1
performance_schema_instrument = ‘%=on’
performance_schema_digests_size = 40000
performance_schema_max_table_instances = 40000
performance_schema_max_sql_text_length = 4096
performance_schema_max_digest_length = 4096
performance-schema-instrument=’stage/%=ON’
performance-schema-consumer-events-stages-current=ON
performance-schema-consumer-events-stages-history=ON
performance-schema-consumer-events-stages-history-long=ON
performance-schema-consumer-events-transactions-history-long=ON

###INNODB MONITOR SETTINGS
innodb_monitor_enable=module_innodb,module_dml,module_ddl,module_trx,module_os,module_purge,module_log,module_lock,module_buffer,module_index,module_ibuf_system,module_buffer_page,module_adaptive_hash

###ADDITIONAL SETTINGS
# BEGIN ANSIBLE MANAGED BLOCK FOR MGR
###MGR SETTINGS
read_only=1
super_read_only=1
binlog_transaction_dependency_tracking = WRITESET
transaction-write-set-extraction = XXHASH64
binlog_transaction_dependency_history_size = 25000
# report_host = 127.0.0.1 # optional for group replication
binlog_checksum = NONE # only for group replication
loose-group_replication_group_name = ’38f8425e-9182-5934-b32a-7e4317fe4a04′
loose-group_replication_start_on_boot = off
loose-group_replication_local_address = 192.168.4.26:23307
loose-group_replication_group_seeds = “192.168.4.26:23307,192.168.4.31:23307,192.168.4.41:23307”
loose-group_replication_bootstrap_group = off
loose_group_replication_single_primary_mode = 0
loose_group_replication_enforce_update_everywhere_checks = 1
loose_group_replication_unreachable_majority_timeout = 120
loose_group_replication_start_on_boot = 0
# END ANSIBLE MANAGED BLOCK FOR MGR
From PXC/Galera to MySQL InnoDB Cluster in 10 steps – lefred blog: tribulations of a MySQL Evangelist

October 20, 2021 / 16:25 Reply

[…] I already wrote some time ago an article on this process: how to migrate from Galera to MySQL Group Replication. […]

Cookie	Duration	Description
cookielawinfo-checbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.