In this article, I will show you how it’s possible to perform an online migration from a 3 members Galera cluster setup (in this case I’m using PXC 5.7.14) to a 3 members MySQL Group Replication cluster setup (MySQL Community 5.7.17).
Don’t forget that before adopting Group Replication as database backend, you should validate that your application do match GR requirements and limitations. When this is validated, you can start !
So first, let’s have a look at the current situation:
We have an application (sysbench 0.5), reading and writing to a Galera Cluster (Percona XtraDB Cluster 5.7.14) via ProxySQL. We write on all the nodes (Multi-Master) and we will do the same on our MySQL Group Replication Cluster, we will use a Multi-Primary Group.
This is the command used to simulate our application:
while true; do sysbench --test=/usr/share/doc/sysbench/tests/db/oltp.lua \ --mysql-host=127.0.0.1 --mysql-port=6033 --mysql-password=fred \ --mysql-table-engine=innodb --mysql-user=fred --max-requests=0 \ --tx-rate=20 --num-threads=2 --report-interval=1 run ; done;
And this an overview of the machines used:
Hostname | OS | Software | IP | Server_id |
---|---|---|---|---|
mysql1 | CentOS 7 | PXC 5.7.14 | 192.168.90.10 | 1 |
mysql2 | CentOS 7 | PXC 5.7.14 | 192.168.90.11 | 2 |
mysql3 | CentOS 7 | PXC 5.7.14 | 192.168.90.12 | 3 |
app | CentOS 7 | ProxySQL sysbench 0.5 |
192.168.90.13 | n/a |
So the goal will be to replace all those PXC nodes by MySQL 5.7.17 with Group Replication one by one and avoid downtime.
For those familiar with ProxySQL this is how we see the Galera nodes in the proxy:
ProxySQL Admin> select hostname, status, hostgroup_id from runtime_mysql_servers; +---------------+--------+--------------+ | hostname | status | hostgroup_id | +---------------+--------+--------------+ | 192.168.90.12 | ONLINE | 1 | | 192.168.90.10 | ONLINE | 1 | | 192.168.90.11 | ONLINE | 1 | +---------------+--------+--------------+
For the other ones, you can find more info in the previous post: HA with MySQL Group Replication and ProxySQL.
To be able to proceed as we planned, we need to have binary logs enabled on every PXC nodes and also use MySQL GTIDs.
So in my.cnf you must have:
enforce_gtid_consistency = on gtid_mode = on log_bin log_slave_updates
First Step: remove one node and migrate it to MySQL 5.7.17
Our first step in this section will be to stop mysqld
and remove the PXC packages on mysql3:
[root@mysql3]# systemctl stop mysql ProxySQL Admin> select hostname, status from runtime_mysql_servers; +---------------+---------+ | hostname | status | +---------------+---------+ | 192.168.90.11 | ONLINE | | 192.168.90.12 | SHUNNED | | 192.168.90.10 | ONLINE | +---------------+---------+
Our application is of course still running (it might of course be disconnected), so in this case sysbench runs in a loop.
As all our nodes are running CentOS 7, we will use the mysql57 community repo for el7.
[root@mysql3 ~]# yum install http://dev.mysql.com/get/mysql57-community-release-el7-9.noarch.rpm
Now we can change the packages:
[root@mysql3 ~]# yum -y swap Percona-XtraDB-Cluster* mysql-community-server mysql-community-libs-compat ... ========================================================================================================= Package Arch Version Repository Size ========================================================================================================= Installing: mysql-community-libs-compat x86_64 5.7.17-1.el7 mysql57-community 2.0 M mysql-community-server x86_64 5.7.17-1.el7 mysql57-community 162 M Removing: Percona-XtraDB-Cluster-57 x86_64 5.7.14-26.17.1.el7 @percona-release-x86_64 0.0 Percona-XtraDB-Cluster-client-57 x86_64 5.7.14-26.17.1.el7 @percona-release-x86_64 37 M Percona-XtraDB-Cluster-server-57 x86_64 5.7.14-26.17.1.el7 @percona-release-x86_64 227 M Percona-XtraDB-Cluster-shared-57 x86_64 5.7.14-26.17.1.el7 @percona-release-x86_64 3.7 M Percona-XtraDB-Cluster-shared-compat-57 x86_64 5.7.14-26.17.1.el7 @percona-release-x86_64 6.7 M Installing for dependencies: mysql-community-client x86_64 5.7.17-1.el7 mysql57-community 24 M mysql-community-common x86_64 5.7.17-1.el7 mysql57-community 271 k mysql-community-libs x86_64 5.7.17-1.el7 mysql57-community 2.1 M Transaction Summary ========================================================================================================= Install 2 Packages (+3 Dependent packages) Remove 5 Packages
After that step, it’s time to modify my.cnf
and comment all wsrep and pxc related variables and add some extra that are mandatory:
binlog_checksum = none master_info_repository = TABLE relay_log_info_repository = TABLE transaction_write_set_extraction = XXHASH64 loose-group_replication_group_name="afb80f36-2bff-11e6-84e0-0800277dd3bf" loose-group_replication_start_on_boot=off loose-group_replication_local_address= "192.168.90.12:3406" loose-group_replication_group_seeds= "192.168.90.10:3406,192.168.90.11:3406" loose-group_replication_bootstrap_group= off loose-group_replication_single_primary_mode= off
Then we move that server to another hostgroup in ProxySQL:
ProxySQL Admin> update mysql_servers set hostgroup_id =2 where hostname ="192.168.90.12"; ProxySQL Admin> load mysql servers to runtime; ProxySQL Admin> select hostname, status, hostgroup_id from runtime_mysql_servers; +---------------+---------+--------------+ | hostname | status | hostgroup_id | +---------------+---------+--------------+ | 192.168.90.11 | ONLINE | 1 | | 192.168.90.10 | ONLINE | 1 | | 192.168.90.12 | SHUNNED | 2 | +---------------+---------+--------------+
It’s time now to start mysqld:
[root@mysql3 ~]# systemctl start mysqld [root@mysql3 ~]# mysql Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 5 Server version: 5.7.17-log MySQL Community Server (GPL) ...
Step 2: create Group Replication Cluster of 1 node
Now we need to bootstrap our group.
This is a very easy step:
mysql3> INSTALL PLUGIN group_replication SONAME 'group_replication.so'; mysql3> SET GLOBAL group_replication_bootstrap_group=ON; mysql3> START GROUP_REPLICATION; mysql3> SET GLOBAL group_replication_bootstrap_group=OFF;
Now the Group is started:
mysql3> select * from performance_schema.replication_group_members\G *************************** 1. row *************************** CHANNEL_NAME: group_replication_applier MEMBER_ID: 9e8416d7-b1c6-11e6-bc10-08002718d305 MEMBER_HOST: mysql3.localdomain MEMBER_PORT: 3306 MEMBER_STATE: ONLINE
It’s not needed now, but it’s the best time to also setup the credentials for the recovery process. You might forget it.
mysql1> CHANGE MASTER TO MASTER_USER='repl', MASTER_PASSWORD='password' FOR CHANNEL 'group_replication_recovery';
This user is not yet created but we will created directly during the next step (and it will be replicated to all nodes).
Step 3: make this MySQL 5.7.17 replicate from PXC
So now we need to create a replication user on the Galera cluster that we will use for this new MySQL 5.7 server (and later for the Group Replication’s recovery process):
mysql1> CREATE USER 'repl'@'192.168.90.%' IDENTIFIED BY 'password'; mysql1> GRANT REPLICATION SLAVE ON *.* TO 'repl'@'192.168.90.%';
And we can start asynchronous replication from the Galera Cluster to this new MySQL server:
mysql3> CHANGE MASTER TO MASTER_HOST="192.168.90.10", MASTER_USER="repl", MASTER_PASSWORD='password', MASTER_AUTO_POSITION=1; mysql3> START SLAVE;
Now we have the following environment:
Step 4: migrate an extra node to the Group
Now we will almost do the same with mysql2:
- stop mysql
- install mysql community repository
- swap the packages
- modify my.cnf
- put mysql2 in hostgroup_id 2 in ProxySQL
- start mysqld
- join the group
Let’s skip points 1 to 4.
Unlike Galera, it’s mandatory with MySQL Group Replication that all the nodes have a unique server_id. We must then be careful, in this case we will set it to 2.
Don’t forget to also swap the addresses for mysql3 and mysql2 between group_replication_local_address
and group_replication_group_seeds
:
loose-group_replication_local_address= "192.168.90.11:3406" loose-group_replication_group_seeds= "192.168.90.10:3406,192.168.90.12:3406"
Put mysql2 in hostgroup_id 2 in ProxySQL, so you have:
ProxySQL Admin> select hostname, status, hostgroup_id from runtime_mysql_servers; +---------------+---------+--------------+ | hostname | status | hostgroup_id | +---------------+---------+--------------+ | 192.168.90.10 | ONLINE | 1 | | 192.168.90.12 | ONLINE | 2 | | 192.168.90.11 | SHUNNED | 2 | +---------------+---------+--------------+
Start mysqld and it’s time to join the group (point 7) !
mysql2> INSTALL PLUGIN group_replication SONAME 'group_replication.so'; mysql2> CHANGE MASTER TO MASTER_USER='repl', MASTER_PASSWORD='password' FOR CHANNEL 'group_replication_recovery'; mysql2> START GROUP_REPLICATION;
So this is the new situation:
mysql3> select * from performance_schema.replication_group_members\G *************************** 1. row *************************** CHANNEL_NAME: group_replication_applier MEMBER_ID: 5221ffcf-c1e0-11e6-b1f5-08002718d305 MEMBER_HOST: pxc2 MEMBER_PORT: 3306 MEMBER_STATE: ONLINE *************************** 2. row *************************** CHANNEL_NAME: group_replication_applier MEMBER_ID: 5a2d38db-c1e0-11e6-8bf6-08002718d305 MEMBER_HOST: pxc3 MEMBER_PORT: 3306 MEMBER_STATE: ONLINE
Step 5: move the application to our new Cluster
It’s time now to let the application connect to our new MySQL Group Replication Cluster. In ProxySQL, we change the hostgroup_id for mysql2 and mysql3 to 1 and to 2 for mysql1, then we load it runtime and we stop mysql on mysql1:
ProxySQL Admin> select hostname, status, hostgroup_id from runtime_mysql_servers; +---------------+--------+--------------+ | hostname | status | hostgroup_id | +---------------+--------+--------------+ | 192.168.90.10 | ONLINE | 1 | | 192.168.90.12 | ONLINE | 2 | | 192.168.90.11 | ONLINE | 2 | +---------------+--------+--------------+ ProxySQL Admin> update mysql_servers set hostgroup_id =2 where hostname ="192.168.90.10"; ProxySQL Admin> update mysql_servers set hostgroup_id =1 where hostname ="192.168.90.11"; ProxySQL Admin> update mysql_servers set hostgroup_id =1 where hostname ="192.168.90.12"; ProxySQL Admin> load mysql servers to runtime; ProxySQL Admin> select hostname, status, hostgroup_id from runtime_mysql_servers; +---------------+--------+--------------+ | hostname | status | hostgroup_id | +---------------+--------+--------------+ | 192.168.90.12 | ONLINE | 1 | | 192.168.90.11 | ONLINE | 1 | | 192.168.90.10 | ONLINE | 2 | +---------------+--------+--------------+
In this case as we are using ProxySQL, as soon as mysql1 (192.168.90.10) changes group, all the connection to it are killed and they will reconnect to the new nodes that are now part of the MySQL Group Replication.
[root@mysql1 ~]# systemctl stop mysql
To finish, we have two options, or we configure the remaining PXC node to be slave for a while, so we can still decide to rollback the migration (I would then consider to add an extra node to the current MySQL Group Replication Cluster, as with 2 nodes, the cluster is not tolerant to any failure). Or we can directly migrate the last Galera node to Group Replication.
Conclusion
As you could see, migrate your current Galera environment to MySQL Group Replication is not complicated and can be done with really minimal impact.
Don’t hesitate to leave your comments or questions like usual.
Note:
It’s also possible to swap Setp 2 and Step3, this means that the asynchronous replication is started before the bootstrap of the Group Replication. In that case, it might happen that the asynchronous replication fails while starting Group Replication, as the replication recovery is started and therefore no transaction can be executed.
You can see the following in SHOW SLAVE STATUS
:
Last_SQL_Error: Error in Xid_log_event: Commit could not be completed, 'Error on observer while running replication hook 'before_commit'.'
The error log also gives you information about it:
[Note] Plugin group_replication reported: 'Starting group replication recovery with view_id 14817109352506883:1' [Note] Plugin group_replication reported: 'Only one server alive. Declaring this server as online within the replication group' [ERROR] Plugin group_replication reported: 'Transaction cannot be executed while Group Replication is recovering. Try again when the server is ONLINE.' [ERROR] Run function 'before_commit' in plugin 'group_replication' failed [ERROR] Slave SQL for channel '': Error in Xid_log_event: Commit could not be completed, 'Error on observer while running replication hook 'before_commit'.', Error_code: 3100 [Warning] Slave: Error on observer while running replication hook 'before_commit'. Error_code: 3100 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'pxc1-bin.000009' position 76100764 [Note] Plugin group_replication reported: 'This server was declared online within the replication group'
Restarting again replication solves the problem:
mysql3> STOP SLAVE; mysql3> START SLAVE;
Nice Article ! Logically makes sense..
Hi, Frederic!
Good guide, thanks for that. I want to translate it to russian language. Please let me know if you have any objection.
Hi Vasily,
No problem, just add a link to this original post and I’m perfectly fine. I would even thank you for that. Send me back the url of the Russian version too 😉
Cheers,
Hi, Frederic,
Yes of course. I always add link back to the original post.
Here is the translation of your article on russian language – http://sqlinfo.ru/articles/info/32.html
You write interesting articles. I’ll translate some more your articles later.
Happy new year.
Hi frederic,
The article is great. But I have concern at the value of loose-group_replication_group_seeds. When you set the node 3(192.168.90.12, first node to MGR), 192.168.90.10 and 192.168.90.11 was still PXC. I thought the loose-group_replication_group_seeds should be “192.168.90.12:3406”.
loose-group_replication_local_address= “192.168.90.12:3406”
loose-group_replication_group_seeds= “192.168.90.10:3406,192.168.90.11:3406”
Looking forward to your feedback on this.
Hi Janet,
It doesn’t matter, at that time the setting is not really used as we bootstrap the Group from that node. And in case it reboots, it will try to connect to the other nodes in that list, none will be available and a bootstrap will be necessary anyway. The setting it’s used at boot to join a Group.
Cheers,
Hi, Frederic
I’m now migrating my pxc to mgr , but one problem is that mgr can’t catch up with master(pxc).
my pxc version is 5.7.14. I found a [blog](https://www.percona.com/blog/2017/04/19/how-we-made-percona-xtradb-cluster-scale/) it seems like that pxc5.7.14 is not support group commit very well. is there any way to optimize replication latency? here is my my.cnf
###BASIC SETTINGS
server-id = 0040263307
user = mysql
port = 3307
basedir = /usr/local/mysql
datadir = /data/mysql_3307/data
socket = /data/mysql_3307/tmp/mysql.sock
pid-file = /data/mysql_3307/tmp/node004026.pid
character-set-server = utf8mb4
transaction_isolation = READ-COMMITTED
explicit_defaults_for_timestamp = 1
max_allowed_packet = 64M
open_files_limit = 65535
sql_mode = NO_ENGINE_SUBSTITUTION
###CONNECTION SETTINGS
interactive_timeout = 30
wait_timeout = 30
lock_wait_timeout = 1800
skip_name_resolve = 1
max_connections = 2560
max_user_connections=2048
max_connect_errors = 1000000
back_log = 1024
###TABLE CACHE PERFORMANCE SETTINGS
table_open_cache = 2048
table_definition_cache = 1024
table_open_cache_instances = 64
###SESSION MEMORY SETTINGS
read_buffer_size = 16M
read_rnd_buffer_size = 32M
sort_buffer_size = 32M
join_buffer_size = 32M
tmp_table_size = 64M
max_heap_table_size = 64M
thread_cache_size = 64
query_cache_size = 0
query_cache_type = 0
key_buffer_size = 16M
max_length_for_sort_data = 8096
bulk_insert_buffer_size = 64M
myisam_sort_buffer_size = 128M
myisam_max_sort_file_size = 10G
myisam_repair_threads = 1
###LOG SETTINGS
log_error = /data/mysql_3307/logs/node004026.err
general_log_file = /data/mysql_3307/logs/node004026.log
slow_query_log = 1
slow_query_log_file = /data/mysql_3307/logs/slow-queries.log
long_query_time = 0.2
log_queries_not_using_indexes = 1
log_slow_admin_statements = 1
log_slow_slave_statements = 1
log_throttle_queries_not_using_indexes = 60
min_examined_row_limit = 100
expire_logs_days = 3
#log_timestamps = system
###INNODB SETTINGS
innodb_buffer_pool_size = 50G
innodb_buffer_pool_instances = 8
innodb_buffer_pool_load_at_startup = 1
innodb_buffer_pool_dump_at_shutdown = 1
innodb_lru_scan_depth = 4096
innodb_lock_wait_timeout = 20
innodb_io_capacity = 2000
innodb_io_capacity_max = 4000
innodb_flush_method = O_DIRECT
#innodb_undo_logs = 128
#innodb_undo_tablespaces = 3
#innodb_max_undo_log_size = 4G
#innodb_undo_log_truncate = 1
innodb_flush_neighbors = 0
innodb_log_file_size = 1G
innodb_log_files_in_group = 5
innodb_log_buffer_size = 32M
innodb_large_prefix = 1
innodb_thread_concurrency = 64
innodb_print_all_deadlocks = 1
innodb_strict_mode = 1
innodb_purge_threads = 4
innodb_write_io_threads = 8
innodb_read_io_threads = 8
innodb_page_cleaners = 8
innodb_sort_buffer_size = 32M
innodb_file_per_table = 1
innodb_stats_persistent_sample_pages = 64
innodb_autoinc_lock_mode = 2
innodb_online_alter_log_max_size = 4G
innodb_open_files = 65535
#innodb_data_file_path = ibdata1:1G:autoextend
innodb_flush_log_at_trx_commit = 1
innodb_checksums = 1
innodb_checksum_algorithm = crc32
innodb_rollback_on_timeout = 1
internal_tmp_disk_storage_engine = InnoDB
innodb_status_file = 1
innodb_status_output_locks = 1
###REPLICATION SETTINGS
sync_binlog = 1
master_info_repository = TABLE
relay_log_info_repository = TABLE
gtid_mode = on
enforce_gtid_consistency = 1
log_slave_updates
binlog_format = row
binlog_row_image = full
binlog_cache_size = 4M
binlog_gtid_simple_recovery = 1
max_binlog_cache_size = 2G
max_binlog_size = 1G
relay_log_recovery = 1
relay_log = /data/mysql_3307/logs/0040263307-relay-bin
relay_log_purge = 1
log_bin = /data/mysql_3307/logs/0040263307-mysql-bin
#binlog_transaction_dependency_history_size = 25000
slave_parallel_type = LOGICAL_CLOCK
slave_parallel_workers = 8
slave_preserve_commit_order = on
###PERFORMANCE_SCHEMA SETTINGS
performance_schema = 1
performance_schema_instrument = ‘%=on’
performance_schema_digests_size = 40000
performance_schema_max_table_instances = 40000
performance_schema_max_sql_text_length = 4096
performance_schema_max_digest_length = 4096
performance-schema-instrument=’stage/%=ON’
performance-schema-consumer-events-stages-current=ON
performance-schema-consumer-events-stages-history=ON
performance-schema-consumer-events-stages-history-long=ON
performance-schema-consumer-events-transactions-history-long=ON
###INNODB MONITOR SETTINGS
innodb_monitor_enable=module_innodb,module_dml,module_ddl,module_trx,module_os,module_purge,module_log,module_lock,module_buffer,module_index,module_ibuf_system,module_buffer_page,module_adaptive_hash
###ADDITIONAL SETTINGS
# BEGIN ANSIBLE MANAGED BLOCK FOR MGR
###MGR SETTINGS
read_only=1
super_read_only=1
binlog_transaction_dependency_tracking = WRITESET
transaction-write-set-extraction = XXHASH64
binlog_transaction_dependency_history_size = 25000
# report_host = 127.0.0.1 # optional for group replication
binlog_checksum = NONE # only for group replication
loose-group_replication_group_name = ’38f8425e-9182-5934-b32a-7e4317fe4a04′
loose-group_replication_start_on_boot = off
loose-group_replication_local_address = 192.168.4.26:23307
loose-group_replication_group_seeds = “192.168.4.26:23307,192.168.4.31:23307,192.168.4.41:23307”
loose-group_replication_bootstrap_group = off
loose_group_replication_single_primary_mode = 0
loose_group_replication_enforce_update_everywhere_checks = 1
loose_group_replication_unreachable_majority_timeout = 120
loose_group_replication_start_on_boot = 0
# END ANSIBLE MANAGED BLOCK FOR MGR
[…] I already wrote some time ago an article on this process: how to migrate from Galera to MySQL Group Replication. […]