MySQL Group Replication… synchronous or asynchronous replication ?

After some feedback we received from early adopters or discussions during events like FOSDEM, I realized that there is some misconception about the type of replication that MySQL Group Replication is using. And even experts can be confused as Vadim’s blog post illustrated it.

So, is MySQL Group Replication asynchronous or synchronous ??

… in fact it depends !

The short answer is that GR is asynchronous. The confusion here can be explained by the comparison with Galera that claims to be synchronous or virtually synchronous depending where and who claims it (Synchronous multi-master replication library, synchronous replication, scalable synchronous replication solution, enables applications requiring synchronous replication of data, …) . But GR and Galera are not more synchronous than the other.

The more detailed answer is that it depends what do you call “replication”. In fact for years, in the MySQL world, replication defined the process of writing (or changing or deleting) data to a master and the appearance of that data on the slave. The full process is what we called replication. The fact of writing data on a master, adding that change in the binary log, sending it on the relay log of a slave and the slave applying that change… So “replication” is in fact 5 different steps:

locally applying
generating a binlog event
sending the binlog event to the slave(s)
adding the binlog event on the relay log
applying the binlog event from the relay log

And indeed, in MySQL Group Replication and in Galera (even if binlog and relay log files are mostly replace by the galera cache), only the step #3 is synchronous… and in fact this step is the streaming of the binlog event (write set) to the slave(s)… the replication of the data to the other nodes.

So yes the process of sending (replicating, streaming) the data to the other nodes is synchronous. But the applying of these changes is still completely asynchronous.

For example if you create a large transaction (which is not recommended neither in InnoDB, Galera and Group Replication) that modifies a huge amount of records, when the transaction is committed, a huge binlog event is created and streamed everywhere. As soon as the other nodes of the cluster/group acknowledge the reception of the binlog event, the node where the transaction was created returns “success” to the client and the data on that particular node is ready. Meanwhile all the other nodes need to process the huge binlog and make all the necessary data modification…. and this can take a lot of time. So yes, if you try to read the data that is part of that huge transaction on another node than the one where the write was done… the data won’t be there immediately. Bigger is the transaction longer you will have to wait for your data to be applied on the slave(s).

Let’s check with some pictures to try to make this more clear, considering the vertical axis is Time, :

We have a MySQL Group Replication cluster of 3 nodes and we start a transaction on **node1**

we add some statements to our transaction…

we commit the transaction and binary log events are generated

those binlog events are **streamed**/delivered synchronously to the other nodes and as soon as everybody (*) ack the reception of the binlog events, each node starts certifying them as soon as they can… but independently

certification can start as son as the transaction is received

when certification is done, on the writer, there is no need to wait for anything else from the other nodes and the commit result is sent back to the client

every other nodes consume from the apply queue the changes and start to apply them locally. This is again an asynchronous process like it was for certification

you can see that the transaction is committed on every node at different time

If you perform a lot of large transactions and you want to avoid inconsistent reads, with MySQL Group Replication, you need to wait by yourself and check if there is still some transaction to apply in the queue or verify the last GTID executed to know if the data you modified is present or not where you try to read it. By default this is the same with Galera. However, Galera implemented sync_wait that force the client to wait (until a timeout) for all the transaction in the apply queue to be executed before the current one.

The only synchronous replication solution for the moment is still MySQL Cluster, aka NDB.

(*) on Group Replication majority is enough.

18 Comments

Pavel Katiushyn

February 17, 2017 / 16:40 Reply

Hi,
If I understand correctly, GR is asynchronous in applying transactions, but synchronous in bin log events transfer.
Isn’t “semi-sync” better description for GR then? Of course it should not be confused with semi-sync replication plugin.
BTW, here is described the differences between replication types: async, semi-sinc and “semi-sync” group replication: https://dev.mysql.com/doc/refman/5.7/en/group-replication-replication-technologies.html
- lefred
  
  February 17, 2017 / 17:14 Reply
  
  Hi Pavel, thank you for comment. GR can be seen like that, but don’t forget there is many other aspects to take in consideration. This is a high level simplification to try to explain easily why it’s not “synchronous replication”. Certification, flow control, quorum, etc… all that is also part of the solution which is inexistant in semi-sync.
Peter Zaitsev

February 17, 2017 / 19:12 Reply

Fred,

PXC and other Galera Replications have a mode where selects on the nodes wait for writes to be propagated to ensure there are no stale reads. Does Group Replication have the same option ?
- lefred
  
  February 20, 2017 / 10:17 Reply
  
  Hi Peter,
  
  Thank you for your comment. Unfortunately GR doesn’t have this feature yet but we understand the need. Our development team is working on a solution.
  - Peter Zaitsev
    
    February 21, 2017 / 00:03 Reply
    
    Thank you Fred,
    
    My point is this is a significant difference at this point between GR and PXC/Galera – you can’t configure it in a way to forbid stale reads.
VadimTk

February 17, 2017 / 19:21 Reply

Fred,

I am not confused, my point is that without sync_wait or other mechanism to ensure that you are not reading stale data – it is impractical to use Group Replication.
How do you propose for the client that reads from Node2 to be sure that it is using most recent data?
- lefred
  
  February 20, 2017 / 10:22 Reply
  
  Hi Vadim,
  
  I understand your need and it was not my goal to say you are confused 😉 And to be honest I was surprised about the bug you entered instead of a feature request 😀
  
  So we are working on some solution to have this in the server, currently you have two options:
  – with the connector, as in the OK packet you receive when you write a transaction, you also get the GTID of that trx. So if you really need to read back that data, you can check the last gtid executed. But this requires an application change.
  – with the application, you can check the apply queue lenght but this is not really precise
  
  Thank you for evaluating GR
  - Hany Widodo
    
    October 28, 2021 / 07:20 Reply
    
    Hi Fred,
    
    We’re consulting service and have requirements from our clients, that need setup the mysql such as oracle RAC and Dataguard as DC-DR solution which means different network segment on DC and DR.
    And read the documentation both from support.oracle.com and google for mysql enterprise 8.x, we haven’t found the solution for this. Need your recommendation 🙏🏻
    
    Thanks and Regards,
    Hany
    - lefred
      
      October 28, 2021 / 07:25 Reply
      
      Take a look at https://dev.mysql.com/doc/mysql-shell/8.0/en/innodb-clusterset.html
Hugo

February 17, 2017 / 19:29 Reply

Hi,
When Node 2 is applying the big transaction(1), it is possible to write another transaction (2) on the node 2 immediately or the transaction (2) will wait that the first big transaction(1) finish ?
- VadimTk
  
  February 18, 2017 / 06:47 Reply
  
  Hugo,
  
  right now because of https://bugs.mysql.com/bug.php?id=84901 you can’t write any transaction on the node 2.
Brad

February 17, 2017 / 21:32 Reply

Is this true for both single primary and multi-master?
Brad

February 18, 2017 / 16:19 Reply

Say we have a 3 node cluster in single primary node. Node 1 is the master. Then we have the following:

Node 1 commits a transaction
Node 3 certifies it.
Node 2 has not certified it.
Node 1 finailizes the commit
Node 1 fails
Failover to Node 2.

At this point node 2 has not certified the transaction from node 1, but node 1 is gone. Will node 2 receive the data and proceed with certification from node 3?

If so, what is the proper way to to monitor node 2 to ensure it has received that transaction?
- lefred
  
  February 20, 2017 / 10:25 Reply
  
  Hi Brad,
  When the commit is finalized, this mean all members got the transaction.
  https://dev.mysql.com/doc/refman/5.7/en/images/gr-replication-diagram.png
  - VadimTk
    
    February 20, 2017 / 21:50 Reply
    
    Fred,
    
    What about if we have 4 nodes?
    As I understand your implementation requires only majority.
    Does not it mean that to finalize commit you need only to have confirmation from 3 nodes out of 4?
MySQL Group Replication: about ack from majority – lefred's blog: tribulations of a MySQL Evangelist

February 23, 2017 / 01:32 Reply

[…] This means that as soon as the majority of nodes member of the group ack the writeset reception, certification can start. So, as a picture is worth a 1000 words, this is what it looks like if we take the illustrations from my previous post: […]
MySQL Group Replication: about ack from majority – Cloud Data Architect

February 24, 2017 / 12:58 Reply

[…] This means that as soon as the majority of nodes member of the group ack the writeset reception, certification can start. So, as a picture is worth a 1000 words, this is what it looks like if we take the illustrations from my previous post: […]
futuristic architecture

July 3, 2017 / 09:38 Reply

I don’t know whether it’s just me or if everybody else experiencing problems with your blog.
It looks like some of the written text on your content are running
off the screen. Can somebody else please comment and let
me know if this is happening to them as well? This could be a
problem with my browser because I’ve had this
happen previously. Thanks

Cookie	Duration	Description
cookielawinfo-checbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

MySQL Group Replication… synchronous or asynchronous replication ?

So, is MySQL Group Replication asynchronous or synchronous ??

… in fact it depends !

follow me

Subscribe to Blog via Email

18 Comments

Leave a Reply to lefredCancel Reply