MySQL Group Replication: about ack from majority

The documentation states that “For a transaction to commit, the majority of the group have to agree on the order of a given transaction in the global sequence of transactions.

This means that as soon as the majority of nodes member of the group ack the writeset reception, certification can start. So, as a picture is worth a 1000 words, this is what it looks like if we take the illustrations from my previous post:

a group of 3 members

zoom in transaction deliver
the writer also acks
majority is reached, the system agreed on the order
ack of the remaining node will come too but the order has been already decided

 

certification can start
the process then continues as usual

So theoretically, having 2 nodes in one DC and 1 node in another DC shouldn’t be affected by the latency between both sites if writes are happening on the DC with the majority of nodes. But this is not the case.

As you can see on the video above, every 3rd write to the system is affected by the latency. That’s because the system has to wait for the noop (single skip message) from the “distant” node. As Alfranio explained it in his blog post about our homegrown paxos based consensus, XCom is a multi-leader or more precisely a multi-proposer solution. In this protocol, every member has an associated unique number and a reserved slot in the stream of totally ordered messages. There is no leader election and each member is a leader of its own slots in the stream of messages. Members can propose messages for their slots without having to wait for other members. But if they don’t have anything to say, they need to tell it too and this is where we are affected by the latency.

So in conclusion, having a distant node (or with higher latency) as member of a Group slows down the full workload. Not constantly but at least in correlation with its ratio on the cluster. 1/3 for a 3 nodes cluster for example.

At least for now, if you need to have a distant node and using it only for reads, I would advice to use asynchronous replication between your two sites or being prepare to pay the cost of the latency.

 

Subscribe to Blog via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

5 Comments

  1. Hello,
    Was your test case with Single-Primary mode or Multi-Primary mode? Or does it matter?
    Thanks,
    Bill

    • Hi Bill,
      Thank you for your question. In fact currently it doesn’t matter. This is indeed somewhere we could implement some optimization.

  2. Hello
    As you said : “”So theoretically, having 2 nodes in one DC and 1 node in another DC shouldn’t be affected by the latency between both sites if writes are happening on the DC with the majority of nodes. But this is not the case.””
    Any 2021 update concerning this behavior and “latency-price-to-pay”, with your example of having 2 local nodes (majority) + 1 distant node ?

    • It´s still the same as Group Replication needs consistency to protect your data. It would be a very bad idea to just not care about the other site.
      Of course there have been improvements for low latency networks but if the node is away with a very low latency compare to the other nodes at a certain point, even if you can delay you might be affected if write load is faster then the ack .

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

As MySQL Community Manager, I am an employee of Oracle and the views expressed on this blog are my own and do not necessarily reflect the views of Oracle.

You can find articles I wrote on Oracle’s blog.