Recently, my friend Marco Tusa(MySQL Daddy or the Grinch) wrote his first impression on MySQL Group Replication (part of InnoDB Cluster). And his conclusion was not that positive. But when I analyze his setup, I understand that his assumptions were not so right.
Let’s try to explain what were the issues and why his test wasn’t correct.
Before commenting Marco’s tests, I would like to clarify the flow-control implementation in Group Replication:
We designed the flow-control feature in Group Replication as a safety measure for delaying writer nodes when they consistently exceed the write capacity of the Group, so that a large backlog would not make it hard to switch over from a member to another.
Flow-control is a coarse grained measure, and the default threshold for that safety measure was set to about one second of throughput with small update transactions on modern machines (hence those 25000). When using larger transaction, the threshold can be reduced without harm, but for performance it is better to keep above that 1 second of throughput. But take notice that, in a well balanced system, flow-control is not expected to limit throughput because slaves are able to keep executing transactions as fast as the server delivers them.
I know that GR flow-control can be a complicate concept and that’s why I blogged already about it and you can expect soon more posts related to this topic and especially covering the changes in 8.0.2.
Let’s go back to Marco’s test, my first remark is related to the sizing of the nodes. Are the nodes all equals as recommended ? We don’t know… To what I can see, nodes 3 and 4 are slow applying the workload and this is not related to the wrong flow-control value (see next point below). Also Marco created a cluster of 4 nodes with two in one location and two in another location (10ms). As majority is required for the consensus, the price of these 10ms is always paid.
The second point is related to the value used for flow-control, 25 ? In contrast with Galera, the queues checked for flow-control also include the transactions going through the applier pipeline. This means that very low thresholds will trigger flow-control even when replication is functioning perfectly well and with low latency. In fact, you won’t gain anything from that.
Another misconception, is related to the measurement. Marco suggested that he could safely say that the incoming GTID (last_received_transaction_set from replication_connection_status) is for sure the last apply on the master a node can know about.
In Group Replication, that doesn’t hold since before apply all group members do the certification (or at least the majority, with those 10ms). In fact, we can only know that this “master” (writer) will apply this transaction soon, but may not yet be applied. Certification happens independently at each member once a majority of the members do agree on the transaction delivery. This means that a transaction listed on last_received_transaction_set is certified and queued to apply on this member. On the other members it may already be certified/applied or soon be.
Additionally to this, Marco complains (just a little) that MySQL Group Replication is using binlogs and relay logs… this is completely true ! Our new replication is based on proven technologies, mastered by a lot of people with on top the Group Communication layer. Galera invented the gcache, we reused the relay log 😉
And finally, I completely agree with Marco when he says that Group Replication is based on asynchronous replication… but this is exactly the same for Galera (and PXC) where flow-control is totally blocking the cluster which is not the case in Group Replication (with other differences some good, some bad).
So yes, if you change the default or if you don’t follow the recommendations, the experience can hurt (and it’s the same with all HA solutions when playing with data). But the current defaults are good and should provide you a stable environment.
I’m very happy to see that Marco is taking a look at our HA solution and I also suggest to all people interested in MySQL Group Replication to check the new improvement we made in 8.0.2. I am looking forward to Marco’s new post on MySQL Group Replication and grazie per averlo provato (thank you for having tried it)!