This is why MySQL supports now (since 8.0.14) a new consistency model to avoid such situation when needed.
Nuno Carvalho and Aníbal Pinto already posted a blog series I highly encourage you to read:
- Group Replication – Consistency Levels
- Group Replication: Preventing stale reads on primary fail-over! (you can also check this post)
- Group Replication – Consistent Reads
- Group Replication – Consistent Reads Deep Dive
After those great articles, let’s check how that does work with some examples.
This is how the environment is setup:
- 3 members:
- the cluster runs in Single-Primay mode
mysql1is the Primary Master
- some extra sys views are installed
Example 1 – EVENTUAL
This is the default behavior (
group_replication_consistency='EVENTUAL'). The scenario is the following:
- we display the default value of the session variable controlling the Group Replication Consistency on the Primary and on one Secondary
- we lock a table on a Secondary master (
mysql3) to block the apply of the transaction coming from the Primary
- we demonstrate that even if we commit a new transaction on
mysql1, we can read the table on
mysql3and the new record is missing (the write could not happen due to the lock)
- once unlocked, the transaction is applied and the record is visible on the Secondary master (
Example 2 – BEFORE
In this example, we will illustrate how we can avoid inconsistent reads on a Secondary master:
As you could notice, once we have set the session variable controlling the consistency, operations on the table (the server is READ-ONLY) are waiting for the Apply Queue to be empty before returning the result set.
We could also notice that the wait time (timeout) for this read operation is very long (8 hours by default) and can be modified to a shorter period:
SET wait_timeout=10 to define it to 10 seconds.
When the timeout is reached, the following error is returned:
ERROR: 3797: Error while waiting for group transactions commit on group_replication_consistency= 'BEFORE'
Example 3 – AFTER
It’s also possible to return from commit on the writer only when all members applied the change too. Let’s check this in action too:
This can be considered as synchronous writes as the return from commit happens only when all members have applied it. However you could also notice that in this consistency level,
wait_timeout has not effect on the write. In fact
wait_timeout has only effect on read operations when the consistency level is different than
This means that this can lead to several issues if you lock a table for any reason. If the DBA needs to perform some maintenance operations and requires to lock a table for a long time, it’s mandatory to not operate queries in
BEFORE_AND_AFTERwhile in such maintenance.
Example 4 – Scope
In the following video, I just want to show you the “scope” of these “waits” for transactions that are in the applying queue.
We will lock again
t1 but on a Secondary master, we will perform a
SELECT from table
t2, the first time we will keep the default value of
EVENTUAL) and the second time we will change the consistency level to
We could see that as soon as they are transactions in the apply queue, if you change the consistency level to something
BEFORE, it needs to wait for the previous transactions in the queue to be applied even if those events are related or not to the same table(s) or record(s). It doesn’t matter.
Example 5 – Observability
Of course it’s possible to check what’s going on and if queries are waiting for something.
group_replication_consistency is set to BEFORE (or includes it), while a transaction is waiting for the applying queue to be committed, it’s possible to track those waiting transactions by running the following query:
SELECT * FROM information_schema.processlist
WHERE state='Executing hook on transaction begin.';
group_replication_consistency is set to AFTER (or includes it), while a transaction is waiting for the transaction to be committed on the other members too, it’s possible to track those waiting transactions by running the following query:
SELECT * FROM information_schema.processlist
WHERE state='waiting for handler commit';
It’s also possible to have even more information joining the processlist and InnoDB Trx tables:
SELECT *, TIME_TO_SEC(TIMEDIFF(now(),trx_started)) lock_time_sec
FROM information_schema.innodb_trx JOIN information_schema.processlist
WHERE state='waiting for handler commit' ORDER BY trx_started\G
This consistency level is a wonderful feature but it could become dangerous if abused without full control of your environment.
I would avoid to set anything
AFTER globally if you don’t control completely your environment. Table locks, DDLs, logical backups, snapshots could all delay the commits and transactions could start pilling up on the Primary Master. But if you control your environment, you have now the complete freedom to control completely the consistency you need on your MySQL InnoDB Cluster.