MySQL InnoDB Cluster – consistency levels

on

Consistency during reads have been a small concern from the adopters of MySQL InnoDB Cluster (see this post and this one).

This is why MySQL supports now (since 8.0.14) a new consistency model to avoid such situation when needed.

Nuno Carvalho and Aníbal Pinto already posted a blog series I highly encourage you to read:

After those great articles, let’s check how that does work with some examples.

The environment

This is how the environment is setup:

  • 3 members: mysql1, mysql2 & mysql3
  • the cluster runs in Single-Primay mode
  • mysql1 is the Primary Master
  • some extra sys views are installed

Example 1 – EVENTUAL

This is the default behavior (group_replication_consistency='EVENTUAL'). The scenario is the following:

  • we display the default value of the session variable controlling the Group Replication Consistency on the Primary and on one Secondary
  • we lock a table on a Secondary master (mysql3) to block the apply of the transaction coming from the Primary
  • we demonstrate that even if we commit a new transaction on mysql1, we can read the table on mysql3 and the new record is missing (the write could not happen due to the lock)
  • once unlocked, the transaction is applied and the record is visible on the Secondary master (mysql3) too.

Example 2 – BEFORE

In this example, we will illustrate how we can avoid inconsistent reads on a Secondary master:

As you could notice, once we have set the session variable controlling the consistency, operations on the table (the server is READ-ONLY) are waiting for the Apply Queue to be empty before returning the result set.

We could also notice that the wait time (timeout) for this read operation is very long (8 hours by default) and can be modified to a shorter period:

We used SET wait_timeout=10 to define it to 10 seconds.

When the timeout is reached, the following error is returned:

ERROR: 3797: Error while waiting for group transactions commit on group_replication_consistency= 'BEFORE'

Example 3 – AFTER

It’s also possible to return from commit on the writer only when all members applied the change too. Let’s check this in action too:

This can be considered as synchronous writes as the return from commit happens only when all members have applied it. However you could also notice that in this consistency level, wait_timeout has not effect on the write. In fact wait_timeout has only effect on read operations when the consistency level is different than EVENTUAL.

This means that this can lead to several issues if you lock a table for any reason. If the DBA needs to perform some maintenance operations and requires to lock a table for a long time, it’s mandatory to not operate queries in AFTER or BEFORE_AND_AFTERwhile in such maintenance.

Example 4 – Scope

In the following video, I just want to show you the “scope” of these “waits” for transactions that are in the applying queue.

We will lock again t1 but on a Secondary master, we will perform a SELECT from table t2, the first time we will keep the default value of group_replication_consistency(EVENTUAL) and the second time we will change the consistency level to BEFORE :

We could see that as soon as they are transactions in the apply queue, if you change the consistency level to something BEFORE, it needs to wait for the previous transactions in the queue to be applied even if those events are related or not to the same table(s) or record(s). It doesn’t matter.

Example 5 – Observability

Of course it’s possible to check what’s going on and if queries are waiting for something.

BEFORE

When group_replication_consistency is set to BEFORE (or includes it), while a transaction is waiting for the applying queue to be committed, it’s possible to track those waiting transactions by running the following query:

SELECT * FROM information_schema.processlist 
WHERE state='Executing hook on transaction begin.';

AFTER

When group_replication_consistency is set to AFTER (or includes it), while a transaction is waiting for the transaction to be committed on the other members too, it’s possible to track those waiting transactions by running the following query:

SELECT * FROM information_schema.processlist 
WHERE state='waiting for handler commit';

It’s also possible to have even more information joining the processlist and InnoDB Trx tables:

SELECT *, TIME_TO_SEC(TIMEDIFF(now(),trx_started)) lock_time_sec 
FROM information_schema.innodb_trx JOIN information_schema.processlist
ON processlist.ID=innodb_trx.trx_mysql_thread_id
WHERE state='waiting for handler commit' ORDER BY trx_started\G

Conclusion

This consistency level is a wonderful feature but it could become dangerous if abused without full control of your environment.

I would avoid to set anything AFTER globally if you don’t control completely your environment. Table locks, DDLs, logical backups, snapshots could all delay the commits and transactions could start pilling up on the Primary Master. But if you control your environment, you have now the complete freedom to control completely the consistency you need on your MySQL InnoDB Cluster.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

recent
Locations of visitors to this page
categories