MySQL InnoDB Cluster – how to manage a split-brain situation

lefred
April 10, 2019
5.7, 8.0, Group Replication, InnoDB Cluster, MySQL, MySQL 8.0, MySQL Shell
21 Comments

Everywhere I go to present MySQL InnoDB Cluster, during the demo of creating a cluster, many people doesn’t understand why when I’ve 2 members, my cluster is not yet tolerant to any failure.

Indeed when you create a MySQL InnoDB Cluster, as soon as you have added your second instance, you can see in the status:

    "status": "OK_NO_TOLERANCE",      
    "statusText": "Cluster is NOT tolerant to any failures.",

Quorum

Why is that ? It’s because, to be part of primary partition (the partition that holds the service, the one having a Primary-Master in Single Primary Mode, the default mode), your partition must reach the majority of nodes (quorum). In MySQL InnoDB Cluster (and many other cluster solutions), to achieve quorum, the amount of members in a partition must be > (bigger) than 50%.

So when we have 2 nodes, if there is a network issue between the two servers, the cluster will be split in 2 partitions. And each of it will have 50% of the amount of total members (1 of 2). Is 50% > than 50% ?? No! That’s why none of the partition will reach quorum and none will allow queries in case of MySQL InnoDB Cluster.

Indeed, the first machine will see that it won’t be able to reach the second machine anymore… but why ? Is the second machine who died ? Am I having network interface issues ? We don’t know, so we cannot decide.

Let’s take a look at this cluster of 3 members (3/3 = 100%):

If we take a look in the cluster.status()output, we can see that with 3 nodes we can tolerate one failure:

    "status": "OK",      
    "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.",

Now let’s imagine we have a network issue that will isolate one of the members:

We can see in cluster.status()output that the node is missing:

Our cluster will still be able to serve transactions has one partition still has quorum (2/3 = 66%, which is bigger than 50%).

        "mysql6:3306": {
             "address": "mysql6:3306",
              "mode": "n/a",
              "readReplicas": {},
              "role": "HA",
              "status": "(MISSING)"
        }

There is a very important concept I want to cover as this is not always obvious. The cluster is different in InnoDB Cluster and in Group Replication. Indeed, InnoDB Cluster relies on metadata created by the DBA using the MySQL Shell. Those metadata describes how the cluster has been set up. Group Replication sees the cluster differently. It sees it as it was last time it checked and how it’s right now… and updates that view. This is commonly called, the view of the world.
So in the example above, InnoDB Cluster sees 3 nodes: 2 online and 1 missing. For Group Replication, for a short moment, the partitioned node was UNREACHABLE and few second later, after being ejected from the Group by the majority (so only if there is still a majority), the node is not part of the cluster anymore. The Group size is now 2 of 2 (2/2 not 2/3). This information is exposed via performance_schema.replication_group_members

If our network issue would have been more serious and would split our cluster in 3 like the picture below, the cluster would be “offline” as none of the 3 partition would have reached quorum majority, 1/3 = 33% (<50%):

In this case the MySQL service won’t work properly until a human fixes the situation.

Fixing the situation

When there is no more primary partition in the cluster (like the example above), the DBA needs to restore the service. And as usual, there is already some information in the MySQL error log:

2019-04-10T13:34:09.051391Z 0 [Warning] [MY-011493] [Repl] Plugin group_replication 
  reported: 'Member with address mysql4:3306 has become unreachable.'
2019-04-10T13:34:09.065598Z 0 [Warning] [MY-011493] [Repl] Plugin group_replication 
  reported: 'Member with address mysql5:3306 has become unreachable.'
2019-04-10T13:34:09.065615Z 0 [ERROR] [MY-011495] [Repl] Plugin group_replication 
  reported: 'This server is not able to reach a majority of members in
  the group. This server will now block all updates. The server will
  remain blocked until contact with the majority is restored. It is 
  possible to use group_replication_force_members to force a new group
  membership.

From the message, we can see that this is exactly the situation we are explaining here. We can see in cluster.status()that the cluster is “blocked” :

    "status": "NO_QUORUM",      
    "statusText": "Cluster has no quorum as visible from 'mysql4:3306'
                   and cannot process write transactions. 
                   2 members are not active",

We have two solutions to fix the problem:

using SQL and Group Replication variables
using the MySQL Shell’s adminAPI

Fixing using SQL and Group Replication variables

This process is explained in the manual (Group Replication: Network Partitioning).

On the node the DBA wants to use to restore the service, if there is only one node left we can use the global variable group_replication_force_members and use the GCS address of the server that you can find in group_replication_local_address(if there are multiple servers online but not reaching the majority, all should be added to this variable):

SQL set global group_replication_force_members=@@group_replication_local_address;

Be careful that the best practice is to shutdown the other nodes to avoid any kind of conflicts if they reappear during the process of forcing quorum.

And the cluster will be again available. We can see in the error log that the situation has been resolved:

2019-04-10T14:41:15.232078Z 0 [Warning] [MY-011498] [Repl] Plugin group_replication 
reported: 
  'The member has resumed contact with a majority of the members in the group. 
  Regular operation is restored and transactions are unblocked.'

Don’t forget to remove the value of group_replication_force_members when you are back online:

SQL set global group_replication_force_members='';

When the network issue are resolved, the nodes will try to reconnect but has we forced the membership, those nodes will be rejected. You will need to rejoin the Group by:

restarting mysqld
or restarting again group replication (stop group_replication; start group_replication)
or using MySQL Shell (cluster.rejoinInstance())

Using the MySQL Shell’s adminAPI

The other option is to use the adminAPI from the MySQL Shell. This is the preferable option of course ! With the AdminAPI you don’t even need to know the port used for GCS to restore the quorum.

In the example below, we will use the server called mysql4 to re-activate our cluster:

JS cluster.forceQuorumUsingPartitionOf('clusteradmin@mysql4')

And when the network issues are resolved, the Shell can also be used to rejoin other instances (in this case mysql6) :

JS cluster.rejoinInstance('clusteradmin@mysql6')

Conclusion

When for any reason you have lost quorum on your MySQL InnoDB Cluster, don’t panic. Choose the node (or list of nodes that can still communicate with each others) you want to use and if possible shutdown or stop mysqld on the other ones. Then MySQL Shell is again your friend and use the adminAPI to force the quorum and reactive your cluster in one single command !

Bonus

If you want to know if your MySQL server is part of the primary partition (the one having the majority), you can run this command:

mysql> SELECT IF( MEMBER_STATE='ONLINE' AND ((
 SELECT COUNT(*) FROM performance_schema.replication_group_members 
 WHERE MEMBER_STATE NOT IN ('ONLINE', 'RECOVERING')) >= 
 ((SELECT COUNT(*) 
   FROM performance_schema.replication_group_members)/2) = 0), 'YES', 'NO' ) 
` in primary partition` 
 FROM performance_schema.replication_group_members 
 JOIN performance_schema.replication_group_member_stats 
 USING(member_id) where member_id=@@global.server_uuid;
+----------------------+
| in primary partition |
+----------------------+
| NO                   |
+----------------------+

Or by using this addition to sys schema: addition_to_sys_GR.sql

SQL select gr_member_in_primary_partition();
+----------------------------------+
| gr_member_in_primary_partition() |
+----------------------------------+
| YES                              |
+----------------------------------+
1 row in set (0.0288 sec)

21 Comments

MySQL InnoDB Cluster : avoid split-brain while forcing quorum – lefred's blog: tribulations of a MySQL Evangelist

April 11, 2019 / 19:13 Reply

[…] We saw yesterday that when an issue (like network splitting), it’s possible to remain with a partitioned cluster where none of the partition have quorum (majority of members). For more info read how to manage a split-brain situation. […]
MySQL InnoDB Cluster : avoid split-brain while forcing quorum

April 11, 2019 / 21:07 Reply

[…] We saw yesterday that when an issue (like network splitting), it’s possible to remain with a partitioned cluster where none of the partition have quorum (majority of members). For more info read how to manage a split-brain situation. […]
Ali

January 25, 2021 / 21:10 Reply

Hello Lefred

In which case Split Brain situation occur..?
- lefred
  
  January 25, 2021 / 21:14 Reply
  
  Hi Ali,
  
  Most of the time a split/brain situation happens when a network problem occurs and none of the nodes can see a majority.
  
  This is why running with a 2 nodes cluster is problematic. If N1 cannot see N2, none of them reaches the majority and cannot decide which “partition” can keep de leadership and run the cluster. In that case, each nodes will ask “Hey I cannot see my buddy… Do I have a problem or does he ??”
  
  Cheers,
  - Ali
    
    January 26, 2021 / 15:01 Reply
    
    Hello Lefred
    
    I am having strage error, i can’t run the
    cluster.forceQuorumPartitionOf
    
    It is failing with error run time error 1231
    - lefred
      
      January 26, 2021 / 15:09 Reply
      
      What’s the error message ? Which version of MySQL are you using ?
      - Ali
        
        January 26, 2021 / 15:24
        
        It’s 5.7.31 Version,
        
        Error while running
        cluster.forceQuorumPartitionOf(“admin@db01:3306”)
        
        Restoring failed with error 1231 Runtime Error.
        =========
        
        Mysql log file contain below information.
        
        2021-01-26T12:45:11.499896Z 4202400 [ERROR] Plugin group_replication reported: ‘Timeout on wait for view after setting group_replication_force_members value ‘sihoprdochdb01:33061′ into group communication interfaces’
        2021-01-26T12:49:19.353997Z 4203575 [Note] Unknown error 1156
        2021-01-26T12:49:58.768086Z 4203729 [Note] Plugin group_replication reported: ‘[GCS] Checking members to force a new configuration.’
        2021-01-26T12:49:58.768127Z 4203729 [Note] Plugin group_replication reported: ‘[GCS] ::modify_configuration():: Node[0]=sihoprdochdb01:33061 16106355403399440’
        2021-01-26T12:49:58.768333Z 4203729 [Note] Plugin group_replication reported: ‘The group_replication_force_members value ‘sihoprdochdb01:33061′ was set in the group communication interfaces’
        2021-01-26T12:50:03.552140Z 0 [Note] Plugin group_replication reported: ‘[GCS] Removing members that have failed while processing new view.’
        2021-01-26T12:50:03.552367Z 0 [Note] Plugin group_replication reported: ‘[GCS] Removing members that have failed while processing new view.’
        2021-01-26T12:50:03.555441Z 4191854 [Note]
        2021-01-26T12:50:03.558999Z 4191860 [Note]
        2021-01-26T12:50:58.768438Z 4203729 [ERROR] Plugin group_replication reported: ‘Timeout on wait for view after setting group_replication_force_members value ‘sihoprdochdb01:33061′ into group communication interfaces’
Ali

January 26, 2021 / 15:33 Reply

We’ve 3 Node IDC

2 nodes were showing “UNREACHABLE”
1 node was showing “R/O”

i stop 2 nodes and run the command “forceQuorumPartitionOf” which is failed.
- lefred
  
  January 26, 2021 / 16:03 Reply
  
  Hi Ali,
  The cluster object needs to be retrieved from the remaining node:
  n1 – unreachable
  n2 – unreachable
  n3 – r/o
  1) stop gr on n1 and n2
  2) connect to n3
  3) cluster = get.getCluster()
  4) cluster.forceQuorum…(n3)
Ali

January 26, 2021 / 16:19 Reply

Thank you ladies

Please send me the steps involve here.
- lefred
  
  January 26, 2021 / 16:20 Reply
  
  Ladies ?
  The steps are clear, you just need to connect to the remaining node to force quorum.
  - Ali
    
    January 26, 2021 / 16:22 Reply
    
    My mistake sorry for that.
    
    I mean stop group replication on N1 N2
    Force quorum on N3
    Join back N1 N2
    - lefred
      
      January 26, 2021 / 16:25 Reply
      
      if n1 and n2 are unreachable, you won’t be able to stop GR but if you can still connect to them, just run ‘stop group_replication’. on N3, do: cluster=dba.getCluster() and cluster.forceQuorumUsingPartitionOf(‘clusteradmin@N3’)
      - Ali
        
        January 26, 2021 / 16:48
        
        mysql@sihoprdochdb01 SQL-LOG]$ mysqlsh cladmin@sihoprdochdb01:3306 — cluster s tatus
        WARNING: Cluster has no quorum and cannot process write transactions: Group has no quorum
        {
        “clusterName”: “prodCluster”,
        “defaultReplicaSet”: {
        “name”: “default”,
        “primary”: “sihoprdochdb01:3306”,
        “ssl”: “DISABLED”,
        “status”: “NO_QUORUM”,
        “statusText”: “Cluster has no quorum as visible from ‘sihoprdochdb01:330 6’ and cannot process write transactions. 2 members are not active”,
        “topology”: {
        “sihoprdochdb01:3306”: {
        “address”: “sihoprdochdb01:3306”,
        “mode”: “R/O”,
        “readReplicas”: {},
        “role”: “HA”,
        “status”: “ONLINE”
        },
        “sihoprdochdb02:3306”: {
        “address”: “sihoprdochdb02:3306”,
        “mode”: “R/O”,
        “readReplicas”: {},
        “role”: “HA”,
        “status”: “UNREACHABLE”
        },
        “sihoprdochdb03:3306”: {
        “address”: “sihoprdochdb03:3306”,
        “mode”: “R/O”,
        “readReplicas”: {},
        “role”: “HA”,
        “status”: “UNREACHABLE”
        }
        },
        “topologyMode”: “Single-Primary”
        },
        “groupInformationSourceMember”: “sihoprdochdb01:3306”
        }
        [mysql@sihoprdochdb01 SQL-LOG]$
      - lefred
        
        January 26, 2021 / 16:52
        
        and if you do cluster.forceQuorumUsingPartitionOf(‘cladmin@sihoprdochdb01’)
Ali

January 26, 2021 / 18:30 Reply

Hi Lefred

Issue being resolved after restarting the MySQL Daemon from 3 nodes.
And Start the cluster from major outage.

Thanks for your quick reply,

Need to understand why this issue happen , where we killed some sessions in database using Router connection from workbench.
Heril

February 11, 2021 / 01:53 Reply

Hi Lefred,
What happened with application that connects via router on primary node in situation when primary node fails? Does, for example, node 2 becomes primary and router automatically transfer connection to another server despite connection string?
- lefred
  
  February 11, 2021 / 02:21 Reply
  
  In case of failover, new connections will be routed to the new Primary. Existing connection will be closed and the affiliation will have to reconnect. The running trx or sessions won’t be replayed , the application needs to handle that and in case of failure reconnect and replay the full trx that was not yet committed.
Sharad Patel

December 21, 2022 / 09:25 Reply

Hi lefred,

I have a MySQL InnoDb Cluster with three nodes (DB1, DB2, DB3) and two routers (DBR4, DBR5) with HA. Somehow the cluster have many issues and challenges but it work till yesterday along with DB1 – Online-Secondary, DB2-Error-Missing, DB3-Online-Primary and DBR5 is the only DB router. But now primary one DB3 is showing as Offline – Unreachable. And Cluster status is NO_QUORUM.

How to solve this situation ASAP with DB1 and DB3 only. I will try to add DB2 later on.

Thanks & Regards,

Sharad
- lefred
  
  December 21, 2022 / 09:36 Reply
  
  Hi,
  It’s very complicate to be able to help you with that few information. I suspect of course a network issue.
  However, I encourage you to join #mysql_innodb_cluster channel in mysqlcommunity.slack.com where you will certainly find help.
  Regards and season’s greatings
Rohit

February 8, 2023 / 11:38 Reply

Hii Lefred,

I also faced Forcequorum issue. Could u please tell me why this network split-brain occur ?
I have to give RCA for this issue. Please Help me

Cookie	Duration	Description
cookielawinfo-checbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.