Setup DR for your MySQL InnoDB Cluster

From MySQL 8.0.27, the only way to create DR for MySQL InnoDB Cluster is to use MySQL InnoDB ClusterSet !

I WANT TO READ THAT OUTDATED INFORMATION

Subscribe to Blog via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

18 Comments

  1. thanks this was a lot of help

    If anyone finds this and is a WINDOWS user who has a INNODB Cluster set up in one location and would like a disaster recovery copy at another location / site, then here is what I did.

    First, you need to choose one node in the live cluster to connect to.
    I chose node B5. You need to forward the port in your router to allow TCP traffic.
    I set up external port 9500 to :3306

    The reason for this is the Master_Log_file changes name from node to node. On one node it is called SQL-Cluster-B5-bin.00032 but on another node it is called SQL-Cluster-B1-bin.00032.
    Which just doesn’t work if you point the DR node to port 6446 hoping the cluster router will send the DR node to the next node (B1) if the one you are connected to goes down (B5).
    You will get the wrong BIN file.
    So this DR is based on the availability of a NODE in you Cluster not the entire cluster. But I will work on this so check if there are other posts below.

    Anyway, here we go….

    In your LIVE cluster, log into B5 and run this in workbench

    CREATE USER ‘home123’@’%’ IDENTIFIED BY ‘password123’ REQUIRE SSL;
    GRANT REPLICATION SLAVE, BACKUP_ADMIN, CLONE_ADMIN ON *.* TO ‘home123’@’%’;
    GRANT SELECT ON performance_schema.* TO ‘home123’@’%’;
    INSTALL PLUGIN clone SONAME ‘mysql_clone.dll’;
    flush privileges;

    On the DR node run workbench and update this setting (I found it timed out)
    Edit → Preferences → SQL Editor → DBMS connection read time out (in seconds): 600
    Changed the value to 6000.

    then run this in workbench
    SET PERSIST_ONLY gtid_mode=on;
    SET PERSIST_ONLY enforce_gtid_consistency=true;
    INSTALL PLUGIN clone SONAME ‘mysql_clone.dll’;
    CREATE USER ‘home123’@’%’ IDENTIFIED BY ‘password123’ REQUIRE SSL;
    GRANT REPLICATION SLAVE, BACKUP_ADMIN, CLONE_ADMIN ON *.* TO ‘home123’@’%’;
    GRANT SELECT ON performance_schema.* TO ‘home123’@’%’;
    RESTART;

    then run this (still on the DR node)
    (Note : : is just the WAN IP of the LIVE node you are using EG. 111.222.333.444:9500 )

    flush privileges;
    SET GLOBAL clone_valid_donor_list=’:’;
    CLONE INSTANCE FROM home123@: IDENTIFIED BY ‘password123′;

    the clone will now start , you can check how it is going by looking at the size in the properties window of your “C:\ProgramData\MySQL\MySQL Server 8.0\Data” folder

    once done run this (still on the DR node)
    CHANGE REPLICATION SOURCE TO source_host=”,
    source_port=, source_user=’home123′, source_password=’password123’,
    source_auto_position=1, source_ssl=1, source_retry_count=10000,
    source_connect_retry=10, source_connection_auto_failover=1 ;

    That is it.
    Run a select statement to count(*) a table to check the DR is running.
    select count(*) from table1

    also SHOW REPLICA STATUS will give you all the details.

    If you shutdown the live node and then restart it you will see the Cluster recovers the node and then the DR copy gets updated.

    Anyway thanks
    Damon

    • just noticed that some details have been removed by the formatting
      please replace the last bit with this


      then run this (still on the DR node)
      (Note : You need the WAN of the LIVE server and the port you are forwarding from
      EG. 201.222.33.99 : 9500 so in the below change it to your WAN – pretty sure you could use a http://www.acme.com.au as well )

      flush privileges;
      SET GLOBAL clone_valid_donor_list=’201.222.33.99:9500’;
      CLONE INSTANCE FROM home123@201.222.33.99:9500 IDENTIFIED BY ‘password123â€Č;

      the clone will now start , you can check how it is going by looking at the size in the properties window of your “C:\ProgramData\MySQL\MySQL Server 8.0\Data” folder

      once done run this (still on the DR node)
      CHANGE REPLICATION SOURCE TO source_host=201.222.33.99′,
      source_port=9500 , source_user=’home123â€Č, source_password=’password123’,
      source_auto_position=1, source_ssl=1, source_retry_count=10000,
      source_connect_retry=10, source_connection_auto_failover=1 ;

    • Hi Daemon,

      When using GTID (and AUTO_POSITION), that the binary logs change name or position is not a problem at all. Also with the feature presented in this post, doing replication over MySQL Router is not needed anymore.

      Thank you to read my blog.

      Cheers,
      lefred

  2. Hye lefred,

    If we want to promote DR as production in case of DC crash where innodb cluster totally down, application still need to change connection string to mysql DR? You mention it is easily upgradable to innodb cluster.
    My question is:

    1) can we promote it to master like traditional master-slave procedure? Once innodb cluster dc is available, we need to backup and restore DR to innodb cluster right? And reset back replication channel.

    2) If it can be upgraded to innodb cluster, do we need to install mysqlrouter on DR? Or existing router will do.

    Thanks,
    Daniel

    • Looks like I found a way to perform DR recovery to HQ cluster:

      a) After activity done at DR, convert to InnoDB and create tempCluster
      b) Bringup one DB at HQ after troubleshoot, dissolve cluster.
      c) at DR join HQ DB instance (clone)
      d) after sync, promote HQ DB as primary, remove DR instance
      e) dissolve cluster. (check innodb_cluster_metadata tables (instances and clusters) delete unnecessary old cluster_id
      f) Create new cluster, add Node2 and Node3 by cloning
      g) re-bootstrap mysql router

      any thoughts?

  3. Hello:

    I have the same question as Daniel, if there is a total failure in the main data center and I am going to use my single instance in the alternate data center, we operate without problem … but then, when the main data center is ok, how do I “bounce back”, considering that the proxy has already been receiving data …. and it will be inconsistent data if I only point my applications to the innodb cluster in the main datacenter.

  4. Hi lefred,

    As you have mentioned that Replicating from one InnoDB Cluster to another is not supported and will not work. But we are already in the process of testing this type of DR solution (ASYNC Master-Master replication between two Innodb clusters). And this is actually working for me as of now.

    So, can you please explain what is not going to work here or, I need to test some specific scenario to generate issue in this DR solution. It is great if you tell me that one scenario. Meanwhile, I am sharing the details below for our setup.

    1. Setup two Innodb Cluster on DC1 and DC2 (with three nodes in each DC).
    2. Setup ASync replication from DC1 > DC2 and DC2 > DC1 (async master-master rpl).
    3. There is HAProxy also in top of both the DCs, so when DC1 goes down, HAProxy will route the traffic to DC2 (yes, app will point to one DC at a time).

    What we have followed to setup Async Rpl?

    – We started Innodb cluster on DC1.
    – Copied data from DC1 to DC2 (on single mysql node on DC2, Innodbcluster is not setup at this time on DC2) using mysqlbackup (with skip-gtid-purged=off, to set gtid_purged). And started async replication (DC1 > DC2, gtid auto-pos set to ON).
    – After we checked successful execution of DC1 > DC2 replication, we stopped the rpl and perform reset slave to remove replication setup (bcs the next step is to create innodb cluster, and var c=dba.createCLuster() will generate an error if it find replication filter on underlying mysql node).
    – We created DC2 innodb cluster and added other two nodes with recovery option clone.
    – After successful innodb cluster creation on DC2. We started DC1 > DC2 async rpl again on primary node using auto-pos=1, and it is working fine.
    – Lastly, we setup DC2 > DC1, using auto-pos=1, also working fine.
    – ASync replication thread is connected via Router for both DC1 > DC2 and DC2 > DC1.

    It will be really great if you provide a right direction.
    Thanks.

    • Hi Shubham,

      Master-Master replication is not something I would ever recommend with Asynchronous Replication in between.

      Also what is happening when you change the Primary in DC2 for example ?

      The metadata will be a problem. Your chance is that for routing you are not relying for the moment to the metadata either but with MySQL Router you do too of course.

  5. Hi Lefred

    How do we setup a high availability solution using inno DB cluster? Can you point me to any of your blog link?

    Thankyou
    Charandeep

Leave a Reply to damonCancel Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

As MySQL Community Manager, I am an employee of Oracle and the views expressed on this blog are my own and do not necessarily reflect the views of Oracle.

You can find articles I wrote on Oracle’s blog.