Deploying High Availability and Disaster Recovery MySQL on OCI like a devops

We all know MySQL InnoDB ClusterSet, a solution that links multiple InnoDB Clusters and Read Replicas asynchronously to easily generate complex MySQL architectures and manage them without burdensome commands. All this thanks to the MySQL Shell’s AdminAPI.

This is an example of MySQL InnoDB ClusterSet using two data centers:

Let’s explore how we can automate the deployment of a similar architecture on OCI across two different regions.

We will connect us-ashburn to eu-frankfurt:

Our Primary Data Center will be Ahsburn, and the disaster recovery (Secondary) one will be Frankfurt.

To automate deployments in OCI, we will use an infrastructure provisioning tool and a configuration management system.

This is an overview of the final deployment, the resources in OCI, and the roles of the servers after configuration:

Terraform

To perform the deployments of all required resources in OCI, we will use Terraform (or we can use OpenTofu).

Terraform will deploy our infrastructure, and will be the first step of our deployment.

The code is published on this GitHub repository: https://github.com/lefred/oci-mysql-architectures

Puppet

Once all the resources are created, we need to configure the compute instances. This means we need to install MySQL, MySQL Shell, and use the AdminAPI to configure both InnoDB Clusters, the InnoDB ClusterSet, the Read Replica, and the MySQL Routers.

All these operations are automated using Puppet.

The configuration code is also available on GitHub: https://github.com/lefred/oci-puppet-demo

Deployment

In our deployment, we will use DNS names for the servers when configuring the different solutions using the AdminAPI, as we don’t know in advance the IP addresses that will be assigned to our machines, but we control their names.

Therefore, to work around some limitations related to the DNS between different regions, we will need to apply the architecture in multiple steps.

Step 1

As the first step, after creating the terraform.tfvars file by copying the terraform.tfvars.template (please edit the new file with your information), we deploy all the resources related to the Networking (VNCs, subnets, security lists, …) and the puppet server:

To deploy only the Puppet server, we modify the variables.tf file to look like this on the top:

variable "puppet_server_count" {
  description = "Number of Puppet Server instances to create"
  type        = number
  default     = 1
}

variable "region_instance_map" {
  type = map(number)
  description = "Map of region to number of compute instances"
  default = {
    source = 0,
    replica = 0
  }
}

variable "region_router_map" {
  type = map(number)
  description = "Map of region to number of compute router instances"
  default = {
    source = 0,
    replica = 0
  }
}

variable "region_map" {
  type = map(string)
  default = {
    source = "us-ashburn-1"
    replica = "eu-frankfurt-1"
  }
}

As you can see, the instances and the routers in each region are equal to 0.

We need to run the terraform apply in three different rounds because the DNS forwarder cannot be created immediately. After that, we need to apply again to make the forward rules (we require a rule to resolve from one region to another).

These are the errors you will see:

During the first terraform apply:

╷
│ Error: Missing required argument
│ 
│   with oci_dns_resolver.mysqlvcn1,
│   on dns.tf line 40, in resource "oci_dns_resolver" "mysqlvcn1":
│   40:  resolver_id = data.oci_core_vcn_dns_resolver_association.mysqlvcn1.dns_resolver_id
│ 
│ The argument "resolver_id" is required, but no definition was found.
╵
╷
│ Error: Missing required argument
│ 
│   with oci_dns_resolver.mysqlvcn2,
│   on dns.tf line 134, in resource "oci_dns_resolver" "mysqlvcn2":
│  134:  resolver_id = data.oci_core_vcn_dns_resolver_association.mysqlvcn2.dns_resolver_id
│ 
│ The argument "resolver_id" is required, but no definition was found.

And if you try to create the resolvers with the rules defined, you will see:

╷
│ Error: 400-InvalidParameter, Source endpoint name does not match any isForwarding resolver endpoint name
│ Suggestion: Please update the parameter(s) in the Terraform config as per error message Source endpoint name does not match any isForwarding resolver endpoint name
│ Documentation: https://registry.terraform.io/providers/oracle/oci/latest/docs/resources/dns_resolver 
│ API Reference: https://docs.oracle.com/iaas/api/#/en/dns/20180115/Resolver/UpdateResolver 
│ Request Target: PUT https://dns.us-ashburn-1.oci.oraclecloud.com/20180115/resolvers/ocid1.dnsresolver.oc1.iad.xxxx?scope=PRIVATE 
│ Provider version: 7.8.0, released on 2025-07-07.  
│ Service: Dns Resolver 
│ Operation Name: UpdateResolver 
│ OPC request ID: xxx 
│ 
│ 
│   with oci_dns_resolver.mysqlvcn1,
│   on dns.tf line 30, in resource "oci_dns_resolver" "mysqlvcn1":
│   30: resource "oci_dns_resolver" "mysqlvcn1" {
│ 
╵
╷
│ Error: 400-InvalidParameter, Source endpoint name does not match any isForwarding resolver endpoint name
│ Suggestion: Please update the parameter(s) in the Terraform config as per error message Source endpoint name does not match any isForwarding resolver endpoint name
│ Documentation: https://registry.terraform.io/providers/oracle/oci/latest/docs/resources/dns_resolver 
│ API Reference: https://docs.oracle.com/iaas/api/#/en/dns/20180115/Resolver/UpdateResolver 
│ Request Target: PUT https://dns.eu-frankfurt-1.oci.oraclecloud.com/20180115/resolvers/ocid1.dnsresolver.oc1.eu-frankfurt-1.xxxx?scope=PRIVATE 
│ Provider version: 7.8.0, released on 2025-07-07.  
│ Service: Dns Resolver 
│ Operation Name: UpdateResolver 
│ OPC request ID: xxxx
│ 
│ 
│   with oci_dns_resolver.mysqlvcn2,
│   on dns.tf line 125, in resource "oci_dns_resolver" "mysqlvcn2":
│  125: resource "oci_dns_resolver" "mysqlvcn2" {
│ 
╵

Hopefully, I’ve automated this process in the file deploy.sh.

When the deployment is done, we get the following output with only the public IP of our puppet server:

In the OCI console, we can also see that our new computer instance is deployed:

Please note that you will need to manually delete the forwarding rule in each forwarder before running terraform destroy:

Step 2

Now we can deploy multiple compute machines that will be acting as members of the different MySQL InnoDB Clusters and eventual routers.

Since OCI deploys compute machines quickly and in parallel, it’s safer to define the Primary compute instance (seed). If not, they risk deploying one of each.

In Puppet, in the hiera file of each member of the Primary Cluster (in our case, the instances in Hashburn), we add this line:

innodbcluster::magic::seed_node: ashburn-mysql-1.mysqlpriv1.mysqlvcn1.oraclevcn.com



We also specify in our variables.tf file the number of compute instances we want to deploy:

variable "region_instance_map" {
  type = map(number)
  description = "Map of region to number of compute instances"
  default = {
    source = 3,
    replica = 4
  }
}

variable "region_router_map" {
  type = map(number)
  description = "Map of region to number of compute router instances"
  default = {
    source = 2,
    replica = 1
  }
}

This is the number of the new compute instances (10) we will deploy to match the illustrated architecture.

We then execute terraform apply, and after that, we let Puppet do its magic for a couple of minutes.

While Puppet is performing its tasks, let’s have a look at our deployed infrastructure:

We see that all our instances have been successfully deployed.

Step 3

Now, let’s connect via SSH to the Puppet server to see how our InnoDB ClusterSet is deployed.

We can connect to one of the routers directly using MySQL Shell from our laptop, but we won’t be able to manage the architecture as the members’ hostnames aren’t publicly resolved.

Let’s now check the routers:

Great! Everything has been successfully configured.

A router might need to be bootstrapped again (with the –force option) when you see the following message:
WARNING: Router must be bootstrapped again for the ClusterSet to be recognized.
This can happen if the router is created while the ClusterSet is not yet fully deployed.
This automation should be implemented in MySQL Router.

This is the overview of the MySQL ClusterSet we have deployed:

Puppet Details

The configuration is made in Puppet via hiera YAML files. Those files are easy to understand, and one is created for each instance. The common.yaml file is also parsed for every node.

├── common.yaml
└── nodes
    ├── ashburn-mysql-1.mysqlpriv1.mysqlvcn1.oraclevcn.com.yaml
    ├── ashburn-mysql-2.mysqlpriv1.mysqlvcn1.oraclevcn.com.yaml
    ├── ashburn-mysql-3.mysqlpriv1.mysqlvcn1.oraclevcn.com.yaml
    ├── ashburn-router-1.mysqlpub1.mysqlvcn1.oraclevcn.com.yaml
    ├── ashburn-router-2.mysqlpub1.mysqlvcn1.oraclevcn.com.yaml
    ├── frankfurt-mysql-1.mysqlpriv2.mysqlvcn2.oraclevcn.com.yaml
    ├── frankfurt-mysql-2.mysqlpriv2.mysqlvcn2.oraclevcn.com.yaml
    ├── frankfurt-mysql-3.mysqlpriv2.mysqlvcn2.oraclevcn.com.yaml
    ├── frankfurt-mysql-4.mysqlpriv2.mysqlvcn2.oraclevcn.com.yaml
    ├── frankfurt-router-1.mysqlpub2.mysqlvcn2.oraclevcn.com.yaml
    └── puppet.mysqlpub1.mysqlvcn1.oraclevcn.com.yaml

Classes and settings are defined in those files. Those are the files you should edit to match your infrastructure.

Let’s take a look at one of the servers, the Primary Instance of the Primary Cluster, ashburn-mysql-1:

---
classes:
    - innodbcluster
    - innodbcluster::server
    - innodbcluster::clusterset

innodbcluster::cluster_name: cluster-ashburn
innodbcluster::members:
  - ashburn-mysql-1.mysqlpriv1.mysqlvcn1.oraclevcn.com
  - ashburn-mysql-2.mysqlpriv1.mysqlvcn1.oraclevcn.com
  - ashburn-mysql-3.mysqlpriv1.mysqlvcn1.oraclevcn.com

innodbcluster::magic::seed_node: ashburn-mysql-1.mysqlpriv1.mysqlvcn1.oraclevcn.com

We can see that we define three classes: innodbcluster, innodbcluster::server, and innodbcluster::clusterset.

We also define the name of the cluster in which this node belongs (cluster-ashburn) and the list of the other nodes.

And finally, we define, this is optional, the seed node.

The Puppet module for InnoDB Cluster also includes some functions that are executed on the server when it compiles the catalogue for a Puppet agent each time it’s executed on a host. That’s why the Puppet server needs to have MySQL Shell and the credentials to access each MySQL instance.

This is an overview of the module:

└── innodbcluster
    ├── lib
    │   ├── facter
    │   └── puppet
    │       └── functions
    │           └── innodbcluster
    │               ├── bootstrap_candidate.rb
    │               ├── cluster_exists.rb
    │               ├── cluster_node.rb
    │               ├── clusterset_exists.rb
    │               ├── clusterset_is_part.rb
    │               ├── clusterset_node.rb
    │               ├── readreplica_is_part.rb
    │               ├── seed_clusterset_node.rb
    │               └── seed_node.rb
    ├── manifests
    │   ├── clusterset.pp
    │   ├── config.pp
    │   ├── init.pp
    │   ├── magic.pp
    │   ├── packages.pp
    │   ├── passwords.pp
    │   ├── readreplica.pp
    │   ├── router.pp
    │   ├── serverpackages.pp
    │   ├── server.pp
    │   └── service.pp
    └── templates
        └── my.cnf.erb

Test

To test, we create a new dedicated user (I use root locally on ashburn-mysql-1 to create the user and the privileges.

Then let’s create the following table in the fred database:

CREATE TABLE `t1` (
  `id` int unsigned NOT NULL AUTO_INCREMENT,
  `host` varchar(40) DEFAULT NULL,
  `created` timestamp NULL DEFAULT CURRENT_TIMESTAMP,
  PRIMARY KEY (`id`)
) 

Now using one of the Public IPs of the routers, let’s connect with the new user and insert records in that table (we use the read/write port, 6450):

SQL> insert into t1 (host) values (@@hostname);
Query OK, 1 row affected (0.1004 sec)
Statement ID: 265649
GTIDs: 265332aa-613e-11f0-afb3-02001713e486:208
SQL> insert into t1 (host) values (@@hostname);
Query OK, 1 row affected (0.1016 sec)
Statement ID: 265847
GTIDs: 265332aa-613e-11f0-afb3-02001713e486:209

And then let’s read those records and display from where we are reading them:

SQL> select *, @@hostname read_from from t1;
+----+-----------------+---------------------+-----------------+
| id | host            | created             | read_from       |
+----+-----------------+---------------------+-----------------+
|  1 | ashburn-mysql-1 | 2025-07-15 07:06:59 | ashburn-mysql-2 |
|  2 | ashburn-mysql-1 | 2025-07-15 07:07:02 | ashburn-mysql-2 |
+----+-----------------+---------------------+-----------------+
2 rows in set (0.1169 sec)
Statement ID: 23768

By default, even if we connect to the router (using the R/W or the RO port) in Frankfurt (Secondary Cluster), reads and writes will happen on the Primary Cluster in Ashburn:

SQL> select *, @@hostname read_from from t1;
+----+-----------------+---------------------+-----------------+
| id | host            | created             | read_from       |
+----+-----------------+---------------------+-----------------+
|  1 | ashburn-mysql-1 | 2025-07-15 07:06:59 | ashburn-mysql-3 |
|  2 | ashburn-mysql-1 | 2025-07-15 07:07:02 | ashburn-mysql-3 |
+----+-----------------+---------------------+-----------------+
2 rows in set (0.3323 sec)
Statement ID: 91906

Routing Guidelines

Let’s imagine we would like to connect to any of the Read Replicas of the Secondary Cluster when connecting to a specific router (frankfurt-router-1) in our example.

If we have not yet created any routing guidelines, we start by creating one. So we connect to one of the instances using the clusteradmin account:

[opc@puppet ~]$ mysqlsh clusteradmin@10.0.1.136
Py> cs = dba.get_clusterset()
Py> rg = cs.create_routing_guideline("oci_guideline")
Creating Default Routing Guideline...

* Adding default destinations...
** Destination 'Primary' successfully added.
** Destination 'PrimaryClusterSecondary' successfully added.
** Destination 'PrimaryClusterReadReplica' successfully added.

* Adding default routes...
** Route 'rw' successfully added.
** Route 'ro' successfully added.

NOTE: Routing guideline 'oci_guideline' won't be made active by default. To activate this guideline, please use .set_routing_option() with the option 'guideline'.

Routing Guideline 'oci_guideline' successfully created.

We can use rg.show(), and we will see that all our servers in Frankfurt are not referenced:

Py> rg.show()
Routing Guideline: 'oci_guideline'
ClusterSet: 'oci-clusterset'

Routes
------
  - rw
    + Match: "$.session.targetPort = $.router.port.rw"
    + Destinations:
      * ashburn-mysql-1.mysqlpriv1.mysqlvcn1.oraclevcn.com:3306 (Primary)

  - ro
    + Match: "$.session.targetPort = $.router.port.ro"
    + Destinations:
      * ashburn-mysql-2.mysqlpriv1.mysqlvcn1.oraclevcn.com:3306, ashburn-mysql-3.mysqlpriv1.mysqlvcn1.oraclevcn.com:3306 (PrimaryClusterSecondary)
      * ashburn-mysql-1.mysqlpriv1.mysqlvcn1.oraclevcn.com:3306 (Primary)

Destination Classes
-------------------
  - Primary:
    + Match: "$.server.memberRole = PRIMARY AND ($.server.clusterRole = PRIMARY OR $.server.clusterRole = UNDEFINED)"
    + Instances:
      * ashburn-mysql-1.mysqlpriv1.mysqlvcn1.oraclevcn.com:3306

  - PrimaryClusterSecondary:
    + Match: "$.server.memberRole = SECONDARY AND ($.server.clusterRole = PRIMARY OR $.server.clusterRole = UNDEFINED)"
    + Instances:
      * ashburn-mysql-2.mysqlpriv1.mysqlvcn1.oraclevcn.com:3306
      * ashburn-mysql-3.mysqlpriv1.mysqlvcn1.oraclevcn.com:3306

Unreferenced servers
--------------------
  - frankfurt-mysql-3.mysqlpriv2.mysqlvcn2.oraclevcn.com:3306
  - frankfurt-mysql-2.mysqlpriv2.mysqlvcn2.oraclevcn.com:3306
  - frankfurt-mysql-1.mysqlpriv2.mysqlvcn2.oraclevcn.com:3306
  - frankfurt-mysql-4.mysqlpriv2.mysqlvcn2.oraclevcn.com:3306

We need to create a new destination to reach the read replicas on the secondary cluster (Frankfurt):

Py> rg.add_destination(
     "ReplicaClusterReadReplica", 
     "$.server.memberRole = READ_REPLICA 
          AND $.server.clusterRole = REPLICA 
          AND $.server.clusterName = 'cluster-frankfurt'"
    )
Destination 'ReplicaClusterReadReplica' successfully added.

And now, we can create a route to reach that new destination:

Py> rg.add_route(
   "frankfurt_route",
   "$.session.targetPort = $.router.port.rw_split",
   [ "round-robin(ReplicaClusterReadReplica)" ],
   {"connectionSharingAllowed": True, "enabled": True}
)   
Route 'frankfurt_route' successfully added.

If we run again rg.show() we can see that now frankfurt-mysql-4, the read replica is referenced.

We don’t want to deploy this routing guideline across all the routers in the cluster, so we need to specify the router on which we want this to be applied:

Py> cs.set_routing_option(
    "frankfurt-router-1.mysqlpub2.mysqlvcn2.oraclevcn.com::system",
    "guideline", "oci_guideline"
    )
Routing option 'guideline' successfully updated in router
'frankfurt-router-1.mysqlpub2.mysqlvcn2.oraclevcn.com::system'

If we list the routers, we can see that the guideline is applied:

Py> cs.list_routers(
   "frankfurt-router-1.mysqlpub2.mysqlvcn2.oraclevcn.com::system")
{
    "frankfurt-router-1.mysqlpub2.mysqlvcn2.oraclevcn.com::system": {
        "currentRoutingGuideline": "oci_guideline", 
        "hostname": "frankfurt-router-1.mysqlpub2.mysqlvcn2.oraclevcn.com", 
        "lastCheckIn": "2025-07-15 06:46:41", 
        "localCluster": "cluster-frankfurt", 
        "roPort": "6447", 
        "roXPort": "6449", 
        "rwPort": "6446", 
        "rwSplitPort": "6450", 
        "rwXPort": "6448", 
        "supportedRoutingGuidelinesVersion": "1.1", 
        "targetCluster": null, 
        "version": "9.3.0"
    }
}

And now, if we connect to the public IP of our router in Frankfurt, we can see that the reads are happening locally on our read replica instance (frankfurt-mysql-4):

SQL> select *, @@hostname read_from from fred.t1;
+----+-----------------+---------------------+-------------------+
| id | host            | created             | read_from         |
+----+-----------------+---------------------+-------------------+
|  1 | ashburn-mysql-1 | 2025-07-15 07:06:59 | frankfurt-mysql-4 |
|  2 | ashburn-mysql-1 | 2025-07-15 07:07:02 | frankfurt-mysql-4 |
+----+-----------------+---------------------+-------------------+
2 rows in set (0.0235 sec)
Statement ID: 68212

To disable the guideline from the router, we use None, like this:
Py> cs.set_routing_option(
"frankfurt-router-1.mysqlpub2.mysqlvcn2.oraclevcn.com::system",
"guideline", None)
Routing option 'guideline' successfully updated in router

'frankfurt-router-1.mysqlpub2.mysqlvcn2.oraclevcn.com::system'.

Conclusion

In this article, we saw how to deploy and configure a complete MySQL architecture that includes High Availability, Disaster Recovery, and Read Replicas to spread the load.

We use Terraform and Puppet to automate the deployment of resources in OCI, and Puppet to configure the instances using MySQL Shell’s Admin API.

Finally, we began to explore the Routing Guidelines that can enhance the flexibility of accessing your MySQL servers.

For more information about the Routing Guidelines, check these two articles from Miguel:

Happy automatic deployment in OCI!

Subscribe to Blog via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Leave a Reply

Your email address will not be published. Required fields are marked *

As MySQL Community Manager, I am an employee of Oracle and the views expressed on this blog are my own and do not necessarily reflect the views of Oracle.

You can find articles I wrote on Oracle’s blog.