How to Migrate from MariaDB to MySQL HeatWave

We see an increase in the demand for database migration from MariaDB to MySQL. This is likely due to recent news about MariaDB. [1][2][3][4][5][6][7][8]

In this article, we will show you how to migrate from MariaDB to MySQL HeatWave in Oracle Cloud Infrastructure (OCI).

If you plan to migrate from MariaDB to MySQL on-premise, please check this blog post.

MariaDB has significantly diverged, and is no longer drop-in compatible with MySQL. To migrate from MariaDB to MySQL, an in-place upgrade is not possible. A logical dump and load will be necessary. Such logical dump is also always necessary when the plan is to migrate to a managed database system like MySQL HeatWave.

In this post I explain the process of such migration and the possible problems (not an exhaustive list) you may encounter during the upgrade.

The process consists of the following 5 operations:

  1. search for eventual incompatibilities
  2. dump the data (logical)
  3. deploy your MySQL HeatWave instance
  4. load the data
  5. (optional) setup replication from on-prem

Why Migrate from MariaDB to MySQL HeatWave ?

World’s #1 and #2 most popular databases are developed at Oracle:

Here are the top 10 reasons to migrate to MySQL HeatWave on OCI:

  1. Fully managed cloud database service (dbaas): no need to worry about backups, hardware, upgrades, …
  2. Operated by the MySQL Team: your sever is operated by those that made it ! Tight feedback loop between Operations and Development. Your use of MySQL HeatWave directly helps improve the product.
  3. Oracle’s support: as soon as you deploy a MySQL HeatWave instance you get access to MySQL Support.
  4. Security & Compliance: network isolation, encryption at rest and in transit, secure user authentication (GDPR & HIPAA)
  5. Scalability (shapes and read replicas)
  6. Best High Availability based on Group Replication (RPO=0)
  7. Performance with HeatWave engine for OLTP and/or OLAP
  8. Machine Learning capabilities
  9. LakeHouse: Data Lakes and Data Warehouses
  10. Best Price/Performance compared to other MySQL Cloud Services.

MySQL HeatWave is a single service that provides OLTP, OLAP, ML and Lakehouse. Eliminate complexity of multiple services, ETL, etc

Search for eventual incompatibilities

This part is maybe the most challenging, and more you wait for the migration, more this process will become complicated.

We need to find if the MariaDB 10.6 server hosting our data is also using some features that are different, removed or not available in MySQL 8.0.

High Availability

In MariaDB, HA is provided by Galera, a plugin developed by another company, Codership. MySQL includes native, built-in, HA and DR using Group Replication, Asynchronous Connection Failover and Asynchronous Replication (in-bound and out-bond channels in MySQL HeatWave). This article does not cover the migration from Galera. Once migrated to MySQL HeatWave, the system will handle HA automatically if enabled:

Additionally, MySQL HeatWave provides point-in-time recovery and Read Replicas.

Point-in-Time Recovery can be enabled in the backup section:

Read Replicas are very easy to create:

Storage Engines

MariaDB Community Edition, contains various storage engines, in alpha or beta stages. These engines are not included in MariaDB Enterprise Edition. InnoDB is the main engine used for transaction processing. Before migration, you will need to convert data in other storage engines to InnoDB.

MySQL primary storage engine is InnoDB.

This is a query that you can use to verify the Storage Engines actually used on your database:

SELECT COUNT(*) as '# TABLES', 
  CONCAT(ROUND(sum(data_length) / ( 1024 * 1024 * 1024 ), 2), 'G') DATA,
  CONCAT(ROUND(sum(index_length) / ( 1024 * 1024 * 1024 ), 2), 'G') INDEXES,
  CONCAT(sum(ROUND(( data_length + index_length ) / ( 1024 * 1024 * 1024 ), 2)), 'G')
      'TOTAL SIZE', ENGINE 
FROM information_schema.TABLES
WHERE TABLE_SCHEMA 
  NOT IN ('mysql', 'information_schema', 'performance_schema', 'sys') 
  GROUP BY engine;
+----------+-------+---------+------------+--------+
| # TABLES | DATA  | INDEXES | TOTAL SIZE | ENGINE |
+----------+-------+---------+------------+--------+
|        1 | 0.00G | 0.00G   | 0.00G      | Aria   |
|        5 | 0.00G | 0.00G   | 0.00G      | InnoDB |
|        1 | 0.00G | 0.00G   | 0.00G      | MyISAM |
+----------+-------+---------+------------+--------+
3 rows in set (0.002 sec)

In the example above, we can see that there is one table using a Storage Engine not supported in MySQL 8.0. Let’s find the table:

SELECT TABLE_SCHEMA, TABLE_NAME, ENGINE 
  FROM information_schema.TABLES  
 WHERE TABLE_SCHEMA NOT 
    IN ('mysql', 'information_schema', 'performance_schema', 'sys')
   AND engine NOT IN ('MyISAM','InnoDB');
+--------------+------------+--------+
| TABLE_SCHEMA | TABLE_NAME | ENGINE |
+--------------+------------+--------+
| mydatabase   | t4         | Aria   |
+--------------+------------+--------+
1 row in set (0.001 sec)

To fix it, you need to convert it to InnoDB:

ALTER TABLE mydatabase.t4 ENGINE=InnoDB;

Note that there may be limitations related to the size of the rows, it also depends on the character set used.

Functions

MariaDB has some functions that are not present in MySQL, like JSON_DETAILED, which is called JSON_PRETTY in MySQL 8.0.

The list of those functions is present in MariaDB’s documentation, however check it twice as some information regarding MySQL 8.0 is sometimes outdated, especially on this page, (like invisible columns, virtual columns, wait, intersect, except, …).

This is not a blocking factor for migration, unless these functions are present in the default values of columns. But of course, if your application uses some of these functions, it may be necessary to modify it to use the appropriate one in MySQL 8.0.

To illustrate this, let’s use the ADD_MONTHS function.

First let’s see if we have this function as default for some columns:

SELECT TABLE_NAME, COLUMN_NAME
  FROM information_schema.COLUMNS
 WHERE COLUMN_DEFAULT LIKE '%add_months%';
Empty set (0.055 sec)

Great !…. mmm but I’m sure I’ve created a table with that specific function as default. This is what I did:

ALTER TABLE t6 ADD COLUMN future DATETIME DEFAULT (ADD_MONTHS(NOW(), 2));

In fact, several functions are acting like aliases. If we check the output of SHOW CREATE TABLE statement, we can see that the function is translated:

SHOW CREATE TABLE t6\G
*************************** 1. row ***************************
       Table: t6
Create Table: CREATE TABLE `t6` (
  `id` int(11) NOT NULL DEFAULT nextval(`mydatabase`.`s3`),
  `b` int(11) DEFAULT NULL,
  `future` datetime DEFAULT (current_timestamp() + interval 2 month),
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 COLLATE=latin1_swedish_ci
1 row in set (0.000 sec)

We can see that ADD_MONTHS() was translated by + interval 2 month.

This means that this function is not a problem when it has been used as default value when creating a table.

But of course, if the application uses similar functions in its queries, they must be rewritten as MySQL HeatWave doesn’t provide the Query Rewrite Plugin that you can use on-prem in MySQL 8.0.

Enabling general log and parsing it is the best way to look for such queries. But many DBAs don’t like to enable the general log, because it can create a significant overhead with heavy workload.

Another option is to use Performance_Schema. This is how we can use it:

SELECT DIGEST_TEXT 
 FROM performance_schema.events_statements_summary_by_digest 
WHERE DIGEST_TEXT LIKE '%add_months%';
+------------------------------------------------------------------------------------------+
| DIGEST_TEXT                                                                              |
+------------------------------------------------------------------------------------------+
| ALTER TABLE `t6` ADD COLUMN `future` DATETIME DEFAULT ( ADD_MONTHS ( NOW ( ) , ? ) )     |
| SELECT ID , `b` , ADD_MONTHS ( `future` , ? ) `present` FROM `t6`                        |
| SELECT ID , `b` , ADD_MONTHS ( `future` , ? ) `present` FROM `t6` ORDER BY `b`           |
| SELECT ADD_MONTHS ( `future` , ? ) `present` , COUNT ( * ) FROM `t6` GROUP BY `present`  |
+------------------------------------------------------------------------------------------+
4 rows in set (0.000 sec)

These 3 SELECT must be rewritten in the application.

Data Types

MariaDB also supports INET6 as a data type and this data type is not present in MySQL 8.0. In MySQL 8.0, IPv6 values are stored into VARBINARY(16).

To find a list of data type used in our database, we can execute the following query:

SELECT  DATA_TYPE , count(*) TOT  
  FROM information_schema.COLUMNS  
 WHERE TABLE_SCHEMA NOT 
   IN ('mysql', 'sys', 'information_schema', 'performance_schema') 
GROUP BY 1;
+-----------+-----+
| DATA_TYPE | TOT |
+-----------+-----+
| bigint    |  14 |
| datetime  |   1 |
| inet6     |   1 |
| int       |  10 |
| longtext  |   3 |
| tinyint   |   2 |
+-----------+-----+
6 rows in set (0.001 sec)

We can see that we have one column using inet6.

Did you notice that even if I’ve created JSON columns, I don’t see any..?!?

In fact, in MariaDB, JSON columns are an alias to a LONGTEXT with as check using JSON_VALID():

`doc` longtext CHARACTER SET utf8mb4 COLLATE utf8mb4_bin 
DEFAULT NULL CHECK (json_valid(`doc`))

In MySQL 8.0, JSON is a native datatype allowing multiple functions and enhancements related to performance and replication.

If you use unknown data types, the logical dump will fail with an error message like this:

Util.dumpInstance: Unknown data_type: inet6 and column_type: inet6 (LogicError)

So if this is the case, we need to modify the column back to VARBINARY(16) before the dump:

ALTER TABLE t5 MODIFY address VARBINARY(16);
Query OK, 4 rows affected (0.019 sec)              
Records: 4  Duplicates: 0  Warnings: 0

Data Dump

Once you have verified all the points above, it’s time to dump the data.

We use MySQL Shell with the dump and load utility to perform this logical dump.

It’s possible to use it from the command line or interactively. As the plan is to migrate to OCI, the best is to export the dump to Object Storage.

In case we have MariaDB Sequences (see FAQ), we can exclude such tables. When migrating to MySQL HeatWave, it’s also recommended to use --ocimds=true and the following compatibility options:

  • strip_definers
  • force_innodb
  • create_invisible_pks
$ mysqlsh root@127.0.0.1:10612 -- util dumpInstance "fromMariaDB" \ 
--osBucketName="migration" --users=false --osNamespace=ixxxxxxxj \
--threads=8 --ocimds=true --compatibility="strip_definers,force_innodb,create_invisible_pks" \
--excludeTables="mydatabase.s1,mydatabase.s3"
NOTE: Backup lock is not supported in MySQL 5.6 and DDL changes will not be blocked.
The dump may fail with an error if schema changes are made while dumping.
Acquiring global read lock
Global read lock acquired
Initializing - done 
WARNING: Failed to fetch value of @@GLOBAL.GTID_EXECUTED.
3 out of 7 schemas will be dumped and within them 7 tables, 0 views.
Gathering information - done 
All transactions have been started
Global read lock has been released
Checking for compatibility with MySQL Database Service 8.0.33
NOTE: MySQL Server 5.6 detected, please consider upgrading to 8.0 first.
NOTE: Table `mydatabase`.`t` does not have a Primary Key, this will be fixed 
when the dump is loaded
NOTE: Table `mydatabase`.`t3` had unsupported engine MyISAM changed to InnoDB
NOTE: One or more tables without Primary Keys were found.
      Missing Primary Keys will be created automatically when this dump is loaded.
      This will make it possible to enable High Availability in MySQL Database Service
      instance without application impact.
      However, Inbound Replication into an MDS HA instance (at the time of 
      the release of MySQL Shell 8.0.24) will still not be possible.
Compatibility issues with MySQL Database Service 8.0.33 were found and repaired. 
Please review the changes made before loading them.
Validating MDS compatibility - done        
Writing global DDL files
Running data dump using 8 threads.
NOTE: Progress information uses estimated values and may not be accurate.
Writing schema metadata - done
NOTE: Table statistics not available for `mydatabase`.`t2`, chunking operation 
may be not optimal. Please consider running 'ANALYZE TABLE `mydatabase`.`t2`;' first.
Writing DDL - done
Writing table metadata - done
Starting data dump
115% (22 rows / ~19 rows), 28.00 rows/s, 0.00 B/s uncompressed, 0.00 B/s compressed
Dump duration: 00:00:01s
Total duration: 00:00:02s
Schemas dumped: 3
Tables dumped: 7
Uncompressed data size: 405 bytes
Compressed data size: 349 bytes
Compression ratio: 1.2
Rows written: 22
Bytes written: 349 bytes
Average uncompressed throughput: 253.80 B/s
Average compressed throughput: 218.71 B/s

This is the same example in MySQL Shell’s interactive mode:

util.dumpInstance('fromMariaDB_interactive', 
            {osBucketName: "migration", users: false, osNamespace:'xxxxxxxx',
            threads:8,ocimds: true,excludeTables: ["mydatabase.s1","mydatabase.s3"],
            compatibility: ["force_innodb","strip_definers","create_invisible_pks"]})

The dump is a simple operation, we just have to remember to use the {users: false} option as the users and privileges are not compatible between MariaDB and MySQL.

As the plan is to migrate to MySQL HeatWave on OCI, you can also create a PAR (Pre-Authenticated Request) manifest, using {ociParManifest: true}.

If you created a PAR manifest, you can create the PAR URL from the @.manifest.json file:

And you can create the PAR URL from the @.manifest.json file:

Logical Data Load to MySQL HeatWave in OCI

The dump in Object Storage can be loaded at the creation of the MySQL HeatWave instance:

Or manually, using MySQL Shell on a compute instance and the PAR URL:

util.loadDump("https://o..com/p/F..W/n/i..j/b/migration/o/fromMariaDB_par/@.manifest.json",
     {progressFile: "progress.json", ignoreVersion: true})

Users and Authentication

Remember that to be able to create the logical dump, we had to skip the users using {users: false} option.

But if you have users that you also want to migrate, you can try using the user.getUsersGrants() plugin:

user.getUsersGrants("fred")
-- User `fred`@`%`
CREATE USER IF NOT EXISTS `fred`@`%` IDENTIFIED 
WITH 'mysql_native_password' AS '*6C69D17939B2C1D04E17A96F9B29B284832979B7';
GRANT ALL PRIVILEGES ON *.* TO `fred`@`%`;
GRANT SELECT, UPDATE, DELETE ON `mydatabase`.* TO `fred`@`%`;

And then replay the statements on the new MySQL HeatWave Instance. Be aware that the authentication plugin used is the old one (mysql_native_password) which is not default in MySQL 8.0. In MySQL 8.0 a more secure authentication method is used: caching_sha2_password.

Or we can use the user.copy() method of the same plugin:

Be aware that in MySQL HeatWave Database Service, some grants are not allowed for the user accounts.

You can use the administrator role to have an account with the same privileges of the admin user created during the deployment of the MySQL HeatWave DB Instance:

GRANT 'administrator' TO 'dev1';

Live Migration – Replication

If you don’t use any specific features related to MariaDB, it’s also possible to use standard MySQL Asynchronous Replication between both systems:

  • using binlog position based replication
  • GTIDs are not compatible

For Asynchronous Replication, the position to use is located in the dump directory (or Object Storage bucket) in the @.json file:

We use that information to create an Inbound Replication Channel:

We need to use the public IP of the MariaDB instance or its VPN address:

We use the previous saved binary log position information:

Conclusion

The migration from MariaDB to MySQL HeatWave is easy with MySQL Shell Dump & Load Utility.

Recently a detailed list of schema differences has been published by Skeema.io. I advise you to take a look at it if you experience some issues.

FAQ

Some infrequent cases that may come up:

How to migrate MariaDB sequences ?

MariaDB supports sequences which are not available in MySQL 8.0. Currently they are rarely used. Let’s check if our source server is using sequences because we will have to manage these tables for the migration:

SELECT COUNT(*), TABLE_TYPE FROM information_schema.TABLES GROUP BY table_type;
+----------+------------------+
| COUNT(*) | TABLE_TYPE       |
+----------+------------------+
|      117 | BASE TABLE       |
|        2 | SEQUENCE         |
|        1 | SYSTEM VERSIONED |
|       79 | SYSTEM VIEW      |
|      101 | VIEW             |
+----------+------------------+
5 rows in set (0.0250 sec)

We can see that we have 2 sequences. Usually sequences are used as default values in columns, if this is the case we can try to find out which table they are related to:

WITH seqlist (a) AS (
  SELECT CONCAT('%`',TABLE_SCHEMA,'`.`', TABLE_NAME,'`%') a
    FROM information_schema.TABLES 
   WHERE table_type="SEQUENCE")  
SELECT TABLE_NAME, COLUMN_NAME 
  FROM information_schema.COLUMNS
  JOIN seqlist WHERE COLUMN_DEFAULT LIKE seqlist.a;
+------------+-------------+
| TABLE_NAME | COLUMN_NAME |
+------------+-------------+
| t5         | a           |
| t6         | id          |
+------------+-------------+
2 rows in set (0.023 sec)

We need to replace the field by an auto_increment:

ALTER TABLE `t5` modify
  `a` int(11) NOT NULL auto_increment;

How to migrate MariaDB system-versioned tables ?

From the query above (question in FAQ), we can see that there is also 1 SYSTEM VERSIONED table. These type of tables are not supported in MySQL 8.0. If you don’t change them, they will just be ignored during the Dump process.

If you want to migrate the latest data from such table, you need to drop the versioning.

To get the list of the System Versioned tables you can run the following query:

SELECT TABLE_SCHEMA, TABLE_NAME 
  FROM information_schema.TABLES 
 WHERE TABLE_TYPE='system versioned';
+--------------+------------+
| TABLE_SCHEMA | TABLE_NAME |
+--------------+------------+
| mydatabase   | t          |
+--------------+------------+
1 row in set (0.0090 sec)

To remove the versioning information, you can use the following statement:

ALTER TABLE mydatabase.t DROP SYSTEM VERSIONING;
Query OK, 0 rows affected (0.0269 sec)

Subscribe to Blog via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

One comment

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

As MySQL Community Manager, I am an employee of Oracle and the views expressed on this blog are my own and do not necessarily reflect the views of Oracle.

You can find articles I wrote on Oracle’s blog.