How to Migrate from MariaDB to MySQL 8.0

We are seeing increasing requests for migrations from MariaDB to MySQL.

MariaDB has significantly diverged, and is no longer drop-in compatible with MySQL. To migrate from MariaDB to MySQL, an in-place upgrade is not possible. A logical dump and load will be necessary.

In this post I explain the process of such migration and the possible problems (not an exhaustive list) you may encounter during the upgrade.

The process consists of the following 4 operations:

  1. search for eventual incompatibilities
  2. dump the data (logical)
  3. install & start MySQL 8.0
  4. load the data

To illustrate the process of such migration, I use MariaDB 10.6, which is actually the most used version of MariaDB.

Search for eventual incompatibilities

This part is maybe the most chalenging, and more you wait for the migration, more this process will become complicated.

We need to find if the MariaDB 10.6 server hosting our data is also using some features that are different, removed or not available in MySQL 8.0.

High Availability

In MariaDB, HA is provided by Galera, a plugin developed by another company, Codership. MySQL includes native, built-in, HA and DR using Group Replication, InnoDB Cluster, ClusterSet and ReplicaSet. This article does not cover the migration from Galera to InnoDB Cluster.

Storage Engines

MariaDB Community Edition, contains various storage engines, in alpha or beta stages. These engines are not included in MariaDB Enterprise Edition. InnoDB is the main engine used for transaction processing. Before migration, you will need to convert data in other storage engines to InnoDB.

MySQL primary storage engine is InnoDB.

This is a query that you can use to verify the Storage Engines actually used on your database:

SELECT COUNT(*) as '# TABLES', 
  CONCAT(ROUND(sum(data_length) / ( 1024 * 1024 * 1024 ), 2), 'G') DATA,
  CONCAT(ROUND(sum(index_length) / ( 1024 * 1024 * 1024 ), 2), 'G') INDEXES,
  CONCAT(sum(ROUND(( data_length + index_length ) / ( 1024 * 1024 * 1024 ), 2)), 'G') 'TOTAL SIZE', ENGINE 
FROM information_schema.TABLES
WHERE TABLE_SCHEMA 
  NOT IN ('mysql', 'information_schema', 'performance_schema', 'sys') 
  GROUP BY engine;
+----------+-------+---------+------------+--------+
| # TABLES | DATA  | INDEXES | TOTAL SIZE | ENGINE |
+----------+-------+---------+------------+--------+
|        1 | 0.00G | 0.00G   | 0.00G      | Aria   |
|        5 | 0.00G | 0.00G   | 0.00G      | InnoDB |
|        1 | 0.00G | 0.00G   | 0.00G      | MyISAM |
+----------+-------+---------+------------+--------+
3 rows in set (0.002 sec)

In the example above, we can see that there is one table using a Storage Engine not supported in MySQL 8.0. Let’s find the table:

SELECT TABLE_SCHEMA, TABLE_NAME, ENGINE 
  FROM information_schema.TABLES  
 WHERE TABLE_SCHEMA NOT 
    IN ('mysql', 'information_schema', 'performance_schema', 'sys')
   AND engine NOT IN ('MyISAM','InnoDB');
+--------------+------------+--------+
| TABLE_SCHEMA | TABLE_NAME | ENGINE |
+--------------+------------+--------+
| mydatabase   | t4         | Aria   |
+--------------+------------+--------+
1 row in set (0.001 sec)

There are two ways to fix this, either we modify the engine directly on the MariaDB server (recommended), or we modify the engine when loading the data on MySQL 8.0.

Recommended way:

ALTER TABLE mydatabase.t4 ENGINE=InnoDB;

I will illustrate the dump in the next part without first changing the storage engine to InnoDB, but be aware that sometimes there may be limitations related to the size of the rows, it also depends on the character set used.

Functions

MariaDB has some functions that are not present in MySQL, like JSON_DETAILED, which is called JSON_PRETTY in MySQL 8.0.

The list of those functions is present in MariaDB’s documentation, however check it twice as some information regarding MySQL 8.0 is sometimes outdated, especially on this page, (like invisible columns, virtual columns, wait, intersect, except, …).

This is not a blocking factor for migration, unless these functions are present in the default values of columns. But of course, if your application uses some of these functions, it may be necessary to modify it to use the appropriate one in MySQL 8.0.

To illustrate this, let’s use the ADD_MONTHS function.

First let’s see if we have this function as default for some columns:

SELECT TABLE_NAME, COLUMN_NAME
  FROM information_schema.COLUMNS
 WHERE COLUMN_DEFAULT LIKE '%add_months%';
Empty set (0.055 sec)

Great !…. mmm but I’m sure I’ve created a table with that specific function as default. This is what I did:

ALTER TABLE t6 ADD COLUMN future DATETIME DEFAULT (ADD_MONTHS(NOW(), 2));

In fact, several functions are acting like aliases. If we check the output of SHOW CREATE TABLE statement, we can see that the function is translated:

SHOW CREATE TABLE t6\G
*************************** 1. row ***************************
       Table: t6
Create Table: CREATE TABLE `t6` (
  `id` int(11) NOT NULL DEFAULT nextval(`mydatabase`.`s3`),
  `b` int(11) DEFAULT NULL,
  `future` datetime DEFAULT (current_timestamp() + interval 2 month),
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 COLLATE=latin1_swedish_ci
1 row in set (0.000 sec)

We can see that ADD_MONTHS() was translated by + interval 2 month.

This means that this function is not a problem when it has been used as default value when creating a table.

But of course, if the application uses such a function in its queries, they must be rewritten. Enabling general log and parsing it is the best way to look for such queries. But many DBAs don’t like to enable the general log, because it can create a significant overhead with heavy workload.

Another option is to use Performance_Schema. This is how we can use it:

SELECT DIGEST_TEXT 
 FROM performance_schema.events_statements_summary_by_digest 
WHERE DIGEST_TEXT LIKE '%add_months%';
+------------------------------------------------------------------------------------------+
| DIGEST_TEXT                                                                              |
+------------------------------------------------------------------------------------------+
| ALTER TABLE `t6` ADD COLUMN `future` DATETIME DEFAULT ( ADD_MONTHS ( NOW ( ) , ? ) )     |
| SELECT ID , `b` , ADD_MONTHS ( `future` , ? ) `present` FROM `t6`                        |
| SELECT ID , `b` , ADD_MONTHS ( `future` , ? ) `present` FROM `t6` ORDER BY `b`           |
| SELECT ADD_MONTHS ( `future` , ? ) `present` , COUNT ( * ) FROM `t6` GROUP BY `present`  |
+------------------------------------------------------------------------------------------+
4 rows in set (0.000 sec)

These 4 queries should be rewritten.

Data Types

MariaDB also supports INET6 as a data type and this data type is not present in MySQL 8.0. In MySQL 8.0, IPv6 values are stored into VARBINARY(16).

To find a list of data type used in our database, we can execute the following query:

SELECT  DATA_TYPE , count(*) TOT  
  FROM information_schema.COLUMNS  
 WHERE TABLE_SCHEMA NOT 
   IN ('mysql', 'sys', 'information_schema', 'performance_schema') 
GROUP BY 1;
+-----------+-----+
| DATA_TYPE | TOT |
+-----------+-----+
| bigint    |  14 |
| datetime  |   1 |
| inet6     |   1 |
| int       |  10 |
| longtext  |   3 |
| tinyint   |   2 |
+-----------+-----+
6 rows in set (0.001 sec)

We can see that we have one column using inet6.

Did you notice that even if I’ve created JSON columns, I don’t see any..?!?

In fact, in MariaDB, JSON columns are an alias to a LONGTEXT with as check using JSON_VALID():

`doc` longtext CHARACTER SET utf8mb4 COLLATE utf8mb4_bin 
DEFAULT NULL CHECK (json_valid(`doc`))

In MySQL 8.0, JSON is a native datatype allowing multiple functions and enhancements related to performance and replication.

If you use unknown data types, the logical dump will fail with an error message like this:

Util.dumpInstance: Unknown data_type: inet6 and column_type: inet6 (LogicError)

So if this is the case, we need to modify the column back to VARBINARY(16) before the dump:

ALTER TABLE t5 MODIFY address VARBINARY(16);
Query OK, 4 rows affected (0.019 sec)              
Records: 4  Duplicates: 0  Warnings: 0

Data Dump

Once you have verified all the points above, it’s time to dump the data.

We use MySQL Shell with the dump and load utility to perform this logical dump.

It’s possible to use it from the command line or interactively:

$ mysqlsh root@127.0.0.1:10612 -- util dumpInstance "/tmp/dump_mariadb_10_6" \
   --users=false
NOTE: Backup lock is not supported in MySQL 5.6 and DDL changes will not be blocked. The dump may fail with an error if schema changes are made while dumping.
Acquiring global read lock
Global read lock acquired
Initializing - done 
WARNING: Failed to fetch value of @@GLOBAL.GTID_EXECUTED.
2 out of 6 schemas will be dumped and within them 5 tables, 0 views.
Gathering information - done 
All transactions have been started
Global read lock has been released
Writing global DDL files
Running data dump using 4 threads.
NOTE: Progress information uses estimated values and may not be accurate.
NOTE: Table statistics not available for `mydatabase`.`t2`, chunking operation may be not optimal. 
Please consider running 'ANALYZE TABLE `mydatabase`.`t2`;' first.
Writing schema metadata - done       
Writing DDL - done       
Writing table metadata - done       
Starting data dump
122% (11 rows / ~9 rows), 0.00 rows/s, 0.00 B/s uncompressed, 0.00 B/s compressed
Dump duration: 00:00:00s                                                         
Total duration: 00:00:00s                                                        
Schemas dumped: 2                                                                
Tables dumped: 5                                                                 
Uncompressed data size: 287 bytes                                                
Compressed data size: 224 bytes                                                  
Compression ratio: 1.3                                                           
Rows written: 11                                                                 
Bytes written: 224 bytes                                                         
Average uncompressed throughput: 287.00 B/s                                      
Average compressed throughput: 224.00 B/s

Or interactively in MySQL Shell:

The dump is a simple operation, we just have to remember to use the {users: false} option.

If you have Storage Engines that are not supported, you can force them to be InnoDB using the following command:

util.dumpInstance("/tmp/dump_mariadb_10_6_force", 
            {users: false, compatibility: ["force_innodb"]}

During the dump process you will see output like this:

NOTE: Table `mydatabase`.`t4` had unsupported engine Aria changed to InnoDB
NOTE: Table `mydatabase`.`t3` had unsupported engine MyISAM changed to InnoDB

However, you will still have to manually change some syntaxes. For example, the Aria engine will be replaced by InnoDB, but the PAGE_CHECKSUM=1 supplement will remain and will have to be removed.

Data Load

On a freshly installed MySQL 8.0 we can now load the dump. Once again we use MySQL Shell for this operation as well.

The load may fail and require some manual changes due to extra syntax related to some exotic storage engines or other features (see FAQ).

When everything is ready, you can load the dump in MySQL 8.0 like this:

Since I tried this before and it failed, I deleted the previous data and reloaded it with the resetProgress: true option because MySQL Shell can be interrupted and resumed.

Users and Authentication

Remember that to be able to create the logical dump, we had to skip the users using {users: false} option.

But if you have users that you also want to migrate, you can try using the user.getUsersGrants() plugin:

 JS > user.getUsersGrants("fred")
-- User `fred`@`%`
CREATE USER IF NOT EXISTS `fred`@`%` IDENTIFIED 
WITH 'mysql_native_password' AS '*6C69D17939B2C1D04E17A96F9B29B284832979B7';
GRANT ALL PRIVILEGES ON *.* TO `fred`@`%`;

And then replay the statements on the new MySQL 8.0. Be aware that the authentication plugin used is the old one (mysql_native_password) which is not default in MySQL 8.0. In MySQL 8.0 a more secure authentication method is used: caching_sha2_password.

Live Migration

If you don’t have any specific features on the MariaDB server, it is also possible to use asynchronous replication between both systems. This is something you need to test of course but be aware that it’s also a possibility for a live migration with minimal downtime.

Conclusion

If you are not using MariaDB specific features, the migration from MariaDB to MySQL 8.0 is easy with MySQL Shell Dump & Load Utility. But don’t forget that more you are waiting, more this process might become complicated and be locked in MariaDB.

So if you want to benefit from all new features available in MySQL 8.0, including all the security features, it’s the right time to migrate back to the official MySQL.

FAQ

Some infrequent cases that may come up:

How to migrate MariaDB sequences ?

MariaDB supports sequences which are not available in MySQL 8.0. We do not see many MariaDB users using this features. Let’s check if our source server is using sequences because we will have to manage these tables for the migration:

SELECT COUNT(*), TABLE_TYPE FROM information_schema.TABLES GROUP BY table_type;
+----------+------------------+
| COUNT(*) | TABLE_TYPE       |
+----------+------------------+
|      117 | BASE TABLE       |
|        2 | SEQUENCE         |
|        1 | SYSTEM VERSIONED |
|       79 | SYSTEM VIEW      |
|      101 | VIEW             |
+----------+------------------+
5 rows in set (0.0250 sec)

We can see that we have 2 sequences. Usually sequences are used as default values in columns, if this is the case we can try to find out which table they are related to:

WITH seqlist (a) AS (
  SELECT CONCAT('%`',TABLE_SCHEMA,'`.`', TABLE_NAME,'`%') a
    FROM information_schema.TABLES 
   WHERE table_type="SEQUENCE")  
SELECT TABLE_NAME, COLUMN_NAME 
  FROM information_schema.COLUMNS
  JOIN seqlist WHERE COLUMN_DEFAULT LIKE seqlist.a;
+------------+-------------+
| TABLE_NAME | COLUMN_NAME |
+------------+-------------+
| t5         | a           |
| t6         | id          |
+------------+-------------+
2 rows in set (0.023 sec)

We also have to fix these tables manually, if not, the load will fail. For sequences, the error message will be like this:

ERROR: [Worker003] Error processing table `mydatabase`.`t6`: MySQL Error 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'nextval(`mydatabase`.`s3`),
  `b` int(11) DEFAULT NULL,
  `future` datetime DEFA' at line 2: CREATE TABLE IF NOT EXISTS `t6` (
  `id` int(11) NOT NULL DEFAULT nextval(`mydatabase`.`s3`),
  `b` int(11) DEFAULT NULL,
  `future` datetime DEFAULT (current_timestamp() + interval 2 month),
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 COLLATE=latin1_swedish_ci
ERROR: [Worker003] While executing DDL script for `mydatabase`.`t6`: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'nextval(`mydatabase`.`s3`),
  `b` int(11) DEFAULT NULL,
  `future` datetime DEFA' at line 2
ERROR: Aborting load...

To fix the file, we need to replace in the sql file directly from the dump and replace the unknown syntax.

For example in the file mydatabase@t5.sql, this:

CREATE TABLE IF NOT EXISTS `t5` (
  `a` int(11) NOT NULL DEFAULT nextval(`mydatabase`.`s1`),

becomes:

CREATE TABLE IF NOT EXISTS `t5` (
  `a` int(11) NOT NULL auto_increment,

How to migrate MariaDB system-versioned tables ?

From the query above (question in FAQ), we can see that there is also 1 SYSTEM VERSIONED table. These type of tables are not supported in MySQL 8.0. If you don’t change them, they will just be ignored during the Dump process.

If you want to migrate the latest data from such table, you need to drop the versioning.

To get the list of the System Versioned tables you can run the following query:

SELECT TABLE_SCHEMA, TABLE_NAME 
  FROM information_schema.TABLES 
 WHERE TABLE_TYPE='system versioned';
+--------------+------------+
| TABLE_SCHEMA | TABLE_NAME |
+--------------+------------+
| mydatabase   | t          |
+--------------+------------+
1 row in set (0.0090 sec)

To remove the versioning information, you can use the following statement:

ALTER TABLE mydatabase.t DROP SYSTEM VERSIONING;
Query OK, 0 rows affected (0.0269 sec)

Subscribe to Blog via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

One comment

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

As MySQL Community Manager, I am an employee of Oracle and the views expressed on this blog are my own and do not necessarily reflect the views of Oracle.

You can find articles I wrote on Oracle’s blog.