MySQL Invisible Column: part III

We recently saw how the new Invisible Column feature works in MySQL since 8.0.23 and how we can use it as a Primary Key for InnoDB tables when no Primary Key was defined.

As I wrote earlier, a good Primary Key is important for InnoDB (storage, IOPS, secondary indexes, memory…) but there is another important domain where a Primary Key is important in MySQL: replication !

Asynchronous Replication

When using “traditional” replication, if you modify a record (UPDATE and DELETE), the record(s) to modify on the replica are identified using indexes, and of course the Primary Key if any. The hidden global 6-bytes auto generated by InnoDB primary key is never used as never exposed and as it’s global, there is no warranty to have the same on the source and the replica. You shouldn’t consider it at all.

If the algorithm was not able to find a suitable index, or was only able to find an index that was non-unique or contained nulls, a hash table is used to assist in identifying the table records. The algorithm creates a hash table containing the rows in the UPDATE or DELETE operation, with the key as the full before-image of the row. The algorithm then iterates over all the records in the target table, using the selected index if it found one, or else performing a full table scan (see the manual).

So if your application doesn’t support an extra key to be used as a Primary Key, using an hidden column as primary key is a option to speed up the operations by the replication applier.

Here is an example of statement where an additional column as Primary Key would break the application:

SQL  create table t1 (name varchar(20), age int);

SQL  insert into t1 values ('mysql',25),('kenny', 35),('lefred', 44);

Now let’s add a auto_increment column as Primary Key:

SQL  alter table t1 add id int auto_increment primary key first;

And now let’s add a record using the INSERT statement as specified in the application:

SQL  insert into t1 values ('python',20);
ERROR: 1136: Column count doesn't match value count at row 1

The best is of course to modify the application to change that INSERT statement, but is it possible?

Also how many application are still using SELECT * and then reference the column like col[2] ?

If this is your case, you have two options:

parse all queries and use the rewrite query plugin
use an invisible column

In this particular case, the choice is easy (at least for somebody as lazy as I am 😉 ):

SQL  alter table t1 modify id int auto_increment invisible;
SQL  insert into t1 values ('python',20);
Query OK, 1 row affected (0.0887 sec)

Easy, isn’t it ?

Group Replication

MySQL InnoDB Cluster uses another type of replication: Group Replication.

One of the requirements to use Group Replication is to have a Primary Key (this is why you can use sql_require_primary_key)

Let’s recreate the table without the Primary Key as on the example above and check if the instance can be used in an InnoDB Cluster:

The message is clear, modifications on that table won’t be spread in the cluster.

So let’s add the invisible primary key and check again:

All good !

This means that if you have an application using tables without Primary Key that didn’t allow you to migrate to a full HA solution like MySQL InnoDB Cluster, now thanks to the invisible column, this is possible.

This would help solving Hadoop Hive support for MySQL InnoDB Cluster for example (see HIVE-17306).

One comment

Apache Superset with MySQL HeatWave – lefred blog: tribulations of a MySQL Evangelist

June 15, 2023 / 13:31 Reply

[…] Having a Primary Key defined in MySQL InnoDB is mandatory when using Group Replication. In fact it’s always recommended to have a primary key ! (see [1], [2], [3] and [4]) […]

Cookie	Duration	Description
cookielawinfo-checbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Asynchronous Replication

Group Replication

follow me

Subscribe to Blog via Email

One comment

Leave a ReplyCancel Reply