MySQL 8.0.27 introduced a new variable to control the maximum of parallel threads InnoDB can use for creating (sorting and building) secondary indexes: innodb_ddl_threads
.
This new variable is coupled with another new variable: innodb_ddl_buffer_size
.
If you have fast storage and multiple CPU cores, tuning these variables can speed up secondary index creation.
For this example, I used the airportdb database, and I added a secondary index to the largest table, booking
.
Let’s start with the default settings:
SQL alter table booking add index idx_2(flight_id, seat, passenger_id); Query OK, 0 rows affected (9 min 0.6838 sec)
The default settings in MySQL 8.0.27 are:
innodb_ddl_threads = 4 innodb_ddl_buffer_size = 1048576 innodb_parallel_read_threads = 4
The innodb_ddl_buffer_size
is shared between all innodb_ddl_threads
defined. If you increase the amount of threads, I recommend that you also increase the buffer size.
To find the best values for these variables, let’s have a look at the amount of CPU cores:
SQL select count from information_schema.INNODB_METRICS where name = 'cpu_n'; +-------+ | count | +-------+ | 16 | +-------+
We have then 16 cores to share. As my machine as plenty of memory, I will allocate 1GB for the InnoDB DDL buffer.
SQL SET innodb_ddl_threads = 8; SQL SET innodb_parallel_read_threads = 8; SQL SET innodb_ddl_buffer_size = 1048576000; SQL alter table booking add index idx_2(flight_id, seat, passenger_id); Query OK, 0 rows affected (3 min 9.1862 sec)
Wow ! This is a nice improvement !
This is a table summary to see the impact of these variables:
ddl buffer size | ddl threads | parallel read theads | execution time |
1048576 | 4 | 4 | 9 min 0.6838 sec |
104857600 | 8 | 8 | 4 min 8.3601 sec |
1048576000 | 8 | 8 | 3 min 9.1862 sec |
1048576000 | 16 | 16 | 3 min 7.4079 sec |
1048576000 | 16 | 8 | 3 min 4.1161 sec |
1048576000 | 12 | 4 | 3 min 8.7854 sec |
1048576000 | 4 | 12 | 3 min 5.9497 sec |
1048576000 | 4 | 4 | 3 min 12.2435 sec |
2097152000 | 4 | 4 | 2 min 43.6280 sec |
2097152000 | 8 | 8 | 2 min 44.6516 sec |
We can see that with our system and our data, the best performance is achieved by increasing the buffer size used for the InnoDB DDL threads.
Does that mean only the buffer size matter ? Not really. Of course it depends of your data, with only one thread but with a buffer of 2G, the execution time is 3 min 22.9617 sec.
Also keeping increasing the buffer size won’t necessary improve the performance 😉
Please note that this feature also provides the possibility to build multiple indexes for a table in the same alter statement in parallel:
SQL alter table booking add index idx_2(flight_id, seat, passenger_id), add index idx_3(price, passenger_id), add index idx_4(seat,price); Query OK, 0 rows affected (10 min 17.8435 sec)
With the default settings it takes 28 min 46.9949 sec.
But the same query on a MySQL Database Service instance on OCI with 32 cores is executed in 2 min 11.6700 sec with a buffer size of 2G !
This new way of dealing with online DDL in InnoDB is a nice improvement, congrats InnoDB team !
And as usual, enjoy MySQL !
Thanks for the post LeFred, good to see MySQL getting better at using many cores for what was single-threaded operations in the past.
From what I understand, index creation was single threaded in the past, so could only use one CPU. With the default of innodb_ddl_threads being 4 threads, an index creation could use 100% of 4 CPU. With more and more people running MySQLin the Cloud and in Containers, many MySQL instances in production have 4 CPU or less. On such systems, the default value of innodb_ddl_threads to 4 might be too big as an index creation could starve CPU for other operations. Can you share some rationale behind the default being 4 (I would have put 2 or even 1) ?
And I know the default for innodb_page_cleaners is 4, but the page-cleaner threads are not as CPU intensive as a thread doing sorting. A default of 4 for innodb_page_cleaners might be good even on a “small” MySQL instance, while a default of 4 for innodb_ddl_threads might be catastrophic, especially in a Kubernetes / Container infrastructure where we might have CPU Throttling.
Thanks in advance for the details, Jean-François Gagné
Hi Jean-François,
Thank you for your comment.
innodb_ddl_threads defines the MAXIMUM threads that can be used. So this doesn’t mean 4 threads will be used. I need to check the code to see what kind of control is made.