In this new article about how to find the info when using MySQL Database Service on Oracle Cloud Infrastructure, we will learn about the query accelerator: HeatWave.
With HeatWave, you can boost the performance of your MySQL queries, providing your applications with faster, more reliable, and cost-effective access to data.
HeatWave is a high-performance in-memory query accelerator for MySQL Database Service on Oracle Cloud Infrastructure. It is designed to accelerate analytics workloads (OLAP) and increase the performance of your MySQL databases by orders of magnitude. This is achieved through the use of in-memory processing, advanced algorithms, and machine learning techniques to optimize query performance. If identified by the optimizer, OLTP requests can also be accelerated using HeatWave.
Today we will try to answer the following questions:
- Can I use HeatWave ?
- Is HeatWave enabled ?
- Is my data ready to benefit from HeatWave ?
- Is my query accelerated ?
- Why is my query not accelerated ?
- Could Machine Learning improve how my data is loaded into HeatWave ?
The above questions are what a MySQL DBA using MDS in OCI must answer regularly.
Can I use HeatWave ?
To be able to use HeatWave for your MySQL Database Service in OCI, the MySQL Shape must be compatible with HeatWave.
When you create a new DB System, you have the possibility to choose for a Standalone, High Availability or HeatWave system:
If you choose HeatWave, you will have the choice of all HeatWave compatible shapes available in your tenancy:
But even if you select a HeatWave compatible shape, this doesn’t mean you already have a HeatWave Cluster enabled.
For example, if we check the DB System we used for the previous articles, we can see that even if it’s a HeatWave compatible shape, the HeatWave cluster is not yet enabled:
So yes, HeatWave can be used on this system, but only once the HeatWave Cluster will be created.
Is HeatWave enabled ?
We saw that HeatWave is not enabled by default even if we use a HeatWave compatible shape.
We can click Edit next to HeatWave cluster: Disabled on the picture above or select HeatWave on the menu on the left:
Before the creation of the HeatWave Cluster, we can also check with the SQL interface if the HeatWave service is ready:
show global status like 'rapid_service_status';
+----------------------+---------+
| Variable_name | Value |
+----------------------+---------+
| rapid_service_status | OFFLINE |
+----------------------+---------+
1 row in set (0.0011 sec)
OFFLINE
means it’s not ready. The Storage Engine’s name for HeatWave is RAPID.
Once the cluster is created we can see that HeatWave is enabled:
And in SQL:
select * from performance_schema.global_status
where variable_name in ('rapid_resize_status',
'rapid_service_status','rapid_cluster_ready_number');
+----------------------------+--------------------+
| VARIABLE_NAME | VARIABLE_VALUE |
+----------------------------+--------------------+
| rapid_cluster_ready_number | 1 |
| rapid_resize_status | RESIZE_UNSUPPORTED |
| rapid_service_status | ONLINE |
+----------------------------+--------------------+
3 rows in set (0.0009 sec)
Is my data ready to benefit from HeatWave ?
To benefit from HeatWave, the data needs to be loaded into the HeatWave Cluster.
The best way to perform this operation is to use Estimate node operation when enabling the HeatWave Cluster:
You select the database you want to load:
So if you want to load all the tables from that schema (airportdb
) you can call that procedure.
If you are using MySQL Shell for Visual Studio Code, you can also easily load a schema to HeatWave:
You can also verify which tables are loaded into HeatWave using the following query:
select name, load_progress, load_status, query_count
from performance_schema.rpd_tables
join performance_schema.rpd_table_id using(id);
+-----------------------------+---------------+---------------------+-------------+
| name | load_progress | load_status | query_count |
+-----------------------------+---------------+---------------------+-------------+
| airportdb.flight_log | 100 | AVAIL_RPDGSTABSTATE | 0 |
| airportdb.airport_geo | 100 | AVAIL_RPDGSTABSTATE | 0 |
| airportdb.flight | 100 | AVAIL_RPDGSTABSTATE | 0 |
| airportdb.passengerdetails | 100 | AVAIL_RPDGSTABSTATE | 0 |
| airportdb.passenger | 100 | AVAIL_RPDGSTABSTATE | 0 |
| airportdb.airplane | 100 | AVAIL_RPDGSTABSTATE | 0 |
| airportdb.weatherdata | 100 | AVAIL_RPDGSTABSTATE | 0 |
| airportdb.flightschedule | 100 | AVAIL_RPDGSTABSTATE | 0 |
| airportdb.booking | 100 | AVAIL_RPDGSTABSTATE | 0 |
| airportdb.employee | 100 | AVAIL_RPDGSTABSTATE | 0 |
| airportdb.airplane_type | 100 | AVAIL_RPDGSTABSTATE | 0 |
| airportdb.seat_sold | 100 | AVAIL_RPDGSTABSTATE | 0 |
| airportdb.airport | 100 | AVAIL_RPDGSTABSTATE | 0 |
| airportdb.airline | 100 | AVAIL_RPDGSTABSTATE | 0 |
| airportdb.airport_reachable | 100 | AVAIL_RPDGSTABSTATE | 0 |
+-----------------------------+---------------+---------------------+-------------+
15 rows in set (0.0008 sec)
When a table is successfully loaded into HeatWave its status is AVAIL_RPDGSTABSTATE.
OCI MDS Web Console also has some Metrics available for HeatWave. This is an example for the load:
Is my query accelerated ?
The query explain plan (QEP) provides adequate information to determine whether a query is being off-loaded to HeatWave.
The QEP is generated using the EXPLAIN
keyword:
If we see in secondary engine RAPID it means that the query is indeed using HeatWave to be accelerated.
There is also a status variable that is incremented when a query is accelerated using HeatWave:
show status like 'rapid_query_offload_count';
+---------------------------+-------+
| Variable_name | Value |
+---------------------------+-------+
| rapid_query_offload_count | 2 |
+---------------------------+-------+
1 row in set (0.0011 sec)
And we can also check again in the performance_schema
tables we used earlier:
select name, load_progress, load_status, query_count
from performance_schema.rpd_tables
join performance_schema.rpd_table_id using(id) where query_count > 0;
+--------------------+---------------+---------------------+-------------+
| name | load_progress | load_status | query_count |
+--------------------+---------------+---------------------+-------------+
| airportdb.flight | 100 | AVAIL_RPDGSTABSTATE | 2 |
| airportdb.airplane | 100 | AVAIL_RPDGSTABSTATE | 2 |
| airportdb.booking | 100 | AVAIL_RPDGSTABSTATE | 2 |
| airportdb.airline | 100 | AVAIL_RPDGSTABSTATE | 2 |
+--------------------+---------------+---------------------+-------------+
4 rows in set (0.0008 sec)
There is also a Metric collecting the number of statements processed by the HeatWave cluster:
If you don’t see the sentence in secondary engine RAPID in the Query Execution Plan for a query, this means the query won’t be off-loaded to HeatWave:
There is also a nice status variables that tracks the amount of data scanned by queries using HeatWave. The value is in megabytes:
show global status like 'hw_data%';
+-----------------+-------+
| Variable_name | Value |
+-----------------+-------+
| hw_data_scanned | 444 |
+-----------------+-------+
1 row in set (0.0009 sec)
Why is my query not accelerated ?
There may be some reasons why a query does not use HeatWave, see the limitations (please check regularly, as every releases in OCI remove some limitations).
To find out why a query is not off-loaded to the HeatWave cluster, we use the optimizer_trace
:
For this particular query, we can see that HeatWave is not used because the query cost is under the threshold where a query is off-loaded to HeatWave.
Could Machine Learning improve how my data is loaded into HeatWave ?
The short answer is Yes ! MySQL HeatWave offers Machine Learning Advisors that can provide recommendations based on the workload using machine learning models, data analysis and HeatWave query history.
So after having used HeatWave for a while or after having modified a lot of data, like a new import, it’s recommended to uses these ML Advisors that will create an autopilot report.
Auto Encoding
Auto Encoding recommends how the string column must be encoded in HeatWave to reduce the amount of required memory and improve performance:
CALL sys.heatwave_advisor(JSON_OBJECT("auto_enc",JSON_OBJECT("mode","recommend")));
The output is a report listing the suggestions:
+-------------------------------+
| INITIALIZING HEATWAVE ADVISOR |
+-------------------------------+
| Version: 1.44 |
| |
| Output Mode: normal |
| Excluded Queries: 0 |
| Target Schemas: All |
| |
+-------------------------------+
6 rows in set (0.0110 sec)
+---------------------------------------------------------+
| ANALYZING LOADED DATA |
+---------------------------------------------------------+
| Total 15 tables loaded in HeatWave for 1 schemas |
| Tables excluded by user: 0 (within target schemas) |
| |
| SCHEMA TABLES COLUMNS |
| NAME LOADED LOADED |
| ------ ------ ------ |
| `airportdb` 15 107 |
| |
+---------------------------------------------------------+
8 rows in set (0.0110 sec)
+------------------------------------------------------------------------------------------------------+
| ENCODING SUGGESTIONS |
+------------------------------------------------------------------------------------------------------+
| Total Auto Encoding suggestions produced for 22 columns |
| Queries executed: 9 |
| Total query execution time: 621.48 ms |
| Most recent query executed on: Wednesday 22nd March 2023 19:48:25 |
| Oldest query executed on: Wednesday 22nd March 2023 19:47:07 |
| |
| CURRENT SUGGESTED |
| COLUMN COLUMN COLUMN |
| NAME ENCODING ENCODING |
| ------ -------- --------- |
| `airportdb`.`airline`.`airlinename` VARLEN DICTIONARY |
| `airportdb`.`airplane_type`.`description` VARLEN DICTIONARY |
| `airportdb`.`airplane_type`.`identifier` VARLEN DICTIONARY |
| `airportdb`.`airport`.`name` VARLEN DICTIONARY |
| `airportdb`.`airport_geo`.`city` VARLEN DICTIONARY |
| `airportdb`.`airport_geo`.`country` VARLEN DICTIONARY |
| `airportdb`.`airport_geo`.`name` VARLEN DICTIONARY |
| `airportdb`.`employee`.`city` VARLEN DICTIONARY |
| `airportdb`.`employee`.`country` VARLEN DICTIONARY |
| `airportdb`.`employee`.`emailaddress` VARLEN DICTIONARY |
| `airportdb`.`employee`.`lastname` VARLEN DICTIONARY |
| `airportdb`.`employee`.`password` VARLEN DICTIONARY |
| `airportdb`.`employee`.`street` VARLEN DICTIONARY |
| `airportdb`.`employee`.`telephoneno` VARLEN DICTIONARY |
| `airportdb`.`employee`.`username` VARLEN DICTIONARY |
| `airportdb`.`passenger`.`lastname` VARLEN DICTIONARY |
| `airportdb`.`passenger`.`passportno` VARLEN DICTIONARY |
| `airportdb`.`passengerdetails`.`city` VARLEN DICTIONARY |
| `airportdb`.`passengerdetails`.`country` VARLEN DICTIONARY |
| `airportdb`.`passengerdetails`.`emailaddress` VARLEN DICTIONARY |
| `airportdb`.`passengerdetails`.`street` VARLEN DICTIONARY |
| `airportdb`.`passengerdetails`.`telephoneno` VARLEN DICTIONARY |
| |
| Applying the suggested encodings might improve cluster memory usage. Performance gains not expected. |
| Estimated HeatWave cluster memory savings: 0 bytes |
| |
+------------------------------------------------------------------------------------------------------+
36 rows in set (0.0110 sec)
+----------------------------------------------------------------------------------------------------------------+
| SCRIPT GENERATION |
+----------------------------------------------------------------------------------------------------------------+
| Script generated for applying suggestions for 7 loaded tables |
| |
| Applying changes will take approximately 5.00 s |
| |
| Retrieve script containing 57 generated DDL commands using the query below: |
| Deprecation Notice: "heatwave_advisor_report" will be deprecated, please switch to "heatwave_autopilot_report" |
| SELECT log->>"$.sql" AS "SQL Script" FROM sys.heatwave_autopilot_report WHERE type = "sql" ORDER BY id; |
| |
| Caution: Executing the generated script will alter the column comment and secondary engine flags in the schema |
| |
+----------------------------------------------------------------------------------------------------------------+
You can generate a single string to cut & paste using:
SET SESSION group_concat_max_len = 1000000;
SELECT GROUP_CONCAT(log->>"$.sql" SEPARATOR ' ')
FROM sys.heatwave_autopilot_report
WHERE type = "sql" ORDER BY id;
After having performed all the recommended DDLs, if we run again the advisor, we can see that there are no more encoding suggestions:
CALL sys.heatwave_advisor(JSON_OBJECT("auto_enc",JSON_OBJECT("mode","recommend")));
+-------------------------------+
| INITIALIZING HEATWAVE ADVISOR |
+-------------------------------+
| Version: 1.44 |
| |
| Output Mode: normal |
| Excluded Queries: 0 |
| Target Schemas: All |
| |
+-------------------------------+
6 rows in set (0.0087 sec)
+---------------------------------------------------------+
| ANALYZING LOADED DATA |
+---------------------------------------------------------+
| Total 14 tables loaded in HeatWave for 1 schemas |
| Tables excluded by user: 0 (within target schemas) |
| |
| SCHEMA TABLES COLUMNS |
| NAME LOADED LOADED |
| ------ ------ ------ |
| `airportdb` 14 92 |
| |
+---------------------------------------------------------+
8 rows in set (0.0087 sec)
+------------------------------------------+
| ENCODING SUGGESTIONS |
+------------------------------------------+
| No encoding suggestions can be generated |
| Current encodings found to be the best |
+------------------------------------------+
2 rows in set (0.0087 sec)
Query OK, 0 rows affected (0.0087 sec)
Auto Data Placement
This advisor generates recommendations about data placement keys that are used to partition table data among the different HeatWave nodes:
CALL sys.heatwave_advisor(JSON_OBJECT("target_schema",JSON_ARRAY("airportdb")));
Of course, you need a MySQL HeatWave Cluster of at least 2 nodes to use this advisor.
Query Insights
This last advisor, doesn’t really provide recommendations, but returns runtime data for successfully executed queries, runtime estimates for EXPLAIN queries, cancelled queries (ctrl+c) and failed queries due to an out-of-memory error.
This is how to call the Query Insights Advisor:
CALL sys.heatwave_advisor(JSON_OBJECT("query_insights", TRUE));
This is an output example:
+-------------------------------+
| INITIALIZING HEATWAVE ADVISOR |
+-------------------------------+
| Version: 1.44 |
| |
| Output Mode: normal |
| Excluded Queries: 0 |
| Target Schemas: All |
| |
+-------------------------------+
6 rows in set (0.0086 sec)
+---------------------------------------------------------+
| ANALYZING LOADED DATA |
+---------------------------------------------------------+
| Total 14 tables loaded in HeatWave for 1 schemas |
| Tables excluded by user: 0 (within target schemas) |
| |
| SCHEMA TABLES COLUMNS |
| NAME LOADED LOADED |
| ------ ------ ------ |
| `airportdb` 14 92 |
| |
+---------------------------------------------------------+
8 rows in set (0.0086 sec)
+--------------------------------------------------------------------------------------------------------------------+
| QUERY INSIGHTS |
+--------------------------------------------------------------------------------------------------------------------+
| Queries executed on Heatwave: 9 |
| Session IDs (as filter): None |
| |
| QUERY-ID SESSION-ID QUERY-STRING EXEC-RUNTIME (s) COMMENT |
| -------- ---------- ------------ ---------------- ------- |
| 1 950 SELECT airlinename 'Airline Name', SUM(sold_seat)/SU... 0.447 |
| 2 950 SELECT airlinename 'Airline Name', SUM(sold_seat)/SU... 0.024 (est.) Explain. |
| 3 950 SELECT airlinename 'Airline Name', SUM(sold_seat)/SU... 0.024 (est.) Explain. |
| 4 950 SELECT airlinename 'Airline Name', SUM(sold_seat)/SU... 0.024 (est.) Explain. |
| 5 950 SELECT airlinename 'Airline Name', SUM(sold_seat)/SU... 0.175 |
| 6 950 SELECT airlinename 'Airline Name', SUM(sold_seat)/SU... 0.019 (est.) Explain. |
| 7 950 SELECT airlinename 'Airline Name', SUM(sold_seat)/SU... 0.019 (est.) Explain. |
| 8 950 SELECT airlinename 'Airline Name', SUM(sold_seat)/SU... 0.019 (est.) Explain. |
| 9 950 SELECT airlinename 'Airline Name', SUM(sold_seat)/SU... 0.019 (est.) Explain. |
| |
| TOTAL ESTIMATED: 7 EXEC-RUNTIME: 0.150 sec |
| TOTAL EXECUTED: 2 EXEC-RUNTIME: 0.621 sec |
| |
| |
| Retrieve detailed query statistics using the query below: |
| SELECT log FROM sys.heatwave_autopilot_report WHERE stage = "QUERY_INSIGHTS" AND type = "info"; |
| |
+--------------------------------------------------------------------------------------------------------------------+
22 rows in set (0.0086 sec)
Conclusion
MySQL HeatWave is very powerful and now you know how to control it and verify that your queries benefit from it.
You have learn how to monitor the usage of HeatWave, and how to use to Machine Learning Autopilot Advisors to improve your HeatWave experience !
For even more information on how to monitor HeatWave, please check the manual.
[…] For more information about how to manage HeatWave engine, check this article. […]