Migrating to MySQL 8.0 for WordPress – episode 3: query optimization

Now that MySQL 8.0.3 RC1 is installed and that we saw how to verify the workload, it’s time to see if we can optimize some of the queries. As explained before, rewriting queries when using a product like WordPress is complicated but maybe we can do something for the indexes ?

So, do you remember how to check the query that was consuming most resources ? Let’s find it out again:

mysql> select t1.*, QUERY_SAMPLE_TEXT from statement_analysis as t1 
                join performance_schema.events_statements_summary_by_digest as t2 
                on t2.digest=t1.digest and t2.SCHEMA_NAME = t1.db where db = 'wp_lefred' 
                limit 1\G
*************************** 1. row ***************************
            query: SELECT `option_name` , `option ... options` WHERE `autoload` = ? 
               db: wp_lefred
        full_scan: *
       exec_count: 103
        err_count: 0
       warn_count: 0
    total_latency: 2.97 s
      max_latency: 600.21 ms
      avg_latency: 28.82 ms
     lock_latency: 39.46 ms
        rows_sent: 28825
    rows_sent_avg: 280
    rows_examined: 208849
rows_examined_avg: 2028
    rows_affected: 0
rows_affected_avg: 0
       tmp_tables: 0
  tmp_disk_tables: 0
      rows_sorted: 0
sort_merge_passes: 0
           digest: 52292f0ae858595a6dfe100f30207c9f
       first_seen: 2017-10-24 14:59:50.492922
        last_seen: 2017-10-24 23:45:26.192839
QUERY_SAMPLE_TEXT: SELECT option_name, option_value FROM wp_options WHERE autoload = 'yes'

Nothing special… let’s check the Query Execution Plan (QEP):

mysql> explain SELECT option_name, option_value FROM wp_options WHERE autoload = 'yes'\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: wp_options
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 1551
     filtered: 10.00
        Extra: Using where
1 row in set, 1 warning (0.02 sec)

Hummm… so here we can definitely see that this is a full table scan (type: ALL) and it scans 1551 rows.

Let’s now verify the table’s structure:

mysql> show create table wp_options\G
*************************** 1. row ***************************
       Table: wp_options
Create Table: CREATE TABLE `wp_options` (
  `option_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `option_name` varchar(191) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '',
  `option_value` longtext COLLATE utf8mb4_unicode_ci NOT NULL,
  `autoload` varchar(20) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT 'yes',
  PRIMARY KEY (`option_id`),
  UNIQUE KEY `option_name` (`option_name`)
) ENGINE=InnoDB AUTO_INCREMENT=980655 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
1 row in set (0.08 sec)

OK so it’s clear that autoload is not indexed… and that the type doesn’t seems super adequate…

Let’s verify:

mysql> select distinct autoload FROM wp_options;
+----------+
| autoload |
+----------+
| yes      |
| no       |
+----------+
2 rows in set (0.00 sec)

mysql> select  autoload, count(*) FROM wp_options group by autoload;
+----------+----------+
| autoload | count(*) |
+----------+----------+
| yes      |      280 |
| no       |     1309 |
+----------+----------+
2 rows in set (0.00 sec)

Now let’s run the query and check if MySQL performs really a full table scan:

mysql> flush status;
Query OK, 0 rows affected (0.07 sec)

mysql> SELECT option_name, option_value FROM wp_options WHERE autoload = 'yes'\G
...
mysql> show status like 'ha%';
+----------------------------+-------+
| Variable_name              | Value |
+----------------------------+-------+
| Handler_commit             | 1     |
| Handler_delete             | 0     |
| Handler_discover           | 0     |
| Handler_external_lock      | 2     |
| Handler_mrr_init           | 0     |
| Handler_prepare            | 0     |
| Handler_read_first         | 1     |
| Handler_read_key           | 1     |
| Handler_read_last          | 0     |
| Handler_read_next          | 0     |
| Handler_read_prev          | 0     |
| Handler_read_rnd           | 0     |
| Handler_read_rnd_next      | 1590  |
| Handler_rollback           | 0     |
| Handler_savepoint          | 0     |
| Handler_savepoint_rollback | 0     |
| Handler_update             | 0     |
| Handler_write              | 0     |
+----------------------------+-------+
18 rows in set (0.06 sec)

Handler_read_rnd_next is incremented when the server performs a full table scan and this is a
counter you don’t really want to see with a high value. So indeed in our case we perform a full table scan.

The QEP can also be more detailed when using the JSON format:

mysql> explain format=JSON SELECT option_name, option_value FROM wp_options WHERE autoload = 'yes'\G
*************************** 1. row ***************************
EXPLAIN: {
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "202.85"
    },
    "table": {
      "table_name": "wp_options",
      "access_type": "ALL",
      "rows_examined_per_scan": 1546,
      "rows_produced_per_join": 154,
      "filtered": "10.00",
      "cost_info": {
        "read_cost": "187.39",
        "eval_cost": "15.46",
        "prefix_cost": "202.85",
        "data_read_per_join": "131K"
      },
      "used_columns": [
        "option_name",
        "option_value",
        "autoload"
      ],
      "attached_condition": "(`wp_lefred`.`wp_options`.`autoload` = 'yes')"
    }
  }
}
1 row in set, 1 warning (0.03 sec)

This is already enough information for this query, but we could have even more details enabling the OPTIMIZER TRACE:

mysql> SET OPTIMIZER_TRACE = "enabled=on";
Query OK, 0 rows affected (0.01 sec)

mysql> explain SELECT option_name, option_value FROM wp_options WHERE autoload = 'yes'\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: wp_options
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 1546
     filtered: 10.00
        Extra: Using where
1 row in set, 1 warning (0.00 sec)


mysql> select * from INFORMATION_SCHEMA.OPTIMIZER_TRACE\G
*************************** 1. row ***************************
                            QUERY: explain SELECT option_name, option_value 
                                   FROM wp_options WHERE autoload = 'yes'
                            TRACE: {
  "steps": [
    {
      "join_preparation": {
        "select#": 1,
        "steps": [
          {
            "expanded_query": "/* select#1 */ select `wp_options`.`option_name` AS `option_name`,
            `wp_options`.`option_value` AS `option_value` from `wp_options` 
             where (`wp_options`.`autoload` = 'yes')"
          }
        ]
      }
    },
    {
      "join_optimization": {
        "select#": 1,
        "steps": [
          {
            "condition_processing": {
              "condition": "WHERE",
              "original_condition": "(`wp_options`.`autoload` = 'yes')",
              "steps": [
                {
                  "transformation": "equality_propagation",
                  "resulting_condition": "(`wp_options`.`autoload` = 'yes')"
                },
                {
                  "transformation": "constant_propagation",
                  "resulting_condition": "(`wp_options`.`autoload` = 'yes')"
                },
                {
                  "transformation": "trivial_condition_removal",
                  "resulting_condition": "(`wp_options`.`autoload` = 'yes')"
                }
              ]
            }
          },
          {
            "substitute_generated_columns": {
            }
          },
          {
            "table_dependencies": [
              {
                "table": "`wp_options`",
                "row_may_be_null": false,
                "map_bit": 0,
                "depends_on_map_bits": [
                ]
              }
            ]
          },
          {
            "ref_optimizer_key_uses": [
            ]
          },
          {
            "rows_estimation": [
              {
                "table": "`wp_options`",
                "table_scan": {
                  "rows": 1546,
                  "cost": 48.25
                }
              }
            ]
          },
          {
            "considered_execution_plans": [
              {
                "plan_prefix": [
                ],
                "table": "`wp_options`",
                "best_access_path": {
                  "considered_access_paths": [
                    {
                      "rows_to_scan": 1546,
                      "filtering_effect": [
                      ],
                      "final_filtering_effect": 0.1,
                      "access_type": "scan",
                      "resulting_rows": 154.6,
                      "cost": 202.85,
                      "chosen": true
                    }
                  ]
                },
                "condition_filtering_pct": 100,
                "rows_for_plan": 154.6,
                "cost_for_plan": 202.85,
                "chosen": true
              }
            ]
          },
          {
            "attaching_conditions_to_tables": {
              "original_condition": "(`wp_options`.`autoload` = 'yes')",
              "attached_conditions_computation": [
              ],
              "attached_conditions_summary": [
                {
                  "table": "`wp_options`",
                  "attached": "(`wp_options`.`autoload` = 'yes')"
                }
              ]
            }
          },
          {
            "refine_plan": [
              {
                "table": "`wp_options`"
              }
            ]
          }
        ]
      }
    },
    {
      "join_explain": {
        "select#": 1,
        "steps": [
        ]
      }
    }
  ]
}
MISSING_BYTES_BEYOND_MAX_MEM_SIZE: 0
          INSUFFICIENT_PRIVILEGES: 0
1 row in set (0.01 sec)

For this particular easy query, this is not important but that can be useful. Particularly when you know that you can influence on the cost model.

Optimizing the query

It’s then clear that we could benefit from an index here (and maybe reduce the size of the field, but I won’t modify table structures for now).

Let’s create an index on autoload:

mysql> alter table wp_options add index autoload_idx(autoload);

And we can verify the QEP with the new index:

mysql> explain SELECT option_name, option_value FROM wp_options WHERE autoload = 'yes'\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: wp_options
   partitions: NULL
         type: ref
possible_keys: autoload_idx
          key: autoload_idx
      key_len: 82
          ref: const
         rows: 280
     filtered: 100.00
        Extra: NULL
1 row in set, 1 warning (0.00 sec)

Don’t forget that with MySQL 8.0 it’s also possible to set some indexes invisible. Good candidates are the indexes returned by these queries:

mysql> select * from schema_unused_indexes where object_schema='wp_lefred';
+---------------+-------------+--------------+
| object_schema | object_name | index_name   |
+---------------+-------------+--------------+
| wp_lefred     | wp_links    | link_visible |
| wp_lefred     | wp_termmeta | term_id      |
| wp_lefred     | wp_termmeta | meta_key     |
| wp_lefred     | wp_users    | user_email   |
+---------------+-------------+--------------+

mysql> select * from schema_redundant_indexes where table_schema='wp_lefred';
Empty set (0.20 sec)

This post now concludes the migration to MySQL 8.0 for WordPress series and I hope this article will help you finding which queries need to be optimized !

 

Subscribe to Blog via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

7 Comments

  1. Instead of using Handler status variables, I suggest checking the table_io_waits_summary_by_index_usage table in Performance Schema. Then you will get the number of accesses per table/index. Particularly useful for queries that access multiple tables since handler status variables will only show total numbers for query. Another advantage is that you do not need to remember what those cryptic variables names mean.

  2. Given that the performance_schema capacity is limited, I still prefer the slow query log when searching for “expensive” statements.
    But with the standard tool “mysqldumpslow”, it can’t be really evaluated as bug#79828 (reported in Jan, 2016!) is still not fixed, even though I provided a patch in my (duplicate) bug#83777 report. I still wonder why such low hanging fruit are not picked …
    So the alternative is to use Percona’s “pt-query-digest”, and it gives the bonus of its overall statistics, so the user can easily see the statement efforts in perspective.

    • Hi Jörg,

      Thank you for the comment.

      PFS can be extended and you can even consune it and add it into a table or (what I like to do) send it into ES for example.

      But of course I really like pt-query-digest but it works very well when you enable the extra information available in PS in the slow query log.

      About the bug, I’m checking in out internal system and I see it has been marked as duplicate of a fixed bug in 8.0.1 and 5.7.18 already but I need to double check.

Leave a Reply

Your email address will not be published. Required fields are marked *

As MySQL Community Manager, I am an employee of Oracle and the views expressed on this blog are my own and do not necessarily reflect the views of Oracle.

You can find articles I wrote on Oracle’s blog.