Optimizing MySQL Indexes

on

This article is written for and published on SitePoint.

Introduction

MySQL is one of the most used databases in conjunction with PHP. Making sure that your MySQL databases are running at their best is one of the most important aspects you have to consider whenever your web application grows.

In this series of 3 standalone articles, we will have a look at how we can optimize our MySQL installation. We will take a look at which optimizations we can perform on our database, on our MySQL configuration and how we can find potential problems when MySQL is not performing well. We will be using mostly tools from the Percona Toolkit, to work with our database. This article will focus on improving our indexing.

An introduction to indexes

If you haven’t worked with MySQL indexes by now, you probably have a very small or a badly performing MySQL database. You can compare MySQL indexes with the index in a book. Within the index of a book, you can easily find the correct page that contains the subject you were looking for. If there weren’t any indexes, you had to go through the whole book, searching for pages that contain the subject.

As you can imagine, it’s way faster to search by an index than having to go through each page. Therefore, adding indexes to your database is in general speeding up your select queries. However, the index also has to be created and stored. So the update and insert queries will be slower and it will cost you a bit more disk space. In general, you won’t notice the difference with updating and inserting if you have indexed your table correctly and therefore it’s advisable to add indexes at the right locations.

Tables which only contain a few rows don’t really benefit from indexing. You can imagine that searching through 5 pages is not much slower then first going to the index, getting the page number and then opening that particular that page.

For more information on implementing indexes, please see our previous articles on the matter. Within this article we will continue to see how we can actually improve our current indexes.

Finding duplicate indexes

Duplicate indexes will not necessarily slow down your select queries. However, they can slow down your insert and update queries and can cost you more disk space. In general, it’s better to avoid having duplicate keys.

In the Percona Toolkit, we can find a tool called pt-duplicate-key-checker. This tool will analyze your database and return which tables contain duplicate keys. Imagine we have the following table:

CREATE TABLE `user` (
 `id` int(11) NOT NULL AUTO_INCREMENT,
 `firstname` varchar(255) NOT NULL,
 `lastname` varchar(255) NOT NULL,
 PRIMARY KEY (`id`),
 KEY `id_firstname` (`id`,`firstname`),
 KEY `firstname` (`firstname`),
 KEY `firstname_2` (`firstname`),
 KEY `firstname_lastname_id` (`firstname`,`lastname`,`id`)
) ENGINE=InnoDB

Within this table, we can find 3 duplicate keys. Let’s run pt-duplicate-key-checker to see which keys are actually duplicate.

# ########################################################################
# user                                                        
# ########################################################################

# firstname_2 is a left-prefix of firstname_lastname_id
# Key definitions:
#   KEY `firstname_2` (`firstname`),
#   KEY `firstname_lastname_id` (`firstname`,`lastname`,`id`)
# Column types:
#      `firstname` varchar(255) not null
#     `lastname` varchar(255) not null
#     `id` int(11) not null auto_increment
# To remove this duplicate index, execute:
ALTER TABLE `user` DROP INDEX `firstname_2`;

# firstname is a left-prefix of firstname_lastname_id
# Key definitions:
#   KEY `firstname` (`firstname`),
#   KEY `firstname_lastname_id` (`firstname`,`lastname`,`id`)
# Column types:
#     `firstname` varchar(255) not null
#     `lastname` varchar(255) not null
#     `id` int(11) not null auto_increment
# To remove this duplicate index, execute:
ALTER TABLE `user` DROP INDEX `firstname`;

# Key firstname_lastname_id ends with a prefix of the clustered index
# Key definitions:
#   KEY `firstname_lastname_id` (`firstname`,`lastname`,`id`)
#   PRIMARY KEY (`id`),
# Column types:
#     `firstname` varchar(255) not null
#     `lastname` varchar(255) not null
#     `id` int(11) not null auto_increment
# To shorten this duplicate clustered index, execute:
ALTER TABLE `user` DROP INDEX `firstname_lastname_id`, ADD INDEX `firstname_lastname_id` (`firstname`,`lastname`);

# ########################################################################
# Summary of indexes                                                      
# ########################################################################

# Size Duplicate Indexes   1032
# Total Duplicate Indexes  3
# Total Indexes            5

Although the index firstname is a duplicate of firstname_2, they are both a duplicate of the firstname_lastname_id key. Why is that? It’s because firstname is a so called leftmost prefix of firstname_lastname_id. When you generate an index on 3 columns (A, B, C) you basically have 3 indexes covered:

  • A
  • A, B
  • A, B, C

So since firstname is the first column in the firstname_lastname_id index, it has no effect to create a separate index.

The last found index might be confusing. According to the tool, the id can be removed. In this case, it is because we are using the InnoDB engine. InnoDB uses a clustered index where the data for the row is stored. In general InnoDB will use the primary key as the clustered index. Therefore, InnoDB will already be using the index on that column and there is no need to add it to this index.

For more information:
Multiple column indexes
InnoDB index types

Finding unused indexes

Of course you want to avoid duplicate keys as much as possible, but how about unused keys? For that, Percona Toolkit contains a tool called pt-index-usage. Based on a query log, this tool will run an explain against all found queries. For that, you need to make sure you are logging queries in your slow log or general log for example.

Let’s run pt-index-usage /var/log/mysql/mysql-slow.log and see what the output is.

ALTER TABLE `user` DROP KEY `lastname_index`; -- type:non-unique

According to the explain that has been run, this index is not used – we could just remove the index. However, it might be that your application has one or more queries which are using the key, and the query using it simply hasn’t been logged yet. So I wouldn’t suggest to remove the keys without doing a manual check.

Also note that this tool can be quite resource hungry, because it’s running explains against each query. It is therefore not advisable to run this tool in a production environment.

Conclusion

Indexes can be very important for an optimal MySQL performance. Making sure that your indexes are in good shape is as important as adding them.

Leave a Reply

Your email address will not be published. Required fields are marked *