Optimizing MySQL Bottlenecks

on

This article is written for and published on SitePoint.

Introduction

MySQL is one of the most used databases in conjunction with PHP. Making sure that your MySQL databases are running at their best is one of the most important aspects you have to consider whenever your web application grows.

In this series of 3 articles, we will have a look at how we can optimize our MySQL. We will take a look at which optimizations we can perform on our database, on our MySQL configuration and how we can find potential problems when MySQL is not performing well. We will be using mostly tools from the Percona Toolkit to work with our database. This article focuses on discovering causes to problems.

Activating slow log

First of all, we need to activate the slog low. In the slow log, MySQL will report any queries which take longer then a configured unit of time. Also, it can report any query not using an index.

Activating the slow log is easy. All you have to do is change the MySQL config file (which is located in most cases at /etc/mysql/my.cnf) and add the following lines:

log_slow_queries    = /var/log/mysql/mysql-slow.log
long_query_time = 1
log-queries-not-using-indexes

Any query which is taking longer then a second or which is lacking the use of an index, is being added to the log.

Analyzing your slow log

You can analyze your slow log by hand, going through each and every query. You might however prefer to use an automatic tool like pt-query-digest. With this tool, you can analyze your slow log and it will automatically rank your queries based on running time.

Let’s run pt-query-digest /var/log/mysql/mysql-slow.log.

# 360ms user time, 20ms system time, 24.66M rss, 92.02M vsz
# Current date: Thu Feb 13 22:39:29 2014
# Hostname: *
# Files: mysql-slow.log
# Overall: 8 total, 6 unique, 1.14 QPS, 0.00x concurrency ________________
# Time range: 2014-02-13 22:23:52 to 22:23:59
# Attribute          total     min     max     avg     95%  stddev  median
# ============     ======= ======= ======= ======= ======= ======= =======
# Exec time            3ms   267us   406us   343us   403us    39us   348us
# Lock time          827us    88us   125us   103us   119us    12us    98us
# Rows sent             36       1      15    4.50   14.52    4.18    3.89
# Rows examine          87       4      30   10.88   28.75    7.37    7.70
# Query size         2.15k     153     296  245.11  284.79   48.90  258.32

Note that I had 8 slow queries, with 6 of them unique. Next we see a table, indicating the times it took to do some actions. Observe the column which has the 95% heading, since that shows you the best average.

# Profile
# Rank Query ID           Response time Calls R/Call V/M   Item
# ==== ================== ============= ===== ====== ===== ===============
#    1 0x728E539F7617C14D  0.0011 41.0%     3 0.0004  0.00 SELECT blog_article
#    2 0x1290EEE0B201F3FF  0.0003 12.8%     1 0.0003  0.00 SELECT portfolio_item
#    3 0x31DE4535BDBFA465  0.0003 12.6%     1 0.0003  0.00 SELECT portfolio_item
#    4 0xF14E15D0F47A5742  0.0003 12.1%     1 0.0003  0.00 SELECT portfolio_category
#    5 0x8F848005A09C9588  0.0003 11.8%     1 0.0003  0.00 SELECT blog_category
#    6 0x55F49C753CA2ED64  0.0003  9.7%     1 0.0003  0.00 SELECT blog_article

Now it gets interesting. Apparently these are the queries which were reported as slow. Note that in my example, the queries are not really that slow and they were reported because no index was being used. Let’s get into more detail about the first query.

# Query 1: 0 QPS, 0x concurrency, ID 0x728E539F7617C14D at byte 736 ______
# Scores: V/M = 0.00
# Time range: all events occurred at 2014-02-13 22:23:52
# Attribute    pct   total     min     max     avg     95%  stddev  median
# ============ === ======= ======= ======= ======= ======= ======= =======
# Count         37       3
# Exec time     40     1ms   352us   406us   375us   403us    22us   366us
# Lock time     42   351us   103us   125us   117us   119us     9us   119us
# Rows sent     25       9       1       4       3    3.89    1.37    3.89
# Rows examine  24      21       5       8       7    7.70    1.29    7.70
# Query size    47   1.02k     261     262  261.25  258.32       0  258.32
# String:
# Hosts        localhost
# Users        *
# Query_time distribution
#   1us
#  10us
# 100us  ################################################################
#   1ms
#  10ms
# 100ms
#    1s
#  10s+
# Tables
#    SHOW TABLE STATUS LIKE 'blog_article'G
#    SHOW CREATE TABLE `blog_article`G
# EXPLAIN /*!50100 PARTITIONS*/
SELECT b0_.id AS id0, b0_.slug AS slug1, b0_.title AS title2, b0_.excerpt AS excerpt3, b0_.external_link AS external_link4, b0_.description AS description5, b0_.created AS created6, b0_.updated AS updated7 FROM blog_article b0_ ORDER BY b0_.created DESC LIMIT 10G

This is probably the most important part of the output. Here we see the first ranked query. Once again we check the 95% column. You can see how long it took and how much is data is being sent. You can also see in which time category this query lies. In this case, it lies between 100us and 1ms.
Lastly, we see the actual query that was run.

For every ranked query, you will get feedback like this. You can use this feedback to analyze which queries are lacking an index or are being slow.

Analyzing MySQL when problems occur

Another interesting Percona tool is pt-stalk. With this tool, you can get an overview of what MySQL is doing at certain moments. Let’s say that today, we want to check the state of MySQL 4 times. We can do this by running the following command.

pt-stalk --sleep=21600 --threshold=0 --iterations=4

After every iteration, pt-stalk will save all kinds of data to the `/var/lib/pt-stalk/’ directory. In this directory, you will find a list of files:

-rw-r--r--  1 root root  11220 Feb 22 14:52 2014_02_22_14_51_35-df
-rw-r--r--  1 root root    121 Feb 22 14:52 2014_02_22_14_51_35-disk-space
-rw-r--r--  1 root root  42870 Feb 22 14:52 2014_02_22_14_51_35-diskstats
-rw-r--r--  1 root root      9 Feb 22 14:52 2014_02_22_14_51_35-hostname
-rw-r--r--  1 root root   3968 Feb 22 14:51 2014_02_22_14_51_35-innodbstatus1
-rw-r--r--  1 root root   3969 Feb 22 14:52 2014_02_22_14_51_35-innodbstatus2
-rw-r--r--  1 root root  49980 Feb 22 14:52 2014_02_22_14_51_35-interrupts
-rw-r--r--  1 root root   4146 Feb 22 14:51 2014_02_22_14_51_35-log_error
-rw-r--r--  1 root root  69763 Feb 22 14:51 2014_02_22_14_51_35-lsof
-rw-r--r--  1 root root  36420 Feb 22 14:52 2014_02_22_14_51_35-meminfo
-rw-r--r--  1 root root     82 Feb 22 14:51 2014_02_22_14_51_35-mutex-status1
-rw-r--r--  1 root root     82 Feb 22 14:52 2014_02_22_14_51_35-mutex-status2
-rw-r--r--  1 root root 559349 Feb 22 14:52 2014_02_22_14_51_35-mysqladmin
-rw-r--r--  1 root root 139723 Feb 22 14:52 2014_02_22_14_51_35-netstat
-rw-r--r--  1 root root 104400 Feb 22 14:52 2014_02_22_14_51_35-netstat_s
-rw-r--r--  1 root root  12542 Feb 22 14:51 2014_02_22_14_51_35-opentables1
-rw-r--r--  1 root root  12542 Feb 22 14:52 2014_02_22_14_51_35-opentables2
-rw-r--r--  1 root root    810 Feb 22 14:52 2014_02_22_14_51_35-output
-rw-r--r--  1 root root   9380 Feb 22 14:51 2014_02_22_14_51_35-pmap
-rw-r--r--  1 root root  34134 Feb 22 14:52 2014_02_22_14_51_35-processlist
-rw-r--r--  1 root root  43504 Feb 22 14:52 2014_02_22_14_51_35-procstat
-rw-r--r--  1 root root  61620 Feb 22 14:52 2014_02_22_14_51_35-procvmstat
-rw-r--r--  1 root root  11379 Feb 22 14:51 2014_02_22_14_51_35-ps
-rw-r--r--  1 root root 335970 Feb 22 14:52 2014_02_22_14_51_35-slabinfo
-rw-r--r--  1 root root  26524 Feb 22 14:51 2014_02_22_14_51_35-sysctl
-rw-r--r--  1 root root  11468 Feb 22 14:51 2014_02_22_14_51_35-top
-rw-r--r--  1 root root    379 Feb 22 14:51 2014_02_22_14_51_35-trigger
-rw-r--r--  1 root root   8181 Feb 22 14:51 2014_02_22_14_51_35-variables
-rw-r--r--  1 root root   2652 Feb 22 14:52 2014_02_22_14_51_35-vmstat
-rw-r--r--  1 root root    312 Feb 22 14:52 2014_02_22_14_51_35-vmstat-overall

The file names already show you what you can expect inside. Data like disk space, process list, top etc are some of the things you can see. For every iteration, it will save this kind of files.

Well, this is all interesting, but it would be more interesting if we actually can get this data when something goes wrong. Luckily, we can configure pt-stalk to check at certain thresholds.

Let’s say we want to check what is going on, right at the moment when we have 100 connections.

pt-stalk --function processlist --variable Host --match=localhost --threshold 100 --daemonize

By adding --dameonize, the tool will run in the background until you tell it to quit. The other parameters will make sure that as soon as there are 100 connections, logging starts. The logs you will get from this, will be exactly the same as above.

Of course, you can check for other conditions. If you know what the problem is, you could configure pt-stalk to start logging as soon as the problem occurs. With that, you can analyze the log files and see if something strange was going on.

Conclusion

This conclused our Percona Toolkit overview. We have seen a lot of tools from the Percona Toolkit which we can use to optimize or check our MySQL configuration and database. If you are interested in other Percona tools we haven’t covered yet, please have a look at this page to get an idea of what else can be done.

Leave a Reply

Your email address will not be published. Required fields are marked *