You are here
MySQL Query Cache and InnoDB Considerations
In this article we will be going over a higher level view of MySQL optimization - InnoDB versus MyISAM, whether or not you should use the Query Cache, and the new InnoDB Barracuda file format. We discuss what to use and when, which is important for tuning your own my.cnf configuration.
MyISAM versus InnoDB
While InnoDB is, in general, the superior database engine, MyISAM still possesses an advantage in certain limited situations.
- MyISAM's overhead is smaller, meaning that tables with fixed-width data will be smaller.
- MyISAM doesn't require a shared tablespace to do tracking, which additionally reduces required disk I/O.
- While for variable-length tables, InnoDB's compression algorithm can make tables smaller than MyISAM... the superior solution is to compress the data on input.
- Setting concurrent_insert=2 means that MyISAM will process multiple inserts at a time without issue, no matter what rows are deleted.
This makes MyISAM superior for a limited set of activities. Logging, in particular, is a fairly attractive use for MyISAM - not terribly critical, and you are merely appending a lot of data in sequential order. Something which MyISAM is well suited for.
That is about all, however. If your table allows UPDATE statements (editing, pm status updates, and so on), then these will force a table lock. Enough of these occur at once, and your site is effectively down at its most active moments.
Never a good situation.
If the table in question is doing anything besides getting INSERT and DELETE statements, e.g. log tables, then you should be working with InnoDB at this point.
The Query Cache
The easiest way to check to see if the Query Cache is doing you any good is to enable it, give it enough space to work with, then check the status of it, e.g.
mysql> show status like 'Qcache%';
+-------------------------+------------+
| Variable_name | Value |
+-------------------------+------------+
| Qcache_free_blocks | 1839 |
| Qcache_free_memory | 94658784 |
| Qcache_hits | 1123738642 |
| Qcache_inserts | 584027197 |
| Qcache_lowmem_prunes | 1775317 |
| Qcache_not_cached | 144840211 |
| Qcache_queries_in_cache | 105508 |
| Qcache_total_blocks | 213789 |
+-------------------------+------------+
8 rows in set (0.00 sec)If you are getting lowmem_prunes, you may need to first assign more space to the query cache. In addition, the space you have might be too fragmented for MySQL to work with, necessitating that you regularly run the MySQL command
FLUSH QUERY CACHE;Every so often. I run it twice an hour through cron.
Finally, if you are running Sphinx, the reindexing might be blowing through your Query cache. The solution here is to set query_cache_limit to a lower value - I use 256k:
# Query cache
query_cache_type = 1
query_cache_limit = 256K
query_cache_size = 256M
query_cache_min_res_unit = 4KAfter you've done this, and after you have assigned 'enough' RAM to the Query Cache - more than 256 megabytes is currently regarded as a bad idea (though 512MB might be okay in the future) - check the status after your server has been running for awhile as above.
Qcache_hits / Qcache_inserts = Your Query Cache's effectiveness.
A number less than or equal to one basically means that you aren't actually getting much out of the query cache - on average, the entry gets pruned before it gets used. A number greater than 1 means that the query cache is worthwhile to you - for now.
There is genuine talk of improving the Query Cache's design in future versions of MySQL - or removing it entirely. What is here should be taken as advice for MySQL versions up to 5.5, and possibly further if nothing improves.
The New InnoDB File Format and Compression
The new file format is available by default as of the version of InnoDB in 5.4.2. In MySQL 5.1, you will have to load the plugin instead:
# InnoDB
ignore_builtin_innodb
plugin-load=innodb=ha_innodb_plugin.so;innodb_trx=ha_innodb_plugin.so;innodb_locks=ha_innodb_plugin.so;innodb_lock_waits=ha_innodb_plugin.so;innodb_cmp=ha_innodb_plugin.so;innodb_cmp_reset=ha_innodb_plugin.so;innodb_cmpmem=ha_innodb_plugin.so;innodb_cmpmem_reset=ha_innodb_plugin.so
default-storage-engine = InnoDB
innodb_file_per_table
innodb_file_format = barracudaAgain, the plugin is not needed as of MySQL 5.4.2 in order to enable the new file format or take advantage of other new features in InnoDB. The new Barracuda format requires the use of the innodb_file_per_table option, as the shared tablespace is stored in the original Antelope format.
This format enables two additional row formats for you to store your data in.
Dynamic rows stores long columns (BLOB, TEXT, VARCHAR) that don't fit into a single page as a twenty-byte pointer to the overflow page. When the row size exceeds page size, the longest columns get replaced by the overflow page pointer, until the column fits. This allows InnoDB tables to be more efficient when not actually analyzing these columns, as they do not need to be loaded into the buffer pool. This differs from the previous COMPACT format, which would store the first 768 bytes of these columns in-table, with the 20-byte pointer following, regardless of whether or not it made a good fit.
You can change a table's row format to DYNAMIC by running:
ALTER TABLE table_name ROW_FORMAT=DYNAMIC;
<code>
<b>Compressed</b> tables require a bit more discussion. They are basically Dynamic tables, but InnoDB will attempt to compress the contents of a row in a table and see if the results fit in the KEY_BLOCK_SIZE. If they don't, they're stored in the overflow page as with Dynamic, although the overflow page is also compressed. Specifying too large a value of KEY_BLOCK_SIZE will waste some space, but specifying too small a value can cause InnoDB to split nodes needlessly, and a lot of data will be forced into the overflow page. This can dramatically reduce write performance, I have discovered.
<code>
mysql> select * from information_schema.INNODB_CMP;
+-----------+--------------+-----------------+---------------+----------------+-----------------+
| page_size | compress_ops | compress_ops_ok | compress_time | uncompress_ops | uncompress_time |
+-----------+--------------+-----------------+---------------+----------------+-----------------+
| 1024 | 0 | 0 | 0 | 0 | 0 |
| 2048 | 0 | 0 | 0 | 0 | 0 |
| 4096 | 3357554 | 1512129 | 1889 | 219238221 | 26632 |
| 8192 | 1039670 | 1031859 | 464 | 87310383 | 15538 |
| 16384 | 0 | 0 | 0 | 0 | 0 |
+-----------+--------------+-----------------+---------------+----------------+-----------------+
5 rows in set (0.00 sec)Information on using the compression information schema tables can be found here.
In a nutshell, however, you can see that, according to MySQL's documentation, my situation here isn't particularly ideal. I have three main compressed tables here:
1) My ajax_chat_messages, with a KEY_BLOCK_SIZE of 4k. These are getting compressed at a somewhat better ratio than raw posts at 4k, but they are still getting compressed - converting them back to DYNAMIC doubled the size of the table on-disk. Converting to 2k increased the size of the table by about 15% from 4k, while decreasing performance massively. Converting them to 8k saw a slight increase in size again, at a massive benefit to performance.
2) My private_messages table, with a KEY_BLOCK_SIZE of 8k. According to MySQL's documentation, compression is working perfectly. Only about half the messages are getting compressed, but those that don't fit into their pages fine.
3) My smf_messages table (forum posts), with a KEY_BLOCK_SIZE of 4k. Unlike the above, about three quarters of my member's posts are getting compressed, but of these, only half end up successfully doing so.
Converting the posts table to a key_block_size of 8k means that my posts table is 3,087,007,744 bytes rather than 2,889,875,456 bytes, but the number of successful compressions is far superior:
mysql> select * from information_schema.INNODB_CMP WHERE page_size=8192;
+-----------+--------------+-----------------+---------------+----------------+-----------------+
| page_size | compress_ops | compress_ops_ok | compress_time | uncompress_ops | uncompress_time |
+-----------+--------------+-----------------+---------------+----------------+-----------------+
| 8192 | 2365184 | 2351065 | 973 | 95024340 | 16636 |
+-----------+--------------+-----------------+---------------+----------------+-----------------+
5 rows in set (0.00 sec)Meaning, about 200,000 fewer posts actually got compressed (about 1.3 million, leaving out the million compressed private messages), but the overflow page is now far smaller - only about 8,000 private messages and 6,000 posts. This means the overflow pages will be called less, less RAM will be used to store compressed and uncompressed versions of pages, and may mean slightly faster thread views. For the most part, the difference is not too terribly great.
Compression saves me a couple gigabytes over MyISAM, personally - but your mileage will vary. MySQL's documentation on this matter should be taken to heart - overcompression is really, really bad for MySQL. Sustained write rates were an order of magnitude lower than for when compress_ops_ok is ~99% of compress_ops. Whether you see a performance benefit will thus largely depend on how well InnoDB is actually handling the compression. I use less disk space at a key_block_size of 4k on most tables, but it is also dramatically slower.
All things considered, I'm not exactly a fan. This sort of thing should be done in the forum script - not handled by the database. All tables should be stored as dynamic or compact, with their variable-length data gzip-compressed appropriately. This is a matter best addressed with developers, however, and in the mean time, it is a good stopgap measure to improve your disk i/o.
In our final article in this series, I will present an annotated version of my my.cnf file, going over more specific optimizations and what they mean.
- Vekseid's blog
- 9268 reads
Comments
Regarding the query cache
Regarding the query cache catching Sphinx queries, wouldn't you just set SQL_NO_CACHE in the Sphinx configuration in the first place so it never goes through the query cache?
Possibly, I'm not sure when
Possibly, I'm not sure when that option got in and I'm moving Sphinx to my backup server anyway.
The issue is still that large queries are very rarely repeatable - especially with communities. They wipe out large parts of your query cache and are never used again.
Add new comment