MySQL InnoDB performance tuning on T2000

innodb.html

MySQL is one of the most popular opensource database.  It supports several pluggable storage engines, MyISAM manages non-transactional tables and it is good performer especially on MySQL 5.1.  The InnoDB and BDB storage engines provide transaction-safe tables. They were acquired by Oracle not so long ago.  The MEMORY storage engine which formerly known as heap storage engine provides in-memory tables.

The InnoDB supports both file system and raw disks on Solaris. It does row level locking.
 
MySQL InnoDB allocates a few types of memory buffers out of the heap.  The main buffer that is sized by  innodb_buffer_pool_size is used to cache data and indexes of the tables. Each buffer block is 16k bytes.  There is a dedicated thread that does pread to bring in the data to the buffer from the disk. There are also other types of memory buffers like sort buffer, log buffer, query_cache buffer.  This is how to monitor MySQL statistics to get idea of whether need to increase those buffer size.


mysql> show innodb status 

mysqladmin extended-status

It is important to understand how MySQL executes select queries by explain. That can help optimize the select performance.

MySQL is single process multi-threaded. For each new user connection, there is 1 threaded created from MySQL. We
can control the number of concurrent threads that can run simultaneous by limiting  innodb_thread_concurrency. Normally on T2000 we set it equal to the number of cpus.
 
There is no dedicated log write threads.  There is one io thread that wakes up frequently to do group writes but in our evaluation I rarely see it fires.  Each user thread could issue pwrite when the transaction is commited. And it is serialized! The default innodb_flush_log_at_trx_commit is 1, meaning that after each transaction is committed, there is a pwrite first, then followed by fsflush.  If set innodb_flush_log_at_trx_commit to 0, the performance would improve quite dramatically, pwrite and fsflush is only done every one second instead of after each transaction commit. But the risk is that customer might lose one second of data in case of power loss. innodb_flush_log_at_trx_commit  = 2 would issue pwrite after each commit but fsflush every 1 second.  If log disk response time is not fast enough, it could become a huge performance bottleneck.  It is very important to size enough IOPs for log disks.  We have seen performance being doubled or tripled right away by using a few more disks for the log files.  For performance purpose,  if you use filesystems to store datafiles and log files, you should use forcedirectio.

This is an example of how to use Solaris Volume Manager(svm) to create a soft partition and then set up the logfile links to this location

metadb -d -f c1t0d0s7 c1t1d0s7
metadb -a -f c1t0d0s7 c1t1d0s7

Then create a concatenation

metainit d10 1 4 c1t0d0s0 c1t1d0s0 c1t2d0s0 c1t3d0s0

You can stop here and do newfs on /dev/md/rdsk/d10 or continue to create a soft partition(for example 10G) on d10

metainit d101 -p d10 10G  

After that

newfs /dev/md/rdsk/d101
mount -o forcedirectio /dev/md/dsk/d101  /logs


There is innodb_flush_method.  The default is performing very well on Solaris which uses fsync() to flush both the data
and log files.
 

When MySQL starts, there are 10 threads created. Thread 1 handles network connections and create new threads for new user connections. Then there are 4 io threads. There is 1 log write thread doing group commits once a while. There is one insert thread, InnoDB stores data physically by its primary key order and insertion would not cause random reads on the disk, but for the non unique secondary index insert, it could cause a lot of random reads. So to avoid that, InnoDB checks whether the secondary index page is in the buffer pool. If it is, InnoDB does the insertion directly to  the index page. If the index page is not found in the buffer pool, InnoDB inserts the record to a special insert buffer structure. The insert buffer is kept so small that it fits entirely in the buffer pool, and insertions can be done very fast. Periodically, the insert buffer is merged into the secondary index trees in the database. Often it is possible to merge several insertions to the same page of the index tree, saving disk I/O operations. There is another thread that I am not sure its purpose.  Thread 6 is handling rollback. Thread 7 and 8 are monitoring thread. Thread 9 is the master thread. Thread 10 is the signal handler thread

Here are a few steps of how to tune MySQL InnoDB performance on T2000 or other Solaris platforms
  • Choose at least MySQL version 5.0.22.  In my experience MySQL 4.1  has more user locks inside hence it doesn't scale well.   MySQL 5.1 beta  is even better than 5.0.

  • If your database size is more than 4GB, using MySQL 64bit is better because you can use more than 4GB memory for innoDB database buffers.

  • MySQL 5.0.22 64bit from MySQL website is a pretty good performer.  But in case if you want to compile it on your own on Solaris, You can refer a good  whitepaper from Jenny Chen from Sun Microsystem.  Compile time can be reduced to a few minutes using parallel build "dmake" in sun studio 11 

  • You can use libumem or libmtmalloc to get scalable memory allocation performance on multi-threaded MySQL.  The way to do this is before mysql is started,  setenv LD_PRELOAD_64 /usr/lib/sparcv9/libumem.so.  This example is for 64bit MySQL
  • Always keep an eye on MySQL internal statistics. Normally you can look at reads versus hits and tell if you need to increase certain buffer size
    • mysql> show innodb status;
    • mysql> show full processlist;
    • mysqladmin extended-status
  • Making sure there is no io bottleneck. Collect  "iostat -xtcnz 5" on the system and look at that what is the disk response time (column asvc_t in miliseconds) and what is the average outstanding ios(column actv). Making sure that you have enough IOPs for the logfile disks and datafile disks.
  • On solaris systems, you can use "prstat -Lmc" to monitor  the active processes on the systems.
  • MySQL startup options are very important. Following is an example of /etc/my.cnf I use to evaluate MySQL sysbench OLTP performance on T2000
#
#MySQL configure
#
# The MySQL server
[mysqld]
port=3306
socket=/tmp/mysql.sock
user=root
datadir=/data
basedir=/usr/local/mysql
max_connections=1600
max_connect_errors=10
table_cache=2048
max_allowed_packet=1048576
binlog_cache_size=1048576
key_buffer_size=16777216
max_heap_table_size=67108864
sort_buffer_size=65536
join_buffer_size=1048576
query_cache_size=26214400
thread_cache=16
thread_concurrency=32
thread_stack=64K
ft_min_word_len=4
default_table_type=MYISAM
transaction_isolation=REPEATABLE-READ
tmp_table_size=64M
skip-locking
skip-concurrent-insert
server-id=1
#######below is for innodb############
innodb_status_file=0
innodb_data_file_path=ibdata1:100M:autoextend
innodb_buffer_pool_size=1800M
innodb_additional_mem_pool_size=20M
innodb_log_file_size=200M
innodb_log_files_in_group=2
innodb_log_buffer_size=80M
innodb_flush_log_at_trx_commit=1
innodb_lock_wait_timeout=300
innodb_max_dirty_pages_pct=90
innodb_thread_concurrency=32


Comments:

Post a Comment:
  • HTML Syntax: NOT allowed
About

yufei

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today
Bookmarks