Improve InnoDB thread scheduling

Introduction

Note: this article was originally published on http://blogs.innodb.com on July 25, 2011 by Sunny Bains.

InnoDB has had the thread concurrency management code for some years now. Most will be familiar with the three configuration variables associated with this feature:
The problem with the existing code is that the queueing overhead becomes too much and negatively impacts performance, especially as the number of user threads goes up. The queueing code uses the os_event_t library to manage the queuing and dequeing of user threads in an explicit wait queue. This wait queue is implemented using strict FIFO. The FIFO constraint ensures that there is no thread starvation. To overcome this overhead one experimental feature that we are trying is to use busy polling of  free slots to enter InnoDB using sleep delays. The new scheme unlike the existing concurrency management scheme is not starvation free. Before people start complaining that event driven is better than polling, rest assured I know the theory. If there is a way to reduce the overhead and still keep it event driven, I’m very interested to see your code and experimental data :-) .  I’ve experimented with futexes too but no joy there either. It is reasonable to assume that I could have overlooked some better technique therefore feedback is important. In theory if the sleep delay can be tuned to exactly the amount required for a slot to be free then there should be perfect scheduling. This as we know is never going to be the case. However, we can try to get a good enough approximation that varies with the load. This will reduce the optimal or peak TPS due to the overhead of sleeping while there is potentially a slot that is empty. However, for applications that have lots of threads, greater than say 256 it will be able to maintain a higher TPS at the higher thread count because it doesn’t suffer from the queuing bottleneck. The results have been very encouraging in our internal experiments on high end hardware, hosts with >= 16 cores. There are some issues that need ironing out, in particular Sysbench OLTP read-only tests on 8 core hardware. This feature uses atomics to manage the thread concurrency slots.

New configuration variables
  1. innodb_adaptive_sleep_delay - This is a boolean variable that is used to enable/disable the adaptive sleep delay feature.
  2. innodb_adaptive_max_sleep_delay - This variable sets the upper bound of the max sleep delay in micro-seconds that is possible when adaptive sleep delay is enabled.
How adaptive sleep delay works

It is a very simple algorithm, in fact too simple and that is deliberate. I tried various feedback mechanisms but none matched the RW results of this simple algorithm. At one stage I even toyed with the idea of going down the subsumption architecture path :-) .
The basic idea behind it is that we try and reduce sleep as fast as we can and increase sleep as slowly as we can. The reasoning is that this makes it adapt quicker to a dynamic load. The code is rather simple and is all contained in the function srv_conc_enter_innodb_with_atomics(), see srv0conc.c.
  • if a thread has waited for more than one iteration increment sleep delay by one.
  • if a free slot found and there are no waiting threads sleeping then halve the sleep delay.
This works surprisingly well on hardware with >= 16 cores with the innodb_thread_concurrency set to the number of cores on the hardware. For read-write loads it works well on 8 core hardware too but has issues on 8 core hardware for Sysbench read-only tests. This last issue is under investigation.

Below you can find charts for results obtained on two boxes with 8 and 24 cores. These results highlight difference between two cases –  innodb_thread_concurrency=0 and innodb_thread_concurrency=<N_CPU> along with innodb_adaptive_sleep_delay=1

box1:  8 cores(2.33GHz)/16GB/RAID10
box2: 24 cores(2.66Ghz)/32GB/RAID10
MySQL 5.6.3 – lab-release
sysbench-0.4(https://launchpad.net/sysbench/0.4)
(You can find settings/test run details at the end of the post)


Conclusion

To use this new feature you simply set the innodb_thread_concurrency to the available number of cores and enable innodb_adaptive_sleep_delay and run your favourite benchmark software against it. As mentioned earlier there is an issue with RO tests on 8 core hardware, this issue is being looked at. If there is an easy way (and fast) way to make it starvation free then that will be added. However, it performs quite well in RW tests. Your feedback is important to fine tune this feature, please do share with us.

MySQL/sysbench settings
[mysqld]
innodb_status_file=0
innodb_data_file_path=ibdata1:100M:autoextend
innodb_buffer_pool_size=4G
innodb_log_file_size=100M
innodb_log_files_in_group=2
innodb_log_buffer_size=64M
innodb_flush_log_at_trx_commit=2
innodb_thread_concurrency=0
innodb_flush_method=O_DIRECT
user=root
port=3306
max_connections=2000
table_cache=2048
max_heap_table_size=64M
sort_buffer_size=64K
join_buffer_size=1M
tmp_table_size=64M
thread_cache=16
query_cache_type=0
query_cache_size=0
max_prepared_stmt_count=100000

cmd line for prepare:
sysbench –test=oltp –oltp-table-size=1000000 –oltp-dist-type=uniform \
–oltp-table-name=sbtest –mysql-user=root –mysql-db=sbtest \
–mysql-host=127.0.0.1 –mysql-port=3306 –mysql-table-engine=INNODB  prepare

cmd line for warmup:
mysql -uroot -h127.0.0.1 -P3306 -e’check table sbtest’

cmd line for run:
sysbench –num-threads=<16..1536> –test=oltp –oltp-table-size=1000000 \
–oltp-dist-type=uniform –oltp-table-name=sbtest  –report-interval=1 –forced-shutdown=1 \
–max-requests=0 –max-time=120 –mysql-host=127.0.0.1 \
–mysql-user=root –mysql-port=3306 –mysql-db=sbtest
–mysql-table-engine=INNODB  <test options(see below)> run

parameters for OLTP_RO test:
–oltp-read-only=on
parameters for OLTP_RW test:
–oltp-read-only=off
parameters for UPDATE_KEY test:
–oltp-test-mode=nontrx –oltp-nontrx-mode=update_key run
parameters for POINT_SELECT test:
–oltp-point-selects=1 –oltp-simple-ranges=0 –oltp-sum-ranges=0 \
–oltp-order-ranges=0 –oltp-distinct-ranges=0 –oltp-skip-trx=on –oltp-read-only=on
parameters for SELECT_SIMPLE_RANGES test:
–oltp-point-selects=0 –oltp-simple-ranges=1 –oltp-sum-ranges=0 \
–oltp-order-ranges=0 –oltp-distinct-ranges=0 –oltp-skip-trx=on –oltp-read-only=on
parameters for SELECT_SUM_RANGES test:
–oltp-point-selects=0 –oltp-simple-ranges=0 –oltp-sum-ranges=1 \
–oltp-order-ranges=0 –oltp-distinct-ranges=0 –oltp-skip-trx=on –oltp-read-only=on

This post was co-authored with Alexey Stroganov (a.k.a Ranger).
Comments:

Post a Comment:
Comments are closed for this entry.
About

This is the InnoDB team blog.

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today