I have recently been investigating a bew feature of MySQL 6.0 - the
"Pool-of-Threads" scheduler. This feature is a fairly significant
change to the way MySQL completes tasks given to it by database
To begin with, be advised that the MySQL database is implemented as a
single multi-threaded process. The conventional threading model is
that there are a number of "internal" threads doing administrative work
(including accepting connections from clients wanting to connect to
the database), then one thread for each database connection. That
thread is responsible for all communication with that database client
connection, and performs the bulk of database operations on behalf of
This architecture exists in other RDBMS implementations. Another
common implementation is a collection of processes all cooperating via
a region of shared memory, usually with semaphores or other
synchronization objects located in that shared memory.
The creation and management of threads can be said to be cheap, in a
relative sense - it is usually significantly cheaper to create or
destroy a thread rather than a process. However these overheads do not
come for free. Also, the operations involved in scheduling a thread
as opposed to a process are not significantly different. A single
operating system instance scheduling several thousand threads
on and off the CPUs is not much less work than one scheduling several
thousand processes doing the same work.
The theory behind the Pool-of-Threads scheduler is to provide an
operating mode which supports a large number of clients that will be
maintaining their connections to the database, but will not be sending
a constant stream of requests to the database. To support this, the
database will maintain a (relatively) small pool of worker threads
that take a single request from a client, complete the request, return
the results, then return to the pool and wait for another request,
which can come from any client. The database's internal threads still
exist and operate in the same manner.
In theory, this should mean less work for the operating system to
schedule threads that want CPU. On the other hand, it should mean
some more overhead for the database, as each worker thread needs to
restore the context of a database connection prior to working on each
A smaller pool of threads should also consume less memory, as each
thread requires a minimum amount of memory for a thread stack, before
we add what is needed to store things like a connection context, or
working space to process a request.
You can read more about
the different threading models in the MySQL 6.0 Reference Manual.
Testing the Theory
Mark Callaghan of Google has recently had a look at whether this
theory holds true. He has published his results under
"No new global mutexes! (and how to make the thread/connection pool
work)". Mark has identified (via
this bug he logged)
that the overhead for using Pool-of-Threads seems quite large - up to
So, my first task is see if I get the same results. I will note here
that I am using Solaris, whereas Mark was no doubt using a Linux
distro. We probably have different hardware as well (although both
are Intel x86).
Here is what I found when running sysbench read-only (with the
sysbench clients on the same host). The "conventional" scheduler
inside MySQL is known as the "Thread-per-Connection" scheduler, by the
This is in contrast to Mark's results - I am only seeing a loss in
throughput of up to 30%.
What about the bigger picture?
These results do show there is a definite reduction in maximum
throughput if you use the pool-of-threads scheduler.
I believe it is worth looking at the bigger picture however. To do
this, I am going to add in two more test cases:
- sysbench read-only, with the sysbench client and MySQL database
on separate hosts, via a 1 Gb network
- sysbench read-write, via a 1 Gb network
What I want to see is what sort of impact the pool-of-threads
scheduler has for a workload that I expect is still the more common
one - where our database server is on a dedicated host, accessed via a
As you can see, the impact on throughput is far less significant when
the client and server are separated by a network. This is because we
have introduced network latency as a component of each transaction and
increased the amount of work the server and client need to do - they
now need to perform ethernet driver, IP and TCP tasks.
This reduces the relative overhead - in CPU consumed and latency -
introduced by pool-of-threads.
This is a reminder that if you are conducting performance tests on a
system prior to implementing or modifying your architecture, you would
do well to choose a test architecture and workload that is as close as
possible to that you are intending to deploy. The same is true if you
are are trying to extrapolate performance testing someone else has
done to your own architecture.
The Converse is Also True
On the other hand, if you are a developer or performance engineer
conducting testing in order to test a specific feature or code change,
a micro-benchmark or simplified test is more likely to be what you
need. Indeed, Mark's use of the "blackhole" storage engine is a good
idea to eliminate that processing from each transaction.
In this scenario, if you fail to make the portion of the software you
have modified a significant part of the work being done, you run the
risk of seeing performance results that are not significantly
different, which may lead you to assume your change has negligible
In my next posting, I will compare the two schedulers using a