Elements of Response Time
By user12610824 on Apr 09, 2009
As Tim showed in the first blog, with sysbench (the load driver) running on the same system as MySQL, throughput with the pool-of-threads scheduler was lower than with the default, thread-per-connection scheduler. However, with sysbench running on a remote system, accessing MySQL over the network (arguably a more realistic case), throughput with the two schedulers was quite similar. While reviewing response time data, it was noted that the ramping of response time (ie the response time curve) for pool-of-threads was very similar to that for thread-per-connection when communicating over the network. The question was, why?
As Sherlock Holmes would say, it's elementary! (pun intended)
In a closed queuing network, the lower bound on average response time, once you reach Nsat (the number of consumers at which some queuing is guaranteed to happen), is (N\*Dmax)-Z. Where N is the number of consumers, Dmax is the service demand at the bottleneck device, and Z is think time (the delay between requests from the same consumer).
If the response time curve is the same for both implementations, it suggests that Dmax is also the same and Nsat is comparable.
If you look at just CPU, network, and disk, you will have the following elements of response time:
The service demand at each device:
- Dcpu = (%usr+%sys)/TPS
- Ddisk = %busy/TPS
- Dnet = (network utilization)/TPS
- Note: TPS is Transactions Per Second reported by the application.
The total service demand:
- D = Dcpu + Ddisk + Dnet
- Dmax = max(Dcpu, Ddisk, Dnet)
The number of consumers at which some queuing is guaranteed to occur:
- Nsat = (D + Z)/Dmax
- Note: Nsat marks the approximate knee in the throughput and response time curves. It is often written as N\*, but this can be confusing when written in plain text equations where "\*" indicates multiplication. Nsat may also be referred to as Nopt, meaning the optimal number of consumers in the system, because it marks the approximate point where throughput levels out and response time starts to climb.
The lower bound on average response time is then:
- for N < Nsat: D
- for N >= Nsat: (N\*Dmax)-Z
When sysbench was running locally with MySQL, Dmax was probably the CPU service demand, which differed between implementations. With sysbench remote, Dmax is probably now either Dnet or Ddisk. Some simple hand calculations will tell which, and that component will need to be addressed to reduce the average response time and increase the throughput.
I hope to see you at my MySQL Camp Session, at 2pm on Thursday, April 23rd, where we will discuss other uses of simple queuing models to answer questions about performance. MySQL Camp is free and you do not need to be registered for the main conference to attend, so drop by!