Olio Performance on T5220

A colleague recently asked for some assistance with analyzing and tuning the performance of Olio on a 1.4GHz T5220 CMT server. The main symptoms were that its cpu consumption varied considerably over the course of a run, and that response time targets weren't being met, even at only moderate load.

Olio is a web2.0 toolkit designed to help evāžaluate the suitability, functionality and performance of various web technologies. The deployment I was looking at involved several components:

  • Faban driver on a separate server, generating web requests on average every 5 second per simulated user
  • Apache 2.2.9 from CoolStack 1.3.1
  • PHP 5.2.6 also from CoolStack
  • MySQL, initially 5.1.25 from CoolStack, later switched to 5.1.30 that I compiled myself

Apache/PHP and MySQL were both running on the same T5220, and a processor set had been created for MySQL.

Since Apache was involved, one of the things that I did is edit httpd.conf to create a CustomLog format for its access log.

    LogFormat "%h %l %u %t \\"%r\\" %>s %b %{pid}P %X %D" rscommon
    CustomLog logs/access_log rscommon

This added process ID (pid), KeepAlive status, and response time (in microseconds) to each log record. Running an experiment for a very small number of users then showed which web queries were most common and which were inherently much slower than the rest. Temporarily for this purpose I set certain KeepAlive parameters to make it easier to follow the sequence of requests from a particular user, because they'd be serviced by the same process (when using mpm_worker and configuring sufficent processes).

KeepAlive on
KeepAliveTimeout 60
MaxKeepAliveRequests 0

For 10 users over 360 seconds, Faban reported 700 Operations, or 1.944 ops/sec. However the access log showed 5579 web requests, so clearly one or more Operations involved many web requests. The access_log confirmed that there were several inherently slow request types which were also quite common.

 210 GET /events.php?socialEventID=...
 286 GET /taggedEvents.php?tag=...
 286 POST /taggedEvents.php
 316 GET /index.php
3396 GET /fileService.php?cache=false&file=...

I probably wouldn't have chosen a quasi-open queuing system workload myself for performance tuning: a closed queuing system with zero think time makes it easier to identify bottlenecks. Still, having a list of candidate web queries meant that I could quickly create a simpler workload that exercised just Apache or Apache+PHP. For that I used the Apache utility ab, and created two simple pages, one static and one dynamic via phpinfo().

The motivation for this is to verify what throughput can be achieved for relatively lightweight requests. As will be seen below, there is a trap that involves a nasty interaction between tcp_time_wait_interval, KeepAlive, and the characteristics of this workload.

If KeepAlive is not used then each request creates a new connection, which gets closed at the end of the request. TCP requires a delay, which on Solaris is specified by tcp_time_wait_interval, before the port number can be reused by another connection. There are only 64k different port numbers, so once they've all been used, then a new connection has to wait for a port to become available due to expiry of tcp_time_wait_interval for a previous connection. The default on Solaris is 60000ms, which implies a sustainable connection rate of about 1000/sec. In a LAN environment, MSL (Maximum Segment Lifetime) should be less than 1 second, in which case the default can be safely reduced. This is necessary for very high connection rates.

# ndd /dev/tcp tcp_time_wait_interval
# ndd -set /dev/tcp tcp_time_wait_interval 2000

Alternatively if KeepAlive is used, then each httpd process potentially can sit around idle for KeepAliveTimeout seconds before becoming available to service requests from another user. With a large user population, this means having a lot of apache processes, each with their own connection to mysqld at the backend. Careful attention need to be paid to whether there is sufficient real memory for so many httpd processes. Experience also suggests large numbers of active mysql processes can reduce efficiency.

For 1000 users, each combination failed to deliver the required performance (throughput and/or response time targets).

 KeepAlive KeepAliveTimeout
 MaxClients  Result
 on 60
 2048  fail: response times > 5 sec
 on  1  400  fail: 90th percentile > target
 400  fail: unstable response times

The instability with KeepAlive off is particularly interesting as for majority of the run response times and throughput were looking good, when suddenly things deteroriated badly. This is reflected in the graphs generated by Faban of cpu utilisation and response time.



Before pursuing this though, I decided to investigate more deeply the inherently slow transactions. One way of doing this is to select a transaction type and run it repeatedly in a single thread. Using a single thread makes it more practical to instrument things with DTrace's pid provider, as only a single apache process need be started up. Mysqld itself is a single process, but multithreaded. Each apache process establishes its own connection to myqld. I wrote a DTrace script commandx.d to capture SQL text, response time and cpu time from mysql, which I found very useful. [Caveat: it needs to be run using the same data model as the process its observing, either -32 or -64.] DTrace scripts tend to grow organically, as you hack, modify, adapt, enhance, and expand them according to need. This one needs further work to capture stmt ids returned from PREPARE commands.

There are a few other techniques I have found useful for MySQL. One is to capture microstate accounting and other resource information from /proc pseudo-filesystem at start and end of an experiment and output the difference via a small utility I wrote, pidusage. Mysqld maintains a set of global variables about its internal activity that can be queried. To compute the differences between two samples I use the nawk script mysqldiff. Putting it all together the resulting microbenchmark script looks something like this:

mpid=`pgrep -x mysqld`
mysql -B -e 'show global status;' >sample1
pidusage -s -p$mpid
ab -c 1 -n 100 "http://localhost:8080/events.php?socialEventID=..."
pidusage -g -p$mpid
mysql -B -e 'show global status;' >sample2
mysqldiff sample1 sample2

For DTracing mysqld:

commandx.d -64 -p $mpid -o commandx.log 

And for starting a standalone single apache process:

httpd -k start -X

Combining all this generates a wealth of information, but I'll quote selectively here. 100 queries took 15.908s, and according to ab the mean was 159.077ms, so that's consistent. From microstate accounting, mysqld used 13.084s cpu time so the workload is cpu-bound (the database fits in memory). Statistics from mysqld give a clue about the cpu time: 100 web queries resulted in 500 select statements, which in total read 439800 rows.

Com_admin_commands                              100
Com_begin                                       100
Com_commit                                      100
Com_select                                      501
Com_update                                      100
Handler_read_first                              100
Handler_read_key                             418800
Handler_read_next                             22600
Handler_read_rnd_next                        417170
Innodb_buffer_pool_read_requests             934231
Innodb_rows_read                             439800
Questions                                       803

And this is what commandx captured at mysqld level for a typical events.php query (formatted to make the SQL text more readable). Note that all times are in microseconds (with Time referring to elapsed time since the script started logging).

      Time  Tid   RespTM    CpuTM Type          Len SQL
  10070946 6487       51       36 14:PING         0
  10071118 6487   123550   123500 03:QUERY      194 select tag, refcount as count
                                                    from   SOCIALEVENTTAG
                                                    where  socialeventtagid in
                                                         ( select socialeventtagid
                                                           from   SOCIALEVENTTAG_SOCIALEVENT
                                                           where  socialeventid = '2086')
                                                    and    refcount > 0
                                                    order  by tag ASC
  10195907 6487     1005      974 03:QUERY       84 select count(username) as count from PERSON_SOCIALEVENT where socialeventid = '2086'
  10197179 6487      747      706 03:QUERY      275 select title,description,submitterUserName,imagethumburl,literatureurl,telephone,timezone,
                                                    from   SOCIALEVENT as se, ADDRESS as a
                                                    where  se.socialeventid = '2086' and se.ADDRESS_a...
  10199070 6487      746      724 03:QUERY       68 select username from PERSON_SOCIALEVENT where socialeventid = '2086'
  10209447 6487       97       73 03:QUERY       17 start transaction
  10209761 6487      594      568 03:QUERY      110 update COMMENTS_RATING set comments='', ratings=''
                                                    where  username='' and socialeventid='2086' and commentid=''
  10210475 6487     1356     1329 03:QUERY      113 select commentid,username,comments,ratings,created_at,updated_at
                                                    from   COMMENTS_RATING where socialeventid = '2086'
  10221075 6487       81       65 03:QUERY        6 commit 

So a single query dominates. Prior to MySQL 6.0.x, there is a known issue around subqueries not being processed optimally. In this case there is a workaround, which is to use a join instead of a subquery:

select  tag, refcount as count
where   a.socialeventtagid = b.socialeventtagid
and     b.socialeventid = '2086'
and     a.refcount > 0
order   by tag ASC

I chose to use super-smack to test the two versions in isolation, before making changes to the PHP source code. Super-smack provides a convenient facility for plugging keys selected randomly from a dictionary into SQL commands. This makes for a more realistic workload. Since I had already turned off mysqld's Query Cache, this wasn't strictly necessary, but I chose to do it anyway. [Aside: I made a few changes to the super-smack source that I downloaded to clean up output formatting, ensured it used an MT-hot malloc, and replaced calls to random() with a thread-safe random number generator. This last change required passing an additional parameter for per-thread state around the callstack.]

$ super-smack -D. soc002.smack 1 100
Query Barrel Report for client smacker
connect: max=0ms  min=-1ms avg= 42ms from 1 clients 
Query_type   num_queries    max_time    min_time     q_per_s
soc002               100         124         124        8.05

$ super-smack -D. soc002a.smack 1 100
Query Barrel Report for client smacker
connect: max=0ms  min=-1ms avg= 41ms from 1 clients 
Query_type   num_queries    max_time    min_time     q_per_s
soc002a              100           0           0      681.73

The next problem query came from index.php, and took about 99ms.

select count(\*) as count From SOCIALEVENT where eventtimestamp >= CURRENT_TIMESTAMP

The problem here is that almost all 18000 records in SOCIALEVENT have an eventtimestamp in the future. This part of the application looks a bit dodgy since it appears to be involved in pagination, in which links are provided for the first 10 pages. My suggestion here is to rely on count(\*) of all records being a special case, and subtract the few records older than CURRENT_TIMESTAMP. Its still expensive relative to most queries though.

select count(\*) as count From SOCIALEVENT
select count(\*) as count From SOCIALEVENT where eventtimestamp < CURRENT_TIMESTAMP

$ super-smack -D. cnt001.smack 1 100 Query Barrel Report for client smacker connect: max=0ms min=-1ms avg= 47ms from 1 clients Query_type num_queries max_time min_time q_per_s cnt001 100 100 100 9.95 $ super-smack -D. cnt002.smack 1 100 Query Barrel Report for client smacker connect: max=0ms min=-1ms avg= 47ms from 1 clients Query_type num_queries max_time min_time q_per_s cnt002a 100 39 39 24.03 cnt002b 100 0 0 24.03

The third change is a bit subtle, but here's the original query:

select socialeventid,title,summary,imagethumburl,createdtimestamp,eventdate,submitterusername
where  eventtimestamp >= CURRENT_TIMESTAMP
order  by eventdate asc
limit  0,10

The issue here is that the order clause sorts by eventdate, not eventtimestamp. For this application, eventdate can actually be derived from eventtimestamp (via DATE function), and eventtimestamp has an index that facilitates getting records in the right order. It makes more sense to order by eventtimestamp.

Ideally I would have rerun the original experiment at this point, but my changes had been applied to a version of the application just prior to transaction start/commit being inserted. So temporarily I just used the modified older version, but note that it doesn't have the explicit transaction logic. This should be a relatively minor thing, as by default each SQL statement is a transaction when using InnoDB engine in MySQL. Before simulating 500+ users, I ran a smaller test to verify from the apache access_log that several of the common requests were now inherently faster. Making the mysql component more efficient also shifts the relative cpu consumption between mysql and apache. Rather than manually adjusting the number of cpus in a processor set, I deleted them and let the OS work it out. Another decision was to run with KeepAlive off, as performance was less stable with it enabled. [Maybe in the future it would be worth hacking the apache source to allow subsecond KeepAliveTimeout ?]

Cranking up the number of users, its now passing all performance criteria at 1800 users. There are a few signs of it approaching instability with the run queue occasionally jumping. At its busiest the system overall is approaching 90% busy, so if throughput is to be increased much further, further attacks will need to be made on the components consuming the majority of the cpu time. Some of the requests are still quite slow (~1 second) due to image processing but I'll leave that to another blog entry.

UIDriver metric: 344.169 ops/sec
Driver start: Thu Nov 06 03:35:52 PST 2008
Driver end: Thu Nov 06 03:43:03 PST 2008
Total number of operations: 123901
Pass/Fail: PASSED

Response Times (seconds)

Type Avg Max 90th% Reqd. 90th% Pass/Fail
HomePage 0.317 4.511 0.500 1.000 PASSED
Login 0.292 2.429 0.400 1.000 PASSED
TagSearch 0.222 5.393 0.500 2.000 PASSED
EventDetail 0.108 4.104 0.200 2.000 PASSED
PersonDetail 0.169 3.640 0.380 2.000 PASSED
AddPerson 1.390 6.814 2.040 3.000 PASSED
AddEvent 1.468 7.117 2.140 4.000 PASSED


Post a Comment:
  • HTML Syntax: NOT allowed

Richard Smith


« July 2016