A Patent for Workload Management Based on Service Level Objectives
By jsavit on Nov 21, 2011
Many operating systems have a resource manager which lets you control machine resources. For example, Solaris provides controls for CPU with several options:
- shares for proportional CPU allocation. If you have twice as many shares as me, and we are competing for CPU, you'll get about twice as many CPU cycles),
- dedicated CPU allocation in which a number of CPUs are exclusively dedicated to an application's use. You can say that a zone or project "owns" 8 CPUs on a 32 CPU machine, for example. And,
- capped CPU in which you specify the upper bound, or cap, of how much CPU an application gets. For example, you can throttle an application to 0.125 of a CPU.
Useful as that is (and tragic that some other operating systems have little resource management and isolation, and frighten people into running only 1 app per OS instance - and wastefully size every server for the peak workload it might experience) that's not really workload management. With resource management one controls the resources, and hope that's enough to meet application service objectives. In fact, we hold resource distribution constant, see if that was good enough, and adjust resource distribution if that didn't meet service level objectives.
Here's an example of what happens today: Let's try 30% dedicated CPU. Not enough? Let's try 80% Oh, that's too much, and we're achieving much better response time than the objective, but other workloads are starving. Let's back that off and try again. It's not the process I object to - it's that we to often do this manually. Worse, we sometimes identify and adjust the wrong resource and fiddle with that to no useful result.
Back in my days as a customer managing large systems, one of my users would call me up to beg for a "CPU boost":
Me: "it won't make any difference - there's plenty of spare CPU to be had, and your application is
completely I/O bound."
User: "Please do it anyway."
Me: "oh, all right, but it won't do you any good." (I did, because he was a friend, but it didn't help.)
Prior artThere are some operating environments that take a stab about workload management (rather than resource management) but I find them lacking. I know of one that uses synthetic "service units" composed of the sum of CPU, I/O and memory allocations multiplied by weighting factors. A workload is set to make a target rate of service units consumed per second. But this seems to be missing a key point: what is the relationship between artificial 'service units' and actually meeting a throughput or response time objective? What if I get plenty of one of the components (so am getting enough service units), but not enough of the resource whose needed to remove the bottleneck?
Actual workload management
That's not really the answer either. What is needed is to specify a workload's service levels in terms of externally visible metrics that are meaningful to a business, such as response times or transactions per second, and have the workload manager figure out which resources are not being adequately provided, and then adjust it as needed. If an application is not meeting its service level objectives and the reason is that it's not getting enough CPU cycles, adjust its CPU resource accordingly.
If the reason is that the application isn't getting enough RAM to keep its working set in memory, then adjust its RAM assignment appropriately so it stops swapping. Simple idea, but that's a task we keep dumping on system administrators.
In other words - don't hold the number of CPU shares constant and watch the achievement of service level vary. Instead, hold the service level constant, and dynamically adjust the number of CPU shares (or amount of other resources like RAM or I/O bandwidth) in order to meet the objective.
Instrumenting non-instrumented applicationsThere's one little problem here: how do I measure application performance in a way relating to a service level. I don't want to do it based on internal resources like number of CPU seconds it received per minute - We need to make resource decisions based on externally visible and meaningful measures of performance, not synthetic items or internal resource counters.
If I have a way of marking the beginning and end of a transaction, I can then measure whether or not the application is meeting an objective based on it. If I can observe the delay factors for an application, I can see which resource shortages are slowing an application enough to keep it from meeting its objectives. I can then adjust resource allocations to relieve those shortages.
Fortunately, Solaris provides facilities for both marking application progress and determining what factors cause
application latency. The Solaris DTrace facility let's me introspect on application behavior: in particular I can see events like "receive a web hit" and "respond to that web hit" so I can get transaction rate and response time. DTrace (and tools like
prstat) let me see where latency is being added to an application, so I know which resource to adjust.