As the second post in workload management for your Oracle data warehouse I was going to focus on statement queuing some more. Before you read this post it is important that you read this older post for context on concurrency. Also realize that what I'm describing here is 220.127.116.11 functionality (some is 18.104.22.168 but major enhancements are in 22.214.171.124 so let's just go with that release).
And for those who were at Openworld this year (and can listen to the recordings / download those slides) you can find the session I did, which covers the workload management posts (past and future!). Session information: S316965 - Best practices on managing parallel execution in concurrent environments.
Like I said, the statement queuing is really something introduced in 126.96.36.199, but I will refer to it with its enhancements in 188.8.131.52. To enable the new set of parallel features you must modify an init.ora parameter. Once you enable the new features (and again refer to the older post for background) you get a few more parameters that can be useful to work with if the default values proof they need some adjustments. These parameters form a hierarchy of sorts:
In this post we really only look at case #3 => Auto. When the database is set up for auto all new functionality is enabled. So you will get full Automatic DOP, full statement queuing and In-memory parallel execution.
For this post we will assume that the 10 second threshold set via parallel_min_time_threshold is fine.
One way of using parallel_degree_limit is to cap the DOP for every statement (inserts, updates and selects) to a set number. However it is much more flexible to leverage Database Resource Manager for that. Consequently it would be better to use parallel_degree_limit as your safety net. In other words, it is your safeguard to an extreme DOP on the system.
The above is an extreme case where we have automatic DOPs generated showing 56 and 64 across a number of statements. Here we set parallel_degree_limit to 16, which is more akin to using it as a concurrency vehicle. It does however show the effect in Enterprise Manager and it does show there is no visual way of telling a parameter is kicking in!
For 184.108.40.206 it is best to think of parallel_degree_limit as the safety net and use the groups in Database Resource Manager to enable caps on the DOPs. More on that a little later in the post.
First of all, you should decide whether or not you want queuing and how aggressive you want to apply it in your system / your group.
That is because queuing comes with both upsides and downsides. The upside is that each statement will run with a computed or set DOP, no downgrades. That means that you run a statement faster than if you run it with a lower DOP (provided the DOP calculation is reasonably well done). It also means that you do not thrash the system by using too much resources.
The downside is an element of unpredictability. E.g. if the statement runs in 20 seconds today, it may run in 30 seconds tomorrow due to 10 seconds in the queue. An individual user may notice these differences. Obviously, if you do not queue the users statement may simply get hammered and downgraded, making life much worse.
The theory is that the upside outweighs the downside and that - overall - a workload on your system runs better and at faster aggregate times.
To make the parameter settings a bit more tangible it pays to think of queuing as the point in time when you have enough statements running concurrently to accept wait times for new statements. Or differently put: find the optimal queuing point based on desired concurrency on your system. I call that minimal concurrency.
To visualize minimal concurrency consider two scenarios as shown in the graph above, one with a maximum DOP of 16, one with a maximum DOP of 32. Again, I used parallel_degree_limit, but you should do this by group and you will get the same calculation. By using the parallel_degree_limit it is just simpler to explain as there is only one value for the entire system.
In the case of a DOP of 16, with parallel_servers_target at 128, you can run at minimum 4 statements concurrently. When the cap is set to 32, you can run 2 statements concurrently.
There are two interesting things to note here.
One is that the above is a very conservative calculation in that we assume each statement runs in with that maximum allowed DOP. That is probably not the case, however if this is the case it may be worth elevating the cap a bit so statements can run faster with higher DOPs.
Secondly, the match does not quite add up...
The reason for my math being off is that the DOP is not the same as number of processes used to run this statement in parallel (or even better, it may not be the same - see partition wise joins for more on that). As Oracle uses producers and consumers the number of processes is often double the DOP. So for a DOP of 16, you often need 32 processes. Parallel_servers_target is set in processes (not in DOP), so on the conservative side we should assume 2 * DOP as the number of processes (hence the diagonal bar in the little graph blocks). And then the math all of a sudden adds up again...
Since we are doing the math, this is the formula to use:
You will have to think about the minimal concurrency per group in Database Resource Manager. Next up therefore, a little bit on DBRM and how you can set up groups in plans and work with the concurrency, queuing and DOP caps per group.
Before you get excited, this is only going to touch on a small aspect of DBRM, which helps with the parallel execution functionality mentioned above. But to place that in some context I will quickly put down the basics of DBRM in a picture so we all get on board on terminology and high level concepts.
As a simple example, the above shows an 8-node RAC cluster divided into three resource groups. The cool thing about this is that as long as no one in Group 2 and 3 is working, group 1 can have the entire machine's resources. As soon as the others start to submit work, group 1 gets throttled back to where each other group get its rightful resources.
In DBRM the main entity is the resource plan. A resource plan consists of groups resource allocations and policies (known as directives and thresholds). The above and below picture shows a simple plan with three (3) groups. The groups are called Static Reports, Tactical Queries and Ad-hoc Workload. Any request that comes into the database gets assigned to a specific group when this plan in active. Assignments can happen based on application context, based on Database user etc. Those are relatively static. The assignment can also happen based on estimated execution time, which is very dynamic.
In 220.127.116.11 each of the resource groups can have a separate parallel statement queue (as shown above).
Each group can have properties set to downgrade a statement to another group, or to reject the query from running in the first place if its estimated execution time is too long. You can also determine that a query gets terminated or cancelled if runs to long, this of course being evaluated on actual elapsed time. You'd want to catch the error in your application to make sure it is handled well:
ERROR at line 1:
ORA-12801: error signaled in parallel query server P000
ORA-00040: active time limit exceeded - call aborted
The Enterprise Manager screen below shows some of these settings.
Here we will focus on the directives for parallel execution.
Parallel Execution directives focus on both limiting the maximum DOP and on managing the statement queue as is shown here:
This is what we should be using to cap the DOP on a system. Rather than using parallel_degree_limit which is system wide, here we specify the cap on a per resource group basis. Dividing your workload into groups allows you to specifically dictate what max DOP a certain group can run with, without interfering with other queries. Just to clarify, the numbers assigned do not need to add up to any given total (like max_parallel_servers).
Slightly out of order, but it is much simpler than the next topic... This setting drives when a statement is ejected from the queue because it has been waiting too long. UNLIMITED means it will wait an unlimited number of seconds (also knows as eternity). The statement will "error out" after reaching the time limit.
This is a per group threshold to queuing. As the name implies, on a per group basis you say how much of the space before we start queuing is given to this group. If we have a setting of parallel_servers_target = 128 the the above allocates 25% of that 128 to the LONG_SQL_GROUP. Translated, this means that once 32 processes are in use by sessions running within LONG_SQL_GROUP next sessions within this group will be queued.
The total does not have to add up to 100%. In fact, UNLIMITED above equals to setting it to 100% allowing this group to only queue processes hit 128 (using the above numbers).
This over allocation has some interesting effects. The first is that it gives each group a certain leeway allowing it to queue a little later when the overall system is not busy. E.g. when nothing within MEDIUM and SHORT runs above, queuing will start no earlier than with 128 (e.g. 100%) processes in use for SHORT. When more processes run in other groups these groups use up their quota and SHORT will start queuing a bit sooner. That is because of the second phenomenon of never going over the limit set in parallel_servers_target for the system.
The graph above (not always to scale!) shows this over allocation and capping in an extreme form. The blue group (group A) is allowed to use max percentage parallel servers target = unlimited, yet because group B (the green guys) are already using their maximum quota group A will start queuing sooner than "expected".
When using more groups it therefore pays to consider raising the bar and increasing the overall head room for queuing.
A great many systems today leverage services to divide a set of resources. Services works nicely with both DBRM and with statement queuing giving you another level of working with resource management.
Consider two services created that divide the 8-node RAC system into 2 services called Gold and Silver. You can now layer the groups on top of the services. This allows you to separate the workloads in a physical environment (e.g. log onto service gold or silver) and then manage various consumer groups within that service.
By using server pools you can choose to change the service "size" and add or revoke a node from a service at specific times (for example to run a large ETL load on service Gold you may want to add a node to it).
Statement queuing gets still set within the group. But you must keep in mind that the service has only a limited number of nodes to its disposal and therefore a reduced number of processes in parallel_servers_target for the service. Each node has its number for parallel_servers_target and the aggregate for the whole system is Node count * parameter value. For the service it is node count within service * parameter value.
As you can see from the above, there are a lot of ways to create a workload management system. The crucial piece is to understand your workload and then set the system up to handle such a workload. Different workloads require different solutions!
For example the set up for a system that divides a data warehouse workload into short running, medium running and long running queries is different from a system that has OLTP style and DW style workloads.
The first example would use dynamic assignment to consumer groups of queries based on estimated execution time. The second would probably leverage a user/application driven mapping to consumer groups.
The latter may use a very wide range of parallel settings. OLTP would have very low DOP caps, almost never queue and get the high priority when on-line orders are being placed - e.g. maximize concurrency. The former would potentially focus on overall throughput and allow queuing to make sure few statements are suffering from too much capping.
I'll work with a small example in the next post to show some of the parameters in a more concise way and give some context for all of them... stay tuned.