As promised, the next entry in our 11gR2 explorations is In-Memory Parallel Execution. If you are going to Oracle OpenWorld next month make sure you check out the following session:
Tuesday, October 13 2009 5:30PM, Moscone South Room 308
Extreme Performance with Oracle Database 11g and In-Memory Parallel Execution.
In this session you will get more details and insight from the folks who actually built this functionality! A must see if this is of any interest, so book that ticket now and register!
Down to business, what is "In-Memory Parallel Execution"?
Let's begin by having a quick trip down memory-lane back to Oracle Database 7 when Parallel Execution (PX) was first introduced. The goal of PX then and now is to reduce the time it takes to complete a complex SQL statement by using multiple processes to go after the necessary data instead of just one process. Up until now these parallel server processes, typically by-passed the buffer cache and read the necessary data directly from disk. The main reasoning for this was that the objects accessed by PX were large and would not fit into the buffer cache. Any attempt made to read these large objects into the cache would have resulted in trashing the cache content.
However, as hardware systems have evolved; the memory capacity on a typical database server have become extremely large. Take for example the 2 CPU socket Sun server being used in new the Sun Oracle Database Machine. It has an impressive 72GB of memory, giving a full Database Machine (8 database nodes) over ½ a TB of memory. Suddenly using the buffer cache to hold large object doesn't seem so impossible any more.
In-Memory Parallel Execution (In-Memory PX) takes advantage of these larger buffer caches but it also ensures we don't trash the cache.
In-Memory PX begins by determining if the working set (group of database blocks) necessary for a query fits into the aggregated buffer cache of the system. If the working set does not fit then the objects will be accessed via direct path IO just as they were before. If the working set fits into the aggregated buffer cache then the blocks will be distributed among the nodes and the blocks will be affinitzed or associated with that node.
In previous releases, if the Parallel Execution of one statement read part of an object into the buffer cache, then subsequent SQL statement on other nodes in the cluster would access that data via Cache Fusion. This behavior could eventually result in a full copy of that table in every buffer cache in the cluster. In-Memory PX is notably different because Cache Fusion will not be used to copy the data from its original node to another node, even if a parallel SQL statement that requires this data is issued from another node. Instead Oracle uses the parallel server process on the same node (that the data resides on) to access the data and will return only the result to the node where the statement was issued.
The decision to use the aggregated buffer cache is based on an advanced set of heuristics that include; the size of the object, the frequency at which the object changes and is accessed, and the size of the aggregated buffer cache. If the object meets these criteria it will be fragmented or broken up into pieces and each fragment will be mapped to a specific node. If the object is hash partitioned then each partition becomes a fragment, otherwise the mapping is based on the FileNumber and ExtentNumber.
To leverage In-Memory PX you must set the initialization parameter PARALLEL_DEGREE_POLICY to AUTO (default MANUAL). Once this is set, the database controls which objects are eligible to be read into the buffer cache and which object will reside there at any point in time. It is not possible to manual control the behavior for specific statements.
Obviously this post is more of a teaser, for in-depth discussions on this, go to Openworld and/or keep an eye out for a new white paper called Parallel Execution Fundemental in Oracle Database 11gR2 that will be coming soon to oracle.com. This paper not only covers In-Memory PX but Auto-DOP and parallel statement queuing.
Stay tuned for more on 11gR2 coming soon...