Monday Jan 11, 2010

Integrating Hadoop Data with Oracle Parallel Processing

[Read More]

Thursday Dec 03, 2009

Increase Performance while Reducing Cost

[Read More]

Monday Nov 30, 2009

Lock up Your Data Warehouse - Part 3

[Read More]

Wednesday Nov 04, 2009

Chunking of Updates

[Read More]

Wednesday Oct 21, 2009

Edition Based Redefinition

[Read More]

Thursday Oct 01, 2009

In-Database MapReduce (Map-Reduce)

[Read More]

Sunday Sep 27, 2009

In-Memory Parallel Execution in Oracle Database 11gR2

As promised, the next entry in our 11gR2 explorations is In-Memory Parallel Execution. If you are going to Oracle OpenWorld next month make sure you check out the following session:

Tuesday, October 13 2009 5:30PM, Moscone South Room 308
Session S311420
Extreme Performance with Oracle Database 11g and In-Memory Parallel Execution.

In this session you will get more details and insight from the folks who actually built this functionality! A must see if this is of any interest, so book that ticket now and register!

Down to business, what is "In-Memory Parallel Execution"?

Let's begin by having a quick trip down memory-lane back to Oracle Database 7 when Parallel Execution (PX) was first introduced. The goal of PX then and now is to reduce the time it takes to complete a complex SQL statement by using multiple processes to go after the necessary data instead of just one process. Up until now these parallel server processes, typically by-passed the buffer cache and read the necessary data directly from disk. The main reasoning for this was that the objects accessed by PX were large and would not fit into the buffer cache. Any attempt made to read these large objects into the cache would have resulted in trashing the cache content.

However, as hardware systems have evolved; the memory capacity on a typical database server have become extremely large. Take for example the 2 CPU socket Sun server being used in new the Sun Oracle Database Machine. It has an impressive 72GB of memory, giving a full Database Machine (8 database nodes) over ½ a TB of memory. Suddenly using the buffer cache to hold large object doesn't seem so impossible any more.

In-Memory Parallel Execution (In-Memory PX) takes advantage of these larger buffer caches but it also ensures we don't trash the cache.

In-Memory PX begins by determining if the working set (group of database blocks) necessary for a query fits into the aggregated buffer cache of the system. If the working set does not fit then the objects will be accessed via direct path IO just as they were before. If the working set fits into the aggregated buffer cache then the blocks will be distributed among the nodes and the blocks will be affinitzed or associated with that node.

In previous releases, if the Parallel Execution of one statement read part of an object into the buffer cache, then subsequent SQL statement on other nodes in the cluster would access that data via Cache Fusion. This behavior could eventually result in a full copy of that table in every buffer cache in the cluster. In-Memory PX is notably different because Cache Fusion will not be used to copy the data from its original node to another node, even if a parallel SQL statement that requires this data is issued from another node. Instead Oracle uses the parallel server process on the same node (that the data resides on) to access the data and will return only the result to the node where the statement was issued.

The decision to use the aggregated buffer cache is based on an advanced set of heuristics that include; the size of the object, the frequency at which the object changes and is accessed, and the size of the aggregated buffer cache. If the object meets these criteria it will be fragmented or broken up into pieces and each fragment will be mapped to a specific node. If the object is hash partitioned then each partition becomes a fragment, otherwise the mapping is based on the FileNumber and ExtentNumber.




To leverage In-Memory PX you must set the initialization parameter PARALLEL_DEGREE_POLICY to AUTO (default MANUAL). Once this is set, the database controls which objects are eligible to be read into the buffer cache and which object will reside there at any point in time. It is not possible to manual control the behavior for specific statements.

Obviously this post is more of a teaser, for in-depth discussions on this, go to Openworld and/or keep an eye out for a new white paper called Parallel Execution Fundemental in Oracle Database 11gR2 that will be coming soon to This paper not only covers In-Memory PX but Auto-DOP and parallel statement queuing.

Stay tuned for more on 11gR2 coming soon...

Monday Sep 14, 2009

Announcing something very cool...

[Read More]

Sunday Sep 06, 2009

11gR2 - What is in it for me?

[Read More]

Wednesday Aug 19, 2009

Partitioning or Backup Tables...?

[Read More]

Tuesday Aug 18, 2009

Partitioning and/or/with Exadata?

Got a little busy there with all the comments on the Netezza posting, but now we're back into some of the outstanding topics. This post is the next one on discussing some of the ODTUG session questions (see this post). One of the questions was about the use of partitioning and whether it is made obsolete by Exadata off-loading... In other words, should you look at one, the other or both? The answer is that you will want both, and there are a variety of reasons for that. First of all, on the query side you will hopefully be using partitioning (often range partitioning) for partition elimination. From an I/O perspective that looks roughly like this*:
In essence, partition pruning allows you to reduce a 5TB I/O operation to a much smaller I/O operation and therefore much faster return of the information. Compression is something that may or may not be used. In this example we are compressing the data and further reducing the I/O numbers. So far there is nothing new here, with Exadata however you will see a further reduction. After applying a smart scan, both the rows returned (remember Exadata is smart storage and actually knows rows and columns should be returned!) and the columns returned are further reduced. This is on top of partitioning. You will get something like this:
The conclusion from a query and I/O perspective is therefore that you will benefit from both.[Read More]

Wednesday Jul 29, 2009

Class: Optimizer - Good things get even better (Aug 5)

[Read More]

Sunday Jul 26, 2009

Lock up Your Data Warehouse – Part 2

[Read More]

Thursday Jul 02, 2009

Lock up your data warehouse - Part 1

[Read More]

Tuesday Jun 09, 2009

Unloading data using External Table and Data Pump

[Read More]

The data warehouse insider is written by the Oracle product management team and sheds lights on all thing data warehousing and big data.


« November 2015