Up until now our blog posts have focused on getting you started with Oracle Database In-Memory on a single instance Oracle Database. So, we thought it would be a good idea to broaden the scope and look what happens with In-Memory in a RAC environment.
When running Oracle Database In-Memory in a RAC environment it is important to consider that each instance will have its own In-Memory column store (IM column store). By default objects populated into memory will be distributed across all of the IM column stores in the cluster. Conceptually it helps to think of Oracle Database In-Memory in a RAC environment as a shared-nothing architecture for queries (although it is much more flexible than a true shared-nothing databases). This means that parallel server processes need to be employed to execute queries that access in-memory objects, as we do not ship IMCUs between the RAC instances. This post explains how parallelization should be used when running Oracle Database In-Memory in a RAC environment.
But let’s start with some basic background information on how the IM column stores are populated on RAC and then discuss how to manage parallelization for Oracle Database In-Memory in a RAC environment.
In a RAC environment, how an object is populated into the IM column store is controlled by the DISTRIBUTE sub-clause. By default the DISTRIBUTE sub-clause is set to AUTO. This means that Oracle will choose the best way to distribute the object across the available IM column stores using one of the following options unless the object is so small that it only consists of 1 IMCU, in which case it will reside on just one RAC instance:
The distribution is performed by w0* background processes as a part of the segment population task. The goal of the data distribution is to put an equal amount of data from an object on each RAC instance. If your partitioned strategy results in a large data skew (one partition is much larger than the others), we recommend you override the default distribuiton (BY PARTITION) by manually specifying DISTRIBUTE BY ROWID RANGE.
On Oracle Engineered Systems the DUPLICATE sub-clause of the INMEMORY attribute can also be used to mirror the IMCUs for any objects specified with the DUPLICATE sub-clause.
When data is populated in-memory in a RAC environment it is affinitized to a specific RAC instance. This means that we need to access the IMCUs in the IM column stores in which they reside. We don’t ship IMCUs across the interconnect to the RAC instance on which the query is being executed, this is why all in-memory queries must be parallelized with a degree equal to at least the number of instances that were involved in the distributed population. Each IM column store must be visited in order to have access to all of the IMCUs for the object(s).
Automatic Degree of Parallelism, or Auto DOP, allows Oracle Database to automatically decide the degree of parallelism for all SQL statements. Setting the init.ora parameter PARALLEL_DEGREE_POLICY to AUTO enables Auto DOP and allows the Oracle Database to determine whether the statement should run in parallel based on the cost of running the statement. There are two other parameters that affect Auto DOP, PARALLEL_MIN_TIME_THRESHOLD and PARALLEL_DEGREE_LIMIT. These parameters ensure that trivial statements are not parallelized and that database resources are not overused. When Oracle Database In-Memory is used, the PARALLEL_MIN_TIME_THRESHOLD parameter is automatically set for you.
In a RAC environment, Auto DOP must always be used in order to guarantee that the degree of parallelism (DOP) chosen will result in at least one parallel server process being allocated for each active instance. This is what is meant by parallel scans being affinitized for Oracle Database In-Memory. The parallel query coordinator is aware of the home locations of the IMCUs and the only way to insure that the cost-based DOP calculation will be greater than or equal to the number of IM column store instances is to have Auto DOP set. If Auto DOP is not set then the parallel query coordinator will not make use of the home locations of the IMCUs (i.e. instance location awareness) and there is no guarantee that the IM column store will be used to access all of the data needed for the query.
If you think about this it makes sense. There has to be enough parallel server processes available to run on each instance where IMCUs for the object(s) involved reside. Otherwise, some of the data will have to come from the buffer cache or disk as there's no other mechanism in the Oracle Database to allow a single query to access multiple RAC instances.
The following two examples show the same query accessing a scaled up version of the scott.emp table that has been range partitioned by salary (0-2499), (2500-3999) and (4000-6000) with the three resulting partitions distributed across a 3 node RAC cluster, with each RAC instance hosting one partition in memory.
The following steps were used to run the query were:
The first example is run with Auto DOP set to AUTO (enabled). Note that the query went parallel and Auto DOP was used. You will also notice that there are almost no physical reads and the number of consistent gets is reasonably low:
The second example is run with Auto DOP set to MANUAL (disabled) so no parallel execution is used. The number consistent gets is much higher and the number of physical reads almost match the consistent gets. Even though the execution plan shows "TABLE ACCESS IMMEMORY FULL" the majority of the execution went to the buffer cache.
If you are already familiar with Auto DOP and are using it in production today, you are probably using it in conjunction with Oracle Resource Manager (Resource Manager). If you specify a parallel limit for specific consumer groups via Resource Manager, make sure you set the parallel limit to be a minimum of the total number of RAC instances or a multiple of the number of RAC instances, so we get an even distribution of parallel server processes across the cluster.
There is currently a known issue with Auto DOP when running Oracle
Database In-Memory in a RAC environment. In some cases, even with Auto
DOP set correctly, the optimizer may calculate the cost of a serial
access to be less than the cost of a parallel access. This has been
identified as bug 18960760, and will usually only happen when very small
tables in the IM column store are involved in the query. The current
workaround is to enable a fix control. You can do so using the following
ALTER SYSTEM SET "_fix_control"="18960760:on";
If that still doesn't help then you can specify the /*+ parallel */ hint for the SQL statement.
The current In-Memory bug numbers associated with parallel execution are:
In the second part of this series on Oracle Database In-Memory on RAC we will describe how In-Memory works with RAC services.