X

Enabling Real-Time Analytics With Database In-Memory

Oracle Database In-Memory on RAC - Part 2

Andy Rivenes
Product Manager

In our previous post we described how data is automatically distributed across all of the In-Memory column stores (IM column stores) in a RAC cluster. We also discussed why Auto DOP should be used when accessing data within the IM column stores. In this second part of our In-Memory on RAC series we’ll explore how RAC services can be used to control where data is populated, and the subsequent access of that data.

First let's give some background on services.

In RAC, services can be used to control node affinity for scheduler jobs as well as enabling application partitioning. They can also be used as a form of connection management. This can allow groups of connections or applications to be directed to a subset of nodes in the cluster.

It’s this ability to direct workloads to a subset of nodes that we are interested in for our discussion because we're going to talk about how services can enable In-Memory to be run on just  a subset of nodes in a RAC cluster.

Recall that in part 1 we discussed how data is populated in-memory on RAC, as specified by the DISTRIBUTE sub-clause. The options for the DISTRIBUTE sub-clause are:

  • BY ROWID RANGE
  • BY PARTITION
  • BY SUBPARTITION

The default option is AUTO, which means that Oracle will decide the best way to populate the object across the available IM column stores with the primary goal being to distribute an equal amount of data from the object into each IM column store.

So how do services impact which IM columns stores will have data distributed across them and which will be accessed for a given query?

As we said earlier services allows an IM column store to be allocated on a subset of nodes in a RAC cluster rather than on all of the nodes. Let’s assume we have a four node RAC cluster but only want to allocate an IM column store on three nodes. We would need to do two things:

  1. Define a service to allow access to just the three nodes that we want to have an IM column store.
  2. Ensure that any parallel queries will only run on these three nodes. This can be done with the init.ora parameter PARALLEL_INSTANCE_GROUP.

Remember, both the population and access of IM column stores in RAC needs to happen in parallel so that all IM column stores are accessible from any individual instance. We touched on this in part 1 of our series.

Below are the steps needed to set up In-Memory for a subset of nodes in a RAC cluster. There are actually two slightly different procedures depending on the PRIORITY setting for the objects being populated into the IM column stores. For objects with a PRIORITY of NONE (default) the population will only happen when the object is first accessed, which means the population will be initiated on the node that the query is run from. This is why service connections need to be restricted to just the nodes that have an IM column store. This is done by creating a service that only runs on IM column store nodes and then setting the init.ora parameter PARALLEL_INSTANCE_GROUP on the participating nodes to restrict parallel operations to just that service.

For any other value of PRIORITY the population will occur at instance startup and therefore could be initiated from any instance in the cluster. This requires that all instances, not just the instances running IM column stores, have the init.ora parameter PARALLEL_INSTANCE_GROUP set to the name of our In-Memory service. This has to be done to insure that all parallelization tasks only run on the nodes hosting IM column stores. In either case, all client connections still have to be restricted to connecting to just the In-Memory service.

If you are using the PRIORITY clause set to NONE then the following steps are needed to have In-Memory work with RAC services:

  1. Create a service running on the instances that you want to use for the IM column store with a command of the following format:







    srvctl add service -db <database name> –s <service name>
    -preferred "<instance names>"

    In our example we might use a command like:

    srvctl add service –db dbmm –s dbmperf –preferred "dbm1, dbm2, dbm3"

    We will also need to start the service and set up TNS aliases to connect to the service:

    srvctl start service -db dbmm -service "dbmperf"
    DBMPERF = 
    (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = <host name>)(PORT = <listener port>))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = DBMPERF)
    )
    )
  2. Set the parameter PARALLEL_INSTANCE_GROUP to the service created in the previous step on just those instances identified in the previous step:







    alter system set parallel_instance_group=<service name> scope=both
    sid=<sid name>

    In our example we might use three commands like:

    alter system set parallel_instance_group=dbmperf scope=both sid='dbm1';
    alter system set parallel_instance_group=dbmperf scope=both sid='dbm2';
    alter system set parallel_instance_group=dbmperf scope=both sid='dbm3';
  3. Alter the table as INMEMORY with the PRIORITY clause set to NONE. You can use the DUPLICATE or DUPLICATE ALL sub-clause if running on an engineered system:







    alter table t_target INMEMORY <DUPLICATE | DUPLICATE ALL | NO DUPLICATE>
    priority NONE;

    In our example we would use a command like:

    alter table t_target inmemory duplicate priority none;

    Update (12/14/2014): I've changed my note about not using the DUPLICATE ALL sub-clause. In this case there is no problem using it. I also noted that the DUPLICATE sub-clause requires an engineered system and I eliminated the other PRIORITY choices since the only one we are considering in this example is NONE.

  4. Connect to the service created in the first step and run a query on the table or use the DBMS_INMEMORY.POPULATE  procedure to start the population of the IM column stores.

Be aware that setting PARALLEL_INSTANCE_GROUP to a service on some nodes limits all parallel execution on those nodes to the nodes in that service regardless if they use the IM column store or not.

If you have set the PRIORITY clause to any value other than NONE then the following steps are needed to have In-Memory work with RAC services:

  1. Create a service running on the instances that you want to use for the IM column store with a command of the following format:







    srvctl add service -db <database name> –s <service name>
    -preferred "<instance names>"

    In our example we might use a command like:

    srvctl add service –db dbmm –s dbmperf –preferred "dbm1, dbm2, dbm3"

    We will also need to start the service and set up TNS aliases to connect to the service:

    srvctl start service -db dbmm -service "dbmperf"
    DBMPERF =
      (DESCRIPTION =
        (ADDRESS = (PROTOCOL = TCP)(HOST = <host name>)(PORT = <listener port>))
        (CONNECT_DATA =
          (SERVER = DEDICATED)
          (SERVICE_NAME = DBMPERF)
        )
      )
  2. Set the parameter PARALLEL_INSTANCE_GROUP to the service created in the previous step:







    alter system set parallel_instance_group=<service name> scope=both sid=<sid list>

    In our example we might use a command like:

    alter system set parallel_instance_group=dbmperf scope=both sid='*';
  3. Alter the table as INMEMORY with the PRIORITY clause. You can use the DUPLICATE or DUPLICATE ALL sub-clause if running on an engineered system:







    alter table t_target INMEMORY <DUPLICATE | DUPLICATE ALL | NO DUPLICATE>
    priority <CRITICAL | HIGH | MEDIUM | LOW>;

    In our example we might use a command like:

    alter table t_target inmemory duplicate priority critical;

    Update (12/14/2014): I've noted that the DUPLICATE sub-clause requires an engineered
    system.

Be aware that setting PARALLEL_INSTANCE_GROUP to a service on all nodes limits all parallel execution to the nodes in that service only.

After these steps users connecting to this service will be able to query the table from the In-Memory column store.

To summarize, we described how RAC services can be used to control where data is populated, and the subsequent access of that data. One of the key takeaways that we discussed in part 1 of our series is that we don't ship IMCUs across the interconnect in RAC and that means that each IM column store must be visited in order to have access to all of the IMCUs for an In-Memory query. In a RAC environment services enable the use of a subset of nodes and still insure that all In-Memory queries will be able to access all IM column stores.

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.