X

Enabling Real-Time Analytics With Database In-Memory

  • September 4, 2018

Questions You Asked: What happens if you access a column that is not populated in the IM column store?

Andy Rivenes
Product Manager

I was speaking with a customer recently and they asked the question, "What happens if you access a column that is not populated in the IM column store?" I thought that would make a good blog post and sure enough, it did back in October of 2015. You can read it here. To be honest, I had forgotten that I had written it until I went back and looked for it. Since it was so long ago I thought I would repeat it and add a few more details.

When you populate an object in the IM column store it is possible to exclude columns. This can be useful if you have LOB or other columns in an object and your application does not access them from your analytic queries. If you don't populate a column then it won't take up space in the IM column store. But what if a query does access one or more of those excluded columns? Well it is an all or nothing proposition. If you access columns that are not populated then all columns will be accessed from the row store.

Let's see an example. This time around we'll use the CUSTOMER table from our SSB schema. Note that the table is fully populated in the IM column store:

Something that I didn't mention in the previous post is the ability to verify that a table's columns are eligible to be populated. We can see this if we query the view V$IM_COLUMN_LEVEL. In this case all of the columns are eligible and use the default compression level. We'll refer back to this view shortly.

 
Now we'll run a query to verify that we can access the CUSTOMER table in the IM column store:


We can see from the execution plan and the statistics that we accessed the CUSTOMER table from the IM column store. I have highlighted the key line in the execution plan and a couple of the key statistics.

Now let's exclude a column from the CUSTOMER table. I'm going to ALTER the table NO INMEMORY and then ALTER it INMEMORY with the C_NATION column excluded (i.e. NO INMEMORY). This will re-populate the CUSTOMER table without the C_NATION column. It is possible to just issue the ALTER command with the NO INMEMORY column exclusion, but then you don't get the benefit of memory savings until all of the IMCUs have been repopulated based on changes, and that could take a very long time, if ever. It will depend on how much of the table is modified, and in the meantime Database In-Memory will treat the column as if it was not populated so it just makes sense to remove it from the IM column store first.

 
Once we've verified that the CUSTOMER table has been fully populated let's run the same query that we ran above:


And we see that since Oracle Database knows that the C_NATION column has not been populated the Optimizer does not even consider an in-memory access. Instead it chooses TABLE ACCESS FULL and we see from the statistics that we did not access the IM column store.

We can also see that the column has been excluded in the V$IM_COLUMN_LEVEL view:

 
And finally, let's run a query that doesn't access the C_NATION column just to be sure that we can still access the rest of the table in the IM column store:

 
You can see that we are back to using TABLE ACCESS INMEMORY FULL and that the statistics show us that we did access the data in the IM column store.

If you are curious what query I'm using to display the execution plan and session statistics I've listed it below:

And now you have a little more detail about how Database In-Memory works when columns are excluded from the IM column store.

 

Join the discussion

Comments ( 2 )
  • Jonathan Lewis Wednesday, September 5, 2018
    Andy,

    This does introduce the question - for sufficiently sparse data could you exclude columns but use the in-memory store as a sort of bitmap index to access row data from the table after finding rowids from the in-memory data, something like:

    select c2.mktsegment, count(*)
    from customer c2
    where c2.rowid in (
    select /*+ no_merge */ c1.rowid
    from customer c1
    where c1.c_nation = 'JAPAN'
    )
    group by c2.mktsegment;


    I'd be hoping that the c1 scan could be done in memory to identify a "small enough" number of rows that could be read from disc by rowid access on c2.
  • Andy Rivenes Monday, September 10, 2018
    Hi Jonathan,

    I think it's an interesting idea. Database In-Memory was designed based on the assumption that it would be costlier to access the data in the column store and then access any missing column data from the row store than it would be to just access the data from the row store. But it is entirely possible that, as you say, if the data is sufficiently sparse that it might be faster to find the qualifying rowids from the column store and then access those rows from the row store. I think it's an interesting idea, and of course it would have to be tested in each situation to see if there would really be any benefit.
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.