Enabling Real-Time Analytics With Database In-Memory

  • September 4, 2018

Questions You Asked: What happens if you access a column that is not populated in the IM column store?

Andy Rivenes
Product Manager

I was speaking with a customer recently and they asked the question, "What happens if you access a column that is not populated in the IM column store?" I thought that would make a good blog post and sure enough, it did back in October of 2015. You can read it here. To be honest, I had forgotten that I had written it until I went back and looked for it. Since it was so long ago I thought I would repeat it and add a few more details.

When you populate an object in the IM column store it is possible to exclude columns. This can be useful if you have LOB or other columns in an object and your application does not access them from your analytic queries. If you don't populate a column then it won't take up space in the IM column store. But what if a query does access one or more of those excluded columns? Well it is an all or nothing proposition. If you access columns that are not populated then all columns will be accessed from the row store.

Let's see an example. This time around we'll use the CUSTOMER table from our SSB schema. Note that the table is fully populated in the IM column store:

Something that I didn't mention in the previous post is the ability to verify that a table's columns are eligible to be populated. We can see this if we query the view V$IM_COLUMN_LEVEL. In this case all of the columns are eligible and use the default compression level. We'll refer back to this view shortly.

Now we'll run a query to verify that we can access the CUSTOMER table in the IM column store:

We can see from the execution plan and the statistics that we accessed the CUSTOMER table from the IM column store. I have highlighted the key line in the execution plan and a couple of the key statistics.

Now let's exclude a column from the CUSTOMER table. I'm going to ALTER the table NO INMEMORY and then ALTER it INMEMORY with the C_NATION column excluded (i.e. NO INMEMORY). This will re-populate the CUSTOMER table without the C_NATION column. It is possible to just issue the ALTER command with the NO INMEMORY column exclusion, but then you don't get the benefit of memory savings until all of the IMCUs have been repopulated based on changes, and that could take a very long time, if ever. It will depend on how much of the table is modified, and in the meantime Database In-Memory will treat the column as if it was not populated so it just makes sense to remove it from the IM column store first.

Once we've verified that the CUSTOMER table has been fully populated let's run the same query that we ran above:

And we see that since Oracle Database knows that the C_NATION column has not been populated the Optimizer does not even consider an in-memory access. Instead it chooses TABLE ACCESS FULL and we see from the statistics that we did not access the IM column store.

We can also see that the column has been excluded in the V$IM_COLUMN_LEVEL view:

And finally, let's run a query that doesn't access the C_NATION column just to be sure that we can still access the rest of the table in the IM column store:

You can see that we are back to using TABLE ACCESS INMEMORY FULL and that the statistics show us that we did access the data in the IM column store.

If you are curious what query I'm using to display the execution plan and session statistics I've listed it below:

And now you have a little more detail about how Database In-Memory works when columns are excluded from the IM column store.


Join the discussion

Comments ( 4 )
  • Jonathan Lewis Wednesday, September 5, 2018

    This does introduce the question - for sufficiently sparse data could you exclude columns but use the in-memory store as a sort of bitmap index to access row data from the table after finding rowids from the in-memory data, something like:

    select c2.mktsegment, count(*)
    from customer c2
    where c2.rowid in (
    select /*+ no_merge */ c1.rowid
    from customer c1
    where c1.c_nation = 'JAPAN'
    group by c2.mktsegment;

    I'd be hoping that the c1 scan could be done in memory to identify a "small enough" number of rows that could be read from disc by rowid access on c2.
  • Andy Rivenes Monday, September 10, 2018
    Hi Jonathan,

    I think it's an interesting idea. Database In-Memory was designed based on the assumption that it would be costlier to access the data in the column store and then access any missing column data from the row store than it would be to just access the data from the row store. But it is entirely possible that, as you say, if the data is sufficiently sparse that it might be faster to find the qualifying rowids from the column store and then access those rows from the row store. I think it's an interesting idea, and of course it would have to be tested in each situation to see if there would really be any benefit.
  • RobK Tuesday, November 12, 2019
    I've been investigating something similar.
    My question was what happens if a table does not fit into the IM column store 100%-ly.

    --------------- ---------- ---------- ---------------- ------------- --------
    IMTEST_HIGH 14.69 35.52 35.52 OUT OF MEMORY HIGH

    So I had a segment where v$im_segments.BYTES_NOT_POPULATED>0

    I was surprised to see that the execution plan (DBMS_XPLAN.DISPLAY_CURSOR) showed in-memory plan:

    select max(s1) from IMTEST_HIGH

    Plan hash value: 3948097222

    | Id | Operation | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time | A-Rows | A-Time | Buffers | Reads |
    | 0 | SELECT STATEMENT | | 1 | | | 1235 (100)| | 1 |00:00:02.27 | 4548 | 4546 |
    | 1 | SORT AGGREGATE | | 1 | 1 | 51 | | | 1 |00:00:02.27 | 4548 | 4546 |
    | 2 | TABLE ACCESS INMEMORY FULL| IMTEST_HIGH | 1 | 200K| 9960K| 1235 (1)| 00:00:01 | 200K|00:00:02.26 | 4548 | 4546 |

    So the operation was TABLE ACCESS INMEMORY FULL, but we can see some blocks were read from disk.

    I was wondering what percentage of the data was read from IM and what percentage was read from disk (buffer cache was not used at all because of the full table scan)

    As it turned out IM column store was not used at all:
    Relevant and non-zero session stats:
    IM populate segments requested 1
    IM scan CUs invalid or missing revert to on disk extent 52
    table scan blocks gotten 4579
    table scan disk IMC fallback 200000
    table scan disk non-IMC rows gotten 1800
    table scan rows gotten 201800
    table scans (IM) 1
    table scans (direct read) 1
    table scans (long tables) 1

    My interpretation is the following:
    - All data was read from disk
    - IM column store was not used

    Questions and ideas:
    - Why is IM not used at all? Some data is in the IM column store. I did not find any relevant documentation.
    - I think the DBMS_XPLAN.DISPLAY_CURSOR could/should show some stats regarding IM usage (specially in the case when STATISTICS_LEVEL parameter is set to ALL). Would it be possible to add some new columns to it?
    - Maybe if the segment is not fully populated and IM column store will not be used at all, the execution plan should reflect it. Cannot the optimizer be aware of the fact that something is missing from the IM column store (and v$im_segment.populate_status is "OUT OF MEMORY")?
    - Also "table scans (IM)" statistic is misleading in this case

    Your feedback would be valuable.

  • Andy Rivenes Thursday, May 21, 2020
    Hi RobK, acutally the statistic tables scan rows gotten shows that data was read from the IM column store. If a table is not fully populated then the data that is populated will be read from the IM column store and the rest from the row store.
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.