Oracle Database 21c In-Memory Hybrid Scans

March 18, 2022 | 9 minute read
Andy Rivenes
Product Manager
Text Size 100%:

Another new feature in Oracle Database 21c is In-Memory Hybrid Scans. With In-Memory Hybrid Scans if you exclude columns for a table from being populated then the query can still access that table in the In-Memory (IM) column store. As long as the columns that have been excluded are "projection" columns (i.e. specified only in the SELECT list) then the query can run in-memory and the column values that have been excluded from population can be retrieved from the row store.

You might ask why you would want to do this? Many customers exclude columns from being populated to save space. After all, if one or more columns of a table are never accessed by analytic queries then there is no point in using the space in the IM column store to populate them. In previous releases the problem with excluding columns was that you had to make sure that none of the excluded columns were accessed by your analytic queries, otherwise performance for those queries suffered because the object had to be accessed from the row store.

With In-Memory Hybrid Scans in 21c even if the columns of a table have been excluded and are then part of some analytic queries there usually won't be a huge drop off in performance. Yes, the query will be slower than if all the columns were populated, but in most cases it will still be much faster than if the entire table has to be accessed from the row store. This is affected by how many values are being retrieved though. If for example, you are returning a very large number of values, relative to the total, then In-Memory Hybrid Scans may not help very much. The cost of running the query in the IM column store and then retrieving a large number of column values from the row store may not be any better than just accessing the data in the row store. However, if you are only returning a relatively small number of values then the query can run almost as fast as if all the columns were populated. As usual, there is a tradeoff but the space savings can be quite dramatic.

How do you tell if your query took advantage of In-Memory Hybrid Scans?

I ran a query using the SSB schema's LINEORDER table. I excluded the LO_REVENUE column, along with some other columns, in the LINEORDER table from population and ran the following query:

select sum(lo_revenue)
from lineorder
where LO_ORDERDATE = '19960102'
and lo_quantity > 40
and lo_shipmode = 'AIR';

As you can see from the execution plan below, the LINEORDER table was accessed with a new "HYBRID" attribute to the TABLE ACCESS INMEMORY FULL access method.

Copied to Clipboard
Error: Could not Copy
Copied to Clipboard
Error: Could not Copy
-----------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                | Name      | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
-----------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |           |       |       | 68412 (100)|          |        |      |            |
|   1 |  SORT AGGREGATE                          |           |     1 |    26 |            |          |        |      |            |
|   2 |   PX COORDINATOR                         |           |       |       |            |          |        |      |            |
|   3 |    PX SEND QC (RANDOM)                   | :TQ10000  |     1 |    26 |            |          |  Q1,00 | P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE                       |           |     1 |    26 |            |          |  Q1,00 | PCWP |            |
|   5 |      PX BLOCK ITERATOR                   |           |   749 | 19474 | 68412   (1)| 00:00:03 |  Q1,00 | PCWC |            |
|*  6 |       TABLE ACCESS INMEMORY FULL (HYBRID)| LINEORDER |   749 | 19474 | 68412   (1)| 00:00:03 |  Q1,00 | PCWP |            |
-----------------------------------------------------------------------------------------------------------------------------------

However, if in your zest to save space you eliminate columns that are used in the WHERE clause watch out. In our example SQL, if you also exclude one or more of the WHERE clause columns (i.e. LO_ORDERDATE, LO_QUANTITY or LO_SHIPMODE) then you will get the pre-21c behavior and probably an unpleasant performance surprise.

In this next example I also excluded the LO_ORDERDATE column and now you can see that the query did not access the LINEORDER table in-memory, but instead accessed the table in the row store with TABLE ACCESS FULL.

Copied to Clipboard
Error: Could not Copy
Copied to Clipboard
Error: Could not Copy
-----------------------------------------------------------------------------------------------------------------
| Id  | Operation              | Name      | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
-----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |           |       |       | 68412 (100)|          |        |      |            |
|   1 |  SORT AGGREGATE        |           |     1 |    26 |            |          |        |      |            |
|   2 |   PX COORDINATOR       |           |       |       |            |          |        |      |            |
|   3 |    PX SEND QC (RANDOM) | :TQ10000  |     1 |    26 |            |          |  Q1,00 | P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE     |           |     1 |    26 |            |          |  Q1,00 | PCWP |            |
|   5 |      PX BLOCK ITERATOR |           |   749 | 19474 | 68412   (1)| 00:00:03 |  Q1,00 | PCWC |            |
|*  6 |       TABLE ACCESS FULL| LINEORDER |   749 | 19474 | 68412   (1)| 00:00:03 |  Q1,00 | PCWP |            |
-----------------------------------------------------------------------------------------------------------------

What about those space savings I mentioned? Well in my environment when I populated the full LINEORDER table is looked like this:

Copied to Clipboard
Error: Could not Copy
Copied to Clipboard
Error: Could not Copy
In-Memory            Bytes
OWNER           NAME                 STATUS               Disk Size             Size    Not Populated
--------------- -------------------- ------------- ---------------- ---------------- ----------------
SSB             LINEORDER            COMPLETED        3,693,232,128    3,245,342,720                0

Roughly 3 GB of space with the default compression level of MEMCOMPRESS FOR QUERY LOW. I then proceeded to repopulate the table with the following command, excluding a bunch of columns that the query didn't need to access:

alter table lineorder inmemory priority high
no inmemory (LO_ORDERPRIORITY, LO_SHIPPRIORITY, LO_EXTENDEDPRICE, LO_ORDTOTALPRICE, LO_DISCOUNT, LO_REVENUE, LO_SUPPLYCOST, LO_TAX, LO_COMMITDATE);

Now the amount of space required by the LINEORDER table is much less:

Copied to Clipboard
Error: Could not Copy
Copied to Clipboard
Error: Could not Copy
In-Memory            Bytes
OWNER           NAME                 STATUS               Disk Size             Size    Not Populated
--------------- -------------------- ------------- ---------------- ---------------- ----------------
SSB             LINEORDER            COMPLETED        3,693,232,128    1,509,949,440                0

About half the space. Now this is a contrived example, but the point is that the In-Memory Hybrid Scans feature can provide the ability to significantly reduce space in the IM column store while still preserving analytic query performance.

If you want to see more details about how In-Memory Hybrid Scans works you can check out this recent Database In-Memory Ask TOM Office Hours session. In this session we describe how In-Memory Hybrid Scans work and then we demonstrated these different examples in a 21c database.

 

Andy Rivenes

Product Manager

Andy Rivenes is a Senior Principal Product Manager at Oracle Corporation and is a Product Manager for Database In-Memory. Andy has been working with Oracle products since 1992 and spent a large part of his career managing Oracle databases for both custom and E-Business Suite applications. Andy now spends his time helping customers implement Database In-Memory and providing information about Database In-Memory through blogs, conferences, and customer visits.

Show more
Oracle Chatbot
Disconnected