Subscribe

Share

DBA

Load and Buffer Fast Enough for IoT

Ingesting large volumes of data from IoT devices? Learn how to make inserts and queries blazing fast in Oracle Database 19c with Memoptimized Rowstore.

By Arup Nanda

August 5, 2019

When Jane, the principal architect of Acme Bank, walked into the office of Chief Technology Officer Mark this morning, she could almost smell the frustration in the air. One of Mark’s key initiatives is to continuously get health data from all the externally located assets of the bank, such as ATMs, point-of-sale (POS) terminals, and even the laptops of the financial advisors into a central database. The objective is to analyze the incoming health data and provide an early warning about any failed component, be it a device such as an ATM or a segment of code, and take appropriate action. The board of directors of the bank approved the initiative after much deliberation, and the network team invested a lot of money and time to establish the infrastructure.

The project increased the amount of data coming to the database several thousandfold, and consequently the database was unable to handle the volume of concurrent inserts. The devices returned errors when the inserts failed, throwing the entire IoT network into chaos. Similarly, the health check processes looked up the data collected at several hundred thousand times the normal volume. For example, a process would check the older value of a property of an ATM, compare it against the data just collected, and make a decision a million times a day, but the process wouldn’t be able to make that decision unless the results came back quickly. These lookups were not as fast as expected, causing massive logjams in the health check process, which made the whole program essentially ineffective.

These issues cast a large cloud over the entire IoT program, the brainchild of Mark, who desperately needs a solution immediately. Some people at Acme have suggested a different approach: using an eventually consistent NoSQL database that can supposedly scale better by deferring the data validation to the end. Although that may work, Mark is not enthusiastic about it. First, it defers the validation. Second, it’s yet another technology his team has to support, adding to the already complex footprint. And finally, it still doesn’t solve the need for super-quick lookup. Mark needs a solution within the same Oracle Database instance Acme uses today, and that’s why he has summoned Jane and others to this meeting in his office, hoping she can offer a way out of this debacle.

Yes, she can. Jane smiles and starts to explain the solution in detail. There are two issues, she begins: reading the data and writing the data.

Fast Lookups

First Jane explains how data access from a database works. When a user or an application issues a SELECT statement such as this

select col2 from tableA where col1 = 

Oracle Database works to locate the specific row. Let’s assume that the tableA table has a primary key on the COL1 column, so locating a specific row with that column is the quickest way to access the data. Here is the general order of steps, Jane explains:

  1. Oracle Database looks up the primary key index to get the leaf block where a selected row may reside.

  2. The database goes to the leaf block of the index, where the pointer (ROWID) to the data block of the table resides. Jane reminds everyone that at this time, the database does not know for sure that a row for the query even exists. The database identifies the index block that includes a pointer to a data block in which the row may potentially reside.

  3. Finally, the database session gets the data block of the table to the buffer cache of the System Global Area (SGA). The block goes to an empty slot in the buffer cache, called a buffer. If a buffer is not available, the session must wait until the buffer is made available.

Pointing to the sequence of events, Jane explains that there are two cases where the performance can be enhanced by optimization.

  • The data block where the row returned by the SELECT statement resides is in one buffer in the buffer cache. The block has to compete with other buffers and may be aged out of the cache if it hasn’t been accessed in a while, which makes the database session get it again from disk. In a typical database, this process is very likely to happen. The block is not guaranteed to be in the buffer cache forever.

  • The above SELECT activity accesses three blocks from memory (or, more accurately, the buffer cache), also known as consistent gets. Even when the buffers are accessed from memory, bypassing the disk access, the database has to perform some work to locate the specific buffer containing the data, which takes up CPU cycles and requires use of “database buffer latches,” which are designed to make sure two processes do not access the same buffer at the same time, to reduce the chances of corruption. This, in effect, prevents parallel access to the buffers, especially if they are in the same group of buffers. Therefore, the lower the number of consistent gets, the better the performance.

Jane stresses that these features of Oracle Database are designed to protect the data integrity above all and work in almost all cases. However, in special cases, such as the explosion in lookups by IoT devices, performance is more important, and this is where typical NoSQL databases have started to gain popularity. Mark doesn’t want to explore yet another technology and asks if this type of performance can be achieved in Oracle Database as well.

Yes, it can, responds Jane, via a new feature in Oracle Database, introduced in 18c and greatly enhanced in 19c, called Memoptimized Rowstore. When a table is marked to be optimized for reads, the table’s blocks are placed in a special cache, not in the buffer cache, eliminating the competition for buffers. This new buffer is called a memoptimize pool. Tables need to be placed in the memoptimize pool explicitly—they don’t get there by default, unlike with the buffer cache. In addition, the memoptimize pool is designed for very fast access, so typical restraining mechanisms such as latches are not present, making access faster.

The audience remains skeptical. Betty, a DBA, is the first to speak: Well, she opines, the same effect can be achieved with a separate buffer cache, such as the KEEP pool, where tables need to be marked specially to be placed there and not placed automatically, thereby reducing competition for buffers and increasing performance. This feature has been available for ages in Oracle Database, and she is unconvinced that the new Memoptimized Rowstore feature will produce a dramatic improvement.

True, Jane concedes, the KEEP pool reduces competition for buffers, but there’s still the problem of latches, which this new feature addresses. She asks Betty to hold on for more information on the KEEP pool until after the demonstration of the Memoptimized Rowstore fast lookup feature.

Dave, another DBA, is also unconvinced. This new memoptimize pool will have hundreds of thousands of buffers. The database still needs to figure out which specific buffer has what data. Won’t all that lookup slow down performance, he wonders. Good question, Jane agrees, and explains the second part of the new feature: the hash index, a structure in memory that maps primary keys to the buffers in the memoptimize pool. When users query the specially marked read-optimized table on the primary key, the database queries this hash index to locate the specific buffers in the memoptimize pool, bypassing the traditional buffer access methods.

The audience is interested now and wants to see the new feature in action. Jane turns to her demo and first creates the memoptimize pool by specifying its size:

alter system set memoptimize_pool_size = 100m scope=spfile;

Because this is not a dynamic parameter, Jane sets it in the SPFILE as shown and then restarts the database for it to take effect. This creates a new pool, completely different from the buffer cache of the database instance. To demonstrate, she checks the currently allocated memory areas in SGA, using this query:

select component, current_size from v$sga_dynamic_components;

COMPONENT                             CURRENT_SIZE
------------------------------------  ------------
shared pool                              352321536
large pool                                16777216
java pool                                        0
streams pool                                     0
unified pga pool                                 0
memoptimize buffer cache                 117440512
DEFAULT buffer cache                    1023410176
KEEP buffer cache                                0
RECYCLE buffer cache                             0
DEFAULT 2K buffer cache                          0
DEFAULT 4K buffer cache                          0
DEFAULT 8K buffer cache                          0
DEFAULT 16K buffer cache                         0
DEFAULT 32K buffer cache                         0
Shared IO Pool                            83886080
Data Transfer Cache                              0
In-Memory Area                                   0
In Memory RW Extension Area                      0
In Memory RO Extension Area                      0
ASM Buffer Cache                                 0

Pointing to the output in bold, she shows how much memory was reserved by the database instance for the new memoptimize buffer cache value. This is different from the DEFAULT buffer cache, shown on the next line. Therefore, when the DBAs set SGA_TARGET to let Oracle Database set the size of the buffer cache automatically, this memoptimize pool is not affected. Of this, 75% is reserved for the memoptimize pool and 25% is reserved for the hash index to look up the specific data elements in the pool.

Table Placement

With the memoptimize pool created, Jane now creates the tables. She picks an existing table being used at Acme, ATM_PROP, that stores the properties of each ATM, such as ambient temperature, cash balance, and so on. To compare the lookup performance, she first creates a table in the traditional manner:

create table atm_prop
(
        atm_id  number,
        prop_id number,
        prop_val varchar2(100),
        constraint pk_atm_prop primary key (atm_id, prop_id)
);

Then she inserts 100,000 records into the table:

begin
     for i in 1..1000 loop
              for j in 1..100 loop
                     insert into atm_prop
                     values
                     (i,j,
                     dbms_random.string('A',dbms_random.value(1,100))
                     );
             end loop;
      end loop;
end;
/

Next she creates the same table for the new Memoptimized Rowstore fast lookup feature:

create table atm_prop_ro
(
        atm_id  number,
        prop_id number,
        prop_val varchar2(100),
        constraint pk_atm_prop_ro primary key (atm_id, prop_id)
)
segment creation immediate
memoptimize for read
/

Jane draws everyone’s attention to the last two lines. The memoptimize for read clause makes the table exactly that—optimized for very fast reads—and enables it to be placed into the memoptimize pool. Jane reminds everyone that, starting with Oracle Database 11g, when a table is created, a segment—the physical representation of the table in data files—is not created immediately, by default. However, a very important prerequisite of read-optimized tables is that they must actually exist as segments; hence Jane includes the segment creation immediate clause.

What about existing tables, asks Betty? Can they be converted into “memoptimized for read” tables without dropping and re-creating them?

Sure, assures Jane. She uses the following SQL to alter an existing table named MYTABLE:

alter table mytable memoptimize for read;

Can any table be converted, asks Betty? No, replies Jane. The table must meet these conditions:

  • It must be a heap table (not an index-organized table, clustered table, external table, and so on).

  • It must have a primary key.

  • The primary key must not be an identity column.

  • It must not be compressed.

  • It can be partitioned but not reference-partitioned.

  • It must have a physical segment.

In addition, Jane continues, the following must be true for the parameters of the database or the database session:

  • SQL trace must not be enabled.

  • STATISTICS_LEVEL must not be set to ALL.

With the Memoptimized Rowstore table created, she inserts the 100,000 records as she did for the traditional table.

begin
      for i in 1..1000 loop
             for j in 1..100 loop
                     insert into atm_prop_ro
                     values
                     (i,j,
                     dbms_random.string('A',dbms_random.value(1,100))
                     );
             end loop;
      end loop;
end;
/

Then she collects the statistics on both of the tables:

SQL> exec dbms_stats.gather_table_stats(ownname=>'ACME', 
                                        tabname=>'ATM_PROP')
SQL> exec dbms_stats.gather_table_stats(ownname=>'ACME', 
                                        tabname=>'ATM_PROP_RO')

With the setup in place, she does a test by running a simple SQL statement to select the value of a property (PROP_ID=42) for a specific ATM (ATM_ID=367). She uses the set autotrace on explain stat command to show the execution plan followed by the query as well as the database statistics such as consistent gets after the query executes. First she queries the standard database table, ATM_PROP:

SQL> col PROP_VAL format a30
SQL> set autotrace on explain stat
SQL> select prop_val from atm_prop where atm_id = 367 and prop_id = 42;

PROP_VAL
------------------------------
UlSJKKkMrKeQMjgMZtQtYkPu


Execution Plan
----------------------------------------------------------
Plan hash value: 2816524897

------------------------------------------------------------------------------------
| Id | Operation		    |Name        |Rows |Bytes |Cost(%CPU)|Time     |
------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT	            |            |   1 |   58 |   2   (0)|00:00:01 |
|  1 |  TABLE ACCESS BY INDEX ROWID |ATM_PROP    |   1 |   58 |   2   (0)|00:00:01 |
|* 2 |   INDEX UNIQUE SCAN	    |PK_ATM_PROP |   1 |      |   1   (0)|00:00:01 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ATM_ID"=367 AND "PROP_ID"=42)

Statistics
----------------------------------------------------------
	  0  recursive calls
	  0  db block gets
	  3  consistent gets
	  0  physical reads
	  0  redo size
	592  bytes sent via SQL*Net to client
	412  bytes received via SQL*Net from client
	  1  SQL*Net roundtrips to/from client
	  0  sorts (memory)
	  0  sorts (disk)
	  1  rows processed

Now she issues the same query on the memoptimized table:

SQL> select * from atm_prop_ro where atm_id = 367 and prop_id = 42;

PROP_VAL
------------------------------
UlSJKKkMrKeQMjgMZtQtYkPu


Execution Plan
----------------------------------------------------------
Plan hash value: 494331909

-----------------------------------------------------------------------------------------------
| Id |Operation                              |Name          |Rows |Bytes |Cost (%CPU)|Time    |
-----------------------------------------------------------------------------------------------
|  0 |SELECT STATEMENT                       |              |   1 |   58 |    2   (0)|00:00:01|
|  1 | TABLE ACCESS BY INDEX ROWID READ OPTIM|ATM_PROP_RO   |   1 |   58 |    2   (0)|00:00:01|
|* 2 |  INDEX UNIQUE SCAN READ OPTIM         |PK_ATM_PROP_RO|   1 |      |    1   (0)|00:00:01|
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ATM_ID"=367 AND "PROP_ID"=42)


Statistics
----------------------------------------------------------
	  0  recursive calls
	  0  db block gets
	  3  consistent gets
	  0  physical reads
	  0  redo size
	592  bytes sent via SQL*Net to client
	412  bytes received via SQL*Net from client
	  1  SQL*Net roundtrips to/from client
	  0  sorts (memory)
	  0  sorts (disk)
	  1  rows processed

Betty examines the output carefully. She observes the difference in the execution plan in the second case with the following steps, not visible in the first case:

TABLE ACCESS BY INDEX ROWID READ OPTIM
INDEX UNIQUE SCAN READ OPTIM

These are the execution plan steps for the read-optimized table, Jane confirms. Instead of the TABLE ACCESS BY INDEX ROWID step for the original table, the TABLE ACCESS BY INDEX ROWID READ OPTIM step—a truncated version of TABLE ACCESS BY INDEX ROWID READ OPTIMIZED—shows how the optimizer chose a memoptimized method.

Betty points to one statistic in the output: the consistent gets, which are the same—three—in both cases. Earlier Jane mentioned that the consistent gets were the problem; the more there are, the worse the performance. So, Betty demands to know, how is this supposed to be better?

Of course, this is not better, smiles Jane. There are new statistics that Oracle Database provides for inspecting the memoptimize-related data. She picks up three of them from the V$SYSSTAT dynamic performance view.

select n.name, s.value
from v$sysstat s, v$statname n
where n.statistic# = s.statistic#
and n.name in ( 
    'memopt r lookups',
    'memopt r misses',
    'memopt r populate tasks accepted'
 );

NAME                                      VALUE
------------------------------------ ----------
memopt r lookups                              3
memopt r misses                               3
memopt r populate tasks accepted              1

Referring to the output, Jane points to the first statistic: memopt r lookups. It shows how many times the database has looked up the memoptimize pool. This output shows that it has done that three times. But how many times has it actually found data there? The next statistic, memopt r misses, provides the answer. The database has missed the data in the pool all three times. This means that Oracle Database had to resort to a conventional index lookup. There is a block for the root index, one block for the leaf index where the pointer to the table is stored, and finally the table block itself where the row is stored—making a total of three consistent gets.

Why did Oracle Database use the conventional index, asks a frustrated Betty. There is a simple answer, Jane explains; it’s because the table blocks are not yet populated in the memoptimize pool. The task is accepted, as shown in the memopt r populate tasks accepted statistic, but it’s not yet complete. However, Jane assures Betty that she can populate—immediately—the table in the memoptimize pool. She issues the following:

SQL> exec dbms_memoptimize.populate(schema_name=>'ACME', 
                                    table_name=>'ATM_PROP_RO');

PL/SQL procedure successfully completed.

Again, she issues the SELECT statement she issued earlier:

SQL> select prop_val from atm_prop_ro where atm_id = 367 and prop_id = 42;

PROP_VAL
------------------------------
UlSJKKkMrKeQMjgMZtQtYkPu


Execution Plan
----------------------------------------------------------
Plan hash value: 494331909

------------------------------------------------------------------------------------------------
| Id |Operation                              |Name          |Rows |Bytes |Cost (%CPU)|Time     |
------------------------------------------------------------------------------------------------
|  0 |SELECT STATEMENT                       |              |   1 |   58 |    2   (0)|00:00:01 |
|  1 | TABLE ACCESS BY INDEX ROWID READ OPTIM|ATM_PROP_RO   |   1 |   58 |    2   (0)|00:00:01 |
|* 2 |  INDEX UNIQUE SCAN READ OPTIM         |PK_ATM_PROP_RO|   1 |      |    1   (0)|00:00:01 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ATM_ID"=367 AND "PROP_ID"=42)


Statistics
----------------------------------------------------------
	  0  recursive calls
	  0  db block gets
	  0  consistent gets
	  0  physical reads
	  0  redo size
	732  bytes sent via SQL*Net to client
	423  bytes received via SQL*Net from client
	  2  SQL*Net roundtrips to/from client
	  0  sorts (memory)
	  0  sorts (disk)
	  1  rows processed

Referring to the output, Jane draws her colleagues’ attention to how the consistent gets value has dropped to 0. Looking at the other three memoptimize-related statistics:

select n.name, s.value
from v$sysstat s, v$statname n
where n.statistic# = s.statistic#
and n.name in ( 
    'memopt r lookups',
    'memopt r misses',
    'memopt r populate tasks accepted'
 );

NAME                                      VALUE
-----------------------------------   ---------
memopt r lookups                             44
memopt r misses                              43
memopt r populate tasks accepted              2

Jane notes that the cumulative number of lookups is now 44, whereas the number of cumulative misses is 43. The last lookup came from the memoptimize pool—not from the buffer cache—making the consistent gets value 0, which is the desired outcome. When the millions of IoT devices issue these seemingly small SELECT statements, the savings in consistent gets produce exponential gains in performance, she concludes. The audience, especially Mark, is ecstatic.

However, Jane cautions, for the query to take advantage of this new feature, it must meet certain conditions:

  • The query must be in the format SELECT … FROM … WHERE <primary key column> = ….

  • If the table has a multicolumn primary key (PK), all PK columns must be present in the WHERE clause.

  • No other columns (non-PK) are allowed in the WHERE clause.

  • The query can’t include the GATHER_PLAN_STATISTICS hint.

If the query doesn’t follow these rules, the session won’t complain; it will simply switch back to the conventional buffer cache silently, Jane warns. The audience duly notes that.

Comparison with Other Optimizations

Although the results were impressive, Betty still has some lingering doubts. Since the objective is to do a lookup from a table on primary keys, there are other ways to achieve the same result, including a nondefault buffer cache such as the KEEP pool and using an index-organized table—features that have been available in Oracle Database for years. Why is this new approach better, she wonders.

KEEP pool. The KEEP pool is a cache separate from the default buffer cache, with its own latches and access mechanism. Jane concedes that keeping the table in the KEEP pool limits the competition for buffers of that table to the queries accessing that table only and that, therefore, it is better than the default buffer cache. However, is that good enough, she asks?

To address that question, she creates a new table similar to ATM_PROP_RO in the conventional manner and explicitly puts it and its primary key index in the KEEP buffer pool.

create table atm_prop_kp
(
        atm_id  number,
        prop_id number,
        prop_val varchar2(100),
        constraint pk_atm_prop_kp primary key (atm_id, prop_id)
)
storage (buffer_pool keep)
/
alter index pk_atm_prop_kp storage (buffer_pool keep)
/

Then she inserts the data from the original table and collects the statistics:

SQL> insert into atm_prop_kp select * from atm_prop;
SQL> commit;
SQL> exec dbms_stats.gather_table_stats(ownname=>USER, 
                                        tabname=>'ATM_PROP_KP')

Now she executes the same query to examine the execution plan and statistics of the query:

SQL> set autot on explain stat
SQL> select * from atm_prop_kp where atm_id = 367 and prop_id = 42;

    ATM_ID    PROP_ID PROP_VAL
---------- ---------- ------------------------------
       367	   42 UlSJKKkMrKeQMjgMZtQtYkPu


Execution Plan
----------------------------------------------------------
Plan hash value: 2329927764

-------------------------------------------------------------------------------------
| Id |Operation                   |Name           |Rows |Bytes |Cost(%CPU)|Time     |
-------------------------------------------------------------------------------------
|  0 |SELECT STATEMENT            |               |   1 |   58 |   2   (0)|00:00:01 |
|  1 | TABLE ACCESS BY INDEX ROWID|ATM_PROP_KP    |   1 |   58 |   2   (0)|00:00:01 |
|* 2 |  INDEX UNIQUE SCAN         |PK_ATM_PROP_KP |   1 |      |   1   (0)|00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ATM_ID"=367 AND "PROP_ID"=42)


Statistics
----------------------------------------------------------
	  0  recursive calls
	  0  db block gets
	  3  consistent gets
	  0  physical reads
	  0  redo size
	592  bytes sent via SQL*Net to client
	412  bytes received via SQL*Net from client
	  1  SQL*Net roundtrips to/from client
	  0  sorts (memory)
	  0  sorts (disk)
	  1  rows processed

Pointing to the output, Jane explains that there are still three consistent gets with this query, so this method is not better than the Memoptimized Rowstore approach.

Index-organized tables. Index-organized tables are special tables in Oracle Database that are essentially indexes with all the other columns attached to the primary key values in the index, so there is no separate index for the tables. This reduces at least one lookup from the index to the table and therefore is theoretically faster. But again, is it fast enough, Jane asks?

She creates an index-organized table like the original table, inserts the same amount of data (from the original table), gathers the statistics, and checks the execution plan and statistics of the same query:

create table atm_prop_iot
(
        atm_id  number,
        prop_id number,
        prop_val varchar2(100),
        constraint pk_atm_prop primary key (atm_id, prop_id)
)
organization index
/

SQL> insert into atm_prop_iot select * from atm_prop;
SQL> commit;

SQL> exec dbms_stats.gather_table_stats(ownname=>USER, tabname=>'ATM_PROP_IOT')

SQL> select prop_val from atm_prop_iot where atm_id = 367 and prop_id = 42;

PROP_VAL
------------------------------
UlSJKKkMrKeQMjgMZtQtYkPu


Execution Plan
----------------------------------------------------------
Plan hash value: 2993696207

-------------------------------------------------------------------------------
| Id | Operation         | Name            |Rows |Bytes |Cost(%CPU)| Time     |
-------------------------------------------------------------------------------
|  0 | SELECT STATEMENT  |                 |   1 |   58 |   2   (0)| 00:00:01 |
|* 1 |  INDEX UNIQUE SCAN| PK_ATM_PROP_IOT |   1 |   58 |   2   (0)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("ATM_ID"=367 AND "PROP_ID"=42)


Statistics
----------------------------------------------------------
	  0  recursive calls
	  0  db block gets
	  2  consistent gets
	  0  physical reads
	  0  redo size
	732  bytes sent via SQL*Net to client
	424  bytes received via SQL*Net from client
	  2  SQL*Net roundtrips to/from client
	  0  sorts (memory)
	  0  sorts (disk)
	  1  rows processed

Pointing to the output, Jane notes that the consistent gets statistics dropped to two, from three in the default case, but it’s still two too many. The read-optimized table performs much better than these conventional solutions and meets the requirements of the IoT program. Mark is relieved.

Fast Ingest

But there is still the second problem Acme needs to solve. The IoT devices generate a lot of data to be ingested by the database, which simply can’t handle that volume of inserts so fast. Is there a solution to that, asks Mark.

Yes, there is, smiles Jane. Similar to a read-optimized table, the Oracle Database Memoptimized Rowstore also includes a fast ingest feature that uses a write-optimized table, she responds. To demonstrate, she creates a table named ATM_PROP_WO, similar to the ATM_PROP table, with the new memoptimize for write clause. Just as with the read-optimized tables, these tables need to have their segments created along with the table, so she includes the segment creation immediate clause, as she did in the previous (memoptimize for read) case. Here is the SQL:

create table atm_prop_wo
(
        atm_id  number,
        prop_id number,
        prop_val varchar2(100),
        constraint pk_atm_prop_wo primary key (atm_id, prop_id)
)
segment creation immediate
memoptimize for write
/

Now this table is ready for fast ingestion of data. However, this in itself is not enough, she continues. The INSERT statement must include the memoptimize_write hint to quickly insert the data. Jane writes two PL/SQL blocks, one each to insert 100,000 records into ATM_PROP (the conventional table) and ATM_PROP_WO (the write-optimized table), respectively. She sets the timing on the SQL*Plus prompt to record the time it takes for each block. First, she truncates the ATM_PROP table and inserts into it:

begin
     for i in 1..1000 loop
              for j in 1..100 loop
                      insert into atm_prop
                      values
                      (i,j,
                      dbms_random.string('A',dbms_random.value(1,100))
                      );
              end loop;
     end loop;
end;

PL/SQL procedure successfully completed.

Elapsed: 00:00:28.37

Then she inserts into the write-optimized table, taking care to include the required memoptimize_write hint in the INSERT statement:

begin
      for i in 1..1000 loop
             for j in 1..100 loop
                     insert /*+ memoptimize_write */ into atm_prop_wo
                     values
                     (i,j,
                     dbms_random.string('A',dbms_random.value(1,100))
                     );
             end loop;
      end loop;
end;

PL/SQL procedure successfully completed.

Elapsed: 00:00:12.16

This time it took only 12.16 seconds, down from the original 28.37 seconds—a huge drop. The audience is impressed but curious nonetheless about how that performance actually came about.

When a user issues an INSERT statement, Oracle Database puts that data in a buffer in the buffer cache, Jane explains, and the competition for the buffer cache affects the performance. In the write-optimized tables, Oracle Database adds the data not to the buffer cache but, rather, to a new region in the large pool, she elaborates. This bypasses all the checks and the overhead associated with placing the data in the buffer cache. The database then picks up the data records in batches from the large pool and commits them to the database. The user doesn’t have to wait for the sync to disk to happen.

How will those records be written to the data files, Betty wants to know. They are written to the database eventually and automatically, Jane explains, but there is a way to force the writing as well. She uses the following SQL to force the writing of the inserted records into the database:

SQL> exec dbms_memoptimize_admin.writes_flush

Now the records are written from the write-optimized table to the database. Jane further shows ways to examine the activities of the flushing mechanism from the V$MEMOPTIMIZE_WRITE_AREA dynamic performance view:

SQL> select * from v$memoptimize_write_area;

TOTAL_SIZE USED_SPACE FREE_SPACE NUM_WRITES NUM_WRITERS     CON_ID
---------- ---------- ---------- ---------- ----------- ----------
959447040     1121696  958325344          0           1          3

Pointing to the output, Jane explains what these columns mean:

  • TOTAL_SIZE shows the total size of the pool available for the inserts into the write-optimized tables.

  • USED_SPACE shows the total space occupied by the writes.

  • FREE_SPACE shows the free space available for the future writes.

  • NUM_WRITERS shows the number of processes writing the data.

The earlier example was a record-by-record insert. Jane demonstrates a second example, in which she inserts the 100,000 records simultaneously to both tables (standard and write-only) from a single table she built earlier—ATM_PROP_RO. She truncates both tables, inserts the records in a batch, and records the time:

SQL> truncate table atm_prop_wo;
SQL> truncate table atm_prop;
SQL> insert into atm_prop select * from atm_prop_ro;

100000 rows created.

Elapsed: 00:00:01.61

SQL> insert into atm_prop_wo select * from atm_prop_ro;

100000 rows created.

Elapsed: 00:00:01.36

The time dropped from 1.61 seconds to 1.36 seconds, which is not as dramatic as the time drop for record-by-record inserts from the standard to the write-optimized table. The reason, Jane explains, is that the records were already somewhat batched in this case. But in a typical IoT device, the data comes in streams, not bulk inserts, so the write-optimized-table approach produces huge performance increases. Mark concurs.

Can any table be write-optimized, Betty asks. No, answers Jane. Tables with the following properties can’t be write-optimized:

  • Disk compression
  • In-memory compression
  • Column default values
  • Encryption
  • Functional indexes
  • Domain indexes
  • Bitmap indexes
  • Bitmap join indexes
  • Ref types
  • Varray types
  • OID$ types
  • Subpartition stats
  • Unused columns
  • Virtual columns
  • LOBs
  • Triggers
  • Binary columns
  • Foreign keys
  • Row archival
  • Invisible columns
  • Temporary tables
  • Nested tables
  • Index-organized tables
  • External tables
  • Materialized views with on-demand refresh

Although the list seems extraordinarily long, it’s not all that bad, Jane hastens to add. For Acme, the write-optimized table use case is simple: to ingest as much data as possible in the shortest-possible time from the IoT devices that generate the data, such as ATMs and POS devices. These tables are usually standard database tables, so this seemingly long exclusion list has no impact.

Because the inserts are so fast, Betty asks, can we convert all our tables to write-optimized ones, assuming, of course, that they qualify?

No, responds Jane quickly. Again, write-optimized tables are just for a specific use case where data comes in record by record and must be ingested as quickly as possible. There are limitations to this approach we have to consider, Jane warns:

  • If the database crashes, the data may be lost if it hasn’t been written to the database yet.

  • Because the data first goes to the large pool, other sessions can’t see that data.

  • Because the data can’t be validated at INSERT, it is validated when it is written to the database, which is later. Any errors, such as running out of space, are visible at that time.

  • Parent key values must be written before child data.

All of these limitations may sound ominous, but in reality, Jane assures her colleagues, they don’t have much impact on the IoT data being ingested at Acme. The data comes in streams, and the primary key values are always incremented, so there is no chance of conflict. In the rare chance that there is a conflict, it’s likely a bug and the record should be discarded anyway. The data is merely inserted for analysis later; no other process looks for the data at the same time, unlike in transactional systems.

To reduce the possibility of data loss due to a system failure, put the data in a simple queue, call dbms_memoptimize_admin.writes_flush periodically, and delete the queue after a successful call. Overall, this provides a very fast ingestion mechanism for data coming from IoT devices.

Mark and the rest of the team are very impressed and thank Jane profusely. The meeting is adjourned.

Next Steps

LEARN more about the Memoptimized Rowstore.

READ more about the memoptimize pool setting.

Illustration by Wes Rowell