Lock-free reservation: scale your apps

In a previous post, my colleague Ulrike Schwinn introduced the new “lock-free reservation” feature. I recommend you to read this post (Lock-free reservation: how to start with) before entering in scalability considerations: it’s important that you understand the concepts and their implementation in the Oracle database before any further reading.

In the context of transactions updating low cardinality tables, scalability problems usually arise, and grow exponentially with concurrency. Imagine an ecommerce application that would update a number of units of a product each time this product is being purchased. Typically, a potentially huge number of concurrent sessions would try to update a low cardinality table, to decrease the number of units of a given product. This would cause scalability problems, as each concurrent session will need to acquire an exclusive lock on a particular row in the stock table. This is not easy to mitigate with standard locking mechanism.

Lock-free reservation features introduces the concept of “journal table”: instead of updating a row, we insert a new row in a journal table, avoiding to lock the row we want to update. On commit, an exclusive lock is needed to actually update the row. But as it is only “on commit”, the lock is released very quickly.

In this context, what is the scalability benefit that we can get with this new feature: let’s go through a few tests, against a Oracle Database 23ai free!

Test 1: Standard locking – OLTP mode

For this test, we are going to use the following code:

--- Create a table: we will update concurrently the VAL column
    create table T_COUNTER2
(
    ID number primary key,
    VAL NUMBER
);

-- Insert a couple of rows

insert into T_COUNTER2(ID,VAL) values (0,0),(1,0);

commit;

-- Create a procedure that will act as an online application
-- We add a sleep to simulate a think time 

CREATE OR REPLACE PROCEDURE PC_UPD_COUNTER2_TT
IS
    v_id PLS_INTEGER;
    v_incr PLS_INTEGER;
BEGIN
    v_id := mod(floor(dbms_random.value(1,1000000)),2);
    v_incr := mod(floor(dbms_random.value(1,1000000)),2);
    --
    update T_COUNTER2
    set VAL = VAL+v_incr
    where ID = v_id;
    --- Sleep 1s to simulate think time
    DBMS_SESSION.SLEEP(1);
    --
    commit;
END PC_UPD_COUNTER2_TT;
/

Now we will use this PL/SQL procedure to generate some concurrency. Open two SQL*Plus sessions and run the following code in each one:

BEGIN
    for i in 1..200
    LOOP
        PC_UPD_COUNTER2_TT;
    END LOOP;
END;
/

Take care to collect an AWR snapshot before and after running this code. Then generate an AWR report between these two snapshots.

This is the AWR output for the TOP 10 foreground events:

Top10 foreground events with standar row locking

You can observe that more than 99% of the DB Time was spent on event “enq: TX – row lock contention”, which will generate a clear scalability problem. This is because of the standard row locking mechanism: updating a row requires acquiring an exclusive lock on this row.
And because of the 1sec think time, this lock is held 1sec during each update operation.

Test 2: Using lock-free reservation columns – OLTP mode

Now let’s build the same use case using the new lock-free reservation columns:

-- Create a table with a RESERVABLE column:

create table T_COUNTER
(
    ID number primary key,
    VAL NUMBER RESERVABLE
);

-- Insert a couple of rows:

insert into T_COUNTER(ID,VAL) values (0,0),(1,0);

commit;

-- Create a PL/SQL procedure, that will be used to simulate an online application
--  We add a sleep to simulate a think time

CREATE OR REPLACE PROCEDURE PC_UPD_COUNTER_TT
IS
    v_id PLS_INTEGER;
    v_incr PLS_INTEGER;
BEGIN
    v_id := mod(floor(dbms_random.value(1,1000000)),2);
    v_incr := mod(floor(dbms_random.value(1,1000000)),2);
    --
    update T_COUNTER
    set VAL = VAL+v_incr
    where ID = v_id;
    --- Sleep 1s to simulate think time
    DBMS_SESSION.SLEEP(1);
    --
    commit;
END PC_UPD_COUNTER_TT;
/

Now we will use this PL/SQL procedure to generate some concurrency. Open two SQL*Plus sessions and run the following code in each one:

BEGIN
    for i in 1..200
    LOOP
        PC_UPD_COUNTER_TT;
    END LOOP;
END;
/

Take care to collect an AWR snapshot before and after running this code. Then generate an AWR report between these two snapshots.

This is the AWR output for the TOP 10 foreground events:

Top10 foreground events using reservable columns

Observe how the waits on “enq: TX – row lock contention” disappeared: as the exclusive lock is only required on commit, the application sessions don’t spend think time holding an exclusive lock on the row they want to update!

But what if there is no think time ?
In other words: what if we remove the “sleep” instruction from our code, to simulate a yet not so uncommon “row by row” batch?

Test 3: Standard locking – batch mode

For this test, we use the following code:

--- Create a table: we will update concurrently the VAL column

create table T_COUNTER2
(
    ID number primary key,
    VAL NUMBER
);

-- Insert a couple of rows

insert into T_COUNTER2(ID,VAL) values (0,0),(1,0);

commit;

-- Create a procedure that will act as an row by row batch processing

CREATE OR REPLACE PROCEDURE PC_UPD_COUNTER2
IS
    v_id PLS_INTEGER;
    v_incr PLS_INTEGER;
BEGIN
    v_id := mod(floor(dbms_random.value(1,1000000)),2);
    v_incr := mod(floor(dbms_random.value(1,1000000)),2);
    --
    update T_COUNTER2
    set VAL = VAL+v_incr
    where ID = v_id;
    --
    commit;
END PC_UPD_COUNTER2;
/

Now we will use this PL/SQL procedure to generate some concurrency. Open two SQL*Plus sessions and run the following code in each one:

BEGIN
    for i in 1..100000
    LOOP
        PC_UPD_COUNTER2;
    END LOOP;
END;
/

Take care to collect an AWR snapshot before and after running this code. Then generate an AWR report between these two snapshots.

This is the AWR output for the TOP 10 foreground events:

Top10 foreground events - Standard locking with batch processing

Observe that even if there are waits on event “enq: TX – row lock contention”, their percentage of the DB time is not representing a scalability problem. It’s because as there is no think time, the exclusive lock is held very short time (70 microseconds on average vs almost 1 sec on average during Test 1).

Now let’s repeat this test with a reservable column.

Test 4: Using lock-free reservation columns – batch mode

For this last test, we will use the following code:

-- Create a table with a RESERVABLE column:

create table T_COUNTER
(
    ID number primary key,
    VAL NUMBER RESERVABLE
);

-- Insert a couple of rows:

insert into T_COUNTER(ID,VAL) values (0,0),(1,0);

commit;

-- Create a PL/SQL procedure that will be used to simulate a row-by-row batch

CREATE OR REPLACE PROCEDURE PC_UPD_COUNTER
IS
    v_id PLS_INTEGER;
    v_incr PLS_INTEGER;
BEGIN
    v_id := mod(floor(dbms_random.value(1,1000000)),2);
    v_incr := mod(floor(dbms_random.value(1,1000000)),2);
    --
    update T_COUNTER
    set VAL = VAL+v_incr
    where ID = v_id;
    --
    commit;
END PC_UPD_COUNTER;
/

Now we will use this PL/SQL procedure to generate some concurrency. Open two SQL*Plus sessions and run the following code in each one:

BEGIN
    for i in 1..100000
    LOOP
        PC_UPD_COUNTER;
    END LOOP;
END;
/

Take care to collect an AWR snapshot before and after running this code. Then generate an AWR report between these two snapshots.

This is the AWR output for the TOP10 foreground events:

Top10 foreground events - Reservable column in Batch mode

In this case, we observe no real mitigation on the “enq: TX – row lock contention” wait event. We observe 4.9 sec waiting on “enq: TX – row lock contention” event, versus 1.6 sec during the previous test, being the lock held on average 248 microseconds versus 70 microseconds during the previous test. And the CPU time was significantly higher during this test: 116 sec versus 21 sec during the standard locking test.

Conclusions

During the “online application tests” (test 1 and 2), I can observe an awesome difference on the “enq: TX – row lock contention” wait event: from more than 99% of the DB time using standard row locking, it disappears from the Top 10 foreground wait events when using reservable columns!

In this case, the new lock-free reservation mechanism has killed the potential scalability problem. So we can say that online applications updating intensively and with huge concurrency low cardinality tables are very likely to take advantage of this new feature.

On the other hand please note lock-free reservation is designed for OLTP concurrency such as online shopping cart applications and not primarily for row-by-row batch processing as shown in the last test.

Lock-free reservation: scale your apps

Test 1: Standard locking – OLTP mode

Test 2: Using lock-free reservation columns – OLTP mode

Test 3: Standard locking – batch mode

Test 4: Using lock-free reservation columns – batch mode

Conclusions

Stephane Duprat

Technical License Eng Specialist

Oracle Database - fit for Kubernetes

New full-text search in 23ai: Ubiquitous Database Search

Lock-free reservation: scale your apps

Test 1: Standard locking – OLTP mode

Test 2: Using lock-free reservation columns – OLTP mode

Test 3: Standard locking – batch mode

Test 4: Using lock-free reservation columns – batch mode

Conclusions

Authors

Stephane Duprat

Technical License Eng Specialist

Oracle Database - fit for Kubernetes

New full-text search in 23ai: Ubiquitous Database Search