Subscribe

Share

DBA

Are We All on the Same Page?

Pagination of data can make (or break) your database.

By Connor McDonald

October 18, 2018

For as long as user-facing computer applications have existed, there has been the concept of the page, where a page on the screen is a metaphor for a page in a book. The page presents a subset of the total information available, and once that information has been digested by the user, the page is “turned” and the next subset of information is presented.

I started my computer programming career observing pages of data on a green-screen mainframe terminal where the size of the page was fixed at a constant 24 rows, each containing up to 80 characters of data. The advent of graphical user interfaces evolved the concept of a page into a more dynamic set of information, with the user controlling the amount of the content displayed on the page, and the current trend seems to be toward the “never-ending” page, where content is retrieved and rendered in real time as the user consumes it on screen. For example, most blog platforms dynamically display more and more blog posts as the user scrolls downward. As such, there is no longer the concept of a “next” page—the current page just continues to grow. Despite these advances, the underlying challenge for the database developer remains the same: to source from the database an ordered subset of an entire dataset, namely a subset that is defined by an ordering sequence related to one or more of the attributes in that data.

First Principles

Because this ordered-subset requirement has existed for so long, the SQL technique for accessing a page of data in an ordered fashion is well known: sort the data within an inline view and only then retrieve the number of rows required to render the page. This is typically known as a Top-N query, because it retrieves the top “n” rows from a dataset.

For example, the SQL in Listing 1, which is not using an inline view, aims to show the most recently submitted questions from the Ask TOM database, but it clearly returns incorrect data, since we’ve definitely taken questions more recently than 18 years ago!

Listing 1: Incorrect Top-N query

SQL> select question_id, created
  2  from   asktom.ate_submitted_questions
  3  where  rownum <= 5
  4  order by created desc;

         QUESTION_ID CREATED
———————————————————— ———————————
        348459360553 10-MAR-2001
        348431169703 10-MAR-2001
          3612348048 02-MAY-2000
           212348047 02-MAY-2000
           612348047 02-MAY-2000

Transforming this into the correctly functioning SQL statement in Listing 2 is easy, and reading the text of the query in top-down fashion maps to the logical operation to be performed: first sort the rows and then extract five rows from that sorted dataset.

Listing 2: The familiar inline view approach

SQL> select *
  2  from (
  3    select question_id, created
  4    from   asktom.ate_submitted_questions
  5    order by created desc
  6  )
  7  where  rownum <= 5;

         QUESTION_ID CREATED
———————————————————— ———————————
 9533142000346451403 19-AUG-2018
 9537700100346275665 18-AUG-2018
 9537699900346476947 18-AUG-2018
 9537697900346933719 18-AUG-2018
 9537696800346355200 18-AUG-2018

As I mentioned, this technique is generally well known and widely adopted in the developer community, so why am I covering old ground here? It is because I am seeing questions on Ask TOM and other Q&A sites indicating that an alternative mechanism is becoming popular. I think this is a reflection of the challenges modern developers face. They are now called upon to know many more components of the entire technology stack. While developers are growing their skills in HTML5, CSS, JavaScript, back-end languages, and the infrastructure that supports them, they often avoid direct interaction with SQL, instead relying on an object-relational mapping (ORM) library, such as Hibernate, to handle that task for them. Handling the pagination of data is therefore now often performed not with SQL techniques as described above but in the application tier.

The recent advent of the never-ending-page metaphor could also inspire a developer to adopt a similar mindset when retrieving data from a database. A developer could issue a single query to obtain all the data in the desired order, and the application tier could fetch only the rows required to be rendered, yielding a code structure along the lines of Listing 3.

Listing 3: Application tier processing of Top-N (pseudocode)

public static void ...
{
  PreparedStatement sql_stmt = 
    conn.prepareStatement(
    "select  question_id, created             //
     from    asktom.ate_submitted_questions   // Query to get all rows
     order by order by created desc");        //

  ResultSet rset = sql_stmt.executeQuery();
  int i = 0;
  while( rset.next() )
  {                                //
     ...                           //  Fetch only the number of
     i = i + 1;                    //  rows required, and then exit
     if (i > 10) {                 //  the fetch loop
        break;                     //
      }   
  }
  rset.close();
}

This approach is often attractive to developers unfamiliar with SQL, because the query text is a very simple “retrieve data and sort it”—which could be automatically generated by ORM—and developers do not need to trouble themselves with the idiosyncrasies of inline views and the like. But although this approach might be fine for small, fixed-size datasets, there are risks as the data volumes increase.

Scalability Threats

To demonstrate these risks, I’ll now explore the “retrieve data and sort it” approach on a larger dataset. I have an existing table, SALES_TRANSACTIONS, that is a simulation of a fictitious high-volume retailer that is recording sales of its products. The table has approximately 10 million rows and occupies 1.5 GB on disk, as shown in Listing 4. Obviously, real-world applications could have even larger volumes of data, but this will suffice for the demonstration.

Listing 4: Disk size of total sales transactions

SQL> select num_rows, blocks*8192/1024/1024 mb
  2  from   user_tables
  3  where  table_name = 'SALES_TRANSACTIONS';

  NUM_ROWS         MB
——————————  —————————
  10308870   1522.625

To track the work being performed by the database to execute Top-N queries, I am going to focus on the dynamic performance view V$SQL_WORKAREA_HISTOGRAM. V$SQL_WORKAREA_HISTOGRAM displays cumulative memory work area execution statistics, where work areas include the memory required for the sorting operations that Top-N queries will have to perform to provide the correct information for the desired page of results. The columns for this view are shown in Listing 5.

Listing 5: The V$SQL_WORKAREA_HISTOGRAM columns

SQL> desc V$SQL_WORKAREA_HISTOGRAM
 Name                                 Null?    Type
 ———————————————————————————————————— ———————— ——————
 LOW_OPTIMAL_SIZE                              NUMBER
 HIGH_OPTIMAL_SIZE                             NUMBER
 OPTIMAL_EXECUTIONS                            NUMBER
 ONEPASS_EXECUTIONS                            NUMBER
 MULTIPASSES_EXECUTIONS                        NUMBER
 TOTAL_EXECUTIONS                              NUMBER
 CON_ID                                        NUMBER

The V$SQL_WORKAREA_HISTOGRAM view contains a row for a distribution of the memory size used by database sessions for operations such as sorting, yielding a frequency histogram of memory utilization for the database instance. Here are two sample rows from this view:

LOW_OPTIMAL_SIZE HIGH_OPTIMAL_SIZE OPTIMAL_EXECUTIONS ONEPASS_EXECUTIONS
———————————————— ————————————————— —————————————————— ——————————————————
         1048576           2097151                  3                  0
        16777216          33554431                  0                  2

The data in row 1 shows that since database instance startup, there have been three operations between 1 MB and 2 MB in size (LOW_OPTIMAL_SIZE and HIGH_OPTIMAL_SIZE). These operations were deemed to be optimal executions (OPTIMAL_EXECUTIONS), where optimal means that the operation could be completed in memory without needing to spill to temporary storage.

The data in row 2 shows that since instance startup there have been two operations between 16 MB and 32 MB in size that were one-pass executions (ONEPASS_EXECUTIONS), where one-pass means that the operation could be completed only by spillage of some data to temporary storage in one phase of the processing.

V$SQL_WORKAREA_HISTOGRAM displays data that is cumulative since instance startup for all database sessions, so to track the usage for a single query, I will need to capture the contents of the view immediately before executing a Top-N query and then, immediately after its execution, extract the delta between the two. Another obvious prerequisite is that I must be the only person doing any work on this database, which is why I’m using my fictitious SALES_TRANSACTIONS data rather than the Ask TOM database tables used earlier.

First I capture the current state of the view into a global temporary table called QUERY_BEFORE, as shown in Listing 6.

Listing 6: Snapshot of current histogram data

SQL> create global temporary table QUERY_BEFORE
  2  on commit preserve rows as
  3  select * from V$SQL_WORKAREA_HISTOGRAM;

Table created.

Next I’ll get the most recent 10 rows from my SALES_TRANSACTIONS table, as defined by the SALE_TIMESTAMP column. My table currently has no indexes, so all rows must be scanned to determine the 10 most recent ones. I’ll use the standard inline view mechanism to let the database know that I am interested in only the first 10 rows, as shown in Listing 7.

Listing 7: Inline view to fetch the 10 most recent transactions

SQL> set timing on
SQL> set feedback only
SQL> select *
  2  from
  3    ( select *
  4      from   SALES_TRANSACTIONS
  5      order by SALE_TIMESTAMP desc
  6    )
  7  where rownum <= 10
  8  /

10 rows selected.

Elapsed: 00:00:05.01

This query took five seconds, and now in Listing 8, I’ll perform a simple join from my previously created QUERY_BEFORE table to the current state of the V$SQL_WORKAREA_HISTOGRAM view to compare before and after values to get a measurement of the memory utilization of the Top-N query just executed.

Listing 8: Memory consumption histogram

SQL> select
  2    s.low_optimal_size,
  3    s.high_optimal_size,
  4    s.optimal_executions - b.optimal_executions delta_opt,
  5    s.onepass_executions - b.onepass_executions delta_onepass,
  6    s.multipasses_executions - b.multipasses_executions delta_multi,
  7    s.total_executions - b.total_executions total
  8  from v$sql_workarea_histogram s,
  9       query_before b
 10  where s.low_optimal_size = b.low_optimal_size
 11  and s.total_executions - b.total_executions > 0 ;

LOW_OPTIMAL_SIZE HIGH_OPTIMAL_SIZE DELTA_OPT DELTA_ONEPASS DELTA_MULTI TOTAL
———————————————— ————————————————— ————————— ————————————— ——————————— —————
         1048576           2097151         3             0           0     3

1 row selected.

The memory consumption to retrieve the 10 most recent transactions is three chunks, each between 1 MB and 2 MB in size. Even in the worst case, it takes only 6 MB to perform the operation.

When I perform this demo live at conferences, people are astounded at this result, to the extent of disbelieving the validity of the measurements in V$SQL_WORKAREA_HISTOGRAM. Recall that the SALES_TRANSACTIONS table has 10 million rows consuming 1,500 MB of storage. How is it possible to sort all that data in only 6 MB of memory and yet not spill to temporary storage? It seems impossible.

The true implementation details are buried deep in the Oracle Database kernel, but even conceptually, it is possible to come up with a strategy using only a small amount of memory to complete the operation. Here is an example of such a strategy:

  1. Read the first 10 rows from SALES_TRANSACTIONS, where first is simply the first encountered rows from storage. The rows are not guaranteed to be the 10 most recent rows I am seeking.
  2. Sort these rows, based on SALE_TIMESTAMP, and hold them in a sorted memory structure. This structure is fixed in size to hold only 10 rows.
  3. Now read the 11th row from the SALES_TRANSACTIONS table. There are two possibilities to consider:
    • This 11th row has a SALE_TIMESTAMP older than that of the oldest row in the sorted memory structure, in which case I simply discard it and continue on to the next row.
    • This 11th row has a SALE_TIMESTAMP newer than that of the oldest row in the sorted memory structure. In this case, I insert this row into its appropriate sequence in the sorted memory structure, so the oldest row in the structure is pushed out and discarded.
  4. Repeat step 3 for the remaining rows in the table.

Once all of the rows have been read, the 10 rows remaining in the sorted memory structure will be the 10 most recent rows from the SALES_TRANSACTIONS table. You can see from this style of implementation that there is a small sorting cost in step 2 but the entire set of 10 million rows is never sorted. Those rows are definitely read, because I must visit every single row to find the 10 most recent sales, but the large sorting cost has been avoided. I stress that this is conceptually what I envisage is happening. The true code implementation could be entirely different, but it demonstrates that a low-memory-utilization approach is definitely possible.

Now I will repeat the task, this time taking the approach that the application code bears responsibility for the pagination. In this example, I will present a simple ORDER BY query to the database and then the application tier will fetch from the result set until the number of rows to be rendered is retrieved.

Again I capture the current state of the V$SQL_WORKAREA_HISTOGRAM view into a global temporary table, this time called QUERY_BEFORE2, as shown in Listing 9.

Listing 9: Snapshot of histogram memory data

SQL> create global temporary table QUERY_BEFORE2
  2  on commit preserve rows as
  3  select * from V$SQL_WORKAREA_HISTOGRAM;

Now I’ll get the most recent 10 rows from my SALES_TRANSACTIONS table, as defined by the SALE_TIMESTAMP column, not directly via SQL but by using a simple PL/SQL FOR loop, as shown in Listing 10. I’ll increment a counter on each fetch and exit the loop after 10 fetches. To the database, this approach looks like a query to retrieve all rows.

Listing 10: PL/SQL loop to fetch 10 rows from a sorted result

SQL> set timing on
SQL> declare
  2    cnt int := 0;
  3  begin
  4  for i in
  5    ( select *
  6      from   SALES_TRANSACTIONS
  7      order by SALE_TIMESTAMP desc
  8    )
  9  loop
 10    cnt := cnt + 1;
 11    exit when cnt = 10;
 12  end loop;
 13  end;
 14  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:08.83

Recall from the first example that the elapsed time was five seconds. Before we look at the underlying memory utilization of this example, it is already apparent that this block of code seems to have worked harder in the database, with its nearly doubled execution time. Again, as shown in Listing 11, I’ll perform a simple join from my previously created QUERY_BEFORE2 table to the current state of the V$SQL_WORKAREA_HISTOGRAM view.

Listing 11: Memory utilization for application tier fetch test

SQL> select
  2    s.low_optimal_size,
  3    s.high_optimal_size,
  4    s.optimal_executions - b.optimal_executions delta_opt,
  5    s.onepass_executions - b.onepass_executions delta_onepass,
  6    s.multipasses_executions - b.multipasses_executions delta_multi,
  7    s.total_executions - b.total_executions delta
  8  from v$sql_workarea_histogram s,
  9       query_before2 b
 10  where s.low_optimal_size = b.low_optimal_size
 11  and s.total_executions - b.total_executions > 0 ;

LOW_OPTIMAL_SIZE HIGH_OPTIMAL_SIZE DELTA_OPT DELTA_ONEPASS DELTA_MULTI DELTA
———————————————— ————————————————— ————————— ————————————— ——————————— —————
         1048576           2097151         1             0           0     1
      1073741824        2147483647         0             2           0     2

2 rows selected

The first row in the output is a (worst case) 2 MB chunk of memory for a query that was performed optimally, but the second row is the main culprit here. There were two operations, each sized between 1,000 MB and 2,000 MB, which—given that the size of the source data is 1,500 MB—suggests that the entire set of source data was processed in a sorting operation, potentially more than once. Note that these operations occurred in the DELTA_ONEPASS column. This indicates that my database server could not process these memory operations solely in RAM. During the operation, information was forced to spill to disk.

Now consider a production use case here. If every connected database session had the potential to consume up to 2 GB of memory just to show simple pagination results, then even the most powerful database server, with terabytes of RAM, is going to be unable to cope if the connected session count climbs to moderate levels. Pagination control being done in the application tier is a large scalability risk.

You may be thinking, “Why didn’t the database server use the same conceptual approach of not sorting the entire set of data?” That approach can be applied only if the number of rows desired on the first page is known to the database. If the database hadn’t had the information that only 10 rows were required, it wouldn’t have been able to determine the size of the sorted memory structure and hence would have fallen back to sorting the entire source data.

Refactoring Existing Code

Once developers observe the benefits in memory consumption and performance of structuring their SQL statements appropriately, refactoring existing code is easy to do. Even if they were previously relying on the application tier to perform the fetching of the desired number of rows, this logic can be left in place, even though the fetch count checking is redundant, since the query will return only the number of rows required. For example, the previous Java pseudocode only needs the SQL statement to be replaced in order to reap the benefits, as shown in Listing 12.

Listing 12: Drop-in replacement SQL for application code

public static void ...
{
  PreparedStatement sql_stmt = 
    conn.prepareStatement(
    "select  question_id, created             //
     from    asktom.ate_submitted_questions   // Query to get all rows
     order by order by created desc");        //

    "select * from (
       select  question_id, created            //
       from    asktom.ate_submitted_questions  // Improved query
       order by order by created desc          //
     ) where rownum <= 10");                   //

  ResultSet rset = sql_stmt.executeQuery();
  int i = 0;
  while( rset.next() )
  {                                //
     ...                           //  Existing row check code is
     i = i + 1;                    //  now redundant but there is
     if (i > 10) {                 //  no detriment to leaving it
        break;                     //
      }   
  }
  rset.close();
}

Because the efficiency benefits materialize with production-scale data volumes, it is difficult in smaller development and testing databases for developers to validate that their SQL changes will have the desired results from performance tests alone. But developers can use the execution plans for the SQL statements to get an indication that the memory-efficient operations are in play.

For a conventional sorting operation in a SQL query that sorts all the rows in the source data, the execution plan will show the SORT ORDER BY line, as shown in Listing 13.

Listing 13: Standard sorting operation in execution plan

SQL> select empno, ename, hiredate
  2  from   emp
  3  order by hiredate desc

———————————————————————————————————————————
| Id  | Operation          | Name | Rows  |
———————————————————————————————————————————
|   0 | SELECT STATEMENT   |      |    14 |
|   1 |  SORT ORDER BY     |      |    14 |
|   2 |   TABLE ACCESS FULL| EMP  |    14 |
———————————————————————————————————————————

However, when the database can switch to the efficient stop after “n” rows style of sorting operation for pagination queries, you will see the STOPKEY keyword, as shown in Listing 14.

Listing 14: STOPKEY optimization for Top-N queries

SQL> select *
  2  from (
  3    select empno, ename, hiredate
  4    from   emp
  5    order by hiredate desc
  6  ) 
  7  where rownum <= 5;

————————————————————————————————————————————————
| Id  | Operation               | Name | Rows  |
————————————————————————————————————————————————
|   0 | SELECT STATEMENT        |      |    5  |
|*  1 |  COUNT STOPKEY          |      |       |
|   2 |   VIEW                  |      |   14  |
|*  3 |    SORT ORDER BY STOPKEY|      |   14  |
|   4 |     TABLE ACCESS FULL   | EMP  |   14  |
————————————————————————————————————————————————

For small, fixed-size datasources such as static lists, a refactoring effort may not be required to present ordered subsets of data, but for any source of data where the total volume is unknown, using an appropriate Top-N query syntax is vital for application performance and scalability.

Summary

A property of relational databases is that data is never stored or presented in any guaranteed order. Data will almost always need to be sorted to be presented in a meaningful way to users who consume that data. When that sorting also involves providing subsets of data, it is important to convey the size of the subset required to the database engine via the Top-N SQL syntax. Failing to do so will result in poorly performing applications that will not scale under production workloads and data volumes.

Next Steps

LEARN more about Top-N SQL.

DOWNLOAD Oracle Database 18c.

TRY Oracle Database cloud services.

Illustration by Wes Rowell