Subscribe

Share

Database, SQL and PL/SQL

Long and Overflowing

LISTAGG in Oracle Database 12c Release 2 solves the problem of excessively long lists.

By Connor McDonald

January/February 2017

In the next few columns, I’ll spend some time looking at new features in Oracle Database 12c Release 2. These features come from the “12 Things About Oracle Database 12c” presentation series that Chris Saxon and I, the Ask Tom team, gave at Oracle OpenWorld 2016 in San Francisco. (You can find the slides for these presentations on asktom.oracle.com, under the Resources tab). In this article, I’ll take a look at improvements to the LISTAGG function in Oracle Database 12c Release 2.

Background

The relational model was first described by the pioneer of relational theory, Edgar F. Codd. In 1970 he published his paper “A Relational Model of Data for Large Shared Data Banks,” in what would become the catalyst for the creation of commercial relational database management systems (RDBMSs), including Oracle Database. The core principle of the relational model is that “all data is represented as mathematical n-ary relations, an n-ary relation being a subset of the Cartesian product of n domains.” Thus the origins of relational databases are based on strict mathematical theory, such databases being a mechanism to store data based on the relational model.

Of course, mention Cartesian products or n-ary relations in casual conversation, and quickly you’ll find yourself standing alone. Most users of databases have a much simpler definition of database: “It’s a spreadsheet, only bigger” with third normal form being “something the data architect did to make our lives harder.” Hence, as long as databases have existed, there has been a requirement to transform the data from how it is stored in database tables to a format that users feel is more natural to be displayed by applications on their devices. One of the common transformations is to transpose and consolidate rows into a list. For example, given a set of employees identified by name and associated with a department

SQL> select deptno, ename
  2  from   emp
  3  order by 1,2;

    DEPTNO ENAME
—————————— ——————————
        10 CLARK
        10 KING
        10 MILLER
        20 ADAMS
        20 FORD
        20 JONES
        20 SCOTT
        20 SMITH
        30 ALLEN
        30 BLAKE
        30 JAMES
        30 MARTIN
        30 TURNER
        30 WARD

I may want to produce a concatenated list of employee names where the concatenation is segmented into the respective departments:

    DEPTNO MEMBERS
—————————— —————————————————————————————————————
        10 CLARK,KING,MILLER
        20 SMITH,JONES,SCOTT,ADAMS,FORD
        30 ALLEN,WARD,MARTIN,BLAKE,TURNER,JAMES

In older versions of Oracle Database, this was quite a difficult problem. There were various techniques that members of the Oracle developer community provided to meet the requirement. For example, in Oracle9i Database, user-defined aggregate functions could be used to achieve the desired result, as shown in Listing 1.

Code Listing 1: User-defined aggregate functions concatenate and segment results

SQL> create or replace type string_agg_type as object
  2  (
  3     total varchar2(4000),
  4  
  5     static function
  6          ODCIAggregateInitialize(sctx IN OUT string_agg_type )
  7          return number,
  8  
  9     member function
 10          ODCIAggregateIterate(self IN OUT string_agg_type ,
 11                               value IN varchar2 )
 12          return number,
 13  
 14     member function
 15          ODCIAggregateTerminate(self IN string_agg_type,
 16                                 returnValue OUT  varchar2,
 17                                 flags IN number)
 18          return number,
 19  
 20     member function
 21          ODCIAggregateMerge(self IN OUT string_agg_type,
 22                             ctx2 IN string_agg_type)
 23          return number
 24  );

SQL> create or replace type body string_agg_type
  2  is
  3
  4    static function ODCIAggregateInitialize(sctx IN OUT string_agg_type)
  5    return number
  6    is
  7    begin
  8      sctx := string_agg_type( null );
  9      return ODCIConst.Success;
 10    end;
 11
 12    member function ODCIAggregateIterate(self IN OUT string_agg_type,
 13    value IN varchar2 )
 14    return number
 15    is
 16    begin
 17      self.total := self.total || ',' || value;
 18      return ODCIConst.Success;
 19    end;
 20
 21    member function ODCIAggregateTerminate(self IN string_agg_type,
 22    returnValue OUT varchar2,
 23    flags IN number)
 24    return number
 25    is
 26    begin
 27      returnValue := ltrim(self.total,',');
 28      return ODCIConst.Success;
 29    end;
 30
 31    member function ODCIAggregateMerge(self IN OUT string_agg_type,
 32    ctx2 IN string_agg_type)
 33    return number
 34    is
 35    begin
 36      self.total := self.total || ctx2.total;
 37      return ODCIConst.Success;
 38    end;
 39
 40  end;

With Oracle Database 10g, the MODEL clause or the SYS_CONNECT_BY_PATH hierarchical query function could be used to achieve the same result with less coding, but comprehending the mechanism of precisely how the code worked in each case still required careful thought, as shown in Listing 2.

Code Listing 2: MODEL clause and SYS_CONNECT_BY_PATH concatenate and segment results

SQL> select deptno , rtrim(ename,',') enames
  2  from ( select deptno,ename,rn
  3         from emp
  4         model
  5         partition by (deptno)
  6         dimension by (
  7            row_number() over
  8             (partition by deptno order by ename) rn
  9               )
 10         measures (cast(ename as varchar2(40)) ename)
 11         rules
 12         ( ename[any]
 13             order by rn desc = ename[cv()]||','||ename[cv()+1])
 14         )
 15   where rn = 1
 16   order by deptno;

SQL> select deptno,
  2         substr(max(sys_connect_by_path(ename, ',')), 2) members
  3  from (select deptno, ename,
  4               row_number ()
  5                 over (partition by deptno order by empno) rn
  6        from emp)
  7  start with rn = 1
  8  connect by prior rn = rn - 1
  9  and prior deptno = deptno
 10  group by deptno;

And with Oracle Database 11g, database rows could be treated as an XML structure, thereby enabling an XML stylesheet to transpose the rows into the desired result, as shown in Listing 3.

Code Listing 3: XML stylesheet transposes rows into desired result

SQL> select deptno,
  2     xmltransform
  3     ( sys_xmlagg
  4        ( sys_xmlgen(ename)
  5        ),
  6       xmltype
  7       (
  8         '<?xml version="1.0"?><xsl:stylesheet version="1.0"
  9            xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 10            <xsl:template match="/">
 11              <xsl:for-each select="/ROWSET/ENAME">
 12                <xsl:value-of select="text()"/>;</xsl:for-each>
 13            </xsl:template>
 14          </xsl:stylesheet>'
 15       )
 16    ).getstringval() members
 17  from emp
 18  group by deptno;

All of these solutions are a credit to the ingenuity of their original coders, but in Oracle Database 11g Release 2, the task became simple, with the arrival of the LISTAGG function. With LISTAGG the concatenated employee list is achieved in just a few lines of easily understood SQL:

SQL> select deptno,
  2         listagg( ename, ',')
  3              within group (order by empno) members
  4  from   emp
  5  group  by deptno;

    DEPTNO MEMBERS
—————————— —————————————————————————————————————————
        10 CLARK,KING,MILLER
        20 SMITH,JONES,SCOTT,ADAMS,FORD
        30 ALLEN,WARD,MARTIN,BLAKE,TURNER,JAMES

I regularly conduct presentations on the topic of SQL analytic functions, the umbrella under which the LISTAGG function resides, and there is always a collective sigh of relief when LISTAGG is demonstrated and compared to the coding solutions that were required historically.

Long Lists

Perusing Oracle Database Reference in the Oracle documentation library reveals, as expected, that there is no limit on the number of rows in a database table.

However, there is a limit on the length of a VARCHAR2 column. And LISTAGG, by its very definition, can generate VARCHAR2 results of arbitrary length, dependent on the data contained within the source table. Having a few dozen more employees for department 10 in the EMPLOYEE table does not cause any problems:

SQL> select deptno,
  2         listagg( ename, ',')
  3              within group (order by empno) members
  4  from   emp
  5  group  by deptno;

  DEPTNO MEMBERS
———————— ————————————————————————————————————————————————————————————
      10 BROOKS,DOMINGUEZ,SIMON,SOLOMON,HAYS,MACIAS,HUDSON,LYNCH,CONN
         ER,CONRAD,IBARRA,MITCHELL,COMBS,WILCOX,SANTIAGO,FLOWERS,SPEN
         CER,WIGGINS,CASTRO,BENNETT,MCCARTHY,BROCK,CALHOUN,LAM,JACOBS
         ,GRIFFITH,WERNER,DUNCAN,GILMORE,MORALES,WEEKS,SIMS,OWEN,MCNE
         IL,CUMMINGS,ROY,HAYES,LUNA,MORA,ROBERSON,BURNETT,SNYDER,MAYS
         ,SHAFFER,MCDANIEL,BELTRAN,POTTER,MYERS,MONTES,SCHMIDT,CLARK,
         KING,MILLER
      20 SMITH,JONES,SCOTT,ADAMS,FORD
      30 ALLEN,WARD,MARTIN,BLAKE,TURNER,JAMES

But if I continue to extend the example by adding hundreds more employees to department 10, LISTAGG will reach the VARCHAR2 limit and the query will no longer execute successfully.

SQL> select deptno,
  2           listagg( ename, ',')
  3               within group (order by empno) members
  4  from   t	
  5  group  by deptno;
select deptno,
*
ERROR at line 1:
ORA-01489: result of string concatenation is too long

Handling this error is a “chicken and egg” problem, because I won’t know if the length of the result of the LISTAGG expression exceeds the limits for a VARCHAR2 until I actually exceed it. There is no way to intercept the LISTAGG processing at the penultimate point where the logical “next” element would exceed the VARCHAR2 limit.

LISTAGG with Oracle Database 12c Release 2

A typical behavior on a website that displays lengthy text is that if that text exceeds what the site deems “reasonable,” the text will be truncated and suffixed in some fashion to indicate that the entirety of the text was not displayed. With Oracle Database 12c Release 2, this functionality has been added to the LISTAGG function.

I’ll return to the previous example, but this time I’ll include the new ON OVERFLOW TRUNCATE clause in the LISTAGG specification:

SQL> select deptno,
  2         listagg( ename, ',' ON OVERFLOW TRUNCATE)
  3              within group (order by empno) members
  4  from   emp
  5  group  by deptno;

  DEPTNO MEMBERS
———————— ————————————————————————————————————————————————————————————
      10 BROOKS,DOMINGUEZ,SIMON,SOLOMON,HAYS,MACIAS,HUDSON,LYNCH,CONN
         ER,BROOKS,BROOKS,BROOKS,BROOKS,BROOKS,BROOKS,BROOKS,BROOKS,C[lines omitted]
         TER,POTTER,POTTER,POTTER,POTTER,POTTER,POTTER,MYERS,MYERS,MY
         ERS,MYERS,MYERS,MYERS,MYERS,MYERS,MONTES,MONTES,MONTES,MONTE
         S,MONTES,...(1806)
      20 SMITH,JONES,SCOTT,ADAMS,FORD
      30 ALLEN,WARD,MARTIN,BLAKE,TURNER,JAMES

Instead of returning an error message, the LISTAGG function now truncates the full string but appends the familiar ellipsis (...) suffix by default. Also provided in this result is “1806,” the number of characters truncated to enable LISTAGG to return a valid-length VARCHAR2 string.

I have control over these parameters, using variations on the OVERFLOW clause. For example, I can specify what suffix I prefer if ellipses are not appropriate. In the example below, I’ve used the string “[snip]” to indicate that some of the output was “snipped.”

SQL> select deptno,
  2         listagg( ename, ',' ON OVERFLOW TRUNCATE '[snip]')
  3              within group (order by empno) members
  4  from   emp
  5  group  by deptno;

  DEPTNO MEMBERS
———————— ————————————————————————————————————————————————————————————
      10 BROOKS,DOMINGUEZ,SIMON,SOLOMON,HAYS,MACIAS,HUDSON,LYNCH,CONN
         ER,BROOKS,BROOKS,BROOKS,BROOKS,BROOKS,BROOKS,BROOKS,BROOKS,C[lines omitted]
         TER,POTTER,POTTER,POTTER,POTTER,POTTER,POTTER,MYERS,MYERS,MY
         ERS,MYERS,MYERS,MYERS,MYERS,MYERS,MONTES,MONTES,MONTES,MONTE
         S,MONTES,[snip](1807)
      20 SMITH,JONES,SCOTT,ADAMS,FORD
      30 ALLEN,WARD,MARTIN,BLAKE,TURNER,JAMES

Similarly, if I do not want to display the count of characters that have exceeded the allowable length of the VARCHAR2 result, I can specify the additional WITHOUT COUNT clause.

SQL> select deptno,
  2         listagg( ename, ',' ON OVERFLOW TRUNCATE '[snip]'  WITHOUT COUNT)
  3              within group (order by empno) members
  4  from   emp
  5  group  by deptno;

  DEPTNO MEMBERS
———————— ————————————————————————————————————————————————————————————
      10 BROOKS,DOMINGUEZ,SIMON,SOLOMON,HAYS,MACIAS,HUDSON,LYNCH,CONN
         ER,BROOKS,BROOKS,BROOKS,BROOKS,BROOKS,BROOKS,BROOKS,BROOKS,C[lines omitted]
         TER,POTTER,POTTER,POTTER,POTTER,POTTER,POTTER,MYERS,MYERS,MY
         ERS,MYERS,MYERS,MYERS,MYERS,MYERS,MONTES,MONTES,MONTES,MONTE
         S,MONTES,[snip]
      20 SMITH,JONES,SCOTT,ADAMS,FORD
      30 ALLEN,WARD,MARTIN,BLAKE,TURNER,JAMES

Without any of these additional clauses, LISTAGG works as it did in Oracle Database 12c Release 1 and earlier, although this operation is, in fact, an application of the new—and default—ON OVERFLOW ERROR clause. If I want LISTAGG to truncate the output with no indication that truncation has actually occurred—for example, if I simply want to avoid an error—I can specify the ON OVERFLOW TRUNCATE clause with a null string as the suffix.

Summary

The LISTAGG function was a long-awaited addition to the suite of analytic functions available in Oracle Database. With Oracle Database 12c Release 2, extensions to the function make it an even more useful tool for database developers.

Next Steps

LEARN more about Oracle Database 12c.

DOWNLOAD Oracle Database 12c.

 

Photography by Ricardo Gomez, Unsplash