Autonomous JSON Database under the covers: OSON format

September 12, 2020 | 4 minute read
Zhen Hua Liu
Text Size 100%:

One of the secrets behind Autonomous JSON Database is a new optimized native binary storage format for JSON, called OSON. All JSON documents in Autonomous JSON Database are automatically stored in OSON format. This delivers big performance benefits to your JSON applications, including faster query performance, more efficient updates, and reduced storage sizes. This native JSON format is completely transparent to your applications; your application always uses standard JSON documents (e.g. text strings), but all database operations on JSON documents are optimized under the covers via this binary format. 

OSON is based on a tree encoding. Following Figure shows OSON encoding of a simple JSON document as a serialized OSON byte array having tree pointers represented as jump navigation offsets.


Such encoding gives the following benefits when processing SQL/JSON path query and update.

  • Fast query performance: OSON supports ‘tree-based jump navigation’. If you want to retrieve only the field “name” of the second element of the array entry “items”, that is, evaluating the dot expression ‘items[1].name’,  then Autonomous JSON Database will ‘jump’ directly to the value “PC” within the OSON buffer, instead of linearly parsing and scanning the text of the JSON document. For a JSON document having few fields, the performance gains may be modest -- but for a typical JSON document having dozens or hundreds of fields and array elements nested within each other forming complex tree structures, then ‘tree-based jump navigation’ will enable Autonomous JSON Database to efficiently skip past the irrelevant portions of the document to retrieve only the requested fields.  OSON format often delivers order of magnitude query performance gains (as compared to JSON stored in its raw text format). 
  • Efficient updates:  In many applications, you may want to update only a subset of the fields of a JSON document (known as a ‘partial update’). For example, you may want to update the “id” field of document from “CDEG4” to “CDEG52” in above Figure. Autonomous JSON Database performs partial updates of OSON whenever possible, by changing only a subset of nodes in the JSON tree structure – even if the new data is larger than the original field(s). Even for complex update operations, full document replacement is typically not necessary. Partial updates of OSON always improve the efficiency of modifying JSON documents, especially for large documents which would be expensive if the entire document was replaced. Partial update often significantly reduces database redo/replication log size and improves document update performance for medium to large sized documents. 
  • Reduced storage: Medium to large sized JSON documents often contain nested object array structures and/or recursive structures. In those documents, the field names may be repeatedly stored many times in JSON text. For example, In above Figure, the field “name” is repeated in each element of the array. OSON will only store each distinct field name once using a dictionary encoding to shrink larger JSON documents.


There are even more benefits to OSON, which benefit other aspects of your JSON applications:

  • Leaf scalar Data in OSON uses native Oracle scalar data types (number, date, timestamp, binary float/double, year-month, day-time intervals, binary) and encoding formats.  Consequently, there is no data-conversion required when extracting JSON leaf data as SQL scalar value in Autonomous JSON Database and its client drivers.

  • OSON processing is fully integrated into Oracle’s client drivers. JSON encoding into OSON and decoding from OSON can occur at the client side whenever appropriate and possible. This saves server CPU to encode JSON into OSON for ingest or to decode OSON into JSON for retrieval.Furthermore, Oracle client drivers are capable of directly leverage optimizations like ‘jump navigation’ over OSON format. SODA client driver for Autonomous JSON database uses OSON client side encoder and decoder autonomously.

  • OSON is fully and deeply integrated into all aspects of Oracle Autonomous Database: JSON functional index, JSON search index,  JSON_TABLE() materialized views, JSON Data-Guide, parallel query processing, In-Memory columnar store, ExaData smart scan, have all been enhanced to fully take advantage of the OSON format to speed up SQL/JSON processing.

    Finally, while OSON is available today in Autonomous JSON Database, Autonomous Transaction processing, Autonomous Data Warehouse, it will also soon be available as JSON datatype in all versions of Oracle Database. The preview version of Oracle Database 20c JSON datatype is available today.  Oracle 21c Database JSON datatype information is available here .


You can learn more in our VLDB 2020 conference paper; the presentation's recording can be watched here





Zhen Hua Liu


The leading architect for semi-structured data management and multi-model data management for Oracle Converged Database Server and Autonomous Database Cloud Servcie. The originator of SQL/JSON standard for DBMS industry. Principal architect for Oracle database server support of SQL/JSON, SQL/XML. Original designer and developer of Oracle optimized binary JSON format, aka, OSON, SQL/JSON path OSON processor engine, JSON, full text, spatial in memory database processing, JSON  sharding support, , JSON/XML search index, JSON DOM API design. One of the Oracle standard representatives to standardize SQL/JSON, SQL/XML, W3C XQuery, SQL/PGQ (Property Graph Query Language). Prior to Oracle, he had done research and development of Object Relational DBMS with integration of Postgres/Illustra into Informix  RDBMS and had done  development of ANSI SQL 89/92 standard compliance feature for Sybase RDBMS.

Previous Post

Introducing Oracle Autonomous JSON Database for application developers

Beda Hammerschmidt | 10 min read

Next Post

Make Someone Else do the Work - Managing Oracle Database 19c Users in Active Directory (part 1 - Kerberos)

Russ Lowenthal | 6 min read