X

A blog about JSON in the Oracle database Technology

Autonomous JSON Database under the covers: OSON format

Zhen Hua Liu
Architect

One of the secrets behind Autonomous JSON Database is a new optimized native binary storage format for JSON, called OSON. All JSON documents in Autonomous JSON Database are automatically stored in OSON format. This delivers big performance benefits to your JSON applications, including faster query performance, more efficient updates, and reduced storage sizes. This native JSON format is completely transparent to your applications; your application always uses standard JSON documents (e.g. text strings), but all database operations on JSON documents are optimized under the covers via this binary format. 


OSON is based on a tree encoding. Following Figure shows OSON encoding of a simple JSON document as a serialized OSON byte array having tree pointers represented as jump navigation offsets.
 

 

Such encoding gives the following benefits when processing SQL/JSON path query and update.

  • Fast query performance: OSON supports ‘tree-based jump navigation’. If you want to retrieve only the field “name” of the second element of the array entry “items”, that is, evaluating the dot expression ‘items[1].name’,  then Autonomous JSON Database will ‘jump’ directly to the value “PC” within the OSON buffer, instead of linearly parsing and scanning the text of the JSON document. For a JSON document having few fields, the performance gains may be modest -- but for a typical JSON document having dozens or hundreds of fields and array elements nested within each other forming complex tree structures, then ‘tree-based jump navigation’ will enable Autonomous JSON Database to efficiently skip past the irrelevant portions of the document to retrieve only the requested fields.  OSON format often delivers order of magnitude query performance gains (as compared to JSON stored in its raw text format). 
  • Efficient updates:  In many applications, you may want to update only a subset of the fields of a JSON document (known as a ‘partial update’). For example, you may want to update the “id” field of document from “CDEG4” to “CDEG52” in above Figure. Autonomous JSON Database performs partial updates of OSON whenever possible, by changing only a subset of nodes in the JSON tree structure – even if the new data is larger than the original field(s). Even for complex update operations, full document replacement is typically not necessary. Partial updates of OSON always improve the efficiency of modifying JSON documents, especially for large documents which would be expensive if the entire document was replaced. Partial update often significantly reduces database redo/replication log size and improves document update performance for medium to large sized documents. 
  • Reduced storage: Medium to large sized JSON documents often contain nested object array structures and/or recursive structures. In those documents, the field names may be repeatedly stored many times in JSON text. For example, In above Figure, the field “name” is repeated in each element of the array. OSON will only store each distinct field name once using a dictionary encoding to shrink larger JSON documents.

 

There are even more benefits to OSON, which benefit other aspects of your JSON applications:

  • Leaf scalar Data in OSON uses native Oracle scalar data types (number, date, timestamp, binary float/double, year-month, day-time intervals, binary) and encoding formats.  Consequently, there is no data-conversion required when extracting JSON leaf data as SQL scalar value in Autonomous JSON Database and its client drivers.

  • OSON processing is fully integrated into Oracle’s client drivers. JSON encoding into OSON and decoding from OSON can occur at the client side whenever appropriate and possible. This saves server CPU to encode JSON into OSON for ingest or to decode OSON into JSON for retrieval.Furthermore, Oracle client drivers are capable of directly leverage optimizations like ‘jump navigation’ over OSON format. SODA client driver for Autonomous JSON database uses OSON client side encoder and decoder autonomously.

  • OSON is fully and deeply integrated into all aspects of Oracle Autonomous Database: JSON functional index, JSON search index,  JSON_TABLE() materialized views, JSON Data-Guide, parallel query processing, In-Memory columnar store, ExaData smart scan, have all been enhanced to fully take advantage of the OSON format to speed up SQL/JSON processing.

    Finally, while OSON is available today in Autonomous JSON Database, Autonomous Transaction processing, Autonomous Data Warehouse, it will also soon be available as JSON datatype in all versions of Oracle Database. The preview version of Oracle Database 20c JSON datatype is available today.

 

You can learn more in our VLDB 2020 conference paper; the presentation's recording can be watched here



 

 

 

 

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.