X

A blog about JSON in the Oracle database Technology

  • May 7, 2015

Storing JSON in the Oracle database

Beda Hammerschmidt
Consulting (Coding) Member of Technical Staff

In this posting  you learn how to:

Store JSON data in the Oracle
database

Pick a good data type for your JSON
data

Identify rows that are JSON (or not
JSON)

Ensure that only JSON data can be
inserted into a table

Find all columns that store JSON
data

Prerequisites:

  • You need the Oracle Database Release 12.1.0.2 or later
  • Ensure that the COMPATIBLE init.ora parameter is set to 12.1.0.2 or later (you may see an error message otherwise)

To learn what JSON is please visit [http://json.org/ or http://en.wikipedia.org/wiki/JSON]

We have chosen to store JSON data in existing data types instead of adding a new type. This decision allows you to choose the best data type for you use case and use existing clients. Also you get cross functional completeness right away (import export, partitioning, replication, etc). JSON can be stored as VARCHAR2,
CLOB or BLOB. 

The supported data types have different properties: Varchar2
has a maximum length of 32k which is probably sufficient for many use
cases. Clob and Blob have no length
limitation. Internally, Clob encodes characters as UCS2 (similar to UTF16)
which means every character takes up two bytes. Blob does not perform such
re-encoding but instead stores the Unicode bytes unmodified.

For this posting we use Varchar2 as data type. In our example we store a list of color names and their codes.

CREATE TABLE colorTab
(
id NUMBER,
color VARCHAR2(32767));

(In order to allow the 32k length for VARCHAR2 the init.ora
parameter MAX_STRING_SIZE has to be set to EXTENDED. See [https://docs.oracle.com/database/121/REFRN/refrn10321.htm#REFRN10321]). Or replace 32767 with 4000.

Let's insert a few rows:

INSERT INTO colorTab
VALUES(1, '


{

"color": "black",
"rgb": [0,0,0],
"hex": "#000000"

}'
);

INSERT INTO colorTab
VALUES(2, '


{

"color": "orange red",
"rgb": [
255,69,0],
"hex": "
#FF4500"

}
');

INSERT INTO colorTab
VALUES(3, '


{

color: "gold",
"rgb": [
255,215,0],
"hex": "
#FFD700 "

}
');

INSERT INTO colorTab
VALUES(4, 
'I am not valid JSON');

COMMIT;

As one can see the last insert statement inserts data that
is not JSON.

How can we identify such a rows? And how can we prevent invalid JSON from being inserted
in the first place?

This brings us to the first new JSON operator: ‘IS JSON’. It
allows us to filter column values.

SELECT id, color

FROM colorTab

WHERE color IS JSON;

(Instead
of
IS
JSON
one can use IS NOT JSON to find all rows which are not JSON.)

If you
look carefully at the third row (gold) you’ll see that the field name ‘color’
is not inside quotes. Technically, this makes the data invalid JSON because key names need to be enquoted but not everyone follows that rule. The IS JSON operator still accepts this row as JSON because our JSON parser by default runs in a tolerant mode. But not every JSON tool accepts unquoted names (try out 
http://jsonlint.com/).

So for JSON you're sending to others you may want to be extra polite and make sure it complies to all rules. You can do this by running IS JSON in a strict mode:

SELECT id, color

FROM colorTab

WHERE color IS JSON STRICT;

The JSON
specification (unfortunately) says nothing about duplicate key names within the
same object. For instance this is technically valid JSON altough it has two keys with the name name.

{
"color": "
black",
"rgb": [
0,0,0],
"hex": "
#000000",
"color": "
white"

}

Again, try this out in  http://jsonlint.com/,

In our
experience duplicate key names 
occur vary rarely but if they do they are big troublemakers. So how can we avoid them? The clause ‘WITH UNIQUE KEYS’ checks that keys are not
repeated within the same object level (keys with the same name can occur at
multiple places in the data but not in the same object).

SELECT id, color

FROM colorTab

WHERE color IS JSON STRICT WITH UNIQUE KEYS;

(Checking
for unique keys comes at a price. It slows down IS JSON because names
have to me memorized and checked for duplicates. So use it carefully!)

So now
that we understand IS JSON how can we use it to prevent non-JSON data to be
inserted into our table in the first place?
The answer is a check constraint. Lets
add a constraint to our table:

TRUNCATE TABLE
colorTab;

ALTER TABLE colorTab


ADD CONSTRAINT ensure_json CHECK (color IS JSON STRICT);

Now,
when inserting any non-JSON row will be rejected with
ORA
-02290: check constraint (SYS.ENSURE_JSON) violated

Also,
since we have added a IS JSON check constraint the table/column will show up in
a new dictionary view:

SQL> SELECT *
FROM USER_JSON_COLUMNS;

TABLE_NAME COLUMN_NAME FORMAT DATA_TYPE

----------
---------- ---------- ----------

COLORTAB COLOR TEXT VARCHAR2

The view's are named USER_JSON_COLUMNS, DBA_JSON_COLUMNS and ALL_JSON_COLUMNS. The rely on the presence of a IS JSON check constraint. 

We invite you to give us feedback; for instance as a comment under each posting or in our LinkedIn group (https://www.linkedin.com/grp/home?gid=6966766). You can also follow the Twitter hashtag #OracleJSON and #OracleDocDB.

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.