This blog walks you through modeling a parent-child data structure, identifying the benefits, and exploring different use cases. This blog introduces the table hierarchy concept and benefits, constructs the model with examples, and illustrates how the hierarchy works at the shard level. 

Introduction

Developers of modern applications are constantly challenged by businesses trying to gain market share and delighting their customers with highly responsive and rich user experiencesThe need to serve data at scale with low latency requires database systems that can scale on demand to meet the extreme workloads. Such a challenge gives rise to NoSQL databases’ promise to scale horizontally and distribute the workloads across a pool of commodity computing resources. NoSQL databases with key-value, document, and columnar flavors partition the data and distribute the partitions across nodes of a cluster to scale horizontally.

However, the promise of superior scalability comes with the cost of trading off normalized data relationships compared to RDBMS. In the world of NoSQL databases, storing denormalized data is a given, even advised by many vendors. The fact is that the underlying distributed nature of the NoSQL architecture makes it well suited for the scalability gig. It’s fast and efficient. But, joining data across distributed partitions is prohibitively expensive, and conflicts with the goals of predictable, low latency, and scalability.

A hidden gem in Oracle NoSQL Database breaks this norm, offering high scalability while maintaining the data relationship.

The NoSQL table hierarchy is an ideal data model for applications that need some data normalization, but also require predictable, low latency at scale. The hierarchy links different tables to enable left outer joins, combining rows from two or more tables based on related columns between them. Such joins execute efficiently since rows from the parent-child tables are co-located in the same shards.  Also, writes to multiple tables in a table hierarchy obey transactional ACID properties since the records residing in each table of the hierarchy share the same shard key. All write operations perform as a single atomic unit. So all of the write operations will execute successfully, or none of them will.

Why Oracle NoSQL Database Table Hierarchy

Many NoSQL databases support data types like arrays and maps. When modeling a data relationship, application developers may find it easier to have each parent row store its child rows inside an array or a map in a nested structure. By doing so, not only is the data relationship denormalized but it has the potential for creating large parent rows, especially when the hierarchy is heavily nested, resulting in inefficient storage and poor performance. Oracle NoSQL Database’s table hierarchy is the ideal data model to avoid issues associated with arrays and maps. Utilizing a table hierarchy should be considered when building data relationship in Oracle NoSQL Database.

The Benefits of a Table Hierarchy

Oracle NoSQL Database table hierarchy comes with the following benefits:

  • Highly efficient for storing data in a parent-child hierarchy – Parent and child rows are stored in separate NoSQL tables, reducing the size of parent rows compared with the single parent with child rows in nested arrays or maps. Write operations on parent or child create new versions of smaller rows and store these changes efficiently, given the append-only architecture of Oracle NoSQL Database.
  • Highly performant for read and write workloads – Parent and child rows reside in the same local shard, enabling write and read operations to achieve high performance, since all records in the hierarchy can be read or written in a single network call.
  • Highly flexible for fine-grained authorization – Access rights to a parent or child table can be configured individually based on conditions at run-time, offering granular and flexible authorization.
  • Scalable ACID transactions – Uniquely balances the goals of scalability, low latency, and ACID by co-locating parent and child data on the same shard.
  • Table joins – Data can be queried using the nested table clause or left outer joins.

How Oracle NoSQL Database Models a Table Hierarchy with Parent-Child Tables

A table hierarchy, consisting of a tree of tables connected by parent-child relationships, is formed using the CREATE TABLE statement. This is done by using a composite name (a name_path) for the child table. A composite name consists of a number ( > 1 ) of identifiers separated by dots. The last identifier is the local child table’s local name, and the first N-1 identifiers are the parent’s names.

Take a simple example where parent-child tables consisting of user ids and names with id as the primary key. The following CREATE TABLE statements form a table hierarchy. Table A is the root (parent) of all child tables. Table B is the direct child of A. Table C is the direct child of B, and the grandchild of A. Table G is a direct child of A in a different tree branch. Each CREATE TABLE statement is executed separately to form a table and its relationship to other tables.  For the root table (Table A), the shard key idA is identical to the primary key.   In Oracle NoSQL Database, the default behavior is the shard key maps to the entire primary key. Every table has a shard key that can be either defined explicitly or implicitly, as in this example.

After completing the above CREATE TABLE operations, let’s look at the construct of the parent-child relationship in further details. The CREATE TABLE statements form the following table hierarchy:

Let’s explore the schema of each Table below. First, notice the child records automatically inherit the shard key of their parent. The following text highlighted in green shows that in the parent Table A, the shard Key is “idA” and the primary key is “idA”.

The following text highlighted in green shows that in the child Table A.B, the shard key = “idA” which is inherited from its parent Table A. The primary keys contain “idA” (for the parents shard key) and “idB”.

The same is true for A.B.C. The following text highlighted in green shows that in Table A.B.C, the shard key = “idA” is inherited from its parent Table A. The primary keys contain “idA”, “idB”, “idC”. The same is true for A.G.

Real-World Use Case of Table Hierarchy

Many airlines offer their passengers real-time baggage tracking self-services. Passengers install an application on their mobile phones to monitor their baggage whereabouts during flights. Let’s illustrate how a table hierarchy offers an alternative way for modeling the normalized data relationship, in addition to the popular JSON document-based model supported by Oracle NoSQL Database.

Below is a sample JSON document capturing a passenger profile, baggage, and its location. Ticket number (“ticketNo”), bag ID (“id”), flight number (“flightNo) are key identifiers for tracking a bag. The sample document below tracks only a bag. The document size grows tremendously when tracking just three bags.  

Let’s take the JSON document, break it into different tables, and normalize the data. The left document is converted into four tables on the right. Each table contains different pieces of data in the document. Parent-child keys are “ticketNo”, “contactPhone”, “id”, and “flightNo” with “ticketNo” as the parent key (or shard key).

The parent-child relationship is illustrated below with the “ticket” table as the parent, the “bagInfo” table as its child, the “flightLegs” table as the grandchild in the same hierarchy. The “passengerInfo” table is a child of the “ticket” parent table in a separate hierarchy. The parent-child keys, “ticketNo”, “id”, “flightNo”, “contactPhone”, linked these tables together to form the hierarchy that represents the normalized data relationship.

Following steps, walk through the commands to create the table hierarchy and show their relationship similar to the steps shown at the blog’s beginning. The load data and query steps illustrate writing records to the table hierarchy and performing a left outer join. I will dive further into these steps in part 2 of this series which is coming up soon.

Deep Dive of Oracle NoSQL Table Hierarchy using a Simple Example

After the breathtaking illustration of table hierarchy using the airline baggage tracking use case above, let’s return to the simple table hierarchy example below used at the beginning of this blog.This example allows us to understand the table hierarchy concept better. In the next blog, I will use the airline baggage tracking data again to explore queries in table hierarchy.

          

Let’s continue our journey to dive further into the underlying structure that constructs the table hierarchy. You can easily map Table A, A.B, A.B.C, and A.G to the airline baggage tracking use case with the following mapping.

A table hierarchy is constructed using the CREATE TABLE statements. To link Tables A and B with A as the parent and B as the child, CREATE TABLE A.B is used to construct such a relationship. The primary key “idA” of Table A is automatically created in the child Table B as an implicit key. Typically, the shard portion of the primary key is created in the child.  In this case, the entire primary key is created in the child because we did not specify a portion of the primary key as the shard key. To link Tables A, B, and C in a hierarchy, CREATE TABLE A.B.C is used, and primary keys of Tables A and B are automatically created in Table C as implicit keys. Since Table A is the parent of all child tables, “idA” becomes the shard key for partitioning all parent-child records.

Implicit keys in a NoSQL table hierarchy are synonymous with foreign keys in relational databases for building relationships.

Let’s populate the tables with values for each parent-child relationship. The name paths of the parent-child tables, e.g., A.B, A.B.C, A.G, are used to specify the table to which values are written. Different paths require a different number of primary keys to link the parent-child rows together, e.g., the row in child Table A.B with primary key values “2”,”2” establishes a parent-child relationship with the row in parent Table A with primary key value “2”. Table A.B.C row requires primary keys “2”,”2”,”2” to establish the parent-child relationship with the row above in Tables A, A.B.

Once the write operations are complete for all parent-child tables, the table hierarchy contains the row values and parent-child relationship in the diagram below.

The name paths of the parent-child tables, e.g., A.B, A.B.C, A.G, are used to specify the table to which values are written. Different paths require a different number of primary keys to link the parent-child rows together, e.g., the row in child Table A.B with primary key values “2”,”2” establishes a parent-child relationship with the row in parent Table A with primary key value “2”. Table A.B.C row requires primary keys “2”,”2”,”2” to establish the parent-child relationship with the row above in Tables A, A.B. Once the write operations are complete for all parent-child tables, the table hierarchy contains the row values and parent-child relationship in the diagram below. For the ease of viewing, let’s orientate the table hierarchy horizontally.

How Table Hierarchy Works

An Oracle NoSQL Database cluster consists of a collection of Storage Nodes, each of which hosts one or more replication nodes. These nodes are organized into shards. A single shard contains multiple replication nodes and a leader node. A shard key is identified in the CREATE TABLE statement for distributing data across different shards. Records with the same shard key are co-located for easy reference and access.

The following diagram shows how Oracle NoSQL Database distributes rows based on shard key. The section later will illustrate this implementation with sample data to show how parent-child rows are distributed and co-located in the same shard. Before that, let’s dive into the semantic of a parent-child relationship.

Table Hierarchy Semantic

Following are the semantic implications of a parent-child relationship:

  • A child table inherits its parent table’s primary key column(s) – As explained above, this is done implicitly, without including the parent columns in the create table statement of the child.
  • All tables in the hierarchy have the same shard key columns – The shard key is inherited from the root table. In the example above, the common shard key is Table A’s column “idA.” Since shard key columns are a subset of the primary key columns, by definition, child tables also inherit the shard key. Attempting to include a shard key clause in the create table statement of a non-root table will raise an error.
  • A parent table cannot be dropped before its children are dropped – A child table must be deleted before deleting its parent to avoid breaking the table hierarchy relationship. In the example above, removing Table A.B without removing Table A.B.C will raise an error.
  • Parent and its child row(s) are co-located because their shard keys match – When two rows RC and RP from a child table C and its parent table P, respectively, have the same values on their common primary key columns, RP and RC will be co-located physically, because they have the same shard key. 

How Parent-Child Rows are Distributed and Co-Located

The diagrams below illustrate a table hierarchy with the parent-child relationship and co-location of the parent and child rows.

Using the sample table hierarchy earlier, the SQL command: “desc as json table <parent/child table name> shows the following schema descriptions for Tables A, B, C, their primary keys, and shard keys. This implicit primary and shard key sharing builds the bond between the parent and child tables in a hierarchy.

Using the earlier sample data and the diagram below, let’s explore how the parent and child rows are bound together, distributed across shards, and co-located in the same shard.

The parent-child relationship is established when two rows share one or more common primary key values. For example, the row in Table A with the primary key (“idA”) value equals to “2” shares the same primary key (“idA”) value in Table A.B. A similar parent-child relationship is established between the rows in Table A.B and Table A.B.C with the primary key (“idA”) and (“idB”) values equal to “2” and “2”. With this linkage, a table hierarchy with rows of one root parent, a child, and a grandchild is born.

The parent-child rows are distributed among the NoSQL shards based on the root parent’s shard key (“idA”) values. For simplicity, assume the parent Table A’s shard key values are hashed to distribute the parent-child rows across shards 1,2,3. The table below shows the mapping of the root parent shard key to Oracle NoSQL Database:

The following diagram illustrates the sample parent-child rows above are stored in Oracle NoSQL Database shard #2 since the root parent Table A’s shard key (“idA”) equals “2”.

We can extend this idea in the diagram below by inserting rows across shards 1, 2, 3. Each parent-child row in a shard is replicated across multiple storage nodes to provide high availability, high performance for read and write operations.

With the co-location of parent-child rows in the same shard, developers can implement transactional operations. All data in a parent-child row can be committed as a single database operation to ensure atomicity, consistency, isolation, durability (ACID).

Now, you see why the table hierarchy is a hidden gem in Oracle NoSQL Database.

Summary

The takeaway is that the NoSQL table hierarchy not only captures the relationship between data entities but also takes advantage of the co-location of the parent-child rows to offer highly performant retrievals and superior scalability. Another takeaway is a table hierarchy enables applications to implement ACID transactions. All data in the same parent-child rows are stored in the same shard and can be committed as a single database operation to ensure atomicity, consistency, isolation, durability.

For the time being, go ahead and try all the examples. You can try it at Oracle NoSQL Database Cloud Service. Or, download and install Oracle NoSQL Database in less than 1 minute.

Once the download and installation are complete, launch KVLITE, a light version of Oracle NoSQL Database, in seconds using the following command:

      java -jar <your KVHOME directory>/lib/kvstore.jar kvlite -secure-config disable

You can get started with the SQL shell using the following command and run all the examples:

      java -jar <your KVHOME directory>/lib/sql.jar -helper-hosts localhost:5000 -store kvstore