Get started with property graphs in Oracle Database 23c Free – Developer Release

April 10, 2023 | 11 minute read
Jayant Sharma
Sr. Director, Product Mgmt
Text Size 100%:

Oracle Database 23c Free – Developer Release is a new, free offering of the industry-leading Oracle Database that enterprises worldwide rely on every day. Among its new features is support for creating and querying property graphs in SQL using constructs, such as GRAPH_TABLE and MATCH, defined in SQL:2023. These enable you to write simple SQL queries to follow connections in data.

In this blog, we will use the VirtualBox VM and a demo dataset to create a property on two existing tables and then query them using the GRAPH_TABLE and MATCH clause constructs.

Prerequisites

•    Download and set up the VirtualBox VM with Oracle Database 23c Free – Developer Release.
•    Download the Bank Graph Dataset contents into a local directory, e.g. downloads/bankgraphdataset.
•    Start the VM.

The scenario

The sample data consists of two CSV files containing synthetic bank accounts and money transfers. The goal is to determine if there are any circular payment chains. That is, a chain of money transfers that start and end at the same account after going through 3, 4, or 5 intermediate accounts.

The steps

•    Create a user, grant necessary roles and privileges
•    Load the CSV files into the database
•    Create a property graph as a view on these tables
•    Query the property graph

Create the user GRAPHUSER

Connect to the database as SYS, once the VM has started and the database is up, and create a new user named GRAPHUSERwith the necessary quota and grants.
We’ll use SQLcl and connect to the PDB named FREEPDB1.

sql sys@localhost:1521/freepdb1 as sysdba

Enter the password when prompted. See the Readme file in the VM for the initial password which hopefully you have changed!

create user graphuser identified by <supply-a-password> quota unlimited on users ;
grant connect, resource to graphuser ;

Connect as GRAPHUSER and load the data

conn graphuser@localhost:1521/freepdb1

Enter the password when prompted.

Enter the LOAD command to load the data from a CSV file after creating the table.

load bank_accounts bankgraphdataset/bank_accounts.csv new ;

LOAD command to load data

Repeat for the money transfers.

load bank_transfers bankgraphdataset/bank_transfers.csv new ;

LOAD command to load more data

Create a property graph named BANK_GRAPH

The BANK_TRANSFERS table has columns named TXN_ID, SRC_ACCT_ID, DST_ACCT_ID, and AMOUNT.  Each row represents a money transfer from the SRC_ACCT_ID to the DST_ACCT_ID. So this table represents the connections between two accounts and hence becomes an edge in the property graph. The TXN_ID value identifies an edge and the AMOUNT, and optionally the SRC_ACCT_ID and DST_ACCT_ID, become properties of the edge.

An edge connects vertices. In this instance the BANK_ACCOUNTS table represents those vertices. The ID column identifies a vertex while the NAME and BALANCE columns become its properties.

So in property graph term, BANK_ACCOUNTS is a vertex table and BANK_TRANSFERS is an edge table.

Now let’s create the property graph. Enter and execute the following statement in SQLcl while connected as GRAPHUSER.

CREATE PROPERTY GRAPH BANK_GRAPH
    VERTEX TABLES (
        BANK_ACCOUNTS
        KEY (ID)
        PROPERTIES (ID, Name, Balance)
    )
    EDGE TABLES (
        BANK_TRANSFERS
        KEY (TXN_ID)
        SOURCE KEY (src_acct_id) REFERENCES BANK_ACCOUNTS(ID)
        DESTINATION KEY (dst_acct_id) REFERENCES BANK_ACCOUNTS(ID)
        PROPERTIES (src_acct_id, dst_acct_id, amount)
    );

See the section on “SQL DDL Statements for Property Graphs” in the Graph Developer’s Guide for Property Graph for more details.

Next, we’ll query the BANK_GRAPH to look for circular payment chains.

First, here’s a very brief introduction to the GRAPH_TABLE and MATCH clause constructs.

Querying property graphs

GRAPH_TABLE is an operator that enables you to query the property graph by specifying a graph pattern to look for and then returning the results as a set of columns, i.e. a normal SQL table.

The MATCH clause lets you specify the graph patterns. The following example,

(src) – [e] -> (dst)

consists of two vertex patterns and one edge pattern. The () indicates a vertex, [] an edge and the arrow -> specifies the edge’s direction.

See the section on “SQL GRAPH_TABLE Queries” in the Graph Developer’s Guide for Property Graph for more details.

Now let’s query the BANK_GRAPH.
First, we look for the top 10 accounts by the number of incoming transfers. Then, the top 10 accounts that are at the center of 2-hop transfers.

REM Find the top 10 accounts by incoming transfers
SELECT acct_id, COUNT(1) AS Num_Transfers
    FROM graph_table ( BANK_GRAPH
        MATCH (src) - [IS BANK_TRANSFERS] -> (dst)
        COLUMNS ( dst.id AS acct_id )
    ) GROUP BY acct_id ORDER BY Num_Transfers DESC
FETCH FIRST 10 ROWS ONLY;

The result should be:

   ACCT_ID    NUM_TRANSFERS
__________ ________________
       387               39
       934               39
       135               36
       534               32
       380               31
       330               30
       406               28
       746               28
       920               26
       259               26

10 rows selected.
REM Find the top 10 accounts in the middle of a 2-hop chain of transfers
SELECT acct_id, COUNT(1) AS Num_In_Middle
    FROM graph_table ( BANK_GRAPH
        MATCH (src) - [IS BANK_TRANSFERS] -> (via) - [IS BANK_TRANSFERS] -> (dst)
        COLUMNS ( via.id AS acct_id )
    ) GROUP BY acct_id ORDER BY Num_In_Middle DESC FETCH FIRST 10 ROWS ONLY;

The result should be:

   ACCT_ID    NUM_IN_MIDDLE
__________ ________________
       387              195
       934              195
       135              180
       534              160
       380              155
       330              150
       406              140
       746              140
       920              130
       259              130

10 rows selected.

Next let’s check if there are any 3-, 4-, or 5-hop circular payment chains.

REM Check if there are any 3-hop (triangles) transfers that start and end at the same account
SELECT acct_id, COUNT(1) AS Num_Triangles
    FROM graph_table (BANK_GRAPH
        MATCH (src) - []->{3} (src)
        COLUMNS (src.id AS acct_id)
    ) GROUP BY acct_id ORDER BY Num_Triangles DESC;

   ACCT_ID    NUM_TRIANGLES
__________ ________________
       918                3
       751                3
       534                3
       359                3
       119                2
       677                2
       218                2
…
118 rows selected.
REM Check if there are any 4-hop transfers that start and end at the same account
SELECT acct_id, COUNT(1) AS Num_4hop_Chains
    FROM graph_table (BANK_GRAPH
        MATCH (src) - []->{4} (src)
        COLUMNS (src.id AS acct_id)
    ) GROUP BY acct_id ORDER BY Num_4hop_Chains DESC;

   ACCT_ID    NUM_4HOP_CHAINS
__________ __________________
       397                  8
       387                  7
       579                  7
       801                  6
       559                  6
       499                  6
       716                  5
…
329 rows selected.
REM Check if there are any 5-hop transfers that start and end at the same account
SELECT acct_id, COUNT(1) AS Num_5hop_Chains
    FROM graph_table (BANK_GRAPH
        MATCH (src) - []->{5} (src)
        COLUMNS (src.id AS acct_id)
    ) GROUP BY acct_id ORDER BY Num_5hop_Chains DESC;

   ACCT_ID    NUM_4HOP_CHAINS
__________ __________________
       397                  8
       387                  7
       579                  7
       801                  6
       559                  6
       499                  6
       716                  5
…
619 rows selected.

Finally, let’s list the top 10 accounts that have such circular payment chains of between 3 to 5-hops in length.

REM Query by number of 3 to 5 hops cycles in descending order. Show top 10.
SELECT DISTINCT(account_id), COUNT(1) AS Num_Cycles
    FROM graph_table(BANK_GRAPH
        MATCH (v1)-[IS BANK_TRANSFERS]->{3, 5}(v1)
        COLUMNS (v1.id AS account_id)
    ) GROUP BY account_id ORDER BY Num_Cycles DESC FETCH FIRST 10 ROWS ONLY;

   ACCOUNT_ID    NUM_CYCLES
_____________ _____________
          135            37
          387            34
          934            30
          640            28
          458            27
           13            27
          559            25
          352            23
          406            23
          499            22

10 rows selected.

We hope you found that quick introduction useful and are motivated to experiment with your datasets and use cases. Please discuss your experiences, post ideas, or ask questions on the public Oracle Database Free – Developer Release forum. If you would like a hands on experience exploring Operational Property Graphs in 23c, try our LiveLab. Oracle Database 23c Free – Developer Release is the first release of the next-generation Oracle Database, allowing developers a head-start on building applications with innovative 23c features that simplify development of modern data-driven apps. The entire feature set of Oracle Database 23c is planned to be generally available within the next 12 months.

 

Jayant Sharma

Sr. Director, Product Mgmt

Jayant Sharma is Sr. Director, Product Management at Oracle. He is currently focused on product strategy and design of spatial and graph offerings on the Autonomous Database platform. 

He works closely with business partners and strategic customers, and has delivered numerous presentations, tutorials, and research papers  at various academic and industry conferences. 


Previous Post

Use JSON Relational Duality with Oracle Database API for Mongo DB

Hermann Baer | 4 min read

Next Post


Introduction to JavaScript in Oracle Database 23ai

Martin Bach | 7 min read