new features is support for creating and querying property graphs in SQL using constructs, such as GRAPH_TABLE and MATCH, defined in SQL:2023. These enable you to write simple SQL queries to follow connections in data.
Oracle Database 23c Free – Developer Release is a new, free offering of the industry-leading Oracle Database that enterprises worldwide rely on every day. Among itsIn this blog, we will use the VirtualBox VM and a demo dataset to create a property on two existing tables and then query them using the GRAPH_TABLE and MATCH clause constructs.
• Download and set up the VirtualBox VM with Oracle Database 23c Free – Developer Release.
• Download the Bank Graph Dataset contents into a local directory, e.g. downloads/bankgraphdataset.
• Start the VM.
The sample data consists of two CSV files containing synthetic bank accounts and money transfers. The goal is to determine if there are any circular payment chains. That is, a chain of money transfers that start and end at the same account after going through 3, 4, or 5 intermediate accounts.
• Create a user, grant necessary roles and privileges
• Load the CSV files into the database
• Create a property graph as a view on these tables
• Query the property graph
Connect to the database as SYS, once the VM has started and the database is up, and create a new user named GRAPHUSERwith the necessary quota and grants.
We’ll use SQLcl and connect to the PDB named FREEPDB1.
sql sys@localhost:1521/freepdb1 as sysdba
Enter the password when prompted. See the Readme file in the VM for the initial password which hopefully you have changed!
create user graphuser identified by <supply-a-password> quota unlimited on users ;
grant connect, resource to graphuser ;
conn graphuser@localhost:1521/freepdb1
Enter the password when prompted.
Enter the LOAD command to load the data from a CSV file after creating the table.
load bank_accounts bankgraphdataset/bank_accounts.csv new ;
Repeat for the money transfers.
load bank_transfers bankgraphdataset/bank_transfers.csv new ;
The BANK_TRANSFERS table has columns named TXN_ID, SRC_ACCT_ID, DST_ACCT_ID, and AMOUNT. Each row represents a money transfer from the SRC_ACCT_ID to the DST_ACCT_ID. So this table represents the connections between two accounts and hence becomes an edge in the property graph. The TXN_ID value identifies an edge and the AMOUNT, and optionally the SRC_ACCT_ID and DST_ACCT_ID, become properties of the edge.
An edge connects vertices. In this instance the BANK_ACCOUNTS table represents those vertices. The ID column identifies a vertex while the NAME and BALANCE columns become its properties.
So in property graph term, BANK_ACCOUNTS is a vertex table and BANK_TRANSFERS is an edge table.
Now let’s create the property graph. Enter and execute the following statement in SQLcl while connected as GRAPHUSER.
CREATE PROPERTY GRAPH BANK_GRAPH
VERTEX TABLES (
BANK_ACCOUNTS
KEY (ID)
PROPERTIES (ID, Name, Balance)
)
EDGE TABLES (
BANK_TRANSFERS
KEY (TXN_ID)
SOURCE KEY (src_acct_id) REFERENCES BANK_ACCOUNTS(ID)
DESTINATION KEY (dst_acct_id) REFERENCES BANK_ACCOUNTS(ID)
PROPERTIES (src_acct_id, dst_acct_id, amount)
);
See the section on “SQL DDL Statements for Property Graphs” in the Graph Developer’s Guide for Property Graph for more details.
Next, we’ll query the BANK_GRAPH to look for circular payment chains.
First, here’s a very brief introduction to the GRAPH_TABLE and MATCH clause constructs.
GRAPH_TABLE is an operator that enables you to query the property graph by specifying a graph pattern to look for and then returning the results as a set of columns, i.e. a normal SQL table.
The MATCH clause lets you specify the graph patterns. The following example,
(src) – [e] -> (dst)
consists of two vertex patterns and one edge pattern. The () indicates a vertex, [] an edge and the arrow -> specifies the edge’s direction.
See the section on “SQL GRAPH_TABLE Queries” in the Graph Developer’s Guide for Property Graph for more details.
Now let’s query the BANK_GRAPH.
First, we look for the top 10 accounts by the number of incoming transfers. Then, the top 10 accounts that are at the center of 2-hop transfers.
REM Find the top 10 accounts by incoming transfers
SELECT acct_id, COUNT(1) AS Num_Transfers
FROM graph_table ( BANK_GRAPH
MATCH (src) - [IS BANK_TRANSFERS] -> (dst)
COLUMNS ( dst.id AS acct_id )
) GROUP BY acct_id ORDER BY Num_Transfers DESC
FETCH FIRST 10 ROWS ONLY;
The result should be:
ACCT_ID NUM_TRANSFERS
__________ ________________
387 39
934 39
135 36
534 32
380 31
330 30
406 28
746 28
920 26
259 26
10 rows selected.
REM Find the top 10 accounts in the middle of a 2-hop chain of transfers
SELECT acct_id, COUNT(1) AS Num_In_Middle
FROM graph_table ( BANK_GRAPH
MATCH (src) - [IS BANK_TRANSFERS] -> (via) - [IS BANK_TRANSFERS] -> (dst)
COLUMNS ( via.id AS acct_id )
) GROUP BY acct_id ORDER BY Num_In_Middle DESC FETCH FIRST 10 ROWS ONLY;
The result should be:
ACCT_ID NUM_IN_MIDDLE
__________ ________________
387 195
934 195
135 180
534 160
380 155
330 150
406 140
746 140
920 130
259 130
10 rows selected.
Next let’s check if there are any 3-, 4-, or 5-hop circular payment chains.
REM Check if there are any 3-hop (triangles) transfers that start and end at the same account
SELECT acct_id, COUNT(1) AS Num_Triangles
FROM graph_table (BANK_GRAPH
MATCH (src) - []->{3} (src)
COLUMNS (src.id AS acct_id)
) GROUP BY acct_id ORDER BY Num_Triangles DESC;
ACCT_ID NUM_TRIANGLES
__________ ________________
918 3
751 3
534 3
359 3
119 2
677 2
218 2
…
118 rows selected.
REM Check if there are any 4-hop transfers that start and end at the same account
SELECT acct_id, COUNT(1) AS Num_4hop_Chains
FROM graph_table (BANK_GRAPH
MATCH (src) - []->{4} (src)
COLUMNS (src.id AS acct_id)
) GROUP BY acct_id ORDER BY Num_4hop_Chains DESC;
ACCT_ID NUM_4HOP_CHAINS
__________ __________________
397 8
387 7
579 7
801 6
559 6
499 6
716 5
…
329 rows selected.
REM Check if there are any 5-hop transfers that start and end at the same account
SELECT acct_id, COUNT(1) AS Num_5hop_Chains
FROM graph_table (BANK_GRAPH
MATCH (src) - []->{5} (src)
COLUMNS (src.id AS acct_id)
) GROUP BY acct_id ORDER BY Num_5hop_Chains DESC;
ACCT_ID NUM_4HOP_CHAINS
__________ __________________
397 8
387 7
579 7
801 6
559 6
499 6
716 5
…
619 rows selected.
Finally, let’s list the top 10 accounts that have such circular payment chains of between 3 to 5-hops in length.
REM Query by number of 3 to 5 hops cycles in descending order. Show top 10.
SELECT DISTINCT(account_id), COUNT(1) AS Num_Cycles
FROM graph_table(BANK_GRAPH
MATCH (v1)-[IS BANK_TRANSFERS]->{3, 5}(v1)
COLUMNS (v1.id AS account_id)
) GROUP BY account_id ORDER BY Num_Cycles DESC FETCH FIRST 10 ROWS ONLY;
ACCOUNT_ID NUM_CYCLES
_____________ _____________
135 37
387 34
934 30
640 28
458 27
13 27
559 25
352 23
406 23
499 22
10 rows selected.
We hope you found that quick introduction useful and are motivated to experiment with your datasets and use cases. Please discuss your experiences, post ideas, or ask questions on the public Oracle Database Free – Developer Release forum. If you would like a hands on experience exploring Operational Property Graphs in 23c, try our LiveLab. Oracle Database 23c Free – Developer Release is the first release of the next-generation Oracle Database, allowing developers a head-start on building applications with innovative 23c features that simplify development of modern data-driven apps. The entire feature set of Oracle Database 23c is planned to be generally available within the next 12 months.
Jayant Sharma is Sr. Director, Product Management at Oracle. He is currently focused on product strategy and design of spatial and graph offerings on the Autonomous Database platform.
He works closely with business partners and strategic customers, and has delivered numerous presentations, tutorials, and research papers at various academic and industry conferences.
Previous Post