new features is support for creating and querying property graphs in SQL using constructs, such as GRAPH_TABLE and MATCH, defined in SQL:2023. These enable you to write simple SQL queries to follow connections in data.
Oracle Database 23c Free – Developer Release is a new, free offering of the industry-leading Oracle Database that enterprises worldwide rely on every day. Among itsIn this blog, we will use the VirtualBox VM and a demo dataset to create a property on two existing tables and then query them using the GRAPH_TABLE and MATCH clause constructs.
• Download and set up the VirtualBox VM with Oracle Database 23c Free – Developer Release.
• Download the Bank Graph Dataset contents into a local directory, e.g. downloads/bankgraphdataset.
• Start the VM.
The sample data consists of two CSV files containing synthetic bank accounts and money transfers. The goal is to determine if there are any circular payment chains. That is, a chain of money transfers that start and end at the same account after going through 3, 4, or 5 intermediate accounts.
• Create a user, grant necessary roles and privileges
• Load the CSV files into the database
• Create a property graph as a view on these tables
• Query the property graph
Connect to the database as SYS, once the VM has started and the database is up, and create a new user named GRAPHUSERwith the necessary quota and grants.
We’ll use SQLcl and connect to the PDB named FREEPDB1.
sql sys@localhost:1521/freepdb1 as sysdba
Enter the password when prompted. See the Readme file in the VM for the initial password which hopefully you have changed!
create user graphuser identified by <supply-a-password> quota unlimited on users ; grant connect, resource to graphuser ;
conn graphuser@localhost:1521/freepdb1
Enter the password when prompted.
Enter the LOAD command to load the data from a CSV file after creating the table.
load bank_accounts bankgraphdataset/bank_accounts.csv new ;
Repeat for the money transfers.
load bank_transfers bankgraphdataset/bank_transfers.csv new ;
The BANK_TRANSFERS table has columns named TXN_ID, SRC_ACCT_ID, DST_ACCT_ID, and AMOUNT. Each row represents a money transfer from the SRC_ACCT_ID to the DST_ACCT_ID. So this table represents the connections between two accounts and hence becomes an edge in the property graph. The TXN_ID value identifies an edge and the AMOUNT, and optionally the SRC_ACCT_ID and DST_ACCT_ID, become properties of the edge.
An edge connects vertices. In this instance the BANK_ACCOUNTS table represents those vertices. The ID column identifies a vertex while the NAME and BALANCE columns become its properties.
So in property graph term, BANK_ACCOUNTS is a vertex table and BANK_TRANSFERS is an edge table.
Now let’s create the property graph. Enter and execute the following statement in SQLcl while connected as GRAPHUSER.
CREATE PROPERTY GRAPH BANK_GRAPH VERTEX TABLES ( BANK_ACCOUNTS KEY (ID) PROPERTIES (ID, Name, Balance) ) EDGE TABLES ( BANK_TRANSFERS KEY (TXN_ID) SOURCE KEY (src_acct_id) REFERENCES BANK_ACCOUNTS(ID) DESTINATION KEY (dst_acct_id) REFERENCES BANK_ACCOUNTS(ID) PROPERTIES (src_acct_id, dst_acct_id, amount) );
See the section on “SQL DDL Statements for Property Graphs” in the Graph Developer’s Guide for Property Graph for more details.
Next, we’ll query the BANK_GRAPH to look for circular payment chains.
First, here’s a very brief introduction to the GRAPH_TABLE and MATCH clause constructs.
GRAPH_TABLE is an operator that enables you to query the property graph by specifying a graph pattern to look for and then returning the results as a set of columns, i.e. a normal SQL table.
The MATCH clause lets you specify the graph patterns. The following example,
(src) – [e] -> (dst)
consists of two vertex patterns and one edge pattern. The () indicates a vertex, [] an edge and the arrow -> specifies the edge’s direction.
See the section on “SQL GRAPH_TABLE Queries” in the Graph Developer’s Guide for Property Graph for more details.
Now let’s query the BANK_GRAPH.
First, we look for the top 10 accounts by the number of incoming transfers. Then, the top 10 accounts that are at the center of 2-hop transfers.
REM Find the top 10 accounts by incoming transfers SELECT acct_id, COUNT(1) AS Num_Transfers FROM graph_table ( BANK_GRAPH MATCH (src) - [IS BANK_TRANSFERS] -> (dst) COLUMNS ( dst.id AS acct_id ) ) GROUP BY acct_id ORDER BY Num_Transfers DESC FETCH FIRST 10 ROWS ONLY;
The result should be:
ACCT_ID NUM_TRANSFERS __________ ________________ 387 39 934 39 135 36 534 32 380 31 330 30 406 28 746 28 920 26 259 26 10 rows selected.
REM Find the top 10 accounts in the middle of a 2-hop chain of transfers SELECT acct_id, COUNT(1) AS Num_In_Middle FROM graph_table ( BANK_GRAPH MATCH (src) - [IS BANK_TRANSFERS] -> (via) - [IS BANK_TRANSFERS] -> (dst) COLUMNS ( via.id AS acct_id ) ) GROUP BY acct_id ORDER BY Num_In_Middle DESC FETCH FIRST 10 ROWS ONLY;
The result should be:
ACCT_ID NUM_IN_MIDDLE __________ ________________ 387 195 934 195 135 180 534 160 380 155 330 150 406 140 746 140 920 130 259 130 10 rows selected.
Next let’s check if there are any 3-, 4-, or 5-hop circular payment chains.
REM Check if there are any 3-hop (triangles) transfers that start and end at the same account SELECT acct_id, COUNT(1) AS Num_Triangles FROM graph_table (BANK_GRAPH MATCH (src) - []->{3} (src) COLUMNS (src.id AS acct_id) ) GROUP BY acct_id ORDER BY Num_Triangles DESC; ACCT_ID NUM_TRIANGLES __________ ________________ 918 3 751 3 534 3 359 3 119 2 677 2 218 2 … 118 rows selected.
REM Check if there are any 4-hop transfers that start and end at the same account SELECT acct_id, COUNT(1) AS Num_4hop_Chains FROM graph_table (BANK_GRAPH MATCH (src) - []->{4} (src) COLUMNS (src.id AS acct_id) ) GROUP BY acct_id ORDER BY Num_4hop_Chains DESC; ACCT_ID NUM_4HOP_CHAINS __________ __________________ 397 8 387 7 579 7 801 6 559 6 499 6 716 5 … 329 rows selected.
REM Check if there are any 5-hop transfers that start and end at the same account SELECT acct_id, COUNT(1) AS Num_5hop_Chains FROM graph_table (BANK_GRAPH MATCH (src) - []->{5} (src) COLUMNS (src.id AS acct_id) ) GROUP BY acct_id ORDER BY Num_5hop_Chains DESC; ACCT_ID NUM_4HOP_CHAINS __________ __________________ 397 8 387 7 579 7 801 6 559 6 499 6 716 5 … 619 rows selected.
Finally, let’s list the top 10 accounts that have such circular payment chains of between 3 to 5-hops in length.
REM Query by number of 3 to 5 hops cycles in descending order. Show top 10. SELECT DISTINCT(account_id), COUNT(1) AS Num_Cycles FROM graph_table(BANK_GRAPH MATCH (v1)-[IS BANK_TRANSFERS]->{3, 5}(v1) COLUMNS (v1.id AS account_id) ) GROUP BY account_id ORDER BY Num_Cycles DESC FETCH FIRST 10 ROWS ONLY; ACCOUNT_ID NUM_CYCLES _____________ _____________ 135 37 387 34 934 30 640 28 458 27 13 27 559 25 352 23 406 23 499 22 10 rows selected.
We hope you found that quick introduction useful and are motivated to experiment with your datasets and use cases. Please discuss your experiences, post ideas, or ask questions on the public Oracle Database Free – Developer Release forum. If you would like a hands on experience exploring Operational Property Graphs in 23c, try our LiveLab. Oracle Database 23c Free – Developer Release is the first release of the next-generation Oracle Database, allowing developers a head-start on building applications with innovative 23c features that simplify development of modern data-driven apps. The entire feature set of Oracle Database 23c is planned to be generally available within the next 12 months.
Jayant Sharma is Sr. Director, Product Management at Oracle. He is currently focused on product strategy and design of spatial and graph offerings on the Autonomous Database platform.
He works closely with business partners and strategic customers, and has delivered numerous presentations, tutorials, and research papers at various academic and industry conferences.
Previous Post
Next Post