This is a guest blog by Oracle ACE Pro Deepak Vohra
Property Graphs and the concomitant Property Graph Query Language (PGQL) provide a new opportunity to use the Oracle Database as if it were a Graph Database. The syntax of PGQL is very close to the one of Property Graph Queries (SQL/PGQ), the new standard specified in the ISO/IEC 9075-16:2023 Information technology — Database languages SQL Part 16. Both define an operator called GRAPH_TABLE and a new construct called MATCH to discover connections in a SQL property graph. PGQL as opposed to SQL/PGQ is its own Property Graph Query Language, and since PGQL has been there for quite some time, it has more features than SQL/PGQ and its underlying standard specification.
Oracle Database 23ai introduces SQL property graphs to represent relational table data as a graph. Property graphs are a way to discover connections between data points which otherwise could involve prohibitive SQL joins to discover, if at all discoverable.
In graph theory a directed graph is made of a set of vertices, and a set of directed edges. An edge is a connection between two vertices. A directed edge has one of its endpoints as the source vertex and the other endpoint as the destination vertex. A directed graph can be described by G = (V, A).
V is a set of vertices
A is a set of edges
A single edge consists of an ordered pair of vertices (v1 , v2), the direction of the edge being from v1 to v2. Every edge in a directed graph is directed.
A property graph data model is a named directed graph with the following description:
Labels are strings. A property/attribute is a key-value pair with property of type string and value of a scalar type (numeric, string, boolean). In Oracle's implementation for 23ai, properties can also be non-scalar such as a JSON column. Typically vertices represent entities such as an employee, and edges represent relations between the entities.
SQL/PGQ is the extension to SQL for querying property graphs. You can query a SQL property graph using the GRAPH_TABLE operator to express graph pattern matching queries. The following parameters need to be defined for the GRAPH_TABLE operator:
We use an Oracle Database 23ai instance created from a Docker image. To start with, pull the Docker image for Oracle Database 23 Free Edition, latest.
sudo docker pull container-registry.oracle.com/database/free:latest
latest: Pulling from database/free
6d6e36f7c9fb: Pull complete
21def9023b6f: Pull complete
5e7b2cfeb7fa: Pull complete
b4a24759beff: Pull complete
78bba54e9814: Pull complete
716b489ad5ad: Pull complete
c23fd8c6cbee: Pull complete
79dea26b3a5a: Pull complete
5dfbcf799df3: Pull complete
154719a62576: Pull complete
Digest: sha256:83edd0756fda0e5faecc0fdf047814f0177d4224d7bf037e4900123ee3e08718
Status: Downloaded newer image for container-registry.oracle.com/database/free:latest
container-registry.oracle.com/database/free:latest
Run a Docker container from the Docker image with the following docker run command.
sudo docker run --name 23ai \
-p 1521:1521 \
-e ORACLE_PWD=orcl23ai \
-e ENABLE_ARCHIVELOG=true \
-e ENABLE_FORCE_LOGGING=true \
-v /opt/oracle/oradata \
container-registry.oracle.com/database/free:latest
Oracle Database 23ai Free Release 23.0.0.0.0 Version 23.4.0.24.05 is started and is ready to use, as partial output from the command shows.
sudo docker run --name 23ai \
> -p 1521:1521 \
> -e ORACLE_PWD=orcl23ai \
> -e ENABLE_ARCHIVELOG=true \
> -e ENABLE_FORCE_LOGGING=true \
> -v /opt/oracle/oradata \
> container-registry.oracle.com/database/free:latest
Starting Oracle Net Listener.
Oracle Net Listener started.
Starting Oracle Database instance FREE.
Oracle Database instance FREE started.
The Oracle base remains unchanged with value /opt/oracle
SQL*Plus: Release 23.0.0.0.0 - Production on Sun Jun 9 21:37:26 2024
Version 23.4.0.24.05
…
…
#########################
DATABASE IS READY TO USE!
#########################
Completed: Pluggable database FREEPDB1 opened read write
Download the sample schemas with the following wget command.
wget https://github.com/oracle-samples/db-sample-schemas/archive/refs/tags/v23.3.zip
Extract the contents of the v23.3.zip file downloaded.
unzip v23.3.zip
The sample schemas get extracted to the db-sample-schemas-23.3 directory.
cd db-sample-schemas-23.3
ls -l
drwxrwxr-x 2 4096 Mar 28 12:32 customer_orders
drwxrwxr-x 2 4096 Mar 28 12:32 human_resources
-rw-rw-r-- 1 1094 Mar 28 12:32 LICENSE.txt
drwxrwxr-x 3 4096 Mar 28 12:32 order_entry
drwxrwxr-x 2 4096 Mar 28 12:32 product_media
-rw-rw-r-- 1 3784 Mar 28 12:32 README.md
-rw-rw-r-- 1 3613 Mar 28 12:32 README.txt
drwxrwxr-x 2 4096 Mar 28 12:32 sales_history
Each set of sample schemas includes SQL scripts to create, install, and populate the schema tables. We use the human_resources schema.
cd human_resources
ls -l
-rw-rw-r-- 1 3849 Mar 28 12:32 hr_code.sql
-rw-rw-r-- 1 17112 Mar 28 12:32 hr_create.sql
-rw-rw-r-- 1 8275 Mar 28 12:32 hr_install.sql
-rw-rw-r-- 1 41301 Mar 28 12:32 hr_populate.sql
-rw-rw-r-- 1 2920 Mar 28 12:32 hr_uninstall.sql
-rw-rw-r-- 1 2642 Mar 28 12:32 README.md
-rw-rw-r-- 1 2666 Mar 28 12:32 README.txt
Run the scripts to create and populate the sample tables.
SQL> desc employees;
Name Null? Type
----------------------------------------- -------- ----------------------------
EMPLOYEE_ID NOT NULL NUMBER(6)
FIRST_NAME VARCHAR2(20)
LAST_NAME NOT NULL VARCHAR2(25)
EMAIL NOT NULL VARCHAR2(25)
PHONE_NUMBER VARCHAR2(20)
HIRE_DATE NOT NULL DATE
JOB_ID NOT NULL VARCHAR2(10)
SALARY NUMBER(8,2)
COMMISSION_PCT NUMBER(2,2)
MANAGER_ID NUMBER(6)
DEPARTMENT_ID NUMBER(4)
SQL> desc departments;
Name Null? Type
----------------------------------------- -------- ----------------------------
DEPARTMENT_ID NOT NULL NUMBER(4)
DEPARTMENT_NAME NOT NULL VARCHAR2(30)
MANAGER_ID NUMBER(6)
LOCATION_ID NUMBER(4)
SQL> desc job_history;
Name Null? Type
----------------------------------------- -------- ----------------------------
EMPLOYEE_ID NOT NULL NUMBER(6)
START_DATE NOT NULL DATE
END_DATE NOT NULL DATE
JOB_ID NOT NULL VARCHAR2(10)
DEPARTMENT_ID NUMBER(4)
SQL> desc jobs;
Name Null? Type
----------------------------------------- -------- ----------------------------
JOB_ID NOT NULL VARCHAR2(10)
JOB_TITLE NOT NULL VARCHAR2(35)
MIN_SALARY NUMBER(6)
MAX_SALARY NUMBER(6)
SQL> desc regions;
Name Null? Type
----------------------------------------- -------- ----------------------------
REGION_ID NOT NULL NUMBER
REGION_NAME VARCHAR2(25)
SQL> desc locations;
Name Null? Type
----------------------------------------- -------- ----------------------------
LOCATION_ID NOT NULL NUMBER(4)
STREET_ADDRESS VARCHAR2(40)
POSTAL_CODE VARCHAR2(12)
CITY NOT NULL VARCHAR2(30)
STATE_PROVINCE VARCHAR2(25)
COUNTRY_ID CHAR(2)
SQL> desc countries;
Name Null? Type
----------------------------------------- -------- ----------------------------
COUNTRY_ID NOT NULL CHAR(2)
COUNTRY_NAME VARCHAR2(60)
REGION_ID NUMBER
SQL>
SQL DDL can be used to create a property graph using the new SQL statement CREATE PROPERTY GRAPH [IF NOT EXISTS]. A property graph is an abstract data structure consisting of two or more vertices and one or more edges.
A property graph definition consists of the following components:
A schema.graph_name
The VERTEX TABLES clause
The EDGE TABLES clause
An edge table definition consists of a graph element name and key pair that is directed from a SOURCE vertex table reference to a DESTINATION vertex table reference.
Create an example property graph from the HR schema tables using the CREATE PROPERTY GRAPH statement. You can use a SQL client of your choice (e.g.; SQL Developer, the SQL Developer extension in Visual Studio, SQLcl, or others) to execute the following statements:
CREATE PROPERTY GRAPH hr
VERTEX TABLES (
employees LABEL employee
PROPERTIES ARE ALL COLUMNS EXCEPT ( job_id, manager_id, department_id ),
departments LABEL department
PROPERTIES ( department_id, department_name ),
jobs LABEL job
PROPERTIES ARE ALL COLUMNS,
job_history
PROPERTIES ( start_date, end_date ),
locations LABEL location
PROPERTIES ARE ALL COLUMNS EXCEPT ( country_id ),
countries LABEL country
PROPERTIES ARE ALL COLUMNS EXCEPT ( region_id ),
regions LABEL region
)
EDGE TABLES (
employees AS works_for
SOURCE KEY ( employee_id ) REFERENCES employees ( employee_id )
DESTINATION KEY ( manager_id ) REFERENCES employees ( employee_id )
NO PROPERTIES,
employees AS works_at
SOURCE KEY ( employee_id ) REFERENCES employees ( employee_id )
DESTINATION departments
NO PROPERTIES,
employees AS works_as
SOURCE KEY ( employee_id ) REFERENCES employees ( employee_id )
DESTINATION jobs
NO PROPERTIES,
departments AS managed_by
SOURCE KEY ( department_id ) REFERENCES departments ( department_id )
DESTINATION employees
NO PROPERTIES,
job_history AS for_employee
SOURCE KEY ( employee_id, start_date ) REFERENCES job_history ( employee_id, start_date)
DESTINATION employees
NO PROPERTIES,
job_history AS for_department
SOURCE KEY ( employee_id, start_date ) REFERENCES job_history ( employee_id, start_date)
DESTINATION departments
NO PROPERTIES,
job_history AS for_job
SOURCE KEY ( employee_id, start_date ) REFERENCES job_history ( employee_id, start_date)
DESTINATION jobs
NO PROPERTIES,
departments AS department_located_in
SOURCE KEY ( department_id ) REFERENCES departments ( department_id )
DESTINATION locations
LABEL located_in
NO PROPERTIES,
locations AS location_located_in
SOURCE KEY ( location_id ) REFERENCES locations ( location_id )
DESTINATION countries
LABEL located_in
NO PROPERTIES,
countries AS country_located_in
SOURCE KEY ( country_id ) REFERENCES countries ( country_id )
DESTINATION regions
LABEL located_in
NO PROPERTIES
)
Query the property graph using the GRAPH_TABLE operator.
Starting with a simple example, search for employees and the manager each employee works for with the following SQL/PGQquery.
SELECT a, b
FROM GRAPH_TABLE(
hr
MATCH (a IS Employee)-[e is works_for]->(b is Employee)
COLUMNS(a.employee_id as a, b.employee_id as b)
);
The output lists the EMPLOYEE_IDs and the ID of the managing employee each one works for. Keep in mind that the person each employee works for is also an employee of the same organization.
SELECT a, b
FROM GRAPH_TABLE(
hr
MATCH (a IS Employee)-[e is works_for]->(b is Employee)
COLUMNS(a.employee_id as a, b.employee_id as b)
);
A B
---------- ----------
101 100
102 100
103 102
104 103
105 103
106 103
107 103
108 101
109 108
110 108
111 108
…
205 101
206 205
106 rows selected.
Take the first example further by discovering an additional connection to the department (id) of the manager that each employee works for.
SELECT a, b, c
FROM GRAPH_TABLE(
hr
MATCH (a IS Employee)-[works_for]->(b is Employee)-[works_at]->(c is Department)
COLUMNS(a.employee_id as a, b.employee_id as b, c.department_id as c)
);
A B C
---------- ---------- ----------
101 100 90
102 100 90
103 102 90
104 103 60
105 103 60
106 103 60
107 103 60
108 101 90
109 108 100
110 108 100
111 108 100
…
205 101 90
206 205 110
106 rows selected.
The preceding SQL/PGQ query lists the employee IDs, manager IDs, and department IDs. We can very well list the employee names, manager names, and department names discovered by the same connections in the graph with the following SQL/PGQ query.
SELECT employee, manager, department
FROM GRAPH_TABLE(
hr
MATCH (a IS Employee)-[works_for]->(b is Employee)-[works_at]->(c is Department)
COLUMNS(a.first_name as employee, b.first_name as manager, c.department_name as department)
);
This time the output lists the employee name, manager name, and department name.;
EMPLOYEE MANAGER DEPARTMENT
-------------------- -------------------- ------------------------------
Neena Steven Executive
Lex Steven Executive
Alexander Lex Executive
Bruce Alexander IT
David Alexander IT
Valli Alexander IT
Diana Alexander IT
Nancy Neena Executive
Daniel Nancy Finance
John Nancy Finance
Ismael Nancy Finance
…
Shelley Neena Executive
William Shelley Accounting
106 rows selected.
With the numerous tables in the schema, we can discover complex connections. As an example, list each employee’s id along with the employee’s department name, the location id of the department, the country name of the location, and the region name of the country with the query:
SELECT employee,department,location,country,region
FROM GRAPH_TABLE(
hr
MATCH (employee IS Employee)-[works_at]->(department is Department)-[department_located_in]->(location is Location)-[location_located_in]->(country is Country)-[country_located_in]->(region is Region)
COLUMNS(employee.employee_id as employee, department.department_name as department, location.location_id as location, country.country_name as country,region.region_name as region)
);
The query discovers the graph connections and lists them in a table output that is similar to the output for any other query.
EMPLOYEE DEPARTMENT LOCATION
---------- ------------------------------ ----------
COUNTRY
------------------------------------------------------------
REGION
-------------------------
100 Executive 1700
United States of America
Americas
101 Executive 1700
United States of America
Americas
…
…
206 Accounting 1700
United States of America
Americas
106 rows selected.
In the following example, use the WHERE clause to filter the manager name. Select all employees who work for manager Steven, along with listing the manager’s department name.
SELECT employee, manager, department
FROM GRAPH_TABLE(
hr
MATCH (a IS Employee)-[works_for]->(b is Employee)-[works_at]->(c is Department)
WHERE b.first_name = 'Steven'
COLUMNS(a.first_name as employee, b.first_name as manager, c.department_name as department)
);
As expected, the output includes only the employees who work for “Steven”; the MANAGER r column has “Steven” for all rows.
EMPLOYEE MANAGER DEPARTMENT
-------------------- -------------------- ------------------------------
Neena Steven Executive
Lex Steven Executive
Den Steven Executive
Matthew Steven Executive
Adam Steven Executive
Payam Steven Executive
Shanta Steven Executive
Kevin Steven Executive
John Steven Executive
Karen Steven Executive
Alberto Steven Executive
Gerald Steven Executive
Eleni Steven Executive
Michael Steven Executive
14 rows selected.
The department is “Executive” for all rows, and a user may be tempted to draw inferences about connections that the query is not meant to discover. As a note, the manager “Steven” could be the same person, or there could be two employees named “Steven”, each of whom works in the “Executive” department. For a user to discover additional connections if any, a user may need to run a different query.
The preceding graph query does not imply that all the employees listed work in the “Executive” department, as one might infer at first. The “Executive” department is where “Steven” works at. To discover all employees who work for manager “Steven” at the same department that “Steven” works for, the query would be the following.
SELECT employee, department
FROM GRAPH_TABLE(
hr
MATCH (d IS Department) <-[IS works_at]- (e1 IS Employee) -[IS works_for]-> (e2 IS Employee) -[IS works_at]-> (d)
WHERE e2.first_name = 'Steven'
COLUMNS(e1.first_name AS employee, d.department_name AS department)
);
The output lists only two employees.
EMPLOYEE DEPARTMENT
-------------------- ------------------------------
Neena Executive
Lex Executive
The same query as the preceding one can be designed differently as follows, this time including the manager’s name in the output.
SELECT employee, manager, department
FROM GRAPH_TABLE(
hr
MATCH (e1 IS Employee)-[IS works_for]-> (e2 IS Employee),
(e1)-[IS works_at]->(d IS Department),(e2)-[IS works_at]->(d)
WHERE e2.first_name = 'Steven'
COLUMNS(e1.first_name AS employee,e2.first_name as manager,d.department_name AS department)
);
The output; however, is the same with one additional column.
EMPLOYEE MANAGER DEPARTMENT
-------------------- -------------------- ------------------------------
Neena Steven Executive
Lex Steven Executive
Our first example in this article was querying all employee ids along with the corresponding manager ids. How would you discover all employee names along with each employee’s department, the manager each employee works for along with the manager’s department? As follows:
SELECT employee, employee_department, manager, manager_department
FROM GRAPH_TABLE(
hr
MATCH (a IS Employee)-[IS works_for]->(b is Employee)-[IS works_at]->(c is Department), (a)-[IS works_at]->(d is Department)
WHERE b.first_name = 'Steven'
COLUMNS( a.first_name as employee, d.department_name as employee_department, b.first_name as manager, c.department_name as manager_department)
);
EMPLOYEE EMPLOYEE_DEPARTMENT MANAGER
-------------------- ------------------------------ --------------------
MANAGER_DEPARTMENT
------------------------------
Michael Marketing Steven
Executive
Den Purchasing Steven
Executive
Matthew Shipping Steven
Executive
Adam Shipping Steven
Executive
Payam Shipping Steven
Executive
Shanta Shipping Steven
Executive
Kevin Shipping Steven
Executive
John Sales Steven
Executive
Karen Sales Steven
Executive
Alberto Sales Steven
Executive
Gerald Sales Steven
Executive
Eleni Sales Steven
Executive
Neena Executive Steven
Executive
Lex Executive Steven
Executive
14 rows selected.
These are only a few of the connections that one might discover in the sample schema.
The new SQL Property Graph support in Oracle Database 23ai simplifies the task of discovering direct and indirect connections between data points for data stored in a relational database table. What is remarkable is that a user does not need to change the design of the tables to use them with the Property Graph Query Language (PGQL); existing tables can be used. A user does not need to install any additional packages or extensions to use the new feature, as we discovered in this article.
Previous Post