This is a guest blog by Oracle ACE Pro Deepak Vohra
Property Graphs and the concomitant Property Graph Query Language (PGQL) provide a new opportunity to use the Oracle Database as if it were a Graph Database. The syntax of PGQL is very close to the one of Property Graph Queries (SQL/PGQ), the new standard specified in the ISO/IEC 9075-16:2023 Information technology — Database languages SQL Part 16. Both define an operator called GRAPH_TABLE and a new construct called MATCH to discover connections in a SQL property graph. PGQL as opposed to SQL/PGQ is its own Property Graph Query Language, and since PGQL has been there for quite some time, it has more features than SQL/PGQ and its underlying standard specification.
What is a Property Graph
Oracle Database 23ai introduces SQL property graphs to represent relational table data as a graph. Property graphs are a way to discover connections between data points which otherwise could involve prohibitive SQL joins to discover, if at all discoverable.
Property Graph Data Model
In graph theory a directed graph is made of a set of vertices, and a set of directed edges. An edge is a connection between two vertices. A directed edge has one of its endpoints as the source vertex and the other endpoint as the destination vertex. A directed graph can be described by G = (V, A).
-
V is a set of vertices
-
A is a set of edges
A single edge consists of an ordered pair of vertices (v1 , v2), the direction of the edge being from v1 to v2. Every edge in a directed graph is directed.
A property graph data model is a named directed graph with the following description:
- A set of vertices with each vertex having zero or more labels, and zero or more properties/attributes.
- A set of edges with each edge being directed from one vertex to another vertex. Each edge has zero or more labels and zero or more properties/attributes.
Labels are strings. A property/attribute is a key-value pair with property of type string and value of a scalar type (numeric, string, boolean). In Oracle’s implementation for 23ai, properties can also be non-scalar such as a JSON column. Typically vertices represent entities such as an employee, and edges represent relations between the entities.
SQL/PGQ
SQL/PGQ is the extension to SQL for querying property graphs. You can query a SQL property graph using the GRAPH_TABLE operator to express graph pattern matching queries. The following parameters need to be defined for the GRAPH_TABLE operator:
- The name of the graph is the first parameter.
- The MATCH clause expresses the graph element patterns (vertex or edge pattern) to be searched on the SQL property graph.
- The WHERE clause is optional and defines one or predicates that apply to the entire pattern in the MATCH clause.
- The COLUMNS clause defines the columns that are passed to the outer SQL query.
- Regular expressions can be included in the outer SQL statement, either in the WHERE or the SELECT clauses.
Prerequisite Setup
We use an Oracle Database 23ai instance created from a Docker image. To start with, pull the Docker image for Oracle Database 23 Free Edition, latest.
sudo docker pull container-registry.oracle.com/database/free:latest latest: Pulling from database/free 6d6e36f7c9fb: Pull complete 21def9023b6f: Pull complete 5e7b2cfeb7fa: Pull complete b4a24759beff: Pull complete 78bba54e9814: Pull complete 716b489ad5ad: Pull complete c23fd8c6cbee: Pull complete 79dea26b3a5a: Pull complete 5dfbcf799df3: Pull complete 154719a62576: Pull complete Digest: sha256:83edd0756fda0e5faecc0fdf047814f0177d4224d7bf037e4900123ee3e08718 Status: Downloaded newer image for container-registry.oracle.com/database/free:latest container-registry.oracle.com/database/free:latest
Run a Docker container from the Docker image with the following docker run command.
sudo docker run --name 23ai \ -p 1521:1521 \ -e ORACLE_PWD=orcl23ai \ -e ENABLE_ARCHIVELOG=true \ -e ENABLE_FORCE_LOGGING=true \ -v /opt/oracle/oradata \ container-registry.oracle.com/database/free:latest
Oracle Database 23ai Free Release 23.0.0.0.0 Version 23.4.0.24.05 is started and is ready to use, as partial output from the command shows.
sudo docker run --name 23ai \ > -p 1521:1521 \ > -e ORACLE_PWD=orcl23ai \ > -e ENABLE_ARCHIVELOG=true \ > -e ENABLE_FORCE_LOGGING=true \ > -v /opt/oracle/oradata \ > container-registry.oracle.com/database/free:latest Starting Oracle Net Listener. Oracle Net Listener started. Starting Oracle Database instance FREE. Oracle Database instance FREE started.
The Oracle base remains unchanged with value /opt/oracle
SQL*Plus: Release 23.0.0.0.0 - Production on Sun Jun 9 21:37:26 2024 Version 23.4.0.24.05 … … ######################### DATABASE IS READY TO USE! ######################### Completed: Pluggable database FREEPDB1 opened read write
Installing Sample Schemas and Tables
Download the sample schemas with the following wget command.
wget https://github.com/oracle-samples/db-sample-schemas/archive/refs/tags/v23.3.zip
Extract the contents of the v23.3.zip file downloaded.
unzip v23.3.zip
The sample schemas get extracted to the db-sample-schemas-23.3 directory.
cd db-sample-schemas-23.3 ls -l drwxrwxr-x 2 4096 Mar 28 12:32 customer_orders drwxrwxr-x 2 4096 Mar 28 12:32 human_resources -rw-rw-r-- 1 1094 Mar 28 12:32 LICENSE.txt drwxrwxr-x 3 4096 Mar 28 12:32 order_entry drwxrwxr-x 2 4096 Mar 28 12:32 product_media -rw-rw-r-- 1 3784 Mar 28 12:32 README.md -rw-rw-r-- 1 3613 Mar 28 12:32 README.txt drwxrwxr-x 2 4096 Mar 28 12:32 sales_history
Each set of sample schemas includes SQL scripts to create, install, and populate the schema tables. We use the human_resources schema.
cd human_resources ls -l -rw-rw-r-- 1 3849 Mar 28 12:32 hr_code.sql -rw-rw-r-- 1 17112 Mar 28 12:32 hr_create.sql -rw-rw-r-- 1 8275 Mar 28 12:32 hr_install.sql -rw-rw-r-- 1 41301 Mar 28 12:32 hr_populate.sql -rw-rw-r-- 1 2920 Mar 28 12:32 hr_uninstall.sql -rw-rw-r-- 1 2642 Mar 28 12:32 README.md -rw-rw-r-- 1 2666 Mar 28 12:32 README.txt
Run the scripts to create and populate the sample tables.
SQL> desc employees; Name Null? Type ----------------------------------------- -------- ---------------------------- EMPLOYEE_ID NOT NULL NUMBER(6) FIRST_NAME VARCHAR2(20) LAST_NAME NOT NULL VARCHAR2(25) EMAIL NOT NULL VARCHAR2(25) PHONE_NUMBER VARCHAR2(20) HIRE_DATE NOT NULL DATE JOB_ID NOT NULL VARCHAR2(10) SALARY NUMBER(8,2) COMMISSION_PCT NUMBER(2,2) MANAGER_ID NUMBER(6) DEPARTMENT_ID NUMBER(4)
SQL> desc departments; Name Null? Type ----------------------------------------- -------- ---------------------------- DEPARTMENT_ID NOT NULL NUMBER(4) DEPARTMENT_NAME NOT NULL VARCHAR2(30) MANAGER_ID NUMBER(6) LOCATION_ID NUMBER(4) SQL> desc job_history; Name Null? Type ----------------------------------------- -------- ---------------------------- EMPLOYEE_ID NOT NULL NUMBER(6) START_DATE NOT NULL DATE END_DATE NOT NULL DATE JOB_ID NOT NULL VARCHAR2(10) DEPARTMENT_ID NUMBER(4)
SQL> desc jobs; Name Null? Type ----------------------------------------- -------- ---------------------------- JOB_ID NOT NULL VARCHAR2(10) JOB_TITLE NOT NULL VARCHAR2(35) MIN_SALARY NUMBER(6) MAX_SALARY NUMBER(6) SQL> desc regions; Name Null? Type ----------------------------------------- -------- ---------------------------- REGION_ID NOT NULL NUMBER REGION_NAME VARCHAR2(25)
SQL> desc locations; Name Null? Type ----------------------------------------- -------- ---------------------------- LOCATION_ID NOT NULL NUMBER(4) STREET_ADDRESS VARCHAR2(40) POSTAL_CODE VARCHAR2(12) CITY NOT NULL VARCHAR2(30) STATE_PROVINCE VARCHAR2(25) COUNTRY_ID CHAR(2) SQL> desc countries; Name Null? Type ----------------------------------------- -------- ---------------------------- COUNTRY_ID NOT NULL CHAR(2) COUNTRY_NAME VARCHAR2(60) REGION_ID NUMBER
SQL>
Creating a Property Graph
SQL DDL can be used to create a property graph using the new SQL statement CREATE PROPERTY GRAPH [IF NOT EXISTS]. A property graph is an abstract data structure consisting of two or more vertices and one or more edges.
A property graph definition consists of the following components:
-
A schema.graph_name
-
The VERTEX TABLES clause
-
The EDGE TABLES clause
An edge table definition consists of a graph element name and key pair that is directed from a SOURCE vertex table reference to a DESTINATION vertex table reference.
Create an example property graph from the HR schema tables using the CREATE PROPERTY GRAPH statement. You can use a SQL client of your choice (e.g.; SQL Developer, the SQL Developer extension in Visual Studio, SQLcl, or others) to execute the following statements:
CREATE PROPERTY GRAPH hr VERTEX TABLES ( employees LABEL employee PROPERTIES ARE ALL COLUMNS EXCEPT ( job_id, manager_id, department_id ), departments LABEL department PROPERTIES ( department_id, department_name ), jobs LABEL job PROPERTIES ARE ALL COLUMNS, job_history PROPERTIES ( start_date, end_date ), locations LABEL location PROPERTIES ARE ALL COLUMNS EXCEPT ( country_id ), countries LABEL country PROPERTIES ARE ALL COLUMNS EXCEPT ( region_id ), regions LABEL region ) EDGE TABLES ( employees AS works_for SOURCE KEY ( employee_id ) REFERENCES employees ( employee_id ) DESTINATION KEY ( manager_id ) REFERENCES employees ( employee_id ) NO PROPERTIES, employees AS works_at SOURCE KEY ( employee_id ) REFERENCES employees ( employee_id ) DESTINATION departments NO PROPERTIES, employees AS works_as SOURCE KEY ( employee_id ) REFERENCES employees ( employee_id ) DESTINATION jobs NO PROPERTIES, departments AS managed_by SOURCE KEY ( department_id ) REFERENCES departments ( department_id ) DESTINATION employees NO PROPERTIES, job_history AS for_employee SOURCE KEY ( employee_id, start_date ) REFERENCES job_history ( employee_id, start_date) DESTINATION employees NO PROPERTIES, job_history AS for_department SOURCE KEY ( employee_id, start_date ) REFERENCES job_history ( employee_id, start_date) DESTINATION departments NO PROPERTIES, job_history AS for_job SOURCE KEY ( employee_id, start_date ) REFERENCES job_history ( employee_id, start_date) DESTINATION jobs NO PROPERTIES, departments AS department_located_in SOURCE KEY ( department_id ) REFERENCES departments ( department_id ) DESTINATION locations LABEL located_in NO PROPERTIES, locations AS location_located_in SOURCE KEY ( location_id ) REFERENCES locations ( location_id ) DESTINATION countries LABEL located_in NO PROPERTIES, countries AS country_located_in SOURCE KEY ( country_id ) REFERENCES countries ( country_id ) DESTINATION regions LABEL located_in NO PROPERTIES )
Querying the SQL Property Graph
Query the property graph using the GRAPH_TABLE operator.
Starting with a simple example, search for employees and the manager each employee works for with the following SQL/PGQquery.
SELECT a, b FROM GRAPH_TABLE( hr MATCH (a IS Employee)-[e is works_for]->(b is Employee) COLUMNS(a.employee_id as a, b.employee_id as b) );
The output lists the EMPLOYEE_IDs and the ID of the managing employee each one works for. Keep in mind that the person each employee works for is also an employee of the same organization.
SELECT a, b FROM GRAPH_TABLE( hr MATCH (a IS Employee)-[e is works_for]->(b is Employee) COLUMNS(a.employee_id as a, b.employee_id as b) );
A B ---------- ---------- 101 100 102 100 103 102 104 103 105 103 106 103 107 103 108 101 109 108 110 108 111 108 … 205 101 206 205
106 rows selected.
Take the first example further by discovering an additional connection to the department (id) of the manager that each employee works for.
SELECT a, b, c FROM GRAPH_TABLE( hr MATCH (a IS Employee)-[works_for]->(b is Employee)-[works_at]->(c is Department) COLUMNS(a.employee_id as a, b.employee_id as b, c.department_id as c) ); A B C ---------- ---------- ---------- 101 100 90 102 100 90 103 102 90 104 103 60 105 103 60 106 103 60 107 103 60 108 101 90 109 108 100 110 108 100 111 108 100 … 205 101 90 206 205 110
106 rows selected.
The preceding SQL/PGQ query lists the employee IDs, manager IDs, and department IDs. We can very well list the employee names, manager names, and department names discovered by the same connections in the graph with the following SQL/PGQ query.
SELECT employee, manager, department FROM GRAPH_TABLE( hr MATCH (a IS Employee)-[works_for]->(b is Employee)-[works_at]->(c is Department) COLUMNS(a.first_name as employee, b.first_name as manager, c.department_name as department) );
This time the output lists the employee name, manager name, and department name.;
EMPLOYEE MANAGER DEPARTMENT -------------------- -------------------- ------------------------------ Neena Steven Executive Lex Steven Executive Alexander Lex Executive Bruce Alexander IT David Alexander IT Valli Alexander IT Diana Alexander IT Nancy Neena Executive Daniel Nancy Finance John Nancy Finance Ismael Nancy Finance … Shelley Neena Executive William Shelley Accounting
106 rows selected.
With the numerous tables in the schema, we can discover complex connections. As an example, list each employee’s id along with the employee’s department name, the location id of the department, the country name of the location, and the region name of the country with the query:
SELECT employee,department,location,country,region FROM GRAPH_TABLE( hr MATCH (employee IS Employee)-[works_at]->(department is Department)-[department_located_in]->(location is Location)-[location_located_in]->(country is Country)-[country_located_in]->(region is Region) COLUMNS(employee.employee_id as employee, department.department_name as department, location.location_id as location, country.country_name as country,region.region_name as region) );
The query discovers the graph connections and lists them in a table output that is similar to the output for any other query.
EMPLOYEE DEPARTMENT LOCATION ---------- ------------------------------ ---------- COUNTRY ------------------------------------------------------------ REGION ------------------------- 100 Executive 1700 United States of America Americas 101 Executive 1700 United States of America Americas … … 206 Accounting 1700 United States of America Americas
106 rows selected.
In the following example, use the WHERE clause to filter the manager name. Select all employees who work for manager Steven, along with listing the manager’s department name.
SELECT employee, manager, department FROM GRAPH_TABLE( hr MATCH (a IS Employee)-[works_for]->(b is Employee)-[works_at]->(c is Department) WHERE b.first_name = 'Steven' COLUMNS(a.first_name as employee, b.first_name as manager, c.department_name as department) );
As expected, the output includes only the employees who work for “Steven”; the MANAGER r column has “Steven” for all rows.
EMPLOYEE MANAGER DEPARTMENT -------------------- -------------------- ------------------------------ Neena Steven Executive Lex Steven Executive Den Steven Executive Matthew Steven Executive Adam Steven Executive Payam Steven Executive Shanta Steven Executive Kevin Steven Executive John Steven Executive Karen Steven Executive Alberto Steven Executive Gerald Steven Executive Eleni Steven Executive Michael Steven Executive
14 rows selected.
The department is “Executive” for all rows, and a user may be tempted to draw inferences about connections that the query is not meant to discover. As a note, the manager “Steven” could be the same person, or there could be two employees named “Steven”, each of whom works in the “Executive” department. For a user to discover additional connections if any, a user may need to run a different query.
The preceding graph query does not imply that all the employees listed work in the “Executive” department, as one might infer at first. The “Executive” department is where “Steven” works at. To discover all employees who work for manager “Steven” at the same department that “Steven” works for, the query would be the following.
SELECT employee, department FROM GRAPH_TABLE( hr MATCH (d IS Department) <-[IS works_at]- (e1 IS Employee) -[IS works_for]-> (e2 IS Employee) -[IS works_at]-> (d) WHERE e2.first_name = 'Steven' COLUMNS(e1.first_name AS employee, d.department_name AS department) ); The output lists only two employees.
EMPLOYEE DEPARTMENT -------------------- ------------------------------ Neena Executive Lex Executive
The same query as the preceding one can be designed differently as follows, this time including the manager’s name in the output.
SELECT employee, manager, department FROM GRAPH_TABLE( hr MATCH (e1 IS Employee)-[IS works_for]-> (e2 IS Employee), (e1)-[IS works_at]->(d IS Department),(e2)-[IS works_at]->(d) WHERE e2.first_name = 'Steven' COLUMNS(e1.first_name AS employee,e2.first_name as manager,d.department_name AS department) );
The output; however, is the same with one additional column.
EMPLOYEE MANAGER DEPARTMENT -------------------- -------------------- ------------------------------ Neena Steven Executive Lex Steven Executive
Our first example in this article was querying all employee ids along with the corresponding manager ids. How would you discover all employee names along with each employee’s department, the manager each employee works for along with the manager’s department? As follows:
SELECT employee, employee_department, manager, manager_department FROM GRAPH_TABLE( hr MATCH (a IS Employee)-[IS works_for]->(b is Employee)-[IS works_at]->(c is Department), (a)-[IS works_at]->(d is Department) WHERE b.first_name = 'Steven' COLUMNS( a.first_name as employee, d.department_name as employee_department, b.first_name as manager, c.department_name as manager_department) ); EMPLOYEE EMPLOYEE_DEPARTMENT MANAGER -------------------- ------------------------------ -------------------- MANAGER_DEPARTMENT ------------------------------ Michael Marketing Steven Executive Den Purchasing Steven Executive Matthew Shipping Steven Executive
Adam Shipping Steven Executive Payam Shipping Steven Executive Shanta Shipping Steven Executive Kevin Shipping Steven Executive John Sales Steven Executive Karen Sales Steven Executive
Alberto Sales Steven Executive Gerald Sales Steven Executive Eleni Sales Steven Executive Neena Executive Steven Executive Lex Executive Steven Executive
14 rows selected.
These are only a few of the connections that one might discover in the sample schema.
Conclusion
The new SQL Property Graph support in Oracle Database 23ai simplifies the task of discovering direct and indirect connections between data points for data stored in a relational database table. What is remarkable is that a user does not need to change the design of the tables to use them with the Property Graph Query Language (PGQL); existing tables can be used. A user does not need to install any additional packages or extensions to use the new feature, as we discovered in this article.