SQL Property Graph Queries (SQL/PGQ) is a new addition to the ISO/IEC SQL Standard that lets you create and query property graphs. It is Part 16 of the latest version of the SQL Standard, SQL:2023, which was published in June 2023. See Announcing the General Availability of the SQL:2023 Standard. SQL/PGQ has been made available in the Oracle Database with the release of Oracle Database 23ai.

Oracle is committed to building standards-based products and property graphs in SQL is another testament of this. Oracle not only spearheaded the standardization effort but also introduced the first commercially available SQL/PGQ implementation as part of Oracle Database 23ai. Property graphs are deeply integrated into SQL and form an integral component of Oracle’s converged database.

Why have Property Graphs in SQL?

Use cases for property graphs are numerous. They include fraud detection (for example, fraud ring detection), social network analysis (for example, community detection), supply chain and logistics (finding shortest paths, or network disruption impact analysis), telecommunications (customer churn analysis), healthcare (patient journey analysis) and cybersecurity (attack path analysis). Many of these use cases require traversing a variable number of links or connections, essentially a variable number of joins in a relational database. Recursive WITH and CONNECT BY in Oracle Database were already capable of doing so, but native graph support in SQL allows for all kinds of graph queries to be expressed very directly and intuitively. It also makes the user’s intent more precisely known to the system, enabling graph-specific evaluation strategies and performance optimizations.

Other languages for querying graphs exist. ISO/IEC GQL and W3C SPARQL are standards for property graphs and RDF graphs respectively, while Oracle PGQL, Neo4j Cypher, TigerGraph GSQL and Apache TinkerPop Gremlin are proprietary property graph query languages (also see Querying Graphs with SQL and PGQL: What is the difference?). What sets SQL apart from these graph-specific languages is its tight integration with the relational model: property graphs in SQL are view-like objects on top of existing tables. Therefore, since lots of data is already stored in tables, there is no need to replicate any data. Existing data can remain securely stored in one place but can now be queried as a property graph too. Moreover, property graph queries always execute on the most up-to-date data. Furthermore, the same data can be used as relational data by one component in an application and as graph data by another component. Oracle’s converged database fully embraces this multi-model design, seamlessly integrating property graphs into SQL and the rest of the database ecosystem. Not only can property graph queries be combined with relational queries, but they can also be combined with JSON queries, vector search, spatial queries, and any other feature in the database.

SQL has a vast ecosystem of developers and tools, and SQL/PGQ instantly puts the ability to create and query property graphs into the hands of SQL developers. Furthermore, they can use their favorite tool to work with property graphs, as any tool that supports SQL can be used to work with property graphs. Property graphs can move from being a specialized technology requiring the knowledge of a special-purpose language to a technology that can be widely integrated into applications.

Integration of Property Graphs into SQL

Property graphs are seamlessly integrated into SQL:

  • Property graphs reside within an SQL schema alongside tables, views and other schema objects. Their definitions are stored in the data dictionary together with definitions of other types of objects. New dictionary views provide access to this property graph metadata.
  • Property graphs are part of SQL’s security model with permissions managed via familiar GRANT and REVOKE statements.
  • Data types for vertex and edge properties can be any existing SQL data type, including new JSON and VECTOR types in Oracle Database 23ai.
  • GRAPH_TABLE is a new operator in the FROM clause which executes the graph query (pattern) against a given graph and returns matches in tabular form for further processing with regular SQL.
  • WHERE and COLUMNS clauses inside GRAPH_TABLE use the same operators, functions and predicates as are available elsewhere in SQL.
  • Simple JSON dot-notation syntax is available for vertex and edge properties of JSON type.
  • Graphs are queried in a transactionally consistent manner when updates occur to underlying tables.
  • In Oracle Database 23ai, the AS OF clause can be placed after a graph name to query a graph at a particular timestamp or SCN.
  • All existing SQL functionality can be used in combination with graphs: GROUP BY, row pattern matching, window/analytics functions, etc.
  • Data from tables, views and graphs can be joined together within a single query.

Example CREATE PROPERTY GRAPH statement and GRAPH_TABLE query

The following statement creates a property graph named bank_graph on top of three existing tables: persons, accounts and transfers. You can find the CREATE TABLE and INSERT statements for these tables at the end of this blog post.

CREATE PROPERTY GRAPH bank_graph
  VERTEX TABLES (
    persons AS person,
    accounts AS account
  )
  EDGE TABLES (
    transfers AS transfer
      SOURCE KEY (from_account) REFERENCES account (account_number)
      DESTINATION KEY (to_account) REFERENCES account (account_number)
      PROPERTIES (transfer_id, time, amount),
    accounts AS owner
      SOURCE KEY (account_number) REFERENCES account (account_number)
      DESTINATION person
      NO PROPERTIES
  );

The graph has two vertex tables named person and account, and two edge tables named transfer and owner. Each row in a vertex table becomes a vertex in the graph and each row in an edge table becomes an edge in the graph. Vertex and edge tables have one or more labels, and each label can have any number of properties. If no labels are specified then the default applies, which is that vertices and edges have single labels corresponding to element table names, as in this example (person, account, transfer, owner). Also, in this example, properties are specified for edges while the default applies for vertices, which is that the properties are all the columns. If primary keys are defined on the underlying tables, they can be used to automatically infer keys for vertices and edges. And if foreign keys are defined, they can be used to automatically infer the source and destination of edges. In the example, vertex and edge keys are inferred from primary keys and the keys for the destination of edge table owner are inferred from an existing foreign key, hence the omission of the REFERENCES clause for that edge table.

Once created, a property graph can be queried via the new GRAPH_TABLE operator that is placed in the FROM clause of an SQL query. GRAPH_TABLE takes a graph as input, performs graph pattern matching, and returns a table as output. For instance, the following query finds out who person Camille transferred money to, either directly or indirectly via an intermediary:

SELECT *
FROM GRAPH_TABLE ( bank_graph
       MATCH (p1 IS person) <-[IS owner]- (a1 IS account),
             (a1) -[e IS transfer]->{1,2} (a2 IS account),
             (a2) -[IS owner]-> (p2 IS person)
       WHERE p1.first_name = 'Camille'
       COLUMNS (p2.first_name, p2.last_name,
                p2.more_data.address.address_line_1.string() AS street,
                COUNT(e.transfer_id) AS path_length,
                JSON_ARRAYAGG(e.amount) AS amounts))
ORDER BY path_length, amounts;

For each match to the graph pattern, the query returns:

  • The first name, last name and street address of the person who Camille transferred money to, either directly or indirectly via an intermediary.
  • The number of transfers along the transfer path to that person: 1 for direct transfers and 2 for indirect transfers via an intermediary.
  • The transaction amounts along the transfer path, as a JSON array.

The result is:

FIRST_NAME LAST_NAME  STREET                     PATH_LENGTH AMOUNTS
---------- ---------- -------------------------- ----------- -------------
Nikita     Ivanov     1 Boolean Boulevard                  1 [1000]
Jake       Zimmerman  500 Property Graph Parkway           2 [1000,1500.3]
Jake       Zimmerman  500 Property Graph Parkway           2 [1000,3000]

Schema flexibility through JSON

Traditionally, graph databases are schemaless, whereas SQL databases like Oracle Database are inherently schema driven. However, SQL offers flexible schema capabilities for graphs through JSON (and XML):

  • Vertex and edge properties can be of type JSON. This is a new data type based on OSON, an optimized native binary storage format for JSON.
  • Simple JSON dot-notation syntax is available for vertex and edge properties of JSON type. The following example retrieves a number (zip_code) from a nested JSON object and compares it against an Oracle NUMBER: WHERE vertex.json_property.address.zip_code = 94065.
  • Oracle Database 23ai supports JSON Schema definitions on JSON columns to validate the structure and contents of JSON documents. By default, only the minimum requirements are validated such that things remain extensible unless otherwise specified. Property graphs can be created on top of tables with columns that have such attached JSON schemas.

By combining graph and JSON, SQL offers the best of both worlds when it comes to type safety and flexibility. Property graphs are inherently schema driven, ensuring type safety by default — for instance, a typo in a label or a property name triggers an error rather than returning an empty result. Meanwhile, JSON provides flexibility and dynamic typing for property graphs when needed.

Documentation for SQL property graphs in Oracle Database 23ai

Property graph support in Oracle Database 23ai is documented in the following places:

Example CREATE TABLE and INSERT statements

The following statements create the necessary data to run the above example:

CREATE TABLE persons (
  person_id NUMBER PRIMARY KEY,
  first_name VARCHAR2(200),
  last_name VARCHAR2(200),
  more_data JSON
);

INSERT INTO persons VALUES (
  1, 'Camille', 'Nichols', '{"address":{"address_line_1":"101 Ellison Avenue",
                                        "zip_code":10001,
                                        "city":"Larry Islands",
                                        "state":"AL"}}');
INSERT INTO persons VALUES (
  2, 'Jake', 'Zimmerman', '{"address":{"address_line_1":"500 Property Graph Parkway",
                                       "zip_code":75002,
                                       "city":"Oracle City",
                                       "state":"AK"}}');
INSERT INTO persons VALUES (
  3, 'Nikita', 'Ivanov', '{"address":{"address_line_1":"1 Boolean Boulevard",
                                      "zip_code":48201,
                                      "city":"Vectorville",
                                      "state":"AZ"}}');
INSERT INTO persons VALUES (
  4, 'Liam', 'O''Connor', '{"address":{"address_line_1":"16 SQL Street",
                                       "zip_code":9075,
                                       "city":"Json Town",
                                       "state":"CA"}}');

CREATE TABLE accounts (
  account_number NUMBER PRIMARY KEY,
  owner_id NUMBER,
  creation_date TIMESTAMP,
  FOREIGN KEY (owner_id) REFERENCES persons (person_id)
);

INSERT INTO accounts VALUES (1001, 2, TIMESTAMP '2000-01-01 14:31:00');
INSERT INTO accounts VALUES (2090, 4, TIMESTAMP '2004-12-15 08:15:00');
INSERT INTO accounts VALUES (8021, 3, TIMESTAMP '2005-03-20 10:45:00');
INSERT INTO accounts VALUES (10039, 1, TIMESTAMP '2020-12-15 14:17:00');

CREATE TABLE transfers (
  transfer_id NUMBER GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  from_account NUMBER,
  to_account NUMBER,
  time TIMESTAMP,
  amount NUMBER(10,2)
);

INSERT INTO transfers (from_account, to_account, time, amount)
  VALUES (10039, 8021, TIMESTAMP '2024-03-12 14:30:00', 1000.00);
INSERT INTO transfers (from_account, to_account, time, amount)
  VALUES (8021, 1001, TIMESTAMP '2024-03-12 16:30:00', 1500.30);
INSERT INTO transfers (from_account, to_account, time, amount)
  VALUES (8021, 1001, TIMESTAMP '2024-03-12 16:31:00', 3000.00);
INSERT INTO transfers (from_account, to_account, time, amount)
  VALUES (1001, 2090, TIMESTAMP '2024-03-12 19:03:00', 9999.50);
INSERT INTO transfers (from_account, to_account, time, amount)
  VALUES (2090, 10039, TIMESTAMP '2024-03-12 19:45:00', 9900.00);