X

Oracle Big Data Spatial and Graph - technical tips, best practices, and news from the product team

Using PGQL in Python

Alan Wu
Architect

I got a question on how to run and consume PGQL in Python yesterday so I decided to write a short blog about it. Find below a complete example on executing PGQL, iterating through its result set, and generating a bar chat. Note that I am using BDSG v2.4 and Python 2.7. The same flow applies to Oracle Spatial and Graph (other than the graph configuration part).

Start the pyopg shell script. You can also use Jupyter if you want.

$ sh ./pyopg.sh

    ____        ____  ____  ______
   / __ \__  __/ __ \/ __ \/ ____/
  / /_/ / / / / / / / /_/ / / __
 / ____/ /_/ / /_/ / ____/ /_/ /
/_/    \__, /\____/_/    \____/
      /____/

Oracle Big Data Spatial and Graph Property Graph Python Shell ...
Context available as opg
Class loading done
>>> >>>

gcb=JClass('oracle.pgx.config.GraphConfigBuilder')
pgx_types = JPackage('oracle.pgx.common.types')

server = JClass('java.util.ArrayList')();
server.add("bigdatalite:5000");
gcb=JClass('oracle.pgx.config.GraphConfigBuilder')
cfg = gcb.forPropertyGraphNosql() .setName("connections").setStoreName("kvstore") .setHosts(server) .hasEdgeLabel(True).setLoadEdgeLabel(True).addVertexProperty("name", pgx_types.PropertyType.STRING, "empty name") .setMaxNumConnections(2).build();

pgx_param=JClass("java.util.HashMap")()
instance=JClass("oracle.pgx.api.Pgx").getInstance()
if not instance.isEngineRunning():
  instance.startEngine(pgx_param)

session=instance.createSession("mysession1")
pgxGraph = session.readGraphWithProperties(cfg);

pgxResultSet = pgxGraph.queryPgql("SELECT n,m WHERE (n) -> (m)")
it=pgxResultSet.getResults().iterator()

while (it.hasNext()):
  element = it.next();
  print element.toString()



The output may look like what's shown below.
==>

...
n(VERTEX)=56 m(VERTEX)=58
n(VERTEX)=56 m(VERTEX)=57
n(VERTEX)=56 m(VERTEX)=5
n(VERTEX)=59 m(VERTEX)=60
n(VERTEX)=67 m(VERTEX)=37
n(VERTEX)=67 m(VERTEX)=73
n(VERTEX)=67 m(VERTEX)=72
...

While command line output is useful, we can do a bit of charting with Pyplot. The query itself is a simple aggregation and counting based on vertex's name property.

pgxResultSet = pgxGraph.queryPgql("SELECT n.name,count(m) as size WHERE (n) -> (m) group by n.name limit 10")
it=pgxResultSet.getResults().iterator()

graph_communities = [{"name":i.get(0),"size":int(i.get(1).toString())} for i in it]

import pandas as pd
import numpy as np

community_frame = pd.DataFrame(graph_communities)
community_frame[:5]

import matplotlib as mpl
import matplotlib.pyplot as plt

fig, ax = plt.subplots(nrows=1, ncols=1, figsize=(16,22));
ax=community_frame.plot.bar(x='name', y='size', rot=0)
plt.show()

Cheers,

Zhe

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.Captcha
Oracle

Integrated Cloud Applications & Platform Services