X

Oracle Big Data Spatial and Graph - technical tips, best practices, and news from the product team

  • November 15, 2017

Sanity Checking Property Graph Functions After Upgrade

Alan Wu
Architect

Recently I worked with Oracle Support on resolving a property graph related issue. One question asked was "how do we quickly sanity check property graph functions after an upgrade?" Well, I usually run through the following steps.

- Make sure BDSG Property Graph works well with Oracle NoSQL Database using the built-in Groovy shell.

cd /opt/oracle/oracle-spatial-graph/property_graph/dal/groovy/

sh gremlin-opg-nosql.sh


// Provide Oracle NoSQL Database server name and port
server = new ArrayList<String>();
server.add("bigdatalite:5000");

// Create a graph config that contains the graph name "connections"
// KV store name "kvstore", edge property "weight" to be loaded into
// in-memory graph, etc.
cfg = GraphConfigBuilder.forPropertyGraphNosql()             \
  .setName("connections").setStoreName("kvstore")            \
  .setHosts(server)                                          \
  .addEdgeProperty("weight", PropertyType.DOUBLE, "1000000") \
  .setMaxNumConnections(2).build();

// Get an instance of OraclePropertyGraph which is a key Java
// class to manage property graph data
opg = OraclePropertyGraph.getInstance(cfg);

// Set the Degree of Parallelism (DOP) for clearing graph data
// from the existing OraclePropertyGraph instance
opg.setClearTableDOP(2); // will return NULL because this
                         // API has no return value.
                         // It is expected.
opg.clearRepository();   // remove all vertices and edges
opgdl=OraclePropertyGraphDataLoader.getInstance();

vfile="../../data/connections.opv"  // vertex flat file
efile="../../data/connections.ope"  // edge flat file

// Set Degree of Parallelism (DOP) to 2 and load in parallel the
// above property graph data files into the database.
opgdl.loadData(opg, vfile, efile, 2);
opg.getVertices();

// Create in-memory analytics session and analyst
session=Pgx.createSession("session_ID_1");
analyst=session.createAnalyst();

// Read the graph from database into memory
pgxGraph = session.readGraphWithProperties(opg.getConfig());

// create a helper function for pretty printing
def p(v) { s1=v.getProperty("name");                 \
    if (s1.length() > 30) return s1;                 \
    s=s1;                                            \
    for (int idx = 0; idx < 30 - s1.length(); idx++) \
      { s=s+ " ";};                                  \
    return ("vertex " + s  + " id " + v.getId());    \
}

// Execute Page Rank
rank=analyst.pagerank(pgxGraph, 0.00000001, 0.85, 5000);

// Get 3 vertices with highest PR values
it = rank.getTopKValues(3).iterator();     \
while(it.hasNext()) {                      \
  v=it.next();                             \
  id=v.getKey().getId();                   \
  pr=v.getValue();                         \
  System.out.println("Influencers --->" +  \
    p(opg.getVertex(id)) + " pr= " + pr);  \
}

:quit

All the above steps should work without a problem.

- Now, time to move on to Apache HBase.

cd /opt/oracle/oracle-spatial-graph/property_graph/dal/groovy/
sh gremlin-opg-hbase.sh
  
// Get a graph config that has graph name "connections" and
// Zookeeper host, port, and some other parameters
cfg = GraphConfigBuilder.forPropertyGraphHbase()            \
 .setName("connections")                                    \
 .setZkQuorum("bigdatalite").setZkClientPort(2181)          \
 .setZkSessionTimeout(120000).setInitialEdgeNumRegions(3)   \
 .setInitialVertexNumRegions(3).setSplitsPerRegion(1)       \
 .addEdgeProperty("weight", PropertyType.DOUBLE, "1000000") \
 .build();
 
// Get an instance of OraclePropertyGraph which is a key Java
// class to manage property graph data
opg = OraclePropertyGraph.getInstance(cfg);
opg.clearRepository();
 
// OraclePropertyGraphDataLoader is a key Java class
// to bulk load property graph data into the backend databases.
opgdl=OraclePropertyGraphDataLoader.getInstance();
vfile="../../data/connections.opv"
efile="../../data/connections.ope"
opgdl.loadData(opg, vfile, efile, 2);

 

Note that the above HBase-related steps are shorter than those for Oracle NoSQL Database. The reason is we no longer need to retest the embedded PGX because we have already done that for Oracle NoSQL Database.

- Finally, start a PGX server which can be used for a remote PGX client.

For simplicity, I am using HTTP (instead of HTTPS or two-way SSL) and this requires setting "enable_tls":false and "enable_client_authentication": false in the following configuration file.

  /opt/oracle/oracle-spatial-graph/property_graph/pgx/conf/server.conf

To kick off the PGX server,

cd /opt/oracle/oracle-spatial-graph/property_graph/pgx/bin/
./start-server

Open a browser and connect to the following URL, you should see a very simple line of text describing the version.

    http://<hostname>:7007/version

If you want to tune this endpoint a bit, take a look at a previous blog I wrote.

Good luck!

 

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.Captcha