X

Oracle Big Data Spatial and Graph - technical tips, best practices, and news from the product team

Persisting Results of Graph Analytics into BDSG Graph Database (Part II)

Alan Wu
Architect

This is the second part of Persisting Results of Graph Analytics into BDSG Graph Database. See the following blog for the first installment.

https://blogs.oracle.com/bigdataspatialgraph/entry/persisting_results_of_graph_analytics

Part II: When # of results to be persisted is big

In this case, it is no longer efficient to store computation results into vertices/edges on the fly because it incurs too many small, incremental changes. A much more efficient way is to convert the computation results into Oracle defined flat files (a format designed for property graph), and then load it into a BDSG graph database in parallel.

Here is an example code snippet:

      OraclePropertyGraph opg = OraclePropertyGraph.getInstance(cfg);
      ...
      Partition communities = analyst.communitiesLabelPropagation(g);
      fw = new BufferedWriter(new FileWriter("/tmp/communities.opv"));

      i = 0;
      while (i < communities.size()) {
              VertexCollection commu = communities.getPartitionByIndex(i);
              Iterator it = commu.iterator();

              while (it.hasNext()) {  
                PgxVertex v = (PgxVertex) (it.next());
                    fw.write(""
                           + v.getId()
                           + ","
                           + community"
                           + ","
                           + "1"
                           + ","
                           + community_" + i
                           + ","
                           + ",\n"
                         );
            }

            i++;
    }

    fw.close();

The above code is very straightforward. Basically, it detects communities in a property graph (by running label propagation), iterates through all the communities,  and writes a text line for each vertex in a community denoting the community assignment for that vertex. In this case, there is no need to do any escaping/encoding because we are not using any tricky characters. When there are commas, newlines, etc. involved, please refer to the "Oracle Flat File Format" section in Chapter 5 of the following development guide for the exact encoding.

http://docs.oracle.com/bigdata/bda47/BDSPA/toc.htm

 Now we have the computed results in a flat file, we can easily persist them into the same property graph with a single loadData API call. Note that because the community assignments are only for vertices, we simply provide an empty flat for the edges (.ope).

 OraclePropertyGraphDataLoader opgdl = OraclePropertyGraphDataLoader.getInstance();
 opgdl.loadData(
   opg,
   "/tmp/communities.opv",
   "/tmp/empty_file.ope",  // It is OK to use an empty edge flat file
   8                       // 8 threads
   );

Thanks,

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.