This is the second part of Persisting Results of Graph Analytics into BDSG Graph Database. See the following blog for the first installment.
https://blogs.oracle.com/bigdataspatialgraph/entry/persisting_results_of_graph_analytics
Part II: When # of results to be persisted is big
In this case, it is no longer efficient to store computation results into vertices/edges on the fly because it incurs too many small, incremental changes. A much more efficient way is to convert the computation results into Oracle defined flat files (a format designed for property graph), and then load it into a BDSG graph database in parallel.
Here is an example code snippet:
OraclePropertyGraph opg = OraclePropertyGraph.getInstance(cfg);
...
Partition communities = analyst.communitiesLabelPropagation(g);
fw = new BufferedWriter(new FileWriter("/tmp/communities.opv"));
i = 0;
while (i < communities.size()) {
VertexCollection commu = communities.getPartitionByIndex(i);
Iterator it = commu.iterator();
while (it.hasNext()) {
PgxVertex v = (PgxVertex) (it.next());
fw.write(""
+ v.getId()
+ ","
+ community"
+ ","
+ "1"
+ ","
+ community_" + i
+ ","
+ ",\n"
);
}
i++;
}
fw.close();
The above code is very straightforward. Basically, it detects communities in a property graph (by running label propagation), iterates through all the communities, and writes a text line for each vertex in a community denoting the community assignment for that vertex. In this case, there is no need to do any escaping/encoding because we are not using any tricky characters. When there are commas, newlines, etc. involved, please refer to the "Oracle Flat File Format" section in Chapter 5 of the following development guide for the exact encoding.
http://docs.oracle.com/bigdata/bda47/BDSPA/toc.htm
Now we have the computed results in a flat file, we can easily persist them into the same property graph with a single loadData API call. Note that because the community assignments are only for vertices, we simply provide an empty flat for the edges (.ope).
OraclePropertyGraphDataLoader opgdl = OraclePropertyGraphDataLoader.getInstance();
opgdl.loadData(
opg,
"/tmp/communities.opv",
"/tmp/empty_file.ope", // It is OK to use an empty edge flat file
8 // 8 threads
);
Thanks,