Oracle Big Data Spatial and Graph - technical tips, best practices, and news from the product team

  • October 15, 2017

3 Ways to Serialize and Write a Sub-Graph to Client Side (III)

Alan Wu
This is the third installment of the "3 Ways to Serialize and Write a Sub-Graph to Client Side" series. The first installment showed an approach which serializes a sub graph on the server side and copies the graph data files to the client side. The second installment, on the other hand, showed a more direct way that first reads a sub graph to the client side and then uses utility methods in OraclePropertyGraphUtilsBase Java class to serialize the graph data.

I saved the easiest for the last :)
  • Approach #3 Use Grep or AWK or Whatever Your Favorite Text Processing Tool to Apply Filtering Directly on the Flat Files
The flat file format used by Oracle Big Data Spatial and Graph (BDSG) is in fact quite text processing friendly. Assume you have a large graph stored in flat files (.opv, .ope) and you want to create a sub graph on the client side. Chances are you can use grep or egrep or gawk or whatever your favorite text processing tool to apply filtering directly on the fat files, as long as the filtering condition is at per record level.

For example, the following egrep will keep only edges with label "collaborates".

cd /opt/oracle/oracle-spatial-graph/property_graph/data

cat connections.ope | egrep ",collaborates" > /tmp/my_subgraph.ope

If you worry about possible mismatch against other text fields, then a bit of regular expression can make sure we only match against the edge label field.

cat connections.ope  |  egrep "^[^,]*,[^,]*,[^,]*,collaborates," > /tmp/my_subgraph.ope
head -5 /tmp/my_subgraph.ope


To use this approach well, one needs to have a solid understanding of the flat file format.


Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.