X

Oracle Big Data Spatial and Graph - technical tips, best practices, and news from the product team

Detecting Communities in a Social Graph

Alan Wu
Architect

Hi,

In this post, I am going to demonstrate an easy flow to detect communities in a social network. Communities are very important to a social graph because individuals of a community tend to share a set of common characteristics or exhibit one or multiple common behaviors. 

The following Groovy based scripts require Oracle Big Data Spatial and Graph v1.1 which is bundled with Oracle Big Data Appliance v4.3.0. One can also get Oracle Big Data Spatial and Graph v1.1 from My Oracle Support.

cd /opt/oracle/oracle-spatial-graph/property_graph/dal/groovy/ 

sh gremlin-opg-hbase.sh


  • First, load a small test graph "connections" into Apache HBase,

// Get a graph config that has graph name "connections" and 

// Zookeeper host, port, and some other parameters

cfg = GraphConfigBuilder.forPropertyGraphHbase()            \

 .setName("connections")                                    \

 .setZkQuorum("bigdatalite").setZkClientPort(2181)          \

 .setZkSessionTimeout(120000).setInitialEdgeNumRegions(3)   \

 .setInitialVertexNumRegions(3).setSplitsPerRegion(1)       \

 .addEdgeProperty("weight", PropertyType.DOUBLE, "1000000") \

 .build();


// Get an instance of OraclePropertyGraph which is a key Java

// class to manage property graph data

opg = OraclePropertyGraph.getInstance(cfg);

opg.clearRepository();

// OraclePropertyGraphDataLoader is a key Java class

// to bulk load property graph data into the backend databases.

opgdl=OraclePropertyGraphDataLoader.getInstance();

vfile="../../data/connections.opv"

efile="../../data/connections.ope"

opgdl.loadData(opg, vfile, efile, 2); 


  • Next, add a tiny loop of just two vertices, vx and vy, with Blueprints Java API

vx = opg.addVertex(1234l);

vy = opg.addVertex(1235l);

// Add an edge from vx to vy, and another from vy to vx

e1=opg.addEdge(3000l, vx, vy, "likes");

e1.setProperty("weight", 1.1d);

e2=opg.addEdge(3001l, vy, vx, "likes");

e2.setProperty("weight", 1.5d);

opg.commit();


  • Get in-memory analyst

// Create an in memory analytics session and analyst

session=Pgx.createSession("session_ID_1");

analyst=session.createAnalyst();

// Read graph data from database into memory

pgxGraph = session.readGraphWithProperties(opg.getConfig());

  • Run community detection algorithms

// Run WCC algorithm

partition = analyst.wcc(pgxGraph)

partition.size() // should be 2

// Get the first community (collection of vertices)

vertexCollection = partition.getPartitionByIndex(0);


     // Run Label Propagation

partition = analyst.communitiesLabelPropagation(pgxGraph)

// How many communities do we have?

partition.size() 

// Get the first community by ID 

vertexCollection = partition.getPartitionByIndex(0);


  • An example output of the above command is as follows. Yours might be different:

==>PgxVertex with ID 77

==>PgxVertex with ID 78

  • Now we have all the communities detected, you can run the following to look into a community that has an entity of interest.

// Look into the community that has the vertex Alibaba

v = opg.getVertices("name", "Alibaba").iterator().next();

vertexCollection = partition.getPartitionByVertex(pgxGraph.getVertex(v.getId()));

// Get details of the 4 vertices in this community. 

// "l" below indicates a Long integer

opg.getVertex(69l);

opg.getVertex(68l);

opg.getVertex(65l);

opg.getVertex(71l);

:quit

Cheers,

Zhe Wu

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.Captcha