This blog post is for users that are using python and Oracle Big Data Spatial and Graph (BDSG). It shows how one can easily develop a demo flow with iPython Notebook and BDSG functions. As usual, we are using the famous Big Data Lite VM (version 4.5.0) and the demo flow is about building a Personalized Page Rank (PPR) based recommender system.
- Step 1: Start iPython Notebook (for brevity, setup of iPython is omitted) with "$ ipython notebook --no-mathjax" Type in the following in the browser page (Notebook). These few lines set the UTF8 encoding and import several packages.
Step 2: Create a graph config. Note that the graph data is stored in Apache HBase. The graph name is "user_movie" and this property graph has two kinds of vertices: users and movies. If a user U clicked movie M, then there is an edge with label "click" from U to M. In addition, there is a reverse edge with label "clickedBy" from M to U.
- Step 3: Get an instance of OraclePropertyGraph, starts the in-memory analyst (PGX), and creates a session for running recommendation.
- Step 4: Read the property graph from Apache HBase into the in-memory analyst.
- Step 5: Use Text Search to find a vertex with a first_name starts with "nathan". Note that "first_name" is a property of vertices representing users in this graph.
- Step 6: Say we want to recommend movies for this user "Nathaniel" we just found. Create a vertex set that includes this user "Nathaniel"
- Step 7: Execute Personalized Page Rank to recommend movies (and also similar users) to Nathaniel
- Step 8: Prepare data for plotting a chart on top personalized page rank values
- Step 9: Plot it out in iPython Notebook.
Acknowledgement: thanks Jay Banerjee for his input on this blog post.