ODI - Java Table Function for MongoDB

Behind the scenes of the MongoDB posting was a very simple JavaDB/Derby table function. The function implemented a couple of methods - the table function readCollection and the function makeRow which creates an array from a Java Object. It can't get much simpler. The iteration through the collection is handled by the class I extended from EnumeratorTableFunction  which came from the posting by Rick Hillegas, and it fits nicely into ODIs source/target generic integration task in the KM framework. Here is a viewlet I have created showing you everything very briefly but end to end.

The makeRow function uses the MongoDB java SDK, and produces a row for each BasicDBObject, each value in the document is output as a column in the table. Nested/complex values are serialized as Java Strings - so you will get a JSON string for anything complex.

  1. public String[] makeRow(Object obj) throws SQLException
  2. {
  3.   int idx = 0;
  4.   BasicDBObject dbo = (BasicDBObject) obj;
  5.   Iterator it = dbo.entrySet().iterator();
  6.   String[]    row = new String[ getColumnCount() ];
  7.   it.next(); // skip the 'id' column
  8.   while (it.hasNext()) {
  9.     Map.Entry pairs = (Map.Entry)it.next();
  10.     row[ idx++ ] = pairs.getValue().toString();
  11.   }
  12.   return row;
  13. }

The readCollection table function is a static method and has a couple of parameters (for demonstration) - one is the MongoDB database name and the other is the collection name. The function initializes the object instance with the column names which are defined to be the key names for the objects in the collection (the first object is taken and its keys used as the column names);

  1. public static ResultSet readCollection(String dbName, String collectionName)
  2.   throws SQLException, UnknownHostException
  3. {
  4.   int idx = 0;
  5.   MongoClient mongoClient = new MongoClient();
  6.   DB db = mongoClient.getDB(dbName);
  7.   DBCollection coll = db.getCollection(collectionName);
  8.   DBCursor cursor = coll.find();
  9.   BasicDBObject dbo = (BasicDBObject)  coll.findOne();
  10.   Set<String> keys = dbo.keySet();
  11.   String[] skeys = new String[keys.size()];
  12.   Iterator it = keys.iterator();
  13.   it.next(); // skip the id
  14.   while (it.hasNext()) {
  15.     skeys[idx++] = it.next().toString();
  16.   }
  17.   return new mongo_table( skeys, cursor );
  18. }

The mongo_table constructor just initializes itself and sets the enumeration to iterate over - the class I extend from is very useful, it can iterate over Java Enumeration, Iterator, Iterable, or array objects - the super class initializes the column names, and the setEnumeration defines the collection/iterator - which in this case is a MongoDB DBCursor which happens to be a Java Iterator<DBObject>.

  1. public mongo_table(String[] column_names, DBCursor cursor)
  2.   throws SQLException
  3. {
  4.   super( column_names );
  5.   setEnumeration( cursor );
  6. }

This approach can be used for sourcing pretty much anything, which is great for integration needs. The ODI Knowledge Module is an LKM and stages the result of the table function into a work table, then everything else is as normal. The KM creates the work table and also registers the table function with JavaDB/Derby. My code for the function registration is as follows;

  1. create function <%=odiRef.getSrcTablesList("","[TABLE_NAME]", "","")%>( dbName varchar( 330), collName varchar( 30))
  2. returns table
  3. (
  4. <%=odiRef.getSrcColList("","[COL_NAME] [SOURCE_CRE_DT]","[COL_NAME] [SOURCE_CRE_DT]",",\n","")%> )
  5. language java
  6. parameter style DERBY_JDBC_RESULT_SET
  7. no sql
  8. external name '<%=odiRef.getOption("TABLE_FUNCTION_NAME")%>'

This creates the table function with the same name as the datastore in the interface, plus the resultant table of the function has the columns (and types) from that datastore. The external JavaDB function name is taken from the KM option TABLE_FUNCTION_NAME. As I mentioned I have hard-wired 2 parameters just now. The Java code implementing this should be created and put in a JAR in the normal userlib directory for adding custom code including JDBC drivers. The other JARs needed are the MongoDB Java SDK jar, derby.jar and vtis-example.jar (from the zip here). You can get the Java source for mongo_table.java here, it is compiled using the MongoDB Java SDK on the classpath as follows (on Windows).

  1. javac -classpath mongo-2.10.1.jar;vtis-example.jar mongo_table.java
  2. jar cvf mongo_table.jar mongo_table.class

The LKM is here it needs imported into your project.

Anyway...this wasn't all about MongoDB per se, it was also about the JavaDB table function capability, any other examples spring to mind about integration capabilities using this route? Going to post about loading into MongoDB and how an IKM is built for this. Interested to hear any ideas/feedback from you on this....so don't be shy!


Thank you for your post. I have a question regarding what i can read on slide 7-32 of your viewlet. You create a datastore with a Derby's physical Architecture and with org.apache.derby.jdbc.EmbeddedDriver but i can't view this driver on list and i am not able to add it, even if i have added the jar on the folder like slide 3-32.


Posted by guest on July 24, 2014 at 05:52 AM PDT #

You can just type the driver name into the JDBC Driver text field (rather than selecting from a list).


Posted by David on July 24, 2014 at 07:19 AM PDT #

Post a Comment:
  • HTML Syntax: NOT allowed

Learn the latest trends, use cases, product updates, and customer success examples for Oracle's data integration products-- including Oracle Data Integrator, Oracle GoldenGate and Oracle Enterprise Data Quality


« March 2015