Friday Jan 11, 2013

ODI - Java Table Function for MongoDB

Behind the scenes of the MongoDB posting was a very simple JavaDB/Derby table function. The function implemented a couple of methods - the table function readCollection and the function makeRow which creates an array from a Java Object. It can't get much simpler. The iteration through the collection is handled by the class I extended from EnumeratorTableFunction  which came from the posting by Rick Hillegas, and it fits nicely into ODIs source/target generic integration task in the KM framework. Here is a viewlet I have created showing you everything very briefly but end to end.

The makeRow function uses the MongoDB java SDK, and produces a row for each BasicDBObject, each value in the document is output as a column in the table. Nested/complex values are serialized as Java Strings - so you will get a JSON string for anything complex.

  1. public String[] makeRow(Object obj) throws SQLException
  2. {
  3.   int idx = 0;
  4.   BasicDBObject dbo = (BasicDBObject) obj;
  5.   Iterator it = dbo.entrySet().iterator();
  6.   String[]    row = new String[ getColumnCount() ];
  7.   it.next(); // skip the 'id' column
  8.   while (it.hasNext()) {
  9.     Map.Entry pairs = (Map.Entry)it.next();
  10.     row[ idx++ ] = pairs.getValue().toString();
  11.   }
  12.   return row;
  13. }

The readCollection table function is a static method and has a couple of parameters (for demonstration) - one is the MongoDB database name and the other is the collection name. The function initializes the object instance with the column names which are defined to be the key names for the objects in the collection (the first object is taken and its keys used as the column names);

  1. public static ResultSet readCollection(String dbName, String collectionName)
  2.   throws SQLException, UnknownHostException
  3. {
  4.   int idx = 0;
  5.   MongoClient mongoClient = new MongoClient();
  6.   DB db = mongoClient.getDB(dbName);
  7.   DBCollection coll = db.getCollection(collectionName);
  8.   DBCursor cursor = coll.find();
  9.   BasicDBObject dbo = (BasicDBObject)  coll.findOne();
  10.   Set<String> keys = dbo.keySet();
  11.   String[] skeys = new String[keys.size()];
  12.   Iterator it = keys.iterator();
  13.   it.next(); // skip the id
  14.   while (it.hasNext()) {
  15.     skeys[idx++] = it.next().toString();
  16.   }
  17.   return new mongo_table( skeys, cursor );
  18. }

The mongo_table constructor just initializes itself and sets the enumeration to iterate over - the class I extend from is very useful, it can iterate over Java Enumeration, Iterator, Iterable, or array objects - the super class initializes the column names, and the setEnumeration defines the collection/iterator - which in this case is a MongoDB DBCursor which happens to be a Java Iterator<DBObject>.

  1. public mongo_table(String[] column_names, DBCursor cursor)
  2.   throws SQLException
  3. {
  4.   super( column_names );
  5.   setEnumeration( cursor );
  6. }

This approach can be used for sourcing pretty much anything, which is great for integration needs. The ODI Knowledge Module is an LKM and stages the result of the table function into a work table, then everything else is as normal. The KM creates the work table and also registers the table function with JavaDB/Derby. My code for the function registration is as follows;

  1. create function <%=odiRef.getSrcTablesList("","[TABLE_NAME]", "","")%>( dbName varchar( 330), collName varchar( 30))
  2. returns table
  3. (
  4. <%=odiRef.getSrcColList("","[COL_NAME] [SOURCE_CRE_DT]","[COL_NAME] [SOURCE_CRE_DT]",",\n","")%> )
  5. language java
  6. parameter style DERBY_JDBC_RESULT_SET
  7. no sql
  8. external name '<%=odiRef.getOption("TABLE_FUNCTION_NAME")%>'

This creates the table function with the same name as the datastore in the interface, plus the resultant table of the function has the columns (and types) from that datastore. The external JavaDB function name is taken from the KM option TABLE_FUNCTION_NAME. As I mentioned I have hard-wired 2 parameters just now. The Java code implementing this should be created and put in a JAR in the normal userlib directory for adding custom code including JDBC drivers. The other JARs needed are the MongoDB Java SDK jar, derby.jar and vtis-example.jar (from the zip here). You can get the Java source for mongo_table.java here, it is compiled using the MongoDB Java SDK on the classpath as follows (on Windows).

  1. javac -classpath mongo-2.10.1.jar;vtis-example.jar mongo_table.java
  2. jar cvf mongo_table.jar mongo_table.class

The LKM is here it needs imported into your project.

Anyway...this wasn't all about MongoDB per se, it was also about the JavaDB table function capability, any other examples spring to mind about integration capabilities using this route? Going to post about loading into MongoDB and how an IKM is built for this. Interested to hear any ideas/feedback from you on this....so don't be shy!

Monday Apr 25, 2011

Accelerating development via the Oracle Data Integrator SDK

A nice post and example of ODI 11g SDK use by David Allan can be seen on this blog.

David writes:

 

Often using ANY tool there are scenarios where there is a lot of grunt work, imagine Microsoft tools like Excel without VB and macros to accelerate and customize those boring repetitive tasks. Data integration and ETL design is exactly the same, the tool needs to expose an SDK to a base platform that you can use to make your life easier. Something to automate the grunt work that is common and very repetitive. The ODI 11g SDK let's you script these kind of repetitive tasks. As an aside the ODI common format designer (see this post here) has a way for migrating like named objects, however using the SDK let's you control much much more.

To illustrate I have created a simple interface construction accelerator that you can download (interfaceAccelerator.java), the accelerator generates ODI interfaces from a control file that defines the interface name, the source and the target - simple and a nice example for demo purposes. If you look at the java code, it is very basic (no pun intended).  It literally is a dozen lines of code. The image below illustrates the java program interfaceAccelerator using the ODI 11g SDK to take as inputs the configuration of the connection details and a control file specify the source to target datastore mappings.

image

The code when called has a bunch of command line parameters shown below and the standard input stream is the interface control file, so the command line looks like;

java -classpath <cp> interfaceAccelerator <url> <driver> <schema> <pwd> <workrep> <odiuser> <odiuserpwd> <project> <folder> < <control_file>

the control file provided in the standard input stream needs to be a comma delimited file with the following structure

    • interface_name,source_model,source_table,target_model,target_table
    • ...

for example a sample command line using an Oracle repository could be

java -classpath <cp> interfaceAccelerator jdbc:oracle:thin:@localhost:1521:ora112 oracle.jdbc.OracleDriver ODI_MASTER mypwd WORKREP1 SUPERVISOR myodipwd STARTERS SDK < icontrol.csv

 

the interfaces will be created in the folder SDK and the project code is STARTERS. The icontrol.csv file used above was (remember the format is interface_name,source_model,source_table,target_model,target_table, this is just what I happened to use in this simple demo program);

    • INTFC1,SCOTT,EMP,STG_MODEL,STG_EMP
    • INTFC2,SCOTT,DEPT,STG_MODEL,STG_DEPT
    • INTFC3,SCOTT,BONUS,STG_MODEL,STG_BONUS

You can created as many interfaces from this driver control file as you desire, the interface generated will map from the source table to the target table and use ODI's auto mapping to perform column level mapping of the source to target table, it will also create default source sets and use the default KM assignment. So you get a pretty useful set of stuff as a basis here.

The interfaces generated whilst executing this accelerator look like the following, the table to table map with all of the like-named columns mapped, the physical flow configured with defaults KMs!

image

You can take this code and customize to make it fit your needs or send in comments on how to do things. In summary if you are finding you desire ways of tuning your work to make using ODI even more productive, then you should look into the ODI 11g SDK and see if you can automate, automate, automate.

 

About

Learn the latest trends, use cases, product updates, and customer success examples for Oracle's data integration products-- including Oracle Data Integrator, Oracle GoldenGate and Oracle Enterprise Data Quality

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
2
3
5
6
7
8
9
10
12
13
14
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today