X

Welcome to All Things Data Integration: Announcements, Insights, Best Practices, Tips & Tricks, and Trend Related...

ODI - Loading MongoDB (API as Target)

David Allan
Architect

In this post I will show how to load documents into a MongoDB collection. The interface design looks just like all other ODI interfaces, but behind the scenes the KM configured in the physical design uses the MongoDB SDK (see MongoDB SDK here) to insert the documents. The target datastore below represents a MongoDB document, the columns are the keys in the document. Each row is inserted as a document, and each column is a key, the column value is the value. The ENAME value below is shown as a complex JSON value.

The IKM I have used is a multi-connect IKM, the source is a SQL data source and the target uses a MongoDB groovy command. The heart of the IKM to insert the documents into the collection has a SQL select as the source command and the following groovy code for the target command;

  1. import com.mongodb.*
  2. MongoClient mongoClient = new MongoClient(" <%=odiRef.getOption("MONGO_SERVER")%> ", <%=odiRef.getOption("MONGO_PORT")%> );
  3. DB db = mongoClient.getDB("<%=odiRef.getOption("MONGODB")%>");
  4. DBCollection coll = db.getCollection("<%=odiRef.getOption("MONGOCOLL")%>")
  5. BasicDBObject doc = new BasicDBObject();

  6. <%=odiRef.getColList(" ", "doc.put(\u0022[COL_NAME]\u0022, \u0022#[CX_COL_NAME]\u0022);", " \n ", "", "((INS and !TRG) and REW)")%>

  7. coll.insert(doc);

The odiRef.getColList method call above generates code for every target column, the code performs a doc.put invocation to add the key-value pairs into the document. For example this is the code generated and executed based on the interface design discussed above;

  1. import com.mongodb.*
  2. MongoClient mongoClient = new MongoClient("DALLAN-SVR", 27017);
  3. DB db = mongoClient.getDB("test");
  4. DBCollection coll = db.getCollection("testCollection")
  5. BasicDBObject doc = new BasicDBObject();

  6.  doc.put("EMPNO", "#EMPNO"); 
  7.  doc.put("ENAME", "#ENAME"); 
  8.  doc.put("JOB", "#JOB"); 
  9.  doc.put("MGR", "#MGR"); 
  10.  doc.put("HIREDATE", "#HIREDATE"); 
  11.  doc.put("SAL", "#SAL"); 
  12.  doc.put("COMM", "#COMM"); 
  13.  doc.put("DEPTNO", "#DEPTNO");

  14. coll.insert(doc);

 This is a simple illustration of how to load documents into MongoDB. We can go into the MongoDB command line and execute the command to see all objects in the collection and get the list of documents, below you can see a preview of executing db.testCollection.find()

  • { "_id" : ObjectId("51dc3ded6c4b9a5bd07d68a6"), "EMPNO" : "7369", "ENAME" : "{ NAME : SMITH, DESCR : 22 }", "JOB" : "CLERK", "MGR" : "7902", "HIREDATE" : "1980-12-17 00:00:00.0", "SAL" : "801", "COMM" : "", "DEPTNO" : "20" }
  • { "_id" : ObjectId("51dc3ded6c4b9a5bd07d68a7"), "EMPNO" : "7499", "ENAME" : "{ NAME : ALLEN, DESCR : 22 }", "JOB" : "SALESMAN", "MGR" : "7698", "HIREDATE" : "1981-02-20 00:00:00.0", "SAL" : "1601", "COMM" : "300", "DEPTNO" : "30" }

 You can see the key:value pairs in our document. For those MongoDB gurus, you'll notice in the 'complex' data illustration, this is really a string and not a MongoDB complex object - that discussion is for another day.

This post is not just about MongoDB, but also a useful post on how to integrate APIs as a target in a data flow. 

Join the discussion

Comments ( 32 )
  • Jan Christian Leonhard Tuesday, January 7, 2014

    Dear David,

    thank you very much for sharing this article.

    Do i get this right - first of all i'd have to install the mongo sdk on the machine where i have the odi running - vice versa - the agents?

    my best wishes

    /J


  • David Tuesday, January 7, 2014

    Hi Jan

    Yes, the MongoDB API (or any other API you might use from KMs, procedures etc) would have to be available for the agent executing the code.

    Cheers

    David


  • Vincenzo Tuesday, January 13, 2015

    Dear David,

    I have some unclear points:

    - when you create a target datastore which is the technology applied to?

    - you use a multi-connect IKM?

    - if i create a new multi-connect IKM which are source and target technologies that i have to specify?

    - which task/command on target and source did you create on IKM?

    - can you share us your IKM with a link?

    Thanks and Regards

    Vincenzo


  • David Tuesday, January 13, 2015

    Hi Vincenzo

    The KM is on the java.net share site below;

    https://java.net/projects/oracledi/downloads/download/ODI/Knowledge%20Modules/KM_IKM_SQL_to_Mongo_DB.xml

    For the target model, I simply made it a File technology model. Then I can define the shape of the datastore. The IKM uses a groovy task - none of the target model technology is really used so you could actually make the target model any technology (the IKM has target tech as undefined). Specifying technology on the IKM helps with default primarily - I left both source and target as undefined and it is a multi connect IKM, I change the name to IKM SQL to Mongo DB to make that a little clearer.

    Cheers

    David


  • Vincenzo Wednesday, January 14, 2015

    Hi David,

    thanks for your reply.

    We added some lines to authenticate with mongo:

    ServerAddress server=new ServerAddress(" <%=odiRef.getOption("MONGO_SERVER")%> ", <%=odiRef.getOption("MONGO_PORT")%>);

    MongoCredential credential=MongoCredential.createMongoCRCredential("usr","<%=odiRef.getOption("MONGODB")%>","pwd".toCharArray());

    MongoClient mongoClient = new MongoClient(server,Arrays.asList(credential));

    Cheers

    Vincenzo


  • David Wednesday, January 14, 2015

    Hi Vincenzo

    That's great! You could use the username and password field on the dataserver and then make the KM more secure and generic. The password would not be shown in the KM or the execution logs.

    <%=odiRef.getInfo("DEST_USER_NAME")%>

    <%=odiRef.getInfo("DEST_PASS")%>

    Then your code could become;

    ServerAddress server=new ServerAddress("<%=odiRef.getOption("MONGO_SERVER")%>", <%=odiRef.getOption("MONGO_PORT")%>);

    MongoCredential credential=MongoCredential.createMongoCRCredential("<%=odiRef.getInfo("DEST_USER_NAME")%>","<%=odiRef.getOption("MONGODB")%>","<%=odiRef.getInfo("DEST_PASS")%>".toCharArray());

    MongoClient mongoClient = new MongoClient(server,Arrays.asList(credential));

    Cheers

    David


  • guest Tuesday, January 20, 2015

    Hi David,

    thank for your reply.

    I made some tests and I realized that for every line I write on mongo are made more connections to the server. I think this depends on how IKM is structured. Could you suggested me some trick to avoid this? Is there a way to initialize the connection once and reuse it?

    I have add mongoClient.close, in this way at least the connections are opened and closed.

    If you had some links where deepening the use of IKM task would be welcome.

    Cheers,

    Vincenzo


  • David Tuesday, January 20, 2015

    Hi Vincenzo

    Yes, you can create a connection in a task and reuse across many subsequent tasks.

    For example is this is the 1st task and it's groovy....

    import com.mongodb.MongoClient;

    mongoClient = new MongoClient("DALLAN-LAP", 27017)

    adb = mongoClient.getDB("<%=odiRef.getOption("MONGODB")%>")

    coll = adb.getCollection("<%=odiRef.getOption("MONGOCOLL")%>")

    Then in the 2nd task which is also groovy you can reference mongoClient and adb for example.... in the next task I may do this and reference adb;

    coll = adb.getCollection("<%=odiRef.getOption("MONGOCOLL")%>")

    coll.drop()

    coll = adb.getCollection("<%=odiRef.getOption("MONGOCOLL")%>")

    Cheers

    David


  • Vincenzo Wednesday, January 21, 2015

    Great,

    I had some problems with groovy syntax :P

    I think that the connections to a new db as mongo could be better structured in ODI defining a new technology in the physical architecture panel (ODI 12), but to do some testing this approach is fine.

    look then your new article ... ;)

    Thanks a lot!

    Vincenzo


  • David Thursday, January 22, 2015

    Hi Vincenzo

    Something like this a little better for defining the data server plus schema? I have defined the collection name in the ODI physical schema and the MongoDB server credentials etc in the ODI data server;
    https://blogs.oracle.com/dataintegration/resource/odi_11g/mongodb_in_odi.jpg

    I tried to draw a parallel with Couchbase also (Bucket is in ODI Physical Schema);
    https://blogs.oracle.com/dataintegration/resource/odi_11g/couchbase_in_odi.jpg

    Cheers

    David


  • Vincenzo Monday, January 26, 2015

    Very well,

    thanks for your screenshot. Gives an idea of what I meant.

    Cheers

    Vincenzo


  • Kenstep Monday, January 26, 2015

    Hi David,

    I use this KM but when i run the interface, It get this error, I think I add mongodb library wrong, plz give me the instruction how to add mongodb SDK to use in KM, and what version of this tutorial.

    org.apache.bsf.BSFException: exception from Groovy: org.codehaus.groovy.control.MultipleCompilationErrorsException: startup failed:

    Initialize: 2: unable to resolve class MongoClient

    @ line 2, column 13.

    MongoClient mongoClient = new MongoClient("DALLAN-LAP", 27017);

    Thanks and Best Regards,


  • David Monday, January 26, 2015

    Hi Kenstep

    If you are on windows copy the Java MongoDB driver into the userlib directory for ODI (on Linux this is under ~/.odi/oracledi/userlib) and restart your ODI client or agent if using agent;

    C:\Users\your_user\AppData\Roaming\odi\oracledi\userlib

    You need to download the Java driver from the web and copy to your host running the ODI studio or agent;

    http://docs.mongodb.org/ecosystem/drivers/java/

    The class MongoClient is provided in the MongoDB java driver download.

    Cheers

    David


  • guest Tuesday, January 27, 2015

    Hi David,

    Thanks for your help, but it's still not work.

    I add mongo API driver to this folder (I'm using windows XP so the path is a little difference)

    http://i79.photobucket.com/albums/j151/khoa_anh1/2_zpsbbnhw66e.jpg

    (I also found some folder lib or driver like: Oracle_ODI1\oracledi.sdk\lib or Oracle_ODI1\oracledi\agent\drivers)

    Then I run the interface on local Agent

    http://i79.photobucket.com/albums/j151/khoa_anh1/3_zpsw4l3ngh6.jpg

    It's still got error

    http://i79.photobucket.com/albums/j151/khoa_anh1/1_zpso4rpxhno.jpg

    I download KM from the link here & do not change anything in code: https://java.net/projects/oracledi/downloads/download/ODI/Knowledge%20Modules/KM_IKM_SQL_to_Mongo_DB.xml

    It's seem that the ODI doesn't load mongo-java-driver-2.12.0 or do I need to specify this file in the file KM_IKM_SQL_to_Mongo_DB.xml.

    do I forgot something ? plz help me. Thank you very much

    Regards,


  • David Tuesday, January 27, 2015

    Hi

    Are you logged in to the OS as the Administrator account? Did you restart the ODI studio after copying the JAR file?

    Cheers

    David


  • Kenstep Tuesday, January 27, 2015

    Hi David,

    I'm sure that I logged in as Administrator (I test on local machine, It's has only administrator account) --> so when i run the ODI, i cann't remove the file mongo-java-driver-2.9.3 (because ODI is reading it).

    I also restart ODI (restart machine) after copy mongo-java-driver-2.9.3

    can you make a tutorial video of how to add new API that is used in KM, or can you describe detail anything you use in this tutorial (version of ODI, version of OS, interface, version of mongo driver)

    Plz support me

    Thanks you mery much,

    Kenstep


  • guest Tuesday, January 27, 2015

    Hi Kenstep,

    if you run the interface on local Agent, maybe you must use <YourODIAgentHome>\odi\agent\lib for mongo jar driver.

    Vincenzo


  • Vincenzo Tuesday, January 27, 2015

    Sorry, I meant if you run the interface on standalone Agent...but now I've seen your screenshot.

    Vincenzo


  • guest Tuesday, January 27, 2015

    Hi David,

    I try to read a collection from mongoDB by ODI, i have 2 systems, windows, and linux.

    On windows system it work, on linux system ODi return this error :

    java.lang.RuntimeException: java.lang.IllegalArgumentException: Could not load JDBC driver class [org.apache.derby.jdbc.EmbeddedDriver]

    at oracle.odi.core.datasource.provider.AbstractDataSourceProvider.configure(AbstractDataSourceProvider.java:106)

    at

    suggestions?

    Thanks you mery much,

    Davide.


  • David Tuesday, January 27, 2015

    Hi

    For the 'Could not load JDBC driver class [org.apache.derby.jdbc.EmbeddedDriver]' you will need to install/copy the Derby/JavaDB JDBC driver jar file to the oracledi/userlib directory and restart studio/agent.

    Cheers

    David


  • David Tuesday, January 27, 2015

    Hi Kenstep

    You are using an old version (you have 2.9.3) of the MongoDB java jar file.

    I think the latest production release right now is below;

    http://central.maven.org/maven2/org/mongodb/mongo-java-driver/2.12.5/mongo-java-driver-2.12.5.jar

    There is a 3.0 version in beta I have also tried. Remove the old version from your userlib directory and install/copy the later one I mention and restart ODI studio and you should be up and running.

    Cheers

    David


  • Kenstep Wednesday, January 28, 2015

    Hi David,

    I used the newest version then It's work, Thank you

    After running, It got this error (after loaded a lot of row from MySQL to MongoDB)

    java.lang.OutOfMemoryError: Not enough storage is available to process this command in tsStartJavaThread (src/jvm/threads/vmthread/lifecycle.c:1097).

    Attempting to allocate 640M bytes

    There is insufficient native memory for the Java

    Runtime Environment to continue.

    Possible reasons:

    The system is out of physical RAM or swap space

    In 32 bit mode, the process size limit was hit

    Possible solutions:

    Reduce memory load on the system

    Increase physical memory or swap space

    Check if swap backing store is full

    Use 64 bit Java on a 64 bit OS

    Decrease Java heap size (-Xmx/-Xms)

    Decrease number of Java threads

    Decrease Java thread stack sizes (-Xss)

    Disable compressed references (-XXcompressedRefs=false)

    at java.lang.Thread.start0(Native Method)

    at java.lang.Thread.start(Thread.java:640)

    at java.util.concurrent.ThreadPoolExecutor.addIfUnderCorePoolSize(ThreadPoolExecutor.java:703)

    at java.util.concurrent.ThreadPoolExecutor.prestartCoreThread(ThreadPoolExecutor.java:1381)

    at java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:222)

    at java.util.concurrent.ScheduledThreadPoolExecutor.scheduleAtFixedRate(ScheduledThreadPoolExecutor.java:426)

    at java.util.concurrent.Executors$DelegatedScheduledExecutorService.scheduleAtFixedRate(Executors.java:656)

    at com.mongodb.Mongo.createCursorCleaningService(Mongo.java:760)

    at com.mongodb.Mongo.<init>(Mongo.java:307)

    at com.mongodb.Mongo.<init>(Mongo.java:287)

    at com.mongodb.Mongo.<init>(Mongo.java:283)

    at com.mongodb.MongoClient.<init>(MongoClient.java:178)

    at com.mongodb.MongoClient.<init>(MongoClient.java:153)

    at com.mongodb.MongoClient.<init>(MongoClient.java:142)

    I tested on my virtual local machine, My Real server has 16GB of RAM, will it work well ?, I need to run a large ETL from MySQL to MongoDB.

    Is that not a performence way to load data from MySQL to MongoDB Because it use a lot of Memory ?

    Any Suggestions or can we use a alternative way to do it ?

    Thank you and Best Regards,

    -----------------

    Kenstep


  • David Wednesday, January 28, 2015

    Hi Kenstep

    There is an updated KM on java.net you can try, as Vincenzo pointed out, the one previously posted was connecting on each insert so would consume many resources plus there was a debug loop that I removed to print out content. I updated the KM on java.net and posted there. The KM is on the java.net share site below;

    https://java.net/projects/oracledi/downloads/download/ODI/Knowledge%20Modules/KM_IKM_SQL_to_Mongo_DB.xml

    The other alternative is building a KM 'MySQL to MongoDB batch' that does an unload from MySQL and uses the CSV bulk loader into MongoDB. Let me know how the updated KM goes.

    Cheers

    David


  • guest Thursday, March 19, 2015

    Hi David,

    I'm trying to load some data into mongodb using odi load plan but I had some problems with LogicalOracleDIAgentJEE agent.

    The "mongo-java-driver-2.12.3.jar" java class located into C:\Users\Administrator\AppData\Roaming\odi\oracledi\userlib is visible only by local agent.

    Could you tell me which is the correct directory used by LogicalOracleDIAgentJEE agent?

    In some blog I read that I have to copy the java class mentioned above into LogicalOracleDIAgentJEE agent directory. I need to copy anything else?

    Thanks and Regards,

    Davide


  • guest Thursday, March 19, 2015

    Hi David,

    I'm trying to load some data into mongodb using odi load plan but I had some problems with LogicalOracleDIAgentJEE agent.

    The "mongo-java-driver-2.12.3.jar" java class located into C:\Users\Administrator\AppData\Roaming\odi\oracledi\userlib is visible only by local agent.

    Could you tell me which is the correct directory used by LogicalOracleDIAgentJEE agent?

    In some blog I read that I have to copy the java class mentioned above into LogicalOracleDIAgentJEE agent directory. I need to copy anything else?

    Thanks and Regards,

    Davide


  • Ursula Tuesday, March 24, 2015

    Hi David,

    I'm trying to load some data into mongodb using odi load plan but I had some problems with LogicalOracleDIAgentJEE agent.

    The "mongo-java-driver-2.12.3.jar" java class located into C:\Users\Administrator\AppData\Roaming\odi\oracledi\userlib is visible only by local agent.

    Could you tell me which is the correct directory used by LogicalOracleDIAgentJEE agent?

    In some blog I read that I have to copy the java class mentioned above into LogicalOracleDIAgentJEE agent directory. In case you think is the correct solution, do I need to copy anything else?

    Thanks and Regards,

    Ursula


  • David Tuesday, March 24, 2015

    Hi Ursula

    There are different locations for studio and agent, for standalone agent copy into YourODIAgentHome\odi\agent\lib

    Cheers

    David


  • Ursula Wednesday, March 25, 2015

    Hi David,

    thanks for your reply but the I hadn't solved yet.

    I copied the following files:

    additional_path.txt(modified with the correct path for derby.jar file)

    derby.jar

    mongo_table.class

    mongo_table.java

    mongo_table.jar

    mongo-java-driver-2.12.3

    vtis-example.jar

    into the directory suggested by you(in my case C:\Oracle\Middleware\Oracle_Home\odi\agent\lib), but when I tried to run the ODI plan using standalone Agent I got this error:

    ODI-1590: The execution of the script failed.

    Caused By: org.apache.bsf.BSFException: exception from Groovy: org.codehaus.groovy.control.MultipleCompilationErrorsException: startup failed:

    _: 3: unable to resolve class com.mongodb.ServerAddress

    @ line 3, column 1.

    import com.mongodb.ServerAddress;

    ^

    _: 4: unable to resolve class com.mongodb.MongoCredential

    @ line 4, column 1.

    import com.mongodb.MongoCredential;

    ^

    _: 2: unable to resolve class com.mongodb.MongoClient

    @ line 2, column 1.

    import com.mongodb.MongoClient;

    ^

    I tried also, to restart ODI services after coping files but it didn't affect ODI execution. Any suggestion?

    Thanks and Regards,

    Ursula


  • David Wednesday, March 25, 2015

    Copy to the [domain]/lib directory for your agent, you can see an example in bloe below;

    http://oracle-dba.yoga-supreme.com/?p=1676

    Cheers

    David


  • Ursula Thursday, March 26, 2015

    Hi David,

    Thanks a lot for your support. After doing the steps suggested by you, I solved the problem I had. Now I can execute a mapping that read an oracle table and write its data into a mongodb collection using an ODI load plan.

    I have another problem with a mapping that performs the opposite operation, that is read a collection from mongodb and write it into an oracle table. It works fine using local agent, but I have still problem if I try to include it into a load plan and execute it using standalone agent.

    The error I get is the following one:

    oracle.odi.runtime.agent.exception.ExecutionEngineException: java.sql.SQLException: The exception 'java.sql.SQLException: Unimplemented method: notImplemented' was thrown while evaluating an expression.

    at oracle.odi.runtime.agent.execution.sql.concurrent.FastJDBCRecordSet.<init>(FastJDBCRecordSet.java:138)

    at oracle.odi.runtime.agent.execution.sql.SQLDataProvider.getJDBCRecordSet(SQLDataProvider.java:133)

    at oracle.odi.runtime.agent.execution.sql.SQLDataProvider.readData(SQLDataProvider.java:105)

    at oracle.odi.runtime.agent.execution.sql.SQLDataProvider.readData(SQLDataProvider.java:38)

    at oracle.odi.runtime.agent.execution.Executor.handleDataMovementTask(Executor.java:309)

    at oracle.odi.runtime.agent.execution.DataMovementTaskExecutionHandler.handleTask(DataMovementTaskExecutionHandler.java:51)

    at oracle.odi.runtime.agent.execution.SessionTask.processTask(SessionTask.java:189)

    at oracle.odi.runtime.agent.execution.SessionTask.doExecuteTask(SessionTask.java:111)

    at oracle.odi.runtime.agent.execution.AbstractSessionTask.execute(AbstractSessionTask.java:864)

    at oracle.odi.runtime.agent.execution.SessionExecutor$SerialTrain.runTasks(SessionExecutor.java:2024)

    at oracle.odi.runtime.agent.execution.SessionExecutor.executeSession(SessionExecutor.java:562)

    at oracle.odi.runtime.agent.processor.TaskExecutorAgentRequestProcessor$1.doAction(TaskExecutorAgentRequestProcessor.java:718)

    at oracle.odi.runtime.agent.processor.TaskExecutorAgentRequestProcessor$1.doAction(TaskExecutorAgentRequestProcessor.java:611)

    at oracle.odi.core.persistence.dwgobject.DwgObjectTemplate.execute(DwgObjectTemplate.java:203)

    ...

    Thanks and Regards,

    Ursula.


  • David Friday, March 27, 2015

    Hi Ursula

    Over on this blog post is the same error and I think you might need the code I pointed to there. See blog comments in https://blogs.oracle.com/dataintegration/entry/odi_java_table_function_for

    The latest Derby code has changes that make the older Sun sample table function support classes fail, here is a zip of the vtis-example.jar, unzip this zip into the odi/userlib and it should work with the old and new Derby releases.

    https://blogs.oracle.com/dataintegration/resource/odi_12c/vtis-example.zip

    Hope that gets you working, its been a while since I looked at that stuff.

    Cheers

    David


  • Ursula Friday, March 27, 2015

    Thanks David,

    I solved the problem, now it works fine both in reading and in writing from/to MongoDB using ODI load plan!

    Thanks a lot for your precious support!

    Best Regards,

    Ursula.


Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.