ODI - Loading MongoDB (API as Target)

In this post I will show how to load documents into a MongoDB collection. The interface design looks just like all other ODI interfaces, but behind the scenes the KM configured in the physical design uses the MongoDB SDK (see MongoDB SDK here) to insert the documents. The target datastore below represents a MongoDB document, the columns are the keys in the document. Each row is inserted as a document, and each column is a key, the column value is the value. The ENAME value below is shown as a complex JSON value.

The IKM I have used is a multi-connect IKM, the source is a SQL data source and the target uses a MongoDB groovy command. The heart of the IKM to insert the documents into the collection has a SQL select as the source command and the following groovy code for the target command;

  1. import com.mongodb.*
  2. MongoClient mongoClient = new MongoClient(" <%=odiRef.getOption("MONGO_SERVER")%> ", <%=odiRef.getOption("MONGO_PORT")%> );
  3. DB db = mongoClient.getDB("<%=odiRef.getOption("MONGODB")%>");
  4. DBCollection coll = db.getCollection("<%=odiRef.getOption("MONGOCOLL")%>")
  5. BasicDBObject doc = new BasicDBObject();

  6. <%=odiRef.getColList(" ", "doc.put(\u0022[COL_NAME]\u0022, \u0022#[CX_COL_NAME]\u0022);", " \n ", "", "((INS and !TRG) and REW)")%>

  7. coll.insert(doc);

The odiRef.getColList method call above generates code for every target column, the code performs a doc.put invocation to add the key-value pairs into the document. For example this is the code generated and executed based on the interface design discussed above;

  1. import com.mongodb.*
  2. MongoClient mongoClient = new MongoClient("DALLAN-SVR", 27017);
  3. DB db = mongoClient.getDB("test");
  4. DBCollection coll = db.getCollection("testCollection")
  5. BasicDBObject doc = new BasicDBObject();

  6.  doc.put("EMPNO", "#EMPNO"); 
  7.  doc.put("ENAME", "#ENAME"); 
  8.  doc.put("JOB", "#JOB"); 
  9.  doc.put("MGR", "#MGR"); 
  10.  doc.put("HIREDATE", "#HIREDATE"); 
  11.  doc.put("SAL", "#SAL"); 
  12.  doc.put("COMM", "#COMM"); 
  13.  doc.put("DEPTNO", "#DEPTNO");

  14. coll.insert(doc);

 This is a simple illustration of how to load documents into MongoDB. We can go into the MongoDB command line and execute the command to see all objects in the collection and get the list of documents, below you can see a preview of executing db.testCollection.find()

  • { "_id" : ObjectId("51dc3ded6c4b9a5bd07d68a6"), "EMPNO" : "7369", "ENAME" : "{ NAME : SMITH, DESCR : 22 }", "JOB" : "CLERK", "MGR" : "7902", "HIREDATE" : "1980-12-17 00:00:00.0", "SAL" : "801", "COMM" : "", "DEPTNO" : "20" }
  • { "_id" : ObjectId("51dc3ded6c4b9a5bd07d68a7"), "EMPNO" : "7499", "ENAME" : "{ NAME : ALLEN, DESCR : 22 }", "JOB" : "SALESMAN", "MGR" : "7698", "HIREDATE" : "1981-02-20 00:00:00.0", "SAL" : "1601", "COMM" : "300", "DEPTNO" : "30" }

 You can see the key:value pairs in our document. For those MongoDB gurus, you'll notice in the 'complex' data illustration, this is really a string and not a MongoDB complex object - that discussion is for another day.

This post is not just about MongoDB, but also a useful post on how to integrate APIs as a target in a data flow. 

Comments:

Dear David,

thank you very much for sharing this article.

Do i get this right - first of all i'd have to install the mongo sdk on the machine where i have the odi running - vice versa - the agents?

my best wishes

/J

Posted by Jan Christian Leonhard on January 07, 2014 at 04:18 AM PST #

Hi Jan

Yes, the MongoDB API (or any other API you might use from KMs, procedures etc) would have to be available for the agent executing the code.

Cheers
David

Posted by David on January 07, 2014 at 08:28 AM PST #

Dear David,
I have some unclear points:
- when you create a target datastore which is the technology applied to?
- you use a multi-connect IKM?
- if i create a new multi-connect IKM which are source and target technologies that i have to specify?
- which task/command on target and source did you create on IKM?
- can you share us your IKM with a link?

Thanks and Regards
Vincenzo

Posted by Vincenzo on January 13, 2015 at 01:13 AM PST #

Hi Vincenzo

The KM is on the java.net share site below;
https://java.net/projects/oracledi/downloads/download/ODI/Knowledge%20Modules/KM_IKM_SQL_to_Mongo_DB.xml

For the target model, I simply made it a File technology model. Then I can define the shape of the datastore. The IKM uses a groovy task - none of the target model technology is really used so you could actually make the target model any technology (the IKM has target tech as undefined). Specifying technology on the IKM helps with default primarily - I left both source and target as undefined and it is a multi connect IKM, I change the name to IKM SQL to Mongo DB to make that a little clearer.

Cheers
David

Posted by David on January 13, 2015 at 09:00 AM PST #

Hi David,
thanks for your reply.

We added some lines to authenticate with mongo:

ServerAddress server=new ServerAddress(" <%=odiRef.getOption("MONGO_SERVER")%> ", <%=odiRef.getOption("MONGO_PORT")%>);
MongoCredential credential=MongoCredential.createMongoCRCredential("usr","<%=odiRef.getOption("MONGODB")%>","pwd".toCharArray());
MongoClient mongoClient = new MongoClient(server,Arrays.asList(credential));

Cheers
Vincenzo

Posted by Vincenzo on January 14, 2015 at 03:45 AM PST #

Hi Vincenzo

That's great! You could use the username and password field on the dataserver and then make the KM more secure and generic. The password would not be shown in the KM or the execution logs.

<%=odiRef.getInfo("DEST_USER_NAME")%>
<%=odiRef.getInfo("DEST_PASS")%>

Then your code could become;

ServerAddress server=new ServerAddress("<%=odiRef.getOption("MONGO_SERVER")%>", <%=odiRef.getOption("MONGO_PORT")%>);
MongoCredential credential=MongoCredential.createMongoCRCredential("<%=odiRef.getInfo("DEST_USER_NAME")%>","<%=odiRef.getOption("MONGODB")%>","<%=odiRef.getInfo("DEST_PASS")%>".toCharArray());
MongoClient mongoClient = new MongoClient(server,Arrays.asList(credential));

Cheers
David

Posted by David on January 14, 2015 at 07:50 AM PST #

Hi David,
thank for your reply.
I made some tests and I realized that for every line I write on mongo are made more connections to the server. I think this depends on how IKM is structured. Could you suggested me some trick to avoid this? Is there a way to initialize the connection once and reuse it?
I have add mongoClient.close, in this way at least the connections are opened and closed.
If you had some links where deepening the use of IKM task would be welcome.

Cheers,
Vincenzo

Posted by guest on January 20, 2015 at 01:19 AM PST #

Hi Vincenzo

Yes, you can create a connection in a task and reuse across many subsequent tasks.

For example is this is the 1st task and it's groovy....
import com.mongodb.MongoClient;
mongoClient = new MongoClient("DALLAN-LAP", 27017)
adb = mongoClient.getDB("<%=odiRef.getOption("MONGODB")%>")
coll = adb.getCollection("<%=odiRef.getOption("MONGOCOLL")%>")

Then in the 2nd task which is also groovy you can reference mongoClient and adb for example.... in the next task I may do this and reference adb;

coll = adb.getCollection("<%=odiRef.getOption("MONGOCOLL")%>")
coll.drop()
coll = adb.getCollection("<%=odiRef.getOption("MONGOCOLL")%>")

Cheers
David

Posted by David on January 20, 2015 at 01:14 PM PST #

Great,
I had some problems with groovy syntax :P

I think that the connections to a new db as mongo could be better structured in ODI defining a new technology in the physical architecture panel (ODI 12), but to do some testing this approach is fine.

look then your new article ... ;)

Thanks a lot!
Vincenzo

Posted by Vincenzo on January 21, 2015 at 01:28 AM PST #

Hi Vincenzo

Something like this a little better for defining the data server plus schema? I have defined the collection name in the ODI physical schema and the MongoDB server credentials etc in the ODI data server;
https://blogs.oracle.com/dataintegration/resource/odi_11g/mongodb_in_odi.jpg
I tried to draw a parallel with Couchbase also (Bucket is in ODI Physical Schema);
https://blogs.oracle.com/dataintegration/resource/odi_11g/couchbase_in_odi.jpg

Cheers
David

Posted by David on January 22, 2015 at 10:38 AM PST #

Very well,
thanks for your screenshot. Gives an idea of what I meant.

Cheers
Vincenzo

Posted by Vincenzo on January 26, 2015 at 03:33 AM PST #

Hi David,

I use this KM but when i run the interface, It get this error, I think I add mongodb library wrong, plz give me the instruction how to add mongodb SDK to use in KM, and what version of this tutorial.

org.apache.bsf.BSFException: exception from Groovy: org.codehaus.groovy.control.MultipleCompilationErrorsException: startup failed:
Initialize: 2: unable to resolve class MongoClient
@ line 2, column 13.
MongoClient mongoClient = new MongoClient("DALLAN-LAP", 27017);

Thanks and Best Regards,

Posted by Kenstep on January 26, 2015 at 08:35 AM PST #

Hi Kenstep

If you are on windows copy the Java MongoDB driver into the userlib directory for ODI (on Linux this is under ~/.odi/oracledi/userlib) and restart your ODI client or agent if using agent;
C:\Users\your_user\AppData\Roaming\odi\oracledi\userlib

You need to download the Java driver from the web and copy to your host running the ODI studio or agent;
http://docs.mongodb.org/ecosystem/drivers/java/

The class MongoClient is provided in the MongoDB java driver download.
Cheers
David

Posted by David on January 26, 2015 at 08:53 AM PST #

Hi David,

Thanks for your help, but it's still not work.

I add mongo API driver to this folder (I'm using windows XP so the path is a little difference)
http://i79.photobucket.com/albums/j151/khoa_anh1/2_zpsbbnhw66e.jpg

(I also found some folder lib or driver like: Oracle_ODI1\oracledi.sdk\lib or Oracle_ODI1\oracledi\agent\drivers)

Then I run the interface on local Agent
http://i79.photobucket.com/albums/j151/khoa_anh1/3_zpsw4l3ngh6.jpg

It's still got error
http://i79.photobucket.com/albums/j151/khoa_anh1/1_zpso4rpxhno.jpg

I download KM from the link here & do not change anything in code: https://java.net/projects/oracledi/downloads/download/ODI/Knowledge%20Modules/KM_IKM_SQL_to_Mongo_DB.xml

It's seem that the ODI doesn't load mongo-java-driver-2.12.0 or do I need to specify this file in the file KM_IKM_SQL_to_Mongo_DB.xml.

do I forgot something ? plz help me. Thank you very much

Regards,

Posted by guest on January 26, 2015 at 08:38 PM PST #

Hi

Are you logged in to the OS as the Administrator account? Did you restart the ODI studio after copying the JAR file?

Cheers
David

Posted by David on January 27, 2015 at 07:38 AM PST #

Hi David,

I'm sure that I logged in as Administrator (I test on local machine, It's has only administrator account) --> so when i run the ODI, i cann't remove the file mongo-java-driver-2.9.3 (because ODI is reading it).

I also restart ODI (restart machine) after copy mongo-java-driver-2.9.3

can you make a tutorial video of how to add new API that is used in KM, or can you describe detail anything you use in this tutorial (version of ODI, version of OS, interface, version of mongo driver)

Plz support me

Thanks you mery much,
Kenstep

Posted by Kenstep on January 27, 2015 at 07:54 AM PST #

Hi Kenstep,
if you run the interface on local Agent, maybe you must use <YourODIAgentHome>\odi\agent\lib for mongo jar driver.

Vincenzo

Posted by guest on January 27, 2015 at 08:28 AM PST #

Sorry, I meant if you run the interface on standalone Agent...but now I've seen your screenshot.

Vincenzo

Posted by Vincenzo on January 27, 2015 at 08:35 AM PST #

Hi David,

I try to read a collection from mongoDB by ODI, i have 2 systems, windows, and linux.
On windows system it work, on linux system ODi return this error :

java.lang.RuntimeException: java.lang.IllegalArgumentException: Could not load JDBC driver class [org.apache.derby.jdbc.EmbeddedDriver]
at oracle.odi.core.datasource.provider.AbstractDataSourceProvider.configure(AbstractDataSourceProvider.java:106)
at

suggestions?
Thanks you mery much,
Davide.

Posted by guest on January 27, 2015 at 09:05 AM PST #

Hi
For the 'Could not load JDBC driver class [org.apache.derby.jdbc.EmbeddedDriver]' you will need to install/copy the Derby/JavaDB JDBC driver jar file to the oracledi/userlib directory and restart studio/agent.

Cheers
David

Posted by David on January 27, 2015 at 11:01 AM PST #

Hi Kenstep
You are using an old version (you have 2.9.3) of the MongoDB java jar file.

I think the latest production release right now is below;
http://central.maven.org/maven2/org/mongodb/mongo-java-driver/2.12.5/mongo-java-driver-2.12.5.jar

There is a 3.0 version in beta I have also tried. Remove the old version from your userlib directory and install/copy the later one I mention and restart ODI studio and you should be up and running.

Cheers
David

Posted by David on January 27, 2015 at 11:16 AM PST #

Hi David,

I used the newest version then It's work, Thank you

After running, It got this error (after loaded a lot of row from MySQL to MongoDB)

java.lang.OutOfMemoryError: Not enough storage is available to process this command in tsStartJavaThread (src/jvm/threads/vmthread/lifecycle.c:1097).

Attempting to allocate 640M bytes

There is insufficient native memory for the Java
Runtime Environment to continue.

Possible reasons:
The system is out of physical RAM or swap space
In 32 bit mode, the process size limit was hit

Possible solutions:
Reduce memory load on the system
Increase physical memory or swap space
Check if swap backing store is full
Use 64 bit Java on a 64 bit OS
Decrease Java heap size (-Xmx/-Xms)
Decrease number of Java threads
Decrease Java thread stack sizes (-Xss)
Disable compressed references (-XXcompressedRefs=false)

at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:640)
at java.util.concurrent.ThreadPoolExecutor.addIfUnderCorePoolSize(ThreadPoolExecutor.java:703)
at java.util.concurrent.ThreadPoolExecutor.prestartCoreThread(ThreadPoolExecutor.java:1381)
at java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:222)
at java.util.concurrent.ScheduledThreadPoolExecutor.scheduleAtFixedRate(ScheduledThreadPoolExecutor.java:426)
at java.util.concurrent.Executors$DelegatedScheduledExecutorService.scheduleAtFixedRate(Executors.java:656)
at com.mongodb.Mongo.createCursorCleaningService(Mongo.java:760)
at com.mongodb.Mongo.<init>(Mongo.java:307)
at com.mongodb.Mongo.<init>(Mongo.java:287)
at com.mongodb.Mongo.<init>(Mongo.java:283)
at com.mongodb.MongoClient.<init>(MongoClient.java:178)
at com.mongodb.MongoClient.<init>(MongoClient.java:153)
at com.mongodb.MongoClient.<init>(MongoClient.java:142)

I tested on my virtual local machine, My Real server has 16GB of RAM, will it work well ?, I need to run a large ETL from MySQL to MongoDB.

Is that not a performence way to load data from MySQL to MongoDB Because it use a lot of Memory ?

Any Suggestions or can we use a alternative way to do it ?

Thank you and Best Regards,
-----------------
Kenstep

Posted by Kenstep on January 27, 2015 at 07:33 PM PST #

Hi Kenstep

There is an updated KM on java.net you can try, as Vincenzo pointed out, the one previously posted was connecting on each insert so would consume many resources plus there was a debug loop that I removed to print out content. I updated the KM on java.net and posted there. The KM is on the java.net share site below;

https://java.net/projects/oracledi/downloads/download/ODI/Knowledge%20Modules/KM_IKM_SQL_to_Mongo_DB.xml

The other alternative is building a KM 'MySQL to MongoDB batch' that does an unload from MySQL and uses the CSV bulk loader into MongoDB. Let me know how the updated KM goes.

Cheers
David

Posted by David on January 28, 2015 at 08:05 AM PST #

Hi David,
I'm trying to load some data into mongodb using odi load plan but I had some problems with LogicalOracleDIAgentJEE agent.
The "mongo-java-driver-2.12.3.jar" java class located into C:\Users\Administrator\AppData\Roaming\odi\oracledi\userlib is visible only by local agent.
Could you tell me which is the correct directory used by LogicalOracleDIAgentJEE agent?
In some blog I read that I have to copy the java class mentioned above into LogicalOracleDIAgentJEE agent directory. I need to copy anything else?

Thanks and Regards,
Davide

Posted by guest on March 19, 2015 at 01:52 AM PDT #

Hi David,
I'm trying to load some data into mongodb using odi load plan but I had some problems with LogicalOracleDIAgentJEE agent.
The "mongo-java-driver-2.12.3.jar" java class located into C:\Users\Administrator\AppData\Roaming\odi\oracledi\userlib is visible only by local agent.
Could you tell me which is the correct directory used by LogicalOracleDIAgentJEE agent?
In some blog I read that I have to copy the java class mentioned above into LogicalOracleDIAgentJEE agent directory. I need to copy anything else?

Thanks and Regards,
Davide

Posted by guest on March 19, 2015 at 03:37 AM PDT #

Hi David,
I'm trying to load some data into mongodb using odi load plan but I had some problems with LogicalOracleDIAgentJEE agent.
The "mongo-java-driver-2.12.3.jar" java class located into C:\Users\Administrator\AppData\Roaming\odi\oracledi\userlib is visible only by local agent.
Could you tell me which is the correct directory used by LogicalOracleDIAgentJEE agent?
In some blog I read that I have to copy the java class mentioned above into LogicalOracleDIAgentJEE agent directory. In case you think is the correct solution, do I need to copy anything else?
Thanks and Regards,
Ursula

Posted by Ursula on March 24, 2015 at 03:27 AM PDT #

Hi Ursula

There are different locations for studio and agent, for standalone agent copy into YourODIAgentHome\odi\agent\lib

Cheers
David

Posted by David on March 24, 2015 at 08:01 AM PDT #

Hi David,
thanks for your reply but the I hadn't solved yet.
I copied the following files:
additional_path.txt(modified with the correct path for derby.jar file)
derby.jar
mongo_table.class
mongo_table.java
mongo_table.jar
mongo-java-driver-2.12.3
vtis-example.jar

into the directory suggested by you(in my case C:\Oracle\Middleware\Oracle_Home\odi\agent\lib), but when I tried to run the ODI plan using standalone Agent I got this error:

ODI-1590: The execution of the script failed.
Caused By: org.apache.bsf.BSFException: exception from Groovy: org.codehaus.groovy.control.MultipleCompilationErrorsException: startup failed:
_: 3: unable to resolve class com.mongodb.ServerAddress
@ line 3, column 1.
import com.mongodb.ServerAddress;
^

_: 4: unable to resolve class com.mongodb.MongoCredential
@ line 4, column 1.
import com.mongodb.MongoCredential;
^

_: 2: unable to resolve class com.mongodb.MongoClient
@ line 2, column 1.
import com.mongodb.MongoClient;
^

I tried also, to restart ODI services after coping files but it didn't affect ODI execution. Any suggestion?

Thanks and Regards,
Ursula

Posted by Ursula on March 25, 2015 at 03:30 AM PDT #

Copy to the [domain]/lib directory for your agent, you can see an example in bloe below;
http://oracle-dba.yoga-supreme.com/?p=1676

Cheers
David

Posted by David on March 25, 2015 at 01:45 PM PDT #

Hi David,
Thanks a lot for your support. After doing the steps suggested by you, I solved the problem I had. Now I can execute a mapping that read an oracle table and write its data into a mongodb collection using an ODI load plan.
I have another problem with a mapping that performs the opposite operation, that is read a collection from mongodb and write it into an oracle table. It works fine using local agent, but I have still problem if I try to include it into a load plan and execute it using standalone agent.
The error I get is the following one:

oracle.odi.runtime.agent.exception.ExecutionEngineException: java.sql.SQLException: The exception 'java.sql.SQLException: Unimplemented method: notImplemented' was thrown while evaluating an expression.
at oracle.odi.runtime.agent.execution.sql.concurrent.FastJDBCRecordSet.<init>(FastJDBCRecordSet.java:138)
at oracle.odi.runtime.agent.execution.sql.SQLDataProvider.getJDBCRecordSet(SQLDataProvider.java:133)
at oracle.odi.runtime.agent.execution.sql.SQLDataProvider.readData(SQLDataProvider.java:105)
at oracle.odi.runtime.agent.execution.sql.SQLDataProvider.readData(SQLDataProvider.java:38)
at oracle.odi.runtime.agent.execution.Executor.handleDataMovementTask(Executor.java:309)
at oracle.odi.runtime.agent.execution.DataMovementTaskExecutionHandler.handleTask(DataMovementTaskExecutionHandler.java:51)
at oracle.odi.runtime.agent.execution.SessionTask.processTask(SessionTask.java:189)
at oracle.odi.runtime.agent.execution.SessionTask.doExecuteTask(SessionTask.java:111)
at oracle.odi.runtime.agent.execution.AbstractSessionTask.execute(AbstractSessionTask.java:864)
at oracle.odi.runtime.agent.execution.SessionExecutor$SerialTrain.runTasks(SessionExecutor.java:2024)
at oracle.odi.runtime.agent.execution.SessionExecutor.executeSession(SessionExecutor.java:562)
at oracle.odi.runtime.agent.processor.TaskExecutorAgentRequestProcessor$1.doAction(TaskExecutorAgentRequestProcessor.java:718)
at oracle.odi.runtime.agent.processor.TaskExecutorAgentRequestProcessor$1.doAction(TaskExecutorAgentRequestProcessor.java:611)
at oracle.odi.core.persistence.dwgobject.DwgObjectTemplate.execute(DwgObjectTemplate.java:203)
...

Thanks and Regards,
Ursula.

Posted by Ursula on March 26, 2015 at 08:01 AM PDT #

Hi Ursula
Over on this blog post is the same error and I think you might need the code I pointed to there. See blog comments in https://blogs.oracle.com/dataintegration/entry/odi_java_table_function_for

The latest Derby code has changes that make the older Sun sample table function support classes fail, here is a zip of the vtis-example.jar, unzip this zip into the odi/userlib and it should work with the old and new Derby releases.

https://blogs.oracle.com/dataintegration/resource/odi_12c/vtis-example.zip

Hope that gets you working, its been a while since I looked at that stuff.
Cheers
David

Posted by David on March 26, 2015 at 08:19 PM PDT #

Thanks David,
I solved the problem, now it works fine both in reading and in writing from/to MongoDB using ODI load plan!

Thanks a lot for your precious support!
Best Regards,
Ursula.

Posted by Ursula on March 27, 2015 at 01:58 AM PDT #

Post a Comment:
  • HTML Syntax: NOT allowed
About

Learn the latest trends, use cases, product updates, and customer success examples for Oracle's data integration products-- including Oracle Data Integrator, Oracle GoldenGate and Oracle Enterprise Data Quality

Search

Archives
« September 2015
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today