X

The Oracle NoSQL Database Blog covers all things Oracle NoSQL Database. On-Prem, Cloud and more.

  • August 12, 2015

Migrating/Importing MongoDB Documents into Nosql Tables

Summary 

This paper presents a how to to migrate documents in MongoDB's collections into tables and child tables in Oracle Nosql. The idea is to take as example a relatively complex document, define a mapping file to map the basic fields of the document into a table,  and to map the embedded collections of the document into child tables. The java class that we provide will generate the Nosql structure of the tables and insert the data. The set of components of each element of the collection is inserted in the same operation into the store.

A Json example 

 Let's use an example of a family item from a MongoDB collection:

{ "_id" : ObjectId("55c4c6576e4ae64b5997d39e"),

"firstname" : "lena",

"lastname" : "clark",

"gender" : "W",

"childrens" : [

{ "name" : "bob",

"schools" : [ {"street" : "90, pine street","id" : "Saint volume"},

{"street" : "134, mice street","id" : "Saint appearance"}

],

"hobbies" : ["soccer","photo"]

},

{ "name" : "joseph",

"schools" : [ {"street" : "168, merely street","id" : "Saint slipped"} ],

"hobbies" : ["tennis","piano"]

},

{ "name" : "sandy",

"schools" : [{"street" : "227, thread street","id" : "Saint discovery"}],

"hobbies" : ["football","guitar"]

}

]

}

In this case the main document has the the following fields : '_id', 'firstname', 'lastname', 'gender' and childrens'. 'childrens' is an embedded collection, containing 'name', 'schools' and 'hobbies'. 'schools' is again a nested collection with 'street and 'id' fields and 'hobbies' is a list. We can map them into several nested tables:

  • the main table represents FAMILY,
  • FAMILY.CHILDREN  gets 'childrens' items and
  • FAMILY.CHILDREN.SCHOOLS and FAMILY.CHILDREN.HOBBIES store schools and hobbies information.

The mapping file 

The mapping file, is a properties file, it contains also connect information to access MongoDB database and Nosql store:

  • the name of the Nosql store: Nosql.Store=kvstore
  • the host and port of the nosql store: Nosql.URL=bigdatalite:5000
  • the mongodb host: MongoDB.host=localhost
  • the mongodb port: MongoDB.port=27017
  • the mongodb database: MongoDB.DB=gadb

Mapping principles

Define the main collection, its fields and its main table mapping

For each field define its type and its mapping value. Note that this can be a recursive step.

For each table define the primary key index components. 

Mapping extracts

Mapping collection and table with its primary keys

  • mongo.collection=family
  • mongo.collection.map=FAMILY
  • FAMILY.indexcols=LASTNAME,FIRSTNAME
indexcols is the keyword to introduce the comma separated list of columns of the key, order is important. The indexcol prefix is a Nosql table name

Family fields

  • family.fields=lastname,firstname,gender,childrens
  • family.firstname.type=string
  • family.firstname.map=FIRSTNAME
  • family.childrens.type=collection
  • family.childrens.map=CHILDREN
fields is the keyword to introduce the comma separated list of fields of a collection. For each field type corresponds to the type of a column in a Nosql table (string, integer, long, float, double or boolean are accepted). Two other values are used: array or collection. array is for lists of basic types, collection is for more complex collections. When  type is a basic type, map indicates a column of the mapped table, when the type is array or collection, map introduces a new table.

Children mappings

  • CHILDREN.indexcols=NAME
  • childrens.fields=name,schools,hobbies
  • childrens.name.type=string
  • childrens.name.map=NAME
  • childrens.schools.type=collection
  • childrens.schools.map=SCHOOLS
  • childrens.hobbies.type=array
  • childrens.hobbies.map=HOBBIES

School mappings 

  • schools.fields=street,id
  • schools.indexcols=ID
street and id are basic string fields, their type and map are not shown.

Hobbies mappings

  • hobbies.fields=hobbies
  • hobbies.hobbies.type=string
  • hobbies.hobbies.map=HOBBY
  • HOBBIES.indexcols=HOBBY

children.hobbies is an array of strings mapped to child table HOBBIES, there is no name in the main collection for the field, I've chosen to use hobbies (the name of the collection) as the field name to be able to define a mapping. 

Tables generated

Get child tables from FAMILY   

kv-> show tables -parent FAMILY

Tables: 

FAMILY.CHILDREN

 FAMILY.CHILDREN.HOBBIES

 FAMILY.CHILDREN.SCHOOLS

Get table indexes

kv-> show indexes -table FAMILY

Indexes on table FAMILY

FAMILYIndex (LASTNAME, FIRSTNAME)

kv-> show indexes -table FAMILY.CHILDREN

Indexes on table FAMILY.CHILDREN

CHILDRENIndex (NAME)

kv-> show indexes -table FAMILY.CHILDREN.SCHOOLS

Indexes on table FAMILY.CHILDREN.SCHOOLS

SCHOOLSIndex (ID)

kv-> show indexes -table FAMILY.CHILDREN.HOBBIES

Indexes on table FAMILY.CHILDREN.HOBBIES

HOBBIESIndex (HOBBY) 

Getting data from tables

Get our example family

kv-> get table -name FAMILY -field LASTNAME -value "clark" -field FIRSTNAME -value "lena"

{"FIRSTNAME":"lena","LASTNAME":"clark","GENDER":"W"}

Get our family children

kv-> get table -name FAMILY.CHILDREN -field LASTNAME -value "clark" -field FIRSTNAME -value "lena"

{"LASTNAME":"clark","FIRSTNAME":"lena","NAME":"bob"}

{"LASTNAME":"clark","FIRSTNAME":"lena","NAME":"joseph"}

{"LASTNAME":"clark","FIRSTNAME":"lena","NAME":"sandy"}

Get our family children schools

kv-> get table -name FAMILY.CHILDREN.SCHOOLS -field LASTNAME -value "clark" -field FIRSTNAME -value "lena"

{"LASTNAME":"clark","FIRSTNAME":"lena","NAME":"bob","STREET":"134, mice street","ID":"Saint appearance"}

{"LASTNAME":"clark","FIRSTNAME":"lena","NAME":"bob","STREET":"90, pine street","ID":"Saint volume"}

{"LASTNAME":"clark","FIRSTNAME":"lena","NAME":"joseph","STREET":"168, merely street","ID":"Saint slipped"}

{"LASTNAME":"clark","FIRSTNAME":"lena","NAME":"sandy","STREET":"227, thread street","ID":"Saint discovery"} 

Get our family children hoobies

kv-> get table -name FAMILY.CHILDREN.HOBBIES -field LASTNAME -value "clark" -field FIRSTNAME -value "lena"

{"LASTNAME":"clark","FIRSTNAME":"lena","NAME":"bob","HOBBY":"photo"}

{"LASTNAME":"clark","FIRSTNAME":"lena","NAME":"bob","HOBBY":"soccer"}

{"LASTNAME":"clark","FIRSTNAME":"lena","NAME":"joseph","HOBBY":"piano"}

{"LASTNAME":"clark","FIRSTNAME":"lena","NAME":"joseph","HOBBY":"tennis"}

{"LASTNAME":"clark","FIRSTNAME":"lena","NAME":"sandy","HOBBY":"football"}

{"LASTNAME":"clark","FIRSTNAME":"lena","NAME":"sandy","HOBBY":"guitar"}

Running the example

jar files needed

MongoJavaDriver: mongo-java-driver-3.0.0.jar (this is the version we have used)

Nosql client: kvclient.jar (it should be a version containing tables 12.3+, it had been tested with 3.3.4)

main java class : mongoloadnosql.MongoDB2Nosql (java source code is here)

Parameters

The tool has 5 parameters:

  • -limit <integer>, number of documents to load, 0 is for all the documents
  • -skip <integer>, offset of the first document to load, similar to the skip function in MongoDB, 0 means no skip of documents
  • -mapfile <file>, properties file to load
  • -create [true|<anything else than true>], if true the Nosql API functions for creation of tables and indexes are issued
  • -insert [true|<anything else than true>], if true the Nosql API functions  for insertion are issued

Launch command

This command creates nosql tables and indexes if there do not exists, and insert the whole collection items using the properties file  mappingfam.properties:

java -classpath <tool_dir>/classes:<KVHOME>/lib/kvclient.jar:<MONGODB_CLASSPATH>/mongo-java-driver-3.0.0.jar mongoloadnosql.Mongo2Nosql -limit 0 -skip 0 -mapfile mappingfam.properties -create true -insert true

Caveats and warnings

Actually there is no possibility to map  MongoDB references (neither referenced relationships nor DBRefs)

Fields should be presented in the order defined by their primary keys (lastname,firstname) instead of (firstname,lastname) 

The java code attached is just  to illustrate how to  import/migrate MongoDB data into Nosql tables in an efficient and consistent way, it has not been tested in all kinds of situations and it is not intended to be free of bugs.

Bonus

The following mapping file, allows to map MongoDB documents having the structure of a post.  

In this case there is an embedded object "origine" which is defined as: {"owner" : "gus","site" : "recent_safety.com"}) which is not a collection.

There is no primary key other than the MongoDB '_id' field.

Enjoy trying this example also.

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.Captcha
Oracle

Integrated Cloud Applications & Platform Services