X

Best practices, news, tips and tricks - learn about Oracle's R Technologies for Oracle Database and Big Data

Migrating R models from Development to Production

Users of Oracle R Enterprise (ORE) embedded R execution will often calibrate R models in a development environment and promote the final models to a production database. In most cases, the development and production databases are distinct, and model serialization between databases is not effective if the underlying tables are not identical.  To facilitate the migration process, ORE includes scripts to transport the ORE system schema, RQSYS, and ORE objects such as tables, scripts, and models from one database to another.

Migration Scripts

The ORE migration utility scripts and documentation reside in $ORACLE_HOME/R/migration after the ORE server component is installed. Navigate to the server directory and change to the migration subdirectory:

/oreserver_install_dir/server/migration

The migration subdirectory contains a README and the following subdirectories:

exp: Contains a script to migrate RQSYS and all ORE user data to a dump file.
imp: Contains a script for importing ORE user data from the dump file created by the script in exp.
oreuser: Contains scripts for exporting and importing data for a specific ORE user.


Instructions for running the migration scripts are provided in the README.  Note that the current version of the migration scripts require that the source and target environments contain the same versions of Oracle Database and Oracle R Enterprise.

Migration Example

Here's an example that migrates models from a single ORE 1.5.0 schema in a local Oracle 12.1.0.2 database to an ORE 1.5.0 schema in a remote Oracle 12.1.0.2 database.  In this case, the databases reside on different servers. 

Create model and predictions:

R> ore.create(iris, "IRIS")
R> mod <- ore.randomForest(Species~., IRIS)
R> mod

Call:
 ore.randomForest(formula = Species ~ ., data = IRIS)
               Type of random forest: classification
                     Number of trees: 500
                    Number of groups:  1
 No. of variables tried at each split: 2

R> pred <- predict(mod, IRIS, type="all", supplemental.cols="Species")
R> head(pred)
  setosa versicolor virginica prediction Species
1  1.000      0.000         0     setosa  setosa
2  0.998      0.002         0     setosa  setosa
3  1.000      0.000         0     setosa  setosa
4  1.000      0.000         0     setosa  setosa
5  1.000      0.000         0     setosa  setosa
6  1.000      0.000         0     setosa  setosa

Save the models to a datastore named myModels:

R> ore.save(mod, pred, name = "myModels")

Run the migration utility.  Refer to $ORACLE_HOME/R/migration/oreuser/exp/README for syntax details.  In this case, I'm using the Big Data Lite VM
with schema moviedemo and instance orcl.  I'm exporting all ORE data including the model mod and predictions pred to a dump file in /tmp/moviedemodsi.
Predictions are included to illustrate capability, however, scoring would likely occur on new data in the production environment.

$ cd $ORACLE_HOME/R/migration/oreuser/exp

$ perl -I$ORACLE_HOME/R/migration/perl $ORACLE_HOME/bin/ore_dsiexport.pl orcl /tmp/moviedemodsi

moviedemodsi.zip MOVIEDEMO
Checking connection to orcl.........
 Enter db user system password:welcome1
Connect to db connect_str .....   Pass
Checking ORE version .........Pass
****Step1 Setup before export ******
/u01/app/oracle/product/12.1.0.2/dbhome_1/bin/sqlplus -L -S system/welcome1@"orcl"

@/u01/app/oracle/product/12.1.0.2/dbhome_1/R/migration/oreuser/exp/setup.sql system welcome1 orcl

/tmp/moviedemodsi MOVIEDEMO

/u01/app/oracle/product/12.1.0.2/dbhome_1/R/migration/oreuser/exp/storedproc.sql >

/tmp/moviedemodsi/tmpstep1.log
******Step2 export schema with data store ****
****Step3 cleanup ****
Export complete
dump files are in /tmp/moviedemodsi
*****Step4 creating zip file moviedemodsi.zip ***
  adding: MOVIEDEMO_rqds.dmp (deflated 90%)
  adding: MOVIEDEMO_rqdsob.dmp (deflated 89%)
  adding: MOVIEDEMO_rqrefdbobj.dmp (deflated 90%)
  adding: MOVIEDEMO_rqdsref.dmp (deflated 91%)
  adding: MOVIEDEMO_rqdsaccess.dmp (deflated 90%)
  adding: MOVIEDEMO_src.dmp (deflated 70%)
  adding: MOVIEDEMO_srcdsi.dmp (deflated 91%)
  adding: exp_dsi_user.sh (deflated 59%)
  adding: imp_dsi_user.sh (deflated 78%)
Created moviedemodsi.zip. Use this for file importing ORE data from MOVIEDEMO into target db


In my target database, I ran the import script, which imports the dumped ORE user data.  As with the export, the script is run as system user, and you will be prompted for the password:

$ perl -I$ORACLE_HOME/R/migration/perl ore_dsiimport.pl orcl /home/oracle
Checking connection to orcl.........
 Enter db user system password:welcome1
Connect to db connect_str .....   Pass
Checking ORE version .........Pass
****Step2 import of schema with datastoreinventory ****
****Step3 staging datastore metadata *****

*****************************IMPORTANT **************************
Check dstorestg.log for errors. Then run the fllowing scripts to complete the import ***
****** Run as sysdba : 1. rqdatastoremig.sql <rquser>
******************************************************************

Then, after running rqdatastoremig.sql, I can log into my target MOVIEDEMO schema and load the models:

> ore.load("myModels")
[1] "mod"  "pred"
> mod

Call:
 ore.randomForest(formula = Species ~ ., data = IRIS)
               Type of random forest: classification
                     Number of trees: 500
                    Number of groups:  1
 No. of variables tried at each split: 2

> head(pred)
    setosa versicolor virginica prediction    Species
1    1.000      0.000     0.000     setosa     setosa
2    1.000      0.000     0.000     setosa     setosa
3    1.000      0.000     0.000     setosa     setosa
4    1.000      0.000     0.000     setosa     setosa
5    1.000      0.000     0.000     setosa     setosa
6    1.000      0.000     0.000     setosa     setosa

ORE Migration Utility Benefits

The ORE migration utility enables data scientists to test and deploy models quickly, reducing the feedback loop time required for retraining and fine tuning models. ORE users don't have to worry about the risk of error in rewriting models because the models are deployed in a language they were trained and tested in. In addition, if the tables in the model training environment are identical to the production, you'll cut out another enormous chunk of testing time.

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.Captcha
Oracle

Integrated Cloud Applications & Platform Services