X

Best practices, news, tips and tricks - learn about Oracle's R Technologies for Oracle Database and Big Data

Using rJava in Embedded R Execution

Integration with high performance programming languages is one way to tackle big data with R. Portions of the R code are moved from R to another language to avoid bottlenecks and perform expensive procedures. The goal is to balance R’s elegant handling of data with the heavy duty computing capabilities of other languages.

Outsourcing R to another language can easily be hidden in R functions, so proficiency in the target language is not requisite for the users of these functions. The rJava package by Simon Urbanek is one such example - it outsources R to Java very much like R's native .C/.Call interface. rJava allows users to create objects, call methods and access fields of Java objects from R.



















































Oracle R Enterprise (ORE) provides an additional boost to rJava when used in embedded R script execution on the database server machine. Embedded R Execution allows R scripts to take advantage of a likely more powerful database server machine - more memory and CPUs, and greater CPU power. Through embedded R, ORE enables R to leverage database support for data parallel and task parallel execution of R scripts and also operationalize R scripts in database applications.  The net result is the ability to analyze larger data sets in parallel from a single R or SQL interface, depending on your preference.

In this post, we demonstrate a basic example of configuring and deploying rJava in base R and embedded R execution.

1. Install Java

To start, you need Java. If you are not using a pre-configured engineered system like Exadata or the Big Data Appliance, you can download the Java Runtime Environment (JRE) and Java Development Kit (JDK) here.

To verify the JRE is installed on your system, execute the command:

$ java -version
java version "1.7.0_67"


If the JRE is installed on the system, the version number is returned. The equivalent check for JDK is:

$ javac -version
javac 1.7.0_67


A "command not recognized" error indicates either Java is not present or you need to add Java to your PATH and CLASSPATH environment variables.

2. Configure Java Parameters for R

R provides the javareconf utility to configure Java support in R.  To prepare the R environment for Java, execute this command in R's home directory:

$ echo $R_HOME
/usr/lib64/R


$ cd /usr/lib64/R

$ sudo R CMD javareconf
or
$ R CMD javareconf -e

3.  Install rJava Package

rJava release versions can be obtained from CRAN.  Assuming an internet connection is available, the install.packages command in an R session will do the trick.

> install.packages("rJava")
..
..
* installing *source* package ‘rJava’ ...
** package ‘rJava’ successfully unpacked and MD5 sums checked
checking for gcc... gcc -m64 -std=gnu99
..
..
** testing if installed package can be loaded
* DONE (rJava)

4. Configure the Environment Variable CLASSPATH

The CLASSPATH environment variable must contain the directories with the jar and class files.  The class files in this example will be created in /home/oracle/tmp.

  export CLASSPATH=$ORACLE_HOME/jlib:/home/oracle/tmp

Alternatively, use the rJava function .jaddClassPath to define the path to the class files.

5. Create and Compile Java Program

For this test, we create a simple, Hello, World! example. Create the file HelloWorld.java in /home/oracle/tmp with the contents:

  public class HelloWorld {
          public String SayHello(String str)
            {
                  String a = "Hello,";
            return a.concat(str);
            }
    }


Compile the Java code.

$ javac HelloWorld.java


6.  Call Java from R


In R, execute the following commands to call the rJava package and initialize the Java Virtual Machine (JVM).

R> library(rJava)
R> .jinit()


Instantiate the class HelloWorld in R. In other words, tell R to look at the compiled HelloWorld program.

R> .jnew
("HelloWorld")

Call the function directly.

R> .jcall(obj, "S", "SayHello", str)
              VAL
1 Hello,      World!


7.  Call Java In Embedded R Execution


Oracle R Enterprise uses external procedures in Oracle Database to support embedded R execution. The default configuration for external procedures is spawned directly by Oracle Database. The path to the JVM shared library, libjvm.so must be added to the environment variable LD_LIBRARY_PATH so it is found in the shell where Oracle is started.  This is defined in two places: at the OS shell and in the external procedures configuration file, extproc.ora.

In the OS shell:

$ locate libjvm.so

/usr/java/jdk1.7.0_45/jre/lib/amd64/server

$ export LD_LIBRARY_PATH=/usr/java/jdk1.7.0_45/jre/lib/amd64/server:$LD_LIBRARY_PATH


In extproc.ora:

$ cd $ORACLE_HOME/hs/admin/extproc.ora


Edit the file extproc.ora to add the path to libjvm.so in LD_LIBRARY_PATH:

SET EXTPROC_DLLS=ANY
SET LD_LIBRARY_PATH=/usr/java/jdk1.7.0_45/jre/lib/amd64/server
export LD_LIBRARY_PATH


You will need to bounce the database instance after updating extproc.ora.

Now load rJava in embedded R:

> library(ORE)
> ore.connect(user     = 'oreuser',
             password = 'password',
             sid      = 'sid',
             host     = 'hostname',
             all      = TRUE)


> TEST <- ore.doEval(function(str) {
                       library(rJava)
                       .jinit()
                       obj <- .jnew("HelloWorld")
                       val <- .jcall(obj, "S", "SayHello", str)
                       return(as.data.frame(val))
                     },
                     str = 'World!',
                    FUN.VALUE = data.frame(VAL = character())
  )

> print(TEST)
              VAL
1 Hello,      World!


If you receive this error, LD_LIBRARY_PATH is not set correctly in extproc.ora:

Error in .oci.GetQuery(conn, statement, data = data, prefetch = prefetch,  :
  Error in try({ : ORA-20000: RQuery error
Error : package or namespace load failed for ‘rJava’
ORA-06512: at "RQSYS.RQEVALIMPL", line 104
ORA-06512: at "RQSYS.RQEVALIMPL", line 101


Once you've mastered this simple example, you can move to your own use case. If you get stuck, the rJava package has very good documentation. Start with the information on the rJava CRAN page. Then, from an R session with the rJava package loaded, execute the command help(package="rJava") lto list  the available functions.

After that, the source code of R packages which use rJava are a useful source of further inspiration – look at the reverse dependencies list for rJava in CRAN. In particular, the helloJavaWorld package is a tutorial for how to include Java code in an R package.

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.Captcha