R Package Installation with Oracle R Enterprise


Programming languages give developers the opportunity to write reusable functions and to bundle those functions into logical deployable entities. In R, these are called packages. R has thousands of such packages provided by an almost equally large group of third-party contributors. To allow others to benefit from these packages, users can share packages on the CRAN system for use by the vast R development community worldwide.

R's package system along with the CRAN framework provides a process for authoring, documenting and distributing packages to millions of users. In this post, we'll illustrate the various ways in which such R packages can be installed for use with R and together with Oracle R Enterprise. In the following, the same instructions apply when using either open source R or Oracle R Distribution.

In this post, we cover the following package installation scenarios for:


R command line
Linux shell command line
Use with Oracle R Enterprise
Installation on Exadata or RAC
Installing all packages in a CRAN Task View
Troubleshooting common errors


1. R Package Installation Basics

R package installation basics are outlined in Chapter 6 of the R Installation and Administration Guide. There are two ways to install packages from the command line: from the R command line and from the shell command line. For this first example on Oracle Linux using Oracle R Distribution, we’ll install the arules package as root so that packages will be installed in the default R system-wide location where all users can access it, /usr/lib64/R/library.

Within R, using the
install.packages function always attempts to install the latest version of the requested package available on CRAN:

R> install.packages("arules")


If the arules package depends upon other packages that are not already installed locally, the R installer automatically downloads and installs those required packages. This is a huge benefit that frees users from the task of identifying and resolving those dependencies.


You can also install R from the shell command line. This is useful for some packages when an internet connection is not available or for installing packages not uploaded to CRAN. To install packages this way, first locate the package on
CRAN and then download the package source to your local machine. For example:

$ wget http://cran.r-project.org/src/contrib/arules_1.1-2.tar.gz

Then, install the package using the command R CMD INSTALL:

$ R CMD INSTALL arules_1.1-2.tar.gz


A major difference between installing R packages using the R package installer at the R command line and shell command line is that package dependencies must be resolved manually at the shell command line.
Package dependencies are listed in the Depends section of the package’s CRAN site. If dependencies are not identified and installed prior to the package’s installation, you will see an error similar to:

ERROR: dependency ‘xxx’ is not available for package ‘yyy’


As a best practice and to save time, always refer to the package’s CRAN site to understand the package dependencies prior to attempting an installation.


If you don’t run R as root, you won’t have permission to write packages into the default system-wide location and you will be prompted to create a personal library accessible by your userid. You can accept the personal library path chosen by R, or specify the library location by passing parameters to the
install.packages function. For example, to create an R package repository in your home directory:

R> install.packages("arules", lib="/home/username/Rpackages")

or


$ R CMD INSTALL arules_1.1-2.tar.gz --library=/home/username/Rpackages


Refer to the
install.packages help file in R or execute R CMD INSTALL --help at
the shell command line for a full list of command line options.


To set the library location and avoid having to specify this at every package install, simply create the R startup environment file .
Renviron in your home area if it does not already exist, and add the following piece of code to it:

R_LIBS_USER = "/home/username/Rpackages"


2. Setting the Repository

Each time you install an R package from the R command line, you are asked which CRAN mirror, or server, R should use. To set the repository and avoid having to specify this during every package installation, create the R startup command file .Rprofile in your home directory and add the following R code to it:

cat("Setting Seattle repository")
r = getOption("repos") 
r["CRAN"] = "http://cran.fhcrc.org/"
options(repos = r)
rm(r)

This code snippet sets the R package repository to the Seattle CRAN mirror at the start of each R session.


3. Installing R Packages for use with Oracle R Enterprise

Embedded R execution with Oracle R Enterprise allows the use of CRAN or other third-party R packages in user-defined R functions executed on the Oracle Database server. The steps for installing and configuring packages for use with Oracle R Enterprise are the same as for open source R. The database-side R engine just needs to know where to find the R packages.

The Oracle R Enterprise installation is performed by user
oracle, which typically does not have write permission to the default site-wide library, /usr/lib64/R/library. On Linux and UNIX platforms, the Oracle R Enterprise Server installation provides the ORE script, which is executed from the operating system shell to install R packages and to start R. The ORE script is a wrapper for the default R script, a shell wrapper for the R executable. It can be used to start R, run batch scripts, and build or install R packages. Unlike the default R script, the ORE script installs packages to a location writable by user oracle and accessible by all ORE users - $ORACLE_HOME/R/library.

To install a package on the database server so that it can be used by any R user and for use in embedded R execution, an Oracle DBA would typically download 
the package source from CRAN using wget. If the package depends on any packages that are not in the R distribution in use, download the sources for those packages, also. 

For a single Oracle Database instance, replace the R script with ORE to install the packages in the same location as the Oracle R Enterprise packages.

$ wget http://cran.r-project.org/src/contrib/arules_1.1-2.tar.gz
$ ORE CMD INSTALL arules_1.1-2.tar.gz


Behind the scenes, the ORE script performs the equivalent of setting R_LIBS_USER to the value of
$ORACLE_HOME/R/library, and all R packages installed with the ORE script are installed to this location. For installing a package on multiple database servers, such as those in an Oracle Real Application Clusters (Oracle RAC) or a multinode Oracle Exadata Database Machine environment, use the ORE script in conjunction with the Exadata Distributed Command Line Interface (DCLI) utility.

$ dcli -g nodes -l oracle ORE CMD INSTALL arules_1.1-1.tar.gz

The DCLI -g flag designates a file containing a list of nodes to install on, and the -l flag specifies the user id to use when executing the commands. For more information on using DCLI with Oracle R Enterprise, see Chapter 5 in the Oracle R Enterprise Installation Guide.

If you are using an Oracle R Enterprise client, install the package the same as any R package, bearing in mind that you must install the same version of the package on both the client and server machines to avoid incompatibilities.


4. CRAN Task Views

CRAN also maintains a set of Task Views that identify packages associated with a particular task or methodology. Task Views are helpful in guiding users through the huge set of available R packages. They are actively maintained by volunteers who include detailed annotations for routines and packages. If you find one of the task views is a perfect match, you can install every package in that view using the ctv package - an R package for automating package installation.

To use the ctv package to install a task view, first, install and load the ctv package.

R> install.packages("ctv")

R> library(ctv)


Then query the names of the available task views and install the view you choose.


R> available.views()
R> install.views("TimeSeries")


5. Using and Managing R packages

To use a package, start up R and load packages one at a time with the library command.

Load the
arules package in your R session.

R> library(arules)

Verify the version of
arules installed.

R> packageVersion("arules")

[1] '1.1.2'


Verify the version of
arules installed on the database server using embedded R execution.


R> ore.doEval(function() packageVersion("arules"))


View the help file for the apropos function in the
arules
package


R> ?apropos


Over time, your package repository will contain more and more packages, especially if you are using the system-wide repository where others are adding additional packages. It’s good to know the entire set of R packages accessible in your environment. To list all available packages in your local R session, use the
installed.packages command:

R> myLocalPackages <- row.names(installed.packages())

R> myLocalPackages


To access the list of available packages on the ORE database server from the ORE client, use the following embedded R syntax:

R> myServerPackages <-
ore.doEval(function() row.names(installed.packages())
R> myServerPackages


6. Troubleshooting Common Problems

Installing Older Versions of R packages

If you immediately upgrade to the latest version of R, you will have no problem installing the most recent versions of R packages. However, if your version of R is older, some of the more recent package releases will not work and
install.packages will generate a message such as:

Warning message:
In install.packages("arules")
: package ‘arules’ is not available


This is when you have to go to the
Old sources link on the CRAN page for the arules
package and determine which version is compatible with your version of R.


Begin by determining what version of R you are using:


$ R --version

Oracle Distribution of R version 3.0.1 (--) -- "Good Sport"
Copyright (C) The R Foundation for Statistical Computing
Platform: x86_64-unknown-linux-gnu (64-bit)


Given that R-3.0.1 was
released May 16, 2013, any version of the arules package released after this date may work. Scanning the arules archive, we might try installing version 0.1.1-1, released in January of 2014:

$ wget http://cran.r-project.org/src/contrib/Archive/arules/arules_1.1-1.tar.gz
$ R CMD INSTALL arules_1.1-1.tar.gz

For use with ORE:

$ ORE CMD INSTALL arules_1.1-1.tar.gz

The "package not available" error can also be thrown if the package you’re trying to install lives elsewhere, either another R package site, or it’s been removed from CRAN. A quick Google search usually leads to more information on the package’s location and status.


Oracle R Enterprise is not in the R library path

On Linux hosts, after installing the ORE server components, starting R, and attempting to load the ORE packages, you may receive the error:


R> library(ORE)
Error in library(ORE) : there is no package called ‘ORE’

If you know the ORE packages have been installed and you receive this error, this is the result of not starting R with the ORE script. To resolve this problem, exit R and restart using the ORE script. After restarting R and running the command to load the ORE packages, you should not receive any errors.

$ ORE
R> library(ORE)

On Windows servers, the solution is to make the location of the ORE packages visible to R by adding them to the R library paths. To accomplish this, exit R, then add the following lines to the .Rprofile file. On Windows, the .Rprofile file is located in
R\etc directory C:\Program Files\R\R-<version>\etcAdd the following lines:

.libPaths("<path to $ORACLE_HOME>/R/library")

The above line will tell R to include the R directory in the Oracle home as part of its search path. When you start R, the path above will be included, and future R package installations will also be saved to $ORACLE_HOME/R/library. This path should be writable by the user oracle, or the userid for the DBA tasked with installing R packages.

Binary package compiled with different version of R

By default, R will install pre-compiled versions of packages if they are found. If the version of R under which the package was compiled does not match your installed version of R you will get an error message:

Warning message: package ‘xxx’ was built under R version 3.0.0

The solution is to download the package source and build it for your version of R.

$ wget
http://cran.r-project.org/src/contrib/Archive/arules/arules_1.1-1.tar.gz
$ R CMD INSTALL arules_1.1-1.tar.gz

For use with ORE:

$ ORE CMD INSTALL arules_1.1-1.tar.gz

Unable to execute files in /tmp directory

By default, R uses the /tmp directory to install packages. On security conscious machines, the /tmp directory is often marked as "noexec" in the /etc/fstab file. This means that no file under /tmp can ever be executed, and users who attempt to install R package will receive an error:

ERROR: 'configure' exists but is not executable -- see the 'R Installation and Administration Manual’

The solution is to set the TMP and TMPDIR environment variables to a location which R will use as the compilation directory. For example:

$ mkdir <some path>/tmp
$ export TMPDIR= <some path>/tmp
$ export TMP= <some path>/tmp

This error typically appears on Linux client machines and not database servers, as Oracle Database writes to the value of the 
TMP environment variable for several tasks, including holding temporary files during database installation.


7. Creating your own R package

Creating your own package and submitting to CRAN is for advanced users, but it is not difficult. The procedure to follow, along with details of R's package system, is detailed in the
Writing R Extensions manual.

Comments:

Thansk for great article.

Is ORE uses both /usr/lib64/R/library and $ORACLE_HOME/R/library to look for installed packages?

In other words, is it enough if I install as root using R's install.packages()? Would this package then be visible to an Oracle Data Mining database on that server?

Posted by Ruslan on June 26, 2015 at 09:17 AM PDT #

If R is initiated with the ORE script, packages will be installed into $ORACLE_HOME/R/library. The default location for root is /usr/lib64/R/library. Executing the the command:

R> .libPaths()

in R will show you the paths available for R package installation. The first path returned is the location where R packages will be installed.

Posted by guest on June 26, 2015 at 01:55 PM PDT #

Thank you Sherry!

It would be great to have information in how to register R package to use from within Oracle Data Miner engine. It has an API with sys.rqScriptCreate() I believe... Found it here http://docs.oracle.com/cd/E11882_01/doc.112/e36761/scripts.htm#OREUG168
but it is not documented very well.
More examples / explanation would be really helpful - a good topic for another blog post if you feel it woulld be too long to reply in a comment.

Posted by Ruslan on June 26, 2015 at 03:14 PM PDT #

Hi Ruslan,

The best examples demonstrating registering R packages within Oracle R Enterprise are in our online training materials.

Go here:

http://www.oracle.com/technetwork/database/database-technologies/r/r-enterprise/learnmore/index.html

and look for the training session titled, "Oracle R Enterprise 1.4 Embedded R Execution - SQL API"

Here's the direct link: http://www.oracle.com/technetwork/database/database-technologies/r/r-enterprise/learnmore/ore-1-4-embedded-r-execution-sql-2159066.pdf

In addition, this blog post demonstrates how to register an R script that includes the use of the open source source R package 1071:

https://blogs.oracle.com/R/entry/invoking_r_scripts_via_oracle1

Hope it helps.

Sherry

Posted by guest on June 26, 2015 at 03:38 PM PDT #

Post a Comment:
  • HTML Syntax: NOT allowed
About

The place for best practices, tips, and tricks for applying Oracle R Enterprise, Oracle R Distribution, ROracle, and Oracle R Advanced Analytics for Hadoop in both traditional and Big Data environments.

Search

Archives
« August 2015
SunMonTueWedThuFriSat
      
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
     
Today