Wednesday Dec 04, 2013

Using DCLI to install Oracle R Distribution and Oracle R Enterprise

Oracle R Enterprise is commonly used to apply parallel resources to R computations in Oracle's Exadata Database Machine, To take advantage of Exadata's massively parallel grid infrastructure, Oracle R Distribution and the Oracle R Enterprise server components must be installed on each node. We've now streamlined the installation of Oracle R on Exadata, allowing users to get up and running quickly.

This is where Exadata's distributed command line interface utility (DCLI) comes in handy - it can be used to control multiple nodes with a single command. In Exadata environments, it's common to use DCLI to manage or monitor multiple nodes simultaneously to eliminate having to log in to each node individually.  In this post, we will use DCLI to install Oracle R Distribution and Oracle R Enterprise server components on Exadata compute nodes.

DCLI comes with a help flag that indicates the various options and commands.  We will use some of these commands in the following steps. The flags for non-Exadata RAC systems may differ slightly, so these instructions may require slight modifications for non-Exadata RAC environments. Refer to My Oracle Support for assistance with DCLI options.

$ dcli -h

Distributed Shell for Oracle Storage

This script executes commands on multiple cells in parallel threads.
The cells are referenced by their domain name or ip address.
Local files can be copied to cells and executed on cells.
This tool does not support interactive sessions with host applications.
Use of this tool assumes ssh is running on local host and cells.
The -k option should be used initially to perform key exchange with
cells.  User may be prompted to acknowledge cell authenticity, and
may be prompted for the remote user password.  This -k step is serialized
to prevent overlayed prompts.  After -k option is used once, then
subsequent commands to the same cells do not require -k and will not require
passwords for that user from the host.
Command output (stdout and stderr) is collected and displayed after the
copy and command execution has finished on all cells.
Options allow this command output to be abbreviated.

Return values:
 0 -- file or command was copied and executed successfully on all cells
 1 -- one or more cells could not be reached or remote execution
 returned non-zero status.
 2 -- An error prevented any command execution

Examples:
 dcli -g mycells -k
 dcli -c stsd2s2,stsd2s3 vmstat
 dcli -g mycells cellcli -e alter iormplan active
 dcli -g mycells -x reConfig.scl

usage: dcli [options] [command]

options:
 --version           show program's version number and exit
 -c CELLS            comma-separated list of cells
 -d DESTFILE         destination directory or file
 -f FILE             file to be copied
 -g GROUPFILE        file containing list of cells
 -h, --help          show help message and exit
 -k                  push ssh key to cell's authorized_keys file
 -l USERID           user to login as on remote cells (default: celladmin)
 -n                  abbreviate non-error output
 -r REGEXP           abbreviate output lines matching a regular expression
 -s SSHOPTIONS       string of options passed through to ssh
 --scp=SCPOPTIONS    string of options passed through to scp if different
 from sshoptions
 --serial            serialize execution over the cells
 -t                  list target cells
 --unkey             drop keys from target cells' authorized_keys file
 -v                  print extra messages to stdout
 --vmstat=VMSTATOPS  vmstat command options
 -x EXECFILE         file to be copied and executed

I. Install Oracle R Distribution across Exadata compute nodes

Oracle R Distribution is distributed as a set of RPMs. Root or sudo access is required only to install the RPMs, which is typical of RPM-based installs.  Root access is not necessary for running the R software - no R process will ever run as root.  The Oracle R Enterprise server installation steps are executed by user oracle, or any database user that meets the requirements listed in the Oracle R Enterprise Installation and Administration Guide. We strongly recommend reviewing the prerequisites and installation steps in the documentation prior to beginning the installation.

Step 1: SSH Trust and User Equivalence

The first task is to establish trust between your hosts. In other words, configure the Exadata environment to enable automatic authentication as DCLI executes remote commands.

 a. Generate an SSH public-private key for the root user, on any compute node, as
root.

$ ssh-keygen -N '' -f ~/.ssh/id_dsa -t dsa

This places the generated public and private key files in the .ssh sub-directory of the root user's home directory.

 b. Using your text editor, create a file that contains the host or node names of all the compute nodes in the rack separated by newlines. For example, the nodes file for a 2-node cluster may contain following entries:

$ cat nodes
exadb01
exadb02

c. Run the DCLI command with the -k option to push the SSH public key to each compute node's ssh authorized key file to establish SSH Trust. You will be prompted to enter the password for each compute node, but this will be the only time. With the -k option, each compute node is contacted sequentially rather than in parallel to give you a chance to enter the password for each node.

$ dcli -t -g nodes -l root -k -s "\-o StrictHostkeyChecking=no"

Once the DCLI command completes, you have established SSH Trust and User Equivalence. Subsequent DCLI commands issued will be executed without being prompted for passwords.

Step 2: Log in as root to any compute node in the Exadata and download the file ord-linux-x86_64-version.tar.gz, where version is the R version you want to install. For example, the file name for R-2.15.3 is ord-linux-x86_64-2.15.3.tar.gz.

Step 3: Create a directory and replicate the file ord-linux-x86_64-2.15.3.tar.gz in this directory across all nodes. Here, we create a directory named ORD in /home/oracle and replicate ord-linux-x86_64-2.15.3.tar.gz in /home/oracle/ORD.

$ dcli -t -g nodes -l root mkdir -p /home/oracle/ORD
$ dcli -t -g nodes -l root -f ord-linux-x86_64-2.15.3.tar.gz -d /home/oracle/ORD/ord-linux-x86_64-2.15.3.tar.gz

Step 4: Uncompress and untar ord-linux-x86_64-2.15.3.tar.gz to get the dependent RPMs across all nodes.

$ dcli -t -g nodes -l root tar xvfz /home/oracle/ORD/ord-linux-x86_64-2.15.3.tar.gz -C /home/oracle/ORD
$ ls /home/oracle/ORD/ord-linux-x86_64-2.15.3

NOTE: You can also download these RPMs from
           http://public-yum.oracle.com/

At the time of this blog post, several of these dependencies required by R's development RPMs will cause conflicts during standard Exadata upgrades. To avoid this, remove gcc-gfortran, mesa-libGl-devel, libpng-devel, and R-devel-<version>.el5.x86_64.rpm from the list. For Oracle R Distribution 2.15.3, the RPM is R-devel-2.15.3-1.el5.x86_64.rpm.

Step 5: To install these new RPMs and update existing RPMs across nodes, issue the following RPM command.

$ dcli -t -g nodes -l root rpm -i --force /home/oracle/ORD/ord-linux-x86_64-2.15.3/*.rpm

The --force flag is used here to avoid errors regarding circular dependencies. 

Step 6: Verify R installations on each node by first returning the location where R is installed and then starting R.

$ dcli -g nodes -l oracle R RHOME
$ dcli -g nodes -l oracle R --vanilla

Oracle R Distribution installation on Exadata commands summary:

ssh-keygen -N " -f~/.ssh/id_dsa -t dsa

vi nodes # enter node names

dcli -t -g nodes -l root -k -s "\-o StrictHostkeyChecking=no" 

dcli -t -g nodes -l root mkdir -p /home/oracle/ORD
dcli -t -g nodes -l root -f ord-linux-x86_64-2.15.3.tar.gz -d /home/oracle/ORD/ord-linux-x86_64-2.15.3.tar.gz 
dcli -t -g nodes -l root tar xvfz /home/oracle/ORD/ord-linux-x86_64-2.15.3.tar.gz -C /home/oracle/ORD

dcli -t -g nodes -l root rpm -i --force /home/oracle/ORD/ord-linux-x86_64-2.15.3/*.rpm

dcli -g nodes -l root R RHOME
dcli -g nodes -l root R --vanilla


II. Install Oracle R Enterprise Server across Exadata compute nodes

Before installing Oracle R Enterprise Server, ensure that environment variables are set on each node as shown in Table 4-1 in the Oracle R Enterprise Installation and Administration Guide. 

Step 1: Download Oracle R Enterprise server and supporting installers.

On the Oracle R Enterprise page, download Oracle R Enterprise Server and Oracle R Enterprise client supporting packages for Linux. The following files are downloaded for Oracle R Enterprise 1.3.1:

ore-server-linux-x86-64-1.3.1.zip
ore-supporting-linux-x86-64-1.3.1.zip

Step 2: Copy the Oracle R Enterprise server and supporting packages installers across nodes.

$ dcli -g nodes -l oracle mkdir -p /home/oracle/ORE
$ dcli -g nodes -l oracle -f ore-server-linux-x86-64-1.3.1.zip -d /home/oracle/ORE/ore-server-linux-x86-64-1.3.1.zip
$ dcli -g nodes -l oracle -f ore-supporting-linux-x86-64-1.3.1.zip -d /home/oracle/ORE/ore-supporting-linux-x86-64-1.3.1.zip

Step 3: Unzip Oracle R Enterprise server and supporting packages installers.

$ dcli -t -g nodes -l oracle unzip /home/oracle/ORE/ore-server-linux-x86-64-1.3.1.zip -d /home/oracle/ORE
$ dcli -t -g nodes -l oracle unzip /home/oracle/ORE/ore-supporting-linux-x86-64-1.3.1.zip -d /home/oracle/ORE

Step 4: Install Oracle R Enterprise server components.

$ dcli -t -g nodes -l oracle /home/oracle/ORE/server/./install.sh

Step 5: Create an Oracle R Enterprise user.

$ dcli -t -g nodes -l oracle /home/oracle/ORE/server/./demo_user.sh

Step 6: Apply grants to the Oracle R Enterprise user by executing the demo_user.sh script.  The default user is rquser. As this is a shared database, the grants need only be applied to a single node.

$ cd /home/oracle/ORE 
$ sqlplus / as sysdba

SQL> grant RQADMIN to rquser;
SQL> grant CREATE TABLE to rquser;
SQL> grant CREATE SESSION to rquser;
SQL> grant CREATE VIEW to rquser;
SQL> grant CREATE PROCEDURE to rquser;
SQL> grant CREATE MINING MODEL to rquser;

Step 7: Install Oracle R Enterprise client supporting packages.

$ dcli -t -g nodes -l oracle R CMD INSTALL /home/oracle/ORE/supporting/DBI_0.2-5_R_x86_64-unknown-linux-gnu.tar.gz
$ dcli -t -g nodes -l oracle R CMD INSTALL /home/oracle/ORE/supporting/ROracle_1.1-9_R_x86_64-unknown-linux-gnu.tar.gz
$ dcli -t -g nodes -l oracle R CMD INSTALL /home/oracle/ORE/supporting/png_0.1-4_R_x86_64-unknown-linux-gnu.tar.gz

Step 8: Verify Oracle R Enterprise loads.

$ dcli -t -g nodes -l oracle ORE -e "library(ORE)"

 Additional steps for validating the Oracle R Enterprise installation are in sections 6.3 and 6.4 of the Oracle R Installation and Administration Guide.

Oracle R Enterprise installation on Exadata commands summary:


dcli -g nodes -l oracle mkdir -p /home/oracle/ORE
dcli -g nodes -l oracle -f ore-server-linux-x86-64-1.3.1.zip -d /home/oracle/ORE/ore-server-linux-x86-64-1.3.1.zip
dcli -g nodes -l oracle -f ore-supporting-linux-x86-64-1.3.1.zip -d /home/oracle/ORE/ore-supporting-linux-x86-64-1.3.1.zip

dcli -t -g nodes -l oracle unzip /home/oracle/ORE/ore-server-linux-x86-64-1.3.1.zip -d /home/oracle/ORE
dcli -t -g nodes -l oracle unzip /home/oracle/ORE/ore-supporting-linux-x86-64-1.3.1.zip -d /home/oracle/ORE

dcli -t -g nodes -l oracle /home/oracle/ORE/server/./install.sh
dcli -t -g nodes -l oracle /home/oracle/ORE/server/./demo_user.sh

cd /home/oracle/ORE 
sqlplus / as sysdba

SQL> grant RQADMIN to rquser;
SQL> grant CREATE TABLE to rquser;
SQL> grant CREATE SESSION to rquser;
SQL> grant CREATE VIEW to rquser;
SQL> grant CREATE PROCEDURE to rquser;
SQL> grant CREATE MINING MODEL to rquser;

dcli -t -g nodes -l oracle R CMD INSTALL /home/oracle/ORE/supporting/DBI_0.2-5_R_x86_64-unknown-linux-gnu.tar.gz
dcli -t -g nodes -l oracle R CMD INSTALL /home/oracle/ORE/supporting/ROracle_1.1-9_R_x86_64-unknown-linux-gnu.tar.gz
dcli -t -g nodes -l oracle R CMD INSTALL /home/oracle/ORE/supporting/png_0.1-4_R_x86_64-unknown-linux-gnu.tar.gz

dcli -t -g nodes -l oracle ORE -e "library(ORE)"



Conclusion: DCLI is a powerful utility that provides the ability to install Oracle R Distribution and Oracle R Enterprise on multiple Exadata compute nodes without the effort of repeating commands on each node.


About

The place for best practices, tips, and tricks for applying Oracle R Enterprise, Oracle R Distribution, ROracle, and Oracle R Advanced Analytics for Hadoop in both traditional and Big Data environments.

Search

Archives
« December 2013 »
SunMonTueWedThuFriSat
1
2
3
5
7
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
30
31
    
       
Today