Saturday May 23, 2015

sqlplus and DYLD_LIBRARY_PATH on Mac OS/X

At some point in the past I followed directions in "Oracle Database Client Installation Guide for Apple Mac OS X (Intel)" so that I could use sqlplus on my MacBook. Following the directions, I set the DYLD_LIBRARY_PATH environment variable in my .bashrc.

Today, I noticed this:

$ brew doctor
Please note that these warnings are just used to help the Homebrew maintainers
with debugging if you file an issue. If everything you use Homebrew for is
working fine: please don't worry and just ignore them. Thanks!

Warning: Setting DYLD_* vars can break dynamic linking.
Set variables:
    DYLD_LIBRARY_PATH: :/Applications/instantclient_11_2

Hmm, what to do? Ignore? Fix? Google?

The top google hit was "Oracle sqlplus and instant client on Mac OS/X without DYLD_LIBRARY_PATH".

Everything is Casey's blog checked out:

$ which sqlplus
/Applications/instantclient_11_2/sqlplus

$ otool -L /Applications/instantclient_11_2/sqlplus
/Applications/instantclient_11_2/sqlplus:
    /ade/b/1891624078/oracle/sqlplus/lib/libsqlplus.dylib (compatibility version 0.0.0, current version 0.0.0)
    /ade/b/2649109290/oracle/rdbms/lib/libclntsh.dylib.11.1 (compatibility version 0.0.0, current version 0.0.0)
    /ade/b/2649109290/oracle/ldap/lib/libnnz11.dylib (compatibility version 0.0.0, current version 0.0.0)
    /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 159.1.0)

When I unset the variable, sqlplus stopped working:

$ unset DYLD_LIBRARY_PATH

$ sqlplus
dyld: Library not loaded: /ade/b/1891624078/oracle/sqlplus/lib/libsqlplus.dylib
  Referenced from: /Applications/instantclient_11_2/sqlplus
  Reason: image not found
Trace/BPT trap: 5


sqlplus cannot be opened because of a problem. Check with the developer to make sure sqlplus works with this version of OS X. You may need to reinstall the application. Be sure to install any available updates. Ignore. Report.

SIMPLE SOLUTION: I was about to follow Casey's instructions when a simpler solution popped into my mind. I removed the DYLD_LIBRARY_PATH environment variable from my ~/.bashrc and replaced it with an alias:

alias sqlplus="DYLD_LIBRARY_PATH=/Applications/instantclient_11_2 sqlplus"

After killing the OS X Terminal and restarting, and verifying that the environment variable was gone and alias present, both "brew doctor" and "sqlplus" were happy.

DISCLAIMER: My alias only addresses sqlplus usage. If you are using more Instant Client functionality, you may need to use Casey's solution or bluebinary's solution. I can't vouch for either Casey's solution or bluebinary's solution, but both approaches seem reasonable.

Wednesday May 20, 2015

Python: Old Dog Learns New Trick

I'm now a confident Python programmer. I've recently wrapped up a sequence of 2 online classes that focus on programming with Python.

MITx-6.00.1x-Cert_sm.jpg
MITx: 6.00.1x Introduction to Computer Science and Programming Using Python - Covered Python data types, flow control, iteration, numerical methods, functions, scope, recursion, Python data structures such as lists, sets and dictionaries, comprehensions, debugging, assertions and exceptions, efficiency (Big O), sorting, search, hashing, trees, classes, object oriented programming, and inheritance.
MITx-6.00.2x-Cert_sm.jpg
MITx: 6.00.2x Introduction to Computational Thinking and Data Science - Covered plotting using PyLab, simulations and random walks, probability, stochastic programming, hashing, histograms, Monte Carlo simulations, curve fitting, prediction, graph traversal algorithms (used to optimize cost or distance), feature vectors, distance metrics and data clustering.


Development Tools - The course material focused on IDLE, but instead, I chose to use a combination of PyDev (the Python IDE for Eclipse), Spyder and iPython Notebook.

Why Python? I've observed that in general, the use of Python has been expanding. Current relevant usage includes:
  • The Apache Spark Python API (PySpark)
  • Open Stack development
  • Some Oracle SuperCluster management tools
Was the course hard? Both courses included video lectures, programming exercises, and exams. For me, it was more "time consuming" than "hard", largely due to the fact that I came to Python already knowing C, C++, Java, R, bash and some others.

Were the courses interesting? Two modules that I found to be particularly interesting were the knapsack problem and graph traversal. This was the first time that I had written code for either one. The concepts didn't seem odd, partially because I've previously spent plenty of time working with tree structures.

Do I like Python? Yes. The syntax and semantics are enough like C++ and  Java the the language quickly felt familiar. There are pros and cons to dynamically typed vs. statically typed languages. Both approaches will be used for the foreseeable future with no clear winner. I definitely like having the REPL loop and I'm eagerly awaiting the REPL loop in Java 9.

To Do - I'd like to get more experience with Python packages including NumPy, SciPy and pandas.



Saturday Apr 18, 2015

Hello World with Oracle PL/SQL

Hello World with Oracle PL/SQL. The most boring blog entry, ever.

  Obi-Wan: "These aren't the droids you're looking for."

  Agent Kay: "Nothing to see here. Move along folks."



set serveroutput on;

--------------------
-- Drop package body hw;
-- drop package hw;

CREATE OR REPLACE PACKAGE hw AS
   PROCEDURE hello_world;
END hw;
/
CREATE OR REPLACE PACKAGE BODY hw AS
  PROCEDURE hello_world
  IS
    hw_str VARCHAR2 (42) := 'Hello World!';
  BEGIN
    DBMS_OUTPUT.put_line (hw_str); 
  END hello_world;
END hw;
/

begin
  hw.hello_world;
end;
/

--------------------
-- Drop package body hw_2;
-- drop package hw_2;

CREATE OR REPLACE PACKAGE hw_2 AS
   PROCEDURE hello_world_2(
     num IN NUMBER);
END hw_2;
/
CREATE OR REPLACE PACKAGE BODY hw_2 AS
  PROCEDURE hello_world_2 (
    num IN NUMBER)
  IS
    hw_str VARCHAR2 (42) := 'Hello World!';
  BEGIN
    DBMS_OUTPUT.put_line (hw_str); 
    DBMS_OUTPUT.put_line (num+1); 
  END hello_world_2;
END hw_2;
/
begin
  hw_2.hello_world_2(3);
end;
/

--------------------
-- Drop package body hw_34;
-- drop package hw_34;

CREATE OR REPLACE PACKAGE hw_34 AS
   PROCEDURE hello_world_3;
   PROCEDURE hello_world_4(
     num IN NUMBER);
END hw_34;
/
CREATE OR REPLACE PACKAGE BODY hw_34 AS
  PROCEDURE hello_world_3 
  IS
    hw_str VARCHAR2 (42) := 'Hello World!';
  BEGIN
    DBMS_OUTPUT.put_line (hw_str); 
  END hello_world_3;

  PROCEDURE hello_world_4 (
    num IN NUMBER)
  IS
    hw_str VARCHAR2 (42) := 'Hello World!';
  BEGIN
    DBMS_OUTPUT.put_line (hw_str); 
    DBMS_OUTPUT.put_line (num+2); 
  END hello_world_4;
END hw_34;
/
begin
  hw_34.hello_world_3;
  hw_34.hello_world_4(3);
end;
/

Tuesday Apr 07, 2015

useradd with "-d localhost:...".

When creating Solaris users with useradd, I prefer using "-m" (make the directory) and specifying the directory with "-d localhost:...". Here is a quick comparison:

root@first-server:~# useradd -m -P "System Administrator" joerg1
root@first-server:~# useradd -m -d localhost:/export/home/joerg2 -P "System Administrator" joerg2

The /etc/passwd entry is different:

root@first-server:~# tail -2 /etc/passwd
joerg1:x:100:10::/export/home/joerg1:/usr/bin/bash
joerg2:x:101:10::/home/joerg2:/usr/bin/bash

joerg1@first-server:~$ pwd
/export/home/joerg1

joerg2@first-server:~$ pwd
/home/joerg2


Using "-d localhost:..." causes an auto_home entry to be created:

# cat /etc/auto_home
#
# Copyright 2005 Sun Microsystems, Inc.  All rights reserved.
# Use is subject to license terms.
#
# ident "%Z%%M% %I%     %E% SMI"
#
# Home directory map for automounter
#
joerg2 localhost:/export/home/joerg2
+auto_home


Both commands create a ZFS dataset:

root@first-server:~# zfs list | grep joerg
rpool/export/home/joerg1                              35K   201G    35K  /export/home/joerg1
rpool/export/home/joerg2                              35K   201G    35K  /export/home/joerg2


But the next step is more consistent if you are going to use a home directory that is shared over servers. From the first server, NFS share /export/home/joerg2. Then use a similar syntax on the second server (without the -m option) and (ta-da) the useradd will do all of the right stuff:

root@second-server:~# useradd  -d first-server:/export/home/joerg2 -P "System Administrator" joerg2

root@second-server:~# tail -1 /etc/passwd
joerg2:x:103:10::/home/joerg2:/usr/bin/bash

root@second-server:~# tail -2 /etc/auto_home
joerg2 first-server::/export/home/joerg2
+auto_home

root@
second-server:~# su - joerg2
Oracle Corporation      SunOS 5.11      11.2    December 2014

joerg2@
second-server:~$ pwd
/home/joerg2

joerg2@
second-server:~$ df -h .
Filesystem             Size   Used  Available Capacity  Mounted on
first-server:/export/home/joerg2
                       201G    35K       201G     1%    /home/joerg2


BTW, adding the -m is a really good idea. It is more powerful than following the useradd without a "-m" and then using mkdir, which would have been fine in the days of yore.

-m

         If the directory does not already  exist and the  parent
         directory  is  the  mount point of a ZFS dataset, then a
         child of that dataset will be created and mounted at the
         specified location. The user is delegated permissions to
         create ZFS snapshots and promote them. The newly created
         dataset  will  inherit  the  encryption setting from its
         parent. If it is encrypted, the user is granted  permis-
         sion to change its wrapping key.

Friday Apr 03, 2015

Cloudera Hive client on Solaris

Goal: From an Oracle Solaris server, access Hive data that is stored in a Cloudera Hadoop cluster, for example, access Hive data in an Oracle Big Data Appliance from a Solaris SPARC RDBMS server.

SC_BDA.png

This blog assumes that the Cloudera Hadoop cluster is already up and running. To test, I started with Oracle Big Data Lite Virtual Machine that was installed using the process documented at Big Data Lite with a static IP address on Oracle Linux 6 / EL6

start.png

Step 1: On the Solaris server, create an "oracle" user:

The user doesn't have to be "oracle", but that fits well with my planned usage.

# /usr/sbin/groupadd oinstall
# /usr/sbin/groupadd dba
# /usr/sbin/useradd -d localhost:/export/home/oracle -m -g oinstall -G dba oracle
# passwd oracle
# echo "oracle ALL=(ALL) ALL" > /etc/sudoers.d/oracle
# su - oracle


Set up a VNC server using the process at Installing VNC server on Solaris 11.

Step 2: Visit the Cloudera Manager to determine the version of Cloudera that is running:

  • Click on the "Support" pull down and the "About":

ClientConfig_1.png

  • In my case, I was using Cloudera Express 5.1.2
version2_1.png

Step 3: On the Oracle Solaris server, install the Hive and Hadoop tarballs that match your cluster
tarball_1_1.png

  • Choose your version. In my case, CDH 5.1.2.

tarball_2_1.png


tarball_3_1.png

  • Unpack the tarballs on your Solaris server:
$ tar xzf Downloads/hadoop-2.3.0-cdh5.1.2.tar.gz
$ tar xzf Downloads/hive-0.12.0-cdh5.1.2.tar.gz


  • And verify:
$ ls hadoop-2.3.0-cdh5.1.2
bin                  examples             libexec
bin-mapreduce1       examples-mapreduce1  sbin
cloudera             include              share
etc                  lib                  src

$ ls hive-0.12.0-cdh5.1.2/

bin                examples           LICENSE            RELEASE_NOTES.txt
conf               hcatalog           NOTICE             scripts
docs               lib                README.txt


Step 4: Download the HDFS configuration files to the Solaris server:
  • In the Cloudera Manager, go the the hdfs page:
HdfsClientConfig1_1.png

* From the hdfs page, download the client configuration. Place the client configuration onto the Solaris server

HdfsClientConfig2_1.png

  • Unpack the HDFS client configuration on your Solaris server:

$ unzip Downloads/hdfs-clientconfig.zip
Archive:  Downloads/hdfs-clientconfig.zip
  inflating: hadoop-conf/hdfs-site.xml 
  inflating: hadoop-conf/log4j.properties 
  inflating: hadoop-conf/topology.py 
  inflating: hadoop-conf/topology.map 
  inflating: hadoop-conf/hadoop-env.sh 
  inflating: hadoop-conf/core-site.xml 

Step 6: Configure the HDFS client software on the Solaris server

  • Edit hadoop-conf/hadoop-env.sh set JAVA_HOME correctly:

export JAVA_HOME=/usr/jdk/instances/jdk1.7.0

  • Move the configuration files into place:

$ cp hadoop-conf/* hadoop-2.3.0-cdh5.1.2/etc/hadoop/

  • Add the Hadoop bin directory to your PATH. You may want to do this in your local shell or .bashrc

$ export PATH=~/hadoop-2.3.0-cdh5.1.2/bin:$PATH

Step 7: Test the HDFS client software on Solaris

  • Verify that the remote hdfs filesystem is visible from your Solaris server

$ hdfs dfs -ls
15/04/02 14:40:54 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 6 items
drwx------   - oracle oracle          0 2014-08-25 01:55 .Trash
drwx------   - oracle oracle          0 2014-09-23 09:25 .staging
drwxr-xr-x   - oracle oracle          0 2014-01-12 15:15 moviedemo
drwxr-xr-x   - oracle oracle          0 2014-09-24 05:38 moviework
drwxr-xr-x   - oracle oracle          0 2014-09-08 11:50 oggdemo
drwxr-xr-x   - oracle oracle          0 2014-09-20 09:59 oozie-oozi


Step 8: Download the Hive configuration files to the Solaris server:

  • In the Cloudera Manager, go the the hive page:
go_to_hive_1.png

* From the hive page, download the client configuration. Place the client configuration on the Solaris server

ClientConfig_1.png

  • Unpack the hive client configuration on your Solaris server:

$ unzip Downloads/hive-clientconfig.zip
Archive:  Downloads/hive-clientconfig(1).zip
  inflating: hive-conf/hive-site.xml 
  inflating: hive-conf/hive-env.sh  
  inflating: hive-conf/log4j.properties 
  inflating: hive-conf/hadoop-env.sh 
  inflating: hive-conf/core-site.xml 
  inflating: hive-conf/mapred-site.xml 
  inflating: hive-conf/topology.py  
  inflating: hive-conf/yarn-site.xml 
  inflating: hive-conf/hdfs-site.xml 
  inflating: hive-conf/topology.map  


Step 9: Configure the Hive client software on the Solaris server

  • Move the configuration files into place:

$ cp hive-conf/* hive-0.12.0-cdh5.1.2/conf/

Step 10: YARN configuration option for Hadoop:

  • The HFDS configuration files don't include yarn-site.xml. Copy the YARN configuration file from the hive tree to the hadoop tree:

$ cp hive-0.12.0-cdh5.1.2/conf/yarn-site.xml hadoop-2.3.0-cdh5.1.2/etc/hadoop/

Step 11: Hide hive-env.sh:

$ mv hive-0.12.0-cdh5.1.2/conf/hive-env.sh hive-0.12.0-cdh5.1.2/conf/hive-env.sh.HIDDEN

  • Add the Hive bin directory to your PATH. You may want to do this in your local shell or .bashrc

$ export PATH=~/hive-0.12.0-cdh5.1.2/bin:$PATH

Step 12: Test the Hive client software on Solaris

  • Verify that the remote Hive tables are visible from your Solaris server

$ hive
15/04/03 12:20:37 INFO Configuration.deprecation: mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive
15/04/03 12:20:37 INFO Configuration.deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
15/04/03 12:20:37 INFO Configuration.deprecation: mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
15/04/03 12:20:37 INFO Configuration.deprecation: mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack
15/04/03 12:20:37 INFO Configuration.deprecation: mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node
15/04/03 12:20:37 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
15/04/03 12:20:37 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative
15/04/03 12:20:37 WARN conf.HiveConf: DEPRECATED: Configuration property hive.metastore.local no longer has any effect. Make sure to provide a valid value for hive.metastore.uris if you are connecting to a remote metastore.

Logging initialized using configuration in jar:file:/home/oracle/hive-0.12.0-cdh5.1.2/lib/hive-common-0.12.0-cdh5.1.2.jar!/hive-log4j.properties
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/oracle/hadoop-2.3.0-cdh5.1.2/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/oracle/hive-0.12.0-cdh5.1.2/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
hive> show tables;
OK
cust
movie
movie_rating
movieapp_log_avro
movieapp_log_json
movieapp_log_odistage
movieapp_log_stage
movielog
session_stats
Time taken: 1.799 seconds, Fetched: 9 row(s)
hive> 

Conclusion: Successfully accomplished task of configuring an Oracle Solaris server to access Hive data that is stored in a Cloudera Hadoop cluster.

I am unfriending hive-env.sh


BEFORE:

$ hive
usage: tail [+/-[n][lbc][f]] [file]
       tail [+/-[n][l][r|f]] [file]
/home/oracle/hive-0.12.0-cdh5.1.2/conf/hive-env.sh: line 5: ,: command not found
usage: tail [+/-[n][lbc][f]] [file]
       tail [+/-[n][l][r|f]] [file]
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/mapred/JobConf
    at org.apache.hadoop.hive.conf.HiveConf.initialize(HiveConf.java:1138)
    at org.apache.hadoop.hive.conf.HiveConf.<init>(HiveConf.java:1099)
    at org.apache.hadoop.hive.common.LogUtils.initHiveLog4jCommon(LogUtils.java:74)
    at org.apache.hadoop.hive.common.LogUtils.initHiveLog4j(LogUtils.java:58)
    at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:639)
    at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:623)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.mapred.JobConf
    at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
    ... 11 more



HIDE hive-env.sh:

$ mv hive-0.12.0-cdh5.1.2/conf/hive-env.sh hive-0.12.0-cdh5.1.2/conf/hive-env.sh.HIDDEN

AFTER:

oracle@p3231-zone14:~$ hive
15/04/03 12:20:37 INFO Configuration.deprecation: mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive
15/04/03 12:20:37 INFO Configuration.deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
15/04/03 12:20:37 INFO Configuration.deprecation: mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
15/04/03 12:20:37 INFO Configuration.deprecation: mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack
15/04/03 12:20:37 INFO Configuration.deprecation: mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node
15/04/03 12:20:37 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
15/04/03 12:20:37 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative
15/04/03 12:20:37 WARN conf.HiveConf: DEPRECATED: Configuration property hive.metastore.local no longer has any effect. Make sure to provide a valid value for hive.metastore.uris if you are connecting to a remote metastore.

Logging initialized using configuration in jar:file:/home/oracle/hive-0.12.0-cdh5.1.2/lib/hive-common-0.12.0-cdh5.1.2.jar!/hive-log4j.properties
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/oracle/hadoop-2.3.0-cdh5.1.2/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/oracle/hive-0.12.0-cdh5.1.2/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
hive> 

Thursday Feb 19, 2015

"ipadm show-addr" with name resolution on Solaris 11

With Solaris 11, use "ipadm show-addr" to show all of the IP addresses on a multi-homed server:

# ipadm show-addr
ADDROBJ           TYPE     STATE        ADDR
lo0/v4            static   ok           127.0.0.1/8
net0/v4           static   ok           31.41.59.163/20
net0/v4a          static   ok           
31.41.59.203/20
net1/v4           static   ok           192.168.20.201/24
net1/v4a          static   ok           169.254.245.72/16
net2/v4           static   ok           192.168.10.201/24

net3/v4           static   ok           192.168.30.201/24

net3/v4a          static   ok           192.168.30.101/24

net3/v4b          static   ok           192.168.30.104/24

net3/v4c          static   ok           192.168.30.103/24

lo0/v6            static   ok           ::1/128 ()

I grew bored of looking up the hostnames of the IP address, so I wrote a short Python script to append the hostname for each line: 

# show-addr.py
ADDROBJ           TYPE     STATE        ADDR ()
lo0/v4            static   ok           127.0.0.1/8 (localhost)
net0/v4           static   ok           31.41.59.163/20 (proj-1-db.spacely-sprockets.com)
net0/v4a          static   ok           
31.41.59.203/20 (proj-owan-vip1.spacely-sprockets.com)
net1/v4           static   ok           192.168.20.201/24 (proj-1-db-stor)
net1/v4a          static   ok           169.254.245.72/16 ()
net2/v4           static   ok           192.168.10.201/24 (
proj-1-db-priv)
net3/v4           static   ok           192.168.30.201/24 (
proj-1-db)
net3/v4a          static   ok           192.168.30.101/24 (
proj-1-vip.jdbc.spacely-sprockets.com)
net3/v4b          static   ok           192.168.30.104/24 (proj-scan.jdbc.
spacely-sprockets.com)
net3/v4c          static   ok           192.168.30.103/24 (proj-scan.jdbc.
spacely-sprockets.com)
lo0/v6            static   ok           ::1/128 ()

Here is the script, "show-addr.py":

#!/bin/env python2.6

from socket import gethostbyaddr
from subprocess import Popen, PIPE

proc=Popen(['ipadm', 'show-addr'], stdout=PIPE)
for line in proc.stdout:
    words=line.split()
    cidr=words[-1]
    addr=cidr.split('/')[0]
    try:
        hostname=gethostbyaddr(addr)[0]
    except:
        hostname=''
    print line[:-1] + " (" + hostname + ")"




Saturday Dec 20, 2014

SPARC T5-4 RAC and WebLogic Cluster

Background for related blog entries, including:

The system was being set up to test an application:
  • Oracle Financial Services Analytical Applications
  • WebLogic Cluster
  • Oracle Database

Here is a brief description of the hardware that was available for the test:

Compute nodes: Oracle's SPARC T5-4 SERVERS
T5-4b.png
/T5-4a.png
Each T5-4 has:
  • Sixteen-core 3.6 GHz SPARC T5 processors
  • Up to 128 threads per processor for a maximum 512 threads per system
  • Sixteen floating-point units per SPARC T5 processor
  • 1 TB (using 32x 32 GB 1,066 MHz DDR3 DIMMs)
Storage: Sun ZFS Storage Appliance 7420
s7420.jpg
This particular unit was populated with:
  • 4x 8-core Intel® Xeon® processors
  • 20 HDDs
  • 4 SSD's for logging
  • 4 SSD for cache

Th
The system was being set up to test an application:
  • Oracle Financial Services Analytical Applications
  • WebLogic Cluster
  • Oracle Database
The deployment was an active/active cluster: if either SPARC T5-4 Server were to go down, the system would be able to continue processing transactions.

Two virtualization layers were used. Solaris/SPARC has many virtualization alternatives and features, as discussed in Oracle Solaris 11 Virtualization Technology, allowing users to deploy virtualized environments that are specifically tailored to meet the needs of their site.  For this deployment, I used:
  • Oracle VM Server for SPARC. "Oracle VM Server leverages the built-in SPARC hypervisor to subdivide a supported platform’s resources (CPUs, memory, network, and storage) by creating partitions called logical (or virtual) domains. Each logical domain can run an independent operating system."
  • Solaris 11 Zones. I like Jeff Victor's blog Comparing Solaris 11 Zones to Solaris 10 Zones. Zones are a form of server virtualization called "OS (Operating System) Virtualization." They improve consolidation ratios by isolating processes from each other so that they cannot interact. Each zone has its own set of users, naming services, and other software components. One of the many advantages is that there is no need for a hypervisor, so there is no performance overhead. Many data centers run tens to hundreds of zones per server!"

Virtualized layout of first T5-4 Server
:

Application

OBIEE

OHS

WLS Cluster Nodes

RAC Node
Non Global Zones n/a
 proj-1-obiee  proj-1-ohs  proj-1-z1  z2..z14  proj-1-z15 n/a
Global Zone
proj-1-control proj-1-wls

 proj-1-db
LDom
Primary
Guest

Guest


First T5-4 Server


Virtualized layout of second T5-4 Server:

Application

OHS

WLS Cluster Nodes

RAC Node
Non Global Zones n/a
 proj-2-ohs   proj-2-z1   proj-2-z2 
 z3..z14   proj-2-z15 n/a
Global Zone
proj-2-control proj-2-wls  proj-2-db
LDom
Primary
Guest Guest


Second T5-4 Server



Network Connectivity:

Each Solaris Zone is dual ported to isolate network layers, as discussed in Onion Security

network.png

Resource allocation: two Oracle SPARC T5-4 servers to simultaneously host both Oracle RAC and a WebLogic Server Cluster. I chose to use Oracle VM Server for SPARC to create a cluster like this:


There are plenty of trade offs and decisions that need to be made, for example:


LDom Core and memory Allocation:

Cntl
0.25
 4 
64GB

                    App LDom
                   2.75 CPU's                   
                    44 cores                    
                     704 GB        
     DB LDom
     One CPU    
     16 cores   

      256 GB  

Additional detail can be found in other blog entries, including:


Friday Dec 19, 2014

DNS Bind Server configuration on Solaris 11.2

This blog is part of the SPARC T5-4 RAC and WebLogic Cluster series:

Virtualized layout of T5-4 Servers:

Application

OHS

WLS Cluster Nodes

RAC Node
Non Global Zones n/a
 proj-2-ohs   proj-2-z1   proj-2-z2 
 z3..z14   proj-2-z15 n/a
Global Zone
proj-2-control proj-2-wls  proj-2-db
LDom
Primary
Guest Guest


T5-4 Server


Network Connectivity:

Each Solaris Zone is dual ported to isolate network layers, as discussed in Onion Security

network.png


DNS Configuration:

Normally, I use systems were the naming is defined in the corporate DNS server. For this test, the private subnets needed a private DNS server.  Here are the files that are used to configure the DNS Server.

/etc/named.conf
options {
       directory       "/var/named";
        pid-file        "/var/named/tmp/pid";
        dump-file       "/var/named/dump/named_dump.db";
        statistics-file "/var/named/named.stats";
        forward         first;
        forwarders { 130.35.249.52; 130.35.249.41; 192.135.82.132; };
};

zone "jdbc.bigcorp.com" {
        type master;
        file "jdbc.db";
};

zone "30.168.192.in-addr.arpa" {
        type master;
        file "30.168.192.db";
};

zone "http.bigcorp.com" {
        type master;
        file "jdbc.db";
};

zone "40.168.192.in-addr.arpa" {
        type master;
        file "40.168.192.db";
};
logging {
category "default" { "debug"; };
category "general" { "debug"; };
category "database" { "debug"; };
category "security" { "debug"; };
category "config" { "debug"; };
category "resolver" { "debug"; };
category "xfer-in" { "debug"; };
category "xfer-out" { "debug"; };
category "notify" { "debug"; };
category "client" { "debug"; };
category "unmatched" { "debug"; };
category "network" { "debug"; };
category "update" { "debug"; };
category "queries" { "debug"; };
category "dispatch" { "debug"; };
category "dnssec" { "debug"; };
category "lame-servers" { "debug"; };
channel "debug" {
file "/tmp/nameddbg" versions 2 size 50m;
print-time yes;
print-category yes;
};
};


HTTP Network
/var/named/http.db
/var/named/40.168.192.db
$TTL 3h
@       IN      SOA     proj-1-db jeff  (
        2013022744 ;serial (change after every update)
        3600 ;refresh (1 hour)
        3600 ;retry (1 hour)
        604800 ;expire (1 week)
        38400 ;minimum (1 day)
)

             IN    NS  proj-1-db.bigcorp.com
proj-1-z1    IN    A   192.168.40.51
proj-1-z2    IN    A   192.168.40.52
proj-1-z3    IN    A   192.168.40.53
proj-1-z4    IN    A   192.168.40.54
proj-1-z5    IN    A   192.168.40.55
proj-2-z1    IN    A   192.168.40.71
proj-2-z2    IN    A   192.168.40.72
proj-2-z3    IN    A   192.168.40.73
proj-2-z4    IN    A   192.168.40.74
proj-2-z5    IN    A   192.168.40.75
proj-3-oats  IN    A   192.168.40.103
proj-4-oats  IN    A   192.168.40.104
proj-1-obiee IN    A   192.168.40.221
proj-1-ohs   IN    A   192.168.40.231

$TTL 3h
@       IN      SOA    proj-1-db.http.bigcorp.com. jeff.http.bigcorp.com. (
        2013022744 ;serial (change after every update)
        3600 ;refresh (1 hour)
        3600 ;retry (1 hour)
        604800 ;expire (1 week)
        38400 ;minimum (1 day)
)

    IN  NS   proj-1-db.bigcorp.com.
51  IN  PTR  proj-1-z1.http.bigcorp.com.
52  IN  PTR  proj-1-z2.http.bigcorp.com.
53  IN  PTR  proj-1-z3.http.bigcorp.com.
54  IN  PTR  proj-1-z4.http.bigcorp.com.
55  IN  PTR  proj-1-z5.http.bigcorp.com.
71  IN  PTR  proj-2-z1.http.bigcorp.com.
72  IN  PTR  proj-2-z2.http.bigcorp.com.
73  IN  PTR  proj-2-z3.http.bigcorp.com.
74  IN  PTR  proj-2-z4.http.bigcorp.com.
75  IN  PTR  proj-2-z5.http.bigcorp.com.
103 IN  PTR  proj-3-oats.http.bigcorp.com.
104 IN  PTR  proj-4-oats.http.bigcorp.com.
221 IN  PTR  proj-1-obiee.http.bigcorp.com.
231 IN  PTR  proj-1-ohs.http.bigcorp.com.



JDBC Network
/var/named/jdbc.db
/var/named/30.168.192.db
$TTL 3h
@       IN      SOA     proj-1-db jeff  (
        2013022744 ;serial (change after every update)
        3600 ;refresh (1 hour)
        3600 ;retry (1 hour)
        604800 ;expire (1 week)
        38400 ;minimum (1 day)
)

              IN   NS  proj-1-db
proj-1-z1     IN   A   192.168.30.51
proj-1-z2     IN   A   192.168.30.52
proj-1-z3     IN   A   192.168.30.53
proj-1-z4     IN   A   192.168.30.54
proj-1-z5     IN   A   192.168.30.55
proj-2-z1     IN   A   192.168.30.71
proj-2-z2     IN   A   192.168.30.72
proj-2-z3     IN   A   192.168.30.73
proj-2-z4     IN   A   192.168.30.74
proj-2-z5     IN   A   192.168.30.75
proj-1-db-vip IN   A   192.168.30.101
proj-2-db-vip IN   A   192.168.30.102
proj-scan     IN   A   192.168.30.103
proj-scan     IN   A   192.168.30.104
proj-scan     IN   A   192.168.30.105
proj-1-db     IN   A   192.168.30.201
proj-2-db     IN   A   192.168.30.202
proj-1-obiee  IN   A   192.168.30.221
proj-1-ohs    IN   A   192.168.30.231
proj-2-ohs    IN   A   192.168.30.232

$TTL 3h
@       IN      SOA    proj-1-db.jdbc.bigcorp.com. jeff.jdbc.bigcorp.com. (
        2013022744 ;serial (change after every update)
        3600 ;refresh (1 hour)
        3600 ;retry (1 hour)
        604800 ;expire (1 week)
        38400 ;minimum (1 day)
)

    IN  NS   proj-1-db.jdbc.bigcorp.com.
51  IN  PTR  proj-1-z1.jdbc.bigcorp.com.
52  IN  PTR  proj-1-z2.jdbc.bigcorp.com.
53  IN  PTR  proj-1-z3.jdbc.bigcorp.com.
54  IN  PTR  proj-1-z4.jdbc.bigcorp.com.
55  IN  PTR  proj-1-z5.jdbc.bigcorp.com.
71  IN  PTR  proj-2-z1.jdbc.bigcorp.com.
72  IN  PTR  proj-2-z2.jdbc.bigcorp.com.
73  IN  PTR  proj-2-z3.jdbc.bigcorp.com.
74  IN  PTR  proj-2-z4.jdbc.bigcorp.com.
75  IN  PTR  proj-2-z5.jdbc.bigcorp.com.
101 IN  PTR  proj-1-vip.jdbc.bigcorp.com.
102 IN  PTR  proj-2-vip.jdbc.bigcorp.com.
103 IN  PTR  proj-scan.jdbc.bigcorp.com.
104 IN  PTR  proj-scan.jdbc.bigcorp.com.
105 IN  PTR  proj-scan.jdbc.bigcorp.com.
201 IN  PTR  proj-1-db.jdbc.bigcorp.com.
202 IN  PTR  proj-2-db.jdbc.bigcorp.com.
221 IN  PTR  proj-1-obiee.jdbc.bigcorp.com.
231 IN  PTR  proj-1-ohs.jdbc.bigcorp.com.
232 IN  PTR  proj-2-ohs.jdbc.bigcorp.com.


Configuring a DNS Server:

# mkdir /var/named
# mkdir /var/named/dump
# mkdir /var/named/tmp
# pkg install pkg:/service/network/dns/bind

# named-checkconf -z /etc/named.conf
zone jdbc.bigcorp.com/IN: loaded serial 2013022744
zone 30.168.192.in-addr.arpa/IN: loaded serial 2013022744
zone http.bigcorp.com/IN: loaded serial 2013022744
zone 40.168.192.in-addr.arpa/IN: loaded serial 2013022744

Start the DNS Server:

# svcadm enable network/dns/server


Configure the DNS Client:

root@proj-1-db:~# svccfg -s network/dns/client
svc:/network/dns/client> setprop config/search = astring: ("jdbc.bigcorp.com" "bigcorp.com")
svc:/network/dns/client> setprop config/nameserver = net_address: (192.168.30.201)
svc:/network/dns/client> refresh
svc:/network/dns/client> quit

Test DNS:

root@proj-1-db:~# nslookup proj-2-z4
Server:        192.168.30.201
Address:    192.168.30.201#53

Name:    proj-2-z4.jdbc.bigcorp.com
Address: 192.168.30.74


I didn't use DNS for the Storage network (20) or Cluster Interconnect (10), instead, I just used /etc/hosts.

root@proj-1-db:~# cat /etc/hosts
#
# Copyright 2009 Sun Microsystems, Inc.  All rights reserved.
# Use is subject to license terms.
#
# Internet host table
#
::1             localhost
127.0.0.1       localhost loghost
192.168.30.201 
proj-1-db

## RAC Private /24 Subnet
192.168.10.201 
proj-1-db-priv
192.168.10.202 
proj-2-db-priv

## Storage /24 Subnet
192.168.20.201 
proj-1-db-stor
192.168.20.202 
proj-2-db-stor
192.168.20.205  proj-5-s7420-stor


WebLogic Server Zones will have 3 IP's, each:

root@proj-2-z1:~# ipadm show-addr
ADDROBJ           TYPE     STATE        ADDR
lo0/v4            static   ok           127.0.0.1/8
z1/v4             static   ok           xx.xx.xx.xx/20   # Management Subnet
jdbc_z1/v4        static   ok           192.168.30.71/24 # JDBC Subnet
http_z1/v4        static   ok           192.168.40.71/24 # HTTP Subnet
lo0/v6            static   ok           ::1/128


When Oracle Clusterware is up, the database LDom will have many IP's

root@proj-1-db:~# ipadm show-addr
ADDROBJ           TYPE     STATE        ADDR
lo0/v4            static   ok           127.0.0.1/8
net0/v4           static   ok           xx.xx.xx.xx/20
    # Management Subnet
net1/v4           static   ok           192.168.20.201/24 # Storage Subnet
net2/v4           static   ok           192.168.10.201/24 # Clusterware interconnect
net3/v4           static   ok           192.168.30.201/24 # JDBC Subnet
net3/v4a          static   ok           192.168.30.103/24 # SCAN
net3/v4d          static   ok           192.168.30.105/24
# SCAN
net3/v4e          static   ok           192.168.30.101/24 # VIP
lo0/v6            static   ok           ::1/128


Tuesday Dec 16, 2014

Oracle SQL Connector for HDFS on SPARC

This blog describes the steps necessary to configure Oracle SQL Connector for Hadoop Distributed File System (OSCH) to enable Oracle Database running on Oracle Solaris to access and analyze data residing in Cloudera HDFS running on an Oracle Big Data Appliance. The steps have been verified against a Cloudera Distribution including Apache Hadoop running in the Oracle Big Data Lite Virtual Machine. My configuration adventures regarding the Oracle Big Data Lite Virtual Machine and VirtualBox are shown in this previous blog entry.

Although similar steps are expected to work on many similar hardware and software configurations, it is worth noting that this document has been tested using the following two configurations:

Tested Configuration 1
RDBMS
Connector
Hadoop
  • Oracle SPARC T4-2 Server
  • Oracle Solaris 11.2
  • Oracle Database 12c Enterprise Edition Release 12.1.0.2.0
  • Oracle SQL Connector for Hadoop Distributed File System v3.1.0
  • Oracle Big Data Appliance
  • Oracle Enterprise Linux 6.4
  • Cloudera Manager (5.1.0)
  • Cloudera Distribution including Apache Hadoop (CDH5.1.0)



Tested Configuration 2
RDBMS
Connector
Hadoop
  • SPARC SuperCluster T4-4
  • Oracle Solaris 11.1
  • Oracle Database 12c Enterprise Edition Release 12.1.0.2.0
  • Oracle SQL Connector for Hadoop Distributed File System v3.1.0
  • Oracle Big Data Lite Virtual Machine 4.0.1
  • Oracle Enterprise Linux 6.4
  • Cloudera Manager (5.1.2)
  • Cloudera Distribution including Apache Hadoop (CDH5.1.2)


Part A: Preparing the initial environment

Step 1: Oracle Database 12c deployment

* Follow the typical database deployment guides:

Oracle Database Online Documentation 12c Release 1 (12.1)
-> Installing and Upgrading
--> Under, "Solaris Installation Guides", follow the guide that suites your needs:
---> Grid Infrastructure Installation Guide for Oracle Solaris
---> Database Installation Guide for Oracle Solaris
---> Database Quick Installation Guide for Oracle Solaris on SPARC (64-Bit)


Step 2:
Oracle Big Data deployment

* For the Big Data Appliance, follow the instructions in the Oracle Big Data Documentation.

* For the Oracle Big Data Lite Virtual Machine, follow the Quick Deployment Step-by-step instructions. Also, see my configuration adventures recounted in this previous blog entry.


Part B: Enable the Cloudera HDFS client on Solaris

* It is assumed that both Oracle database and Hadoop cluster are up and running independently before you attempt to install the connector.

Step 1: Make sure that Java is installed on Solaris

$ sudo pkg install --accept jdk-7

$ /usr/java/bin/java -version
java version "1.7.0_65"
Java(TM) SE Runtime Environment (build 1.7.0_65-b17)
Java HotSpot(TM) Server VM (build 24.65-b04, mixed mode)


Step 2: Determine the Version of Cloudera that is running on your Big Data Appliance:

* Starting with the BDA up and running:

initial_state.png

* Click on the "Support" pull down and the "About":

about_1.png

* In my case, I was using Cloudera Express 5.1.2

about.png



Step 3:
On the Oracle Database servers, install the version of Hadoop that matches your cluster

* Visit Cloudera Downloads, click on "CDH Download Now", and choose your version. In my case, CDH 5.1.2.

file:///Users/user/Desktop/Connector/download_1.png

* This took me to the Packaging and Tarballs page for all of the CDH components. Download the Apache Hadoop Tarball

download_2.png

Place this tarball onto your Solaris server, or in the case of Oracle RAC, copy the tarball to every node.There are many ways to get the tarball onto your Solaris server. In my case, the most convenient method was to use wget:

$ export http_proxy=http://the-proxy-server.bigcorp.com:80
$ wget http://archive.cloudera.com/cdh5/cdh/5/hadoop-2.3.0-cdh5.1.2.tar.gz
$ scp hadoop-2.3.0-cdh5.1.2.tar.gz oracle@my-rac-node-2:


Unpack the tarball on your Solaris server, or on every RAC node:

$ tar xzf ../Downloads/hadoop-2.3.0-cdh5.1.2.tar.gz

And verify:

$ ls hadoop-2.3.0-cdh5.1.2
bin                  examples             libexec
bin-mapreduce1       examples-mapreduce1  sbin
cloudera             include              share
etc                  lib                  src


Step 4: Download the hdfs configuration files to the Solaris RDBMS server

* In the Cloudera Manager, go the the hdfs page:

client_1.png

* From the hdfs page, download the client configuration. Place the client configuration on the Solaris server(s)

client_2.png


* Unpack the client configuration on your Solaris server(s):

$ unzip ./Downloads/hdfs-clientconfig.zip

$ ls hadoop-conf
core-site.xml     hdfs-site.xml     topology.map
hadoop-env.sh     log4j.properties  topology.py



Step 5: Configure the Hadoop client software on the Solaris server(s)

* Edit hadoop-conf/hadoop-env.sh set JAVA_HOME correctly:

export JAVA_HOME=/usr/jdk/instances/jdk1.7.0


* Move the configuration files into place:

$ cp hadoop-conf/* hadoop-2.3.0-cdh5.1.2/etc/hadoop/


* Add the Hadoop bin directory to your PATH. You may want to do this in your local shell or .bashrc

$ export PATH=~/hadoop-2.3.0-cdh5.1.2/bin:$PATH


Step 6: Test the Hadoop client software on Solaris

* Verify that the remote hdfs filesystem is visible from your Solaris server(s)

$ hdfs dfs -ls
14/12/12 09:35:23 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 6 items
drwx------   - oracle oracle          0 2014-08-25 02:55 .Trash
drwx------   - oracle oracle          0 2014-09-23 10:25 .staging
drwxr-xr-x   - oracle oracle          0 2014-01-12 15:15 moviedemo
drwxr-xr-x   - oracle oracle          0 2014-09-24 06:38 moviework
drwxr-xr-x   - oracle oracle          0 2014-09-08 12:50 oggdemo
drwxr-xr-x   - oracle oracle          0 2014-09-20 10:59 oozie-oozi



Part C: Install "Oracle SQL Connector for HDFS" (OSCH) on your Solaris server(s)

Step 1: Download OSCH


* Download Oracle SQL Connector for Hadoop Distributed File System Release 3.1.0 from the Oracle Big Data Connectors Downloads page.

* Unzip OSCH

$ unzip oraosch-3.1.0.zip
Archive:  oraosch-3.1.0.zip
 extracting: orahdfs-3.1.0.zip      
  inflating: README.txt             

$ unzip orahdfs-3.1.0
Archive:  orahdfs-3.1.0.zip
replace orahdfs-3.1.0/doc/README.txt? [y]es, [n]o, [A]ll, [N]one, [r]ename: A
  inflating: orahdfs-3.1.0/doc/README.txt 
  inflating: orahdfs-3.1.0/jlib/ojdbc6.jar 
  inflating: orahdfs-3.1.0/jlib/osdt_cert.jar 
  inflating: orahdfs-3.1.0/jlib/orahdfs.jar 
  inflating: orahdfs-3.1.0/jlib/oraclepki.jar 
  inflating: orahdfs-3.1.0/jlib/osdt_core.jar 
  inflating: orahdfs-3.1.0/jlib/ora-hadoop-common.jar 
  inflating: orahdfs-3.1.0/bin/hdfs_stream 
  inflating: orahdfs-3.1.0/examples/sql/mkhive_unionall_view.sql  


Step 2: Install OSCH

* Follow the instructions in the Connectors User's Guide which is available as part of the Oracle Big Data Documentation. Also see
Getting Started with Oracle Big Data Connectors.

* I'm tempted to cut & paste everything from "1.4.3 Installing Oracle SQL Connector for HDFS" into this document, but I won't.

* Your mileage may vary, but for me, it looked like this:

$ tail -1 /var/opt/oracle/oratab
dbmc1:/u01/app/oracle/product/12.1.0.2/dbhome_1:N               # line added by Agent

$ export PATH=/usr/local/bin:$PATH

$ export ORACLE_SID=dbmc1

$ . /usr/local/bin/oraenv
ORACLE_SID = [dbmc1] ?
The Oracle base has been set to /u01/app/oracle

$ env | grep ORA
ORACLE_SID=dbmc1
ORACLE_BASE=/u01/app/oracle
ORACLE_HOME=/u01/app/oracle/product/12.1.0.2/dbhome_1

$ srvctl status database -d dbmc1
Instance dbmc11 is running on node etc20dbadm01
Instance dbmc12 is running on node etc20dbadm02

$ export OSCH_HOME=/export/home/oracle/orahdfs-3.1.0

$ export HADOOP_CLASSPATH=$OSCH_HOME/jlib/*


$ sqlplus / as sysdba

SQL> CREATE USER hdfsuser IDENTIFIED BY n0ne0fyourBusiness
   DEFAULT TABLESPACE users
   QUOTA UNLIMITED ON users;

SQL> GRANT CREATE SESSION, CREATE TABLE, CREATE VIEW TO hdfsuser;

SQL> GRANT EXECUTE ON sys.utl_file TO hdfsuser;

SQL> CREATE OR REPLACE DIRECTORY osch_bin_path AS '/export/home/oracle/orahdfs-3.1.0/bin'
SQL> GRANT READ, EXECUTE ON DIRECTORY osch_bin_path TO hdfsuser;

Notice that MOVIEDEMO_DIR needs to be on shared storage, visible to both RAC nodes. From a Solaris shell prompt, create the MOVIEDEMO_DIR, substituting the ZFS_SA InfiniBand hostname for XXX, below.  Then allow the database user to assess the directory:

$ mkdir /net/XXX/export/test/hdfsuser

$ sqlplus / as sysdba
SQL>
CREATE OR REPLACE DIRECTORY MOVIEDEMO_DIR AS '/net/XXX/export/test/hdfsuser';
SQL> GRANT READ, WRITE ON DIRECTORY MOVIEDEMO_DIR TO hdfsuser;

Step 3: Test using the Movie Demo

Test using the movie demo which is documented in the Big Data Connectors User's Guide. Cut and paste moviefact_hdfs.sh and moviefact_hdfs.xml from Example 2-1 Accessing HDFS Data Files from Oracle Database

In moviefact_hdfs.sh, for my configuration I need to change the path to OSCH_HOME and moviefact_hdfs.xml.

In moviefact_hdfs.xml, I needed to change two properties, as follows. For the database connection, use the Oracle Single Client Access Name (SCAN).

    <property>
      <name>oracle.hadoop.connection.url</name>
      <value>jdbc:oracle:thin:@sc-scan:1521/dbmc1</value>
    </property>

    <property>
      <name>oracle.hadoop.connection.user</name>
      <value>hdfsuser</value>
    </property>


Run the script:

$ sh moviefact_hdfs.sh
Oracle SQL Connector for HDFS Release 3.1.0 - Production

Copyright (c) 2011, 2014, Oracle and/or its affiliates. All rights reserved.

[Enter Database Password:]
14/12/12 12:36:00 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
The create table command succeeded.

CREATE TABLE "HDFSUSER"."MOVIE_FACTS_EXT"
(
 "CUST_ID"                        VARCHAR2(4000),
 "MOVIE_ID"                       VARCHAR2(4000),
 "GENRE_ID"                       VARCHAR2(4000),
 "TIME_ID"                        TIMESTAMP(9),
 "RECOMMENDED"                    NUMBER,
 "ACTIVITY_ID"                    NUMBER,
 "RATING"                         NUMBER,
 "SALES"                          NUMBER
)
ORGANIZATION EXTERNAL
(
   TYPE ORACLE_LOADER
   DEFAULT DIRECTORY "MOVIEDEMO_DIR"
   ACCESS PARAMETERS
   (
     RECORDS DELIMITED BY 0X'0A'
     CHARACTERSET AL32UTF8
     PREPROCESSOR "OSCH_BIN_PATH":'hdfs_stream'
     FIELDS TERMINATED BY 0X'09'
     MISSING FIELD VALUES ARE NULL
     (
       "CUST_ID" CHAR(4000),
       "MOVIE_ID" CHAR(4000),
       "GENRE_ID" CHAR(4000),
       "TIME_ID" CHAR DATE_FORMAT TIMESTAMP MASK 'YYYY-MM-DD:HH:MI:SS',
       "RECOMMENDED" CHAR,
       "ACTIVITY_ID" CHAR,
       "RATING" CHAR,
       "SALES" CHAR
     )
   )
   LOCATION
   (
     'osch-20141212123602-2522-1',
     'osch-20141212123602-2522-2',
     'osch-20141212123602-2522-3',
     'osch-20141212123602-2522-4'
   )
) PARALLEL REJECT LIMIT UNLIMITED;

The following location files were created.

osch-20141212123602-2522-1 contains 1 URI, 12754882 bytes

    12754882 hdfs://ofsaa-bdl.us.oracle.com:8020/user/oracle/moviework/data/part-00001

osch-20141212123602-2522-2 contains 1 URI, 438 bytes

         438 hdfs://ofsaa-bdl.us.oracle.com:8020/user/oracle/moviework/data/part-00002

osch-20141212123602-2522-3 contains 1 URI, 432 bytes

         432 hdfs://ofsaa-bdl.us.oracle.com:8020/user/oracle/moviework/data/part-00003

osch-20141212123602-2522-4 contains 1 URI, 202 bytes

         202 hdfs://ofsaa-bdl.us.oracle.com:8020/user/oracle/moviework/data/part-00004



Final verification with SQL Developer:

/test.png

Wednesday Dec 10, 2014

(OATS) Oracle Application Testing Suite Report

When using Oracle Application Testing Suite (OATS), I like to monitor and save reports that look like this:


Name

Value

Min

Max

Avg

Pass

Warning

Fail

Transaction1

70.62

51.84

81.08

72.503

84

0

0

Transaction2

72.72

55.01

82.7

73.195

95

0

0

Transaction3

72.22

54.99

82.86

73.542

83

0

0


A colleague asked how I generate the reports. Here is the answer.


Step 1:

OATS_a.png


Step 2:

OATS_b.png


Steps 3 & 4: 

OATS_c.png


Finally, once the report is open in Firefox, I can cut & paste the table into Thunderbird and save or send to my colleagues.

Friday Oct 24, 2014

Onion Security

This blog is part of the SPARC T5-4 RAC and WebLogic Cluster series: including:

I was asked why I used so many subnets in the system which is described in my blog entry SPARC T5-4 LDoms for RAC and WebLogic Clusters:
  • Management Subnet
  • HTTP Public Network
  • JDBC subnet
  • Storage subnet
The short answer: I didn't need to. I could have used one public network and one RAC private network.

Longer answer:
  • Better observability
  • Better isolation
  • It enables a better security model


Onion View





LDom View




If Joe Blackhat is able to compromise our HTTP server, that is a bad thing, but hopefully he will only be able access a subset of the data.  To get any additional data, he will need to request the data from the WebLogic server. The HTTP to WebLogic network layer can be monitored, firewalled, and logged. Again, if Joe Blackhat is able to penetrate one layer deeper, into the WebLogic layer, he will only be able to access additional data via JDBC calls to Oracle RAC. Again, the WebLogic to RAC network layer can be monitored, firewalled, and logged. And so forth...

In case it isn't obvious, the management network is intended to be used only infrequently by DBA's and System Administrators. This network should be tightly controlled and only enabled when system administration is required.

Wednesday Aug 20, 2014

Quadratic Programming with Oracle R Enterprise

     I wanted to use quadprog with ORE on a server running Oracle Solaris 11.2 on a Oracle SPARC T-4 server

For background, see:


Problem: path to Solaris Studio doesn't match my installation:

$ ORE CMD INSTALL quadprog_1.5-5.tar.gz
* installing to library \u2018/u01/app/oracle/product/12.1.0/dbhome_1/R/library\u2019
* installing *source* package \u2018quadprog\u2019 ...
** package \u2018quadprog\u2019 successfully unpacked and MD5 sums checked
** libs
/opt/SunProd/studio12u3/solarisstudio12.3/bin/f95 -m64   -PIC  -g  -c aind.f -o aind.o
bash: /opt/SunProd/studio12u3/solarisstudio12.3/bin/f95: No such file or directory
*** Error code 1
make: Fatal error: Command failed for target `aind.o'
ERROR: compilation failed for package \u2018quadprog\u2019
* removing \u2018/u01/app/oracle/product/12.1.0/dbhome_1/R/library/quadprog\u2019

$ ls -l /opt/solarisstudio12.3/bin/f95
lrwxrwxrwx   1 root     root          15 Aug 19 17:36 /opt/solarisstudio12.3/bin/f95 -> ../prod/bin/f90

Solution: a symbolic link:

$ sudo mkdir -p /opt/SunProd/studio12u3

$ sudo ln -s /opt/solarisstudio12.3 /opt/SunProd/studio12u3/

Now, it is all good:

$ ORE CMD INSTALL quadprog_1.5-5.tar.gz
* installing to library \u2018/u01/app/oracle/product/12.1.0/dbhome_1/R/library\u2019
* installing *source* package \u2018quadprog\u2019 ...
** package \u2018quadprog\u2019 successfully unpacked and MD5 sums checked
** libs
/opt/SunProd/studio12u3/solarisstudio12.3/bin/f95 -m64   -PIC  -g  -c aind.f -o aind.o
/opt/SunProd/studio12u3/solarisstudio12.3/bin/ cc -xc99 -m64 -I/usr/lib/64/R/include -DNDEBUG -KPIC  -xlibmieee  -c init.c -o init.o
/opt/SunProd/studio12u3/solarisstudio12.3/bin/f95 -m64  -PIC -g  -c -o solve.QP.compact.o solve.QP.compact.f
/opt/SunProd/studio12u3/solarisstudio12.3/bin/f95 -m64  -PIC -g  -c -o solve.QP.o solve.QP.f
/opt/SunProd/studio12u3/solarisstudio12.3/bin/f95 -m64   -PIC  -g  -c util.f -o util.o
/opt/SunProd/studio12u3/solarisstudio12.3/bin/ cc -xc99 -m64 -G -o quadprog.so aind.o init.o solve.QP.compact.o solve.QP.o util.o -xlic_lib=sunperf -lsunmath -lifai -lsunimath -lfai -lfai2 -lfsumai -lfprodai -lfminlai -lfmaxlai -lfminvai -lfmaxvai -lfui -lfsu -lsunmath -lmtsk -lm -lifai -lsunimath -lfai -lfai2 -lfsumai -lfprodai -lfminlai -lfmaxlai -lfminvai -lfmaxvai -lfui -lfsu -lsunmath -lmtsk -lm -L/usr/lib/64/R/lib -lR
installing to /u01/app/oracle/product/12.1.0/dbhome_1/R/library/quadprog/libs
** R
** preparing package for lazy loading
** help
*** installing help indices
  converting help for package \u2018quadprog\u2019
    finding HTML links ... done
    solve.QP                                html 
    solve.QP.compact                        html 
** building package indices
** testing if installed package can be loaded
* DONE (quadprog)

======

Here is an example from http://cran.r-project.org/web/packages/quadprog/quadprog.pdf

> require(quadprog)
> Dmat <- matrix(0,3,3)
> diag(Dmat) <- 1
> dvec <- c(0,5,0)
> Amat <- matrix(c(-4,-3,0,2,1,0,0,-2,1),3,3)
> bvec <- c(-8,2,0)
> solve.QP(Dmat,dvec,Amat,bvec=bvec)
$solution
[1] 0.4761905 1.0476190 2.0952381

$value
[1] -2.380952

$unconstrained.solution
[1] 0 5 0

$iterations
[1] 3 0

$Lagrangian
[1] 0.0000000 0.2380952 2.0952381

$iact
[1] 3 2


Here, the standard example is modified to work with Oracle R Enterprise


require(ORE)
ore.connect("my-name", "my-sid", "my-host", "my-pass", 1521)

ore.doEval(
  function () {
    require(quadprog)
  }
)

ore.doEval(
  function () {
    Dmat <- matrix(0,3,3)
    diag(Dmat) <- 1
    dvec <- c(0,5,0)
    Amat <- matrix(c(-4,-3,0,2,1,0,0,-2,1),3,3)
    bvec <- c(-8,2,0)
    solve.QP(Dmat,dvec,Amat,bvec=bvec)
  }
)

$solution
[1] 0.4761905 1.0476190 2.0952381

$value
[1] -2.380952

$unconstrained.solution
[1] 0 5 0

$iterations
[1] 3 0

$Lagrangian
[1] 0.0000000 0.2380952 2.0952381

$iact
[1] 3 2

Now I can combine the quadprog compute algorithms with the Oracle R Enterprise Database engine functionality:
  • Scale to large datasets
  • Access to tables, views, and external tables in the database, as well as those accessible through database links
  • Use SQL query parallel execution
  • Use in-database statistical and data mining functionality

Tuesday Mar 18, 2014

What compiler options were used to create an executable?


Not stripped:

$ cc -O hello.c

$ file a.out

a.out: ELF 32-bit MSB executable SPARC32PLUS Version 1, V8+ Required, dynamically linked, not stripped

$ strings -a a.out | grep cc

/opt/solarisstudio12.3/prod/bin/cc -O hello.c

$ dwarfdump -i a.out | grep compile_o

DW_AT_SUN_compile_options Xa;O;R=Sun C 5.12 SunOS_sparc Patch 148917-07 2013/10/18;backend;raw;cd;

Stripped:

$ strip a.out

$ file a.out

a.out: ELF 32-bit MSB executable SPARC32PLUS Version 1, V8+ Required, dynamically linked, stripped

$ strings -a a.out | grep cc

(none)

Tuesday Apr 14, 2009

Algorithmics Financial Risk Management Software on the Sun Fire X4270


For the past several months, I've been slaving over a new server powered by Intel® Xeon® 5500 Series processors.  I have been using the system to investigate the characteristics of Algorithmics' Risk Analysis application, and I've found that the new server and the Solaris OS run great.  Solaris lets you take advantage of all the great new features of these new CPUs such as TurboBoost and HyperThreading. The big takeaway is that the new version of Algo's software really benefits from the new CPU: this platform runs faster than the other Solaris systems I've tested.

If you don't know Algorithmics, their software analyzes the risk of financial instruments, and financial institutions are using it to help keep the current financial meltdown from worsening. Algo is used by banks and financial institutions in 30 countries. I'm part of the team at Sun that has worked closely with Algorithmics to optimize performance of Algorithmics' risk management solutions on Sun Fire servers running Solaris.

Algo's software has a flexible simulation framework for a broad class of financial instruments.  SLIMs are their newest generation of very fast simulation engines. Different SLIMs are available to simulate interest rate products (swaps, FRAs, bonds, etc.), CDSs and some option products. The simulation performance with the new SLIMs is spectacular.

 Solaris provides everything that is required for Algo to take advantage of the new server line's performance and energy efficiency:
  • Sun Studio Express has been optimized to generate efficient SIMD instructions
  • ZFS provides the high performance I/O
  • Hyperthreading:  Solaris can easily take advantage of the heavily threaded architecture.
  • Turbo Boost.  Yes, it kicks in immediately when a SLIM is launched
  • QuickPath chip-to-chip interconnect.  Solaris is NUMA aware.


 Here are the answers to some questions that you may have about SLIMs on Solaris and the Xeon® Processor 5570:

 Q: Are SLIMs I/O bound or CPU bound?
 A: SLIM's require both the computational and the I/O subsystem to be excellent.  For the data sets that have been provided by Algo running on the X4270 with ZFS striped over 12 internal hard drives, results vary.  The majority of the SLIMs are CPU bound, but a few are I/O bound.  

 Q: What are the computational characteristics of SLIM's running on the X4270?
 A: The number of threads to be used are specified by the user.  Most of the SLIM's scale well up to 8 threads and some scale all the way up to 16 threads.  SLIM's that don't scale to 16 threads on the X4270 with internal hard drives are primarily I/O bound and benefit from even faster storage. (Hint: check this Algo blog again in the future for more discussion regarding faster I/O devices.) 

 Q: What are the I/O characteristics of SLIM's running on the X4270 with ZFS?
 A: The I/O pattern of SLIMs is large block sequential write.  ZFS caches the writes and flushes the cache approximately once every 10 seconds.  Each hard drive hits peaks of 80 MB/second.  With 12 striped internal hard drives, the total system throughput can reach close to 1.0 GB/second. 

 Q: Is external storage required for SLIMs?
 A: 14 internal drives (plus 2 for mirrored OS disks) at 146GB each can hold a data set upwards of 1.7TB.  If your data fits, internal drives will be a cost effective solution.

 Q: Is hardware RAID required for the SLIMs data.  Background:  RAID can potentially fulfill needs including (a) Striping to creating larger storage units from smaller disks, and (b) data redundancy so that you don't loose important data.  (c) Hardware RAID can increasing I/O performance via cacheing and fast checksum block computation.  
 A: No, hardware RAID is not required.  (a) ZFS has a full range of RAID capabilities.   (b) The cube of data produced by SLIMs is normally considered to be temporary data that can be recalculated if necessary,  and therefore redundancy is not required.  (c) If redundancy is required for your installation, RAID-Z has been shown to have a negligible impact on SLIMs' I/O performance. The SLIM's write intensive I/O pattern will blow through cache and be bound by disk write performance, so there is no advantage to adding an additional layer of cache.

 Q: Algo often recommends using Direct I/O filesystems instead of buffering.  Should we use Direct I/O with ZFS?
 A: No.  All ZFS I/O is buffered.  Based on comparisons against QFS with Direct I/O enabled, ZFS is recommended.

 Q: In some cases ZFS performance can be improved by disabling ZIL or by putting ZIL on on a faster device in a ZFS hybrid storage pool.  Does it help SLIMs performance?
 A: No. SLIMs' I/O is not synchronous. SSD for ZIL will not improve SLIMs' performance when writing to direct attached storage.

 Q: Is the use of Power Management recommended?
 A: Yes.   PowerTOP was used to enable and monitor the CPU power management.  When the machine is idle, power is reduced.  When SLIM's are executing, the CPU's quickly jumps into TurboBoost mode.  There was no significant performance difference between running SLIM's with and without power management enabled.

Q: The Sun Fire X4270 DDR3 memory can be configured to run at 800MHz, 1066MHz or 1333MHz.  Does the DDR3 speed effect SLIMs performance?
A: Yes, several of the SLIMs (but not all) run better on systems configured to run DDR3 at 1333MHz.

Q: Would it be better to deploy SLIMs on a blade server (like the X6275 Server Module) or a rack mounted server (like the X4270).
A: Again, the answer resolves around storage.  If the SLIMs time series data fits onto the internal disk drives, the rack mounted server will be a cost effective solution.  If your time series data is greater than 1.5 TB, it will be necessary to use external storage, and the blade server will a more cost effective solution. 



 Platform tested:

     \* Sun Fire X4270 2RU rack-mount server
     \* 2 Intel® Xeon® Processor 5500 Series CPU's
     \* 14 -  146GB 10K RPM disk drives
           o 2 for the operating system
           o 12 for data storage
           o 2 empty slots
     \* Memory configurations ranged from 24GB to 72GB

About

Jeff Taylor-Oracle

Search

Categories
Archives
« May 2015
SunMonTueWedThuFriSat
     
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
21
22
24
25
26
27
28
29
30
31
      
Today