X

Jeff Taylor's Weblog

Recent Posts

Sun

libCrun.so.1: version 'SUNW_1.6' not found

We were getting the errors below issue when trying to launch a program on Solaris 11.2: ld.so.1: am: fatal: libCrun.so.1: version 'SUNW_1.6' not found (required by file prog)ld.so.1: am: fatal: /usr/lib/libCrun.so.1: wrong ELF class: ELFCLASS32 I quickly recognized the ELFCLASS32, but needed to do a little research to get to the bottom of the "libCrun.so.1: version 'SUNW_1.6' not found" 1) fatal: /usr/lib/libCrun.so.1: wrong ELF class: ELFCLASS32The application was picking up the 32-bit version of the library. It turned out that the 64-bit libraries were already loaded on the system and it was only necessary to modify the environment to ensure that the 64-bit libraries would be picked up at runtime. One solution is to set one of the LD_LIBRARY_PATH variants, however it is generally a bad idea to set any of those environment variables, except for short-term testing of an application or library. If an application routinely needs one or more of the LD_LIBRARY_PATH variables, the app was probably not built correctly.  I would first check whether it was built with a correct runpath (linker -R option).  For a full discussion of what's wrong with LD_LIBRARY_PATH and how to avoid using it, see these classic blog entries: https://blogs.oracle.com/rie/ldlibrarypath-just-say-nohttps://blogs.oracle.com/ali/avoiding-ldlibrarypath:-the-options It turned out that our problem was that LD_LIBRARY_PATH was set and /usr/lib was BEFORE /usr/lib/sparcv9. If you choose to use one of the LD_LIBRARY_PATH variables for testing, using LD_LIBRARY_PATH_64 is the better choice. "The 64-bit dynamic linker's search path can be completely overridden using the LD_LIBRARY_PATH_64 environment variable." (seehttps://docs.oracle.com/cd/E19455-01/806-0477/dev-env-7/index.html). The advantage of using LD_LIBRARY_PATH_64 is that it only applies to 64-bit programs. If you modify LD_LIBRARY_PATH to solve the 64-bit problem, a side effect can be that some 32-bit programs may fail due to picking up the 64-bit libraries. FYI, here are some commands that were useful during the investigation. a) Find out which package supplies libCrun.so.1: # pkg search -l -H -o pkg.name libCrun.so.1 system/library/c++-runtime b) If necessary, install the the package: # pkg install system/library/c++-runtime Linked image publisher check No updates necessary for this image. ... c) What is path to the various versions of the library? # pkg search libCrun.so.1 INDEX      ACTION VALUE                        PACKAGE basename   file   usr/lib/amd64/libCrun.so.1   pkg:/system/library/c++-runtime@0                           .5.11-0.175.2.0.0.37.0 basename   file   usr/lib/sparcv9/libCrun.so.1 pkg:/system/library/c++-runtime@0                           .5.11-0.175.2.0.0.37.0 basename   file   usr/lib/libCrun.so.1         pkg:/system/library/c++-runtime@0                           .5.11-0.175.2.0.0.37.0 d) Which library is which? # file /usr/lib/libCrun.so.1 /usr/lib/libCrun.so.1:  ELF 32-bit MSB dynamic lib SPARC32PLUS Version 1, V8+ Required, dynamically linked, not stripped, no debugging information available # file /usr/lib/sparcv9/libCrun.so.1 /usr/lib/sparcv9/libCrun.so.1:  ELF 64-bit MSB dynamic lib SPARCV9 Version 1, dynamically linked, not stripped, no debugging information available # file /usr/lib/amd64/libCrun.so.1 /usr/lib/amd64/libCrun.so.1:    cannot open: No such file or directory 2) libCrun.so.1: version 'SUNW_1.6' not found The solution to an error similar to "libCrun.so.1: version 'SUNW_1.6' not found" is to use a build machine that is at a lower version than the target execution machine. If your have a C++ application that you are planning to distribute, you should review Using and Redistributing Oracle Developer Studio Libraries in an Application, which is about redistributing libraries that are provided only as part of Studio. The libstl library was used as an example. An application using those libraries may not run unless the libraries are distributed with it. The article explains how to provide the library with your app. Libraries that are distributed with Solaris (like libCrun) should NOT be distributed with your app. In fact, you don't have permission to do so. The Distribution Readme does not list these libraries as redistributable. The reason is that a program could potentially link to two copies of the runtime library -- the one with your app and the other from /usr/lib. If you encounter an error similar to "libCrun.so.1: version 'SUNW_1.6' not found", you have several choices:1. Build on a system with an older version of the Solaris library,2. Require that clients install the newer Solaris library on their systems, or3. A option is available for programs that consist of an executable that does not use any shared libraries other than the basic system libraries like libc. You can link the C++ runtime library statically in that case. Again, here are some commands that were useful during the investigation. Version 1.6 is required by the app: # pvs -r /weblogic/OFSAA_HOME/ficdb/bin/am         libsocket.so.1 (SUNW_0.7);         libnsl.so.1 (SUNW_1.7, SUNWprivate_1.1);         libpthread.so.1 (SUNW_1.2);         librt.so.1 (SUNW_0.7);         libCstd.so.1 (SUNW_1.1.1, SUNW_1.3);         libCrun.so.1 (SUNW_1.6);         libc.so.1 (SUNW_1.1); The node that fails is at Solaris 11.2 and it doesn't supply 1.6 # pkg info entire           Name: entire        Summary: Incorporation to lock all system packages to the same build    Description: This package constrains system package versions to the same                 build.  WARNING: Proper system update and correct package                 selection depend on the presence of this incorporation.                 Removing this package will result in an unsupported system.       Category: Meta Packages/Incorporations          State: Installed      Publisher: solaris        Version: 0.5.11  Build Release: 5.11         Branch: 0.175.2.0.0.42.0 Packaging Date: Tue Jun 24 19:38:32 2014           Size: 5.46 kB           FMRI:pkg://solaris/entire@0.5.11,5.11-0.175.2.0.0.42.0:20140624T193832Z The library doesn't supply version 1.6: # pvs /usr/lib/64/libCrun.so.1         libc.so.1 (SUNW_1.1, SUNWprivate_1.1);         libCrun.so.1;         SUNW_1.1;         SUNW_1.2;         SUNW_1.3;         SUNW_1.4;         SUNW_1.5; A node that successfully runs that application is at Solaris 11.3 SRU 18 and it does supply version 1.6 # pkg info entire              Name: entire           Summary: entire incorporation including Support Repository Update (Oracle Solaris 11.3.18.6.0).       Description: This package constrains system package versions to the same                    build.  WARNING: Proper system update and correct package                    selection depend on the presence of this incorporation.                    Removing this package will result in an unsupported system.                    For more information see: https://support.oracle.com/rs?type=doc&id=2045311.1          Category: Meta Packages/Incorporations             State: Installed         Publisher: solaris           Version: 0.5.11 (Oracle Solaris 11.3.18.6.0)     Build Release: 5.11            Branch: 0.175.3.18.0.6.0    Packaging Date: Wed Mar 22 19:24:51 2017              Size: 5.46 kB              FMRI:pkg://solaris/entire@0.5.11,5.11-0.175.3.18.0.6.0:20170322T192451Z The library does supply version 1.6: # pvs /usr/lib/64/libCrun.so.1         libc.so.1 (SUNW_1.1, SUNWprivate_1.1);         libCrun.so.1;         SUNW_1.1;         SUNW_1.2;         SUNW_1.3;         SUNW_1.4;         SUNW_1.5;         SUNW_1.6; The fundamental behavior is described in: Oracle Solaris 11.1 Linkers and Libraries Guide Internal Versioning One possible workaround is to get you going *today* is to set LD_NOVERSION. This isn't a good workaround and you should quickly move to a longer term solution. Before: # ldd /weblogic/OFSAA_HOME/ficdb/bin/am | grep run         libCrun.so.1 =>  /usr/lib/64/libCrun.so.1         libCrun.so.1 (SUNW_1.6) =>       (version not found) After: # export LD_NOVERSION=yes # ldd /weblogic/OFSAA_HOME/ficdb/bin/am | grep run         libCrun.so.1 =>  /usr/lib/64/libCrun.so.1 From the ld man page: LD_NOVERSION, LD_NOVERSION_32, and LD_NOVERSION_64 By default, the runtime linker verifies version dependencies for the primary executable and all of its dependencies. When LD_NOVERSION is set to any non-null value, the runtime linker disables this version checking. In general, the LD_NOVERSION linker option is not suitable for the version problem described here. When a new symbol is added to C++ system library, it goes under a new version number. If you disable the version checking, the program still may not run, because it may need a symbol that is not present. Hope this helps...

We were getting the errors below issue when trying to launch a program on Solaris 11.2:ld.so.1: am: fatal: libCrun.so.1: version 'SUNW_1.6' not found (required by file prog)ld.so.1: am: fatal:...

Analytics

Demangling Java symbols on Solaris 11.3

I was investigating a Java application problem on a SPARC server running Oracle Solaris 11.3 SRU 18 and was having trouble understanding the Java stack traces. The symbols in a Java stack trace are not necessarily human readable: $ pstack 17510/3 17510:    /weblogic/jdk1.8.0_121/bin/java -server -d64 -Xms32m -Xmx200m -XX:MaxP ------------  lwp# 3 / thread# 3  ---------------  ffffffff6d04e3c4 lwp_cond_wait (10026fa48, 10026fa30, 0, 0)  ffffffff432846e8 __1cCosNPlatformEventEpark6M_v_ (10026fa00, 233d40, 233c00, 10026fa30, ffffffff437cd708, 10026fa20) + 100  ffffffff432165ec __1cHMonitorFIWait6MpnGThread_l_i_ (100272670, 10026e800, 0, 8, 0, 0) + a4  ffffffff432174e0 __1cHMonitorEwait6Mblb_b_ (100272670, 10026e800, 0, 0, 1, 5b659c) + 378  ffffffff42d34080 __1cNGCTaskManagerIget_task6MI_pnGGCTask__ (1002725f0, 0, 100278328, ffffffff435de094, ffffffff437cd708, 100278240) + a8  ffffffff42d3675c __1cMGCTaskThreadDrun6M_v_ (10026e800, 0, 10040a360, ffffffff436554a0, ffffffff438dd6cd, 3d8) + e4  ffffffff43275f50 java_start (10026e800, 228000, 228344, 194800, ffffffff437cd708, ffffffff42be6558) + 388  ffffffff6d04931c _lwp_start (0, 0, 0, 0, 0, 0) It is expected that c++filt can demangle the Java symbols, but I found two versions and wasn't sure which to use. # pkg search -l c++filt INDEX      ACTION VALUE                               PACKAGE basename   file   opt/developerstudio12.5/bin/c++filtpkg:/developer/developerstudio-125/c++@12.5-1.0.0.0 basename   file   usr/gnu/bin/c++filt                pkg:/developer/gnu-binutils@2.23.1-0.175.3.0.0.30.0 The version of c++filt that comes with pkg:/developer/gnu-binutils@2.23.1-0.175.3.0.0.30.0 did NOT help:  $ pstack 17510/3 | /usr/gnu/bin/c++filt 17510:    /weblogic/jdk1.8.0_121/bin/java -server -d64 -Xms32m -Xmx200m -XX:MaxP ------------  lwp# 3 / thread# 3  ---------------  ffffffff6d04e3c4 lwp_cond_wait (10026fa48, 10026fa30, 0, 0)  ffffffff432846e8 __1cCosNPlatformEventEpark6M_v_ (10026fa00, 233d40, 233c00, 10026fa30, ffffffff437cd708, 10026fa20) + 100  ffffffff432165ec __1cHMonitorFIWait6MpnGThread_l_i_ (100272670, 10026e800, 0, 8, 0, 0) + a4  ffffffff432174e0 __1cHMonitorEwait6Mblb_b_ (100272670, 10026e800, 0, 0, 1, 5b659c) + 378  ffffffff42d34080 __1cNGCTaskManagerIget_task6MI_pnGGCTask__ (1002725f0, 0, 100278328, ffffffff435de094, ffffffff437cd708, 100278240) + a8  ffffffff42d3675c __1cMGCTaskThreadDrun6M_v_ (10026e800, 0, 10040a360, ffffffff436554a0, ffffffff438dd6cd, 3d8) + e4  ffffffff43275f50 java_start (10026e800, 228000, 228344, 194800, ffffffff437cd708, ffffffff42be6558) + 388  ffffffff6d04931c _lwp_start (0, 0, 0, 0, 0, 0) The version of c++filt that comes with Oracle Developer Studio successfully demangled the symbols: $ pstack 17510/3 | /opt/developerstudio12.5/bin/c++filt 17510:    /weblogic/jdk1.8.0_121/bin/java -server -d64 -Xms32m -Xmx200m -XX:MaxP ------------  lwp# 3 / thread# 3  ---------------  ffffffff6d04e3c4 lwp_cond_wait (10026fa48, 10026fa30, 0, 0)  ffffffff432846e8 void os::PlatformEvent::park() (10026fa00, 233d40, 233c00, 10026fa30, ffffffff437cd708, 10026fa20) + 100  ffffffff432165ec int Monitor::IWait(Thread*,long) (100272670, 10026e800, 0, 8, 0, 0) + a4  ffffffff432174e0 bool Monitor::wait(bool,long,bool) (100272670, 10026e800, 0, 0, 1, 5b659c) + 378  ffffffff42d34080 GCTask*GCTaskManager::get_task(unsigned) (1002725f0, 0, 100278328, ffffffff435de094, ffffffff437cd708, 100278240) + a8  ffffffff42d3675c void GCTaskThread::run() (10026e800, 0, 10040a360, ffffffff436554a0, ffffffff438dd6cd, 3d8) + e4  ffffffff43275f50 java_start (10026e800, 228000, 228344, 194800, ffffffff437cd708, ffffffff42be6558) + 388  ffffffff6d04931c _lwp_start (0, 0, 0, 0, 0, 0) Now, back to solving the actual problem that I was investigating...

I was investigating a Java application problem on a SPARC server running Oracle Solaris 11.3 SRU 18 and was having trouble understanding the Java stack traces.The symbols in a Java stack trace are not...

Analytics

CDH 5.5 hive and hadoop start up problems on Solaris

This was a bit of a puzzler that I don't think that many other people will have to deal with, but I thought that it was interesting: % hive /opt/cloudera/cdh5.5.0/hive-1.1.0-cdh5.5.0/bin/hive: line 184: [: too many arguments Usage: dirname [ path ] /opt/cloudera/cdh5.5.0/hadoop-2.6.0-cdh5.5.0/bin/hadoop: line 26: /opt/cloudera/cdh5.5.0/hadoop-2.6.0-cdh5.5.0/libexec/hadoop-config.sh: No such file or directory /opt/cloudera/cdh5.5.0/hadoop-2.6.0-cdh5.5.0/bin/hadoop: line 144: exec: : not found Unable to determine Hadoop version information. 'hadoop version' returned: Usage: dirname [ path ] /opt/cloudera/cdh5.5.0/hadoop-2.6.0-cdh5.5.0/bin/hadoop: line 26: /opt/cloudera/cdh5.5.0/hadoop-2.6.0-cdh5.5.0/libexec/hadoop-config.sh: No such file or directory /opt/cloudera/cdh5.5.0/hadoop-2.6.0-cdh5.5.0/bin/hadoop: line 144: exec: : not found% What?? But the hive start up script used to work!! It works when logged in as another user!! It works when run on Linux!! What went wrong? A little detective work showed this: % bash -x hive...'[' -f Running .cshrc /opt/cloudera/cdh5.5.0/hadoop-2.6.0-cdh5.5.0/bin/hadoop ']'... OK, so someone was just trying to debug .cshrc. Lets try  running the script from a bash shell ... same problem!! Additionally, the hive startup script should be running with bash in any case because it has this at the top: #!/usr/bin/env bash It turned out that the HADOOP_IN_PATH variable had the Running .cshrc string. HADOOP_IN_PATH=`which hadoop 2>/dev/null`if [ -f ${HADOOP_IN_PATH} ]; then  HADOOP_DIR=`dirname "$HADOOP_IN_PATH"`/..fi Because on Solaris "which" is a cshell script. $ head -3 /bin/which #! /usr/bin/csh -f # # Copyright 2005 Sun Microsystems, Inc.  All rights reserved. This was OK on Linux because "which" is an executable, not a cshell script: $ file /usr/bin/which /usr/bin/which: ELF 64-bit LSB executable, AMD x86-64, version 1 (SYSV), for GNU/Linux 2.6.9, dynamically linked (uses shared libs), stripped One solution on Solaris is to use the "GNU which" program "gwhich" in the hive and hadoop startup scripts. Or, of course, solve the problem by taking the echos out of .cshrc.

This was a bit of a puzzler that I don't think that many other people will have to deal with, but I thought that it was interesting:% hive /opt/cloudera/cdh5.5.0/hive-1.1.0-cdh5.5.0/bin/hive: line 184:...

Analytics

Table of Contents

Java Stored Procedures and SQLJUTL2.INVOKE sqlplus and DYLD_LIBRARY_PATH on Mac OS/X Python: Old Dog Learns New Trick Hello World with Oracle PL/SQL useradd with "-d localhost:...". Cloudera Hive client on Solaris I am unfriending hive-env.sh "ipadm show-addr" with name resolution on Solaris 11 SPARC T5-4 RAC and WebLogic Cluster DNS Bind Server configuration on Solaris 11.2 Oracle SQL Connector for HDFS on SPARC Big Data Lite with a static IP address on Oracle Linux 6 / EL6 (OATS) Oracle Application Testing Suite Report JAVA_HOME on Solaris 11 Enterprise Manager agentTZRegion Onion Security netperf on Solaris 11 emca and ORA-12537: TNS:connection closed Installing VNC server on Solaris 11 Quadratic Programming with Oracle R Enterprise SPARC T5-4 LDoms for RAC and WebLogic Clusters VNC Cut & Paste on Solaris 10 Using R to analyze Java G1 garbage collector log files What compiler options were used to create an executable? Quadratic data in Oracle R Enterprise and Oracle Data Mining Finding datasets for South African Heart Disease R attributes and regexpr Redo Log Switches Gathering Database Statistics in a Test Harness Hadoop on an Oracle SPARC T4-2 Server Error in invoking target 'all_no_orcl' of makefile Hive 0.11 (May, 15 2013) and Rank() within a category Ganglia on Solaris 11.1 Adding users in Solaris 11 with power like the initial account Debugging Hadoop using Solaris Studio in a Solaris 11 Zone non-interactive zone configuration Hadoop Java Error logs Solaris 11 VNC Server is "blurry" or "smeared" ZFS for Database Log Files ndd on Solaris 10 Java EE Application Servers, SPARC T4, Solaris Containers, and Resource Pools What is bondib1 used for on SPARC SuperCluster with InfiniBand, Solaris 11 networking & Oracle RAC? Watch out for a trailing slash on $ORACLE_HOME User "oracle" unable to start or stop listeners NFS root access for Oracle RAC on Sun ZFS Storage 7x20 Appliance Flash Archive with MPXIO Solaris installation on a SPARC T3 from a remote CDROM ISO 11gR2 Grid Infrastructure Patch Set for Solaris x86-64 Run level verification check failed when installing 11.2.0.2 Grid Upgrade (p10098816) Useful GNU grep option Adding a hard drive for /export/home under ZFS Java Monitoring and Tuning Example Installing the Solaris 10 OS Companion CD Solaris installation on a SPARC T3 from a remote CDROM ISO OpenOffice Calc cut&paste to Thunderbird NFS Tuning for HPC Streaming Applications Graphing Solaris Performance Stats with gnuplot Solaris/x64 VNC with Cut & Paste Solaris Containers & 32-bit Java Heap allocation Algorithmics Financial Risk Management Software on the Sun Fire X4270 Jumpstart: ERROR: could not load the media (/cdrom) Fino's Additions for wt.properties Configuring jumbo frames on the V490’s ce and the T2000's e1000g FAQ for Windchill on Solaris JVM Tuning for Windchill

Java Stored Procedures and SQLJUTL2.INVOKE sqlplus and DYLD_LIBRARY_PATH on Mac OS/X Python: Old Dog Learns New Trick Hello World with Oracle PL/SQLuseradd with "-d localhost:...".Cloudera Hive...

Analytics

Java Stored Procedures and SQLJUTL2.INVOKE

BEFORE: I was having trouble accessing Java Stored Procedures from an Oracle 12c client: java.sql.SQLException: ORA-06550: line 1, column 13: PLS-00201: identifier 'SQLJUTL2.INVOKE' must be declared ORA-06550: line 1, column 7: PL/SQL: Statement ignored     at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:450)     at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:399)     at oracle.jdbc.driver.T4C8Oall.processError(T4C8Oall.java:1059)     at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:522)     at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:257)     at oracle.jdbc.driver.T4C8Oall.doOALL(T4C8Oall.java:587)     atoracle.jdbc.driver.T4CCallableStatement.doOall8(T4CCallableStatement.java:220)     atoracle.jdbc.driver.T4CCallableStatement.doOall8(T4CCallableStatement.java:48)     atoracle.jdbc.driver.T4CCallableStatement.executeForRows(T4CCallableStatement.java:938)     atoracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1150)     atoracle.jdbc.driver.OraclePreparedStatement.executeInternal(OraclePreparedStatement.java:4798)     atoracle.jdbc.driver.OraclePreparedStatement.executeUpdate(OraclePreparedStatement.java:4875)     atoracle.jdbc.driver.OracleCallableStatement.executeUpdate(OracleCallableStatement.java:5661)     atoracle.jdbc.driver.OraclePreparedStatementWrapper.executeUpdate(OraclePreparedStatementWrapper.java:1361)     at oracle.jpub.reflect.Client.invoke(Client.java:97)     atJavaStoredProcedureStubs.RecommendationEngine.newInstance(RecommendationEngine.java:31)     at client.myClient.doit(myClient.java:80)     at client.myClient.main(myClient.java:126) The problem was also effecting Oracle's jpub client: $ jpub -user=scott/tiger -url=jdbc:oracle:thin:@myserver:1521/orcl -java=java_stored_procedures.StoredProcedure1 Note: Oracle Databases 10g Release 2 or later is recommended for publishing server-side Java. ERROR: Unable to obtain information on server-side Java classes: java.sql.SQLException: ORA-06550: line 1, column 13: PLS-00201: identifier 'SQLJUTL2.REFLECT' must be declared ORA-06550: line 1, column 7: PL/SQL: Statement ignored  Please ensure you have installed [ORACLE_HOME]/sqlj/lib/sqljutl.sql and sqljutl.jar. J2T-106, ERROR: Sorry, unable to continue due to oracle.jpub.JPubException: ORA-06550: line 1, column 13: PLS-00201: identifier 'SQLJUTL2.REFLECT' must be declared ORA-06550: line 1, column 7: PL/SQL: Statement ignored WORKAROUND: $ sqlplus scott/tiger@myserver:1521/orcl @$ORACLE_HOME/jpub/sql/sqljutl2.sql SQL*Plus: Release 12.1.0.2.0 Production on Tue Jun 16 07:05:43 2015 Copyright (c) 1982, 2014, Oracle.  All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production With the Partitioning, OLAP, Data Mining and Real Application Testing options Package created. Package body created. SQL> AFTER: $ jpub -user=scott/tiger -url=jdbc:oracle:thin:@myserver:1521/orcl -java=java_stored_procedures.StoredProcedure1 ERROR: error reflecting Java class in server: reflect("java_stored_procedures.StoredProcedure1",91) ./java_stored_procedures/StoredProcedure1.java $ ls -l java_stored_procedures/ total 13 -rw-r--r--   1 oracle   oinstall    2233 Jun 16 07:06 StoredProcedure1.class -rw-r--r--   1 oracle   oinstall    2867 Jun 16 07:06 StoredProcedure1.java COMMENTS: 1) The workaround is NOT required for 11g Clients 2) This problem in the 12c Client effects BOTH 11g and 12c Servers 3) The workaround must be run as the current user (e.g. scott/tiger). Running sqljutl2.sql as 'SYS as SYSDBA' is NOT sufficient. 4) The workaround is necessary even when "describe sys.SQLJUTL2" indicates that FUNCTION REFLECT RETURNS LONG exists in the server.

BEFORE: I was having trouble accessing Java Stored Procedures from an Oracle 12c client:java.sql.SQLException: ORA-06550: line 1, column 13: PLS-00201: identifier 'SQLJUTL2.INVOKE' must be declared ORA-0...

Analytics

sqlplus and DYLD_LIBRARY_PATH on Mac OS/X

At some point in the past I followed directions in "Oracle Database Client Installation Guide for Apple Mac OS X (Intel)" so that I could use sqlplus on my MacBook. Following the directions, I set the DYLD_LIBRARY_PATH environment variable in my .bashrc. Today, I noticed this: $ brew doctorPlease note that these warnings are just used to help the Homebrew maintainerswith debugging if you file an issue. If everything you use Homebrew for isworking fine: please don't worry and just ignore them. Thanks! Warning: Setting DYLD_* vars can break dynamic linking.Set variables:    DYLD_LIBRARY_PATH: :/Applications/instantclient_11_2 Hmm, what to do? Ignore? Fix? Google? The top google hit was "Oracle sqlplus and instant client on Mac OS/X without DYLD_LIBRARY_PATH". Everything is Casey's blog checked out: $ which sqlplus/Applications/instantclient_11_2/sqlplus $ otool -L /Applications/instantclient_11_2/sqlplus/Applications/instantclient_11_2/sqlplus:    /ade/b/1891624078/oracle/sqlplus/lib/libsqlplus.dylib (compatibility version 0.0.0, current version 0.0.0)    /ade/b/2649109290/oracle/rdbms/lib/libclntsh.dylib.11.1 (compatibility version 0.0.0, current version 0.0.0)    /ade/b/2649109290/oracle/ldap/lib/libnnz11.dylib (compatibility version 0.0.0, current version 0.0.0)    /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 159.1.0) When I unset the variable, sqlplus stopped working: $ unset DYLD_LIBRARY_PATH $ sqlplus dyld: Library not loaded: /ade/b/1891624078/oracle/sqlplus/lib/libsqlplus.dylib   Referenced from: /Applications/instantclient_11_2/sqlplus   Reason: image not found Trace/BPT trap: 5 SIMPLE SOLUTION: I was about to follow Casey's instructions when a simpler solution popped into my mind. I removed the DYLD_LIBRARY_PATH environment variable from my ~/.bashrc and replaced it with an alias: alias sqlplus="DYLD_LIBRARY_PATH=/Applications/instantclient_11_2 sqlplus" After killing the OS X Terminal and restarting, and verifying that the environment variable was gone and alias present, both "brew doctor" and "sqlplus" were happy. DISCLAIMER: My alias only addresses sqlplus usage. If you are using more Instant Client functionality, you may need to use Casey's solution or bluebinary's solution. I can't vouch for either Casey's solution or bluebinary's solution, but both approaches seem reasonable.

At some point in the past I followed directions in "Oracle Database Client Installation Guide for Apple Mac OS X (Intel)" so that I could use sqlplus on my MacBook. Following the directions, I set the...

Analytics

Python: Old Dog Learns New Trick

I'm now a confident Python programmer. I've recently wrapped up a sequence of 2 online classes that focus on programming with Python. MITx: 6.00.1x Introduction to Computer Science and Programming Using Python - Covered Python data types, flow control, iteration, numerical methods, functions, scope, recursion, Python data structures such as lists, sets and dictionaries, comprehensions, debugging, assertions and exceptions, efficiency (Big O), sorting, search, hashing, trees, classes, object oriented programming, and inheritance. MITx: 6.00.2x Introduction to Computational Thinking and Data Science - Covered plotting using PyLab, simulations and random walks, probability, stochastic programming, hashing, histograms, Monte Carlo simulations, curve fitting, prediction, graph traversal algorithms (used to optimize cost or distance), feature vectors, distance metrics and data clustering. Development Tools - The course material focused on IDLE, but instead, I chose to use a combination of PyDev (the Python IDE for Eclipse), Spyder and iPython Notebook. Why Python? I've observed that in general, the use of Python has been expanding. Current relevant usage includes: The Apache Spark Python API (PySpark) Open Stack development Some Oracle SuperCluster management tools Was the course hard? Both courses included video lectures, programming exercises, and exams. For me, it was more "time consuming" than "hard", largely due to the fact that I came to Python already knowing C, C++, Java, R, bash and some others. Were the courses interesting? Two modules that I found to be particularly interesting were the knapsack problem and graph traversal. This was the first time that I had written code for either one. The concepts didn't seem odd, partially because I've previously spent plenty of time working with tree structures. Do I like Python? Yes. The syntax and semantics are enough like C++ and  Java the the language quickly felt familiar. There are pros and cons to dynamically typed vs. statically typed languages. Both approaches will be used for the foreseeable future with no clear winner. I definitely like having the REPL loop and I'm eagerly awaiting the REPL loop in Java 9. To Do - I'd like to get more experience with Python packages including NumPy, SciPy and pandas.

I'm now a confident Python programmer. I've recently wrapped up a sequence of 2 online classes that focus on programming with Python. MITx: 6.00.1x Introduction to Computer Science andProgramming Using...

Analytics

Hello World with Oracle PL/SQL

Hello World with Oracle PL/SQL. The most boring blog entry, ever.   Obi-Wan: "These aren't the droids you're looking for."   Agent Kay: "Nothing to see here. Move along folks." set serveroutput on; -------------------- -- Drop package body hw; -- drop package hw; CREATE OR REPLACE PACKAGE hw AS    PROCEDURE hello_world; END hw; / CREATE OR REPLACE PACKAGE BODY hw AS   PROCEDURE hello_world   IS     hw_str VARCHAR2 (42) := 'Hello World!';   BEGIN     DBMS_OUTPUT.put_line (hw_str);    END hello_world; END hw; / begin   hw.hello_world; end; / -------------------- -- Drop package body hw_2; -- drop package hw_2; CREATE OR REPLACE PACKAGE hw_2 AS    PROCEDURE hello_world_2(      num IN NUMBER); END hw_2; / CREATE OR REPLACE PACKAGE BODY hw_2 AS   PROCEDURE hello_world_2 (     num IN NUMBER)   IS     hw_str VARCHAR2 (42) := 'Hello World!';   BEGIN     DBMS_OUTPUT.put_line (hw_str);      DBMS_OUTPUT.put_line (num+1);    END hello_world_2; END hw_2; / begin   hw_2.hello_world_2(3); end; / -------------------- -- Drop package body hw_34; -- drop package hw_34; CREATE OR REPLACE PACKAGE hw_34 AS    PROCEDURE hello_world_3;    PROCEDURE hello_world_4(      num IN NUMBER); END hw_34; / CREATE OR REPLACE PACKAGE BODY hw_34 AS   PROCEDURE hello_world_3    IS     hw_str VARCHAR2 (42) := 'Hello World!';   BEGIN     DBMS_OUTPUT.put_line (hw_str);    END hello_world_3;   PROCEDURE hello_world_4 (     num IN NUMBER)   IS     hw_str VARCHAR2 (42) := 'Hello World!';   BEGIN     DBMS_OUTPUT.put_line (hw_str);      DBMS_OUTPUT.put_line (num+2);    END hello_world_4; END hw_34; / begin   hw_34.hello_world_3;   hw_34.hello_world_4(3); end; /

Hello World with Oracle PL/SQL. The most boring blog entry, ever.  Obi-Wan: "These aren't the droids you're looking for."  Agent Kay: "Nothing to see here. Move along folks."set serveroutput on; -------...

Analytics

Cloudera Hive client on Solaris

Goal: From an Oracle Solaris server, access Hive data that is stored in a Cloudera Hadoop cluster, for example, access Hive data in an Oracle Big Data Appliance from a Solaris SPARC RDBMS server. This blog assumes that the Cloudera Hadoop cluster is already up and running. To test, I started with Oracle Big Data Lite Virtual Machine that was installed using the process documented at Big Data Lite with a static IP address on Oracle Linux 6 / EL6 Step 1: On the Solaris server, create an "oracle" user: The user doesn't have to be "oracle", but that fits well with my planned usage. # /usr/sbin/groupadd oinstall # /usr/sbin/groupadd dba # /usr/sbin/useradd -d localhost:/export/home/oracle -m -g oinstall -G dba oracle # passwd oracle # echo "oracle ALL=(ALL) ALL" > /etc/sudoers.d/oracle # su - oracle Set up a VNC server using the process at Installing VNC server on Solaris 11. Step 2: Visit the Cloudera Manager to determine the version of Cloudera that is running: Click on the "Support" pull down and the "About": In my case, I was using Cloudera Express 5.1.2 Step 3: On the Oracle Solaris server, install the Hive and Hadoop tarballs that match your cluster Visit Cloudera Downloads, click on "CDH Download Now" Choose your version. In my case, CDH 5.1.2. This took me to the Packaging and Tarballs page for all of the CDH components. Download the Apache Hadoop Tarball and the Apache Hive Tarball Unpack the tarballs on your Solaris server: $ tar xzf Downloads/hadoop-2.3.0-cdh5.1.2.tar.gz $ tar xzf Downloads/hive-0.12.0-cdh5.1.2.tar.gz And verify: $ ls hadoop-2.3.0-cdh5.1.2bin                  examples             libexecbin-mapreduce1       examples-mapreduce1  sbincloudera             include              shareetc                  lib                  src $ ls hive-0.12.0-cdh5.1.2/ bin                examples           LICENSE            RELEASE_NOTES.txt conf               hcatalog           NOTICE             scripts docs               lib                README.txt Step 4: Download the HDFS configuration files to the Solaris server: In the Cloudera Manager, go the the hdfs page: * From the hdfs page, download the client configuration. Place the client configuration onto the Solaris server Unpack the HDFS client configuration on your Solaris server: $ unzip Downloads/hdfs-clientconfig.zip Archive:  Downloads/hdfs-clientconfig.zip   inflating: hadoop-conf/hdfs-site.xml    inflating: hadoop-conf/log4j.properties    inflating: hadoop-conf/topology.py    inflating: hadoop-conf/topology.map    inflating: hadoop-conf/hadoop-env.sh    inflating: hadoop-conf/core-site.xml  Step 6: Configure the HDFS client software on the Solaris server Edit hadoop-conf/hadoop-env.sh set JAVA_HOME correctly: export JAVA_HOME=/usr/jdk/instances/jdk1.7.0 Move the configuration files into place: $ cp hadoop-conf/* hadoop-2.3.0-cdh5.1.2/etc/hadoop/ Add the Hadoop bin directory to your PATH. You may want to do this in your local shell or .bashrc $ export PATH=~/hadoop-2.3.0-cdh5.1.2/bin:$PATH Step 7: Test the HDFS client software on Solaris Verify that the remote hdfs filesystem is visible from your Solaris server $ hdfs dfs -ls 15/04/02 14:40:54 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Found 6 items drwx------   - oracle oracle          0 2014-08-25 01:55 .Trash drwx------   - oracle oracle          0 2014-09-23 09:25 .staging drwxr-xr-x   - oracle oracle          0 2014-01-12 15:15 moviedemo drwxr-xr-x   - oracle oracle          0 2014-09-24 05:38 moviework drwxr-xr-x   - oracle oracle          0 2014-09-08 11:50 oggdemo drwxr-xr-x   - oracle oracle          0 2014-09-20 09:59 oozie-oozi Step 8: Download the Hive configuration files to the Solaris server: In the Cloudera Manager, go the the hive page: * From the hive page, download the client configuration. Place the client configuration on the Solaris server Unpack the hive client configuration on your Solaris server: $ unzip Downloads/hive-clientconfig.zip Archive:  Downloads/hive-clientconfig(1).zip   inflating: hive-conf/hive-site.xml    inflating: hive-conf/hive-env.sh     inflating: hive-conf/log4j.properties    inflating: hive-conf/hadoop-env.sh    inflating: hive-conf/core-site.xml    inflating: hive-conf/mapred-site.xml    inflating: hive-conf/topology.py     inflating: hive-conf/yarn-site.xml    inflating: hive-conf/hdfs-site.xml    inflating: hive-conf/topology.map   Step 9: Configure the Hive client software on the Solaris server Move the configuration files into place: $ cp hive-conf/* hive-0.12.0-cdh5.1.2/conf/ Step 10: YARN configuration option for Hadoop: The HFDS configuration files don't include yarn-site.xml. Copy the YARN configuration file from the hive tree to the hadoop tree: $ cp hive-0.12.0-cdh5.1.2/conf/yarn-site.xml hadoop-2.3.0-cdh5.1.2/etc/hadoop/ Step 11: Hide hive-env.sh: See: I am unfriending hive-env.sh $ mv hive-0.12.0-cdh5.1.2/conf/hive-env.sh hive-0.12.0-cdh5.1.2/conf/hive-env.sh.HIDDEN Add the Hive bin directory to your PATH. You may want to do this in your local shell or .bashrc $ export PATH=~/hive-0.12.0-cdh5.1.2/bin:$PATH Step 12: Test the Hive client software on Solaris Verify that the remote Hive tables are visible from your Solaris server $ hive 15/04/03 12:20:37 INFO Configuration.deprecation: mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive 15/04/03 12:20:37 INFO Configuration.deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize 15/04/03 12:20:37 INFO Configuration.deprecation: mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize 15/04/03 12:20:37 INFO Configuration.deprecation: mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack 15/04/03 12:20:37 INFO Configuration.deprecation: mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node 15/04/03 12:20:37 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces 15/04/03 12:20:37 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative 15/04/03 12:20:37 WARN conf.HiveConf: DEPRECATED: Configuration property hive.metastore.local no longer has any effect. Make sure to provide a valid value for hive.metastore.uris if you are connecting to a remote metastore. Logging initialized using configuration in jar:file:/home/oracle/hive-0.12.0-cdh5.1.2/lib/hive-common-0.12.0-cdh5.1.2.jar!/hive-log4j.properties SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/oracle/hadoop-2.3.0-cdh5.1.2/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/oracle/hive-0.12.0-cdh5.1.2/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] hive> show tables; OK cust movie movie_rating movieapp_log_avro movieapp_log_json movieapp_log_odistage movieapp_log_stage movielog session_stats Time taken: 1.799 seconds, Fetched: 9 row(s) hive>  Conclusion: Successfully accomplished task of configuring an Oracle Solaris server to access Hive data that is stored in a Cloudera Hadoop cluster.

Goal: From an Oracle Solaris server, access Hive data that is stored in a Cloudera Hadoop cluster, for example, access Hive data in an Oracle Big Data Appliance from a Solaris SPARC RDBMS server. This...

Analytics

I am unfriending hive-env.sh

BEFORE: $ hive usage: tail [+/-[n][lbc][f]] [file]        tail [+/-[n][l][r|f]] [file] /home/oracle/hive-0.12.0-cdh5.1.2/conf/hive-env.sh: line 5: ,: command not found usage: tail [+/-[n][lbc][f]] [file]        tail [+/-[n][l][r|f]] [file] Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/mapred/JobConf     at org.apache.hadoop.hive.conf.HiveConf.initialize(HiveConf.java:1138)     at org.apache.hadoop.hive.conf.HiveConf.<init>(HiveConf.java:1099)     at org.apache.hadoop.hive.common.LogUtils.initHiveLog4jCommon(LogUtils.java:74)     at org.apache.hadoop.hive.common.LogUtils.initHiveLog4j(LogUtils.java:58)     at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:639)     at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:623)     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)     at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)     at java.lang.reflect.Method.invoke(Method.java:606)     at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.mapred.JobConf     at java.net.URLClassLoader$1.run(URLClassLoader.java:366)     at java.net.URLClassLoader$1.run(URLClassLoader.java:355)     at java.security.AccessController.doPrivileged(Native Method)     at java.net.URLClassLoader.findClass(URLClassLoader.java:354)     at java.lang.ClassLoader.loadClass(ClassLoader.java:425)     at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)     at java.lang.ClassLoader.loadClass(ClassLoader.java:358)     ... 11 more HIDE hive-env.sh: $ mv hive-0.12.0-cdh5.1.2/conf/hive-env.sh hive-0.12.0-cdh5.1.2/conf/hive-env.sh.HIDDEN AFTER: oracle@p3231-zone14:~$ hive 15/04/03 12:20:37 INFO Configuration.deprecation: mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive 15/04/03 12:20:37 INFO Configuration.deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize 15/04/03 12:20:37 INFO Configuration.deprecation: mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize 15/04/03 12:20:37 INFO Configuration.deprecation: mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack 15/04/03 12:20:37 INFO Configuration.deprecation: mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node 15/04/03 12:20:37 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces 15/04/03 12:20:37 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative 15/04/03 12:20:37 WARN conf.HiveConf: DEPRECATED: Configuration property hive.metastore.local no longer has any effect. Make sure to provide a valid value for hive.metastore.uris if you are connecting to a remote metastore. Logging initialized using configuration in jar:file:/home/oracle/hive-0.12.0-cdh5.1.2/lib/hive-common-0.12.0-cdh5.1.2.jar!/hive-log4j.properties SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/oracle/hadoop-2.3.0-cdh5.1.2/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/oracle/hive-0.12.0-cdh5.1.2/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] hive> 

BEFORE:$ hive usage: tail [+/-[n][lbc][f]] [file]        tail [+/-[n][l][r|f]] [file] /home/oracle/hive-0.12.0-cdh5.1.2/conf/hive-env.sh: line 5: ,: command not found usage: tail [+/-[n][lbc][f]] [file]  ...

Analytics

SPARC T5-4 RAC and WebLogic Cluster

Background for related blog entries, including: SPARC T5-4 LDoms for RAC and WebLogic Clusters Onion Security DNS Bind Server configuration on Solaris 11.2 NFS root access for Oracle RAC on Sun ZFS Storage 7x20 Appliance The system was being set up to test an application:Oracle Financial Services Analytical ApplicationsWebLogic ClusterOracle DatabaseHere is a brief description of the hardware that was available for the test: Compute nodes: Oracle's SPARC T5-4 SERVERS Each T5-4 has: Sixteen-core 3.6 GHz SPARC T5 processors Up to 128 threads per processor for a maximum 512 threads per system Sixteen floating-point units per SPARC T5 processor 1 TB (using 32x 32 GB 1,066 MHz DDR3 DIMMs) Storage: Sun ZFS Storage Appliance 7420 This particular unit was populated with: 4x 8-core Intel® Xeon® processors 20 HDDs 4 SSD's for logging 4 SSD for cache Th The system was being set up to test an application: Oracle Financial Services Analytical Applications WebLogic Cluster Oracle Database The deployment was an active/active cluster: if either SPARC T5-4 Server were to go down, the system would be able to continue processing transactions.Two virtualization layers were used. Solaris/SPARC has many virtualization alternatives and features, as discussed in Oracle Solaris 11 Virtualization Technology, allowing users to deploy virtualized environments that are specifically tailored to meet the needs of their site.  For this deployment, I used: Oracle VM Server for SPARC. "Oracle VM Server leverages the built-in SPARC hypervisor to subdivide a supported platform’s resources (CPUs, memory, network, and storage) by creating partitions called logical (or virtual) domains. Each logical domain can run an independent operating system." Solaris 11 Zones. I like Jeff Victor's blog Comparing Solaris 11 Zones to Solaris 10 Zones. Zones are a form of server virtualization called "OS (Operating System) Virtualization." They improve consolidation ratios by isolating processes from each other so that they cannot interact. Each zone has its own set of users, naming services, and other software components. One of the many advantages is that there is no need for a hypervisor, so there is no performance overhead. Many data centers run tens to hundreds of zones per server!" Virtualized layout of first T5-4 Server: Application OBIEE OHS WLS Cluster Nodes RAC Node Non Global Zones n/a  proj-1-obiee  proj-1-ohs  proj-1-z1  z2..z14  proj-1-z15 n/a Global Zone proj-1-control proj-1-wls  proj-1-db LDom Primary Guest Guest First T5-4 Server Virtualized layout of second T5-4 Server: Application OHS WLS Cluster Nodes RAC Node Non Global Zones n/a  proj-2-ohs   proj-2-z1   proj-2-z2   z3..z14   proj-2-z15 n/a Global Zone proj-2-control proj-2-wls  proj-2-db LDom Primary Guest Guest Second T5-4 Server Network Connectivity:Each Solaris Zone is dual ported to isolate network layers, as discussed in Onion Security Resource allocation: two Oracle SPARC T5-4 servers to simultaneously host both Oracle RAC and a WebLogic Server Cluster. I chose to use Oracle VM Server for SPARC to create a cluster like this: There are plenty of trade offs and decisions that need to be made, for example: Rather than configuring the system by hand, you might want to use an Oracle SuperCluster T5-8 My configuration is similar to jsavit's: Availability Best Practices - Example configuring a T5-8 but I chose to ignore some of the advice. Maybe I should have included an  alternate service domain, but I decided that I already had enough redundancy LDom Core and memory Allocation: Cntl0.25 4  64GB                     App LDom                    2.75 CPU's                                        44 cores                                          704 GB              DB LDom      One CPU         16 cores         256 GB   Additional detail can be found in other blog entries, including: SPARC T5-4 LDoms for RAC and WebLogic Clusters Onion Security DNS Bind Server configuration on Solaris 11.2 NFS root access for Oracle RAC on Sun ZFS Storage 7x20 Appliance

Background for related blog entries, including: SPARC T5-4 LDoms for RAC and WebLogic Clusters Onion Security DNS Bind Server configuration on Solaris 11.2 NFS root access for Oracle RAC on Sun ZFS...

Analytics

DNS Bind Server configuration on Solaris 11.2

This blog is part of the SPARC T5-4 RAC and WebLogic Cluster series: Background SPARC T5-4 LDoms for RAC and WebLogic Clusters Onion Security DNS Bind Server configuration on Solaris 11.2 NFS root access for Oracle RAC on Sun ZFS Storage 7x20 Appliance Virtualized layout of T5-4 Servers: Application OHS WLS Cluster Nodes RAC Node Non Global Zones n/a  proj-2-ohs   proj-2-z1   proj-2-z2   z3..z14   proj-2-z15 n/a Global Zone proj-2-control proj-2-wls  proj-2-db LDom Primary Guest Guest T5-4 Server Network Connectivity: Each Solaris Zone is dual ported to isolate network layers, as discussed in Onion Security DNS Configuration: Normally, I use systems were the naming is defined in the corporate DNS server. For this test, the private subnets needed a private DNS server.  Here are the files that are used to configure the DNS Server. /etc/named.conf options {        directory       "/var/named";         pid-file        "/var/named/tmp/pid";         dump-file       "/var/named/dump/named_dump.db";         statistics-file "/var/named/named.stats";         forward         first;         forwarders { 130.35.249.52; 130.35.249.41; 192.135.82.132; }; }; zone "jdbc.bigcorp.com" {         type master;         file "jdbc.db"; }; zone "30.168.192.in-addr.arpa" {         type master;         file "30.168.192.db"; }; zone "http.bigcorp.com" {         type master;         file "jdbc.db"; }; zone "40.168.192.in-addr.arpa" {         type master;         file "40.168.192.db"; }; logging { category "default" { "debug"; }; category "general" { "debug"; }; category "database" { "debug"; }; category "security" { "debug"; }; category "config" { "debug"; }; category "resolver" { "debug"; }; category "xfer-in" { "debug"; }; category "xfer-out" { "debug"; }; category "notify" { "debug"; }; category "client" { "debug"; }; category "unmatched" { "debug"; }; category "network" { "debug"; }; category "update" { "debug"; }; category "queries" { "debug"; }; category "dispatch" { "debug"; }; category "dnssec" { "debug"; }; category "lame-servers" { "debug"; }; channel "debug" { file "/tmp/nameddbg" versions 2 size 50m; print-time yes; print-category yes; }; }; HTTP Network /var/named/http.db /var/named/40.168.192.db $TTL 3h@       IN      SOA     proj-1-db jeff  (        2013022744 ;serial (change after every update)        3600 ;refresh (1 hour)        3600 ;retry (1 hour)        604800 ;expire (1 week)        38400 ;minimum (1 day))              IN    NS  proj-1-db.bigcorp.comproj-1-z1    IN    A   192.168.40.51proj-1-z2    IN    A   192.168.40.52proj-1-z3    IN    A   192.168.40.53proj-1-z4    IN    A   192.168.40.54proj-1-z5    IN    A   192.168.40.55proj-2-z1    IN    A   192.168.40.71proj-2-z2    IN    A   192.168.40.72proj-2-z3    IN    A   192.168.40.73proj-2-z4    IN    A   192.168.40.74proj-2-z5    IN    A   192.168.40.75proj-3-oats  IN    A   192.168.40.103proj-4-oats  IN    A   192.168.40.104proj-1-obiee IN    A   192.168.40.221proj-1-ohs   IN    A   192.168.40.231 $TTL 3h@       IN      SOA    proj-1-db.http.bigcorp.com. jeff.http.bigcorp.com. (        2013022744 ;serial (change after every update)        3600 ;refresh (1 hour)        3600 ;retry (1 hour)        604800 ;expire (1 week)        38400 ;minimum (1 day))     IN  NS   proj-1-db.bigcorp.com.51  IN  PTR  proj-1-z1.http.bigcorp.com.52  IN  PTR  proj-1-z2.http.bigcorp.com.53  IN  PTR  proj-1-z3.http.bigcorp.com.54  IN  PTR  proj-1-z4.http.bigcorp.com.55  IN  PTR  proj-1-z5.http.bigcorp.com.71  IN  PTR  proj-2-z1.http.bigcorp.com.72  IN  PTR  proj-2-z2.http.bigcorp.com.73  IN  PTR  proj-2-z3.http.bigcorp.com.74  IN  PTR  proj-2-z4.http.bigcorp.com.75  IN  PTR  proj-2-z5.http.bigcorp.com.103 IN  PTR  proj-3-oats.http.bigcorp.com.104 IN  PTR  proj-4-oats.http.bigcorp.com.221 IN  PTR  proj-1-obiee.http.bigcorp.com.231 IN  PTR  proj-1-ohs.http.bigcorp.com. JDBC Network /var/named/jdbc.db /var/named/30.168.192.db $TTL 3h@       IN      SOA     proj-1-db jeff  (        2013022744 ;serial (change after every update)        3600 ;refresh (1 hour)        3600 ;retry (1 hour)        604800 ;expire (1 week)        38400 ;minimum (1 day))               IN   NS  proj-1-dbproj-1-z1     IN   A   192.168.30.51proj-1-z2     IN   A   192.168.30.52proj-1-z3     IN   A   192.168.30.53proj-1-z4     IN   A   192.168.30.54proj-1-z5     IN   A   192.168.30.55proj-2-z1     IN   A   192.168.30.71proj-2-z2     IN   A   192.168.30.72proj-2-z3     IN   A   192.168.30.73proj-2-z4     IN   A   192.168.30.74proj-2-z5     IN   A   192.168.30.75proj-1-db-vip IN   A   192.168.30.101proj-2-db-vip IN   A   192.168.30.102proj-scan     IN   A   192.168.30.103proj-scan     IN   A   192.168.30.104proj-scan     IN   A   192.168.30.105proj-1-db     IN   A   192.168.30.201proj-2-db     IN   A   192.168.30.202proj-1-obiee  IN   A   192.168.30.221proj-1-ohs    IN   A   192.168.30.231proj-2-ohs    IN   A   192.168.30.232 $TTL 3h@       IN      SOA    proj-1-db.jdbc.bigcorp.com. jeff.jdbc.bigcorp.com. (        2013022744 ;serial (change after every update)        3600 ;refresh (1 hour)        3600 ;retry (1 hour)        604800 ;expire (1 week)        38400 ;minimum (1 day))     IN  NS   proj-1-db.jdbc.bigcorp.com.51  IN  PTR  proj-1-z1.jdbc.bigcorp.com.52  IN  PTR  proj-1-z2.jdbc.bigcorp.com.53  IN  PTR  proj-1-z3.jdbc.bigcorp.com.54  IN  PTR  proj-1-z4.jdbc.bigcorp.com.55  IN  PTR  proj-1-z5.jdbc.bigcorp.com.71  IN  PTR  proj-2-z1.jdbc.bigcorp.com.72  IN  PTR  proj-2-z2.jdbc.bigcorp.com.73  IN  PTR  proj-2-z3.jdbc.bigcorp.com.74  IN  PTR  proj-2-z4.jdbc.bigcorp.com.75  IN  PTR  proj-2-z5.jdbc.bigcorp.com.101 IN  PTR  proj-1-vip.jdbc.bigcorp.com.102 IN  PTR  proj-2-vip.jdbc.bigcorp.com.103 IN  PTR  proj-scan.jdbc.bigcorp.com.104 IN  PTR  proj-scan.jdbc.bigcorp.com.105 IN  PTR  proj-scan.jdbc.bigcorp.com.201 IN  PTR  proj-1-db.jdbc.bigcorp.com.202 IN  PTR  proj-2-db.jdbc.bigcorp.com.221 IN  PTR  proj-1-obiee.jdbc.bigcorp.com.231 IN  PTR  proj-1-ohs.jdbc.bigcorp.com.232 IN  PTR  proj-2-ohs.jdbc.bigcorp.com. Configuring a DNS Server: # mkdir /var/named# mkdir /var/named/dump# mkdir /var/named/tmp# pkg install pkg:/service/network/dns/bind # named-checkconf -z /etc/named.confzone jdbc.bigcorp.com/IN: loaded serial 2013022744zone 30.168.192.in-addr.arpa/IN: loaded serial 2013022744zone http.bigcorp.com/IN: loaded serial 2013022744zone 40.168.192.in-addr.arpa/IN: loaded serial 2013022744 Start the DNS Server: # svcadm enable network/dns/server Configure the DNS Client: root@proj-1-db:~# svccfg -s network/dns/clientsvc:/network/dns/client> setprop config/search = astring: ("jdbc.bigcorp.com" "bigcorp.com")svc:/network/dns/client> setprop config/nameserver = net_address: (192.168.30.201)svc:/network/dns/client> refreshsvc:/network/dns/client> quit Test DNS: root@proj-1-db:~# nslookup proj-2-z4Server:        192.168.30.201Address:    192.168.30.201#53 Name:    proj-2-z4.jdbc.bigcorp.comAddress: 192.168.30.74 I didn't use DNS for the Storage network (20) or Cluster Interconnect (10), instead, I just used /etc/hosts. root@proj-1-db:~# cat /etc/hosts ## Copyright 2009 Sun Microsystems, Inc.  All rights reserved.# Use is subject to license terms.## Internet host table#::1             localhost127.0.0.1       localhost loghost192.168.30.201  proj-1-db ## RAC Private /24 Subnet192.168.10.201  proj-1-db-priv192.168.10.202  proj-2-db-priv ## Storage /24 Subnet192.168.20.201  proj-1-db-stor192.168.20.202  proj-2-db-stor192.168.20.205  proj-5-s7420-stor WebLogic Server Zones will have 3 IP's, each: root@proj-2-z1:~# ipadm show-addr ADDROBJ           TYPE     STATE        ADDR lo0/v4            static   ok           127.0.0.1/8 z1/v4             static   ok           xx.xx.xx.xx/20   # Management Subnet jdbc_z1/v4        static   ok           192.168.30.71/24 # JDBC Subnet http_z1/v4        static   ok           192.168.40.71/24 # HTTP Subnetlo0/v6            static   ok           ::1/128 When Oracle Clusterware is up, the database LDom will have many IP's root@proj-1-db:~# ipadm show-addrADDROBJ           TYPE     STATE        ADDRlo0/v4            static   ok           127.0.0.1/8net0/v4           static   ok           xx.xx.xx.xx/20    # Management Subnet net1/v4           static   ok           192.168.20.201/24 # Storage Subnetnet2/v4           static   ok           192.168.10.201/24 # Clusterware interconnectnet3/v4           static   ok           192.168.30.201/24 # JDBC Subnetnet3/v4a          static   ok           192.168.30.103/24 # SCANnet3/v4d          static   ok           192.168.30.105/24 # SCAN net3/v4e          static   ok           192.168.30.101/24 # VIPlo0/v6            static   ok           ::1/128

This blog is part of the SPARC T5-4 RAC and WebLogic Cluster series: Background SPARC T5-4 LDoms for RAC and WebLogic Clusters Onion Security DNS Bind Server configuration on Solaris 11.2 NFS root access...

Analytics

Oracle SQL Connector for HDFS on SPARC

This blog describes the steps necessary to configure Oracle SQL Connector for Hadoop Distributed File System (OSCH) to enable Oracle Database running on Oracle Solaris to access and analyze data residing in Cloudera HDFS running on an Oracle Big Data Appliance. The steps have been verified against a Cloudera Distribution including Apache Hadoop running in the Oracle Big Data Lite Virtual Machine. My configuration adventures regarding the Oracle Big Data Lite Virtual Machine and VirtualBox are shown in this previous blog entry. Although similar steps are expected to work on many similar hardware and software configurations, it is worth noting that this document has been tested using the following two configurations: Tested Configuration 1 RDBMS Connector Hadoop Oracle SPARC T4-2 Server Oracle Solaris 11.2 Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 Oracle SQL Connector for Hadoop Distributed File System v3.1.0 Oracle Big Data Appliance Oracle Enterprise Linux 6.4 Cloudera Manager (5.1.0) Cloudera Distribution including Apache Hadoop (CDH5.1.0) Tested Configuration 2 RDBMS Connector Hadoop SPARC SuperCluster T4-4 Oracle Solaris 11.1 Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 Oracle SQL Connector for Hadoop Distributed File System v3.1.0 Oracle Big Data Lite Virtual Machine 4.0.1 Oracle Enterprise Linux 6.4 Cloudera Manager (5.1.2) Cloudera Distribution including Apache Hadoop (CDH5.1.2) Part A: Preparing the initial environment Step 1: Oracle Database 12c deployment * Follow the typical database deployment guides: Oracle Database Online Documentation 12c Release 1 (12.1) -> Installing and Upgrading --> Under, "Solaris Installation Guides", follow the guide that suites your needs:---> Grid Infrastructure Installation Guide for Oracle Solaris---> Database Installation Guide for Oracle Solaris ---> Database Quick Installation Guide for Oracle Solaris on SPARC (64-Bit) Step 2: Oracle Big Data deployment * For the Big Data Appliance, follow the instructions in the Oracle Big Data Documentation. * For the Oracle Big Data Lite Virtual Machine, follow the Quick Deployment Step-by-step instructions. Also, see my configuration adventures recounted in this previous blog entry. Part B: Enable the Cloudera HDFS client on Solaris * It is assumed that both Oracle database and Hadoop cluster are up and running independently before you attempt to install the connector. Step 1: Make sure that Java is installed on Solaris $ sudo pkg install --accept jdk-7 $ /usr/java/bin/java -version java version "1.7.0_65" Java(TM) SE Runtime Environment (build 1.7.0_65-b17) Java HotSpot(TM) Server VM (build 24.65-b04, mixed mode) Step 2: Determine the Version of Cloudera that is running on your Big Data Appliance: * Starting with the BDA up and running: * Click on the "Support" pull down and the "About": * In my case, I was using Cloudera Express 5.1.2 Step 3: On the Oracle Database servers, install the version of Hadoop that matches your cluster * Visit Cloudera Downloads, click on "CDH Download Now", and choose your version. In my case, CDH 5.1.2. * This took me to the Packaging and Tarballs page for all of the CDH components. Download the Apache Hadoop Tarball Place this tarball onto your Solaris server, or in the case of Oracle RAC, copy the tarball to every node.There are many ways to get the tarball onto your Solaris server. In my case, the most convenient method was to use wget: $ export http_proxy=http://the-proxy-server.bigcorp.com:80$ wget http://archive.cloudera.com/cdh5/cdh/5/hadoop-2.3.0-cdh5.1.2.tar.gz$ scp hadoop-2.3.0-cdh5.1.2.tar.gz oracle@my-rac-node-2: Unpack the tarball on your Solaris server, or on every RAC node: $ tar xzf ../Downloads/hadoop-2.3.0-cdh5.1.2.tar.gz And verify: $ ls hadoop-2.3.0-cdh5.1.2bin                  examples             libexecbin-mapreduce1       examples-mapreduce1  sbincloudera             include              shareetc                  lib                  src Step 4: Download the hdfs configuration files to the Solaris RDBMS server * In the Cloudera Manager, go the the hdfs page: * From the hdfs page, download the client configuration. Place the client configuration on the Solaris server(s) * Unpack the client configuration on your Solaris server(s): $ unzip ./Downloads/hdfs-clientconfig.zip $ ls hadoop-conf core-site.xml     hdfs-site.xml     topology.map hadoop-env.sh     log4j.properties  topology.py Step 5: Configure the Hadoop client software on the Solaris server(s) * Edit hadoop-conf/hadoop-env.sh set JAVA_HOME correctly: export JAVA_HOME=/usr/jdk/instances/jdk1.7.0 * Move the configuration files into place: $ cp hadoop-conf/* hadoop-2.3.0-cdh5.1.2/etc/hadoop/ * Add the Hadoop bin directory to your PATH. You may want to do this in your local shell or .bashrc $ export PATH=~/hadoop-2.3.0-cdh5.1.2/bin:$PATH Step 6: Test the Hadoop client software on Solaris * Verify that the remote hdfs filesystem is visible from your Solaris server(s) $ hdfs dfs -ls 14/12/12 09:35:23 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicableFound 6 itemsdrwx------   - oracle oracle          0 2014-08-25 02:55 .Trashdrwx------   - oracle oracle          0 2014-09-23 10:25 .stagingdrwxr-xr-x   - oracle oracle          0 2014-01-12 15:15 moviedemodrwxr-xr-x   - oracle oracle          0 2014-09-24 06:38 movieworkdrwxr-xr-x   - oracle oracle          0 2014-09-08 12:50 oggdemodrwxr-xr-x   - oracle oracle          0 2014-09-20 10:59 oozie-oozi Part C: Install "Oracle SQL Connector for HDFS" (OSCH) on your Solaris server(s) Step 1: Download OSCH * Download Oracle SQL Connector for Hadoop Distributed File System Release 3.1.0 from the Oracle Big Data Connectors Downloads page. * Unzip OSCH $ unzip oraosch-3.1.0.zip Archive:  oraosch-3.1.0.zip extracting: orahdfs-3.1.0.zip         inflating: README.txt              $ unzip orahdfs-3.1.0Archive:  orahdfs-3.1.0.zipreplace orahdfs-3.1.0/doc/README.txt? [y]es, [n]o, [A]ll, [N]one, [r]ename: A  inflating: orahdfs-3.1.0/doc/README.txt    inflating: orahdfs-3.1.0/jlib/ojdbc6.jar    inflating: orahdfs-3.1.0/jlib/osdt_cert.jar    inflating: orahdfs-3.1.0/jlib/orahdfs.jar    inflating: orahdfs-3.1.0/jlib/oraclepki.jar    inflating: orahdfs-3.1.0/jlib/osdt_core.jar    inflating: orahdfs-3.1.0/jlib/ora-hadoop-common.jar    inflating: orahdfs-3.1.0/bin/hdfs_stream    inflating: orahdfs-3.1.0/examples/sql/mkhive_unionall_view.sql   Step 2: Install OSCH * Follow the instructions in the Connectors User's Guide which is available as part of the Oracle Big Data Documentation. Also see Getting Started with Oracle Big Data Connectors. * I'm tempted to cut & paste everything from "1.4.3 Installing Oracle SQL Connector for HDFS" into this document, but I won't. * Your mileage may vary, but for me, it looked like this: $ tail -1 /var/opt/oracle/oratabdbmc1:/u01/app/oracle/product/12.1.0.2/dbhome_1:N               # line added by Agent $ export PATH=/usr/local/bin:$PATH $ export ORACLE_SID=dbmc1 $ . /usr/local/bin/oraenvORACLE_SID = [dbmc1] ?The Oracle base has been set to /u01/app/oracle $ env | grep ORAORACLE_SID=dbmc1ORACLE_BASE=/u01/app/oracleORACLE_HOME=/u01/app/oracle/product/12.1.0.2/dbhome_1 $ srvctl status database -d dbmc1Instance dbmc11 is running on node etc20dbadm01Instance dbmc12 is running on node etc20dbadm02 $ export OSCH_HOME=/export/home/oracle/orahdfs-3.1.0 $ export HADOOP_CLASSPATH=$OSCH_HOME/jlib/* $ sqlplus / as sysdba SQL> CREATE USER hdfsuser IDENTIFIED BY n0ne0fyourBusiness    DEFAULT TABLESPACE users    QUOTA UNLIMITED ON users; SQL> GRANT CREATE SESSION, CREATE TABLE, CREATE VIEW TO hdfsuser; SQL> GRANT EXECUTE ON sys.utl_file TO hdfsuser; SQL> CREATE OR REPLACE DIRECTORY osch_bin_path AS '/export/home/oracle/orahdfs-3.1.0/bin' SQL> GRANT READ, EXECUTE ON DIRECTORY osch_bin_path TO hdfsuser; Notice that MOVIEDEMO_DIR needs to be on shared storage, visible to both RAC nodes. From a Solaris shell prompt, create the MOVIEDEMO_DIR, substituting the ZFS_SA InfiniBand hostname for XXX, below.  Then allow the database user to assess the directory: $ mkdir /net/XXX/export/test/hdfsuser $ sqlplus / as sysdba SQL> CREATE OR REPLACE DIRECTORY MOVIEDEMO_DIR AS '/net/XXX/export/test/hdfsuser'; SQL> GRANT READ, WRITE ON DIRECTORY MOVIEDEMO_DIR TO hdfsuser; Step 3: Test using the Movie Demo Test using the movie demo which is documented in the Big Data Connectors User's Guide. Cut and paste moviefact_hdfs.sh and moviefact_hdfs.xml from Example 2-1 Accessing HDFS Data Files from Oracle Database In moviefact_hdfs.sh, for my configuration I need to change the path to OSCH_HOME and moviefact_hdfs.xml. In moviefact_hdfs.xml, I needed to change two properties, as follows. For the database connection, use the Oracle Single Client Access Name (SCAN).     <property>      <name>oracle.hadoop.connection.url</name>      <value>jdbc:oracle:thin:@sc-scan:1521/dbmc1</value>    </property>     <property>      <name>oracle.hadoop.connection.user</name>      <value>hdfsuser</value>    </property> Run the script: $ sh moviefact_hdfs.shOracle SQL Connector for HDFS Release 3.1.0 - Production Copyright (c) 2011, 2014, Oracle and/or its affiliates. All rights reserved. [Enter Database Password:]14/12/12 12:36:00 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicableThe create table command succeeded. CREATE TABLE "HDFSUSER"."MOVIE_FACTS_EXT"( "CUST_ID"                        VARCHAR2(4000), "MOVIE_ID"                       VARCHAR2(4000), "GENRE_ID"                       VARCHAR2(4000), "TIME_ID"                        TIMESTAMP(9), "RECOMMENDED"                    NUMBER, "ACTIVITY_ID"                    NUMBER, "RATING"                         NUMBER, "SALES"                          NUMBER)ORGANIZATION EXTERNAL(   TYPE ORACLE_LOADER   DEFAULT DIRECTORY "MOVIEDEMO_DIR"   ACCESS PARAMETERS   (     RECORDS DELIMITED BY 0X'0A'     CHARACTERSET AL32UTF8     PREPROCESSOR "OSCH_BIN_PATH":'hdfs_stream'     FIELDS TERMINATED BY 0X'09'     MISSING FIELD VALUES ARE NULL     (       "CUST_ID" CHAR(4000),       "MOVIE_ID" CHAR(4000),       "GENRE_ID" CHAR(4000),       "TIME_ID" CHAR DATE_FORMAT TIMESTAMP MASK 'YYYY-MM-DD:HH:MI:SS',       "RECOMMENDED" CHAR,       "ACTIVITY_ID" CHAR,       "RATING" CHAR,       "SALES" CHAR     )   )   LOCATION   (     'osch-20141212123602-2522-1',     'osch-20141212123602-2522-2',     'osch-20141212123602-2522-3',     'osch-20141212123602-2522-4'   )) PARALLEL REJECT LIMIT UNLIMITED; The following location files were created. osch-20141212123602-2522-1 contains 1 URI, 12754882 bytes     12754882 hdfs://ofsaa-bdl.us.oracle.com:8020/user/oracle/moviework/data/part-00001 osch-20141212123602-2522-2 contains 1 URI, 438 bytes          438 hdfs://ofsaa-bdl.us.oracle.com:8020/user/oracle/moviework/data/part-00002 osch-20141212123602-2522-3 contains 1 URI, 432 bytes          432 hdfs://ofsaa-bdl.us.oracle.com:8020/user/oracle/moviework/data/part-00003 osch-20141212123602-2522-4 contains 1 URI, 202 bytes          202 hdfs://ofsaa-bdl.us.oracle.com:8020/user/oracle/moviework/data/part-00004 Final verification with SQL Developer:

This blog describes the steps necessary to configure Oracle SQL Connector for Hadoop Distributed File System (OSCH) to enable Oracle Database running on Oracle Solaris to access andanalyze data...

Sun

Big Data Lite with a static IP address on Oracle Linux 6 / EL6

Quest: to run the Oracle Big Data Lite Virtual Machine in a VirtualBox guest, where the VBox guest is configured with a static IP address and running in a headless mode, running on a VBox host running Oracle Linux 6. Step 1: Install VirtualBox on the host You might be tempted to start a download from the Oracle VM VirtualBox page, but that isn't the best practice.Yes, an "Oracle Linux 6 / EL6" RPM is published on that page, but instead, install from the yum public_ol6_addons repository. See Wim Coekaerts blog: Setting up Oracle Linux 6 with public-yum for all updates. VirtualBox-4.3 is in the public_ol6_addons repository: # repoquery -i VirtualBox-4.3 Name        : VirtualBox-4.3Version     : 4.3.20_96996_el6Release     : 1Architecture: x86_64Size        : 151242167Packager    : NoneGroup       : Applications/SystemURL         : http://www.virtualbox.org/Repository  : public_ol6_addonsSummary     : Oracle VM VirtualBoxSource      : VirtualBox-4.3-4.3.20_96996_el6-1.src.rpmDescription :VirtualBox is a powerful PC virtualization solution allowingyou to run a wide range of PC operating systems on your Linuxsystem. This includes Windows, Linux, FreeBSD, DOS, OpenBSDand others. VirtualBox comes with a broad feature set andexcellent performance, making it the premier virtualizationsoftware solution on the market. When I was ready to install VirtualBox, the repositories on my VirtualBox host were set up like this: # yum repolist Loaded plugins: refresh-packagekit, security repo id               repo name                                cloudera-cdh5         Cloudera's Distribution for Hadoop public_ol6_UEK_latest Latest Unbreakable Enterprise Kernel public_ol6_addons     Oracle Linux 6Server Add ons public_ol6_latest     Oracle Linux 6Server Latest FYI, the Cloudera repository is a hold-over from a previous project and *NOT* required to for the VirtualBox host. Cloudera will be running in the VBox guest, not host. Here we go: # yum install VirtualBox-4.3 Loaded plugins: refresh-packagekit, security Setting up Install Process Resolving Dependencies --> Running transaction check ---> Package VirtualBox-4.3.x86_64 0:4.3.20_96996_el6-1 will be installed --> Finished Dependency Resolution Dependencies Resolved ===============================================================================================  Package               Arch          Version                    Repository                Size =============================================================================================== Installing:  VirtualBox-4.3        x86_64        4.3.20_96996_el6-1         public_ol6_addons         68 M Transaction Summary =============================================================================================== Install       1 Package(s) Total download size: 68 M Installed size: 144 M Is this ok [y/N]: y Downloading Packages: VirtualBox-4.3-4.3.20_96996_el6-1.x86_64.rpm                            |  68 MB     00:26      Running rpm_check_debug Running Transaction Test Transaction Test Succeeded Running Transaction   Installing : VirtualBox-4.3-4.3.20_96996_el6-1.x86_64                                    1/1 Creating group 'vboxusers'. VM users must be member of that group! No precompiled module for this kernel found -- trying to build one. Messages emitted during module compilation will be logged to /var/log/vbox-install.log. Stopping VirtualBox kernel modules [  OK  ] Recompiling VirtualBox kernel modules [FAILED]   (Look at /var/log/vbox-install.log to find out what went wrong)   Verifying  : VirtualBox-4.3-4.3.20_96996_el6-1.x86_64                                    1/1 Installed:   VirtualBox-4.3.x86_64 0:4.3.20_96996_el6-1                                                    Complete! WWW-wisdom indicates that you should install kernel headers, and I noticed that a newer version was available. Could that be the problem? # yum list kernel* Loaded plugins: refresh-packagekit, security Installed Packages kernel.x86_64                2.6.32-358.14.1.el6     ol6_latest           kernel.x86_64                2.6.32-431.1.2.el6      @public_ol6_latest    kernel.x86_64                2.6.32-431.29.2.el6     installed             kernel-devel.x86_64          2.6.32-358.14.1.el6     @ol6_latest           kernel-devel.x86_64          2.6.32-431.1.2.el6      @public_ol6_latest    kernel-devel.x86_64          2.6.32-431.29.2.el6     installed             kernel-firmware.noarch       2.6.32-431.29.2.el6     installed             kernel-uek.x86_64            2.6.39-400.109.5.el6uek @ol6_UEK_latest       kernel-uek.x86_64            2.6.39-400.212.1.el6uek @public_ol6_UEK_latest kernel-uek.x86_64            2.6.39-400.215.10.el6uek public_ol6_UEK_latest kernel-uek-firmware.noarch   2.6.39-400.109.5.el6uek @ol6_UEK_latest       kernel-uek-firmware.noarch   2.6.39-400.212.1.el6uek ublic_ol6_UEK_latest kernel-uek-firmware.noarch   2.6.39-400.215.10.el6uek@public_ol6_UEK_latest kernel-uek-headers.x86_64    2.6.32-400.36.8.el6uek  installed Available Packages kernel.x86_64                2.6.32-504.1.3.el6       public_ol6_latest     kernel-abi-whitelists.noarch 2.6.32-504.1.3.el6       public_ol6_latest     kernel-debug.x86_64          2.6.32-504.1.3.el6       public_ol6_latest     kernel-debug-devel.x86_64    2.6.32-504.1.3.el6       public_ol6_latest     kernel-devel.x86_64          2.6.32-504.1.3.el6       public_ol6_latest     kernel-doc.noarch            2.6.32-504.1.3.el6       public_ol6_latest     kernel-firmware.noarch       2.6.32-504.1.3.el6       public_ol6_latest     kernel-headers.x86_64        2.6.32-504.1.3.el6       public_ol6_latest     kernel-uek.x86_64            2.6.39-400.215.13.el6uek public_ol6_UEK_latest kernel-uek-debug.x86_64      2.6.39-400.215.13.el6uek public_ol6_UEK_latest kernel-uek-debug-devel.x86_64 2.6.39-400.215.13.el6uekpublic_ol6_UEK_latest kernel-uek-devel.x86_64      2.6.39-400.215.13.el6uek public_ol6_UEK_latest kernel-uek-doc.noarch        2.6.39-400.215.13.el6uek public_ol6_UEK_latest kernel-uek-firmware.noarch   2.6.39-400.215.13.el6uek public_ol6_UEK_latest kernel-uek-headers.x86_64    2.6.32-400.36.11.el6uek  public_ol6_latest So I updated the headers... # yum install kernel-uek-headers.x86_64 Loaded plugins: refresh-packagekit, security... Downloading Packages: kernel-uek-headers-2.6.32-400.36.11.el6uek.x86_64.rpm   | 742 kB  00:00     ... Updated:   kernel-uek-headers.x86_64 0:2.6.32-400.36.11.el6uek   Complete! But I still couldn't build the kernel module: # /etc/init.d/vboxdrv setup Stopping VirtualBox kernel modules                         [  OK  ] Recompiling VirtualBox kernel modules                      [FAILED]   (Look at /var/log/vbox-install.log to find out what went wrong) # more /var/log/vbox-install.log Makefile:183: *** Error: unable to find the sources of your current Linux kernel. Specify KERN_DIR=<directory> and run Make again.  Stop. I'm told that I need to specify a KERN_DIR, and this is what I have to choose from: # ls /usr/src/kernels/ 2.6.32-358.14.1.el6.x86_64  2.6.32-431.1.2.el6.x86_64  2.6.32-431.29.2.el6.x86_64 So I do as I'm told: # export KERN_DIR=/usr/src/kernels/2.6.32-358.14.1.el6.x86_64 # /etc/init.d/vboxdrv setupStopping VirtualBox kernel modules                         [  OK  ]Recompiling VirtualBox kernel modules                      [  OK  ]Starting VirtualBox kernel modules                         [FAILED]  (modprobe vboxdrv failed. Please use 'dmesg' to find out why) # dmesg | tail -1 vboxdrv: disagrees about version of symbol module_layout Observe the result: the error message changed, but it still didn't work. Which directory in /usr/src/kernels should I use? Choose a different one? Well, duh, I say to myself, which version is running? # uname -a Linux p3231-03 2.6.39-400.215.10.el6uek.x86_64 #1 SMP Wed Sep 10 00:07:12 PDT 2014 x86_64 x86_64 x86_64 GNU/Linux But wait! None of the versions in /usr/src/kernels match my "uname -a" output. Looking at the list, above, I see that I have Installed Packages of "kernel-uek-devel.x86_64" matching all 3 versions in /usr/src/kernels and one Available Packages of  "kernel-uek-devel.x86_64" which matches my  "uname -a". Try installing that: # yum install kernel-uek-devel.x86_64 Loaded plugins: refresh-packagekit, security...Downloading Packages: kernel-uek-devel-2.6.39-400.215.13.el6uek.x86_64.rpm    | 8.1 MB 00:03     ... Installed:   kernel-uek-devel.x86_64 0:2.6.39-400.215.13.el6uek  Complete! And now a 4th version, matching my "uname -a" is in /usr/src/kernels. # ls /usr/src/kernels/ 2.6.32-358.14.1.el6.x86_64  2.6.32-431.29.2.el6.x86_64 2.6.32-431.1.2.el6.x86_64   2.6.39-400.215.13.el6uek.x86_64 Try again: # export KERN_DIR=/usr/src/kernels/2.6.39-400.215.13.el6uek.x86_64 # /etc/init.d/vboxdrv setup Stopping VirtualBox kernel modules                         [  OK  ] Removing old VirtualBox pci kernel module                  [  OK  ] Removing old VirtualBox netadp kernel module               [  OK  ] Removing old VirtualBox netflt kernel module               [  OK  ] Removing old VirtualBox kernel module                      [  OK  ] Recompiling VirtualBox kernel modules                      [  OK  ] Starting VirtualBox kernel modules                         [  OK  ] Victory!! Now VirtualBox starts. Step 2: Install the extension pack Recall that my quest is to run in a headless mode, so I need VirtualBox Remote Desktop Protocol (VRDP) support; see Section 7.1, “Remote display (VRDP support)”. Download the extension pack from http://download.virtualbox.org/virtualbox/4.3.20 and install the extention pack on the VirtualBox HOST: $ sudo VBoxManage extpack install Oracle_VM_VirtualBox_Extension_Pack-4.3.20-96996.vbox-extpack [sudo] password for cloudera: 0%...10%...20%...30%...40%...50%...60%...70%...80%...90%...100%Successfully installed "Oracle VM VirtualBox Extension Pack". Step 3: Oracle Big Data Lite deployment Send the HOST GUI display back to my desktop and follow the Quick Deployment Step-by-step instructions to deploy the GUEST - the Oracle Big Data Lite Virtual Machine. desktop$ ssh -X vbox-host vbox-host$ virtualbox Step 4: Update the Guest Additions Send the HOST GUI display back to my desktop and mount the Guest Additions CD on the GUEST: desktop$ ssh -X vbox-host vbox-host$ virtualbox Step 5: Start the guest and update the Guest Additions Since I'm using SSH X11 Forwarding, when I start the guest, the console is displayed on my desktop. Login to the GUEST as "oracle" with password "welcome1" and update the guest additions. [oracle@bigdatalite ~]$ sudo /media/VBOXADDITIONS_4.3.20_96996/VBoxLinuxAdditions.run Verifying archive integrity... All good. Uncompressing VirtualBox 4.3.20 Guest Additions for Linux............ VirtualBox Guest Additions installer Removing installed version 4.2.16 of VirtualBox Guest Additions... Copying additional installer modules ... Installing additional modules ... Removing existing VirtualBox non-DKMS kernel modules       [  OK  ] Building the VirtualBox Guest Additions kernel modules The headers for the current running kernel were not found. If the following module compilation fails then this could be the reason. The missing package can be probably installed with yum install kernel-uek-devel-2.6.39-400.109.6.el6uek.x86_64 Building the main Guest Additions module                   [  OK  ] Building the shared folder support module                  [  OK  ] Building the OpenGL support module                         [  OK  ] Doing non-kernel setup of the Guest Additions              [  OK  ] You should restart your guest to make sure the new modules are actually used Installing the Window System drivers Installing X.Org Server 1.13 modules                       [  OK  ] Setting up the Window System to use the Guest Additions    [  OK  ] You may need to restart the hal service and the Window System (or just restart the guest system) to enable the Guest Additions. Installing graphics libraries and desktop services componen[  OK  ] Step 6: Configure a bridged adapter For my particular needs a static IP address was required for the Oracle Big Data Lite Virtual Machine. It is unlikely that you will need to follow this step. The static IP address for the GUEST had been published to a our DNS server. Further, a dedicated network port for the GUEST was available. Q: Which network port is available? A: Used "mii-tool" and "ifconfig -a" to determine that eth0 was the primary port for the HOST and eth2 was the spare port the had been wired for the GUEST: # for i in `seq 0 3`; do mii-tool -V eth$i| grep -v 2000; doneeth0: negotiated 100baseTx-FD, link oketh1: no linketh2: negotiated 100baseTx-FD, link oketh3: no link After starting the GUEST, I was able to configure the hostname and IP of the GUEST. # ip addr add xx.xx.xx.xx/20 dev eth4# route add default gw xx.xx.xx.xx eth4 Interesting that eth2 on the HOST is connected to eth4 on the GUEST. Step 7: Configure VirtualBox Remote Desktop Protocol (VRDP): See Section 7.1, “Remote display (VRDP support)”. Step 8 : Start the GUEST in headless mode: On the HOST, run: $ VBoxManage list vms $ nohup VBoxHeadless -s BigDataLite-4.0.1 & At this point, you should be able to ssh to the IP of the GUEST Step 9: Open a hole in the HOST firewall for RDP: On the HOST, run: # iptables -I INPUT -p tcp -m state --state NEW -m tcp --dport 3389 -j ACCEPT # service iptables save That surprised me. To get to the VirtualBox Remote Desktop of the GUEST, connect your remote desktop client to port 3389 of the HOST. At first, I incorrectly guessed that the IP address for the GUEST VRDP would be the GUEST's IP, but in fact, I found the GUEST's VRDP on the HOST IP address.   Step 10: Start the Cloudera Manager On the GUEST, run:$ sudo service cloudera-scm-server start Starting cloudera-scm-server:                              [  OK  ] $ sudo service cloudera-scm-agent start Starting cloudera-scm-agent:                               [  OK  ] Step 9: Open a hole in the GUEST firewall for the Cloudera Manager: On the GUEST, run: # iptables -I INPUT -p tcp -m state --state NEW -m tcp --dport 7180 -j ACCEPT # service iptables save Step 11: Visit the Cloudera Manager with a web browser: Visit http://<!-- guest --> :7180/cmf/login (credentials: admin/welcome1)

Quest: to run the Oracle Big Data Lite Virtual Machine in a VirtualBox guest, where the VBox guest is configured with a static IP address and running in aheadless mode, running on a VBox host...

Sun

JAVA_HOME on Solaris 11

I recently asserted that it is not a good idea to set the JAVA_HOME environment variable to /usr/java on Solaris 11. Instead, I'd recommend something like this: $ export JAVA_HOME=/usr/jdk/instances/jdk1.7.0 Why? Because it isn't clear where /usr/java will point to over time. Here are some details. Working with Solaris 11: $ cat /etc/release                             Oracle Solaris 11.2 SPARC   Copyright (c) 1983, 2014, Oracle and/or its affiliates.  All rights reserved.                              Assembled 23 June 2014 # pkg info entire | grep //           FMRI: pkg://solaris/entire@0.5.11,5.11-0.175.2.0.0.42.0:20140624T193832Z At first glance, it seems that /usr/java is a symbolic link to JDK 1.7: $ ls -l /usr/java lrwxrwxrwx   1 root     root          15 Aug 18 08:02 /usr/java -> jdk/jdk1.7.0_60 But on closer examination, it isn't a full JDK. Notice that jconsole, jps, etc are missing: $ ls /usr/java/bin ControlPanel  jcontrol      pack200       rmiregistry   tnameserv java          keytool       policytool    servertool    unpack200 javaws        orbd          rmid          sparcv9 Installing the full JDK is easy: $ sudo pkg install --accept jdk-7 And now the rest of the expected JDK programs are present: $ ls /usr/java/binappletviewer  javah         jmap          native2ascii  servertoolextcheck      javap         jps           orbd          sparcv9idlj          jcmd          jrunscript    pack200       tnameservjar           jconsole      jsadebugd     policytool    unpack200jarsigner     jdb           jstack        rmic          wsgenjava          jdeps         jstat         rmid          wsimportjava-rmi.cgi  jhat          jstatd        rmiregistry   xjcjavac         jinfo         jvisualvm     schemagenjavadoc       jjs           keytool       serialver Note that /usr/java is a symbolic link, to a second symbolic link, which links to a directory: $ ls -ld /usr/java lrwxrwxrwx   1 root     root          15 Aug 18 08:02 /usr/java -> jdk/jdk1.7.0_60 Following symbolic link 1: $ ls -ld /usr/jdk/jdk1.7.0_60 lrwxrwxrwx   1 root     root          18 Aug 18 08:02 /usr/jdk/jdk1.7.0_60 -> instances/jdk1.7.0 Following symbolic link 2: $ ls -ld /usr/jdk/instances/jdk1.7.0/ drwxr-xr-x   6 root     bin            7 Aug 18 08:02 /usr/jdk/instances/jdk1.7.0/ But if you install jdk-8, the links change: $ ls -ld /usr/javalrwxrwxrwx   1 root     root          15 Nov 21 13:08 /usr/java -> jdk/jdk1.8.0_20 $ ls -ld /usr/jdk/jdk1.8.0_20lrwxrwxrwx   1 root     root          18 Nov 21 13:08 /usr/jdk/jdk1.8.0_20 -> instances/jdk1.8.0 So if your JAVA_HOME was set to /usr/java, your application would start using JDK-8. $ export JAVA_HOME=/usr/java $ $JAVA_HOME/bin/java -version java version "1.8.0_20" Java(TM) SE Runtime Environment (build 1.8.0_20-b26) Java HotSpot(TM) 64-Bit Server VM (build 25.20-b23, mixed mode) It is interesting that, in contrast, when you install JDK-6 and JDK-7 is already installed, the /usr/java links will continue to point to the JDK with the higher major version. My advice is that if your application is certified with JDK-7, and you'd like the users to automatically pick up the newest bug fixes and security updates, this is the safest bet. $ export JAVA_HOME=/usr/jdk/instances/jdk1.7.0 Hope this helps.

I recently asserted that it is not a good idea to set the JAVA_HOME environment variable to /usr/java on Solaris 11. Instead, I'd recommend something like this:$ export...

Database

Enterprise Manager agentTZRegion

EM didn't like the switch from daylight savings to standard time - it was locked into Pacific daylight savings: In $ORACLE_HOME/node1_sid/sysman/log/emdb.nohup ----- Mon Nov  3 19:36:01 2014::tzOffset for -07:00 is -420(min), but agent is runnning with tzOffset -480(min) $ grep TZ $ORACLE_HOME/node1_sid/sysman/config/emd.properties agentTZRegion=-07:00 $ grep US/Pacific $ORACLE_HOME/sysman/admin/supportedtzs.lst US/Pacific US/Pacific-New $ export TZ=US/Pacific $ $ORACLE_HOME/bin/emctl resetTZ agent Oracle Enterprise Manager 11g Database Control Release 11.2.0.4.0 Copyright (c) 1996, 2013 Oracle Corporation.  All rights reserved. Updating/u01/app/oracle/product/11.2.0/dbhome_1/node1_sid/sysman/config/emd.properties... Successfully updated/u01/app/oracle/product/11.2.0/dbhome_1/node1_sid/sysman/config/emd.properties. Login as the em repository user and run the  script: exec mgmt_target.set_agent_tzrgn('node1:3938','US/Pacific') and commit the changes This can be done for example by logging into sqlplus and doing SQL> exec mgmt_target.set_agent_tzrgn('node1:3938','US/Pacific') SQL> commit [also ran "emctl resetTZ agent" on other RAC node] $ grep TZ $ORACLE_HOME/node1_sid/sysman/config/emd.properties agentTZRegion=US/Pacific $ sqlplus / as sysdba SQL*Plus: Release 11.2.0.4.0 Production on Tue Nov 4 07:28:57 2014 Copyright (c) 1982, 2013, Oracle.  All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production With the Partitioning, Real Application Clusters, OLAP, Data Mining and Real Application Testing options SQL> SQL> alter session set current_schema = SYSMAN; Session altered. SQL> exec mgmt_target.set_agent_tzrgn('node1:3938','US/Pacific') PL/SQL procedure successfully completed. SQL> execmgmt_target.set_agent_tzrgn('node2:3938','US/Pacific') PL/SQL procedure successfully completed. $ emctl stop dbconsole $ emctl start dbconsole

EM didn't like the switch from daylight savings to standard time - it was locked into Pacific daylight savings: In $ORACLE_HOME/node1_sid/sysman/log/emdb.nohup----- Mon Nov  3 19:36:01 2014::tzOffset...

Analytics

Onion Security

This blog is part of the SPARC T5-4 RAC and WebLogic Cluster series: including:BackgroundSPARC T5-4 LDoms for RAC and WebLogic ClustersOnion SecurityDNS Bind Server configuration on Solaris 11.2NFS root access for Oracle RAC on Sun ZFS Storage 7x20 ApplianceI was asked why I used so many subnets in the system which is described in my blog entry SPARC T5-4 LDoms for RAC and WebLogic Clusters: Management Subnet HTTP Public Network JDBC subnet Storage subnet The short answer: I didn't need to. I could have used one public network and one RAC private network. Longer answer: Better observability Better isolation It enables a better security model Onion View LDom View If Joe Blackhat is able to compromise our HTTP server, that is a bad thing, but hopefully he will only be able access a subset of the data.  To get any additional data, he will need to request the data from the WebLogic server. The HTTP to WebLogic network layer can be monitored, firewalled, and logged. Again, if Joe Blackhat is able to penetrate one layer deeper, into the WebLogic layer, he will only be able to access additional data via JDBC calls to Oracle RAC. Again, the WebLogic to RAC network layer can be monitored, firewalled, and logged. And so forth... In case it isn't obvious, the management network is intended to be used only infrequently by DBA's and System Administrators. This network should be tightly controlled and only enabled when system administration is required.

This blog is part of the SPARC T5-4 RAC and WebLogic Cluster series: including: Background SPARC T5-4 LDoms for RAC and WebLogic Clusters Onion Security DNS Bind Server configuration on Solaris 11.2 NFS r...

Sun

netperf on Solaris 11

I wanted to use netperf 2.6.0 on Solaris 11. See http://www.netperf.org/netperf/ for details. I had this error when I tried to build: $ make make  all-recursive Making all in src Making all in missing Making all in m4 gcc -DHAVE_CONFIG_H -I. -I..       -MT netlib.o -MD -MP -MF .deps/netlib.Tpo -c -o netlib.o netlib.c In file included from netlib.c:2245:0: /usr/include/sys/processor.h: In function ‘bind_to_specific_processor’: /usr/include/sys/processor.h:128:12: error: ‘processor_affinity’ redeclared as different kind of symbol netlib.c:2222:32: note: previous definition of ‘processor_affinity’ was here This was with gcc version 4.5.2. Made the following changes in src/netlib.c: $ diff netlib.c.ORIG netlib.c 2222c2222 < bind_to_specific_processor(int processor_affinity, int use_cpu_map) --- > bind_to_specific_processor(int j_processor_affinity, int use_cpu_map) 2233c2233 <     mapped_affinity = lib_cpu_map[processor_affinity]; --- >     mapped_affinity = lib_cpu_map[j_processor_affinity]; 2236c2236 <     mapped_affinity = processor_affinity; --- >     mapped_affinity = j_processor_affinity; And all was good. FYI, I added these settings to increase performance on the 10-Gb network: /usr/sbin/ndd -set /dev/tcp tcp_conn_req_max_q 16384 /usr/sbin/ndd -set /dev/tcp tcp_conn_req_max_q0 16384 /usr/sbin/ndd -set /dev/tcp tcp_max_buf 2097152 /usr/sbin/ndd -set /dev/tcp tcp_xmit_hiwat 524288 /usr/sbin/ndd -set /dev/tcp tcp_recv_hiwat 524288 /usr/sbin/ndd -set /dev/tcp tcp_cwnd_max 2097152 The performance on the 10-Gb network is about 30% faster with the tunables in place. The 1-Gb networks is not faster than it had been. See "ndd on Solaris 10" - https://blogs.oracle.com/taylor22/entry/ndd_on_solaris_10 Performance now looks good: $ / usr/local/bin/netperf -fg -H my-server MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to my-server () port 0 AF_INET Recv   Send    Send                          Socket Socket  Message  Elapsed              Size   Size    Size     Time     Throughput  bytes  bytes   bytes    secs.    10^9bits/sec  524288  65536  65536    10.00       9.90 References: My blog about ndd settings on Solaris How to Measure the Network Bandwidth Between Solaris Nodes? 2009 article about Netperf 2.4.5 on Solaris Hope this helps.

I wanted to use netperf 2.6.0 on Solaris 11. See http://www.netperf.org/netperf/ for details. I had this error when I tried to build:$ make make  all-recursive Making all in src Making all in missing Maki...

Database

emca and ORA-12537: TNS:connection closed

Problem: I couldn't configure Enterprise Manager for a RAC database running on two SPARC T5-4 Servers $ emca -deconfig dbcontrol db -repos drop  -cluster$ emca -config dbcontrol db -repos create -cluster STARTED EMCA at Sep 6, 2014 1:03:42 PMEM Configuration Assistant, Version 11.2.0.3.0 ProductionCopyright (c) 2003, 2011, Oracle.  All rights reserved. Enter the following information:Database unique name: orcl2Service name: orcl2Listener port number: 1521Listener ORACLE_HOME [ /u01/app/11.2.0/grid ]: Password for SYS user:  Password for DBSNMP user:  Password for SYSMAN user: 40Cluster name: ofsaa-scanEmail address for notifications (optional): Outgoing Mail (SMTP) server for notifications (optional): ----------------------------------------------------------------- You have specified the following settings Database ORACLE_HOME ................ /u01/app/oracle/product/11.2.0/dbhome_1 Database instance hostname ................ Listener ORACLE_HOME ................ /u01/app/11.2.0/gridListener port number ................ 1521Cluster name ................ proj-scanDatabase unique name ................ orcl2Email address for notifications ............... Outgoing Mail (SMTP) server for notifications ............... -----------------------------------------------------------------Do you wish to continue? [yes(Y)/no(N)]: ySep 6, 2014 1:04:28 PM oracle.sysman.emcp.EMConfig performINFO: This operation is being logged at /u01/app/oracle/cfgtoollogs/emca/orcl2/emca_2014_09_06_13_03_41.log.Sep 6, 2014 1:04:33 PM oracle.sysman.emcp.EMReposConfig createRepositoryINFO: Creating the EM repository (this may take a while) ...Sep 6, 2014 1:09:20 PM oracle.sysman.emcp.EMReposConfig invokeINFO: Repository successfully createdSep 6, 2014 1:09:22 PM oracle.sysman.emcp.util.GeneralUtil initSQLEngineRemotely WARNING: Error during db connection : ORA-12537: TNS:connection closed Sep 6, 2014 1:09:33 PM oracle.sysman.emcp.EMReposConfig invokeSEVERE: Failed to unlock all EM-related accountsSep 6, 2014 1:09:33 PM oracle.sysman.emcp.EMConfig performSEVERE: Failed to unlock all EM-related accountsRefer to the log file at /u01/app/oracle/cfgtoollogs/emca/orcl2/emca_2014_09_06_13_03_41.log for more details.Could not complete the configuration. Refer to the log file at /u01/app/oracle/cfgtoollogs/emca/orcl2/emca_2014_09_06_13_03_41.log for more details. Very strange. Everything else seemed fine. $ srvctl status scan$ srvctl status scan_listener$ srvctl status listener$ srvctl status database -d orcl2 I spent so many hours digging through log files, listener.ora, tnsnames.ora, running netca and netmgr. Nothing helped. Two other strange things that I eventually noticed.- sqlplus connections to the SCAN address had intermittent failures.- Successful connections where always to one instance, never the other. All of the failures where ORA-12537: TNS:connection closed Eventually, I found MOS note 1069517.1, "ORA-12537 / ORA-12547 or TNS-12518 if Listener (including SCAN Listener) and Database are Owned by Different OS User". Sure enough, on one SPARC T5-4 servers, the setgid bit wasn't set on the oracle executable, but it was set on the other server. # export ORACLE_HOME=/u01/app/oracle/product/11.2.0/dbhome_1# cd $ORACLE_HOME/bin# chmod u+s oradism nmo nmhs emtgtctl2 jssu extjob nmb oracle# chmod g+s emtgtctl2 oracle

Problem: I couldn't configure Enterprise Manager for a RAC database running on two SPARC T5-4 Servers$ emca -deconfig dbcontrol db -repos drop  -cluster$ emca -config dbcontrol db -repos create...

Sun

Installing VNC server on Solaris 11

I tend to forget the exact command line for installing VNC on Solaris 11. VNC isn't installed by default: # vncserver-bash: vncserver: command not found But it is in the repository. (You need to use the --accept option because java/jre-7  requires BCL License acceptance ) # pkg install --accept solaris-desktop            Packages to install: 463            Mediators to change:   1             Services to change:  16        Create boot environment:  No Create backup boot environment:  No DOWNLOAD                                PKGS         FILES    XFER (MB)   SPEED Completed                            463/463   68277/68277  729.0/729.0  6.2M/s PHASE                                          ITEMS Installing new actions                   104012/104012 Updating package state database                 Done Updating package cache                           0/0 Updating image state                            Done Creating fast lookup database                   Done Updating package cache                           2/2 You have new mail in /var/mail/root Now the VNC server is installed: $ vncserver You will require a password to access your desktops. Password:Verify:xauth:  file /home/jeff/.Xauthority does not exist New 'my-host:1 (root)' desktop is my-host:1 Creating default startup script /home/jeff/.vnc/xstartupStarting applications specified in /home/jeff/.vnc/xstartupLog file is /home/jeff/.vnc/my-host:1.log By default, you get the tiled window manager (twm), but I prefer Gnome. # vi /home/jeff/.vnc/xstartup At the bottom of xstartup, change "twm &" -to- "gnome-session &": Kill the session running twm and restart with Gnome: $ vncserver -kill :1Killing Xvnc process ID 2392 $ vncserver New 'my-host:1 (root)' desktop is my-host:1 Starting applications specified in /home/jeff/.vnc/xstartupLog file is /home/jeff/.vnc/my-host:1.log Now you're good to go. Visit the vncserver with a vncclient: You may also want to review: VNC Cut & Paste on Solaris 10 Solaris/x64 VNC with Cut & Paste Solaris 11 VNC Server is "blurry" or "smeared" Hope this helps.

I tend to forget the exact command line for installing VNC on Solaris 11. VNC isn't installed by default: # vncserver -bash: vncserver: command not foundBut it is in the repository. (You need to use the...

Analytics

Quadratic Programming with Oracle R Enterprise

     I wanted to use quadprog with ORE on a server running Oracle Solaris 11.2 on a Oracle SPARC T-4 server For background, see: Oracle SPARC T4-2 http://docs.oracle.com/cd/E23075_01/ Oracle Solaris 11.2 http://www.oracle.com/technetwork/server-storage/solaris11/overview/index.html quadprog: Functions to solve Quadratic Programming Problems http://cran.r-project.org/web/packages/quadprog/index.html Oracle R Enterprise 1.4 ("ORE") 1.4 http://www.oracle.com/technetwork/database/options/advanced-analytics/r-enterprise/ore-downloads-1502823.html Problem: path to Solaris Studio doesn't match my installation: $ ORE CMD INSTALL quadprog_1.5-5.tar.gz * installing to library \u2018/u01/app/oracle/product/12.1.0/dbhome_1/R/library\u2019 * installing *source* package \u2018quadprog\u2019 ... ** package \u2018quadprog\u2019 successfully unpacked and MD5 sums checked ** libs /opt/SunProd/studio12u3/solarisstudio12.3/bin/f95 -m64   -PIC  -g  -c aind.f -o aind.o bash: /opt/SunProd/studio12u3/solarisstudio12.3/bin/f95: No such file or directory *** Error code 1 make: Fatal error: Command failed for target `aind.o' ERROR: compilation failed for package \u2018quadprog\u2019 * removing \u2018/u01/app/oracle/product/12.1.0/dbhome_1/R/library/quadprog\u2019 $ ls -l /opt/solarisstudio12.3/bin/f95 lrwxrwxrwx   1 root     root          15 Aug 19 17:36 /opt/solarisstudio12.3/bin/f95 -> ../prod/bin/f90 Solution: a symbolic link: $ sudo mkdir -p /opt/SunProd/studio12u3 $ sudo ln -s /opt/solarisstudio12.3 /opt/SunProd/studio12u3/ Now, it is all good: $ ORE CMD INSTALL quadprog_1.5-5.tar.gz * installing to library \u2018/u01/app/oracle/product/12.1.0/dbhome_1/R/library\u2019 * installing *source* package \u2018quadprog\u2019 ... ** package \u2018quadprog\u2019 successfully unpacked and MD5 sums checked ** libs /opt/SunProd/studio12u3/solarisstudio12.3/bin/f95 -m64   -PIC  -g  -c aind.f -o aind.o /opt/SunProd/studio12u3/solarisstudio12.3/bin/ cc -xc99 -m64 -I/usr/lib/64/R/include -DNDEBUG -KPIC  -xlibmieee  -c init.c -o init.o /opt/SunProd/studio12u3/solarisstudio12.3/bin/f95 -m64  -PIC -g  -c -o solve.QP.compact.o solve.QP.compact.f /opt/SunProd/studio12u3/solarisstudio12.3/bin/f95 -m64  -PIC -g  -c -o solve.QP.o solve.QP.f /opt/SunProd/studio12u3/solarisstudio12.3/bin/f95 -m64   -PIC  -g  -c util.f -o util.o /opt/SunProd/studio12u3/solarisstudio12.3/bin/ cc -xc99 -m64 -G -o quadprog.so aind.o init.o solve.QP.compact.o solve.QP.o util.o -xlic_lib=sunperf -lsunmath -lifai -lsunimath -lfai -lfai2 -lfsumai -lfprodai -lfminlai -lfmaxlai -lfminvai -lfmaxvai -lfui -lfsu -lsunmath -lmtsk -lm -lifai -lsunimath -lfai -lfai2 -lfsumai -lfprodai -lfminlai -lfmaxlai -lfminvai -lfmaxvai -lfui -lfsu -lsunmath -lmtsk -lm -L/usr/lib/64/R/lib -lR installing to /u01/app/oracle/product/12.1.0/dbhome_1/R/library/quadprog/libs ** R ** preparing package for lazy loading ** help *** installing help indices   converting help for package \u2018quadprog\u2019     finding HTML links ... done     solve.QP                                html      solve.QP.compact                        html  ** building package indices ** testing if installed package can be loaded * DONE (quadprog) ====== Here is an example from http://cran.r-project.org/web/packages/quadprog/quadprog.pdf > require(quadprog) > Dmat <- matrix(0,3,3) > diag(Dmat) <- 1 > dvec <- c(0,5,0) > Amat <- matrix(c(-4,-3,0,2,1,0,0,-2,1),3,3) > bvec <- c(-8,2,0) > solve.QP(Dmat,dvec,Amat,bvec=bvec) $solution [1] 0.4761905 1.0476190 2.0952381 $value [1] -2.380952 $unconstrained.solution [1] 0 5 0 $iterations [1] 3 0 $Lagrangian [1] 0.0000000 0.2380952 2.0952381 $iact [1] 3 2 Here, the standard example is modified to work with Oracle R Enterprise require(ORE) ore.connect("my-name", "my-sid", "my-host", "my-pass", 1521) ore.doEval(   function () {     require(quadprog)   } ) ore.doEval(   function () {     Dmat <- matrix(0,3,3)     diag(Dmat) <- 1     dvec <- c(0,5,0)     Amat <- matrix(c(-4,-3,0,2,1,0,0,-2,1),3,3)     bvec <- c(-8,2,0)    solve.QP(Dmat,dvec,Amat,bvec=bvec)   } ) $solution [1] 0.4761905 1.0476190 2.0952381 $value [1] -2.380952 $unconstrained.solution [1] 0 5 0 $iterations [1] 3 0 $Lagrangian [1] 0.0000000 0.2380952 2.0952381 $iact [1] 3 2 Now I can combine the quadprog compute algorithms with the Oracle R Enterprise Database engine functionality: Scale to large datasets Access to tables, views, and external tables in the database, as well as those accessible through database links Use SQL query parallel execution Use in-database statistical and data mining functionality

     I wanted to use quadprog with ORE on a server running Oracle Solaris 11.2 on a Oracle SPARC T-4 server For background, see: Oracle SPARC T4-2 http://docs.oracle.com/cd/E23075_01/ Oracle Solaris 11.2 h...

Sun

SPARC T5-4 LDoms for RAC and WebLogic Clusters

This blog is part of the SPARC T5-4 RAC and WebLogic Cluster series:BackgroundSPARC T5-4 LDoms for RAC and WebLogic ClustersOnion SecurityDNS Bind Server configuration on Solaris 11.2NFS root access for Oracle RAC on Sun ZFS Storage 7x20 ApplianceI wanted to use two Oracle SPARC T5-4 servers to simultaneously host both Oracle RAC and a WebLogic Server Cluster. I chose to use Oracle VM Server for SPARC to create a cluster like this: There are plenty of trade offs and decisions that need to be made, for example: Rather than configuring the system by hand, you might want to use an Oracle SuperCluster T5-8 My configuration is similar to jsavit's: Availability Best Practices - Example configuring a T5-8 but I chose to ignore some of the advice. Maybe I should have included an  alternate service domain, but I decided that I already had enough redundancy Both Oracle SPARC T5-4 servers were to be configured like this: Cntl0.25  4  64GB                     App LDom                    2.75 CPU's                                        44 cores                                          704 GB              DB LDom      One CPU         16 cores         256 GB   The systems started with everything in the primary domain: # ldm list NAME             STATE      FLAGS   CONS    VCPU  MEMORY   UTIL  NORM  UPTIMEprimary          active     -n-c--  UART    512   1023G    0.0%  0.0%  11m # ldm list-spconfigfactory-default [current]primary # ldm list -o core,memory,physioNAME             primary           CORE    CID    CPUSET    0      (0, 1, 2, 3, 4, 5, 6, 7)    1      (8, 9, 10, 11, 12, 13, 14, 15)    2      (16, 17, 18, 19, 20, 21, 22, 23)-- SNIP    62     (496, 497, 498, 499, 500, 501, 502, 503)    63     (504, 505, 506, 507, 508, 509, 510, 511) MEMORY    RA               PA               SIZE                0x30000000       0x30000000       255G    0x80000000000    0x80000000000    256G    0x100000000000   0x100000000000   256G    0x180000000000   0x180000000000   256G # Give this memory block to the DB LDom IO    DEVICE                           PSEUDONYM        OPTIONS    pci@300                          pci_0               pci@340                          pci_1               pci@380                          pci_2               pci@3c0                          pci_3               pci@400                          pci_4               pci@440                          pci_5               pci@480                          pci_6               pci@4c0                          pci_7               pci@300/pci@1/pci@0/pci@6        /SYS/RCSA/PCIE1     pci@300/pci@1/pci@0/pci@c        /SYS/RCSA/PCIE2     pci@300/pci@1/pci@0/pci@4/pci@0/pci@c /SYS/MB/SASHBA0     pci@300/pci@1/pci@0/pci@4/pci@0/pci@8 /SYS/RIO/NET0       pci@340/pci@1/pci@0/pci@6        /SYS/RCSA/PCIE3     pci@340/pci@1/pci@0/pci@c        /SYS/RCSA/PCIE4     pci@380/pci@1/pci@0/pci@a        /SYS/RCSA/PCIE9     pci@380/pci@1/pci@0/pci@4        /SYS/RCSA/PCIE10    pci@3c0/pci@1/pci@0/pci@e        /SYS/RCSA/PCIE11    pci@3c0/pci@1/pci@0/pci@8        /SYS/RCSA/PCIE12    pci@400/pci@1/pci@0/pci@e        /SYS/RCSA/PCIE5     pci@400/pci@1/pci@0/pci@8        /SYS/RCSA/PCIE6     pci@440/pci@1/pci@0/pci@e        /SYS/RCSA/PCIE7     pci@440/pci@1/pci@0/pci@8        /SYS/RCSA/PCIE8     pci@480/pci@1/pci@0/pci@a        /SYS/RCSA/PCIE13    pci@480/pci@1/pci@0/pci@4        /SYS/RCSA/PCIE14    pci@4c0/pci@1/pci@0/pci@8        /SYS/RCSA/PCIE15    pci@4c0/pci@1/pci@0/pci@4        /SYS/RCSA/PCIE16    pci@4c0/pci@1/pci@0/pci@c/pci@0/pci@c /SYS/MB/SASHBA1     pci@4c0/pci@1/pci@0/pci@c/pci@0/pci@4 /SYS/RIO/NET2    Added an additional service processor configuration: # ldm add-spconfig split # ldm list-spconfig factory-default primary split [current] And removed many of the resources from the primary domain: # ldm start-reconf primary# ldm set-core 4 primary# ldm set-memory 32G primary# ldm rm-io pci@340 primary# ldm rm-io pci@380 primary# ldm rm-io pci@3c0 primary# ldm rm-io pci@400 primary# ldm rm-io pci@440 primary# ldm rm-io pci@480 primary# ldm rm-io pci@4c0 primary# init 6 Needed to add resources to the guest domains: # ldm add-domain db # ldm set-core cid=`seq -s"," 48 63` db # ldm add-memory mblock=0x180000000000:256G db # ldm add-io pci@480 db # ldm add-io pci@4c0 db # ldm add-domain app # ldm set-core 44 app# ldm set-memory 704G  app # ldm add-io pci@340 app# ldm add-io pci@380 app# ldm add-io pci@3c0 app# ldm add-io pci@400 app# ldm add-io pci@440 app Needed to set up services: # ldm add-vds primary-vds0 primary # ldm add-vcc port-range=5000-5100 primary-vcc0 primary Needed to add a virtual network port for the WebLogic application domain: # ipadm NAME              CLASS/TYPE STATE        UNDER      ADDR lo0               loopback   ok           --         --    lo0/v4         static     ok           --         ...    lo0/v6         static     ok           --         ... net0              ip         ok           --         ...    net0/v4        static     ok           --         xxx.xxx.xxx.xxx/24    net0/v6        addrconf   ok           --         ....    net0/v6        addrconf   ok           --         ... net8              ip         ok           --         --    net8/v4        static     ok           --         ... # dladm show-phys LINK              MEDIA                STATE      SPEED  DUPLEX    DEVICE net1              Ethernet             unknown    0      unknown   ixgbe1 net0              Ethernet             up         1000   full      ixgbe0 net8              Ethernet             up         10     full      usbecm2 # ldm add-vsw net-dev=net0 primary-vsw0 primary # ldm add-vnet vnet1 primary-vsw0 app Needed to add a virtual disk to the WebLogic application domain: # format Searching for disks...done AVAILABLE DISK SELECTIONS:        0. c0t5000CCA02505F874d0 <HITACHI-H106060SDSUN600G-A2B0-558.91GB>           /scsi_vhci/disk@g5000cca02505f874           /dev/chassis/SPARC_T5-4.AK00084038/SYS/SASBP0/HDD0/disk        1. c0t5000CCA02506C468d0 <HITACHI-H106060SDSUN600G-A2B0-558.91GB>           /scsi_vhci/disk@g5000cca02506c468           /dev/chassis/SPARC_T5-4.AK00084038/SYS/SASBP0/HDD1/disk        2. c0t5000CCA025067E5Cd0 <HITACHI-H106060SDSUN600G-A2B0-558.91GB>           /scsi_vhci/disk@g5000cca025067e5c           /dev/chassis/SPARC_T5-4.AK00084038/SYS/SASBP0/HDD2/disk        3. c0t5000CCA02506C258d0 <HITACHI-H106060SDSUN600G-A2B0-558.91GB>           /scsi_vhci/disk@g5000cca02506c258           /dev/chassis/SPARC_T5-4.AK00084038/SYS/SASBP0/HDD3/disk Specify disk (enter its number): ^C # ldm add-vdsdev /dev/dsk/c0t5000CCA02506C468d0s2 HDD1@primary-vds0 # ldm add-vdisk HDD1 HDD1@primary-vds0 app Add some additional spice to the pot: # ldm set-variable auto-boot\\?=false db # ldm set-variable auto-boot\\?=false app # ldm set-var boot-device=HDD1 app Bind the logical domains: # ldm bind db # ldm bind app At the end of the process, the system is set up like this: # ldm list -o core,memory,physio NAME             primary          CORE     CID    CPUSET     0      (0, 1, 2, 3, 4, 5, 6, 7)     1      (8, 9, 10, 11, 12, 13, 14, 15)     2      (16, 17, 18, 19, 20, 21, 22, 23)     3      (24, 25, 26, 27, 28, 29, 30, 31) MEMORY     RA               PA               SIZE                0x30000000       0x30000000       32G IO     DEVICE                           PSEUDONYM        OPTIONS     pci@300                          pci_0               pci@300/pci@1/pci@0/pci@6        /SYS/RCSA/PCIE1     pci@300/pci@1/pci@0/pci@c        /SYS/RCSA/PCIE2     pci@300/pci@1/pci@0/pci@4/pci@0/pci@c /SYS/MB/SASHBA0     pci@300/pci@1/pci@0/pci@4/pci@0/pci@8 /SYS/RIO/NET0   ------------------------------------------------------------------------------ NAME             app              CORE     CID    CPUSET     4      (32, 33, 34, 35, 36, 37, 38, 39)     5      (40, 41, 42, 43, 44, 45, 46, 47)     6      (48, 49, 50, 51, 52, 53, 54, 55)     7      (56, 57, 58, 59, 60, 61, 62, 63)     8      (64, 65, 66, 67, 68, 69, 70, 71)     9      (72, 73, 74, 75, 76, 77, 78, 79)     10     (80, 81, 82, 83, 84, 85, 86, 87)     11     (88, 89, 90, 91, 92, 93, 94, 95)     12     (96, 97, 98, 99, 100, 101, 102, 103)     13     (104, 105, 106, 107, 108, 109, 110, 111)     14     (112, 113, 114, 115, 116, 117, 118, 119)     15     (120, 121, 122, 123, 124, 125, 126, 127)     16     (128, 129, 130, 131, 132, 133, 134, 135)     17     (136, 137, 138, 139, 140, 141, 142, 143)     18     (144, 145, 146, 147, 148, 149, 150, 151)     19     (152, 153, 154, 155, 156, 157, 158, 159)     20     (160, 161, 162, 163, 164, 165, 166, 167)     21     (168, 169, 170, 171, 172, 173, 174, 175)     22     (176, 177, 178, 179, 180, 181, 182, 183)     23     (184, 185, 186, 187, 188, 189, 190, 191)     24     (192, 193, 194, 195, 196, 197, 198, 199)     25     (200, 201, 202, 203, 204, 205, 206, 207)     26     (208, 209, 210, 211, 212, 213, 214, 215)     27     (216, 217, 218, 219, 220, 221, 222, 223)     28     (224, 225, 226, 227, 228, 229, 230, 231)     29     (232, 233, 234, 235, 236, 237, 238, 239)     30     (240, 241, 242, 243, 244, 245, 246, 247)     31     (248, 249, 250, 251, 252, 253, 254, 255)     32     (256, 257, 258, 259, 260, 261, 262, 263)     33     (264, 265, 266, 267, 268, 269, 270, 271)     34     (272, 273, 274, 275, 276, 277, 278, 279)     35     (280, 281, 282, 283, 284, 285, 286, 287)     36     (288, 289, 290, 291, 292, 293, 294, 295)     37     (296, 297, 298, 299, 300, 301, 302, 303)     38     (304, 305, 306, 307, 308, 309, 310, 311)     39     (312, 313, 314, 315, 316, 317, 318, 319)     40     (320, 321, 322, 323, 324, 325, 326, 327)     41     (328, 329, 330, 331, 332, 333, 334, 335)     42     (336, 337, 338, 339, 340, 341, 342, 343)     43     (344, 345, 346, 347, 348, 349, 350, 351)     44     (352, 353, 354, 355, 356, 357, 358, 359)     45     (360, 361, 362, 363, 364, 365, 366, 367)     46     (368, 369, 370, 371, 372, 373, 374, 375)     47     (376, 377, 378, 379, 380, 381, 382, 383) MEMORY     RA               PA               SIZE                0x30000000       0x830000000      192G     0x4000000000     0x80000000000    256G     0x8080000000     0x100000000000   256G IO     DEVICE                           PSEUDONYM        OPTIONS     pci@340                          pci_1               pci@380                          pci_2               pci@3c0                          pci_3               pci@400                          pci_4               pci@440                          pci_5               pci@340/pci@1/pci@0/pci@6        /SYS/RCSA/PCIE3     pci@340/pci@1/pci@0/pci@c        /SYS/RCSA/PCIE4     pci@380/pci@1/pci@0/pci@a        /SYS/RCSA/PCIE9     pci@380/pci@1/pci@0/pci@4        /SYS/RCSA/PCIE10     pci@3c0/pci@1/pci@0/pci@e        /SYS/RCSA/PCIE11     pci@3c0/pci@1/pci@0/pci@8        /SYS/RCSA/PCIE12     pci@400/pci@1/pci@0/pci@e        /SYS/RCSA/PCIE5     pci@400/pci@1/pci@0/pci@8        /SYS/RCSA/PCIE6     pci@440/pci@1/pci@0/pci@e        /SYS/RCSA/PCIE7     pci@440/pci@1/pci@0/pci@8        /SYS/RCSA/PCIE8 ------------------------------------------------------------------------------ NAME             db               CORE     CID    CPUSET     48     (384, 385, 386, 387, 388, 389, 390, 391)     49     (392, 393, 394, 395, 396, 397, 398, 399)     50     (400, 401, 402, 403, 404, 405, 406, 407)     51     (408, 409, 410, 411, 412, 413, 414, 415)     52     (416, 417, 418, 419, 420, 421, 422, 423)     53     (424, 425, 426, 427, 428, 429, 430, 431)     54     (432, 433, 434, 435, 436, 437, 438, 439)     55     (440, 441, 442, 443, 444, 445, 446, 447)     56     (448, 449, 450, 451, 452, 453, 454, 455)     57     (456, 457, 458, 459, 460, 461, 462, 463)     58     (464, 465, 466, 467, 468, 469, 470, 471)     59     (472, 473, 474, 475, 476, 477, 478, 479)     60     (480, 481, 482, 483, 484, 485, 486, 487)     61     (488, 489, 490, 491, 492, 493, 494, 495)     62     (496, 497, 498, 499, 500, 501, 502, 503)     63     (504, 505, 506, 507, 508, 509, 510, 511) MEMORY     RA               PA               SIZE                0x80000000       0x180000000000   256G IO     DEVICE                           PSEUDONYM        OPTIONS     pci@480                          pci_6               pci@4c0                          pci_7               pci@480/pci@1/pci@0/pci@a        /SYS/RCSA/PCIE13     pci@480/pci@1/pci@0/pci@4        /SYS/RCSA/PCIE14     pci@4c0/pci@1/pci@0/pci@8        /SYS/RCSA/PCIE15     pci@4c0/pci@1/pci@0/pci@4        /SYS/RCSA/PCIE16     pci@4c0/pci@1/pci@0/pci@c/pci@0/pci@c /SYS/MB/SASHBA1     pci@4c0/pci@1/pci@0/pci@c/pci@0/pci@4 /SYS/RIO/NET2   Start the domains: # ldm start app LDom app started # ldm start db LDom db started Make sure to start the vntsd service that was created, above. # svcs -a | grep ldo disabled        8:38:38 svc:/ldoms/vntsd:default online          8:38:58 svc:/ldoms/agents:default online          8:39:25 svc:/ldoms/ldmd:default # svcadm enable vntsd Now use the MAC address to configure the Solaris 11 Automated Installation. Database Logical Domain # telnet localhost 5000 {0} ok devalias screen                   /pci@4c0/pci@1/pci@0/pci@c/pci@0/pci@7/display@0 disk7                    /pci@4c0/pci@1/pci@0/pci@c/pci@0/pci@c/scsi@0/disk@p3 disk6                    /pci@4c0/pci@1/pci@0/pci@c/pci@0/pci@c/scsi@0/disk@p2 disk5                    /pci@4c0/pci@1/pci@0/pci@c/pci@0/pci@c/scsi@0/disk@p1 disk4                    /pci@4c0/pci@1/pci@0/pci@c/pci@0/pci@c/scsi@0/disk@p0 scsi1                    /pci@4c0/pci@1/pci@0/pci@c/pci@0/pci@c/scsi@0 net3                     /pci@4c0/pci@1/pci@0/pci@c/pci@0/pci@4/network@0,1 net2                     /pci@4c0/pci@1/pci@0/pci@c/pci@0/pci@4/network@0 virtual-console          /virtual-devices/console@1 name                     aliases {0} ok boot net2 Boot device: /pci@4c0/pci@1/pci@0/pci@c/pci@0/pci@4/network@0  File and args: 1000 Mbps full duplex Link up Requesting Internet Address for xx:xx:xx:xx:xx:xx Requesting Internet Address for xx:xx:xx:xx:xx:xx WLS Logical Domain # telnet localhost 5001 {0} ok devalias hdd1                     /virtual-devices@100/channel-devices@200/disk@0 vnet1                    /virtual-devices@100/channel-devices@200/network@0 net                      /virtual-devices@100/channel-devices@200/network@0 disk                     /virtual-devices@100/channel-devices@200/disk@0 virtual-console          /virtual-devices/console@1 name                     aliases {0} ok boot net Boot device: /virtual-devices@100/channel-devices@200/network@0  File and args: Requesting Internet Address for xx:xx:xx:xx:xx:xx Requesting Internet Address for xx:xx:xx:xx:xx:xx Repeat the process for the second SPARC T5-4, install Solaris, RAC and WebLogic Cluster, and you are ready to go. Maybe buying a SuperCluster would have been easier.

This blog is part of the SPARC T5-4 RAC and WebLogic Cluster series: Background SPARC T5-4 LDoms for RAC and WebLogic Clusters Onion Security DNS Bind Server configuration on Solaris 11.2 NFS root access...

Sun

VNC Cut & Paste on Solaris 10

I love being able to cut & paste between my laptop (e-mail, webbrowser, etc.) and my VNC session. This functionality is controlled byvncconfig. From the man page: vncconfig is used to configure and control a running instance of Xvnc, or any other X server with the VNC exten- sion. Note that it cannot be used to control VNC servers prior to version 4. I hate this message: $ vncconfig No VNC extension on display :1.0 When I hit the problem again, I found my way back to one of my old blogentries: Solaris/x64 VNC with Cut & Paste, but this time, thesituation was slightly different: On SPARC SFWvnc was not installed When I re-read my old blog, it didn't seem to apply. But I found: $ which Xvnc / usr/local/bin/Xvnc $ pkginfo -L | grep vnc SUNWvncviewer SUNWxvnc $ /usr/sbin/pkgchk -l SUNWxvnc | grep Xvnc NOTE: Couldn't lock the package database. Pathname: /usr/X11/bin/Xvnc Pathname: /usr/X11/share/man/man1/Xvnc.1 Someone had put Xvnc version 3.x in /usr/local. Because /usr/local wasnear the start of my PATH, I wasn't using the version of Xvnc that issupplied with Solaris 10 Update 10. Again, the solution is simple. 1) Kill the running Xvnc process 2) Start a new Xvnc process using the Solaris supplied executable $ export PATH=/usr/X11/bin:$PATH $ vncviewer 3) Use a VNC viewer to visit the Xvnc server and run vncconfig: $ vncconfig & 4) Now, cut & paste works between my laptop (e-mail, web browser,etc.) and my VNC session. 5) You will want to make sure that vncconfig is started automatically inyour .vnc/xstartup file. Hope this helps!

I love being able to cut & paste between my laptop (e-mail, web browser, etc.) and my VNC session. This functionality is controlled by vncconfig. From the man page:vncconfig is used to configure and...

Sun

Using R to analyze Java G1 garbage collector log files

body, td { font-family: sans-serif; background-color: white; font-size: 12px; margin: 8px;}tt, code, pre { font-family: 'DejaVu Sans Mono', 'Droid Sans Mono', 'Lucida Console', Consolas, Monaco, monospace;}h1 { font-size:2.2em; }h2 { font-size:1.8em; }h3 { font-size:1.4em; }h4 { font-size:1.0em; }h5 { font-size:0.9em; }h6 { font-size:0.8em; }a:visited { color: rgb(50%, 0%, 50%);}pre { margin-top: 0; max-width: 95%; border: 1px solid #ccc; white-space: pre-wrap;}pre code { display: block; padding: 0.5em;}code.r, code.cpp { background-color: #F8F8F8;}table, td, th { border: none;}blockquote { color:#666666; margin:0; padding-left: 1em; border-left: 0.5em #EEE solid;}hr { height: 0px; border-bottom: none; border-top-width: thin; border-top-style: dotted; border-top-color: #999999;}@media print { * { background: transparent !important; color: black !important; filter:none !important; -ms-filter: none !important; } body { font-size:12pt; max-width:100%; } a, a:visited { text-decoration: underline; } hr { visibility: hidden; page-break-before: always; } pre, blockquote { padding-right: 1em; page-break-inside: avoid; } tr, img { page-break-inside: avoid; } img { max-width: 100% !important; } @page :left { margin: 15mm 20mm 15mm 10mm; } @page :right { margin: 15mm 10mm 15mm 20mm; } p, h2, h3 { orphans: 3; widows: 3; } h2, h3 { page-break-after: avoid; }} pre .operator, pre .paren { color: rgb(104, 118, 135) } pre .literal { color: rgb(88, 72, 246) } pre .number { color: rgb(0, 0, 205); } pre .comment { color: rgb(76, 136, 107); } pre .keyword { color: rgb(0, 0, 255); } pre .identifier { color: rgb(0, 0, 0); } pre .string { color: rgb(3, 106, 7); } Introduction Working in Oracle Platform Integration gives an engineer opportunities to work on a wide array of technologies. My team’s goal is to make Oracle applications run best on the Solaris/SPARC platform. When looking for bottlenecks in a modern application, one needs to be aware of not only how the CPUs and operating system are executing, but also network, storage, and in some cases, the Java Virtual Machine. I was recently presented with about 1.5 GB of Java Garbage First (G1) Garbage Collector log file data. If you’re not familiar with the subject, you might want to review Garbage First Garbage Collector Tuning by Monica Beckwith. The customer had been running Java HotSpot 1.6.0_31 to host a web application server. The logs are from a very early implementation of the G1 collector, prior to it becoming supported. The G1 collector is supported starting in JDK 7 update 4 and there are substantial performance improvements between the experimental version and the supported version. I was told that the Solaris/SPARC server was running a Java process launched using a command line that included the following flags: -d64 -Xms9g -Xmx9g -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -XX:InitiatingHeapOccupancyPercent=80 -XX:PermSize=256m -XX:MaxPermSize=256m -XX:+PrintGC -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC -XX:+PrintGCDateStamps -XX:+PrintFlagsFinal -XX:+DisableExplicitGC -XX:+UnlockExperimentalVMOptions -XX:ParallelGCThreads=8 Several sources on the internet indicate that if I were to print out the 1.5 GB of log files, it would require enough paper to fill the bed of a pick up truck. Of course, it would be fruitless to try to scan the log files by hand. Tools will be required to summarize the contents of the log files. Others have encountered large Java garbage collection log files. There are existing tools to analyze the log files: IBM’s GC toolkit The chewiebug GCViewer gchisto HPjmeter Instead of using one of the other tools listed, I decide to parse the log files with standard Unix tools, and analyze the data with R. From the R project's home page, "R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows and MacOS." If you would like to install R, there are many options for downloading R, including: one of the CRAN mirrors listed at http://cran.r-project.org/mirrors.html the Oracle R Distribution from https://oss.oracle.com/ORD/ Data Cleansing The log files arrived in two different formats. I guess that the difference is that one set of log files was generated using a more verbose option, maybe -XX:+PrintHeapAtGC, and the other set of log files was generated without that option. Format 1 In some of the log files, the log files with the less verbose format, a single trace, i.e. the report of a singe garbage collection event, looks like this: {Heap before GC invocations=12280 (full 61): garbage-first heap total 9437184K, used 7499918K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) region size 4096K, 1 young (4096K), 0 survivors (0K) compacting perm gen total 262144K, used 144077K [0xffffffff40000000, 0xffffffff50000000, 0xffffffff50000000) the space 262144K, 54% used [0xffffffff40000000, 0xffffffff48cb3758, 0xffffffff48cb3800, 0xffffffff50000000)No shared spaces configured.2014-05-14T07:24:00.988-0700: 60586.353: [GC pause (young) 7324M->7320M(9216M), 0.1567265 secs]Heap after GC invocations=12281 (full 61): garbage-first heap total 9437184K, used 7496533K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) region size 4096K, 0 young (0K), 0 survivors (0K) compacting perm gen total 262144K, used 144077K [0xffffffff40000000, 0xffffffff50000000, 0xffffffff50000000) the space 262144K, 54% used [0xffffffff40000000, 0xffffffff48cb3758, 0xffffffff48cb3800, 0xffffffff50000000)No shared spaces configured.} A simple grep can be used to extract a summary: $ grep "\[ GC pause (young" g1gc.log 2014-05-13T13:24:35.091-0700: 3.109: [GC pause (young) 20M->5029K(9216M), 0.0146328 secs]2014-05-13T13:24:35.440-0700: 3.459: [GC pause (young) 9125K->6077K(9216M), 0.0086723 secs]2014-05-13T13:24:37.581-0700: 5.599: [GC pause (young) 25M->8470K(9216M), 0.0203820 secs]2014-05-13T13:24:42.686-0700: 10.704: [GC pause (young) 44M->15M(9216M), 0.0288848 secs]2014-05-13T13:24:48.941-0700: 16.958: [GC pause (young) 51M->20M(9216M), 0.0491244 secs]2014-05-13T13:24:56.049-0700: 24.066: [GC pause (young) 92M->26M(9216M), 0.0525368 secs]2014-05-13T13:25:34.368-0700: 62.383: [GC pause (young) 602M->68M(9216M), 0.1721173 secs] But that format wasn't easily read into R, so I needed to be a bit more tricky. I used the following Unix command to create a summary file that was easy for R to read. $ echo "SecondsSinceLaunch BeforeSize AfterSize TotalSize RealTime" > summary.txt $ grep "\[GC pause (young" g1gc.log | grep -v mark | sed -e 's/[A-SU-z\(\),]/ /g' -e 's/->/ /' -e 's/: / /g' >> summary.txt "SecondsSinceLaunch BeforeSize AfterSize TotalSize RealTime2014-05-13T13:24:35.091-0700 3.109 20 5029 9216 0.0146328 2014-05-13T13:24:35.440-0700 3.459 9125 6077 9216 0.0086723 2014-05-13T13:24:37.581-0700 5.599 25 8470 9216 0.0203820 2014-05-13T13:24:42.686-0700 10.704 44 15 9216 0.0288848 2014-05-13T13:24:48.941-0700 16.958 51 20 9216 0.0491244 2014-05-13T13:24:56.049-0700 24.066 92 26 9216 0.0525368 2014-05-13T13:25:34.368-0700 62.383 602 68 9216 0.1721173 Format 2 In some of the log files, the log files with the more verbose format, a single trace, i.e. the report of a singe garbage collection event, was more complicated than Format 1. Here is a text file with an example of a single G1GC trace in the second format. As you can see, it is quite complicated. It is nice that there is so much information available, but the level of detail can be overwhelming. I wrote this awk script (download) to summarize each trace on a single line. #!/usr/bin/env awk -fBEGIN { printf("SecondsSinceLaunch IncrementalCount FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize\n")}####################### Save count data from lines that are at the start of each G1GC trace.# Each trace starts out like this:# {Heap before GC invocations=14 (full 0):# garbage-first heap total 9437184K, used 325496K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000)######################/{Heap.*full/{ gsub ( "\\)" , "" ); nf=split($0,a,"="); split(a[2],b," "); getline; if ( match($0, "first") ) { G1GC=1; IncrementalCount=b[1];FullCount=substr( b[3], 1, length(b[3])-1 ); } else { G1GC=0; }}####################### Pull out time stamps that are in lines with this format:# 2014-05-12T14:02:06.025-0700: 94.312: [GC pause (young), 0.08870154 secs]######################/GC pause/ { DateTime=$1; SecondsSinceLaunch=substr($2, 1, length($2)-1);}####################### Heap sizes are in lines that look like this:# [ 4842M->4838M(9216M)] ######################/\[ .*]$/ { gsub ( "\\[" , "" ); gsub ( "\ \]" , "" ); gsub ( "->" , " " ); gsub ( "\\( " , " " ); gsub ( "\ \)" , " " ); split($0,a," "); if ( split(a[1],b,"M") > 1 ) {BeforeSize=b[1]*1024;} if ( split(a[1],b,"K") > 1 ) {BeforeSize=b[1];} if ( split(a[2],b,"M") > 1 ) {AfterSize=b[1]*1024;} if ( split(a[2],b,"K") > 1 ) {AfterSize=b[1];} if ( split(a[3],b,"M") > 1 ) {TotalSize=b[1]*1024;} if ( split(a[3],b,"K") > 1 ) {TotalSize=b[1];}}####################### Emit an output line when you find input that looks like this:# [Times: user=1.41 sys=0.08, real=0.24 secs]######################/\[Times/ { if (G1GC==1) { gsub ( "," , "" ); split($2,a,"="); UserTime=a[2]; split($3,a,"="); SysTime=a[2]; split($4,a,"="); RealTime=a[2]; print DateTime,SecondsSinceLaunch,IncrementalCount,FullCount,UserTime,SysTime,RealTime,BeforeSize,AfterSize,TotalSize; G1GC=0; }} The resulting summary is about 25X smaller than the original file, but still difficult for a human to digest. SecondsSinceLaunch IncrementalCount FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize...2014-05-12T18:36:34.669-0700: 3985.744 561 0 0.57 0.06 0.16 1724416 1720320 94371842014-05-12T18:36:34.839-0700: 3985.914 562 0 0.51 0.06 0.19 1724416 1720320 94371842014-05-12T18:36:35.069-0700: 3986.144 563 0 0.60 0.04 0.27 1724416 1721344 94371842014-05-12T18:36:35.354-0700: 3986.429 564 0 0.33 0.04 0.09 1725440 1722368 94371842014-05-12T18:36:35.545-0700: 3986.620 565 0 0.58 0.04 0.17 1726464 1722368 94371842014-05-12T18:36:35.726-0700: 3986.801 566 0 0.43 0.05 0.12 1726464 1722368 94371842014-05-12T18:36:35.856-0700: 3986.930 567 0 0.30 0.04 0.07 1726464 1723392 94371842014-05-12T18:36:35.947-0700: 3987.023 568 0 0.61 0.04 0.26 1727488 1723392 94371842014-05-12T18:36:36.228-0700: 3987.302 569 0 0.46 0.04 0.16 1731584 1724416 9437184 Reading the Data into R Once the GC log data had been cleansed, either by processing the first format with the shell script, or by processing the second format with the awk script, it was easy to read the data into R. g1gc.df = read.csv("summary.txt", row.names = NULL, stringsAsFactors=FALSE,sep="")str(g1gc.df) ## 'data.frame': 8307 obs. of 10 variables:## $ row.names : chr "2014-05-12T14:00:32.868-0700:" "2014-05-12T14:00:33.179-0700:" "2014-05-12T14:00:33.677-0700:" "2014-05-12T14:00:35.538-0700:" ...## $ SecondsSinceLaunch: num 1.16 1.47 1.97 3.83 6.1 ...## $ IncrementalCount : int 0 1 2 3 4 5 6 7 8 9 ...## $ FullCount : int 0 0 0 0 0 0 0 0 0 0 ...## $ UserTime : num 0.11 0.05 0.04 0.21 0.08 0.26 0.31 0.33 0.34 0.56 ...## $ SysTime : num 0.04 0.01 0.01 0.05 0.01 0.06 0.07 0.06 0.07 0.09 ...## $ RealTime : num 0.02 0.02 0.01 0.04 0.02 0.04 0.05 0.04 0.04 0.06 ...## $ BeforeSize : int 8192 5496 5768 22528 24576 43008 34816 53248 55296 93184 ...## $ AfterSize : int 1400 1672 2557 4907 7072 14336 16384 18432 19456 21504 ...## $ TotalSize : int 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 ... head(g1gc.df) ## row.names SecondsSinceLaunch IncrementalCount## 1 2014-05-12T14:00:32.868-0700: 1.161 0## 2 2014-05-12T14:00:33.179-0700: 1.472 1## 3 2014-05-12T14:00:33.677-0700: 1.969 2## 4 2014-05-12T14:00:35.538-0700: 3.830 3## 5 2014-05-12T14:00:37.811-0700: 6.103 4## 6 2014-05-12T14:00:41.428-0700: 9.720 5## FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize## 1 0 0.11 0.04 0.02 8192 1400 9437184## 2 0 0.05 0.01 0.02 5496 1672 9437184## 3 0 0.04 0.01 0.01 5768 2557 9437184## 4 0 0.21 0.05 0.04 22528 4907 9437184## 5 0 0.08 0.01 0.02 24576 7072 9437184## 6 0 0.26 0.06 0.04 43008 14336 9437184 Basic Statistics Once the data has been read into R, simple statistics are very easy to generate. All of the numbers from high school statistics are available via simple commands. For example, generate a summary of every column: summary(g1gc.df) ## row.names SecondsSinceLaunch IncrementalCount FullCount ## Length:8307 Min. : 1 Min. : 0 Min. : 0.0 ## Class :character 1st Qu.: 9977 1st Qu.:2048 1st Qu.: 0.0 ## Mode :character Median :12855 Median :4136 Median : 12.0 ## Mean :12527 Mean :4156 Mean : 31.6 ## 3rd Qu.:15758 3rd Qu.:6262 3rd Qu.: 61.0 ## Max. :55484 Max. :8391 Max. :113.0 ## UserTime SysTime RealTime BeforeSize ## Min. :0.040 Min. :0.0000 Min. : 0.0 Min. : 5476 ## 1st Qu.:0.470 1st Qu.:0.0300 1st Qu.: 0.1 1st Qu.:5137920 ## Median :0.620 Median :0.0300 Median : 0.1 Median :6574080 ## Mean :0.751 Mean :0.0355 Mean : 0.3 Mean :5841855 ## 3rd Qu.:0.920 3rd Qu.:0.0400 3rd Qu.: 0.2 3rd Qu.:7084032 ## Max. :3.370 Max. :1.5600 Max. :488.1 Max. :8696832 ## AfterSize TotalSize ## Min. : 1380 Min. :9437184 ## 1st Qu.:5002752 1st Qu.:9437184 ## Median :6559744 Median :9437184 ## Mean :5785454 Mean :9437184 ## 3rd Qu.:7054336 3rd Qu.:9437184 ## Max. :8482816 Max. :9437184 Q: What is the total amount of User CPU time spent in garbage collection? sum(g1gc.df$UserTime) ## [1] 6236 As you can see, less than two hours of CPU time was spent in garbage collection. Is that too much? To find the percentage of time spent in garbage collection, divide the number above by total_elapsed_time*CPU_count. In this case, there are a lot of CPU’s and it turns out the the overall amount of CPU time spent in garbage collection isn’t a problem when viewed in isolation. When calculating rates, i.e. events per unit time, you need to ask yourself if the rate is homogenous across the time period in the log file. Does the log file include spikes of high activity that should be separately analyzed? Averaging in data from nights and weekends with data from business hours may alias problems. If you have a reason to suspect that the garbage collection rates include peaks and valleys that need to be independently analyzed, see the “Time Series” section, below. Q: How much garbage is collected on each pass? The amount of heap space that is recovered per GC pass is surprisingly low: At least one collection didn’t recover any data. (“Min.=0”) 25% of the passes recovered 3MB or less. (“1st Qu.=3072”) Half of the GC passes recovered 4MB or less. (“Median=4096”) The average amount recovered was 56MB. (“Mean=56390”) 75% of the passes recovered 36MB or less. (“3rd Qu.=36860”) At least one pass recovered 2GB. (“Max.=2121000”) g1gc.df$Delta = g1gc.df$BeforeSize - g1gc.df$AfterSizesummary(g1gc.df$Delta) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0 3070 4100 56400 36900 2120000 Q: What is the maximum User CPU time for a single collection? The worst garbage collection (“Max.”) is many standard deviations away from the mean. The data appears to be right skewed. summary(g1gc.df$UserTime) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0.040 0.470 0.620 0.751 0.920 3.370 sd(g1gc.df$UserTime) ## [1] 0.3966 Basic Graphics Once the data is in R, it is trivial to plot the data with formats including dot plots, line charts, bar charts (simple, stacked, grouped), pie charts, boxplots, scatter plots histograms, and kernel density plots. Histogram of User CPU Time per Collection I don't think that this graph requires any explanation. hist(g1gc.df$UserTime, main="User CPU Time per Collection", xlab="Seconds", ylab="Frequency") Box plot to identify outliers When the initial data is viewed with a box plot, you can see the one crazy outlier in the real time per GC. Save this data point for future analysis and drop the outlier so that it’s not throwing off our statistics. Now the box plot shows many outliers, which will be examined later, using times series analysis. Notice that the scale of the x-axis changes drastically once the crazy outlier is removed. par(mfrow=c(2,1))boxplot(g1gc.df$UserTime,g1gc.df$SysTime,g1gc.df$RealTime, main="Box Plot of Time per GC\n(dominated by a crazy outlier)", names=c("usr","sys","elapsed"), xlab="Seconds per GC", ylab="Time (Seconds)", horizontal = TRUE, outcol="red")crazy.outlier.df=g1gc.df[g1gc.df$RealTime > 400,]g1gc.df=g1gc.df[g1gc.df$RealTime < 400,]boxplot(g1gc.df$UserTime,g1gc.df$SysTime,g1gc.df$RealTime, main="Box Plot of Time per GC\n(crazy outlier excluded)", names=c("usr","sys","elapsed"), xlab="Seconds per GC", ylab="Time (Seconds)", horizontal = TRUE, outcol="red")box(which = "outer", lty = "solid") Here is the crazy outlier for future analysis: crazy.outlier.df ## row.names SecondsSinceLaunch IncrementalCount## 8233 2014-05-12T23:15:43.903-0700: 20741 8316## FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize## 8233 112 0.55 0.42 488.1 8381440 8235008 9437184## Delta## 8233 146432 R Time Series Data To analyze the garbage collection as a time series, I’ll use Z’s Ordered Observations (zoo). “zoo is the creator for an S3 class of indexed totally ordered observations which includes irregular time series.” require(zoo) ## Loading required package: zoo## ## Attaching package: 'zoo'## ## The following objects are masked from 'package:base':## ## as.Date, as.Date.numeric head(g1gc.df[,1]) ## [1] "2014-05-12T14:00:32.868-0700:" "2014-05-12T14:00:33.179-0700:"## [3] "2014-05-12T14:00:33.677-0700:" "2014-05-12T14:00:35.538-0700:"## [5] "2014-05-12T14:00:37.811-0700:" "2014-05-12T14:00:41.428-0700:" options("digits.secs"=3)times=as.POSIXct( g1gc.df[,1], format="%Y-%m-%dT%H:%M:%OS%z:")g1gc.z = zoo(g1gc.df[,-c(1)], order.by=times)head(g1gc.z) ## SecondsSinceLaunch IncrementalCount FullCount## 2014-05-12 17:00:32.868 1.161 0 0## 2014-05-12 17:00:33.178 1.472 1 0## 2014-05-12 17:00:33.677 1.969 2 0## 2014-05-12 17:00:35.538 3.830 3 0## 2014-05-12 17:00:37.811 6.103 4 0## 2014-05-12 17:00:41.427 9.720 5 0## UserTime SysTime RealTime BeforeSize AfterSize## 2014-05-12 17:00:32.868 0.11 0.04 0.02 8192 1400## 2014-05-12 17:00:33.178 0.05 0.01 0.02 5496 1672## 2014-05-12 17:00:33.677 0.04 0.01 0.01 5768 2557## 2014-05-12 17:00:35.538 0.21 0.05 0.04 22528 4907## 2014-05-12 17:00:37.811 0.08 0.01 0.02 24576 7072## 2014-05-12 17:00:41.427 0.26 0.06 0.04 43008 14336## TotalSize Delta## 2014-05-12 17:00:32.868 9437184 6792## 2014-05-12 17:00:33.178 9437184 3824## 2014-05-12 17:00:33.677 9437184 3211## 2014-05-12 17:00:35.538 9437184 17621## 2014-05-12 17:00:37.811 9437184 17504## 2014-05-12 17:00:41.427 9437184 28672 Example of Two Benchmark Runs in One Log File The data in the following graph is from a different log file, not the one of primary interest to this article. I’m including this image because it is an example of idle periods followed by busy periods. It would be uninteresting to average the rate of garbage collection over the entire log file period. More interesting would be the rate of garbage collection in the two busy periods. Are they the same or different? Your production data may be similar, for example, bursts when employees return from lunch and idle times on weekend evenings, etc. Once the data is in an R Time Series, you can analyze isolated time windows. Clipping the Time Series data Flashing back to our test case… Viewing the data as a time series is interesting. You can see that the work intensive time period is between 9:00 PM and 3:00 AM. Lets clip the data to the interesting period: par(mfrow=c(2,1))plot(g1gc.z$UserTime, type="h", main="User Time per GC\nTime: Complete Log File", xlab="Time of Day", ylab="CPU Seconds per GC", col="#1b9e77")clipped.g1gc.z=window(g1gc.z, start=as.POSIXct("2014-05-12 21:00:00"), end=as.POSIXct("2014-05-13 03:00:00"))plot(clipped.g1gc.z$UserTime, type="h", main="User Time per GC\nTime: Limited to Benchmark Execution", xlab="Time of Day", ylab="CPU Seconds per GC", col="#1b9e77")box(which = "outer", lty = "solid") Cumulative Incremental and Full GC count Here is the cumulative incremental and full GC count. When the line is very steep, it indicates that the GCs are repeating very quickly. Notice that the scale on the Y axis is different for full vs. incremental. plot(clipped.g1gc.z[,c(2:3)], main="Cumulative Incremental and Full GC count", xlab="Time of Day", col="#1b9e77") GC Analysis of Benchmark Execution using Time Series data In the following series of 3 graphs: The “After Size” show the amount of heap space in use after each garbage collection. Many Java objects are still referenced, i.e. alive, during each garbage collection. This may indicate that the application has a memory leak, or may indicate that the application has a very large memory footprint. Typically, an application's memory footprint plateau's in the early stage of execution. One would expect this graph to have a flat top. On the far right side, the steep decline in the heap space may indicate that the application crashed after 2:00. The second graph shows that the outliers in real execution time, discussed above, occur near 2:00. when the Java heap seems to be quite full. The third graph shows that Full GCs are infrequent during the first few hours of execution. The rate of Full GC's, (the slope of the cumulative Full GC line), changes near midnight. plot(clipped.g1gc.z[,c("AfterSize","RealTime","FullCount")], xlab="Time of Day", col=c("#1b9e77","red","#1b9e77")) GC Analysis of heap recovered Each GC trace includes the amount of heap space in use before and after the individual GC event. During garbage collection, unreferenced objects are identified, the space holding the unreferenced objects is freed, and thus, the difference in before and after usage indicates how much space has been freed. The following box plot and bar chart both demonstrate the same point - the amount of heap space freed per garbage collection is surprisingly low. par(mfrow=c(2,1))boxplot(as.vector(clipped.g1gc.z$Delta), main="Amount of Heap Recovered per GC Pass", xlab="Size in KB", horizontal = TRUE, col="red")hist(as.vector(clipped.g1gc.z$Delta), main="Amount of Heap Recovered per GC Pass", xlab="Size in KB", breaks=100, col="red")box(which = "outer", lty = "solid") This graph is the most interesting. The dark blue area shows how much heap is occupied by referenced Java objects. This represents memory that holds live data. The red fringe at the top shows how much data was recovered after each garbage collection. barplot(clipped.g1gc.z[,c("AfterSize","Delta")], col=c("#7570b3","#e7298a"), xlab="Time of Day", border=NA)legend("topleft", c("Live Objects","Heap Recovered on GC"), fill=c("#7570b3","#e7298a"))box(which = "outer", lty = "solid") When I discuss the data in the log files with the customer, I will ask for an explanation for the large amount of referenced data resident in the Java heap. There are two are possibilities: There is a memory leak and the amount of space required to hold referenced objects will continue to grow, limited only by the maximum heap size. After the maximum heap size is reached, the JVM will throw an “Out of Memory” exception every time that the application tries to allocate a new object. If this is the case, the application needs to be debugged to identify why old objects are referenced when they are no longer needed. The application has a legitimate requirement to keep a large amount of data in memory. The customer may want to further increase the maximum heap size. Another possible solution would be to partition the application across multiple cluster nodes, where each node has responsibility for managing a unique subset of the data. Conclusion In conclusion, R is a very powerful tool for the analysis of Java garbage collection log files. The primary difficulty is data cleansing so that information can be read into an R data frame. Once the data has been read into R, a rich set of tools may be used for thorough evaluation.

Introduction Working in Oracle Platform Integration gives an engineer opportunities to work on a wide array of technologies. My team’s goal is to make Oracle applications run best on the...

Sun

Quadratic data in Oracle R Enterprise and Oracle Data Mining

I was working with some data which was stored in an Oracle database on a SPARC T4 server. I thought that the data had a quadratic component and I wanted to analyze the data using SQL Developer and Oracle Data Mining, a component of the Oracle Advanced Analytics Option. When I reviewed the initial analysis, I wasn't getting results that I had expected, and the fit of the model wasn't very good. I decided to feed some simple, synthetic quad data into Oracle Data Miner to ensure that I was using the tool properly. Oracle R Enterprise was used as the tool to create and view the synthetic data. From an R session that has the Oracle R Enterprise package installed, it is easy to access an Oracle Database: require(ORE) ## Loading required package: ORE## Loading required package: OREbase## ## Attaching package: 'OREbase'## ## The following object(s) are masked from 'package:base':## ## cbind, data.frame, eval, interaction, order, paste, pmax,## pmin, rbind, table## ## Loading required package: OREstats## Loading required package: MASS## Loading required package: OREgraphics## Loading required package: OREeda## Loading required package: OREdm## Loading required package: lattice## Loading required package: OREpredict## Loading required package: ORExml ore.connect("SCOTT", "orcl", "sparc-T4", "TIGER", 1521) ## Loading required package: ROracle## Loading required package: DBI The following R function, quad1(), is used to calculate "y=ax^2 + bx + c", where: - the data frame that is passed in has a column of x values. - a is in coefficients[feature, 1] - b is in coefficients[feature, 2] - c is in coefficients[feature, 3] The function will simply calculate points along a parabolic line and is more complicated than it needs to be. I will leave it in this complicated format so that I can extend it to work with more interesting functions, such as a parabolic surface, later.   quad1 <- function(df, coefficients) { feature <- 1 coefficients[feature, 1] * df[, feature] * df[, feature] + coefficients[feature, 2] * df[, feature] + coefficients[feature, 3]} The following function, genData(), creates random "x" data points and uses func() to calculate the y values that correspond to the random x values. genData <- function(nObservations, func, coefficients, nFeatures, scale) { dframe <- data.frame(x1 = rep(1, nObservations)) for (feature in seq(nFeatures)) { name <- paste("x", feature, sep = "") dframe[name] <- runif(nObservations, -scale[feature], scale[feature]) } dframe["y"] <- func(dframe, coefficients)return(dframe)} The following function, quadGraph(), is used for graphing. The points in dframe are displayed in a scatter plot. The coefficients for the known synthetic data is passed in and the corresponding line is sketched in blue. (Obviously, if you aren't working with synthetic data, it is unlikely that you will know the "true" coefficients.) The R model that is the best estimate of the data based on regression is passed in and sketched in blue. quadGraph <- function(dframe, coefficients = NULL, model = NULL, ...) {with(dframe, plot(x1, y))title(main = "Quadratic Fit")legend("topright", inset = 0.05, c("True", "Model"), lwd = c(2.5, 2.5), col = c("blue", "red")) xRange <- range(dframe[, "x1"]) smoothX <- seq(xRange[1], xRange[2], length.out = 50) trueY <- quad1(data.frame(smoothX), coefficients)lines(smoothX, trueY, col = "blue") new = data.frame(x1 = smoothX) y_estimated <- predict(model, new)lines(smoothX, y_estimated, col = "red")} Here are the settings that will be used. nFeatures <- 1 # one feature can sketch a line, 2 a surface, ...nObservations <- 20 # How many rows of data to create for modelingdegree <- 2 # 2 is quadratic, 3 is cubic, etcset.seed(2) # I'll get the same coefficients every time I run coefficients <- matrix(rnorm(nFeatures * (degree + 1)), nFeatures, degree + 1)scale <- (10^rpois(nFeatures, 2)) * rnorm(nFeatures, 3) Here, synthetic data is created that matches the quadratic function and the random coefficients. modelData <- genData(nObservations, quad1, coefficients, nFeatures, scale) We can make this exercise at least slightly more realistic by adding some irreducible error for the regression algorithm to deal with. Add noise. yRange <- range(modelData[, "y"])yJitter <- (yRange[2] - yRange[1])/10modelData["y"] <- modelData["y"] + rnorm(nObservations, 0, yJitter) Great. At this point I have good quadratic synthetic data which can be analyzed. Feed the synthetic data to the Oracle Database. oreDF <- ore.push(modelData)tableName <- paste("QuadraticSample_", nObservations, "_", nFeatures, sep = "")ore.drop(table = tableName)ore.create(oreDF, table = tableName) The Oracle R Enterprise function to fit the linear model works as expected. m = ore.lm(y ~ x1 + I(x1 * x1), dat = oreDF)summary(m) ## ## Call:## ore.lm(formula = y ~ x1 + I(x1 * x1), data = oreDF)## ## Residuals:## Min 1Q Median 3Q Max ## -2.149 -0.911 -0.156 0.888 1.894 ## ## Coefficients:## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 1.3264 0.4308 3.08 0.0068 ** ## x1 -0.0640 0.1354 -0.47 0.6428 ## I(x1 * x1) -0.8392 0.0662 -12.68 4.3e-10 ***## ---## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 1.28 on 17 degrees of freedom## Multiple R-squared: 0.912,Adjusted R-squared: 0.901 ## F-statistic: 87.7 on 2 and 17 DF, p-value: 1.1e-09 coefficients ## [,1] [,2] [,3]## [1,] -0.8969 0.1848 1.588 Notice that the "true" coefficients, that were used to create the synthetic data are close to the values from the regression. For example, the true "a" is stored in coefficients[1,1] = -0.8969 and is close to the model's I(x1 * x1) = -0.8392. Not bad given that the model was created from only 20 data points. quadGraph(modelData, coefficients, m) The 20 data points, which were calculated from the "true" equation, but with noisy irreducible error added, are shown in the graph. The model, estimated by ore.lm() from the 20 noisy data points, is close to true. At this point, my job is either complete, or ready to start, depending on your perspective. I'm happy that ore.lm() does a nice job of fitting, so maybe I'm done. But if you remember that my initial goal was to validate that SQL Developer and Oracle Data Miner work with quadratic data, my job has just begun. Now that I have known good quadratic synthetic data in the database, I'll use SQL Developer and the Oracle Data Mining to verify that everything is fine. One more step in R. Create a second Oracle Database table that will be used to test the regression model.  testData <- genData(nObservations, quad1, coefficients, nFeatures, scale)oreTestData <- ore.push(testData)tableName <- paste("QuadraticTest_", nObservations, "_", nFeatures, sep = "")ore.drop(table = tableName)ore.create(oreTestData, table = tableName)   Here is the SQL Developer workflow that will be used. The synthetic data is in the Oracle Database table "QuadraticSample_20_1". The "Regress Build" node will run linear regression on the synthetic data. The test data which was generated using R in the previous paragraph, is stored in a Oracle Database table named "QuadraticTest_20_1". The Apply node will use the regression model that has been created and use the "x1" values from the test data, storing the y values in an Oracle Database table named "QUADTESTRESULTS".  So how did it work? A PhD in statistics would quickly tell you, "not well", and might look at you like you're an idiot if you don't know that a Model F Value Statistic of 3.25 isn't good. My more pedestrian approach is to plot the results of applying the model to the test data. The predictive confidence of the model that was created is poor: Pull the test result data into R for viewing: ore.sync()ore.attach()testResults <- ore.pull(QUADTESTRESULTS) ## Warning: ORE object has no unique key - using random order colnames(testResults)[1] <- "y" with(testResults, plot(x1, y))title(main = "Results of Applying Model to Test Data")  Hmm, that doesn't look parabolic to me: Now that I'm quite sure that SQL Developer and Oracle Data Mining isn't giving an expected fit, check through the advanced settings:  There it is!!  Set the feature generation to use quadratic candidates, re-run the model.The predictive confidence has improved.  Bring the new results back into R: Also, your statistician friends will be happy because the new model has a Model F Value Statistic of 124. Exciting, right?  Now, off to work on parabolic surfaces...

I was working with some data which was stored in an Oracle database on a SPARC T4 server. I thought that the data had a quadratic component and I wanted to analyze the data using SQL Developer and...

Database

Redo Log Switches

In Gathering Database Statistics in a Test Harness I said that use of the Automatic Workload Repository (AWR) is fundamental to understanding Oracle Database performance. When I finished the scripts for the blog and ran the first test run, the AWR report was quite clear: I need to increase the size of the redo logs: Report Summary Top ADDM Findings by Average Active Sessions Finding Name Avg active sessions of the task Percent active sessions of finding Task Name Begin Snap Time End Snap Time Top SQL Statements 1.00 80.56 ADDM:1351308822_1_2046 24-Jan-14 11:36 24-Jan-14 11:42 Log File Switches 1.00 5.38 ADDM:1351308822_1_2046 24-Jan-14 11:36 24-Jan-14 11:42 Instance Activity Stats - Thread Activity Statistics identified by '(derived)' come from sources other than SYSSTAT Statistic Total per Hour log switches (derived) 13 126.78 Finding 2: Log File SwitchesImpact is .05 active sessions, 5.38% of total activity.-------------------------------------------------------Log file switch operations were consuming significant database time whilewaiting for checkpoint completion.This problem can be caused by use of hot backup mode on tablespaces. DML totablespaces in hot backup mode causes generation of additional redo.... Recommendation 2: Database Configuration Estimated benefit is .05 active sessions, 5.38% of total activity. ------------------------------------------------------------------ Action Increase the size of the log files to 1552 M to hold at least 20 minutes of redo information. Original log files and groups: SQL> select GROUP#,THREAD#,BYTES from v$log;     GROUP#    THREAD#      BYTES ---------- ---------- ----------      1         1        52428800      2         1        52428800      3         1        52428800 SQL> select MEMBER from v$logfile; MEMBER -------------------------------------------------------------------------------- /u01/app/oracle/oradata/orcl/redo03.log /u01/app/oracle/oradata/orcl/redo02.log /u01/app/oracle/oradata/orcl/redo01.log Create new ones: SQL > ALTER DATABASE   ADD LOGFILE GROUP 4 ('/u01/app/oracle/oradata/orcl/redo04.log')       SIZE 5G; SQL> ALTER DATABASE   ADD LOGFILE GROUP 5 ('/u01/app/oracle/oradata/orcl/redo05.log')       SIZE 5G; SQL> ALTER DATABASE   ADD LOGFILE GROUP 6 ('/u01/app/oracle/oradata/orcl/redo06.log')       SIZE 5G; Drop the old one: SQL> ALTER SYSTEM SWITCH LOGFILE;SQL> ALTER SYSTEM SWITCH LOGFILE;SQL> ALTER DATABASE DROP LOGFILE GROUP 1; SQL> ALTER DATABASE DROP LOGFILE GROUP 2; ERROR at line 1:ORA-01624: log 2 needed for crash recovery of instance orcl (thread 1)ORA-00312: online log 2 thread 1: '/u01/app/oracle/oradata/orcl/redo02.log' SQL> ALTER DATABASE DROP LOGFILE GROUP 3; SQL> alter system checkpoint; SQL> ALTER DATABASE DROP LOGFILE GROUP 2; (Dropping group 2 is OK after checkpoint) Done: SQL> select GROUP#,THREAD#,BYTES from v$log;     GROUP#    THREAD#       BYTES---------- ---------- ----------     4          1     5368709120     5          1     5368709120     6          1     5368709120

In Gathering Database Statistics in a Test Harness I said that use of the Automatic Workload Repository (AWR) is fundamental to understanding Oracle Database performance.When I finished the scripts...

Sun

Gathering Database Statistics in a Test Harness

Use of the Automatic Workload Repository (AWR) is fundamental to understanding Oracle Database performance. By default, Oracle Database automatically generates snapshots of the performance data once every hour and retains the statistics in the workload repository for 8 days. You can also manually create snapshots, but this is usually not necessary. When benchmarking, it is helpful to be able to create an AWR report that covers the exact interval of benchmark execution. Since my benchmarks don't begin or end at the top of the hour, I take an AWR snapshot immediately before and after benchmark execution. At benchmark completion, an AWR report is generated that exactly covers the benchmark execution interval. Here is a simple shell script "snap.sh" that takes a new snapshot and returns the ID of the snapshot. I call this script before and after benchmark execution. #!/bin/bash result=$(sqlplus -S sys/SysPass as sysdba << EOF set head off set term off set echo off set trim on set trimspool on set feedback off set pagesize 0 exec sys.dbms_workload_repository.create_snapshot; select max(snap_id) from dba_hist_snapshot; EOF) echo $result Here is a script "gen_awr.sh". I call this script after benchmark execution with my preferred name for the report, and the before and after snap IDs. sqlplus / as sysdba << @EOF @$ORACLE_HOME/rdbms/admin/awrrpt html 1 $2 $3 $1 exit @EOF Here is a script "run.sh" that controls the gathering of data for the benchmark execution environment: Comments: run.sh #!/usr/bin/env bash ### # Usage ### [ $# -eq 0 ] && { echo "Usage: $0 <TestSize>"; exit 1; } ### # Set up ### size=$1 export DateStamp=`date +%y%m%d_%H%M%S` export DIR=/STATS/`hostname`_${size}_$DateStamp mkdir -p $DIR ### # Start gathering statistics ### vmstat 1 > $DIR/vmstat.out & vmstat_pid=$! Take an AWR snapshot before execution first=`./snap.sh` ### # Run the benchmark ### time sqlplus User/UserPass @RunAll $size > $DIR/std.out 2> $DIR/std.err ### # Stop gathering statistics ### Take an AWR snapshot after execution last=`./snap.sh` kill $vmstat_pid ### # Generate the report ### Generate the AWR report ./gen_awr.sh $size $first $last > /dev/null mv $size.lst $DIR/awr_$size.html ### # View the AWR report ### firefox $DIR/awr_$size.html After many benchmark runs, I have one directory per benchmark execution, named <nodename>_<size>_<date>_<time>, and each directory contains one AWR report that exactly covers the benchmark interval.

Use of the Automatic Workload Repository (AWR) is fundamental to understanding Oracle Database performance. By default, Oracle Databaseautomatically generates snapshots of the performance data...

Sun

Hadoop on an Oracle SPARC T4-2 Server

I recently configured a Oracle SPARC T4-2 server to store and process a combination of 2 types of data: Critical and sensitive data. ACID transactions are required. Security is critical. This data needs to be stored in an Oracle Database. High-volume/low-risk data that needs to be processed using Apache Hadoop. This data is stored in HDFS. Based on the requirements, I configured the server using a combination of: Oracle VM Server for SPARC, used for hard partitioning of system resources such as CPU, memory, PCIe buses and devices. Oracle Solaris Zones to host a Hadoop cluster as shown in Orgad Kimchi's How to Set Up a Hadoop Cluster Using Oracle Solaris Zones The configuration is shown in the following diagram: Hadoop Low CPU utilization: As you can see in the diagram, a T4 CPU is allocated for Hadoop map/reduce processing. The T4 CPU has 8 cores and 64 virtual processors, enabling it to simultaneously run up to 64 software threads: # psrinfo -pv The physical processor has 8 cores and 64 virtual processors (0-63)   The core has 8 virtual processors (0-7)   The core has 8 virtual processors (8-15)   The core has 8 virtual processors (16-23)   The core has 8 virtual processors (24-31)   The core has 8 virtual processors (32-39)   The core has 8 virtual processors (40-47)   The core has 8 virtual processors (48-55)   The core has 8 virtual processors (56-63)     SPARC-T4 (chipid 0, clock 2848 MHz) To test the Hadoop configuration, I created a large Hive table and launched Hadoop map/reduce processes using Hive: INSERT into table Table2  SELECT ColumnA, SUM(ColumnB)  FROM Table1  GROUP BY ColumnA; Out-of-the-box Hadoop performance was not well tuned. Simple Hive functions were not able to take advantage of the T4's CPU resources. While the HIve job was running, I ran iostat in the Global Zone, I could see that: The CPU was not very busy The 3 data-node disks would spike, but were not stressed.  # iostat -MmxPznc 60...     cpu us sy wt id 12  1  0 88                    extended device statistics                  r/s    w/s   Mr/s   Mw/s wait actv wsvc_t asvc_t  %w  %b device    2.2   19.1    0.0    0.1  0.0  0.0    0.0    1.8   0   1 (Primary LDOM root)   23.4    2.9    3.0    0.0  0.0  0.0    0.0    1.3   0   2 (data-node1)  105.5    5.5   11.7    0.0  0.0  0.4    0.0    3.2   0  10 (data-node2)    0.0   23.7    0.0    0.3  0.0  0.0    0.0    1.9   0   1 (Guest LDOM root)   24.2    2.9    1.9    0.0  0.0  0.0    0.0    1.2   0   2 (data-node3)    7.2   22.9    0.4    0.3  0.0  0.1    0.0    5.0   0   6 (/ZONES)      cpu us sy wt id 12  1  0 87                    extended device statistics                  r/s    w/s   Mr/s   Mw/s wait actv wsvc_t asvc_t  %w  %b device    2.3   19.2    0.0    0.1  0.0  0.0    0.0    1.8   0   1 (Primary LDOM root)    3.8    4.0    0.4    0.0  0.0  0.0    0.0    1.4   0   1 (data-node1)   47.9    5.4    4.1    0.0  0.0  0.1    0.0    1.6   0   3 (data-node2)    0.0   25.6    0.0    0.3  0.0  0.0    0.0    1.5   0   1 (Guest LDOM root)   38.2    3.9    3.2    0.0  0.0  0.1    0.0    1.4   0   3 (data-node3)    9.5   21.9    0.6    0.3  0.0  0.1    0.0    4.4   0   6 (/ZONES)      cpu us sy wt id 11  1  0 88                    extended device statistics                  r/s    w/s   Mr/s   Mw/s wait actv wsvc_t asvc_t  %w  %b device    5.3   18.6    0.1    0.1  0.0  0.1    0.0    4.4   0   4 (Primary LDOM root)    0.5    3.6    0.0    0.0  0.0  0.0    0.0    1.1   0   0 (data-node1)    0.4    3.6    0.0    0.0  0.0  0.0    0.0    0.8   0   0 (data-node2)    0.0   23.5    0.0    0.3  0.0  0.0    0.0    1.3   0   1 (Guest LDOM root)  124.9    7.2   10.3    0.0  0.0  0.2    0.0    1.8   0  10 (data-node3)    8.5   24.4    0.6    0.4  0.0  0.2    0.0    4.6   0   6 (/ZONES) To understand the low CPU activity, I looked at active software threads on the Hadoop cluster from the Global Zone.  Six (6) threads were active, one thread per process from map/reduce processes. $ prstat -mL   PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWPID   27529 hadoop    98 0.5 0.0 0.0 0.0 1.0 0.0 0.0  14  46  3K   0 java/2 27534 hadoop    98 0.5 0.0 0.0 0.0 1.1 0.0 0.0  13  46  3K   0 java/2 27575 hadoop    98 0.5 0.0 0.0 0.0 1.1 0.0 0.0  15  53  3K   0 java/2 27577 hadoop    98 0.6 0.0 0.0 0.0 1.2 0.0 0.0  14  53  3K   0 java/2 27576 hadoop    98 0.6 0.0 0.0 0.0 1.1 0.8 0.0  46  57  3K   0 java/2 27578 hadoop    97 0.6 0.0 0.0 0.0 1.1 1.0 0.0  53  53  4K   0 java/2 27575 hadoop   5.8 0.0 0.0 0.0 0.0  94 0.0 0.0  19   4  26   0 java/32 27578 hadoop   5.6 0.0 0.0 0.0 0.0  94 0.0 0.0  19   5  35   0 java/33 27529 hadoop   5.6 0.0 0.0 0.0 0.0  94 0.0 0.0   2   8   2   0 java/32 27576 hadoop   5.5 0.0 0.0 0.0 0.0  95 0.0 0.0  21   6  36   0 java/33 27028 hadoop   1.2 1.3 0.0 0.0 0.0 0.0  97 0.1 254   5  2K   0 java/87 27028 hadoop   1.2 1.2 0.0 0.0 0.0 0.0  97 0.1 251   2  2K   0 java/86   958 root     1.9 0.1 0.0 0.0 0.0  98 0.4 0.0   9   4  27   0 fmd/36 27005 hadoop   1.2 0.8 0.0 0.0 0.0 0.0  98 0.0  99   2  2K   0 java/86 27005 hadoop   1.1 0.8 0.0 0.0 0.0 0.0  98 0.0  98   3  2K   0 java/87 11956 root     1.8 0.1 0.0 0.0 0.0 0.0  98 0.0  44   3 882   0 Xvnc/1 27016 hadoop   1.0 0.8 0.0 0.0 0.0 0.0  98 0.0  95   2  2K   0 java/86 27577 hadoop   1.7 0.0 0.0 0.0 0.0  98 0.0 0.0  18   1  28   0 java/32 27016 hadoop   1.0 0.7 0.0 0.0 0.0 0.0  98 0.0  93   2  2K   0 java/85 27576 hadoop   1.7 0.0 0.0 0.0 0.0  98 0.0 0.0  20   1  33   0 java/32 27577 hadoop   1.6 0.0 0.0 0.0 0.0  98 0.0 0.0  18   0  31   0 java/33Total: 619 processes, 4548 lwps, load averages: 3.46, 8.63, 27.90 From the Hadoop JobTracker UI, it could be seen that Hadoop was only scheduling a few of the tasks. This was because there were only 6 Map and 6 Reduce job slots. (See the "Map Task Capacity" and "Reduce Task Capacity"columns):  Increasing the number of Job Slots: The T4 CPU is able to run 64 software threads simultaneously, and over-subscribing the Hadoop CPU's is recommended. I enabled 25 Job Slots per node, for a total of 75, by adding these properties to mapred-site.xml:  <property>    <name>mapred.tasktracker.map.tasks.maximum</name>    <value>25</value>  </property>   <property>    <name>mapred.tasktracker.reduce.tasks.maximum</name>    <value>25</value>  </property> Improved Hadoop CPU utilization: After add Job slots, while running iostat in the Global Zone, I could see that: The CPU was very busy The 3 data-node disk were active (but not stressed) # iostat -MmxPznc 30... us sy wt id  98  2  0  0                     extended device statistics                  r/s    w/s   Mr/s   Mw/s wait actv wsvc_t asvc_t  %w  %b device     1.2   20.6    0.0    0.1  0.0  0.0    0.0    1.5   0   1 c0t5000CCA025261B74d0s0     0.9   12.9    0.0    0.1  0.0  0.0    0.0    3.2   0   2 c0t5000CCA025311930d0s0     1.0    9.7    0.0    0.0  0.0  0.0    0.0    1.2   0   1 c0t5000CCA02530B058d0s0     0.0   22.4    0.0    0.2  0.0  0.0    0.0    1.0   0   1 c0t5000CCA0250CB198d0s0     1.3   15.7    0.0    0.1  0.0  0.0    0.0    1.1   0   1 c0t5000CCA025324D98d0s0     2.8   28.3    0.1    0.4  0.0  0.1    0.0    3.5   0   4 c0t5000CCA0253C11B0d0s0      cpu  us sy wt id  98  2  0  0                     extended device statistics                  r/s    w/s   Mr/s   Mw/s wait actv wsvc_t asvc_t  %w  %b device     1.3   21.1    0.0    0.1  0.0  0.1    0.0    3.0   0   3 c0t5000CCA025261B74d0s0     0.6    7.2    0.0    0.0  0.0  0.0    0.0    1.3   0   0 c0t5000CCA025311930d0s0     0.8    5.7    0.0    0.0  0.0  0.0    0.0    1.5   0   0 c0t5000CCA02530B058d0s0     0.0   22.6    0.0    0.3  0.0  0.0    0.0    1.1   0   1 c0t5000CCA0250CB198d0s0     0.5    7.7    0.0    0.1  0.0  0.0    0.0    1.1   0   0 c0t5000CCA025324D98d0s0     2.2   24.7    0.1    0.3  0.0  0.1    0.0    2.4   0   2 c0t5000CCA0253C11B0d0s0      cpu  us sy wt id  98  2  0  0                     extended device statistics                  r/s    w/s   Mr/s   Mw/s wait actv wsvc_t asvc_t  %w  %b device     0.9   20.6    0.0    0.1  0.0  0.1    0.0    2.6   0   2 c0t5000CCA025261B74d0s0     0.5    7.1    0.0    0.0  0.0  0.0    0.0    0.9   0   0 c0t5000CCA025311930d0s0     4.8   10.3    0.4    0.0  0.0  0.0    0.0    1.2   0   1 c0t5000CCA02530B058d0s0     0.0   21.7    0.0    0.2  0.0  0.0    0.0    1.1   0   1 c0t5000CCA0250CB198d0s0     2.4   15.1    0.2    0.1  0.0  0.0    0.0    1.2   0   1 c0t5000CCA025324D98d0s0     4.8   25.3    0.5    0.4  0.0  0.2    0.0    5.4   0   5 c0t5000CCA0253C11B0d0s0 Now all 69 maps jobs can run in parallel: Because there are 75 map job slots: (See the "Map Task Capacity" and "Reduce Task Capacity"columns) Hadoop Memory Usage: For this simple case, I found that allocating additional memory to the Java map/reduce processes did not help, but for reference, this is how I increased the memory available to the Hive map/reduce processes: $ hive --hiveconf mapred.map.child.java.opts="-Xmx1000m -Xms500m"

I recently configured a Oracle SPARC T4-2 server to store and process a combination of 2 types of data:Critical and sensitive data. ACID transactions are required.Security is critical. This data needs...

Sun

Hive 0.11 (May, 15 2013) and Rank() within a category

This is a follow up to a Stack Overflow question HiveQL and rank(): libjack recommended that I upgrade to Hive 0.11 (May, 15 2013) to take advantage of Windowing and Analytics functions. His recommendation worked immediately, but it took a while for me to find the right syntax to sort within categories. This blog entry records the correct syntax. 1. Sales Rep data Here is a CSV file with Sales Rep data: $ more reps.csv 1,William,2 2,Nadia,1 3,Daniel,2 4,Jana,1 Create a Hive table for the Sales Rep data: create table SalesRep (   RepID INT,   RepName STRING,   Territory INT   )   ROW FORMAT DELIMITED     FIELDS TERMINATED BY ','     LINES TERMINATED BY '\n'; ... and load the CSV into the Hive Sales Rep table: LOAD DATA  LOCAL INPATH '/home/hadoop/MyDemo/reps.csv'  INTO TABLE SalesRep; 2. Purchase Order data Here is a CSV file with PO data: $ more purchases.csv 4,1,100 2,2,200 2,3,600 3,4,80 4,5,120 1,6,170 3,7,140 Create a Hive table for the PO's: create table purchases (  SalesRepId INT,   PurchaseOrderId INT,   Amount INT  )   ROW FORMAT DELIMITED    FIELDS TERMINATED BY ','    LINES TERMINATED BY '\n'; ... and load CSV into the Hive PO table: LOAD DATA  LOCAL INPATH '/home/hadoop/MyDemo/purchases.csv' INTO TABLE purchases; 3. Hive JOIN So this is the underlining data that is being worked with: SELECT p.PurchaseOrderId, s.RepName, p.amount, s.TerritoryFROM purchases p JOIN SalesRep sWHERE p.SalesRepId = s.RepID; PO ID Rep Amount Territory 1 Jana 100 1 2 Nadia 200 1 3 Nadia 600 1 4 Daniel 80 2 5 Jana 120 1 6 William 170 2 7 Daniel 140 2 4. Hive Rank by Volume only SELECT   s.RepName, s.Territory, V.volume, rank() over (ORDER BY V.volume DESC) as rank FROM   SalesRep s   JOIN     ( SELECT       SalesRepId, SUM(amount) as Volume       FROM purchases       GROUP BY SalesRepId) V   WHERE V.SalesRepId = s.RepID   ORDER BY V.volume DESC; Rep Territory Amount Rank Nadia 1 800 1 Daniel 2 220 2 Jana 1 220 2 William 2 170 4 The ranking over the entire data set - Daniel is tied for second among all Reps. 5. Hive Rank within Territory, by Volume SELECT   s.RepName, s.Territory, V.volume,   rank() over (PARTITION BY s.Territory ORDER BY V.volume DESC) as rank FROM   SalesRep s   JOIN     ( SELECT       SalesRepId, SUM(amount) as Volume       FROM purchases       GROUP BY SalesRepId) V   WHERE V.SalesRepId = s.RepID   ORDER BY V.volume DESC; Rep Territory Amount Rank Nadia 1 800 1 Jana 1 220 2 Daniel 2 220 1 William 2 170 2 The ranking is within the territory - Daniel is the best is his territory. 6. FYI: this example was developed on a SPARC T4 server with Oracle Solaris 11 and Apache Hadoop 1.0.4

This is a follow up to a Stack Overflow question HiveQL and rank(): libjack recommended that I upgrade to Hive 0.11 (May, 15 2013) to take advantage of Windowing and Analytics functions.His...

Sun

Ganglia on Solaris 11.1

Here are some notes that I took while building Ganglia Core 3.6.0 and Ganglia Web 3.5.7 with Solaris Studio 12.3 and installing on Solaris 11.1. These notes are only intended augment (not replace) other Ganglia install guides. 1) Add a ganglia user to build with (This is an optional step, you may build as any user, gmond will run as root )     # useradd -d localhost:/export/home/ganglia -m ganglia     # passwd ganglia     # usermod -R root ganglia     # echo "ganglia ALL=(ALL) ALL" > /etc/sudoers.d/ganglia     # chmod 440 /etc/sudoers.d/ganglia # su - ganglia 2) These packages are needed:     $ sudo pkg install system/header gperf glibmm apache-22 php-53 apache-php53 apr-util-13 libconfuse rrdtool $ sudo svcadm enable apache22 3) Download and unpack: $ gzip -dc ganglia-3.6.0.tar.gz | tar xvf - $ gzip -dc ganglia-web-3.5.7.tar.gz | tar xvf - $ cd ganglia-3.6.0 4) Compiler error when building libgmond.c: "default_conf.h", line 75: invalid directive". In lib/default_conf.h, Removed the 3 lines with a '#' character. udp_recv_channel {\n\ mcast_join = 239.2.11.71\n\ port = 8649\n\   bind = 239.2.11.71\n\ retry_bind = true\n\   # Size of the UDP buffer. If you are handling lots of metrics you really\n\   # should bump it up to e.g. 10MB or even higher.\n\ # buffer = 10485760\n\ }\n\ 5) Compiler error: "data_thread.c", line 143: undefined symbol: FIONREAD. In gmetad/data_thread.c, added: #include <sys/filio.h> 6) Runtime error: "Cannot load /usr/local/lib/ganglia/modcpu.so metric module: ld.so.1: gmond: fatal: relocation error: file /usr/local/lib/ganglia/modcpu.so: symbol cpu_steal_func: referenced symbol not found". Stub added to the bottom of ./libmetrics/solaris/metrics.c g_val_t cpu_steal_func ( void ) {   static g_val_t val=0; return val; } 7) Build Ganglia Core 3.6.0 without gmetad for all of the machines in the cluster, except a primary node: $ export PATH=/usr/local/bin:/usr/local/sbin:$PATH $ export LD_LIBRARY_PATH=/usr/local/lib:/usr/apr/1.3/lib $ export PKG_CONFIG_PATH=/usr/apr/1.3/lib/pkgconfig $ cd ganglia-3.6.0 $ ./configure $ make 8) Install Ganglia Core 3.6.0 (on all machines in the cluster): $ sudo make install $ gmond --default_config > /tmp/gmond.conf $ sudo cp /tmp/gmond.conf /usr/local/etc/gmond.conf $ sudo vi /usr/local/etc/gmond.conf  # (Remove these lines) metric {    name = "cpu_steal" value_threshold = "1.0"   title = "CPU steal" } 9) Start gmond as root (on all machines in the cluster): # export PATH=/usr/local/bin:/usr/local/sbin:$PATH # export LD_LIBRARY_PATH=/usr/local/lib:/usr/apr/1.3/lib # gmond 7) Build Ganglia Core 3.6.0 with gmetad for the primary node:     $ export PATH=/usr/local/bin:/usr/local/sbin:$PATH     $ export LD_LIBRARY_PATH=/usr/local/lib:/usr/apr/1.3/lib     $ export PKG_CONFIG_PATH=/usr/apr/1.3/lib/pkgconfig     $ cd ganglia-3.6.0     $ ./configure --with-gmetad     $ make 10) Install ganglia-web-3.5.7 (on a primary server)     $ cd ganglia-web-3.5.7     $ vi Makefile # Set these variables    GDESTDIR = /var/apache2/2.2/htdocs/    APACHE_USER =  webservd     $ sudo make install     $ sudo mkdir -p /var/lib/ganglia/rrds $ sudo chown -Rh nobody:nobody /var/lib/ganglia 11) Start gmond and gmetad on the primary node 12) You need to remove the "It works" page so that index.php is the default     # sudo rm /var/apache2/2.2/htdocs/index.html Now you can visit the primary node with a web browser.

Here are some notes that I took while building Ganglia Core 3.6.0 and Ganglia Web 3.5.7 with Solaris Studio 12.3 and installing on Solaris 11.1. These notes are only intended augment (not replace)...

Sun

Adding users in Solaris 11 with power like the initial account

During Solaris 11.1 installation, the system administrator is prompted for a user name and password which will be used to create an unprivileged account. For security reasons, by default, root is a role, not a user, therefore the initial login can't be to a root account. The first login must use the initial unprivileged account. Later, the initial unprivileged user can acquire privileges through either "su" or "sudo". For enterprise class deployments, the system administrator should be familiar with RBAC and create users with least privileges. In contrast, I'm working in a lab environment and want to be able to simply and quickly create new users with power like the initial user. With Solaris 10, this was straight forward, but Solaris 11 adds a couple of twists. Create a new user jenny in Solaris 11: # useradd -d localhost:/export/home/jenny -m jenny # passwd jenny Jenny can't su to root: jenny@app61:~$ su - Password: Roles can only be assumed by authorized users su: Sorry Because Jenny doesn't have that role: jenny@app61:~$ roles No roles Give her the role: root@app61:~# usermod -R root jenny And then Jenny can su to root: jenny@app61:~$ roles root jenny@app61:~$ su - Password: Oracle Corporation      SunOS 5.11      11.1    September 2012 You have new mail. root@app61:~# But even when jenny has the root role, she can't use sudo: jenny@app61:~$ sudo -l Password: Sorry, user jenny may not run sudo on app61. jenny@app61:~$ sudo touch /jenny Password: jenny is not in the sudoers file.  This incident will be reported. Oh no, she is in big trouble, now. User jeff was created as the initial account, and he can use sudo: jeff@app61:~$ sudo -l Password: User jeff may run the following commands on this host:     (ALL) ALL But jeff isn't in the sudoers file: root@app61:~# grep jeff /etc/sudoers So how do you make jenny as powerful as jeff with respect to sudo? Turns out that jeff, created during the Solaris installation, is in here: root@app61:~# cat /etc/sudoers.d/svc-system-config-user jeff ALL=(ALL) ALL My coworker, Andrew, offers the following advice: "The last line of /etc/sudoers is a directive to read "drop-in" files from the /etc/sudoers.d directory. You can still edit /etc/sudoers. It may better to leave svc-system-config-user alone and create another drop-in file for local edits. If you want to edit sudoers as part of an application install then you should create a drop-in for the application - this makes the edits easy to undo if you remove the application. If you have multiple drop-ins in /etc/sudoers.d they are processed in alphabetical (sorted) order.  There are restrictions on file names and permissions for drop-ins. The permissions must be 440 (read only for owner and group) and the file name can't have a dot or ~ in it. These are in the very long man page."

During Solaris 11.1 installation, the system administrator is prompted for a user name and password which will be used to create an unprivileged account. For security reasons,by default, root is a...

Sun

Debugging Hadoop using Solaris Studio in a Solaris 11 Zone

I've found Orgad Kimchi's How to Set Up a Hadoop Cluster Using Oracle Solaris Zones to be very useful, however, for a development environment, it is too complex. When map/reduce tasks are running in a clustered environment, it is challenging to isolate bugs. Debugging is easier when working within a standalone Hadoop installation. I've put the following instructions together for installation of a standalone Hadoop configuration in a Solaris Zone with Solaris Studio for application development. A lovely feature of Solaris is that your global zone may host both a Hadoop cluster set up in a manner similar to Orgad's instructions and simultaneously host a zone for development that is running a Hadoop standalone configuration. Create the ZoneThese instructions assume that Solaris 11.1 is already running in the Global Zone. Add the Hadoop Studio Zone # dladm create-vnic -l net0 hadoop_studio # zonecfg -z hadoop-studioUse 'create' to begin configuring a new zone.zonecfg:hadoop-studio> createcreate: Using system default template 'SYSdefault'zonecfg:hadoop-studio> set zonepath=/ZONES/hadoop-studiozonecfg:hadoop-studio> add net zonecfg:hadoop-studio:net> set physical=hadoop_studiozonecfg:hadoop-studio:net> endzonecfg:hadoop-studio> verifyzonecfg:hadoop-studio> commitzonecfg:hadoop-studio> exit Install and boot the zone # zoneadm -z hadoop-studio install # zoneadm -z hadoop-studio boot Login to the zone console to set the network, time, root password, and unprivileged user. # zlogin -C hadoop-studio After the zone's initial configuration steps, nothing else needs to be done from within the global zone. You should be able to log into the Hadoop Studio zone with ssh as the unprivileged user and gain privileges with "su" and "sudo". All of the remaining instructions are from inside the Hadoop Studio Zone. Install extra Solaris software and set up the development environment I like to start with both JDK's installed and not rely on the "/usr/java" symbolic link: # pkg install  jdk-6 # pkg install --accept jdk-7 Verify the JDKs: # /usr/jdk/instances/jdk1.6.0/bin/java -version java version "1.6.0_35" Java(TM) SE Runtime Environment (build 1.6.0_35-b10) Java HotSpot(TM) Server VM (build 20.10-b01, mixed mode) # /usr/jdk/instances/jdk1.7.0/bin/java -version java version "1.7.0_07" Java(TM) SE Runtime Environment (build 1.7.0_07-b10) Java HotSpot(TM) Server VM (build 23.3-b01, mixed mode) Add VNC Remote Desktop software # pkg install --accept solaris-desktop Create a Hadoop user: # groupadd hadoop # useradd -d localhost:/export/home/hadoop -m -g hadoop hadoop # passwd hadoop# usermod -R root hadoop Edit /home/hadoop/.bashrc: export PATH=/usr/bin:/usr/sbinexport PAGER="/usr/bin/less -ins"typeset +x PS1="\u@\h:\w\\$ " # Hadoopexport HADOOP_PREFIX=/home/hadoop/hadoopexport PATH=$HADOOP_PREFIX/bin:$PATH # Javaexport JAVA_HOME=/usr/jdk/instances/jdk1.6.0export PATH=$JAVA_HOME/bin:$PATH # Studioexport PATH=$PATH:/opt/solarisstudio12.3/binalias solstudio='solstudio --jdkhome /usr/jdk/instances/jdk1.6.0' Edit /home/hadoop/.bash_profile: . ~/.bashrc And make sure that the ownership and permission make sense: # ls -l /home/hadoop/.bash*       -rw-r--r--   1 hadoop   hadoop        12 May 22 05:24 /home/hadoop/.bash_profile -rw-r--r--   1 hadoop   hadoop       372 May 22 05:24 /home/hadoop/.bashrc Now is a good time to a start remote VNC desktop for this zone: # su - hadoop $ vncserver You will require a password to access your desktops. Password: Verify: xauth:  file /home/hadoop/.Xauthority does not exist New 'hadoop-studio:1 ()' desktop is hadoop-studio:1 Creating default startup script /home/hadoop/.vnc/xstartup Starting applications specified in /home/hadoop/.vnc/xstartup Log file is /home/hadoop/.vnc/hadoop-studio:1.log Access the remote desktop with your favorite VNC client The default 10 minute time out on the VNC desktop is too fast for my preferences: System -> Preferences -> Screensaver   Display Modes:   Blank after: 100   Close the window (I always look for a "save" button, but no, just close the window without explicitly saving.) Download and Install HadoopFor this article, I used the "12 October, 2012 Release 1.0.4" release. Download the Hadoop tarball and copy it into the home directory of hadoop: $ ls -l hadoop-1.0.4.tar.gz -rw-r--r--   1 hadoop   hadoop   62793050 May 21 12:03 hadoop-1.0.4.tar.gz Unpack the tarball into the home directory of the hadoop user: $ gzip -dc hadoop-1.0.4.tar.gz  | tar -xvf - $ mv hadoop-1.0.4 hadoop Hadoop comes pre-configured in Standalone ModeEdit /home/hadoop/hadoop/conf/hadoop-env.sh, and set JAVA_HOME: export JAVA_HOME=/usr/jdk/instances/jdk1.6.0 That is all. Now, you can run a Hadoop example: $ hadoop jar hadoop/hadoop-examples-1.0.4.jar pi 2 10 Number of Maps  = 2 Samples per Map = 10 Wrote input for Map #0 Wrote input for Map #1 Starting Job ... Job Finished in 10.359 seconds Estimated value of Pi is 3.80000000000000000000 Install Solaris Studio: Visit https://pkg-register.oracle.com/ to obtain Oracle_Solaris_Studio_Support.key.pem, Oracle_Solaris_Studio_Support.certificate.pem and follow the instructions for "pkg set-publisher" and "pkg update" or "pkg install" # sudo pkg set-publisher \           -k /var/pkg/ssl/Oracle_Solaris_Studio_Support.key.pem \           -c /var/pkg/ssl/Oracle_Solaris_Studio_Support.certificate.pem \           -G '*' -g https://pkg.oracle.com/solarisstudio/support solarisstudio # pkg install developer/solarisstudio-123/* If your network requires a proxy, you will need set the proxy before starting Solaris Studio: Start Solaris Studio: $ solstudio (Notice the alias in .bashrc that adds --jdkhome to the solstudio start up command.) Go to "Tools -> Plugins. Click on "Reload Catalog" Load the Java SE plugins. I ran into a problem when the Maven plug in was installed. Something that I should diagnose at a future date. Create a New Project: File -> New Project Step 1: - Catagory: Java - Project: Java Application- Next Step 2: Fill in similar to this: Copy the example source into the project: $ cp -r \     $HADOOP_PREFIX/src/examples/org/apache/hadoop/examples/* \     ~/SolStudioProjects/examples/src/org/apache/hadoop/examples/ Starting to look like a development environment: Modify the Project to compile with Hadoop jars. Right-click on the project and select "Properties" Add in the necessary Hadoop compile jars: I found that I needed these jars at run time: Add Program Arguments (2 10): Now, if you click on the "Run" button. PiEstimators will run inside the IDE: And the set up behaves as expected if you set a break point and click on "Debug":

I've found Orgad Kimchi's How to Set Up a Hadoop Cluster Using Oracle Solaris Zones to be very useful, however, for a development environment, it is toocomplex. When map/reduce tasks are running in...

Sun

non-interactive zone configuration

When creating new Solaris zones, at initial boot up, the system administrator is prompted for the new hostname, network settings, etc of the new zone. I get tired of the brittle process of manually entering the initial settings and I prefer to be able to automate the process. I had previously figured out the process for Solaris 10, but I've only recently figured out the process for Solaris 11. As a review, with Solaris 10, use your favorite editor to create a sysidcfg file: system_locale=Cterminal=dttermsecurity_policy=NONEnetwork_interface=primary {                hostname=app-41}name_service=DNS {    domain_name=us.mycorp.com    name_server=232.23.233.33,154.45.155.15,77.88.21.211    search=us.mycorp.com,yourcorp.com,thecorp.com}nfs4_domain=dynamictimezone=US/Pacificroot_password=xOV2PpE67YUzY 1) Solaris 10 Install: Using sysidcfg to avoid answering the configuration questions in a newly installed zone: After the "zoneadm -z app-41 install" you can copy the sysidcfg file to "/ZONES/app-41/root/etc/sysidcfg"  (assuming your "zonepath" is "/ZONES/app-41") and the initial boot process will read the settings from the file and not prompt the system administrator to manually enter the settings. 2) Solaris 10 Clone: Using sysidcfg when cloning the zone  I used a similar trick on Solaris 10 when cloning old zone "app-41 to new zone "app-44": #  zonecfg -z app-41 export | sed -e 's/app-41/app-44/g' | zonecfg -z app-44 #  zoneadm -z app-44 clone app-41 #  cat /ZONES/app-41/root/etc/sysidcfg | sed -e 's/app-41/app-44/g' > /ZONES/app-44/root/etc/sysidcfg #  zoneadm -z app-44 boot With Solaris 11, instead of a small human readable file which will containing the configuration information, the information is contained in an XML file that would be difficult to create using an editor. Instead, create the initial profile by executing "sysconfig": # sysconfig create-profile -o sc_profile.xml# mkdir /root/profiles/app-61# mv sc_profile.xml /root/profiles/app-61/sc_profile.xml The new XML format is longer so I won't include it in this blog entry and it is left as an exercise for the reader to review the file that has been created. 1) Solaris 11 Install # dladm create-vnic -l net0 app_61 # zonecfg -z app-61Use 'create' to begin configuring a new zone.zonecfg:p3231-zone61> createcreate: Using system default template 'SYSdefault'zonecfg:app-61> set zonepath=/ZONES/app-61zonecfg:app-61> add netzonecfg:app-61:net> set physical=app_61zonecfg:app-61:net> endzonecfg:app-61> verifyzonecfg:app-61> commitzonecfg:app-61> exit # zoneadm -z app-61 install -c /root/profiles/app-61# zoneadm -z app-61 boot# zlogin -C app-61 2) Solaris 11 Clone: If you want to clone app-61 to app-62 and have an existing sc_profile.xml, you can re-use most of the settings and only adjust what has changed: # dladm create-vnic -l net0 app_62 # zoneadm -z app-61 halt # mkdir /root/profiles/app-62 # sed \-e 's/app-61/app-62/g' \-e 's/app_61/app_62/g' \-e 's/11.22.33.61/11.22.33.62/g' \< /root/profiles/app-61/sc_profile.xml \> /root/profiles/app-62/sc_profile.xml # zonecfg -z app-61 export | sed -e 's/61/62/g' | zonecfg -z app-62 # zoneadm -z app-62 clone -c /root/profiles/app-62 app-61# zoneadm -z app-62 boot# zlogin -C app-62 I hope this trick saves you some time and makes your process less brittle.

When creating new Solaris zones, at initial boot up, the system administrator is prompted for the new hostname, network settings, etc ofthe new zone. I get tired of the brittle process of...

Sun

Hadoop Java Error logs

I was having trouble isolating a problem with "reduce" tasks running on Hadoop slave servers.  After poking around on the Hadoop slave, I found an interesting lead in /var/log/hadoop/userlogs/job_201302111641_0057/attempt_201302111641_0057_r_000001_1/stdout: $ cat /tmp/hadoop-hadoop/mapred/local/userlogs/job_201302111641_0059/attempt_201302111641_0059_r_000001_1/stdout ## A fatal error has been detected by the Java Runtime Environment:##  SIGSEGV (0xb) at pc=0xfe67cb31, pid=25828, tid=2## JRE version: 6.0_35-b10# Java VM: Java HotSpot(TM) Server VM (20.10-b01 mixed mode solaris-x86 )# Problematic frame:# C  [libc.so.1+0xbcb31]  pthread_mutex_trylock+0x29## An error report file with more information is saved as:# /tmp/hadoop-hadoop/mapred/local/taskTracker/hadoop/jobcache/job_201302111641_0059/attempt_201302111641_0059_r_000001_1/work/hs_err_pid25828.log## If you would like to submit a bug report, please visit:#   http://java.sun.com/webapps/bugreport/crash.jsp# The crash happened outside the Java Virtual Machine in native code.# See problematic frame for where to report the bug. The HotSpot crash log (hs_err_pid25828.log in my case) will be very interesting because it contains information obtained at the time of the fatal error, including the following information, where possible:The operating exception or signal that provoked the fatal errorVersion and configuration informationDetails on the thread that provoked the fatal error and thread's stack traceThe list of running threads and their stateSummary information about the heapThe list of native libraries loadedCommand line argumentsEnvironment variablesDetails about the operating system and CPUGreat, but hs_err_pid25654.log had been cleaned up before I could get to it. In fact, I found that the hs_err_pid.log files were available for less than a minute and they were always gone before I could capture one. To try to retain the Java error log file, my first incorrect guess was:  <property>   <name>keep.failed.task.files</name>   <value>true</value> </property> My next approach was to add "-XX:ErrorFile=/tmp/hs_err_pid%p.log" to the Java command line for the reduce task. When I tried adding the Java option to HADOOP_OPTS in /usr/local/hadoop/conf/hadoop-env.sh, I realized that this setting isn't applied to the Map and Reduce Task JVMs. Finally, I found that adding the Java option to the mapred.child.java.opts property in mapred-site.xml WORKED!! $ cat /usr/local/hadoop/conf/mapred-site.xml <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration>      <property>          <name>mapred.job.tracker</name>          <value>p3231-name-node:8021</value>      </property>      <property>          <name>mapred.child.java.opts</name>          <value>-XX:ErrorFile=/tmp/hs_err_pid%p.log</value>      </property> </configuration> Now I can view the Java error logs on my Hadoop slaves: $ ls -l /tmp/*err*-rw-r--r--   1 hadoop   hadoop     15626 May 16 15:42 /tmp/hs_err_pid10028.log-rw-r--r--   1 hadoop   hadoop     15795 May 16 15:43 /tmp/hs_err_pid10232.log

I was having trouble isolating a problem with "reduce" tasks running on Hadoop slave servers.  After poking around on the Hadoop slave, I found an interesting lead in/var/log/hadoop/userlogs/job_201302...

Sun

ZFS for Database Log Files

I've been troubled by drop outs in CPU usage in my application server, characterized by the CPUs suddenly going from close to 90% CPU busy to almost completely CPU idle for a few seconds. Here is an example of a drop out as shown by a snippet of vmstat data taken while the application server is under a heavy workload. # vmstat 1 kthr      memory            page            disk          faults      cpu r b w   swap  free  re  mf pi po fr de sr s3 s4 s5 s6   in   sy   cs us sy id 1 0 0 130160176 116381952 0 16 0 0 0 0  0  0  0  0  0 207377 117715 203884 70 21 9 12 0 0 130160160 116381936 0 25 0 0 0 0 0  0  0  0  0 200413 117162 197250 70 20 9 11 0 0 130160176 116381920 0 16 0 0 0 0 0  0  1  0  0 203150 119365 200249 72 21 7 8 0 0 130160176 116377808 0 19 0 0 0 0  0  0  0  0  0 169826 96144 165194 56 17 27  0 0 0 130160176 116377800 0 16 0 0 0 0  0  0  0  0  1 10245 9376 9164 2  1 97 0 0 0 130160176 116377792 0 16 0 0 0 0  0  0  0  0  2 15742 12401 14784 4 1 95 0 0 0 130160176 116377776 2 16 0 0 0 0  0  0  1  0  0 19972 17703 19612 6 2 92 14 0 0 130160176 116377696 0 16 0 0 0 0 0  0  0  0  0 202794 116793 199807 71 21 8 9 0 0 130160160 116373584 0 30 0 0 0 0  0  0 18  0  0 203123 117857 198825 69 20 11 This behavior occurred consistently while the application server was processing synthetic transactions: HTTP requests from JMeter running on an external machine. I explored many theories trying to explain the drop outs, including: Unexpected JMeter behavior Network contention Java Garbage Collection Application Server thread pool problems Connection pool problems Database transaction processing Database I/O contention Graphing the CPU %idle led to a breakthrough: Several of the drop outs were 30 seconds apart. With that insight, I went digging through the data again and looking for other outliers that were 30 seconds apart. In the database server statistics, I found spikes in the iostat "asvc_t" (average response time of disk transactions, in milliseconds) for the disk drive that was being used for the database log files. Here is an example:                     extended device statistics     r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device     0.0 2053.6    0.0 8234.3  0.0  0.2    0.0    0.1   0  24 c3t60080E5...F4F6d0s0     0.0 2162.2    0.0 8652.8  0.0  0.3    0.0    0.1   0  28 c3t60080E5...F4F6d0s0     0.0 1102.5    0.0 10012.8  0.0  4.5    0.0    4.1   0  69 c3t60080E5...F4F6d0s0     0.0   74.0    0.0 7920.6  0.0 10.0    0.0  135.1   0 100 c3t60080E5...F4F6d0s0     0.0  568.7    0.0 6674.0  0.0  6.4    0.0   11.2   0  90 c3t60080E5...F4F6d0s0     0.0 1358.0    0.0 5456.0  0.0  0.6    0.0    0.4   0  55 c3t60080E5...F4F6d0s0     0.0 1314.3    0.0 5285.2  0.0  0.7    0.0    0.5   0  70 c3t60080E5...F4F6d0s0 Here is a little more information about my database configuration: The database and application server were running on two different SPARC servers. Storage for the database was on a storage array connected via 8 gigabit Fibre Channel Data storage and log file were on different physical disk drives Reliable low latency I/O is provided by battery backed NVRAM Highly available: Two Fibre Channel links accessed via MPxIO Two Mirrored cache controllers The log file physical disks were mirrored in the storage device Database log files on a ZFS Filesystem with cutting-edge technologies, such as copy-on-write and end-to-end checksumming Why would I be getting service time spikes in my high-end storage? First, I wanted to verify that the database log disk service time spikes aligned with the application server CPU drop outs, and they did: At first, I guessed that the disk service time spikes might be related to flushing the write through cache on the storage device, but I was unable to validate that theory. After searching the WWW for a while, I decided to try using a separate log device: # zpool add ZFS-db-41 log c3t60080E500017D55C000015C150A9F8A7d0 The ZFS log device is configured in a similar manner as described above: two physical disks mirrored in the storage array. This change to the database storage configuration eliminated the application server CPU drop outs: Here is the zpool configuration: # zpool status ZFS-db-41   pool: ZFS-db-41  state: ONLINE  scan: none requested config:         NAME                                     STATE         ZFS-db-41                                ONLINE           c3t60080E5...F4F6d0  ONLINE         logs           c3t60080E5...F8A7d0  ONLINE Now, the I/O spikes look like this:                     extended device statistics                  r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device    0.0 1053.5    0.0 4234.1  0.0  0.8    0.0    0.7   0  75 c3t60080E5...F8A7d0s0                    extended device statistics                  r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device    0.0 1131.8    0.0 4555.3  0.0  0.8    0.0    0.7   0  76 c3t60080E5...F8A7d0s0                    extended device statistics                  r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device    0.0 1167.6    0.0 4682.2  0.0  0.7    0.0    0.6   0  74 c3t60080E5...F8A7d0s0     0.0  162.2    0.0 19153.9  0.0  0.7    0.0    4.2   0  12 c3t60080E5...F4F6d0s0                    extended device statistics                  r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device    0.0 1247.2    0.0 4992.6  0.0  0.7    0.0    0.6   0  71 c3t60080E5...F8A7d0s0     0.0   41.0    0.0   70.0  0.0  0.1    0.0    1.6   0   2 c3t60080E5...F4F6d0s0                    extended device statistics                  r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device    0.0 1241.3    0.0 4989.3  0.0  0.8    0.0    0.6   0  75 c3t60080E5...F8A7d0s0                    extended device statistics                  r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device    0.0 1193.2    0.0 4772.9  0.0  0.7    0.0    0.6   0  71 c3t60080E5...F8A7d0s0 We can see the steady flow of 4k writes to the ZIL device from O_SYNC database log file writes. The spikes are from flushing the transaction group. Like almost all problems that I run into, once I thoroughly understand the problem, I find that other people have documented similar experiences. Thanks to all of you who have documented alternative approaches. Saved for another day: now that the problem is obvious, I should try "zfs:zfs_immediate_write_sz" as recommended in the ZFS Evil Tuning Guide. References: The ZFS Intent Log Solaris ZFS, Synchronous Writes and the ZIL Explained ZFS Evil Tuning Guide: Cache Flushes ZFS Evil Tuning Guide: Tuning ZFS for Database Performance

I've been troubled by drop outs in CPU usage in my application server, characterized by the CPUs suddenly going from close to 90% CPU busy toalmost completely CPU idle for a few seconds. Here is an...

Sun

ndd on Solaris 10

This is mostly a repost of LaoTsao's Weblog, with some tweaks and additions. In 2008, his blog pointed out that with Solaris 9 and earlier,  an rc3 script would be used to specify ndd parameters at boot up. With Solaris 10 and later, it is more elegant to to SMF. The last time that I tried to cut & paste directly off of his page, some of the XML was messed up, so I am reposting working Solaris 10 XML in this blog entry. Additionally, I am including scripts that I use to distribute the settings to multiple servers. I run the distribution scripts from my MacBook, but they should also work from a windows laptop using cygwin, or from an existing Solaris installation. Why is it necessary to set ndd parameters at boot up? The problem being addressed is how to set ndd parameter which survive reboot. It is easy to specify ndd settings from a shell, but they only apply to the running OS and don't survive reboots. Examples of ndd setting being necessary include performance tuning, as described in NFS Tuning for HPC Streaming Applications, and installing Oracle Database 11gR2 on Solaris 10 with prerequisites, as show here: On Solaris 10 Update 10, the default values network settings don't match the Oracle 11gR2 prerequisites: Expected Value Actual Value tcp_smallest_anon_port 9000 32768 tcp_largest_anon_port 65500 65535 udp_smallest_anon_port 9000 32768 udp_largest_anon_port 65500 65535 To distribute the SMF files, and for future administration, it is helpful to enable passwordless ssh from your secure laptop: ================If not already present, create a ssh key on you laptop================ # ssh-keygen -t rsa ================ Enable passwordless ssh from my laptop. Need to type in the root password for the remote machines. Then, I no longer need to type in the password when I ssh or scp from my laptop to servers. ================ #!/usr/bin/env bash for server in `cat servers.txt` do   echo root@$server   cat ~/.ssh/id_rsa.pub | ssh root@$server "cat >> .ssh/authorized_keys" done Specify the servers to distribute to: ================ servers.txt ================ testhost1testhost2 In addition to ndd values, I offen use the following /etc/system setting:  ================ etc_system_addins ================ set rpcmod:clnt_max_conns=8 set zfs:zfs_arc_max=0x1000000000 set nfs:nfs3_bsize=131072 set nfs:nfs4_bsize=131072 Modify ndd-nettune.txt with the ndd values that are appropriate for your deployment:  ================ ndd-nettune.txt ================ #!/sbin/sh # # ident   "@(#)ndd-nettune.xml    1.0     01/08/06 SMI" . /lib/svc/share/smf_include.sh . /lib/svc/share/net_include.sh # Make sure that the libraries essential to this stage of booting  can be found. LD_LIBRARY_PATH=/lib; export LD_LIBRARY_PATH echo "Performing Directory Server Tuning..." >> /tmp/smf.out # # Performance Settings # /usr/sbin/ndd -set /dev/tcp tcp_max_buf 2097152 /usr/sbin/ndd -set /dev/tcp tcp_xmit_hiwat 1048576 /usr/sbin/ndd -set /dev/tcp tcp_recv_hiwat 1048576 # # Oracle Database 11gR2 Settings # /usr/sbin/ndd -set /dev/tcp tcp_smallest_anon_port 9000 /usr/sbin/ndd -set /dev/tcp tcp_largest_anon_port 65500 /usr/sbin/ndd -set /dev/udp udp_smallest_anon_port 9000 /usr/sbin/ndd -set /dev/udp udp_largest_anon_port 65500 # Reset the library path now that we are past the critical stage unset LD_LIBRARY_PATH ================ ndd-nettune.xml ================ <?xml version="1.0"?> <!DOCTYPE service_bundle SYSTEM "/usr/share/lib/xml/dtd/service_bundle.dtd.1"> <!-- ident "@(#)ndd-nettune.xml 1.0 04/09/21 SMI" --> <service_bundle type='manifest' name='SUNWcsr:ndd'>   <service name='network/ndd-nettune' type='service' version='1'>     <create_default_instance enabled='true' />     <single_instance />     <dependency name='fs-minimal' type='service' grouping='require_all' restart_on='none'>       <service_fmri value='svc:/system/filesystem/minimal' />     </dependency>     <dependency name='loopback-network' grouping='require_any' restart_on='none' type='service'>       <service_fmri value='svc:/network/loopback' />     </dependency>     <dependency name='physical-network' grouping='optional_all' restart_on='none' type='service'>       <service_fmri value='svc:/network/physical' />     </dependency>     <exec_method type='method' name='start' exec='/lib/svc/method/ndd-nettune' timeout_seconds='3' > </exec_method>     <exec_method type='method' name='stop'  exec=':true'                       timeout_seconds='3' > </exec_method>     <property_group name='startd' type='framework'>       <propval name='duration' type='astring' value='transient' />     </property_group>     <stability value='Unstable' />     <template>       <common_name>     <loctext xml:lang='C'> ndd network tuning </loctext>       </common_name>       <documentation>     <manpage title='ndd' section='1M' manpath='/usr/share/man' />       </documentation>     </template>   </service> </service_bundle> Execute this shell script to distribute the files. The ndd values will be immediately modified and then survive reboot. The servers will need to rebooted to pick up the /etc/system settings: ================ system_tuning.sh ================ #!/usr/bin/env bash for server in `cat servers.txt` do   cat etc_system_addins | ssh root@$server "cat >> /etc/system"   scp ndd-nettune.xml root@${server}:/var/svc/manifest/site/ndd-nettune.xml   scp ndd-nettune.txt root@${server}:/lib/svc/method/ndd-nettune   ssh root@$server chmod +x /lib/svc/method/ndd-nettune   ssh root@$server svccfg validate /var/svc/manifest/site/ndd-nettune.xml   ssh root@$server svccfg import /var/svc/manifest/site/ndd-nettune.xml done

This is mostly a repost of LaoTsao's Weblog, with some tweaks and additions. In 2008, his blog pointed out that with Solaris 9 and earlier,  an rc3 script would be used to specify ndd parameters at...

Sun

Java EE Application Servers, SPARC T4, Solaris Containers, and Resource Pools

I've obtained a substantial performance improvement on a SPARC T4-2 Server running a Java EE Application Server Cluster by deploying the cluster members into Oracle Solaris Containers and binding those containers to cores of the SPARC T4 Processor. This is not a surprising result, in fact, it is consistent with other results that are available on the Internet. See the "references", below, for some examples. Nonetheless, here is a summary of my configuration and results. (1.0) Before deploying a Java EE Application Server Cluster into a virtualized environment, many decisions need to be made. I'm not claiming that all of the decisions that I have a made will work well for every environment. In fact, I'm not even claiming that all of the decisions are the best possible for my environment. I'm only claiming that of the small sample of configurations that I've tested, this is the one that is working best for me. Here are some of the decisions that needed to be made: (1.1) Which virtualization option? There are several virtualization options and isolation levels that are available. Options include: Hard partitions:  Dynamic Domains on Sun SPARC Enterprise M-Series Servers Hypervisor based virtualization such as Oracle VM Server for SPARC (LDOMs) on SPARC T-Series Servers OS Virtualization using Oracle Solaris Containers Resource management tools in the Oracle Solaris OS to control the amount of resources an application receives, such as CPU cycles, physical memory, and network bandwidth. Oracle Solaris Containers provide the right level of isolation and flexibility for my environment. To borrow some words from my friends in marketing, "The SPARC T4 processor leverages the unique, no-cost virtualization capabilities of Oracle Solaris Zones"  (1.2) How to associate Oracle Solaris Containers with resources? There are several options available to associate containers with resources, including (a) resource pool association (b) dedicated-cpu resources and (c) capped-cpu resources. I chose to create resource pools and associate them with the containers because I wanted explicit control over the cores and virtual processors.  (1.3) Cluster Topology? Is it best to deploy (a) multiple application servers on one node, (b) one application server on multiple nodes, or (c) multiple application servers on multiple nodes? After a few quick tests, it appears that one application server per Oracle Solaris Container is a good solution. (1.4) Number of cluster members to deploy? I chose to deploy four big application servers. I would like go back and test many 32-bit application servers, but that is left for another day. (2.0) Configuration tested. (2.1) I was using a SPARC T4-2 Server which has 2 CPU and 128 virtual processors. To understand the physical layout of the hardware on Solaris 10, I used the OpenSolaris psrinfo perl script available at http://hub.opensolaris.org/bin/download/Community+Group+performance/files/psrinfo.pl: test# ./psrinfo.pl -pv The physical processor has 8 cores and 64 virtual processors (0-63) The core has 8 virtual processors (0-7)   The core has 8 virtual processors (8-15)   The core has 8 virtual processors (16-23)   The core has 8 virtual processors (24-31)   The core has 8 virtual processors (32-39)   The core has 8 virtual processors (40-47)   The core has 8 virtual processors (48-55)   The core has 8 virtual processors (56-63)     SPARC-T4 (chipid 0, clock 2848 MHz) The physical processor has 8 cores and 64 virtual processors (64-127)   The core has 8 virtual processors (64-71)   The core has 8 virtual processors (72-79)   The core has 8 virtual processors (80-87)   The core has 8 virtual processors (88-95)   The core has 8 virtual processors (96-103)   The core has 8 virtual processors (104-111)   The core has 8 virtual processors (112-119)   The core has 8 virtual processors (120-127)     SPARC-T4 (chipid 1, clock 2848 MHz) (2.2) The "before" test: without processor binding. I started with a 4-member cluster deployed into 4 Oracle Solaris Containers. Each container used a unique gigabit Ethernet port for HTTP traffic. The containers shared a 10 gigabit Ethernet port for JDBC traffic. (2.3) The "after" test: with processor binding. I ran one application server in the Global Zone and another application server in each of the three non-global zones (NGZ):  (3.0) Configuration steps. The following steps need to be repeated for all three Oracle Solaris Containers. (3.1) Stop AppServers from the BUI. (3.2) Stop the NGZ. test# ssh test-z2 init 5 (3.3) Enable resource pools: test# svcadm enable pools (3.4) Create the resource pool: test# poolcfg -dc 'create pool pool-test-z2' (3.5) Create the processor set: test# poolcfg -dc 'create pset pset-test-z2' (3.6) Specify the maximum number of CPU's that may be addd to the processor set: test# poolcfg -dc 'modify pset pset-test-z2 (uint pset.max=32)' (3.7) bash syntax to add Virtual CPUs to the processor set: test# (( i = 64 )); while (( i < 96 )); do poolcfg -dc "transfer to pset pset-test-z2 (cpu $i)"; (( i = i + 1 )) ; done (3.8) Associate the resource pool with the processor set: test# poolcfg -dc 'associate pool pool-test-z2 (pset pset-test-z2)' (3.9) Tell the zone to use the resource pool that has been created: test# zonecfg -z test-z2 set pool=pool-test-z2 (3.10) Boot the Oracle Solaris Container test# zoneadm -z test-z2 boot (3.11) Save the configuration to /etc/pooladm.conf test# pooladm -s (4.0) Verification (4.1) View the processors in each processor set  test# psrset user processor set 5: processors 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 user processor set 6: processors 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 user processor set 7: processors 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 12 (4.2) Verify that the Java processes are associated with the processor sets: test# ps -e -o,vsz,rss,pid,pset,comm | grep java | sort -n   VSZ     RSS    PID PSET COMMAND 3715416 1543344 25143   5 <JAVA_HOME>/bin/sparcv9/java 3772120 1600088 25142   - <JAVA_HOME>/bin/sparcv9/java 3780960 1608832 25144   6 <JAVA_HOME>/bin/sparcv9/java 3792648 1620560 25145   7 <JAVA_HOME>/bin/sparcv9/java (5.0) Results. Using the resource pools improves both throughput and response time: (6.0) Run Time Changes (6.1) I wanted to show an example which started from scratch, which is why I stopped the Oracle Solaris Containers, configured the pools and booted up fresh. There is no room for confusion. However, the steps should work for running containers. One exception is "zonecfg -z test-z2 set pool=pool-test-z2" which will take effect when the zone is booted. (6.2) I've shown poolcfg with the '-d' option which specifies that the command will work directly on the kernel state. For example, at runtime, you can move CPU core 12 (virtual processors 96-103) from test-z3 to test-z2 with the following command: test# (( i = 96 )); while (( i < 104 )); do poolcfg -dc "transfer to pset pset-test-z2 (cpu $i)"; (( i = i + 1 )) ; done (6.3) To specify a run-time change to a container's pool binding, use the following steps: Identify the zone ID (first column) test# zoneadm list -vi   ID NAME        STATUS     PATH                      BRAND    IP    0 global      running    /                         native   shared   28 test-z3     running    /zones/test-z3            native   shared   31 test-z1     running    /zones/test-z1            native   shared   32 test-z2     running    /zones/test-z2            native   shared Modify binding if necessary: test# poolbind -p pool-test-z2 -i zoneid 32 (7.0) Processor sets are particularly relevant to multi-socket configurations:Processor sets reduce cross calls (xcal) and migrations (migr) in multi-socket configurations: Single Socket Test 1 x SPARC T4 Socket2 x Oracle Solaris Containersmpstat samplesThe impact of processor sets was hardly measurable (about a 1% throughput difference) Without Processor Binding CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl 40    1   0  525   933   24 1793  124  363  153    1  2551   50   7   0  43 41    2   0  486  1064   24 1873  137  388  159    2  2560   51   7   0  42 42    1   0  472   973   23 1770  124  352  153    1  2329   49   7   0  44 43    1   0  415   912   22 1697  115  320  153    1  2175   47   7   0  47 44    1   0  369   884   22 1665  111  300  150    1  2008   45   6   0  49 45    2   0  494   902   23 1730  116  324  152    1  2233   46   7   0  47 46    3   0  918  1075   26 2087  163  470  172    1  2935   55   8   0  38 47    2   0  672   999   25 1955  143  416  162    1  2777   53   7   0  40 48    2   0  691   976   25 1904  136  396  159    1  2748   51   7   0  42 49    3   0  849  1081   24 1933  145  411  163    1  2670   52   7   0  40 With each container bound to 4 cores. CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl 40    1   0  347  1164   20 1810  119  311  210    1  2079   42   6   0  51 41    1   0  406  1219   21 1898  131  344  214    1  2266   45   7   0  48 42    1   0  412  1214   21 1902  130  342  212    1  2289   45   7   0  49 43    2   0  410  1208   21 1905  130  343  219    1  2304   45   7   0  48 44    1   0  411  1208   21 1906  131  343  214    1  2313   45   7   0  48 45    1   0  433  1209   21 1917  133  344  215    1  2337   45   7   0  48 46    2   0  500  1244   24 1989  141  368  218    1  2482   46   7   0  47 47    1   0  377  1183   21 1871  127  331  211    1  2289   45   7   0  49 48    1   0  358   961   23 1699   77  202  183    1  2255   41   6   0  53 49    1   0  339  1008   21 1739   84  216  188    1  2231   41   6   0  53 Two Socket Test 2 x T4 Sockets4 x Oracle Solaris Containermpstat sampleThe impact of processor sets was substantial(~25% better throughput) Without Processor Binding CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl 40    1   0 1277  1553   32 2726  317  942   70    2  2620   66  11   0  24 41    0   0 1201  1606   30 2618  309  895   71    2  2603   67  11   0  23 42    1   0 1104  1517   30 2519  295  846   70    2  2499   65  10   0  24 43    1   0  997  1447   28 2443  283  807   69    2  2374   64  10   0  26 44    1   0  959  1402   28 2402  277  776   67    2  2336   64  10   0  26 45    1   0 1057  1466   29 2538  294  841   68    2  2400   64  10   0  26 46    3   0 2785  1776   35 3273  384 1178   74    2  2841   68  12   0  20 47    1   0 1508  1610   33 2949  346 1039   72    2  2764   67  11   0  22 48    2   0 1486  1594   33 2963  346 1036   72    2  2761   67  11   0  22 49    1   0 1308  1589   32 2741  325  952   71    2  2694   67  11   0  22 With each container bound to 4 cores. CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl 40    1   0  423  1223   20 1841  157  377   60    1  2185   48   7   0  45 41    1   0  505  1279   22 1942  168  405   65    1  2396   50   7   0  43 42    1   0  500  1278   22 1941  170  405   65    1  2413   50   7   0  42 43    2   0  492  1277   22 1955  171  408   64    1  2422   50   8   0  42 44    1   0  504  1269   22 1941  167  407   64    1  2430   50   7   0  42 45    1   0  513  1284   22 1977  173  412   64    1  2475   50   8   0  42 46    2   0  582  1302   25 2021  177  431   67    1  2612   52   8   0  41 47    1   0  462  1247   21 1918  168  400   62    1  2392   50   7   0  43 48    1   0  466  1055   25 1777  120  282   56    1  2424   47   7   0  47 49    1   0  412  1080   22 1789  122  285   56    1  2354   46   7   0  47 (8.0) References: Thread placement policies on NUMA systems - update System Administration Guide: Oracle Solaris Containers-Resource Management and Oracle Solaris Zones Capitalizing on large numbers of processors with WebSphere Portal on Solaris WebSphere Application Server and T5440 (Dileep Kumar's Weblog)  http://www.brendangregg.com/zones.html Reuters Market Data System, RMDS 6 Multiple Instances (Consolidated), Performance Test Results in Solaris, Containers/Zones Environment on Sun Blade X6270 by Amjad Khan, 2009.

I've obtained a substantial performance improvement on a SPARC T4-2 Server running a Java EE Application Server Cluster by deploying the cluster members into Oracle Solaris Containers and binding...

Sun

What is bondib1 used for on SPARC SuperCluster with InfiniBand, Solaris 11 networking & Oracle RAC?

A co-worker asked the following question about a SPARC SuperCluster InfiniBand network: > on the database nodes the RAC nodes communicate over the cluster_interconnect. This is the > 192.168.10.0 network on bondib0. (according to ./crs/install/crsconfig_params NETWORKS> setting) > What is bondib1 used for? Is it a HA counterpart in case bondib0 dies? This is my response: Summary: In a SPARC SuperCluster installation, bondib0 and bondib1 are the InfiniBand links that are used for the private interconnect (usage includes global cache data blocks and heartbeat) and for communication to the Exadata storage cells. Currently, the database is idle, so bondib1 is currently only being used for outbound cluster interconnect traffic. Details: bondib0 is the cluster_interconnect $ oifcfg getif            bondeth0  10.129.184.0  global  publicbondib0  192.168.10.0  global  cluster_interconnectipmpapp0  192.168.30.0  global  public bondib0 and bondib1 are on 192.168.10.1 and 192.168.10.2 respectively. # ipadm show-addr | grep bondibondib0/v4static  static   ok           192.168.10.1/24bondib1/v4static  static   ok           192.168.10.2/24 This private network is also used to communicate with the Exadata Storage Cells. Notice that the network addresses of the Exadata Cell Disks are on the same subnet as the private interconnect:  SQL> column path format a40 SQL> select path from v$asm_disk; PATH                                      ----------------------------------------  o/192.168.10.9/DATA_SSC_CD_00_ssc9es01   o/192.168.10.9/DATA_SSC_CD_01_ssc9es01... Hostnames tied to the IPs are node1-priv1 and node1-priv2  # grep 192.168.10 /etc/hosts192.168.10.1    node1-priv1.us.oracle.com   node1-priv1192.168.10.2    node1-priv2.us.oracle.com   node1-priv2 For the four compute node RAC: Each compute node has two IP address on the 192.168.10.0 private network. Each IP address has an active InfiniBand link and a failover InfiniBand link. Thus, the compute nodes are using a total of 8 IP addresses and 16 InfiniBand links for this private network. bondib1 isn't being used for the Virtual IP (VIP): $ srvctl config vip -n node1VIP exists: /node1-ib-vip/192.168.30.25/192.168.30.0/255.255.255.0/ipmpapp0, hosting node node1VIP exists: /node1-vip/10.55.184.15/10.55.184.0/255.255.255.0/bondeth0, hosting node node1 bondib1 is on bondib1_0 and fails over to bondib1_1: # ipmpstat -g GROUP       GROUPNAME   STATE     FDT       INTERFACESipmpapp0    ipmpapp0    ok        --        ipmpapp_0 (ipmpapp_1)bondeth0    bondeth0    degraded  --        net2 [net5]bondib1     bondib1     ok        --        bondib1_0 (bondib1_1)bondib0     bondib0     ok        --        bondib0_0 (bondib0_1) bondib1_0 goes over net24 # dladm show-link | grep bondLINK                CLASS     MTU    STATE    OVERbondib0_0           part      65520  up       net21bondib0_1           part      65520  up       net22bondib1_0           part      65520  up       net24bondib1_1           part      65520  up       net23 net24 is IB Partition FFFF # dladm show-ibLINK         HCAGUID         PORTGUID        PORT STATE  PKEYSnet24        21280001A1868A  21280001A1868C  2    up     FFFFnet22        21280001CEBBDE  21280001CEBBE0  2    up     FFFF,8503net23        21280001A1868A  21280001A1868B  1    up     FFFF,8503net21        21280001CEBBDE  21280001CEBBDF  1    up     FFFF On Express Module 9 port 2: # dladm show-phys -LLINK              DEVICE       LOCnet21             ibp4         PCI-EM1/PORT1net22             ibp5         PCI-EM1/PORT2net23             ibp6         PCI-EM9/PORT1net24             ibp7         PCI-EM9/PORT2 Outbound traffic on the 192.168.10.0 network will be multiplexed between bondib0 & bondib1 # netstat -rn Routing Table: IPv4  Destination           Gateway           Flags  Ref     Use     Interface -------------------- -------------------- ----- ----- ---------- --------- 192.168.10.0         192.168.10.2         U        16    6551834 bondib1   192.168.10.0         192.168.10.1         U         9    5708924 bondib0   The database is currently idle, so there is no traffic to the Exadata Storage Cells at this moment, nor is there currently any traffic being induced by the global cache. Thus, only the heartbeat is currently active. There is more traffic on bondib0 than bondib1 # /bin/time snoop -I bondib0 -c 100 > /dev/nullUsing device ipnet/bondib0 (promiscuous mode)100 packets captured real        4.3user        0.0sys         0.0 (100 packets in 4.3 seconds = 23.3 pkts/sec) # /bin/time snoop -I bondib1 -c 100 > /dev/nullUsing device ipnet/bondib1 (promiscuous mode)100 packets captured real       13.3user        0.0sys         0.0 (100 packets in 13.3 seconds = 7.5 pkts/sec) Half of the packets on bondib0 are outbound (from self). The remaining packet are split evenly, from the other nodes in the cluster. # snoop -I bondib0 -c 100 | awk '{print $1}' | sort | uniq -cUsing device ipnet/bondib0 (promiscuous mode)100 packets captured  49 node1-priv1.us.oracle.com  24 node2-priv1.us.oracle.com  14 node3-priv1.us.oracle.com  13 node4-priv1.us.oracle.com 100% of the packets on bondib1 are outbound (from self), but the headers in the packets indicate that they are from the IP address associated with bondib0: # snoop -I bondib1 -c 100 | awk '{print $1}' | sort | uniq -cUsing device ipnet/bondib1 (promiscuous mode)100 packets captured 100 node1-priv1.us.oracle.com The destination of the bondib1 outbound packets are split evenly, to node3 and node 4. # snoop -I bondib1 -c 100 | awk '{print $3}' | sort | uniq -cUsing device ipnet/bondib1 (promiscuous mode)100 packets captured  51 node3-priv1.us.oracle.com  49 node4-priv1.us.oracle.com Conclusion: In a SPARC SuperCluster installation, bondib0 and bondib1 are the InfiniBand links that are used for the private interconnect (usage includes global cache data blocks and heartbeat) and for communication to the Exadata storage cells. Currently, the database is idle, so bondib1 is currently only being used for outbound cluster interconnect traffic.

A co-worker asked the following question about a SPARC SuperCluster InfiniBand network: > on the database nodes the RAC nodes communicate over the cluster_interconnect. This is the> 192.168.10.0...

Sun

NFS root access for Oracle RAC on Sun ZFS Storage 7x20 Appliance

When installing Oracle Real Application Clusters (Oracle RAC) 11g Release 2 for Solaris Operating System it is necessary to first install Oracle Grid Infrastructure, and it is also necessary to configure shared storage. I install the Grid Infrastructure and then the Database largely following the on-line instructions: Grid Infrastructure Installation Guide for Solaris Operating System Database Installation Guide for Oracle Solaris Real Application Clusters Installation Guide for Linux and UNIX I ran into an interesting problem when installing Oracle RAC in a system that included SPARC Enterprise M5000 Servers and a Sun ZFS Storage 7x20 Appliance, illustrated in the following diagram: (Click to Expand) When configuring the shared storage for Oracle RAC, you may decide to use NFS for Data Files. In this case, you must set up the NFS mounts on the storage appliance to allow root access from all of the RAC clients. This allows files being created from the RAC nodes to be owned by root on the mounted NFS filesystems, rather than an anonymous user, which is the default behavior. In a default configuration, a Solaris NFS server maps "root" access to "nobody". This can be overridden as stated on the share_nfs(1M) man page: Only root users from the hosts specified in access_list will have root access... By default, no host has root access, so root users are mapped to an anonymous user ID... Example: The following will give root read-write permissions to hostb: share -F nfs -o ro=hosta,rw=hostb,root=hostb /var The Sun ZFS Storage 7x20 Appliance features a browser user interface (BUI), "a graphical tool for administration of the appliance. The BUI provides an intuitive environment for administration tasks, visualizing concepts, and analyzing performance data." The following clicks can be used to allow root on the clients mounting to the storage to be considered root: (Click to Expand) Go the the "Shares" page (not shown) Select the "pencil" to edit the share that will be used for Oracle RAC shared storage. Go to the Protocols page. For the NFS Protocol, un-check "inherit from project". Click the "+" to add an NFS exception. Enter the hostname of the RAC node. Allow read/write access. Check the "Root Access" box. Click "Apply" to save the changes. Repeat steps 3-8 for each RAC node. Repeat steps 1-8 for every share that will be used for RAC shared storage. More intuitive readers, after reviewing the network diagram and the screenshot of the S7420 NFS exceptions screen, may immediately observe that it was a mistake to enter the hostname of the RAC nodes associated with the gigabit WAN network. In hindsight, this was an obvious mistake, but at the time I was entering the data, I simply entered the names of the machines, which did not strike me as a "trick question". The next step is to configure the RAC nodes as NFS clients. After the shares have been set up on the Sun ZFS Storage Appliance, the next step is to mount the shares onto the RAC nodes. For each RAC node, update the /etc/vfstab file on each node with an entry similar to the following: nfs_server:/vol/DATA/oradata /u02/oradata nfs rw,bg,hard,nointr,rsize=32768,wsize=32768,proto=tcp,noac,forcedirectio, vers=3,suid Here's a tip of the hat to Pythian's "Installing 11gR2 Grid Infrastructure in 5 Easy Lessons": Lesson #3: Grid is very picky and somewhat uninformative about its NFS support Like an annoying girlfriend, the installer seems to say “Why should I tell you what’s the problem? If you really loved me, you’d know what you did wrong!” You need to trace the installer to find out what exactly it doesn’t like about your configuration. Running the installer normally, the error message is: [FATAL] [INS-41321] Invalid Oracle Cluster Registry (OCR) location. CAUSE: The installer detects that the storage type of the location (/cmsstgdb/crs/ocr/ocr1) is not supported for Oracle Cluster Registry. ACTION: Provide a supported storage location for the Oracle Cluster Registry. OK, so Oracle says the storage is not supported, but I know that ... NFS is support just fine. This means I used the wrong parameters for the NFS mounts. But when I check my vfstab and /etc/mount, everything looks A-OK. Can Oracle tell me what exactly bothers it? It can. If you run the silent install by adding the following flags to the command line: -J-DTRACING.ENABLED=true -J-DTRACING.LEVEL=2 If you get past this stage, it is clear sailing up until you run "root.sh" near the end of the Grid Installation, which is the stage that will fail if the root user's files are mapped to anonymous. So now, I will finally get to the piece to the puzzle that I found to be perplexing. Remember that in my configuration (see diagram, above) each RAC node has two potential paths to the Sun ZFS storage appliance, one path via the router that is connected to the corporate WAN, and one path via the private 10 gigabit storage network. When I accessed the NAS storage via the storage network, root was always mapped to nobody, despite my best efforts. While trying to debug, I discovered that when I accessed the NAS storage via the corporate WAN network, root was mapped to root: # ping -s s7420-10g0 1 1PING s7420-10g0: 1 data bytes9 bytes from s7420-10g0 (192.168.42.15): icmp_seq=0.# ping -s s7420-wan 1 1PING s7420-wan: 1 data bytes9 bytes from s7420-wan (10.1.1.15): icmp_seq=0.# nfsstat -m/S7420/OraData_WAN from s7420-wan:/export/OraData Flags: vers=3,proto=tcp,sec=sys,hard,nointr,noac,link,symlink,acl,rsize=32768,wsize=32768,retrans=5,timeo=600 Attr cache: acregmin=3,acregmax=60,acdirmin=30,acdirmax=60/S7420/OraData_10gbe from s7420-10g0:/export/OraData Flags: vers=3,proto=tcp,sec=sys,hard,nointr,noac,link,symlink,acl,rsize=32768,wsize=32768,retrans=5,timeo=600 Attr cache: acregmin=3,acregmax=60,acdirmin=30,acdirmax=60# touch /S7420/OraData_10gbe/foo1# touch /S7420/OraData_WAN/foo2# touch /net/s7420-10g0/export/OraData/foo3# touch /net/s7420-wan/export/OraData/foo4# touch /net/192.168.42.15/export/OraData/foo5# touch /net/10.1.1.15/export/OraData/foo6# ls -l /S7420/OraData_10gbe/foo*-rw-r--r-- 1 nobody nobody 0 Sep 20 12:54 /S7420/OraData_10gbe/foo1-rw-r--r-- 1 root root 0 Sep 20 12:54 /S7420/OraData_10gbe/foo2-rw-r--r-- 1 nobody nobody 0 Sep 20 12:55 /S7420/OraData_10gbe/foo3-rw-r--r-- 1 root root 0 Sep 20 12:56 /S7420/OraData_10gbe/foo4-rw-r--r-- 1 nobody nobody 0 Sep 20 12:58 /S7420/OraData_10gbe/foo5-rw-r--r-- 1 root root 0 Sep 20 13:04 /S7420/OraData_10gbe/foo6 Having discovered that when I accessed the NAS storage via the storage network, root was always mapped to nobody, but that when I accessed the NAS storage via the corporate WAN network, root was mapped to root, I investigated the mounts on the S7420: # ssh osm04 -l rootPassword: Last login: Thu Sep 15 22:46:46 2011 from 192.168.42.11s7420:> shellExecuting shell commands may invalidate your service contract. Continue? (Y/N) Executing raw shell; "exit" to return to appliance shell ...+-----------------------------------------------------------------------------+| You are entering the operating system shell. By confirming this action in || the appliance shell you have agreed that THIS ACTION MAY VOID ANY SUPPORT |...+-----------------------------------------------------------------------------+s7420# showmount -a | grep OraData192.168.42.11:/export/OraDatarac1.bigcorp.com:/export/OraData When I saw the "showmount" output, the lightbulb in my brain turned on and I understood the problem: I had entered the node names associated with the WAN, rather than node names associated with the private storage network. When NFS packets were arriving from the corporate WAN, the S7420 was using DNS to resolve WAN IP addresses into the WAN hostnames, which matched the hostnames that I had entered into the S7420 NFS Exception form. In contrast, when NFS packets were arriving from the 10 gigabit private storage network, the system was not able to resolve the IP address into hostname because the private storage network data did not exist in DNS. Even if the name resolution was successful, if would have been necessary to enter the the node names associated private storage area network into the S7420 NFS Exceptions form. Several solutions spring to mind: (1) On a typical Solaris NFS server, I would have enabled name resolution of the 10 gigabit private storage network addresses by adding entries to /etc/hosts, and used those node names for the NFS root access. This was not possible because on the appliance, /etc is mounted as read-only. (2) It occurred to me to enter the IP addresses into the S7420 NFS exceptions form, but the BUI would only accept hostnames. (3) One potential solution is to put the private 10 gigabit IP addresses into the corporate DNS server. (4) Instead, I chose to give root read-write permissions to all clients on the 10 gigabit private storage network: (Click to Expand) Now, the RAC installation will be able to complete successfully with RAC nodes accessing the Sun ZFS Storage 7x20 Appliance via the private 10 gigabit storage network.

When installing Oracle Real Application Clusters (Oracle RAC) 11g Release 2 for Solaris Operating System it is necessary to first install Oracle Grid Infrastructure, and it is also necessary...

Sun

Flash Archive with MPXIO

It was necessary to exchange one SPARC Enterprise M4000 Server running Solaris 10 Update 9 with a replacement server.  I thought, "no problem"Ran "flar create -n ben06 -S /S7420/ben06-`date '+%m-%d-%y'`.flar" on the original server to create a flash archive on NAS storage.Upon restoring the flar onto the replacement SPARC Enterprise M4000 Server, problems appeared:Rebooting with command: bootBoot device: disk  File and args:SunOS Release 5.10 Version Generic_144488-17 64-bitCopyright (c) 1983, 2011, Oracle and/or its affiliates. All rights reserved.Hostname: ben06The / file system (/dev/rdsk/c0t0d0s0) is being checked.WARNING - Unable to repair the / filesystem. Run fsck manually (fsck -F ufs /dev/rdsk/c0t0d0s0).Aug 22 10:25:31 svc.startd[7]: svc:/system/filesystem/usr:default: Method "/lib/svc/method/fs-usr" failed with exit status 95. Aug 22 10:25:31 svc.startd[7]: system/filesystem/usr:default failed fatally: transitioned to maintenance (see 'svcs -xv' for details)Requesting System Maintenance Mode(See /lib/svc/share/README for more information.) It got fairly ugly from there:  When I tried to run the fsck command, above, it reported "No such device"Although "mount" reported that / was mounted read/write, other commands, such as "vi", reported that everything was read-only"format" reported that I didn't have any devices at all!!Eventually I realized that there seems to be a bug with Flash Archives + MPXIO After I installed with the flar, I rebooted into single user mode from alternate media (bootp or cdrom), mounted the internal drive, and modified some files, changing "no" to "yes": /kernel/drv/fp.conf:mpxio-disable="yes"; /kernel/drv/iscsi.conf:mpxio-disable="yes"; /kernel/drv/mpt.conf:mpxio-disable="yes"; /kernel/drv/mpt.conf:disable-sata-mpxio="yes"; /kernel/drv/mpt_sas.conf:mpxio-disable="yes"; Then, after a "reboot -- -r", everything was fine.

It was necessary to exchange one SPARC Enterprise M4000 Server running Solaris 10 Update 9 with a replacement server.  I thought, "no problem" Ran "flar create -n ben06 -S...

Sun

Solaris installation on a SPARC T3 from a remote CDROM ISO

I wanted to install Solaris 11 Express on a SPARC T3-1 Server that was located in California. I was working in Massachusetts and had the ISO image of the Solaris 11 Express media on my laptop. Reading the SPARC T3-2 Server Documentation and the Integrated Lights Out Manager (ILOM) 3.0 Documentation it seemed that it would be possible to remotely mount my ISO image and use it for the installation, but I found that the documentation was rather spotty. Specifically, I didn't stumble across any documentation that led me to use the "rcdrom" device in the OpenBoot Prom (OBP, aka "the OK prompt") Here is the recipe: The SPARC T3-1 server was powered up, pre-installed with Solaris 10, and properly wired: Ethernet for Solaris was available via the NET0 port. Ethernet for the Service Processor was available via the SP NET MGT Ethernet port The SP SER MGT port was connected to a terminal concentrator The Service Processor had a static IP Address, so it could be accessed with either SSH or HTTPS The ILOM remote CD-ROM image functionality isn't supported on Mac OS X, so I used Windows XP running in VirtualBox to access the ILOM via HTTPS. In the ILOM HTTPS interface, I clicked on "Redirection", "Use serial redirection", "Launch remote Console" In the ILOM Remote Console window, there is a pulldown for "Devices", "CD-ROM image..."Selected the ISO image on my laptop  In Solaris 10, I was able to see that the remote ISO was mounted ("df -h") Shutdown Solaris 10 taking me back to the OK prompt I was able to see the "rcdrom" device:   ok devalias And boot:   ok boot rcdrom Confirmed success.  The T3 booted off of my ISO (slowly) and eventually promped for the installation language.  (English, Spanish, etc)

I wanted to install Solaris 11 Express on a SPARC T3-1 Server that was located in California. I was working in Massachusetts and had the ISO image of the Solaris 11 Express media on my laptop. Reading...

Sun

11gR2 Grid Infrastructure Patch Set for Solaris x86-64

"Oracle Recommended Patches" (Doc ID 756671.1) recommends 9655006 - 11.2.0.1.2 Grid Infrastructure Patch Set Update (GI PSU2) for 11.2.0.1 installations, but I couldn't find the patch for Solaris x86-64.  In fact, How to Upgrade to Oracle Grid Infrastructure 11g Release 2 states, "GI PSU2 is unavailable for Solaris x86-64 systems." How disappointing, right?  Not really.  Instead of going to Grid 11.2.0.1.2, go to Grid 11.2.0.2 with patch 10098816. Sounds easy enough, except that How to Upgrade to Oracle Grid Infrastructure 11g Release 2 also states: To upgrade existing 11.2.0.1 Oracle Grid Infrastructure installations to Oracle Grid Infrastructure 11.2.0.2, you must first do at least one of the following: Patch the release 11.2.0.1 Oracle Grid Infrastructure home with the 9413827 and 9706490 patches. Install Oracle Grid Infrastructure Patch Set 1 (GI PSU1) or Oracle Grid Infrastructure Patch Set 2 (GI PSU2). Since GI PSU2 is not available, and I had trouble finding the patch number of GI PSU1, I went with p6880880 (OPatch), followed by 9413827 followed by 10098816.  (FYI, once 9413827 is installed, it is not necessary to install 9706490 because patch 9706490 is Subset of 9413827.) Ta-da!! (According to yourdictionary.com, sound of a fanfare: an exclamation of triumph or pride accompanying an announcement, a bow, etc)

"Oracle Recommended Patches" (Doc ID 756671.1) recommends 9655006 - 11.2.0.1.2 Grid Infrastructure Patch Set Update (GI PSU2) for 11.2.0.1 installations, but I couldn't find the patch for...

Sun

Useful GNU grep option

I liked "How to get the serial number of a Sun machine through iLOM" at http://blog.philipp-michels.de/?p=102: $ ipmitool -v -H {IP} -U root -P {PW} fru | grep /SYS -A 5 FRU Device Description : /SYS (ID 20) Product Manufacturer  : SUN MICROSYSTEMS Product Name          : SUN FIRE X4200 M2 Product Part Number   : 602-3891-01 Product Serial        : 0116SL46F2 Product Extra         : 080020FFFFFFFFFFFFFF00144F7DB00A The problem is that the "grep -A" option is not supported by the Solaris default grep. The solution is to use GNU grep (ggrep at /usr/sfw/bin/ggrep) which is packaged on the Solaris Companion CD (also see http://blogs.sun.com/taylor22/entry/installing_the_solaris_10_os) While I was searching the www for grep -A, I found many more interesting usage examples, such as finding stack traces for exceptions in Java log files and  finding the text surrounding any ORA- messages in the Oracle alert.log Update: Solaris 11 installation and usage: # pkg install gnu-grep # which ggrep /usr/bin/ggrep # ls -l /usr/bin/ggrep lrwxrwxrwx   1 root     root          15 Jun  2 13:08 /usr/bin/ggrep -> ../gnu/bin/greproot@ For example, some message was repeated # grep  "repeated 2 times"  /var/adm/messages.*/var/adm/messages.3:Feb 11 16:43:56 p3231-05 last message repeated 2 times Which one? Can't use Solaris grep to find the line before the pattern. # grep -B1 "repeated 2 times"  /var/adm/messages.* grep: illegal option -- B grep: illegal option -- 1 Usage: grep [-c|-l|-q] -bhinsvw pattern file . . . But ggrep does the trick. # ggrep -B1 "repeated 2 times"  /var/adm/messages.* /var/adm/messages.3-Feb 11 16:43:54 p3231-05 dlmgmtd[49]: [ID 183745 daemon.warning] Duplicate links in the repository: net0 /var/adm/messages.3:Feb 11 16:43:56 p3231-05 last message repeated 2 times

I liked "How to get the serial number of a Sun machine through iLOM" at http://blog.philipp-michels.de/?p=102: $ ipmitool -v -H {IP} -U root -P {PW} fru | grep /SYS -A 5 FRU Device Description : /SYS...

Sun

Adding a hard drive for /export/home under ZFS

Summary: added a new hard drive to my "OpenSolaris 2009.06 snv_111b" system and attached it to /export/home.  The first step was easy: my Sun Ultra 24 had a free drive bay, so I stuck the new drive into the running system, ran "devfsadm -C -v" and then from "format" I could see that the disk was available to OpenSolaris as "c9d0" Next, a decision needed to be made with respect to how to utilize the drive. I considered using several approaches, such as adding the disk to the existing rpool.  After considering my potential future projects, I decided to attach the disk as a new zpool that would only contain /export/home.  Isolating /export/home should yield some flexibility when I play nasty games with the root drive. The process was slightly complicated by the fact that logging into OpenSolaris as root is disabled by default. To allow root login, I removed "type=role;" from the root line in /etc/user_attr.  Then, the following commands were executed while logged in as root. Before: # zfs listNAME USED AVAIL REFER MOUNTPOINTnew 79K 685G 19K /newrpool 26.2G 202G 82.5K /rpoolrpool/ROOT 16.4G 202G 19K legacyrpool/ROOT/opensolaris 262M 202G 5.25G /rpool/ROOT/opensolaris-1 25.0M 202G 5.65G /tmp/tmpOM0whBrpool/ROOT/opensolaris-2 33.4M 202G 6.59G /rpool/ROOT/opensolaris-3 171M 202G 6.59G /rpool/ROOT/opensolaris-4 17.4M 202G 6.72G /rpool/ROOT/opensolaris-5 106M 202G 6.72G /rpool/ROOT/opensolaris-6 15.8G 202G 14.6G /rpool/dump 3.97G 202G 3.97G -rpool/export 1.82G 202G 21K /exportrpool/export/home 1.82G 202G 22K /export/homerpool/export/home/jeff 1.82G 202G 1.82G /export/home/jeffrpool/swap 3.97G 206G 101M - Commands: # 1) Create a zpool on the new disk zpool create new c9d0# 2) Create a snapshot of the existing data zfs snapshot -r rpool/export@July26 # 3) Copy the data to the new hard drive zfs send rpool/export@July26 | zfs receive new/export zfs send rpool/export/home@July26 | zfs receive new/export/home zfs send rpool/export/home/jeff@July26 | zfs receive new/export/home/jeff# 4) Rename the old dataset zfs rename rpool/export rpool/export_old# 5) Push the old dataset to a new mount point for archival zfs unmount rpool/export_old/home/jeff zfs unmount rpool/export_old/home zfs set mountpoint=/export_old rpool/export_old zfs set mountpoint=/export_old/home rpool/export_old/home zfs set mountpoint=/export_old/home/jeff rpool/export_old/home/jeff zfs mount rpool/export_old/home zfs mount rpool/export_old/home/jeff# 6) Put the new hard drive on /export zfs set mountpoint=/export new/export When I was satisfied that I could log in as "jeff" and see the data on the new disk, I cleaned up the old disk: # 7) cleaned up the old disk zfs destroy rpool/export_old/home/jeff@July26 zfs destroy rpool/export_old/home/jeff zfs destroy rpool/export_old/home@July26 zfs destroy rpool/export_old/home zfs destroy rpool/export_old@July26 zfs destroy rpool/export_old After: # zpool status pool: new state: ONLINE scrub: none requestedconfig:NAME STATE READ WRITE CKSUMnew ONLINE 0 0 0 c9d0 ONLINE 0 0 0errors: No known data errors pool: rpool state: ONLINE scrub: none requestedconfig:NAME STATE READ WRITE CKSUMrpool ONLINE 0 0 0 c8d0s0 ONLINE 0 0 0errors: No known data errors# zfs listNAME USED AVAIL REFER MOUNTPOINTnew 1.86G 683G 19K /newnew/export 1.86G 683G 21K /exportnew/export/home 1.86G 683G 22K /export/homenew/export/home/jeff 1.86G 683G 1.82G /export/home/jeffrpool 24.4G 204G 82.5K /rpoolrpool/ROOT 16.4G 204G 19K legacyrpool/ROOT/opensolaris 262M 204G 5.25G /rpool/ROOT/opensolaris-1 25.0M 204G 5.65G /tmp/tmpOM0whBrpool/ROOT/opensolaris-2 33.4M 204G 6.59G /rpool/ROOT/opensolaris-3 171M 204G 6.59G /rpool/ROOT/opensolaris-4 17.4M 204G 6.72G /rpool/ROOT/opensolaris-5 106M 204G 6.72G /rpool/ROOT/opensolaris-6 15.8G 204G 14.7G /rpool/dump 3.97G 204G 3.97G -rpool/swap 3.97G 208G 101M - Mission accomplished without needing to reboot the system!

Summary: added a new hard drive to my "OpenSolaris 2009.06 snv_111b" system and attached it to /export/home.  The first step was easy: my Sun Ultra 24 had a free drive bay, so I stuck the new drive...

Sun

Java Monitoring and Tuning Example

Introduction: I've recently been working with a several ISV's who were concerned about application server performance issues. Various application servers have been investigated, including WebLogic, WebSphere, Tomcat and JBoss, however, some of the tools and techniques necessary to monitor and resolve the application server performance were constant. Since Java is the underlying technology, one of the first issues to investigate is Java garbage collection. VisualGC is a good tool to understand the garbage collection patterns. This blog entry is a case study of a tuning exercise, shown in three stages.  In the first stage, the system is running well under a medium load. In the second stage, the load has been increased causing response times to spike.  In the third stage, the the Java parameters have been modified and the system is able to run well under a heavier load.   This blog entry is not intended to be an exhaustive and free standing description of Java Garbage Collection, rather, this blog entry is intended to supplement other descriptions such as Tuning Garbage Collection with the 5.0 Java Virtual Machine and Java SE 6 HotSpot[tm] Virtual Machine Garbage Collection Tuning. Environment and tools: In this case study, a web container is deployed into it's own Solaris Container, and it is the only busy process in the container.  This Solaris Container is using a pool of two quad-core SPARC64 VIICPUs, with 2 virtual CPU's per core, yielding a total of 16 virtual CPUs. Screen shots show the output from tools including VisualGC, prstat, jstat, and xcpustate: VisualGC is a tool that is available for download from with the jvmstat 3.0 package and displays a Java Virtual Machine's memory usage. xcpustate is a tool that is available on the Solaris Companion CD and shows the status of each virtual CPU: blueindicates idle, green is busy in application code and yellow is busy inSolaris kernel code.  jstat is a text based tool that is included with the JDK and is used to display Java garbage collection statistics.  Not to be confused with jvmstat. prstat is a text based tool that is included with Solaris and shows resource usage per process. Java Options: The following Java options where in place as the project began and where left intact: -server -XX:PermSize=512m -XX:MaxPermSize=720m -Xss256k -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -XX:+CMSClassUnloadingEnabled -XX:+CMSPermGenSweepingEnabled -XX:+UseTLAB -XX:+DisableExplicitGC -Dsun.rmi.dgc.client.gcInterval=3600000 -Dsun.rmi.dgc.server.gcInterval=3600000 The following options where modified during the course of the project: Initial Java Options Modified Java Options -Xms1024m -Xms4096m -Xmx2048m -Xmx6144m -XX:NewSize=128m -XX:NewSize=1536m -XX:MaxNewSize=256m -XX:MaxNewSize=1536m N/A -XX:SurvivorRatio=4 N/A -verbose:gc -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -XX:+PrintTenuringDistribution -Xloggc:/web_dir/log/gc.log Stage 1: "if it ain't broke, don't fix it" In the first stage, the system is running well under a medium load. Here is a screen shot of VisualGC running a scenario where the end-user response times are good. The window on the left shows an instantaneous view of the Java memory statistics.  The window on the right show a strip charts of the statistics over time, where new data is published on the right side and slides to the left over time.  Eventually the historic data falls off the left side of the strip chart. The right window has the following strip charts, from top to bottom: Compile TIme strip chart shows that the Java run time compiler has be kicking in, but this a fairly light load which was subsequently seen to disappear as all of the classes in use where fully compiled and optimized. Class Loader strip chart shows that the class loader has recently been idle. GC Time shows that Java garbage has been active, but the narrow spikes indicate short bursts of garbage collection. The narrow spikes are a good omen. If wide peaks are seen it could be indicative of a problem. Eden Space strip chart shows that the data quickly fills the 128MB Eden space and is then collected. When the frequency of the garbage collections approach 1 collection per second or greater, aliasing can be a problem and you should use jstat to ascertain the actual frequency Survivor space strip charts shows that the survivor space is not being used.  Combined with view of the Old Gen activity, deduce that the tenuring process is not working and that data that survives one collection is promoted directly into the Old Generation.  I don't like this configuration, but the user response times are good, so I wouldn't change it, yet.  As the load increases, this will need to be resolved. Old Gen strip chart shows that the memory is being filled as data is promoted but quickly cleaned up by the CMS collector.  Perm Gen strip chart shows a steady state of used space and some availble space, which is good.  I believe in the idiom "if it ain't broke, don't fix it", so although I'm not in love with the above screen shot, I wouldn't change anything unless there is a tangible performance problem.  With this load, the current settings are working. That being said, I don't like that Eden so small and that the Survivor space is not being used.  Objects that survive an initial young generation GC are being promoted directly to the Old Generation, bypassing the tenuring process. The following screen shot was also taken while running the medium load scenario where the end-user response times are good. On the left, prstat shows that there is one busy Java process which has been using 7.5% of this Solaris Container's CPU resource. The xcpustate window shows that all of the CPUs are lightly loaded by the Java process. The following jstat data was gathered while running the medium load scenario where the end-user response times are good. The following columns are displayed: The first ten column describe memory usage: The first four columns, S0C, S1C, S0U, and S1U are about survivor space. Capacity (C) and Utilization (U) for spaces 0 and 1. The capacity is 64 KB but neither space is utilized.  The next two columns are about Eden, showing Capacity (EC) and Utilization (EU).  The capacity is constant at approximately 128MB (130944KB). How much data is in Eden at the instantaneous time that the trace is taken depends on how much data has filled in since the most recent young generation collection and varies wildly. The next two columns are about the Old Generation, showing Capacity (OC) and Utilization (OU).  The capacity is constant at 1.2 GB. Howmuch data is in Eden at the instantaneous time that the trace is takendepends on how much data has filled in since the most recent younggeneration collection and varies slowly. Then there are two columns are about the Permanent Gerneration, showing Capacity (PC) and Utilization (PU). The capacity is constant at 720 MB (737280 KB) and the utilization is fairly flat near 440 MB (450560 KB) The final 5 columns are about the number of collections and time spent in garbage collection: YGC is the count of the total number of Young Generation Collections since the Java process was launched. #jstat -gc 15515 10000 S0C S1C S0U S1U EC EU OC OU PC PU YGC YGCT FGC FGCT GCT 64.0 64.0 0.0 0.0 130944.0 128571.1 1226784.0 657397.4 737280.0 450352.0 497 15.956 158 53.966 69.922 64.0 64.0 0.0 0.0 130944.0 32841.4 1226784.0 624092.3 737280.0 450367.8 503 16.154 160 54.918 71.072 64.0 64.0 0.0 0.0 130944.0 0.0 1226784.0 631067.9 737280.0 450251.8 506 16.245 162 54.963 71.208 64.0 64.0 0.0 0.0 130944.0 76982.1 1226784.0 603735.6 737280.0 450255.0 511 16.383 163 55.725 72.107 64.0 64.0 0.0 0.0 130944.0 118467.4 1226784.0 606041.5 737280.0 450265.4 515 16.538 165 56.504 73.042 64.0 64.0 0.0 0.0 130944.0 108089.0 1226784.0 644641.9 737280.0 450273.4 520 16.795 165 56.504 73.298 64.0 64.0 0.0 0.0 130944.0 65716.7 1226784.0 627548.7 737280.0 449934.1 523 16.901 167 57.483 74.384 64.0 64.0 0.0 0.0 130944.0 93575.9 1226784.0 602544.6 737280.0 449690.5 526 17.021 168 58.691 75.712 64.0 64.0 0.0 0.0 130944.0 65022.2 1226784.0 635921.9 737280.0 449698.6 531 17.217 169 58.736 75.953 64.0 64.0 0.0 0.0 130944.0 23657.6 1226784.0 626730.9 737280.0 449708.1 537 17.471 171 59.584 77.055\^C The Verbose GC log file confirms the jstat data. 1398.985: [GC[YG occupancy: 32842 K (131008 K)]1398.985: [Rescan (parallel) , 0.0162640 secs]1399.002: [weak refs processing, 0.3515853 secs]1399.353: [class unloading, 0.3569879 secs]1399.711: [scrub symbol & string tables, 0.1207449 secs] [1 CMS-remark: 757680K(1226784K)] 790522K(1357792K), 0.9462555 secs]1399.932: [CMS-concurrent-sweep-start]1400.006: [GC 1400.006: [ParNew: 130904K->0K(131008K), 0.0295234 secs] 887457K->758257K(1357792K), 0.0304601 secs]1400.485: [GC 1400.486: [ParNew: 130944K->0K(131008K), 0.0344479 secs] 872839K->751757K(1357792K), 0.0351573 secs]1401.453: [GC 1401.453: [ParNew: 130944K->0K(131008K), 0.0309997 secs] 830358K->704885K(1357792K), 0.0318843 secs]1402.550: [CMS-concurrent-sweep: 1.110/2.618 secs]1402.550: [CMS-concurrent-reset-start]1402.563: [CMS-concurrent-reset: 0.013/0.013 secs]1403.901: [GC 1403.901: [ParNew: 130944K->0K(131008K), 0.0325028 secs] 835829K->711510K(1357792K), 0.0332487 secs]1404.156: [GC [1 CMS-initial-mark: 711510K(1226784K)] 744661K(1357792K), 0.0442969 secs]1404.201: [CMS-concurrent-mark-start]1404.834: [GC 1404.835: [ParNew: 130941K->0K(131008K), 0.0357494 secs] 842451K->717262K(1357792K), 0.0365797 secs]1404.953: [GC 1404.954: [ParNew: 130893K->0K(131008K), 0.0183660 secs] 848155K->717517K(1357792K), 0.0190584 secs]1405.055: [GC 1405.055: [ParNew: 130944K->0K(131008K), 0.0173852 secs] 848461K->717814K(1357792K), 0.0180515 secs]1405.342: [GC 1405.343: [ParNew: 130944K->0K(131008K), 0.0203568 secs] 848758K->720848K(1357792K), 0.0211302 secs]1405.763: [GC 1405.763: [ParNew: 130944K->0K(131008K), 0.0204437 secs] 851792K->722671K(1357792K), 0.0211113 secs]1406.192: [GC 1406.192: [ParNew: 130944K->0K(131008K), 0.0204619 secs] 853615K->724768K(1357792K), 0.0214120 secs]1407.042: [GC 1407.042: [ParNew: 130944K->0K(131008K), 0.0326238 secs] 855712K->734062K(1357792K), 0.0334577 secs]1409.453: [GC 1409.453: [ParNew: 130944K->0K(131008K), 0.0340238 secs] 865006K->741292K(1357792K), 0.0347661 secs]1410.355: [CMS-concurrent-mark: 3.050/6.154 secs]1410.355: [CMS-concurrent-preclean-start]1412.357: [GC 1412.358: [ParNew: 130944K->0K(131008K), 0.0334870 secs] 872236K->747988K(1357792K), 0.0344020 secs]1412.936: [CMS-concurrent-preclean: 0.366/2.581 secs]1412.936: [CMS-concurrent-abortable-preclean-start]1412.941: [CMS-concurrent-abortable-preclean: 0.005/0.006 secs]1412.948: [GC[YG occupancy: 66917 K (131008 K)]1412.949: [Rescan (parallel) , 0.0494186 secs]1412.998: [weak refs processing, 0.4237737 secs]1413.422: [class unloading, 0.2246387 secs]1413.647: [scrub symbol & string tables, 0.1256452 secs] [1 CMS-remark: 747988K(1226784K)] 814906K(1357792K), 0.9224302 secs]1413.872: [CMS-concurrent-sweep-start]1417.184: [GC 1417.185: [ParNew: 130944K->0K(131008K), 0.0439868 secs] 878065K->755909K(1357792K), 0.0450652 secs] Resident Set Size When memory is allocated to a Solaris process, virtual memory pages to fulfill the request are reserved in swap space, but this is a very light weight event that does not require allocating RAM or writing data to the backing store (virtual memory disk).  The first time that any page is accessed, RAM is allocated and zeroed for the page. Finally, in a scenario where RAM is scarce, Solaris may need to pageout the data to the backing store. Itis not uncommon for a Java users to notice a substantial gap between the swap space that has been reserved for the process compared the amount of RAM that is being used.  Observe the prstat SIZE and RSS columns.  The "RSS" column(Resident Set Size) indicates how much RAM is in use by the process.The "SIZE" column displays the the total virtual size of process,including RAM that is allocated to the process, pages that have been reserved but never accessed and data that has been paged out due to memorypressure.  In this case, there is no virtual memory pressure (seen viavmstat, not shown) so we know that there is more than 2 GB of memory space thathas been reserved by the process but not yet accessed. Going back tothe VisualGC screen shot above, in the top left window, notice that OldGen has: Active data, solid orange, 664MB. Empty (but already allocated) space shown with a green hash.  The sum of the active data and the empty space is 1.17 GB. Space that has been reserved but not yet allocated, shown with a  gray hash. The virtual memory footprint of the Old Generation is 1.87 GB, including both active, empty but allocated, and reserved space. In the jstat Old Capacity "OC" column, (1226784 KB) matches the 1.17 GB in the VisualGC output and indicates sum of the memory that has been allocated to the Old Generation, but not pages that have been reserved and not yet allocated. More detail about Solaris memory can be found at sources including Solaris 10 Physical Memory Management and The Solaris Memory System - Sizing, Tools and Architecture The following graph is produced using from vmstat gathered while running the medium load scenario where the end-user response times are good. (See Graphing Solaris Performance Stats with gnuplot) You can see that there is a consistent light CPU load on the system throughout the half hour test. Stage 2: It's broken In the second stage, the load has been increased causing response times to spike. The processing alternates between two phases. Garbage collection runs constantly while the test is running, but the Old generation CMS collector can't clear out the data fast enough to clear the In once What I don't like about the following VisualGC1) GC is constantly running2) Survivor space is not being used3) Two phases: (a) a phase that can be seen in the Eden Space strip chart with very frequent Eden GC's concurrent with application work, and (b) a phase that can be seen in the Old Gen strip chart where where the application can not execute due to exclusive Old Generation GC work. 4) CMS is not able to hold Old Gen down The following jstat data was gather during a run where the load has been increased causing response times to spike. There is one sample every 10 seconds (10000 ms) and twenty samples to cover a total of 200 seconds.  Observing the YGC column, notice that there are 685 (3675 - 2990) Young Generation Collections, for an average rate of 3.4 collections/second.  A tuning goal will be to slow this rate down. #jstat -gc $PID 10000 20 S0C S1C S0U S1U EC EU OC OU PC PU YGC YGCT FGC FGCT GCT 64.0 64.0 0.0 0.0 130944.0 130944.0 1966080.0 1553383.3 737280.0 449596.9 2990 119.803 123 283.914 403.717 64.0 64.0 0.0 0.0 130944.0 93869.1 1966080.0 1886113.2 737280.0 449599.9 3046 121.996 123 283.914 405.910 64.0 64.0 0.0 0.0 130944.0 130944.0 1966080.0 1942614.0 737280.0 449599.9 3055 122.564 124 283.914 406.478 64.0 64.0 0.0 0.0 130944.0 57856.7 1966080.0 1364918.4 737280.0 449602.5 3093 124.077 125 294.691 418.768 64.0 64.0 0.0 0.0 130944.0 0.0 1966080.0 1714499.8 737280.0 449606.8 3149 126.198 125 294.691 420.889 64.0 64.0 0.0 0.0 130944.0 130944.0 1966080.0 1943757.0 737280.0 449608.7 3186 127.848 126 294.691 422.539 64.0 64.0 0.0 0.0 130944.0 130944.0 1966080.0 1943757.0 737280.0 449605.8 3186 127.848 126 294.691 422.539 64.0 64.0 0.0 0.0 130944.0 0.0 1966080.0 1389764.8 737280.0 449616.0 3231 129.533 127 307.723 437.256 64.0 64.0 0.0 0.0 130944.0 101211.9 1966080.0 1741419.7 737280.0 449619.9 3286 131.759 127 307.723 439.483 64.0 64.0 0.0 0.0 130944.0 130944.0 1966080.0 1943870.4 737280.0 449629.2 3323 133.224 128 307.723 440.947 64.0 64.0 0.0 0.0 130944.0 91907.9 1966080.0 1116053.3 737280.0 449621.3 3330 133.492 129 319.118 452.610 64.0 64.0 0.0 0.0 130944.0 0.0 1966080.0 1541494.5 737280.0 449631.2 3392 135.838 129 319.118 454.956 64.0 64.0 0.0 0.0 130944.0 53983.1 1966080.0 1907711.7 737280.0 449651.6 3444 137.955 129 319.118 457.073 64.0 64.0 0.0 0.0 130944.0 130944.0 1966080.0 1942148.9 737280.0 449659.3 3450 138.658 130 319.118 457.776 64.0 64.0 0.0 0.0 130944.0 71632.8 1966080.0 1291300.9 737280.0 449672.5 3484 139.935 131 331.870 471.805 64.0 64.0 0.0 0.0 130944.0 0.0 1966080.0 1661931.9 737280.0 449687.8 3535 142.096 131 331.870 473.966 64.0 64.0 0.0 0.0 130944.0 55150.8 1966080.0 1930792.5 737280.0 449693.2 3581 143.826 131 331.870 475.696 64.0 64.0 0.0 0.0 130944.0 130944.0 1966080.0 1941935.7 737280.0 449693.2 3583 144.291 132 331.870 476.161 64.0 64.0 0.0 0.0 130944.0 66316.6 1966080.0 1405189.1 737280.0 449701.0 3631 146.142 133 342.789 488.931 64.0 64.0 0.0 0.0 130944.0 0.0 1966080.0 1744177.7 737280.0 449704.4 3675 148.066 133 342.789 490.855 The Verbose GC log file confirms the jstat data. 1445.781: [GC 1445.781: [ParNew: 130944K->0K(131008K), 0.0528409 secs] 2048880K->1929346K(2097088K), 0.0733931 secs]1446.057: [GC 1446.077: [ParNew: 130944K->0K(131008K), 0.0507988 secs] 2060290K->1937743K(2097088K), 0.0519872 secs]1446.328: [GC 1446.330: [ParNew (promotion failed): 130944K->130944K(131008K), 0.3677102 secs]1446.698: [CMS1448.338: [CMS-concurrent-mark: 8.326/20.506 secs] (concurrent mode failure): 1943651K->1332161K(1966080K), 13.5947895 secs] 2068687K->1332161K(2097088K), 13.9636616 secs]1460.373: [GC [1 CMS-initial-mark: 1332161K(1966080K)] 1364958K(2097088K), 0.0570809 secs]1460.431: [CMS-concurrent-mark-start]1460.513: [GC 1460.513: [ParNew: 130944K->0K(131008K), 0.0629766 secs] 1463105K->1340611K(2097088K), 0.0638556 secs]1460.644: [GC 1460.645: [ParNew: 130944K->0K(131008K), 0.0287314 secs] 1471555K->1344385K(2097088K), 0.0295894 secs]1460.752: [GC 1460.752: [ParNew: 130944K->0K(131008K), 0.0313205 secs] 1475329K->1349195K(2097088K), 0.0325400 secs]1460.859: [GC 1460.860: [ParNew: 130944K->0K(131008K), 0.0354078 secs] 1480139K->1353073K(2097088K), 0.0363298 secs]1460.988: [GC 1460.988: [ParNew: 130944K->0K(131008K), 0.0363377 secs] 1484017K->1356257K(2097088K), 0.0374367 secs]1461.164: [GC 1461.164: [ParNew: 130944K->0K(131008K), 0.0479562 secs] 1487201K->1362016K(2097088K), 0.0488208 secs]1461.439: [GC 1461.439: [ParNew: 130944K->0K(131008K), 0.0507014 secs] 1492960K->1372934K(2097088K), 0.0515876 secs]1461.579: [GC 1461.580: [ParNew: 130944K->0K(131008K), 0.0319675 secs] 1503878K->1375415K(2097088K), 0.0332522 secs]1461.693: [GC 1461.693: [ParNew: 130944K->0K(131008K), 0.0340291 secs] 1506359K->1378926K(2097088K), 0.0348877 secs]1461.821: [GC 1461.821: [ParNew: 130944K->0K(131008K), 0.0314839 secs] 1509870K->1382392K(2097088K), 0.0323204 secs]1461.950: [GC 1461.951: [ParNew: 130943K->0K(131008K), 0.0323091 secs] 1513336K->1386662K(2097088K), 0.0331815 secs]1462.123: [GC 1462.124: [ParNew: 130944K->0K(131008K), 0.0385567 secs] 1517606K->1391853K(2097088K), 0.0398456 secs]1462.380: [GC 1462.380: [ParNew: 130944K->0K(131008K), 0.0698981 secs] 1522797K->1405319K(2097088K), 0.0708113 secs]1462.627: [GC 1462.628: [ParNew: 130944K->0K(131008K), 0.0482389 secs] 1536263K->1417126K(2097088K), 0.0493317 secs]1462.835: [GC 1462.835: [ParNew: 130944K->0K(131008K), 0.0438596 secs] 1548070K->1423124K(2097088K), 0.0447105 secs]1462.989: [GC 1462.989: [ParNew: 130944K->0K(131008K), 0.0356933 secs] 1554068K->1430934K(2097088K), 0.0365304 secs]1463.156: [GC 1463.156: [ParNew: 130944K->0K(131008K), 0.0426650 secs] 1561878K->1437914K(2097088K), 0.0438581 secs]1463.342: [GC 1463.342: [ParNew: 130944K->0K(131008K), 0.0369751 secs] 1568858K->1445730K(2097088K), 0.0378227 secs]1463.508: [GC 1463.508: [ParNew: 130944K->0K(131008K), 0.0393694 secs] 1576674K->1458926K(2097088K), 0.0402032 secs]1463.709: [GC 1463.709: [ParNew: 130944K->0K(131008K), 0.0351472 secs] 1589870K->1465137K(2097088K), 0.0364246 secs]1463.860: [GC 1463.860: [ParNew: 130944K->0K(131008K), 0.0399990 secs] 1596081K->1468331K(2097088K), 0.0408329 secs]1463.998: [GC 1463.998: [ParNew: 130944K->0K(131008K), 0.0338172 secs] 1599275K->1471847K(2097088K), 0.0346756 secs]1464.123: [GC 1464.123: [ParNew: 130944K->0K(131008K), 0.0281716 secs] 1602791K->1475013K(2097088K), 0.0292473 secs]1464.235: [GC 1464.235: [ParNew: 130944K->0K(131008K), 0.0282486 secs] 1605957K->1477824K(2097088K), 0.0293566 secs]1464.361: [GC 1464.361: [ParNew: 130944K->0K(131008K), 0.0319564 secs] 1608768K->1480943K(2097088K), 0.0328203 secs]1464.590: [GC 1464.590: [ParNew: 130944K->0K(131008K), 0.0430505 secs] 1611887K->1486579K(2097088K), 0.0439285 secs]1464.875: [GC 1464.875: [ParNew: 130944K->0K(131008K), 0.0404393 secs] 1617523K->1496962K(2097088K), 0.0415705 secs]1465.079: [GC 1465.079: [ParNew: 130944K->0K(131008K), 0.0389969 secs] 1627906K->1509873K(2097088K), 0.0398899 secs]1465.318: [GC 1465.318: [ParNew: 130944K->0K(131008K), 0.0377232 secs] 1640817K->1518156K(2097088K), 0.0386229 secs]1465.533: [GC 1465.534: [ParNew: 130939K->0K(131008K), 0.0433747 secs] 1649096K->1527800K(2097088K), 0.0446424 secs]1465.652: [GC 1465.653: [ParNew: 130944K->0K(131008K), 0.0267976 secs] 1658744K->1530747K(2097088K), 0.0276501 secs]1465.754: [GC 1465.755: [ParNew: 130944K->0K(131008K), 0.0382624 secs] 1661691K->1533772K(2097088K), 0.0390969 secs]1465.883: [GC 1465.884: [ParNew: 130944K->0K(131008K), 0.0346790 secs] 1664716K->1538809K(2097088K), 0.0355499 secs]1466.010: [GC 1466.010: [ParNew: 130944K->0K(131008K), 0.0360124 secs] 1669753K->1542202K(2097088K), 0.0368312 secs]1466.133: [GC 1466.134: [ParNew: 130896K->0K(131008K), 0.0264424 secs] 1673098K->1545037K(2097088K), 0.0276615 secs]1466.284: [GC 1466.284: [ParNew: 130944K->0K(131008K), 0.0334272 secs] 1675981K->1551352K(2097088K), 0.0342620 secs]1466.486: [GC 1466.487: [ParNew: 130944K->0K(131008K), 0.0468365 secs] 1682296K->1559392K(2097088K), 0.0476777 secs]1466.761: [GC 1466.762: [ParNew: 130944K->0K(131008K), 0.0668140 secs] 1690336K->1571022K(2097088K), 0.0679247 secs]1467.043: [GC 1467.043: [ParNew: 130944K->0K(131008K), 0.0595864 secs] 1701966K->1579608K(2097088K), 0.0604566 secs]1467.294: [GC 1467.294: [ParNew: 130944K->0K(131008K), 0.0422655 secs] 1710552K->1586731K(2097088K), 0.0433685 secs]1467.412: [GC 1467.413: [ParNew: 130915K->0K(131008K), 0.0357374 secs] 1717646K->1590261K(2097088K), 0.0365601 secs]1467.523: [GC 1467.523: [ParNew: 130944K->0K(131008K), 0.0368735 secs] 1721205K->1592832K(2097088K), 0.0376957 secs]1467.637: [Full GC 1467.637: [ParNew: 130901K->0K(131008K), 0.0387518 secs] 1723877K->1595360K(2097088K), 0.0396553 secs]1467.755: [GC 1467.755: [ParNew: 130944K->0K(131008K), 0.0328768 secs] 1726304K->1597889K(2097088K), 0.0337467 secs]1467.913: [GC 1467.913: [ParNew: 130943K->0K(131008K), 0.0321930 secs] 1728833K->1601670K(2097088K), 0.0332656 secs]1468.115: [GC 1468.115: [ParNew: 130944K->0K(131008K), 0.0378134 secs] 1732614K->1608985K(2097088K), 0.0386423 secs]1468.259: [GC 1468.260: [ParNew: 130944K->0K(131008K), 0.0389691 secs] 1739929K->1619450K(2097088K), 0.0398194 secs]1468.378: [GC 1468.379: [ParNew: 130939K->0K(131008K), 0.0346053 secs] 1750390K->1622375K(2097088K), 0.0354541 secs]1468.509: [GC 1468.509: [ParNew: 130931K->0K(131008K), 0.0329106 secs] 1753307K->1625703K(2097088K), 0.0339618 secs]1468.627: [GC 1468.636: [ParNew: 130874K->0K(131008K), 0.0315224 secs] 1756578K->1630866K(2097088K), 0.0323764 secs]1468.751: [GC 1468.752: [ParNew: 130944K->0K(131008K), 0.0316954 secs] 1761810K->1633494K(2097088K), 0.0325242 secs]1468.926: [GC 1468.933: [ParNew: 130944K->0K(131008K), 0.0361699 secs] 1764438K->1641317K(2097088K), 0.0371095 secs]1469.145: [GC 1469.146: [ParNew: 130944K->0K(131008K), 0.0367411 secs] 1772261K->1647218K(2097088K), 0.0378685 secs]1469.380: [GC 1469.380: [ParNew: 130944K->0K(131008K), 0.0446357 secs] 1778162K->1653998K(2097088K), 0.0454686 secs]1469.688: [GC 1469.689: [ParNew: 130944K->0K(131008K), 0.0390912 secs] 1784942K->1661565K(2097088K), 0.0402177 secs]1469.951: [GC 1469.951: [ParNew: 130944K->0K(131008K), 0.0486676 secs] 1792509K->1671504K(2097088K), 0.0495607 secs]1470.248: [GC 1470.249: [ParNew: 130944K->0K(131008K), 0.0414052 secs] 1802448K->1676829K(2097088K), 0.0425514 secs]1470.447: [GC 1470.447: [ParNew: 130944K->0K(131008K), 0.0342887 secs] 1807773K->1683630K(2097088K), 0.0351706 secs]1470.610: [GC 1470.611: [ParNew: 130940K->0K(131008K), 0.0319829 secs] 1814571K->1688471K(2097088K), 0.0332171 secs]1470.722: [GC 1470.722: [ParNew: 130944K->0K(131008K), 0.0311625 secs] 1819415K->1692388K(2097088K), 0.0319923 secs]1470.852: [GC 1470.852: [ParNew: 130944K->0K(131008K), 0.0314821 secs] 1823332K->1696163K(2097088K), 0.0389738 secs]1470.967: [GC 1470.979: [ParNew: 130944K->0K(131008K), 0.0318559 secs] 1827107K->1700627K(2097088K), 0.0326812 secs]1471.085: [GC 1471.086: [ParNew: 130937K->0K(131008K), 0.0328143 secs] 1831564K->1704737K(2097088K), 0.0336577 secs]1471.229: [GC 1471.229: [ParNew: 130944K->0K(131008K), 0.0335674 secs] 1835681K->1709344K(2097088K), 0.0343768 secs]1471.385: [GC 1471.386: [ParNew: 130944K->0K(131008K), 0.0406892 secs] 1840288K->1714787K(2097088K), 0.0419120 secs]1471.516: [GC 1471.516: [ParNew: 130944K->0K(131008K), 0.0295618 secs] 1845731K->1717584K(2097088K), 0.0303950 secs]1471.625: [GC 1471.625: [ParNew: 130944K->0K(131008K), 0.0376045 secs] 1848528K->1720845K(2097088K), 0.0384246 secs]1471.750: [GC 1471.750: [ParNew: 130870K->0K(131008K), 0.0260799 secs] 1851715K->1723365K(2097088K), 0.0269050 secs]1471.839: [GC 1471.840: [ParNew: 130944K->0K(131008K), 0.0255170 secs] 1854309K->1724726K(2097088K), 0.0263177 secs]1471.988: [GC 1471.988: [ParNew: 130944K->0K(131008K), 0.0285268 secs] 1855670K->1728103K(2097088K), 0.0295660 secs]1472.143: [GC 1472.143: [ParNew: 130944K->0K(131008K), 0.0392845 secs] 1859047K->1738951K(2097088K), 0.0401130 secs]1472.362: [GC 1472.362: [ParNew: 130944K->0K(131008K), 0.0379261 secs] 1869895K->1746709K(2097088K), 0.0387573 secs]1472.554: [GC 1472.554: [ParNew: 130944K->0K(131008K), 0.0431822 secs] 1877653K->1752618K(2097088K), 0.0442519 secs]1472.672: [GC 1472.673: [ParNew: 130944K->0K(131008K), 0.0434003 secs] 1883562K->1754549K(2097088K), 0.0442311 secs]1472.795: [GC 1472.796: [ParNew: 130944K->0K(131008K), 0.0295359 secs] 1885493K->1757812K(2097088K), 0.0304006 secs]1472.904: [GC 1472.904: [ParNew: 130944K->0K(131008K), 0.0386588 secs] 1888756K->1760377K(2097088K), 0.0395649 secs]1473.035: [GC 1473.036: [ParNew: 130861K->0K(131008K), 0.0297446 secs] 1891238K->1763237K(2097088K), 0.0308653 secs]1473.188: [GC 1473.201: [ParNew: 130944K->0K(131008K), 0.0319938 secs] 1894181K->1767389K(2097088K), 0.0333206 secs]1473.401: [GC 1473.401: [ParNew: 130944K->0K(131008K), 0.0387580 secs] 1898333K->1773804K(2097088K), 0.0396182 secs]1473.522: [GC 1473.538: [ParNew: 130944K->0K(131008K), 0.0312055 secs] 1904748K->1778637K(2097088K), 0.0320711 secs]1473.650: [GC 1473.650: [ParNew: 130944K->0K(131008K), 0.0346679 secs] 1909581K->1784622K(2097088K), 0.0355099 secs]1473.769: [GC 1473.780: [ParNew: 130944K->0K(131008K), 0.0325029 secs] 1915566K->1787148K(2097088K), 0.0336199 secs]1473.889: [GC 1473.890: [ParNew: 130944K->0K(131008K), 0.0306312 secs] 1918092K->1791684K(2097088K), 0.0314725 secs]1474.023: [GC 1474.025: [ParNew: 130944K->0K(131008K), 0.0346581 secs] 1922628K->1800742K(2097088K), 0.0354589 secs]1474.219: [GC 1474.219: [ParNew: 130944K->0K(131008K), 0.0431710 secs] 1931686K->1811004K(2097088K), 0.0440093 secs]1474.392: [GC 1474.393: [ParNew: 130944K->0K(131008K), 0.0384690 secs] 1941948K->1820039K(2097088K), 0.0396068 secs]1474.572: [GC 1474.584: [ParNew: 130944K->0K(131008K), 0.0413289 secs] 1950983K->1827458K(2097088K), 0.0426382 secs]1474.801: [GC 1474.813: [ParNew: 130944K->0K(131008K), 0.0385987 secs] 1958402K->1837161K(2097088K), 0.0627487 secs]1475.046: [GC 1475.047: [ParNew: 130944K->0K(131008K), 0.0488532 secs] 1968105K->1845650K(2097088K), 0.0499528 secs]1475.262: [GC 1475.262: [ParNew: 130944K->0K(131008K), 0.0512485 secs] 1976594K->1853035K(2097088K), 0.0521045 secs]1475.458: [GC 1475.458: [ParNew: 130944K->0K(131008K), 0.0461453 secs] 1983979K->1862019K(2097088K), 0.0469998 secs]1475.652: [GC 1475.653: [ParNew: 130944K->0K(131008K), 0.0447249 secs] 1992963K->1868886K(2097088K), 0.0458052 secs]1475.763: [GC 1475.780: [ParNew: 130928K->0K(131008K), 0.0324786 secs] 1999815K->1873299K(2097088K), 0.0333689 secs]1475.882: [GC 1475.882: [ParNew: 130944K->0K(131008K), 0.0340021 secs] 2004243K->1876399K(2097088K), 0.0348220 secs]1475.993: [GC 1475.994: [ParNew: 130944K->0K(131008K), 0.0299435 secs] 2007343K->1883168K(2097088K), 0.0308629 secs]1476.097: [GC 1476.097: [ParNew: 130944K->0K(131008K), 0.0343554 secs] 2014112K->1887994K(2097088K), 0.0351879 secs]1476.218: [GC 1476.218: [ParNew: 130904K->0K(131008K), 0.0331282 secs] 2018898K->1891141K(2097088K), 0.0341797 secs]1476.375: [GC 1476.376: [ParNew: 130944K->0K(131008K), 0.0460486 secs] 2022085K->1896937K(2097088K), 0.0472445 secs]1476.522: [GC 1476.522: [ParNew: 130937K->0K(131008K), 0.0305741 secs] 2027875K->1900685K(2097088K), 0.0313665 secs]1476.630: [GC 1476.630: [ParNew: 130944K->0K(131008K), 0.0354049 secs] 2031629K->1903695K(2097088K), 0.0362252 secs]1476.749: [GC 1476.750: [ParNew: 130944K->0K(131008K), 0.0332512 secs] 2034639K->1906008K(2097088K), 0.0343826 secs]1476.861: [GC 1476.882: [ParNew: 130944K->0K(131008K), 0.0302215 secs] 2036952K->1908090K(2097088K), 0.0310552 secs]1477.030: [GC 1477.030: [ParNew: 130944K->0K(131008K), 0.0308026 secs] 2039034K->1910954K(2097088K), 0.0316133 secs]1477.250: [GC 1477.251: [ParNew: 130944K->0K(131008K), 0.0385982 secs] 2041898K->1916426K(2097088K), 0.0397542 secs]1477.451: [GC 1477.452: [ParNew: 130944K->0K(131008K), 0.0355202 secs] 2047370K->1922843K(2097088K), 0.0363764 secs]1477.641: [GC 1477.641: [ParNew: 130944K->0K(131008K), 0.0374713 secs] 2053787K->1928347K(2097088K), 0.0383333 secs]1477.864: [GC 1477.865: [ParNew: 130944K->0K(131008K), 0.0420856 secs] 2059291K->1935857K(2097088K), 0.0432463 secs]1478.088: [GC 1478.089: [ParNew (promotion failed): 130944K->130944K(131008K), 0.2115119 secs]1478.300: [CMS1480.678: [CMS-concurrent-mark: 8.657/20.247 secs] (concurrent mode failure): 1943323K->1067360K(1966080K), 12.8686984 secs] 2066801K->1067360K(2097088K), 13.0813570 secs]1491.257: [GC [1 CMS-initial-mark: 1067360K(1966080K)] 1106126K(2097088K), 0.0660923 secs]1491.325: [CMS-concurrent-mark-start]1491.415: [GC 1491.416: [ParNew: 130944K->0K(131008K), 0.0628546 secs] 1198304K->1074342K(2097088K), 0.0642408 secs]1491.569: [GC 1491.569: [ParNew: 130944K->0K(131008K), 0.0283941 secs] 1205286K->1077410K(2097088K), 0.0292610 secs] The following screen shot shows VisualGC, prstat and xcpustate.  There is a sustained period of single threaded behavior. Of the 664 threads in the Java process, only one is able to run.  The following screen shot shows VisualGC, prstat and xcpustate.  After the single threaded phase, there is a phase where many of the threads that had been blocked by garbage collection become runnable.  Every virtual CPU's is busy processing runnable threads.  The vmstat data shows the stages of single-threaded garage collection (1/16 CPU's = 6.25%) alternating with stages where the the CPU's are extremely busy processing application code.  The pattern in the vmstat data is more obvious when only 10 minutes of data is displayed. Stage 3: Fix it In the third stage, the Java parameters have been modified and the system able to run well under a heavier load.   Initial Java Options Modified Java Options -Xms1024m -Xms4096m -Xmx2048m -Xmx6144m -XX:NewSize=128m -XX:NewSize=1536m -XX:MaxNewSize=256m -XX:MaxNewSize=1536m N/A -XX:SurvivorRatio=4 N/A -verbose:gc -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -XX:+PrintTenuringDistribution -Xloggc:/web_dir/log/gc.log What I like about the following VisualGC1) GC is not constantly running2) Survivor space is being used (5 generations) but not full3) CMS is able to hold Old Gen down4) Eden GCs are about one per second. The following jstat data was gather during a run where the the Java parameters have been modified and the system able to run well under a heavier load.  There is one sampleevery 10 seconds (10000 ms) and twenty samples to cover a total of 200seconds.  Observing the YGC column, notice that there are 126 (943 - 817) Young Generation Collections, for an average rate of 0.63collections/second (one every 1.6 seconds).  Also notice that the Survivor space is now active. In the VisualGC Survivor Age Histogram, above. observe that data survives 5 generations before being promoted to the Old Generation.  This implied that Java objects that are still referenced after  8 seconds (1.6 seconds X 5 collections).  Most Java objects are not referenced for 8 seconds and therefore are never promoted into the Old Generation, #jstat -gc $PID 10000 20 S0C S1C S0U S1U EC EU OC OU PC PU YGC YGCT FGC FGCT GCT262144.0 262144.0 0.0 135491.4 1048576.0 740774.1 2621440.0 1518371.0 737280.0 454756.7 817 204.672 119 105.404 310.076262144.0 262144.0 143538.3 0.0 1048576.0 429976.2 2621440.0 1667952.6 737280.0 454763.5 826 206.506 119 105.404 311.911262144.0 262144.0 127788.8 0.0 1048576.0 648808.0 2621440.0 1772166.9 737280.0 454766.5 832 207.969 120 105.404 313.373262144.0 262144.0 149593.1 0.0 1048576.0 598492.1 2621440.0 1515637.6 737280.0 454766.5 840 209.517 121 107.803 317.320262144.0 262144.0 162023.5 0.0 1048576.0 1008524.0 2621440.0 1656240.4 737280.0 454768.1 846 210.922 121 107.803 318.725262144.0 262144.0 0.0 125334.4 1048576.0 953097.1 2621440.0 1836384.7 737280.0 454769.0 853 212.890 121 107.803 320.694262144.0 262144.0 151573.6 152908.7 1048576.0 1048576.0 2621440.0 1687193.8 737280.0 454769.8 859 214.153 122 109.747 323.900262144.0 262144.0 0.0 127784.9 1048576.0 170487.4 2621440.0 1597313.5 737280.0 454764.8 865 216.265 123 110.197 326.462262144.0 262144.0 125183.1 0.0 1048576.0 679868.2 2621440.0 1674807.1 737280.0 454765.9 870 217.484 123 110.197 327.682262144.0 262144.0 0.0 138781.8 1048576.0 941507.7 2621440.0 1500453.3 737280.0 454766.5 875 218.688 124 112.160 330.848262144.0 262144.0 154681.9 98652.8 1048576.0 1048576.0 2621440.0 1419845.8 737280.0 454770.0 881 220.185 125 112.576 332.761262144.0 262144.0 140635.1 0.0 1048576.0 184017.0 2621440.0 1541698.1 737280.0 454774.3 888 222.260 125 112.576 334.836262144.0 262144.0 126225.1 0.0 1048576.0 800352.6 2621440.0 1664977.6 737280.0 454777.0 894 224.013 125 112.576 336.589262144.0 262144.0 128930.2 0.0 1048576.0 709898.1 2621440.0 1423721.6 737280.0 454779.2 900 225.276 126 114.513 339.788262144.0 262144.0 0.0 135929.5 1048576.0 901493.1 2621440.0 1500049.7 737280.0 454783.5 907 227.254 127 114.962 342.216262144.0 262144.0 78006.9 156657.5 1048576.0 1048576.0 2621440.0 1682226.6 737280.0 454785.8 916 229.098 127 114.962 344.060262144.0 262144.0 112938.7 0.0 1048576.0 707242.9 2621440.0 1833853.7 737280.0 454790.2 922 230.749 128 114.962 345.712262144.0 262144.0 0.0 119538.9 1048576.0 466447.4 2621440.0 1587738.4 737280.0 454790.8 929 232.260 129 117.110 349.370262144.0 262144.0 141025.9 131586.9 1048576.0 1048576.0 2621440.0 1774825.3 737280.0 454791.9 937 233.762 129 117.110 350.872262144.0 262144.0 0.0 116953.7 1048576.0 562706.0 2621440.0 1886004.9 737280.0 454792.8 943 235.572 129 117.110 352.683 The Verbose GC log file confirms the jstat data. 1063.533: [GC[YG occupancy: 740130 K (1310720 K)]1063.534: [Rescan (parallel) , 0.2932129 secs]1063.827: [weak refs processing, 0.8300947 secs]1064.658: [class unloading, 0.2556826 secs]1064.914: [scrub symbol & string tables, 0.1225031 secs] [1 CMS-remark: 1362667K(2621440K)] 2102798K(3932160K), 1.6268772 secs]1065.161: [CMS-concurrent-sweep-start]1065.744: [GC 1065.748: [ParNew: 1199794K->136002K(1310720K), 0.3157680 secs] 2562442K->1538117K(3932160K), 0.3173270 secs]1067.540: [GC 1067.541: [ParNew: 1183955K->134897K(1310720K), 0.3643224 secs] 2516046K->1487739K(3932160K), 0.3657230 secs]1069.449: [GC 1069.452: [ParNew: 1183473K->132663K(1310720K), 0.4440330 secs] 2398115K->1372567K(3932160K), 0.4454642 secs]1071.165: [GC 1071.165: [ParNew: 1181239K->117685K(1310720K), 0.2458847 secs] 2366563K->1322400K(3932160K), 0.2472765 secs]1073.099: [GC 1073.100: [ParNew: 1166261K->143983K(1310720K), 0.2949013 secs] 2230754K->1208476K(3932160K), 0.2964760 secs]1074.020: [CMS-concurrent-sweep: 3.145/8.858 secs]1074.020: [CMS-concurrent-reset-start]1074.056: [CMS-concurrent-reset: 0.036/0.036 secs]1074.700: [GC 1074.701: [ParNew: 1192514K->130782K(1310720K), 0.2122237 secs] 2257007K->1216167K(3932160K), 0.2136974 secs]1075.304: [GC [1 CMS-initial-mark: 1085384K(2621440K)] 1482522K(3932160K), 0.3090166 secs]1075.614: [CMS-concurrent-mark-start]1076.299: [GC 1076.300: [ParNew: 1179358K->142731K(1310720K), 0.2360216 secs] 2264743K->1228116K(3932160K), 0.2373256 secs]1078.147: [GC 1078.148: [ParNew: 1191307K->146389K(1310720K), 0.3161631 secs] 2276692K->1249628K(3932160K), 0.3176056 secs]1080.368: [GC 1080.369: [ParNew: 1194965K->140667K(1310720K), 0.2221885 secs] 2298204K->1266323K(3932160K), 0.2235858 secs]1082.719: [GC 1082.720: [ParNew: 1189243K->156618K(1310720K), 0.4543953 secs] 2314899K->1292754K(3932160K), 0.4557551 secs]1083.887: [GC 1083.888: [ParNew: 1205108K->101152K(1310720K), 0.1484881 secs] 2341243K->1271768K(3932160K), 0.1498135 secs]1084.462: [CMS-concurrent-mark: 3.962/8.848 secs]1084.462: [CMS-concurrent-preclean-start]1085.434: [CMS-concurrent-preclean: 0.859/0.971 secs]1085.434: [CMS-concurrent-abortable-preclean-start]1086.081: [GC 1086.082: [ParNew: 1149728K->135612K(1310720K), 0.3098950 secs] 2320344K->1306228K(3932160K), 0.3113219 secs]1087.294: [CMS-concurrent-abortable-preclean: 0.459/1.860 secs]1087.342: [GC[YG occupancy: 667129 K (1310720 K)]1087.342: [Rescan (parallel) , 0.3074071 secs]1087.650: [weak refs processing, 0.9052933 secs]1088.555: [class unloading, 0.2571331 secs]1088.813: [scrub symbol & string tables, 0.1228506 secs] [1 CMS-remark: 1170615K(2621440K)] 1837745K(3932160K), 1.7140749 secs]1089.057: [CMS-concurrent-sweep-start]1089.396: [GC 1089.397: [ParNew: 1184134K->139422K(1310720K), 0.1621469 secs] 2354726K->1320441K(3932160K), 0.1636681 secs]1091.280: [GC 1091.281: [ParNew: 1187998K->154229K(1310720K), 0.2434008 secs] 2295339K->1273265K(3932160K), 0.2449105 secs]1093.547: [GC 1093.548: [ParNew: 1202805K->150707K(1310720K), 0.3279749 secs] 2156871K->1125761K(3932160K), 0.3295486 secs]  The application is able to run most of the time, but even with the improved Java options, there are short periods of single-threaded computation, which are likely attributable to the "initial mark" or the "remark" phases of CMS.  See this description of CMS for more details. When zooming in on the vmstat data, the short and acceptable phases of single threaded activity are more obvious.  Notice how much better this is than the long phases of single-threaded behavior that were visible with the initial Java options. Conclusion: "One-size fits all" Java options are not necessarily a good match for your application.  Monitoring garbage collection with a combination of several tools and using the feedback to adjust the Java options for your application can result is a substantial performance improvement.  

Introduction: I've recently been working with a several ISV's who were concerned about application server performance issues. Various application servers have been investigated, including...

Sun

OpenOffice Calc cut&paste to Thunderbird

This blog entry is about how to copy this OpenOffice 3.1 data: Into this Thunderbird 2.0 e-mail:  First I used "gimp" to save a jpg version of the graph: Then inserted into the e-mail. Thunderbird -> Insert -> Image OK - not too hard, but a more direct method would be convenient. Inserting the OpenOffice Calc cells as a Thunderbird Table is almost easy.  Just select the cells in Calc, <ctrl>C, and paste into Thunderbird with <ctrl>V: The problem, IMHO, is that when the OpenOffice Calc data is pasted into the Thunderbird message, all of the existing text in the Thunderbird message is assigned a font that is "too small".  Changing the font of the existing text was not an intended consequence of inserting a table into the message. If you use "Paste Without Formating", Thunderbird puts text, not an HTML table, into your email, which may or may not be OK depending on the task at hand. After you have pasted a table into your e-mail, if you want to increase the font size of some or all of your data, you can select any area, and use "<ctrl>+" (or Format->Size->Larger) to manually set the font size to what you had in mind. Instead, I prefer to use the following approach to return to the original font size: 1) After the paste, in Thunderbird, type "<ctrl>A" to select all of the text2) Click "Insert" -> "HTML" from the Thunderbird pulldown menus.  This brings up a window with all of the HTML for the current window  3) Delete:    ; font-size:x-small4) Click "insert" Voila!! You may want to experiment with removing the entire "style" tag if you don't like the Liberation Sans font. I hope that someone out there in the WWW finds that this blog entry is helpful.

This blog entry is about how to copy this OpenOffice 3.1 data: Into this Thunderbird 2.0 e-mail:  First I used "gimp" to save a jpg version of the graph: Then inserted into the e-mail. Thunderbird ->...

Sun

NFS Streaming over 10 GbE

NFS Tuning for HPC Streaming Applications Overview: I was recently working in a lab environment with the goal of settingup a Solaris 10 Update 8 (s10u8) NFS server application that would beable to stream data to a small number of s10u8 NFS clients with thehighest possible throughput for a High Performance Computing (HPC)application.  The workload varied over time: at some points theworkload was read-intensive while at over times the workload was writeintensive.  Regardless of read or write, the application's I/O patternwas always "large block sequential I/O" which was easily modeled with a"dd" stream from one or several clients. Due to business considerations, 10 gigabit ethernet (10GbE) waschosen for the network infrastructure.  It was necessary to not only toinstall appropriate server, network and I/O hardware, but also to tuneeach subsystem.  I wish it was more obvious if a gigabit implied 1024\^3or 1000\^3.  In ether case, one might naively assume that the connectionshould be able to reach NFS speeds of 1.25 gigabytes per second,however, my goal was to be able to achieve NFS end-to-end throughputclose to 1.0 gigabytes per second. Hardware: A network with the following components worked well: Sun Fire X4270 servers Intel Xeon X5570 CPU's @ 2.93 GHz Solaris 10 Update 8 a network, consisting of: Force10 S2410 Data Center 10 GbE Switch 10-Gigabit Ethernet PCI Express Ethernet Controller, either: 375-3586 (aka Option X1106A-Z) with the Intel 82598EB chip(ixgbe driver), or 501-7283 (aka Option X1027A-Z) with the "Neptune" chip (nxgedriver) an I/O system on the NFS server: 375-3487 Sun StorageTek PCI Express SAS 8-Channel HBAs (SG-XPCIE8SAS-E-Z, LSI-based disk controllers) Storage, either Sun Storage J4400 Arrays, or Sun Storage F5100 Flash Array This configuration based on recent hardware was able to reach closeto full line speed performance.  In contrast, a slightly older serverwith IntelXeon E5440 CPU's @ 2.83 Ghz was not able to reach fullline speed. The application's I/O pattern is large block sequential and known tobe 4K aligned, so the Sun Storage F5100 Flash Array is a good fit.  Sun does not recommend this device for general purpose NFS storage. Network When the hardware was initially installed, rather than immediatelymeasuring NFS performance, the individual network and IO subsystemswere tested.  To measure the network performance, I used netperf. I found that the"out of the box" s10u8 performance was below my expectations; it seems thatthe Solaris "out of the box" settings are better fitted to a web serverwith a large number of potentially slow (WAN) connections.  To get thenetwork humming for large block LAN workload I made several changes: a) The TCP Sliding Window settings in /etc/default/inetinit ndd -set /dev/tcptcp_xmit_hiwat  1048576ndd -set /dev/tcp tcp_recv_hiwat  1048576ndd -set /dev/tcp tcp_max_buf    16777216ndd -set /dev/tcp tcp_cwnd_max    1048576 b) The network interface card "NIC" settings, depending on the card: /kernel/drv/ixgbe.conf default_mtu=8150;tx_copy_threshold=1024; /platform/i86pc/kernel/drv/nxge.conf accept_jumbo = 1;soft-lso-enable = 1;rxdma-intr-time=1;rxdma-intr-pkts=8; /etc/system \* From http://www.solarisinternals.com/wiki/index.php/Networks\* For ixgbe or nxgeset ddi_msix_alloc_limit=8\* For nxgeset nxge:nxge_bcopy_thresh=1024set pcplusmp:apic_multi_msi_max=8set pcplusmp:apic_msix_max=8set pcplusmp:apic_intr_policy=1set nxge:nxge_msi_enable=2 c) Some seasoning :-) \* Added to /etc/system onS10U8 x64 systems based on \*   http://www.solarisinternals.com/wiki/index.php/Networks(Nov 18,2009) \* For few TCP connections set ip:tcp_squeue_wput=1 \* Bursty set hires_tick=1 d) Make sure that you are using jumbo frames. I used mtu 8150, whichI know made both the NICs and the switch happy.  Maybe I should havetried aslightly more aggressive setting of 9000.   /etc/hostname.nxge0 192.168.2.42mtu 8150 /etc/hostname.ixgbe0 192.168.1.44 mtu 8150 e) Verifying the MTU with ping and snoop.  Some ping implementationsinclude a flag to allow the user to set the "do not fragment" (DNF)flag, which is very useful for verifying that the MTU is properly set. With the ping implementation that ships with s10u8, you can't set theDNF flag.  To verify the MTU, use snoop to see if large pings arefragmented: server# snoop -r -d nxge0192.168.1.43Using device nxge0 (promiscuous mode) // Example 1: A 8000 byte packet is not fragmented client% ping -s 192.168.1.43 8000 1PING 192.168.1.43: 8000 data bytes8008 bytes from server10G-43 (192.168.1.43): icmp_seq=0. time=0.370 ms 192.168.1.42 -> 192.168.1.43 ICMP Echo request (ID: 14797 Sequencenumber: 0)192.168.1.43 -> 192.168.1.42 ICMP Echo reply (ID: 14797 Sequencenumber: 0) // Example 2:A 9000 byte pingis broken into 2 packets in both directions client% ping -s192.168.1.43 9000 1PING 192.168.1.43: 9000 data bytes9008 bytes from server10G-43 (192.168.1.43): icmp_seq=0. time=0.383 ms 192.168.1.42 -> 192.168.1.43 ICMP IP fragment ID=32355 Offset=0   MF=1 TOS=0x0 TTL=255192.168.1.42 -> 192.168.1.43 ICMP IP fragment ID=32355 Offset=8128MF=0 TOS=0x0 TTL=255192.168.1.43 -> 192.168.1.42 ICMP IP fragment ID=49788 Offset=0   MF=1 TOS=0x0 TTL=255192.168.1.43 -> 192.168.1.42 ICMP IP fragment ID=49788 Offset=8128MF=0 TOS=0x0 TTL=255 // Example3: A32000 byte ping is broken into 4 packets in both directions client% ping -s192.168.1.43 32000 1PING 192.168.1.43: 32000 data bytes32008 bytes from server10G-43 (192.168.1.43): icmp_seq=0. time=0.556 ms 192.168.1.42 -> 192.168.1.43 ICMP IP fragment ID=32356 Offset=0   MF=1 TOS=0x0 TTL=255192.168.1.42 -> 192.168.1.43 ICMP IP fragment ID=32356 Offset=8128MF=1 TOS=0x0 TTL=255192.168.1.42 -> 192.168.1.43 ICMP IP fragment ID=32356 Offset=16256MF=1 TOS=0x0 TTL=255192.168.1.42 -> 192.168.1.43 ICMP IP fragment ID=32356 Offset=24384MF=0 TOS=0x0 TTL=255192.168.1.43 -> 192.168.1.42 ICMP IP fragment ID=49789 Offset=0   MF=1 TOS=0x0 TTL=255192.168.1.43 -> 192.168.1.42 ICMP IP fragment ID=49789 Offset=8128MF=1 TOS=0x0 TTL=255192.168.1.43 -> 192.168.1.42 ICMP IP fragment ID=49789 Offset=16256MF=1 TOS=0x0 TTL=255192.168.1.43 -> 192.168.1.42 ICMP IP fragment ID=49789 Offset=24384MF=0 TOS=0x0 TTL=255 f) Verification: after network tuning was complete, it had veryimpressive performance, with either the nxgbe or the ixgbe driver.  The end-to-end measurement reported by netperf of 9.78 GbE is very close to full line speed and indicates that the switch, network interface cards, drivers and Solaris system call overhead are minimally intrusive. $ /usr/local/bin/netperf -fg-H 192.168.1.43 -tTCP_STREAM -l60 TCP STREAM TEST from ::ffff:0.0.0.0(0.0.0.0) port 0 AF_INET to ::ffff:192.168.1.43 (192.168.1.43) port 0AF_INET Recv   Send    SendSocket Socket  Message  ElapsedSize   Size    Size     Time     Throughputbytes  bytes   bytes    secs.    10\^9bits/sec 1048576 1048576 1048576    60.56       9.78 $ /usr/local/bin/netperf -fG -H192.168.1.43 -tTCP_STREAM -l60 TCP STREAM TEST from::ffff:0.0.0.0 (0.0.0.0) port 0 AF_INET to ::ffff:192.168.1.43(192.168.1.43) port 0 AF_INET Recv   Send    Send Socket Socket  Message  Elapsed Size   Size    Size     Time    Throughput bytes  bytes   bytes    secs.   GBytes/sec 1048576 1048576 1048576   60.00       1.15 Both of the two tests above show about the same throughput, but in different units.   With the network performance tuning, above, the system can achieve slightly less than 10 gigabits per second, which is slightly more than 1 gigabyte per second. g) Observability: I found "nicstat" (download link athttp://www.solarisinternals.com/wiki/index.php/Nicstat)to be a veryvaluable tool for observing network performance.  To compare thenetwork performance with the running application against synthetictests, I found that it was useful to graph the "nicstat" output. (seehttp://blogs.sun.com/taylor22/entry/graphing_solaris_performance_stats) Youcan verify that the Jumbo Frames MTU is working as expected bychecking that the average packet payload is big by dividingBytes-per-sec / Packets-per-sec to get average bytes-per-packet. Network Link Aggregation for load spreading I initially hoped to use link aggregation to get a 20GbE NFSstream using 2 X 10GbE ports on the NFS server and 2 X 10GbE ports onthe NFS client.  I hoped that packets would be distributed to the linkaggregation group in a "per packet round robin" fashion.  What I foundis that, regardless of whether the negotiation is based on L2, L3 orL4, LACP will negotiate port pairs based on a source/destinationmapping, so that each stream of packets will only use one specific portfrom alink aggregation group.  Link aggregation can be useful in spreading multiple streams overports, but the streams will not necessarily be evenly divided acrossports.  The distribution of data over ports in a link aggregation groupcan beviewed with "nicstat"  After reviewing literature, I concluded that It is best to use IPMP forfailover, but link aggregation for load spreading.  "Link aggregation"has finer control for load spreading than IPMP:  ComparingIPMPand Link Aggregation IPMP can be used to protect against switch/router failure becauseeach NIC can be connected to a different switch and therefore canprotect against either NIC or switch failure. With Link Aggregation, all of the ports in the group must beconnected to a single switch/router, and that switch/router mustsupport Link Aggregation Control Protocol (LACP), so there is noprotection against switch failure. With the Force10 switch that I tested, I was disappointed that the LACPalgorithm was not doing a good job of spreading inbound packets to myhotserver.  Again, once the switch mapped a client to one of the ports inthe "hot server's link group", it stuck so it was not unusual forseveral clients to be banging hard on one port while another port wasidle. Multiple subnets for network load spreading After trying several approaches for load spreading with S10u8, I choseto use multiple subnets.  (If I had been using Nevada 107 or newerwhich includes ProjectClearview, I might have come to a different conclusion.)  In theend, I decided that the best solution was an old fashion approach usinga single common management subnet combined with multiple data subnets: All of the machines were able to to communicate with each otheron a slower "management network", specifically Sun Grid Engine jobswere managed on the slower network. The clients were partitioned into a small number of "datasubnets". The "hot" NFS data server had multiple NIC's, with each NIC on aseparate "data subnet". A limitation of this approach that is that clients in one subnetpartition only have a low bandwidth connection to clients in differentsubnet partitions. This was OK for my project. The advantage of manually preallocating the port distribution wasthat my benchmark was more deterministic.  I did not get overloadedports in a seemingly random pattern. Disk IO: Storage tested Configurations that had sufficient bandwidth for this environment included: A Sun Storage F5100 Flash Array using 80 FMods in a ZFS RAID 0 stripe to createa 2TB volume.  A "2 X Sun Storage J4400Arrays" JBOD configuration with a ZFS RAID 0 stripe A "4 X Sun Storage J4400Arrays" configuration with a ZFS RAID 1+0 mirrored and striped Disk I/O: SAS HBA's The Sun Storage F5100 Flash Array was connected to the the Sun Fire X4270 server using 4 PCIe375-3487 Sun StorageTek PCI Express SAS 8-Channel HBAs (SG-XPCIE8SAS-E-Z, LSI-based disk controllers) so that each F5100 domain with 20 FMods used anindependent HBA.  Using 4 SAS HBAs has a 20% better theoreticalthroughput than using 2 SAS HBA's: A fully loaded Sun Storage F5100 Flash Array with 80 FMods has 4 domains with 20FMods per domain Each  375-3487 Sun StorageTek PCI Express SAS 8-Channel HBA (SG-XPCIE8SAS-E-Z, LSI-based disk controller) has two 4x wide SAS ports an 8x wide PCIe bus connection Each F5100 domain to SAS HBA, connected with a single 4x wide SASport, will have a maximum half duplex speed of (3Gb/sec \* 4) = 12Gb/Sec=~ 1.2 GB/Sec per F5100 domain PCI Express x8 (half duplex) = 2 GB/sec A full F5100 (4 domains) connected using 2 SAS HBA's would belimited by PCIe to 4.0 GB/Sec A full F5100 (4 domains) connected using 4 SAS HBA's would belimited by SAS to 4.8 GB/Sec. Therefore a full Sun Storage F5100 Flash Array has 20%theoretically better throughput when connected using 4 SAS HBA's ratherthan 2SAS HBA's. The "mirrored and striped" configuration using 4 X Sun Storage J4400Arrays was connected using 3 PCIe375-3487 Sun StorageTek PCI Express SAS 8-Channel HBAs (SG-XPCIE8SAS-E-Z, LSI-based disk controllers): Multipathing (MPXIO) MPXIO was used in the "mirrored and striped" 4 X Sun Storage J4400Array so that for every disk in the JBOD configuration, I/O could berequested by either of the 2 SAS cards connected to the array.  Toeliminate any "single point of failure" I chose to mirror all of thedrives in one tray with drives in another tray, so that any tray couldbe removed without losing data. The command for creating a ZFS RAID 1+0 "mirrored and striped" volume out of MPXIO devices looks like this: zpool create -f jbod \\mirror c3t5000C5001586BE93d0 c3t5000C50015876A6Ed0 \\mirror c3t5000C5001586E279d0 c3t5000C5001586E35Ed0 \\mirror c3t5000C5001586DCF2d0 c3t5000C50015825863d0 \\mirror c3t5000C5001584C8F1d0 c3t5000C5001589CEB8d0 \\... It was a bit tricky to figure out which disks (i.e."c3t5000C5001586BE93d0") were in which trays.  I ended up writing asurprisingly complicated Ruby script to choose devices to mirror.  Thisscript worked for me.  Your mileage may vary.  Use at your own risk. #!/usr/bin/env ruby -W0all=`stmsboot -L`Device = Struct.new( :path_count, :non_stms_path_array, :target_ports, :expander, :location)Location = Struct.new( :device_count, :devices )my_map=Hash.newall.each{ |s| if s =~ /\\/dev/ then s2=s.split if !my_map.has_key?(s2[1]) then ## puts "creating device for #{s2[1]}" my_map[s2[1]] = Device.new(0,[],[]) end ## puts my_map[s2[1]] my_map[s2[1]].path_count += 1 my_map[s2[1]].non_stms_path_array.push(s2[0]) else puts "no match on #{s}" end}my_map.each { |k,v| ##puts "key is #{k}" mpath_data=`mpathadm show lu #{k}` in_target_section=false mpath_data.each { |line| if !in_target_section then if line =~ /Target Ports:/ then in_target_section=true end next end if line =~ /Name:/ then my_map[k].target_ports.push(line.split[1]) ##break end } ##puts "key is #{k} value is #{v}" ##puts k v.non_stms_path_array[0], v.non_stms_path_array[1]}location_array=[]location_map=Hash.newmy_map.each { |k,v| my_map[k].expander = my_map[k].target_ports[0][0,14] my_map[k].location = my_map[k].target_ports[0][14,2].hex % 64 if !location_map.has_key?(my_map[k].location) then puts "creating entry for #{my_map[k].location}" location_map[my_map[k].location] = Location.new(0,[]) location_array.push(my_map[k].location) end location_map[my_map[k].location].device_count += 1 location_map[my_map[k].location].devices.push(k)}location_array.sort.each { |location| puts "mirror #{location_map[location].devices[0].gsub('/dev/rdsk/','')} #{location_map[location].devices[1].gsub('/dev/rdsk/','')} \\\\"} Separate ZFS Intent Logs? Based on http://blogs.sun.com/perrin/entry/slog_blog_or_blogging_on, I ran some tests comparing separate intent logs on either disk or SSD(slogs) vs the default chained logs (clogs).  For the large blocksequential workload, "tuning" the configuration by adding separate ZFSIntent Logs actually slowed the system down slightly.  ZFS Tuning /etc/system parameters for ZFS \* For ZFSsetzfs:zfetch_max_streams=64set zfs:zfetch_block_cap=2048set zfs:zfs_txg_synctime=1set zfs:zfs_vdev_max_pending = 8 NFS Tuning a) Kernel settings Solaris /etc/system \* For NFS set nfs:nfs3_nra=16 set nfs:nfs3_bsize=1048576 set nfs:nfs3_max_transfer_size=1048576 \* Added to /etc/system on S10U8 x64systems based on \*  http://www.solarisinternals.com/wiki/index.php/Networks(Nov 18, 2009) \* For NFS throughput set rpcmod:clnt_max_conns = 8 b) Mounting the NFS filesystem /etc/vfstab 192.168.1.5:/nfs - /mnt/nfs nfs -no vers=3,rsize=1048576,wsize=1048576 c) Verifing the NFS mount parameters # nfsstat -m/mnt/ar from 192.168.1.7:/export/ar Flags:        vers=3,proto=tcp,sec=sys,hard,intr,link,symlink,acl,rsize=1048576,wsize=1048576,retrans=5,timeo=600 Attr cache:    acregmin=3,acregmax=60,acdirmin=30,acdirmax=60 Results:With tuning, the Sun Storage J4400 Arrays via NFS achieved writethroughput of 532 MB/Sec and read throughput of 780 MB/sec for a single'dd' stream. $ /bin/time dd if=/dev/zero of=/mnt/jbod/test-80gbs=2048k count=40960; umount /mnt/jbod; mount /mnt/jbod; /bin/time ddif=/mnt/jbod/test-80g of=/dev/null bs=2048k40960+0 records in40960+0 records out real     2:33.7user        0.1sys      1:30.940960+0 records in40960+0 records out real     1:44.9user        0.1sys      1:04.0 With tuning, the Sun Storage F5100 Flash Array via NFS achieved writethroughput of 496 MB/Sec and read throughput of 832 MB/sec for a single'dd' stream. $ /bin/time dd if=/dev/zeroof=/mnt/lf/test-80g bs=2048k count=40960; umount /mnt/lf; mount/mnt/lf; /bin/time dd if=/mnt/lf/test-80g of=/dev/null bs=2048k40960+0 records in40960+0 records out real     2:45.0user        0.2sys      2:17.340960+0 records in40960+0 records out real     1:38.4user        0.1sys      1:19.6 To reiterate, this testing was done in preparation for work with an HPCapplication that is known to have large block sequential I/O that isaligned on 4K boundaries.  The Sun Storage F5100 Flash Array would notbe recommended for general purpose NFS storage that is not known to be 4Kaligned. References:   - http://blogs.sun.com/brendan/entry/1_gbyte_sec_nfs_streaming  -http://blogs.sun.com/dlutz/entry/maximizing_nfs_client_performance_on

NFS Tuning for HPC Streaming Applications Overview: I was recently working in a lab environment with the goal of setting up a Solaris 10 Update 8 (s10u8) NFS server application that would beable to...

Sun

Graphing Solaris Performance Stats

Graphing Solaris Performance Stats with gnuplot It is not unusual to see an engineer import text from "vmstat" or "iostat" to a spreadsheet application such as Microsoft Office Excel or OpenOffice Calc to visualize the data.  This is a fine approach when used periodically but impractical when used frequently.  The process of transferring the data to a laptop, manually massaging the data, launching the office application, importing the data and selecting the columns to chart is too cumbersome when used as a daily process or if there are a large number of machines that are being monitored.  It my case, I needed to visualize the performance from a few servers that were under test, and needed a few graphs from the servers, a few times a day.  I used some traditional Unix scripts and gnuplot (http://www.gnuplot.info) from the Companion CD (http://www.sun.com/software/solaris/freeware) to quickly graph the data. The right tool for graphing Solaris data depends on your use case scenario: One or two graphs, now and then: Import the data into your favorite spreadsheet application. Historic data, more graphs, more frequently: use gnuplot Many graphs, real-time or historic data, for more machines, such as a grid of servers being managed by Sun Grid Engine:  a formal tool such a Ganglia (http://ganglia.info, http://www.sunfreeware.com/) is recommended. An advantage of Ganglia is that performance data is exposed via a web interface to a potentially large number of viewers in real time. That being said, here are some scripts that I used to view Solaris Performance data with gnuplot. 1. Gathering data.  For each benchmark run, a script was used to start gathering performance data: #!/usr/bin/kshdir=$1mkdir $dirvmstat 1 > $dir/vmstat.out 2>&1 &zpool iostat 1 > $dir/zpool_iostat.out 2>&1 &nicstat 1 > $dir/nicstat.out 2>&1 &iostat -nmzxc 1 > $dir/iostat.out 2>&1 &/opt/DTraceToolkit-0.99/Bin/iopattern 1 > $dir/iopattern.out 2>&1 & The statistics gathering processes were all killed at the end of the benchmark run. Hence, each test had a directory with a comprehensive set of statistics files. Next it was necessary to write a set of scripts to operate on the directories. 2. Graphing CPU utilization from "vmstat". This script was fairly short and straightforward.  The "User CPU Utilization" and "System CPU Utilization" are in the 20th and 21st columns.  I added an optional argument to truncate the graph after a specific amount of time to account for the cases where the vmstat process was not killed immediately after the benchmark.  A bash "here document" is used to enter gnuplot commands. #!/usr/bin/bashdir=$1file=$1/vmstat.outif [ $# == 2 ] ; then minutes=$2 (( seconds = minutes \* 60 )) cat $file | head -$seconds > /tmp/data file=/tmp/datafignuplot -persist <<EOFset title "$dir"plot "$file" using 20 title "%user" with lines, \\ "$file" using 21 title "%sys" with linesEOF 3. Graphing IO throughput from "iostat -nmzxc 1" data This script was a little bit more complicated for three reasons: The data file contains statistics for several filesystems that are not interesting and will be filtered out.  The script needs to be launched with an argument that will be used to select one device. I used the 'z' option to iostat which does not print traces when the device is idle (Zero I/O).  The 'z' option makes a smaller file that is more human readable, but it it not good for graphing.  Thus I needed synthesize the zero traces before passing the data to gnuplot. I wanted to include a smooth line for the iostat "%w" and "%b" columns with a scale of 0 to 100. #!/usr/bin/bash# This script is used to parse "iostat -nmzxc" data which is formatted like this:## extended device statistics# r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device# 0.0 0.9 0.8 3.8 0.0 0.0 0.0 0.5 0 0 c0t1d0# 0.0 0.0 0.0 0.0 0.0 0.0 0.0 2.4 0 0 sge_master:/opt/sge6-2/default/common# 0.0 0.8 1.9 184.5 0.0 0.0 4.1 31.1 0 1 192.168.2.9:/jbodif [ $# -lt 2 -o $# -gt 3 ] ; then echo "Usage: $0 pattern dir [minutes]" exit 1fipattern=$1dir=$2(( minutes = 24 \* 60 )) #default: graph 1 dayif [ $# == 3 ] ; then minutes=$3fi(( seconds = minutes \* 60 ))all_data=$dir/iostat.outplot_data=/tmp/plot_dataif [ ! -r $all_data ] ; then echo "can not read $all_data" exit 1fi# For each time interval, either:# print the trace for the device that matches the pattern, or# print a "zero" trace if there is not one in the data file # You can tell that there was no trace for the device during an# interval if you reach the "extended device statistics" line # without finding a tracegawk -v pattern=$pattern '$0 ~ pattern { printf("%s\\n",$0); found = 1 ;}/extended/ { if (found == 0) printf(" 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 192.168.2.9:/jbod \\n") found = 0;} ' $all_data | head -$seconds > $plot_datagnuplot -persist <<EOFset title "$pattern - $dir"set ytics nomirrorset y2range [0:100]set y2tics 0, 20plot "$plot_data" using 3 title "read (kb/sec)" axis x1y1 with lines, \\ "$plot_data" using 4 title "write (kb/sec)" axis x1y1 with lines, \\ "$plot_data" using 9 title "%w" axis x1y2 smooth bezier with lines, \\ "$plot_data" using 10 title "%b" axis x1y2 smooth bezier with linesEOF I created the following graph with the command "graph_iostat.bash jbod NFS_client_10GbE 5" to select data only from the "jbod" NFS mount, where the data is stored in the directory named "NFS_client_10GbE" and only graph the first 5 minutes worth of data. The iostat data was collected on an NFS client connected with a 10 gigabit network.  Thereis some write activity (green) at the start of the 5 minute sample period,followed by several minutes of intense reading (red) where the client hits speeds of 600-700MB/sec. The purple "%b" line, with values on the right x1y2 axis, indicates that during the intense read phase, the mount point is busy about 90% of the time.   4. Graphing I/O Service time from "iostat -nmzxc" data. I also find that columns 6 and 7 from iostat are very interesting and can be graphed using a simplification of the previous script. actv: average number of transactions actively being serviced svc_t: average response time  of  transactions,  in  milliseconds #!/usr/bin/bash# extended device statistics# r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device# 0.0 0.9 0.8 3.8 0.0 0.0 0.0 0.5 0 0 c0t1d0# 0.0 0.0 0.0 0.0 0.0 0.0 0.0 2.4 0 0 sge_master:/opt/sge6-2/default/common# 0.0 0.8 1.9 184.5 0.0 0.0 4.1 31.1 0 1 192.168.2.9:/jbodif [ $# -lt 2 -o $# -gt 3 ] ; then echo "Usage: $0 pattern dir [minutes]" exit 1fipattern=$1dir=$2(( minutes = 24 \* 60 )) #default: graph 1 dayif [ $# == 3 ] ; then minutes=$3fi(( seconds = minutes \* 60 ))all_data=$dir/iostat.outplot_data=/tmp/plot_data# For each time interval, either:# print the trace for the device that matches the pattern, or# print a "zero" trace if there is not one in the data file # You can tell that there was no trace for the device during an# interval if you reach the "extended device statistics" line # without finding a tracegawk -v pattern=$pattern '$0 ~ pattern { printf("%s\\n",$0); found = 1 ;}/extended/ { if (found == 0) printf(" 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 192.168.2.9:/jbod \\n") found = 0;} ' $all_data | head -$seconds > $plot_datagnuplot -persist <<EOFset title "$pattern - $dir"set log yplot "$plot_data" using 6 title "wsvc_t" with lines, \\ "$plot_data" using 7 title "asvc_t" with lines \\EOF Here is the graph produced by the command "graph_iostat_svc_t.bash jbod NFS_client_10GbE 5" 5. Graphing network throughput data from "nicstat" Another very valuable Solaris performance statistics tool is "nicstat".  For the download link, see http://blogs.sun.com/timc/entry/nicstat_the_solaris_and_linux .  A script to graph the data from nicstat follows the same pattern. #!/usr/bin/bashif [ $# -lt 2 -o $# -gt 3 ] ; then echo "Usage: $0 interface dir [minutes]" exit 1fiinterface=$1dir=$2(( minutes = 24 \* 60 )) #default: graph 1 dayif [ $# == 3 ] ; then minutes=$3fi(( seconds = $minutes \* 60 ))all_data=$dir/nicstat.outplot_data=/tmp/plot_dataif [ ! -r $all_data ] ; then echo "can not read $all_data" exit 1figrep $interface $all_data | head -$seconds > $plot_datagnuplot -persist <<EOFset title "$interface - $dir"plot "$plot_data" using 3 title "read" with lines, \\ "$plot_data" using 4 title "write" with linesEOF  "graph_nicstat.bash ixgbe2 NFS_server_10GbE 5" 6. Graphing IO throughput from "zpool iostat" data The challenge for plotting "zpool iostat" data is that the traces are not in constant units and therefore it is necessary to re-compute the data in constant units, in this example, MB/sec.  #!/usr/bin/bashif [ $# -lt 2 -o $# -gt 3 ] ; then echo "Usage: $0 pattern dir [minutes]" exit 1fipool=$1dir=$2(( minutes = 24 \* 60 )) #default: graph 1 dayif [ $# == 3 ] ; then minutes=$3fi(( seconds = minutes \* 60 ))all_data=$dir/zpool_iostat.outplot_data1=/tmp/plot_data1plot_data2=/tmp/plot_data2if [ ! -r $all_data ] ; then echo "can not read $all_data" exit 1figrep $pool $all_data | awk '{printf("%s/1048576\\n",$6)}' | sed -e 's/K/\*1024/g' -e 's/M/\*1048576/g' -e 's/G/\*1073741824/g' | bc | head -$seconds > $plot_data1grep $pool $all_data | awk '{printf("%s/1048576\\n",$7)}' | sed -e 's/K/\*1024/g' -e 's/M/\*1048576/g' -e 's/G/\*1073741824/g' | bc | head -$seconds > $plot_data2gnuplot -persist <<EOFset title "$pool - $dir"set log yplot "$plot_data1" using 1 title "read (MB/sec)" with lines, \\ "$plot_data2" using 1 title "write (MB/sec)" with linesEOF Graphing the IO throughput of the zpool named "jbod" using the command "graph_iostat_svc_t.bash jbod NFS_client_10GbE 5" shows that zpool can deliver data at speeds of close to one gigabyte per second. It is easy to modify the scripts above to graph the output of many tools that output a table of data in text format.

Graphing Solaris Performance Stats with gnuplot It is not unusual to see an engineer import text from "vmstat" or "iostat" to a spreadsheet application such as Microsoft Office Excel or OpenOffice Calc...

Sun

Solaris/x64 VNC with Cut & Paste

Yesterday, I was trying to get Cut & Paste to work between various VNC clients and a VNC server that was running on a Solaris10 Update 8 x64 server.  The VNC software that was first in my PATH was from the SFWvnc package that is shipped on the Solaris Companion CDI was quite confused:  1) Various Google searches revealed that vncconfig must be running on the server for cut and paste to work, however, it would not start: $ vncconfig No VNC extension on display :1.02) The man page for vncconfig indicates that this may be caused by using version 3 Xvnc.3) SFWvnc is version 3.3.74) There is no free version for Solaris x64 at www.realvnc.com (but there is a SPARC build)So I was left trying to figure out what is there easiest way to get  Solaris/x64 VNC with Cut & Paste to work?  I wondered if I need to download RealVNC's 4.X source and build the server.  Did I need to purchase the Enterprise Edition of RealVNC, even though I was not intending to use enterprise features? Solution:  As it turns out the solution is simple: "pkgrm SFWvnc".  This SFW package has VNC 3 files from the Solaris Companion CD that compete with the VNC 4 files that come with SUNWxvnc and SUNWvncviewer in S10U5 and newer.  I've asked the owner to have SFWvnc removed from the S10U9 Solaris Companion CD.

Yesterday, I was trying to get Cut & Paste to work between various VNC clients and a VNC server that was running on a Solaris10 Update 8 x64 server.  The VNC software that was first in my PATH was...

PTC-Windchill

Solaris Containers & 32-bit Java Heap allocation

A note from Steve Dertien at PTC: Jeff, We’vesolved the memory issue withzones.  The issue is impacted by the kernel version on the server andthezone type that they have created. Whatwe’ve discovered is that older kernel versions do notadequately support the larger heap sizes in a whole zoneconfiguration. The kernel version can be output using uname –a as follows. #uname -a SunOSmlxsv0165.10 Generic_137111-04sun4v sparc SUNW,T5240 Withthat particular version you can allocate a JVM with a 2Gbheap in the global zone and in a sparse zone.  In a whole zone you willnot be able to allocate the full 2Gb to the JVM.  The output of a JVMfailure will look like the following in this case: #./java -Xmx2048m -version Erroroccurredduring initialization of VM Couldnotreserve enough space for object heap Couldnotcreate the Java virtual machine. Thisissue is resolved when you upgrade the kernel to a newerversion.  The Sun servers at PTC are using a newer version of the kernelandtherefore we’re not experiencing this issue.  For Windchill supporton Solaris zones (whole or sparse) we should indicate that the customermust beon this kernel version or newer (assuming this does not regress). #uname -a SunOSedc-sunt5120 5.10 Generic_137137-09sun4v sparc SUNW,SPARC-Enterprise-T5120 Idon’t know if there are newer kernel versions but weshould probably put a customer document together that states when running Windchillin aSolaris zone of any kind that the kernel must be patched to this levelorhigher and how to test for this condition.  [Jeff adds: See http://blogs.sun.com/patch/entry/solaris_10_kernel_patchid_progression] Also, this issue does notexist with the 64bit JVM when the –d64 option is supplied to thecommand. Thisis the definition of a whole zone versus a sparse zone(lifted from here: http://opensolaris.org/os/community/zones/faq/) Q: What is a global zone? Sparse-root zone?Whole-rootzone? Local zone?A: After installing Solaris 10 on a system, but before creating anyzones, allprocesses run in the global zone. After you create a zone, it hasprocessesthat are associated with that zone and no other zone. Any processcreated by aprocess in a non-global zone is also associated with that non-globalzone. Anyzone which is not the global zone is called a non-global zone. Somepeople callnon-global zones simply "zones." Others call them "localzones" but this is discouraged. Thedefault zone filesystem model is called "sparse-root." This modelemphasizes efficiency at the cost of some configuration flexibility.Sparse-root zones optimize physical memory and disk space usage bysharing somedirectories, like /usr and /lib. Sparse-root zones have their ownprivate fileareas for directories like /etc and /var. Whole-root zones increaseconfiguration flexibility but increase resource usage. They do not usesharedfilesystems for /usr, /lib, and a few others. There is no supported way to convert an existing sparse-root zone to awhole-root zone. Creating a new zone is required. Wikipediaalso indicates that the penalty for a Whole-zone ismitigated if the file system that the zone is installed on is a ZFSclone ofthe global image.  This means that the system will only requireadditionalfile system space for data that uses different blocks.  Essentially twocopies of the same thing occupy only one block of space instead of thetraditional two.  For those that are concerned about the consumed diskspace for whole zones, that can be mitigated using the ZFS file system. Instructionsfor creating a sparse zone are outlined rather wellhere: http://www.logiqwest.com/dataCenter/Demos/RunBooks/Zones/createBasicZone.html Instructionsfor creating a whole zone are outlined rather wellhere: http://www.logiqwest.com/dataCenter/Demos/RunBooks/Zones/createSelfContainedZone.html Themajor difference is in the second step.  A sparse zoneuses the “create” command while a whole zone uses “create -b”. Jeff Taylor also sent a link to a nice tool called Zonestat (http://opensolaris.org/os/project/zonestat/). The output of the tool does a great job at showing you the distributionofresources across the zones.  The command below assumes that you placedthezonestat.pl script into /usr/bin as it does not exist by default. # perl /usr/bin/zonestat.pl        |--Pool--|Pset|-------Memory-----| Zonename| IT|Size|Used| RAM| Shm| Lkd| VM| ------------------------------------------  global  0D   64  0.0 318M  0.0  0.0 225M edc-sne  0D   64  0.0 273M  0.0  0.0 204M edc-sne2  0D   64  0.0 258M  0.0  0.0 190M ==TOTAL= ===   64  0.0 1.1G  0.0  0.0 620M Onelast detail.  To get the configuration of a zone weshould ask customers for the output of their zone configuration byusing the zonecfgutility from the root zone.  The commands that will work the best iseither “zonecfg -z <zone_name> info” or “zonecfg -z<zone_name>export”.  We need to carefully evaluate any capped-memory settingsor other settings defined in the zone to determine if those are alsocausingany potential issues for Windchill.  The output of that command willindicate whether the zone is a sparse or whole zone as the inheritsection islikely not there in a whole zone. SPARSE/SHAREDZONE # zonecfg -z edc-sne info zonename: edc-sne zonepath:/home/edc-sne brand: native autoboot: true bootargs: pool: limitpriv: scheduling-class: ip-type:shared inherit-pkg-dir:        dir: /lib inherit-pkg-dir:        dir: /platform inherit-pkg-dir:        dir: /sbin inherit-pkg-dir:        dir: /usr net:         address:132.253.33.85         physical:e1000g0        defrouter: 132.253.33.1 WHOLEZONE # zonecfg -z edc-sne2 info zonename: edc-sne2 zonepath:/home/edc-sne2 brand: native autoboot: true bootargs: pool: limitpriv: scheduling-class: ip-type: shared net:         address:132.253.33.86         physical:e1000g0        defrouter: 132.253.33.1 Theexport command will allow use to create the zones internallyfor our own testing purposes and they should contain the create commandthatwas leveraged, but it’s misleading.  If you use the standard “create”command it will automatically create the required inheritedproperties. But the export command will use “create –b” and you will see severalof the following instead: addinherit-pkg-dir set dir=/sbin end Icurrently do not have an opinion as to whether we want toadvocate a Whole vs. Sparse zone.  It seems a Whole zone creates moreindependence from the global zone.  This may be necessary if you wanttohave more absolute control over the configuration. BestRegards,Steve

A note from Steve Dertien at PTC: Jeff, We’ve solved the memory issue with zones.  The issue is impacted by the kernel version on the server and the zone type that they have created. Whatwe’ve discovered...

Analytics

Algorithmics Financial Risk Management Software on the Sun Fire X4270

For the past several months, I've been slaving over a new server powered by Intel® Xeon® 5500 Series processors.  I have been using the system to investigate the characteristics of Algorithmics' Risk Analysis application, and I've found that the new server and the Solaris OS run great.  Solaris lets you take advantage of all the great new features of these new CPUs such as TurboBoost and HyperThreading. The big takeaway is that the new version of Algo's software really benefits from the new CPU: this platform runs faster than the other Solaris systems I've tested.If you don't know Algorithmics, their software analyzes the risk of financial instruments, and financial institutions are using it to help keep the current financial meltdown from worsening. Algo is used by banks and financial institutions in 30 countries. I'm part of the team at Sun that has worked closely with Algorithmics to optimize performance of Algorithmics' risk management solutions on Sun Fire servers running Solaris.Algo's software has a flexible simulation framework for a broad class of financial instruments.  SLIMs are their newest generation of very fast simulation engines. Different SLIMs are available to simulate interest rate products (swaps, FRAs, bonds, etc.), CDSs and some option products. The simulation performance with the new SLIMs is spectacular. Solaris provides everything that is required for Algo to take advantage of the new server line's performance and energy efficiency: Sun Studio Express has been optimized to generate efficient SIMD instructions ZFS provides the high performance I/O Hyperthreading:  Solaris can easily take advantage of the heavily threaded architecture. Turbo Boost.  Yes, it kicks in immediately when a SLIM is launched QuickPath chip-to-chip interconnect.  Solaris is NUMA aware.  Here are the answers to some questions that you may have about SLIMs on Solaris and the Xeon® Processor 5570: Q: Are SLIMs I/O bound or CPU bound? A: SLIM's require both the computational and the I/O subsystem to be excellent.  For the data sets that have been provided by Algo running on the X4270 with ZFS striped over 12 internal hard drives, results vary.  The majority of the SLIMs are CPU bound, but a few are I/O bound.   Q: What are the computational characteristics of SLIM's running on the X4270? A: The number of threads to be used are specified by the user.  Most of the SLIM's scale well up to 8 threads and some scale all the way up to 16 threads.  SLIM's that don't scale to 16 threads on the X4270 with internal hard drives are primarily I/O bound and benefit from even faster storage. (Hint: check this Algo blog again in the future for more discussion regarding faster I/O devices.)   Q: What are the I/O characteristics of SLIM's running on the X4270 with ZFS? A: The I/O pattern of SLIMs is large block sequential write.  ZFS caches the writes and flushes the cache approximately once every 10 seconds.  Each hard drive hits peaks of 80 MB/second.  With 12 striped internal hard drives, the total system throughput can reach close to 1.0 GB/second.   Q: Is external storage required for SLIMs? A: 14 internal drives (plus 2 for mirrored OS disks) at 146GB each can hold a data set upwards of 1.7TB.  If your data fits, internal drives will be a cost effective solution. Q: Is hardware RAID required for the SLIMs data.  Background:  RAID can potentially fulfill needs including (a) Striping to creating larger storage units from smaller disks, and (b) data redundancy so that you don't loose important data.  (c) Hardware RAID can increasing I/O performance via cacheing and fast checksum block computation.   A: No, hardware RAID is not required.  (a) ZFS has a full range of RAID capabilities.   (b) The cube of data produced by SLIMs is normally considered to be temporary data that can be recalculated if necessary,  and therefore redundancy is not required.  (c) If redundancy is required for your installation, RAID-Z has been shown to have a negligible impact on SLIMs' I/O performance. The SLIM's write intensive I/O pattern will blow through cache and be bound by disk write performance, so there is no advantage to adding an additional layer of cache. Q: Algo often recommends using Direct I/O filesystems instead of buffering.  Should we use Direct I/O with ZFS? A: No.  All ZFS I/O is buffered.  Based on comparisons against QFS with Direct I/O enabled, ZFS is recommended. Q: In some cases ZFS performance can be improved by disabling ZIL or by putting ZIL on on a faster device in a ZFS hybrid storage pool.  Does it help SLIMs performance? A: No. SLIMs' I/O is not synchronous. SSD for ZIL will not improve SLIMs' performance when writing to direct attached storage. Q: Is the use of Power Management recommended? A: Yes.   PowerTOP was used to enable and monitor the CPU power management.  When the machine is idle, power is reduced.  When SLIM's are executing, the CPU's quickly jumps into TurboBoost mode.  There was no significant performance difference between running SLIM's with and without power management enabled. Q: The Sun Fire X4270 DDR3 memory can be configured to run at 800MHz, 1066MHz or 1333MHz.  Does the DDR3 speed effect SLIMs performance?A: Yes, several of the SLIMs (but not all) run better on systems configured to run DDR3 at 1333MHz. Q: Would it be better to deploy SLIMs on a blade server (like the X6275 Server Module) or a rack mounted server (like the X4270).A: Again, the answer resolves around storage.  If the SLIMs time series data fits onto the internal disk drives, the rack mounted server will be a cost effective solution.  If your time series data is greater than 1.5 TB, it will be necessary to use external storage, and the blade server will a more cost effective solution.   Platform tested:     \* Sun Fire X4270 2RU rack-mount server      \* 2 Intel® Xeon® Processor 5500 Series CPU's     \* 14 -  146GB 10K RPM disk drives           o 2 for the operating system           o 12 for data storage           o 2 empty slots     \* Memory configurations ranged from 24GB to 72GB

For the past several months, I've been slaving over a new server powered by Intel® Xeon® 5500 Series processors.  I have been using the system to investigate the characteristics of Algorithmics' Risk...

Sun

Configuring jumbo frames on the V490’s ce and the T2000's e1000g

Configuring jumbo frames on the V490’s ce and the T2000's e1000g I was hoping that using Solaris 10's jumbo frames would reduce Windchill's latency as seen by end-user's.  I was surprised at how difficult it was to find consistent documentation and, in the end.  disappointed that it did make too much difference.  I am writing this blog in hope that it may help others to configure jumbo frames more quickly for similar experiments.As stated by Wikipedia,, "In computer networking, jumbo frames are Ethernet frames above 1518 bytes in size. The most commonly supported implementations of hardware support for jumbo frames have a MTU of 9000 bytes. Jumbo frames, while sometimes used on a LAN, are rarely used when exchanging data, especially over the Internet." In my test, I wanted to use jumbo frames for links "inside the data center".  Specifically, for (1) the Sun Cluster interconnect between two V490 Sun Cluster Oracle RAC nodes,  and (2) the communication between the T2000 application tier servers and the V490 database servers.  The hope was that I would see a substantial reduction in the system time (i.e. CPU time inside the kernel) when few IO operations were required for a given payload.  I did not attempt to use jumbo frames  for the final HTML traffic to the end-users who would be "outside of the data center"..The V490's Sun Cluster interconnect was via crossover cables, so the was no limit imposed by a router on the MTU size. The router between the application servers and the database servers advertised that it supported "8k" frames, so I set the MTU to less than 8192 bytes.A) Configuring jumbo frames on the V490's1) Find the path to the device: # grep ce /etc/path_to_inst"/scsi_vhci/ssd@g60020f20000063f0438c7cce0006cd71"14 "ssd""/pci@8,700000/pci@2/network@0"2 "ce""/pci@8,700000/pci@2/network@1"3 "ce""/pci@9,700000/network@2" 0"ce""/pci@9,600000/network@1" 1"ce"2) Configure the ce driver: # cat/platform/sun4u/kernel/drv/ce.confname="pci108e,abba"parent="/pci@8,700000/pci@2" unit-address="0"accept-jumbo=1;name="pci108e,abba"parent="/pci@8,700000/pci@2" unit-address="1"accept-jumbo=1;name="pci108e,abba"parent="/pci@9,700000" unit-address="2"accept-jumbo=1;name="pci108e,abba"parent="/pci@9,600000" unit-address="1"accept-jumbo=1;3) set the MTU in the hostname file:# cat /etc/hostname.ce0 scnode1 mtu 8168 group sc_ipmp0-failover4) Verify after reboot:# ifconfig -a | grep mtulo0:flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL>mtu 8232 index 1ce0:flags=1009000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,NOFAILOVER,FIXEDMTU>mtu 8168 index 2ce2:flags=1008843<UP,BROADCAST,RUNNING,MULTICAST,PRIVATE,IPv4> mtu9194 index 4ce3:flags=1008843<UP,BROADCAST,RUNNING,MULTICAST,PRIVATE,IPv4> mtu9194 index 3clprivnet0:flags=1009843<UP,BROADCAST,RUNNING,MULTICAST,MULTI_BCAST,PRIVATE,IPv4>mtu 9194 index 5B) Configuring jumbo frames on the T2000’s:1) Configure the e1000g driver# grep MaxF /kernel/drv/e1000g.conf MaxFrameSize=2,2,2,2,0,0,0,0,0,0,0,0,0,0,0,0;2) Reboot and verify:# ifconfig -a... e1000g0:flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 8168index 2

Configuring jumbo frames on the V490’s ce and the T2000's e1000g I was hoping that using Solaris 10's jumbo frames would reduce Windchill's latency as seen by end-user's.  I was surprised at how...

PTC-Windchill

FAQ for Windchill on Solaris

FAQ for Windchill on Solaris Introduction I work in Sun's "ISV Engineering" team.  Our responsibilities include working with Sun's key ISV partners to port, tune and optimizeindustry leading applications on Sun's hardware and software stack,and to ensure that the latest solutions from the ISV's are certified on thelatest products from Sun. One of the applications that I focus on is Windchill from  PTC.  As such, I am in frequent contact with PTC's R&D, Global Services, QA and customers, all of whom bring varying degrees of Solaris expertise.  This is a collection of questions and answers. Section I: HARDWARE CONFIGURATION1.  What server configurations are recommended? Windchill will run on hardware ranging from laptops to large servers.  Solaris is typically used for installations where the user count ranges from 50 to several thousands of engineers.  While you can run a full Windchill installation on a single server, typical installations split the Oracle database and Windchill application tier onto different tiers.   2. Does Sun recommend horizontal or vertical scaling for Windchill? "Vertical scaling" is typically used at the database tier.  This strategy is where multiple CPU's are used in a single server.  For example, if four SPARC64 VI processors are required at the database tier, but you would like room for future expansion, a Sun SPARC Enterprise M5000 with eight CPU slots could be used.  Four slots would be populated, and four additional CPU slots would be available for future expansion..  "Horizontal scaling" is often used at the application tier.  This strategy is where multiple servers are combined to accomplish a larger workload.  Often, one Sun Fire T2000 is sufficient to meet the current requirements at the application tier, but the installation does not leave room for future expansion. This is OK because the Windchill application tier scales horizontally.  A drawback is that a "Winchill cluster" is somewhat more difficult to administer, so it is better if you have a rather savvy IT staff.  If you are using an ASP with limited Windchill experience, vertical scaling at the application tier may be a better approach. 3.  Is a "highly available" solution recommended? Yes, the cost of engineering talent is to high to allow for long downtimes.  A "Windchill cluster" is highly available at the application tier.  If one node fails, users who were logged into Windchill on the failing node will loose their Windchill sessions.  When they attempt to reconnect, they will establish a session with one of the other nodes. In addition to the Windchill cluster, we recommend an "active/active" Sun Cluster installation, as follows: Oracle standalone is a potential single point of failure, andtherefore we recommend Sun Cluster HA Oracle for fail over at thedatabase tier.  (Oracle RAC is an alternative, but quite expensive, and not expected to scale well beyond two nodes).  With Sun Cluster HA Oracle, one node is actively running Oracle while the other node is passive.  Oracle will be launched on the passive node in the event of a failure. There are several Windchill components that are single points of failure, including the Windchill master cache server, Aphelion and the background method server.  We recommend that these services be run on the passive Oracle node. On failure, a Sun Cluster agent can launch the Windchill services on the active Oracle node. 4. Does Sun publish server sizing recommendations for Windchill? Yes, see https://www.sun.com/third-party/srsc/resources/ptc/PTCWindchill8.0T2000SizingGuide.pdf 5. What is the hardware configuration? # prtdiag# prtconf -pv | more 6. How can the disk devices and disk partition table be viewed? # format# prtvtoc /dev/dsk/c3t1d0s2 7. How much free disk space is available? # df -h 8. How much disk space is a file/directory using? # du -sh my_dir 9. Where did all of the space on this disk go? # cd mountpoint (identify mount point with df -h)# du -sk \* | sort -n# cd biggest_dir(repeat, working your way down the to offensive directories.) 10. What fibre channel devices are on line? # luxadm probe Section II: SOLARIS KERNEL SETTINGS FOR WINDCHILL1. What version of Solaris is running? # cat /etc/release  2. What kernel tuning is necessay for Windchill? Solaris 10 out of the box is well tuned. 3. Any tweaks?  Add this to /etc/system: \* slow down the fsflush daemonset autoup=600 4. Are there any T2000 specific settings?  Add this to /etc/system: \* Sun recommended settings for T2000's running S10u3set pcie:pcie_aer_ce_mask=0x1set segkmem_lpsize=0x400000 5. Are there any Oracle kernel parameters?  Add this to /etc/system: \* Oracle 10g Settingsset noexec_user_stack=1 6. SunCluster kernel parameters?  Add this to /etc/system: \* Start of lines added by SUNWscrset rpcmod:svc_default_stksize=0x6000set ge:ge_intr_mode=0x833\* Disable task queues and send all packets up to Layer 3\* in interrupt context.\* Uncomment the appropriate line below to use the corresponding\* network interface as a Sun Cluster private interconnect. This\* change will affect all corresponding network-interface instances.\* For more information about performance tuning, see\* http://www.sun.com/blueprints/0404/817-6925.pdf\* set ipge:ipge_taskq_disable=1set ce:ce_taskq_disable=1\* End of lines added by SUNWscr\* When you use the ce Sun Ethernet driver for public network connectionsset ce:ce_reclaim_pending=1 7. What about the recommendation for Oracle /etc/system changes such as shmsys:shminfo_shmmax? No longer required with Solaris 10.  Most are obsolete or dynamically adjustable.Instead use:   projadd -c "Oracle Project" -U oracle,root -K \\   "project.max-shm-memory=(priv,6GB,deny)" -K \\   "process.max-sem-nsems=(priv,2048,deny)" user.oracle     Section III: PATCH LEVELS1. What patches are on the system? # showrev -p 2. How can the Solaris patch level be kept up to date using a GUI? # updatemanger 3. How can the Solaris patch level be kept up to date using a CLI? # smpatch Section IV: SAR1. Setting up sar for one minute samples and long term logging: # su - sys# crontab -l > /tmp/crontab.txt# vi /tmp/crontab.txt 0 \* \* \* 0-6 /usr/lib/sa/sa1 60 60# 20,40 8-17 \* \* 1-5 /usr/lib/sa/sa15 18 \* \* 1-5 /usr/lib/sa/sa2 -s 8:00 -e 18:01 -i 1200 -A0 1 \* \* 2 /opt/sar_bk/mk_sar_bk.sh # crontab /tmp/crontab.txt# vi /opt/sar_bk/mk_sar_bk.sh mkdir /opt/sar_bk/sar_`date +%Y%m%d`cp /var/adm/sa/\* /opt/sar_bk/sar_`date +%Y%m%d` Section V: MONITORING CPU ACTIVITY 1. What tools are used to monitor the current CPU activity level?  All of these work: # prstat# top# xcpustate -disk &# vmstat 10 10# mpstat 10 10 2. How busy was the CPU after lunch? # sar -u -s 13:00 -e 13:30 3. Which cores/CPU's are BUSY? # mpstat Section VI: PROCESSORS1. Are the processors on line? # psrinfo 2. How can you take a processor off line for performance analysis? # psradm -f 16 (where "16" is the id of the processor) 3. How can you put the processor back on line? # psradm -n 16 (where "16" is the id of the processor) Section VII: RUN QUEUEIf you have more requests for processing than compute cycles, processes are scheduled in the run queue.  A large run queue indicates that you need to find more compute cycles (i.e. buy more/faster CPU's) or reduce the workload (i.e. application tuning)1. How can you see the current run queue length? # vmstat 10 10 (Watch the "r" column.) 2. How can you see the historical run queue length? # sar -q  (specifically, look at runq-sz %runocc) Section VIII: DISK ACTIVITY1. Which disk is currently busy? # iostat -mxPzn 10 10 2. Which disk were historically busy? # sar -f /usr/adm/sa/sa13 -s 20:19 -e 21:19 -d  -i 3400 | grep -v , | grep -v .fp | grep -v md | more Section IX: PROCESSES, THREADS, SYSTEM CALLS AND LOCKS 1. Which processes are busy? # prstat  2. Which processes are making a lot of system calls? # prstat -m (watch the "SCL" system call column) 3. Which threads inside of a process are busy? # prstat -L -p 1332 4. Which processes can benefit from a muli-core server like the T2000? Processeswith a significant number of threads or processes may benefit. Windchill method servers and Tomcat both run well with a large numberof cores and run well on the T2000.  (In contrast, Pro/E is not a goodmatch for the T2000.)  5. Which processes are have many threads (LWP's)? # ps -e -o"nlwp,pid,args" | sort -n 6. What system calls is a process making? # truss -c -p 1332 7. The "ps" command truncates the Java arguments.  How do you see the full list?  # pargs 1332 8.  How can you see how much time each thread has taken # ps -o"lwp,time,args" -L  -p 2282 9 Is there locking? # plockstat -C -p 5992# lockstat -CPD 5 sleep 10 10. If the process is locking on malloc, how can you use a threaded malloc? # LD_PRELOAD=/usr/lib/libumem.so 11. How can you see the current stack trace of a process?  # pstack 1332 Section X: MEMORY, VIRTUAL MEMORY AND SWAP SPACE1. What swap devices are mounted? # swap -l 2. How much swap space is used/remaining? # swap -s# vmstat 10 10 3. Is there pressure on the virtual memory system, currently? vmstat 10 10 (watch the "sr" scan rate column.) 4. How much free memory and swap have been available, historically? # sar -r  (freemem freeswap) 5. Was there pressure on the virtual memory system, historically? # sar -g  (watch pgscan/s) 6. Which processes are using the most RAM? Sort processes by Resident Set Size ps -e -o"rss,pid,args" | sort -n 7. Which processes are using the swap space? Sort processes by Virtual size ps -e -o"vsz,pid,args" | sort -n Section XI: IO1. What files does a process have open? # pfiles 4514 | grep / 2. Which disks are busy? # iostat -mxPzn 10 10# xcpustate -disk & Section XII: NETWORK STATUS1. Overall network status? # netstat -i# netstat -sPtcp 2. What ports are processes listening on? # netstat -a | grep  LISTEN 3. What sockets does a process have open?  Here is an example that shows that Apache has port 80 open. # pfiles 4514    3: S_IFSOCK mode:0666 dev:310,0 ino:59012 uid:0 gid:0 size:0      O_RDWR        SOCK_STREAM        SO_REUSEADDR,SO_KEEPALIVE,SO_SNDBUF(49152),SO_RCVBUF(49152),IP_NEXTHOP(0.0.192.0)        sockname: AF_INET6 ::  port: 80 netstat -an | grep LIST | grep 1158snoopIn my environment, running "netstat -s" on the Windchill application tier reported thousand of tcpListenDrop, and therefore, network tuning was required:     #cat /etc/init.d/network-tuning     /usr/sbin/ndd -set /dev/tcp tcp_conn_req_max_q 2048     /usr/sbin/ndd -set /dev/tcp tcp_conn_req_max_q0 8192     /usr/sbin/ndd -set /dev/udp udp_smallest_anon_port 8192     /usr/sbin/ndd -set /dev/tcp tcp_smallest_anon_port 8192     # ln -s /etc/init.d/network-tuning /etc/rc2.d/S99network-tuning  Also needed to increased maxSockets in wt.properties:    wt.method.rmi.maxSockets=800SECTION XIII: RANDOM CRASHES 1. How can you detect random application crashes? AppCrashhttp://blogs.sun.com/gregns/ SECTION XIV: ORACLE CONFIGURATION AND TUNING The following setting worked well for the test database any use patterns provided by PTC.  You mileage will vary.    a)A 5 GB SGA was required for the test database:        ALTER SYSTEM SET sga_max_size=5g SCOPE=spfile;        ALTER SYSTEM SET sga_target=5g SCOPE=spfile;        Total System Global Area 5368709120 bytes        Fixed Size 2037688 bytes        Variable Size 939526216 bytes        Database Buffers 4412407808 bytes        Redo Buffers 14737408 bytes    b)Gather Statistics:        exec DBMS_STATS.GATHER_SCHEMA_STATS ( OWNNAME=>'DUBLIN80M010',        estimate_percent => DBMS_STATS.AUTO_SAMPLE_SIZE, CASCADE=>TRUE );    c)System Statistics:        execute dbms_stats.gather_system_stats('Start');    d)Verify system statistics:        select pname, pval1 from sys.aux_stats$ where sname =        'SYSSTATS_MAIN'    e)Increase cursors:        ALTER SYSTEM SET open_cursors = 2500 SCOPE=SPFILE;    f)Use push join union:         alter system set "_push_join_union_view"=true scope=spfile;    g)Add some indexes recommended by Oracle Enterprise Manager Advisors:        CREATE INDEX "DUBLIN80M010"."MILESTONE_IDX$$_012C000B"        ON "DUBLIN80M010"."MILESTONE"        ("MARKFORDELETEA2",UPPER("NAME"))         COMPUTE STATISTICS;        CREATE INDEX "DUBLIN80M010"."DELIVERABLE_IDX$$_012C000C"        ON "DUBLIN80M010"."DELIVERABLE"        ("MARKFORDELETEA2",UPPER("NAME"))        COMPUTE STATISTICS;        CREATE INDEX "DUBLIN80M010"."PROJECTACTIVITY_IDX$$_012C000D"        ON "DUBLIN80M010"."PROJECTACTIVITY"        ("MARKFORDELETEA2",UPPER("NAME"))        COMPUTE STATISTICS;        CREATE INDEX "DUBLIN80M010"."MANAGEDBASELINE_IDX$$_012C000E"        ON "DUBLIN80M010"."MANAGEDBASELINE"        ("MARKFORDELETEA2",UPPER("NAME"))        COMPUTE STATISTICS;        CREATE INDEX "DUBLIN80M010"."WTORGANIZATION_IDX$$_012C000F"        ON "DUBLIN80M010"."WTORGANIZATION"        ("MARKFORDELETEA2",UPPER("NAME"))        COMPUTE STATISTICS;        CREATE INDEX "DUBLIN80M010"."PROJECTPLAN_IDX$$_012C0010"        ON "DUBLIN80M010"."PROJECTPLAN"        ("MARKFORDELETEA2",UPPER("NAME"))        COMPUTE STATISTICS;        CREATE INDEX "DUBLIN80M010"."REPORTTEMPLATE_IDX$$_012C0011"        ON "DUBLIN80M010"."REPORTTEMPLATE"        ("MARKFORDELETEA2",UPPER("NAME"))        COMPUTE STATISTICS;        CREATE INDEX "DUBLIN80M010"."WTPRODUCT_IDX$$_00740001"        ON "DUBLIN80M010"."WTPRODUCT"        ("IDA3CONTAINERREFERENCE","LATESTITERATIONINFO","MARKFORDELETEA2")        COMPUTE STATISTICS;    h) Analyze Oracle        emctl start dbconsole        http://scnode2:1158/em    i) AWR and ASH reports        sqlplus sys/manager as sysdba        @$ORACLE_HOME/rdbms/admin/awrrpt.sql        @$ORACLE_HOME/rdbms/admin/ashrpt.sqlSECTION XV: METHOD SERVER LOG1.  Any fancy Unix commands to summarize the Method Server log? # cat M\*log | grep Exception  | cut -d: -f 5- | sed -e 's/[0-9]\*_OutdoorProducts_Org1_Admin_ActionItem[-_0-9]\*/XXXX/' | sort | uniq -c | sort -n

FAQ for Windchill on Solaris Introduction I work in Sun's "ISV Engineering" team.  Our responsibilities include working with Sun's key ISV partners to port, tune and optimizeindustry leading...

PTC-Windchill

JVM Tuning for Windchill

JVMTuning for Windchill IsYour Windchill Installation Using Memory Efficiently? JeffTaylor SunMicrosystems Updated: April 13, 2007 Notes     December11, 2006: Reduced the recommended Java heap size from 3.6 GB to 3.2GB to avoid “Out of Memory” errors. This seemscounter-intuitive, but the explanation is simple. The Java processneeds storage outside of the Java heap. For example, when the JVMattempts create a new thread, thread local storage is allocated fromoutside of the Java heap. Thus, an excessively large Java heaplimits the number of threads that the JVM can allocate, and insteadof creating the new thread, the JVM crashes.     April13, 2007: (1) Replaced setting Eden in megabytes with -XX:NewRatio=2(2) added -XX:MaxLiveObjectEvacuationRatio=33 (3) removed explicitsetting of MaxTenuringThreshold and (4) added setting for the RMI GCinterval. Introduction     The World Wide Web is full of articlesthat cover Java Tuning. With so much information available, it ishard for a Windchill Administrator to know where to start. Whichapproaches are useful? Which articles and options apply toWindchill? How to get started? What are the right settings forWindchill?     Why is this so hard? The best Javaoptions depend on the hardware that is used for the Windchill serverand the usage patterns of the Windchill users. One of thefundamental questions that needs to be answered by every Windchilladministrator is: “Is memory being used efficiently?” EveryWindchill installation will need to customize –Xmx, which sets themaximum size of the Java heap. The default value is 64MB1.,much too small for Windchill. Unfortunately, there is not onecorrect answer for every installation. When a well written Javaprogram performs poorly, there are two typical causes. Either theJava heap size is two small causing an excessive amount of garbagecollection, or a Java heap size that is so big that portions arepaged to virtual memory. Either problem can be severe, so finding abalance is important. In conjunction with setting the right heapsize, every administrator will need to set the size of the Eden(young), survivor, and tenured (old) generations2. Also,there are a large number of other Java options, some that helpWindchill, some that have minimal impact, and some that only apply toJVM releases that are not supported by Windchill3.     This article will focus on Java 1.4.2. Windchill 8.0 M020, released in May 2006, was tested with the SunMicrosystems Java Software Developer Kit version 1.4.2_09 for Solaris9 and 10. The Support Matrix notes that higher versions in the1.4.2_xx series are also expected to work. A Reasonable Goal     Try to keep garbage collection at 5% orless of the JVM’s CPU time. If you can’t accomplish this byadjusting the JVM parameters, consider purchasing additional RAM. The information presented in this article is intended to help youunderstand how to measure the current status to achieve this goal. The ThreeGenerations     The Java heap contains Eden, survivorspaces and the old generation2. There are many otherarticles available which describe the generations and garbagecollection algorithm, so I will only state the followingobservations: Tenured(old) generation garbage collections use more resources than Eden(young) generation garbage collections. My goal as a tuner is tostop objects from being unnecessarily tenured. Therate at which new Java objects are created can not be controlled bythe Windchill administrator, rather it is a function of theWindchill usage pattern and the algorithms implemented by those Javaprogrammers who wrote Windchill. Thetime interval between young generation collections is simply thesize of Eden divided by the rate at which new Java objects are beingcreated. Increasing the size of Eden makes the time intervalbetween new generation collections longer. Onepath for an object to be tenured is to survive a certain number ofyoung generation garbage collections. Therefore, increasing thetime interval between young generation collections implies that anobject will be older before being tenured. Therefore, increasingthe size of Eden is good. Anotherpath for an object to be tenured is for the survivor space to be toosmall to contain all of the objects which survive a young generationcollection, in which case the survivors spill into the oldgeneration. Therefore, a large survivor space is a good thing. Whenthe old generation is full with acknowledgement of the “younggeneration guarantee”2, the old generation will becollected. Therefore, a large tenured generation is a good thing.     So a large Java heap is desirable solong as the data fits into RAM. However, finding a balance in thesize of the three generations is a key to successful JVM tuning. How big can youmake the Windchill heap?     There are at least 3 potential limitsto the size of the Java Heap: physical memory, virtual memory andprocess address space. For a discussion of physical memory andvirtual memory, please see the section, “How Big is Too Big?”below. Regarding the process address space, the Windchill programconsists of Java byte code that runs correctly on either the 32-bitor 64-bit Java Virtual Machine. The 64-bit JVM has a bigger processaddress space, but I am not sure that PTC has a clearly statedsupport policy for the 64-bit JVM.     With the 32-bit version of Java, thesize of the Java heap can almost reach the full 32-bit / 4GB processaddress space. In addition to the Java heap, other segments need tobe mapped into the process address space, such as the executable"text" (machine instructions) and stack. Your can use“pmap –x” to view all of the memory segments of a process. With Windchill, I have found that the largest usable 32-bit heap isclose to 3.2 GB.     You can run Windchill inside the 64-bitJVM on Solaris simply by installing the 64-bit Java supplement andadding the –d64 option to the Java command line. The size of theheap with the 64-bit version of Java can be much larger, however,with Windchill, you may have a better overall user experience byusing several 32-bit JVM's (for example, 2 Tomcat JVM's and 4foreground method servers), rather than two very large JVM's (1tomcat plus 1 MS). With huge heaps, users will experience long,seemingly random, pauses when the GC finally kicks in. Enabling GClogging and Example Java Options     The options shown here would be a goodstarting point for a new installation. The sizes should be evaluatedfor your hardware and usage patterns. Tomcat: setenvJAVA_OPTS "-server -Xms3200m -Xmx3200m -XX:NewRatio=2-XX:MaxLiveObjectEvacuationRatio=33 -XX:+UseParNewGC-XX:ParallelGCThreads=8 -XX:SurvivorRatio=4-XX:TargetSurvivorRatio=90 -XX:PermSize=64m -XX:MaxPermSize=64m-XX:+UseTLAB -XX:+ResizeTLAB -XX:+DisableExplicitGC -XX:+UseMPSS-Dsun.rmi.dgc.client.gcInteraval=3600000-Dsun.rmi.dgc.server.gcInterval=3600000 -verbose:gc-XX:+PrintGCTimeStamps -XX:+PrintGCDetails-Xloggc:/opt/ptc/Windchill_8.0/logs/tomcat_gc.log" Windchill 8.0: wt.manager.cmd.MethodServer.java.command=$(wt.java.cmd.quoted)$(wt.manager.cmd.common.java.args)-Djava.protocol.handler.pkgs\\=HTTPClient-DHTTPClient.disableKeepAlives\\=true-DHTTPClient.dontChunkRequests\\=true -Xms3200m -Xmx3200m-XX:NewRatio=2 -XX:MaxLiveObjectEvacuationRatio=33 -XX:+UseParNewGC-XX:ParallelGCThreads=8 -XX:SurvivorRatio=4-XX:TargetSurvivorRatio=90 -XX:+UseTLAB -XX:+ResizeTLAB-XX:+DisableExplicitGC -XX:+UseMPSS-Dsun.rmi.dgc.client.gcInteraval=3600000-Dsun.rmi.dgc.server.gcInterval=3600000 -verbose:gc-XX:+PrintGCTimeStamps -XX:+PrintGCDetails-Xloggc:/opt/ptc/Windchill_8.0/logs/MethodServer_gc.log$(wt.manager.cmd.MethodServer.platform.java.args)wt.method.MethodServerMain Multiple MethodServers     If your Windchill Application tierserver has more than 8 GB of RAM, you will want to consider usingmultiple method servers and/or multiple tomcat instances. Using theWindchill parameters above, the multiple JVM’s would write all ofthe logging data into one file, making analysis very difficult. Here is an example to force each Windchill Method Server to writeinto a unique log file. Changes to wt. properties: wt.manager.cmd.BackgroundMethodServer=/opt/ptc/Windchill_8.0/launch_bms.ksh wt.manager.cmd.MethodServer=/opt/ptc/Windchill_8.0/launch_ms.ksh wt.manager.cmd.ServerManager.java.command=$(wt.java.cmd.quoted)$(wt.manager.cmd.common.java.args) -Xms512m -Xmx512m -XX:NewRatio=2-XX:MaxLiveObjectEvacuationRatio=33 -XX:SurvivorRatio=8-XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:ParallelGCThreads=8-XX:+UseTLAB -XX:+ResizeTLAB -XX:+UseMPSS-Dsun.rmi.dgc.client.gcInteraval=3600000-Dsun.rmi.dgc.server.gcInterval=3600000 -verbose:gc-XX:+PrintGCTimeStamps -XX:+PrintGCDetails-Xloggc:/opt/ptc/Windchill_8.0/logs/ServerManager_gc.logwt.manager.ServerManagerMain /opt/ptc/Windchill_8.0/launch_ms.ksh #!/bin/ksh WT_HOME=/opt/ptc/Windchill_8.0WEB_INF=$WT_HOME/codebase/WEB-INF CLASSPATH=\\$WT_HOME/codebase:\\$WEB_INF/lib/activation.jar:\\$WEB_INF/lib/ie.jar:\\$WEB_INF/lib/ie3rdpartylibs.jar:\\$WEB_INF/lib/install.jar:\\$WEB_INF/lib/mail.jar:\\$WEB_INF/lib/wc3rdpartylibs.jar:\\$WEB_INF/lib/wncWeb.jar:\\$WEB_INF/lib/pdmlWeb.jar:\\$WEB_INF/lib/pjlWeb.jar:\\$WT_HOME/lib/servlet.jar:\\$WT_HOME/lib/CounterPart.jar:\\$WT_HOME/lib/wnc.jar:\\$WT_HOME/lib/pdml.jar:\\$WT_HOME/lib/pjl.jar:\\$WT_HOME/lib/wnc-test.jar:\\/usr/j2se/lib/tools.jar exec/usr/j2se/jre/bin/java \\-server\\-classpath$CLASSPATH \\-noverify\\-Djava.protocol.handler.pkgs=HTTPClient-DHTTPClient.disableKeepAlives=true-DHTTPClient.dontChunkRequests=true \\-Xms3200m-Xmx3200m -XX:NewRatio=2 -XX:MaxLiveObjectEvacuationRatio=33 \\-XX:PermSize=96m-XX:MaxPermSize=96m \\-XX:+UseParNewGC-XX:ParallelGCThreads=8 -XX:SurvivorRatio=4 \\-XX:TargetSurvivorRatio=90\\-XX:+UseTLAB-XX:+ResizeTLAB \\-XX:+DisableExplicitGC\\-XX:+UseMPSS\\-Dsun.rmi.dgc.client.gcInteraval=3600000-Dsun.rmi.dgc.server.gcInterval=3600000 \\-verbose:gc-XX:+PrintGCTimeStamps -XX:+PrintGCDetails\\-Xloggc:$WT_HOME/logs/MethodServer`date +%y%m%d%H%M%S`_gc.log\\wt.method.MethodServerMain\\wt.method.log.file=$WT_HOME/logs/MethodServer`date+%y%m%d%H%M%S`.log /opt/ptc/Windchill_8.0/launch_bms.ksh #!/bin/ksh WT_HOME=/opt/ptc/Windchill_8.0WEB_INF=$WT_HOME/codebase/WEB-INF CLASSPATH=#Same as /opt/ptc/Windchill_8.0/launch_ms.ksh (removed to save space) exec/usr/j2se/jre/bin/java \\-server\\-classpath$CLASSPATH \\-noverify\\-Djava.protocol.handler.pkgs=HTTPClient-DHTTPClient.disableKeepAlives=true-DHTTPClient.dontChunkRequests=true \\-Xms1000m-Xmx1000m -XX:NewRatio=2 -XX:MaxLiveObjectEvacuationRatio=33 \\-XX:PermSize=96m-XX:MaxPermSize=96m \\-XX:+UseParNewGC-XX:ParallelGCThreads=8 -XX:SurvivorRatio=4 \\-XX:TargetSurvivorRatio=90\\-XX:+UseTLAB-XX:+ResizeTLAB \\-XX:+DisableExplicitGC\\-XX:+UseMPSS\\-Dsun.rmi.dgc.client.gcInteraval=3600000-Dsun.rmi.dgc.server.gcInterval=3600000 \\-verbose:gc-XX:+PrintGCTimeStamps -XX:+PrintGCDetails\\-Xloggc:$WT_HOME/logs/BackgoundMethodServer`date+%y%m%d%H%M%S`_gc.log \\wt.method.MethodServerMain\\wt.method.log.file=$WT_HOME/logs/BackgoundMethodServer`date+%y%m%d%H%M%S`.log \\wt.method.serviceName=BackgroundMethodServer\\wt.queue.executeQueues=truewt.queue.queueGroup\\=default \\wt.adapter.enabled\\=false\\wt.method.minPort\\=3000     Using this change to the wt.propertiesfile and the two scripts provided, a unique garbage collection logfile will be written for every Windchill Method Server. This is anecessary step in preparing to use more advanced analysis tools. VisualGC, jstatand PrintGCStats     Three tools that I like to use toanalyze Java performance are VisualGC5, jstat6and PrintGCStats7. Each tool is useful and provides aunique view of Java’s memory management.     A good place to start is VisualGC,which provides a graphical display of the Java generations. VisualGCcan help a Windchill administrator get an intuitive feel of howmemory is being used. Because the graphical display could impactperformance, I run jstatd on the Windchill server and VisualGC on aremote Solaris workstation or Windows laptop. The jstatd daemonseems to have minimal impact on Windchill performance. One exampleof information that be quickly gleaned from VisualGC is by observingthe Survivor Age Histogram Window. If survivors are walking thewhole way down the histogram and then being promoted to the tenuredgeneration, they may be legitimate long-lived objects. On the otherhand, if the objects march very quickly down the histogram, try toincrease the size of Eden to slow the speed at which survivors marchdown the histogram. If survivors are being dumped into the tenuredgeneration before surviving the full MaxTenuringThreshold youngGCs,you should increase the size of the survivor spaces by lowering theSurvivorRatio.    While VisualGC helps the administratorto understand the behavior of the Windchill method server and theTomcat JVM’s, it is not a useful basis for comparison. In otherwords, you can’t put the VisualGC feedback into a spreadsheet. Torecord and compare behavior, use jstat. Jstat is a tool that isprovided with Java 1.5 that can nevertheless be used to monitor theWindchill Java 1.4 JVM’s. Here is an example output (after someclean up with a text editor to align the columns and to annotate therows): #pgrep java | xargs -n 1 /usr/jdk/jdk1.5.0_06/bin/jstat -gc S0C   S1C       S0U     S1U     EC         EU       OC       OU      PC      PU     YGC   YGCT   FGC   FGCT   GCT273024.0273024.0    0.0  1751.0 1092352.0  882496.2 2048000.0 805586.6 65536.0 31095.8 115   82.588  2  39.730 122.318 (tomcat)273024.0273024.0 1745.9     0.0 1092352.0 1070228.3 2048000.0 923693.3 65536.0 30294.0 138   97.911  4  90.186 188.097 (tomcat)273024.0273024.0    0.0 60328.1 1092352.0  892815.4 2048000.0 659197.6 98304.0 62288.3 361  387.152  2  29.291 416.443 (MethodServer)273024.0273024.0    0.0 66246.3 1092352.0  622980.6 2048000.0 587980.6 98304.0 61971.7 298  356.208  1  13.550 369.758 (MethodServer)     The Windchill server is running twoTomcat instances and two Windchill method servers. Jstat displaysthe capacity and utilization for all of the generations, the numberof garbage collections and the time spent in garbage collection. Iused PC, the permanent generation capacity, to determine which JVM’swere Tomcat and which were method servers. It has been noted, basedon historic PU values (permanent generation utilization) that a 64MBpermanent generation is more than sufficient for Tomcat, butsometimes too small for Windchill method servers. If you have morethan a handful of Full Garbage Collections (FGC) per hour, it is timeto make substantial changes to your configuration. Also, if thetotal garbage collection time divided by the JVM’s CPU time (ps –ef| grep java) is greater than 5%, you need to modify yourconfiguration.     Jstat is a tool which can be used bythe Windchill administrator to sample the JVM’s. By sample, I mean“a finite part of a statistical population whose properties arestudied to gain information about the whole”8. Sampling is an important technique, but some details are smoothedover. Jstat would not be a particularly good tool to answerquestions such as “were the user performance complaints that camein after everyone returned from lunch due to excessive garbagecollections at this particular time.” Or, “how much space isavailable after the garbage collection completes?” The final andmost detailed level of garbage collection analysis comes from usingJava options -verbose:gc -XX:+PrintGCTimeStamps -XX:+PrintGCDetailscombined with PrintGCStats to summarize the results. Here is anexample of PrintGCStats output: #/opt/PrintGCStats -v ncpu=8 MethodServer060718120205_gc.logwhat      count       total       mean        max  stddevgen0(s)     206     248.765     1.20760     4.000  0.3632gen0t(s)    206     248.815     1.20784     4.001  0.3632GC(s)       206     248.815     1.20784     4.001  0.3632alloc(MB)   206  219750.464  1066.74982  1066.750  0.0025promo(MB)   206     645.217     3.13212    53.974  4.2256alloc/elapsed_time = 219750.464 MB /  4653.224 s = 47.225 MB/salloc/tot_cpu_time = 219750.464 MB / 37225.792 s =  5.903 MB/salloc/mut_cpu_time = 219750.464 MB / 35235.275 s =  6.237 MB/spromo/elapsed_time =    645.217 MB /  4653.224 s =  0.139 MB/spromo/gc0_time     =    645.217 MB /   248.815 s =  2.593 MB/sgc_seq_load        =   1990.517 s  / 37225.792 s =  5.347%gc_conc_load       =      0.000 s  / 37225.792 s =  0.000%gc_tot_load        =   1990.517 s  / 37225.792 s =  5.347% All three tools are useful and have aunique place in JVM tuning. How big is toobig? (Resident Set Size vs. Virtual Size)     The summary of the article up to thispoint is to increase the size of the JVM generations and/or addadditional Windchill method servers and Tomcat instances to takeadvantage of your RAM. This should leave you with questions such asHowdo you know if the sizes that are currently set are too big?Howdo you know if the sizes that are currently set are too small?Howmuch RAM is available to allow the JVM sizes to be increased?     The heap sizes are too big if memorypressure is causing excessive virtual memory activity. A goodindicator is the scan rate, displayed by either “sar –g” orvmstat. The scan rate should be at or close to zero. If the scannerkicks in for a short time but returns to zero, virtual memorypressure is not having a significant impact on your performance. Ifthe system is always scanning, you need to kill non-criticalprocesses, reduce the size of your Java heap, or add more RAM to thesystem.     The heap sizes are too small and/or theratios need to be adjusted if you are spending more that 5% of yourCPU time in garbage collection. See jstat, above.     If you want to increase the size ofyour heaps, you will need to determine how much RAM is available. Itis important to differentiate between a process’s working set,resident set size and virtual size. The working set is the set ofmemory addresses that a program will need to use in the near future9. The resident set size of a process is the size of a process’smemory address space that is currently in RAM. The virtual size isthe entire size of the process’s memory including pages that arecurrently in RAM, pages that the operating system has paged to thedisk drive, and addresses that have been allocated but not yet beentouched and hence have not yet been mapped. It is common for theresident set size of a Windchill process to be smaller than thevirtual size of the process because the process will have pages thathave not yet been mapped,     Conversely, it is a problem if memorypressure has caused the OS to page your Java heap to disk. One toolthat can be used to view the resident set size and virtual size isthe ps command.     To view the resident set size andvirtual size of your Java processes: #ps -e -orss,vsz,args | grep java | sort -n 413888704712 /usr/j2se/jre/bin/java -server -classpath/opt/ptc/Windchill_8.0/codebase:/opt/6054961248608 /usr/j2se/jre/bin/java -server -classpath/opt/ptc/Windchill_8.0/codebase:/opt/25258883984944 /usr/j2se/jre/bin/java -server -classpath/opt/ptc/Windchill_8.0/codebase:/opt/25907763993064 /usr/j2se/jre/bin/java -server -classpath/opt/ptc/Windchill_8.0/codebase:/opt/26362163968328 /usr/j2se/bin/java -server -Xms3200m -Xmx3200m-XX:NewSize=1400m -XX:MaxNewSize27480403964296 /usr/j2se/bin/java -server -Xms3200m -Xmx3200m-XX:NewSize=1400m -XX:MaxNewSize To add the sizes of all of yourprocesses’ resident set sizes (12.3 GB in this example): #ps -e -orss,vsz,args | awk '{printf( "%d+",$1)}END{print0}' | bc12354976 To add the sizes of all of yourprocesses’ virtual sizes (19.3 GB in this example): #ps -e -orss,vsz,args | awk '{printf( "%d+",$2)}END{print0}'| bc19383912     I should point out that this techniqueis not 100% accurate. When two or more processes allocate a givenshared memory segment or a shared library, the shared memory is addedfor each process rather than just once. In the case of Java programswith large heaps, this is an insignificant rounding error. On asystem that is running Oracle, a different tool needs to be used(maybe pmap –x) because Oracle’s large SGA will be counted manytimes.     The example above is taken from aserver with 16 GB of RAM. The fact that the virtual size is biggerthan RAM should be noted and investigated. Has the working set beenpaged out? Again, a good indicator is the scan rate, displayed byeither “sar –g” or vmstat as discussed above.     Another view of your system memory isavailable from mdb –k. Here is an example when there is a lot offree memory: #mdb -kLoadingmodules: [ unix krtld genunix specfs dtrace ufs sd ip sctp usbarandom fcp fctl nca lofs logindmux ptm md cpc fcip sppp crypto nfsipc ] >::memstatPage Summary            Pages               MB %Tot------------ ---------------- ---------------- ----Kernel                  89123              696  4%Anon                   589676             4606 28%Execand libs            3973               31  0%Pagecache             196119             1532  9%Free(cachelist)       141024             1101  7%Free(freelist)       1067444             8339 51% Total                 2087359            16307Physical              2054319            16049     The ::memstat is a heavy weight commandthat can impact system performance for several minutes. You willwant to use it during off-hours. Conclusion     A Windchill administrator needs toobserve and minimize both Java garbage collection and virtual memorypressure. Java garbage collection should take no more than 5% of theJVM’s CPU cycles. The virtual memory scan rate should remain atzero most of the time. If the administrator can not accomplish bothgoals on a given server, RAM needs to be added to the server. References 1) “java - the Java applicationlauncher”(http://java.sun.com/j2se/1.4.2/docs/tooldocs/solaris/java.html)2) “TuningGarbage Collection with the 1.4.2 Java Virtual Machine”(http://java.sun.com/docs/hotspot/gc1.4.2/)3) “A Collection of JVM Options”(http://blogs.sun.com/roller/resources/watt/jvm-options-list.html)4) “Java Tuning White Paper”(http://java.sun.com/performance/reference/whitepapers/tuning.html)5)http://java.sun.com/performance/jvmstat/visualgc.html6)http://java.sun.com/j2se/1.5.0/docs/tooldocs/share/jstat.html7) “Turbo-charging Java HotSpotVirtual Machine, v1.4.x to Improve the Performance and Scalability ofApplication Servers”(http://java.sun.com/developer/technicalArticles/Programming/turbo/)8) http://www.m-w.com/dictionary/sample9)http://www.memorymanagement.org/glossary/w.html10) “Java Performance Documentation”(http://java.sun.com/docs/performance/index.html)11) “jvmstat 3.0”http://java.sun.com/performance/jvmstat/)12) “GC Portal”(http://java.sun.com/developer/technicalArticles/Programming/GCPortal/)13) “Java 2 Platform, StandardEdition (J2SE Platform), version 1.4.2”(http://java.sun.com/j2se/1.4.2/1.4.2_whitepaper.html)

JVM Tuning for Windchill Is Your Windchill Installation Using Memory Efficiently? Jeff Taylor Sun Microsystems Updated: April 13, 2007 Notes     December11, 2006: Reduced the recommended Java heap size from...