Saturday Mar 09, 2013

OBI Sample VM -- Part 3 Check It Out

"What changed?"

When you open an Oracle Service Request, this is one of the first questions Tech Support will ask.
Your OBIEE application was working;  now it's not;  what changes were made?
It's a question that is often hard to answer ... unless ... you've been disciplined about keeping track of changes.

But ... but ... who has time for that?   You're busy working on business apps for your users ...

just exploded ... I'm surpised too

It isn't difficult to keep track of changes with some simple tools;  you focus on getting the job done, and the tools make your job easier.

The "Revision Control System" ( RCS ) is one of the oldest unix tools for tracking changes.  It's simple ( or can be made simple ), it's small and available everywhere.  
I use it often to track even small changes to simple things. It is especially important if you have frequent disruptions and have to restart hours or days later.   The benefits:

  1. A time-stamped log of changes for every file.
  2. Easy to search if you also forget the file or the directory where you were working
  3. Easy review of the changes with a simple command.
    And easier to spot typos when you're altering a critical file.
  4. Now you're ready for that Oracle Tech Support question:  "What changed?"

The OBIEE Sample Virtual Machine, and Windows MobaXterm are used here to test RCS.  
Much of the software supplied in this virtual machine is new to me, and I can "check it out" without destroying the integrity of the environment.
It allows for quick "try it", and easy back out if the test doesn't work out.

Download the MobaXterm Personal extension set of utilities.

http://goo.gl/0aoO3

Place the URL in your browser to download to your browser environment;  several environments work.

  1. The Windows host machine where you run VirtualBox and MobaXterm
  2. The obiee sample virtual machine running in VirtualBox
  3. A cygwin environment running on the Windows host machine
  4. Your production obiee deployment running on any unix platform

The /etc/moba.alias aliases work in all of these environments, including AIX, Linux, Solaris, HPUX.  The RCS executables  must match your system.  
RCS is usually in the default system;  it's a unix classic.

wget command is often more helpful when you're working in a particular environment, because you can download a file to the exact directory where you are working.  No need to transfer files from a browser download directory to the machine where you want it deployed.   The wget command looks like:

$ wget --no-check-certificate http://goo.gl/0aoO3

The link will download file extz.tgz; a compressed tar file. 
You would deploy the files on MobaXterm Personal by downloading to your $HOME directory, then move to the root directory (/) and  use these commands:

$ cd /
$ tar -xzvf $HOME/extz.tgz
$ source /etc/moba.alias

Caution:

The extz.tgz contains binary files that are only appropriate for the MobaXterm Personal environment.  
Do not use cd / before extracting files on other platforms.  
Expand the archive in a normal directory;  all you need from this distribution will be in the relative directory file ./etc/moba.alias

How do we use this?

/etc/moba.alias provides simple shell functions that expose the basics of RCS and are portable across platforms.

Install the aliases using this command:

$ source /etc/moba.alias

The alias file provides:

 alias  Usage
rcsText  rcsText="project notes"
An environment variable containing a brief description about what you are working on.  This description get recorded by RCS for each file that you change.
When you switch tasks, change the rcsText variable.  Put it in your .profile or .bashrc file for subsequent logins and then put your login profile under RCS control too.  Easy time-stamped revisions of your projects.
 _co  _co file_name
The file_name is added to RCS controls.  If it isn't already being tracked, it's added with a checkin to establish the base file setting.
 _ls  _ls
List the files that are tracked in this directory.
 _diff  _diff filename
List the differences (if any) with the previous version.
 _ci  _ci filename
Commit filename changes to RCS.  
 abm  abm command
"A Better Man" reduces the output from manual pages for review on limited screen space.   Ok, it's not needed for revision control.

As an example, we need to modify /etc/hosts to add the IP address of the obiee sample machine to access it on the network.

I provided a shorter machine alias (obisamp) to reduce typing.  The IP used here might be different if your VirtualBox deploys other VMs.

Here we go -- ($) is the bash prompt:

$ cd /etc
$ _co hosts
$ vi hosts
  O         -- add a line in vi above line 1, then type/paste the next line
  192.168.56.101 obisamp obieesampleapp obieesampleapp.us.oracle.com
  ESC       -- return to vi command mode
  shift-ZZ  -- save the file. ( one-handed, easier than !wq )

$ _ls hosts
----- Mar 9 19:47 RCS/hosts,v

$ _diff hosts
RCS file: RCS/hosts,v
retrieving revision 1.1
diff -r1.1 hosts
22d21  < 192.168.56.101 obisamp obieesampleapp obieesampleapp.us.oracle.com

$ _ci hosts
RCS/hosts,v
new revision: 1.2; previous revision: 1.1
done


To search for files you have modified in a high-level directory, you can use a command like this:

$ find /etc -name '*,v'
/etc/RCS/hosts,v

And then simplify things with a shell function:

$ _find() { find $* -name '*,v'; }
$ _find /etc
/etc/RCS/hosts,v

$ _find /home/oracle
.... obisamp is still thinking about that ...

Add the _find function to /etc/moba.alias ... you are now managing  moba.alias with RCS ... right?

Check it out

Thursday Mar 07, 2013

OBI Sample Virtual Machine as an appliance on Windows

MobaXterm, DropBox and OBIEE Sample Virtual Machine

I've had several question about how to use the OBI Virtual Machine image, and have been busy exploring a number of options.

  1. Use VirtualBox to start the machine and use the resulting desktop.
    Desktop icons can be double clicked to start/stop services
    firefox browser can be used to interact with obiee
    Terminal sessions allow you to type Linux commands, edit and repair files
    Suppose this is your first Linux experience and this all seems very foreign
  2. Start the VM, but then use Windows services to connect and interact.
    This will require an entry in the Windows "hosts" file.  Add the obisamp IP address 
    notepad C:\Windows\System32\drivers\etc\hosts
    192.168.56.101 obisamp obieesampleapp obieesampleapp.us.oracle.com
  3. Now launch the Windows browser of your choice to access the OBI application.
    https://obisamp:7001/analytics/saw.dll?bieehome&startPage=1
  4. What about editing files, working with Linux?  Can that be done from Windows?
    Yes it can.    MobaXterm is a windows utility that interacts with other machines.

Click on this link to download the prepared image to Windows, and unpack the .zip file.

https://dl.dropbox.com/s/igsfzigpt4451ih/obisamp_mobaxterm.zip

Go to the MobaXterm Home Page and watch the small video demo running at the bottom of the page.

http://mobaxterm.mobatek.net/

Launch the application: MobaXterm_Personal-6.2.exe 

Under "Saved sessions", click on "obisamp [SSH]"

A new tab opens up, containing a terminal session to the obisamp virtual machine, and on the left is an sftp file manager showing you all of the files in [ /home/oracle ]

Scroll down until you see the file .bashrc_profile
Click and you are editing that file in a MobaTextEditor that behaves like Notepad.

In the right terminal pane, type xclock
A window appears with an analog clock.  This is a Unix application in the virtual machine, showing in an X-server window running on your Windows machine.

This free MobaXterm application has limitations which you can review in the Downloads section on the MobaXterm home page.

There's more to come ... this gets you started ... explore.

Friday Jan 04, 2013

Deploying the OBI Sample Virtual Machine

Cliff Notes:  step-by-step instructions to deploy the virtual machine

Announcement:  A new virtual machine image is being prepared.
Version 303 is in preparation, and will be publicly available soon as version 304.
Enjoy the Version 303 videos prepared by the development staff.

The documentation for the OBI sample application machine is very through and well written.  
It's just a lot of reading if you just want to get started.  
 Here is the short recipe for success.  
 I deployed this on Windows 8.   The VM guest environment includes:

  • Oracle Linux 5 (el5PAE )
  • OBIEE 11.1.6.2 
  • Oracle Database 11g Release 2
  • Oracle Times Ten
  • Essbase Server
  • Access Provisioning Services
  • Oracle JDev
  • Oracle SQL Developer
  • Oracle Internet Directory

Step by step.

  1. Prepare your host system.
    Minimum 4GB of real memory;  more is better.
    Turn on Virtual Assist features in the BIOS.
    73GB disk space needed to install/deploy 
    50GB of free disk space to run

    Download and install Free Download Manager.
    Download and install 7Zip.
    Download and install md5sum

  2. Download and install Oracle VirtualBox ( I used VirtualBox-4.2.6.82870-Win.exe )
    I save the download images to an external USB 3.0 disk

  3. Sign in to the Oracle Tech Network
    click on Oracle Business Intelligence SampleAppV207

  4. click on Downloads and Instructions
    click to accept the OTN license
    Select the V207 version and download:
    - VB Image-Deployment Guide ( where VB means VirtualBox)
    - What's new in V207
    - V207 known issues

  5. Download the virtual image components, one-by-one
    This is where Free Download Manager is useful.  These images are huge.
    While fdm will maintain a queue of downloads, the Oracle Network will start the first download and reject subsquent ones.  
    It took me about 25 minutes per zip file.
    iPhone Clock/Timer kept me focused. I hate it when machines program me.
    VB Image Key File (.ovf) is used to import the VM into VirtualBox
    SampleApp207GA_OBI_BP1.zip.###  ( 001 to 011 )
    Those filenames are correct;  do not rename them. They are part of a "split" archive ... 2GB per segment.

  6. When all zip files are downloaded, unpack them with 7zip.   In a command window:
    7z x SampleApp207GA_OBI_BP1.zip.001
    Result:
    SampleApp207GA_OBI_BP1-disk1.vmdk ( 22GB)
    SampleApp207GA_OBI_BP1-disk2.vmdk ( 1.5MB)

    You can discard the zip files.   I retained them on the USB drive to avoid downloading again.

  7. Start Oracle VMVirtualBox Manager
    Click File / Import Appliance ( Ctrl-I )
    Click Open appliance...
    Select SampleApp207GA_OBI_BP1.ovf  machine definition file
    Check [x] Reinitialize the MAC address of all network cards
    Wait for it to complete.  Your virtual machine is ready.

    Now you can discard the original .vmdk image.  It has been cloned to the new VB location you chose.

  8. Before you start the VM, click on Settings
    System: Motherboard tab
    Select memory size ( I used 4000 MB )
    [x] Enable IO APIC
    [x] Enable absolute pointing device
    System: Processor
    I selected 2 processors
    [x] Enable PAE/NX
    System: Acceleration
    [x] Enable VT-x/AMD-V
    [x] Enable Nested Paging

    Network: Adapter 1
    [x] Enable Network Adapter
      [ Bridged Adapter ]
      Name:   You will likely have to change this if using Wireless. Use the dropdown box to select
      [ Intel(R) Centrino(R) Wireless N-2230 ]
    Network Adapter 2
    [ ] Not enabled

    USB:
    [x] Enable USB Controller
    [x] Enable USB 2.0 Controller
    Click [OK]

  9. Ladies and Germs, Start your VM
    Username [ oracle ]
    Password [ oracle ]

  10. You'll soon discover the mouse doesn't seem to work correctly.
    The reason is that this virtual machine was built on an earlier VB version, and the "Guest Additions for Linux" need to be updated ( See section 4.2.2 in VB User Guide )

    In the top menu, Click "Machine", then "Settings" ( Host+S, where Host means Right-CTRL key)
    Select Storage
    (O) Empty means no CD installed.   We will put a virtual CD ( .iso) here.
    Click Disc icon on the right, Click "VboxGuestAddtions.iso"
    Click [ OK ]  
    You can't correctly launch VBoxLinuxAdditions.run from the file menu; it requires root privilege.
    Minimize the files folder window.
    R-Click an empty spot on the Desktop
    Open Terminal
    $ su     # switch to root user; Password: oracle.
       " #" is the terminal prompt for root ## are my comments
    # cd /media
    # cd VBOX<tab> ## to complete the pathname "VBOXADDITIONS_4.2.6_82870"
    # sh V*.run<tab> ## to complete the file "VBoxLinuxAdditions.run"
    <enter>

    " Removing installed 4.1.2 of VB Guest Additions"

  11. The mouse is much smoother now.
    Click "Applications" / "System Tools" / "Software Updater"

    enter Password for root"  password [ oracle ]
    Click [ Apply updates ]
    Click [ Reboot Now ]

  12. Congratulations.  You just passed your first Linux Administration Certification Exam.
    You are now free to explore the image on your own, while I construct the next section.

  13. Reading Homework.  Explore according to your interests.
    sampleapp207-deploymentguide-1719589.pdf
    VirtualBoxUserManual.pdf
    Linux Blog  See:  Oracle Linux Hands-on Lab from your Home? Yes You Can Do That!

Postscript:  

I just ran through the steps above, and after rebooting the virtual machine, the mouse was jumpy again.

I don't know why this happened, but some VM Settings were changed and I had  the settings ( step 8 above).
System: Motherboard
[x]
Enable IO APIC
[x] Enable absolute pointing device

Storage: CD Controller Change back to Host Drive.  We don't need the Guest Images iso.

3.2 Configuring Hosts File

The description in this section is incorrect.  Order is important in the /etc/hosts file.
The original file used for testing this VM is very likely:  /etc/hosts_orig
It contains the machine name at the bottom, leaving the original local host name in place ( 127.0.0.1)
To provide a real IP, the entry must come before localhosts.
Here is a short script that could be part of machine bootup, that will repair /etc/hosts, depending on the VirtualBox host machine.

#!/bin/bash
# Find my IP address, my ID.
myHost=$(uname -n)
myID=$(whoami)
myIP=$(ifconfig -a | grep 'inet addr:'| grep -v 'addr:127'| tr ':' ' ' | awk '{print $3}')
hostIP=$(grep "^[$myIP]" /etc/hosts )
[ 'root' == $myID ] || SUDO=sudo
if [ -z "$hostIP" ]; then
   [ -e /etc/hosts.orig ] || $SUDO cp -p /etc/hosts /etc/hosts.orig
   $SUDO echo $myIP $myHost   > /etc/hosts.$$
   $SUDO cat /etc/hosts.orig >> /etc/hosts.$$
   $SUDO ln -fs /etc/hosts.$$ /etc/hosts
   ls -l /etc/hosts /etc/hosts.orig
   fi

Other useful Linux tools missing from the VM

# yum install yum-utils   ## Note dash (-), not dot (.) 

Useful aliases in the works

I wrote a "dirstats" script for Solaris to get an overview of a directory I'm unfamilar with.;  
Chris Garber improved the performance.  Here's the results in /home/oracle:

[oracle-AT-obieesampleapp ~]$ dirstats 

      Size    Dirs   Links   Files Path
       59M       1       0       5 apex 
       12M      15       0     110 apex_listener 
       18G    3779      96   41287 app 
      8.0K       0       0       1 bea 
      5.9M       1       0       1 catalogmanager 
      4.0K       0       0       0 data 
      312K      19      16      24 Desktop 
      4.0K       0       0       0 Downloads 
      5.5G    8831      15   81441 epm 
       32M      16       0      96 firefox 
      4.0K       0       0       0 hostshare 
       28K       6       0       1 http: 
       12G   15697      11  176930 obiee 
      2.4G    2582      10   21889 oid 
      1.6M      32       0      52 oradiag_oracle 
       16K       0       0       1 ore 
      4.0K       0       0       0 RCS 
      155M      39       6     118 scripts 
      2.0M       5       0       4 workspace 

 An automated startup script to make this a service appliance.
Mar 4, 2013.

Each of the components in this software stack has a separate startup script, with a numbered entry on the desktop to start in the proper order by clicking the Desktop entry. The shell script launches a terminal shell that observes the launch and pauses to review success or error messages.  

2-StartWLS.sh is different;  you are expected to watch the output of this script run in a terminal session until a message is displayed showing successful launch  and manually place this script in the background so that Web Logic Services stays running.

I needed to automate this step and avoid human interaction so  I used awk to watch for the key line in the output like this, and  then exit the invocation script to proceed to to the next component.   

echo "#--- 2-StartWLS.sh "
set +o noclobber
nohup /home/oracle/obiee/user_projects/domains/bifoundation_domain/startWebLogic.sh >/tmp/wls.out &

tail -f /tmp/wls.out \
| awk '/<Server started in RUNNING mode>/ \
{ echo "\n\n$*\n\n" >/dev/stderr;fflush; exit; }'

The complete script is available here:   start.sh
The companion script to bring services down in the proper order:   stop.sh

Comments are turned off on this blog.   If you have feedback, contact me via email.
dick.dunbar@oracle.com
I promise to simplify and clean this item.

Wednesday Oct 24, 2012

SafariBooks: Oracle BI 11g Developer's Guide

  • Oracle Business Intelligence 11g Developer’s Guide

  • By: Rittman Mark

  • Publisher: McGraw-Hill

  • Pub. Date: October 11, 2012

  • Print ISBN-13: 978-0-07-179874-7

  • E-Book ISBN-13: 978-0-07-179875-4

  • Pages in Print Edition: 1088

    http://techbus.safaribooksonline.com/book/-/9780071798747 

Tuesday Aug 28, 2012

obiee memory usage

Heap memory is a frequent customer topic.

This is quick refresher, oriented towards AIX, but the principles apply to other unix implementations.

Another aspect of system memory consumption on unix systems is described here.

http://www.linuxatemyram.com


1. 32-bit processes have a maximum addressability of 4GB; usable application heap size of 2-3 GB. 
On AIX it is controlled by an environment variable:
export LDR_CNTRL=....=MAXDATA=0x080000000   # 2GB ( The leading zero (ox08)     is deliberate, not required )
   1a. It is  possible to get 3.25GB  heap size for a 32-bit process using @DSA (Discontiguous Segment Allocation)
    export LDR_CNTRL=MAXDATA=0xd0000000@DSA  # 3.25 GB 32-bit only
        One side-effect of using AIX segments "c" and "d" is that shared libraries will be loaded privately, and not shared.
        If you need the additional heap space, this is worth the trade-off.  This option is frequently used for 32-bit java.
   1b. 64-bit processes have no need for the @DSA option.

2. 64-bit processes can double the 32-bit heap size to 4GB using:
export LDR_CNTRL=....=MAXDATA=0x100000000  # 1 with 8-zeros
    2a. But this setting would place the same memory limitations on obiee as a 32-bit process
    2b. The major benefit of 64-bit is to break the binds of 32-bit addressing.  At a minimum, use 8GB
export LDR_CNTRL=....=MAXDATA=0x200000000  # 2 with 8-zeros
    2c.  Many large customers are providing extra safety to their servers by using 16GB:
export LDR_CNTRL=....=MAXDATA=0x400000000  # 4 with 8-zeros

There is no performance penalty for providing virtual memory allocations larger than required by the application.

 - If the server only uses 2GB of space in 64-bit ... specifying 16GB just provides an upper bound cushion.
    When an unexpected user query causes a sudden memory surge, the extra memory keeps the server running.

3.  The next benefit to 64-bit is that you can provide huge thread stack sizes for

     strange queries that might otherwise crash the server. 
    nqsserver uses fast recursive algorithms to traverse complicated control structures.
    This means lots of thread space to hold the stack frames.
    3a. Stack frames mostly contain register values;  64-bit registers are twice as large as 32-bit
         At a minimum you should  quadruple the size of the server stack threads in NQSConfig.INI
         when migrating from 32- to 64-bit, to prevent a rogue query from crashing the server.  
        Allocate more than is normally necessary for safety.
    3b. There is no penalty for allocating more stack size than you need ...
          it is just virtual memory;   no real resources  are consumed until the extra space is needed.
    3c. Increasing thread stack sizes may require the process heap size (MAXDATA) to be increased.
          Heap space is used for dynamic memory requests, and for thread stacks.
          No performance penalty to run with large heap and thread stack sizes.
          In a 32-bit world, this safety would require careful planning to avoid exceeding 2GB usable storage.
     3d. Increasing the number of threads also may require additional heap storage.
          Most thread stack frames on obiee are allocated when the server is started,
          and the real memory usage increases as threads run work.

Does 2.8GB sound like a lot of memory for an AIX application server?

- I guess it is what you are accustomed to seeing from "grandpa's applications".
- One of the primary design goals of obiee is to trade memory for services ( db, query caches, etc)
- 2.8GB is still well under the 4GB heap size allocated with MAXDATA=0x100000000
- 2.8GB process size is also possible even on 32-bit Windows applications
- It is not unusual to receive a sudden request for 30MB of contiguous storage on obiee.
- This is not a memory leak;  eventually the nqsserver storage will stabilize, but it may take days to do so.

vmstat is the tool of choice to observe memory usage.  On AIX vmstat will show  something that may be 
startling to some people ... that available free memory ( the 2nd column ) is always  trending toward zero ... no available free memory.  Some customers have concluded that "nearly zero memory free" means it is time to upgrade the server with more real memory.   After the upgrade, the server again shows very little free memory available.

Should you be concerned about this?   Many customers are !!  Here is what is happening:

- AIX filesystems are built on a paging model.  
If you read/write a  filesystem block it is paged into memory ( no read/write system calls )
- This filesystem "page" has its own "backing store" on disk, the original filesystem block.
   When the system needs the real memory page holding the file block, there is no need to "page out".
   The page can be stolen immediately, because the original is still on disk in the filesystem.
- The filesystem  pages tend to collect ... every filesystem block that was ever seen since
   system boot is available in memory.  If another application needs the file block, it is retrieved with no physical I/O.

What happens if the system does need the memory ... to satisfy a 30MB heap
request by nqsserver, for example?

- Since the filesystem blocks have their own backing store ( not on a paging device )

  the kernel can just steal any filesystem block ... on a least-recently-used basis
  to satisfy a new real memory request for "computation pages".

No cause for alarm.   vmstat is accurately displaying whether all filesystem blocks have been touched, and now reside in memory.  

Back to nqsserver:  when should you be worried about its memory footprint?
Answer:  Almost never.   Stop monitoring it ... stop fussing over it ... stop trying to optimize it.
This is a production application, and nqsserver uses the memory it requires to accomplish the job, based on demand.

C'mon ... never worry?   I'm from New York ... worry is what we do best.


Ok, here is the metric you should be watching, using vmstat:

- Are you paging ... there are several columns of vmstat output

bash-2.04$ vmstat 3 3

System configuration: lcpu=4 mem=4096MB


kthr    memory             
page              faults        cpu   
----- ------------ ------------------------ ------------ -----------
 r  b    avm fre re  pi  po  fr   sr  cy  in   sy  cs us sy id wa
 0  0 208492  2600   0   0   0   0    0   0  13   45  73  0  0 99  0
 0  0 208492  2600   0   0   0   0    0   0   9   12  77  0  0 99  0
 0  0 208492  2600   0   0   0   0    0   0   9   40  86  0  0 99  0


fre
is the "free memory" indicator that trends toward zero

re
  is "re-page".  The kernel steals a real memory page for one process;  immediately repages back to original process
pi 
"page in".   A process memory page previously paged out, now paged back in because the process needs it
po
"page out" A process memory block was paged out, because it was needed by some other process

Light paging activity ( re, pi, po ) is not a concern for worry.   Processes get started, need some memory, go away.

Sustained paging activity
 is cause for concern.   obiee users are having a terrible day if these counters are always changing.

Hang on ... if nqsserver needs that memory and I reduce MAXDATA to keep the process under control, won't the nqsserver process crash when the memory is needed?


Yes it will.
  It means that nqsserver is configured to require too much memory and there are  lots of options to reduce the real memory requirement.
 - number of threads
 - size of query cache
 - size of sort

But I need nqsserver to keep running.


Real memory is over-committed.
   Many things can cause this:
- running all application processes on a single server

   ... DB server, web servers, WebLogic/WebSphere, sawserver, nqsserver, etc.
  You could move some of those to another host machine and communicate over the network
  The need for real memory doesn't go away, it's just distributed to other host machines.
AIX LPAR is configured with too little memory.  
  The AIX admin needs to provide more real memory to the LPAR running obiee.
- More memory to this LPAR affects other partitions.
Then it's time to visit your friendly IBM rep and buy more memory.

Monday Aug 27, 2012

AIX Checklist for stable obiee deployment

Common AIX configuration issues     ( last updated 26 Jun 2013 )

OBIEE is a complicated system with many moving parts and connection points.
The purpose of this article is to provide a checklist to discuss OBIEE deployment with your systems administrators.

The information in this article is time sensitive, and updated as I discover new  issues or details and broken URL's.
Apologies for lack of updates.  I just discovered this blog software doesn't work with Internet Explorer.
Last 4 updates were discarded.   -- 2013-06-26

What makes OBIEE different?

When Tech Support suggests AIX component upgrades to a stable, locked-down production AIX environment, it is common to get "push back".  "Why is this necessary?  We aren't we seeing issues with other software?"

It's a fair question that I have often struggled to answer; here are the talking points:

  • OBIEE is memory intensive.  It is the entire purpose of the software to trade memory for repetitive, more expensive database requests across a network.
  • OBIEE is implemented in C++ and is very dependent on the C++ runtime to behave correctly.
  • OBIEE is aggressively thread efficient;  if atomic operations on a particular architecture do not work correctly, the software crashes.
  • OBIEE dynamically loads third-party database client libraries directly into the nqsserver process.  If the library is not thread-safe, or corrupts process memory the OBIEE crash happens in an unrelated part of the code.  These are extremely difficult bugs to find.
  • OBIEE software uses 99% common source across multiple platforms:  Windows, Linux, AIX, Solaris and HPUX.  If a crash happens on only one platform, we begin to suspect other factors.  load intensitysystem differences, configuration choices, hardware failures. 

It is rare to have a single product require so many diverse technical skills.   My role in support is to understand system configurations, performance issues, and crashes.   An analyst trained in Business Analytics can't be expected to know AIX internals in the depth required to make configuration choices.  Here are some guidelines.

  1. AIX C++ Runtime must be at  version 12.1.0.1 (was: 11.1.0.4, which still works fine )
    $ lslpp -L | grep xlC.aix
    obiee software will crash if xlC.aix.rte is downlevel;  this is not a "try it" suggestion.
    Aug 2012 version 12.1.0.1  is appropriate for all AIX versions ( 5.3, 6.1, 7.1 )
    Download from here:
    http://www-01.ibm.com/support/docview.wss?uid=swg24033340
    No reboot is necessary to install, it can even be installed while applications are using the current version.
    Restart the apps, and they will pick up the latest version.


  2. AIX 5.3 Technology Level 12 is required when running on Power5,6,7 processors
    AIX 6.1 was introduced with the newer Power chips, and we have seen no issues with 6.1 or 7.1 versions.
    Customers with an unstable deployment, dozens of unexplained crashes, became stable after the TL12 upgrade.
    If your AIX system is 5.3, the minimum TL level should be at or higher than this:
    $ oslevel -s
      5300-12-03-
    1107
    IBM typically supports only the two latest versions of AIX ( 6.1 and 7.1, for example).  AIX 5.3 is still supported and popular running in an LPAR.

  3. Java runtime should be downloaded from IBM's FixCentral.
    IBM now requires registration and login.
      Search FixCentral [ java ]
      [x] Runtimes for Java

    Fixes are available for Java 4,5,6,7.  OBIEE supports 1.5 and 1.6
    Test your installed version using this command:
      java -version

    Recent fixes from IBM have resolved Java performance and stability issues on AIX.
    I see customers have deployed these versions to repair problems.
      "SR9   FP1" 
       "SR12 FP5"
       "SR13 FP3"  ( a popular version: IBM fix IZ94331 )


  4. obiee userid limits
    $ ulimit -Ha  ( hard limits )
    $ ulimit -a   ( default limits )
    core file size (blocks)     unlimited
    data seg size (kbytes)      unlimited
    file size (blocks)          unlimited
    max memory size (kbytes)    unlimited
    open files                  10240
    cpu time (seconds)          unlimited
    virtual memory (kbytes)     unlimited

    It is best to establish the values in /etc/security/limits
    root user is needed to observe and modify this file.
    If you modify a limit in /etc/security/limits , you will need to relog in to have the change take effect.  For example,
    $ ulimit -c 0
    $ ulimit -c 2097151
    cannot modify limit: Operation not permitted
    $ ulimit -c unlimited
    $ ulimit -c
    0

    There are only two meaningful values for core files ( ulimit -c ) ; zero or unlimited.
    Anything else is likely to produce a truncated core file that cannot be analyzed.
    Lack of filesystem space and file size limit ( ulimit -f ) may also produce truncated cores.

  5. Deploy 32-bit or 64-bit ?
    Early versions of OBIEE offered 32-bit or 64-bit choice to AIX customers.
    The 32-bit choice was needed if a database vendor did not supply a 64-bit client library.
    That's no longer an issue and beginning with OBIEE 11, 32-bit code is no longer shipped.

    A common error that leads to "out of memory" conditions to to accept the 32-bit memory configuration choices on 64-bit deployments.  The significant configuration choices are:
    • Maximum process data (heap) size is in an AIX environment variable
      LDR_CNTRL=IGNOREUNLOAD@LOADPUBLIC@PREREAD_SHLIB@MAXDATA=0x...
    • Two thread stack sizes are made in obiee NQSConfig.INI
      [ SERVER ]
      SERVER_THREAD_STACK_SIZE = 0;
      DB_GATEWAY_THREAD_STACK_SIZE = 0;
    • Sort memory in NQSConfig.INI
      [ GENERAL ]
      SORT_MEMORY_SIZE = 4 MB ;
      SORT_BUFFER_INCREMENT_SIZE = 256 KB ;


    Choosing a value for MAXDATA:
    0x080000000  2GB Default maximum 32-bit heap size  ( 8 with 7 zeros )
    0x100000000  4GB 64-bit effectively same as 32-bit ( 1 with 8 zeros )
    0x200000000  8GB 64-bit quadruple the 32-bit max for 64-bit
    0x400000000 16GB 64-bit extra memory for safety ( could cause large core files)


    Using 2GB heap size for a 64-bit process will almost certainly lead to an out-of-memory situation.
    Registers are twice as big ... consume twice as much memory in the heap.
    Upgrading to a 4GB heap for a 64-bit process is just "breaking even" with 32-bit.

    A 32-bit process is constrained by the 32-bit virtual addressing limits.  Heap memory is used for dynamic requirements of obiee software, thread stacks for each of the configured threads, and sometimes for shared libraries.

    64-bit processes are not constrained in this way;  extra heap space can be configured for safety against a query that might create a sudden requirement for excessive storage.  If the storage is not available, this query might crash the whole server and disrupt existing users.

    The MAXDATA settings on obiee10 are changed in
     ./setup directory files:
    .variant.sh and systunesrv.sh

    There is no performance penalty on AIX for configuring more memory than required;  extra memory can be configured for safety.  If there are no other considerations, start with 8GB.


    Choosing a value for Thread Stack size:
    zero is the value documented to select an appropriate default for thread stack size.  My preference is to change this to an absolute value, even if you intend to use the documented default;  it provides better documentation and removes the "surprise" factor.

    There are two thread types that can be configured.
    • GATEWAY is used by a thread pool to call a database client library to establish a DB connection.
      The default size is 256KB;  many customers raise this to 512KB ( no performance penalty for over-configuring ).
      This value must be set to 1 MB if Teradata connections are used.
    • SERVER threads are used to run queries.  OBIEE uses recursive algorithms during the analysis of query structures which can consume significant thread stack storage.  It's difficult to provide guidance on a value that depends on data and complexity.  The general notion is to provide more space than you think you need,  "double down" and increase the value if you run out, otherwise inspect the query to understand why it is too complex for the thread stack.  There are protections built into the software to abort a single user query that is too complex, but the algorithms don't cover all situations.
      256 KB  The default 32-bit stack size.  Many customers increased this to 512KB on 32-bit.  A 64-bit server is very likely to crash with this value;  the stack contains mostly register values, which are twice as big.
      512 KB  The documented 64-bit default.  Some early releases of obiee didn't set this correctly, resulting in 256KB stacks.
      1 MB  The recommended 64-bit setting.  If your system only ever uses 512KB of stack space, there is no performance penalty for using 1MB stack size.
      2 MB  Many large customers use this value for safety.  No performance penalty.

      nqscheduler does not use the NQSConfig.INI file to set thread stack size.
      If this process crashes because the thread stack is too small, use this to set 2MB:
      export OBI_BACKGROUND_STACK_SIZE=2048

  6. Shared libraries are not (shared)
    1. When application libraries are loaded at run-time, AIX makes a decision on whether to load the libraries in a "public" memory segment.  If the filesystem library permissions do not have the "Read-Other" permission bit, AIX loads the library into private process memory with two significant side-effects:
      * The libraries reduce the heap storage available.  
          Might be significant in 32-bit processes;  irrelevant in 64-bit processes.
      * Library code is loaded into multiple real pages for execution;  one copy for each process.
      Multiple execution images is a significant issue for both 32- and 64-bit processes.

      The "real memory pages" saved by using public memory segments is a minor concern.  Today's machines typically have plenty of real memory.
      The real problem with private copies of libraries is that they consume processor cache blocks, which are limited.   The same library instructions executing in different real pages will cause memory delays as the i-cache ( instruction cache 128KB blocks) are refreshed from real memory.   Performance loss because instructions are delayed is something that is difficult to measure without access to low-level cache fault data.   The machine just appears to be running slowly for no observable reason.

      This is an easy problem to detect, and an easy problem to correct.

      Detection:  "
      genld -l" AIX command produces a list of the libraries used by each process and the AIX memory address where they are loaded.
      32-bit public segment is 13 ( "dxxxxxxx" ).   private segments are 2-a.
      64-bit public segment is 9 ( "9xxxxxxxxxxxxxxx") ; private segment is 8.

      genld -l | grep -v ' d| 9' | sort +2

      provides a list of privately loaded libraries. 

      Repair: chmod o+r <libname>
      AIX shared libraries will have a suffix of ".so" or ".a".
      Another technique is to change all libraries in a selected directory to repair those that might not be currently loaded.   The usual directories that need repair are obiee code, httpd code and plugins, database client libraries and java.
      chmod o+r /shr/dir/*.a /shr/dir/*.so

  7. Configure your system for diagnostics
    Production systems shouldn't crash, and yet bad things happen to good software.
    If obiee software crashes and produces a core, you should configure your system for reliable transfer of the failing conditions to Oracle Tech Support.  Here's what we need to be able to diagnose a core file from your system.
    * fullcore enabled. chdev -lsys0 -a fullcore=true
    * core naming enabled. chcore -n on -d
    * ulimit must not truncate core. see item 3.
    * pstack.sh is used to capture core documentation.
    * obidoc is used to capture current AIX configuration.
    * snapcore  AIX utility captures core and libraries. Use the proper syntax.
     $ snapcore -r corename executable-fullpath
       /tmp/snapcore will contain the .pax.Z output file.  It is compressed.
    * If cores are directed to a common directory, ensure obiee userid can write to the directory.  ( chcore -p /cores -d ; chmod 777 /cores )
    The filesystem must have sufficient space to hold a crashing obiee application.
    Use:  df -k
      Check the "Free" column ( not "% Used" )
      8388608 is 8GB.

  8. Disable Oracle Client Library signal handling
    The Oracle DB Client Library is frequently distributed with the sqlplus development kit.
    By default, the library enables a signal handler, which will document a call stack if the application crashes.   The signal handler is not needed, and definitely disruptive to obiee diagnostics.   It needs to be disabled.   sqlnet.ora is typically located at:
       $ORACLE_HOME/network/admin/sqlnet.ora
    Add this line at the top of the file:
       DIAG_SIGHANDLER_ENABLED=FALSE

  9. Disable async query in the RPD connection pool.
    This might be an obiee 10.1.3.4 issue only ( still checking  ).
    "async query" must be disabled in the connection pools.
    It was designed to enable query cancellation to a database, and turned out to have too many edge conditions in normal communication that produced random corruption of data and crashes.  Please ensure it is turned off in the RPD.

  10. Check AIX error report (errpt).
    Errors external to obiee applications can trigger crashes.
     $ /bin/errpt -a
    Hardware errors ( firmware, adapters, disks ) should be reported to IBM support.
    All application core files are recorded by AIX;  the most recent ones are listed first.

  11. Capture pstack output for the most recent crash
    $ errpt -A |grep core |head -1 |xargs pstack.sh
    produces a core*.pstack file in directory set by $obiCollect

  12. Reserved for something important to say.

Wednesday Aug 22, 2012

Active File Sparsing

core files are often sparse files.  This article demonstrates a core file written by HP-UX consumes filesystem space equivalent to the physical size of the core file.   'pax' and 'gzip' demonstrate the core file is highly compressible.

pax actively sparses a file extracted from an archive.  This (mostly zeros) core file appears to be 7200 times smaller with respect to the amount of filesystem space used.

A small demonstration program shows how to write sparse files from an application.

Two additional uses for the pax utility.

[Read More]

Thursday Jul 19, 2012

HPUX

Collection of tools to gather information for Tech Support for HPUX

pstack.sh Capture call stacks for a core file ( updated 2012-08-20).

Under Construction.

[Read More]

Friday Jun 22, 2012

Browser Alert -- cannot download links using Internet Explorer

Internet Explorer ( ie8, ie9 ) is mangling downloads from this blog.

(2013-03-15:  Internet Explorer 10 works fine for downloading;  it doesn't work to upgrade the Apache Roller Blogs. 

Links to files on this blog ( eg., dirstats ) are typically downloaded using browser:
R-click, SaveAs

This works fine on Chrome, Firefox and Safari.  Internet Explorer is not handling the html reference to the file, and adds .html to the filename.   The file will be saved in an incorrect format.   Relatively harmless for a script file that is plain text, but binary files like obiaix.tar.gz , will be corrupted, and there is nothing you can do about it.

"Don't get corrupted, get rid of cable  Internet Explorer, use firefox"  ( sorry, US TV advert reference )

The useful part of the compressed tar file is that you don't have to worry about Windows line-end characters corrupting the scripts, and you don't have to change execution permissions to get the scripts to work.
dos2unix dirstats
chmod +x dirstats

Tuesday Jun 05, 2012

Data Mining Email with Thunderbird

Description of a technique to efficiently search email content using Thunderbird and organize the results.

[Read More]

Monday Jun 04, 2012

obiee 10g teradata 13.10 deployment on Solaris 64-bit

A work in progress.

  • New system verification script for Teradata on Solaris. ( 2012-07-01 )
    obi_td.sh  collects Teradata installation details;  version numbers, symbolic links, etc.
    Upload the results to Oracle Technical Support.


    Completed:
  • odbc environment
  • nqsserver files used to integrate Teradata runtime 
  • Teradata simplified client install
  • Useful shell aliases and environment variables
  • example Solaris LD_LIBRARY_PATH
  • 2012-06-28:  detail added for Teradata Installation.

To Do:

  • Additional obiee parameter tuning ( performance and stability )
  • Testing Teradata connections, graciously shared by Teradata Support

[Read More]

Thursday May 31, 2012

Working with Windows and Unix

Common problems exchanging files between Unix and Windows.
* Use "binary" mode when transferring files
* dos2unix utility to repair scripts transferred to unix in ascii mode

Note: There is no repair binary files uploaded with ftp using ascii mode.
corefiles, archives ( tar, pax ) will be corrupted.

[Read More]

Sunday May 27, 2012

Minamalist

Small software:  description of iterations on the unpackDbx.sh script to extract dbx from full Studio 12.3 distribution.

References:
unpackDbx.sh  -- Final script to extract dbx
dirstats -- metrics on resulting directory (size, subdirs, links, files)

[Read More]

Thursday May 24, 2012

Solaris

Solaris Summary:  obisolaris.tar.gz contains these scripts;

  • obimon - monitor memory and connection growth.
  • pcore.sh - miniture version of pstack.sh for Solaris, if Studio12.3 dbx is not installed.
  • packcore.sh - bundle core file and libraries for Oracle Tech Support
  • pstack.sh  - uses dbx to obtain call stacks from a core file.

Oracle Solaris Studio 12.3 download: extract/install dbx using unpackDbx.sh

Solaris Core Naming (root command): 
  coreadm -i core.%f.%p -e log -e global

[Read More]

Linux

Getting started with Linux technical support

Browser Alert: Use Firefox, Chrome, Safari or IE10 to download files in links below (R-Click, Save-As ).  Previous Internet Explorer versions do not work for download ( and IE10 does not work to edit Roller Blogs ( edited with Chrome 3/15/2013)

  • obilinux.tar.gz (2013-07-10) - Archive collection of files listed below
  • obidoc (v2.5) - collects critical Linux system options needed by Tech Support
  • obimon (v2.5) - monitor memory and connection growth.
  • netstat.sh (v2.2) - monitor network changes with timestamped output
  • vmstat.sh (v2.5) - monitor system memory usage and paging with timestamped output.
  • strace.sh (v2.5) - monitor system calls from an active process.
  • pstack.sh (v2.4) - If you have a core file, this script will document the information Tech Support needs.
  • packcore.sh (v2.1) - bundle core file and libraries for Oracle Tech Support. pstack.sh output preferred.
  • dos2unix Converts ascii file line-end from Windows to Unix.  This is an improved version over the Linux supplied utility; it only removes line-end chars and retains the original owner and timestamp.

Notes:

  • /bin/ksh is unreliable. Scripts use #!/bin/bash
    Symptom:   "
    bad interpreter: "
  • obidoc should be run from the userid that launches obiee servers. 
    It expects to find (whoami, id, uname, uptime, host, hostname, uname, nohup, w, who, vmstat, ps, ifconfig, netstat, df, rpm, egrep, sort, printf, which) in standard directories: 
    /bin:/usr/bin:/sbin
    It will use lsof if it finds it in $PATH.
  • pstack.sh expects to find ( rpm, gdb, grep, egrep,  sort, head, tail, find, which, basename, strings, uname, whoami, sed ).  Output is written to $obiCollect or the current directory.
  • obimon should be run from the userid that owns the process to be monitored.  It needs permissions to access the /proc filesystem.   It needs permission to write to the current directory, or to the directory identified with environment variable $obiCollect.

2013-07-10: obilinux.tar.gz refreshed
2013-04-12: pstack.sh, obimon, new strace.sh
2013-03-04: Revise pstack.sh algorithm to discover executable fullpath from a core file.
2013-01-29: Link only to obilinux.tar.gz;  scripts have been updated to v2.1 to repair pstack.sh, packcore.sh, obidoc; v2.2 obidoc

About

Dick Dunbar
is an escalation engineer working in the Customer Engineering & Advocacy Lab (CEAL team)
for Oracle Analytics and Performance Management.
I live and work in Santa Cruz, California.
I'll share the techniques I use to detect, avoid and repair problems.

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today