Tuesday Aug 28, 2012

obiee memory usage

Heap memory is a frequent customer topic.

This is quick refresher, oriented towards AIX, but the principles apply to other unix implementations.

Another aspect of system memory consumption on unix systems is described here.

http://www.linuxatemyram.com


1. 32-bit processes have a maximum addressability of 4GB; usable application heap size of 2-3 GB. 
On AIX it is controlled by an environment variable:
export LDR_CNTRL=....=MAXDATA=0x080000000   # 2GB ( The leading zero (ox08)     is deliberate, not required )
   1a. It is  possible to get 3.25GB  heap size for a 32-bit process using @DSA (Discontiguous Segment Allocation)
    export LDR_CNTRL=MAXDATA=0xd0000000@DSA  # 3.25 GB 32-bit only
        One side-effect of using AIX segments "c" and "d" is that shared libraries will be loaded privately, and not shared.
        If you need the additional heap space, this is worth the trade-off.  This option is frequently used for 32-bit java.
   1b. 64-bit processes have no need for the @DSA option.

2. 64-bit processes can double the 32-bit heap size to 4GB using:
export LDR_CNTRL=....=MAXDATA=0x100000000  # 1 with 8-zeros
    2a. But this setting would place the same memory limitations on obiee as a 32-bit process
    2b. The major benefit of 64-bit is to break the binds of 32-bit addressing.  At a minimum, use 8GB
export LDR_CNTRL=....=MAXDATA=0x200000000  # 2 with 8-zeros
    2c.  Many large customers are providing extra safety to their servers by using 16GB:
export LDR_CNTRL=....=MAXDATA=0x400000000  # 4 with 8-zeros

There is no performance penalty for providing virtual memory allocations larger than required by the application.

 - If the server only uses 2GB of space in 64-bit ... specifying 16GB just provides an upper bound cushion.
    When an unexpected user query causes a sudden memory surge, the extra memory keeps the server running.

3.  The next benefit to 64-bit is that you can provide huge thread stack sizes for

     strange queries that might otherwise crash the server. 
    nqsserver uses fast recursive algorithms to traverse complicated control structures.
    This means lots of thread space to hold the stack frames.
    3a. Stack frames mostly contain register values;  64-bit registers are twice as large as 32-bit
         At a minimum you should  quadruple the size of the server stack threads in NQSConfig.INI
         when migrating from 32- to 64-bit, to prevent a rogue query from crashing the server.  
        Allocate more than is normally necessary for safety.
    3b. There is no penalty for allocating more stack size than you need ...
          it is just virtual memory;   no real resources  are consumed until the extra space is needed.
    3c. Increasing thread stack sizes may require the process heap size (MAXDATA) to be increased.
          Heap space is used for dynamic memory requests, and for thread stacks.
          No performance penalty to run with large heap and thread stack sizes.
          In a 32-bit world, this safety would require careful planning to avoid exceeding 2GB usable storage.
     3d. Increasing the number of threads also may require additional heap storage.
          Most thread stack frames on obiee are allocated when the server is started,
          and the real memory usage increases as threads run work.

Does 2.8GB sound like a lot of memory for an AIX application server?

- I guess it is what you are accustomed to seeing from "grandpa's applications".
- One of the primary design goals of obiee is to trade memory for services ( db, query caches, etc)
- 2.8GB is still well under the 4GB heap size allocated with MAXDATA=0x100000000
- 2.8GB process size is also possible even on 32-bit Windows applications
- It is not unusual to receive a sudden request for 30MB of contiguous storage on obiee.
- This is not a memory leak;  eventually the nqsserver storage will stabilize, but it may take days to do so.

vmstat is the tool of choice to observe memory usage.  On AIX vmstat will show  something that may be 
startling to some people ... that available free memory ( the 2nd column ) is always  trending toward zero ... no available free memory.  Some customers have concluded that "nearly zero memory free" means it is time to upgrade the server with more real memory.   After the upgrade, the server again shows very little free memory available.

Should you be concerned about this?   Many customers are !!  Here is what is happening:

- AIX filesystems are built on a paging model.  
If you read/write a  filesystem block it is paged into memory ( no read/write system calls )
- This filesystem "page" has its own "backing store" on disk, the original filesystem block.
   When the system needs the real memory page holding the file block, there is no need to "page out".
   The page can be stolen immediately, because the original is still on disk in the filesystem.
- The filesystem  pages tend to collect ... every filesystem block that was ever seen since
   system boot is available in memory.  If another application needs the file block, it is retrieved with no physical I/O.

What happens if the system does need the memory ... to satisfy a 30MB heap
request by nqsserver, for example?

- Since the filesystem blocks have their own backing store ( not on a paging device )

  the kernel can just steal any filesystem block ... on a least-recently-used basis
  to satisfy a new real memory request for "computation pages".

No cause for alarm.   vmstat is accurately displaying whether all filesystem blocks have been touched, and now reside in memory.  

Back to nqsserver:  when should you be worried about its memory footprint?
Answer:  Almost never.   Stop monitoring it ... stop fussing over it ... stop trying to optimize it.
This is a production application, and nqsserver uses the memory it requires to accomplish the job, based on demand.

C'mon ... never worry?   I'm from New York ... worry is what we do best.


Ok, here is the metric you should be watching, using vmstat:

- Are you paging ... there are several columns of vmstat output

bash-2.04$ vmstat 3 3

System configuration: lcpu=4 mem=4096MB


kthr    memory             
page              faults        cpu   
----- ------------ ------------------------ ------------ -----------
 r  b    avm fre re  pi  po  fr   sr  cy  in   sy  cs us sy id wa
 0  0 208492  2600   0   0   0   0    0   0  13   45  73  0  0 99  0
 0  0 208492  2600   0   0   0   0    0   0   9   12  77  0  0 99  0
 0  0 208492  2600   0   0   0   0    0   0   9   40  86  0  0 99  0


fre
is the "free memory" indicator that trends toward zero

re
  is "re-page".  The kernel steals a real memory page for one process;  immediately repages back to original process
pi 
"page in".   A process memory page previously paged out, now paged back in because the process needs it
po
"page out" A process memory block was paged out, because it was needed by some other process

Light paging activity ( re, pi, po ) is not a concern for worry.   Processes get started, need some memory, go away.

Sustained paging activity
 is cause for concern.   obiee users are having a terrible day if these counters are always changing.

Hang on ... if nqsserver needs that memory and I reduce MAXDATA to keep the process under control, won't the nqsserver process crash when the memory is needed?


Yes it will.
  It means that nqsserver is configured to require too much memory and there are  lots of options to reduce the real memory requirement.
 - number of threads
 - size of query cache
 - size of sort

But I need nqsserver to keep running.


Real memory is over-committed.
   Many things can cause this:
- running all application processes on a single server

   ... DB server, web servers, WebLogic/WebSphere, sawserver, nqsserver, etc.
  You could move some of those to another host machine and communicate over the network
  The need for real memory doesn't go away, it's just distributed to other host machines.
AIX LPAR is configured with too little memory.  
  The AIX admin needs to provide more real memory to the LPAR running obiee.
- More memory to this LPAR affects other partitions.
Then it's time to visit your friendly IBM rep and buy more memory.

Monday Aug 27, 2012

AIX Checklist for stable obiee deployment

Common AIX configuration issues     ( last updated 26 Jun 2013 )

OBIEE is a complicated system with many moving parts and connection points.
The purpose of this article is to provide a checklist to discuss OBIEE deployment with your systems administrators.

The information in this article is time sensitive, and updated as I discover new  issues or details and broken URL's.
Apologies for lack of updates.  I just discovered this blog software doesn't work with Internet Explorer.
Last 4 updates were discarded.   -- 2013-06-26

What makes OBIEE different?

When Tech Support suggests AIX component upgrades to a stable, locked-down production AIX environment, it is common to get "push back".  "Why is this necessary?  We aren't we seeing issues with other software?"

It's a fair question that I have often struggled to answer; here are the talking points:

  • OBIEE is memory intensive.  It is the entire purpose of the software to trade memory for repetitive, more expensive database requests across a network.
  • OBIEE is implemented in C++ and is very dependent on the C++ runtime to behave correctly.
  • OBIEE is aggressively thread efficient;  if atomic operations on a particular architecture do not work correctly, the software crashes.
  • OBIEE dynamically loads third-party database client libraries directly into the nqsserver process.  If the library is not thread-safe, or corrupts process memory the OBIEE crash happens in an unrelated part of the code.  These are extremely difficult bugs to find.
  • OBIEE software uses 99% common source across multiple platforms:  Windows, Linux, AIX, Solaris and HPUX.  If a crash happens on only one platform, we begin to suspect other factors.  load intensitysystem differences, configuration choices, hardware failures. 

It is rare to have a single product require so many diverse technical skills.   My role in support is to understand system configurations, performance issues, and crashes.   An analyst trained in Business Analytics can't be expected to know AIX internals in the depth required to make configuration choices.  Here are some guidelines.

  1. AIX C++ Runtime must be at  version 12.1.0.1 (was: 11.1.0.4, which still works fine )
    $ lslpp -L | grep xlC.aix
    obiee software will crash if xlC.aix.rte is downlevel;  this is not a "try it" suggestion.
    Aug 2012 version 12.1.0.1  is appropriate for all AIX versions ( 5.3, 6.1, 7.1 )
    Download from here:
    http://www-01.ibm.com/support/docview.wss?uid=swg24033340
    No reboot is necessary to install, it can even be installed while applications are using the current version.
    Restart the apps, and they will pick up the latest version.


  2. AIX 5.3 Technology Level 12 is required when running on Power5,6,7 processors
    AIX 6.1 was introduced with the newer Power chips, and we have seen no issues with 6.1 or 7.1 versions.
    Customers with an unstable deployment, dozens of unexplained crashes, became stable after the TL12 upgrade.
    If your AIX system is 5.3, the minimum TL level should be at or higher than this:
    $ oslevel -s
      5300-12-03-
    1107
    IBM typically supports only the two latest versions of AIX ( 6.1 and 7.1, for example).  AIX 5.3 is still supported and popular running in an LPAR.

  3. Java runtime should be downloaded from IBM's FixCentral.
    IBM now requires registration and login.
      Search FixCentral [ java ]
      [x] Runtimes for Java

    Fixes are available for Java 4,5,6,7.  OBIEE supports 1.5 and 1.6
    Test your installed version using this command:
      java -version

    Recent fixes from IBM have resolved Java performance and stability issues on AIX.
    I see customers have deployed these versions to repair problems.
      "SR9   FP1" 
       "SR12 FP5"
       "SR13 FP3"  ( a popular version: IBM fix IZ94331 )


  4. obiee userid limits
    $ ulimit -Ha  ( hard limits )
    $ ulimit -a   ( default limits )
    core file size (blocks)     unlimited
    data seg size (kbytes)      unlimited
    file size (blocks)          unlimited
    max memory size (kbytes)    unlimited
    open files                  10240
    cpu time (seconds)          unlimited
    virtual memory (kbytes)     unlimited

    It is best to establish the values in /etc/security/limits
    root user is needed to observe and modify this file.
    If you modify a limit in /etc/security/limits , you will need to relog in to have the change take effect.  For example,
    $ ulimit -c 0
    $ ulimit -c 2097151
    cannot modify limit: Operation not permitted
    $ ulimit -c unlimited
    $ ulimit -c
    0

    There are only two meaningful values for core files ( ulimit -c ) ; zero or unlimited.
    Anything else is likely to produce a truncated core file that cannot be analyzed.
    Lack of filesystem space and file size limit ( ulimit -f ) may also produce truncated cores.

  5. Deploy 32-bit or 64-bit ?
    Early versions of OBIEE offered 32-bit or 64-bit choice to AIX customers.
    The 32-bit choice was needed if a database vendor did not supply a 64-bit client library.
    That's no longer an issue and beginning with OBIEE 11, 32-bit code is no longer shipped.

    A common error that leads to "out of memory" conditions to to accept the 32-bit memory configuration choices on 64-bit deployments.  The significant configuration choices are:
    • Maximum process data (heap) size is in an AIX environment variable
      LDR_CNTRL=IGNOREUNLOAD@LOADPUBLIC@PREREAD_SHLIB@MAXDATA=0x...
    • Two thread stack sizes are made in obiee NQSConfig.INI
      [ SERVER ]
      SERVER_THREAD_STACK_SIZE = 0;
      DB_GATEWAY_THREAD_STACK_SIZE = 0;
    • Sort memory in NQSConfig.INI
      [ GENERAL ]
      SORT_MEMORY_SIZE = 4 MB ;
      SORT_BUFFER_INCREMENT_SIZE = 256 KB ;


    Choosing a value for MAXDATA:
    0x080000000  2GB Default maximum 32-bit heap size  ( 8 with 7 zeros )
    0x100000000  4GB 64-bit effectively same as 32-bit ( 1 with 8 zeros )
    0x200000000  8GB 64-bit quadruple the 32-bit max for 64-bit
    0x400000000 16GB 64-bit extra memory for safety ( could cause large core files)


    Using 2GB heap size for a 64-bit process will almost certainly lead to an out-of-memory situation.
    Registers are twice as big ... consume twice as much memory in the heap.
    Upgrading to a 4GB heap for a 64-bit process is just "breaking even" with 32-bit.

    A 32-bit process is constrained by the 32-bit virtual addressing limits.  Heap memory is used for dynamic requirements of obiee software, thread stacks for each of the configured threads, and sometimes for shared libraries.

    64-bit processes are not constrained in this way;  extra heap space can be configured for safety against a query that might create a sudden requirement for excessive storage.  If the storage is not available, this query might crash the whole server and disrupt existing users.

    The MAXDATA settings on obiee10 are changed in
     ./setup directory files:
    .variant.sh and systunesrv.sh

    There is no performance penalty on AIX for configuring more memory than required;  extra memory can be configured for safety.  If there are no other considerations, start with 8GB.


    Choosing a value for Thread Stack size:
    zero is the value documented to select an appropriate default for thread stack size.  My preference is to change this to an absolute value, even if you intend to use the documented default;  it provides better documentation and removes the "surprise" factor.

    There are two thread types that can be configured.
    • GATEWAY is used by a thread pool to call a database client library to establish a DB connection.
      The default size is 256KB;  many customers raise this to 512KB ( no performance penalty for over-configuring ).
      This value must be set to 1 MB if Teradata connections are used.
    • SERVER threads are used to run queries.  OBIEE uses recursive algorithms during the analysis of query structures which can consume significant thread stack storage.  It's difficult to provide guidance on a value that depends on data and complexity.  The general notion is to provide more space than you think you need,  "double down" and increase the value if you run out, otherwise inspect the query to understand why it is too complex for the thread stack.  There are protections built into the software to abort a single user query that is too complex, but the algorithms don't cover all situations.
      256 KB  The default 32-bit stack size.  Many customers increased this to 512KB on 32-bit.  A 64-bit server is very likely to crash with this value;  the stack contains mostly register values, which are twice as big.
      512 KB  The documented 64-bit default.  Some early releases of obiee didn't set this correctly, resulting in 256KB stacks.
      1 MB  The recommended 64-bit setting.  If your system only ever uses 512KB of stack space, there is no performance penalty for using 1MB stack size.
      2 MB  Many large customers use this value for safety.  No performance penalty.

      nqscheduler does not use the NQSConfig.INI file to set thread stack size.
      If this process crashes because the thread stack is too small, use this to set 2MB:
      export OBI_BACKGROUND_STACK_SIZE=2048

  6. Shared libraries are not (shared)
    1. When application libraries are loaded at run-time, AIX makes a decision on whether to load the libraries in a "public" memory segment.  If the filesystem library permissions do not have the "Read-Other" permission bit, AIX loads the library into private process memory with two significant side-effects:
      * The libraries reduce the heap storage available.  
          Might be significant in 32-bit processes;  irrelevant in 64-bit processes.
      * Library code is loaded into multiple real pages for execution;  one copy for each process.
      Multiple execution images is a significant issue for both 32- and 64-bit processes.

      The "real memory pages" saved by using public memory segments is a minor concern.  Today's machines typically have plenty of real memory.
      The real problem with private copies of libraries is that they consume processor cache blocks, which are limited.   The same library instructions executing in different real pages will cause memory delays as the i-cache ( instruction cache 128KB blocks) are refreshed from real memory.   Performance loss because instructions are delayed is something that is difficult to measure without access to low-level cache fault data.   The machine just appears to be running slowly for no observable reason.

      This is an easy problem to detect, and an easy problem to correct.

      Detection:  "
      genld -l" AIX command produces a list of the libraries used by each process and the AIX memory address where they are loaded.
      32-bit public segment is 13 ( "dxxxxxxx" ).   private segments are 2-a.
      64-bit public segment is 9 ( "9xxxxxxxxxxxxxxx") ; private segment is 8.

      genld -l | grep -v ' d| 9' | sort +2

      provides a list of privately loaded libraries. 

      Repair: chmod o+r <libname>
      AIX shared libraries will have a suffix of ".so" or ".a".
      Another technique is to change all libraries in a selected directory to repair those that might not be currently loaded.   The usual directories that need repair are obiee code, httpd code and plugins, database client libraries and java.
      chmod o+r /shr/dir/*.a /shr/dir/*.so

  7. Configure your system for diagnostics
    Production systems shouldn't crash, and yet bad things happen to good software.
    If obiee software crashes and produces a core, you should configure your system for reliable transfer of the failing conditions to Oracle Tech Support.  Here's what we need to be able to diagnose a core file from your system.
    * fullcore enabled. chdev -lsys0 -a fullcore=true
    * core naming enabled. chcore -n on -d
    * ulimit must not truncate core. see item 3.
    * pstack.sh is used to capture core documentation.
    * obidoc is used to capture current AIX configuration.
    * snapcore  AIX utility captures core and libraries. Use the proper syntax.
     $ snapcore -r corename executable-fullpath
       /tmp/snapcore will contain the .pax.Z output file.  It is compressed.
    * If cores are directed to a common directory, ensure obiee userid can write to the directory.  ( chcore -p /cores -d ; chmod 777 /cores )
    The filesystem must have sufficient space to hold a crashing obiee application.
    Use:  df -k
      Check the "Free" column ( not "% Used" )
      8388608 is 8GB.

  8. Disable Oracle Client Library signal handling
    The Oracle DB Client Library is frequently distributed with the sqlplus development kit.
    By default, the library enables a signal handler, which will document a call stack if the application crashes.   The signal handler is not needed, and definitely disruptive to obiee diagnostics.   It needs to be disabled.   sqlnet.ora is typically located at:
       $ORACLE_HOME/network/admin/sqlnet.ora
    Add this line at the top of the file:
       DIAG_SIGHANDLER_ENABLED=FALSE

  9. Disable async query in the RPD connection pool.
    This might be an obiee 10.1.3.4 issue only ( still checking  ).
    "async query" must be disabled in the connection pools.
    It was designed to enable query cancellation to a database, and turned out to have too many edge conditions in normal communication that produced random corruption of data and crashes.  Please ensure it is turned off in the RPD.

  10. Check AIX error report (errpt).
    Errors external to obiee applications can trigger crashes.
     $ /bin/errpt -a
    Hardware errors ( firmware, adapters, disks ) should be reported to IBM support.
    All application core files are recorded by AIX;  the most recent ones are listed first.

  11. Capture pstack output for the most recent crash
    $ errpt -A |grep core |head -1 |xargs pstack.sh
    produces a core*.pstack file in directory set by $obiCollect

  12. Reserved for something important to say.

Wednesday Aug 22, 2012

Active File Sparsing

core files are often sparse files.  This article demonstrates a core file written by HP-UX consumes filesystem space equivalent to the physical size of the core file.   'pax' and 'gzip' demonstrate the core file is highly compressible.

pax actively sparses a file extracted from an archive.  This (mostly zeros) core file appears to be 7200 times smaller with respect to the amount of filesystem space used.

A small demonstration program shows how to write sparse files from an application.

Two additional uses for the pax utility.

[Read More]
About

Dick Dunbar
is an escalation engineer working in the Customer Engineering & Advocacy Lab (CEAL team)
for Oracle Analytics and Performance Management.
I live and work in Santa Cruz, California.
I'll share the techniques I use to detect, avoid and repair problems.

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today