Saturday Nov 11, 2006

NAS architecture presentation at OSWOUG

I had the pleasure of hearing an old Sequent friend Kevin Closson speak about NAS architecture at a recent OSWOUG meeting. It was an interesting and energetic discussion on the direction of NAS in commodity servers. If you are interested at all in the direction of storage technology for databases, you should check out Kevin's blog and paper on this technology.

Tuesday Oct 31, 2006

Solaris - Unbreakable, Unfakeable, Open, and the most Affordable!

In light of the recent announcements by Oracle on the creation of "Larry Linux", I thought it would be prudent to comment about where Solaris fits into this picture. Oracle was hoping to fill a perceived hole in the support of Linux for the Enterprise. Oracle believes they can do this better than Red Hat. While won't comment on whether or not Oracle can support Linux better than Red Hat, I am looking forward to seeing the "tars" from Larry Linux :)

Anyway, this recent announcement led me to realize that Solaris "now more than ever" is the best choice to run your enterprise. If you want open, Solaris is open. If you want iron-clad Unfakeable, Unshakeable, Unbreakable, ... well Solaris has been running enterprises since before Linux was a twinkle in Torvalds eye. Finally, if it comes down to price - Oracle will not cut you any breaks on Database support... You may have to mortgage your house just to get a years support for Oracle. But if you want to save some pocket change on OS support, Solaris beats Red Hat and Larry Linux in that category as well.
    Red Hat: $999
Larry Linux: $399
    Solaris: $120

Seems like a pretty easy choice to me...

Monday Oct 30, 2006

Dueling DUAL with BEA Weblogic and TestConnectionsOnReserve.

You would think that the "DUAL" table, a simple stub table, would not be a performance topic - but I have seen this for years on high-end benchmarks. People develop applications or tests for applications which tend to over-use the DUAL table. Most commonly, this comes in the form of "select abc.nextseq from DUAL" and "select sysdate from DUAL". This is typically, not a problem for small severs with a low level of concurrency, but it can be bottle-neck on high-end severs with lots of processors.

The problem with DUAL (in Oracle 9i and below) is that this "fake table" hashes to a "real" cache line :) If over-used it can cause a "cache buffers chains" latch contention like crazy. The most dangerous over-use situations are systemic ones. I can get around these issues in most benchmark environments, but cringe when I see the embedded use DUAL.

In BEA websphere, there is a parameter called "TestConnectionsOnReservere". This parameter sends a SQL statement to the database before \*EVERY\* user statement.... talk about overhead! This not only adds SQL\*Net round trips increasing network use, but most commonly uses the "SQL SELECT 1 from DUAL" as the test statement :) What is worse, the overhead just continues to increase as the load is increased. Ken Gottry discusses the performance impact in an article he wrote. This study used a 2-way server to show the performance impact. It is much worse on a high-end server.

What can you do?

Avoid setting the TestConnectionsOnReserve within BEA. The performance cost in terms of potential latch contention and network over-head is too high. If you must use this paramenter, use the "X$DUAL" table instead. Oracle 10g, uses this by default and while it avoids the latching issues, the networking component this parameter is still present.

Wednesday Aug 16, 2006

Solaris Applications Specific Tuning Wiki

As part of the Second Edition of the famous Solaris Internals and the new Solaris Performance and Tools book a performance tuning Wiki has been created. This site is meant to be a living document where best practices, tuning information, and tips are collected.

I have began contributing Oracle performance information to the Solaris applications specific tuning Wiki. I hope you enjoy this repository of information regarding performance on Sun systems.

Friday Aug 04, 2006

High "times()" syscall count with Oracle processes

"Why does Oracle call times() so often? Is something broken? When using truss or dtrace to profile Oracle shadow processes, one often sees a lot of calls to "times". Sysadmins often approach me with this query.

root@catscratchb> truss -cp 7700
syscall               seconds   calls  errors
read                     .002     120
write                    .008     210
times                    .053   10810
semctl                   .000      17
semop                    .000       8
semtimedop               .000       9
mmap                     .003      68
munmap                   .003       5
yield                    .002     231
pread                    .150    2002
kaio                     .003      68
kaio                     .001      68
                     --------  ------   ----
sys totals:              .230   13616      0
usr time:               1.127
elapsed:               22.810

At first glance it would seem alarming to have so many times() calls, but how much does this really effect performance? This question can best be answered by looking at the overall "elapsed" and "cpu" time. Below is output from the "procsystime" tool included in the Dtrace toolkit.

root@catscratchb> ./procsystime -Teco -p 7700
Hit Ctrl-C to stop sampling...
Elapsed Times for PID 7700,
         SYSCALL          TIME (ns)
            mmap           17615703
           write           21187750
          munmap           21671772
           times           90733199       <<== Only 0.28% of elapsed time
          semsys          188622081
            read          226475874
           yield          522057977
           pread        31204749076
          TOTAL:        32293113432

CPU Times for PID 7700,
         SYSCALL          TIME (ns)
          semsys            1346101
           yield            3283406
            read            7511421
            mmap           16701455
           write           19616610
          munmap           21576890
           times           33477300         <<== 10.6% of CPU time for the times syscall
           pread          211710238
          TOTAL:          315223421

Syscall Counts for PID 7700,
         SYSCALL              COUNT
          munmap                 17
          semsys                 84
            read                349
            mmap                350
           yield                381
           write                540
           pread               3921
           times              24985    <<== 81.6% of syscalls.
          TOTAL:              30627

According to the profile above, the times() syscall accounts for only 0.28% of the overall response time. It does use 10.6% of sys CPU. The usr/sys CPU percentages are "83/17" for this application. So, using the 17% for system CPU we can calculate the overall amount of CPU for the times() syscall: 100\*(.17\*.106)= 1.8%.

Oracle uses the times() syscall to keep track of timed performance statistics. Timed statistics can be enabled/disabled by setting the init.ora parameter "TIMED_STATISTICS=TRUE". In fact, it is an \*old\* benchmark trick to disable TIMED_STATISTICS after all tuning has been done. This is usually good for another 2% in overall throughput. In a production environment, it is NOT advisable to ever disable TIMED_STATISTICS. These statistics are extremely important to monitor and maintain application performance. I would argue that disabling timed statistics would actually hurt performance in the long run.

Tuesday Jul 11, 2006

Threshold login triggers for Oracle 10046 event trace collection

There are multiple ways to gather trace data. You can instrument the application, pick an oracle sid from sysdba, turn on tracing for all users (ouch), or use a login trigger to narrow down to a specific user. Each of these methods have merit, but recently I desired to gather traces at various user levels.

The problem with most packaged applications, is that they all use the \*same\* userid. For this Oracle 10G environment, I used this fact to filter only connections of the type that I wanted to sample. I wanted to gather 10046 event trace data when the number of connections was 10, 20, or 30. To achieve this, I used a logon trigger and sampled the number of sessions from v$session to come up with the connection count. I have found this little trick to be very useful in automating collection without modifying the application. I hope this can be useful to you as well.

create or replace trigger trace_my_user
  after logon on database

  mycnt  int;


SELECT count(\*)
 INTO mycnt
 FROM v$session
 WHERE username='GLENNF';

 if (user='GLENNF') and ((mycnt=10) or (mycnt=20) or (mycnt=30)) then
 end if;

Tuesday Mar 14, 2006

Decoding 'latch:cache buffers chains' object from Oracle trace file

If you have been using Method-R for response time based profiling, then you will appreciate this note. The "Cache Buffers Chains" latch pops up from time-to-time when trying to scale applications on high-end systems. It is usually a sign of an application issue so locating the source of contention is critical. There are several notes in metalink (42152.1 and 163424.1) which describe how to find hot blocks, but nothing that uses the the Oracle "10046 event" trace files.
Below is output from a trace file (10gR1) which shows the application waiting on a CBC latch:
   WAIT #3: nam='latch: cache buffers chains' ela= 18996 p1=15245584968 p2=116 p3=1

The ADDR in "p1=" is a decimal value which can be converted into hex and used to query the x$bh, v$latch_children, and sys.dba_extents tables to find the objects that are contending CBCs.

I created a script "CBC_p1_to_obj.sql" which hides the nasty sql and takes the ADDR as input. Below is an example from a recent experiment:
SQL> @CBC_p1_to_obj

Function created.

Enter value for cbc_addr_p1: 15245584968
old  12:   x.hladdr  = to_hex('&&CBC_ADDR_P1') and
new  12:   x.hladdr  = to_hex('15245584968') and

SEGMENT_NAME                           EXTENT#     BLOCK#        TCH     CHILD#     SLEEPS
----------------------------------- ---------- ---------- ---------- ---------- ----------
SYSMAN.MGMT_METRICS                          6          8          2        944          0
SYSMAN.MGMT_METRICS_RAW_PK                  32        113          2        944          0
SYSMAN.MGMT_METRICS_RAW_PK                  33         90          2        944          0
SYSMAN.MGMT_METRICS_1HOUR_PK                18         21          2        944          0
SYSMAN.MGMT_METRICS_1HOUR_PK                17         44          2        944          0
SYS.I_DEPENDENCY1                           18         14          1        944          0
SYS.WRI$_ADV_TASKS_IDX_01                    0          1          1        944          0
XDB.SYS_LOB0000043477C00008$$                0          3          1        944          0
OLAPSYS.MRAC_OLAP2_AW_PHYS_OBJ_T             0          3          1        944          0
OLAPSYS.MRAC_OLAP2_AW_PHYS_OBJ_T             0          3          1        944          0
XDB.SYS_LOB0000043477C00008$$                0          3          1        944          0

SEGMENT_NAME                           EXTENT#     BLOCK#        TCH     CHILD#     SLEEPS
----------------------------------- ---------- ---------- ---------- ---------- ----------
SYS.C_OBJ#                                  16        118          0        944          0
DG.T1PK                                     19        115          0        944          0

13 rows selected.

I hope you will find this script useful. Let me know if you experience any issues.

Thursday Feb 16, 2006

Oracle's vision of multi-core processors on Solaris

There seems to be some confusion about how an Oracle instance uses multi-core processors. From a database point of view, it can use all CPU resource that is offered by Solaris. To find out what CPU resources are available, use the "prtdiag" and "psrinfo" commands. The example below shows a single board USIV domain on a SF25K. There are a total of 4 \* USIV Processors running 8 cores @1.2GHz each. The prtdiag output shows the 4 processors with two CPU ID's each. The psrinfo command simply shows all 8 cores.

catscratchb:root> /usr/sbin/prtdiag
System Configuration:  Sun Microsystems  sun4u Sun Fire 15000
System clock frequency: 150 MHz
Memory size: 32768 Megabytes

========================= CPUs =========================

         CPU      Run    E$    CPU     CPU
Slot ID   ID       MHz    MB   Impl.    Mask
--------  -------  ----  ----  -------  ----
/SB01/P0   32, 36  1200  16.0  US-IV    2.2
/SB01/P1   33, 37  1200  16.0  US-IV    2.2
/SB01/P2   34, 38  1200  16.0  US-IV    2.2
/SB01/P3   35, 39  1200  16.0  US-IV    2.2

catscratchb:root> /usr/sbin/psrinfo
32      on-line   since 02/07/2006 18:00:23
33      on-line   since 02/07/2006 18:00:25
34      on-line   since 02/07/2006 18:00:25
35      on-line   since 02/07/2006 18:00:25
36      on-line   since 02/07/2006 18:00:25
37      on-line   since 02/07/2006 18:00:25
38      on-line   since 02/07/2006 18:00:25
39      on-line   since 02/07/2006 18:00:25

Oracle does size internal latching structures based on the number of CPUs available. This sizing does NOT disallow Oracle shadow process from using CPU resource, it simply makes the system better suited to scale. To find out how many CPUs Oracle thinks are available, run the following SQL command as sysdba. We have found that Oracle sizes its data structures based on the number or CPU IDs that are reported by psrinfo.

SQL> connect / as sysdba

SQL> select ksppinm name, ksppstvl value, ksppdesc description
       from x$ksppi x, x$ksppcv y
       where (x.indx = y.indx)
         and ksppinm like '%cpu_count%'
       order by name;

--------------- ------ ----------------------------------
cpu_count       8      number of CPUs for this instance

Friday Dec 09, 2005

Request for Statspack Data

I am working on shaping the next generation of the Transaction Process Council OLTP benchmark "TPC-E". To make this benchmark more useful and avoid the problems of TPC-C, I would like to have input from people who actually run \*real\* database systems. I am looking for characteristics such as Locical IO per transaction to help better determine if we are going the right direction with the TPC-E workload.

So, if you have statspack data that you would not mind sharing, please send it my way along with a short description of the machine and CPU resource consumed.


Monday Sep 19, 2005

tnsping response time can be misleading.

On several occasions, I have ran into situations where Sun customers are using tnsping as an indicator of network performance. While tnsping does show if a connection to a database exists, the response time is not a true indicator of connect time.

"tnsping" works in the idle loop of the Oracle listener and will not respond until all connections queued by the listener have completed. On a busy system, new connections will be serviced before tnsping responses. This is especially apparent when the "queuesize" parameter has been increased in the listener.ora file. Generally, it is good to service real clients before a ping request, however if you are using this to determine response time, it is not valid. A real performance issue still may exist, but you won't know until digging further.

To get true connect times, I wrote a simple script with a "timex" in front of "select \* from dual;" through a listener. "" measures the true connect time as well as getting the session count and timestamp. You have to modify the connect strings to connect to your database.

I hope this is helpful,

Usage:  ./  intvl  count

oracle@sumocat:~% 20 10
Date_and_Timestamp      SessionCNT      Connect_Time(sec)
09/19/05 11:08:05               20      0.260000
09/19/05 11:08:25               19      0.280000
09/19/05 11:08:46               19      0.270000
09/19/05 11:09:06               20      0.260000
09/19/05 11:09:27               20      0.270000

Monday Aug 29, 2005

Oracle/Sun Performance Factors

Recently, I have done a few presentations for local customers on Oracle/Sun Performance. I used a presentation originally developed for an internal performance round-table. This original presentation was created by Doug Miller, Vincent Carbone, and I. I changed it around a little for customers. I hope you find this useful.
  Presentation: "Oracle/Sun Performance Factors"

Tuesday Jun 21, 2005

"orasrp" - Free Oracle response time based trace profiler.

Ever since I started reading the "Optimizing Oracle Performance" book by Cary Millsap, I have been salivating over getting a copy of the Hotsos profiler - but alas it is too expensive. Recently I tried to find a copy of their free version "Sparky" but it had been pulled due to support issues.

Finally, I stumbled upon orasrp. This analyzer is written in python and can be used standalone to produce html, or you can browse a directory of trace files via your web browser.

Monday Aug 30, 2004

Session Wait History in Oracle 10g

In the past you could only sample Oracle Wait Events and often missed interesting information. In 10g Oracle introduced a new view, V$ACTIVE_SESSION_HISTORY, which keeps a historical view of wait events. I ran across a great article in the Database Journal which describes how to use this new feature... Pretty cool!

Oracle performance on RAW or VxFS with QIO/ODM.

The RAW vs COOKED debate that has been going on for some time. You will find little argument which is more efficient at caching DB buffers. If and Oracle process can get a block by doing a block get, this uses much less code than issuing an IO and getting it from the UFS cache. If this is the case, why have I ran into numerous DBAs that have tried Raw once but saw worse or no performance improvement from RAW?

Further investigation usually reveals that the FS cache was providing caching benefit, that was not able to be realized within Oracle. Without modifying the SGA size and possibly storage parameters on active tables/indexes, it is hard to realize any benefit with Raw.

A good place to start is by looking at your largest objects. If your statistics are current, you can use the following SQL. OEM also does a good job of showing table sizes.
SQL> select table_name, CACHE, num_rows, BLOCKS/128 MB 
     from all_tables
     order by MB desc
TABLE_NAME                     CACHE   NUM_ROWS         MB
------------------------------ ----- ---------- ----------
BIGACTIVETABLE                     N   20000000 13258.2891
LOOKUP                             Y      50000   150.5703
Also, You need to know the index sizes so..
SQL> select index_name,
            leaf_blocks/128 MB ,
     from all_indexes
     where owner = 'STSC'
     order by MB desc
INDEX_NAME                       NUM_ROWS         MB BUFFER_
------------------------------ ---------- ---------- -------
BIGACTIVETABLE_PK                20000000  9245.6875 DEFAULT
Since we now know what the largest tables and indexes are, we should expect to see them in the buffer cache. This can be done by querying the v$bh table.
SQL> SELECT object_name, count(\*)/128 MB 
     FROM v$bh, all_objects
     WHERE object_id=objd
     GROUP BY object_name
     ORDER BY MB desc

OBJECT_NAME                            MB
------------------------------ ----------
LOOKUP                          120.09375
BIGACTIVETABLE                  100.98438
BIGACTIVETABLE_PK                75.23438
Notice that "BIGACTIVETABLE" and it's index are not being cached very well. This is due to the fact that the storage parameter on the table and index are set to "NO CACHE". This causes Oracle to not actively try to cache these blocks. By alter these storage parameters
"alter table BIGACTIVETABLE cache;"
and increasing the SGA, these tables and indexes can use more Oracle buffer cache. Note, that you can also separate objects into the DEFAULT, KEEP, and RECYCLE buffer pools. Statspack reports calculate hit ratios for the DEFAULT, KEEP, are RECYCLE buffer pools separately. I tend to put all objects I am actively trying to KEEP into the KEEP pool so I can figure my caching efficiency via statspack.

I hope this helps inspire you to give this a try. It is not black magic, give it a try.

Take Care,

Tuesday Jun 08, 2004

Monitoring "Temporary Tablespace" Usage in Oracle 9iR2 and 10g

This is useful for monitoring index builds and large DSS queries. The last thing you want to happen after a query has been running for a few hours is to run out of temporary space. Statspack will give you essentially the same information, but I like to have a simple little query for interactive monitoring.
SQL> select 100\*(u.tot/d.tot) "pct_temp_used" FROM
     (select sum(u.blocks) tot from v$tempseg_usage u) u,
     (select sum(d.blocks) tot from dba_temp_files d) d



This blog discusses performance topics as running on Sun servers. The main focus is in database performance and architecture but other topics can and will creep in.


« March 2015

No bookmarks in folder


No bookmarks in folder