Oracle 10g on Solaris 10 Hidden parameters to optimize Oracle 10g on Solaris 10 Hidden parameters to optimize Oracle 10g on Solaris 10

Hidden parameters to optimize Oracle 10g on Solaris 10 Hidden parameters to optimize Oracle 10g on Solaris 10



You may have recently installed Oracle 10g on Solaris 10 and wander
into the wonderful world of Oracle hidden parameters. Every time Oracle
is producing a new vintage of the unbreakable database we get a bulkload
of new mysterious parameters. For the Oracle DBA eye, some of them have
a very explicit name (_lgwr_async_io). Some of them have names directly
extracted from a martian dictionary (see _kghdsidx_count).

Now, of course, your noble intent is to do tuning, not debugging. What about
if you obtain a very sexy
"ORA-03113: end-of-file on communication channel"
on your first 1000 users attempt ?

Well, looking into the Oracle Net Dispatcher log, you will see an helpful :

"NS Primary Error: TNS-12535: TNS:operation timed out
NS Secondary Error: TNS-12606: TNS: Application timeout occurred"


And you call Oracle and they will tell us : This is a bug, Sir. Please go
in sqlnet.ora and do not specify the SQLNET.INBOUND_CONNECT_TIMEOUT parameter.

One problem fixed....the only fixed by a documented feature that you should
not use....great start.

Starting the workload again and now you observe some FULL TABLE SCAN. Oops...
I know how to fix this one and here are the "create index" statements.
Unfortunately, the unbreakable database send you a very rude
ORA-00600 [kcbgtcr_5], or ORA-00600 [kcbgcur_3] error message.

Good thing this young lady from oracle had the coolest voice in the world
so it not a problem to call again. And a certain  John answers the phone...
Excuse me, may I speak with Virgina ?...ok, I'll wait.

Yes, this is a bug again (3392439) and to fix it , just type :
"ALTER SYSTEM FLUSH BUFFER_CACHE" . Interesting... or you can put this in
your pfile "_db_cache_pre_warm=false" . Oracle is easy.


(By the way, some more 600 errors can occur on Oracle 10g for Solaris x86
and the previous parameters do not fix them. You will need the very
entertaining "_enable_NUMA_optimization = FALSE" to keep going...)

Here we are... my 1000 users are running.
Looking at Statspack and system statistics, I notice a lot of pressure on
the shared pool and latch contention.
First, I made sure I was using ISM with " _use_ism_for_pga = true" Yep...
Then, I discovered that we can now segment the shared pool into multiple separate
zones, each protected by bound latches. How to do this ?
Just say " _kghdsidx_count = 4" and you will get four of those. The maximum
is apparently seven. No idea why....And I can not find this martian dictionary.

And running again.... but oracle is still singing the latch contention hymn.
Could I have a high level of contention on certain blocks ?
To find the culprit, I queried V$LATCH_CHILDREN for the address and joined it
to V$BH to identify the blocks protected by this latch (doing so will show all
 blocks that are affected by the warm block).
 
Two way to fix this :

 - If this is on an index (use DBA_EXTENTS to find out this common case) ,
 use a reverse-key index.
 - If not, set _db_block_hash_buckets to the prime number just larger than twice
 the number of buffers.
 
Do not forget you must have one LRU latch minimum for each database writer.
You can increase them with a very elegant "_db_block_lru_latches= xx"
Just tell me why this is undocumented as it appears absolute best practice ?

And here I am, running again. Now that I fixed the latch issue, the contention
has moved to the log writer. No surprise.
 A new feature of Oracle 10g is log parallelism that you can obtain with :
_lgwr_async_io=false
_log_parallelism_dynamic=true
and the tuning of _log_parallelism_max

Looking further into this, it does not provide full parallelism.
And because this is not a 24x7 production system, looks like you can also
do a really,really exciting :
_log_private_parallelism=true

(Common sense could have been _log_parallelism_private=true but this
Oracle engineers like poetry too...)

Oracle did not crash (unbreakable,right ) and I am running as fast as ever.
I realized later that I really did not need to update v$pga_advice all the time
(_smm_advice_enabled=false) or enable auto tuning of undo_retention
(_undo_autotune=false) as I really need this CPU cycles for my transactions
and not for the Oracle kernel.

Finally, here I am using the 21st century software jewel, DTrace
And realize that I am not using malloc() anymore but mmap(). Great !
But can I tune the mmap byte preallocation....oh,yes. Here is our final
undocumented pearl : _realfree_heap_pagesize_hint . Only 28 letters, what
do you think ?

Unbreakable, yes ! Simple, not yet ....

Comments:

I appreciate you taking the time to post these performance parameters for Oracle 10g on Solaris 10.

I recently discovered that another cause of the mysterious "ORA-03113: end-of-file on communication channel" message is Solaris 10 patch 119564-01. It affects Oracle 10.1.0.2 and 10.1.0.4 on Solaris 10 (at least with my test setup running in a zone). The Oracle trace output does not immediately provide any help (see below), but selectively backing out patches one at a time identified 119564-01.

Someone should probably file a bug on this with Sun...

So if your trace output looks like this (note the prune_cpus()), you're probably affected:
Redo thread mounted by this instance: 0 <none>
Oracle process number: 0
18196

kstwlb: SGA is no longer mapped
Exception signal: 10 (SIGBUS), code: 1 (Invalid address alignment), addr: 0x600000007, PC: [
0xffffffff7b900d3c, prune_cpus()+164]
\*\*\* 2005-05-27 09:31:57.866
ksedmp: internal or fatal error
ORA-07445: exception encountered: core dump [prune_cpus()+164] [SIGBUS] [Invalid address ali
gnment] [0x600000007] [] []
Current SQL information unavailable - no SGA.
----- Call Stack Trace -----
calling              call     entry                argument values in hex      
location             type     point                (? means dubious value)     
-------------------- -------- -------------------- ----------------------------
ksedmp()+1008        CALL     ksedst()             1052D83F0 ? 104BFE418 ?
                                                   104BFE428 ? 000000000 ?
                                                   1052D8CE8 ? 000000008 ?
ssexhd()+992         CALL     ksedmp()             000000002 ? 000105000 ?
                                                   105128000 ? 000105128 ?
                                                   000105000 ? 000000001 ?
__sighndlr()+12      PTR_CALL 0000000000000000     00000000A ? 1052DFEF0 ?
                                                   000105000 ? 000105000 ?
                                                   000000010 ? 10512C000 ?
call_user_handler()  CALL     __sighndlr()         00000000A ? 1052DFEF0 ?
+992                                               1052DFC10 ? 1002CDB40 ?
                                                   000000000 ? 000000009 ?

Posted by Matt on May 27, 2005 at 02:42 AM PDT #

Hi Matt - sorry for the delay in my response, but there is not anything actually wrong with patch 119564-01, you just need to follow the README's recommendation and install the kernel update patch (at least rev 118822-03). We are going to fix the wording in the README to include a warning about what will happen if you don't additionally upgrade the kernel. Thanks, Valerie Solaris 10 ON update & patch c-team lead

Posted by Valerie on June 02, 2005 at 07:31 AM PDT #

Have "ORA-03113: end-of-file on communication channel" on initial install of Oracle Application Server 10g. Occurred when installing the infastructure, when the install reached the metadata repository part of the install. Operating system Solaris 10, 1/06. Patches Applied: 118822-30, 119578-18 I'm a newbie, and would appreciate any ideas. Right now I am not sure it is something that I've done, or if the bug mentioned above still exists. Need to be pointed in the right direction. Thanks in advance for any help you can provide. VR/ Jim W.

Posted by Jim Watson on May 02, 2006 at 05:40 AM PDT #

and because all of then are hidden, in some cases if the is_system_modifiable is false, then the only way (if i rememember correctly)is to startup the instance from a pfile. meaning shutdown and startup on non-productive hours. your post is great. thanks k. Hairopoulos GREECE

Posted by Konstantinos Hairopoulos on October 22, 2006 at 02:06 AM PDT #

Nope.

Posted by guest on March 17, 2008 at 01:14 AM PDT #

pls could any one tell me how to instal oracle 10g on solaris 10

Posted by wairhe Emefe on May 28, 2009 at 12:45 PM PDT #

Operating system Solaris 10, 1/06. Patches Applied: 118822-30, 119578-18 " i have, but i cant installing."

Posted by Egitim on December 09, 2010 at 10:39 PM PST #

Post a Comment:
  • HTML Syntax: NOT allowed
About

mrbenchmark

Search

Categories
Archives
« July 2015
SunMonTueWedThuFriSat
   
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
 
       
Today
News
Blogroll
deepdive

No bookmarks in folder