Oracle 10g on Solaris 10 Hidden parameters to optimize Oracle 10g on Solaris 10 Hidden parameters to optimize Oracle 10g on Solaris 10
By mrbenchmark on May 10, 2005
You may have recently installed Oracle 10g on Solaris 10 and wander
into the wonderful world of Oracle hidden parameters. Every time Oracle
is producing a new vintage of the unbreakable database we get a bulkload
of new mysterious parameters. For the Oracle DBA eye, some of them have
a very explicit name (_lgwr_async_io). Some of them have names directly
extracted from a martian dictionary (see _kghdsidx_count).
Now, of course, your noble intent is to do tuning, not debugging. What about
if you obtain a very sexy
"ORA-03113: end-of-file on communication channel"
on your first 1000 users attempt ?
Well, looking into the Oracle Net Dispatcher log, you will see an helpful :
"NS Primary Error: TNS-12535: TNS:operation timed out
NS Secondary Error: TNS-12606: TNS: Application timeout occurred"
And you call Oracle and they will tell us : This is a bug, Sir. Please go
in sqlnet.ora and do not specify the SQLNET.INBOUND_CONNECT_TIMEOUT parameter.
One problem fixed....the only fixed by a documented feature that you should
not use....great start.
Starting the workload again and now you observe some FULL TABLE SCAN. Oops...
I know how to fix this one and here are the "create index" statements.
Unfortunately, the unbreakable database send you a very rude
ORA-00600 [kcbgtcr_5], or ORA-00600 [kcbgcur_3] error message.
Good thing this young lady from oracle had the coolest voice in the world
so it not a problem to call again. And a certain John answers the phone...
Excuse me, may I speak with Virgina ?...ok, I'll wait.
Yes, this is a bug again (3392439) and to fix it , just type :
"ALTER SYSTEM FLUSH BUFFER_CACHE" . Interesting... or you can put this in
your pfile "_db_cache_pre_warm=false" . Oracle is easy.
(By the way, some more 600 errors can occur on Oracle 10g for Solaris x86
and the previous parameters do not fix them. You will need the very
entertaining "_enable_NUMA_optimization = FALSE" to keep going...)
Here we are... my 1000 users are running.
Looking at Statspack and system statistics, I notice a lot of pressure on
the shared pool and latch contention.
First, I made sure I was using ISM with " _use_ism_for_pga = true" Yep...
Then, I discovered that we can now segment the shared pool into multiple separate
zones, each protected by bound latches. How to do this ?
Just say " _kghdsidx_count = 4" and you will get four of those. The maximum
is apparently seven. No idea why....And I can not find this martian dictionary.
And running again.... but oracle is still singing the latch contention hymn.
Could I have a high level of contention on certain blocks ?
To find the culprit, I queried V$LATCH_CHILDREN for the address and joined it
to V$BH to identify the blocks protected by this latch (doing so will show all
blocks that are affected by the warm block).
Two way to fix this :
- If this is on an index (use DBA_EXTENTS to find out this common case) ,
use a reverse-key index.
- If not, set _db_block_hash_buckets to the prime number just larger than twice
the number of buffers.
Do not forget you must have one LRU latch minimum for each database writer.
You can increase them with a very elegant "_db_block_lru_latches= xx"
Just tell me why this is undocumented as it appears absolute best practice ?
And here I am, running again. Now that I fixed the latch issue, the contention
has moved to the log writer. No surprise.
A new feature of Oracle 10g is log parallelism that you can obtain with :
and the tuning of _log_parallelism_max
Looking further into this, it does not provide full parallelism.
And because this is not a 24x7 production system, looks like you can also
do a really,really exciting :
(Common sense could have been _log_parallelism_private=true but this
Oracle engineers like poetry too...)
Oracle did not crash (unbreakable,right ) and I am running as fast as ever.
I realized later that I really did not need to update v$pga_advice all the time
(_smm_advice_enabled=false) or enable auto tuning of undo_retention
(_undo_autotune=false) as I really need this CPU cycles for my transactions
and not for the Oracle kernel.
Finally, here I am using the 21st century software jewel, DTrace
And realize that I am not using malloc() anymore but mmap(). Great !
But can I tune the mmap byte preallocation....oh,yes. Here is our final
undocumented pearl : _realfree_heap_pagesize_hint . Only 28 letters, what
do you think ?
Unbreakable, yes ! Simple, not yet ....