Thursday Jan 19, 2012

Fundamental Oracle flaw revealed??? Really ...?

This Infoworld article from Jan 17, 2012  Fundamental Oracle flaw revealed did alert Oracle database customers.Infoworld has raised this issue to Oracle before going public with it. Patches are included in the Jan 2012 CPU and PSU. So again, it's strongly recommended to apply the Jan 2012 PSU (or CPU if you are just asking for security fixes) to your environments.

What is the background of this issue?
Everything in an Oracle database is dependent on the SCN (System Change Number). This number is crucial to ensure read consistency. It will always be just incremented and is defined as a large 48-bit integer (281 trillion SCNs). But the SCN can jump as well - especially in cases of distributed transactions. Besides that hard limit there's also a soft limit for the SCN (see the MOS Note for more information).
Distributed Transaction

Hot backup bug
Now there's a backup bug which will increment the SCN to a much higher value once ALTER DATABASE BEGIN BACKUP gets used. We call this putting tablespaces into hot backup mode. Actually I'd assume that most people out there (at least those doing backups on a regular basis) use RMAN - and RMAN does not need to put anything into hot backup mode when creating online backups as the real downside of the hot backup mode is an increased value of log information.
Strong recommendation: Use RMAN! And you may apply patch 12371955: "High SCN growth rate from ALTER DATABASE BEGIN BACKUP in 11g" to your environment.

Combination of backup up and distributed transactions
The people who've detected this issue paint now a large Oracle database infrastructure to the wall - with many databases running distributed transactions - and a misbehaving BEGIN BACKUP routine in combination. This would elevate the SCN over and over again - on all interconnected databases - over time as the SCN will be synched over and over again - and will do huge jumps because of the backup bug.

What's the real risk?
I'm not a security expert - but I've seen many customer environments in the real world. I'd say (and skilled DBAs gotten interviewed by Infoworld and others stated similar opinions) it may be just a small risk in larger environments where many databases are connected together - and CPUs or PSUs got not applied on a regular basis. The PSU/CPU fix will prevent the SCN to be incremented in extensive jumps by several ways.
I'd completly disagree with Infoworld's prediction that databases will crash or abandon - transactions won't be executed anymore and an error will be raised. Yes, this is bad enough - true - but the database(s) will remain open.

What should you do?
Apply the January 2012 PSU or CPU and hot backup fix covered by patch 12371955. But keep in mind

  • Take the PSUs over CPUs as PSUs will contain also important non-dictionary changing fixes whereas CPUs contain security fixes only
  • You can't put a CPU on top of a previous applied PSU
  • Both CPUs and PSU are cummulative 
  • And well, you'll need Extended Support to get acces to PSUs or CPUs for Oracle Database 10.1 and 10.2 - and yes, please don't cry: We've asked you to upgrade a looooooong time ago ;-)

Tuesday Nov 29, 2011

Wrong statistics in AUX_STATS$ might puzzle the optimizer

We do recommend the creation of System Statistics for quite a long time. Since Oracle 9i the optimizer works with a CPU and IO cost based model. And in order to give the optimizer some knowledge about the IO subsystem's performance and throughput - once System Statistics are collected - they'll get stored in AUX_STATS$. For this purpose in the old Oracle 9i days some default values had been defined - and you'll still find those defaults in Oracle Database 11g Release 2 in AUX_STATS$. But these old values don't reflect the performance of modern IO systems. So it might be a good best practice post upgrade to create fresh System Statistics if you haven't done this before. 

You can collect System Statistics with:

exec DBMS_STATS.GATHER_SYSTEM_STATS('start');

and end it later by executing:

exec DBMS_STATS.GATHER_SYSTEM_STATS('stop');


You could also run DBMS_STATS.GATHER_SYSTEM_STATS('interval', interval=>N) instead where N is the number of minutes when statistics gathering is stopped automatically.

Please make sure you'll do this on a real workload period. It won't make sense to gather these values while the database is in an idle state. You should do this ideally for several hours. It doesn't affect performance in a negative way as the values are anyway collected in V$SYSSTAT and V$SESSTAT. And in case you'd like to delete the stats and revert to the old default values you'd simply execute:
exec DBMS_STATS.DELETE_SYSTEM_STATS;

The tricky thing in Oracle Database 11.2 - and that's why I'm actually writing this blog post today - is bug9842771. This leads to wrong values in AUX_STATS$ for SREADTIM and MREADTIM by factor 1000 guiding the optimizer sometimes into the totally wrong directon. The workaround is to overwrite these values manually and divide them by 10000. Use the DBMS_STATS.SET_SYSTEM_STATS procedure. See this MOS Note:9842771.8 for the above bug for some further information. This issue is fixed in Oracle Database 11.2.0.3 and above.

To get some background information about the statistics collected in please read this section in the Oracle Database 11.2 Performance Tuning Guide. And gathering System Statistics might have some implication if you have mixed workloads - and interacts with DB_FILE_MULTIBLOCK_READ_COUNT. For more information please read section 13.4.1.2.


Correction: I had to correct the factor (marked in RED above) from 1000 to 10000. Thanks to Andre Duvekot from the Netherlands for highlighting this to me!!!


About

Mike Dietrich - Oracle Mike Dietrich
Senior Principal Technologist - Database Upgrade Development Group - Oracle Corporation

Based near Munich/Germany and spending plenty of time in airplanes to run either upgrade workshops or work onsite with reference customers. Acting as interlink between customers and the Upgrade Development.

Contact me either via XING or LinkedIn

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
2
3
4
5
6
9
10
12
13
15
16
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today
Slides Download Center
OOW Slides Download
Visitors since 17-OCT-2011
White Paper and Docs
Oracle Blogs
Workshops
Viewlets and Videos
This week on my Rega/iPod/CD
Workshop Map
Upgrade Reference Papers