Tuesday Mar 19, 2013

Improve Database Performance

Untitled Document

Best Practices: Proactive Data Collection for Performance Issues

train



A common problem for analysis of performance issues is having the right diagnostic tool or trace in place during the first occurrence of the problem. This is not always possible, but chances for a quicker resolution and potential Root Cause Analysis (RCA) are possible if you adopt a proactive best practice around performance data collection.



The best practice described here includes a methodology around data collection:

  • Top-Down approach to data collection
  • Establishing multiple baselines
  • Having the right tools already installed and running before the problem occurs
  • Deploying specialized tools for unstable environments

The performance of the Oracle database depends upon the performance of the hardware and operating system that it runs on. This is why it is important to consider operating system metrics as part of a "big picture" when diagnosing performance problems. Oracle support has a utility named OSWatcher Black Box, which will capture information about the operating system. We strongly recommended all customers run OSWatcher Black Box because it provides a "look back" capability on the OS. If for example, we had a node eviction at 2:00 in the morning, we can look back in time to see what happened on the OS at 2:00 or more importantly, what happened at 1:59, just prior to the actual event.

Document 301137.1 OSWatcher Black Box User Guide (Includes: [Video])

If we drill down from the operating system into the different layers of Oracle, we can take advantage of utilities which have been available since version 10g. The Automatic Workload Repository (AWR) and Active Session History (ASH) are two very important utilities to allow us to see what is happening on the database both from a holistic point of view and also at the session level. These 2 utilities are part of the Oracle Diagnostic Pack and require a special license to use. In order to obtain the most effective diagnostics possible, it is strongly recommended that an Oracle Diagnostic Pack license is obtained.

Document 1363422.1 Automatic Workload Repository (AWR) Reports
Document 243132.1 Active Session History (ASH)

The AWR reports can be taken at regular intervals and then used to determine what if any performance problems exist on the database. The use of ASH allows us to see specifically what individual sessions are waiting on or what resources these sessions are consuming.

Red path The good news is that if you proactively take advantage of these tools, it may be possible to diagnose a system or database hang, just by having these tools in place. Many times bug signatures are available within AWR and hangs can be matched to these bug signatures. This may allow for a diagnostic analysis without the need to take more intrusive diagnostics such as systemstate dumps, hanganalyze dumps, etc.

It is very important for you to establish baselines using OSWatcher Black Box and AWR. Baselines should be used as points of reference and are extremely valuable for differential analysis when problems arise. AWR has a compare utility when can be run on two different reports such as your baseline report and the report from the period when the problem actually occurred. This facilitates quick identification of any differences in statistics between the two reports and allows us to focus and identify specific problems. Baselines should be taken during different phases of your business cycle. Examples would be at the busiest time of the day, batch reporting at night, monthly or quarterly end of business cycles, etc.

In addition to OSWatcher Black Box, AWR and ASH, there are other support tools worth considering when it comes to being proactive.

Oracle support has a tool which automates the collection of hang diagnostics. This tool, hangfg, is a simple UNIX shell script that allows users to choose how intrusive they want hang diagnostic traces like systemstate dumps and hang analyze dumps to be on their production servers. The complex syntax and kind of diagnostic trace is embedded in the utility. This frees you from having to know the kind and level of trace and the cryptic syntax to generate these traces. All you need to do when experiencing a hang is run the shell script, which will collect the appropriate diagnostic trace depending upon how intrusive you choose to make it.

There are 3 levels:

  1. Light impact on system. This option collects 2 hanganalyze level 3 traces and then determines whether it can also collect 1 hanganalyze level 4 trace with minimal impact to the system. If so, it collects the hanganalyze level 4 trace. If not, it does not collect an additional trace file.
  2. Medium impact on system (default value). This option collects 1 hanganalyze level 3 trace and then determines whether it can also collect 2 hanganalyze level 4 traces with minimal impact to the system. If so, it collects the 2 additional hanganalyze level 4 traces. If not, it collects an additional hanganalyze level 3 trace. This option also collects 1 systemstate level 266 trace.
  3. Heavy impact on system. This option collects 2 hanganalyze level 4 traces and 2 systemstate level 266 traces.

Document 362094.1 HANGFG User Guide

SQL Tuning Tools

Oracle support also has specialized tools around SQL tuning. SQLT and SQLHC are recommended best practice tools to have downloaded and installed on you systems if possible. SQLT is the best diagnostic tool support has to resolve complicated SQL tuning issues. Both tools extract statistical and environmental data from the database. SQLT is a far more extensive diagnostic tool but does require a separate scheme to be installed on the database. SQLHC does not have a footprint on the database and provides quick health check information. Both tools can be explored in more detail by clicking on the links associated with them.

Document 215187.1 SQLT (SQLTXPLAIN) - Tool that helps to diagnose a SQL statement performing poorly
Document 1366133.1 SQL Tuning Health-Check Script (SQLHC)

Hopefully these tips will encourage you to become more proactive with data collection. Our ultimate goal is to reduce your resolution time if you have a performance problem.

Related Resources: Advisor Webcast Recordings

Friday Jan 11, 2013

Leverage New High Availability Checks

Untitled Document

Leverage New High Availability Checks In Oracle RACcheck Utility

Oracle originally designed the Oracle RACcheck as a RAC Configuration Audit tool to review important configuration settings within a Real Application Clusters (RAC). Over time, we enhanced Oracle RACcheck to cover additional areas. These include single instance (non-rac) databases, Exadata, and now high availability (HA) checks.

This article will walk you through executing the new high availability (HA) checks in the RACcheck utility and reading the new report it generates.

Executing RACcheck with high availability checks

To execute RACcheck and include HA checks, you have two options:

$ ./raccheck -r (for regular healthcheck, including the new HA best practices)
$ ./raccheck -c hacheck (for only HA best practices)

Once you execute the command, RACcheck will detect and report all instances running on your server and give you the option to run all databases or a specific database. The list of databases will include any standby databases running on the server.

First


Once the execution completes, the RACcheck utility will create a report and a zip file containing all necessary files. This zip file can submitted to support for review. The results from our RACcheck example are:

TWO

Reading the report

Now we can look more closely at our results. Remember, we used the new flag to show only the high availability (HA) checks.

The report gives a general summary followed by a high-level list of items needing attention and/or items that passed the HA checks.


THREE

For each item, you are able to click on “view” to show details on the check and the settings the check recommends. In our example, if we look more closely at the failed check:

FAIL

SQL Check The recommended minimum number of archive processes on PRIMARY DB is not set All Databases View

 

We see both the current setting and the recommended change.

FOUR

There are details on the instance, as well as a link to allow for additional information.

This is the first set of High Availability checks in the RACcheck tool, but we will add more in the future. If you have any input on these checks, suggestions on which checks we should add, or questions on the new HA checks, please join us in the discussion we created for this conversation in My Oracle Support Community within the Database Backup and Recovery Community.

Related Informations:

RACcheck - RAC Configuration Audit Tool [ID 1268927.1]
Resolve - Troubleshooting and Preventing RAC issues with RACCheck [recorded Advisor Webcast; ID 1456176.1; Archived 2012 tab; Sept 12, 2012]

The new HA checks are:

  • Check that database is running in archivelog mode
  • Check that log_archive_config is set
  • Check archive processes, log_archive_max_processes parameter, setting.
  • Check that standby redo log files are configured.
  • Check Status of the log transport services for destination
  • Checks for a current gap for destination
  • Check for at least one active RFS-Process when log transport services to a standby database are active
  • Check current status of the managed recovery process
  • Check that FAL is configured
  • Check that data guard broker is running
  • Check for any unrecoverable datafile with no backup following the nologging command
  • Check for any offline datafiles.
  • Check that controlfile autobackup is configured to ON
  • Check that control_file_record_keep_time is not set too high or too low
  • Check that archivelog deletion policy is configured
  • For RAC database, check snapshot controlfile is configured and suggest it be placed on a on a shared device
  • Check if flashback database is set to ON
  • Check that recycle bin is ON
  • When FRA is configured, check that archivelog destination is set to FRA and not directory directly
About

Let the 'Get Proactive' team help you achieve resolution faster; simplify your daily operations; reduce risks and maximize uptime through preventative maintenance.

How? By sharing our own best practices, tips, and tricks. Get on the ‘Insider's Track.’

Stay Connected

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today