X
FEATURED POST

After motherboard change on BDA server eth0 network interface is missing

Recently I had a BDA server that was refusing to come up. Every trial to run a start /SYS failed. We were able to identify this as a motherboard failure, then...

Recent Posts

ASM

How to replace old Exadata storage cells with the new X7-2 storage cells, without downtime

Lately I had to help various customers to replace their old storage cells with the new X7-2 ones. There are huge benefits in doing this, the X7 has 25.5TB of flash, 12 x 10TB disks and 192 GB of DDR4 Memory.                       The question my customers asked the most was: Can we do the migration from our old storage to the new X7 without downtime and without risk? The answer was: YES! For doing this I've prepared and implemented a procedure that cover step by step how to migrate critical data, from production databases, while these databases are online, without downtime and without risk. So far I've done two of these migrations in 2 weeks, one in Haifa and one in Istanbul. The haifa migration was run on a machine without customers working on it. The Istanbul migration was implemented on a critical production database with full customer load. Customer was very happy to see how the data was transferred in a fast and safe way without affecting the online business. This is the power of both Exadata and ASM, a migration that only a few years ago may have imposed tremendous effort of planning and most probably required downtime, is now possible to be run online and without affecting the performance of critical workloads! In summa ry the steps of the migration includes the following: 1. Attach the new cells to Exadata and setup the ILOM to the final IP on customer network 2. Connect via ILOM to the new storage and setup the network to customer values 3. Upgrade the new storage servers to latest Exadata storage version 4. Customize the new servers to reflect customer preferences, mail alerts, writeback, asr server, etc 5. Create celldisks and griddisks to match existing diskgroups 6. Extend existing disk groups into the new cells and wait for first rebalance to complete 7. Once second rebalance completes, drop failgroups from the old cells and wait for second rebalance to complete 8. Once second rebalance complete flush the flashcache on the old cells, drop its griddisks and celldisks and shutdown the cells 9. Check the free space available on new cells and increase the size of the griddisks to use all of it as required 10. On the ASM instance resize all griddisks on the disk groups where you increase the size of the griddisks, wait for the third rebalance to complete.        

Lately I had to help various customers to replace their old storage cells with the new X7-2 ones. There are huge benefits in doing this, the X7 has 25.5TB of flash, 12 x 10TB disks and 192 GB of DDR4...

Engineered Systems

How to setup passwordless ssh in Exadata using dcli

Setting passwordless ssh root connection using dcli is fast and simple and will easy later to execute commands on all servers using this utility.In order to do that you should have either:DNS resolution to all Database and Storage nodes OR have them registered in /etc/hosts1) Create a parameter file that contains all the server names you want to reach via dcli, tipically we have a cell_group for storage cells, a dbs_group for database servers and an all_group for both of them.The parameter files will have only the server name, in short formatie: all_group will have on an Exadata quarter rack:dbnode1dbnode2cell1cell2cell32) As root user create ssh equivalence:ssh-keygen   -t    rsa3) Distribute the key to all serversdcli -g ./all_group -l root -k -s '-o StrictHostKeyChecking=no'4) check dcli -g all_group -l root hostname 

Setting passwordless ssh root connection using dcli is fast and simple and will easy later to execute commands on all servers using this utility. In order to do that you should have either: DNS...

Engineered Systems

Bash security fix made available for Exadata

Complete information about the security fix availability should be reviewed, before applying the fix, in MOS DOC:  Responses to common Exadata security scan findings (Doc ID 1405320.1)The security fix is available for download from:http://public-yum.oracle.com/repo/OracleLinux/OL5/latest/x86_64/getPackage/bash-3.2-33.el5_11.4.x86_64.rpmThe summary installation instructions are as follows:1) Download getPackage/bash-3.2-33.el5_11.4.x86_64.rpm2) Copy bash-3.2-33.el5_11.4.x86_64.rpm into /tmp at both database and storage nodes.3) Remove rpm  exadata-sun-computenode-exactrpm -e exadata-sun-computenode-exact4) On compute nodes install bash-3.2-33.el5_11.4.x86_64.rpm using this command: rpm -Uvh /tmp/bash-3.2-33.el5_11.4.x86_64.rpm5) On storage nodes  install bash-3.2-33.el5_11.4.x86_64.rpm using this command:rpm -Uvh --nodeps /tmp/bash-3.2-33.el5_11.4.x86_64.rpm6) Remove /tmp/bash-3.2-33.el5_11.4.x86_64.rpm from all nodesAs a side effect of applyin this fix,  during future upgrades on the database nodes, a warning will appear informing:The "exact package" was not found and it will use minimal instead. That's a normal and expected message and will not interfere with the upgrade. 

Complete information about the security fix availability should be reviewed, before applying the fix, in MOS DOC:  Responses to common Exadata security scan findings (Doc ID 1405320.1) The security fix...

HA

Exadata Parameter _AUTO_MANAGE_EXADATA_DISKS

Exadata auto disk management is controlled by the parameter _AUTO_MANAGE_EXADATA_DISKS. The default value for this parameter is TRUE.When _AUTO_MANAGE_EXADATA_DISKS is enabled, Exadata automate the following disk operations:If a griddisk becomes unavailable/available, ASM will OFFLINE/ONLINE it.If a physicaldisk fails or its status change to predictive failure, for all griddisks built on it ASM will DROP FORCE the failed ones and DROP the ones with predictive failures.If a flashdisk performance degrades, if there are griddisks built on it, they will be DROPPED FORCE in ASM.If a physicaldisk is replaced, the celldisk and griddisks will be recreated and the griddisks will be automatically ADDED in ASM, if they were automatically dropped by ASM. If you manually drop the disks, that will not happen.If a NORMAL, ONLINE griddisk is manually dropped, FORCE option should not be used, otherwise the disk will be automatically added back in ASM. If a gridisk is inactivated, ASM will automatically OFFLINE it.If a gridisk is activated, ASM will automatically ONLINED it. There are some error conditions that may require to temporarily disable _AUTO_MANAGE_EXADATA_DISKS.Details on MOS 1408865.1 - Exadata Auto Disk Management Add disk failing and ASM Rebalance interrupted with error ORA-15074. Immediately after taking care of the problem _AUTO_MANAGE_EXADATA_DISKS should be set back to its default value of TRUE. Full details on Auto disk management feature in Exadata (Doc ID 1484274.1)

Exadata auto disk management is controlled by the parameter _AUTO_MANAGE_EXADATA_DISKS. The default value for this parameter is TRUE.When _AUTO_MANAGE_EXADATA_DISKS is enabled, Exadata automate...

Oracle Engineered Systems, Amazing Exalogic

Sometimes I have heard that Exalogic is just a bunch of servers connected using infiniband, something that you can easily build yourself at a lower cost.That comments misses completely 2 things: 1) What is the idea behind an Oracle engineered system, and the back that Oracle provides for them  2) What is ExalogicThis amazing 5 minute presentation explains Exalogic potential: Oracle Exalogic Elastic Cloud

Sometimes I have heard that Exalogic is just a bunch of servers connected using infiniband, something that you can easily build yourself at a lower cost. That comments misses completely 2 things: 1)...

RAC

RAC11.2.0.2 redundant interconnect and the Cluster Health Monitor

There are 2 interesting new features on RAC 11.2.0.2. The first is the cluster HAIP resource, that makes possible to have up to 4 redundant interconnects that will be automatically managed by the cluster for fail-over and load balancing.  The second one is the Cluster Health Monitor. It was previously available as a utility that you can download and install, now is a resource on the cluster and will start to collect valuable OS statistics from the moment the Cluster is installed. You can see details about both features on this file : HAIP and CHM 11.2.0.2 RAC Features

There are 2 interesting new features on RAC 11.2.0.2. The first is the cluster HAIP resource, that makes possible to have up to 4 redundant interconnects that will be automatically managed by the...

Linux

Configuring FTP on Exadata

Exadata is installed with the minimum set of rpm's required to make it work as a database server.In many cases you will need to install by yourself the rpms required to make available some specific functions, like FTP.Exadata is installed either with Oracle Enterprise Linux or Solaris Express. This instructions match the Linux distribution, and can be used on any RH compatible Linux, not only OEL on Exadata.You can find the rpm's on the Oracle Enterprise Linux Distribution Disk, downloadable from edelivery.oracle.comInstall the Following rpms:[root@exand02 rpms]# lsftp-0.17-35.el5.x86_64.rpm pam-rpms vsftpd-2.0.5-16.el5_4.1.x86_64.rpmlftp-3.7.11-4.el5.x86_64.rpm tftp-server-0.49-2.0.1.x86_64.rpmThe Command to Install[root@exand02 rpms]# rpm -Uivh vsftpd-2.0.5-16.el5_4.1.x86_64.rpm ftp-0.17-35.el5.x86_64.rpm lftp-3.7.11-4.el5.x86_64.rpmStart Service vsftpd[root@exand02 rpms]# service vsftpd startStarting vsftpd for vsftpd: [ OK ][root@exand02 rpms]# service vsftpd statusvsftpd (pid 9274) is running...Configure Automatic vsftp Start[root@exand02 rpms]# chkconfig vsftpd on[root@exand02 rpms]# chkconfig --list | grep vsftpdvsftpd 0:off 1:off 2:on 3:on 4:on 5:on 6:offecho "service vsftpd status" >> /etc/rc.local[root@exand02 rpms]# tail -2 /etc/rc.local########### END DO NOT REMOVE Added by Oracle Exadata ###########service vsftpd startEdit /etc/vsftpd.confSet the following parameters on vsftpd.conf#anonymous_enable=YES (changed to NO to allow Exadata users to ftp)anonymous_enable=NO#userlist_enable=YES (changed to NO to allow Exadata users to ftp)userlist_enable=NOTest[root@exand02 vsftpd]# ftp exand02Connected to exand02 (10.25.104.130).220 (vsFTPd 2.0.5)Name (exand02:root): oracle331 Please specify the password.Password:230 Login successful.Remote system type is UNIX.Using binary mode to transfer files.ftp> pwd257 "/home/oracle"ftp> ls227 Entering Passive Mode (10,25,104,130,85,192)150 Here comes the directory listing.drwxr-xr-x 3 1001 500 4096 May 20 19:47 localdrwxr----- 3 1001 500 4096 May 03 12:20 oradiag_oracle-rw-r--r-- 1 1001 500 1020 Jun 01 14:41 ~oraclec226 Directory send OK.ftp> bye221 Goodbye.

Exadata is installed with the minimum set of rpm's required to make it work as a database server.In many cases you will need to install by yourself the rpms required to make available some specific...

Data Guard

11.2.0.2 Creating a Standby or a Clone Database using Rman Duplicate From Active Database

There are a few things on 11.2.0.2 that you need to take into account to create a standby database or a clone database from an Active Database using Duplicate database command. Points 2, 9 and 10 of this document contains the details I’ve found I needed to change in order to get the clone or duplicate for standby running smoothly on this release:The steps required to complete the task are: 1)Create a parameter file for the clone or standby 2)add the following 3 parameters to the standby or clone database parameter file, even if the paths are identical on both the source and standby or clone servers: *._compression_compatibility='11.2.0' *.db_file_name_convert=('trg_path','aux_path'…) *.log_file_name_convert=('trg_path','aux_path'…) 3)Copy the passwordfile from the source database to the clone or standby Oracle_Home/database folder or $Oracle_Home/dbs on Linux/Uniux. 4)Configure the network on the clone or standby server so that you can connect to the database 5)Startup nomount the clone or standby database 6)Add a tnsnames.ora entry to the clone or standby on your source database server 7)Check that you can connect to the clone or standby database as sysdba 8)On the Primary server connect to the source database and clone or standby database (auxiliary) using RMAN >rman target / auxiliary sys/@prodcl Recovery Manager: Release 11.2.0.2.0 - Production on Sun Jul 24 15:49:25 2011 Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved. connected to target database: PROD (DBID=8114723455) connected to auxiliary database: PRODCL (not mounted) 9)Use the following syntax to duplicate to a clone (use as many channels as CPU’s you may dedicate): run { allocate channel c1 type disk; allocate auxiliary channel cr1 type disk; duplicate target database to 'prodcl' from active database nofilenamecheck; } 10)Use the following syntax to duplicate for standby (use as many channels as CPU’s you may dedicate): run { allocate channel c1 type disk; allocate auxiliary channel cr1 type disk; duplicate target database for standby from active database nofilenamecheck; }

There are a few things on 11.2.0.2 that you need to take into account to create a standby database or a clone database from an Active Database using Duplicate database command.Points 2, 9 and 10 of...

RAC

RAC, ASM and Linux Forum, May 18, 2011: EXADATA Production Customers updates

Exadata is changing the world of Database Performance, on this forum we will have two EXADATA Production Customers updates.75 million Customer Telecom Turkcell will be represented by Ferhat Sengonul, Senior OS Administrator, DBA, and Datawarehouse Project Leader, that led Exadata implementation and scale out to their actual 3 full, 24 database nodes, Exadata racks.Ferhat will present his experience with a very large Data-Warehouse in Exadata, including online high performance reporting, VLDB backup and recovery best practices and upgrading from a traditional 11 racks (1 Sun M9000 Sparc 7; 10 storage racks 250 TB uncompressed) datawarehouse to a full rack and then to multiple racks. We will hear also about his consolidation project of all datawarehouse databases in Exadata.Golden Pages, the first Consolidation on Exadata project implemented in Israel will be presented by Shimi Nahum, Senior DBA and Exadata Project Leader. Shimi will tell us about the challenges the Exadata environment presented to him as a DBA and how he faced them, and the impact of using Oracle Exadata to consolidate multiple Customer databases, including Siebel and ERP databases.A practical dive into the technology will be presented by Oracle's Ophir Manor, the responsible for the several POC's being run by different Israeli Customers.And finally I will tell about the experiences from the field, installing and implementing Exadata at different Customers around the world.Exadata is radically changing the rules and expectations a DBA can have of an Oracle Database; this first hand experiences promise to be one of the most interesting conferences in Israel this year.The conference will be held on May 18 at the Sharon Conference Center 09 starting at 14:00 REGISTRATION: ILOUG RAC, ASM and Linux Forum Registration,SCHEDULE:14:00 – 14:30Registration14:30 – 14:40Welcome 14:40 – 15:25Shimi Nahum, Dapei Zahab, Senior Oracle DBA, responsible of the Exadata project. The first production Oracle Exadata in Israel, challenges for the DBA, speedup impact of Exadata on the end Customer15:25 – 16:15 Ferhat Sengönül, Senior OS and DBA Turkcell, Responsible for the DW project A very large Data-Warehouse in Exadata, the migration process, backup and recovery strategies, scaling up from 1 Exadata rack to 3 16:15 – 16:30Refreshments Break16:30 – 17:15Ofir Manor. Oracle Senior Sales Consultant and Exadata Expert. Preparing the IT infrastructure for Exadata. Lifetime maintenance procedures.17:15 – 17:45Alejandro Vargas, Oracle Principal Support Consultant and EMEA Exadata Core Team Member. Inside the Oracle Database Machine, secrets about the configuration, install and support procedures17:45 – 18:15Questions and Answers

Exadata is changing the world of Database Performance, on this forum we will have two EXADATA Production Customers updates. 75 million Customer Telecom Turkcell will be represented by Ferhat Sengonul,...

Linux

How to Install the Oracle-Validated rpm Using a Local Repository

One of the steps required to install Oracle on Linux is to Install all the Linux packages, rpm's, needed by Oracle, and their dependencies. That is followed by creating the Oracle user account and groups and setup the parameters for the kernel.All of these tasks can be automatized by installing a single rpm that is distributed by Oracle, the Oracle-Validated rpm.On the Oracle Enterprise Linux 5.5 distribution disk it is included the Oracle-Validated rpm and also the ASMLib related rpms.The rpm can be installed as part of the Linux install process, as explained on Sergio Leunissen post from 2009 Another option, if you have your server connected to the Internet, is to run the install using Yum; it will install the oracle-validated rpm and download all required dependencies.Yet another option, if you don't have access to the Internet, is to setup a local or NFS mounted repository that contains all rpm's included on the Linux distribution disk.In this post I'm including an example of the steps required to setup a local rpm repository and install the Oracle-Validated rpm and it's dependencies from it:How to Install The Oracle-Validated rpm From a Local Repository

One of the steps required to install Oracle on Linux is to Install all the Linux packages, rpm's, needed by Oracle, and their dependencies. That is followed by creating the Oracle user account and...

Using 11g RMAN Duplicate Command to Create a Physical Standby Database Over the Network

This post contains a quick, step-by-step walk over the procedure of creating a Physical Standby Database using Rman Duplicate command, not using any backup.Setting up a physical standby database is a simple operation when the required infrastructure is ready.We need 2 servers, a network that communicate both of them, and storage connected to the servers that is proportional to the database size + extra space for archive logs and backups.The Oracle Home is installed on both servers at the same patch level; you may use also cloning to install the RDBMS home on the standby server.You can find the details of the test on this document: how to create a physical standby database using Rman Duplicate command.pdf

This post contains a quick, step-by-step walk over the procedure of creating a Physical Standby Database using Rman Duplicate command, not using any backup. Setting up a physical standby database is a...

Oracle RDBMS Home Install Using Cloning

Using a standard Oracle Home, that is updated to the last patches, as the source to install new Oracle Homes can save a lot of time, compared to installing the same Oracle Home + Patches from scratch.The procedure to clone an Oracle Home is simple and is well documented on a set of My Oracle Support documents that can be found on Document 1154613.1 ordered by release.On this post I'm providing a step by step example of cloning a 11g R2 Home: How to clone a 11g R2 Oracle HomeThis is nice to have solution if you need to make multiple installs on many servers. Yo do one install + patches, then move that copy over to all other servers.

Using a standard Oracle Home, that is updated to the last patches, as the source to install new Oracle Homes can save a lot of time, compared to installing the same Oracle Home + Patches from scratch. T...

HA

Data Guard for Manual Failover, Step by Step

about:blankIn this post I'm showing the steps used to implement a manual failover scenario. My customer did not want to enable fast start failover but to leave the decision to failover in case of a major crash to the management team.In the example I'm providing here I did configure flashback database with a one hour retention time so that the OS team can have this time to solve any issues on the primary, if they succeed to solve the problem in this time then the old primary can be easily reinstated as the new standby, other wise it will need to be recreated from a backup taken from the new primaryAll details of this experience can be found on this document "Step by Step Configuration of a Physical Standby Database for Manual Failover"

about:blankIn this post I'm showing the steps used to implement a manual failover scenario. My customer did not want to enable fast start failover but to leave the decision to failover in case of a...

How to Calculate TCP Socket Buffer Sizes for Data Guard Environments

The MAA best practices contains an example of how to calculate the optimal TCP socket buffer sizes, that is quite important for very busy Data Guard environments, this document Formula to Calculate TCP Socket Buffer Sizes.pdf contains an example of using the instructions provided on the best practices document.In order to execute the calculation you need to know which is the band with or your network interface, usually will be 1Gb, on my example is a 10Gb network; and the round trip time, RTT, that is the time it takes for a packet to make a travel to the other end of the network and come back, on my example that was provided by the network administrator and was 3 ms (1000/seconds)

The MAA best practices contains an example of how to calculate the optimal TCP socket buffer sizes, that is quite important for very busy Data Guard environments, this document Formula to Calculate TCP...

HA

Impact of Truncate or Drop Table When Flashback Database is Enabled

Recently I was working on a VLDB on the implementation of a disaster recovery environment configured with data guard physical standby and fast start failover.One of the questions that come up was about the overhead of truncating and dropping tables. There are daily jobs on the database that truncate extremely large partitions, and as note 565535.1 explain, we knew there is an overhead for these operations.But the information on the note was not clear enough, with the additional information I've got from Senior Oracle colleagues I did compile this document "Impact of Truncate or Drop Table When Flashback Database is Enabled" that further explain the case

Recently I was working on a VLDB on the implementation of a disaster recovery environment configured with data guard physical standby and fast start failover. One of the questions that come up was...

HA

Oracle Database 11g Underground Advice for Database Administrators, by April C. Sims

Recently I have received a request to review the book "Oracle Database 11g Underground Advice for Database Administrators" by April C. SimsI was happy to have the opportunity to know some details about the author, she is an active contributor to the Oracle DBA community, through her blog "Oracle High Availability" .The book is a serious and interesting work, I think it provides a good study and reference guide for DBA's that want to understand and implement highly available environments.She starts walking over the more general aspects and skills required by a DBA and then goes on explaining the steps required to implement Data Guard, using RMAN, upgrading to 11g, etc.

Recently I have received a request to review the book "Oracle Database 11g Underground Advice for Database Administrators" by April C. Sims I was happy to have the opportunity to know some details...

ASM

The Next RAC, ASM and Linux Forum. May 4, 2010 Beit HP Raanana

The next RAC, ASM and Linux forum will take place next week, you are still on time to register : Israel Oracle Users Group RAC,ASM and Linux ForumThis time we will have a panel formed by Principal Oracle Advanced Customer Services Engineers and RAC experts Galit Elad and Nickita Chernovski and Senior Oracle Advanced Customer Services Engineers and RAC experts Roy Burstein and Dorit Noga.They will address the subject: 5 years of experience with RAC at Israeli Customers, lessons learned. It is a wonderful opportunity to meet with the people that is present at most major implementations and helped to solve all major issues along the last years.In addition we will have 2 most interesting Customer Presentations:Visa Cal DBA Team Leader Harel Safra will tell about their experience with scalability using standard Linux Servers for their mission critical data warehouse.Bank Discount Infrastructure DBA Uril Levin, who is in charge of the Bank Backup and Recovery Project, will speak about their Corporate Backup Solution using RMAN; that includes an end to end solution for VLDBS and mission critical databases. One of the most interesting RMAN implementations in Israel.This time I will not be able to attend myself as I'm abroad on business, Galit Elad will greet you and will lead the meeting.I'm sure you will enjoy a very, very interesting meeting.Beit HP is located at 9 Dafna Street, RaananaBest RegardsAlejandroView Larger Map

The next RAC, ASM and Linux forum will take place next week, you are still on time to register : Israel Oracle Users Group RAC,ASM and Linux Forum This time we will have a panel formed by Principal...

RMAN

Cloning A Database On The Same Server Using Rman Duplicate From Active Database

To clone a database using Rman we used to require an existing Rman backup, on 11g we can clone databases using the "from active" database option. In this case we do not require an existing backup, the active datafiles will be used as the source for the clone.In order to clone with the source database open it must be on archivelog mode. Otherwise we can make the clone mounting the source database, as shown in this example.These are the steps required to complete the clone:Configure The NetworkCreate A Password File For The New DatabaseCreate An Init.Ora For The New DatabaseCreate The Admin Directory For The New DatabaseShutdown And Startup Mount The Source DatabaseStartup Nomount The New DatabaseConnect To The Target (Source) And Auxiliary (New Clone) Databases Using RmanExecute The Duplicate CommandRemove The Old PfileCheck The New DatabaseA step by step example is provided on this file: rman-duplicate-from-active-database.pdf

To clone a database using Rman we used to require an existing Rman backup, on 11g we can clone databases using the "from active" database option. In this case we do not require an existing backup, the...

Compression for tables with more than 250 columns

Compression for tables with more than 250 columnsTables with more than 250 columns are not supported to be compressed, this restriction remains in place even on 11g R2.On the 11g R2, Sql Language Reference Manual, page 16-36 we can read: Restrictions on Table Compression * COMPRESS FOR OLTP and COMPRESS BASIC are not supported for tables with more than 255 columns.This is a serious limitation specially for Telecoms where CDR tables can have a number of columns way over 255.The available workaround: Split the table into 2 sub-tables. create table A as select pk,field 1 to 150 from origtable create table B as select pk,field 151 to 300 from origtable Each one will have less than 250 rows. They will be joined by the primary key. The table will be accessed using a view that has all the columns of the original table. create view origtable as select a.pk,field a.1 to a.150, field b.151 to b.300 from a, b where a.pk=b.pk

Compression for tables with more than 250 columns Tables with more than 250 columns are not supported to be compressed, this restriction remains in place even on 11g R2. On the 11g R2, Sql Language...

ASM

RAC, ASM and Linux Forum, December 15, 13:30 - 18:00 Beit HP Raanana

It's time for our 2nd, 2009 RAC, ASM and Linux Forum in Tel Aviv!I would like to warmly invite you to our next RAC, ASM and Linux forum to be held at Beit HP in Raanana, on December 15. You can register on the Israel Oracle User Group site.On this forum we will have a 11g R2 Technology Update and 2 very interesting Customer Presentations about minimum downtime VLDB Migration to RAC on Linux and Creating and Managing RPM's for Oracle Silent Install on Linux.Panel on Grid Infrastructure, ASM, Clusterware and RAC 11g R2, Technology UpdateAnnie Flint, Principal Member of Technical Staff, Oracle CorporationOfir Manor, Senior Sales Consultant, Oracle IsraelAlejandro Vargas, Principal Support Consultant, Oracle Advanced Customer ServicesIn the few months since last forum on June many things happened, 11g Release 2 is already production and brought a revolution in terms of performance and deep changes to the world of ASM, Oracle Clusterware and RAC. Exadata Release 2 was released opening the way for OLTP databases based on the new Oracle - Sun Database Machine. In this seminar we will review the new face of ASM, Oracle Clusterware and RAC on 11g Release 2 and we will comment on some of the incredible performance enhancements of the new version.Migration of a VLDB to RAC 11g with Minimal DowntimeDotan Mor, Senior DBA Pelephone DBA TeamDotan will tell us the whole story of migrating an 8TB Datawarehouse, with near 0 downtime, from Linux 3 with OCFS2 to Linux 5, with CRS, RAC and ASM 11g, and Infiniband; and how his customer still cannot believe the incredible improvements they got in performance.He will tell us also all the several problems faced in the way to this big success and how all of them were overcome.Packaging Application and Database Together On Linux for Super-Silent-InstallsLiron Amitzi, VP Professional ServicesBrillix Liron will explain us how to build a Linux RPM that contains inside the whole set of files required to easily and fast deploy a complete application, from the database to last application executable.See you there!Best Regards,Alejandro

It's time for our 2nd, 2009 RAC, ASM and Linux Forum in Tel Aviv! I would like to warmly invite you to our next RAC, ASM and Linux forum to be held at Beit HP in Raanana, on December 15. You can...

Check IO Scripts

These scripts are very useful to check throughput.The original version can be found on My Oracle Support Note 422414.1 by Luca Canali.set lines 250 pages 50000alter session set nls_date_format='dd-mm-yyyy hh24:mi';col Phys_Read_Total_Bps for 999999999999col Phys_Write_Total_Bps for 999999999999col Redo_Bytes_per_sec for 999999999999col Phys_Read_IOPS for 999999999999col Phys_write_IOPS for 999999999999col Phys_redo_IOPS for 999999999999col OS_LOad for 999999999999col DB_CPU_Usage_per_sec for 999999999999col Host_CPU_util for 999999999999col Network_bytes_per_sec for 999999999999col Phys_IO_Tot_MBps for 999999999999col Phys_IOPS_Tot for 999999999999spool io_max_checkup.logselect min(begin_time), max(end_time), sum(case metric_name when 'Physical Read Total Bytes Per Sec' then maxval end) Phys_Read_Tot_Bps, sum(case metric_name when 'Physical Write Total Bytes Per Sec' then maxval end) Phys_Write_Tot_Bps, sum(case metric_name when 'Redo Generated Per Sec' then maxval end) Redo_Bytes_per_sec, sum(case metric_name when 'Physical Read Total IO Requests Per Sec' then maxval end) Phys_Read_IOPS, sum(case metric_name when 'Physical Write Total IO Requests Per Sec' then maxval end) Phys_write_IOPS, sum(case metric_name when 'Redo Writes Per Sec' then maxval end) Phys_redo_IOPS, sum(case metric_name when 'Current OS Load' then maxval end) OS_LOad, sum(case metric_name when 'CPU Usage Per Sec' then maxval end) DB_CPU_Usage_per_sec, sum(case metric_name when 'Host CPU Utilization (%)' then maxval end) Host_CPU_util, --NOTE 100% = 1 loaded RAC node sum(case metric_name when 'Network Traffic Volume Per Sec' then maxval end) Network_bytes_per_sec, snap_idfrom dba_hist_sysmetric_summarygroup by snap_idorder by snap_id;spool offspool io_maxtot_summary.logselect min(begin_time), max(end_time), sum(case metric_name when 'Physical Read Total Bytes Per Sec' then maxval end)/1024/1024 + sum(case metric_name when 'Physical Write Total Bytes Per Sec' then maxval end)/1024/1024 + sum(case metric_name when 'Redo Generated Per Sec' then maxval end)/1024/1024 Phys_IO_Tot_MBps, sum(case metric_name when 'Physical Read Total IO Requests Per Sec' then maxval end) + sum(case metric_name when 'Physical Write Total IO Requests Per Sec' then maxval end) + sum(case metric_name when 'Redo Writes Per Sec' then maxval end) Phys_IOPS_Tot, sum(case metric_name when 'Current OS Load' then maxval end) OS_LOad, sum(case metric_name when 'CPU Usage Per Sec' then maxval end) DB_CPU_Usage_per_sec, sum(case metric_name when 'Host CPU Utilization (%)' then maxval end) Host_CPU_util, --NOTE 100% = 1 loaded RAC node sum(case metric_name when 'Network Traffic Volume Per Sec' then maxval end) Network_bytes_per_sec, snap_idfrom dba_hist_sysmetric_summarygroup by snap_idorder by snap_id;spool off

These scripts are very useful to check throughput. The original version can be found on My Oracle Support Note 422414.1 by Luca Canali. set lines 250 pages 50000 alter session set...

ASM Hands-On Training, Server Enviroment Setup And Aliases

The server where the labs of the ASM Hands-On Training works is configured with tcsh as the default oracle user shell. A .cshrc file containing several aliases was setup to easy moving around and executing repetitive commands.On this document there is a general explanation of how the environment can be used.Details on this file: Server Enviroment Setup And Aliases

The server where the labs of the ASM Hands-On Training works is configured with tcsh as the default oracle user shell. A .cshrc file containing several aliases was setup to easy moving around and...

ASM

ASM Hands-On Training, Lab 21, ASMCMD Use And Options

On this lab we will review some of the useful set of commands provided by the ASM command line utility.Some of the asmcmd commands display information, these information is based on v$asm views, other commands actually make changes to the structure like mkdir or rm.Details on file: ASMCMD Use And Options

On this lab we will review some of the useful set of commands provided by the ASM command line utility. Some of the asmcmd commands display information, these information is based on v$asm views, other...

ASM

ASM Hands-On Training, Lab 20, Storage Consolidation With ASM

On this lab we will share our ASM disks with a second server and we will open the sati12 database on it.To do that we copied the vm to another location, without including the ASM disks, that are located on a separate folder, this way the second vm is pointing to the same ASM disks as the first one.Details on file: Storage Consolidation With ASM

On this lab we will share our ASM disks with a second server and we will open the sati12 database on it. To do that we copied the vm to another location, without including the ASM disks, that are...

ASM

ASM Hands-On Training, Lab18, ASM Metadata and Other Checkups

On this lab we will review various utilities that provide further information for managing ASM.Checksum Mismatch After Storage Crash, AMDU an 11g tool that can be used with 10g as well, Blockdumps, Asmiostats and asmdebugDetails on this file: ASM Metadata and Other Checkups

On this lab we will review various utilities that provide further information for managing ASM. Checksum Mismatch After Storage Crash, AMDU an 11g tool that can be used with 10g as well, Blockdumps,...

ASM Hands-On Training, Lab 17, Measuring Throughput

On this lab we will review simple methods for getting input about the the database througput and response time. We will produce the load using swingbench and we will gather AWR snapshots every 10 minutes. After some time we will be able to check througput statistics based on the AWR snapshots information.Details on this Document: Measuring ThroughputAnother useful script for checking IO:set pages 50000 lines 250alter session set nls_date_format='dd-mm-yy hh24:mi';spool chkio1-Physical-rw.logselect min(begin_time), max(end_time),sum(case metric_name when 'Physical Read Total Bytes Per Sec' then average end) Physical_Read_Total_Bps,sum(case metric_name when 'Physical Write Total Bytes Per Sec' then average end) Physical_Write_Total_Bps,snap_idfrom dba_hist_sysmetric_summarygroup by snap_idorder by snap_id;spool offselect min(begin_time), max(end_time),sum(case metric_name when 'Physical Read Total Bytes Per Sec' then maxval end) Physical_Read_Total_Bps,sum(case metric_name when 'Physical Write Total Bytes Per Sec' then maxval end) Physical_Write_Total_Bps,sum(case metric_name when 'Physical Read Total Bytes Per Sec' then maxval end) +sum(case metric_name when 'Physical Write Total Bytes Per Sec' then maxval end) Total_IO,snap_idfrom dba_hist_sysmetric_summarygroup by snap_idorder by snap_id/

On this lab we will review simple methods for getting input about the the database througput and response time. We will produce the load using swingbench and we will gather AWR snapshots every 10...

ASM Hands-On Training, Lab 16, Configuring And Running Swingbench And OSWatcher

Whenever implementing new environments it is convenient to run stress tests and to gather OS statistics that will provide valuable diagnostics information.Swingbench provide an easy configurable tool to run oltp or dss like stress tests; that is very useful when the customer does not have any possibility to benchmark the new configuration.Oswatcher will collect statistics from the OS in a cyclic 2-day period by default, which can be extended to whatever period is convenient for you.Details on this file:Lab16-Configure-SwingBench-and-OsWatcher.pdf

Whenever implementing new environments it is convenient to run stress tests and to gather OS statistics that will provide valuable diagnostics information. Swingbench provide an easy configurable tool...

ASM

ASM Hands-On Training, Lab 15, Cloning A Disk Group

Sometimes it is required having several clones of the same database running on the same server. That was simple to do when the database was based on File System, and is still simple to do with the help of Rman.But there are very large databases, which are usually cloned using storage tools like bcv, srdf or snapshots that cannot consider Rman as a viable possibility.For these databases there is the possibility to implement the procedure we will test on this lab.Note that the rename disk group script has not been made public. The rename disk group functionality is available on 11g R2 that was made available to the general public on August 2009.These are the steps required to complete the process:1. Prepare a set of storage devices with enough space to hold a copy of the diskgroup to be copied2. Shutdown the source database3. Make a bit by bit copy of the source database on the target devices4. Add kfed path to the path of user root5. Execute the rename diskgroup script6. Rename ASM disks of the cloned devices7. Rescan ASM disks8. Start the ASM instance and mount the new diskgroup9. Recreate the controlfile and rename the database based on the cloned diskgroup10.Open the clone with the resetlogs optionDetails on this file:Cloning Disk Group

Sometimes it is required having several clones of the same database running on the same server. That was simple to do when the database was based on File System, and is still simple to do with the...

ASM Hands-On Training, Lab 14, 11g Compatibility Parameters and Resilience Test

On this lab we will make a review of the 11g Compatibility Parameters and 3 Resilience TestsThe compatibility parameters compatible.asm and compatible.rdbms define the minimum ASM and database versions that will be able to connect to a disk group. These parameters can be advanced onlyOn the resilience probe we will do the following tests· Add ASM disks and bring down the ASM instance on the middle of the operation· Remove ASM disks and bring down the ASM instance on the middle of the operation· Destroy the ASM metadata and restore the diskDetails on this file:11g Compatibility Parameters and Resilience Test

On this lab we will make a review of the 11g Compatibility Parameters and 3 Resilience Tests The compatibility parameters compatible.asm and compatible.rdbms define the minimum ASM and database...

ASM

ASM Hands-On Training, Lab 13, ASM And Rman: Crash, Restore And Recovery Scenarios

On this Lab we will review the following crash and recovery scenarios, and we will implement some of them:· System tablespace loss· Recover system tablespace· User datafile loss· Recover users tablespace· Online redo loss· Recover from redo loss· Controlfile loss· Recover from controlfile loss· Database loss· Recover from total lossDetails on this fileASM and Rman Crash Restore and Recovery Scenarios

On this Lab we will review the following crash and recovery scenarios, and we will implement some of them: · System tablespace loss · Recover system tablespace · User datafile loss · Recover users...

ASM

ASM Hands-On Training, Labs 11 and 12 : Restoring A Lost Database From Backup And Cleanup File Based ASM Diskgroups

To work on the 11g part of the Labs we will start restoring an 11g database we have on a backup.The backup consist of an incremental level 0 backupset, a controlfile and spfile backup and the archived logs required to recover the database.The restore is done in 6 steps · Re-create the ASM metada · Restore the spfile · Restore the controlfile · Restore the datafiles · Recover the database · Open resetlogsThis is covered on this file:Restoring a Lost Database From BackupOn the previous Labs we created ASM disks based on OS Files, before continuing with the next labs we need to clear them.This is explained on this file:Cleanup File based ASM Diskgroups

To work on the 11g part of the Labs we will start restoring an 11g database we have on a backup.The backup consist of an incremental level 0 backupset, a controlfile and spfile backup and the archived...

ASM Hands-On Training, Lab 10, Migrating From 10g ASM To 11g

In this Lab we will migrate our 10g Disk Groups to 11g.The first steps of the migration, Installing 11.1.0.6 on it's own Oracle Home and applying patch 11.1.0.7 are ready on this environment, so we can go ahead with the migration process.In this lab we will migrate only ASM, the database migration is a little longer for the time frame we have so we will continue by moving to ASM an 11g database that already exist on this server.Details on this document:Migrating 10g ASM to 11g

In this Lab we will migrate our 10g Disk Groups to 11g. The first steps of the migration, Installing 11.1.0.6 on it's own Oracle Home and applying patch 11.1.0.7 are ready on this environment, so we...

ASM Hands-On Training, Lab 9, Migrating Raw And Block Devices To ASMLib

ASMLib is best practice for managing ASM on Linux, it may happen that a database server that was configured a couple of years ago with ASM 10g R1 is still using raw devices, or that a new ASM 10g R2 or even 11g was configured using block devices instead of ASMLib.ASMLib provides global open close call that will improve performance and transparently implement device labeling, an important aspect when managing storage on Linux.This lab explain the procedure to migrate these devices to ASMLib.Details on this document:Migrating Raw and Block Devices to ASMLib

ASMLib is best practice for managing ASM on Linux, it may happen that a database server that was configured a couple of years ago with ASM 10g R1 is still using raw devices, or that a new ASM 10g R2...

ASM Hands-On Training, Lab 8, Configure XDB for ASM Access Through FTP and HTML

The use of ftp and html add flexibility to manage the ASM environment, here a brief description of how to implement this configuration.XDB enable use of FTP and HTML protocols to access and manage files located on ASM disksgroups. Files can be easily browsed or moved in/out from ASM this way. This option is available only when the database is fully available.Details on this document:Configure XDB for ASM Access Through FTP and HTML

The use of ftp and html add flexibility to manage the ASM environment, here a brief description of how to implement this configuration. XDB enable use of FTP and HTML protocols to access and...

ASM

ASM Hands-On Training, Lab 7, 10g Normal Redundancy, Resilience Test-

For Databases requiring normal redundancy I definitively recommend 11g instead of 10g, because the fast mirror resynchronization new feature that makes possible to reinstate a failed disk without requiring to rebuild it as is the case on 10g.This Hands ON series move on a later lab the database to 11g to test this options. In this lab we still work with 10g.Normal Redundancy on 10g provides protection against disk failures, but it does require for the failed disk to be reconfigured after failure.On this lab we will simulate a disk failure and then we will restore the failed diskDetails on this document: Normal Redundancy Resilience Test 10g

For Databases requiring normal redundancy I definitively recommend 11g instead of 10g, because the fast mirror resynchronization new feature that makes possible to reinstate a failed disk without...

RAC

ASM Hands-On Training, Lab 6, Looking into PST Metadata

This lab looks into the ASM disk header section that contain the Partnership and Status Table, the PST. I have found in some limited occasions, specially following storage crashes that result in problems to mount the ASM disk groups, that looking into the PST may help to clarify the extent of the damage and asses if we need to go for a backup or not.The lab can be found on this document: Looking into PST MetadataNote that when reading the Metadata to seee the PST you must add to the kfed command the parameter "aunum = 1", that is written in the guide but is not clear enough.

This lab looks into the ASM disk header section that contain the Partnership and Status Table, the PST.I have found in some limited occasions, specially following storage crashes that result in...

ASM Hands-On Training, Lab 5, Using RMAN To Migrate a Database Into ASM

One of the ways to migrate a database to ASM is to use the Rman “Backup as Copy” command to create a database copy into ASM storage and then switch the database to the copy.This technique can be used, combined with incremental backups, to move even very large databases into ASM. The first backup may take a long time, one or more incremental backups can be used to update the first backup and close the gap, until a downtime window is obtained to switch the database on filesystem to the backup on ASM.This Lab can be downloaded here: Using RMAN To Migrate a Database Into ASMLab Topics: Summary Migrating a Database Into ASM Backup Database Into ASM Spfile Backup into ASM Consistent database shutdown Prepare Pfile for the ASM Database Start the database in NOMOUNT mode Change Parameters on Spfile to point to ASM Move the controlfiles into ASM Switch the Database from File System to ASM Recover The Database Migrate the Temporary Datafiles to ASM Move Flashback logs into flash recovery Area Move RMAN Change Tracking File Into ASM Remove the File System Old Files Remove the Old Spfile from Filesystem Scripts

One of the ways to migrate a database to ASM is to use the Rman “Backup as Copy” command to create a database copy into ASM storage and then switch the database to the copy. This technique can be used,...

ASM Hands-On Training, Lab 4, Install, Configure and Run ORION

Oracle Input Output Numbers, ORION, provides a simple way to check storage performance before the Oracle Database is created, that gives us a clean storage performance report that can be compared with the awr IO statistics once the database is running.When performance problems related to poor performance at the storage level are detected in a late implementation stage the number of variables may make the task to detect the bottleneck difficult, and the time to find a solution scarce.Orion can simulate the type of IO's an OLTP or DSS system does and provide detailed response times in terms of IOPS and MBPS.The lab can be downloaded here: Lab 4, Install, Configure and Run ORION Lab Index: Download Install Running a Simple Test Orion Generated Reports Running and advanced test

Oracle Input Output Numbers, ORION, provides a simple way to check storage performance before the Oracle Database is created, that gives us a clean storage performance report that can be compared with...

ASM

ASM Hands-On Training, Lab 3, Creating The ASM Instance And Managing ASM Disk Groups

On this Lab we create the ASM instance on a single instance environment, and create ASM diskgroups in several different ways using different kind of devices. Note that these exercises provide some degree of confidence on managing devices but are valid only for training.On a production environment we will use mostly a high end SAN or NAS storage, usually connected by fiber channel, and we will provision LUN's for ASM.The lab can be accessed here : Lab3-Create-ASM-Instance-and-Managing-Diskgroups.pdfLab IndexCreate The ASM InstanceCreate the Cluster Synchronization Services Daemon Configure and Start a ListenerSizing the ASM Shared Pool Setup a minimal ASM PfileDisk Discovery Discovering Raw DevicesDiscovering Block DevicesDiscovering ASMLib Devices Creating Disk Groups Using Loop Devices KFED Build the Kfed Utility Kfed Help Check Disk Headers Using Kfed Creating Normal Redundancy Disk Groups Using Block DevicesComparing External And Normal Redundancy Disk Using Kfed Using File System Based Files As ASM Disks Creating a diskgroup on ASM based on files Check the diskgroups created on this Lab Scripts

On this Lab we create the ASM instance on a single instance environment, and create ASM diskgroups in several different ways using different kind of devices. Note that these exercises provide some...

ASM Hands-On Training, Lab 2, Install and Configure ASMLib

ASMLib provide an easy way to present storage to ASM on Linux, it takes care of device labeling and improves the performance of open-close calls to the devices. Is Oracle Best Practice for managing disks on Linux.Lab2-Install-and-Configure-ASMLib.pdfThe following points are covered on this lab:Install The ASMLib: Cheking The Kernel Version Downloading the Drivers for ASMLib Installing the ASMLib DriversConfigure ASMLib: Create ASM Disks ASMLib Directory Structures ASM Directories ASM files on /opt directory Map ASMLib Disks to Physical Devices File /etc/sysconfig/oracleasm ASM init fileAnnex: Scripts

ASMLib provide an easy way to present storage to ASM on Linux, it takes care of device labeling and improves the performance of open-close calls to the devices. Is Oracle Best Practice for managing...

ASM

ASM Hands-On Training

Early this year I developed an ASM training that was implemented on Madrid and Tel Aviv.The idea was to have a set of exercises that will provide lots of hands-on experience to the participants, in a few days.On this and the following posts I will present the labs of the training.The training was implemented using the student’s laptops. A Virtual Machine containing a Linux 5 environment setup with Oracle 10g and 11g, ready for implementing the labs, was distributed to the participants.The requirements were :·Laptop with a minimum of 30 GB free space on the hard disk or an external disk·2 / 4 GB Memory ·Sun Virtualbox 2.2 installed on the laptopOn the first lab included here : Lab1 Configuration of Devices for ASM. we created devices to be used by ASM in several different ways, the idea was to provide many examples and train the fingers on how to build ASM disks on top of them; that also provided some insights about how the discovery process works. The whole program cover these subjectsASM Hands-On Training ProgramTraining ProgramDAY 1·Getting around on the system·ASM Architecture ·Preparing storage devices for ASM·Preparing storage for ASM ·Loop devices, ·File system, ·Raw devices, ·Block devices·ASMLib·Install ASMLib, ·Create the ASM Instance·Create the ASM instance, ·Managing diskgroups ·Kfed·Using kfed to identify ASM Headers·Using kfed to dump the headers of a physical device·ORION·Installing, configuring and running ORION simple test·Using RMAN for Restoring a Database Into ASM·Configure Diskgroups for restoring a database into ASM, ·Restore Database from Backup·Normal Redundancy ·PST table·10g ASM Normal Redundancy Resilience test·Drop database, ·Create Normal Redundancy Disk Group, ·Restore Database, ·Resilience test·Additional Means for Managing ASM ·Configure XDB ·Using FTP with ASM ·Browsing ASM Directories with HTML·Migrating Devices to ASMLib ·Migrate raw devices and block devices to ASMLibDAY 2·Migration to 11g·10g ASM migration to 11g·Create an 11g Database in ASM·ASM Backup, Crash, Restore and Recovery Scenarios·Backup and Restore ASM Metadata·ASM and Rman Crash, Restore and Recovery Scenarios Hands-On - Part I·ASM and Rman Crash, Restore and Recovery Scenarios Hands On – Part II·11g ASM New Features·Setting up 11g Compatibility Parameters ·Resilience Tests·ASM Cloning·Cloning a Disk group on the same server·Bringing Up a copy of the databases on a cloned Disk Group·Benchmarking, Monitoring and Troubleshooting·Configure and Run Swingbench·Configure and run OSWatcher·Resilience Tests While Running a Benchmark LoadDAY 3·ASM Storage Sizing·Measuring throughput·Defining Infrastructure requirements for Optimal Performance with ASM·Hardware·ASM on SAN, NAS·ASM Metadata·ASM Directories·COD and ACD·Block Dump, ·DbVerify, ·Header Dumps, ·dd·Storage Consolidation·Plugging ASM to a 2nd Node, ·Open the database on the 2nd Node, ·ASMCMD·Using ASMCMD commands and options·ASM on Extended Clusters·Normal Redundancy·Preferred Mirror Read·Fast Mirror Resync·DWDM·Buffer Credits·ASM on Exadata·Frequent SR Issues on ASM

Early this year I developed an ASM training that was implemented on Madrid and Tel Aviv. The idea was to have a set of exercises that will provide lots of hands-on experience to the participants, in a...

RAC and ASM on Linux Forum Meeting, June 3 at Beit HP in Raanana

On June 3, 2009 we had the 5th conference of the RAC/ASM/Linux Forum, this time at Beit HP, in Raanana, Israel.RAC, ASM and Linux are building blocks of the HP Oracle Database Machine and we wanted to start this meeting with a review of the technology revolution that Exadata is bringing in. Ehood Baratz from HP introduced the concepts and Annie Flint from Oracle RACPack Team, that is already working on implementation projects with customers around Europe, provided a deep technology analysis. This was followed by two Customer Case Studies: Dina Raphael from Rafael presented their implementation case of Oracle Applications on a 3 Node Linux Cluster with 11g CRS and ASM and 10g RDBMS. They are using ASM Diskgroup rename to mount several clones of the production database on the same development server, Dina's implementation has become a reference in Israel and Europe for implementations of Oracle Applications on Linux with 11g RAC and ASM.The second case was presented by Alon Spiegel, Senior Oracle Technology Consultant and CEO of Brillix. This is a large implementation that required higher availability, achieved using both RAID 1 + ASM Normal redundancy with failure groups located on a campus cluster, and highest performance. A very interesting design, implementation and testing effort worth to hear about. Beit HP address is 9 Dafna Street, Ra'anana Industrial Zone. IsraelThe Conference will start at 14:00You can download the program here: RAC And ASM On Linux Forum Agenda, June 3 2009The presentations can be downloaded here:RAC And ASM Forum 2009, Alejandro Vargas HP Oracle new products - Ehood BaratzRAC, ASM and Linux on the architecture of the HP Oracle Database Machine - Annie FlintOracle Applications implementation over 11g Oracle Clusterware + 11g ASM + 10g RDBMS on Rafael - Dina Raphaelhttp://www.oracle.com/il/RACForum/presentations_030609/RAC-And-ASM-Forum-2009-AlonSpiegel.zip

On June 3, 2009 we had the 5th conference of the RAC/ASM/Linux Forum, this time at Beit HP, in Raanana, Israel. RAC, ASM and Linux are building blocks of the HP Oracle Database Machine and we wanted to...

RAC

11.1.0.7 CRS bundle1 released

CRS Bundle Patch I was released for version 11.1.0.7.It is available for Linux x86 and x86-64, AIX 64-bit, HP-UX Itanium and HP-UX RISC 64-bit.Details can be found on metalink Note 810663.1 11.1.0.X CRS Bundle Patch InformationBefore applying the patch be sure to follow instructions on Note 458485.1 : "How to find whether the one-off Patches will conflict or not?", to check for potential conflicts between patches already applied and the new patches.Apply on test, check and then go on production.

CRS Bundle Patch I was released for version 11.1.0.7. It is available for Linux x86 and x86-64, AIX 64-bit, HP-UX Itanium and HP-UX RISC 64-bit. Details can be found on metalink Note 810663.1 11.1.0.X...

High Level Tracing For Monitoring And Debug With Event 10046

Whenever there is an error condition that do not provide enough information to immediately diagnose the root cause and find a solution, tracing the session that is generating the problem is the best first option to get more information. In many cases I did find on the raw trace the additional information required to get to the root cause of a problem.This method is good also to generate high level SQL traces to be used for SQL tuning.This is a short reference for running oradebug with event 10046 that enable SQL statement tracing including binds and waits.To trace same session:oradebug setmypid ;To trace other session first you need to get it's spid:select a.username,b.spid from v$session a, v$process bwhere a.username like '%&user%'and a.paddr=b.addr ;Then from your session, connected as sys, attach to the other session:oradebug setospid &ospid ;Then check the tracefile name:oradebug tracefile_name ;At this moment you can start tracing, there are many different events that can be used for different purposes, event 10046 is very useful for debugging, level 12 provide information about what are the waits the session is waiting for and also the values of the bind variables :oradebug event 10046 trace name context forever, level 12 ;Check the trace as it run:! tail To interrupt the trace you can either exit the session or execute this command:oradebug event 10046 trace name context off ;More information about oradebug and its many options can be find on this page "Utiliwiki"

Whenever there is an error condition that do not provide enough information to immediately diagnose the root cause and find a solution, tracing the session that is generating the problem is the best...

Oracle Open World Call for Papers

Answering feedback from attendees to the last Open World, the possibility to contribute experiences for the next conference has been opened to the entire Oracle community: Customers, Partners and EmployeesIf you wish to send a paper for consideration log into this site: OOW 2009 Call for Papers.The deadline for submission was extended until April 26.

Answering feedback from attendees to the last Open World, the possibility to contribute experiences for the next conference has been opened to the entire Oracle community: Customers, Partners...

Useful Scripts

- Service level objectives setup- Have the dba speak the same language as the storage admin- Throughput measure of transfer bits across the media throughput cannot be faster than the slowest link of the path from source to destination. Tak einto account that 2gbit HBA does not provide 2Gbytes throughput but 256 MbytesTo convert 2Gbits into Mbytes :2Gbits = (2* 1024 * 1024 * 1024) / (1024* 1024) / 8bits= 2147483648 / 1048576= 2048 / 8= 256MbytesAlways size for throughput- Good Put measure of usable amount of data, schedule appropriately.Measure the transfer of usable data. That measure is known as good-put. Good-put is the measure of usable data transferred over a given period of time. So even though you have great throughput, your good put may not be that well enough. This is because your queries are not well tuned (though they may good good response times), and thus request more data than is necessary. Good-put is a measurement of resource efficiency.- IOPS io operations per second (OLTP)set lines 250set pages 9999spool sysmetric_outp.logalter session set nls_date_format='dd-mm-yyyy hh24:mi';select min(begin_time), max(end_time),       sum(case metric_name when 'Physical Read Total Bytes Per Sec' then average end) Physical_Read_Total_Bps,       sum(case metric_name when 'Physical Write Total Bytes Per Sec' then average end) Physical_Write_Total_Bps,       sum(case metric_name when 'Redo Generated Per Sec' then average end) Redo_Bytes_per_sec,       sum(case metric_name when 'Physical Read Total IO Requests Per Sec' then average end) Physical_Read_IOPS,       sum(case metric_name when 'Physical Write Total IO Requests Per Sec' then average end) Physical_write_IOPS,       sum(case metric_name when 'Redo Writes Per Sec' then average end) Physical_redo_IOPS,       sum(case metric_name when 'Current OS Load' then average end) OS_LOad,       sum(case metric_name when 'CPU Usage Per Sec' then average end) DB_CPU_Usage_per_sec,        sum(case metric_name when 'Host CPU Utilization (%)' then average end) Host_CPU_util, --NOTE 100% = 1 loaded RAC node        sum(case metric_name when 'Network Traffic Volume Per Sec' then average end) Network_bytes_per_sec,        snap_idfrom dba_hist_sysmetric_summarygroup by snap_idorder by snap_id;spool off - Mbytes per second large block sequential io (DSS/Olap)- Transactions per second does not translate well to storage so we don't use it- Determine application IO characteristics- Recovery point objective (RPO), Recovery Time Objective (RTO), response time SLA, IO rates- Understand each IO components badwith limits- CPU, HBA, Switch, Controller, Disks- Choose the IO components that match your application throughput requirements- design for throughput (not capacity)- design for scalability and availability- AWR can be used to determine current IO- IOPS = “physical reads total I/O requests” + “physical writes total I/O requests”- MBytes/s = “physical reads total bytes” + physical writes total bytes”- For RAC environments - aggregate IOPS or MBytes/s for all nodes- Estimate expected IO througput rates, new users, reports, etc- extrapolate known loads to estimate what will be- Use the following as basic guidelines for OLTP systems :- Low transaction system – 1000 IOPS or 200MBytes/s- Medium transaction system – 5000 IOPS or 600 Mbytes/s- High-end transaction system – 10,000 IOPS or 1Gbytes/s

- Service level objectives setup - Have the dba speak the same language as the storage admin - Throughput measure of transfer bits across the mediathroughput cannot be faster than the slowest link of...

HA

How To Manually Open The Standby Database When The Primary Is Lost

This post explains how to manually convert the Physical Standby into the Primary database, when the Primary is not available anymore. The procedure is very simple, still is convenient to have it available so that in the stress situation of a production crash we can implement it without problems.Personally I prefer to work harder at the implementation stage and build an infrastructure strong enough to be configured using DG Broker, Observer and Fast Start Failover.Step by Step details of manually converting the standby into the new primary can be found on this file How to open the standby when the primary is lost

This post explains how to manually convert the Physical Standby into the Primary database, when the Primary is not available anymore. The procedure is very simple, still is convenient to have...

How to Synchronize the Standby When there is a Big gap, and all Required Archived Logs are Lost

This post describes a situation that should not happen, having a standby out of synch and not having the required logs to close the gap.If it is nice to have a procedure to deal with the situation, it is better to assure that archived logs are backed up on a schedule, and that the backup and cleanup script is automatically run every time a threshold of 20% free space on the archive log destination is reached before the next scheduled cleanup run.In any case this technique is quite nice as it permit to re-synch the standby without needing to copy over the whole database, but only the blocks that were changed since the last standby SCN.Full details on this document: How to Synchronize the Standby When there is a Big gap, and all Required Archived Logs are Lost

This post describes a situation that should not happen, having a standby out of synch and not having the required logs to close the gap. If it is nice to have a procedure to deal with the situation,...

Data Guard Fast Start Failover Implementation Step by Step

On this post I'm adding a step by step implementation procedure of a Dataguard site configured using fast start failover, it includes tips for configuring the observer process on a windows server and to configure automatic client failover in case of role change.All details on this document: Dataguard FSFO Step by Step

On this post I'm adding a step by step implementation procedure of a Dataguard site configured using fast start failover, it includes tips for configuring the observer process on a windows server and...

RAC

RAC and ASM Upgrade from 10.2.0.3 to 11.1.0.7

11g is becoming more and more interesting for many Companies. Since last year I started to implement 11g Oracle Clusterware with 11g ASM and 10g RAC Databases; on the last months I did start to work on pure 11g environments soon to be production, so I thought that having experience on how to migrate 10g RAC environments to 11g was necessaire.On this post I'm presenting a step by step procedure for migrating 10.2.0.3 Oracle Clusterware, ASM and RDBMS to 11g. Each step is presented on a separate document: 10g Oracle Clusterware Upgrade to 11.1.0.7 10g ASM Upgrade to 11.1.0.7 10g RAC Database Upgrade to 11.1.0.7

11g is becoming more and more interesting for many Companies. Since last year I started to implement 11g Oracle Clusterware with 11g ASM and 10g RAC Databases; on the last months I did start to work on...

Data Guard

Data Guard Fast Start Failover Reference Implementation

This post contains a reference of parameter files and database parameters used on a healthy Data Guard Physical Standby implementation, configured with FSFO in Max Availability protection mode. This environment consistently succeeded to complete Switchover or FSFO, in case of crash, and reinstate the failed primary as soon it become available again.The details can be found in this document: DATA GUARD FSFO AND SWITCHOVER REFERENCE CONFIGURATIONI was motivated to write this down after I spent several hours implementing a Data Guard FSFO environment on Windows. Switchover succeeded but the broker failed to start the databases with error ORA-12514.After some rounds of review, I was able to notice that there is a huge number of Service Requests opened with Oracle Support for the same or similar issues.This reference of a healthy configuration may help to quickly find errors on the parameter files or broker related parameters.

This post contains a reference of parameter files and database parameters used on a healthy Data Guard Physical Standby implementation, configured with FSFO in Max Availability protection mode. This...

The Oracle-Validated rpm is available for all users

Preparing Linux for Oracle Clusterware and RAC installation is made fast and simple with the help of the Oracle Validated Rpm. The Oracle-Validated rpm automatize the install of the numerous components required to setup OCW and RAC on Linux, and execute most of the required OS configuration steps. That was a long and tedious task on the time before the rpm was made available.Until recently to be able to use the Oracle-Validated rpm was necessary to have an Unbreakable Linux Network (ULN) support contract. Now this valuable rpm is available to the general public. Note 728346.1 contains general information about it. On my test I did install the rpm on Enterprise Linux 5, I did download the rpm matching my kernel from http://oss.oracle.com/el5/oracle-validated/Enterprise Linux can be downloaded from the Oracle EDelivery site.The following document contains details of the rpm install: Installing the Oracle-Validated rpm on EL5

Preparing Linux for Oracle Clusterware and RAC installation is made fast and simple with the help of the Oracle Validated Rpm. The Oracle-Validated rpm automatize the install of the numerous components...

ASM

Database Migration From Windows to Linux Using RMAN

This post describes the procedure required to migrate a database from Windows to Linux using the RMAN Convert Database command.Both Windows and Linux platforms have the same endian format, which makes possible to transfer the whole database, making the migration process very straightforward and simple.To migrate between platforms that have a different endian format, Cross Platform Transportable Tablespaces (XTTS) needs to be used instead.List of Steps Needed to Complete the MigrationThe migration process is simple, but as it has several steps it is convenient to be familiar with them before running it. 1. Check platform compatibility between source and target OS 2. Start the database in read only mode 3. Check database readiness for transport from Windows to Linux using DBMS_TDB.CHECK_DB 4. Check if there are any external objects 5. Execute the Rman Convert database command 6. Copy converted datafiles, generated Transport Script and Parameter File to Linux 7. Edit the init.ora for the new database 8. Edit the Transport Script and Parameter File changing the windows paths to Linux Paths 9. Execute the Transport Script10.Change the Database ID11.Check database integrityFor a detailed step by step guide please check this document : Database Migration From Windows to Linux Using RMAN

This post describes the procedure required to migrate a database from Windows to Linux using the RMAN Convert Database command. Both Windows and Linux platforms have the same endian format, which makes...

Real Application Testing on 10g/11g Step-by-Step

In this post I'm reviewing Real Application Testing Database Reply and SQL Performance Analyzer step by step.The initial understanding of the implementation process demand some work, but the results are excellent and worth the effort, we get valuable information to asses how our systems will behave after change, something much needed for production environments. The document I'm attaching to this post cover the functional tests I did of both Database Replay and SQL Performance Analyzer, I'm still working on the following stages of evaluating a production load on a different platform.The test steps can be followed on this document Real Application Testing on 10g and they are the following: 1.Clone production database on test server, start it up 2.Create 11g database on test server on Windows 3.Download patch 6998002 for Windows 32 bit 4.Download patch 6903335 for Windows 32 bit 5.Download patch 7044721 for 11.0.6.1 on windows 6.Download patch 6974999 for 10.2.0.3 on Linux 7.Download RAT scripts and docs from OTN 8.Install patch 6998002 on Windows 10g clone database 9.Install patch 6903335 on Windows 10g clone database 10.Install patch 7044721 on Windows 11g test database 11.Install patch 6974999 on Linux 10g test database 12.Configure and execute functional RAT tests on Windows 13.Install 11g on Linux 14.Create 11g Database on Linux 15.Migrate clone of production Database from windows to Linux 16.Configure and execute functional RAT tests on Linux 17.Install patch 6903335 on production database 18.Capture load on production Database 19.Replay production load on Linux

In this post I'm reviewing Real Application Testing Database Reply and SQL Performance Analyzer step by step. The initial understanding of the implementation process demand some work, but the results...

RAC and ASM on Linux Forum, July 16, Sheraton Hotel, Tel Aviv

On Wednesday, July 16 we will have our 3rd RAC and ASM on Linux Forum at the Sheraton Hotel in Tel Aviv, this time dedicated to Performance and Maximum Availability Architecture (MAA).We have several invited speakers from Oracle Israel: Oded Ilan, Gadi Chen and myself will speak about ASM performance; Eyal Druskin and Nickita Chernovsky will explain the concept of MAA, I'll present a short demo of Active Data Guard.Then we have two special invited speakers: Saar Maoz, from Oracle Corporation, Racpack Team will speak about performance tuning of RAC using AWR reports.Husnu Sensoy, from Turkcell, the main cellphone services provider in Turkey will speak about his experience implementing a 50TB database based on ASM.The presentations can be downloaded here: Rac on Linux Forum Presentations, If you are based on Israel and didn't register yet please go to this link to see the program and register: Oracle RAC on Linux ForumSee you there!

On Wednesday, July 16 we will have our 3rd RAC and ASM on Linux Forum at the Sheraton Hotel in Tel Aviv, this time dedicated to Performance and Maximum Availability Architecture (MAA). We have several...

ASM Administration

Manually changing rebalance power on an ongoing ASM rebalance operation

A couple of times I've removed ASM disks from a diskgroup and found that the rebalance operation will take too long.This is not a problem when you don't want to disturb online users, the rebalance process only locks for writes 1mb of data at a time, so letting the rebalance to run slowly has no performance impact.My problem was that I needed to free the disks as fast as possible in order to finish some IO tests within a limited timeframe. The procedure to change the rebalance power of an ongoing operation is very simple: alter diskgroup DATADG rebalance power 6;After that the rebalance operation will restart with the new set rebalance power.For a complete example check this document: Changing the rebalance power of an ongoing rebalance operation

A couple of times I've removed ASM disks from a diskgroup and found that the rebalance operation will take too long. This is not a problem when you don't want to disturb online users, the rebalance...

SQL Performance Analyzer and Database Replay are available for pre 11g Releases

Change Control is one of the most important aspects of managing a production database environment, the back port of this functionality is a major breakthrough, that will make easier to migrate 9i production environments to 10g and 11g.SQL Performance Analyzer permits to identify SQLperformance issues caused by system changes, the performance analyzer assess the impact of that changes and provide a report that highlight any  differences between the before and after situations.Database Replay permits toperform comprehensive change control testing, using workloads taken from production environments. Problems canbe identified and solved on the test stage before implementing the changes onproduction. Whole details of the process can be obtained from Metalink Note: 560977.1This functionality for 11g has been ported to Oracle Server - Enterprise Edition - Versions 9.0.1.0  to  11.1.0.6 9.2.0.8.0 + one off patch 6973309.10.2.0.3.0 + one off patch 6974999

Change Control is one of the most important aspects of managing a production database environment, the back port of this functionality is a major breakthrough, that will make easier to migrate 9i...

Oracle Recovery Manager ( RMAN) Hands On Practice

Oracle Recovery Manager, RMAN, is at the same time simple and complex, it's sintax can be as direct as 'backup database' to make a whole database backup, and as complex as it can be to make possible the most sophisticated and complex backup and recovery tasks.This post covers a series of tests I did prepare to show how Rman works, the tar of the scripts contained on the guide can be downloaded here.This RMAN Hands-On guide, once you have configured the environment, can be followed in about 3 hours, it covers the basics of Backup, Restore and Recovery of Oracle Databases using RMAN.This hands on practice is organized into 4 chaptersCATALOGADMINISTRATION    BACKUP  RESTORE AND RECOVERYEach chapter contains a set of scripts that can be executed in order to implement the practice, it teaches the following points:Build and administer an Rman CatalogUse different techniques for backing up a DatabaseAnd the most important: How to restore and recover the database in different crisis scenarios

Oracle Recovery Manager, RMAN, is at the same time simple and complex, it's sintax can be as direct as 'backup database' to make a whole database backup, and as complex as it can be to make possible...

Bigfile Tablespaces

Bigfile Tablespaces may become a standard implementation on databases using high end storage and Oracle Automatic Storage Management.In order to avoid potential performance issues Bigfile Tablespaces must being built on top of a file system that can grow up dynamically and that support file striping, both characteristics are native to ASM.<meta http-equiv="CONTENT-TYPE" content="text/html; charset=utf-8"><title></title><meta name="GENERATOR" content="OpenOffice.org 2.3 (Linux)">This feature not only address the need of very large databases enabling datafile sizes of up to 4G blocks per tablespace of storage, the maximum database size using Bigfile tablespaces is 8 exabytes whith 32K blocksize.Combined with Oracle Managed Files (OMF), Bigfile tablespaces makes datafile management completely transparent to the DBA, thus greatly simplifying small - medium and very large databases storage management.The DBA will need to be aware and only check that enough free space is always available at the level of ASM DiskGroup, to support the natural growth pace of its applications.Bigfile tablespaces can be implemented at database creation. Using DBCA, you will need to edit the definition of each tablespace and check the 'use Bigfile' checkbox. 'bigfile-dbca'To add a Bigfile tablespace using OMF enter this command:SQL> create bigfile tablespace dev;Tablespace created.SQL>  select TABLESPACE_NAME,BLOCK_SIZE,BIGFILE,SEGMENT_SPACE_MANAGEMENT from dba_tablespaces;TABLESPACE_NAME                BLOCK_SIZE BIG SEGMEN------------------------------ ---------- --- ------SYSTEM                              16384 YES MANUALSYSAUX                              16384 YES AUTOUNDOTBS1                            16384 YES MANUALTEMP                                16384 YES MANUALUSERS                               16384 YES AUTODEV                                 16384 YES AUTO6 rows selected.SQL> select bytes,file_name from dba_data_files where tablespace_name='DEV' / BYTES FILE_NAME---------- -------------------------------------------------- 104857600 +DATADG/redpanda/datafile/dev.276.654704777

Bigfile Tablespaces may become a standard implementation on databases using high end storage and Oracle Automatic Storage Management.In order to avoid potential performance issues Bigfile...

Extended RAC on Pure Oracle Stack Summary

Extended RAC is a Tier 6 DRP technology that enable zero data loss in case of major disaster at a local site, and near 0 downtime if well configured and well managed.It is an expensive technology because of the costs associated with the required infrastructure, dark fiber over long distances and redundant components on all hardware layers.A strict change control procedure must be in place, with an identical test environment where all changes to the infrastructure and new code versions are carefully tested before implementation on the production site.Oracle 11g provides new features that greatly help on implementing extended clusters, and makes it the version that best fit the challenge.The syncronization between sites is obtained using ASM redundancy, reads are made locally at each site using ASM preferred mirror read feature, and in case of a local storage failure ASM fast mirrot resync make recovery much easier.   10g R2 and 11g Dataguard FSFO (Fast Start Failover) technology provide a similar level of protection at less cost and at greater distances, as it relays on TCP for communications, so it is convenient to consider it as an option when evaluating extended RAC for DRP.This document: Extended RAC On Pure Oracle Stack Summary contains basic implementation guidelines.More information can be found on these excellent documents:Erik Peterson: Rac at a DistanceClemens Bleile: Extended RAC POC Christian Bilien:  Post on Extended RAC

Extended RAC is a Tier 6 DRP technology that enable zero data loss in case of major disaster at a local site, and near 0 downtime if well configured and well managed.It is an expensive technology...

How To Resize the Online Redologs on a RAC Environment

Resizing the online redologs on a RAC environment is a simple task, that require a little more attention than that of a single instance.Usually we do resize when redolog switches happen too frequently. Take care to add all the new larger groups, then switch over until the current redolog for each instance is on the new larger groups before starting to remove the old groups.You can find a step by step description of the procedure on this document: How to Resize the Online Redologs on a RAC EnvironmentIn some cases it may happen that you have only one instance of the RAC database up:SQL> select inst_id,thread#,status,enabled,groups,instance,sequence# from gv$thread/   INST_ID    THREAD# STATUS ENABLED      GROUPS INSTANCE    SEQUENCE#---------- ---------- ------ -------- ---------- ---------- ----------         1          1 CLOSED PUBLIC            3 sati1              41         1          2 OPEN   PUBLIC            3 sati2              58You will still see that both threads have a current redolog with status active:SQL> select inst_id,group#,thread#,sequence#,bytes,members,status from gv$log  2  /   INST_ID     GROUP#    THREAD#  SEQUENCE#      BYTES    MEMBERS STATUS---------- ---------- ---------- ---------- ---------- ---------- ----------------         1          1          1          0   10485760          2 UNUSED         1          2          1          0   10485760          2 UNUSED         1          3          1          0   10485760          2 UNUSED         1          4          2         59   10485760          2 CURRENT         1          5          2          0   10485760          2 UNUSED         1          6          2          0   10485760          2 UNUSED         1         10          1         39   52428800          2 INACTIVE         1         11          2         56   52428800          2 INACTIVE         1         12          1         40   52428800          2 INACTIVE         1         13          2         57   52428800          2 INACTIVE         1         14          1         41   52428800          2 CURRENT         1         15          2         58   52428800          2 INACTIVEIn this case you will not be able to remove the current group of the thread belonging to the instance that is down until you start the instance and switch over this thread to the new groups; if you try to remove the current group of a down instance you will get the following error:SQL> alter database drop logfile group 14;alter database drop logfile group 14*ERROR at line 1:ORA-01623: log 14 is current log for instance sati1 (thread 1) - cannot dropORA-00312: online log 14 thread 1: '+DATADG/sati/onlinelog/group_14.303.648232897'ORA-00312: online log 14 thread 1: '+DATADG/sati/onlinelog/group_14.289.648232903'     

Resizing the online redologs on a RAC environment is a simple task, that require a little more attention than that of a single instance. Usually we do resize when redolog switches happen...

Disaster Recovery Stories

In the last weeks I had the opportunity to work on two cases of disaster recovery. There are a couple of lessons learned from both cases:Check your backupsBackup your archived logsMake backups of the database structure (full export without data)Make backups of your controlfiles (backup controlfile to trace)The first case was a development environment, where a month worth of work by a team of several developers was lost, despite that the database was working on archive log mode and they had cold backups. The last backup was corrupted, and the archived logs of the next backup were missing.The database undo tablespace and one of the data tablespace datafiles were lost.In this case I did open the old backup to create a full database export, I used this export to create the application user objects in a new database. Then I did dump the data from the corrupted database using DUL and imported it into the new database.The second case was related to a production clone that was periodically refreshed by an automatic job. As part of the job the metadata of one tablespace was exported in order to be able to plug-in it  back into the refreshed  database. That tablespace contained critical information and the delta of developers work. In this case both the metadata export and the database refresh failed, the transportable tablespace datafile was left orphaned without its metadata export.In this case I didn't have the data dictionary so the dumps of data lacked the original create table statements, the only reference I had was the object number and an old metadata export.I did extract from this export the create table statements, linked them to its original objects dumps and imported the data. Some new objects, not contained  on the old metadata export, were left unnamed, to be identified by the owners of the application.Is good to be able to recover  valuable data when all hope is almost  lost. But the better is to have good validated backups!

In the last weeks I had the opportunity to work on two cases of disaster recovery. There are a couple of lessons learned from both cases: Check your backups Backup your archived logs Make backups of the...

How to Rename and Register with CRS a Clone of a RAC Database

To rename a database we set the new name recreating the controlfile, in the case of a RAC database the procedure is slightly more complex, as we need to recreate the controlfile in single instance mode using a modified pfile, and then recreate the spfile.Once the new controlfile is ready we need to register the database with CRS in order to enable its management as a RAC database.This document "How to Recreate the Controlfile on a RAC Environment" illustrate the steps required to accomplish that.

To rename a database we set the new name recreating the controlfile, in the case of a RAC database the procedure is slightly more complex, as we need to recreate the controlfile in single instance...

ASM Database Clones Naming Issues

There are several ways to clone an ASM based database, some of them result in a directory structure that do not match the current database name, i.e, the source database directory structure is built this way:+DATADG/proddb/datafile+DATADG/proddb/controlfile+DATADG/proddb/onlineredolog... etcWhen cloning we will keep the same directory structure and change the database name to testdb or devdb or repdb, some dba's do not feel comfortable having a "testdb" database on a directory like "+DATADG/proddb/..." In this post I will mention 2 possible workarounds.1) Using a generic unique_db_name to be shared by all clones2) Creating a set of directories and moving the database oracle files over to it1. Using a generic unique_db_name to be shared by all clonesThe easiest one is to use a generic db_unique_name for the set of databases to be cloned.For instance if the database to be cloned is a billing system we may set the db_unique_name at database creation to "billing", in this case we will get for all our clones a generic root directory named "billing" instead of one that is  the name of source database in the group.When using Database Configuration Assistant (DBCA) you can set manually the parameter db_unique_name by choosing parameters -> advanced parameters configuration.The static parameter  db_unique_name can be changed at any moment, it will require an instance restart to become active. If you are using Oracle Managed Files (OMF) The new unique_db_name is used as root directory for any new oracle files created from the moment the new name is made active.Following you can see an example of a database created with a db_unique_name='generic' and a db_name='redhorse'SQL> select name from v$database;    NAME    ---------    REDHORSE    SQL> select * from global_name;    GLOBAL_NAME    --------------------------------------------------------------------------------    REDHORSE    SQL> select db_unique_name from v$database;    DB_UNIQUE_NAME    ------------------------------    GENERIC        SQL> select file_name from dba_data_files;    FILE_NAME    --------------------------------------------------------------------------------    +DB1DATADG/generic/datafile/system.292.643421261    +DB1DATADG/generic/datafile/undotbs1.293.643421325    +DB1DATADG/generic/datafile/sysaux.294.643421347    +DB1DATADG/generic/datafile/users.296.6434213872. Creating a set of directories and moving the database oracle files over to itThis is a more laborious workaround than to set the db_unique_name of the clone to be the same as the db_name. Once the clone is created you will manually create the directories matching the new name, and move the oracle files from the old directory to the new directory using rman, i.e.:Create a set of directories reflecting the new database structure, in my case the original database was cpdg1 and the clone is cpdg2. (it may happen that your spfile is siting on an alias with the name of the database, you will need to remove the alias to be able to create the directories)ASMCMD> ls cpdg1CONTROLFILE/DATAFILE/ONLINELOG/PARAMETERFILE/TEMPFILE/ASMCMD> mkdir cpdg2ASMCMD> mkdir cpdg2/CONTROLFILE/ASMCMD> mkdir cpdg2/DATAFILE/ASMCMD> mkdir cpdg2/PARAMETERFILE/ASMCMD> mkdir cpdg2/TEMPFILEFrom now on every create of an oracle file on database cpdg2 will go to the cpdg2 directory, i.e.SQL> select name from v$database;NAME---------CPDG2SQL> create tablespace users2;Tablespace created.SQL> select file_name from dba_data_files where tablespace='USERS2';FILE_NAME-------------------------------------------------------+DB3CLNDG/cpdg2/datafile/users2.267.643260907To move tablespaces based on the old directory cpdg1 set the tablespace offline and move the datafiles with rmanSQL> alter tablespace users offline;Tablespace altered.Start an rman sessionRMAN> copy datafile '+DB3CLNDG/cpdg1/datafile/users.264.640191515' to RMAN> '+DB3CLNDG';Starting backup at 06/01/2008 03:42:53using channel ORA_DISK_1channel ORA_DISK_1: starting datafile copyinput datafile file number=00004name=+DB3CLNDG/cpdg1/datafile/users.264.640191515output file name=+DB3CLNDG/cpdg2/datafile/users.268.643261377tag=TAG20080106T034255 RECID=1 STAMP=643261380 channel ORA_DISK_1:datafile copy complete, elapsed time: 00:00:07 Finished backup at06/01/2008 03:43:03RMAN> switch tablespace users to copy;datafile 4 switched to datafile copy"+DB3CLNDG/cpdg2/datafile/users.268.643261377"Bring the tablespace online againSQL> alter tablespace users online;Tablespace altered.SQL> select file_name from dba_data_files where tablespace_name='USERS';FILE_NAME-----------------------------------------------------+DB3CLNDG/cpdg2/datafile/users.268.643261377Now you can delete the old file{oracle} /oradisk/app01/oracle/product/11db/dbs [pollux.com] > asmcmd ls+DB3CLNDG/cpdg1/datafile/users.264.640191515users.264.640191515{oracle} /oradisk/app01/oracle/product/11db/dbs [pollux.com] > asmcmd rm+DB3CLNDG/cpdg1/datafile/users.264.640191515{oracle} /oradisk/app01/oracle/product/11db/dbs [pollux.com] > asmcmd ls+DB3CLNDG/cpdg1/datafile/users.264.640191515ASMCMD-08002: entry 'users.264.640191515' does not exist in directory'+DB3CLNDG/cpdg1/datafile/'

There are several ways to clone an ASM based database, some of them result in a directory structure that do not match the current database name, i.e, the source database directory structure is built...

MAA Seminar: Questions and Answers

There were a couple of questions I did take home from the Seminar:1. Which is the minimal retention time that can be used when using Flashback Database?2. Is it possible to open a 10g Physical Standby read-write, perform application testing on it, and reinstantiate it as PhysicalStandby using flashback technologies?3. Which is the correct High Availability configuration for tnsnames.ora when using Data Guard?4. How I can avoid redo shipping when the standby database is not available?5. Which are 10g Data Guard Data Type restrictions?6. Is it possible to use tape compression in top of Rman compression?7. Is it possible to use Rman Convert command on a small endian ASM diskgroup mounted on a big endian server?8. Is it possible to integrate Oracle and My Sql?9. Is it possible to use Storage backups with ASM?Here you can find the Answers: MAA Seminar Questions and Answers

There were a couple of questions I did take home from the Seminar: 1. Which is the minimal retention time that can be used when using Flashback Database?2. Is it possible to open a 10g Physical Standby...

ASM Based Tablespaces Backup With Rman for Long Term Offline Storing

This post is the answer to a conversation with a Senior DBA.He asked me if it is possible to set a tablespace offline, use Rman to backup it, then remove the tablespace underlying datafiles, and, when needed restore the tablespace without performing recovery.The idea behind is to be able to free the space used by tablespaces that we do not need to have online, but we need to keep them, usually for legal reasons, for a period of several years.The standard solution would be to move and compress this data into historical databases, based on cheaper storage . The data transfer can be achieved using transportable tablespaces, cross-platform transportable tablespaces or data pump.I did test two approaches of doing this: Using Transportable Tablespaces and using Offline Tablespaces.Full details can be found here:  ASM Based Tablespaces Backup With Rman for Long Term Offline Storing

This post is the answer to a conversation with a Senior DBA.He asked me if it is possible to set a tablespace offline, use Rman to backup it, then remove the tablespace underlying datafiles, and, when...

How to generate series of AWR reports

To checkup performance issues usually I do get awr and ash reports every 15 minutes along the period I want to check.This script helps to automatically generate a set of awr reports: Run_AWR_Reports

To checkup performance issues usually I do get awr and ash reports every 15 minutes along the period I want to check. This script helps to automatically generate a set of awr reports: Run_AWR_Reports

MAA Seminar in Istanbul: using Cross Platform Transportable Tablespaces into ASM

One of the interesting questions I've got during the seminar was related to the possibility to migrate a terabyte database from a little endian machine to a big endian machine, using an ASM diskgroup to transfer the data.Unfortunately ASM diskgroups are not cross platform enabled. There is no problem to mount the same diskgroup on multiple servers at the same time, provided that the OS system for all the servers is the same. You can also unmount the ASM disks from one server and mount them into another, but still it is required to have the same OS to be able to work with the diskgroup on the new server.The alternative solution is to use Cross-Platform Transportable Tablespaces (XTTS). There are several possible strategies to accomplish the migration of a terabyte database using XTTS. On this post I'm presenting and example of how this technology is implemented, step by stepCross Platform Transportable Tablespaces Into ASM

One of the interesting questions I've got during the seminar was related to the possibility to migrate a terabyte database from a little endian machine to a big endian machine, using an ASM diskgroup...

MAA Seminar in Istanbul: Using a Physical Standby For Read/Write Testing and Reporting

Another interesting subject we tested during the seminar was activating a physical standby database for test, reporting, etc.Usually we are using storage snapshots for this purpose; we mount a storage generated copy of the production database on a report server, perform recovery and use it during the day. The process of resynchronizing the storage copy is repeated each night.Using a physical standby for the same purpose is possible on 10g when flashback database is enabled on the standby. A guaranteed restore point is created to be able to re-activate and resynchronize the standby when the period of report/testing is over.This is a powerful feature, worth knowing and using it.You can find step by step implementation details on this document:  Using a Physical Standby For Read/Write Testing and Reporting

Another interesting subject we tested during the seminar was activating a physical standby database for test, reporting, etc.Usually we are using storage snapshots for this purpose; we mount a storage...

MAA Seminar in Istanbul: Database Migration to ASM

Last week I had the privilege to share experiences about Oracle technology with a group of colleagues in Istanbul.During the meeting we focused on Maximum Availability Architecturetopics, one of the subjects I did present was Database Migration toASM. I think this is an important subject, many criticalproduction databases based on 8i and 9i are on the planning stage ofmigration to 10g with ASM. Rman provides a very efficient meanto perform the migration, with full control on the hands of the DBA, aminimum downtime window can be achieved, even for databases on themulti-terabyte range.This document: Database Migration to Oracle Automatic Storage Management contains a step by step description of this technique and how to implement it.You can read TonguçYilmaz notes on the seminar on his excellent blog:  H.TonguçYILMAZ Oracle Blog A few comments about my visit to Istanbul:Istanbul is a vibrant city, modern and ancient , plenty of colorful restaurants and typical stores, rich in ceramics and jewelry that reminds us of the greatness of the Ottoman Empire. The Mosques in the fog, the ships crossing the Bosphorous in all directions, the youth filling Liberty Street and Taksim Square made a strong impression on my memory. The taste of real Turkish coffee in the company of my dear colleagues, in this cold Istanbul winter, is something I'll remember for a long time.From left to right: Alejandro Vargas, TonguçYilmaz and Hüsnü Şensoy 

Last week I had the privilege to share experiences about Oracle technology with a group of colleagues in Istanbul. During the meeting we focused on Maximum Availability Architecturetopics, one of the...

ASM Seminar, December 20, Hotel Daniel Hertzeliya

From December 16 to 20 will take place Oracle Week Israel.On the 20 we have a full day Seminar about ASM Features and Future, I did choose to be the main body of the Seminar a very good course offered until today to a limited number of people, it will be a unique opportunity to get a deep dive into ASM technology.So if you are around in Israel, I warmly invite you to come over. Hertzeliya, at the Mediterranean sea side  is a very nice place to spend the day discussing High Technology.This is the seminar program:Automatic Storage Management SeminarPart I     - Introduction and ArchitecturePart II    - ASM InternalsPart III   - 11g New FeaturesPart I     - Introduction and Architecture1 � Introduction-      What is ASM?-      Performance benefits -      Striping/rebalancing-      Mirroring-      ASM Tools2 - ASM Architecture3 � Concepts-     Diskgroup-     Disk-     AU-     ASM Files4 - Diskgroup Fault Tolerance-      Failuregroup-      Partnering-      Mirroring-      Striping5 - ASM Setup6 - Handling Disks & Diskgroups7 - ASM Filenames Part II    - ASM Internals1- ASM STORAGE DEVICES-     Overview of possible devices-     block- and raw-devices-     NAS/SAN/NFS-     LVM2 - Raw files at different platforms-     SUN-     LINUX-     AIX, HP                                            3 � ASMLib4 - ASM Metadata5 - ASM Diagnosis & Troubleshooting-     disk discovery-     css (cluster manager)-     Blockdumps5.1 - Disk Discovery-     Basics-     Disk strings on different platforms-     Troubleshooting5.2 - CSS issues5.3 � blockdumps-     Dbverify -     ORACLE blockdump-     Examine diskheader  (kfed) -     OS blockdump (dd) 5.4 � Troubleshooting6 - ASM backupPart III   - ASM 11g New Features

From December 16 to 20 will take place Oracle Week Israel. On the 20 we have a full day Seminar about ASM Features and Future, I did choose to be the main body of the Seminar a very good course...

Mapping ASM disks to Physical Devices

Sometimes you may need to map ASM Disks to its physical devices. If they are based on ASMLib you will see their ASM name, ie: ORCL:VOL1 when querying v$asm_diskWhen running oracleasm querydisk VOL1 you will get in addition the major - minor numbers, that can be used to match the physical device, ie:[root@orcldb2 ~]# /etc/init.d/oracleasm querydisk VOL1Disk "VOL1" is a valid ASM disk on device [8, 97][root@orcldb2 ~]# ls -l /dev | grep 8, | grep 97brw-rw----   1 root disk     8,      81 Nov  4 13:02 sdg1This script can do the job for a group of ASM Disks:---------- start  here ------------#!/bin/kshfor i in `/etc/init.d/oracleasm listdisks`dov_asmdisk=`/etc/init.d/oracleasm querydisk $i | awk  '{print $2}'`v_minor=`/etc/init.d/oracleasm querydisk $i | awk -F[ '{print $2}'| awk -F] '{print $1}' | awk '{print $1}'`v_major=`/etc/init.d/oracleasm querydisk $i | awk -F[ '{print $2}'| awk -F] '{print $1}' | awk '{print $2}'`v_device=`ls -la /dev | grep $v_minor | grep $v_major | awk '{print $10}'`echo "ASM disk $v_asmdisk based on /dev/$v_device  [$v_minor $v_major]"done---------- finish here ------------The output looks like this:ASM disk "VOL1" based on /dev/sdg1  [8, 97]ASM disk "VOL10" based on /dev/sdp1  [8, 241]ASM disk "VOL2" based on /dev/sdh1  [8, 113]ASM disk "VOL3" based on /dev/sdk1  [8, 161]ASM disk "VOL4" based on /dev/sdi1  [8, 129]ASM disk "VOL5" based on /dev/sdl1  [8, 177]ASM disk "VOL6" based on /dev/sdj1  [8, 145]ASM disk "VOL7" based on /dev/sdn1  [8, 209]ASM disk "VOL8" based on /dev/sdo1  [8, 225]ASM disk "VOL9" based on /dev/sdm1  [8, 193]If you are using multi-path, you will need an additional step to map the physical device to the multi-path device, for instance if using EMC Powerpath if you want to map sdf1[root@orclp ~]# /etc/init.d/oracleasm querydisk vol1Disk "VOL1" is a valid ASM disk on device [8, 81][root@orclp ~]# ls -l /dev | grep 8,| grep 81brw-rw----   1 root disk     8,      81 Oct 29 20:42 sdf1[root@orclp ~]# powermt display dev=all ......Pseudo name=emcpowergSymmetrix ID=000290101698Logical device ID=0214state=alive; policy=SymmOpt; priority=0; queued-IOs=0==============================================================================---------------- Host ---------------   - Stor -   -- I/O Path -  -- Stats ---### HW Path                 I/O Paths    Interf.   Mode    State  Q-IOs Errors==============================================================================   1 qla2xxx                   sdf       FA  7bB   active  alive      0      0   2 qla2xxx                   sdq       FA 10bB   active  alive      0      0......The last step is to check the partition assigned to the emcpower device, ie:[root@orclp ~]# ls -l /dev/emcpowerg*brw-------  1 root root 120, 96 Oct 29 20:41 /dev/emcpowergbrw-------  1 root root 120, 97 Nov 15 13:08 /dev/emcpowerg1

Sometimes you may need to map ASM Disks to its physical devices.If they are based on ASMLib you will see their ASM name, ie: ORCL:VOL1 when querying v$asm_diskWhen running oracleasm querydisk VOL1 you...

10g database with 11g ASM

I just finished a test install on Oracle Enterprise Linux 5; I wanted to check how well ASM 11g  does work with a 10g database. I did install Linux 2.6.18-8.el5, Oracle 11.1.0.6, ASMLib 2.0.3-1.el5 and Oracle 10.2.0.1 + patch set 10.2.0.3All the installs completed smoothly on my laptop virtual machine. Note that you can also download Oracle VM  from November 14. You can see here the Oracle Unveils Oracle® VM announcement.At last my configuration includes 11g ASM + 11g database + 10g database, all living together in harmony.You can download Oracle Enterprise Linux 5, 11g and 10g from Oracle Edelivery page.

I just finished a test install on Oracle Enterprise Linux 5; I wanted to check how well ASM 11g  does work with a 10g database.I did install Linux 2.6.18-8.el5, Oracle 11.1.0.6, ASMLib 2.0.3-1.el5 and...

CRS and ASM Fail to Start After Linux Kernel Upgrade

The case I'm describing apply to a specific configuration: RAC 10g R2 on Linux x86_64 with EMC storage using EMC Powerpath and ASM.It may be not relevant in other configurations, still it provides guidelines to diagnose and solve similar issues. In brief the problem was the following:CRS failed to start after the Linux Kernel was upgraded from version  2.6.9-22.ELsmp to 2.6.9-55.0.2.ELsmp; ASM also failed to startThe solution steps were: Reinstall EMC PowerpathReconfigure /etc/sysconfig/rawdevicesInstall ASMLib rpm's matching the new kernelThis document provide a full description of the case and the solution implementation:RAC and ASM issues after Upgrade of Linux Kernel

The case I'm describing apply to a specific configuration: RAC 10g R2 on Linux x86_64 with EMC storage using EMC Powerpath and ASM. It may be not relevant in other configurations, still it...

How to Generate a Script to Kill the Sessions Holding an Object

The main issue in this post is a simple script, you can use to find sessions holding an object, and thus disabling the possibility to compile it. The script generate kill -9 commands to clear their shadow processes at the OS level.select 'kill -9 '||b.spid||' # username '||c.username||''from v$access a, v$process b, v$session cwhere a.object=upper('&name')and a.sid=c.sidand b.addr=c.paddr/If you want to have an example of using this script you can continue reading :-)This morning we find that a package holding a new version was invalid, because a dependent object was not changed accordingly, as projected to be. The dependent package, that remain unchanged, was being executed by about ~200 from ~5000 concurrent users.The options were: rollback the package that failed compilation or introduce the new version of the dependent, and after that recompile  the invalid object.We decided to introduce the changes to the dependent. In one screen I did run this script to generate kill -9 commands for all shadow sessions holding the dependent,set echo off  head off  veri off  feed off  pages 50000/ spool killsess.sh/select 'kill -9 '||b.spid||' # username '||c.username||''from v$access a, v$process b, v$session cwhere a.object=upper('&name')and a.sid=c.sidand b.addr=c.paddr/spool offOn other screen I did start an sqlplus session to compile the new version, wrote down the compile command and left the screen open, ready to hit the enter key.Once the generate of kill -9 commands finished I did set execution privileges for the script and executed it, from within sqlplus!chmod 700 killsess.sh!./killsess.shAfter that, immediately I did compile the dependent, and the invalid object from the second screen:@compile_version;alter package <pkgname> compile body;      

The main issue in this post is a simple script, you can use to find sessions holding an object, and thus disabling the possibility to compile it. The script generate kill -9 commands to clear...

10g Dataguard, Flashback and New Features Q&A

Last week I had the privilege to share some time with a selected group of DBA's.We did a review of some of the strong 10g features like ADDM, AWR, Dataguard, Flashback Technologies and ASM.There were many interesting questions been raised up, I did take note of some of them, and compiled them into this document;  10g Questions and Answers

Last week I had the privilege to share some time with a selected group of DBA's. We did a review of some of the strong 10g features like ADDM, AWR, Dataguard, Flashback Technologies and ASM.There were...

Oracle Flashback Technologies

Flashback Database, Table, Query, Transaction and Drop provides an extremely useful set of tools to the DBA.  Once understood they are easily configured, implemented and used, in case of need.They are not only useful for recovery purposes but sometimes for informational purposes as well, a question like: which was the value for some record the last 6 hours? can be easily answered using flashback records.Metalink Note 435998.1 : What Do All 10g Flashback Features Rely on and what are their Limitations ? contain additional conceptual information and links to some of the sources I did use to prepare the document I'm attaching here:Flashback Technologies Step by Step

Flashback Database, Table, Query, Transaction and Drop provides an extremely useful set of tools to the DBA.  Once understood they are easily configured, implemented and used, in case of need.They are...

Data Guard, Broker, Observer and Switchover, Step by Step

This post is the continuation of  the previous Data Guard, Physical Standby Implementation, Step by StepThis time I did focus on configuring Data Guard Broker, the Observer and implement Switchover from the Primary to the Standby and back. The step by step instructions are contained on this document: Configuration of 10g Data Guard Broker and Observer for SwitchoverThis configuration do require a third server to run from it the Observer process on the background. It proved to be more efficient than other hardware cluster solutions in terms of the time required to switchover, about 5 minutes. The best solution is to upgrade  the configuration to RAC + Data Guard, that provides maximum protection and availability.

This post is the continuation of  the previous Data Guard, Physical Standby Implementation, Step by StepThis time I did focus on configuring Data Guard Broker, the Observer and implement Switchover...

Data Guard, Physical Standby Implementation, Step by Step

Although Enterprise Manager Grid Control provides the simplest way to implement Data Guard, in this post I'm presenting an exercise of manual implementation step by step.I think this is a valuable exercise that provides a close insight of the basic parameters and steps required to setup Data Guard, it provides also a good feeling of how this technology works, that the high level of automation we got from Grid Control cannot give.This exercise can be completed in a couple of hours if you have two servers with 10g Oracle homes already installed, the step by step guide can be downloaded here: 10g Data Guard, Physical Standby Creation, step by step. 

Although Enterprise Manager Grid Control provides the simplest way to implement Data Guard, in this post I'm presenting an exercise of manual implementation step by step.I think this is a valuable...

Data Guard or Storage Snapshots?

Customer sites often choose storage related technologies to ensure recoverability for large databases. I've been involved several times in recovering production databases from a BCV or SRDF backup. Mounting the storage copy back on the production server was done usually in about 30 minutes, after a period of deciding what to do that could take another 30 minutes to 1 hour. Then we needed to apply the relevant redo logs, that required a time that sometimes was of several hours.In my opinion, for sites requiring maximum availability storage snapshots do not provide the best high availability option.They are excellent for building up report databases that are refreshed every day, or for cloning test, QA, development or training databases from production.10g Data Guard provides the most efficient environment for  disaster recovery, it can be configured in several ways to match different sets of requirements, and, together with RAC conforms the Maximum Availability Architecture, that ensure the maximum levels of service.Configuration and management of Data Guard was made simple through the interface that Oracle Enterprise Manager Grid Control provides.Still it can also be manually implemented and managed with some training.

Customer sites often choose storage related technologies to ensure recoverability for large databases. I've been involved several times in recovering production databases from a BCV or SRDF backup....

Oracle RAC on Linux Conference - Sep 4 - Tel Aviv

RAC, ASM and Linux: Configuration, Monitoring/Managing and Extended RAC configurations, and a Successfull implementation report are the topics of the conference of the RAC on Linux Forum to be held on September 4 at the Sheraton Hotel in Tel Aviv  If you are around I will be happy to see you there, the conference is free, you can register at the Oracle Israel, events pageYou can download my presentation here "Rac on Linux Configuration"  This presentation was compiled and adapted by myself from presentations by various Oracle colleagues.This is the Conference schedule:14:00 � 14:30Welcoming14:30 � 14:40Opening words14:40 � 15:30Configuration & Installation Best Practices - Alejandro Vargas, Oracle Corporation.15:30 � 16:15Monitoring & Management Best Practices - Ofir Manor, Oracle Israel.16:15 � 16:45Coffee Break16:45 � 17:30Scalability Best Practices - Annie Flint, Oracle Corporation.17:30 � 18:15Customer Success: Bank Mizrahi Multi Node DW Implementation - Itamar Tadmor, Iteam.The speakers:Alejandro Vargas is a Senior DBA at Oracle Israel and a RAC enthusiast :-)Ofir Manor is a Senior Consultant at Oracle Israel and an excellent speaker.Annie Flint is a Senior Consultant with the Oracle RAC Pack group, and a world class RAC expert.Itamar Tadmor is a Senior DBA and Team Leader with many years of experience.See you there!    

RAC, ASM and Linux: Configuration, Monitoring/Managing and Extended RAC configurations, and a Successfull implementation report are the topics of the conference of the RAC on Linux Forum to be held on...

IO Throughput Verification Before RAC Implementation: ORION

When we plan a RAC environment we need to define the throughput requirements early on the design phase, and then validate that our servers are able to perform at the required levels, even before we have create the RAC database.When performance problems related to poor performance at the storage level are detected in a late implementation stage the number of variables may make the task to detect the bottleneck difficult, and the time to find a solution scarce.There is a tool that can be used to validate that a configuration is capable to provide the throughput we expect from it, is called Orion and can be downloaded from this link:OTN Download SiteThe Orion install guide can be downloaded from the same page.Orion can simulate the type of IO's an OLTP or DSS system does and provide detailed response times in terms of IOPS and MBPS.Once Orion executable is unziped on a separate directory you will need to create a parameter file, ie: mystat.lun that contains the list of the luns to run on the test.Be aware that any information there will be destroyed.[oracle@rac1 ~/ORION]$ cat mystat.lun/dev/raw/raw1/dev/raw/raw2/dev/raw/raw3A simple test can be run using the following parameters (note that testname and the lun parameter file must match):> orion -run simple -testname mystat -numdisks 3The output will be in the form of:TEST STARTPoint 1 (small=0, large=0) of 22Valid small 1 Valid large 1Validran (small): VLun = 0 Size = 8578870272ran (small): Index = 0  Count = 10770  Avg Lat =   5.56ran (small): nio=10770 nior=10770 niow=0 req w%=0 act w%=0ran (small): my   1 oth   0 iops 179 size 8 K lat   5.56 ms bw =   1.40 MBps dur  59.93 s READAnd will be logged into the file mystat_trace.txt Several output files will be generated as well:[oracle@rac1 ~/ORION]$ ls -ltrtotal 2156-rwxrwxrwx  1 root   root     1914905 Aug  9 07:06 orion-rwxrwxrwx  1 root   root      233611 Aug  9 07:07 Orion_Users_Guide.pdf-rwxrwxrwx  1 root   root          42 Aug  9 07:17 mystat.lun-rw-r--r--  1 oracle oinstall   13350 Aug  9 08:52 mystat_trace.txt-rw-r--r--  1 oracle oinstall     698 Aug  9 08:52 mystat_summary.txt-rw-r--r--  1 oracle oinstall     260 Aug  9 08:52 mystat_mbps.csv-rw-r--r--  1 oracle oinstall     336 Aug  9 08:52 mystat_lat.csv-rw-r--r--  1 oracle oinstall     336 Aug  9 08:52 mystat_iops.csvThe csv files can be uploaded to a worksheet to produce graphical representations of the data collected.In summary this is a great tool, worth to know it and use it. It is light, simple to use and free.

When we plan a RAC environment we need to define the throughput requirements early on the design phase, and then validate that our servers are able to perform at the required levels, even before...

Log Miner Implementation Summary

This is a quick example of Log Miner implementation.Log Miner is a powerful tool and all DBA's should know and master it.Attached a summary of the first steps for mining archived logs or online redo logsLog Miner, first steps exampleThis is the link to the complete reference on the Utilities Guide: Using Log Miner to Analyze Redo Log FilesAnd this is an excellent article by Arup Nanda on OTN:Mining For Clues

This is a quick example of Log Miner implementation. Log Miner is a powerful tool and all DBA's should know and master it.Attached a summary of the first steps for mining archived logs or online redo...

ORA-1410 invalid row id

Problem solving strategies sometimes find clear causes and answers and sometimes not, in this case analyzing the facts did show the way to find a workaround. It happened on a multi TB RAC Database. Symptoms observed were that queries on a partitioned table started to fail with ORA-1410 on some partitions.The initial diagnostic was of a logical corruption affecting one specific datafile, in this case a shared raw device. Some of these partitions were dropped and recreated using create as select from a backup.Once the create as select finished, checkup on the new partitions again returned ORA-1410, all traces pointing to the same raw device.The common factor was always the same device. Building a new raw device and moving all partitions siting on the affected one solved the problem.

Problem solving strategies sometimes find clear causes and answers and sometimes not, in this case analyzing the facts did show the way to find a workaround. It happened on a multi TB RAC Database.Symp...

Database location on the storage frame

Is the location of database files irrelevant these days when the capabilities of high end storage seems to address all possible issues?The answer is still yes according to a good friend I was working with the last two weeks. It is extremely relevant and the performance gaining of  following this rule is high, as high is the penalty for not following it. Every production environment needs to be built according to this standard:      The formula for database location on the frame is 20%-40%-20%-20%:20% of the disk located on the external cylinder is the best performing area of the frame, to be used by online redo logs, undo segments, temp segments and most accessed database files40% near the external cylinder is a good performing area, to be used by all other database filesNext 20% is a slow performing area and should not be used for database files, except parameter files, archived logs, software installLast 20% correspond to the inner cylinder, is the slowest performing area and should not be used for database files at all.Lun's to be used by the database needs to be mapped to assure that the database is located in the most performing areas of your frame.This is a task that must be accomplished working together with Storage Vendors and Storage Administrators.If this task was not done, a mapping of the actual lun's can be requested from the vendor or storage administrator. 

Is the location of database files irrelevant these days when the capabilities of high end storage seems to address all possible issues?The answer is still yes according to a good friend I was...

How to setup raw devices on Linux when /dev/raw does not exist

Sometimes you need to configure raw devices for OCR, Votingdisk and ASM Spfile, but the directory /dev/raw does not exist.As root create the /dev/raw directory:    mkdir /dev/rawThen create the required raw devices using the following sintax:mknod raw<raw_dev_number> c 162  <raw_dev_number>i.e.:mknod raw1 c 162 1For setting up 12 raw devices use a loop:#!/bin/kshx=1cd /dev/rawwhile [ $x -lt 12 ];do  mknod raw$x c 162 $x    x=`expr $x + 1 `doneOnce configured create or update the script /etc/udev/scripts/dev-raw.sh to automatize the configuration of raw devices for RAC on each startup# raw-dev.shMAKEDEV rawmv /dev/raw/raw1 /dev/raw/votingdiskmv /dev/raw/raw2 /dev/raw/ocr.dbfmv /dev/raw/raw3 /dev/raw/spfile+ASM.orachmod 660 /dev/raw/votingdiskchmod 660 /dev/raw/ocr.dbfchmod 660 /dev/raw/spfile+ASM.orachown oracle:dba /dev/raw/votingdiskchown oracle:dba /dev/raw/ocr.dbfchown oracle:dba /dev/raw/spfile+ASM.ora

Sometimes you need to configure raw devices for OCR, Votingdisk and ASM Spfile, but the directory /dev/raw does not exist. As root create the /dev/raw directory:     mkdir /dev/rawThen create the...

RAC on ASM replication using an NFS mounted File System

One of the RAC environments I'm working with is a very critical production database, there are several development, test and training clones of it, and from time to time a new clone needs to be created.The main backup strategy we are using is based on EMC snapshots, so we can easily create clones from snapshots.Still sometimes you get not standard requirements, or find circumstances that require creative solutions.The special request was:Replicate the 64 bit prod database to a 32 bit training environment Production is ASM based, training will be File System basedThere is no, and cannot be communication between Prod and TrainThe workaround I did use has several steps but worked fine:Open read-only a prod snapshot on a recovery server. In my case I did use a server where we automatically open read only the prod snapshots for backup verification. Mount NFS the data File System of the training server on the recovery serverGenerate a create controlfile script using :          alter database backup controlfile to trace;Find the trace on user_dump_destination, rename it to CRCTL.sql and move it to the NFS mounted File SystemBackup the database using rman backup as copy. You will get files named like:data_D-PRODDB_I-2123417810_TS-UNDOTBS2_FNO-21_g5il8cgeOnce the backup finish make a list of the file names order according the FNO-# ; from FNO-1 until the last fileOptionally you may rename the files to a more simple name, like :mv data_D-PRODDB_I-2123417810_TS-UNDOTBS2_FNO-21_g5il8cge UNDOTBS2_01.dbfEdit the create controlfile script, change the database name using the 'set database ' clause. Replace the datafile names on the CRCTL.sql script you generated on step 3 with the names of the datafiles you got from Rman (or you renamed on step 7) take care to order them according their File Number: FNO-1, FNO-2 etc.If this is the first time you clone a database in this server you will need to create the init.ora, and set the bdump, udump, cdump and archived log destinationsRecreate the controlfile using the CRCTL.sql and open the database with resetlogs.Shutdown the database and open it in migrate mode:          startup migrate;Execute utlirp.sql to change the word size from 64 to 32 bits          @?/rdbms/admin/utlirp.sqlShutdown immediateStartup normalThe database I worked with was a 500GB database, I did use a 2 CPU Linux box to open it read only and backed it up using 8 channels. The throughput was of 1GB per minute.This is the rman script:#!/bin/kshecho `date` >start.backup# Environment Variables# ---------------------export ORACLE_SID=proddbexport ORACLE_BASE=/u01/app01/oracleexport ORACLE_HOME=/u01/app01/oracle/product/10.2.0.3# Path Variables# --------------export BASE_PATH=/u01/app01/oracle/scripts/general:/usr/kerberos/sbin:/usr/kerberos/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/usr/X11R6/bin:/root/bin:/u01/app01/oracle/scriptsv_bdest=/NFSmnt/traindbexport PATH=${ORACLE_HOME}/bin:${BASE_PATH}# Backup database# ---------------rman target / nocatalog <<EOFrun {allocate channel backup_disk1 type disk format '$v_bdest/%U';allocate channel backup_disk2 type disk format '$v_bdest/%U';allocate channel backup_disk3 type disk format '$v_bdest/%U';allocate channel backup_disk4 type disk format '$v_bdest/%U';allocate channel backup_disk5 type disk format '$v_bdest/%U';allocate channel backup_disk6 type disk format '$v_bdest/%U';allocate channel backup_disk7 type disk format '$v_bdest/%U';allocate channel backup_disk8 type disk format '$v_bdest/%U';backup as COPY database include current controlfile;release channel backup_disk1;release channel backup_disk2;release channel backup_disk3;release channel backup_disk4;release channel backup_disk5;release channel backup_disk6;release channel backup_disk7;release channel backup_disk8;}exitEOFecho `date` >end.backup# EOF backup_to_fs

One of the RAC environments I'm working with is a very critical production database, there are several development, test and training clones of it, and from time to time a new clone needs to be...

Corrupted Archived Log While Performing Recovery of a Crashed Instance

When there is no more than the actual backup, and the database serves a critical online production facility, and the facility virtually stops because of the problem, then the last thing you want to meet is  a corrupted archived log!It started early in the morning, a very top industry had its ERP database down, a colleague was all the night working on the case, but the database was still down...Facts were: Database was performing a hot backup,While being backed up crashedStartup normal does not workRecovery fails with oracle pointing to a corrupted archived log Fortunately the problem was not that complicated: The crash happened while the database was in backup mode.Select status from v$backup revealed that several datafiles were still on backup mode.To perform recovery you must finish backup.Recover again failed with the corruption being reported on the same archived log as before, but we noticed that the sequence of this archived log was on the range of the online logs.We did recovery again and applied the matching online log instead of the corrupted archived log, and the following sequences until the current, recovery finished successfully.Before opening the database with resetlogs option we did backup the online logs, to replace the corrupted one.The database opened with resetlogs successfully.Note: To end backup mode in 10g you can use:     alter database end backup.In this case we were working with a 9i database so we used :     alter tablespace <tbsname> end backup;A script like this can be used to match datafiles to its tablespaces:set echo off  head offSELECT     'alter tablespace '||name||' end backup;'from     V$TABLESPACE WHERE TS# IN (    SELECT TS# FROM V$DATAFILE WHERE FILE# IN (        SELECT FILE# FROM V$BACKUP WHERE STATUS='ACTIVE'))/alter tablespace USERS end backup;alter tablespace PERFSTAT end backup;

When there is no more than the actual backup, and the database serves a critical online production facility, and the facility virtually stops because of the problem, then the last thing you want to...

After Production Upgrade Thoughts, or Think Twice Before Dropping a Big Table

On a previous post "Production Upgrade from 8.1.7.4 to 9.2.0.8"  I  did describe the preparation steps for a critical production database upgrade, I was responsible for the rdbms upgrade, a little part of the whole project that included several OS and Application layers.In few words, the bottom idea on this post is to summarize some lessons learned during the upgrade. They are: When executing a critical process, that needs to be finished within a restricted time window:Build a step by step detailed check list for every action related to the process. Prepare ready made scripts for each step that can be executed through scripts.Rehearse and improve the check lists on test until they run smoothly.Make at least 2 general rehearsals on a full set environment.On Production, follow your check lists carefully.If something arise that can wait until after finishing the procedure, put it on the waiting list.If something unexpected arise, before taking any action:a) Gather the relevant people and review the facts,b) Review the options, c) Only then decide on a course of action.Once the process finish, gather the relevant people and perform an after action review to summarize the learned lessons.This is the anecdotal part of the post:I did start the Database upgrade at15:30 and finished at 17:00, less that 50% of the time we had on the general rehearsal; we were running on a much powerful machine, compile was done using parallel 35 and finished really fast.  Being 3 hours before the planned time was nice. Then the problems started.A colleague was upgrading statspack, and got some errors, so he decided to remove Perfstat tables. One of them, was a dictionary managed table that had several hundred of thousands of extents. We were ready to turn the database back into archive log mode to proceed with application checks but the drop was still running. We went to have dinner while the drop was still running.Archive log mode was a precondition to move to the next stage and we needed to shutdown the database to move it to archive log mode. But the drop was running...My colleague killed his session expecting that this will give us the possibility to proceed with the other steps, but this was not the case.Once a session that was running more than an hour dropping a dictionary managed object with more than 200.000 extents is killed, smon take over the task of moving used extents to free extents, the only correct action in this case is to wait, and let smon to finish its work.Shutdown immediate will not work because oracle waits for smon to finish. The original drop  statement converted the segment to a temporary segment; when the drop is interrupted, the temporary segment will be cleaned up by smon.  Shutdown abort will work and the database will come up, but you will not be able to set archive log mode until smon finish its cleaning up actions.Shutdown abort in this case was a bad idea, on startup it forced smon to scan all extents already set, a slow and painful process... trace of smon showed that it scanned uet$ row by row reviewing all the job that was already completed:PARSING IN CURSOR #2 len=94 dep=1 uid=0 oct=3 lid=0 tim=169417024926 hv=1948998214 ad='99fffd90'select file#,block#,length from uet$ where ts#=:1 and segfile#=:2 and segblock#=:3 and ext#=:4END OF STMT.........   value=203542......=====================PARSING IN CURSOR #2 len=94 dep=1 uid=0 oct=3 lid=0 tim=169417086781 hv=1948998214 ad='99fffd90'select file#,block#,length from uet$ where ts#=:1 and segfile#=:2 and segblock#=:3 and ext#=:4END OF STMT.........   value=203541...etc, etc. It took 3 hours just to get to the point where it was able to resume the cleanup process, then the trace started to show work on free extents table :select length from fet$ where file#=:1 and block#=:2 and ts#=:3And after a while it begin to move extents into fet$:insert into fet$ (file#,block#,ts#,length) values (:1,:2,:3,:4)At this moment we were able to estimate how long it will take the whole cleaning process, I did set up this script to show how things were going on, 550 was the average number of extents cleaned per minute:#!/usr/bin/kshexport ORACLE_HOME=/u01/app01/oracle/product/9iexport ORACLE_SID=proddbx=1vfree=sys.fet$vused=sys.uet$while [ $x -eq 1 ];dosqlplus -s / <<eofset echo off  feed offcol "Expected Time" for 99.99select  sysdate,        (select count(block#) from $vused) Used,        (select count(block#) from $vfree) Free,        (select count(block#) from $vused)/550 "Expected Time"from dual/exiteofsleep 60doneAfter 5 long hours the cleanup finished, we were able to restart the database in archive log mode and start application migration and checkups.Worth to know and remember: a) If you kill a session that was dropping and object with many extents, and then shutdown the database, the time required by smon to get to the point where it will start cleaning up extents again is the time the drop was running by 3. Every time you restart the process smon needs to skip all extents on fet$ that already visited to be able to restart its work.b) If you have an object that you must drop and has many extents use the procedure described on Note 68836.1  How To Efficiently Drop A Table With Many ExtentsThere were several other highlights during this 36 hours upgrade, that finished with great success and on time, that is other rule: always add at least 40% spare time for each task :-)

On a previous post "Production Upgrade from 8.1.7.4 to 9.2.0.8"  I  did describe the preparation steps for a critical production database upgrade, I was responsible for the rdbms upgrade, a little...

Moving a datafile from File System to ASM

Moving datafiles from file system to ASM can be done in different ways, one of them using the RMAN copy command.These are the steps:1. Check where to build a new file system based tablespace:[oracle@rac1 ~]$ cd /u01/oradata/racdb[oracle@rac1 ~]$ df -k .Filesystem           1K-blocks      Used Available Use% Mounted on/dev/vx/dsk/u01                      31457280  17540576  13819976  56% /u012. Connect to sqlplus and create a new tablespace[oracle@rac1 racdb]$ sqlplus / as sysdbaSQL*Plus: Release 10.2.0.3.0 - Production on Tue Jun 19 06:07:50 2007Copyright (c) 1982, 2006, Oracle.  All Rights Reserved.Connected to:Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - ProductionWith the Partitioning, Real Application Clusters, OLAP and Data Mining optionsSQL> create tablespace trtst datafile '/u01/oradata/racdb/trtst01.dbf' size 150M;Tablespace created.SQL> select file_name from dba_data_files where tablespace_name ='TRTST';FILE_NAME--------------------------------------------------------------------------------/u01/oradata/racdb/trtst01.dbf3. take the tablespace offline, you may take offline a single datafile from a multifile tablespace if required.SQL> ALTER TABLESPACE TRTST OFFLINE;Tablespace altered.4. Check where are your datafiles located on ASMSQL> SELECT FILE_NAME FROM DBA_DATA_FILES;FILE_NAME--------------------------------------------------------------------------------+DATADG/racdb/datafile/users.259.623629857+DATADG/racdb/datafile/sysaux.257.623629849+DATADG/racdb/datafile/undotbs1.258.623629855+DATADG/racdb/datafile/system.256.623629845+DATADG/racdb/datafile/undotbs2.261.623630209/u01/oradata/racdb/trtst01.dbf6 rows selected.5. Log out from sqlplus, start an RMAN session and execute the copy commandNote that when giving the destination inside ASM you just need to pass the disk group name when using omf (Oracle Managed Files) that is the best practice in ASM.SQL> EXITDisconnected from Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - ProductionWith the Partitioning, Real Application Clusters, OLAP and Data Mining options[oracle@rac1 racdb]$ rman target / nocatalogRecovery Manager: Release 10.2.0.3.0 - Production on Tue Jun 19 06:12:14 2007Copyright (c) 1982, 2005, Oracle.  All rights reserved.connected to target database: RACDB (DBID=621246832)using target database control file instead of recovery catalogRMAN> copy datafile '/u01/oradata/racdb/trtst01.dbf' to '+DATADG';Starting backup at 19-JUN-07using channel ORA_DISK_1channel ORA_DISK_1: starting datafile copyinput datafile fno=00006 name=/u01/oradata/racdb/trtst01.dbfoutput filename=+DATADG/racdb/datafile/trtst.263.625644857 tag=TAG20070619T061416 recid=1 stamp=625644858channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:04Finished backup at 19-JUN-07RMAN> exitRecovery Manager complete.6. Start an sqlplus session, rename the old file to the new ASM file[oracle@rac1 racdb]$ sqlplus / as sysdbaSQL*Plus: Release 10.2.0.3.0 - Production on Tue Jun 19 06:15:11 2007Copyright (c) 1982, 2006, Oracle.  All Rights Reserved.Connected to:Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - ProductionWith the Partitioning, Real Application Clusters, OLAP and Data Mining optionsSQL> alter database rename file '/u01/oradata/racdb/trtst01.dbf' to '+DATADG/racdb/datafile/trtst.263.625644857';Database altered.SQL> alter tablespace trtst online;Tablespace altered.7. remove the old filerm /u01/oradata/racdb/trtst01.dbf8. Create a test objectSQL> create table testtb tablespace trtst as select * from dba_source;Table created.End of the Example

Moving datafiles from file system to ASM can be done in different ways, one of them using the RMAN copy command. These are the steps:1. Check where to build a new file system based tablespace: [oracle@r...

RAC Proactive Monitoring Using OS Watcher

RAC and single instance as well ...OS Watcher is a monitoring application, written by Carl Davis from Oracle Center of Expertise.And I think is wonderful! install in minutes and work straight away without any problems. Is the kind of application you may hold on your disk-on-key and install on a client site so that after a couple of hours you will be able to check and see if there is an OS related issue, where the bottleneck is located.OSW gather statistics using OS commands vmstat, iostat, mpstat, netstat and top. It create one file per hour. This make easy to drill down to a problematic time, and compare it to base line information.OSW provides also a java interface to display the information in graphic format.Here you can find a quick start guide I did prepare, you can review the full guide on Metalink OS Watcher GuideThis is an example of the graphics you will be able to display using OSW:

RAC and single instance as well ... OS Watcher is a monitoring application, written by Carl Davis from Oracle Center of Expertise.And I think is wonderful! install in minutes and work straight...

Production Upgrade from 8.1.7.4 to 9.2.0.8

Is not uncommon that very critical databases remain on obsolete versions longer than it is convenient.If the database provide good service and reasonable performance, the risks related to any radical change make the people that need to make the decision to upgrade, to delay it as much as possible.Sooner or later the day come when the increasing amount of problems, usually related to a big increment on the load and complexity of the applications running on the database raise up the need to upgrade to a more powerful version.At this moment the DBA get a small window to shutdown the application during the weekend and upgrade the database as fast and safely as possible.The attached document: "8.1.7.4 - 32 bits Database Upgrade to 9.2.0.8 - 64 bits" is the checklist I did prepare for such a case. Most of the work is done on the pre-migration phase. All upgrade steps are automatized as much as possible. During the upgrade, the longest part of it, the compilation of thousands of user objects, is done in parallel. I did try this procedure running the compilation with one session only, on a general rehearsal, I expect to decrease the execution time from 3.5 hours to about 2 hours on the next try running in a more powerful machine, with parallel (15) for the compilation process.Today most upgrades are done from 9i to 10g, and 11g is waiting around the corner.Nevertheless still there are many critical production applications running on 8i databases. This  procedure may serve as an example for similar situations.

Is not uncommon that very critical databases remain on obsolete versions longer than it is convenient.If the database provide good service and reasonable performance, the risks related to any radical...

Linux and RAC with ASM install on Virtual Nodes, Step 2 of 5

We did install the Oracle Distribution of Enterprise Linux on the previous post.This second post, Step By Step Configure Linux for RAC on a Virtual Node contains the configuration steps required to finish the setup of the first node:Install and configure VMware toolsConfigure the NetworkOracle User Configuration Steps for RAC on a Virtual NodeGeneral Linux Configuration Steps for RAC on a Virtual NodeStorage Configuration Steps for RAC on a Virtual NodeOnce these steps are completed we will be ready to replicate the node and build from it the second node, on step #3.

We did install the Oracle Distribution of Enterprise Linux on the previous post. This second post, Step By Step Configure Linux for RAC on a Virtual Node contains the configuration steps required to...

Linux and RAC with ASM install on Virtual Nodes, Step 1 of 5

Installing Linux and RAC with ASM on the constrained environment of a desktop PC or Laptop is a very interesting learning experience for people interested on RAC that lacks the possibility to install and try RAC on real servers.My first trial was made using the excellent step by step guide by Vincent Chan, that was published on OTN.Since then I tried this process a couple of times and developed a guide divided into 5 chapters:Linux Install Configure Linux for RACCreate and Configure the 2ndLinux NodeInstall RAC on 2 Linux Nodes forRACTests and Trials with RAC on Linux In this post I'm attaching the first document, Step By Step Install of Linux Enterprise Edition on a Virtual Machine.The steps covered on this document are:Download and install VMware serverDownload Enterprise Linux from OracleCreate a Linux Virtual MachineInstall Linux Enterprise EditionThis RAC on Virtual Machines installation is not supported and should only be used for private learning purposes.All the steps to implement this trial on real servers are also covered on a previous post Step by Step RAC on Linux; And Using Virtualization for Test and TrainingAnother excellent guide to RAC on Linux install can be find on OTN's article: Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and iSCSI  by Oracle Ace Jeffrey Hunter.

Installing Linux and RAC with ASM on the constrained environment of a desktop PC or Laptop is a very interesting learning experience for people interested on RAC that lacks the possibility to...

Clusterware Install hang when running root.sh on second node

Lately I've seen this behavior  twice, one on HPUX and another on Linux:Oracle clusterware 10.2.0.1 install went on smoothly and then hang when running root.sh on the 2nd node, while displaying this message:Startup will be queued to init within 90 seconds.There were no errors logged on the second server, truss or strace showed the processes waiting...On the first server we saw errors logged on $ORA_CRS_HOME/log/<servername>/alert<servername>.log pointing to a problem accessing the OCR raw device.on HP the problem was solved configuring the raw devices using a whole LUN as described on the install guide.on Linux the problem was solved pointing directly to the raw devices instead of soft links to them.

Lately I've seen this behavior  twice, one on HPUX and another on Linux:Oracle clusterware 10.2.0.1 install went on smoothly and then hang when running root.sh on the 2nd node, while displaying...

Enterprise Manager Database Control Reconfiguration

Enterprise Manager Database Control is of great help for RAC environments administration.For production environments it is advisable to setup a Grid Control dedicated server and have on it the EM repository.For test and training purposes Database Control is excellent.This link has a Guide to re-configure EM using Enterprise Manager Configuration Assistant (emca) when some misconfiguration makes Database Control incomplete or unavailable, removing it and reconfiguriong it is a quick option to get it back on work. 

Enterprise Manager Database Control is of great help for RAC environments administration.For production environments it is advisable to setup a Grid Control dedicated server and have on it the EM...

RAC with ASM on Linux, Crash Scenario: All Oracle Homes Loss

On this 6th scenario we lose all Oracle Homes on one node.On this environment Oracle Homes are installed on this path: /oradisk/app01/oracle/productThese are the installed Oracle Homes:ASM HomeRDBMS HomeCRS Home The scenario is triggered using the following script:cd /oradisk/app01rm -rf *  Recovery from such a loss is greatly simplified by having a good backup of the Oracle Homes.In many sites you will not find such a backup, and if reinstalling the software is not an issue restoring all parameter files may be sometimes difficult.The recovery steps include:Check available backupsRestore ASM and RDBMS homes from a good backupReinstall Oracle ClusterwareRestore OCR from a backup from before the crashRestart all RAC componentsCheckCrash script and full details of the test: Crash Scenario 5 - Oracle Homes LossImportant! :Before running this crash scenario be sure to Backup the Oracle Homes directories and OCR details tobackup and recover them are included on the attached file.

On this 6th scenario we lose all Oracle Homes on one node. On this environment Oracle Homes are installed on this path: /oradisk/app01/oracle/product These are the installed Oracle Homes:ASM Home RDBMS...

RAC with ASM on Linux, Crash Scenario: ASM Spfile Loss

On this 5th scenario we lose the ASM Spfile.The ASM spfile  in this case is located on a Raw Device, so backup and recovery from and to a Raw Device is explained.Backup and restore using create pfile from spfile and create spfile from pfile can also be used. In this case it will be necessary to shutdown the database and ASM instances, then start the ASM instance with the pfile and recreate the spfile.To simulate the ASM spfile loss wewrite zeros to it's raw device; because the ASM spfile is accessed only at startup this loss does not affect online work with the database or ASM instances.The recovery steps include:Restore the ASM spfile  from backup using ddCheckCrash script and full details of the test: Crash Scenario 4 - ASM Spfile LossImportant! :Before running a crash scenario be sure to Backup the ASM Spfile, details tobackup it are included on the attached file.

On this 5th scenario we lose the ASM Spfile. The ASM spfile  in this case is located on a Raw Device, so backup and recovery from and to a Raw Device is explained. Backup and restore using create pfile f...

RAC with ASM on Linux, Crash Scenario: Voting Disk Loss

On this 4th scenario we lose the Voting Disk.The Voting Disk in this case is located on a Raw Device, so backup and recovery from and to a Raw Device is explained. To simulate the Voting Disk loss wewrite zeros to the raw device, this will cause CRS to fail, the database to crash and the server to get stuck.The Voting Disk and the OCR can and should be multiplexed.To multiplex a voting disk execute as root:crsctl add css votedisk <path>The recovery steps include:Reboot both servers Restore the Voting Disk from backupRestart CRSRestart all cluster componentsCheckCrash script and full details of the test: Crash Scenario 4 - Voting Disk LossImportant! :Before running a crash scenario be sure to Backup the Voting Disk, details tobackup it are included on the attached file.

On this 4th scenario we lose the Voting Disk. The Voting Disk in this case is located on a Raw Device, so backup and recovery from and to a Raw Device is explained. To simulate the Voting Disk loss wew...

Oracle

Integrated Cloud Applications & Platform Services