Out Of Time? - Use The Shotgun or Reboot..

Preface

Last weekend was pretty exciting. We were upgrading our customer's production 11.5.9 Oracle Applications instance to 10.2.0.3 and part of the effort was applying the ATG Family pack H.RUP6 patch. We were getting some really weird errors while doing the production run, something we had never seen before in the pre-production rounds (we did 6 of them to be exact). The possibility of rolling back seemed quite real.

Essentially, the AD.I.6 patchset failed while running autoconfig implicitly (the patch completed, but autoconfig was failing). Interestingly, Autoconfig was running fine on the admin/concurrent tier, but not working on the web/forms tier. Just to add to the details, the APPL_TOPs for the middle tiers (shared across multiple tiers) and Admin/concurrent tier were different.

This article talks about how we came over this issue. It is hoped it will help someone else in the future too.

A Race against time..

It was a perplexing situation. We were budgeted/negotiated 36 hrs for the entire downtime (Sun JRE Vista Patching, HRMS family pack K3 and 10g upgrade), and time was slipping by. We couldn't afford to lose more time.

Truncated Class File?

The error we were getting on the web/forms tier was as follows:

middle_tier_1:web_prod> ./adautocfg.sh
Enter the APPS user password:
The log file for this session is located at:
/ORACLE/apps/prod/admin/prod_middle_tier_1/log/11151046/adconfig.log

AutoConfig is configuring the Applications environment...

AutoConfig will consider the custom templates if present.
Using APPL_TOP location : /ORACLE/apps/prod
Classpath : /usr/java/j2sdk1.4.2_07/jre/lib/rt.jar:/usr/java/j2sdk1.4.2_07/lib/dt.jar:/usr/java/j2sdk1.4.2_07/lib/tools.jar:/ORACLE/apps/prod/common/java/appsborg2.zip:/ORA
CLE/apps/prod/common/java

Exception in thread "main" java.lang.ClassFormatError: oracle/apps/ad/autoconfig/InstantiateFile
(Truncated class file)
at java.lang.ClassLoader.defineClass0(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:539)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:123)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:251)
at java.net.URLClassLoader.access$100(URLClassLoader.java:55)
at java.net.URLClassLoader$1.run(URLClassLoader.java:194)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:187)
at java.lang.ClassLoader.loadClass(ClassLoader.java:289)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:274)
at java.lang.ClassLoader.loadClass(ClassLoader.java:235)
at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:302)
at oracle.apps.ad.tools.configuration.Customizer.getProdTopDrivers(Customizer.java:380)
at oracle.apps.ad.tools.configuration.Customizer.getAllDrivers(Customizer.java:358)
at oracle.apps.ad.tools.configuration.VersionConflictListGenerator.getAllConflicts(VersionConflictListGenerator.java:170)
at oracle.apps.ad.tools.configuration.VersionConflictListGenerator.main(VersionConflictListGenerator.java:426)
ERROR: Version Conflicts utility failed.
Terminate.

Ideas, anyone?

All right, so a class file is truncated. The very first thing that we thought about was that maybe regenerating the jar files through adadmin utility would fix it. But, it did not help.

Another thought was that a Version Conflicts utility failed message typically means something is point to incorrect code. The most common reason is usually the version of JDK. Someone pointed that the JDK that we were using was 1.4.2_07, whereas 1.3.1 was certified with 11.5.9. But since our AD level was pretty high, we didn't think this was relevant. The application was working fine with JDK 1.4.2_07 before.

Yet another person thought of removing the entry for Web tier from the TNS topology model itself. But, as can be seen here, even that idea did not work.

$ perl $AD_TOP/bin/adgentns.pl appspass=XXXXXX contextfile=$APPL_TOP/admin/prod_middle_tier_1.xml
##########################################################################
Generate Tns Names

##########################################################################
Logfile: /ORACLE/apps/prod/admin/prod_middle_tier_1/log/11151856/NetServiceHandler.log
Classpath :
/usr/java/j2sdk1.4.2_07/jre/lib/rt.jar:/usr/java/j2sdk1.4.2_07/lib/dt.jar:/usr/java/j2sdk1.4.2_07/lib/tools.jar:/ORACLE/apps/prod/common/java/appsborg2.zip:/ORACLE/apps/prod/common/java

Updating s_tnsmode to 'generateTNS'
UpdateContext exited with status: 0
Exception in thread "main" java.lang.ClassFormatError: oracle/apps/ad/tools/configuration/NetServiceHandler (Truncated class file)
at java.lang.ClassLoader.defineClass(ClassLoader.java:539)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:123)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:251)
at java.net.URLClassLoader.access$100(URLClassLoader.java:55)
at java.net.URLClassLoader$1.run(URLClassLoader.java:194)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:187)
at java.lang.ClassLoader.loadClass(ClassLoader.java:289)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:274)
at java.lang.ClassLoader.loadClass(ClassLoader.java:235
at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:302)
Error generating tnsnames.ora from the database, temperory tnsnames.ora will be generated using
templates
Instantiating Tools tnsnames.ora
Exception in thread "main" java.lang.ClassFormatError: oracle/apps/ad/autoconfig/InstantiateFile
(Truncated class file)
at java.lang.ClassLoader.defineClass0(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:539)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:123)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:251)
at java.net.URLClassLoader.access$100(URLClassLoader.java:55)
at java.net.URLClassLoader$1.run(URLClassLoader.java:194)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:187)
at java.lang.ClassLoader.loadClass(ClassLoader.java:289)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:274)
at java.lang.ClassLoader.loadClass(ClassLoader.java:235)
at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:302)
Error in instantiating tools tnsnames.ora:
Exception in thread "main" java.lang.ClassFormatError: oracle/apps/ad/autoconfig/InstantiateFile (Truncated class file)
at java.lang.ClassLoader.defineClass0(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:539)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:123)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:251)
at java.net.URLClassLoader.access$100(URLClassLoader.java:55)
at java.net.URLClassLoader$1.run(URLClassLoader.java:194)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:187)
at java.lang.ClassLoader.loadClass(ClassLoader.java:289)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:274)
at java.lang.ClassLoader.loadClass(ClassLoader.java:235)
at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:302)
Error in instantiating web tnsnames.ora:

adgentns.pl exiting with status 256
ERRORCODE = 256 ERRORCODE_END

Well, so far, no luck. Then, we did realize that the context file is also uploaded into the Database by OAM functionality (remember the OAM screen through which you can edit the autoconfig tokens?). We had thoroughly checked the context file of the middle tier, but maybe there was a corruption in the database copy ? We came across some metalink notes (e.g. Note:463895.1, Note:352916.1), which talked about purging all uploaded context files (kept in applsys.FND_OAM_CONTEXT_FILES). So potentially, we could take a backup of applsys.FND_OAM_CONTEXT_FILES table, delete the entry corresponding to middle_tier_1 and then autoconfig on middle_tier_1 to reload the data. So, this was a realistic possibility.

Wait a minute .. Can we take the shotgun approach?

OK, so while we were thinking of all these options, the thought came to my mind that we had 5 middle tiers, so why don't we try to run autoconfig on one of the other middle tiers. Most surprisingly, Autoconfig ran fine on another middle tier!!

Hmph, all of them were sharing the same $JAVA_TOP, so what could make the other middle tiers behave differently? Maybe a difference in an autoconfig token or environment variable? But, even after comparing all the entries in XML file and environment variables, there was no perceptible difference. Strange, very strange.

This was a good validation point for us, as we could conclude that there was something wrong with middle_tier_1, but not with the others. So, we could simply continue patching with another middle tier and hope to not get similar issues in the future. middle_tier_1 could be taken out of the load balancer, dealt with later and when fixed, put back in the load balancer. In the interim, it sounded like a workable strategy, so we took it. This is what saved us from spinning our wheels on this issue.

So what was it, really?

We continued patching and successfully migrated the production instance to 10g (Yippee!).

One thing did strike me though much later: the versions of the InstantiateFile.class were different in $JAVA_TOP and $AD_TOP when we were getting this error. For example, this was the situation when we were having this error:

middle_tier_1:web_prod> adident Header $JAVA_TOP/oracle/apps/ad/autoconfig/InstantiateFile.class
/ORACLE/apps/prod/common/java/oracle/apps/ad/autoconfig/InstantiateFile.class:
$Header InstantiateFile.java 115.212 2007/07/10 11:20:16 schagant ship $

middle_tier_1:web_prod> adident Header $AD_TOP/java/oracle/apps/ad/autoconfig/InstantiateFile.class
/ORACLE/apps/prod/ad/11.5.0/java/oracle/apps/ad/autoconfig/InstantiateFile.class:
$Header InstantiateFile.java 115.203 2006/11/01 08:05:36 subhroy ship $

Later on, after the outage, when I checked the versions again, this is what I saw:

middle_tier_1:web_prod> adident Header $JAVA_TOP/oracle/apps/ad/autoconfig/InstantiateFile.class
/ORACLE/apps/prod/common/java/oracle/apps/ad/autoconfig/InstantiateFile.class:
$Header InstantiateFile.java 115.212 2007/07/10 11:20:16 schagant ship $

middle_tier_1:web_prod> adident Header $AD_TOP/java/oracle/apps/ad/autoconfig/InstantiateFile.class
/ORACLE/apps/prod/ad/11.5.0/java/oracle/apps/ad/autoconfig/InstantiateFile.class:
$Header InstantiateFile.java 115.212 2006/11/01 08:05:36 subhroy ship $

I am now forced to think that maybe this was a contributor to the issue at hand. If we had known that this .class file had multiple copies across the APPL_TOP and JAVA_TOP, we could have simply run adadmin with the Maintain files->copy files to destination option to sync up the duplicate copies. But then, why didn't the other middle tiers complain? The APPL_TOP and JAVA_TOP was shared amongst all of them. So maybe, it wasn't the real problem.

Later on, when we rebooted middle_tier_1 box (RHAS3 32bit linux), autoconfig ran successfully! This thought was based on Note 556107.1 Java.Lang.Classformaterror: oracle/apps/ibe/store/StoreCurrency (Truncated class file). If it was really this which resolved the issue, it must have been some corrupted shared module in memory, which got cleared off?

Come to think of it, when nothing seems to work on Windows, we do reboot it.

In Retrospect..

So, I don't have all the answers, but a little lateral thinking saved us and enabled us to proceed. Sometimes, the shotgun approach does work. It pays to have multiple middle tiers in the architecture, so that there is no single point of failure. Also, sometimes, rebooting can resolve some really weird errors.

I'm reminded of a phrase in Hindi -- अकल बड़ी या भैंस ? (loose translation: sometimes brute force is superior to elaborate reasoning). You can be the judge.

Comments:

Good one !!

Posted by Atul on November 23, 2008 at 10:44 PM EST #

I would also consider the NAS/OS caching as a possible cause as well. Regardless, rebooting the client would resolve both.

Posted by Pete on November 24, 2008 at 05:32 AM EST #

Post a Comment:
  • HTML Syntax: NOT allowed
About

bocadmin_ww

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today