In-Depth: Thoughts on Testing from a Battle-Scarred Support Engineer
By Nick Quarmby on Jan 24, 2011
"The more I practice, the luckier I seem to get."
"If you think safety is expensive, just try having an accident."
The two quotes above I think pretty accurately sum up the importance of preparation and planning. The first has been attributed to various professional golfers over the years and suggests luck is never accidental. If things work, they work for a reason and that reason is usually because of good preparation, not good luck.
The second quote has been attributed to various airline industry executives. Here the message is that however much time, and inevitably money, that you spend making sure your planes do not crash, it will cost you considerably more if one of them does.
The real message is of course that planning and preparation pay off in the long term and, in the context of your Oracle Applications upgrade, you can conclude that successful upgrades are the result of good testing, not good fortune.
This article is just a few thoughts from a battle-scarred Support Engineer on why good testing is crucial to the success of your system and why it is important for you to thoroughly test any patches, upgrades or migrations before carrying out the work on your production environment. The great majority of the work that you do in any project will be testing. Underestimate it at your peril. Testing is not simply a necessary evil which has to be endured before the inevitable, white-knuckle ride of the production upgrade. Testing is not where you should be cutting corners. Testing is where you should be going into every conceivable corner you can find and looking for something that might, without warning, jump out of the shadows and derail your production upgrade.
There are links at the end of the document to various tools you have access to which may help your testing but this article is not intended to go into the specifics of testing but simply to discuss how testing should be approached and hopefully raise awareness of the consequences of not doing enough testing.
A significant Oracle Applications upgrade is a project that will be measured in months, not weeks. The most important part of that long process will be the work that is done before the production upgrade.
A requirement to upgrade a system or perform maintenance is usually triggered via a business or technical requirement. Once a need to upgrade is identified, the work always seems to be considered urgent - these days everything is urgent. Early on in the project you need to clearly identify what is needed to achieve a successful upgrade and present this to the people driving the project. You may sometimes feel you are presented with unreasonable timescales to complete a project and commercial pressures beyond your control have placed you in this position. This is sometimes an inevitability but it is for these reasons that, as early as possible in a project, you emphasise to the people driving your business that systems do not change over the course of one frenetic weekend. Well, actually, they do, but what happens during that weekend is simply the tip of the iceberg.
Behind the scenes you must do a lot of work to prepare for that tiny weekend window at the end of your long project plan. It's sometimes inevitable that tasks tend to concertina towards the end of a project and, whilst some people work better under pressure with a visible and looming deadline, many others do not. Use your time wisely unless you look forward to sleepless nights.
Testing should be performed on a complete cloned copy of your production environment, and, where possible, on an identical hardware environment. If migrating to new hardware, your upgrade testing should include the new hardware so you are confident you can configure the software on the new hardware and that it performs correctly in the new environment.
You should never consider applying untested patches on your production environment. However urgent a patch may be, it is never so urgent you should risk the stability of the whole production environment by applying it untested. Oracle Applications patches (patches applied using adpatch) are not reversible without using database rollback/flashback features and it would also require a detailed analysis of the patch and its log files in conjunction with your specific system. You should never assume you can manually reverse even the simplest of Oracle Applications patches. Oracle Support cannot provide this service for you remotely via a Service Request (SR).
Do not rely on the assurances of your suppliers and assume everything will work on the day. You need to prove to yourself that your suppliers will deliver what you need and when you need it. Contact them early in the project - they will be grateful for the opportunity to be part of your plans. If things go wrong, having a supplier to blame for things not working may absolve you of some responsibility but you will still have to present this failure within your organisation and that will reflect on you, no matter how much you consider that you were not responsible for that failure.
Always follow the documentation. Context sensitive documents exist for upgrading all components of Oracle Applications and should be followed. If you cannot find an appropriate document for a process in your upgrade, contact Oracle Support who should be able to advise you of the right Note or manual to follow. Do not assume generic documentation will be applicable to your Oracle Applications environment.
Our published documentation is designed to achieve a successful upgrade. It is not tested by trying to perform the upgrade steps in a different order or by deliberately omitting steps to see how the upgrade will turn out. You should perform all steps in the upgrade documentation unless you have a clear reason not to, or clear guidance from Oracle Support that a step can be omitted.
Consider phasing your upgrade. If a major upgrade involves both a database and Applications upgrade you may be able to perform this in separate, shorter downtimes. 10gR2 and 11gR2 are certified on both 11i and R12 so you could perform a database upgrade during your first downtime, return the system to your users and perform the Applications upgrade in the second downtime at a later date. This will increase the testing you have to do and will also require additional user acceptance testing but you may find this method works for you.
Testing is not just to establish that the technical upgrade is a success. Testing should also include ensuring that the system you build performs acceptably under load and remains stable. The links at the end of this article refer to products you can use to measure performance and load on your environment.
The truth is the hardest work you will do in your whole upgrade project will be the testing and preparation. It is during testing that you will find where you have to reduce what takes two weeks into something that you'll be lucky to get 48 hours to perform. It is during testing that you will have to find imaginative ways to solve the issues you encounter.
Issues Encountered During Testing
Some people see testing as something that is only for the overly cautious - those people who are too scared to confront real-time events when they arise. We may live in an increasingly risk-averse society but the truth is that smart people are always testing. Only fools rush in. A macho or cavalier attitude to testing rarely produces a reliable or stable environment.
Your testing will reveal issues that have to be solved. These truly are opportunities and not problems. This is why you are testing in the first place - so that you do not see these errors when you come to the production upgrade. If you encounter issues during testing then you must find a solution to these issues or find a workaround that does not compromise the rest of the upgrade. Skipping a documented step because you don't think it's significant can easily cause problems further down the line.
Perform at least one test upgrade before you make a commitment to how soon you can perform your production upgrade. A well run project should allow the key people involved time to scope out their work before committing to any deadlines. If you do not know the scale of the job, you cannot accurately plan for the production upgrade. As part of the whole project, you should plan to perform a minimum of three successful test upgrades.
During testing you will see software acting inconsistently under apparent laboratory conditions. Some people seem to accept software is a somewhat capricious product that can behave unpredictably and that it's acceptable to see it acting differently from one day to the next so long as it doesn't actually do the wrong thing. This is not how commercial business software is designed or expected to behave. You should be able to find a reason for that inconsistent behaviour. If you cannot, then you have to make a call on how significant you believe that inconsistency is, but be assured, there is a reason for this apparently erratic behaviour. Software - even fiendishly complicated software - does not have a mind of its own.
Use Service Requests (SR) with Oracle Support to validate your upgrade plans and to help you if you encounter any technical issues during your testing. An upgrade where the first we hear from you is when you raise a Severity one SR during the production upgrade is already a failing upgrade. We want to know about any issues you encounter during testing when these can be handled as Severity two and three type SRs where you can work closely with a single engineer, usually in your own time zone, for the duration of the SR.
Be flexible in your testing. Oracle may release a new maintenance pack or updates to patches or the technology stack in the middle of your testing. Consider whether you should now integrate these later releases into your planned upgrade. This will require additional testing but you may not get another downtime window for some time so it's important you upgrade to the latest software whenever possible.
The Production Upgrade
Some DBAs approach a production upgrade with about as much relish as they approach invasive dental work. They know it will be painful, there may be some sleepless nights, and it will cost much more than expected. This is a worrying but increasingly common scenario and it's a real concern to think that some customers feel that the production phase of an upgrade is not only the most stressful part of their project but also the stage most likely to produce an unpredictable outcome.
It would be wrong to dismiss the above concerns as groundless but it's important to realise that with good planning and testing, the production upgrade should not exactly be a formality, but it should certainly be something for which you are fully prepared for every scenario you can think of, and probably a few you have not thought of.
If you approach your production upgrade with dread, unsure of what might go wrong, and wondering if it will succeed then this is almost certainly due to a lack of effective preparation and testing. You may feel under pressure to deliver a system but this is nothing compared to the pressure you will feel if you deliver nothing at all. If you really go into your production upgrade being unsure of its success then you should not be contemplating performing it. Production downtime is not something you will be offered regularly and it should not be lost on speculative or poorly tested work.
The Apollo astronauts on their way to the Moon, had a "go/no go" decision at each critical point in their mission. Your production upgrade should run on similar lines. At each significant stage in the production upgrade you should decide whether you are on course to reach your objective. You should not be afraid to abandon a production upgrade which has gone wrong and instead focus on returning the old system to your users as quickly as possible. This should be the worst scenario you can contemplate but your project plan should include a contingency for it. Your users may not thank you for this but having the old system available on Monday morning is always more acceptable than having no system at all to offer them. You can always come back to fight another day. At no time should you be in a position that you have nothing to offer your users but a broken, part-upgraded environment.
Your production upgrade is not the place where you should be trying any of the following:-
- Fixing errors that you have not seen during testing
- Trying out a different upgrade path that you did not think of during testing
- Applying patches that you did not require during testing
The common message in the above is that you should not be trying anything in your production upgrade that you have not done during testing. This should be pretty much a golden rule of your production upgrade. One day you may have to break that rule but that day will be an exceptional one.
Using standby database functionality from Data Guard is very useful when upgrading. You can configure up to nine standby databases off your production database. If, for example, you have a standby database, you can decouple this at the start of the upgrade. This is then immediately available as a pre-upgrade backup should your production upgrade fail. If the production upgrade succeeds then you just bring this standby database back online after the upgrade and it will automatically resynchronise with the new production database through Data Guard. This may be a little more complicated if your upgrade includes a database upgrade but standby databases can still have an important part to play in any maintenance exercise.
It's important to approach a production upgrade with a positive mental attitude. A belief that what you are going to do will succeed means you will approach any problems with a belief that you can solve them. If you expect your upgrade to fail then problems come like body blows and you struggle to pick yourself up and fight back. You will only have that positive mental attitude if the weeks and months you have spent prior to the production upgrade have been used constructively giving you the confidence you need to make your production upgrade a success.
When Things Go Wrong
Fans of The Hitchhiker's Guide to the Galaxy will know that printed on the cover of this book is the phrase "DON'T PANIC" (always in upper case). This is always good advice in difficult circumstances. A calm head is always best when things don't seem to be going to plan.
Always have a contingency plan. Expect things to go wrong and congratulate yourself when they don't, but do not assume that the most fastidious of testing will guarantee a success. Having a contingency or backup plan will support you. You'll be reassured that you were smart enough to plan for every eventuality.
Do not expect to rely on others in a crisis. This is your upgrade and your responsibility. Your colleague at the next desk may be sympathetic to your predicament but that does not mean they do not have their own issues to deal with. Nobody knows your upgrade as intimately as you do. If your production upgrade fails and you need help, consider how much time you may have to spend familiairising a third party with your upgrade and your environment before they can offer constructive help as to how you might overcome your current problem. They may be able to help you but if they cannot, then the onus is back on you to find a solution to your problem. Good preparation before you started your upgrade will be invaluable here.
Do not assume that if a component of the upgrade fails then you can resort to the supplier and they will be responsible for the failure. That may give you a solution in the long term but it may not fix your immediate short term problem. Having a new solution after the event may deflect some of the recriminations as to why the upgrade failed but that will not make the failure go away.
We want your upgrade to succeed. Oracle Support is available to help during testing and also during your production upgrade. Make sure you use us to ensure everything we offer is available to you throughout your project. Engage your Service Delivery Manager (SDM) and ensure they know about your upgrade. They can make Oracle Support aware of your plans.
The media is full of stories about IT upgrades that have failed. Outside of promotional literature, you rarely read a headline that says "IT Project Goes Exactly to Plan." That does not mean that successes do not exist -- it just means that they do not make the news.
There is rarely much glory in doing your work quietly, efficiently and delivering what is promised, on time and within budget. But if you look around your organisation you will notice that some of the most successful people in the business are the ones who do exactly that. They're not lucky. They planned it that way.
- Real Application Testing Certified With E-Business Suite
- EBS 12.1.1 Test Starter Kit now Available for Oracle Application Testing Suite
- Evolutionary Steps for Automated Testing for E-Business Suite
- Oracle Application Testing Suite 9.0 Supported with Oracle E-Business Suite - http://blogs.oracle.com/stevenChan/2009/10/oats_ebs_certified.html
- Automated Testing for the E-Business Suite - http://blogs.oracle.com/stevenChan/2006/06/automated_testing_for_the_ebus.html
- Field-Tested Advice for Smooth EBS 18.104.22.168 Upgrades
- Upgrading EBS 12 to DB 11g with Physical Standby Enabled