Accrue Economically: Don't recalculate what you don't need
By gaurav.verma on Nov 12, 2007
PrefaceIn one customer's Production 11.5.9 eBusiness suite
instance, POXACRCR (PO Accrual Rebuild Reconciliation Report) was one of the
highest consumer of I/O, specifically logical reads, which were in billions.
The report used to take 10-12 hrs every night and do ~2 billion logical reads
in one execution.
Tying up I/O audit information to the concurrent job
information can be used effectively for finding out the highest I/O batch
consumer in the production system. Please look at "How was it done"
section in We're
Skewed! Using 10053 event for being Histogram-wise article.
The fact that it was a seeded Oracle report made it more perplexing. This case study deals with the travails of investigating the performance improvements efforts for this concurrent report. This case study would be useful for clients who have implemented Purchasing (PO) and Accounts Receivables (AR) module.
What does POXACRCR do in 11.5.9?There are several informative notes in metalink about POXACRCR (Accrual Reconciliation Report) and associated POXACREC (Accrual Rebuild Reconciliation Report). One calculates accrual data (POXACREC), other shows it (POXACRCR).
The FAQ Note 433683.1 (FAQ For On Line Accruals And Accrual Reconciliation Reports) talks about various aspects of this report and associated issues. The Accrual Reconciliation Report is used to track and reconcile On Line Accrual Balances to the GL Accrual Account.
The report helps you monitor potential problems
with purchasing and receiving activities that can affect
the accuracy of the A/P accrual accounts.
Why was it running every day?Now, we come to the interesting part of why this report was running every night. The customer had really old blanket Purchase orders (dating back to 2005) which were being matched invoices in 2007. So the financials user thought that that would need running this report every night to match the accrual invoice transactions in AP against purchasing sub-ledger.
What was the caveat?As of a certain version in 11.5.9 release, POXACRCR has an parameter for rebuilding accrual data too. This was being set to Yes in the nightly run. Also, the incremental calculation functionality was not present until patch 5637381 (ST: ONDEMAND:POOR PERFORMANCE OF ACCRUAL RECON REBUILD - POXACRCR), which delivers INCREMENTAL REFRESH MODE functionality.
Since the customer was not at this level, the system was calculating accruals since the beginning of the implementation to that day, every night. No one has realized what a colossal waste of resources it was to keep calculating the same 99% of data, every night.
Was it really needed to run every night like with?Now, we come to the question as to whether this re-inventing the wheel approach for getting the same accrual calculation was really needed. Obviously, after patch 5637381, we could certainly leverage the incremental rebuild calculation and do with running a day's worth of accrual transactions to keep adding to the previous calculation data.
Also, since there is a month end closing done at the sub-ledger and main ledger level, the financial data is not expected to change after the AR statement generation program is run. Changing the accrual data in AR sub-ledger information for previous months or years is not a usual financial practice. Therefore, there was no need to wipe out the previous financial information and start from ground zero again. Extending this thought, calculating the accrual information for the current month would be sufficient to make the calculations current (assuming the previous periods' accrual calculations are retained).
The common perception from the Financial business users side was that "Since I am matching receipts to blanket Purchase Orders dating back to 2004, I think I need to run the report again and again to make the accrual data current". While this was true till the incremental functionality patch was not applied, after the incremental functionality patch, we could get away by just adding the deltas to the existing data.
Change of Behaviour with an exampleAfter the Incremental Refresh enhancement the behaviour of
the report changes such that the existing data in PO_ACCRUAL_RECONCILE_TEMP_ALL
table is deleted only if data in this table exists for the given period. Prior
period data remains as is and doesnot get deleted. Data for the new periods
gets added on to the existing data.
Lets take the above scenario where the feature is in use
When we run the report for the first time for period
01-JAN-1999 to 31-DEC-2001, data for this period is populated into the
PO_ACCRUAL_RECONCILE_TEMP_ALL table and the same is used for generating the
report, if required.
Now when the Accrual Reconciliation Rebuild program is run
with a date range of say 01-DEC-2001 to 31-DEC-2004, then the already existing
data in the table for the overlapping period of 01-DEC-2001 to 31-DEC-2001
(i.e. the 'To Date' of the previous run 31-DEC-2001 to the 'From Date' of the
current run 01-DEC-2001) would only get deleted. The existing data for the Non
Overlapping period from the previous run i.e 01-JAN-1999 to 30-NOV-2001 remains
gets processed and inserted into the table. Thus the data in the table PO_ACCRUAL_RECONCILE_TEMP_ALL would now have data from 01-JAN-1999 to
31-DEC-2004. We can use the Accrual Reconciliation Report Conc. program
to generate reports for any date range between 01-JAN-1999 to 31-DEC-2004.
This way the Accrual data can be built over successive
runs/periods and Accrual Reconciliation Report can be used to generate report
outputs for (taking the above example) from say 01-JAN-1999 to 31-DEC-2004
without having to rebuild the data for the earlier periods.
After applying this patch, we can build the data from
the date that we wish to track the Accruals by running the Accrual Rebuild,
either in one run or multiple runs with successive date ranges depending on the
volume of data till the last cutoff date. Therafter, subsequent runs of the
Accrual Rebuild program continues to build on the data going forward.
PS: This content has been taken from the README of patch 5637381.
How do we test it?Coming up with a credible test plan was a big challenge. Ultimately, we were able to come up with the following plan to test out both the basic functionality and I/O gain:
Assumptions: This was done on a test instance which was cloned in August 2007. Since the month end closing for July 2007 was already done on this instance, the initial baseline was done from Jan-2004 to Jul-2007.
1. Turn session audit on to get I/O related statistics
2. Run Accrual Rebuild Reconciliation Report from Jan 1, 2004 thru July 31, 2007
3. Apply identified patches (DBA stuff)
4. Rerun Accrual Rebuild Reconciliation Report from Jan 1, 2004 thru July 31, 2007 & compare the timings
5. Financial business users validate data
6. Financial business users do write-offs against the Blanket POs & Returns --> match receipts
7. Do postings in GL
8. Financial business users rerun rebuild report for that day only when 6) and 7) were done
9. Run 'Accrual Reconciliation Report' and validate accrual data information (done by Financial business users)
10. Compare I/O information for Accrual Rebuild Reconciliation Report before and after the patches
Teething troubles...The world is not perfect. One would have expected the runtime for complete run come down drastically after applying the patches for incremental functionality, but instead, it became significantly worse! Compare 10 hours baseline run in DEV against 20 hours run in the same instance for a period of 3 years.
Even gathering statistics on the PO_ACCRUAL_RECONCILIATION_TEMP table did not help. Thats when we found patch 5669511 (CST: POOR PERFORMANCE ON ACCRUAL REBUILD AFTER INCREMENTAL PATCH).
Still, the following select statement was coming as the highest time taker:
On doing some research, it turned up that Bug 5978709 was logged for the same scenario. The Customer had aborted the requested in the tar associated with his bug after it was hanging at the same select statement for 30 hrs. Later on, customer removed the hint and used that as a workaround. I tried the same in the DEV instance and was able to bring down the runtime to 3 hrs flat.
The problematic query without the hint took less than 1 hr to execute. Compare that against previous run which took many hours!
There was another patch, namely 5186477 (188.8.131.52 POXACRCR PERFORMANCE ISSUE), which had kind of also addressed something similar (and a little more) by adding another hint to the query /*+ leading (aah) */. But taking out the hint was working better than that. A backport to Version 115.64.1159.44 of POXACREC.rdf was probably not worth it.
Teething troubles continued...After this adventure, it was observed that the concurrent runs for 1 day, 15 days or 30 days periods were taking approximately 1 hr. What an oddity!
Tkprofs showed the following INSERT INTO .... SELECT FROM ... statement as the highest time taker, consistently.
If you look at the tkprof output in the above update, the insert statement processed 0 zeros and took 5700 seconds, which is outrageous. It seemed that the /*+ ORDERED USE_NL (PAT WTA WTS) */ in the select statement is bogging it down OR rather, the plan being
used does not seem to follow the hint properly.
The problem is the Full table scan on 'WIP_TRANSACTION_ACCOUNTS' in the plan:
0 TABLE ACCESS GOAL: ANALYZED (FULL) OF
This insert statement was the single most biggest pain point in the current run. The Full table scan of PO_ACCRUAL_ACCOUNTS_TEMP is just 2, whereas the FTS cost of 'WIP_TRANSACTION_ACCOUNTS' is 47,223!!
After removing the hint, the overall cost of the query is coming down to 4,273. Compare that against the previous cost of 125,520!
I searched in the stored pl/sql code as to where this hint is being set
and could not find it anywhere in the APPL_TOP patch/115/sql
directories. Then I searched in the POXACREC.rdf file using the strings
command and found it there.
Working it around using Stored Outlines...Although I have bug 6438458 open with development, a quick test for using the new plans without the hint could be done using stored outlines. Using stored outlines is easy.
You first create the stored outline using this template:
SQL> create or replace outline poxacrcr_select_with_hint on <sql with the hint included>;
SQL> create or replace outline poxacrcr_select_without_hint on <sql without the hint included>;
Now, as the sys user, you exchange the hints for each sql statement:
SQL> show user
USER is "SYS"
SQL> update outln.ol$hints set
where ol_name in ('POXACRCR_SELECT_WITH_HINT',
53 rows updated.
You can now drop the outline for sql without hint, because we are not going to use it anymore. It's utility is over (exchanging the hints).
SQL> drop outline poxacrcr_select_without_hint;
Similarily, the plan for the INSERT INTO < > SELECT .. FROM... could be influenced as well.
Summary of test runs in Development instance:
The Recipe of Patches
Patch application order for 5637381
1) 4649230: INV: POXACREC IS GETTING ORA-01427 (Prereq for 5637381)
2) 4706120: CST: POXACREC.RDF ACCRUAL REBUILD RECONCILIATION REPORT - SLOW PERFORMANCE (prereq of 4760373)
a) Gather statistics for table PO_ACCRUAL_RECONCILE_TEMP_ALL and its indexes.
b) Compile Flexfields as a new parameter is added to the concurrent program.
3) 4760373: CST: ACCRUAL REBUILD RECONCILIATION REPORT ENDS WITH ERROR (post req of 4649230)
4) 5637381: CST: ONDEMAND:POOR PERFORMANCE OF ACCRUAL RECON REBUILD - POXACRCR
5) 5669511 CST: POOR PERFORMANCE ON ACCRUAL REBUILD AFTER INCREMENTAL PATCH
6) Use Stored outlines for the using a better plan for
bottleneck query 1 and query 2 as suggested in "Working it using stored
Summary of learning:It pays to ask the important question of 'Are we the
first people to encounter this performance issue?'. Usually, for seeded
reports, someone else has already gone through the legwork of alleviating the
pain. Searching through http://metalink.oracle.com
and published bugs can yield valuable information. Patches may not be available
for all scenarios, but for a few known sqls that don't have an optimal plan,
stored outline database feature may be used for forcing good plans.
the optimizer is intelligent enough to find the optimal plan based on data
distribution statistics. Using hints may be needed to work for some 11i
production instances, it may not work for other data distributions. Sometimes,
it may be best to leave it to the judgement of Cost based optimizer (CBO).
It also pays to use I/O auditing information of concurrent jobs to your benefit and attack the biggest I/O consumer to get the biggest dividends.