By StuRamage on Jun 19, 2008
I tend to focus on a number of factors in an attempt to speed things up, but my top 11 is:
- Are the Staging and Production schemas on the same database/tablespace?
- Are the disks experiencing hotspots as a result of this shared implementation?
- If the two schemas are not on the same database, what is the theoretical maximum capacity of the link between the two instances?
- Is the machine CPU or Memory constrained?
- Can multithreaded runs be threaded further?
- Can Indexes be dropped for specific conversion steps? In most instances it it cheaper to drop the indexes before any mass insert steps and rebuild them afterwards than wear the cost of multiple I/Os for each insert.
- In instances where the indexes cannot be dropped, have these been created on their own database tablespace, with a dedicated I/O channel?
- For multi-threaded jobs, are the thread ranges reasonable, or can better results be achieved by running a number of single threaded artificial high/low limit jobs (CC&B Threading does not dynamically assign High/Low limits, it basically takes the max key range and divides by the number of threads, which can result in all of the work falling to the first couple of threads).
- How regular are database statistics being gathered... massive numbers of inserts can skew the optimum access path, so regular stats collections are essential.
- Can complex data manipulation be better served with the creation of a temporary table on the Staging or Input schemas?
- Consider whether all documented Conversion steps are required. e.g. I tend to avoid running Tidy Balances and prefer to build all required components and associated values into my data extracts and mapping rules.