This article is part of a series of “Best Practice” blogs for Oracle Analytics.
Many customers have unused columns in their analyses. Columns that are no longer required but still selected significantly impact performance and the way to improve performance is very simple.
Scenario – Removing Columns That Aren't Required
When you include columns in the analysis (in this example “Year”) that have been excluded from the view, the following impacts performance:
As an example, this simple report shows the number of customers by region and year:
The end report that's displayed is a graph showing the number of customers by region. When viewing this report, you notice that the “Year” column has been excluded and can be removed from the selection criteria.
Even though the “Year” column isn't displayed in the view, it's still selected as part of the logical query.
Logical Query:
Including the "Year" column in the analysis has the following impact:
Physical Query:
You can review the physical query to identify areas where performance is impacted:
This example shows a select count distinct from the customer number aggregation rule. In some situations, it also impacts reports with a sum aggregation rule. The generated query in this scenario also uses a grouping set. At the database level, it might be selecting many rows (millions) and then having to group by Year and Region as well as Region. This can consume significant database resources unnecessarily.
You can remove the “Year” column to analyze the impact on the logical and physical queries generated.
Column That's Deleted from the Selection Criteria:
Logical Query:
The logical query now doesn't contain the "Year" column and more importantly, the report aggregation is removed. This results in a much simpler physical query that doesn't include grouping sets. The number of records selected is also greatly reduced.
Physical Query:
Call to Action
By reviewing the analysis of non-performant reports, and in the first instance simply removing redundant unused columns, you can achieve significant performance gains. The physical query generated has reduced complexity and fewer records are returned, therefore less processing is required. For more information regarding performance, review the “Best Practice” blogs.