Evaluating I/O Performance for Large Data Warehouses

One topic that we often encounter when talking with customers is I/O performance of large scale data warehouses and our recommended methodology for performance improvement when there is a serious I/O bottleneck. Effective utilization of the I/O infrastructure capacity is the key to environment performance improvement in VLDB systems and can have a substantial impact to users perception of system performance.

Because large database servers are a dynamic, evolving environment, we must be aware that most of the tuning efforts for optimization of I/O performance may become obsolete when the application I/O infrastructure utilization pattern changes. Therefore, performance tuning of such environments should be considered a continuous process rather than a specific one-time, effort and action plan. To ensure that each iteration of tuning is consistent, a consistent task plan for tuning should be established and rigorously followed.

Below is a sample process that was developed by Pat Sodia, Oracle Performance Architect. Pat is one of our top DW performance architects at Oracle he often uses this process for I/O optimization. A plan such as this should be used every time there is a change in the application environment that impacts its I/O utilization pattern.

I/O Performance Optimization Process

1.     Evaluate and implement all possible database server optimization for improved I/O distribution.

2.     Identify the high utilization devices (hot spots) on the I/O infrastructure. Using standard operating system tools or any other related tools, identify those devices, (LUNs) with utilization greater than 80%.

3.     Identify the logical volumes associated with the high-utilization devices.

4.     Identify the database structures associated with those logical volumes.

5.     Verify that the contents of each high utilization logical volume can be distributed across less utilized and/or spare logical volumes. Execute I/O distribution and verify new utilization numbers for the affected device.

a.     If there are more possible I/O distributions than there are under utilized devices, add new devices/LUNs to the environment and further implement application level I/O distribution.

6.     If utilization rates for all devices are under 80%, document new volume configurations to be reproduced in the next database build and re-establish the Baseline System Performance Model.

a.     Collect system utilization information; re-calibrate the capacity planning model and re-run load growth simulations for the new environment.

7.     If devices utilization is still high, verify if those devices are sharing the same controller with other high utilization devices.

8.     Distribute the high utilization devices evenly across the available controllers by relocating logical volumes contents, for example.

9.     If utilization rates for all devices are under 80%, document new volume contents mapping to controllers, to be reproduced in the next database build. Re-establish the Baseline System Performance Model.

a.     Collect system utilization information; re-calibrate the capacity planning model and re-run load growth simulations for the new environment.

10. If neither logical volume contents can be relocated for better I/O distribution nor the controller where the high-utilization device is located is overloaded, consider stripping the logical volume across more than one LUN.

11. Evaluate Operations impact on volume stripping. Operations permitting create “high performance” volumes by stripping a logical volume across two LUNs. Two non-stripped logical volumes can be converted in two stripped logical volumes on the respective two LUNs. Addition of new devices to the environment may facilitate execution and make it viable.

12. Relocate the identified high utilization devices/database structures to the newly created stripped volumes. Evaluate utilization of new devices.

13. If utilization rates for all devices are under 80%, document new volume contents mapping to controllers, to be reproduced in the next database build. Re-establish the Baseline System Performance Model.

a.     Collect system utilization information; re-calibrate the capacity planning model and re-run load growth simulations for the new environment.

14. Depending on the impact of the stripping on performance, consider further logical volume stripping over a higher number of LUNs.

15. If a given controller remains with more than three high-utilization devices after the operations listed above, consider adding a new HBA to the I/O infrastructure for better I/O distribution.

16. Verify devices utilization after the new controller is added. If improvement is verified, re-establish the Baseline System Performance Model.

a.     Collect system utilization information; re-calibrate the capacity planning model and re-run load growth simulations for the new environment.

By following a process such as this, you can proactively identify potential I/O bottlenecks before they become a major performance issue. Of course, you could always leverage Oracle Exadata or the Oracle Database Machine to obtain a pre-configured I/O system that is optimized for VLDB performance.

Comments:

Post a Comment:
  • HTML Syntax: NOT allowed
About

Focusing on implementing successful Analytics solutions and the world of Oracle BI & EPM.


Rob Reynolds leads Oracle's Business Analytics Consulting organization in North America.


Our charter is to help customers successfully leverage their investment in Oracle software for maximum returns. We assist customers in developing BI & EPM strategies, implementation roadmaps , architectural services and full-life cycle implementations of Oracle BI, EPM & Analytics products.


Follow on Twitter - @RobReynolds


Learn more about Oracle Consulting.

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today