IT organizations at large companies are complex entities where people are partitioned by function. There are SAN people, system administrators, Database administrators, and Developers. While it is good to specialize by function there seems to be a mis-match when each organization optimizes their internal operations. Let me walk you though the a common situation where the SAN administrators and system administrators each try to optimize performance without consideration to overall picture.
- DBA requests storage for new application. They are expecting filesystem(s) or RAW luns will be presented for ASM use.
- System's administrators request luns from the Storage administrators to fulfill the request.
- Storage administrators supply the luns.
Their job is to make sure the performance of the supplied luns map cleanly to the Database environment. For years System Administrators have been using SW volume management/Raid to improve performance. So, naturally, they request a large number of luns (say 128) from the Storage administrators so they can stripe. Past experimentation has shown that a 32k stripe width was best.
The Storage people take care of large Hitachi or EMC boxes. Their job is to supply luns to applications and make sure their "san-box" performs well. They gladly supply the luns to the Systems administrators, but to ensure performance of the SAN box, they must prevent the fiber from "resets". The maximum number of requests on a fiber is 256 requests. So, no problem, they have the system administrators adjust the "sd_max_throttle" parameter so the OS will queue events and not cause resets. The rule of thumb is to set it to:
sd_max_throttle = 256/#luns = 256/128 = 2
Putting it all together
So, now the system administrator takes these 128 luns and creates four file systems by striping 32 luns together each with a 32k stripe width using SVM. Since this is a SAN, there are multiple connections from the host to the SAN in this case there are 4 connections. MPxIO is used to round-robin IO requests to the 4 connections to the SAN to balance load and allow for fail-over in case of an HBA failure.
This environment is turned over to the DBA who finds the performance is less than stellar.
The DBA is running 10 jobs that result in queries which full scan 10 tables. These queries request 1MB per IO. Now a stripe width of 32k breaks down the 1MB IO into 32 equal pieces... and since there are 10 concurrent jobs that equates to 32\*10 or 320 concurrent request for IO. Finally, these 320 request, are routed down one of the four channels so that would be 320/4 or 80 requests per channel. Are you beginning to see the problem?
Given the "sd_max_throttle" setting of 2, the OS will allow 2 outstanding requests at a time. If you look at the array, the performance will look great... so it must be an OS problem
This issue was solved in multiple phases.
- Quick Fix: Simply increase the "sd_max_throttle" >= 80. This will prevent queuing at the driver level.
- Increased stripe width. Use an SVM stripe width of 1MB or greater. This will reduce the number of IO being broken down by the SW volume manager.
- Optimal Solution. Eliminate SW striping all together and build larger luns within the SAN box.
Storage issues often involve multiple layers of HW, SW, people, and organizations. To architect a well thought out solution, all aspects must be taken into consideration. Get everyone talking and sharing information so that your organizational stove-pipes don't cripple application performance.