By glennf on Jan 04, 2008
- DBA requests storage for new application. They are expecting filesystem(s) or RAW luns will be presented for ASM use.
- System's administrators request luns from the Storage administrators to fulfill the request.
- Storage administrators supply the luns.
Systems AdministratorsTheir job is to make sure the performance of the supplied luns map cleanly to the Database environment. For years System Administrators have been using SW volume management/Raid to improve performance. So, naturally, they request a large number of luns (say 128) from the Storage administrators so they can stripe. Past experimentation has shown that a 32k stripe width was best.
Storage AdministratorsThe Storage people take care of large Hitachi or EMC boxes. Their job is to supply luns to applications and make sure their "san-box" performs well. They gladly supply the luns to the Systems administrators, but to ensure performance of the SAN box, they must prevent the fiber from "resets". The maximum number of requests on a fiber is 256 requests. So, no problem, they have the system administrators adjust the "sd_max_throttle" parameter so the OS will queue events and not cause resets. The rule of thumb is to set it to:
sd_max_throttle = 256/#luns = 256/128 = 2
Putting it all togetherSo, now the system administrator takes these 128 luns and creates four file systems by striping 32 luns together each with a 32k stripe width using SVM. Since this is a SAN, there are multiple connections from the host to the SAN in this case there are 4 connections. MPxIO is used to round-robin IO requests to the 4 connections to the SAN to balance load and allow for fail-over in case of an HBA failure.
This environment is turned over to the DBA who finds the performance is less than stellar.
AnalysisThe DBA is running 10 jobs that result in queries which full scan 10 tables. These queries request 1MB per IO. Now a stripe width of 32k breaks down the 1MB IO into 32 equal pieces... and since there are 10 concurrent jobs that equates to 32\*10 or 320 concurrent request for IO. Finally, these 320 request, are routed down one of the four channels so that would be 320/4 or 80 requests per channel. Are you beginning to see the problem?
Given the "sd_max_throttle" setting of 2, the OS will allow 2 outstanding requests at a time. If you look at the array, the performance will look great... so it must be an OS problem
The FixThis issue was solved in multiple phases.
- Quick Fix: Simply increase the "sd_max_throttle" >= 80. This will prevent queuing at the driver level.
- Increased stripe width. Use an SVM stripe width of 1MB or greater. This will reduce the number of IO being broken down by the SW volume manager.
- Optimal Solution. Eliminate SW striping all together and build larger luns within the SAN box.