Testing using remote power

I've blogged about the remote power systems we have in our labs before, here and here, however over the last few weeks I've been investigating how long it takes for the system to recover from a disk failure. The customer has a test case that involves pulling the drive and then seeing how long the application stalls for. Not a perfect test but a reasonable simulation for a drive failing. The goal is to have no more than a 30 second pause when the drive fails.

The trouble with this test is I need to pull the drive so I have to be in the lab and I'm not even in the same country as the test case.

If however I put the drive in a unipack and the arrange for that to be on remote power I can power off the drive remotely and automatically. This helps me as the test case in in Germany so I don't have to move the systems. It helps even more as I can now write a script that runs the test automatically in a loop. By doing this I can get this graph running the test over night while I sleep:


The 5 cases where we are over 30 seconds are a bit of a worry but the others show a nice curve giving some confidence that in the usual run of things the failure time is actually less than 20 seconds. The outlying results are on inspection of the logs the result of a Disconnected command time out on another target on the bus when the the target is failed.

Tags:

Comments:

Post a Comment:
Comments are closed for this entry.
About

This is the old blog of Chris Gerhard. It has mostly moved to http://chrisgerhard.wordpress.com

Search

Archives
« April 2014
MonTueWedThuFriSatSun
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
    
       
Today