Multipathing for Tape
By danmas on Jul 02, 2008
It was a dark and stormy night, the back up windows opened and shut with the regularity of a 90 year old pensioner's clacking of teeth as he snored with the flem filled nervousness of old age and incontinence. The data flowed from the backing store to fixed content archives like a humpback whale strains plankton. The administrator anxiously clacked her gum back and fourth against the two teeth she had recently had crowned with a couple of diamonds (one shaped like an "i" the other clearly an "o"...some of the geeks at work thought it was a one and a zero, but, they don't understand IO) as she watched the progress of her back up. All of a sudden there was a rigid silence as the status window showed zero throughput and the tortured silent scream of failed IOs bounced off the ear buds of the administrator. With manic obsessiveness the administrator chanted her mantra "If an IO fails to complete, is it ever an IO?" and she anxiously counted off the seconds before the back up timed out. ....Over and over again she chanted her mantra as the digital clocked ticked with the slowness and regularity of a clock:
7..8..9..."...fails to complete..." 17...18....19... "is it ever an IO?"....28...29... (occasionally she changed her mantra to "I think I can I think I can" as she took power hits of her RedBull) 47...48...49... And then, with the almost ecstatic relief one feels as a particularly large boil is lanced, the IO kicked back in down the alternate path and the back up proceeded.
So, with a rather shameless head nod towards the "The Bulwer-Lytton Fiction Contest" I announce that we have integrated multipathing for tape into Solaris build 93. This is the culmination of a lot of work in both the ST driver as well as MPxIO. In particular, we have solved the problem of logical block addressing and error recovery as well as true multi-pathing to multi ported tape drives. Additionally, this is a platform for innovation as well. Now that we have highly available tape back up windows that are natively part of the Solaris operating system (you could do some of this before with FCP2 Error recovery, but it wouldn't help with a bad HBA or cable) we can look at bus saturation so we can attempt to drive down back up times.
A strong head nod towards our experts in tape and MPxIO. Well done Randy and Wayne