Monday May 26, 2014

Validating Petabytes of Data with Regularity and Thoroughness

by Brian Zents

When former Intel CEO Andy Grove said “only the paranoid survive,” he wasn’t necessarily talking about tape storage administrators, but it’s a lesson they’ve learned well. After all, tape storage is the last line of defense to prevent data loss, so tape administrators are extra cautious in making sure their data is secure. Not surprisingly, we are often asked for ways to validate tape media and the files on them.

In the past, an administrator could validate the media, but doing so was often tedious or disruptive or both. The debut of the Data Integrity Validation (DIV) and Library Media Validation (LMV) features in the Oracle T10000C drive helped eliminate many of these pains. Also available with the Oracle T10000D drive, these features use hardware-assisted CRC checks that not only ensure the data is written correctly the first time, but also do so much more efficiently.

Traditionally, a CRC check takes at least 25 seconds per 4GB file with a 2:1 compression ratio, but the T10000C/D drives can reduce the check to a maximum of nine seconds because the entire check is contained within the drive. No data needs to be sent to a host application. A time savings of at least 64 percent is extremely beneficial over the course of checking an entire 8.5TB T10000D tape.

While the DIV and LMV features are better than anything else out there, what storage administrators really need is a way to check petabytes of data with regularity and thoroughness. With the launch of Oracle StorageTek Tape Analytics (STA) 2.0 in April, there is finally a solution that addresses this longstanding need. STA bundles these features into one interface to automate all media validation activities across all Oracle SL3000 and SL8500 tape libraries in an environment. And best of all, the validation process can be associated with the health checks an administrator would be doing already through STA.

In fact, STA validates the media based on any of the following policies:

  • Random Selection – Randomly selects media for validation whenever a validation drive in the standalone library or library complex is available.
  • Media Health = Action – Selects media that have had a specified number of successive exchanges resulting in an Exchange Media Health of “Action.” You can specify from one to five exchanges.
  • Media Health = Evaluate – Selects media that have had a specified number of successive exchanges resulting in an Exchange Media Health of “Evaluate.” You can specify from one to five exchanges.
  • Media Health = Monitor – Selects media that have had a specified number of successive exchanges resulting in an Exchange Media Health of “Monitor.” You can specify from one to five exchanges.
  • Extended Period of Non-Use – Selects media that have not had an exchange for a specified number of days. You can specify from 365 to 1,095 days (one to three years).
  • Newly Entered – Selects media that have recently been entered into the library.
  • Bad MIR Detected – Selects media with an exchange resulting in a “Bad MIR Detected” error. A bad media information record (MIR) indicates degraded high-speed access on the media.

To avoid disrupting host operations, an administrator designates certain drives for media validation operations. If a host requests a file from media currently being validated, the host’s request takes priority. To ensure that the administrator really knows it is the media that is bad, as opposed to the drive, STA includes drive calibration and qualification features. In addition, validation requests can be re-prioritized or cancelled as needed. To ensure that a specific tape isn’t validated too often, STA prevents a tape from being validated twice within 24 hours via one of the policies described above. A tape can be validated more often if the administrator manually initiates the validation.

When the validations are complete, STA reports the results. STA does not report simply a “good” or “bad” status. It also reports if media is even degraded so the administrator can migrate the data before there is a true failure. From that point, the administrators’ paranoia is relieved, as they have the necessary information to make a sound decision about the health of the tapes in their environment.

About the Photograph

Photograph taken by Rick Ramsey in Death Valley, California, May 2014

- Brian

Follow OTN Garage on:
Web | Facebook | Twitter | YouTube
About

Contributors:
Rick Ramsey
Kemer Thomson
and members of the OTN community

Search

Archives
« April 2015
SunMonTueWedThuFriSat
   
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
  
       
Today
Blogs We Like