Detecting data/file corruption

Sometimes I get escalations that go along the lines of '...I moved this application data from machine fred to machine bob and now the application won't read it. What's happened?'
To try and debug the problem from the application down, is probably going to be quite long-winded. So, my first action is to verify that the file is actually the same on both machines. i.e. did it get corrupted in the transfer. If it did, then we can forget the appliction layer stuff, and concentrate on the method of transfer. It seems obvious when you think of it, but sometimes in the heat of the momemt, the simplest things get forgotten. What follows are some examples of how to use standard Solaris tools to detect data corruption.

For a long time we've had binaries that generate a checksum against a file - which is a simple way to tell if the source and destination copies are the same. There are sum, cksum and now in s10 digest. Also we have 'cmp' which will do a byte-for-byte conparison of two files.

Examples

All of these tools can be used on reguar files and raw devices.

!!Copy a raw disk slice to an image file using dd.

# dd if=/dev/rdsk/c0t0d0s3 of=/var/tmp/c0t0d0s3.img bs=1024k
41+1 records in
41+1 records out

!!Now we can use the comparison tools, they should all come back identical or
clean.  Remember cmp gives no output for a matching pair of files.  For sum and
cksum, the first column is the checksum, the second column, the size.

# cmp /dev/rdsk/c0t0d0s3 /var/tmp/c0t0d0s3.img

# sum  /dev/rdsk/c0t0d0s3 /var/tmp/c0t0d0s3.img
28918 85050 /dev/rdsk/c0t0d0s3
28918 85050 /var/tmp/c0t0d0s3.img

# cksum  /dev/rdsk/c0t0d0s3 /var/tmp/c0t0d0s3.img
3185788260      43545600        /dev/rdsk/c0t0d0s3
3185788260      43545600        /var/tmp/c0t0d0s3.img

# digest -a md5  /dev/rdsk/c0t0d0s3 /var/tmp/c0t0d0s3.img
(/dev/rdsk/c0t0d0s3) = 0616a55e0a4e30ecf49c974f23a56255
(/var/tmp/c0t0d0s3.img) = 0616a55e0a4e30ecf49c974f23a56255

To show what happens when a file is corrupted we will write a single byte to the
front of the file, which is currently all zero's.

The current contents of the first 10 bytes of the file (offsets are in octal)
# od -x -N 10 /var/tmp/c0t0d0s3.img
0000000 0000 0000 0000 0000 0000
0000012

Now we write the first byte of /etc/hosts (any file would do) to the front of
the image file, to simulate corruption.
# dd if=/etc/hosts of=/var/tmp/c0t0d0s3.img bs=1 count=1 conv=notrunc

We now see that the file has changed by one byte.
# od -x -N 10 /var/tmp/c0t0d0s3.img
0000000 3100 0000 0000 0000 0000
0000012

!!Now we will re-run the comparison commands to see what is shown for a
corrupted file.

# cmp /dev/rdsk/c0t0d0s3 /var/tmp/c0t0d0s3.img
/dev/rdsk/c0t0d0s3 /var/tmp/c0t0d0s3.img differ: char 1, line 1

# sum /dev/rdsk/c0t0d0s3 /var/tmp/c0t0d0s3.img
28918 85050 /dev/rdsk/c0t0d0s3
28967 85050 /var/tmp/c0t0d0s3.img

# cksum /dev/rdsk/c0t0d0s3 /var/tmp/c0t0d0s3.img
3185788260      43545600        /dev/rdsk/c0t0d0s3
1666608083      43545600        /var/tmp/c0t0d0s3.img

Again, note that for cksum and sum, that the second column is identical in the
original and corrupt version since we have not changed the file length.

Timings, comparing two identical files on filesystem.  Single disk Ultra10 Solaris10.  The
 timings are dominated by waiting for IO.

# timex cmp /dev/dsk/c0t0d0s3 c0t0d0s3.img.bak

real       12.83
user        4.86
sys         1.31

# timex sum /dev/dsk/c0t0d0s3 c0t0d0s3.img.bak
28918 85050 /dev/dsk/c0t0d0s3
28918 85050 c0t0d0s3.img.bak

real       15.17
user        3.89
sys         1.15

# timex cksum /dev/dsk/c0t0d0s3 c0t0d0s3.img.bak
3185788260      43545600        /dev/dsk/c0t0d0s3
3185788260      43545600        c0t0d0s3.img.bak

real       14.57
user        2.73
sys         1.33

# timex digest -a md5  /dev/dsk/c0t0d0s3 c0t0d0s3.img.bak
(/dev/dsk/c0t0d0s3) = 0616a55e0a4e30ecf49c974f23a56255
(c0t0d0s3.img.bak) = 0616a55e0a4e30ecf49c974f23a56255

real       15.82
user        4.07
sys         1.68
Comments:

Post a Comment:
  • HTML Syntax: NOT allowed
About

gjl

Search

Archives
« April 2014
MonTueWedThuFriSatSun
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
    
       
Today