Tuesday Aug 11, 2009

Homebrew Hybrid Storage Pool

I had a bit of trouble with a slow iscsi connection to my downstairs Solaris box, and so I tried something crazy. I used a RAM disk for a ZIL. This is fine, as long as your machine never, ever (ever!) goes down. I made things slightly less crazy by replacing the RAM disk with a file vdev. This allows me to power down the machine, but the iscsi performance goes back to being pretty terrible.

The answer is to use an SSD as a ZIL vdev, so I talked the wife into letting me get 16GB SATA II drive, which I just put into the machine. There was a bit of a tense moment (note: this is what we in the business call an understatement) when the file systems on my big pool didn't appear right away (me: ls /tank, machine: I got nothing, me: WTF!?), but they appeared eventually and all I had to do was svcadm clear a couple of services that depended on them. Note to self: make those services dependent on the ZFS file systems being available.

The SSD was all the way off on c11d0 for some reason, but ZFS was happy to replace my ZIL with the new vdev, so now I'm sitting here watching this:

root@blue:~# zpool status -v tank
  pool: tank
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
	continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scrub: resilver in progress for 0h21m, 57.62% done, 0h15m to go
config:

	NAME                     STATE     READ WRITE CKSUM
	tank                     ONLINE       0     0     0
	  raidz1                 ONLINE       0     0     0
	    c4d0                 ONLINE       0     0     0
	    c4d1                 ONLINE       0     0     0
	    c5d0                 ONLINE       0     0     0
	    c5d1                 ONLINE       0     0     0
	logs
	  replacing              ONLINE       0     0     0
	    /rpool/slog/current  ONLINE       0     0     0
	    c11d0                ONLINE       0     0     0

errors: No known data errors
Yes, I called my big pool tank. I'm a ZFS nerd, I guess. Tomorrow I'll plug the Mac Book into the gigE and see how Time Machine does over iscsi. I'm hoping for big numbers.

Update: I'm glad that I didn't stay up until it finished:

root@blue:~# zpool status -v tank
  pool: tank
 state: ONLINE
 scrub: resilver completed after 2h37m with 0 errors on Wed Aug 12 01:27:18 2009
config:

	NAME        STATE     READ WRITE CKSUM
	tank        ONLINE       0     0     0
	  raidz1    ONLINE       0     0     0
	    c4d0    ONLINE       0     0     0
	    c4d1    ONLINE       0     0     0
	    c5d0    ONLINE       0     0     0
	    c5d1    ONLINE       0     0     0
	logs
	  c11d0     ONLINE       0     0     0

errors: No known data errors

Saturday Jul 25, 2009

Dear ZFS and Time Slider teams: Will you marry me?

I'm sure my wife and your wives (or husbands) and children will understand.

You see, I was working on my home system this afternoon, writing code instead of enjoying the summer weather, when I hit the following:

stgreen@blue:~/Projects/silv/work$ hg verify
\*\* unknown exception encountered, details follow
\*\* report bug details to http://www.selenic.com/mercurial/bts
\*\* or mercurial@selenic.com
\*\* Mercurial Distributed SCM (version 1.1.2)
\*\* Extensions loaded: 
Traceback (most recent call last):
  File "/usr/bin/hg", line 20, in ?
    mercurial.dispatch.run()
[...]
  File "/usr/lib/python2.4/vendor-packages/mercurial/revlog.py", line 379, in parseindex
    index, nodemap, cache = parsers.parse_index(data, inline)
ValueError: corrupt index file

This made me, shall we say, unhappy. This made me realize that I hadn't done a push to the "main" hg repository since I started the new project that I had just recently gotten working, so I was looking at losing more than a thousand lines of code.

But there you were, ZFS and Time Slider, ready to pick me up and get me back in the game:

You are the wind beneath my wings:

stgreen@blue:~/Desktop/work$ hg verify
checking changesets
checking manifests
crosschecking files in changesets and manifests
checking files
3209 files, 266 changesets, 3862 total revisions

I only lost 15 minutes of work, and those 15 minutes don't even matter, because for those 15 minutes, I was screwing around in a virtualized Ubuntu getting GWT hosting mode working. I only lost two small changes.

I have no idea what caused this problem. ZFS isn't reporting any errors on the drive, but the hg and virtualbox forums suggest that the vboxsf filesystem might be corrupting files. So, note to self: push to the main hg repository before cloning to the virtualized Ubuntu!

And I'm 100% serious about that marriage thing.

About

This is Stephen Green's blog. It's about the theory and practice of text search engines, with occasional forays into recommendation and other technologies that can use a good text search engine. Steve is the PI of the Information Retrieval and Machine Learning project in Oracle Labs.

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today