Thursday Sep 06, 2007

7 Easy Tips for ZFS Starters

So you're now curious about ZFS. Maybe you read Jonathan's latest blog entry on ZFS or you've followed some other buzz on the Solaris ZFS file system or maybe you saw a friend using it. Now it's time for you to try it out yourself. It's easy and here are seven tips to get you started quickly and effortlessly:

1. Check out what Solaris ZFS can do for you

First, try to compose yourself a picture of what the Solaris ZFS filesystem is, what features it has and how it can work to your advantage. Check out the CSI:Munich video for a fun demo on how Solaris ZFS can turn 12 cheap USB memory sticks into highly available, enterprise-class, robust storage. Of course, what works with USB sticks also works with your own harddisks or any other storage device. Also, there are great ZFS screencasts that show you some more powerful features in an easy to follow way. Finally, there's a nice writeup on "What is ZFS?" at the OpenSolaris ZFS Community's homepage.

2. Read some (easy) documentation

It's easy to configure Solaris ZFS. Really. You just need to know two commands: zpool (1M) and zfs (1M). That's it. So, get your hands onto a Solaris system (or download and install it for free) and take a look at those manpages. If you still want more, then there's of course the ZFS Administration Guide with detailed planning, configuration and troubleshooting steps. If you want to learn even more, check out the OpenSolaris ZFS Community Links page. German-speaking readers are invited to read my german white paper on ZFS or listen to episode #006 of the POFACS podcast.

3. Dive into the pool

Solaris ZFS manages your storage devices in pools. Pools are a convenient way of abstracting storage hardware and turning it into a repository of blocks to store your data in. Each pool takes a number of devices and applies an availability scheme (or none) to it. Pools can then be easily expanded by adding more disks to them. Use pools to manage your hardware and its availability properties. You could create a mirrored pool for data that should be protected against disk failure and that needs fast access to hardware. Then, you could add another pool using RAID-Z (which is similar, but better than RAID-5) for data that needs to be protected but where performance is not the first priority. For scratch, test or demo data, a pool without any RAID scheme is ok, too. Pools are easily created:

zpool create mypool mirror c0d0 c1d0

Will create a mirror out of the two disk devices c0d0 and c1d0. Similarly, you can easily create a RAID-Z pool by saying:

zpool create mypool raidz c0d0 c1d0 c2d0

The easiest way to turn a disk into a pool is:

zpool create mypool c0d0

It's that easy. All the complexity of finding, sanity-checking, labeling, formatting and managing disks is hidden behind this simple command.

If you don't have any spare disks to try this out with, then you can just create yourself some files, then use them as if they were block devices:

# mkfile 128m /export/stuff/disk1
# mkfile 128m /export/stuff/disk2
# zpool create testpool mirror /export/stuff/disk1 /export/stuff/disk2
# zpool status testpool
pool: testpool
state: ONLINE
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
testpool ONLINE 0 0 0
mirror ONLINE 0 0 0
/export/stuff/disk1 ONLINE 0 0 0
/export/stuff/disk2 ONLINE 0 0 0

errors: No known data errors

The cool thing about this procedure is that you can create as many virtual disks as you like and then test ZFS's features such as data integrity, self-healing, hot spares, RAID-Z and RAID-Z2 etc. without having to find any free disks.

When creating a pool for production data, think about redundancy. There are three basic properties to storage: availability, performance and space. And it's a good idea to prioritize them in that order: Make sure you have redundancy (mirroring, RAID-Z, RAID-Z2) so ZFS can self-heal data when stuff goes wrong at the hardware level. Then decide how much performance you want. Generally, mirroring is faster and more flexible than RAID-Z/Z2, especially if the pool is degraded and ZFS needs to reconstruct data. Space is the cheapest of all three, so don't be greedy and try to give priority to the other two. Richard Elling has some great recommendations on RAID, space and MTTDL. Roch has also posted a great article on mirroring vs. RAID-Z.

4. The power to give

Once you have set up your basic pool, you can already access your new ZFS file system: Your pool has been automatically mounted for you in the root directory. If you followed the examples above, then you can just cd to /mypool and start using ZFS!

But there's more: Creating additional ZFS file systems that use your pool's resources is very easy, just say something like:

zfs create mypool/home
zfs create mypool/home/johndoe
zfs create mypool/home/janedoe

Each of these commands only takes seconds to complete and every time you will get a full new file system, already set up and mounted for you to start using it immediately. Notice that you can manage your ZFS filesystems hierarchically as seen above. Use pools to manage storage properties at the hardware level, use filesystems to present storage to your users and applications. Filesystems have properties (compression, quotas, reservations, etc.) that you can easily administer using zfs set and that are inherited across the hierarchy. Check out Chris Gerhard's blog on more thoughts about file system organization.

5. Snapshot early, snapshot often

ZFS snapshots are quick, easy and cheap. Much cheaper than the horrible experience when you realize that you just deleted a very important file that hasn't been backed up yet! So, use snapshots whenever you can. If you think about whether to snapshot or not, just do it. I recently spent only about $220 on two 320 GB USB disks for my home server to expand my pool with. At these prices, the time you spend thinking about whether to snapshot or not may be more worth than just buying more disk.

Again, Chris has some wisdom on this topic in his ZFS snapshot massacre blog entry. He once had over 60000 snapshots and he's snapshotting filesystems by the minute! Since snapshots in ZFS “just work” and since they only take up the space that actually changes between snapshots, there's really no reason to not doing snapshots all the time. Maybe once per minute is a little bit exaggerated, but once a week, once per day or once an hour per active filesystem is definitely good advice.

Instead of time based snapshotting, Chris came up with the idea to snapshot a file system shared with Samba whenever the Samba user logs in!

6. See the Synergy

ZFS by itself is very powerful. But the full beauty of it can be unleashed by combining ZFS with other great Solaris 10 features. Here are some examples:

  • Tim Foster has written a great SMF service that will snapshot your ZFS filesystems on a regular basis. It's fully automatic, configurable and integrated with SMF in a beautiful way.

  • ZFS can create block devices, too. They are called zvols. Since Nevada build 54, they are fully integrated into the Solaris iSCSI infrastructure. See Ben Rockwood's blog entry on the beauty of iSCSI with ZFS.

  • A couple of people are now elevating this concept even further: Take two Thumpers, create big zvols inside them, export them through iSCSI and mirror over them with ZFS on a server. You'll get a huge, distributed storage subsystem that can be easily exported and imported on a regular network. A poor man's SAN and a powerful shared storage for future HA clusters thanks to ZFS, iSCSI and Thumper! Jörg Möllenkamp is taking this concept a bit further by thinking about ZFS, iSCSI, Thumper and SAM-FS.

  • Check out some cool Sun StorageTek Availability Suite and ZFS demos here.

  • ZFS and boot support is still in the works, but if you're brave, you can try it out with the newer Solaris Nevada distributions on x64 systems. Think about the possibilities together with Solaris Live Upgrade! Create a new boot environment in seconds while not needing to find or dedicate a new partition, thanks to snapshots, while saving most of the needed disk space!

And that's only the beginning. As ZFS becomes more and more adopted, we'll see many more creative uses of ZFS with other Solaris 10 technologies and other OSes.

7. Beam me up, ZFS!

One of the most amazing features of ZFS is zfs send/receive. zfs send will turn a ZFS filesystem into a bitstream that you can save to a file, pipe through bzip2 for compression or send through ssh to a distant server for archiving or for remote replication through the corresponding zfs receive command. It also supports incremental sending and receiving out of subsequent snapshots through the -i modifier.

This is a powerful feature with a lot of uses:

  • Create your Solaris zone as a ZFS filesystem, complete with applications, configuration, automation scripts, users etc., zfs send | bzip2 >zone_archive.zfs.bz2 it for later use. Then, unpack and create hundreds of cloned zones out of this master copy.

  • Easily migrate ZFS filesystems between pools on the same machine or on distant machines (through ssh) with zfs send/receive.

  • Create a crontab entry that takes a snapshot every minute, then zfs send -i it over ssh to a second machine where it is piped into zfs receive. Tadah! You'll get free, finely-grained, online remote replication of your precious data.

  • Easily create efficient full or incremental backups of home directories (each in their own ZFS filesystems) through ZFS send. Again, you can compress them and treat them like you would, say, treat a tar archive.

See? It is easy, isn't it? I hope this guide helps you find your way around the world of ZFS. If you want more, drop by the OpenSolaris ZFS Community, we have a mailing list/forum where bright and friendly people hang out that will be glad to help you.

Wednesday Aug 08, 2007

A True Web 2.0 Chip

Yesterday was the big day in which we launched the UltraSPARC T2 chip, code-named Niagara 2.

Few people realize how significant this announcement really is. The UltraSPARC T1 chip already changed the game of providing a powerful web infrastructure: By providing 32 threads in parallel, the UltraSPARC T1 chip and the associated T2000 server can provide more than double the performance of today's regular chips, at half the power cost. Even now, 18 months after its introduction, this chip still remains ahead of the pack both in absolute web performance and in price/performance and in performance/watt.

UltraSPARC T2 is not just a better version of the T1 chip, it provides three significant improvements:

  • More parallelism: Instead of 32 concurrent threads, UltraSPARC T2 delivers 64 threads running in parallel. Moore's law gives us twice as many transistors to play with every 18 weeks and the best way to leverage that is to turn them into parallelism. UltraSPARC T1 and T2 are all about maximizing the return on Moore's Law. Check out the specs.
  • More networking: The UltraSPARC T2 features two 10 Gigabit Ethernet ports directly on the chip. Two. Ten GigaBit. On the chip. The NIC is included, there is no bus system between the NIC and the CPU, the CPU is the NIC is the CPU. Total embedded networking. For applications that live in the network, what more can they ask for in a server?
  • Built-in, free and fast encryption. In a world where the web becomes social, private data becomes more and more common, but also more and more important to secure. Making security a default feature of your web service is now available for free and it does not impact performance.

Of course, there are many more other improvements, such as 8 FP units, more memory etc., but the three points above alone make the UltraSPARC T2 the perfect chip for web 2.0 applications.

This guy needs UltraSPARC T2!For instance, check out this analysis of the Facebook platform by Marc Andreesen. If you don't want to read it all, here's a summary: Web 2.0 means explosive growth in server capacity, for any reasonably successful application. In the case of iLike, they are growing their user base at the rate of 300k a day! This kind of growth can be fatal for your company if you don't have the infrastructure to sustain it. Well, UltraSPARC T2 is just the kind of technology that was designed to do just that: Handle many, many, many concurrent users at once as efficiently and securely as possible.

So, all you Web 2.0 startups out there, get in touch with your nearest Sun rep or Sun SE and ask them about UltraSPARC T2, or better yet, get a free 60-day trial of UltraSPARC T1, do your favourite benchmark, double that number and forget about that crypto-card to see what UltraSPARC T2 can do for you real soon now. Then, sit back, relax and keep those 300k a day users coming!

Tuesday Aug 07, 2007

Consolidating Web 2.0 Services, anyone?

I have profiles on both LinkedIn and XING. And lately, I discovered Facebook, so I created a third profile there as well. And then there are half a dozen web forums here and there that I have a profile with as well.

Wouldn't it be nice to create and update a profile in one place, then have it available from whatever the Web 2.0 networking site du jour is? 

Each of these sites has their own messaging system. No, they don't forward me messages, they just send out notifications, since they want me to spend valuable online time with their websites, not anybody else's.

Wouldn't it be nice to have all Web 2.0 site's messaging systems aggregated as simple emails to my personal mailbox of choice?

I also like Plazes.com, and I update my whereabouts and what I do there once in a while. I can also tell Facebook what I'm doing right now. And now, surprise, a colleague tells me that this Twitter (sorry, I don't have a Twitter profile yet...) thing is real cool and I should use it to tell the world what I'm doing right now. That would be the third Web 2.0 service where I can type in what I do and let my friends know.

Wouldn't it be... You get the picture.

I think it would be real nice if Web 2.0 services could sit together at one table, agree on some open standards for Web 2.0 style profiles, messaging, microblogging, geo-tagging etc., and then connect with each other, so one change in one profile is reflected in the other as well, so one message sent to me from one forum reaches my conventional mail box and so one action I post to one microblogging site shows up on Plazes and Facebook as well.

I know I'm asking for a lot: After all, much of the business models of Web 2.0 companies actually rely on collecting all that data from their users and figure out how to monetize it. But on the other hand, as a user of such services, I'd like to have a nice user experience and updating three profiles is not fun if I were to do that seriously.

Therefore, I think one of the following will happen:

  • Web 2.0 companies will consolidate in the sense of being merged into very few, but global uber-companies that own all business profiles, all geo-tagging stuff, etc. This is probably why Google is buying some Web 2.0 company on a weekly basis. Maybe I should by XING stock and wait for them to be acquired by LinkedIn etc. but maybe I'm an investment sissy...
  • Web 2.0 Meta-Companies will emerge that leverage Web 2.0 APIs (or mimick users through traditional HTTP) and offer Meta-Services. I'd love to got to, say, a MetaProfiles.com, set up a real good and thorough profile of my life, then let it automatically export it to LinkedIn, XING and whatnot.com and I'd be a happy person. Let me know if you happen to know such a service.
    The closest thing to such a service is actually Facebook: Since it's not just a social website, but a real application platform, it has the potential to provide meta-services for any other Web 2.0 sites out there. I love being able to pull in data from Plazes, del.icio.us etc. into my Facebook profile and have it all in one place. I love the "My Profiles" app that lets me show off my dozen or so profiles, blogs, etc. in one single list.
  • Since both of the above are quite inevitable, eventually the losers remaining companies will sit down and start agreeing on unified and open standards for Web 2.0 centric data exchange. We've seen this with many other open standards, so why not the same for personal profiles, geodata etc.?

Meanwhile, I'll check out some of the APIs out there. Maybe I can put together a sync script or something similar to help me across the turbulences of Web 2.0 tryouts.

But first, I'll tryout Twitter. Since a couple of friends are using it already, I feel some social pressure 2.0 building up...

About

Tune in and find out useful stuff about Sun Solaris, CPU and System Technology, Web 2.0 - and have a little fun, too!

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today
Bookmarks
TopEntries
Blogroll
OldTopEntries