Let's take a look at how ZFS protects data. I plugged in a spare external disk, created two small 1GB partitions on it with fdisk, and set up a ZFS pool for testing:
fdisk /dev/sdc # set up two 1GB partitionsNote that this is just a test set-up. Normally you should definitely use two separate disks to get the full benefit of mirroring. Also, it doesn't really make sense to slice up disks into partitions.
zpool create testpool mirror /dev/sdc1 /dev/sdc2
zfs create testpool/testfs
Smashing bitsLet's create a test file which fills the file system and make a note of the sha1 fingerprint:
cd /testpool/testfsNow comes the fun part. With a small (and very dangerous) python script, we can corrupt one of the devices by writing some junk data at regular intervals:
dd if=/dev/urandom of=testfile bs=1M count=920
# prints a sha1 fingerprint for the file
When we reread the file after the corruption, ZFS will transparently pick the pieces of data on the healthy disks. Note that in this case the file cannot not be cached in memory because it is larger than the available system memory.#!/usr/bin/python2.5 openedDevice = open('/dev/sdc1', 'w+b') interval = 10000000 while (True): openedDevice.seek(interval,1) print str(openedDevice.tell()) openedDevice.write('corrupt')
/home/wim/corruption.pyStrangely enough, running zpool status testpool doesn't report any errors at this point. I have send a mail to the zfs-fuse mailing list to ask whether this is normal.
# still prints the correct fingerprint!
To detect and fix the errors, we have to run this simple command:
zpool scrub testpoolTo protect against bit rot on consumer grade disks, the recommendation is to run a scrub once a week. In a future post I'll explore how to do that automatically, including some kind of reporting so that I know when a disk is in trouble.
# shows progress and results of the scrub
zpool status testpool