RAT Nest: zfs

How to migrate a message store on a ZFS filesystem? I was told "Simple! Just create a snapshot, zfs send to other filesystem, then create a new snapshot, then shutdown app, and send snapshot to new file system with minimal downtime."

Sounded pretty simple, so I tried it. Like earlier attempts to "zfs send | zfs recv"... several minutes go by and NOTHING HAPPENS. The delay made me nervous (not knowing what was going on) so I aborted it from the NON-test system (sending to this remote system) and used tape to get it over there.

Now I am testing with about 20% of the actual message store so whatever the numbers are here, it will be safe to assume 5x longer on the other system. I start the move:

zfs send store@today | zfs recv newmail/store

It did NOTHING on the actual file system for 20 minutes. Meanwhile,
"zfs list" showed that it created the target filesystem and the target
filesystem usage was growing. When the target filesystem size matched the original filesystem, the data seemed to ALL SUDDENLY APPEAR ON THE DISK. I didn't trust it so I ran du and sure enough - 10's of thousands of user accounts were there. Pretty wild. Next test - 2nd snapshot. The 2nd snapshot should be ONLY the diffs and should be pretty quick. I guess the assumption is that there are no other snapshots. It has to be clean. Interesting.

This means I could use the daily snapshot - once done, move to the
new disk, take system down, take a 2nd snapshot, move the difs & ta-dah! Very cool stuff, this ZFS. Then I try part two.

bash-3.00# date && zfs send store@L8R | zfs recv newmail/store && date
Friday, June 11, 2010 10:55:49 AM CDT
cannot receive new filesystem stream: destination 'newmail/store' exists
must specify -F to overwrite it

Huh?! It sounds like I have to destroy the content on the new system to do this. This can't be right - or can it? It's a test system so I proceed.

bash-3.00# date && zfs send store@L8R | zfs recv -F newmail/store && date
Friday, June 11, 2010 11:23:23 AM CDT
cannot receive new filesystem stream: destination has snapshots (eg. newmail/store@today) must destroy them to overwrite it

Oh yeah, I forgot, it duplicates the snapshot when you do this. No harm to get rid of that I suppose...

bash-3.00# zfs list
NAME                  USED AVAIL REFER MOUNTPOINT
newmail              41.4G   226G 44.0K /newmail
newmail/store        41.4G   226G 41.2G /newmail/store
newmail/store@today   187M      - 41.2G -
store                41.4G   292G 41.2G /store
store@today          78.0K      - 41.0G -
store@L8R             193M      - 41.2G -
bash-3.00# zfs destroy newmail/store@today

There! Now check to see that its all gone and note the filesystem sizes.

bash-3.00# zfs list
NAME            USED AVAIL REFER MOUNTPOINT
newmail        41.2G   226G 44.0K /newmail
newmail/store 41.2G   226G 41.2G /newmail/store
store          41.4G   292G 41.2G /store
store@today    78.0K      - 41.0G -
store@L8R       193M      - 41.2G -
bash-3.00# date && zfs send store@L8R | zfs recv -F newmail/store && date
Friday, June 11, 2010 11:23:57 AM CDT

After a bit and while it's doing this, I switch over to another console and run zfs list:

bash-3.00# zfs list
NAME            USED AVAIL REFER MOUNTPOINT
newmail        4.39G   263G 44.0K /newmail
newmail/store 4.39G   263G 4.39G /newmail/store
store          41.4G   292G 41.2G /store
store@today    78.0K      - 41.0G -
store@L8R       193M      - 41.2G -

Hey! That's exciting! It did EXACTLY as I thought - it destroyed the new filesystem I created. What was once 41G is now 78k. ;-( But hang on, let's see what we end up with. Some time later...

bash-3.00# zfs list && date
NAME            USED AVAIL REFER MOUNTPOINT
newmail        7.02G   260G 44.0K /newmail
newmail/store 7.02G   260G 7.02G /newmail/store
store          41.4G   292G 41.2G /store
store@today    78.0K      - 41.0G -
store@L8R       193M      - 41.2G -
Friday, June 11, 2010 11:29:47 AM CDT

7GB?! I only added about 200MB to the original so this is interesting ... Wait a bit more ...

bash-3.00# zfs list && date
NAME            USED AVAIL REFER MOUNTPOINT
newmail        18.1G   249G 44.0K /newmail
newmail/store 18.1G   249G 18.1G /newmail/store
store          41.4G   292G 41.2G /store
store@today    78.0K      - 41.0G -
store@L8R       193M      - 41.2G -
Friday, June 11, 2010 11:35:15 AM CDT

Okay - that is WAY bigger than the snapshot. Interesting... Wait a bit more ...

bash-3.00# zfs list && date
NAME            USED AVAIL REFER MOUNTPOINT
newmail        38.4G   229G 44.0K /newmail
newmail/store 38.4G   229G 38.4G /newmail/store
store          41.4G   292G 41.2G /store
store@today    78.0K      - 41.0G -
store@L8R       193M      - 41.2G -
Friday, June 11, 2010 11:44:58 AM CDT

Okay, it is definitely going to recreate the entire 41GB and (hopefully) include the extra stuff I added. Exactly what I expected BUT I'll take it.

A few minutes later - TA-DAHHH! It is done. It took 23 minutes (the first time it took 20-21 minutes). Weird. I don't see the point of the 2nd snapshot - just stop and snapshot the first time - or, I've done something wrong (most likely).

Oh well. Fun experiment. But I'd better do some more reading before trying this live. Incidentally, this Solaris 10 ZFS Essentials is really nice for those times you're away from a PC and want to read about ZFS. I haven't seen the new Solaris ZFS Administration Guide but I'm putting it on my list of books to acquire.

Hopefully, this was useful or at least entertaining. I know, I know! "RTFM!" (Which stands for Read the Fine Manual, of course).

This year, while creating the disaster backup server, I decided to but all 6 systems on one V490 with a 3310 SCSI array made into a ZFS filesystem with double parity RAIDz2 and a hot spare. I allocated 90% and then carved it up among the Oracle Application Servers and Oracle database.

A couple of things I wanted to make note of: I found that if you simply specify the mask in the zonecfg, you do not have to add any /etc/netmasks or /etc/network entries. After issuing an "add net" simply say "set address=192.168.1.1/24" and it saves times. It will, of course, assume the class C mask (which is correct in this case) but the ugly nag message can be avoided if you do either of the above.

The other thing I got bit by was sub-zone Ethernet connections. I had a sub-zone that needed a special VLAN not used by the global zone. So, I just added the Ethernet connection and forgot about it. After rebooting, NOTHING WORKED! It turns out the special VLAN was the connection used by my (virtual) DNS server. Since the global instance wasn't using it, it didn't get plumbed up after reboot and then it wasn't there for the DNS system and so nothing had DNS which caused a lot of slow timeouts, black screens, etc. I didn't want or need that VLAN in the global zone so I experimented with touching /etc/hostname.ce2 (the unused card in the global zone). That is, a created an empty file placeholder. Amazingly - it worked.

RAT Nest

Friday, August 27, 2010

Important ZFS Migration Gotcha

Friday, June 11, 2010

More Fun With ZFS

Thursday, March 11, 2010

Working with Solaris 10 zones

Search This Blog

Blog Archive

About Me