Friday, June 11, 2010

More Fun With ZFS

How to migrate a message store on a ZFS filesystem?  I was told "Simple! Just create a snapshot, zfs send to other filesystem, then create a new snapshot, then shutdown app, and send snapshot to new file system with minimal downtime."

Sounded pretty simple, so I tried it.  Like earlier attempts to "zfs send | zfs recv"... several minutes go by and NOTHING HAPPENS. The delay made me nervous (not knowing what was going on) so I aborted it from the NON-test system (sending to this remote system) and used tape to get it over there.

Now I am testing with about 20% of the actual message store so whatever the numbers are here, it will be safe to assume 5x longer on the other system. I start the move:


zfs send store@today | zfs recv newmail/store

It did NOTHING on the actual file system for 20 minutes. Meanwhile,
"zfs list" showed that it created the target filesystem and the target
filesystem usage was growing. When the target filesystem size matched the original filesystem, the data seemed to ALL SUDDENLY APPEAR ON THE DISK. I didn't trust it so I ran du and sure enough - 10's of thousands of user accounts were there. Pretty wild. Next test - 2nd snapshot. The 2nd snapshot should be ONLY the diffs and should be pretty quick.  I guess the assumption is that there are no other snapshots.  It has to be clean. Interesting. 


This means I could use the daily snapshot - once done, move to the
new disk, take system down, take a 2nd snapshot, move the difs & ta-dah!  Very cool stuff, this ZFS. Then I try part two.


bash-3.00# date && zfs send store@L8R | zfs recv newmail/store  && date
Friday, June 11, 2010 10:55:49 AM CDT
cannot receive new filesystem stream: destination 'newmail/store' exists
must specify -F to overwrite it
Huh?!  It sounds like I have to destroy the content on the new system to do this.  This can't be right - or can it?  It's a test system so I proceed.
bash-3.00# date && zfs send store@L8R | zfs recv -F newmail/store  && date
Friday, June 11, 2010 11:23:23 AM CDT
cannot receive new filesystem stream: destination has snapshots (eg. newmail/store@today)
must destroy them to overwrite it
Oh yeah, I forgot, it duplicates the snapshot when you do this.  No harm to get rid of that I suppose...
bash-3.00# zfs list
NAME                  USED  AVAIL  REFER  MOUNTPOINT
newmail              41.4G   226G  44.0K  /newmail
newmail/store        41.4G   226G  41.2G  /newmail/store
newmail/store@today   187M      -  41.2G  -
store                41.4G   292G  41.2G  /store
store@today          78.0K      -  41.0G  -
store@L8R             193M      -  41.2G  -
bash-3.00# zfs destroy newmail/store@today
There!  Now check to see that its all gone and note the filesystem sizes.
bash-3.00# zfs list
NAME            USED  AVAIL  REFER  MOUNTPOINT
newmail        41.2G   226G  44.0K  /newmail
newmail/store  41.2G   226G  41.2G  /newmail/store
store          41.4G   292G  41.2G  /store
store@today    78.0K      -  41.0G  -
store@L8R       193M      -  41.2G  -
bash-3.00# date && zfs send store@L8R | zfs recv -F newmail/store  && date
Friday, June 11, 2010 11:23:57 AM CDT
After a bit and while it's doing this, I switch over to another console and run zfs list:
 bash-3.00# zfs list
NAME            USED  AVAIL  REFER  MOUNTPOINT
newmail        4.39G   263G  44.0K  /newmail
newmail/store  4.39G   263G  4.39G  /newmail/store
store          41.4G   292G  41.2G  /store
store@today    78.0K      -  41.0G  -
store@L8R       193M      -  41.2G  -
Hey!  That's exciting!  It did EXACTLY as I thought - it destroyed the new filesystem I created.  What was once 41G is now 78k.  ;-(  But hang on, let's see what we end up with.  Some time later...
bash-3.00# zfs list && date
NAME            USED  AVAIL  REFER  MOUNTPOINT
newmail        7.02G   260G  44.0K  /newmail
newmail/store  7.02G   260G  7.02G  /newmail/store
store          41.4G   292G  41.2G  /store
store@today    78.0K      -  41.0G  -
store@L8R       193M      -  41.2G  -
Friday, June 11, 2010 11:29:47 AM CDT
7GB?!  I only added about 200MB to the original so this is interesting ... Wait a bit more ...

bash-3.00# zfs list && date
NAME            USED  AVAIL  REFER  MOUNTPOINT
newmail        18.1G   249G  44.0K  /newmail
newmail/store  18.1G   249G  18.1G  /newmail/store
store          41.4G   292G  41.2G  /store
store@today    78.0K      -  41.0G  -
store@L8R       193M      -  41.2G  -
Friday, June 11, 2010 11:35:15 AM CDT
Okay - that is WAY bigger than the snapshot.  Interesting... Wait a bit more ...
bash-3.00# zfs list && date
NAME            USED  AVAIL  REFER  MOUNTPOINT
newmail        38.4G   229G  44.0K  /newmail
newmail/store  38.4G   229G  38.4G  /newmail/store
store          41.4G   292G  41.2G  /store
store@today    78.0K      -  41.0G  -
store@L8R       193M      -  41.2G  -
Friday, June 11, 2010 11:44:58 AM CDT
Okay, it is definitely going to recreate the entire 41GB and (hopefully) include the extra stuff I added.  Exactly what I expected BUT I'll take it.

A few minutes later - TA-DAHHH!  It is done.  It took 23 minutes (the first time it took 20-21 minutes).    Weird.  I don't see the point of the 2nd snapshot - just stop and snapshot the first time - or, I've done something wrong (most likely). 

Oh well.  Fun experiment.  But I'd better do some more reading before trying this live.  Incidentally, this Solaris 10 ZFS Essentials is really nice for those times you're away from a PC and want to read about ZFS.  I haven't seen the new Solaris ZFS Administration Guide but I'm putting it on my list of books to acquire.

Hopefully, this was useful or at least entertaining.  I know, I know! "RTFM!" (Which stands for Read the Fine Manual, of course).

4 comments:

zogness said...

Wouldn't it be nice if there was some sort of progress bar printing to screen? Just a series of pound signs is all I'm talking about.

RAT said...

Ah ha! Francisco from the Sun Managers list says here's where I went wrong - I left the "i" out of the 2nd incremental snapshot. BIG difference!! It took only seconds to update using this:

zfs send -i store@today store@L8R | zfs receive -F newmail/store@L8R

RAT said...

Here's another example - as I do this live, the ZFS send/rec has been working on a live message store of 200+GB for 21 hours. A df -h show NOTHING in the new filesystem. When it is completed it will SUDDENLY show 202GB. That's weird to me.

MarkoSchuetz said...

For progress do

zfs send ... | zfs receive ... &
while true ; do zfs list ... ; sleep 100 ; done