title graphics

Blog/HOWTO: ZFS under Linux using FUSE


Judging just by the features of ZFS – sheer size due to 128 Bit, integrated volume manager, capability for snapshots, to add partitions on the fly, to add RAID disks on the fly (also RAID 5/6, named RAID-Z/Z2), add hotspares, snapshots, builtin compression – looks like a leap forward in file system technology. More on technical features on ZFS see some whitepapers.
  Not that it matters for enterprise or data center usage, but besides the possibility growing it (which nearly all file systems are able to) you can also shrink ZFS: XFS (Linux, IRIX) or JFS (Linux, AIX) cannot be shrinked. Also quite "interesting" in terms of computer forensics is the copy-on-write mechanism. That means with carving techniques you'll probably find all "old" data unless the logic in ZFS decided to overwrite it at a certain point.

Implementations

The Zettabyte file system is available for Solaris since Solaris 10 Update 2, aka HW 6/06, OpenSolaris and its distributions. Its source code is freely available, and there are also ports to other operating systems: FreeBSD, probably soon NetBSD. And there are even rumors that it'll be integrated in the to be released version 10.5 of Mac OS X (aka Leopard).
  However it's not natively available under the most widespead Open Source operating systems: Linux. The main reason is a license incompatibility: The Linux kernel uses GPLv2, ZFS is as the whole nine yeards of OpenSolaris under CDDL. Which means a developer can't just take the source code and modify it to work within the Linux kernel, as e.g. FreeBSD did it. There was a lengthy discussion on LKML. Bottom line: Aside from the mentioned not clear patent situation on ZFS a reimplementation from scratch is for Linux the only legal choice for a real implementation. But: It's a whole lot of work. Just to give you a picture: From the free availability of the source code of SGI's XFS to its implementation into the mainline kernel it took four years for SGI and other developers. That was the port from Irix to Linux. For Sun to implement ZFS from scratch took five years.

However, there's FUSE which allows as the name suggests the filesystem to be in userspace and thus circumventing the license incompatibility. In kernel space there's is a (GPL'ed) fuse module which provides the gluing part. The catch: a filesystem in userspace has limitations and will never be as good – in terms of features, scalability, performance – as a kernel based one.
  The ZFS on FUSE project was sponsored by Google's Summer of Code program in 2006, and is beta as of May 2007.

HOWTO

Ok, those are the constraints. If you anyway want to give it a shot, here is how is works on a single system. But be warned: It's your own risk and zfs-fuse is beta software. Your data is in danger. Got that? ;-)
  I explain ZFS on FUSE using by the time I wrote this with the most recent packages. I recommend you to check for newer versions.
  The operating systems:
  • Nexenta aka GNU/Solaris (Alpha 6/7) which by default creates /export/home using ZFS
  • Opensuse 10.2 (64 Bit), distribution specific settings are in green
  • Ubuntu 7.04 (32 Bit), distribution specific settings denoted in brown
On both Linux distributions you have to download and compile the zfs-fuse part. For this you need the fuse-devel stuff (libfuse, *.h files), and scons. scons is a software construction tool, see it like a Python-based autoconf/automake. Before you read the description, here is the next catch: You're always supposed to export the ZFS pools on on one system in order to be able to import cleanly on the other one. unmount/plain shutdown just won't do it. Also worth to know if you're not familiar with Solaris x86: All Solaris' partitions are – in the Linux point of view – installed in a single primary partition, see below.

Solaris side

This is what you have to do on the Solaris system (here Nexenta):
root@mybox:~# uname -a
SunOS mybox 5.11 NexentaOS_20070402 i86pc i386 i86pc Solaris
root@mybox:~# zfs list
NAME USED AVAIL REFER MOUNTPOINT
home 559M 3.95G 559M /export/home
root@mybox:~# zpool status
pool: home
state: ONLINE
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
home ONLINE 0 0 0
c0d0s7 ONLINE 0 0 0

errors: No known data errors
root@mybox:~# mount -p | grep -w zfs
home - /export/home zfs - no rw,devices,setuid,exec,xattr,atime
root@mybox:~# zpool export home
root@mybox:~# zpool status
no pools available
root@mybox:~# /usr/sbin/shutdown -y -g0 -i6

Preparations under Linux

Now you can start importing it on the Linux side. Both distributions which are explained here are using a vendor provided kernel. Make sure otherwise that CONFIG_PARTITION_ADVANCED=y, CONFIG_SOLARIS_X86_PARTITION=y and CONFIG_SUN_PARTITION=y is set, CONFIG_UFS_FS=m cannot hurt. First you need on both Linux distributions scons which is likely not installed:
mybox:~ # uname -a
Linux mybox 2.6.18.8-0.3-default #1 SMP Tue Apr 17 08:42:35 UTC 2007 x86_64 x86_64 x86_64 GNU/Linux

mybox:~ # rpm -Uv scons-0.96.91-37.x86_64.rpm
Preparing packages for installation...
scons-0.96.91-37
mybox:~ #
mybox:~ # uname -a
Linux mybox 2.6.20-15-generic #2 SMP Sun Apr 15 07:36:31 UTC 2007 i686 GNU/Linux

mybox:~ # apt-get install scons
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following NEW packages will be installed:
scons
0 upgraded, 1 newly installed, 0 to remove and 3 not upgraded.
Need to get 0B/389kB of archives.
After unpacking 1708kB of additional disk space will be used.
Selecting previously deselected package scons.
(Reading database ... 212995 files and directories currently installed.)
Unpacking scons (from .../scons_0.96.93-2_all.deb) ...
Setting up scons (0.96.93-2) ...
mybox:~ #
You also need fuse and fuse-devel stuff. Under Suse 10.2 version 2.6.0 is installed which is not sufficent. Pull the following RPMs or later from ftp.halifax.rwth-aachen.de:
mybox:~ # rpm -Uv http://ftp.halifax.rwth-aachen.de/opensuse/repositories/filesystems/openSUSE_10.2/x86_64/fuse-devel-2.6.3-4.1.x86_64.rpm \
http://ftp.halifax.rwth-aachen.de/opensuse/repositories/filesystems/openSUSE_10.2/x86_64/fuse-2.6.3-4.1.x86_64.rpm
Retrieving http://ftp[...]
Preparing packages for installation...
fuse-devel-2.6.3-4.1
fuse-2.6.3-4.1
mybox:~ #
Under Ubuntu if you configured multiverse and universe repositories it's also only one command:
mybox:~ #  apt-get install fuse-utils libfuse-dev libfuse2
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following NEW packages will be installed:
  fuse-utils libfuse-dev libfuse2
0 upgraded, 3 newly installed, 0 to remove and 3 not upgraded.
Need to get 0B/275kB of archives.
After unpacking 770kB of additional disk space will be used.
Selecting previously deselected package fuse-utils.
(Reading database ... 213249 files and directories currently installed.)
Unpacking fuse-utils (from .../fuse-utils_2.6.3-1ubuntu2_i386.deb) ...
Selecting previously deselected package libfuse2.
Unpacking libfuse2 (from .../libfuse2_2.6.3-1ubuntu2_i386.deb) ...
Selecting previously deselected package libfuse-dev.
Unpacking libfuse-dev (from .../libfuse-dev_2.6.3-1ubuntu2_i386.deb) ...
Setting up fuse-utils (2.6.3-1ubuntu2) ...
creating fuse device node...
udev active, devices will be created in /dev/.static/dev/
creating fuse group...

Setting up libfuse2 (2.6.3-1ubuntu2) ...

Setting up libfuse-dev (2.6.3-1ubuntu2) ...

mybox:~ 
The zfs-fuse archive is the one where this story is all about (use this one or later if available). Pull it e.g. from berlios:
mybox:~ # cd /tmp
mybox:/tmp # cd /usr/src
mybox:/usr/src # tar xjf /tmp/zfs-fuse-0.4.0_beta1.tar.bz2
mybox:/usr/src # cd /usr/src/zfs-fuse-0.4.0_beta1/src
mybox:/tmp # wget http://download2.berlios.de/zfs-fuse/zfs-fuse-0.4.0_beta1.tar.bz2
mybox:/tmp # cd /usr/src
mybox:/usr/src # tar xjf /tmp/zfs-fuse-0.4.0_beta1.tar.bz2
mybox:/usr/src # cd /usr/src/zfs-fuse-0.4.0_beta1/src

Compilation of zfs-fuse

Under Suse fuse.h is in the devel RPM above in a different place, thus you need to make a change before starting the compilation with scons:
mybox:/usr/src/zfs-fuse-0.4.0_beta1/src # mv zfs-fuse/cmd_listener.c zfs-fuse/cmd_listener.c.orig
mybox:/usr/src/zfs-fuse-0.4.0_beta1/src # sed 's/fuse\/fuse.h/linux\/fuse.h/' < zfs-fuse/cmd_listener.c.orig > zfs-fuse/cmd_listener.c
mybox:/usr/src/zfs-fuse-0.4.0_beta1/src # scons
[.. some compilation .. ] mybox:/usr/src/zfs-fuse-0.4.0_beta1/src # scons install install_dir=/usr/local/sbin && cd
mybox:~ # export PATH=$PATH:/usr/local/sbin
mybox:~ # modprobe fuse
mybox:~ # echo "none /sys/fs/fuse/connections fusectl none 0 0" >>/etc/fstab
mybox:~ # mount -a -t fusectl
mybox:~ # zfs-fuse &
[1] 21202
mybox:~ # dmesg | grep solaris
sdb3: <solaris: [s0] sdb7 [s1] sdb8 [s2] sdb9 [s7] sdb10 >
mybox:~ # fdisk -l /dev/sdb

Disk /dev/sdb: 160.0 GB, 160041885696 bytes
255 heads, 63 sectors/track, 19457 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/sdb1 * 1 1583 12715416 7 HPFS/NTFS
/dev/sdb2 1584 2221 5124735 7 HPFS/NTFS
/dev/sdb3 * 2222 3924 13679347+ bf Solaris
/dev/sdb4 3925 19457 124768822+ 5 Extended
/dev/sdb5 3925 4964 8353768+ 83 Linux
/dev/sdb6 4965 19457 116414991 83 Linux

As you can see, the Solaris partitions sdb7 through sdb10 are within one primary partition (sdb3). That means: slice 0 is under sdb7, s1 under sdb8, the backup partition (s2) under sdb9 and slice 7 is from the Linux perspective under sdb10.
  Mounting the UFS partition (root under Nexenta) is optional here, but since we are almost there we can as well mount the UFS root partition:
mybox:~ # mkdir /solaris
mybox:~ # echo "/dev/sdb7 /solaris/ ufs ro,ufstype=sunx86,noauto 0 0 # slice 0 is root" >>/etc/fstab
mybox:~ # mount /solaris
mybox:~ #

Importing ZFS under Linux

Now we can start importing the file system, just by using the same command you would use under Solaris, namely zfs import. But before that you need to load the fuse module, then start the zfs-fuse daemon:
mybox:~ # export PATH=$PATH:/usr/local/sbin
mybox:~ # echo "none            /sys/fs/fuse/connections        fusectl none            0 0" >>/etc/fstab
mybox:~ # mount -a -t fusectl
mybox:~ # zfs-fuse &
[1] 21202
mybox:~ # zpool import
  pool: home
    id: 13093621367567478064
 state: ONLINE
action: The pool can be imported using its name or numeric identifier.
        The pool may be active on on another system, but can be imported using
        the '-f' flag.
config:

        home        ONLINE
          sdb10     ONLINE
mybox:~ # zpool import home
cannot import 'home': pool may be in use from other system
use '-f' to import anyway
mybox:~ # 
Just for demonstration: In this case ZFS was not exported on the Solaris side before. You now can issue zpool import -f home and hope you will continue to be happy or go back, boot Solaris and issue a zfs export as explained above. If you exported it sucessfully it would look like this:
mybox:~ # zpool import
  pool: home
    id: 13093621367567478064
 state: ONLINE
action: The pool can be imported using its name or numeric identifier.
config:

        home        ONLINE
          sdb10     ONLINE


mybox:~ # zpool import home
mybox:~ # mount | egrep "export|solaris|fuse"
/dev/sdb7 on /solaris type ufs (ro,ufstype=sunx86)
none on /sys/fs/fuse/connections type fusectl (rw,none)
home on /export/home type fuse (rw,nosuid,nodev,allow_other)
mybox:~ # df -kT /export/home/ /solaris
Filesystem    Type   1K-blocks      Used Available Use% Mounted on
home          fuse     4709297     42995   4666302   1% /export/home
/dev/sdb7      ufs     8171817   1800911   6289188  23% /solaris
That was it already. Now you can do all kinds of stuff, reading from it, even writing to it. If you want to use it on a regular basis – I deliberately avoid the term production system here due to its status – you should put the command  mount -a -t fusectl; zfs-fuse and the import command as well as the zfs export command (see below) in a startup/shutdown script.
  In the meantime here a some examples of the zfs related commands:
mybox:~ # zpool list
NAME                    SIZE    USED   AVAIL    CAP  HEALTH     ALTROOT
home                   4.56G    562M   4.01G    12%  ONLINE     -
mybox:~ # zpool status -v
  pool: home
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        home        ONLINE       0     0     0
          sdb10     ONLINE       0     0     0

errors:No known data errors
mybox:~ #  zpool list
NAME                    SIZE    USED   AVAIL    CAP  HEALTH     ALTROOT
home                   4.56G    562M   4.01G    12%  ONLINE     -
mybox:~ # zfs list
NAME   USED  AVAIL  REFER  MOUNTPOINT
home   562M  3.94G   562M  /export/home
mybox:~ # zpool iostat -v home
               capacity     operations    bandwidth
pool         used  avail   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
home         562M  4.01G      0      6    502   212K
  sdb10      562M  4.01G      0      6    502   212K
----------  -----  -----  -----  -----  -----  -----
mybox:~ #

Shutdown

As explained above if you want to use your ZFS under any other OS you need to export it before. This is also true if you shutdown Linux and boot Solaris. The commands are the same:
mybox:~ # zpool  export home
mybox:~ # zpool  status
no pools available
mybox:~ # mount | egrep -w fuse
none on /sys/fs/fuse/connections type fusectl (rw)
mybox:~ # 

(Dirk Wetter, 5/20/2007)

Incompatible ZFS version

<update> Looks like recent OpenSolaris builds come with a ZFS version which is not recognized anymore by the latest zfs-fuse (by the time of writing this is version 0.4.0_beta1). On another system I have SXCE 65 (Solaris Express Community Edition, build 65). Here is the message if I try to import it under Linux (32 Bit OpenSuse 10.2):
mybox:~ # zpool import
  pool: home
    id: 7799573591936139070
 state: FAULTED
status: The pool is formatted using an incompatible version.
action: The pool cannot be imported.  Access the pool on a system running newer
        software, or recreate the pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-A5
config:

        home        UNAVAIL   newer version
          hda15     ONLINE
mybox:~ # zpool import -f home
cannot import 'home': pool is formatted using a newer ZFS version
mybox:~ # 
Currently I am not aware of a solution or an update which recognizes the new ZFS. </update> (Dirk Wetter, 6/15/2007)


Discuss this article  |   Permalink, Comments [2]   |   del.icio.us

Discussions

Ricardo Correira wrote (6/25/2006, 03:48 PM):
Yes, this is known. If you update to the latest mercurial revision (see instructions in http://www.wizy.org/wiki/ZFS_on_FUSE ) you will no longer have this problem :)

Permalink, Comments [1], Reply


Dirk Wetter wrote (7/5/2006, 01:10 PM):
Yes, now I have a different problem ;-) . After installing mercurial following Ricardo's hint:

mybox:/usr/src # mkdir zfs-fuse; cd zfs-fuse
mybox:/usr/src/zfs-fuse # hg clone http://www.wizy.org/mercurial/zfs-fuse/trunk
requesting all changes
adding changesets
adding manifests
adding file changes
added 239 changesets with 1522 changes to 475 files
443 files updated, 0 files merged, 0 files removed, 0 files unresolved
mybox:/usr/src/zfs-fuse # cd trunk/src

After following the instructions above (i.e. scons; scons install install_dir={install-path}; {install-path}/zfs-fuse &) I get:

mybox:/usr/src/zfs-fuse # zpool import
  pool: home
    id: 7799573591936139070
 state: FAULTED
status: The pool metadata is corrupted.
action: The pool cannot be imported due to damaged devices or data.
        The pool may be active on on another system, but can be imported using
        the '-f' flag.
   see: http://www.sun.com/msg/ZFS-8000-72
config:

        home        FAULTED   corrupted data
          hda15     ONLINE
mybox:/usr/src/zfs-fuse # zpool import -f home
cannot import 'home': I/O error
mybox:/usr/src/zfs-fuse # 

No, the metadata is not corrupted ;-) SXCE 65 runs just fine.
Note: Mercurial (hg frontend) is the SCM used e.g. also for OpenSolaris. [On the internet DVD image of Opensuse 10.2: there's no mercurial. You need to pull the i386/x86_64 RPMs first].


Permalink, Comments [0], Reply