Index: head/en_US.ISO8859-1/books/handbook/zfs/chapter.xml
===================================================================
--- head/en_US.ISO8859-1/books/handbook/zfs/chapter.xml	(revision 51459)
+++ head/en_US.ISO8859-1/books/handbook/zfs/chapter.xml	(revision 51460)
@@ -1,4347 +1,4333 @@
 <?xml version="1.0" encoding="iso-8859-1"?>
 <!--
      The FreeBSD Documentation Project
      $FreeBSD$
 -->
 
 <chapter xmlns="http://docbook.org/ns/docbook"
   xmlns:xlink="http://www.w3.org/1999/xlink" version="5.0"
   xml:id="zfs">
 
   <info>
     <title>The Z File System (<acronym>ZFS</acronym>)</title>
 
     <authorgroup>
       <author>
 	<personname>
 	  <firstname>Tom</firstname>
 	  <surname>Rhodes</surname>
 	</personname>
 	<contrib>Written by </contrib>
       </author>
       <author>
 	<personname>
 	  <firstname>Allan</firstname>
 	  <surname>Jude</surname>
 	</personname>
 	<contrib>Written by </contrib>
       </author>
       <author>
 	<personname>
 	  <firstname>Benedict</firstname>
 	  <surname>Reuschling</surname>
 	</personname>
 	<contrib>Written by </contrib>
       </author>
       <author>
 	<personname>
 	  <firstname>Warren</firstname>
 	  <surname>Block</surname>
 	</personname>
 	<contrib>Written by </contrib>
       </author>
     </authorgroup>
   </info>
 
   <para>The <emphasis>Z File System</emphasis>, or
     <acronym>ZFS</acronym>, is an advanced file system designed to
     overcome many of the major problems found in previous
     designs.</para>
 
   <para>Originally developed at &sun;, ongoing open source
     <acronym>ZFS</acronym> development has moved to the <link
       xlink:href="http://open-zfs.org">OpenZFS Project</link>.</para>
 
   <para><acronym>ZFS</acronym> has three major design goals:</para>
 
   <itemizedlist>
     <listitem>
       <para>Data integrity: All data includes a
 	<link linkend="zfs-term-checksum">checksum</link> of the data.
 	When data is written, the checksum is calculated and written
 	along with it.  When that data is later read back, the
 	checksum is calculated again.  If the checksums do not match,
 	a data error has been detected.  <acronym>ZFS</acronym> will
 	attempt to automatically correct errors when data redundancy
 	is available.</para>
     </listitem>
 
     <listitem>
       <para>Pooled storage: physical storage devices are added to a
 	pool, and storage space is allocated from that shared pool.
 	Space is available to all file systems, and can be increased
 	by adding new storage devices to the pool.</para>
     </listitem>
 
     <listitem>
       <para>Performance: multiple caching mechanisms provide increased
 	performance.  <link linkend="zfs-term-arc">ARC</link> is an
 	advanced memory-based read cache.  A second level of
 	disk-based read cache can be added with
 	<link linkend="zfs-term-l2arc">L2ARC</link>, and disk-based
 	synchronous write cache is available with
 	<link linkend="zfs-term-zil">ZIL</link>.</para>
     </listitem>
   </itemizedlist>
 
   <para>A complete list of features and terminology is shown in
     <xref linkend="zfs-term"/>.</para>
 
   <sect1 xml:id="zfs-differences">
     <title>What Makes <acronym>ZFS</acronym> Different</title>
 
     <para><acronym>ZFS</acronym> is significantly different from any
       previous file system because it is more than just a file system.
       Combining the traditionally separate roles of volume manager and
       file system provides <acronym>ZFS</acronym> with unique
       advantages.  The file system is now aware of the underlying
       structure of the disks.  Traditional file systems could only be
       created on a single disk at a time.  If there were two disks
       then two separate file systems would have to be created.  In a
       traditional hardware <acronym>RAID</acronym> configuration, this
       problem was avoided by presenting the operating system with a
       single logical disk made up of the space provided by a number of
       physical disks, on top of which the operating system placed a
       file system.  Even in the case of software
       <acronym>RAID</acronym> solutions like those provided by
       <acronym>GEOM</acronym>, the <acronym>UFS</acronym> file system
       living on top of the <acronym>RAID</acronym> transform believed
       that it was dealing with a single device.
       <acronym>ZFS</acronym>'s combination of the volume manager and
       the file system solves this and allows the creation of many file
       systems all sharing a pool of available storage.  One of the
       biggest advantages to <acronym>ZFS</acronym>'s awareness of the
       physical layout of the disks is that existing file systems can
       be grown automatically when additional disks are added to the
       pool.  This new space is then made available to all of the file
       systems.  <acronym>ZFS</acronym> also has a number of different
       properties that can be applied to each file system, giving many
       advantages to creating a number of different file systems and
       datasets rather than a single monolithic file system.</para>
   </sect1>
 
   <sect1 xml:id="zfs-quickstart">
     <title>Quick Start Guide</title>
 
     <para>There is a startup mechanism that allows &os; to mount
       <acronym>ZFS</acronym> pools during system initialization.  To
       enable it, add this line to
       <filename>/etc/rc.conf</filename>:</para>
 
     <programlisting>zfs_enable="YES"</programlisting>
 
     <para>Then start the service:</para>
 
     <screen>&prompt.root; <userinput>service zfs start</userinput></screen>
 
     <para>The examples in this section assume three
       <acronym>SCSI</acronym> disks with the device names
       <filename><replaceable>da0</replaceable></filename>,
       <filename><replaceable>da1</replaceable></filename>, and
       <filename><replaceable>da2</replaceable></filename>.  Users
       of <acronym>SATA</acronym> hardware should instead use
       <filename><replaceable>ada</replaceable></filename> device
       names.</para>
 
     <sect2 xml:id="zfs-quickstart-single-disk-pool">
       <title>Single Disk Pool</title>
 
       <para>To create a simple, non-redundant pool using a single
 	disk device:</para>
 
       <screen>&prompt.root; <userinput>zpool create <replaceable>example</replaceable> <replaceable>/dev/da0</replaceable></userinput></screen>
 
       <para>To view the new pool, review the output of
 	<command>df</command>:</para>
 
       <screen>&prompt.root; <userinput>df</userinput>
 Filesystem  1K-blocks    Used    Avail Capacity  Mounted on
 /dev/ad0s1a   2026030  235230  1628718    13%    /
 devfs               1       1        0   100%    /dev
 /dev/ad0s1d  54098308 1032846 48737598     2%    /usr
 example      17547136       0 17547136     0%    /example</screen>
 
       <para>This output shows that the <literal>example</literal> pool
 	has been created and mounted.  It is now accessible as a file
 	system.  Files can be created on it and users can browse
 	it:</para>
 
       <screen>&prompt.root; <userinput>cd /example</userinput>
 &prompt.root; <userinput>ls</userinput>
 &prompt.root; <userinput>touch testfile</userinput>
 &prompt.root; <userinput>ls -al</userinput>
 total 4
 drwxr-xr-x   2 root  wheel    3 Aug 29 23:15 .
 drwxr-xr-x  21 root  wheel  512 Aug 29 23:12 ..
 -rw-r--r--   1 root  wheel    0 Aug 29 23:15 testfile</screen>
 
       <para>However, this pool is not taking advantage of any
 	<acronym>ZFS</acronym> features.  To create a dataset on this
 	pool with compression enabled:</para>
 
       <screen>&prompt.root; <userinput>zfs create example/compressed</userinput>
 &prompt.root; <userinput>zfs set compression=gzip example/compressed</userinput></screen>
 
       <para>The <literal>example/compressed</literal> dataset is now a
 	<acronym>ZFS</acronym> compressed file system.  Try copying
 	some large files to
 	<filename>/example/compressed</filename>.</para>
 
       <para>Compression can be disabled with:</para>
 
       <screen>&prompt.root; <userinput>zfs set compression=off example/compressed</userinput></screen>
 
       <para>To unmount a file system, use
 	<command>zfs umount</command> and then verify with
 	<command>df</command>:</para>
 
       <screen>&prompt.root; <userinput>zfs umount example/compressed</userinput>
 &prompt.root; <userinput>df</userinput>
 Filesystem  1K-blocks    Used    Avail Capacity  Mounted on
 /dev/ad0s1a   2026030  235232  1628716    13%    /
 devfs               1       1        0   100%    /dev
 /dev/ad0s1d  54098308 1032864 48737580     2%    /usr
 example      17547008       0 17547008     0%    /example</screen>
 
       <para>To re-mount the file system to make it accessible again,
 	use <command>zfs mount</command> and verify with
 	<command>df</command>:</para>
 
       <screen>&prompt.root; <userinput>zfs mount example/compressed</userinput>
 &prompt.root; <userinput>df</userinput>
 Filesystem         1K-blocks    Used    Avail Capacity  Mounted on
 /dev/ad0s1a          2026030  235234  1628714    13%    /
 devfs                      1       1        0   100%    /dev
 /dev/ad0s1d         54098308 1032864 48737580     2%    /usr
 example             17547008       0 17547008     0%    /example
 example/compressed  17547008       0 17547008     0%    /example/compressed</screen>
 
       <para>The pool and file system may also be observed by viewing
 	the output from <command>mount</command>:</para>
 
       <screen>&prompt.root; <userinput>mount</userinput>
 /dev/ad0s1a on / (ufs, local)
 devfs on /dev (devfs, local)
 /dev/ad0s1d on /usr (ufs, local, soft-updates)
 example on /example (zfs, local)
 example/compressed on /example/compressed (zfs, local)</screen>
 
       <para>After creation, <acronym>ZFS</acronym> datasets can be
 	used like any file systems.  However, many other features are
 	available which can be set on a per-dataset basis.  In the
 	example below, a new file system called
 	<literal>data</literal> is created.  Important files will be
 	stored here, so it is configured to keep two copies of each
 	data block:</para>
 
       <screen>&prompt.root; <userinput>zfs create example/data</userinput>
 &prompt.root; <userinput>zfs set copies=2 example/data</userinput></screen>
 
       <para>It is now possible to see the data and space utilization
 	by issuing <command>df</command>:</para>
 
       <screen>&prompt.root; <userinput>df</userinput>
 Filesystem         1K-blocks    Used    Avail Capacity  Mounted on
 /dev/ad0s1a          2026030  235234  1628714    13%    /
 devfs                      1       1        0   100%    /dev
 /dev/ad0s1d         54098308 1032864 48737580     2%    /usr
 example             17547008       0 17547008     0%    /example
 example/compressed  17547008       0 17547008     0%    /example/compressed
 example/data        17547008       0 17547008     0%    /example/data</screen>
 
       <para>Notice that each file system on the pool has the same
 	amount of available space.  This is the reason for using
 	<command>df</command> in these examples, to show that the file
 	systems use only the amount of space they need and all draw
 	from the same pool.  <acronym>ZFS</acronym> eliminates
 	concepts such as volumes and partitions, and allows multiple
 	file systems to occupy the same pool.</para>
 
       <para>To destroy the file systems and then destroy the pool as
 	it is no longer needed:</para>
 
       <screen>&prompt.root; <userinput>zfs destroy example/compressed</userinput>
 &prompt.root; <userinput>zfs destroy example/data</userinput>
 &prompt.root; <userinput>zpool destroy example</userinput></screen>
     </sect2>
 
     <sect2 xml:id="zfs-quickstart-raid-z">
       <title>RAID-Z</title>
 
       <para>Disks fail.  One method of avoiding data loss from disk
 	failure is to implement <acronym>RAID</acronym>.
 	<acronym>ZFS</acronym> supports this feature in its pool
 	design.  <acronym>RAID-Z</acronym> pools require three or more
 	disks but provide more usable space than mirrored
 	pools.</para>
 
       <para>This example creates a <acronym>RAID-Z</acronym> pool,
 	specifying the disks to add to the pool:</para>
 
       <screen>&prompt.root; <userinput>zpool create storage raidz da0 da1 da2</userinput></screen>
 
       <note>
 	<para>&sun; recommends that the number of devices used in a
 	  <acronym>RAID</acronym>-Z configuration be between three and
 	  nine.  For environments requiring a single pool consisting
 	  of 10 disks or more, consider breaking it up into smaller
 	  <acronym>RAID-Z</acronym> groups.  If only two disks are
 	  available and redundancy is a requirement, consider using a
 	  <acronym>ZFS</acronym> mirror.  Refer to &man.zpool.8; for
 	  more details.</para>
       </note>
 
       <para>The previous example created the
 	<literal>storage</literal> zpool.  This example makes a new
 	file system called <literal>home</literal> in that
 	pool:</para>
 
       <screen>&prompt.root; <userinput>zfs create storage/home</userinput></screen>
 
       <para>Compression and keeping extra copies of directories
 	and files can be enabled:</para>
 
       <screen>&prompt.root; <userinput>zfs set copies=2 storage/home</userinput>
 &prompt.root; <userinput>zfs set compression=gzip storage/home</userinput></screen>
 
       <para>To make this the new home directory for users, copy the
 	user data to this directory and create the appropriate
 	symbolic links:</para>
 
       <screen>&prompt.root; <userinput>cp -rp /home/* /storage/home</userinput>
 &prompt.root; <userinput>rm -rf /home /usr/home</userinput>
 &prompt.root; <userinput>ln -s /storage/home /home</userinput>
 &prompt.root; <userinput>ln -s /storage/home /usr/home</userinput></screen>
 
       <para>Users data is now stored on the freshly-created
 	<filename>/storage/home</filename>.  Test by adding a new user
 	and logging in as that user.</para>
 
       <para>Try creating a file system snapshot which can be rolled
 	back later:</para>
 
       <screen>&prompt.root; <userinput>zfs snapshot storage/home@08-30-08</userinput></screen>
 
       <para>Snapshots can only be made of a full file system, not a
 	single directory or file.</para>
 
       <para>The <literal>@</literal> character is a delimiter between
 	the file system name or the volume name.  If an important
 	directory has been accidentally deleted, the file system can
 	be backed up, then rolled back to an earlier snapshot when the
 	directory still existed:</para>
 
       <screen>&prompt.root; <userinput>zfs rollback storage/home@08-30-08</userinput></screen>
 
       <para>To list all available snapshots, run
 	<command>ls</command> in the file system's
 	<filename>.zfs/snapshot</filename> directory.  For example, to
 	see the previously taken snapshot:</para>
 
       <screen>&prompt.root; <userinput>ls /storage/home/.zfs/snapshot</userinput></screen>
 
       <para>It is possible to write a script to perform regular
 	snapshots on user data.  However, over time, snapshots can
 	consume a great deal of disk space.  The previous snapshot can
 	be removed using the command:</para>
 
       <screen>&prompt.root; <userinput>zfs destroy storage/home@08-30-08</userinput></screen>
 
       <para>After testing, <filename>/storage/home</filename> can be
 	made the real <filename>/home</filename> using this
 	command:</para>
 
       <screen>&prompt.root; <userinput>zfs set mountpoint=/home storage/home</userinput></screen>
 
       <para>Run <command>df</command> and <command>mount</command> to
 	confirm that the system now treats the file system as the real
 	<filename>/home</filename>:</para>
 
       <screen>&prompt.root; <userinput>mount</userinput>
 /dev/ad0s1a on / (ufs, local)
 devfs on /dev (devfs, local)
 /dev/ad0s1d on /usr (ufs, local, soft-updates)
 storage on /storage (zfs, local)
 storage/home on /home (zfs, local)
 &prompt.root; <userinput>df</userinput>
 Filesystem   1K-blocks    Used    Avail Capacity  Mounted on
 /dev/ad0s1a    2026030  235240  1628708    13%    /
 devfs                1       1        0   100%    /dev
 /dev/ad0s1d   54098308 1032826 48737618     2%    /usr
 storage       26320512       0 26320512     0%    /storage
 storage/home  26320512       0 26320512     0%    /home</screen>
 
       <para>This completes the <acronym>RAID-Z</acronym>
 	configuration.  Daily status updates about the file systems
 	created can be generated as part of the nightly
 	&man.periodic.8; runs.  Add this line to
 	<filename>/etc/periodic.conf</filename>:</para>
 
       <programlisting>daily_status_zfs_enable="YES"</programlisting>
     </sect2>
 
     <sect2 xml:id="zfs-quickstart-recovering-raid-z">
       <title>Recovering <acronym>RAID-Z</acronym></title>
 
       <para>Every software <acronym>RAID</acronym> has a method of
 	monitoring its <literal>state</literal>.  The status of
 	<acronym>RAID-Z</acronym> devices may be viewed with this
 	command:</para>
 
       <screen>&prompt.root; <userinput>zpool status -x</userinput></screen>
 
       <para>If all pools are
 	<link linkend="zfs-term-online">Online</link> and everything
 	is normal, the message shows:</para>
 
       <screen>all pools are healthy</screen>
 
       <para>If there is an issue, perhaps a disk is in the
 	<link linkend="zfs-term-offline">Offline</link> state, the
 	pool state will look similar to:</para>
 
       <screen>  pool: storage
  state: DEGRADED
 status: One or more devices has been taken offline by the administrator.
 	Sufficient replicas exist for the pool to continue functioning in a
 	degraded state.
 action: Online the device using 'zpool online' or replace the device with
 	'zpool replace'.
  scrub: none requested
 config:
 
 	NAME        STATE     READ WRITE CKSUM
 	storage     DEGRADED     0     0     0
 	  raidz1    DEGRADED     0     0     0
 	    da0     ONLINE       0     0     0
 	    da1     OFFLINE      0     0     0
 	    da2     ONLINE       0     0     0
 
 errors: No known data errors</screen>
 
       <para>This indicates that the device was previously taken
 	offline by the administrator with this command:</para>
 
       <screen>&prompt.root; <userinput>zpool offline storage da1</userinput></screen>
 
       <para>Now the system can be powered down to replace
 	<filename>da1</filename>.  When the system is back online,
 	the failed disk can replaced in the pool:</para>
 
       <screen>&prompt.root; <userinput>zpool replace storage da1</userinput></screen>
 
       <para>From here, the status may be checked again, this time
 	without <option>-x</option> so that all pools are
 	shown:</para>
 
       <screen>&prompt.root; <userinput>zpool status storage</userinput>
  pool: storage
  state: ONLINE
  scrub: resilver completed with 0 errors on Sat Aug 30 19:44:11 2008
 config:
 
 	NAME        STATE     READ WRITE CKSUM
 	storage     ONLINE       0     0     0
 	  raidz1    ONLINE       0     0     0
 	    da0     ONLINE       0     0     0
 	    da1     ONLINE       0     0     0
 	    da2     ONLINE       0     0     0
 
 errors: No known data errors</screen>
 
       <para>In this example, everything is normal.</para>
     </sect2>
 
     <sect2 xml:id="zfs-quickstart-data-verification">
       <title>Data Verification</title>
 
       <para><acronym>ZFS</acronym> uses checksums to verify the
 	integrity of stored data.  These are enabled automatically
 	upon creation of file systems.</para>
 
       <warning>
 	<para>Checksums can be disabled, but it is
 	  <emphasis>not</emphasis> recommended!  Checksums take very
 	  little storage space and provide data integrity.  Many
 	  <acronym>ZFS</acronym> features will not work properly with
 	  checksums disabled.  There is no noticeable performance gain
 	  from disabling these checksums.</para>
       </warning>
 
       <para>Checksum verification is known as
 	<emphasis>scrubbing</emphasis>.  Verify the data integrity of
 	the <literal>storage</literal> pool with this command:</para>
 
       <screen>&prompt.root; <userinput>zpool scrub storage</userinput></screen>
 
       <para>The duration of a scrub depends on the amount of data
 	stored.  Larger amounts of data will take proportionally
 	longer to verify.  Scrubs are very <acronym>I/O</acronym>
 	intensive, and only one scrub is allowed to run at a time.
 	After the scrub completes, the status can be viewed with
 	<command>status</command>:</para>
 
       <screen>&prompt.root; <userinput>zpool status storage</userinput>
  pool: storage
  state: ONLINE
  scrub: scrub completed with 0 errors on Sat Jan 26 19:57:37 2013
 config:
 
 	NAME        STATE     READ WRITE CKSUM
 	storage     ONLINE       0     0     0
 	  raidz1    ONLINE       0     0     0
 	    da0     ONLINE       0     0     0
 	    da1     ONLINE       0     0     0
 	    da2     ONLINE       0     0     0
 
 errors: No known data errors</screen>
 
       <para>The completion date of the last scrub operation is
 	displayed to help track when another scrub is required.
 	Routine scrubs help protect data from silent corruption and
 	ensure the integrity of the pool.</para>
 
       <para>Refer to &man.zfs.8; and &man.zpool.8; for other
 	<acronym>ZFS</acronym> options.</para>
     </sect2>
   </sect1>
 
   <sect1 xml:id="zfs-zpool">
     <title><command>zpool</command> Administration</title>
 
     <para><acronym>ZFS</acronym> administration is divided between two
       main utilities.  The <command>zpool</command> utility controls
       the operation of the pool and deals with adding, removing,
       replacing, and managing disks.  The
       <link linkend="zfs-zfs"><command>zfs</command></link> utility
       deals with creating, destroying, and managing datasets,
       both <link linkend="zfs-term-filesystem">file systems</link> and
       <link linkend="zfs-term-volume">volumes</link>.</para>
 
     <sect2 xml:id="zfs-zpool-create">
       <title>Creating and Destroying Storage Pools</title>
 
       <para>Creating a <acronym>ZFS</acronym> storage pool
 	(<emphasis>zpool</emphasis>) involves making a number of
 	decisions that are relatively permanent because the structure
 	of the pool cannot be changed after the pool has been created.
 	The most important decision is what types of vdevs into which
 	to group the physical disks.  See the list of
 	<link linkend="zfs-term-vdev">vdev types</link> for details
 	about the possible options.  After the pool has been created,
 	most vdev types do not allow additional disks to be added to
 	the vdev.  The exceptions are mirrors, which allow additional
 	disks to be added to the vdev, and stripes, which can be
 	upgraded to mirrors by attaching an additional disk to the
 	vdev.  Although additional vdevs can be added to expand a
 	pool, the layout of the pool cannot be changed after pool
 	creation.  Instead, the data must be backed up and the
 	pool destroyed and recreated.</para>
 
       <para>Create a simple mirror pool:</para>
 
       <screen>&prompt.root; <userinput>zpool create <replaceable>mypool</replaceable> mirror <replaceable>/dev/ada1</replaceable> <replaceable>/dev/ada2</replaceable></userinput>
 &prompt.root; <userinput>zpool status</userinput>
   pool: mypool
  state: ONLINE
   scan: none requested
 config:
 
         NAME        STATE     READ WRITE CKSUM
         mypool      ONLINE       0     0     0
           mirror-0  ONLINE       0     0     0
             ada1    ONLINE       0     0     0
             ada2    ONLINE       0     0     0
 
 errors: No known data errors</screen>
 
       <para>Multiple vdevs can be created at once.  Specify multiple
 	groups of disks separated by the vdev type keyword,
 	<literal>mirror</literal> in this example:</para>
 
       <screen>&prompt.root; <userinput>zpool create <replaceable>mypool</replaceable> mirror <replaceable>/dev/ada1</replaceable> <replaceable>/dev/ada2</replaceable> mirror <replaceable>/dev/ada3</replaceable> <replaceable>/dev/ada4</replaceable></userinput>
   pool: mypool
  state: ONLINE
   scan: none requested
 config:
 
         NAME        STATE     READ WRITE CKSUM
         mypool      ONLINE       0     0     0
           mirror-0  ONLINE       0     0     0
             ada1    ONLINE       0     0     0
             ada2    ONLINE       0     0     0
           mirror-1  ONLINE       0     0     0
             ada3    ONLINE       0     0     0
             ada4    ONLINE       0     0     0
 
 errors: No known data errors</screen>
 
       <para>Pools can also be constructed using partitions rather than
 	whole disks.  Putting <acronym>ZFS</acronym> in a separate
 	partition allows the same disk to have other partitions for
 	other purposes.  In particular, partitions with bootcode and
 	file systems needed for booting can be added.  This allows
 	booting from disks that are also members of a pool.  There is
 	no performance penalty on &os; when using a partition rather
 	than a whole disk.  Using partitions also allows the
 	administrator to <emphasis>under-provision</emphasis> the
 	disks, using less than the full capacity.  If a future
 	replacement disk of the same nominal size as the original
 	actually has a slightly smaller capacity, the smaller
 	partition will still fit, and the replacement disk can still
 	be used.</para>
 
       <para>Create a
 	<link linkend="zfs-term-vdev-raidz">RAID-Z2</link> pool using
 	partitions:</para>
 
       <screen>&prompt.root; <userinput>zpool create <replaceable>mypool</replaceable> raidz2 <replaceable>/dev/ada0p3</replaceable> <replaceable>/dev/ada1p3</replaceable> <replaceable>/dev/ada2p3</replaceable> <replaceable>/dev/ada3p3</replaceable> <replaceable>/dev/ada4p3</replaceable> <replaceable>/dev/ada5p3</replaceable></userinput>
 &prompt.root; <userinput>zpool status</userinput>
   pool: mypool
  state: ONLINE
   scan: none requested
 config:
 
         NAME        STATE     READ WRITE CKSUM
         mypool      ONLINE       0     0     0
           raidz2-0  ONLINE       0     0     0
             ada0p3  ONLINE       0     0     0
             ada1p3  ONLINE       0     0     0
             ada2p3  ONLINE       0     0     0
             ada3p3  ONLINE       0     0     0
             ada4p3  ONLINE       0     0     0
             ada5p3  ONLINE       0     0     0
 
 errors: No known data errors</screen>
 
       <para>A pool that is no longer needed can be destroyed so that
 	the disks can be reused.  Destroying a pool involves first
 	unmounting all of the datasets in that pool.  If the datasets
 	are in use, the unmount operation will fail and the pool will
 	not be destroyed.  The destruction of the pool can be forced
 	with <option>-f</option>, but this can cause undefined
 	behavior in applications which had open files on those
 	datasets.</para>
     </sect2>
 
     <sect2 xml:id="zfs-zpool-attach">
       <title>Adding and Removing Devices</title>
 
       <para>There are two cases for adding disks to a zpool: attaching
 	a disk to an existing vdev with
 	<command>zpool attach</command>, or adding vdevs to the pool
 	with <command>zpool add</command>.  Only some
 	<link linkend="zfs-term-vdev">vdev types</link> allow disks to
 	be added to the vdev after creation.</para>
 
       <para>A pool created with a single disk lacks redundancy.
 	Corruption can be detected but
 	not repaired, because there is no other copy of the data.
 
 	The <link linkend="zfs-term-copies">copies</link> property may
 	be able to recover from a small failure such as a bad sector,
 	but does not provide the same level of protection as mirroring
 	or <acronym>RAID-Z</acronym>.  Starting with a pool consisting
 	of a single disk vdev, <command>zpool attach</command> can be
 	used to add an additional disk to the vdev, creating a mirror.
 	<command>zpool attach</command> can also be used to add
 	additional disks to a mirror group, increasing redundancy and
 	read performance.  If the disks being used for the pool are
 	partitioned, replicate the layout of the first disk on to the
 	second, <command>gpart backup</command> and
 	<command>gpart restore</command> can be used to make this
 	process easier.</para>
 
       <para>Upgrade the single disk (stripe) vdev
 	<replaceable>ada0p3</replaceable> to a mirror by attaching
 	<replaceable>ada1p3</replaceable>:</para>
 
       <screen>&prompt.root; <userinput>zpool status</userinput>
   pool: mypool
  state: ONLINE
   scan: none requested
 config:
 
         NAME        STATE     READ WRITE CKSUM
         mypool      ONLINE       0     0     0
           ada0p3    ONLINE       0     0     0
 
 errors: No known data errors
 &prompt.root; <userinput>zpool attach <replaceable>mypool</replaceable> <replaceable>ada0p3</replaceable> <replaceable>ada1p3</replaceable></userinput>
 Make sure to wait until resilver is done before rebooting.
 
 If you boot from pool 'mypool', you may need to update
 boot code on newly attached disk 'ada1p3'.
 
 Assuming you use GPT partitioning and 'da0' is your new boot disk
 you may use the following command:
 
         gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 da0
 &prompt.root; <userinput>gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 <replaceable>ada1</replaceable></userinput>
 bootcode written to ada1
 &prompt.root; <userinput>zpool status</userinput>
   pool: mypool
  state: ONLINE
 status: One or more devices is currently being resilvered.  The pool will
         continue to function, possibly in a degraded state.
 action: Wait for the resilver to complete.
   scan: resilver in progress since Fri May 30 08:19:19 2014
         527M scanned out of 781M at 47.9M/s, 0h0m to go
         527M resilvered, 67.53% done
 config:
 
         NAME        STATE     READ WRITE CKSUM
         mypool      ONLINE       0     0     0
           mirror-0  ONLINE       0     0     0
             ada0p3  ONLINE       0     0     0
             ada1p3  ONLINE       0     0     0  (resilvering)
 
 errors: No known data errors
 &prompt.root; <userinput>zpool status</userinput>
   pool: mypool
  state: ONLINE
   scan: resilvered 781M in 0h0m with 0 errors on Fri May 30 08:15:58 2014
 config:
 
         NAME        STATE     READ WRITE CKSUM
         mypool      ONLINE       0     0     0
           mirror-0  ONLINE       0     0     0
             ada0p3  ONLINE       0     0     0
             ada1p3  ONLINE       0     0     0
 
 errors: No known data errors</screen>
 
       <para>When adding disks to the existing vdev is not an option,
 	as for <acronym>RAID-Z</acronym>, an alternative method is to
 	add another vdev to the pool.  Additional vdevs provide higher
 	performance, distributing writes across the vdevs.  Each vdev
 	is responsible for providing its own redundancy.  It is
 	possible, but discouraged, to mix vdev types, like
 	<literal>mirror</literal> and <literal>RAID-Z</literal>.
 	Adding a non-redundant vdev to a pool containing mirror or
 	<acronym>RAID-Z</acronym> vdevs risks the data on the entire
 	pool.  Writes are distributed, so the failure of the
 	non-redundant disk will result in the loss of a fraction of
 	every block that has been written to the pool.</para>
 
       <para>Data is striped across each of the vdevs.  For example,
 	with two mirror vdevs, this is effectively a
 	<acronym>RAID</acronym> 10 that stripes writes across two sets
 	of mirrors.  Space is allocated so that each vdev reaches 100%
 	full at the same time.  There is a performance penalty if the
 	vdevs have different amounts of free space, as a
 	disproportionate amount of the data is written to the less
 	full vdev.</para>
 
       <para>When attaching additional devices to a boot pool, remember
 	to update the bootcode.</para>
 
       <para>Attach a second mirror group (<filename>ada2p3</filename>
 	and <filename>ada3p3</filename>) to the existing
 	mirror:</para>
 
       <screen>&prompt.root; <userinput>zpool status</userinput>
   pool: mypool
  state: ONLINE
   scan: resilvered 781M in 0h0m with 0 errors on Fri May 30 08:19:35 2014
 config:
 
         NAME        STATE     READ WRITE CKSUM
         mypool      ONLINE       0     0     0
           mirror-0  ONLINE       0     0     0
             ada0p3  ONLINE       0     0     0
             ada1p3  ONLINE       0     0     0
 
 errors: No known data errors
 &prompt.root; <userinput>zpool add <replaceable>mypool</replaceable> mirror <replaceable>ada2p3</replaceable> <replaceable>ada3p3</replaceable></userinput>
 &prompt.root; <userinput>gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 <replaceable>ada2</replaceable></userinput>
 bootcode written to ada2
 &prompt.root; <userinput>gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 <replaceable>ada3</replaceable></userinput>
 bootcode written to ada3
 &prompt.root; <userinput>zpool status</userinput>
   pool: mypool
  state: ONLINE
   scan: scrub repaired 0 in 0h0m with 0 errors on Fri May 30 08:29:51 2014
 config:
 
         NAME        STATE     READ WRITE CKSUM
         mypool      ONLINE       0     0     0
           mirror-0  ONLINE       0     0     0
             ada0p3  ONLINE       0     0     0
             ada1p3  ONLINE       0     0     0
           mirror-1  ONLINE       0     0     0
             ada2p3  ONLINE       0     0     0
             ada3p3  ONLINE       0     0     0
 
 errors: No known data errors</screen>
 
       <para>Currently, vdevs cannot be removed from a pool, and disks
 	can only be removed from a mirror if there is enough remaining
 	redundancy.  If only one disk in a mirror group remains, it
 	ceases to be a mirror and reverts to being a stripe, risking
 	the entire pool if that remaining disk fails.</para>
 
       <para>Remove a disk from a three-way mirror group:</para>
 
       <screen>&prompt.root; <userinput>zpool status</userinput>
   pool: mypool
  state: ONLINE
   scan: scrub repaired 0 in 0h0m with 0 errors on Fri May 30 08:29:51 2014
 config:
 
         NAME        STATE     READ WRITE CKSUM
         mypool      ONLINE       0     0     0
           mirror-0  ONLINE       0     0     0
             ada0p3  ONLINE       0     0     0
             ada1p3  ONLINE       0     0     0
             ada2p3  ONLINE       0     0     0
 
 errors: No known data errors
 &prompt.root; <userinput>zpool detach <replaceable>mypool</replaceable> <replaceable>ada2p3</replaceable></userinput>
 &prompt.root; <userinput>zpool status</userinput>
   pool: mypool
  state: ONLINE
   scan: scrub repaired 0 in 0h0m with 0 errors on Fri May 30 08:29:51 2014
 config:
 
         NAME        STATE     READ WRITE CKSUM
         mypool      ONLINE       0     0     0
           mirror-0  ONLINE       0     0     0
             ada0p3  ONLINE       0     0     0
             ada1p3  ONLINE       0     0     0
 
 errors: No known data errors</screen>
     </sect2>
 
     <sect2 xml:id="zfs-zpool-status">
       <title>Checking the Status of a Pool</title>
 
       <para>Pool status is important.  If a drive goes offline or a
 	read, write, or checksum error is detected, the corresponding
 	error count increases.  The <command>status</command> output
 	shows the configuration and status of each device in the pool
 	and the status of the entire pool.  Actions that need to be
 	taken and details about the last <link
 	  linkend="zfs-zpool-scrub"><command>scrub</command></link>
 	are also shown.</para>
 
       <screen>&prompt.root; <userinput>zpool status</userinput>
   pool: mypool
  state: ONLINE
   scan: scrub repaired 0 in 2h25m with 0 errors on Sat Sep 14 04:25:50 2013
 config:
 
         NAME        STATE     READ WRITE CKSUM
         mypool      ONLINE       0     0     0
           raidz2-0  ONLINE       0     0     0
             ada0p3  ONLINE       0     0     0
             ada1p3  ONLINE       0     0     0
             ada2p3  ONLINE       0     0     0
             ada3p3  ONLINE       0     0     0
             ada4p3  ONLINE       0     0     0
             ada5p3  ONLINE       0     0     0
 
 errors: No known data errors</screen>
     </sect2>
 
     <sect2 xml:id="zfs-zpool-clear">
       <title>Clearing Errors</title>
 
       <para>When an error is detected, the read, write, or checksum
 	counts are incremented.  The error message can be cleared and
 	the counts reset with <command>zpool clear
 	  <replaceable>mypool</replaceable></command>.  Clearing the
 	error state can be important for automated scripts that alert
 	the administrator when the pool encounters an error.  Further
 	errors may not be reported if the old errors are not
 	cleared.</para>
     </sect2>
 
     <sect2 xml:id="zfs-zpool-replace">
       <title>Replacing a Functioning Device</title>
 
       <para>There are a number of situations where it may be
 	desirable to replace one disk with a different disk.  When
 	replacing a working disk, the process keeps the old disk
 	online during the replacement.  The pool never enters a
 	<link linkend="zfs-term-degraded">degraded</link> state,
 	reducing the risk of data loss.
 	<command>zpool replace</command> copies all of the data from
 	the old disk to the new one.  After the operation completes,
 	the old disk is disconnected from the vdev.  If the new disk
 	is larger than the old disk, it may be possible to grow the
 	zpool, using the new space.  See <link
 	  linkend="zfs-zpool-online">Growing a Pool</link>.</para>
 
       <para>Replace a functioning device in the pool:</para>
 
       <screen>&prompt.root; <userinput>zpool status</userinput>
   pool: mypool
  state: ONLINE
   scan: none requested
 config:
 
         NAME        STATE     READ WRITE CKSUM
         mypool      ONLINE       0     0     0
           mirror-0  ONLINE       0     0     0
             ada0p3  ONLINE       0     0     0
             ada1p3  ONLINE       0     0     0
 
 errors: No known data errors
 &prompt.root; <userinput>zpool replace <replaceable>mypool</replaceable> <replaceable>ada1p3</replaceable> <replaceable>ada2p3</replaceable></userinput>
 Make sure to wait until resilver is done before rebooting.
 
 If you boot from pool 'zroot', you may need to update
 boot code on newly attached disk 'ada2p3'.
 
 Assuming you use GPT partitioning and 'da0' is your new boot disk
 you may use the following command:
 
         gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 da0
 &prompt.root; <userinput>gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 <replaceable>ada2</replaceable></userinput>
 &prompt.root; <userinput>zpool status</userinput>
   pool: mypool
  state: ONLINE
 status: One or more devices is currently being resilvered.  The pool will
         continue to function, possibly in a degraded state.
 action: Wait for the resilver to complete.
   scan: resilver in progress since Mon Jun  2 14:21:35 2014
         604M scanned out of 781M at 46.5M/s, 0h0m to go
         604M resilvered, 77.39% done
 config:
 
         NAME             STATE     READ WRITE CKSUM
         mypool           ONLINE       0     0     0
           mirror-0       ONLINE       0     0     0
             ada0p3       ONLINE       0     0     0
             replacing-1  ONLINE       0     0     0
               ada1p3     ONLINE       0     0     0
               ada2p3     ONLINE       0     0     0  (resilvering)
 
 errors: No known data errors
 &prompt.root; <userinput>zpool status</userinput>
   pool: mypool
  state: ONLINE
   scan: resilvered 781M in 0h0m with 0 errors on Mon Jun  2 14:21:52 2014
 config:
 
         NAME        STATE     READ WRITE CKSUM
         mypool      ONLINE       0     0     0
           mirror-0  ONLINE       0     0     0
             ada0p3  ONLINE       0     0     0
             ada2p3  ONLINE       0     0     0
 
 errors: No known data errors</screen>
     </sect2>
 
     <sect2 xml:id="zfs-zpool-resilver">
       <title>Dealing with Failed Devices</title>
 
       <para>When a disk in a pool fails, the vdev to which the disk
 	belongs enters the
 	<link linkend="zfs-term-degraded">degraded</link> state.  All
 	of the data is still available, but performance may be reduced
 	because missing data must be calculated from the available
 	redundancy.  To restore the vdev to a fully functional state,
 	the failed physical device must be replaced.
 	<acronym>ZFS</acronym> is then instructed to begin the
 	<link linkend="zfs-term-resilver">resilver</link> operation.
 	Data that was on the failed device is recalculated from
 	available redundancy and written to the replacement device.
 	After completion, the vdev returns to
 	<link linkend="zfs-term-online">online</link> status.</para>
 
       <para>If the vdev does not have any redundancy, or if multiple
 	devices have failed and there is not enough redundancy to
 	compensate, the pool enters the
 	<link linkend="zfs-term-faulted">faulted</link> state.  If a
 	sufficient number of devices cannot be reconnected to the
 	pool, the pool becomes inoperative and data must be restored
 	from backups.</para>
 
       <para>When replacing a failed disk, the name of the failed disk
 	is replaced with the <acronym>GUID</acronym> of the device.
 	A new device name parameter for
 	<command>zpool replace</command> is not required if the
 	replacement device has the same device name.</para>
 
       <para>Replace a failed disk using
 	<command>zpool replace</command>:</para>
 
       <screen>&prompt.root; <userinput>zpool status</userinput>
   pool: mypool
  state: DEGRADED
 status: One or more devices could not be opened.  Sufficient replicas exist for
         the pool to continue functioning in a degraded state.
 action: Attach the missing device and online it using 'zpool online'.
    see: http://illumos.org/msg/ZFS-8000-2Q
   scan: none requested
 config:
 
         NAME                    STATE     READ WRITE CKSUM
         mypool                  DEGRADED     0     0     0
           mirror-0              DEGRADED     0     0     0
             ada0p3              ONLINE       0     0     0
             316502962686821739  UNAVAIL      0     0     0  was /dev/ada1p3
 
 errors: No known data errors
 &prompt.root; <userinput>zpool replace <replaceable>mypool</replaceable> <replaceable>316502962686821739</replaceable> <replaceable>ada2p3</replaceable></userinput>
 &prompt.root; <userinput>zpool status</userinput>
   pool: mypool
  state: DEGRADED
 status: One or more devices is currently being resilvered.  The pool will
         continue to function, possibly in a degraded state.
 action: Wait for the resilver to complete.
   scan: resilver in progress since Mon Jun  2 14:52:21 2014
         641M scanned out of 781M at 49.3M/s, 0h0m to go
         640M resilvered, 82.04% done
 config:
 
         NAME                        STATE     READ WRITE CKSUM
         mypool                      DEGRADED     0     0     0
           mirror-0                  DEGRADED     0     0     0
             ada0p3                  ONLINE       0     0     0
             replacing-1             UNAVAIL      0     0     0
               15732067398082357289  UNAVAIL      0     0     0  was /dev/ada1p3/old
               ada2p3                ONLINE       0     0     0  (resilvering)
 
 errors: No known data errors
 &prompt.root; <userinput>zpool status</userinput>
   pool: mypool
  state: ONLINE
   scan: resilvered 781M in 0h0m with 0 errors on Mon Jun  2 14:52:38 2014
 config:
 
         NAME        STATE     READ WRITE CKSUM
         mypool      ONLINE       0     0     0
           mirror-0  ONLINE       0     0     0
             ada0p3  ONLINE       0     0     0
             ada2p3  ONLINE       0     0     0
 
 errors: No known data errors</screen>
     </sect2>
 
     <sect2 xml:id="zfs-zpool-scrub">
       <title>Scrubbing a Pool</title>
 
       <para>It is recommended that pools be
 	<link linkend="zfs-term-scrub">scrubbed</link> regularly,
 	ideally at least once every month.  The
 	<command>scrub</command> operation is very disk-intensive and
 	will reduce performance while running.  Avoid high-demand
 	periods when scheduling <command>scrub</command> or use <link
 	  linkend="zfs-advanced-tuning-scrub_delay"><varname>vfs.zfs.scrub_delay</varname></link>
 	to adjust the relative priority of the
 	<command>scrub</command> to prevent it interfering with other
 	workloads.</para>
 
       <screen>&prompt.root; <userinput>zpool scrub <replaceable>mypool</replaceable></userinput>
 &prompt.root; <userinput>zpool status</userinput>
   pool: mypool
  state: ONLINE
   scan: scrub in progress since Wed Feb 19 20:52:54 2014
         116G scanned out of 8.60T at 649M/s, 3h48m to go
         0 repaired, 1.32% done
 config:
 
         NAME        STATE     READ WRITE CKSUM
         mypool      ONLINE       0     0     0
           raidz2-0  ONLINE       0     0     0
             ada0p3  ONLINE       0     0     0
             ada1p3  ONLINE       0     0     0
             ada2p3  ONLINE       0     0     0
             ada3p3  ONLINE       0     0     0
             ada4p3  ONLINE       0     0     0
             ada5p3  ONLINE       0     0     0
 
 errors: No known data errors</screen>
 
       <para>In the event that a scrub operation needs to be cancelled,
 	issue <command>zpool scrub -s
 	  <replaceable>mypool</replaceable></command>.</para>
     </sect2>
 
     <sect2 xml:id="zfs-zpool-selfheal">
       <title>Self-Healing</title>
 
       <para>The checksums stored with data blocks enable the file
 	system to <emphasis>self-heal</emphasis>.  This feature will
 	automatically repair data whose checksum does not match the
 	one recorded on another device that is part of the storage
 	pool.  For example, a mirror with two disks where one drive is
 	starting to malfunction and cannot properly store the data any
 	more.  This is even worse when the data has not been accessed
 	for a long time, as with long term archive storage.
 	Traditional file systems need to run algorithms that check and
 	repair the data like &man.fsck.8;.  These commands take time,
 	and in severe cases, an administrator has to manually decide
 	which repair operation must be performed.  When
 	<acronym>ZFS</acronym> detects a data block with a checksum
 	that does not match, it tries to read the data from the mirror
 	disk.  If that disk can provide the correct data, it will not
 	only give that data to the application requesting it, but also
 	correct the wrong data on the disk that had the bad checksum.
 	This happens without any interaction from a system
 	administrator during normal pool operation.</para>
 
       <para>The next example demonstrates this self-healing behavior.
 	A mirrored pool of disks <filename>/dev/ada0</filename> and
 	<filename>/dev/ada1</filename> is created.</para>
 
       <screen>&prompt.root; <userinput>zpool create <replaceable>healer</replaceable> mirror <replaceable>/dev/ada0</replaceable> <replaceable>/dev/ada1</replaceable></userinput>
 &prompt.root; <userinput>zpool status <replaceable>healer</replaceable></userinput>
   pool: healer
  state: ONLINE
   scan: none requested
 config:
 
     NAME        STATE     READ WRITE CKSUM
     healer      ONLINE       0     0     0
       mirror-0  ONLINE       0     0     0
        ada0     ONLINE       0     0     0
        ada1     ONLINE       0     0     0
 
 errors: No known data errors
 &prompt.root; <userinput>zpool list</userinput>
 NAME     SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
 healer   960M  92.5K   960M     0%  1.00x  ONLINE  -</screen>
 
       <para>Some important data that to be protected from data errors
 	using the self-healing feature is copied to the pool.  A
 	checksum of the pool is created for later comparison.</para>
 
       <screen>&prompt.root; <userinput>cp /some/important/data /healer</userinput>
 &prompt.root; <userinput>zfs list</userinput>
 NAME     SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
 healer   960M  67.7M   892M     7%  1.00x  ONLINE  -
 &prompt.root; <userinput>sha1 /healer > checksum.txt</userinput>
 &prompt.root; <userinput>cat checksum.txt</userinput>
 SHA1 (/healer) = 2753eff56d77d9a536ece6694bf0a82740344d1f</screen>
 
       <para>Data corruption is simulated by writing random data to the
 	beginning of one of the disks in the mirror.  To prevent
 	<acronym>ZFS</acronym> from healing the data as soon as it is
 	detected, the pool is exported before the corruption and
 	imported again afterwards.</para>
 
       <warning>
 	<para>This is a dangerous operation that can destroy vital
 	  data.  It is shown here for demonstrational purposes only
 	  and should not be attempted during normal operation of a
 	  storage pool.  Nor should this intentional corruption
 	  example be run on any disk with a different file system on
 	  it.  Do not use any other disk device names other than the
 	  ones that are part of the pool.  Make certain that proper
 	  backups of the pool are created before running the
 	  command!</para>
       </warning>
 
       <screen>&prompt.root; <userinput>zpool export <replaceable>healer</replaceable></userinput>
 &prompt.root; <userinput>dd if=/dev/random of=/dev/ada1 bs=1m count=200</userinput>
 200+0 records in
 200+0 records out
 209715200 bytes transferred in 62.992162 secs (3329227 bytes/sec)
 &prompt.root; <userinput>zpool import healer</userinput></screen>
 
       <para>The pool status shows that one device has experienced an
 	error.  Note that applications reading data from the pool did
 	not receive any incorrect data.  <acronym>ZFS</acronym>
 	provided data from the <filename>ada0</filename> device with
 	the correct checksums.  The device with the wrong checksum can
 	be found easily as the <literal>CKSUM</literal> column
 	contains a nonzero value.</para>
 
       <screen>&prompt.root; <userinput>zpool status <replaceable>healer</replaceable></userinput>
     pool: healer
    state: ONLINE
   status: One or more devices has experienced an unrecoverable error.  An
           attempt was made to correct the error.  Applications are unaffected.
   action: Determine if the device needs to be replaced, and clear the errors
           using 'zpool clear' or replace the device with 'zpool replace'.
      see: http://www.sun.com/msg/ZFS-8000-9P
     scan: none requested
   config:
 
       NAME        STATE     READ WRITE CKSUM
       healer      ONLINE       0     0     0
         mirror-0  ONLINE       0     0     0
          ada0     ONLINE       0     0     0
          ada1     ONLINE       0     0     1
 
 errors: No known data errors</screen>
 
       <para>The error was detected and handled by using the redundancy
 	present in the unaffected <filename>ada0</filename> mirror
 	disk.  A checksum comparison with the original one will reveal
 	whether the pool is consistent again.</para>
 
       <screen>&prompt.root; <userinput>sha1 /healer >> checksum.txt</userinput>
 &prompt.root; <userinput>cat checksum.txt</userinput>
 SHA1 (/healer) = 2753eff56d77d9a536ece6694bf0a82740344d1f
 SHA1 (/healer) = 2753eff56d77d9a536ece6694bf0a82740344d1f</screen>
 
       <para>The two checksums that were generated before and after the
 	intentional tampering with the pool data still match.  This
 	shows how <acronym>ZFS</acronym> is capable of detecting and
 	correcting any errors automatically when the checksums differ.
 	Note that this is only possible when there is enough
 	redundancy present in the pool.  A pool consisting of a single
 	device has no self-healing capabilities.  That is also the
 	reason why checksums are so important in
 	<acronym>ZFS</acronym> and should not be disabled for any
 	reason.  No &man.fsck.8; or similar file system consistency
 	check program is required to detect and correct this and the
 	pool was still available during the time there was a problem.
 	A scrub operation is now required to overwrite the corrupted
 	data on <filename>ada1</filename>.</para>
 
       <screen>&prompt.root; <userinput>zpool scrub <replaceable>healer</replaceable></userinput>
 &prompt.root; <userinput>zpool status <replaceable>healer</replaceable></userinput>
   pool: healer
  state: ONLINE
 status: One or more devices has experienced an unrecoverable error.  An
             attempt was made to correct the error.  Applications are unaffected.
 action: Determine if the device needs to be replaced, and clear the errors
             using 'zpool clear' or replace the device with 'zpool replace'.
    see: http://www.sun.com/msg/ZFS-8000-9P
   scan: scrub in progress since Mon Dec 10 12:23:30 2012
         10.4M scanned out of 67.0M at 267K/s, 0h3m to go
         9.63M repaired, 15.56% done
 config:
 
     NAME        STATE     READ WRITE CKSUM
     healer      ONLINE       0     0     0
       mirror-0  ONLINE       0     0     0
        ada0     ONLINE       0     0     0
        ada1     ONLINE       0     0   627  (repairing)
 
 errors: No known data errors</screen>
 
       <para>The scrub operation reads data from
 	<filename>ada0</filename> and rewrites any data with an
 	incorrect checksum on <filename>ada1</filename>.  This is
 	indicated by the <literal>(repairing)</literal> output from
 	<command>zpool status</command>.  After the operation is
 	complete, the pool status changes to:</para>
 
       <screen>&prompt.root; <userinput>zpool status <replaceable>healer</replaceable></userinput>
   pool: healer
  state: ONLINE
 status: One or more devices has experienced an unrecoverable error.  An
         attempt was made to correct the error.  Applications are unaffected.
 action: Determine if the device needs to be replaced, and clear the errors
              using 'zpool clear' or replace the device with 'zpool replace'.
    see: http://www.sun.com/msg/ZFS-8000-9P
   scan: scrub repaired 66.5M in 0h2m with 0 errors on Mon Dec 10 12:26:25 2012
 config:
 
     NAME        STATE     READ WRITE CKSUM
     healer      ONLINE       0     0     0
       mirror-0  ONLINE       0     0     0
        ada0     ONLINE       0     0     0
        ada1     ONLINE       0     0 2.72K
 
 errors: No known data errors</screen>
 
       <para>After the scrub operation completes and all the data
 	has been synchronized from <filename>ada0</filename> to
 	<filename>ada1</filename>, the error messages can be
 	<link linkend="zfs-zpool-clear">cleared</link> from the pool
 	status by running <command>zpool clear</command>.</para>
 
       <screen>&prompt.root; <userinput>zpool clear <replaceable>healer</replaceable></userinput>
 &prompt.root; <userinput>zpool status <replaceable>healer</replaceable></userinput>
   pool: healer
  state: ONLINE
   scan: scrub repaired 66.5M in 0h2m with 0 errors on Mon Dec 10 12:26:25 2012
 config:
 
     NAME        STATE     READ WRITE CKSUM
     healer      ONLINE       0     0     0
       mirror-0  ONLINE       0     0     0
        ada0     ONLINE       0     0     0
        ada1     ONLINE       0     0     0
 
 errors: No known data errors</screen>
 
       <para>The pool is now back to a fully working state and all the
 	errors have been cleared.</para>
     </sect2>
 
     <sect2 xml:id="zfs-zpool-online">
       <title>Growing a Pool</title>
 
       <para>The usable size of a redundant pool is limited by the
 	capacity of the smallest device in each vdev.  The smallest
 	device can be replaced with a larger device.  After completing
 	a <link linkend="zfs-zpool-replace">replace</link> or
 	<link linkend="zfs-term-resilver">resilver</link> operation,
 	the pool can grow to use the capacity of the new device.  For
 	example, consider a mirror of a 1&nbsp;TB drive and a
 	2&nbsp;TB drive.  The usable space is 1&nbsp;TB.  When the
 	1&nbsp;TB drive is replaced with another 2&nbsp;TB drive, the
 	resilvering process copies the existing data onto the new
 	drive.  Because
 	both of the devices now have 2&nbsp;TB capacity, the mirror's
 	available space can be grown to 2&nbsp;TB.</para>
 
       <para>Expansion is triggered by using
 	<command>zpool online -e</command> on each device.  After
 	expansion of all devices, the additional space becomes
 	available to the pool.</para>
     </sect2>
 
     <sect2 xml:id="zfs-zpool-import">
       <title>Importing and Exporting Pools</title>
 
       <para>Pools are <emphasis>exported</emphasis> before moving them
 	to another system.  All datasets are unmounted, and each
 	device is marked as exported but still locked so it cannot be
 	used by other disk subsystems.  This allows pools to be
 	<emphasis>imported</emphasis> on other machines, other
 	operating systems that support <acronym>ZFS</acronym>, and
 	even different hardware architectures (with some caveats, see
 	&man.zpool.8;).  When a dataset has open files,
 	<command>zpool export -f</command> can be used to force the
 	export of a pool.  Use this with caution.  The datasets are
 	forcibly unmounted, potentially resulting in unexpected
 	behavior by the applications which had open files on those
 	datasets.</para>
 
       <para>Export a pool that is not in use:</para>
 
       <screen>&prompt.root; <userinput>zpool export mypool</userinput></screen>
 
       <para>Importing a pool automatically mounts the datasets.  This
 	may not be the desired behavior, and can be prevented with
 	<command>zpool import -N</command>.
 	<command>zpool import -o</command> sets temporary properties
 	for this import only.
 	<command>zpool import altroot=</command> allows importing a
 	pool with a base mount point instead of the root of the file
 	system.  If the pool was last used on a different system and
 	was not properly exported, an import might have to be forced
 	with <command>zpool import -f</command>.
 	<command>zpool import -a</command> imports all pools that do
 	not appear to be in use by another system.</para>
 
       <para>List all available pools for import:</para>
 
       <screen>&prompt.root; <userinput>zpool import</userinput>
    pool: mypool
      id: 9930174748043525076
   state: ONLINE
  action: The pool can be imported using its name or numeric identifier.
  config:
 
         mypool      ONLINE
           ada2p3    ONLINE</screen>
 
       <para>Import the pool with an alternative root directory:</para>
 
       <screen>&prompt.root; <userinput>zpool import -o altroot=<replaceable>/mnt</replaceable> <replaceable>mypool</replaceable></userinput>
 &prompt.root; <userinput>zfs list</userinput>
 zfs list
 NAME                 USED  AVAIL  REFER  MOUNTPOINT
 mypool               110K  47.0G    31K  /mnt/mypool</screen>
     </sect2>
 
     <sect2 xml:id="zfs-zpool-upgrade">
       <title>Upgrading a Storage Pool</title>
 
       <para>After upgrading &os;, or if a pool has been imported from
 	a system using an older version of <acronym>ZFS</acronym>, the
 	pool can be manually upgraded to the latest version of
 	<acronym>ZFS</acronym> to support newer features.  Consider
 	whether the pool may ever need to be imported on an older
 	system before upgrading.  Upgrading is a one-way process.
 	Older pools can be upgraded, but pools with newer features
 	cannot be downgraded.</para>
 
       <para>Upgrade a v28 pool to support
 	<literal>Feature Flags</literal>:</para>
 
       <screen>&prompt.root; <userinput>zpool status</userinput>
   pool: mypool
  state: ONLINE
 status: The pool is formatted using a legacy on-disk format.  The pool can
         still be used, but some features are unavailable.
 action: Upgrade the pool using 'zpool upgrade'.  Once this is done, the
         pool will no longer be accessible on software that does not support feat
         flags.
   scan: none requested
 config:
 
         NAME        STATE     READ WRITE CKSUM
         mypool      ONLINE       0     0     0
           mirror-0  ONLINE       0     0     0
 	    ada0    ONLINE       0     0     0
 	    ada1    ONLINE       0     0     0
 
 errors: No known data errors
 &prompt.root; <userinput>zpool upgrade</userinput>
 This system supports ZFS pool feature flags.
 
 The following pools are formatted with legacy version numbers and can
 be upgraded to use feature flags.  After being upgraded, these pools
 will no longer be accessible by software that does not support feature
 flags.
 
 VER  POOL
 ---  ------------
 28   mypool
 
 Use 'zpool upgrade -v' for a list of available legacy versions.
 Every feature flags pool has all supported features enabled.
 &prompt.root; <userinput>zpool upgrade mypool</userinput>
 This system supports ZFS pool feature flags.
 
 Successfully upgraded 'mypool' from version 28 to feature flags.
 Enabled the following features on 'mypool':
   async_destroy
   empty_bpobj
   lz4_compress
   multi_vdev_crash_dump</screen>
 
       <para>The newer features of <acronym>ZFS</acronym> will not be
 	available until <command>zpool upgrade</command> has
 	completed.  <command>zpool upgrade -v</command> can be used to
 	see what new features will be provided by upgrading, as well
 	as which features are already supported.</para>
 
       <para>Upgrade a pool to support additional feature flags:</para>
 
       <screen>&prompt.root; <userinput>zpool status</userinput>
   pool: mypool
  state: ONLINE
 status: Some supported features are not enabled on the pool. The pool can
         still be used, but some features are unavailable.
 action: Enable all features using 'zpool upgrade'. Once this is done,
         the pool may no longer be accessible by software that does not support
         the features. See zpool-features(7) for details.
   scan: none requested
 config:
 
         NAME        STATE     READ WRITE CKSUM
         mypool      ONLINE       0     0     0
           mirror-0  ONLINE       0     0     0
 	    ada0    ONLINE       0     0     0
 	    ada1    ONLINE       0     0     0
 
 errors: No known data errors
 &prompt.root; <userinput>zpool upgrade</userinput>
 This system supports ZFS pool feature flags.
 
 All pools are formatted using feature flags.
 
 
 Some supported features are not enabled on the following pools. Once a
 feature is enabled the pool may become incompatible with software
 that does not support the feature. See zpool-features(7) for details.
 
 POOL  FEATURE
 ---------------
 zstore
       multi_vdev_crash_dump
       spacemap_histogram
       enabled_txg
       hole_birth
       extensible_dataset
       bookmarks
       filesystem_limits
 &prompt.root; <userinput>zpool upgrade mypool</userinput>
 This system supports ZFS pool feature flags.
 
 Enabled the following features on 'mypool':
   spacemap_histogram
   enabled_txg
   hole_birth
   extensible_dataset
   bookmarks
   filesystem_limits</screen>
 
       <warning>
 	<para>The boot code on systems that boot from a pool must be
 	  updated to support the new pool version.  Use
 	  <command>gpart bootcode</command> on the partition that
 	  contains the boot code.  See &man.gpart.8; for more
 	  information.</para>
       </warning>
     </sect2>
 
     <sect2 xml:id="zfs-zpool-history">
       <title>Displaying Recorded Pool History</title>
 
       <para>Commands that modify the pool are recorded.  Recorded
 	actions include the creation of datasets, changing properties,
 	or replacement of a disk.  This history is useful for
 	reviewing how a pool was created and which user performed a
 	specific action and when.  History is not kept in a log file,
 	but is part of the pool itself.  The command to review this
 	history is aptly named
 	<command>zpool history</command>:</para>
 
       <screen>&prompt.root; <userinput>zpool history</userinput>
 History for 'tank':
 2013-02-26.23:02:35 zpool create tank mirror /dev/ada0 /dev/ada1
 2013-02-27.18:50:58 zfs set atime=off tank
 2013-02-27.18:51:09 zfs set checksum=fletcher4 tank
 2013-02-27.18:51:18 zfs create tank/backup</screen>
 
       <para>The output shows <command>zpool</command> and
 	<command>zfs</command> commands that were executed on the pool
 	along with a timestamp.  Only commands that alter the pool in
 	some way are recorded.  Commands like
 	<command>zfs list</command> are not included.  When no pool
 	name is specified, the history of all pools is
 	displayed.</para>
 
       <para><command>zpool history</command> can show even more
 	information when the options <option>-i</option> or
 	<option>-l</option> are provided.  <option>-i</option>
 	displays user-initiated events as well as internally logged
 	<acronym>ZFS</acronym> events.</para>
 
       <screen>&prompt.root; <userinput>zpool history -i</userinput>
 History for 'tank':
 2013-02-26.23:02:35 [internal pool create txg:5] pool spa 28; zfs spa 28; zpl 5;uts  9.1-RELEASE 901000 amd64
 2013-02-27.18:50:53 [internal property set txg:50] atime=0 dataset = 21
 2013-02-27.18:50:58 zfs set atime=off tank
 2013-02-27.18:51:04 [internal property set txg:53] checksum=7 dataset = 21
 2013-02-27.18:51:09 zfs set checksum=fletcher4 tank
 2013-02-27.18:51:13 [internal create txg:55] dataset = 39
 2013-02-27.18:51:18 zfs create tank/backup</screen>
 
       <para>More details can be shown by adding <option>-l</option>.
 	History records are shown in a long format, including
 	information like the name of the user who issued the command
 	and the hostname on which the change was made.</para>
 
       <screen>&prompt.root; <userinput>zpool history -l</userinput>
 History for 'tank':
 2013-02-26.23:02:35 zpool create tank mirror /dev/ada0 /dev/ada1 [user 0 (root) on :global]
 2013-02-27.18:50:58 zfs set atime=off tank [user 0 (root) on myzfsbox:global]
 2013-02-27.18:51:09 zfs set checksum=fletcher4 tank [user 0 (root) on myzfsbox:global]
 2013-02-27.18:51:18 zfs create tank/backup [user 0 (root) on myzfsbox:global]</screen>
 
       <para>The output shows that the
 	<systemitem class="username">root</systemitem> user created
 	the mirrored pool with disks
 	<filename>/dev/ada0</filename> and
 	<filename>/dev/ada1</filename>.  The hostname
 	<systemitem class="systemname">myzfsbox</systemitem> is also
 	shown in the commands after the pool's creation.  The hostname
 	display becomes important when the pool is exported from one
 	system and imported on another.  The commands that are issued
 	on the other system can clearly be distinguished by the
 	hostname that is recorded for each command.</para>
 
       <para>Both options to <command>zpool history</command> can be
 	combined to give the most detailed information possible for
 	any given pool.  Pool history provides valuable information
 	when tracking down the actions that were performed or when
 	more detailed output is needed for debugging.</para>
     </sect2>
 
     <sect2 xml:id="zfs-zpool-iostat">
       <title>Performance Monitoring</title>
 
       <para>A built-in monitoring system can display pool
 	<acronym>I/O</acronym> statistics in real time.  It shows the
 	amount of free and used space on the pool, how many read and
 	write operations are being performed per second, and how much
 	<acronym>I/O</acronym> bandwidth is currently being utilized.
 	By default, all pools in the system are monitored and
 	displayed.  A pool name can be provided to limit monitoring to
 	just that pool.  A basic example:</para>
 
       <screen>&prompt.root; <userinput>zpool iostat</userinput>
                capacity     operations    bandwidth
 pool        alloc   free   read  write   read  write
 ----------  -----  -----  -----  -----  -----  -----
 data         288G  1.53T      2     11  11.3K  57.1K</screen>
 
       <para>To continuously monitor <acronym>I/O</acronym> activity, a
 	number can be specified as the last parameter, indicating a
 	interval in seconds to wait between updates.  The next
 	statistic line is printed after each interval.  Press
 	<keycombo action="simul">
 	  <keycap>Ctrl</keycap>
 	  <keycap>C</keycap>
 	</keycombo> to stop this continuous monitoring.
 	Alternatively, give a second number on the command line after
 	the interval to specify the total number of statistics to
 	display.</para>
 
       <para>Even more detailed <acronym>I/O</acronym> statistics can
 	be displayed with <option>-v</option>.  Each device in the
 	pool is shown with a statistics line.  This is useful in
 	seeing how many read and write operations are being performed
 	on each device, and can help determine if any individual
 	device is slowing down the pool.  This example shows a
 	mirrored pool with two devices:</para>
 
       <screen>&prompt.root; <userinput>zpool iostat -v </userinput>
                             capacity     operations    bandwidth
 pool                     alloc   free   read  write   read  write
 -----------------------  -----  -----  -----  -----  -----  -----
 data                      288G  1.53T      2     12  9.23K  61.5K
   mirror                  288G  1.53T      2     12  9.23K  61.5K
     ada1                     -      -      0      4  5.61K  61.7K
     ada2                     -      -      1      4  5.04K  61.7K
 -----------------------  -----  -----  -----  -----  -----  -----</screen>
     </sect2>
 
     <sect2 xml:id="zfs-zpool-split">
       <title>Splitting a Storage Pool</title>
 
       <para>A pool consisting of one or more mirror vdevs can be split
 	into two pools.  Unless otherwise specified, the last member
 	of each mirror is detached and used to create a new pool
 	containing the same data.  The operation should first be
 	attempted with <option>-n</option>.  The details of the
 	proposed operation are displayed without it actually being
 	performed.  This helps confirm that the operation will do what
 	the user intends.</para>
     </sect2>
   </sect1>
 
   <sect1 xml:id="zfs-zfs">
     <title><command>zfs</command> Administration</title>
 
     <para>The <command>zfs</command> utility is responsible for
       creating, destroying, and managing all <acronym>ZFS</acronym>
       datasets that exist within a pool.  The pool is managed using
       <link
 	linkend="zfs-zpool"><command>zpool</command></link>.</para>
 
     <sect2 xml:id="zfs-zfs-create">
       <title>Creating and Destroying Datasets</title>
 
       <para>Unlike traditional disks and volume managers, space in
 	<acronym>ZFS</acronym> is <emphasis>not</emphasis>
 	preallocated.  With traditional file systems, after all of the
 	space is partitioned and assigned, there is no way to add an
 	additional file system without adding a new disk.  With
 	<acronym>ZFS</acronym>, new file systems can be created at any
 	time.  Each <link
 	  linkend="zfs-term-dataset"><emphasis>dataset</emphasis></link>
 	has properties including features like compression,
 	deduplication, caching, and quotas, as well as other useful
 	properties like readonly, case sensitivity, network file
 	sharing, and a mount point.  Datasets can be nested inside
 	each other, and child datasets will inherit properties from
 	their parents.  Each dataset can be administered,
 	<link linkend="zfs-zfs-allow">delegated</link>,
 	<link linkend="zfs-zfs-send">replicated</link>,
 	<link linkend="zfs-zfs-snapshot">snapshotted</link>,
 	<link linkend="zfs-zfs-jail">jailed</link>, and destroyed as a
 	unit.  There are many advantages to creating a separate
 	dataset for each different type or set of files.  The only
 	drawbacks to having an extremely large number of datasets is
 	that some commands like <command>zfs list</command> will be
 	slower, and the mounting of hundreds or even thousands of
 	datasets can slow the &os; boot process.</para>
 
       <para>Create a new dataset and enable <link
 	  linkend="zfs-term-compression-lz4">LZ4
 	  compression</link> on it:</para>
 
       <screen>&prompt.root; <userinput>zfs list</userinput>
 NAME                  USED  AVAIL  REFER  MOUNTPOINT
 mypool                781M  93.2G   144K  none
 mypool/ROOT           777M  93.2G   144K  none
 mypool/ROOT/default   777M  93.2G   777M  /
 mypool/tmp            176K  93.2G   176K  /tmp
 mypool/usr            616K  93.2G   144K  /usr
 mypool/usr/home       184K  93.2G   184K  /usr/home
 mypool/usr/ports      144K  93.2G   144K  /usr/ports
 mypool/usr/src        144K  93.2G   144K  /usr/src
 mypool/var           1.20M  93.2G   608K  /var
 mypool/var/crash      148K  93.2G   148K  /var/crash
 mypool/var/log        178K  93.2G   178K  /var/log
 mypool/var/mail       144K  93.2G   144K  /var/mail
 mypool/var/tmp        152K  93.2G   152K  /var/tmp
 &prompt.root; <userinput>zfs create -o compress=lz4 <replaceable>mypool/usr/mydataset</replaceable></userinput>
 &prompt.root; <userinput>zfs list</userinput>
 NAME                   USED  AVAIL  REFER  MOUNTPOINT
 mypool                 781M  93.2G   144K  none
 mypool/ROOT            777M  93.2G   144K  none
 mypool/ROOT/default    777M  93.2G   777M  /
 mypool/tmp             176K  93.2G   176K  /tmp
 mypool/usr             704K  93.2G   144K  /usr
 mypool/usr/home        184K  93.2G   184K  /usr/home
 mypool/usr/mydataset  87.5K  93.2G  87.5K  /usr/mydataset
 mypool/usr/ports       144K  93.2G   144K  /usr/ports
 mypool/usr/src         144K  93.2G   144K  /usr/src
 mypool/var            1.20M  93.2G   610K  /var
 mypool/var/crash       148K  93.2G   148K  /var/crash
 mypool/var/log         178K  93.2G   178K  /var/log
 mypool/var/mail        144K  93.2G   144K  /var/mail
 mypool/var/tmp         152K  93.2G   152K  /var/tmp</screen>
 
       <para>Destroying a dataset is much quicker than deleting all
 	of the files that reside on the dataset, as it does not
 	involve scanning all of the files and updating all of the
 	corresponding metadata.</para>
 
       <para>Destroy the previously-created dataset:</para>
 
       <screen>&prompt.root; <userinput>zfs list</userinput>
 NAME                   USED  AVAIL  REFER  MOUNTPOINT
 mypool                 880M  93.1G   144K  none
 mypool/ROOT            777M  93.1G   144K  none
 mypool/ROOT/default    777M  93.1G   777M  /
 mypool/tmp             176K  93.1G   176K  /tmp
 mypool/usr             101M  93.1G   144K  /usr
 mypool/usr/home        184K  93.1G   184K  /usr/home
 mypool/usr/mydataset   100M  93.1G   100M  /usr/mydataset
 mypool/usr/ports       144K  93.1G   144K  /usr/ports
 mypool/usr/src         144K  93.1G   144K  /usr/src
 mypool/var            1.20M  93.1G   610K  /var
 mypool/var/crash       148K  93.1G   148K  /var/crash
 mypool/var/log         178K  93.1G   178K  /var/log
 mypool/var/mail        144K  93.1G   144K  /var/mail
 mypool/var/tmp         152K  93.1G   152K  /var/tmp
 &prompt.root; <userinput>zfs destroy <replaceable>mypool/usr/mydataset</replaceable></userinput>
 &prompt.root; <userinput>zfs list</userinput>
 NAME                  USED  AVAIL  REFER  MOUNTPOINT
 mypool                781M  93.2G   144K  none
 mypool/ROOT           777M  93.2G   144K  none
 mypool/ROOT/default   777M  93.2G   777M  /
 mypool/tmp            176K  93.2G   176K  /tmp
 mypool/usr            616K  93.2G   144K  /usr
 mypool/usr/home       184K  93.2G   184K  /usr/home
 mypool/usr/ports      144K  93.2G   144K  /usr/ports
 mypool/usr/src        144K  93.2G   144K  /usr/src
 mypool/var           1.21M  93.2G   612K  /var
 mypool/var/crash      148K  93.2G   148K  /var/crash
 mypool/var/log        178K  93.2G   178K  /var/log
 mypool/var/mail       144K  93.2G   144K  /var/mail
 mypool/var/tmp        152K  93.2G   152K  /var/tmp</screen>
 
       <para>In modern versions of <acronym>ZFS</acronym>,
 	<command>zfs destroy</command> is asynchronous, and the free
 	space might take several minutes to appear in the pool.  Use
 	<command>zpool get freeing
 	  <replaceable>poolname</replaceable></command> to see the
 	<literal>freeing</literal> property, indicating how many
 	datasets are having their blocks freed in the background.
 	If there are child datasets, like
 	<link linkend="zfs-term-snapshot">snapshots</link> or other
 	datasets, then the parent cannot be destroyed.  To destroy a
 	dataset and all of its children, use <option>-r</option> to
 	recursively destroy the dataset and all of its children.
 	Use <option>-n</option> <option>-v</option> to list datasets
 	and snapshots that would be destroyed by this operation, but
 	do not actually destroy anything.  Space that would be
 	reclaimed by destruction of snapshots is also shown.</para>
     </sect2>
 
     <sect2 xml:id="zfs-zfs-volume">
       <title>Creating and Destroying Volumes</title>
 
       <para>A volume is a special type of dataset.  Rather than being
 	mounted as a file system, it is exposed as a block device
 	under
 	<filename>/dev/zvol/<replaceable>poolname</replaceable>/<replaceable>dataset</replaceable></filename>.
 	This allows the volume to be used for other file systems, to
 	back the disks of a virtual machine, or to be exported using
 	protocols like <acronym>iSCSI</acronym> or
 	<acronym>HAST</acronym>.</para>
 
       <para>A volume can be formatted with any file system, or used
 	without a file system to store raw data.  To the user, a
 	volume appears to be a regular disk.  Putting ordinary file
 	systems on these <emphasis>zvols</emphasis> provides features
 	that ordinary disks or file systems do not normally have.  For
 	example, using the compression property on a 250&nbsp;MB
 	volume allows creation of a compressed <acronym>FAT</acronym>
 	file system.</para>
 
       <screen>&prompt.root; <userinput>zfs create -V 250m -o compression=on tank/fat32</userinput>
 &prompt.root; <userinput>zfs list tank</userinput>
 NAME USED AVAIL REFER MOUNTPOINT
 tank 258M  670M   31K /tank
 &prompt.root; <userinput>newfs_msdos -F32 /dev/zvol/tank/fat32</userinput>
 &prompt.root; <userinput>mount -t msdosfs /dev/zvol/tank/fat32 /mnt</userinput>
 &prompt.root; <userinput>df -h /mnt | grep fat32</userinput>
 Filesystem           Size Used Avail Capacity Mounted on
 /dev/zvol/tank/fat32 249M  24k  249M     0%   /mnt
 &prompt.root; <userinput>mount | grep fat32</userinput>
 /dev/zvol/tank/fat32 on /mnt (msdosfs, local)</screen>
 
       <para>Destroying a volume is much the same as destroying a
 	regular file system dataset.  The operation is nearly
 	instantaneous, but it may take several minutes for the free
 	space to be reclaimed in the background.</para>
     </sect2>
 
     <sect2 xml:id="zfs-zfs-rename">
       <title>Renaming a Dataset</title>
 
       <para>The name of a dataset can be changed with
 	<command>zfs rename</command>.  The parent of a dataset can
 	also be changed with this command.  Renaming a dataset to be
 	under a different parent dataset will change the value of
 	those properties that are inherited from the parent dataset.
 	When a dataset is renamed, it is unmounted and then remounted
 	in the new location (which is inherited from the new parent
 	dataset).  This behavior can be prevented with
 	<option>-u</option>.</para>
 
       <para>Rename a dataset and move it to be under a different
 	parent dataset:</para>
 
       <screen>&prompt.root; <userinput>zfs list</userinput>
 NAME                   USED  AVAIL  REFER  MOUNTPOINT
 mypool                 780M  93.2G   144K  none
 mypool/ROOT            777M  93.2G   144K  none
 mypool/ROOT/default    777M  93.2G   777M  /
 mypool/tmp             176K  93.2G   176K  /tmp
 mypool/usr             704K  93.2G   144K  /usr
 mypool/usr/home        184K  93.2G   184K  /usr/home
 mypool/usr/mydataset  87.5K  93.2G  87.5K  /usr/mydataset
 mypool/usr/ports       144K  93.2G   144K  /usr/ports
 mypool/usr/src         144K  93.2G   144K  /usr/src
 mypool/var            1.21M  93.2G   614K  /var
 mypool/var/crash       148K  93.2G   148K  /var/crash
 mypool/var/log         178K  93.2G   178K  /var/log
 mypool/var/mail        144K  93.2G   144K  /var/mail
 mypool/var/tmp         152K  93.2G   152K  /var/tmp
 &prompt.root; <userinput>zfs rename <replaceable>mypool/usr/mydataset</replaceable> <replaceable>mypool/var/newname</replaceable></userinput>
 &prompt.root; <userinput>zfs list</userinput>
 NAME                  USED  AVAIL  REFER  MOUNTPOINT
 mypool                780M  93.2G   144K  none
 mypool/ROOT           777M  93.2G   144K  none
 mypool/ROOT/default   777M  93.2G   777M  /
 mypool/tmp            176K  93.2G   176K  /tmp
 mypool/usr            616K  93.2G   144K  /usr
 mypool/usr/home       184K  93.2G   184K  /usr/home
 mypool/usr/ports      144K  93.2G   144K  /usr/ports
 mypool/usr/src        144K  93.2G   144K  /usr/src
 mypool/var           1.29M  93.2G   614K  /var
 mypool/var/crash      148K  93.2G   148K  /var/crash
 mypool/var/log        178K  93.2G   178K  /var/log
 mypool/var/mail       144K  93.2G   144K  /var/mail
 mypool/var/newname   87.5K  93.2G  87.5K  /var/newname
 mypool/var/tmp        152K  93.2G   152K  /var/tmp</screen>
 
       <para>Snapshots can also be renamed like this.  Due to the
 	nature of snapshots, they cannot be renamed into a different
 	parent dataset.  To rename a recursive snapshot, specify
 	<option>-r</option>, and all snapshots with the same name in
 	child datasets with also be renamed.</para>
 
       <screen>&prompt.root; <userinput>zfs list -t snapshot</userinput>
 NAME                                USED  AVAIL  REFER  MOUNTPOINT
 mypool/var/newname@first_snapshot      0      -  87.5K  -
 &prompt.root; <userinput>zfs rename <replaceable>mypool/var/newname@first_snapshot</replaceable> <replaceable>new_snapshot_name</replaceable></userinput>
 &prompt.root; <userinput>zfs list -t snapshot</userinput>
 NAME                                   USED  AVAIL  REFER  MOUNTPOINT
 mypool/var/newname@new_snapshot_name      0      -  87.5K  -</screen>
     </sect2>
 
     <sect2 xml:id="zfs-zfs-set">
       <title>Setting Dataset Properties</title>
 
       <para>Each <acronym>ZFS</acronym> dataset has a number of
 	properties that control its behavior.  Most properties are
 	automatically inherited from the parent dataset, but can be
 	overridden locally.  Set a property on a dataset with
 	<command>zfs set
 	  <replaceable>property</replaceable>=<replaceable>value</replaceable>
 	  <replaceable>dataset</replaceable></command>.  Most
 	properties have a limited set of valid values,
 	<command>zfs get</command> will display each possible property
 	and valid values.  Most properties can be reverted to their
 	inherited values using <command>zfs inherit</command>.</para>
 
       <para>User-defined properties can also be set.  They become part
 	of the dataset configuration and can be used to provide
 	additional information about the dataset or its contents.  To
 	distinguish these custom properties from the ones supplied as
 	part of <acronym>ZFS</acronym>, a colon (<literal>:</literal>)
 	is used to create a custom namespace for the property.</para>
 
       <screen>&prompt.root; <userinput>zfs set <replaceable>custom</replaceable>:<replaceable>costcenter</replaceable>=<replaceable>1234</replaceable> <replaceable>tank</replaceable></userinput>
 &prompt.root; <userinput>zfs get <replaceable>custom</replaceable>:<replaceable>costcenter</replaceable> <replaceable>tank</replaceable></userinput>
 NAME PROPERTY           VALUE SOURCE
 tank custom:costcenter  1234  local</screen>
 
       <para>To remove a custom property, use
 	<command>zfs inherit</command> with <option>-r</option>.  If
 	the custom property is not defined in any of the parent
 	datasets, it will be removed completely (although the changes
 	are still recorded in the pool's history).</para>
 
       <screen>&prompt.root; <userinput>zfs inherit -r <replaceable>custom</replaceable>:<replaceable>costcenter</replaceable> <replaceable>tank</replaceable></userinput>
 &prompt.root; <userinput>zfs get <replaceable>custom</replaceable>:<replaceable>costcenter</replaceable> <replaceable>tank</replaceable></userinput>
 NAME    PROPERTY           VALUE              SOURCE
 tank    custom:costcenter  -                  -
 &prompt.root; <userinput>zfs get all <replaceable>tank</replaceable> | grep <replaceable>custom</replaceable>:<replaceable>costcenter</replaceable></userinput>
 &prompt.root;</screen>
 
     <sect3 xml:id="zfs-zfs-set-share">
       <title>Getting and Setting Share Properties</title>
 
       <para>Two commonly used and useful dataset properties are the
 	<acronym>NFS</acronym> and <acronym>SMB</acronym> share
 	options.  Setting these define if and how
 	<acronym>ZFS</acronym> datasets may be shared on the network.
 	At present, only setting sharing via <acronym>NFS</acronym> is
 	supported on &os;.  To get the current status of
 	a share, enter:</para>
 
       <screen>&prompt.root; <userinput>zfs get sharenfs <replaceable>mypool/usr/home</replaceable></userinput>
 NAME             PROPERTY  VALUE    SOURCE
 mypool/usr/home  sharenfs  on       local
 &prompt.root; <userinput>zfs get sharesmb <replaceable>mypool/usr/home</replaceable></userinput>
 NAME             PROPERTY  VALUE    SOURCE
 mypool/usr/home  sharesmb  off      local</screen>
 
       <para>To enable sharing of a dataset, enter:</para>
 
       <screen>&prompt.root; <userinput> zfs set sharenfs=on <replaceable>mypool/usr/home</replaceable></userinput></screen>
 
       <para>It is also possible to set additional options for sharing
       datasets through <acronym>NFS</acronym>, such as
       <option>-alldirs</option>, <option>-maproot</option> and
       <option>-network</option>.  To set additional options to a
       dataset shared through NFS, enter:</para>
 
       <screen>&prompt.root; <userinput> zfs set sharenfs="-alldirs,-maproot=<replaceable>root</replaceable>,-network=<replaceable>192.168.1.0/24</replaceable>" <replaceable>mypool/usr/home</replaceable></userinput></screen>
       </sect3>
     </sect2>
 
     <sect2 xml:id="zfs-zfs-snapshot">
       <title>Managing Snapshots</title>
 
       <para><link linkend="zfs-term-snapshot">Snapshots</link> are one
 	of the most powerful features of <acronym>ZFS</acronym>.  A
 	snapshot provides a read-only, point-in-time copy of the
 	dataset.  With Copy-On-Write (<acronym>COW</acronym>),
 	snapshots can be created quickly by preserving the older
 	version of the data on disk.  If no snapshots exist, space is
 	reclaimed for future use when data is rewritten or deleted.
 	Snapshots preserve disk space by recording only the
 	differences between the current dataset and a previous
 	version.  Snapshots are allowed only on whole datasets, not on
 	individual files or directories.  When a snapshot is created
 	from a dataset, everything contained in it is duplicated.
 	This includes the file system properties, files, directories,
 	permissions, and so on.  Snapshots use no additional space
 	when they are first created, only consuming space as the
 	blocks they reference are changed.  Recursive snapshots taken
 	with <option>-r</option> create a snapshot with the same name
 	on the dataset and all of its children, providing a consistent
 	moment-in-time snapshot of all of the file systems.  This can
 	be important when an application has files on multiple
 	datasets that are related or dependent upon each other.
 	Without snapshots, a backup would have copies of the files
 	from different points in time.</para>
 
       <para>Snapshots in <acronym>ZFS</acronym> provide a variety of
 	features that even other file systems with snapshot
 	functionality lack.  A typical example of snapshot use is to
 	have a quick way of backing up the current state of the file
 	system when a risky action like a software installation or a
 	system upgrade is performed.  If the action fails, the
 	snapshot can be rolled back and the system has the same state
 	as when the snapshot was created.  If the upgrade was
 	successful, the snapshot can be deleted to free up space.
 	Without snapshots, a failed upgrade often requires a restore
 	from backup, which is tedious, time consuming, and may require
 	downtime during which the system cannot be used.  Snapshots
 	can be rolled back quickly, even while the system is running
 	in normal operation, with little or no downtime.  The time
 	savings are enormous with multi-terabyte storage systems and
 	the time required to copy the data from backup.  Snapshots are
 	not a replacement for a complete backup of a pool, but can be
 	used as a quick and easy way to store a copy of the dataset at
 	a specific point in time.</para>
 
       <sect3 xml:id="zfs-zfs-snapshot-creation">
 	<title>Creating Snapshots</title>
 
 	<para>Snapshots are created  with <command>zfs snapshot
 	    <replaceable>dataset</replaceable>@<replaceable>snapshotname</replaceable></command>.
 	  Adding <option>-r</option> creates a snapshot recursively,
 	  with the same name on all child datasets.</para>
 
 	<para>Create a recursive snapshot of the entire pool:</para>
 
 	<screen>&prompt.root; <userinput>zfs list -t all</userinput>
 NAME                                   USED  AVAIL  REFER  MOUNTPOINT
 mypool                                 780M  93.2G   144K  none
 mypool/ROOT                            777M  93.2G   144K  none
 mypool/ROOT/default                    777M  93.2G   777M  /
 mypool/tmp                             176K  93.2G   176K  /tmp
 mypool/usr                             616K  93.2G   144K  /usr
 mypool/usr/home                        184K  93.2G   184K  /usr/home
 mypool/usr/ports                       144K  93.2G   144K  /usr/ports
 mypool/usr/src                         144K  93.2G   144K  /usr/src
 mypool/var                            1.29M  93.2G   616K  /var
 mypool/var/crash                       148K  93.2G   148K  /var/crash
 mypool/var/log                         178K  93.2G   178K  /var/log
 mypool/var/mail                        144K  93.2G   144K  /var/mail
 mypool/var/newname                    87.5K  93.2G  87.5K  /var/newname
 mypool/var/newname@new_snapshot_name      0      -  87.5K  -
 mypool/var/tmp                         152K  93.2G   152K  /var/tmp
 &prompt.root; <userinput>zfs snapshot -r <replaceable>mypool@my_recursive_snapshot</replaceable></userinput>
 &prompt.root; <userinput>zfs list -t snapshot</userinput>
 NAME                                        USED  AVAIL  REFER  MOUNTPOINT
 mypool@my_recursive_snapshot                   0      -   144K  -
 mypool/ROOT@my_recursive_snapshot              0      -   144K  -
 mypool/ROOT/default@my_recursive_snapshot      0      -   777M  -
 mypool/tmp@my_recursive_snapshot               0      -   176K  -
 mypool/usr@my_recursive_snapshot               0      -   144K  -
 mypool/usr/home@my_recursive_snapshot          0      -   184K  -
 mypool/usr/ports@my_recursive_snapshot         0      -   144K  -
 mypool/usr/src@my_recursive_snapshot           0      -   144K  -
 mypool/var@my_recursive_snapshot               0      -   616K  -
 mypool/var/crash@my_recursive_snapshot         0      -   148K  -
 mypool/var/log@my_recursive_snapshot           0      -   178K  -
 mypool/var/mail@my_recursive_snapshot          0      -   144K  -
 mypool/var/newname@new_snapshot_name           0      -  87.5K  -
 mypool/var/newname@my_recursive_snapshot       0      -  87.5K  -
 mypool/var/tmp@my_recursive_snapshot           0      -   152K  -</screen>
 
 	<para>Snapshots are not shown by a normal
 	  <command>zfs list</command> operation.  To list snapshots,
 	  <option>-t snapshot</option> is appended to
 	  <command>zfs list</command>.  <option>-t all</option>
 	  displays both file systems and snapshots.</para>
 
 	<para>Snapshots are not mounted directly, so path is shown in
 	  the <literal>MOUNTPOINT</literal> column.  There is no
 	  mention of available disk space in the
 	  <literal>AVAIL</literal> column, as snapshots cannot be
 	  written to after they are created.  Compare the snapshot
 	  to the original dataset from which it was created:</para>
 
 	<screen>&prompt.root; <userinput>zfs list -rt all <replaceable>mypool/usr/home</replaceable></userinput>
 NAME                                    USED  AVAIL  REFER  MOUNTPOINT
 mypool/usr/home                         184K  93.2G   184K  /usr/home
 mypool/usr/home@my_recursive_snapshot      0      -   184K  -</screen>
 
 	<para>Displaying both the dataset and the snapshot together
 	  reveals how snapshots work in
 	  <link linkend="zfs-term-cow">COW</link> fashion.  They save
 	  only the changes (<emphasis>delta</emphasis>) that were made
 	  and not the complete file system contents all over again.
 	  This means that snapshots take little space when few changes
 	  are made.  Space usage can be made even more apparent by
 	  copying a file to the dataset, then making a second
 	  snapshot:</para>
 
 	<screen>&prompt.root; <userinput>cp <replaceable>/etc/passwd</replaceable> <replaceable>/var/tmp</replaceable></userinput>
 &prompt.root; <userinput>zfs snapshot <replaceable>mypool/var/tmp</replaceable>@<replaceable>after_cp</replaceable></userinput>
 &prompt.root; <userinput>zfs list -rt all <replaceable>mypool/var/tmp</replaceable></userinput>
 NAME                                   USED  AVAIL  REFER  MOUNTPOINT
 mypool/var/tmp                         206K  93.2G   118K  /var/tmp
 mypool/var/tmp@my_recursive_snapshot    88K      -   152K  -
 mypool/var/tmp@after_cp                   0      -   118K  -</screen>
 
 	<para>The second snapshot contains only the changes to the
 	  dataset after the copy operation.  This yields enormous
 	  space savings.  Notice that the size of the snapshot
 	  <replaceable>mypool/var/tmp@my_recursive_snapshot</replaceable>
 	  also changed in the <literal>USED</literal>
 	  column to indicate the changes between itself and the
 	  snapshot taken afterwards.</para>
       </sect3>
 
       <sect3 xml:id="zfs-zfs-snapshot-diff">
 	<title>Comparing Snapshots</title>
 
 	<para>ZFS provides a built-in command to compare the
 	  differences in content between two snapshots.  This is
 	  helpful when many snapshots were taken over time and the
 	  user wants to see how the file system has changed over time.
 	  For example, <command>zfs diff</command> lets a user find
 	  the latest snapshot that still contains a file that was
 	  accidentally deleted.  Doing this for the two snapshots that
 	  were created in the previous section yields this
 	  output:</para>
 
 	<screen>&prompt.root; <userinput>zfs list -rt all <replaceable>mypool/var/tmp</replaceable></userinput>
 NAME                                   USED  AVAIL  REFER  MOUNTPOINT
 mypool/var/tmp                         206K  93.2G   118K  /var/tmp
 mypool/var/tmp@my_recursive_snapshot    88K      -   152K  -
 mypool/var/tmp@after_cp                   0      -   118K  -
 &prompt.root; <userinput>zfs diff <replaceable>mypool/var/tmp@my_recursive_snapshot</replaceable></userinput>
 M       /var/tmp/
 +       /var/tmp/passwd</screen>
 
 	<para>The command lists the changes between the specified
 	  snapshot (in this case
 	  <literal><replaceable>mypool/var/tmp@my_recursive_snapshot</replaceable></literal>)
 	  and the live file system.  The first column shows the
 	  type of change:</para>
 
 	<informaltable pgwide="1">
 	  <tgroup cols="2">
 	    <tbody valign="top">
 	      <row>
 		<entry>+</entry>
 		<entry>The path or file was added.</entry>
 	      </row>
 
 	      <row>
 		<entry>-</entry>
 		<entry>The path or file was deleted.</entry>
 	      </row>
 
 	      <row>
 		<entry>M</entry>
 		<entry>The path or file was modified.</entry>
 	      </row>
 
 	      <row>
 		<entry>R</entry>
 		<entry>The path or file was renamed.</entry>
 	      </row>
 	    </tbody>
 	  </tgroup>
 	</informaltable>
 
 	<para>Comparing the output with the table, it becomes clear
 	  that <filename><replaceable>passwd</replaceable></filename>
 	  was added after the snapshot
 	  <literal><replaceable>mypool/var/tmp@my_recursive_snapshot</replaceable></literal>
 	  was created.  This also resulted in a modification to the
 	  parent directory mounted at
 	  <literal><replaceable>/var/tmp</replaceable></literal>.</para>
 
 	<para>Comparing two snapshots is helpful when using the
 	  <acronym>ZFS</acronym> replication feature to transfer a
 	  dataset to a different host for backup purposes.</para>
 
 	<para>Compare two snapshots by providing the full dataset name
 	  and snapshot name of both datasets:</para>
 
 	<screen>&prompt.root; <userinput>cp /var/tmp/passwd /var/tmp/passwd.copy</userinput>
 &prompt.root; <userinput>zfs snapshot <replaceable>mypool/var/tmp@diff_snapshot</replaceable></userinput>
 &prompt.root; <userinput>zfs diff <replaceable>mypool/var/tmp@my_recursive_snapshot</replaceable> <replaceable>mypool/var/tmp@diff_snapshot</replaceable></userinput>
 M       /var/tmp/
 +       /var/tmp/passwd
 +       /var/tmp/passwd.copy
 &prompt.root; <userinput>zfs diff <replaceable>mypool/var/tmp@my_recursive_snapshot</replaceable> <replaceable>mypool/var/tmp@after_cp</replaceable></userinput>
 M       /var/tmp/
 +       /var/tmp/passwd</screen>
 
 	<para>A backup administrator can compare two snapshots
 	  received from the sending host and determine the actual
 	  changes in the dataset.  See the
 	  <link linkend="zfs-zfs-send">Replication</link> section for
 	  more information.</para>
       </sect3>
 
       <sect3 xml:id="zfs-zfs-snapshot-rollback">
 	<title>Snapshot Rollback</title>
 
 	<para>When at least one snapshot is available, it can be
 	  rolled back to at any time.  Most of the time this is the
 	  case when the current state of the dataset is no longer
 	  required and an older version is preferred.  Scenarios such
 	  as local development tests have gone wrong, botched system
 	  updates hampering the system's overall functionality, or the
 	  requirement to restore accidentally deleted files or
 	  directories are all too common occurrences.  Luckily,
 	  rolling back a snapshot is just as easy as typing
 	  <command>zfs rollback
 	    <replaceable>snapshotname</replaceable></command>.
 	  Depending on how many changes are involved, the operation
 	  will finish in a certain amount of time.  During that time,
 	  the dataset always remains in a consistent state, much like
 	  a database that conforms to ACID principles is performing a
 	  rollback.  This is happening while the dataset is live and
 	  accessible without requiring a downtime.  Once the snapshot
 	  has been rolled back, the dataset has the same state as it
 	  had when the snapshot was originally taken.  All other data
 	  in that dataset that was not part of the snapshot is
 	  discarded.  Taking a snapshot of the current state of the
 	  dataset before rolling back to a previous one is a good idea
 	  when some data is required later.  This way, the user can
 	  roll back and forth between snapshots without losing data
 	  that is still valuable.</para>
 
 	<para>In the first example, a snapshot is rolled back because
 	  of a careless <command>rm</command> operation that removes
 	  too much data than was intended.</para>
 
 	<screen>&prompt.root; <userinput>zfs list -rt all <replaceable>mypool/var/tmp</replaceable></userinput>
 NAME                                   USED  AVAIL  REFER  MOUNTPOINT
 mypool/var/tmp                         262K  93.2G   120K  /var/tmp
 mypool/var/tmp@my_recursive_snapshot    88K      -   152K  -
 mypool/var/tmp@after_cp               53.5K      -   118K  -
 mypool/var/tmp@diff_snapshot              0      -   120K  -
 &prompt.root; <userinput>ls /var/tmp</userinput>
 passwd          passwd.copy     vi.recover
 &prompt.root; <userinput>rm /var/tmp/passwd*</userinput>
 &prompt.root; <userinput>ls /var/tmp</userinput>
 vi.recover</screen>
 
 	<para>At this point, the user realized that too many files
 	  were deleted and wants them back.  <acronym>ZFS</acronym>
 	  provides an easy way to get them back using rollbacks, but
 	  only when snapshots of important data are performed on a
 	  regular basis.  To get the files back and start over from
 	  the last snapshot, issue the command:</para>
 
 	<screen>&prompt.root; <userinput>zfs rollback <replaceable>mypool/var/tmp@diff_snapshot</replaceable></userinput>
 &prompt.root; <userinput>ls /var/tmp</userinput>
 passwd          passwd.copy     vi.recover</screen>
 
 	<para>The rollback operation restored the dataset to the state
 	  of the last snapshot.  It is also possible to roll back to a
 	  snapshot that was taken much earlier and has other snapshots
 	  that were created after it.  When trying to do this,
 	  <acronym>ZFS</acronym> will issue this warning:</para>
 
 	<screen>&prompt.root; <userinput>zfs list -rt snapshot <replaceable>mypool/var/tmp</replaceable></userinput>
 AME                                   USED  AVAIL  REFER  MOUNTPOINT
 mypool/var/tmp@my_recursive_snapshot    88K      -   152K  -
 mypool/var/tmp@after_cp               53.5K      -   118K  -
 mypool/var/tmp@diff_snapshot              0      -   120K  -
 &prompt.root; <userinput>zfs rollback <replaceable>mypool/var/tmp@my_recursive_snapshot</replaceable></userinput>
 cannot rollback to 'mypool/var/tmp@my_recursive_snapshot': more recent snapshots exist
 use '-r' to force deletion of the following snapshots:
 mypool/var/tmp@after_cp
 mypool/var/tmp@diff_snapshot</screen>
 
 	<para>This warning means that snapshots exist between the
 	  current state of the dataset and the snapshot to which the
 	  user wants to roll back.  To complete the rollback, these
 	  snapshots must be deleted.  <acronym>ZFS</acronym> cannot
 	  track all the changes between different states of the
 	  dataset, because snapshots are read-only.
 	  <acronym>ZFS</acronym> will not delete the affected
 	  snapshots unless the user specifies <option>-r</option> to
 	  indicate that this is the desired action.  If that is the
 	  intention, and the consequences of losing all intermediate
 	  snapshots is understood, the command can be issued:</para>
 
 	<screen>&prompt.root; <userinput>zfs rollback -r <replaceable>mypool/var/tmp@my_recursive_snapshot</replaceable></userinput>
 &prompt.root; <userinput>zfs list -rt snapshot <replaceable>mypool/var/tmp</replaceable></userinput>
 NAME                                   USED  AVAIL  REFER  MOUNTPOINT
 mypool/var/tmp@my_recursive_snapshot     8K      -   152K  -
 &prompt.root; <userinput>ls /var/tmp</userinput>
 vi.recover</screen>
 
 	<para>The output from <command>zfs list -t snapshot</command>
 	  confirms that the intermediate snapshots
 	  were removed as a result of
 	  <command>zfs rollback -r</command>.</para>
       </sect3>
 
       <sect3 xml:id="zfs-zfs-snapshot-snapdir">
 	<title>Restoring Individual Files from Snapshots</title>
 
 	<para>Snapshots are mounted in a hidden directory under the
 	  parent dataset:
 	  <filename>.zfs/snapshots/<replaceable>snapshotname</replaceable></filename>.
 	  By default, these directories will not be displayed even
 	  when a standard <command>ls -a</command> is issued.
 	  Although the directory is not displayed, it is there
 	  nevertheless and can be accessed like any normal directory.
 	  The property named <literal>snapdir</literal> controls
 	  whether these hidden directories show up in a directory
 	  listing.  Setting the property to <literal>visible</literal>
 	  allows them to appear in the output of <command>ls</command>
 	  and other commands that deal with directory contents.</para>
 
 	<screen>&prompt.root; <userinput>zfs get snapdir <replaceable>mypool/var/tmp</replaceable></userinput>
 NAME            PROPERTY  VALUE    SOURCE
 mypool/var/tmp  snapdir   hidden   default
 &prompt.root; <userinput>ls -a /var/tmp</userinput>
 .               ..              passwd          vi.recover
 &prompt.root; <userinput>zfs set snapdir=visible <replaceable>mypool/var/tmp</replaceable></userinput>
 &prompt.root; <userinput>ls -a /var/tmp</userinput>
 .               ..              .zfs            passwd          vi.recover</screen>
 
 	<para>Individual files can easily be restored to a previous
 	  state by copying them from the snapshot back to the parent
 	  dataset.  The directory structure below
 	  <filename>.zfs/snapshot</filename> has a directory named
 	  exactly like the snapshots taken earlier to make it easier
 	  to identify them.  In the next example, it is assumed that a
 	  file is to be restored from the hidden
 	  <filename>.zfs</filename> directory by copying it from the
 	  snapshot that contained the latest version of the
 	  file:</para>
 
 	<screen>&prompt.root; <userinput>rm /var/tmp/passwd</userinput>
 &prompt.root; <userinput>ls -a /var/tmp</userinput>
 .               ..              .zfs            vi.recover
 &prompt.root; <userinput>ls /var/tmp/.zfs/snapshot</userinput>
 after_cp                my_recursive_snapshot
 &prompt.root; <userinput>ls /var/tmp/.zfs/snapshot/<replaceable>after_cp</replaceable></userinput>
 passwd          vi.recover
 &prompt.root; <userinput>cp /var/tmp/.zfs/snapshot/<replaceable>after_cp/passwd</replaceable> <replaceable>/var/tmp</replaceable></userinput></screen>
 
 	<para>When <command>ls .zfs/snapshot</command> was issued, the
 	  <literal>snapdir</literal> property might have been set to
 	  hidden, but it would still be possible to list the contents
 	  of that directory.  It is up to the administrator to decide
 	  whether these directories will be displayed.  It is possible
 	  to display these for certain datasets and prevent it for
 	  others.  Copying files or directories from this hidden
 	  <filename>.zfs/snapshot</filename> is simple enough.  Trying
 	  it the other way around results in this error:</para>
 
 	<screen>&prompt.root; <userinput>cp <replaceable>/etc/rc.conf</replaceable> /var/tmp/.zfs/snapshot/<replaceable>after_cp/</replaceable></userinput>
 cp: /var/tmp/.zfs/snapshot/after_cp/rc.conf: Read-only file system</screen>
 
 	<para>The error reminds the user that snapshots are read-only
 	  and cannot be changed after creation.  Files cannot be
 	  copied into or removed from snapshot directories because
 	  that would change the state of the dataset they
 	  represent.</para>
 
 	<para>Snapshots consume space based on how much the parent
 	  file system has changed since the time of the snapshot.  The
 	  <literal>written</literal> property of a snapshot tracks how
 	  much space is being used by the snapshot.</para>
 
 	<para>Snapshots are destroyed and the space reclaimed with
 	  <command>zfs destroy
 	    <replaceable>dataset</replaceable>@<replaceable>snapshot</replaceable></command>.
 	  Adding <option>-r</option> recursively removes all snapshots
 	  with the same name under the parent dataset.  Adding
 	  <option>-n -v</option> to the command displays a list of the
 	  snapshots that would be deleted and an estimate of how much
 	  space would be reclaimed without performing the actual
 	  destroy operation.</para>
       </sect3>
     </sect2>
 
     <sect2 xml:id="zfs-zfs-clones">
       <title>Managing Clones</title>
 
       <para>A clone is a copy of a snapshot that is treated more like
 	a regular dataset.  Unlike a snapshot, a clone is not read
 	only, is mounted, and can have its own properties.  Once a
 	clone has been created using <command>zfs clone</command>, the
 	snapshot it was created from cannot be destroyed.  The
 	child/parent relationship between the clone and the snapshot
 	can be reversed using <command>zfs promote</command>.  After a
 	clone has been promoted, the snapshot becomes a child of the
 	clone, rather than of the original parent dataset.  This will
 	change how the space is accounted, but not actually change the
 	amount of space consumed.  The clone can be mounted at any
 	point within the <acronym>ZFS</acronym> file system hierarchy,
 	not just below the original location of the snapshot.</para>
 
       <para>To demonstrate the clone feature, this example dataset is
 	used:</para>
 
       <screen>&prompt.root; <userinput>zfs list -rt all <replaceable>camino/home/joe</replaceable></userinput>
 NAME                    USED  AVAIL  REFER  MOUNTPOINT
 camino/home/joe         108K   1.3G    87K  /usr/home/joe
 camino/home/joe@plans    21K      -  85.5K  -
 camino/home/joe@backup    0K      -    87K  -</screen>
 
       <para>A typical use for clones is to experiment with a specific
 	dataset while keeping the snapshot around to fall back to in
 	case something goes wrong.  Since snapshots cannot be
 	changed, a read/write clone of a snapshot is created.  After
 	the desired result is achieved in the clone, the clone can be
 	promoted to a dataset and the old file system removed.  This
 	is not strictly necessary, as the clone and dataset can
 	coexist without problems.</para>
 
       <screen>&prompt.root; <userinput>zfs clone <replaceable>camino/home/joe</replaceable>@<replaceable>backup</replaceable> <replaceable>camino/home/joenew</replaceable></userinput>
 &prompt.root; <userinput>ls /usr/home/joe*</userinput>
 /usr/home/joe:
 backup.txz     plans.txt
 
 /usr/home/joenew:
 backup.txz     plans.txt
 &prompt.root; <userinput>df -h /usr/home</userinput>
 Filesystem          Size    Used   Avail Capacity  Mounted on
 usr/home/joe        1.3G     31k    1.3G     0%    /usr/home/joe
 usr/home/joenew     1.3G     31k    1.3G     0%    /usr/home/joenew</screen>
 
       <para>After a clone is created it is an exact copy of the state
 	the dataset was in when the snapshot was taken.  The clone can
 	now be changed independently from its originating dataset.
 	The only connection between the two is the snapshot.
 	<acronym>ZFS</acronym> records this connection in the property
 	<literal>origin</literal>.  Once the dependency between the
 	snapshot and the clone has been removed by promoting the clone
 	using <command>zfs promote</command>, the
 	<literal>origin</literal> of the clone is removed as it is now
 	an independent dataset.  This example demonstrates it:</para>
 
       <screen>&prompt.root; <userinput>zfs get origin <replaceable>camino/home/joenew</replaceable></userinput>
 NAME                  PROPERTY  VALUE                     SOURCE
 camino/home/joenew    origin    camino/home/joe@backup    -
 &prompt.root; <userinput>zfs promote <replaceable>camino/home/joenew</replaceable></userinput>
 &prompt.root; <userinput>zfs get origin <replaceable>camino/home/joenew</replaceable></userinput>
 NAME                  PROPERTY  VALUE   SOURCE
 camino/home/joenew    origin    -       -</screen>
 
       <para>After making some changes like copying
 	<filename>loader.conf</filename> to the promoted clone, for
 	example, the old directory becomes obsolete in this case.
 	Instead, the promoted clone can replace it.  This can be
 	achieved by two consecutive commands: <command>zfs
 	  destroy</command> on the old dataset and <command>zfs
 	  rename</command> on the clone to name it like the old
 	dataset (it could also get an entirely different name).</para>
 
       <screen>&prompt.root; <userinput>cp <replaceable>/boot/defaults/loader.conf</replaceable> <replaceable>/usr/home/joenew</replaceable></userinput>
 &prompt.root; <userinput>zfs destroy -f <replaceable>camino/home/joe</replaceable></userinput>
 &prompt.root; <userinput>zfs rename <replaceable>camino/home/joenew</replaceable> <replaceable>camino/home/joe</replaceable></userinput>
 &prompt.root; <userinput>ls /usr/home/joe</userinput>
 backup.txz     loader.conf     plans.txt
 &prompt.root; <userinput>df -h <replaceable>/usr/home</replaceable></userinput>
 Filesystem          Size    Used   Avail Capacity  Mounted on
 usr/home/joe        1.3G    128k    1.3G     0%    /usr/home/joe</screen>
 
       <para>The cloned snapshot is now handled like an ordinary
 	dataset.  It contains all the data from the original snapshot
 	plus the files that were added to it like
 	<filename>loader.conf</filename>.  Clones can be used in
 	different scenarios to provide useful features to ZFS users.
 	For example, jails could be provided as snapshots containing
 	different sets of installed applications.  Users can clone
 	these snapshots and add their own applications as they see
 	fit.  Once they are satisfied with the changes, the clones can
 	be promoted to full datasets and provided to end users to work
 	with like they would with a real dataset.  This saves time and
 	administrative overhead when providing these jails.</para>
     </sect2>
 
     <sect2 xml:id="zfs-zfs-send">
       <title>Replication</title>
 
       <para>Keeping data on a single pool in one location exposes
 	it to risks like theft and natural or human disasters.  Making
 	regular backups of the entire pool is vital.
 	<acronym>ZFS</acronym> provides a built-in serialization
 	feature that can send a stream representation of the data to
 	standard output.  Using this technique, it is possible to not
 	only store the data on another pool connected to the local
 	system, but also to send it over a network to another system.
 	Snapshots are the basis for this replication (see the section
 	on <link linkend="zfs-zfs-snapshot"><acronym>ZFS</acronym>
 	  snapshots</link>).  The commands used for replicating data
 	are <command>zfs send</command> and
 	<command>zfs receive</command>.</para>
 
       <para>These examples demonstrate <acronym>ZFS</acronym>
 	replication with these two pools:</para>
 
       <screen>&prompt.root; <userinput>zpool list</userinput>
 NAME    SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
 backup  960M    77K   896M     0%  1.00x  ONLINE  -
 mypool  984M  43.7M   940M     4%  1.00x  ONLINE  -</screen>
 
       <para>The pool named <replaceable>mypool</replaceable> is the
 	primary pool where data is written to and read from on a
 	regular basis.  A second pool,
 	<replaceable>backup</replaceable> is used as a standby in case
 	the primary pool becomes unavailable.  Note that this
 	fail-over is not done automatically by <acronym>ZFS</acronym>,
 	but must be manually done by a system administrator when
 	needed.  A snapshot is used to provide a consistent version of
 	the file system to be replicated.  Once a snapshot of
 	<replaceable>mypool</replaceable> has been created, it can be
 	copied to the <replaceable>backup</replaceable> pool.  Only
 	snapshots can be replicated.  Changes made since the most
 	recent snapshot will not be included.</para>
 
       <screen>&prompt.root; <userinput>zfs snapshot <replaceable>mypool</replaceable>@<replaceable>backup1</replaceable></userinput>
 &prompt.root; <userinput>zfs list -t snapshot</userinput>
 NAME                    USED  AVAIL  REFER  MOUNTPOINT
 mypool@backup1             0      -  43.6M  -</screen>
 
       <para>Now that a snapshot exists, <command>zfs send</command>
 	can be used to create a stream representing the contents of
 	the snapshot.  This stream can be stored as a file or received
 	by another pool.  The stream is written to standard output,
 	but must be redirected to a file or pipe or an error is
 	produced:</para>
 
       <screen>&prompt.root; <userinput>zfs send <replaceable>mypool</replaceable>@<replaceable>backup1</replaceable></userinput>
 Error: Stream can not be written to a terminal.
 You must redirect standard output.</screen>
 
       <para>To back up a dataset with <command>zfs send</command>,
 	redirect to a file located on the mounted backup pool.  Ensure
 	that the pool has enough free space to accommodate the size of
 	the snapshot being sent, which means all of the data contained
 	in the snapshot, not just the changes from the previous
 	snapshot.</para>
 
       <screen>&prompt.root; <userinput>zfs send <replaceable>mypool</replaceable>@<replaceable>backup1</replaceable> > <replaceable>/backup/backup1</replaceable></userinput>
 &prompt.root; <userinput>zpool list</userinput>
 NAME    SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
 backup  960M  63.7M   896M     6%  1.00x  ONLINE  -
 mypool  984M  43.7M   940M     4%  1.00x  ONLINE  -</screen>
 
       <para>The <command>zfs send</command> transferred all the data
 	in the snapshot called <replaceable>backup1</replaceable> to
 	the pool named <replaceable>backup</replaceable>.  Creating
 	and sending these snapshots can be done automatically with a
 	&man.cron.8; job.</para>
 
       <para>Instead of storing the backups as archive files,
 	<acronym>ZFS</acronym> can receive them as a live file system,
 	allowing the backed up data to be accessed directly.  To get
 	to the actual data contained in those streams,
 	<command>zfs receive</command> is used to transform the
 	streams back into files and directories.  The example below
 	combines <command>zfs send</command> and
 	<command>zfs receive</command> using a pipe to copy the data
 	from one pool to another.  The data can be used directly on
 	the receiving pool after the transfer is complete.  A dataset
 	can only be replicated to an empty dataset.</para>
 
       <screen>&prompt.root; <userinput>zfs snapshot <replaceable>mypool</replaceable>@<replaceable>replica1</replaceable></userinput>
 &prompt.root; <userinput>zfs send -v <replaceable>mypool</replaceable>@<replaceable>replica1</replaceable> | zfs receive <replaceable>backup/mypool</replaceable></userinput>
 send from @ to mypool@replica1 estimated size is 50.1M
 total estimated size is 50.1M
 TIME        SENT   SNAPSHOT
 
 &prompt.root; <userinput>zpool list</userinput>
 NAME    SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
 backup  960M  63.7M   896M     6%  1.00x  ONLINE  -
 mypool  984M  43.7M   940M     4%  1.00x  ONLINE  -</screen>
 
       <sect3 xml:id="zfs-send-incremental">
 	<title>Incremental Backups</title>
 
 	<para><command>zfs send</command> can also determine the
 	  difference between two snapshots and send only the
 	  differences between the two.  This saves disk space and
 	  transfer time.  For example:</para>
 
 	<screen>&prompt.root; <userinput>zfs snapshot <replaceable>mypool</replaceable>@<replaceable>replica2</replaceable></userinput>
 &prompt.root; <userinput>zfs list -t snapshot</userinput>
 NAME                    USED  AVAIL  REFER  MOUNTPOINT
 mypool@replica1         5.72M      -  43.6M  -
 mypool@replica2             0      -  44.1M  -
 &prompt.root; <userinput>zpool list</userinput>
 NAME    SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
 backup  960M  61.7M   898M     6%  1.00x  ONLINE  -
 mypool  960M  50.2M   910M     5%  1.00x  ONLINE  -</screen>
 
 	<para>A second snapshot called
 	  <replaceable>replica2</replaceable> was created.  This
 	  second snapshot contains only the changes that were made to
 	  the file system between now and the previous snapshot,
 	  <replaceable>replica1</replaceable>.  Using
 	  <command>zfs send -i</command> and indicating the pair of
 	  snapshots generates an incremental replica stream containing
 	  only the data that has changed.  This can only succeed if
 	  the initial snapshot already exists on the receiving
 	  side.</para>
 
 	<screen>&prompt.root; <userinput>zfs send -v -i <replaceable>mypool</replaceable>@<replaceable>replica1</replaceable> <replaceable>mypool</replaceable>@<replaceable>replica2</replaceable> | zfs receive <replaceable>/backup/mypool</replaceable></userinput>
 send from @replica1 to mypool@replica2 estimated size is 5.02M
 total estimated size is 5.02M
 TIME        SENT   SNAPSHOT
 
 &prompt.root; <userinput>zpool list</userinput>
 NAME    SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
 backup  960M  80.8M   879M     8%  1.00x  ONLINE  -
 mypool  960M  50.2M   910M     5%  1.00x  ONLINE  -
 
 &prompt.root; <userinput>zfs list</userinput>
 NAME                         USED  AVAIL  REFER  MOUNTPOINT
 backup                      55.4M   240G   152K  /backup
 backup/mypool               55.3M   240G  55.2M  /backup/mypool
 mypool                      55.6M  11.6G  55.0M  /mypool
 
 &prompt.root; <userinput>zfs list -t snapshot</userinput>
 NAME                                         USED  AVAIL  REFER  MOUNTPOINT
 backup/mypool@replica1                       104K      -  50.2M  -
 backup/mypool@replica2                          0      -  55.2M  -
 mypool@replica1                             29.9K      -  50.0M  -
 mypool@replica2                                 0      -  55.0M  -</screen>
 
 	<para>The incremental stream was successfully transferred.
 	  Only the data that had changed was replicated, rather than
 	  the entirety of <replaceable>replica1</replaceable>.  Only
 	  the differences were sent, which took much less time to
 	  transfer and saved disk space by not copying the complete
 	  pool each time.  This is useful when having to rely on slow
 	  networks or when costs per transferred byte must be
 	  considered.</para>
 
 	<para>A new file system,
 	  <replaceable>backup/mypool</replaceable>, is available with
 	  all of the files and data from the pool
 	  <replaceable>mypool</replaceable>.  If <option>-P</option>
 	  is specified, the properties of the dataset will be copied,
 	  including compression settings, quotas, and mount points.
 	  When <option>-R</option> is specified, all child datasets of
 	  the indicated dataset will be copied, along with all of
 	  their properties.  Sending and receiving can be automated so
 	  that regular backups are created on the second pool.</para>
       </sect3>
 
       <sect3 xml:id="zfs-send-ssh">
 	<title>Sending Encrypted Backups over
 	  <application>SSH</application></title>
 
 	<para>Sending streams over the network is a good way to keep a
 	  remote backup, but it does come with a drawback.  Data sent
 	  over the network link is not encrypted, allowing anyone to
 	  intercept and transform the streams back into data without
 	  the knowledge of the sending user.  This is undesirable,
 	  especially when sending the streams over the internet to a
 	  remote host.  <application>SSH</application> can be used to
 	  securely encrypt data send over a network connection.  Since
 	  <acronym>ZFS</acronym> only requires the stream to be
 	  redirected from standard output, it is relatively easy to
 	  pipe it through <application>SSH</application>.  To keep the
 	  contents of the file system encrypted in transit and on the
 	  remote system, consider using <link
 	    xlink:href="https://wiki.freebsd.org/PEFS">PEFS</link>.</para>
 
 	<para>A few settings and security precautions must be
 	  completed first.  Only the necessary steps required for the
 	  <command>zfs send</command> operation are shown here.  For
 	  more information on <application>SSH</application>, see
 	  <xref linkend="openssh"/>.</para>
 
 	<para>This configuration is required:</para>
 
 	<itemizedlist>
 	  <listitem>
 	    <para>Passwordless <application>SSH</application> access
 	      between sending and receiving host using
 	      <application>SSH</application> keys</para>
 	  </listitem>
 
 	  <listitem>
 	    <para>Normally, the privileges of the
 	      <systemitem class="username">root</systemitem> user are
 	      needed to send and receive streams.  This requires
 	      logging in to the receiving system as
 	      <systemitem class="username">root</systemitem>.
 	      However, logging in as
 	      <systemitem class="username">root</systemitem> is
 	      disabled by default for security reasons.  The
 	      <link linkend="zfs-zfs-allow">ZFS Delegation</link>
 	      system can be used to allow a
 	      non-<systemitem class="username">root</systemitem> user
 	      on each system to perform the respective send and
 	      receive operations.</para>
 	  </listitem>
 
 	  <listitem>
 	    <para>On the sending system:</para>
 
 	    <screen>&prompt.root; <userinput>zfs allow -u someuser send,snapshot <replaceable>mypool</replaceable></userinput></screen>
 	  </listitem>
 
 	  <listitem>
 	    <para>To mount the pool, the unprivileged user must own
 	      the directory, and regular users must be allowed to
 	      mount file systems.  On the receiving system:</para>
 
 	    <screen>&prompt.root; <userinput>sysctl vfs.usermount=1</userinput>
 vfs.usermount: 0 -> 1
 &prompt.root; <userinput>sysrc -f /etc/sysctl.conf vfs.usermount=1</userinput>
 &prompt.root; <userinput>zfs create <replaceable>recvpool/backup</replaceable></userinput>
 &prompt.root; <userinput>zfs allow -u <replaceable>someuser</replaceable> create,mount,receive <replaceable>recvpool/backup</replaceable></userinput>
 &prompt.root; <userinput>chown <replaceable>someuser</replaceable> <replaceable>/recvpool/backup</replaceable></userinput></screen>
 	  </listitem>
 	</itemizedlist>
 
 	<para>The unprivileged user now has the ability to receive and
 	  mount datasets, and the <replaceable>home</replaceable>
 	  dataset can be replicated to the remote system:</para>
 
 	<screen>&prompt.user; <userinput>zfs snapshot -r <replaceable>mypool/home</replaceable>@<replaceable>monday</replaceable></userinput>
 &prompt.user; <userinput>zfs send -R <replaceable>mypool/home</replaceable>@<replaceable>monday</replaceable> | ssh <replaceable>someuser@backuphost</replaceable> zfs recv -dvu <replaceable>recvpool/backup</replaceable></userinput></screen>
 
 	<para>A recursive snapshot called
 	  <replaceable>monday</replaceable> is made of the file system
 	  dataset <replaceable>home</replaceable> that resides on the
 	  pool <replaceable>mypool</replaceable>.  Then it is sent
 	  with <command>zfs send -R</command> to include the dataset,
 	  all child datasets, snapshots, clones, and settings in the
 	  stream.  The output is piped to the waiting
 	  <command>zfs receive</command> on the remote host
 	  <replaceable>backuphost</replaceable> through
 	  <application>SSH</application>.  Using a fully qualified
 	  domain name or IP address is recommended.  The receiving
 	  machine writes the data to the
 	  <replaceable>backup</replaceable> dataset on the
 	  <replaceable>recvpool</replaceable> pool.  Adding
 	  <option>-d</option> to <command>zfs recv</command>
 	  overwrites the name of the pool on the receiving side with
 	  the name of the snapshot.  <option>-u</option> causes the
 	  file systems to not be mounted on the receiving side.  When
 	  <option>-v</option> is included, more detail about the
 	  transfer is shown, including elapsed time and the amount of
 	  data transferred.</para>
       </sect3>
     </sect2>
 
     <sect2 xml:id="zfs-zfs-quota">
       <title>Dataset, User, and Group Quotas</title>
 
       <para><link linkend="zfs-term-quota">Dataset quotas</link> are
 	used to restrict the amount of space that can be consumed
 	by a particular dataset.
 	<link linkend="zfs-term-refquota">Reference Quotas</link> work
 	in very much the same way, but only count the space
 	used by the dataset itself, excluding snapshots and child
 	datasets.  Similarly,
 	<link linkend="zfs-term-userquota">user</link> and
 	<link linkend="zfs-term-groupquota">group</link> quotas can be
 	used to prevent users or groups from using all of the
 	space in the pool or dataset.</para>
 
       <para>To enforce a dataset quota of 10&nbsp;GB for
 	<filename>storage/home/bob</filename>:</para>
 
       <screen>&prompt.root; <userinput>zfs set quota=10G storage/home/bob</userinput></screen>
 
       <para>To enforce a reference quota of 10&nbsp;GB for
 	<filename>storage/home/bob</filename>:</para>
 
       <screen>&prompt.root; <userinput>zfs set refquota=10G storage/home/bob</userinput></screen>
 
       <para>To remove a quota of 10&nbsp;GB for
 	<filename>storage/home/bob</filename>:</para>
 
       <screen>&prompt.root; <userinput>zfs set quota=none storage/home/bob</userinput></screen>
 
       <para>The general format is
 	<literal>userquota@<replaceable>user</replaceable>=<replaceable>size</replaceable></literal>,
 	and the user's name must be in one of these formats:</para>
 
       <itemizedlist>
 	<listitem>
 	  <para><acronym>POSIX</acronym> compatible name such as
 	    <replaceable>joe</replaceable>.</para>
 	</listitem>
 
 	<listitem>
 	  <para><acronym>POSIX</acronym> numeric ID such as
 	    <replaceable>789</replaceable>.</para>
 	</listitem>
 
 	<listitem>
 	  <para><acronym>SID</acronym> name
 	    such as
 	    <replaceable>joe.bloggs@example.com</replaceable>.</para>
 	</listitem>
 
 	<listitem>
 	  <para><acronym>SID</acronym>
 	    numeric ID such as
 	    <replaceable>S-1-123-456-789</replaceable>.</para>
 	</listitem>
       </itemizedlist>
 
       <para>For example, to enforce a user quota of 50&nbsp;GB for the
 	user named <replaceable>joe</replaceable>:</para>
 
       <screen>&prompt.root; <userinput>zfs set userquota@joe=50G</userinput></screen>
 
       <para>To remove any quota:</para>
 
       <screen>&prompt.root; <userinput>zfs set userquota@joe=none</userinput></screen>
 
       <note>
 	<para>User quota properties are not displayed by
 	  <command>zfs get all</command>.
 	  Non-<systemitem class="username">root</systemitem> users can
 	  only see their own quotas unless they have been granted the
 	  <literal>userquota</literal> privilege.  Users with this
 	  privilege are able to view and set everyone's quota.</para>
       </note>
 
       <para>The general format for setting a group quota is:
 	<literal>groupquota@<replaceable>group</replaceable>=<replaceable>size</replaceable></literal>.</para>
 
       <para>To set the quota for the group
 	<replaceable>firstgroup</replaceable> to 50&nbsp;GB,
 	use:</para>
 
       <screen>&prompt.root; <userinput>zfs set groupquota@firstgroup=50G</userinput></screen>
 
       <para>To remove the quota for the group
 	<replaceable>firstgroup</replaceable>, or to make sure that
 	one is not set, instead use:</para>
 
       <screen>&prompt.root; <userinput>zfs set groupquota@firstgroup=none</userinput></screen>
 
       <para>As with the user quota property,
 	non-<systemitem class="username">root</systemitem> users can
 	only see the quotas associated with the groups to which they
 	belong.  However,
 	<systemitem class="username">root</systemitem> or a user with
 	the <literal>groupquota</literal> privilege can view and set
 	all quotas for all groups.</para>
 
       <para>To display the amount of space used by each user on
 	a file system or snapshot along with any quotas, use
 	<command>zfs userspace</command>.  For group information, use
 	<command>zfs groupspace</command>.  For more information about
 	supported options or how to display only specific options,
 	refer to &man.zfs.1;.</para>
 
       <para>Users with sufficient privileges, and
 	<systemitem class="username">root</systemitem>, can list the
 	quota for <filename>storage/home/bob</filename> using:</para>
 
       <screen>&prompt.root; <userinput>zfs get quota storage/home/bob</userinput></screen>
     </sect2>
 
     <sect2 xml:id="zfs-zfs-reservation">
       <title>Reservations</title>
 
       <para><link linkend="zfs-term-reservation">Reservations</link>
 	guarantee a minimum amount of space will always be available
 	on a dataset.  The reserved space will not be available to any
 	other dataset.  This feature can be especially useful to
 	ensure that free space is available for an important dataset
 	or log files.</para>
 
       <para>The general format of the <literal>reservation</literal>
 	property is
 	<literal>reservation=<replaceable>size</replaceable></literal>,
 	so to set a reservation of 10&nbsp;GB on
 	<filename>storage/home/bob</filename>, use:</para>
 
       <screen>&prompt.root; <userinput>zfs set reservation=10G storage/home/bob</userinput></screen>
 
       <para>To clear any reservation:</para>
 
       <screen>&prompt.root; <userinput>zfs set reservation=none storage/home/bob</userinput></screen>
 
       <para>The same principle can be applied to the
 	<literal>refreservation</literal> property for setting a
 	<link linkend="zfs-term-refreservation">Reference
 	  Reservation</link>, with the general format
 	<literal>refreservation=<replaceable>size</replaceable></literal>.</para>
 
       <para>This command shows any reservations or refreservations
 	that exist on <filename>storage/home/bob</filename>:</para>
 
       <screen>&prompt.root; <userinput>zfs get reservation storage/home/bob</userinput>
 &prompt.root; <userinput>zfs get refreservation storage/home/bob</userinput></screen>
     </sect2>
 
     <sect2 xml:id="zfs-zfs-compression">
       <title>Compression</title>
 
       <para><acronym>ZFS</acronym> provides transparent compression.
 	Compressing data at the block level as it is written not only
 	saves space, but can also increase disk throughput.  If data
 	is compressed by 25%, but the compressed data is written to
 	the disk at the same rate as the uncompressed version,
 	resulting in an effective write speed of 125%.  Compression
 	can also be a great alternative to
 	<link linkend="zfs-zfs-deduplication">Deduplication</link>
 	because it does not require additional memory.</para>
 
       <para><acronym>ZFS</acronym> offers several different
 	compression algorithms, each with different trade-offs.  With
 	the introduction of <acronym>LZ4</acronym> compression in
 	<acronym>ZFS</acronym> v5000, it is possible to enable
 	compression for the entire pool without the large performance
 	trade-off of other algorithms.  The biggest advantage to
 	<acronym>LZ4</acronym> is the <emphasis>early abort</emphasis>
 	feature.  If <acronym>LZ4</acronym> does not achieve at least
 	12.5% compression in the first part of the data, the block is
 	written uncompressed to avoid wasting CPU cycles trying to
 	compress data that is either already compressed or
 	uncompressible.  For details about the different compression
 	algorithms available in <acronym>ZFS</acronym>, see the
 	<link linkend="zfs-term-compression">Compression</link> entry
 	in the terminology section.</para>
 
       <para>The administrator can monitor the effectiveness of
 	compression using a number of dataset properties.</para>
 
       <screen>&prompt.root; <userinput>zfs get used,compressratio,compression,logicalused <replaceable>mypool/compressed_dataset</replaceable></userinput>
 NAME        PROPERTY          VALUE     SOURCE
 mypool/compressed_dataset  used              449G      -
 mypool/compressed_dataset  compressratio     1.11x     -
 mypool/compressed_dataset  compression       lz4       local
 mypool/compressed_dataset  logicalused       496G      -</screen>
 
       <para>The dataset is currently using 449&nbsp;GB of space (the
 	used property).  Without compression, it would have taken
 	496&nbsp;GB of space (the <literal>logicalused</literal>
 	property).  This results in the 1.11:1 compression
 	ratio.</para>
 
       <para>Compression can have an unexpected side effect when
 	combined with
 	<link linkend="zfs-term-userquota">User Quotas</link>.
 	User quotas restrict how much space a user can consume on a
 	dataset, but the measurements are based on how much space is
 	used <emphasis>after compression</emphasis>.  So if a user has
 	a quota of 10&nbsp;GB, and writes 10&nbsp;GB of compressible
 	data, they will still be able to store additional data.  If
 	they later update a file, say a database, with more or less
 	compressible data, the amount of space available to them will
 	change.  This can result in the odd situation where a user did
 	not increase the actual amount of data (the
 	<literal>logicalused</literal> property), but the change in
 	compression caused them to reach their quota limit.</para>
 
       <para>Compression can have a similar unexpected interaction with
 	backups.  Quotas are often used to limit how much data can be
 	stored to ensure there is sufficient backup space available.
 	However since quotas do not consider compression, more data
 	may be written than would fit with uncompressed
 	backups.</para>
     </sect2>
 
     <sect2 xml:id="zfs-zfs-deduplication">
       <title>Deduplication</title>
 
       <para>When enabled,
 	<link linkend="zfs-term-deduplication">deduplication</link>
 	uses the checksum of each block to detect duplicate blocks.
 	When a new block is a duplicate of an existing block,
 	<acronym>ZFS</acronym> writes an additional reference to the
 	existing data instead of the whole duplicate block.
 	Tremendous space savings are possible if the data contains
 	many duplicated files or repeated information.  Be warned:
 	deduplication requires an extremely large amount of memory,
 	and most of the space savings can be had without the extra
 	cost by enabling compression instead.</para>
 
       <para>To activate deduplication, set the
 	<literal>dedup</literal> property on the target pool:</para>
 
       <screen>&prompt.root; <userinput>zfs set dedup=on <replaceable>pool</replaceable></userinput></screen>
 
       <para>Only new data being written to the pool will be
 	deduplicated.  Data that has already been written to the pool
 	will not be deduplicated merely by activating this option.  A
 	pool with a freshly activated deduplication property will look
 	like this example:</para>
 
       <screen>&prompt.root; <userinput>zpool list</userinput>
 NAME  SIZE ALLOC  FREE CAP DEDUP HEALTH ALTROOT
 pool 2.84G 2.19M 2.83G  0% 1.00x ONLINE -</screen>
 
       <para>The <literal>DEDUP</literal> column shows the actual rate
 	of deduplication for the pool.  A value of
 	<literal>1.00x</literal> shows that data has not been
 	deduplicated yet.  In the next example, the ports tree is
 	copied three times into different directories on the
 	deduplicated pool created above.</para>
 
       <screen>&prompt.root; <userinput>for d in dir1 dir2 dir3; do</userinput>
 > <userinput>mkdir $d &amp;&amp; cp -R /usr/ports $d &amp;</userinput>
 > <userinput>done</userinput></screen>
 
       <para>Redundant data is detected and deduplicated:</para>
 
       <screen>&prompt.root; <userinput>zpool list</userinput>
 NAME SIZE  ALLOC FREE CAP DEDUP HEALTH ALTROOT
 pool 2.84G 20.9M 2.82G 0% 3.00x ONLINE -</screen>
 
       <para>The <literal>DEDUP</literal> column shows a factor of
 	<literal>3.00x</literal>.  Multiple copies of the ports tree
 	data was detected and deduplicated, using only a third of the
 	space.  The potential for space savings can be enormous, but
 	comes at the cost of having enough memory to keep track of the
 	deduplicated blocks.</para>
 
       <para>Deduplication is not always beneficial, especially when
 	the data on a pool is not redundant.
 	<acronym>ZFS</acronym> can show potential space savings by
 	simulating deduplication on an existing pool:</para>
 
       <screen>&prompt.root; <userinput>zdb -S <replaceable>pool</replaceable></userinput>
 Simulated DDT histogram:
 
 bucket              allocated                       referenced
 ______   ______________________________   ______________________________
 refcnt   blocks   LSIZE   PSIZE   DSIZE   blocks   LSIZE   PSIZE   DSIZE
 ------   ------   -----   -----   -----   ------   -----   -----   -----
      1    2.58M    289G    264G    264G    2.58M    289G    264G    264G
      2     206K   12.6G   10.4G   10.4G     430K   26.4G   21.6G   21.6G
      4    37.6K    692M    276M    276M     170K   3.04G   1.26G   1.26G
      8    2.18K   45.2M   19.4M   19.4M    20.0K    425M    176M    176M
     16      174   2.83M   1.20M   1.20M    3.33K   48.4M   20.4M   20.4M
     32       40   2.17M    222K    222K    1.70K   97.2M   9.91M   9.91M
     64        9     56K   10.5K   10.5K      865   4.96M    948K    948K
    128        2   9.50K      2K      2K      419   2.11M    438K    438K
    256        5   61.5K     12K     12K    1.90K   23.0M   4.47M   4.47M
     1K        2      1K      1K      1K    2.98K   1.49M   1.49M   1.49M
  Total    2.82M    303G    275G    275G    3.20M    319G    287G    287G
 
 dedup = 1.05, compress = 1.11, copies = 1.00, dedup * compress / copies = 1.16</screen>
 
       <para>After <command>zdb -S</command> finishes analyzing the
 	pool, it shows the space reduction ratio that would be
 	achieved by activating deduplication.  In this case,
 	<literal>1.16</literal> is a very poor space saving ratio that
 	is mostly provided by compression.  Activating deduplication
 	on this pool would not save any significant amount of space,
 	and is not worth the amount of memory required to enable
 	deduplication.  Using the formula
 	<emphasis>ratio = dedup * compress / copies</emphasis>,
 	system administrators can plan the storage allocation,
 	deciding whether the workload will contain enough duplicate
 	blocks to justify the memory requirements.  If the data is
 	reasonably compressible, the space savings may be very good.
 	Enabling compression first is recommended, and compression can
 	also provide greatly increased performance.  Only enable
 	deduplication in cases where the additional savings will be
 	considerable and there is sufficient memory for the <link
 	  linkend="zfs-term-deduplication"><acronym>DDT</acronym></link>.</para>
     </sect2>
 
     <sect2 xml:id="zfs-zfs-jail">
       <title><acronym>ZFS</acronym> and Jails</title>
 
       <para><command>zfs jail</command> and the corresponding
 	<literal>jailed</literal> property are used to delegate a
 	<acronym>ZFS</acronym> dataset to a
 	<link linkend="jails">Jail</link>.
 	<command>zfs jail <replaceable>jailid</replaceable></command>
 	attaches a dataset to the specified jail, and
 	<command>zfs unjail</command> detaches it.  For the dataset to
 	be controlled from within a jail, the
 	<literal>jailed</literal> property must be set.  Once a
 	dataset is jailed, it can no longer be mounted on the
 	host because it may have mount points that would compromise
 	the security of the host.</para>
     </sect2>
   </sect1>
 
   <sect1 xml:id="zfs-zfs-allow">
     <title>Delegated Administration</title>
 
     <para>A comprehensive permission delegation system allows
       unprivileged users to perform <acronym>ZFS</acronym>
       administration functions.  For example, if each user's home
       directory is a dataset, users can be given permission to create
       and destroy snapshots of their home directories.  A backup user
       can be given permission to use replication features.  A usage
       statistics script can be allowed to run with access only to the
       space utilization data for all users.  It is even possible to
       delegate the ability to delegate permissions.  Permission
       delegation is possible for each subcommand and most
       properties.</para>
 
     <sect2 xml:id="zfs-zfs-allow-create">
       <title>Delegating Dataset Creation</title>
 
       <para><command>zfs allow
 	  <replaceable>someuser</replaceable> create
 	  <replaceable>mydataset</replaceable></command> gives the
 	specified user permission to create child datasets under the
 	selected parent dataset.  There is a caveat: creating a new
 	dataset involves mounting it.  That requires setting the
 	&os; <literal>vfs.usermount</literal> &man.sysctl.8; to
 	<literal>1</literal> to allow non-root users to mount a
 	file system.  There is another restriction aimed at preventing
 	abuse: non-<systemitem class="username">root</systemitem>
 	users must own the mountpoint where the file system is to be
 	mounted.</para>
     </sect2>
 
     <sect2 xml:id="zfs-zfs-allow-allow">
       <title>Delegating Permission Delegation</title>
 
       <para><command>zfs allow
 	  <replaceable>someuser</replaceable> allow
 	  <replaceable>mydataset</replaceable></command> gives the
 	specified user the ability to assign any permission they have
 	on the target dataset, or its children, to other users.  If a
 	user has the <literal>snapshot</literal> permission and the
 	<literal>allow</literal> permission, that user can then grant
 	the <literal>snapshot</literal> permission to other
 	users.</para>
     </sect2>
   </sect1>
 
   <sect1 xml:id="zfs-advanced">
     <title>Advanced Topics</title>
 
     <sect2 xml:id="zfs-advanced-tuning">
       <title>Tuning</title>
 
       <para>There are a number of tunables that can be adjusted to
 	make <acronym>ZFS</acronym> perform best for different
 	workloads.</para>
 
       <itemizedlist>
 	<listitem>
 	  <para
 	      xml:id="zfs-advanced-tuning-arc_max"><emphasis><varname>vfs.zfs.arc_max</varname></emphasis>
 	    - Maximum size of the <link
 	      linkend="zfs-term-arc"><acronym>ARC</acronym></link>.
 	    The default is all <acronym>RAM</acronym> less 1&nbsp;GB,
 	    or one half of <acronym>RAM</acronym>, whichever is more.
 	    However, a lower value should be used if the system will
 	    be running any other daemons or processes that may require
 	    memory.  This value can only be adjusted at boot time, and
 	    is set in <filename>/boot/loader.conf</filename>.</para>
 	</listitem>
 
 	<listitem>
 	  <para
 	      xml:id="zfs-advanced-tuning-arc_meta_limit"><emphasis><varname>vfs.zfs.arc_meta_limit</varname></emphasis>
 	    - Limit the portion of the
 	    <link linkend="zfs-term-arc"><acronym>ARC</acronym></link>
 	    that can be used to store metadata.  The default is one
 	    fourth of <varname>vfs.zfs.arc_max</varname>.  Increasing
 	    this value will improve performance if the workload
 	    involves operations on a large number of files and
 	    directories, or frequent metadata operations, at the cost
 	    of less file data fitting in the <link
 	      linkend="zfs-term-arc"><acronym>ARC</acronym></link>.
 	    This value can only be adjusted at boot time, and is set
 	    in <filename>/boot/loader.conf</filename>.</para>
 	</listitem>
 
 	<listitem>
 	  <para
 	      xml:id="zfs-advanced-tuning-arc_min"><emphasis><varname>vfs.zfs.arc_min</varname></emphasis>
 	    - Minimum size of the <link
 	      linkend="zfs-term-arc"><acronym>ARC</acronym></link>.
 	    The default is one half of
 	    <varname>vfs.zfs.arc_meta_limit</varname>.  Adjust this
 	    value to prevent other applications from pressuring out
 	    the entire <link
 	      linkend="zfs-term-arc"><acronym>ARC</acronym></link>.
 	    This value can only be adjusted at boot time, and is set
 	    in <filename>/boot/loader.conf</filename>.</para>
 	</listitem>
 
 	<listitem>
 	  <para
 	      xml:id="zfs-advanced-tuning-vdev-cache-size"><emphasis><varname>vfs.zfs.vdev.cache.size</varname></emphasis>
 	    - A preallocated amount of memory reserved as a cache for
 	    each device in the pool.  The total amount of memory used
 	    will be this value multiplied by the number of devices.
 	    This value can only be adjusted at boot time, and is set
 	    in <filename>/boot/loader.conf</filename>.</para>
 	</listitem>
 
 	<listitem>
 	  <para
 	      xml:id="zfs-advanced-tuning-min-auto-ashift"><emphasis><varname>vfs.zfs.min_auto_ashift</varname></emphasis>
 	    - Minimum <varname>ashift</varname> (sector size) that
 	    will be used automatically at pool creation time.  The
 	    value is a power of two.  The default value of
 	    <literal>9</literal> represents
 	    <literal>2^9 = 512</literal>, a sector size of 512 bytes.
 	    To avoid <emphasis>write amplification</emphasis> and get
 	    the best performance, set this value to the largest sector
 	    size used by a device in the pool.</para>
 
 	  <para>Many drives have 4&nbsp;KB sectors.  Using the default
 	    <varname>ashift</varname> of <literal>9</literal> with
 	    these drives results in write amplification on these
 	    devices.  Data that could be contained in a single
 	    4&nbsp;KB write must instead be written in eight 512-byte
 	    writes.  <acronym>ZFS</acronym> tries to read the native
 	    sector size from all devices when creating a pool, but
 	    many drives with 4&nbsp;KB sectors report that their
 	    sectors are 512 bytes for compatibility.  Setting
 	    <varname>vfs.zfs.min_auto_ashift</varname> to
 	    <literal>12</literal> (<literal>2^12 = 4096</literal>)
 	    before creating a pool forces <acronym>ZFS</acronym> to
 	    use 4&nbsp;KB blocks for best performance on these
 	    drives.</para>
 
 	  <para>Forcing 4&nbsp;KB blocks is also useful on pools where
 	    disk upgrades are planned.  Future disks are likely to use
 	    4&nbsp;KB sectors, and <varname>ashift</varname> values
 	    cannot be changed after a pool is created.</para>
 
 	  <para>In some specific cases, the smaller 512-byte block
 	    size might be preferable.  When used with 512-byte disks
 	    for databases, or as storage for virtual machines, less
 	    data is transferred during small random reads.  This can
 	    provide better performance, especially when using a
 	    smaller <acronym>ZFS</acronym> record size.</para>
 	</listitem>
 
 	<listitem>
 	  <para
 	      xml:id="zfs-advanced-tuning-prefetch_disable"><emphasis><varname>vfs.zfs.prefetch_disable</varname></emphasis>
 	    - Disable prefetch.  A value of <literal>0</literal> is
 	    enabled and <literal>1</literal> is disabled.  The default
 	    is <literal>0</literal>, unless the system has less than
 	    4&nbsp;GB of <acronym>RAM</acronym>.  Prefetch works by
 	    reading larger blocks than were requested into the
 	    <link linkend="zfs-term-arc"><acronym>ARC</acronym></link>
 	    in hopes that the data will be needed soon.  If the
 	    workload has a large number of random reads, disabling
 	    prefetch may actually improve performance by reducing
 	    unnecessary reads.  This value can be adjusted at any time
 	    with &man.sysctl.8;.</para>
 	</listitem>
 
 	<listitem>
 	  <para
 	      xml:id="zfs-advanced-tuning-vdev-trim_on_init"><emphasis><varname>vfs.zfs.vdev.trim_on_init</varname></emphasis>
 	    - Control whether new devices added to the pool have the
 	    <literal>TRIM</literal> command run on them.  This ensures
 	    the best performance and longevity for
 	    <acronym>SSD</acronym>s, but takes extra time.  If the
 	    device has already been secure erased, disabling this
 	    setting will make the addition of the new device faster.
 	    This value can be adjusted at any time with
 	    &man.sysctl.8;.</para>
 	</listitem>
 
 	<listitem>
 	  <para
 	      xml:id="zfs-advanced-tuning-vdev-max_pending"><emphasis><varname>vfs.zfs.vdev.max_pending</varname></emphasis>
 	    - Limit the number of pending I/O requests per device.
 	    A higher value will keep the device command queue full
 	    and may give higher throughput.  A lower value will reduce
 	    latency.  This value can be adjusted at any time with
 	    &man.sysctl.8;.</para>
 	</listitem>
 
 	<listitem>
 	  <para
 	      xml:id="zfs-advanced-tuning-top_maxinflight"><emphasis><varname>vfs.zfs.top_maxinflight</varname></emphasis>
 	    - Maxmimum number of outstanding I/Os per top-level
 	    <link linkend="zfs-term-vdev">vdev</link>.  Limits the
 	    depth of the command queue to prevent high latency.  The
 	    limit is per top-level vdev, meaning the limit applies to
 	    each <link linkend="zfs-term-vdev-mirror">mirror</link>,
 	    <link linkend="zfs-term-vdev-raidz">RAID-Z</link>, or
 	    other vdev independently.  This value can be adjusted at
 	    any time with &man.sysctl.8;.</para>
 	</listitem>
 
 	<listitem>
 	  <para
 	      xml:id="zfs-advanced-tuning-l2arc_write_max"><emphasis><varname>vfs.zfs.l2arc_write_max</varname></emphasis>
 	    - Limit the amount of data written to the <link
 	      linkend="zfs-term-l2arc"><acronym>L2ARC</acronym></link>
 	    per second.  This tunable is designed to extend the
 	    longevity of <acronym>SSD</acronym>s by limiting the
 	    amount of data written to the device.  This value can be
 	    adjusted at any time with &man.sysctl.8;.</para>
 	</listitem>
 
 	<listitem>
 	  <para
 	      xml:id="zfs-advanced-tuning-l2arc_write_boost"><emphasis><varname>vfs.zfs.l2arc_write_boost</varname></emphasis>
 	    - The value of this tunable is added to <link
 	      linkend="zfs-advanced-tuning-l2arc_write_max"><varname>vfs.zfs.l2arc_write_max</varname></link>
 	    and increases the write speed to the
 	    <acronym>SSD</acronym> until the first block is evicted
 	    from the <link
 	      linkend="zfs-term-l2arc"><acronym>L2ARC</acronym></link>.
 	    This <quote>Turbo Warmup Phase</quote> is designed to
 	    reduce the performance loss from an empty <link
 	      linkend="zfs-term-l2arc"><acronym>L2ARC</acronym></link>
 	    after a reboot.  This value can be adjusted at any time
 	    with &man.sysctl.8;.</para>
 	</listitem>
 
 	<listitem>
 	  <para
 	      xml:id="zfs-advanced-tuning-scrub_delay"><emphasis><varname>vfs.zfs.scrub_delay</varname></emphasis>
 	    - Number of ticks to delay between each I/O during a
 	    <link
 	      linkend="zfs-term-scrub"><command>scrub</command></link>.
 	    To ensure that a <command>scrub</command> does not
 	    interfere with the normal operation of the pool, if any
 	    other <acronym>I/O</acronym> is happening the
 	    <command>scrub</command> will delay between each command.
 	    This value controls the limit on the total
 	    <acronym>IOPS</acronym> (I/Os Per Second) generated by the
 	    <command>scrub</command>.  The granularity of the setting
 	    is determined by the value of <varname>kern.hz</varname>
 	    which defaults to 1000 ticks per second.  This setting may
 	    be changed, resulting in a different effective
 	    <acronym>IOPS</acronym> limit.  The default value is
 	    <literal>4</literal>, resulting in a limit of:
 	    1000&nbsp;ticks/sec / 4 =
 	    250&nbsp;<acronym>IOPS</acronym>.  Using a value of
 	    <replaceable>20</replaceable> would give a limit of:
 	    1000&nbsp;ticks/sec / 20 =
 	    50&nbsp;<acronym>IOPS</acronym>.  The speed of
 	    <command>scrub</command> is only limited when there has
 	    been recent activity on the pool, as determined by <link
 	      linkend="zfs-advanced-tuning-scan_idle"><varname>vfs.zfs.scan_idle</varname></link>.
 	    This value can be adjusted at any time with
 	    &man.sysctl.8;.</para>
 	</listitem>
 
 	<listitem>
 	  <para
 	      xml:id="zfs-advanced-tuning-resilver_delay"><emphasis><varname>vfs.zfs.resilver_delay</varname></emphasis>
 	    - Number of milliseconds of delay inserted between
 	    each I/O during a
 	    <link linkend="zfs-term-resilver">resilver</link>.  To
 	    ensure that a resilver does not interfere with the normal
 	    operation of the pool, if any other I/O is happening the
 	    resilver will delay between each command.  This value
 	    controls the limit of total <acronym>IOPS</acronym> (I/Os
 	    Per Second) generated by the resilver.  The granularity of
 	    the setting is determined by the value of
 	    <varname>kern.hz</varname> which defaults to 1000 ticks
 	    per second.  This setting may be changed, resulting in a
 	    different effective <acronym>IOPS</acronym> limit.  The
 	    default value is 2, resulting in a limit of:
 	    1000&nbsp;ticks/sec / 2 =
 	    500&nbsp;<acronym>IOPS</acronym>.  Returning the pool to
 	    an <link linkend="zfs-term-online">Online</link> state may
 	    be more important if another device failing could
 	    <link linkend="zfs-term-faulted">Fault</link> the pool,
 	    causing data loss.  A value of 0 will give the resilver
 	    operation the same priority as other operations, speeding
 	    the healing process.  The speed of resilver is only
 	    limited when there has been other recent activity on the
 	    pool, as determined by <link
 	      linkend="zfs-advanced-tuning-scan_idle"><varname>vfs.zfs.scan_idle</varname></link>.
 	    This value can be adjusted at any time with
 	    &man.sysctl.8;.</para>
 	</listitem>
 
 	<listitem>
 	  <para
 	      xml:id="zfs-advanced-tuning-scan_idle"><emphasis><varname>vfs.zfs.scan_idle</varname></emphasis>
 	    - Number of milliseconds since the last operation before
 	    the pool is considered idle.  When the pool is idle the
 	    rate limiting for <link
 	      linkend="zfs-term-scrub"><command>scrub</command></link>
 	    and
 	    <link linkend="zfs-term-resilver">resilver</link> are
 	    disabled.  This value can be adjusted at any time with
 	    &man.sysctl.8;.</para>
 	</listitem>
 
 	<listitem>
 	  <para
 	      xml:id="zfs-advanced-tuning-txg-timeout"><emphasis><varname>vfs.zfs.txg.timeout</varname></emphasis>
 	    - Maximum number of seconds between
 	    <link linkend="zfs-term-txg">transaction group</link>s.
 	    The current transaction group will be written to the pool
 	    and a fresh transaction group started if this amount of
 	    time has elapsed since the previous transaction group.  A
 	    transaction group my be triggered earlier if enough data
 	    is written.  The default value is 5 seconds.  A larger
 	    value may improve read performance by delaying
 	    asynchronous writes, but this may cause uneven performance
 	    when the transaction group is written.  This value can be
 	    adjusted at any time with &man.sysctl.8;.</para>
 	</listitem>
       </itemizedlist>
     </sect2>
 
 <!-- These sections will be added in the future
     <sect2 xml:id="zfs-advanced-booting">
       <title>Booting Root on <acronym>ZFS</acronym> </title>
 
       <para></para>
     </sect2>
 
     <sect2 xml:id="zfs-advanced-beadm">
       <title><acronym>ZFS</acronym> Boot Environments</title>
 
       <para></para>
     </sect2>
 
     <sect2 xml:id="zfs-advanced-troubleshoot">
       <title>Troubleshooting</title>
 
       <para></para>
     </sect2>
 -->
 
     <sect2 xml:id="zfs-advanced-i386">
       <title><acronym>ZFS</acronym> on i386</title>
 
       <para>Some of the features provided by <acronym>ZFS</acronym>
 	are memory intensive, and may require tuning for maximum
 	efficiency on systems with limited
 	<acronym>RAM</acronym>.</para>
 
       <sect3>
 	<title>Memory</title>
 
 	<para>As a bare minimum, the total system memory should be at
 	  least one gigabyte.  The amount of recommended
 	  <acronym>RAM</acronym> depends upon the size of the pool and
 	  which <acronym>ZFS</acronym> features are used.  A general
 	  rule of thumb is 1&nbsp;GB of RAM for every 1&nbsp;TB of
 	  storage.  If the deduplication feature is used, a general
 	  rule of thumb is 5&nbsp;GB of RAM per TB of storage to be
 	  deduplicated.  While some users successfully use
 	  <acronym>ZFS</acronym> with less <acronym>RAM</acronym>,
 	  systems under heavy load may panic due to memory exhaustion.
 	  Further tuning may be required for systems with less than
 	  the recommended RAM requirements.</para>
       </sect3>
 
       <sect3>
 	<title>Kernel Configuration</title>
 
 	<para>Due to the address space limitations of the
 	  &i386; platform, <acronym>ZFS</acronym> users on the
 	  &i386; architecture must add this option to a
 	  custom kernel configuration file, rebuild the kernel, and
 	  reboot:</para>
 
 	<programlisting>options        KVA_PAGES=512</programlisting>
 
 	<para>This expands the kernel address space, allowing
 	  the <varname>vm.kvm_size</varname> tunable to be pushed
 	  beyond the currently imposed limit of 1&nbsp;GB, or the
 	  limit of 2&nbsp;GB for <acronym>PAE</acronym>.  To find the
 	  most suitable value for this option, divide the desired
 	  address space in megabytes by four.  In this example, it
 	  is <literal>512</literal> for 2&nbsp;GB.</para>
       </sect3>
 
       <sect3>
 	<title>Loader Tunables</title>
 
 	<para>The <filename>kmem</filename> address space can be
 	  increased on all &os; architectures.  On a test system with
 	  1&nbsp;GB of physical memory, success was achieved with
 	  these options added to
 	  <filename>/boot/loader.conf</filename>, and the system
 	  restarted:</para>
 
 	<programlisting>vm.kmem_size="330M"
 vm.kmem_size_max="330M"
 vfs.zfs.arc_max="40M"
 vfs.zfs.vdev.cache.size="5M"</programlisting>
 
 	<para>For a more detailed list of recommendations for
 	  <acronym>ZFS</acronym>-related tuning, see <link
 	    xlink:href="https://wiki.freebsd.org/ZFSTuningGuide"></link>.</para>
       </sect3>
     </sect2>
   </sect1>
 
   <sect1 xml:id="zfs-links">
     <title>Additional Resources</title>
 
     <itemizedlist>
       <listitem>
 	<para><link xlink:href="https://wiki.freebsd.org/ZFS">FreeBSD
 	    Wiki - <acronym>ZFS</acronym></link></para>
       </listitem>
 
       <listitem>
 	<para><link
 	    xlink:href="https://wiki.freebsd.org/ZFSTuningGuide">FreeBSD
 	    Wiki - <acronym>ZFS</acronym> Tuning</link></para>
       </listitem>
 
       <listitem>
 	<para><link
 	    xlink:href="http://wiki.illumos.org/display/illumos/ZFS">Illumos
 	    Wiki - <acronym>ZFS</acronym></link></para>
       </listitem>
 
       <listitem>
 	<para><link
 	    xlink:href="http://docs.oracle.com/cd/E19253-01/819-5461/index.html">Oracle
 	    Solaris <acronym>ZFS</acronym> Administration
 	    Guide</link></para>
       </listitem>
 
       <listitem>
 	<para><link
 	    xlink:href="http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide"><acronym>ZFS</acronym>
 	    Evil Tuning Guide</link></para>
       </listitem>
 
       <listitem>
 	<para><link
 	    xlink:href="http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide"><acronym>ZFS</acronym>
 	    Best Practices Guide</link></para>
       </listitem>
 
       <listitem>
 	<para><link
 	    xlink:href="https://calomel.org/zfs_raid_speed_capacity.html">Calomel
 	    Blog - <acronym>ZFS</acronym> Raidz Performance, Capacity
 	    and Integrity</link></para>
       </listitem>
     </itemizedlist>
   </sect1>
 
   <sect1 xml:id="zfs-term">
     <title><acronym>ZFS</acronym> Features and Terminology</title>
 
     <para><acronym>ZFS</acronym> is a fundamentally different file
       system because it is more than just a file system.
       <acronym>ZFS</acronym> combines the roles of file system and
       volume manager, enabling additional storage devices to be added
       to a live system and having the new space available on all of
       the existing file systems in that pool immediately.  By
       combining the traditionally separate roles,
       <acronym>ZFS</acronym> is able to overcome previous limitations
       that prevented <acronym>RAID</acronym> groups being able to
       grow.  Each top level device in a pool is called a
       <emphasis>vdev</emphasis>, which can be a simple disk or a
       <acronym>RAID</acronym> transformation such as a mirror or
       <acronym>RAID-Z</acronym> array.  <acronym>ZFS</acronym> file
       systems (called <emphasis>datasets</emphasis>) each have access
       to the combined free space of the entire pool.  As blocks are
       allocated from the pool, the space available to each file system
       decreases.  This approach avoids the common pitfall with
       extensive partitioning where free space becomes fragmented
       across the partitions.</para>
 
     <informaltable pgwide="1">
       <tgroup cols="2">
 	<tbody valign="top">
 	  <row>
 	    <entry xml:id="zfs-term-pool">pool</entry>
 
 	    <entry>A storage <emphasis>pool</emphasis> is the most
 	      basic building block of <acronym>ZFS</acronym>.  A pool
 	      is made up of one or more vdevs, the underlying devices
 	      that store the data.  A pool is then used to create one
 	      or more file systems (datasets) or block devices
 	      (volumes).  These datasets and volumes share the pool of
 	      remaining free space.  Each pool is uniquely identified
 	      by a name and a <acronym>GUID</acronym>.  The features
 	      available are determined by the <acronym>ZFS</acronym>
 	      version number on the pool.
-
-	      <note>
-		<para>&os;&nbsp;9.0 and 9.1 include support for
-		  <acronym>ZFS</acronym> version 28.  Later versions
-		  use <acronym>ZFS</acronym> version 5000 with feature
-		  flags.  The new feature flags system allows greater
-		  cross-compatibility with other implementations of
-		  <acronym>ZFS</acronym>.</para>
-	      </note>
 	    </entry>
 	  </row>
 
 	  <row>
 	    <entry xml:id="zfs-term-vdev">vdev&nbsp;Types</entry>
 
 	    <entry>A pool is made up of one or more vdevs, which
 	      themselves can be a single disk or a group of disks, in
 	      the case of a <acronym>RAID</acronym> transform.  When
 	      multiple vdevs are used, <acronym>ZFS</acronym> spreads
 	      data across the vdevs to increase performance and
 	      maximize usable space.
 
 	      <itemizedlist>
 		<listitem>
 		  <para
 		      xml:id="zfs-term-vdev-disk"><emphasis>Disk</emphasis>
 		    - The most basic type of vdev is a standard block
 		    device.  This can be an entire disk (such as
 		    <filename><replaceable>/dev/ada0</replaceable></filename>
 		    or
 		    <filename><replaceable>/dev/da0</replaceable></filename>)
 		    or a partition
 		    (<filename><replaceable>/dev/ada0p3</replaceable></filename>).
 		    On &os;, there is no performance penalty for using
 		    a partition rather than the entire disk.  This
 		    differs from recommendations made by the Solaris
 		    documentation.</para>
 		</listitem>
 
 		<listitem>
 		  <para
 		      xml:id="zfs-term-vdev-file"><emphasis>File</emphasis>
 		    - In addition to disks, <acronym>ZFS</acronym>
 		    pools can be backed by regular files, this is
 		    especially useful for testing and experimentation.
 		    Use the full path to the file as the device path
 		    in <command>zpool create</command>.  All vdevs
 		    must be at least 128&nbsp;MB in size.</para>
 		</listitem>
 
 		<listitem>
 		  <para
 		      xml:id="zfs-term-vdev-mirror"><emphasis>Mirror</emphasis>
 		    - When creating a mirror, specify the
 		    <literal>mirror</literal> keyword followed by the
 		    list of member devices for the mirror.  A mirror
 		    consists of two or more devices, all data will be
 		    written to all member devices.  A mirror vdev will
 		    only hold as much data as its smallest member.  A
 		    mirror vdev can withstand the failure of all but
 		    one of its members without losing any data.</para>
 
 		  <note>
 		    <para>A regular single disk vdev can be upgraded
 		      to a mirror vdev at any time with
 		      <command>zpool
 			<link
 			  linkend="zfs-zpool-attach">attach</link></command>.</para>
 		  </note>
 		</listitem>
 
 		<listitem>
 		  <para
 		      xml:id="zfs-term-vdev-raidz"><emphasis><acronym>RAID-Z</acronym></emphasis>
 		    - <acronym>ZFS</acronym> implements
 		    <acronym>RAID-Z</acronym>, a variation on standard
 		    <acronym>RAID-5</acronym> that offers better
 		    distribution of parity and eliminates the
 		    <quote><acronym>RAID-5</acronym> write
 		    hole</quote> in which the data and parity
 		    information become inconsistent after an
 		    unexpected restart.  <acronym>ZFS</acronym>
 		    supports three levels of <acronym>RAID-Z</acronym>
 		    which provide varying levels of redundancy in
 		    exchange for decreasing levels of usable storage.
 		    The types are named <acronym>RAID-Z1</acronym>
 		    through <acronym>RAID-Z3</acronym> based on the
 		    number of parity devices in the array and the
 		    number of disks which can fail while the pool
 		    remains operational.</para>
 
 		  <para>In a <acronym>RAID-Z1</acronym> configuration
 		    with four disks, each 1&nbsp;TB, usable storage is
 		    3&nbsp;TB and the pool will still be able to
 		    operate in degraded mode with one faulted disk.
 		    If an additional disk goes offline before the
 		    faulted disk is replaced and resilvered, all data
 		    in the pool can be lost.</para>
 
 		  <para>In a <acronym>RAID-Z3</acronym> configuration
 		    with eight disks of 1&nbsp;TB, the volume will
 		    provide 5&nbsp;TB of usable space and still be
 		    able to operate with three faulted disks.  &sun;
 		    recommends no more than nine disks in a single
 		    vdev.  If the configuration has more disks, it is
 		    recommended to divide them into separate vdevs and
 		    the pool data will be striped across them.</para>
 
 		  <para>A configuration of two
 		    <acronym>RAID-Z2</acronym> vdevs consisting of 8
 		    disks each would create something similar to a
 		    <acronym>RAID-60</acronym> array.  A
 		    <acronym>RAID-Z</acronym> group's storage capacity
 		    is approximately the size of the smallest disk
 		    multiplied by the number of non-parity disks.
 		    Four 1&nbsp;TB disks in <acronym>RAID-Z1</acronym>
 		    has an effective size of approximately 3&nbsp;TB,
 		    and an array of eight 1&nbsp;TB disks in
 		    <acronym>RAID-Z3</acronym> will yield 5&nbsp;TB of
 		    usable space.</para>
 		</listitem>
 
 		<listitem>
 		  <para
 		      xml:id="zfs-term-vdev-spare"><emphasis>Spare</emphasis>
 		    - <acronym>ZFS</acronym> has a special pseudo-vdev
 		    type for keeping track of available hot spares.
 		    Note that installed hot spares are not deployed
 		    automatically; they must manually be configured to
 		    replace the failed device using
 		    <command>zfs replace</command>.</para>
 		</listitem>
 
 		<listitem>
 		  <para
 		      xml:id="zfs-term-vdev-log"><emphasis>Log</emphasis>
 		    - <acronym>ZFS</acronym> Log Devices, also known
 		    as <acronym>ZFS</acronym> Intent Log (<link
 		      linkend="zfs-term-zil"><acronym>ZIL</acronym></link>)
 		    move the intent log from the regular pool devices
 		    to a dedicated device, typically an
 		    <acronym>SSD</acronym>.  Having a dedicated log
 		    device can significantly improve the performance
 		    of applications with a high volume of synchronous
 		    writes, especially databases.  Log devices can be
 		    mirrored, but <acronym>RAID-Z</acronym> is not
 		    supported.  If multiple log devices are used,
 		    writes will be load balanced across them.</para>
 		</listitem>
 
 		<listitem>
 		  <para
 		      xml:id="zfs-term-vdev-cache"><emphasis>Cache</emphasis>
 		    - Adding a cache vdev to a pool will add the
 		    storage of the cache to the <link
 		      linkend="zfs-term-l2arc"><acronym>L2ARC</acronym></link>.
 		    Cache devices cannot be mirrored.  Since a cache
 		    device only stores additional copies of existing
 		    data, there is no risk of data loss.</para>
 		</listitem>
 	      </itemizedlist></entry>
 	  </row>
 
 	  <row>
 	    <entry xml:id="zfs-term-txg">Transaction Group
 	      (<acronym>TXG</acronym>)</entry>
 
 	    <entry>Transaction Groups are the way changed blocks are
 	      grouped together and eventually written to the pool.
 	      Transaction groups are the atomic unit that
 	      <acronym>ZFS</acronym> uses to assert consistency.  Each
 	      transaction group is assigned a unique 64-bit
 	      consecutive identifier.  There can be up to three active
 	      transaction groups at a time, one in each of these three
 	      states:
 
 	      <itemizedlist>
 		<listitem>
 		  <para><emphasis>Open</emphasis> - When a new
 		    transaction group is created, it is in the open
 		    state, and accepts new writes.  There is always
 		    a transaction group in the open state, however the
 		    transaction group may refuse new writes if it has
 		    reached a limit.  Once the open transaction group
 		    has reached a limit, or the <link
 		      linkend="zfs-advanced-tuning-txg-timeout"><varname>vfs.zfs.txg.timeout</varname></link>
 		    has been reached, the transaction group advances
 		    to the next state.</para>
 		</listitem>
 
 		<listitem>
 		  <para><emphasis>Quiescing</emphasis> - A short state
 		    that allows any pending operations to finish while
 		    not blocking the creation of a new open
 		    transaction group.  Once all of the transactions
 		    in the group have completed, the transaction group
 		    advances to the final state.</para>
 		</listitem>
 
 		<listitem>
 		  <para><emphasis>Syncing</emphasis> - All of the data
 		    in the transaction group is written to stable
 		    storage.  This process will in turn modify other
 		    data, such as metadata and space maps, that will
 		    also need to be written to stable storage.  The
 		    process of syncing involves multiple passes.  The
 		    first, all of the changed data blocks, is the
 		    biggest, followed by the metadata, which may take
 		    multiple passes to complete.  Since allocating
 		    space for the data blocks generates new metadata,
 		    the syncing state cannot finish until a pass
 		    completes that does not allocate any additional
 		    space.  The syncing state is also where
 		    <emphasis>synctasks</emphasis> are completed.
 		    Synctasks are administrative operations, such as
 		    creating or destroying snapshots and datasets,
 		    that modify the uberblock are completed.  Once the
 		    sync state is complete, the transaction group in
 		    the quiescing state is advanced to the syncing
 		    state.</para>
 		</listitem>
 	      </itemizedlist>
 
 	      All administrative functions, such as <link
 		linkend="zfs-term-snapshot"><command>snapshot</command></link>
 	      are written as part of the transaction group.  When a
 	      synctask is created, it is added to the currently open
 	      transaction group, and that group is advanced as quickly
 	      as possible to the syncing state to reduce the
 	      latency of administrative commands.</entry>
 	  </row>
 
 	  <row>
 	    <entry xml:id="zfs-term-arc">Adaptive Replacement
 	      Cache (<acronym>ARC</acronym>)</entry>
 
 	    <entry><acronym>ZFS</acronym> uses an Adaptive Replacement
 	      Cache (<acronym>ARC</acronym>), rather than a more
 	      traditional Least Recently Used (<acronym>LRU</acronym>)
 	      cache.  An <acronym>LRU</acronym> cache is a simple list
 	      of items in the cache, sorted by when each object was
 	      most recently used.  New items are added to the top of
 	      the list.  When the cache is full, items from the
 	      bottom of the list are evicted to make room for more
 	      active objects.  An <acronym>ARC</acronym> consists of
 	      four lists; the Most Recently Used
 	      (<acronym>MRU</acronym>) and Most Frequently Used
 	      (<acronym>MFU</acronym>) objects, plus a ghost list for
 	      each.  These ghost lists track recently evicted objects
 	      to prevent them from being added back to the cache.
 	      This increases the cache hit ratio by avoiding objects
 	      that have a history of only being used occasionally.
 	      Another advantage of using both an
 	      <acronym>MRU</acronym> and <acronym>MFU</acronym> is
 	      that scanning an entire file system would normally evict
 	      all data from an <acronym>MRU</acronym> or
 	      <acronym>LRU</acronym> cache in favor of this freshly
 	      accessed content.  With <acronym>ZFS</acronym>, there is
 	      also an <acronym>MFU</acronym> that only tracks the most
 	      frequently used objects, and the cache of the most
 	      commonly accessed blocks remains.</entry>
 	  </row>
 
 	  <row>
 	    <entry
 	      xml:id="zfs-term-l2arc"><acronym>L2ARC</acronym></entry>
 
 	    <entry><acronym>L2ARC</acronym> is the second level
 	      of the <acronym>ZFS</acronym> caching system.  The
 	      primary <acronym>ARC</acronym> is stored in
 	      <acronym>RAM</acronym>.  Since the amount of
 	      available <acronym>RAM</acronym> is often limited,
 	      <acronym>ZFS</acronym> can also use
 	      <link linkend="zfs-term-vdev-cache">cache vdevs</link>.
 	      Solid State Disks (<acronym>SSD</acronym>s) are often
 	      used as these cache devices due to their higher speed
 	      and lower latency compared to traditional spinning
 	      disks.  <acronym>L2ARC</acronym> is entirely optional,
 	      but having one will significantly increase read speeds
 	      for files that are cached on the <acronym>SSD</acronym>
 	      instead of having to be read from the regular disks.
 	      <acronym>L2ARC</acronym> can also speed up <link
 		linkend="zfs-term-deduplication">deduplication</link>
 	      because a <acronym>DDT</acronym> that does not fit in
 	      <acronym>RAM</acronym> but does fit in the
 	      <acronym>L2ARC</acronym> will be much faster than a
 	      <acronym>DDT</acronym> that must be read from disk.  The
 	      rate at which data is added to the cache devices is
 	      limited to prevent prematurely wearing out
 	      <acronym>SSD</acronym>s with too many writes.  Until the
 	      cache is full (the first block has been evicted to make
 	      room), writing to the <acronym>L2ARC</acronym> is
 	      limited to the sum of the write limit and the boost
 	      limit, and afterwards limited to the write limit.  A
 	      pair of &man.sysctl.8; values control these rate limits.
 	      <link
 		linkend="zfs-advanced-tuning-l2arc_write_max"><varname>vfs.zfs.l2arc_write_max</varname></link>
 	      controls how many bytes are written to the cache per
 	      second, while <link
 		linkend="zfs-advanced-tuning-l2arc_write_boost"><varname>vfs.zfs.l2arc_write_boost</varname></link>
 	      adds to this limit during the
 	      <quote>Turbo Warmup Phase</quote> (Write Boost).</entry>
 	  </row>
 
 	  <row>
 	    <entry
 	      xml:id="zfs-term-zil"><acronym>ZIL</acronym></entry>
 
 	    <entry><acronym>ZIL</acronym> accelerates synchronous
 	      transactions by using storage devices like
 	      <acronym>SSD</acronym>s that are faster than those used
 	      in the main storage pool.  When an application requests
 	      a synchronous write (a guarantee that the data has been
 	      safely stored to disk rather than merely cached to be
 	      written later), the data is written to the faster
 	      <acronym>ZIL</acronym> storage, then later flushed out
 	      to the regular disks.  This greatly reduces latency and
 	      improves performance.  Only synchronous workloads like
 	      databases will benefit from a <acronym>ZIL</acronym>.
 	      Regular asynchronous writes such as copying files will
 	      not use the <acronym>ZIL</acronym> at all.</entry>
 	  </row>
 
 	  <row>
 	    <entry xml:id="zfs-term-cow">Copy-On-Write</entry>
 
 	    <entry>Unlike a traditional file system, when data is
 	      overwritten on <acronym>ZFS</acronym>, the new data is
 	      written to a different block rather than overwriting the
 	      old data in place.  Only when this write is complete is
 	      the metadata then updated to point to the new location.
 	      In the event of a shorn write (a system crash or power
 	      loss in the middle of writing a file), the entire
 	      original contents of the file are still available and
 	      the incomplete write is discarded.  This also means that
 	      <acronym>ZFS</acronym> does not require a &man.fsck.8;
 	      after an unexpected shutdown.</entry>
 	  </row>
 
 	  <row>
 	    <entry xml:id="zfs-term-dataset">Dataset</entry>
 
 	    <entry><emphasis>Dataset</emphasis> is the generic term
 	      for a <acronym>ZFS</acronym> file system, volume,
 	      snapshot or clone.  Each dataset has a unique name in
 	      the format
 	      <replaceable>poolname/path@snapshot</replaceable>.
 	      The root of the pool is technically a dataset as well.
 	      Child datasets are named hierarchically like
 	      directories.  For example,
 	      <replaceable>mypool/home</replaceable>, the home
 	      dataset, is a child of <replaceable>mypool</replaceable>
 	      and inherits properties from it.  This can be expanded
 	      further by creating
 	      <replaceable>mypool/home/user</replaceable>.  This
 	      grandchild dataset will inherit properties from the
 	      parent and grandparent.  Properties on a child can be
 	      set to override the defaults inherited from the parents
 	      and grandparents.  Administration of datasets and their
 	      children can be
 	      <link linkend="zfs-zfs-allow">delegated</link>.</entry>
 	  </row>
 
 	  <row>
 	    <entry xml:id="zfs-term-filesystem">File system</entry>
 
 	    <entry>A <acronym>ZFS</acronym> dataset is most often used
 	      as a file system.  Like most other file systems, a
 	      <acronym>ZFS</acronym> file system is mounted somewhere
 	      in the systems directory hierarchy and contains files
 	      and directories of its own with permissions, flags, and
 	      other metadata.</entry>
 	  </row>
 
 	  <row>
 	    <entry xml:id="zfs-term-volume">Volume</entry>
 
 	    <entry>In additional to regular file system datasets,
 	      <acronym>ZFS</acronym> can also create volumes, which
 	      are block devices.  Volumes have many of the same
 	      features, including copy-on-write, snapshots, clones,
 	      and checksumming.  Volumes can be useful for running
 	      other file system formats on top of
 	      <acronym>ZFS</acronym>, such as <acronym>UFS</acronym>
 	      virtualization, or exporting <acronym>iSCSI</acronym>
 	      extents.</entry>
 	  </row>
 
 	  <row>
 	    <entry xml:id="zfs-term-snapshot">Snapshot</entry>
 
 	    <entry>The
 	      <link linkend="zfs-term-cow">copy-on-write</link>
 	      (<acronym>COW</acronym>) design of
 	      <acronym>ZFS</acronym> allows for nearly instantaneous,
 	      consistent snapshots with arbitrary names.  After taking
 	      a snapshot of a dataset, or a recursive snapshot of a
 	      parent dataset that will include all child datasets, new
 	      data is written to new blocks, but the old blocks are
 	      not reclaimed as free space.  The snapshot contains
 	      the original version of the file system, and the live
 	      file system contains any changes made since the snapshot
 	      was taken.  No additional space is used.  As new data is
 	      written to the live file system, new blocks are
 	      allocated to store this data.  The apparent size of the
 	      snapshot will grow as the blocks are no longer used in
 	      the live file system, but only in the snapshot.  These
 	      snapshots can be mounted read only to allow for the
 	      recovery of previous versions of files.  It is also
 	      possible to
 	      <link linkend="zfs-zfs-snapshot">rollback</link> a live
 	      file system to a specific snapshot, undoing any changes
 	      that took place after the snapshot was taken.  Each
 	      block in the pool has a reference counter which keeps
 	      track of how many snapshots, clones, datasets, or
 	      volumes make use of that block.  As files and snapshots
 	      are deleted, the reference count is decremented.  When a
 	      block is no longer referenced, it is reclaimed as free
 	      space.  Snapshots can also be marked with a
 	      <link linkend="zfs-zfs-snapshot">hold</link>.  When a
 	      snapshot is held, any attempt to destroy it will return
 	      an <literal>EBUSY</literal> error.  Each snapshot can
 	      have multiple holds, each with a unique name.  The
 	      <link linkend="zfs-zfs-snapshot">release</link> command
 	      removes the hold so the snapshot can deleted.  Snapshots
 	      can be taken on volumes, but they can only be cloned or
 	      rolled back, not mounted independently.</entry>
 	  </row>
 
 	  <row>
 	    <entry xml:id="zfs-term-clone">Clone</entry>
 
 	    <entry>Snapshots can also be cloned.  A clone is a
 	      writable version of a snapshot, allowing the file system
 	      to be forked as a new dataset.  As with a snapshot, a
 	      clone initially consumes no additional space.  As
 	      new data is written to a clone and new blocks are
 	      allocated, the apparent size of the clone grows.  When
 	      blocks are overwritten in the cloned file system or
 	      volume, the reference count on the previous block is
 	      decremented.  The snapshot upon which a clone is based
 	      cannot be deleted because the clone depends on it.  The
 	      snapshot is the parent, and the clone is the child.
 	      Clones can be <emphasis>promoted</emphasis>, reversing
 	      this dependency and making the clone the parent and the
 	      previous parent the child.  This operation requires no
 	      additional space.  Because the amount of space used by
 	      the parent and child is reversed, existing quotas and
 	      reservations might be affected.</entry>
 	  </row>
 
 	  <row>
 	    <entry xml:id="zfs-term-checksum">Checksum</entry>
 
 	    <entry>Every block that is allocated is also checksummed.
 	      The checksum algorithm used is a per-dataset property,
 	      see <link
 		linkend="zfs-zfs-set"><command>set</command></link>.
 	      The checksum of each block is transparently validated as
 	      it is read, allowing <acronym>ZFS</acronym> to detect
 	      silent corruption.  If the data that is read does not
 	      match the expected checksum, <acronym>ZFS</acronym> will
 	      attempt to recover the data from any available
 	      redundancy, like mirrors or <acronym>RAID-Z</acronym>).
 	      Validation of all checksums can be triggered with <link
 		linkend="zfs-term-scrub"><command>scrub</command></link>.
 	      Checksum algorithms include:
 
 	      <itemizedlist>
 		<listitem>
 		  <para><literal>fletcher2</literal></para>
 		</listitem>
 
 		<listitem>
 		  <para><literal>fletcher4</literal></para>
 		</listitem>
 
 		<listitem>
 		  <para><literal>sha256</literal></para>
 		</listitem>
 	      </itemizedlist>
 
 	      The <literal>fletcher</literal> algorithms are faster,
 	      but <literal>sha256</literal> is a strong cryptographic
 	      hash and has a much lower chance of collisions at the
 	      cost of some performance.  Checksums can be disabled,
 	      but it is not recommended.</entry>
 	  </row>
 
 	  <row>
 	    <entry xml:id="zfs-term-compression">Compression</entry>
 
 	    <entry>Each dataset has a compression property, which
 	      defaults to off.  This property can be set to one of a
 	      number of compression algorithms.  This will cause all
 	      new data that is written to the dataset to be
 	      compressed.  Beyond a reduction in space used, read and
 	      write throughput often increases because fewer blocks
 	      are read or written.
 
 	      <itemizedlist>
 		<listitem xml:id="zfs-term-compression-lz4">
 		  <para><emphasis><acronym>LZ4</acronym></emphasis> -
 		    Added in <acronym>ZFS</acronym> pool version
 		    5000 (feature flags), <acronym>LZ4</acronym> is
 		    now the recommended compression algorithm.
 		    <acronym>LZ4</acronym> compresses approximately
 		    50% faster than <acronym>LZJB</acronym> when
 		    operating on compressible data, and is over three
 		    times faster when operating on uncompressible
 		    data.  <acronym>LZ4</acronym> also decompresses
 		    approximately 80% faster than
 		    <acronym>LZJB</acronym>.  On modern
 		    <acronym>CPU</acronym>s, <acronym>LZ4</acronym>
 		    can often compress at over 500&nbsp;MB/s, and
 		    decompress at over 1.5&nbsp;GB/s (per single CPU
 		    core).</para>
-
-		  <note>
-		    <para><acronym>LZ4</acronym> compression is
-		      only available after &os;&nbsp;9.2.</para>
-		  </note>
 		</listitem>
 
 		<listitem xml:id="zfs-term-compression-lzjb">
 		  <para><emphasis><acronym>LZJB</acronym></emphasis> -
 		    The default compression algorithm.  Created by
 		    Jeff Bonwick (one of the original creators of
 		    <acronym>ZFS</acronym>).  <acronym>LZJB</acronym>
 		    offers good compression with less
 		    <acronym>CPU</acronym> overhead compared to
 		    <acronym>GZIP</acronym>.  In the future, the
 		    default compression algorithm will likely change
 		    to <acronym>LZ4</acronym>.</para>
 		</listitem>
 
 		<listitem xml:id="zfs-term-compression-gzip">
 		  <para><emphasis><acronym>GZIP</acronym></emphasis> -
 		    A popular stream compression algorithm available
 		    in <acronym>ZFS</acronym>.  One of the main
 		    advantages of using <acronym>GZIP</acronym> is its
 		    configurable level of compression.  When setting
 		    the <literal>compress</literal> property, the
 		    administrator can choose the level of compression,
 		    ranging from <literal>gzip1</literal>, the lowest
 		    level of compression, to <literal>gzip9</literal>,
 		    the highest level of compression.  This gives the
 		    administrator control over how much
 		    <acronym>CPU</acronym> time to trade for saved
 		    disk space.</para>
 		</listitem>
 
 		<listitem xml:id="zfs-term-compression-zle">
 		  <para><emphasis><acronym>ZLE</acronym></emphasis> -
 		    Zero Length Encoding is a special compression
 		    algorithm that only compresses continuous runs of
 		    zeros.  This compression algorithm is only useful
 		    when the dataset contains large blocks of
 		    zeros.</para>
 		</listitem>
 	      </itemizedlist></entry>
 	  </row>
 
 	  <row>
 	    <entry
 	      xml:id="zfs-term-copies">Copies</entry>
 
 	    <entry>When set to a value greater than 1, the
 	      <literal>copies</literal> property instructs
 	      <acronym>ZFS</acronym> to maintain multiple copies of
 	      each block in the
 	      <link linkend="zfs-term-filesystem">File System</link>
 	      or
 	      <link linkend="zfs-term-volume">Volume</link>.  Setting
 	      this property on important datasets provides additional
 	      redundancy from which to recover a block that does not
 	      match its checksum.  In pools without redundancy, the
 	      copies feature is the only form of redundancy.  The
 	      copies feature can recover from a single bad sector or
 	      other forms of minor corruption, but it does not protect
 	      the pool from the loss of an entire disk.</entry>
 	  </row>
 
 	  <row>
 	    <entry
 	      xml:id="zfs-term-deduplication">Deduplication</entry>
 
 	    <entry>Checksums make it possible to detect duplicate
 	      blocks of data as they are written.  With deduplication,
 	      the reference count of an existing, identical block is
 	      increased, saving storage space.  To detect duplicate
 	      blocks, a deduplication table (<acronym>DDT</acronym>)
 	      is kept in memory.  The table contains a list of unique
 	      checksums, the location of those blocks, and a reference
 	      count.  When new data is written, the checksum is
 	      calculated and compared to the list.  If a match is
 	      found, the existing block is used.  The
 	      <acronym>SHA256</acronym> checksum algorithm is used
 	      with deduplication to provide a secure cryptographic
 	      hash.  Deduplication is tunable.  If
 	      <literal>dedup</literal> is <literal>on</literal>, then
 	      a matching checksum is assumed to mean that the data is
 	      identical.  If <literal>dedup</literal> is set to
 	      <literal>verify</literal>, then the data in the two
 	      blocks will be checked byte-for-byte to ensure it is
 	      actually identical.  If the data is not identical, the
 	      hash collision will be noted and the two blocks will be
 	      stored separately.  Because <acronym>DDT</acronym> must
 	      store the hash of each unique block, it consumes a very
 	      large amount of memory.  A general rule of thumb is
 	      5-6&nbsp;GB of ram per 1&nbsp;TB of deduplicated data).
 	      In situations where it is not practical to have enough
 	      <acronym>RAM</acronym> to keep the entire
 	      <acronym>DDT</acronym> in memory, performance will
 	      suffer greatly as the <acronym>DDT</acronym> must be
 	      read from disk before each new block is written.
 	      Deduplication can use <acronym>L2ARC</acronym> to store
 	      the <acronym>DDT</acronym>, providing a middle ground
 	      between fast system memory and slower disks.  Consider
 	      using compression instead, which often provides nearly
 	      as much space savings without the additional memory
 	      requirement.</entry>
 	  </row>
 
 	  <row>
 	    <entry xml:id="zfs-term-scrub">Scrub</entry>
 
 	    <entry>Instead of a consistency check like &man.fsck.8;,
 	      <acronym>ZFS</acronym> has <command>scrub</command>.
 	      <command>scrub</command> reads all data blocks stored on
 	      the pool and verifies their checksums against the known
 	      good checksums stored in the metadata.  A periodic check
 	      of all the data stored on the pool ensures the recovery
 	      of any corrupted blocks before they are needed.  A scrub
 	      is not required after an unclean shutdown, but is
 	      recommended at least once every three months.  The
 	      checksum of each block is verified as blocks are read
 	      during normal use, but a scrub makes certain that even
 	      infrequently used blocks are checked for silent
 	      corruption.  Data security is improved, especially in
 	      archival storage situations.  The relative priority of
 	      <command>scrub</command> can be adjusted with <link
 		linkend="zfs-advanced-tuning-scrub_delay"><varname>vfs.zfs.scrub_delay</varname></link>
 	      to prevent the scrub from degrading the performance of
 	      other workloads on the pool.</entry>
 	  </row>
 
 	  <row>
 	    <entry xml:id="zfs-term-quota">Dataset Quota</entry>
 
 	    <entry><acronym>ZFS</acronym> provides very fast and
 	      accurate dataset, user, and group space accounting in
 	      addition to quotas and space reservations.  This gives
 	      the administrator fine grained control over how space is
 	      allocated and allows space to be reserved for critical
 	      file systems.
 
 	      <para><acronym>ZFS</acronym> supports different types of
 		quotas: the dataset quota, the <link
 		  linkend="zfs-term-refquota">reference
 		  quota (<acronym>refquota</acronym>)</link>, the
 		<link linkend="zfs-term-userquota">user
 		  quota</link>, and the
 		<link linkend="zfs-term-groupquota">group
 		  quota</link>.</para>
 
 	      <para>Quotas limit the amount of space that a dataset
 		and all of its descendants, including snapshots of the
 		dataset, child datasets, and the snapshots of those
 		datasets, can consume.</para>
 
 	      <note>
 		<para>Quotas cannot be set on volumes, as the
 		  <literal>volsize</literal> property acts as an
 		  implicit quota.</para>
 	      </note></entry>
 	  </row>
 
 	  <row>
 	    <entry xml:id="zfs-term-refquota">Reference
 	      Quota</entry>
 
 	    <entry>A reference quota limits the amount of space a
 	      dataset can consume by enforcing a hard limit.  However,
 	      this hard limit includes only space that the dataset
 	      references and does not include space used by
 	      descendants, such as file systems or snapshots.</entry>
 	  </row>
 
 	  <row>
 	    <entry xml:id="zfs-term-userquota">User
 	      Quota</entry>
 
 	    <entry>User quotas are useful to limit the amount of space
 	      that can be used by the specified user.</entry>
 	  </row>
 
 	  <row>
 	    <entry xml:id="zfs-term-groupquota">Group
 	      Quota</entry>
 
 	    <entry>The group quota limits the amount of space that a
 	      specified group can consume.</entry>
 	  </row>
 
 	  <row>
 	    <entry xml:id="zfs-term-reservation">Dataset
 	      Reservation</entry>
 
 	    <entry>The <literal>reservation</literal> property makes
 	      it possible to guarantee a minimum amount of space for a
 	      specific dataset and its descendants.  If a 10&nbsp;GB
 	      reservation is set on
 	      <filename>storage/home/bob</filename>, and another
 	      dataset tries to use all of the free space, at least
 	      10&nbsp;GB of space is reserved for this dataset.  If a
 	      snapshot is taken of
 	      <filename>storage/home/bob</filename>, the space used by
 	      that snapshot is counted against the reservation.  The
 	      <link
 		linkend="zfs-term-refreservation"><literal>refreservation</literal></link>
 	      property works in a similar way, but it
 	      <emphasis>excludes</emphasis> descendants like
 	      snapshots.
 
 	      <para>Reservations of any sort are useful in many
 		situations, such as planning and testing the
 		suitability of disk space allocation in a new system,
 		or ensuring that enough space is available on file
 		systems for audio logs or system recovery procedures
 		and files.</para>
 	    </entry>
 	  </row>
 
 	  <row>
 	    <entry xml:id="zfs-term-refreservation">Reference
 	      Reservation</entry>
 
 	    <entry>The <literal>refreservation</literal> property
 	      makes it possible to guarantee a minimum amount of
 	      space for the use of a specific dataset
 	      <emphasis>excluding</emphasis> its descendants.  This
 	      means that if a 10&nbsp;GB reservation is set on
 	      <filename>storage/home/bob</filename>, and another
 	      dataset tries to use all of the free space, at least
 	      10&nbsp;GB of space is reserved for this dataset.  In
 	      contrast to a regular
 	      <link linkend="zfs-term-reservation">reservation</link>,
 	      space used by snapshots and descendant datasets is not
 	      counted against the reservation.  For example, if a
 	      snapshot is taken of
 	      <filename>storage/home/bob</filename>, enough disk space
 	      must exist outside of the
 	      <literal>refreservation</literal> amount for the
 	      operation to succeed.  Descendants of the main data set
 	      are not counted in the <literal>refreservation</literal>
 	      amount and so do not encroach on the space set.</entry>
 	  </row>
 
 	  <row>
 	    <entry xml:id="zfs-term-resilver">Resilver</entry>
 
 	    <entry>When a disk fails and is replaced, the new disk
 	      must be filled with the data that was lost.  The process
 	      of using the parity information distributed across the
 	      remaining drives to calculate and write the missing data
 	      to the new drive is called
 	      <emphasis>resilvering</emphasis>.</entry>
 	  </row>
 
 	  <row>
 	    <entry xml:id="zfs-term-online">Online</entry>
 
 	    <entry>A pool or vdev in the <literal>Online</literal>
 	      state has all of its member devices connected and fully
 	      operational.  Individual devices in the
 	      <literal>Online</literal> state are functioning
 	      normally.</entry>
 	  </row>
 
 	  <row>
 	    <entry xml:id="zfs-term-offline">Offline</entry>
 
 	    <entry>Individual devices can be put in an
 	      <literal>Offline</literal> state by the administrator if
 	      there is sufficient redundancy to avoid putting the pool
 	      or vdev into a
 	      <link linkend="zfs-term-faulted">Faulted</link> state.
 	      An administrator may choose to offline a disk in
 	      preparation for replacing it, or to make it easier to
 	      identify.</entry>
 	  </row>
 
 	  <row>
 	    <entry xml:id="zfs-term-degraded">Degraded</entry>
 
 	    <entry>A pool or vdev in the <literal>Degraded</literal>
 	      state has one or more disks that have been disconnected
 	      or have failed.  The pool is still usable, but if
 	      additional devices fail, the pool could become
 	      unrecoverable.  Reconnecting the missing devices or
 	      replacing the failed disks will return the pool to an
 	      <link linkend="zfs-term-online">Online</link> state
 	      after the reconnected or new device has completed the
 	      <link linkend="zfs-term-resilver">Resilver</link>
 	      process.</entry>
 	  </row>
 
 	  <row>
 	    <entry xml:id="zfs-term-faulted">Faulted</entry>
 
 	    <entry>A pool or vdev in the <literal>Faulted</literal>
 	      state is no longer operational.  The data on it can no
 	      longer be accessed.  A pool or vdev enters the
 	      <literal>Faulted</literal> state when the number of
 	      missing or failed devices exceeds the level of
 	      redundancy in the vdev.  If missing devices can be
 	      reconnected, the pool will return to a
 	      <link linkend="zfs-term-online">Online</link> state.  If
 	      there is insufficient redundancy to compensate for the
 	      number of failed disks, then the contents of the pool
 	      are lost and must be restored from backups.</entry>
 	  </row>
 	</tbody>
       </tgroup>
     </informaltable>
   </sect1>
 </chapter>