Index: en_US.ISO8859-1/books/handbook/zfs/chapter.xml
===================================================================
--- en_US.ISO8859-1/books/handbook/zfs/chapter.xml
+++ en_US.ISO8859-1/books/handbook/zfs/chapter.xml
@@ -143,6 +143,13 @@
ada device
names.
+ The first step in creating a new ZFS pool
+ is deciding on the disk layout. There are a number of options
+ and once the pool is created, the layout can not be changed.
+ For more information see Advanced Topics - Pool
+ Layout.
+
Single Disk Pool
@@ -163,8 +170,8 @@
This output shows that the example pool
has been created and mounted. It is now accessible as a file
- system. Files can be created on it and users can browse
- it:
+ system. Files can be created on it and users can browse it,
+ like in this example:
&prompt.root; cd /example
&prompt.root; ls
@@ -465,8 +472,8 @@
not recommended! Checksums take very
little storage space and provide data integrity. Many
ZFS features will not work properly with
- checksums disabled. There is no noticeable performance gain
- from disabling these checksums.
+ checksums disabled. There is also no noticeable performance
+ gain from disabling these checksums.
Checksum verification is known as
@@ -716,13 +723,13 @@
as for RAID-Z, an alternative method is to
add another vdev to the pool. Additional vdevs provide higher
performance, distributing writes across the vdevs. Each vdev
- is reponsible for providing its own redundancy. It is
- possible, but discouraged, to mix vdev types, like
+ is reponsible for providing its own redundancy. Do not mix
+ different vdev types, like
mirror and RAID-Z.
Adding a non-redundant vdev to a pool containing mirror or
RAID-Z vdevs risks the data on the entire
pool. Writes are distributed, so the failure of the
- non-redundant disk will result in the loss of a fraction of
+ non-redundant vdev will result in the loss of a fraction of
every block that has been written to the pool.Data is striped across each of the vdevs. For example,
@@ -731,8 +738,8 @@
of mirrors. Space is allocated so that each vdev reaches 100%
full at the same time. There is a performance penalty if the
vdevs have different amounts of free space, as a
- disproportionate amount of the data is written to the less
- full vdev.
+ disproportionate amount of the data is written to the vdev
+ that is less full.
When attaching additional devices to a boot pool, remember
to update the bootcode.
@@ -1933,7 +1940,7 @@
Creating Snapshots
- Snapshots are created with zfs snapshot
+ Snapshots are created with zfs snapshot
dataset@snapshotname.
Adding creates a snapshot recursively,
with the same name on all child datasets.
@@ -2171,7 +2178,7 @@
ZFS will issue this warning:&prompt.root; zfs list -rt snapshot mypool/var/tmp
-AME USED AVAIL REFER MOUNTPOINT
+NAME USED AVAIL REFER MOUNTPOINT
mypool/var/tmp@my_recursive_snapshot 88K - 152K -
mypool/var/tmp@after_cp 53.5K - 118K -
mypool/var/tmp@diff_snapshot 0 - 120K -
@@ -2290,99 +2297,140 @@
Managing Clones
- A clone is a copy of a snapshot that is treated more like
- a regular dataset. Unlike a snapshot, a clone is not read
- only, is mounted, and can have its own properties. Once a
- clone has been created using zfs clone, the
- snapshot it was created from cannot be destroyed. The
- child/parent relationship between the clone and the snapshot
- can be reversed using zfs promote. After a
- clone has been promoted, the snapshot becomes a child of the
- clone, rather than of the original parent dataset. This will
- change how the space is accounted, but not actually change the
- amount of space consumed. The clone can be mounted at any
- point within the ZFS file system hierarchy,
- not just below the original location of the snapshot.
-
- To demonstrate the clone feature, this example dataset is
- used:
-
- &prompt.root; zfs list -rt all camino/home/joe
-NAME USED AVAIL REFER MOUNTPOINT
-camino/home/joe 108K 1.3G 87K /usr/home/joe
-camino/home/joe@plans 21K - 85.5K -
-camino/home/joe@backup 0K - 87K -
+ A clone is an exact copy of a snapshot that is treated
+ more like a regular dataset. Unlike a snapshot, the clone can
+ be changed independently from its originating dataset. A
+ clone can be written to, mounted, and can have its own
+ properties. Similar to how snapshots works, a clone shares
+ unmodified blocks with the origin snapshot
+ it was created from, saving space, and only growing as it is
+ modified. A clone can only be created from a snapshot.
+
+ Create a snapshot of a file system, then clone it:
+
+ &prompt.root; echo "first message" > /var/tmp/my_message
+&prompt.root; ls /var/tmp
+my_message vi.recover
+&prompt.root; zfs snapshot mypool/var/tmp@first_snapshot
+&prompt.root; zfs list -rt all mypool/var/tmp
+NAME USED AVAIL REFER MOUNTPOINT
+mypool/var/tmp 249K 30.5G 249K /var/tmp
+mypool/var/tmp@first_snapshot 0 - 249K -
+&prompt.root; zfs clone mypool/var/tmp@first_snapshotmypool/var/clone
+&prompt.root; zfs list -rt all mypool/var/clonemypool/var/tmp
+NAME USED AVAIL REFER MOUNTPOINT
+mypool/var/clone 12.8K 30.5G 249K /var/clone
+mypool/var/tmp 249K 30.5G 249K /var/tmp
+mypool/var/tmp@first_snapshot 0 - 249K -
+&prompt.root; ls /var/clone
+my_message vi.recover
+
+ A clone is essentially a fork of a file system, a common
+ base set of blocks that are shared by two file systems. When
+ a file is modified in a clone, additional space is consumed.
+ The original blocks are kept intact because they are still
+ being used by the first file system, and its snapshot(s).
+ When a file is modified in the first file system, additional
+ space is consumed again, this time allocated to the snapshot.
+ The original blocks are still in use, now only by the
+ snapshot. The system now contains all three versions of the
+ file.A typical use for clones is to experiment with a specific
- dataset while keeping the snapshot around to fall back to in
- case something goes wrong. Since snapshots can not be
- changed, a read/write clone of a snapshot is created. After
- the desired result is achieved in the clone, the clone can be
- promoted to a dataset and the old file system removed. This
- is not strictly necessary, as the clone and dataset can
- coexist without problems.
-
- &prompt.root; zfs clone camino/home/joe@backupcamino/home/joenew
-&prompt.root; ls /usr/home/joe*
-/usr/home/joe:
-backup.txz plans.txt
-
-/usr/home/joenew:
-backup.txz plans.txt
-&prompt.root; df -h /usr/home
-Filesystem Size Used Avail Capacity Mounted on
-usr/home/joe 1.3G 31k 1.3G 0% /usr/home/joe
-usr/home/joenew 1.3G 31k 1.3G 0% /usr/home/joenew
-
- After a clone is created it is an exact copy of the state
- the dataset was in when the snapshot was taken. The clone can
- now be changed independently from its originating dataset.
- The only connection between the two is the snapshot.
- ZFS records this connection in the property
- origin. Once the dependency between the
- snapshot and the clone has been removed by promoting the clone
- using zfs promote, the
- origin of the clone is removed as it is now
- an independent dataset. This example demonstrates it:
-
- &prompt.root; zfs get origin camino/home/joenew
-NAME PROPERTY VALUE SOURCE
-camino/home/joenew origin camino/home/joe@backup -
-&prompt.root; zfs promote camino/home/joenew
-&prompt.root; zfs get origin camino/home/joenew
-NAME PROPERTY VALUE SOURCE
-camino/home/joenew origin - -
-
- After making some changes like copying
- loader.conf to the promoted clone, for
- example, the old directory becomes obsolete in this case.
- Instead, the promoted clone can replace it. This can be
- achieved by two consecutive commands: zfs
- destroy on the old dataset and zfs
- rename on the clone to name it like the old
- dataset (it could also get an entirely different name).
-
- &prompt.root; cp /boot/defaults/loader.conf/usr/home/joenew
-&prompt.root; zfs destroy -f camino/home/joe
-&prompt.root; zfs rename camino/home/joenewcamino/home/joe
-&prompt.root; ls /usr/home/joe
-backup.txz loader.conf plans.txt
-&prompt.root; df -h /usr/home
-Filesystem Size Used Avail Capacity Mounted on
-usr/home/joe 1.3G 128k 1.3G 0% /usr/home/joe
-
- The cloned snapshot is now handled like an ordinary
- dataset. It contains all the data from the original snapshot
- plus the files that were added to it like
- loader.conf. Clones can be used in
- different scenarios to provide useful features to ZFS users.
- For example, jails could be provided as snapshots containing
- different sets of installed applications. Users can clone
- these snapshots and add their own applications as they see
- fit. Once they are satisfied with the changes, the clones can
- be promoted to full datasets and provided to end users to work
- with like they would with a real dataset. This saves time and
- administrative overhead when providing these jails.
+ dataset while keeping the original around to fall back to in
+ case something goes wrong. Clones can also be powerful in the
+ case of databases, jails, and virtual machines, allowing the
+ administrator to create multiple nearly identical versions of
+ the original without consuming additional space. Clone can be
+ kept indefinately, or after the desired result is achieved in
+ the clone, it can be promoted to be the parent dataset and the
+ old file system can then be destroyed.
+
+ Make a change to the clone, and then the parent:
+
+ &prompt.root; echo "clone message" > /var/clone/my_message
+&prompt.root; zfs list -rt all mypool/var/clonemypool/var/tmp
+NAME USED AVAIL REFER MOUNTPOINT
+mypool/var/clone 134K 30.5G 249K /var/clone
+mypool/var/tmp 249K 30.5G 249K /var/tmp
+mypool/var/tmp@first_snapshot 0 - 249K -
+&prompt.root; echo "new message" > /var/tmp/my_message
+&prompt.root; zfs list -rt all mypool/var/clonemypool/var/tmp
+NAME USED AVAIL REFER MOUNTPOINT
+mypool/var/clone 134K 30.5G 249K /var/clone
+mypool/var/tmp 383K 30.5G 249K /var/tmp
+mypool/var/tmp@first_snapshot 134K - 249K -
+
+ After a clone has been created, the snapshot it was
+ created from cannot be destroyed because the clone only
+ contains the blocks that have been modified. The child/parent
+ relationship between the clone and the snapshot can be
+ reversed using zfs promote. The snapshot
+ then becomes a child of the clone, rather than of the original
+ parent dataset. The original dataset can then be destroyed if
+ desired. The way that space is counted changes when a clone
+ is promoted. The same amount of space is used, but which of
+ the blocks are owned by the parent and the child
+ change.
+
+ The only connection between the clone and the original
+ dataset is the snapshot. ZFS records this
+ connection in the origin property. Once
+ the dependency between the clone and the original dataset has
+ been reversed by promoting the clone using
+ zfs promote, the original dataset becomes
+ the clone. The origin property on the
+ clone will then be blank, and the origin
+ property on the original dataset will now point to the
+ snapshot under the dataset that was formerly the clone.
+
+ Promote the clone:
+
+ &prompt.root; zfs list -rt all mypool/var/clonemypool/var/tmp
+NAME USED AVAIL REFER MOUNTPOINT
+mypool/var/clone 134K 30.5G 249K /var/clone
+mypool/var/tmp 383K 30.5G 249K /var/tmp
+mypool/var/tmp@first_snapshot 134K - 249K -
+&prompt.root; zfs get origin mypool/var/clone
+NAME PROPERTY VALUE SOURCE
+mypool/var/clone origin mypool/var/tmp@first_snapshot -
+&prompt.root; zfs promote mypool/var/clone
+&prompt.root; zfs list -rt all mypool/var/clonemypool/var/tmp
+NAME USED AVAIL REFER MOUNTPOINT
+mypool/var/clone 383K 30.5G 249K /var/clone
+mypool/var/clone@first_snapshot 134K - 249K -
+mypool/var/tmp 134K 30.5G 249K /var/tmp
+&prompt.root; zfs get origin mypool/var/clone
+NAME PROPERTY VALUE SOURCE
+mypool/var/clone origin - -
+&prompt.root; zfs get origin mypool/var/tmp
+NAME PROPERTY VALUE SOURCE
+mypool/var/tmp origin mypool/var/clone@first_snapshot -
+
+ After making some changes to the clone, it is now in
+ the state that the administrator wants. The old dataset has
+ become obsolete and the administrator wants to replace it
+ with the clone. After the clone is promoted, this can be
+ achieved by two additional commands: zfs
+ destroy the old dataset and zfs
+ rename the clone to the name of the old dataset.
+ The clone could also keep its original name, and only change
+ its mountpoint property instead.
+
+ &prompt.root; zfs destroy -f mypool/var/tmp
+&prompt.root; zfs rename mypool/var/clonemypool/var/tmp
+&prompt.root; zfs list -rt all mypool/var/tmp
+NAME USED AVAIL REFER MOUNTPOINT
+mypool/var/tmp 383K 30.5G 249K /var/tmp
+mypool/var/tmp@first_snapshot 134K - 249K -
+
+ The original clone is now an ordinary dataset. It
+ contains all the data from the original snapshot plus the
+ files that were added or modified. Any changes made the
+ original dataset after the snapshot the clone was based off of
+ have been destroyed. Now that there is no other dataset
+ depending on the snapshot, it can now be destroyed as
+ well.
@@ -3041,6 +3089,55 @@
Advanced Topics
+
+ Pool Layout
+
+ Choosing which type of vdevs to use to construct a
+ pool requires deciding which factors are most important. The
+ main considerations for a pool are: redundancy, capacity, and
+ performance.
+
+ Mirrors
+ provide the best performance in terms of operations per second
+ (IOPS). With a mirror, every disk in a
+ vdev can be used to service reads, since each disk in the vdev
+ contains a complete copy of the data. Mirrors provide good
+ redundancy since each disk in a vdev contains a complete copy
+ of the data, and a mirror vdev can consist of many disks. The
+ downside to mirrors is that they provide the worst capacity,
+ since each mirror vdev, no matter how many disks it contains,
+ provides only the capacity of the smallest disk. Multiple
+ mirror vdevs can be striped together (similar to RAID-10) to
+ provide more capacity, but the usable capacity will usually be
+ less than the same number of disks in RAID-Z.
+
+ RAID-Z comes in
+ a number of levels of redundancy. RAID-Z1 provides enough
+ redundancy to withstand the failure of a single disk in each
+ vdev. RAID-Z2 can withstand two concurrent failures, and Z3
+ can withstand three concurrent failures without any data loss.
+ Choosing between these levels of redundancy allows the
+ administrator to balance the level of redundancy against the
+ amount of usable capacity. Each RAID-Z vdev will provide
+ storage capacity equal to the number of disks, less the level
+ of redundancy, multiplied by the size of the smallest disk.
+ Examples of the storage calculations are provided in the
+ RAID-Z definition
+ in the terminology section. Multiple RAID-Z vdevs can be
+ striped together to create an effective RAID-50 or RAID-60
+ type array.
+
+ Using more separate vdevs will increase performance.
+ Each vdev is operated as a unit, the effective speed of an
+ individual vdev is determined by the speed of the slowest
+ device. For the best performance, the recommended layout is
+ many mirror vdevs, but this provides the worst effective
+ capacity of the possible configurations. For increased
+ redundancy, an administrator can choose between using RAID-Z2,
+ Z3, or adding more member disks to each mirror vdev.
+
+
Tuning
@@ -3182,9 +3279,16 @@
vdevs to prevent the amount of free space across the vdevs
from becoming unbalanced, which will reduce read and write
performance. This value can be adjusted at any time with
- &man.sysctl.8;.
+ &man.sysctl.8;.
+
+ In &os; releases after August 10, 2014 this
+ sysctl is deprecated. ZFS will always write to a
+ degraded vdev unless it is a non-redundant
+ vdev.
+
+
vfs.zfs.txg.timeout
- Maximum number of seconds between
- transaction groups.
+ transaction groups.
The current transaction group will be written to the pool
and a fresh transaction group started if this amount of
time has elapsed since the previous transaction group. A
@@ -3622,6 +3727,14 @@
and an array of eight 1 TB disks in
RAID-Z3 will yield 5 TB of
usable space.
+
+ For
+ optimal performance, it is best to have a power of
+ 2 (2, 4, 8) number of non-parity drives so that
+ writes can be distributed evenly. The recommended
+ configurations are: RAID-Z1: 3, 5, or 9 disks.
+ RAID-Z2: 6 or 10 disks. RAID-Z3: 5, 7 or 11
+ disks.
@@ -4082,7 +4195,10 @@
copies feature is the only form of redundancy. The
copies feature can recover from a single bad sector or
other forms of minor corruption, but it does not protect
- the pool from the loss of an entire disk.
+ the pool from the loss of an entire disk. Each
+ copy of a block consumes that much additional space in
+ the file system, but also in any snapshots where that
+ block has been modified.