diff --git a/en_US.ISO8859-1/books/handbook/geom/chapter.sgml b/en_US.ISO8859-1/books/handbook/geom/chapter.sgml index 07a4482622..9551ef043c 100644 --- a/en_US.ISO8859-1/books/handbook/geom/chapter.sgml +++ b/en_US.ISO8859-1/books/handbook/geom/chapter.sgml @@ -1,771 +1,773 @@ Tom Rhodes Written by GEOM: Modular Disk Transformation Framework Synopsis GEOM GEOM Disk Framework GEOM This chapter covers the use of disks under the GEOM framework in &os;. This includes the major RAID control utilities which use the framework for configuration. This chapter will not go into in depth discussion on how GEOM handles or controls I/O, the underlying subsystem, or code. This information is provided through the &man.geom.4; manual page and its various SEE ALSO references. This chapter is also not a definitive guide to RAID configurations. Only GEOM-supported RAID classifications will be discussed. After reading this chapter, you will know: What type of RAID support is available through GEOM. How to use the base utilities to configure, maintain, and manipulate the various RAID levels. How to mirror, stripe, encrypt, and remotely connect disk devices through GEOM. How to troubleshoot disks attached to the GEOM framework. Before reading this chapter, you should: Understand how &os; treats disk devices (). Know how to configure and install a new &os; kernel (). GEOM Introduction GEOM permits access and control to classes — Master Boot Records, BSD labels, etc — through the use of providers, or the special files in /dev. Supporting various software RAID configurations, GEOM will transparently provide access to the operating system and operating system utilities. Tom Rhodes Written by Murray Stokely RAID0 - Striping GEOM Striping Striping is a method used to combine several disk drives into a single volume. In many cases, this is done through the use of hardware controllers. The GEOM disk subsystem provides software support for RAID0, also known as disk striping. In a RAID0 system, data are split up in blocks that get written across all the drives in the array. Instead of having to wait on the system to write 256k to one disk, a RAID0 system can simultaneously write 64k to each of four different disks, offering superior I/O performance. This performance can be enhanced further by using multiple disk controllers. Each disk in a RAID0 stripe must be of the same size, since I/O requests are interleaved to read or write to multiple disks in parallel. Disk Striping Illustration Creating a stripe of unformatted ATA disks Load the geom_stripe.ko module: &prompt.root; kldload geom_stripe Ensure that a suitable mount point exists. If this volume will become a root partition, then temporarily use another mount point such as /mnt: &prompt.root; mkdir /mnt Determine the device names for the disks which will be striped, and create the new stripe device. For example, to stripe two unused and unpartitioned ATA disks, for example /dev/ad2 and /dev/ad3: &prompt.root; gstripe label -v st0 /dev/ad2 /dev/ad3 Metadata value stored on /dev/ad2. Metadata value stored on /dev/ad3. Done. Write a standard label, also known as a partition table, on the new volume and install the default bootstrap code: &prompt.root; bsdlabel -wB /dev/stripe/st0 This process should have created two other devices in the /dev/stripe directory in addition to the st0 device. Those include st0a and st0c. At this point a file system may be created on the st0a device with the newfs utility: &prompt.root; newfs -U /dev/stripe/st0a Many numbers will glide across the screen, and after a few seconds, the process will be complete. The volume has been created and is ready to be mounted. To manually mount the created disk stripe: &prompt.root; mount /dev/stripe/st0a /mnt To mount this striped file system automatically during the boot process, place the volume information in /etc/fstab file. For this purpose, a permanent mount point, named stripe, is created: &prompt.root; mkdir /stripe &prompt.root; echo "/dev/stripe/st0a /stripe ufs rw 2 2" \ >> /etc/fstab The geom_stripe.ko module must also be automatically loaded during system initialization, by adding a line to /boot/loader.conf: &prompt.root; echo 'geom_stripe_load="YES"' >> /boot/loader.conf RAID1 - Mirroring GEOM Disk Mirroring Mirroring is a technology used by many corporations and home users to back up data without interruption. When a mirror exists, it simply means that diskB replicates diskA. Or, perhaps diskC+D replicates diskA+B. Regardless of the disk configuration, the important aspect is that information on one disk or partition is being replicated. Later, that information could be more easily restored, backed up without causing service or access interruption, and even be physically stored in a data safe. To begin, ensure the system has two disk drives of equal size, these exercises assume they are direct access (&man.da.4;) SCSI disks. Mirroring Primary Disks Assuming &os; has been installed on the first, da0 disk device, &man.gmirror.8; should be told to store its primary data there. Before building the mirror, enable additional debugging information and opening access to the device by setting the kern.geom.debugflags &man.sysctl.8; option to the following value: &prompt.root; sysctl kern.geom.debugflags=17 Now create the mirror. Begin the process by storing meta-data information on the primary disk device, effectively creating the /dev/mirror/gm device using the following command: &prompt.root; gmirror label -vb round-robin gm0 /dev/da0 The system should respond with: Metadata value stored on /dev/da0. Done. Initialize GEOM, this will load the /boot/kernel/geom_mirror.ko kernel module: &prompt.root; gmirror load When this command completes successfully, it creates the gm0 device node under the /dev/mirror directory. Enable loading of the geom_mirror.ko kernel module during system initialization: &prompt.root; echo 'geom_mirror_load="YES"' >> /boot/loader.conf Edit the /etc/fstab file, replacing references to the old da0 with the new device nodes of the gm0 mirror device. If &man.vi.1; is your preferred editor, the following is an easy way to accomplish this task: &prompt.root; vi /etc/fstab In &man.vi.1; back up the current contents of fstab by typing :w /etc/fstab.bak. Then replace all old da0 references with gm0 by typing :%s/da/mirror\/gm/g. The resulting fstab file should look similar to the following. It does not matter if the disk drives are SCSI or ATA, the RAID device will be gm regardless. # Device Mountpoint FStype Options Dump Pass# -/dev/mirror/gm0s2b none swap sw 0 0 -/dev/mirror/gm0s2a / ufs rw 1 1 -#/dev/mirror/gm0s2d /store ufs rw 2 2 -/dev/mirror/gm0s2e /usr ufs rw 2 2 +/dev/mirror/gm0s1b none swap sw 0 0 +/dev/mirror/gm0s1a / ufs rw 1 1 +/dev/mirror/gm0s1d /usr ufs rw 0 0 +/dev/mirror/gm0s1f /home ufs rw 2 2 +#/dev/mirror/gm0s2d /store ufs rw 2 2 +/dev/mirror/gm0s1e /var ufs rw 2 2 /dev/acd0 /cdrom cd9660 ro,noauto 0 0 Reboot the system: &prompt.root; shutdown -r now During system initialization, the gm0 should be used in place of the da0 device. Once fully initialized, this may be checked by visually inspecting the output from the mount command: &prompt.root; mount Filesystem 1K-blocks Used Avail Capacity Mounted on /dev/mirror/gm0s1a 1012974 224604 707334 24% / devfs 1 1 0 100% /dev /dev/mirror/gm0s1f 45970182 28596 42263972 0% /home /dev/mirror/gm0s1d 6090094 1348356 4254532 24% /usr /dev/mirror/gm0s1e 3045006 2241420 559986 80% /var devfs 1 1 0 100% /var/named/dev The output looks good, as expected. Finally, to begin synchronization, insert the da1 disk into the mirror using the following command: &prompt.root; gmirror insert gm0 /dev/da1 As the mirror is built the status may be checked using the following command: &prompt.root; gmirror status Once the mirror has been built and all current data has been synchronized, the output from the above command should look like: Name Status Components mirror/gm0 COMPLETE da0 da1 If there are any issues, or the mirror is still completing the build process, the example will show DEGRADED in place of COMPLETE. Troubleshooting System refuses to boot If the system boots up to a prompt similar to: ffs_mountroot: can't find rootvp Root mount failed: 6 mountroot> Reboot the machine using the power or reset button. At the boot menu, select option six (6). This will drop the system to a &man.loader.8; prompt. Load the kernel module manually: OK? load geom_mirror OK? boot If this works then for whatever reason the module was not being loaded properly. Check whether the relevant entry in /boot/loader.conf is correct. If the problem persists, place: options GEOM_MIRROR in the kernel configuration file, rebuild and reinstall. That should remedy this issue. Recovering From Disk Failure The wonderful part about disk mirroring is that when a disk fails, it may be replaced, presumably, without losing any data. Considering the previous RAID1 configuration, assume that da1 has failed and now needs to be replaced. To replace it, determine which disk has failed and power down the system. At this point, the disk may be swapped with a new one and the system brought back up. After the system has restarted, the following commands may be used to replace the disk: &prompt.root; gmirror forget gm0 &prompt.root; gmirror insert gm0 /dev/da1 Use the gmirror command to monitor the progress of the rebuild. It is that simple. GEOM Gate Network Devices GEOM supports the remote use of devices, such as disks, CD-ROMs, files, etc. through the use of the gate utilities. This is similar to NFS. To begin, an exports file must be created. This file specifies who is permitted to access the exported resources and what level of access they are offered. For example, to export the fourth slice on the first SCSI disk, the following /etc/gg.exports is more than adequate: 192.168.1.0/24 RW /dev/da0s4d It will allow all hosts inside the private network access the file system on the da0s4d partition. To export this device, ensure it is not currently mounted, and start the &man.ggated.8; server daemon: &prompt.root; ggated Now to mount the device on the client machine, issue the following commands: &prompt.root; ggatec create -o rw 192.168.1.1 /dev/da0s4d ggate0 &prompt.root; mount /dev/ggate0 /mnt From here on, the device may be accessed through the /mnt mount point. It should be pointed out that this will fail if the device is currently mounted on either the server machine or any other machine on the network. When the device is no longer needed, it may be safely unmounted with the &man.umount.8; command, similar to any other disk device. Labeling Disk Devices GEOM Disk Labels During system initialization, the &os; kernel will create device nodes as devices are found. This method of probing for devices raises some issues, for instance what if a new disk device is added via USB? It is very likely that a flash device may be handed the device name of da0 and the original da0 shifted to da1. This will cause issues mounting file systems if they are listed in /etc/fstab, effectively, this may also prevent the system from booting. One solution to this issue is to chain the SCSI devices in order so a new device added to the SCSI card will be issued unused device numbers. But what about USB devices which may replace the primary SCSI disk? This happens because USB devices are usually probed before the SCSI card. One solution is to only insert these devices after the system has been booted. Another method could be to use only a single ATA drive and never list the SCSI devices in /etc/fstab. A better solution is available. By using the glabel utility, an administrator or user may label their disk devices and use these labels in /etc/fstab. Because glabel stores the label in the last sector of a given provider, the label will remain persistent across reboots. By using this label as a device, the file system may always be mounted regardless of what device node it is accessed through. This goes without saying that a label be permanent. The glabel utility may be used to create both a transient and permanent label. Only the permanent label will remain consistent across reboots. See the &man.glabel.8; manual page for more information on the differences between labels. Label Types and Examples There are two types of labels, a generic label and a file system label. Labels can be permanent or temporary. Permanent labels can be created with the &man.tunefs.8; or &man.newfs.8; commands. They will then be created in a sub-directory of /dev, which will be named according to their file system type. For example, UFS2 file system labels will be created in the /dev/ufs directory. Permanent labels can also be created with the glabel label command. These are not file system specific, and will be created in the /dev/label directory. A temporary label will go away with the next reboot. These labels will be created in the /dev/label directory and are perfect for experimentation. A temporary label can be created using the glabel create command. For more information, please read the manual page of &man.glabel.8;. To create a permanent label for a UFS2 file system without destroying any data, issue the following command: &prompt.root; tunefs -L home /dev/da3 If the file system is full, this may cause data corruption; however, if the file system is full then the main goal should be removing stale files and not adding labels. A label should now exist in /dev/ufs which may be added to /etc/fstab: /dev/ufs/home /home ufs rw 2 2 The file system must not be mounted while attempting to run tunefs. Now the file system may be mounted like normal: &prompt.root; mount /home From this point on, so long as the geom_label.ko kernel module is loaded at boot with /boot/loader.conf or the GEOM_LABEL kernel option is present, the device node may change without any ill effect on the system. File systems may also be created with a default label by using the flag with newfs. See the &man.newfs.8; manual page for more information. The following command can be used to destroy the label: &prompt.root; glabel destroy home The following example shows how to label the partitions of a boot disk. Labeling Partitions on the Boot Disk By permanently labeling the partitions on the boot disk, the system should be able to continue to boot normally, even if the disk is moved to another controller or transferred to a different system. For this example, it is assumed that a single ATA disk is used, which is currently recognized by the system as ad0. It is also assumed that the standard &os; partition scheme is used, with /, /var, /usr and /tmp file systems, as well as a swap partition. Reboot the system, and at the &man.loader.8; prompt, press 4 to boot into single user mode. Then enter the following commands: &prompt.root; glabel label rootfs /dev/ad0s1a GEOM_LABEL: Label for provider /dev/ad0s1a is label/rootfs &prompt.root; glabel label var /dev/ad0s1d GEOM_LABEL: Label for provider /dev/ad0s1d is label/var &prompt.root; glabel label usr /dev/ad0s1f GEOM_LABEL: Label for provider /dev/ad0s1f is label/usr &prompt.root; glabel label tmp /dev/ad0s1e GEOM_LABEL: Label for provider /dev/ad0s1e is label/tmp &prompt.root; glabel label swap /dev/ad0s1b GEOM_LABEL: Label for provider /dev/ad0s1b is label/swap &prompt.root; exit The system will continue with multi-user boot. After the boot completes, edit /etc/fstab and replace the conventional device names, with their respective labels. The final /etc/fstab file will look like the following: # Device Mountpoint FStype Options Dump Pass# /dev/label/swap none swap sw 0 0 /dev/label/rootfs / ufs rw 1 1 /dev/label/tmp /tmp ufs rw 2 2 /dev/label/usr /usr ufs rw 2 2 /dev/label/var /var ufs rw 2 2 The system can now be rebooted. If everything went well, it will come up normally and mount will show: &prompt.root; mount /dev/label/rootfs on / (ufs, local) devfs on /dev (devfs, local) /dev/label/tmp on /tmp (ufs, local, soft-updates) /dev/label/usr on /usr (ufs, local, soft-updates) /dev/label/var on /var (ufs, local, soft-updates) UFS Journaling Through GEOM GEOM Journaling With the release of &os; 7.0, the long awaited feature of UFS journals has been implemented. The implementation itself is provided through the GEOM subsystem and is easily configured via the &man.gjournal.8; utility. What is journaling? Journaling capability stores a log of file system transactions, i.e.: changes that make up a complete disk write operation, before meta-data and file writes are committed to the disk proper. This transaction log can later be replayed to redo file system transactions, preventing file system inconsistencies. This method is yet another mechanism to protect against data loss and inconsistencies of the file system. Unlike Soft Updates which tracks and enforces meta-data updates and Snapshots which is an image of the file system, an actual log is stored in disk space specifically reserved for this task, and in some cases may be stored on another disk entirely. Unlike other file system journaling implementations, the gjournal method is block based and not implemented as part of the file system - only as a GEOM extension. To enable support for gjournal, the &os; kernel must have the following option - which is the default on 7.X systems: options UFS_GJOURNAL If journaled volumes need to be mounted during startup, the geom_journal.ko kernel module will also have to be loaded, by adding the following line in /boot/loader.conf: geom_journal_load="YES" Alternatively, this function can also be built into a custom kernel, by adding the following line in the kernel configuration file: options GEOM_JOURNAL Creating a journal on a free file system may now be done using the following steps, considering that the da4 is a new SCSI disk: &prompt.root; gjournal label /dev/da4 &prompt.root; gjournal load At this point, there should be a /dev/da4 device node and a /dev/da4.journal device node. A file system may now be created on this device: &prompt.root; newfs -O 2 -J /dev/da4.journal The previously issued command will create a UFS2 file system with journaling being made active. Effectively mount the device at the desired point with: &prompt.root; mount /dev/da4.journal /mnt In the case of several slices, a journal will be created for each individual slice. For instance, if ad4s1 and ad4s2 are both slices, then gjournal will create ad4s1.journal and ad4s2.journal. In the case of the command being run twice, the result will be journals. Under some circumstances, keeping the journal on another disk may be desired. For these cases, the journal provider or storage device should be listed after the device to enable journaling on. Journaling may also be enabled on current file systems by using tunefs; however, always make a backup before attempting to alter a file system. In most cases, the gjournal will fail if it is unable to create the actual journal but this does not protect against data loss incurred as a result of misusing tunefs. It is also possible to journal the boot disk of a &os; system. Please refer to the article Implementing UFS Journaling on a Desktop PC for detailed instructions on this task.