diff --git a/en_US.ISO8859-1/articles/solid-state/article.sgml b/en_US.ISO8859-1/articles/solid-state/article.sgml
index 57175186c5..dc2bfe6db6 100644
--- a/en_US.ISO8859-1/articles/solid-state/article.sgml
+++ b/en_US.ISO8859-1/articles/solid-state/article.sgml
@@ -1,629 +1,638 @@
 <!-- Copyright (c) 2001 The FreeBSD Documentation Project
 
      Redistribution and use in source (SGML DocBook) and 'compiled' forms
      (SGML, HTML, PDF, PostScript, RTF and so forth) with or without
      modification, are permitted provided that the following conditions
      are met:
 
       1. Redistributions of source code (SGML DocBook) must retain the above
          copyright notice, this list of conditions and the following
          disclaimer as the first lines of this file unmodified.
 
       2. Redistributions in compiled form (transformed to other DTDs,
          converted to PDF, PostScript, RTF and other formats) must reproduce
          the above copyright notice, this list of conditions and the
          following disclaimer in the documentation and/or other materials
          provided with the distribution.
 
      THIS DOCUMENTATION IS PROVIDED BY THE FREEBSD DOCUMENTATION PROJECT "AS
      IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
      THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
      PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL NIK CLAYTON BE LIABLE FOR ANY
      DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
      DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
      OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
      HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
      STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
      ANY WAY OUT OF THE USE OF THIS DOCUMENTATION, EVEN IF ADVISED OF THE
      POSSIBILITY OF SUCH DAMAGE.
 
      $FreeBSD$
 -->
 
 <!DOCTYPE article PUBLIC "-//FreeBSD//DTD DocBook V4.1-Based Extension//EN" [
 <!ENTITY % man PUBLIC "-//FreeBSD//ENTITIES DocBook Manual Page Entities//EN">
 %man;
 <!ENTITY legalnotice SYSTEM "../../share/sgml/legalnotice.sgml">
+
+<!ENTITY % trademarks PUBLIC "-//FreeBSD//ENTITIES DocBook Trademark Entities//EN">
+%trademarks;
 ]>
 
 <article>
   <articleinfo>
     <title>FreeBSD and Solid State Devices</title>
 
     <authorgroup>
       <author>
 	<firstname>John</firstname>
 	<surname>Kozubik</surname>
 
 	<affiliation>
 	  <address><email>john@kozubik.com</email></address>
 	</affiliation>
       </author>
     </authorgroup>
     
     <pubdate>$FreeBSD$</pubdate>
 
     <copyright>
       <year>2001</year>
       <holder>The FreeBSD Documentation Project</holder>
     </copyright>
 
+    <legalnotice id="trademarks" role="trademarks">
+      &tm-attrib.freebsd;
+      &tm-attrib.m-systems;
+      &tm-attrib.general;
+    </legalnotice>
+
     &legalnotice;
     
     <abstract>
       <para>This article covers the use of solid state disk devices in FreeBSD
 	to create embedded systems.</para>
     
       <para>Embedded systems have the advantage of increased stability due to
 	the lack of integral moving parts (hard drives).  Account must be
 	taken, however, for the generally low disk space available in the
 	system and the durability of the storage medium.</para>
 
       <para>Specific topics to be covered include the types and attributes of
 	solid state media suitable for disk use in FreeBSD, kernel options
 	that are of interest in such an environment, the
 	<filename>rc.diskless</filename> mechanisms that automate the
 	initialization of such systems and the need for read-only filesystems,
 	and building filesystems from scratch.  The article will conclude
 	with some general strategies for small and read-only FreeBSD
 	environments.</para>
     </abstract>
   </articleinfo>
 
   <sect1 id="intro">
     <title>Solid State Disk Devices</title>
 
     <para>The scope of this article will be limited to solid state disk
       devices made from flash memory.  Flash memory is a solid state memory
       (no moving parts) that is non-volatile (the memory maintains data even
       after all power sources have been disconnected).  Flash memory can
       withstand tremendous physical shock and is reasonably fast (the flash
       memory solutions covered in this article are slightly slower than a EIDE
       hard disk for write operations, and much faster for read operations).
       One very important aspect of flash memory, the ramifications of which
       will be discussed later in this article, is that each sector has a
       limited rewrite capacity.  You can only write, erase, and write again to
       a sector of flash memory a certain number of times before the sector
       becomes permanently unusable.  Although many flash memory products
       automatically map bad blocks, and although some even distribute write
       operations evenly throughout the unit, the fact remains that there
       exists a limit to the amount of writing that can be done to the device.
       Competitive units have between 1,000,000 and 10,000,000 writes per
       sector in their specification.  This figure varies due to the
       temperature of the environment.</para>
 
     <para>Specifically, we will be discussing ATA compatible compact-flash
-      units and the M-Systems Disk-On-Chip flash memory unit.  ATA compatible
+      units and the M-Systems &diskonchip; flash memory unit.  ATA compatible
       compact-flash cards are quite popular as storage media for digital
       cameras.  Of particular interest is the fact that they pin out directly
       to the IDE bus and are compatible with the ATA command set.  Therefore,
       with a very simple and low-cost adaptor, these devices can be attached
       directly to an IDE bus in a computer.  Once implemented in this manner,
       operating systems such as FreeBSD see the device as a normal hard disk
-      (albeit small).  The M-Systems Disk-On-Chip product is based on the same
+      (albeit small).  The M-Systems &diskonchip; product is based on the same
       underlying flash memory technology as ATA compatible compact-flash
       cards, but resides in a DIP form factor and is not ATA compatible.  To
       use such a device, not only must you install it on a motherboard that
-      has a Disk-On-Chip socket, you must also build the `fla` driver into any
+      has a &diskonchip; socket, you must also build the `fla` driver into any
       FreeBSD kernel you wish to use it with.  Further, there is critical,
       manufacturer-specific data residing in the boot sector of this device,
       so you must take care not to install the FreeBSD (or any other) boot
       loader when using this.</para>
 
     <para>Other solid state disk solutions do exist, but their expense,
       obscurity, and relative unease of use places them beyond the scope of
       this article.</para>
   </sect1>
 
   <sect1 id="kernel">
       <title>Kernel Options</title>
 
     <para>A few kernel options are of specific interest to those creating
       an embedded FreeBSD system.</para>
 
     <para>First, all embedded FreeBSD systems that use flash memory as system
       disk will be interested in memory disks and memory filesystems.  Because
       of the limited number of writes that can be done to flash memory, the
       disk and the filesystems on the disk will most likely be mounted
       read-only.  In this environment, filesystems such as
       <filename>/tmp</filename> and <filename>/var</filename> are mounted as
       memory filesystems to allow the system to create logs and update
       counters and temporary files.  Memory filesystems are a critical
       component to a successful solid state FreeBSD implementation.</para>
 
     <para>You should make sure the following lines exist in your kernel
       configuration file:</para>
 
     <programlisting>options         MFS             # Memory Filesystem
 options         MD_ROOT         # md device usable as a potential root device
 pseudo-device   md              # memory disk</programlisting>
 
-    <para>Second, if you will be using the M-Systems Disk-On-Chip product, you
+    <para>Second, if you will be using the M-Systems &diskonchip; product, you
       must also include this line:</para>
 
     <programlisting>device          fla0    at isa?</programlisting>
   </sect1>
 
   <sect1 id="ro-fs">
     <title><filename>rc.diskless</filename> and Read-Only Filesystems</title>
 
     <para>The post-boot initialization of an embedded FreeBSD system is
       controlled by <filename>/etc/rc.diskless2</filename>
       (<filename>/etc/rc.diskless1</filename> is for BOOTP diskless boot).
       This initialization script is invoked by placing a line in
       <filename>/etc/rc.conf</filename> as follows:</para>
 
     <programlisting>diskless_mount=/etc/rc.diskless2</programlisting>
 
     <para><filename>rc.diskless2</filename> mounts <filename>/var</filename>
       as a memory filesystem, makes a configurable list of directories in
       <filename>/var</filename> with the &man.mkdir.1; command, changes modes
       on some of those directories, and extracts a list of device entries to
       copy to a writable (again, a memory filesystem)
       <filename>/dev</filename> partition.  In the execution of
       <filename>/etc/rc.diskless2</filename>, one other
       <filename>rc.conf</filename> variable comes into play -
       <literal>varsize</literal>.  The <filename>/etc/rc.diskless2</filename>
       file creates a <filename>/var</filename> partition based on the value of
       this variable in <filename>rc.conf</filename>:</para>
 
     <programlisting>varsize=8192</programlisting>
 
     <para>Remember that this value is in sectors.  The creation of the
       <filename>/dev</filename> partition by
       <filename>/etc/rc.diskless2</filename>, however, is governed by a
       hard-coded value of 4096 sectors.  It is trivial to change this entry in
       the <filename>/etc/rc.diskless2</filename> file itself, although you
       should not need more space than that for
       <filename>/dev</filename>.</para>
 
     <para>It is important to remember that the
       <filename>/etc/rc.diskless2</filename> script assumes that you have
       already removed your conventional <filename>/tmp</filename> partition
       and replaced it with a symbolic link to <filename>/var/tmp</filename>.
       Because <filename>tmp</filename> is one of the directories created in
       <filename>/var</filename> by the <filename>/etc/rc.diskless2</filename>
       script, and because <filename>/var</filename> is a memory filesystem
       (which is mounted read-write), <filename>/tmp</filename> will now be a
       directory that is read-write as well.</para>
 
     <para>The fact that <filename>/var</filename> and
       <filename>/dev</filename> are read-write filesystems is an important
       distinction, as the <filename>/</filename> partition (and any other
       partitions you may have on your flash media) should be mounted
       read-only.  Remember that in <xref linkend="intro"> we detailed the
       limitations of flash memory - specifically the limited write capability.
       The importance of not mounting filesystems on flash media read-write,
       and the importance of not using a swap file, cannot be overstated.  A
       swap file on a busy system can burn through a piece of flash media in
       less than one year. Heavy logging or temporary file creation and
       destruction can do the same.  Therefore, in addition to removing the
       <literal>swap</literal> and <literal>/proc</literal> entries from your
       <filename>/etc/fstab</filename> file, you should also change the Options
       field for each filesystem to <literal>ro</literal> as follows:</para>
 
     <programlisting># Device                Mountpoint      FStype  Options         Dump    Pass#
 /dev/ad0s1a             /               ufs     ro              1       1</programlisting>
 
     <para>A few applications in the average system will immediately begin to
       fail as a result of this change.  For instance, ports will not install
       from the ports tree because the
       <filename>/var/db/port.mkversion</filename> file does not exist.  cron
       will not run properly as a result of missing cron tabs in the
       <filename>/var</filename> created by
       <filename>/etc/rc.diskless2</filename>, and syslog and dhcp will
       encounter problems as well as a result of the read-only filesystem and
       missing items in the <filename>/var</filename> that
       <filename>/etc/rc.diskless2</filename> has created.  These are only
       temporary problems though, and are addressed, along with solutions to
       the execution of other common software packages in
       <xref linkend="strategies">.</para>
 
     <para>An important thing to remember is that a filesystem that was mounted
       read-only with <filename>/etc/fstab</filename> can be made read-write at
       any time by issuing the command:</para>
 
     <screen>&prompt.root; <userinput>/sbin/mount -uw <replaceable>partition</replaceable></userinput></screen>
 
     <para>and can be toggled back to read-only with the command:</para>
 
     <screen>&prompt.root; <userinput>/sbin/mount -ur <replaceable>partition</replaceable></userinput></screen>
   </sect1>
 
   <sect1>
     <title>Building a File System From Scratch</title>
 
     <para>Because ATA compatible compact-flash cards are seen by FreeBSD as
-      normal IDE hard drives, as is a M-Systems Disk-On-Chip product (when you
+      normal IDE hard drives, as is a M-Systems &diskonchip; product (when you
       are running a kernel with the fla driver built in) you could
       theoretically install FreeBSD from the network using the kern and
       mfsroot floppies or from a CD.  Other than the fact that you should not
       write a boot-loader of any kind to the M-Systems device, no special
       instructions are needed.</para>
 
     <para>However, even a small installation of FreeBSD using normal
       installation procedures can produce a system in size of greater than 200
       megabytes.  Because most people will be using smaller flash memory
       devices (128 megabytes is considered fairly large - 32 or even 16
       megabytes is common) an installation using normal mechanisms is not
       possible&mdash;there is simply not enough disk space for even the
       smallest of conventional installations.</para>
 
     <para>The easiest way to overcome this space limitation is to install
       FreeBSD using conventional means to a normal hard disk.  After the
       installation is complete, pare down the operating system to a size that
       will fit onto your flash media, then tar the entire filesystem.  The
       following steps will guide you through the process of preparing a piece
       of flash memory for your tarred filesystem.  Remember, because a normal
       installation is not being performed, operations such as partitioning,
       labeling, file-system creation, etc. need to be performed by hand.  In
       addition to the kern and mfsroot floppy disks, you will also need to use
-      the fixit floppy.  If you are using a M-Systems Disk-On-Chip, the kernel
+      the fixit floppy.  If you are using a M-Systems &diskonchip;, the kernel
       on your kern floppy must have the <literal>fla</literal> option detailed
       in <xref linkend="kernel"> compiled into it.  Please see
       <xref linkend="kern.flp"> for instructions on creating a new kernel for
       <filename>kern.flp</filename>.</para>
 
     <procedure>
       <step>
 	<title>Partitioning your flash media device</title>
 
 	<para>After booting with the kern and mfsroot floppies, choose
 	  <literal>custom</literal> from the installation menu.  In the custom
 	  installation menu, choose <literal>partition</literal>.  In the
 	  partition menu, you should delete all existing partitions using the
 	  <keycap>d</keycap> key.  After deleting all existing partitions,
 	  create a partition using the <keycap>c</keycap> key and accept the
 	  default value for the size of the partition.  When asked for the
 	  type of the partition, make sure the value is set to
 	  <literal>165</literal>.  Now write this partition table to the disk
 	  by pressing the <keycap>w</keycap> key (this is a hidden option on
 	  this screen).  When presented with a menu to choose a boot manager,
 	  take care to select <literal>None</literal> if you are using an
-	  M-Systems Disk-On-Chip.  If you are using an ATA compatible compact
+	  M-Systems &diskonchip;.  If you are using an ATA compatible compact
 	  flash card, you should choose the FreeBSD Boot Manager.  Now press
 	  the <keycap>q</keycap> key to quit the partition menu.  You will be
 	  shown the boot manager menu once more - repeat the choice you made
 	  earlier.</para>
       </step>
 
       <step>
 	<title>Creating filesystems on your flash memory device</title>
 
 	<para>Exit the custom installation menu, and from the main
 	  installation menu choose the <literal>fixit</literal> option.  After
 	  entering the fixit environment, enter the following commands:</para>
 
 	<informaltable frame="none">
 	  <tgroup cols="2">
 	    <thead>
 	      <row>
 		<entry align="center">ATA compatible</entry>
 
-		<entry align="center">Disk-On-Chip</entry>
+		<entry align="center">&diskonchip;</entry>
 	      </row>
 	    </thead>
 	    <tbody>
 	      <row>
 		<entry><para><screen>&prompt.root; <userinput>mknod /dev/ad0a c 116 0</userinput>
 &prompt.root; <userinput>mknod /dev/ad0c c 116 2</userinput>		      
 &prompt.root; <userinput>disklabel -e /dev/ad0c</userinput></screen></para></entry>
 
 		<entry><para><screen>&prompt.root; <userinput>mknod /dev/fla0a c 102 0</userinput>
 &prompt.root; <userinput>mknod /dev/fla0c c 102 2</userinput>
 &prompt.root; <userinput>disklabel -e /dev/fla0c</userinput></screen></para></entry>
 	      </row>
 	    </tbody>
 	  </tgroup>
 	</informaltable>
 
 	<para>At this point you will have entered the vi editor under the
-	  auspices of the disklabel command.  If you are using Disk-On-Chip,
+	  auspices of the disklabel command.  If you are using &diskonchip;,
 	  the first step will be to change the type value near the beginning
 	  of the file from <literal>ESDI</literal> to
 	  <literal>DOC2K</literal>.  Next, regardless of whether you are using
-	  Disk-On-Chip or ATA compatible compact flash media, you need to add
+	  &diskonchip; or ATA compatible compact flash media, you need to add
 	  an <literal>a:</literal> line at the end of the file.  This
 	  <literal>a:</literal> line should look like:</para>
 
 	<programlisting>a:      <replaceable>123456</replaceable>  0       4.2BSD  0       0</programlisting>
 
 	<para>Where <replaceable>123456</replaceable> is a number that is
 	  exactly the same as the number in the existing <literal>c:</literal>
 	  entry for size. Basically you are duplicating the existing
 	  <literal>c:</literal> line as an <literal>a:</literal> line, making
 	  sure that fstype is <literal>4.2BSD</literal>.  Save the file and
 	  exit.</para>
 
 	<informaltable frame="none">
 	  <tgroup cols="2">
 	    <thead>
 	      <row>
 		<entry align="center">ATA compatible</entry>
 
-		<entry align="center">Disk-On-Chip</entry>
+		<entry align="center">&diskonchip;</entry>
 	      </row>
 	    </thead>
 	    <tbody>
 	      <row>
 		<entry><para><screen>&prompt.root; <userinput>disklabel -B -r /dev/ad0c</userinput>
 &prompt.root; <userinput>newfs /dev/ad0a</userinput></screen></para></entry>
 
 		<entry><para><screen>&prompt.root; <userinput>disklabel -B -r /dev/fla0c</userinput>
 &prompt.root; <userinput>newfs /dev/fla0a</userinput></screen></para></entry>
 	      </row>
 	    </tbody>
 	  </tgroup>
 	</informaltable>
       </step>
 
       <step>
 	<title>Placing your filesystem on the flash media</title>
 
 	<para>Mount the newly prepared flash media:</para>
 
 	<informaltable frame="none">
 	  <tgroup cols="2">
 	    <thead>
 	      <row>
 		<entry align="center">ATA compatible</entry>
 
-		<entry align="center">Disk-On-Chip</entry>
+		<entry align="center">&diskonchip;</entry>
 	      </row>
 	    </thead>
 	    <tbody>
 	      <row>
 		<entry><para><screen>&prompt.root; <userinput>mount /dev/ad0a /flash</userinput></screen></para></entry>
 
 		<entry><para><screen>&prompt.root; <userinput>mount /dev/fla0a /flash</userinput></screen></para></entry>
 	      </row>
 	    </tbody>
 	  </tgroup>
 	</informaltable>
 	
 	<para>Bring this machine up on the network so we may transfer our tar
 	  file and explode it onto our flash media filesystem.  One example of
 	  how to do this is:</para>
 
 	<screen>&prompt.root; <userinput>ifconfig xl0 192.168.0.10 netmask 255.255.255.0</userinput>
 &prompt.root; <userinput>route add default 192.168.0.1</userinput></screen>
 
 	<para>Now that the machine is on the network, transfer your tar file.
 	  You may be faced with a bit of a dilemma at this point - if your
 	  flash memory part is 128 megabytes, for instance, and your tar file
 	  is larger than 64 megabytes, you cannot have your tar file on the
 	  flash media at the same time as you explode it - you will run out of
 	  space.  One solution to this problem, if you are using FTP, is to
 	  untar the file while it is transferred over FTP.  If you perform
 	  your transfer in this manner, you will never have the tar file and
 	  the tar contents on your disk at the same time:</para>
 
 	<screen><prompt>ftp></prompt> <userinput>get tarfile.tar "| tar xvf -"</userinput></screen>
 
 	<para>If your tarfile is gzipped, you can accomplish this as
 	  well:</para>
 
 	<screen><prompt>ftp></prompt> <userinput>get tarfile.tar "| zcat | tar xvf -"</userinput></screen>
 
 	<para>After the contents of your tarred filesystem are on your flash
 	  memory filesystem, you can unmount the flash memory and
 	  reboot:</para>
 
 	<screen>&prompt.root; <userinput>cd /</userinput>
 &prompt.root; <userinput>umount /flash</userinput>
 &prompt.root; <userinput>exit</userinput></screen>
 
 	<para>Assuming that you configured your filesystem correctly when it
 	  was built on the normal hard disk (with your filesystems mounted
 	  read-only, and with the necessary options compiled into the kernel)
 	  you should now be successfully booting your FreeBSD embedded
 	  system.</para>
       </step>
     </procedure>
   </sect1>
 
   <sect1 id="kern.flp">
     <title>Building a <filename>kern.flp</filename> Installation Floppy with
       the fla Driver</title>
 
     <note>
       <para>This section of the article is relevant only to those using
-	M-Systems Disk-On-Chip flash media.</para>
+	M-Systems &diskonchip; flash media.</para>
     </note>
 
     <para>It is possible that your <filename>kern.flp</filename> boot floppy
       does not have a kernel with the <devicename>fla</devicename> driver
-      compiled into it necessary for the system to recognize the Disk-On-Chip.
+      compiled into it necessary for the system to recognize the &diskonchip;.
       If you have booted off of the installation floppies and are told that no
       disks are present, then you are probably lacking the
       <devicename>fla</devicename> driver in your kernel.</para>
 
     <para>After you have built a kernel with <devicename>fla</devicename>
       support that is smaller than 1.4 megabytes, you can create a custom
       <filename>kern.flp</filename> floppy image with it by following these
       instructions:</para>
 
     <procedure>
       <step>
 	<para>Obtain an existing kern.flp image file</para>
       </step>
 
       <step>
 	<para><screen>&prompt.root; <userinput>vnconfig vn0c kern.flp</userinput></screen></para>
       </step>
 
       <step>
 	<para><screen>&prompt.root; <userinput>mount /dev/vn0c /mnt</userinput></screen></para>
       </step>
       
       <step>
 	<para>Place your kernel file into <filename>/mnt</filename>, replacing
 	  the existing one</para>
       </step>
 
       <step>
 	<para><screen>&prompt.root; <userinput>vnconfig -d vn0c</userinput></screen></para>
       </step>
     </procedure>
 
     <para>Your <filename>kern.flp</filename> file now has your new kernel on it.</para>
   </sect1>
 
   <sect1 id="strategies">
     <title>System Strategies for Small and Read Only Environments</title>
 
     <para>In <xref linkend="ro-fs">, it was pointed out that the
       <filename>/var</filename> filesystem constructed by
       <filename>/etc/rc.diskless2</filename> and the presence of a read-only
       root filesystem causes problems with many common software packages used
       with FreeBSD.  In this article, suggestions for successfully running
       cron, syslog, ports installations, and the Apache web server will be
       provided.</para>
 
     <sect2>
       <title>cron</title>
 
       <para>In <filename>/etc/rc.diskless2</filename> there is a variable
 	named <literal>var_dirs</literal>.  This variable consists of a
 	space-delimited list of directories that will be created inside of
 	<filename>/var</filename> after it is mounted as a memory filesystem.
 	<filename>cron</filename> and <filename>cron/tabs</filename> are not
 	in that list, and without those directories, cron will complain.  By
 	inserting <literal>cron</literal>, <literal>cron/tabs</literal>, and
 	perhaps even <literal>at</literal>, and <literal>at/jobs</literal> as
 	elements of that variable, you will facilitate the running of the
 	&man.cron.8; and &man.at.1; daemons.</para>
 
       <para>However, this still does not solve the problem of maintaining cron
 	tabs across reboots.  When the system reboots, the
 	<filename>/var</filename> filesystem that is in memory will disappear
 	and any cron tabs you may have had in it will also disappear.
 	Therefore, one solution would be to create cron tabs for the users
 	that need them, mount your <filename>/</filename> filesystem as
 	read-write and copy those cron tabs to somewhere safe, like
 	<filename>/etc/tabs</filename>, then add a line to the end of
 	<filename>/etc/rc.diskless2</filename> that copies those crontabs into
 	<filename>/var/cron/tabs</filename> after that directory has been
 	created during system initialization.  You may also need to add a line
 	that changes modes and permissions on the directories you create and
 	the files you copy with <filename>/etc/rc.diskless2</filename>.</para>
     </sect2>
 
     <sect2>
       <title>syslog</title>
 
       <para><filename>syslog.conf</filename> specifies the locations of
 	certain log files that exist in <filename>/var/log</filename>.  These
 	files are not created by <filename>/etc/rc.diskless2</filename> upon
 	system initialization. Therefore, somewhere in
 	<filename>/etc/rc.diskless2</filename>, after the section that creates
 	the directories in <filename>/var</filename>, you will need to add
 	something like this:</para>
 
       <screen>&prompt.root; <userinput>touch /var/log/security /var/log/maillog /var/log/cron /var/log/messages</userinput>
 &prompt.root; <userinput>chmod 0644 /var/log/*</userinput></screen>
 
       <para>You will also need to add the log directory to the list of
 	directories that <filename>/etc/rc.diskless2</filename>
 	creates.</para>
     </sect2>
 
     <sect2>
       <title>ports installation</title>
 
       <para>Before discussing the changes necessary to successfully use the
 	ports tree, a reminder is necessary regarding the read-only nature of
 	your filesystems on the flash media.  Since they are read-only, you
 	will need to temporarily mount them read-write using the mount syntax
 	shown in <xref linkend="ro-fs">.  You should always remount those
 	filesystems read-only when you are done with any maintenance -
 	unnecessary writes to the flash media could considerably shorten its
 	lifespan.</para>
 
       <para>To make it possible to enter a ports directory and successfully
 	run <command>make install</command>, it is necessary for the file
 	<filename>/var/db/port.mkversion</filename> to exist, and that it has
 	a correct date in it.  Further, we must create a packages directory on
 	a non-memory filesystem that will keep track of our packages across
 	reboots. Because it is necessary to mount your filesystems as
 	read-write for the installation of a package anyway, it is sensible to
 	assume that an area on the flash media can also be used for package
 	information to be written to.</para>
 
       <para>First, create a package database directory.  This is normally in
 	<filename>/var/db/pkg</filename>, but we cannot place it there as it
 	will disappear every time the system is booted.</para>
 
       <screen>&prompt.root; <userinput>mkdir /etc/pkg</userinput></screen>
 
       <para>Now, add a line to <filename>/etc/rc.diskless2</filename> that
 	links the <filename>/etc/pkg</filename> directory to
 	<filename>/var/db/pkg</filename>.  An example:</para>
 
       <screen>&prompt.root; <userinput>ln -s /etc/pkg /var/db/pkg</userinput></screen>
       
       <para>Add another line in <filename>/etc/rc.diskless2</filename> that
 	creates and populates
 	<filename>/var/db/port.mkversion</filename></para>
 
       <screen>&prompt.root; <userinput>touch /var/db/port.mkversion</userinput>
 &prompt.root; <userinput>chmod 0644 /var/db/port.mkversion</userinput>
 &prompt.root; <userinput>echo <replaceable>20010412</replaceable> >> /var/db/port.mkversion</userinput></screen>
 
       <para>where <replaceable>20010412</replaceable> is a date that is
 	appropriate for your particular release of FreeBSD</para>
 
       <para>Now, any time that you mount your filesystems as read-write and
 	install a package, the <command>make install</command> will work
 	because it finds a suitable
 	<filename>/var/db/port.mkversion</filename>, and package information
 	will be written successfully to <filename>/etc/pkg</filename> (because
 	the filesystem will, at that time, be mounted read-write) which will
 	always be available to the operating system as
 	<filename>/var/db/pkg</filename>.</para>
     </sect2>
 
     <sect2>
       <title>Apache Web Server</title>
 
       <para>Apache keeps pid files and logs in
 	<filename><replaceable>apache_install</replaceable>/logs</filename>.
 	Since this directory doubtless exists on a read-only filesystem, this
 	will not work.  It is necessary to add a new directory to the
 	<filename>/etc/rc.diskless2</filename> list of directories to create
 	in <filename>/var</filename>, to link
 	<filename><replaceable>apache_install</replaceable>/logs</filename> to
 	<filename>/var/log/apache</filename>.  It is also necessary to set
 	permissions and ownership on this new directory.</para>
 
       <para>First, add the directory <literal>log/apache</literal> to the list
 	of directories to be created in
 	<filename>/etc/rc.diskless2</filename>.</para>
       
       <para>Second, add these commands to
 	<filename>/etc/rc.diskless2</filename> after the directory creation
 	section:</para>
 
       <screen>&prompt.root; <userinput>chmod 0774 /var/log/apache</userinput>
 &prompt.root; <userinput>chown nobody:nobody /var/log/apache</userinput></screen>
 
       <para>Finally, remove the existing
 	<filename><replaceable>apache_install</replaceable>/logs</filename>
 	directory, and replace it with a link:</para>
 
       <screen>&prompt.root; <userinput>rm -rf (apache_install)/logs</userinput>
 &prompt.root; <userinput>ln -s /var/log/apache (apache_install)/logs</userinput></screen>
     </sect2>
   </sect1>
 </article>
 
diff --git a/en_US.ISO8859-1/articles/storage-devices/article.sgml b/en_US.ISO8859-1/articles/storage-devices/article.sgml
index ad4d0c9ae1..cc7fd0cafa 100644
--- a/en_US.ISO8859-1/articles/storage-devices/article.sgml
+++ b/en_US.ISO8859-1/articles/storage-devices/article.sgml
@@ -1,2639 +1,2647 @@
 <!DOCTYPE article PUBLIC "-//FreeBSD//DTD DocBook V4.1-Based Extension//EN" [
 <!ENTITY % man PUBLIC "-//FreeBSD//ENTITIES DocBook Manual Page Entities//EN">
 %man;
 <!ENTITY % authors PUBLIC "-//FreeBSD//ENTITIES DocBook Author Entities//EN">
 %authors;
+<!ENTITY % trademarks PUBLIC "-//FreeBSD//ENTITIES DocBook Trademark Entities//EN">
+%trademarks;
 ]>
 
 <article>
   <articleinfo>
     <title>Storage Devices</title>
     
     <authorgroup>
       <author>
         <firstname>Wilko</firstname>
         <surname>Bulte</surname>
         
         <affiliation>
           <address><email>wilko@FreeBSD.org</email></address>
         </affiliation>
       </author>
     </authorgroup>
     
     <pubdate>$FreeBSD$</pubdate>
     
+
+    <legalnotice id="trademarks" role="trademarks">
+      &tm-attrib.freebsd;
+      &tm-attrib.general;
+    </legalnotice>
+
     <abstract>
       <para>This article talks about storage devices with FreeBSD.</para>
     </abstract>
   </articleinfo>
   
   <sect1 id="esdi">
     <title>Using ESDI hard disks</title>
       
     <para><emphasis>Copyright &copy; 1995, &a.wilko;.  24 September
       1995.</emphasis></para>
 	  
     <para>ESDI is an acronym that means Enhanced Small Device
       Interface. It is loosely based on the good old ST506/412
       interface originally devised by Seagate Technology, the makers
       of the first affordable 5.25" winchester disk.</para>
 	  
     <para>The acronym says Enhanced, and rightly so.  In the first
       place the speed of the interface is higher, 10 or 15
       Mbits/second instead of the 5 Mbits/second of ST412 interfaced
       drives.  Secondly some higher level commands are added, making
       the ESDI interface somewhat <quote>smarter</quote> to the operating system
       driver writers.  It is by no means as smart as SCSI by the way.
       ESDI is standardized by ANSI.</para>
 	  
     <para>Capacities of the drives are boosted by putting more sectors
       on each track.  Typical is 35 sectors per track, high capacity
       drives I have seen were up to 54 sectors/track.</para>
       
     <para>Although ESDI has been largely obsoleted by IDE and SCSI
       interfaces, the availability of free or cheap surplus drives
       makes them ideal for low (or now) budget systems.</para>
       
       <sect2>
 	<title>Concepts of ESDI</title>
 
 	<sect3>
 	  <title>Physical connections</title>
 	  
 	  <para>The ESDI interface uses two cables connected to each drive.
 	    One cable is a 34 pin flat cable edge connector that carries the
 	    command and status signals from the controller to the drive and
 	    vice-versa.  The command cable is daisy chained between all the
 	    drives.  So, it forms a bus onto which all drives are
 	    connected.</para>
 	      
 	  <para>The second cable is a 20 pin flat cable edge connector that
 	    carries the data to and from the drive.  This cable is radially
 	    connected, so each drive has its own direct connection to the
 	    controller.</para>
 	      
 	  <para>To the best of my knowledge PC ESDI controllers are limited to
 	    using a maximum of 2 drives per controller.  This is compatibility
 	    feature(?) left over from the WD1003 standard that reserves only a
 	    single bit for device addressing.</para>
 	</sect3>
 
 	<sect3>
 	  <title>Device addressing</title>
 	  
 	  <para>On each command cable a maximum of 7 devices and 1 controller
 	    can be present.  To enable the controller to uniquely  identify
 	    which drive it addresses, each ESDI device is equipped with
 	    jumpers or switches to select the devices address.</para>
 	      
 	  <para>On PC type controllers the first drive is set to address 0,
 	    the second disk to address 1.  <emphasis>Always make
 	      sure</emphasis> you set each disk to an unique address! So, on a
 	    PC with its two drives/controller maximum the first drive is drive
 	    0, the second is drive 1.</para>
 	</sect3>
 
 	<sect3>
 	  <title>Termination</title>
 	  
 	  <para>The daisy chained command cable (the 34 pin cable remember?)
 	    needs to be terminated at the last drive on the chain.  For this
 	    purpose ESDI drives come with a termination resistor network that
 	    can be removed or disabled by a jumper when it is not used.</para>
 	  
 	  <para>So, one and <emphasis>only</emphasis> one drive, the one at
 	    the farthest end of the command cable has its terminator
 	    installed/enabled.  The controller automatically terminates the
 	    other end of the cable.  Please note that this implies that the
 	    controller must be  at one end of the cable and
 	    <emphasis>not</emphasis> in the middle.</para>
 	</sect3>
       </sect2>
       
       <sect2>
 	<title>Using ESDI disks with FreeBSD</title>
 
 	<para>Why is ESDI such a pain to get working in the first
 	  place?</para>
 
 	<para>People who tried ESDI disks with FreeBSD are known to have
 	  developed a profound sense of frustration.  A combination of factors
 	  works against you to produce effects that are hard to understand
 	  when you have never seen them before.</para>
 	    
 	<para>This has also led to the popular legend ESDI and FreeBSD is a
 	  plain NO-GO.  The following sections try to list all the pitfalls
 	  and  solutions.</para>
 	    
 	<sect3>
 	  <title>ESDI speed variants</title>
 	  
 	  <para>As briefly mentioned before, ESDI comes in two speed flavors.
 	    The older drives and controllers use a 10 Mbits/second data
 	    transfer rate.  Newer stuff uses 15 Mbits/second.</para>
 	      
 	  <para>It is not hard to imagine that 15 Mbits/second drive cause
 	    problems on controllers laid out for 10 Mbits/second.  As always,
 	    consult your controller <emphasis>and</emphasis> drive
 	    documentation to see if things match.</para>
 	</sect3>
 
 	<sect3>
 	  <title>Stay on track</title>
 	  
 	  <para>Mainstream ESDI drives use 34 to 36 sectors per track. Most
 	    (older) controllers cannot handle more than this  number of
 	    sectors.  Newer, higher capacity, drives use higher numbers of
 	    sectors per track.  For instance, I own a 670 MB drive that has 54
 	    sectors per track.</para>
 	      
 	  <para>In my case, the controller could not handle this number of
 	    sectors.  It proved to work well except that it only used 35
 	    sectors on each track.  This meant losing a lot of disk
 	    space.</para>
 	      
 	  <para>Once again, check the documentation of your hardware for more
 	    info.  Going out-of-spec like in the example might or might not
 	    work.  Give it a try or get another more capable
 	    controller.</para>
 	</sect3>
 
 	<sect3>
 	  <title>Hard or soft sectoring</title>
 	  
 	  <para>Most ESDI drives allow hard or soft sectoring to be selected
 	    using a jumper.  Hard sectoring means that the drive will produce
 	    a sector pulse on the start of each new sector. The controller
 	    uses this pulse to tell when it should start to write or
 	    read.</para>
 	  
 	  <para>Hard sectoring allows a selection of sector size (normally
 	    256, 512 or 1024 bytes per formatted sector).  FreeBSD uses 512
 	    byte sectors.  The number of sectors per track also varies while
 	    still using the same number of bytes per formatted sector.  The
 	    number of <emphasis>unformatted</emphasis> bytes per sector
 	    varies, dependent on your controller it needs more or less
 	    overhead  bytes to work correctly.  Pushing more sectors on a
 	    track  of course gives you more usable space, but might give
 	    problems if your controller needs more bytes than the  drive
 	    offers.</para>
 	      
 	  <para>In case of soft sectoring, the controller itself determines
 	    where to start/stop reading or writing.  For ESDI hard sectoring
 	    is the default (at least on everything I came across).  I never
 	    felt the urge to try soft sectoring.</para>
 	      
 	  <para>In general, experiment with sector settings before you install
 	    FreeBSD because you need to re-run the low-level format after each
 	    change.</para>
 	</sect3>
 
 	<sect3>
 	  <title>Low level formatting</title>
 	  
 	  <para>ESDI drives need to be low level formatted before they are
 	    usable.  A reformat is needed whenever you figgle with the number
 	    of sectors/track jumpers or the physical orientation of the drive
 	    (horizontal, vertical).  So, first think, then format.  The format
 	    time must not be underestimated, for big disks it can take
 	    hours.</para>
 	      
 	  <para>After a low level format, a surface scan is done to find and
 	    flag bad sectors.  Most disks have a manufacturer bad block list
 	    listed on a piece of paper or adhesive sticker.  In addition, on
 	    most disks the list is also written onto the disk.  Please use the
 	    manufacturer's list.  It is much easier to remap a defect now than
 	    after FreeBSD is installed.</para>
 	      
 	  <para>Stay away from low-level formatters that mark all sectors of a
 	    track as bad as soon as they find one bad sector.  Not only does
 	    this waste space, it also and more importantly causes you grief
 	    with bad144 (see the section on bad144).</para>
 	</sect3>
 
 	<sect3>
 	  <title>Translations</title>
 	  
 	  <para>Translations, although not exclusively a ESDI-only problem,
 	    might give you real trouble.  Translations come in multiple
 	    flavors.  Most of them  have in common that they attempt to work
 	    around the limitations posed upon disk geometries by the original
 	    IBM PC/AT design (thanks IBM!).</para>
 	      
 	  <para>First of all there is the (in)famous 1024 cylinder limit. For
 	    a system to be able to boot, the stuff (whatever operating system)
 	    must be in the first 1024 cylinders of a disk.  Only 10 bits are
 	    available to encode the cylinder number.  For the number of
 	    sectors the limit is 64 (0-63).  When you combine the 1024
 	    cylinder limit with the 16 head limit (also a design feature) you
 	    max out at fairly limited  disk sizes.</para>
 	      
 	  <para>To work around this problem, the manufacturers of ESDI PC
 	    controllers added a BIOS prom extension on their boards.  This
 	    BIOS extension handles disk I/O for booting (and for some
 	    operating systems <emphasis>all</emphasis> disk I/O) by using
 	    translation.  For instance, a big drive might be presented to the
 	    system as having 32 heads and 64 sectors/track.  The result is
 	    that the number of cylinders is reduced to something below 1024
 	    and is therefore usable by the system without problems.  It is
 	    noteworthy to know that FreeBSD does not use the BIOS after its
 	    kernel has started.  More on this later.</para>
 	      
 	  <para>A second reason for translations is the fact that most older
 	    system BIOSes could only handle drives with 17 sectors per track
 	    (the old ST412 standard).  Newer system BIOSes usually have a
 	    user-defined drive type (in most cases this is drive type
 	    47).</para>
 
 	  <warning>
 	    <para>Whatever you do to translations after reading this document,
 	      keep in mind that if you have multiple operating systems on the
 	      same disk, all must use the same translation</para>
 	  </warning>
 	  
 	  <para>While on the subject of translations, I have seen one
 	    controller type (but there are probably more like this) offer the
 	    option to logically split a drive in multiple partitions as a BIOS
 	    option.  I had select 1 drive == 1 partition because this
 	    controller wrote this info onto the disk.  On power-up it read the
 	    info and presented itself to the system based on the info from the
 	    disk.</para>
 	</sect3>
 	    
 	<sect3>
 	  <title>Spare sectoring</title>
 	  
 	  <para>Most ESDI controllers offer the possibility to remap bad
 	    sectors.  During/after the low-level format of the disk bad
 	    sectors are marked as such, and a replacement sector is put in
 	    place (logically of course) of the bad one.</para>
 	      
 	  <para>In most cases the remapping is done by using N-1 sectors on
 	    each track for actual data storage, and sector N itself is the
 	    spare sector.  N is the total number of sectors physically
 	    available on the track.  The idea behind this is that the
 	    operating system sees a <quote>perfect</quote> disk without bad sectors.  In
 	    the case of FreeBSD this concept is not usable.</para>
 	      
 	  <para>The problem is that the translation from
 	    <emphasis>bad</emphasis> to <emphasis>good</emphasis> is performed
 	    by the BIOS of the ESDI controller.  FreeBSD, being a true 32 bit
 	    operating system, does not use the BIOS after it has been booted.
 	    Instead, it has device drivers that talk directly to the
 	    hardware.</para>
 	      
 	  <para><emphasis>So: do not use spare sectoring, bad block remapping
 	      or whatever it may be called by the controller manufacturer when
 	      you want to use the disk for FreeBSD.</emphasis></para>
 	</sect3>
 
 	<sect3>
 	  <title>Bad block handling</title>
 	  
 	  <para>The preceding section leaves us with a problem.  The
 	    controller's bad block handling is not usable and still FreeBSD's
 	    filesystems assume perfect media without any flaws. To solve this
 	    problem, FreeBSD use the <command>bad144</command> tool.  Bad144
 	    (named after a Digital Equipment standard for bad block handling)
 	    scans a FreeBSD slice for bad blocks.  Having found these bad
 	    blocks, it writes a table with the offending block numbers to the
 	    end of the FreeBSD slice.</para>
 	      
 	  <para>When the disk is in operation, the disk accesses are checked
 	    against the table read from the disk.  Whenever a block number is
 	    requested that is in the <command>bad144</command> list, a
 	    replacement block (also from the end of the FreeBSD slice) is
 	    used.  In this way, the <command>bad144</command> replacement
 	    scheme presents <quote>perfect</quote> media to the FreeBSD filesystems.</para>
 	      
 	  <para>There are a number of potential pitfalls associated with the
 	    use of <command>bad144</command>.  First of all, the slice cannot
 	    have more than 126 bad sectors.  If your drive has a high number
 	    of bad sectors, you might need to divide it into multiple FreeBSD
 	    slices each containing less than 126 bad sectors.  Stay away from
 	    low-level format programs that mark <emphasis>every</emphasis>
 	    sector of a track as bad when  they find a flaw on the track.  As
 	    you can imagine, the  126 limit is quickly reached when the
 	    low-level format is done this way.</para>
 	      
 	  <para>Second, if the slice contains the root filesystem, the slice
 	    should be within the 1024 cylinder BIOS limit.  During the boot
 	    process the bad144 list is read using the BIOS and this only
 	    succeeds when the list is within the 1024 cylinder limit.</para>
 
 	  <note>
 	    <para>The restriction is not that only the root
 	      <emphasis>filesystem</emphasis> must be within the 1024 cylinder
 	      limit, but rather the entire <emphasis>slice</emphasis> that
 	      contains the root filesystem.</para>
 	  </note>
 	</sect3>
 
 	<sect3>
 	  <title>Kernel configuration</title>
 	  
 	  <para>ESDI disks are handled by the same <literal>wd</literal>driver
 	    as IDE and ST412 MFM disks.  The <literal>wd</literal> driver
 	    should work for all WD1003 compatible interfaces.</para>
 	      
 	  <para>Most hardware is jumperable for one of two different I/O
 	    address ranges and IRQ lines.  This allows you to have  two wd
 	    type controllers in one system.</para>
 	  
 	  <para>When your hardware allows non-standard strappings, you can use
 	    these with FreeBSD as long as you enter the  correct info into the
 	    kernel config file.  An example from the kernel config file (they
 	    live in <filename>/sys/i386/conf</filename> BTW).</para>
 	      
 	  <programlisting># First WD compatible controller
 controller      wdc0    at isa? port "IO_WD1" bio irq 14 vector wdintr
 disk            wd0     at wdc0 drive 0
 disk            wd1     at wdc0 drive 1
 # Second WD compatible controller
 controller      wdc1    at isa? port "IO_WD2" bio irq 15 vector wdintr
 disk            wd2     at wdc1 drive 0
 disk            wd3     at wdc1 drive 1</programlisting>
 	</sect3>
       </sect2>
       
       <sect2>
 	<title>Particulars on ESDI hardware</title>
 
 	<sect3>
 	  <title>Adaptec 2320 controllers</title>
 	  
 	  <para>I successfully installed FreeBSD onto a ESDI disk controlled
 	    by a ACB-2320.  No other operating system was present on the
 	    disk.</para>
 	      
 	  <para>To do so I low level formatted the disk using
 	    <command>NEFMT.EXE</command> (<command>ftp</command>able from
 	    <hostid role="fqdn">www.adaptec.com</hostid>) and answered NO to
 	    the question whether the disk should be formatted with a spare
 	    sector on each track.  The BIOS on the ACD-2320 was disabled.  I
 	    used the <literal>free configurable</literal> option in the system
 	    BIOS to allow the BIOS to boot it.</para>
 	      
 	  <para>Before using <command>NEFMT.EXE</command> I tried to format
 	    the disk using the ACB-2320 BIOS built-in formatter.  This proved
 	    to be a show stopper, because it did not give me an option to
 	    disable spare sectoring.  With spare sectoring enabled the FreeBSD
 	    installation process broke down on the <command>bad144</command>
 	    run.</para>
 	      
 	  <para>Please check carefully which
 	    ACB-232<replaceable>xy</replaceable> variant you have. The
 	    <replaceable>x</replaceable> is either <literal>0</literal> or
 	    <literal>2</literal>, indicating a controller without or with a
 	    floppy controller on board.</para>
 	      
 	  <para>The <literal>y</literal> is more interesting.  It can either
 	    be a blank, a <literal>A-8</literal> or a <literal>D</literal>.  A
 	    blank indicates a plain 10 Mbits/second controller.  An
 	    <literal>A-8</literal> indicates a 15 Mbits/second controller
 	    capable of handling 52 sectors/track.  A <literal>D</literal>
 	    means a 15 Mbits/second controller that can also handle drives
 	    with &gt; 36 sectors/track (also 52?).</para>
 	      
 	  <para>All variations should be capable of using 1:1 interleaving.
 	    Use 1:1, FreeBSD is fast enough to handle it.</para>
 	</sect3>
 
 	<sect3>
 	  <title>Western Digital WD1007 controllers</title>
 	  
 	  <para>I successfully installed FreeBSD onto a ESDI disk controlled
 	    by a WD1007 controller.  To be precise, it was a WD1007-WA2.
 	    Other variations of the WD1007 do exist.</para>
 	      
 	  <para>To get it to work, I had to disable the sector translation and
 	    the WD1007's onboard BIOS.  This implied I could not use the
 	    low-level formatter built into this BIOS.  Instead, I grabbed
 	    <command>WDFMT.EXE</command> from <hostid
 	      role="fqdn">www.wdc.com</hostid> Running this formatted my drive
 	    just fine.</para>
 	</sect3>
 
 	<sect3>
 	  <title>Ultrastor U14F controllers</title>
 	  
 	  <para>According to multiple reports from the net, Ultrastor ESDI
 	    boards work OK with FreeBSD.  I lack any further info on
 	    particular settings.</para>
 	</sect3>
       </sect2>
       
       <sect2 id="esdi-further-reading">
 	<title>Further reading</title>
 	    
 	<para>If you intend to do some serious ESDI hacking, you might want to
 	  have the official standard at hand:</para>
 	    
 	<para>The latest ANSI X3T10 committee document is: Enhanced Small
 	  Device Interface (ESDI) [X3.170-1990/X3.170a-1991]    [X3T10/792D
 	  Rev 11]</para>
 		
 	<para>On Usenet the newsgroup <ulink
 	    url="news:comp.periphs">comp.periphs</ulink> is a noteworthy place
 	  to look  for more info.</para>
 	    
 	<para>The World Wide Web (WWW) also proves to be a very handy info
 	  source: For info on Adaptec ESDI controllers see <ulink
 	    url="http://www.adaptec.com/"></ulink>. For
 	  info on Western Digital controllers see
 	  <ulink url="http://www.wdc.com/"></ulink>.</para>
       </sect2>
       
       <sect2>
 	<title>Thanks to...</title>
 
 	<para>Andrew Gordon for sending me an Adaptec 2320 controller and ESDI
 	  disk for testing.</para>
       </sect2>
     </sect1>
     
     <sect1 id="scsi">
       <title>What is SCSI?</title>
       
       <para><emphasis>Copyright &copy; 1995, &a.wilko;.  July 6,
 	  1996.</emphasis></para>
 	  
       <para>SCSI is an acronym for Small Computer Systems Interface.  It is an
 	ANSI standard that has become one of the leading I/O buses in the
 	computer industry.  The foundation of the SCSI standard was laid by
 	Shugart Associates (the same guys that gave the world the first mini
 	floppy disks) when they introduced the SASI bus (Shugart Associates
 	Standard Interface).</para>
 	  
       <para>After some time an industry effort was started to come to a more
 	strict standard allowing devices from different vendors to work
 	together.  This effort was recognized in the ANSI SCSI-1 standard.
 	The SCSI-1 standard (approximately 1985) is rapidly becoming obsolete.  The
 	current standard is SCSI-2 (see <link
 	  linkend="scsi-further-reading">Further reading</link>), with SCSI-3
 	on the drawing boards.</para>
 	  
       <para>In addition to a physical interconnection standard, SCSI defines a
 	logical (command set) standard to which disk devices must adhere.
 	This standard is called the Common Command Set (CCS) and was developed
 	more or less in parallel with ANSI SCSI-1. SCSI-2 includes the
 	(revised) CCS as part of the standard itself. The commands are
 	dependent on the type of device at hand.  It does not make much sense
 	of course to define a Write command for a scanner.</para>
 	  
       <para>The SCSI bus is a parallel bus, which comes in a number of
 	variants.  The oldest and most used is an 8 bit wide bus, with
 	single-ended signals, carried on 50 wires.  (If you do not know what
 	single-ended means, do not worry, that is what this document is all
 	about.)  Modern designs also use 16 bit wide buses, with differential
 	signals.  This allows transfer speeds of 20Mbytes/second, on cables
 	lengths of up to 25 meters.  SCSI-2 allows a maximum bus width of 32
 	bits, using an additional cable. Quickly emerging are Ultra SCSI (also
 	called Fast-20) and Ultra2 (also called Fast-40).  Fast-20 is 20
 	million transfers per second (20 Mbytes/sec on a 8 bit bus), Fast-40
 	is 40 million transfers per second (40 Mbytes/sec on a 8 bit bus).
 	Most hard drives sold today are single-ended Ultra SCSI (8 or 16
 	bits).</para>
 	  
       <para>Of course the SCSI bus not only has data lines, but also a number
 	of control signals.  A very elaborate protocol is part of the standard
 	to allow multiple devices to share the bus in an efficient manner.  In
 	SCSI-2, the data is always checked using a separate parity line.  In
 	pre-SCSI-2 designs parity was optional.</para>
 	  
       <para>In SCSI-3 even faster bus types are introduced, along with a
 	serial SCSI busses that reduces the cabling overhead and allows a
 	higher maximum bus length.  You might see names like SSA and
 	fibre channel in this context.  None of the serial buses are currently
 	in widespread use (especially not in the typical FreeBSD environment).
 	For this reason the serial bus types are not discussed any
 	further.</para>
 	  
       <para>As you could have guessed from the description above, SCSI devices
 	are intelligent.  They have to be to adhere to the SCSI standard
 	(which is over 2 inches thick BTW).  So, for a hard disk drive for
 	instance you do not specify a head/cylinder/sector to address a
 	particular block, but simply the number of the block you want.
 	Elaborate caching schemes, automatic bad block replacement etc are all
 	made possible by this <quote>intelligent device</quote> approach.</para>
 	  
       <para>On a SCSI bus, each possible pair of devices can communicate.
 	Whether their function allows this is another matter, but the standard
 	does not restrict it.  To avoid signal contention, the 2 devices have
 	to arbitrate for the bus before using it.</para>
 	  
       <para>The philosophy of SCSI is to have a standard that allows
 	older-standard devices to work with newer-standard ones.  So, an old
 	SCSI-1 device should normally work on a SCSI-2 bus.  I say Normally,
 	because it is not absolutely sure that the implementation of an old
 	device follows the (old) standard closely enough to be acceptable on a
 	new bus.  Modern devices are usually more well-behaved, because the
 	standardization has become more strict and is better adhered to by the
 	device manufacturers.</para>
 	  
       <para>Generally speaking, the chances of getting a working set of
 	devices on a single bus is better when all the devices are SCSI-2 or
 	newer.  This implies that you do not have to dump all your old stuff
 	when you get that shiny 80GB disk: I own a system on which a pre-SCSI-1
 	disk, a SCSI-2 QIC tape unit, a SCSI-1 helical scan tape unit and 2
 	SCSI-1 disks work together quite happily.  From a performance
 	standpoint you might want to separate your older and newer (=faster)
 	devices however. This is especially advantageous if you have an
 	Ultra160 host adapter where you should separate your U160 devices
 	from the Fast and Wide SCSI-2 devices.</para>
       
       <sect2>
 	<title>Components of SCSI</title>
 
 	<para>As said before, SCSI devices are smart.  The idea is to put the
 	  knowledge about intimate hardware details onto the SCSI device
 	  itself.  In this way, the host system does not have to worry about
 	  things like how many heads a hard disks has, or how many tracks
 	  there are on a specific tape device.  If you are curious, the
 	  standard specifies commands with which you can query your devices on
 	  their hardware particulars.  FreeBSD uses this capability during
 	  boot to check out what devices are connected and whether they need
 	  any special treatment.</para>
 	    
 	<para>The advantage of intelligent devices is obvious: the device
 	  drivers on the host can be made in a much more generic fashion,
 	  there is no longer a need to change (and qualify!) drivers for every
 	  odd new device that is introduced.</para>
 	    
 	<para>For cabling and connectors there is a golden rule: get good
 	  stuff.  With bus speeds going up all the time you will save yourself
 	  a lot of grief by using good material.</para>
 	    
 	<para>So, gold plated connectors, shielded cabling, sturdy connector
 	  hoods with strain reliefs etc are the way to go. Second golden rule:
 	  do no use cables longer than necessary.  I once spent 3 days hunting
 	  down a problem with a flaky machine only to discover that shortening
 	  the SCSI bus by 1 meter solved the problem.  And the original bus
 	  length was well within the SCSI specification.</para>
       </sect2>
 	  
       <sect2>
 	<title>SCSI bus types</title>
 
 	<para>From an electrical point of view, there are two incompatible bus
 	  types: single-ended and differential.  This means that there are two
 	  different main groups of SCSI devices and controllers, which cannot
 	  be mixed on the same bus.  It is possible however to use special
 	  converter hardware to transform a single-ended bus into a
 	  differential one (and vice versa).  The differences between the bus
 	  types are explained in the next sections.</para>
 	    
 	<para>In lots of SCSI related documentation there is a sort of jargon
 	  in use to abbreviate the different bus types.  A small list:</para>
 	    
 	<itemizedlist>
 	  <listitem>
 	    <para>FWD:	Fast Wide Differential</para>
 	  </listitem>
 	  
 	  <listitem>
 	    <para>FND:	Fast Narrow Differential</para>
 	  </listitem>
 	  
 	  <listitem>
 	    <para>SE:	Single Ended</para>
 	  </listitem>
 	  
 	  <listitem>
 	    <para>FN:	Fast Narrow</para>
 	  </listitem>
 	  
 	  <listitem>
 	    <para>etc.</para>
 	  </listitem>
 	</itemizedlist>
 
 
 	<para>With a minor amount of imagination one can usually imagine what
 	  is meant.</para>
 	    
 	<para>Wide is a bit ambiguous, it can indicate 16 or 32 bit buses. As
 	  far as I know, the 32 bit variant is not (yet) in use, so wide
 	  normally means 16 bit.</para>
 	    
 	<para>Fast means that the timing on the bus is somewhat different, so
 	  that on a narrow (8 bit) bus 10 Mbytes/sec are possible instead of 5
 	  Mbytes/sec for <quote>slow</quote> SCSI.  As discussed before, bus speeds of 20
 	  and 40 million transfers/second are also emerging  (Fast-20 == Ultra
 	  SCSI and Fast-40 == Ultra2 SCSI).</para>
 
 	<note>
 	  <para>The data lines &gt; 8 are only used for data transfers and
 	    device addressing.  The transfers of commands and status messages
 	    etc are only performed on the lowest 8 data lines. The standard
 	    allows narrow devices to operate on a wide bus. The usable bus
 	    width is negotiated between the devices.  You have to watch your
 	    device addressing closely when mixing wide and narrow.</para>
 	</note>
 
 	<sect3>
 	  <title>Single ended buses</title>
 	  
 	  <para>A single-ended SCSI bus uses signals that are either 5 Volts
 	    or 0 Volts (indeed, TTL levels) and are relative to a COMMON
 	    ground reference.  A singled ended 8 bit SCSI bus has
 	    approximately 25 ground lines, who are all tied to a single <quote>rail</quote>
 	    on all devices.  A standard single ended bus has a maximum length
 	    of 6 meters.  If the same bus is used with fast-SCSI devices, the
 	    maximum length allowed drops to 3 meters.  Fast-SCSI means that
 	    instead of 5Mbytes/sec the bus allows 10Mbytes/sec
 	    transfers.</para>
 	      
 	  <para>Fast-20 (Ultra SCSI) and Fast-40 allow for 20 and 40 million
 	    transfers/second respectively.  So, F20 is 20 Mbytes/second on a 8
 	    bit bus, 40 Mbytes/second on a 16 bit bus etc.  For F20 the max
 	    bus length is 1.5 meters, for F40 it becomes 0.75 meters.  Be
 	    aware that F20 is pushing  the limits quite a bit, so you will
 	    quickly find out if your SCSI bus is electrically sound.</para>
 
 	  <note>
 	    <para>If some devices on your bus use <quote>fast</quote> to communicate your
 	      bus must adhere to the length restrictions for fast
 	      buses!</para>
 	  </note>
 	  
 	  <para>It is obvious that with the newer fast-SCSI devices the bus
 	    length can become a real bottleneck.  This is why the differential
 	    SCSI bus was introduced in the SCSI-2 standard.</para>
 	      
 	  <para>For connector pinning and connector types please refer to the
 	    SCSI-2 standard (see <link linkend="scsi-further-reading">Further
 	      reading</link>) itself, connectors etc are listed there in
 	    painstaking detail.</para>
 	  
 	  <para>Beware of devices using non-standard cabling.  For instance
 	    Apple uses a 25pin D-type connecter (like the one on serial ports
 	    and parallel printers).  Considering that the official SCSI bus
 	    needs 50 pins you can imagine the use of this connector needs some
 	    <quote>creative cabling</quote>.  The reduction of the number of ground wires
 	    they used is a bad idea, you better stick to 50 pins cabling  in
 	    accordance with the SCSI standard.  For Fast-20 and 40 do not even
 	    think about buses like this.</para>
 	</sect3>
 
 	<sect3>
 	  <title>Differential buses</title>
 	  
 	  <para>A differential SCSI bus has a maximum length of 25 meters.
 	    Quite a difference from the 3 meters for a single-ended fast-SCSI
 	    bus.  The idea behind differential signals is that each bus signal
 	    has its own return wire.  So, each signal is carried on a
 	    (preferably twisted) pair of wires.  The voltage difference
 	    between these two wires determines whether the signal is asserted
 	    or de-asserted.  To a certain extent the voltage difference
 	    between ground and the signal wire pair is not relevant (do not
 	    try 10 kVolts though).</para>
 	      
 	  <para>It is beyond the scope of this document to explain why this
 	    differential idea is so much better.  Just accept that
 	    electrically seen the use of differential signals gives a much
 	    better noise margin.  You will normally find differential buses in
 	    use for inter-cabinet connections.  Because of the lower cost
 	    single ended is mostly used for shorter buses like inside
 	    cabinets.</para>
 	      
 	  <para>There is nothing that stops you from using differential stuff
 	    with FreeBSD, as long as you use a controller that has device
 	    driver support in FreeBSD.  As an example, Adaptec marketed the
 	    AHA1740 as a single ended board, whereas the AHA1744 was
 	    differential.  The software interface to the host is identical for
 	    both.</para>
 	</sect3>
 
 	<sect3>
 	  <title>Terminators</title>
 	  
 	  <para>Terminators in SCSI terminology are resistor networks that are
 	    used to get a correct impedance matching.  Impedance matching is
 	    important to get clean signals on the bus, without reflections or
 	    ringing.  If you once made a long distance telephone call on a bad
 	    line you probably know what reflections are.  With 20Mbytes/sec
 	    traveling over your SCSI bus, you do not want signals echoing
 	    back.</para>
 	      
 	  <para>Terminators come in various incarnations, with more or less
 	    sophisticated designs.  Of course, there are internal and external
 	    variants.  Many SCSI devices come with a number of sockets in
 	    which a number of resistor networks can (must be!) installed.  If
 	    you remove terminators from a device, carefully store them.  You
 	    will need them when you ever decide to reconfigure your SCSI bus.
 	    There is enough variation in even these simple tiny things to make
 	    finding the exact replacement a frustrating business.  There are
 	    also SCSI devices that have a single jumper to enable or disable a
 	    built-in terminator. There are special terminators you can stick
 	    onto a flat cable bus.  Others look like external connectors, or a
 	    connector hood without a cable.  So, lots of choice as you can
 	    see.</para>
 	      
 	  <para>There is much debate going on if and when you should switch
 	    from simple resistor (passive) terminators to active terminators.
 	    Active terminators contain slightly more elaborate circuit to give
 	    cleaner bus signals.  The general consensus seems to be that the
 	    usefulness of active termination increases when you have long
 	    buses and/or fast devices.  If you ever have problems with your
 	    SCSI buses you might consider trying an active terminator.  Try to
 	    borrow one first, they reputedly are quite expensive.</para>
 	      
 	  <para>Please keep in mind that terminators for differential and
 	    single-ended buses are not identical.  You should <emphasis>not
 	      mix</emphasis> the two variants.</para>
 	      
 	  <para>OK, and now where should you install your terminators? This is
 	    by far the most misunderstood part of SCSI.  And it is by far the
 	    simplest.  The rule is: <emphasis>every single line on the SCSI
 	      bus has 2 (two) terminators, one at each end of the
 	      bus.</emphasis> So, two and not one or three or whatever.  Do
 	    yourself a favor and stick to this rule.  It will save you endless
 	    grief, because wrong termination has the potential to introduce
 	    highly mysterious bugs.  (Note the <quote>potential</quote> here;
 	    the nastiest part is that it may or may not work.)</para>
 	      
 	  <para>A common pitfall is to have an internal (flat) cable in a
 	    machine and also an external cable attached to the controller. It
 	    seems almost everybody forgets to remove the terminators from the
 	    controller.  The terminator must now be on the last external
 	    device, and not on the controller! In general, every
 	    reconfiguration of a SCSI bus must pay attention to this.</para>
 
 	  <note>
 	    <para>Termination is to be done on a per-line basis.  This means
 	      if you have both narrow and wide buses connected to the same
 	      host adapter, you need to enable termination on the higher 8
 	      bits of the bus on the adapter (as well as the last devices on
 	      each bus, of course).</para>
 	      </note>
 	      
 	  <para>What I did myself is remove all terminators from my SCSI
 	    devices and controllers.  I own a couple of external terminators,
 	    for both the Centronics-type external cabling and for the internal
 	    flat cable connectors.  This makes reconfiguration much
 	    easier.</para>
 	      
 	  <para>On modern devices, sometimes integrated terminators are used.
 	    These things are special purpose integrated circuits that can be
 	    enabled or disabled with a control pin.  It is not necessary to
 	    physically remove them from a device.  You may find them on newer
 	    host adapters, sometimes they are software configurable, using
 	    some sort of setup tool.  Some will even auto-detect the cables
 	    attached to the connectors and automatically set up the
 	    termination as necessary.  At any rate, consult your
 	    documentation!</para>
 	</sect3>
 
 	<sect3>
 	  <title>Terminator power</title>
 	  
 	  <para>The terminators discussed in the previous chapter need power
 	    to operate properly.  On the SCSI bus, a line is dedicated to this
 	    purpose.  So, simple huh?</para>
 	      
 	  <para>Not so.  Each device can provide its own terminator power to
 	    the terminator sockets it has on-device.  But if you have external
 	    terminators, or when the device supplying the terminator power to
 	    the SCSI bus line is switched off you are in trouble.</para>
 	      
 	  <para>The idea is that initiators (these are devices that initiate
 	    actions on the bus, a discussion follows) must supply terminator
 	    power.  All SCSI devices are allowed (but not required) to supply
 	    terminator power.</para>
 	      
 	  <para>To allow for un-powered devices on a bus, the terminator power
 	    must be supplied to the bus via a diode.  This prevents the
 	    backflow of current to un-powered devices.</para>
 	      
 	  <para>To prevent all kinds of nastiness, the terminator power is
 	    usually fused.  As you can imagine, fuses might blow.  This can,
 	    but does not have to, lead to a non functional bus.  If multiple
 	    devices supply terminator power, a single blown fuse will not put
 	    you out of business.  A single supplier with a blown fuse
 	    certainly will.  Clever external terminators sometimes have a  LED
 	    indication that shows whether terminator power is present.</para>
 	      
 	  <para>In newer designs auto-restoring fuses that <quote>reset</quote> themselves
 	    after some time are sometimes used.</para>
 	</sect3>
 
 	<sect3>
 	  <title>Device addressing</title>
 	  
 	  <para>Because the SCSI bus is, ehh, a bus there must be a way to
 	    distinguish or address the different devices connected to
 	    it.</para>
 	      
 	  <para>This is done by means of the SCSI or target ID.  Each device
 	    has a unique target ID.  You can select the ID to which a device
 	    must respond using a set of jumpers, or a dip switch, or something
 	    similar.  Some SCSI host adapters let you change the target ID
 	    from the boot menu.  (Yet some others will not let you change the
 	    ID from 7.)  Consult the documentation of your device for more
 	    information.</para>
 	  
 	  <para>Beware of multiple devices configured to use the same ID.
 	    Chaos normally reigns in this case.  A pitfall is that one of the
 	    devices sharing the same ID sometimes even manages to answer to
 	    I/O requests!</para>
 	      
 	  <para>For an 8 bit bus, a maximum of 8 targets is possible.  The
 	    maximum is 8 because the selection is done bitwise using the 8
 	    data lines on the bus.  For wide buses this increases to the
 	    number of data lines (usually 16).</para>
 
 	  <note>
 	    <para>A narrow SCSI device can not communicate with a SCSI device
 	      with a target ID larger than 7.  This means it is generally not
 	      a good idea to move your SCSI host adapter's target ID to
 	      something higher than 7 (or your CDROM will stop
 	      working).</para>
 	  </note>
 	  
 	  <para>The higher the SCSI target ID, the higher the priority the
 	    devices has.  When it comes to arbitration between devices that
 	    want to use the bus at the same time, the device that has the
 	    highest SCSI ID will win.  This also means that the SCSI host
 	    adapter usually uses target ID 7.  Note however that the lower 8
 	    IDs have higher priorities than the higher 8 IDs on a wide-SCSI
 	    bus.  Thus, the order of target IDs is: [7 6 .. 1 0 15 14 .. 9 8]
 	    on a wide-SCSI system.  (If you are wondering why the lower 8
 	    have higher priority, read the previous paragraph for a
 	    hint.)</para>
 	      
 	  <para>For a further subdivision, the standard allows for Logical
 	    Units or LUNs for short.  A single target ID may have multiple
 	    LUNs.  For example, a tape device including a tape changer may
 	    have LUN 0 for the tape device itself, and LUN 1 for the tape
 	    changer.  In this way, the host system can address each of the
 	    functional units of the tape changer as desired.</para>
 	</sect3>
 	    
 	<sect3>
 	  <title>Bus layout</title>
 	  
 	  <para>SCSI buses are linear.  So, not shaped like Y-junctions, star
 	    topologies, rings, cobwebs or whatever else people might want to
 	    invent.  One of the most common mistakes is for people with
 	    wide-SCSI host adapters to connect devices on all three connecters
 	    (external connector, internal wide connector, internal narrow
 	    connector).  Do not do that.  It may appear to work if you are
 	    really lucky, but I can almost guarantee that your system will
 	    stop functioning at the most unfortunate moment (this is also
 	    known as <quote>Murphy's law</quote>).</para>
 	      
 	  <para>You might notice that the terminator issue discussed earlier
 	    becomes rather hairy if your bus is not linear.  Also, if you have
 	    more connectors than devices on your internal SCSI cable, make
 	    sure you attach devices on connectors on both ends instead of
 	    using the connectors in the middle and let one or both ends
 	    dangle.  This will screw up the termination of the bus.</para>
 	      
 	  <para>The electrical characteristics, its noise margins and
 	    ultimately the reliability of it all are tightly related to linear
 	    bus rule.</para>
 	  
 	  <para><emphasis>Stick to the linear bus rule!</emphasis></para>
 	</sect3>
       </sect2>
       
       <sect2>
 	<title>Using SCSI with FreeBSD</title>
 
 	<sect3>
 	  <title>About translations, BIOSes and magic...</title>
 	  
 	  <para>As stated before, you should first make sure that you have a
 	    electrically sound bus.</para>
 	      
 	  <para>When you want to use a SCSI disk on your PC as boot disk, you
 	    must aware of some quirks related to PC BIOSes.  The PC BIOS in
 	    its first incarnation used a low level physical interface to the
 	    hard disk.  So, you had to tell the BIOS (using a setup tool or a
 	    BIOS built-in setup) how your disk physically looked like.  This
 	    involved stating number of heads, number of cylinders, number of
 	    sectors per track, obscure things like precompensation and reduced
 	    write current cylinder etc.</para>
 	      
 	  <para>One might be inclined to think that since SCSI disks are smart
 	    you can forget about this.  Alas, the arcane setup issue is still
 	    present today.  The system BIOS needs to know how to access your
 	    SCSI disk with the head/cyl/sector method in order to load the
 	    FreeBSD kernel during boot.</para>
 	      
 	  <para>The SCSI host adapter or SCSI controller you have put in your
 	    AT/EISA/PCI/whatever bus to connect your disk therefore has its
 	    own on-board BIOS.  During system startup, the SCSI BIOS takes
 	    over the hard disk interface routines from the system BIOS.  To
 	    fool the system BIOS, the system setup is normally set to No hard
 	    disk present.  Obvious, is it not?</para>
 	      
 	  <para>The SCSI BIOS itself presents to the system a so called
 	    <emphasis>translated</emphasis> drive.  This means that a fake
 	    drive table is constructed that allows the PC to boot the drive.
 	    This translation is often (but not always) done using a pseudo
 	    drive with 64 heads and 32 sectors per track.  By varying the
 	    number of cylinders, the SCSI BIOS adapts to the actual drive
 	    size.  It is useful to note that 32 * 64 / 2 = the size of your
 	    drive in megabytes.  The division by 2 is to get from disk blocks
 	    that are normally 512 bytes in size to Kbytes.</para>
 	      
 	  <para>Right.  All is well now?! No, it is not.  The system BIOS has
 	    another quirk you might run into.  The number of cylinders of a
 	    bootable hard disk cannot be greater than 1024.  Using the
 	    translation above, this is a show-stopper for disks greater than 1
 	    GB.  With disk capacities going up all the time this is causing
 	    problems.</para>
 	      
 	  <para>Fortunately, the solution is simple: just use another
 	    translation, e.g. with 128 heads instead of 32.  In most cases new
 	    SCSI BIOS versions are available to upgrade older SCSI host
 	    adapters.  Some newer adapters have an option, in the form of a
 	    jumper or software setup selection, to switch the translation the
 	    SCSI BIOS uses.</para>
 	      
 	  <para>It is very important that <emphasis>all</emphasis> operating
 	    systems on the disk use the <emphasis>same translation</emphasis>
 	    to get the right idea about where to find the relevant partitions.
 	    So, when installing FreeBSD you must answer any questions about
 	    heads/cylinders etc using the translated values your host adapter
 	    uses.</para>
 	      
 	  <para>Failing to observe the translation issue might lead to
 	    un-bootable systems or operating systems overwriting each others
 	    partitions.  Using fdisk you should be able to see all
 	    partitions.</para>
 	      
 	  <para>You might have heard some talk of <quote>lying</quote> devices?
 	    Older FreeBSD kernels used to report the geometry of SCSI disks
 	    when booting.  An example from one of my systems:</para>
 	      
 	  <screen>aha0 targ 0 lun 0: &lt;MICROP 1588-15MB1057404HSP4&gt;
 da0: 636MB (1303250 total sec), 1632 cyl, 15 head, 53 sec, bytes/sec 512</screen>
 
 	  <para>Newer kernels usually do not report this information.
 	    e.g.</para>
 
 	  <screen>(bt0:0:0): "SEAGATE ST41651 7574" type 0 fixed SCSI 2
 da0(bt0:0:0): Direct-Access 1350MB (2766300 512 byte sectors)</screen>
 		
 	  <para>Why has this changed?</para>
 	      
 	  <para>This info is retrieved from the SCSI disk itself.  Newer disks
 	    often use a technique called zone bit recording.  The idea is that
 	    on the outer cylinders of the drive there is more space so more
 	    sectors per track can be put on them.  This results in disks that
 	    have more tracks on outer cylinders than on the inner cylinders
 	    and, last but not least, have more capacity.  You can imagine that
 	    the value reported by the drive when inquiring about the geometry
 	    now becomes suspect at best, and nearly always misleading.  When
 	    asked for a geometry, it is nearly always better to supply the
 	    geometry used by the BIOS, or <emphasis>if the BIOS is never going
 	      to know about this disk</emphasis>, (e.g. it is not a booting
 	    disk) to supply a fictitious geometry that is convenient.</para>
 	</sect3>
 
 	<sect3>
 	  <title>SCSI subsystem design</title>
 	  
 	  <para>FreeBSD uses a layered SCSI subsystem.  For each different
 	    controller card a device driver is written.  This driver knows all
 	    the intimate details about the hardware it controls.  The driver
 	    has a interface to the upper layers of the SCSI subsystem through
 	    which it receives its commands and reports back any status.</para>
 	      
 	  <para>On top of the card drivers there are a number of more generic
 	    drivers for a class of devices.  More specific: a driver for tape
 	    devices (abbreviation: sa, for serial access),
 	    magnetic disks (da, for direct access), CDROMs (cd) etc.
 	    In case you are wondering where you can find this stuff, it all
 	    lives in <filename>/sys/cam/scsi</filename>.  See the man pages in
 	    section 4 for more details.</para>
 	      
 	  <para>The multi level design allows a decoupling of low-level bit
 	    banging and more high level stuff.  Adding support for another
 	    piece of hardware is a much more manageable problem.</para>
 	</sect3>
 
 	<sect3>
 	  <title>Kernel configuration</title>
 	  
 	  <para>Dependent on your hardware, the kernel configuration file must
 	    contain one or more lines describing your host adapter(s).  This
 	    includes I/O addresses, interrupts etc. Consult the manual page for
 	    your adapter driver to get more info. Apart from that, check out
 	    <filename>/sys/i386/conf/LINT</filename> for an overview of a
 	    kernel config file.  <filename>LINT</filename> contains every
 	    possible option you can dream of.  It does
 	    <emphasis>not</emphasis> imply <filename>LINT</filename> will
 	    actually get you to a working kernel at all.</para>
 	  
 	  <para>Although it is probably stating the obvious: the kernel config
 	    file should reflect your actual hardware setup.  So, interrupts,
 	    I/O addresses etc must match the kernel config file.  During
 	    system boot messages will be displayed to indicate whether the
 	    configured hardware was actually found.</para>
 
 	  <note>
 	    <para>Note that most of the EISA/PCI drivers (namely
 	      <devicename>ahb</devicename>, <devicename>ahc</devicename>,
 	      <devicename>ncr</devicename> and <devicename>amd</devicename>
 	      will automatically obtain the correct parameters from the host
 	      adapters themselves at boot time; thus, you just need to write,
 	      for instance, <literal>controller ahc0</literal>.</para>
 	  </note>
 	  
 	  <para>An example loosely based on the FreeBSD 2.2.5-Release kernel
 	    config  file <filename>LINT</filename> with some added comments
 	    (between []):</para>
 	      
 	  <programlisting># SCSI host adapters: `aha', `ahb', `aic', `bt', `nca'
 #
 # aha: Adaptec 154x
 # ahb: Adaptec 174x
 # ahc: Adaptec 274x/284x/294x
 # aic: Adaptec 152x and sound cards using the Adaptec AIC-6360 (slow!)
 # amd: AMD 53c974 based SCSI cards (e.g., Tekram DC-390 and 390T)
 # bt: Most Buslogic controllers
 # nca: ProAudioSpectrum cards using the NCR 5380 or Trantor T130
 # ncr: NCR/Symbios 53c810/815/825/875 etc based SCSI cards
 # uha: UltraStore 14F and 34F
 # sea: Seagate ST01/02 8 bit controller (slow!)
 # wds: Western Digital WD7000 controller (no scatter/gather!).
 #
 
 [For an Adaptec AHA274x/284x/294x/394x etc controller]
 controller	ahc0
 
 [For an NCR/Symbios 53c875 based controller]
 controller	ncr0
 
 [For an Ultrastor adapter]
 controller	uha0	at isa? port "IO_UHA0" bio irq ? drq 5 vector uhaintr
 
 # Map SCSI buses to specific SCSI adapters
 controller	scbus0	at ahc0
 controller	scbus2	at ncr0
 controller	scbus1  at uha0
 
 # The actual SCSI devices
 disk da0 at scbus0 target 0 unit 0	[SCSI disk 0 is at scbus 0, LUN 0]
 disk da1 at scbus0 target 1             [implicit LUN 0 if omitted]
 disk da2 at scbus1 target 3             [SCSI disk on the uha0]
 disk da3 at scbus2 target 4             [SCSI disk on the ncr0]
 tape sa1 at scbus0 target 6             [SCSI tape at target 6]
 device cd0 at scbus?                    [the first ever CDROM found, no wiring]</programlisting>
 	      
 	  <para>The example above tells the kernel to look for a ahc (Adaptec
 	    274x) controller, then for an NCR/Symbios board, and so on.  The
 	    lines following the controller specifications  tell the kernel to
 	    configure specific devices but <emphasis>only</emphasis> attach
 	    them when they match the target ID and LUN specified on the
 	    corresponding bus.</para>
 	      
 	  <para>Wired down devices get <quote>first shot</quote> at the unit
 	    numbers so the first non <quote>wired down</quote> device, is
 	    allocated the unit number  one greater than the highest
 	    <quote>wired down</quote> unit number for that kind of device.  So,
 	    if you had a SCSI tape at target ID 2 it would be configured as
 	    sa2, as the tape at target ID 6 is wired down to unit number
 	    1.</para>
 
 	  <note>
 	    <para>Wired down devices need not be found to get their unit
 	      number.  The unit number for a wired down device is reserved for
 	      that device, even if it is turned off at boot time.  This allows
 	      the device to be turned on and brought on-line at a later time,
 	      without rebooting.  Notice that a device's unit number has
 	      <emphasis>no</emphasis> relationship with its target ID on  the
 	      SCSI bus.</para>
 	  </note>
 	  
 	  <para>Below is another example of a kernel config file as used by
 	    FreeBSD version &lt; 2.0.5.  The difference with the first example
 	    is that devices are not <quote>wired down</quote>.  <quote>Wired
 	    down</quote> means that you specify which SCSI target belongs to
 	    which device.</para>
 	      
 	  <para>A kernel built to the config file below will attach  the first
 	    SCSI disk it finds to da0, the second disk to da1 etc. If you ever
 	    removed or added a disk, all other devices of the same type (disk
 	    in this case) would <quote>move around</quote>.  This implies you have to
 	    change <filename>/etc/fstab</filename> each time.</para>
 	      
 	  <para>Although the old style still works, you  are
 	    <emphasis>strongly</emphasis> recommended to use this new feature.
 	    It will save you a lot of grief whenever you shift your hardware
 	    around on the SCSI buses.  So, when you re-use your old trusty
 	    config file after upgrading from a pre-FreeBSD2.0.5.R system check
 	    this out.</para>
 	  
 	  <programlisting>[driver for Adaptec 174x]
 controller      ahb0 at isa? bio irq 11 vector ahbintr
 
 [for Adaptec 154x]
 controller      aha0    at isa? port "IO_AHA0" bio irq 11 drq 5 vector ahaintr
 
 [for Seagate ST01/02]
 controller      sea0    at isa? bio irq 5 iomem 0xc8000 iosiz 0x2000 vector seaintr
 
 controller      scbus0
 
 device          da0     [support for 4 SCSI harddisks, da0 up da3]
 device          sa0	[support for 2 SCSI tapes]
 
 [for the CDROM]
 device          cd0     #Only need one of these, the code dynamically grows</programlisting>
 	      
 	  <para>Both examples support SCSI disks.  If during boot more devices
 	    of a specific type (e.g. da disks) are found than are configured
 	    in the booting kernel, the system will simply allocate more
 	    devices, incrementing the unit number starting at the last number
 	    <quote>wired down</quote>.  If there are no <quote>wired
 	    down</quote> devices then counting starts at unit 0.</para>
 	      
 	  <para>Use <command>man 4 scsi</command> to check for the latest info
 	    on the SCSI subsystem.  For more detailed info on host adapter
 	    drivers use e.g., <command>man 4 ahc</command> for info on the
 	    Adaptec 294x driver.</para>
 	</sect3>
 
 	<sect3>
 	  <title>Tuning your SCSI kernel setup</title>
 	  
 	  <para>Experience has shown that some devices are slow to respond to
 	    INQUIRY commands after a SCSI bus reset (which happens at boot
 	    time).  An INQUIRY command is sent by the kernel on boot to see
 	    what kind of device (disk, tape, CDROM etc.) is connected to a
 	    specific target ID.  This process is called device probing by the
 	    way.</para>
 	      
 	  <para>To work around the <quote>slow response</quote> problem, FreeBSD allows a
 	    tunable delay time before the SCSI devices are probed following a
 	    SCSI bus reset.  You can set this delay time in your kernel
 	    configuration file using a line like:</para>
 	      
 	  <programlisting>options         SCSI_DELAY=15         #Be pessimistic about Joe SCSI device</programlisting>
 
 	  <para>This line sets the delay time to 15 seconds.  On my own system
 	    I had to use 3 seconds minimum to get my trusty old CDROM drive
 	    to be recognized.  Start with a high value (say 30 seconds or so)
 	    when you have problems  with device recognition.  If this helps,
 	    tune it back until it just stays working.</para>
 	</sect3>
 
 	<sect3 id="scsi-rogue-devices">
 	  <title>Rogue SCSI devices</title>
 	      
 	  <para>Although the SCSI standard tries to be complete and concise,
 	    it is a complex standard and implementing things correctly is no
 	    easy task.  Some vendors do a better job then others.</para>
 	      
 	  <para>This is exactly where the <quote>rogue</quote> devices come
 	    into view. Rogues are devices that are recognized by the FreeBSD
 	    kernel as behaving slightly (...) non-standard.  Rogue devices are
 	    reported by the kernel when booting.  An example for two of my
 	    cartridge tape units:</para>
 	  
 	  <screen>Feb 25 21:03:34 yedi /kernel: ahb0 targ 5 lun 0: &lt;TANDBERG TDC 3600       -06:&gt;
 Feb 25 21:03:34 yedi /kernel: sa0: Tandberg tdc3600 is a known rogue
 
 Mar 29 21:16:37 yedi /kernel: aha0 targ 5 lun 0: &lt;ARCHIVE VIPER 150  21247-005&gt;
 Mar 29 21:16:37 yedi /kernel: sa1: Archive  Viper 150 is a known rogue </screen>
 	      
 	  <para>For instance, there are devices that respond to all LUNs on a
 	    certain target ID, even if they are actually only one device.  It
 	    is easy to see that the kernel might be fooled into believing that
 	    there are 8 LUNs at that particular target ID. The confusion this
 	    causes is left as an exercise to the reader.</para>
 	      
 	  <para>The SCSI subsystem of FreeBSD recognizes devices with bad
 	    habits by looking at the INQUIRY response they send when probed.
 	    Because the INQUIRY response also includes the version number of
 	    the device  firmware, it is even possible that for different
 	    firmware versions different workarounds are used. See e.g.
 	    <filename>/sys/cam/scsi/scsi_sa.c</filename> and
 	    <filename>/sys/cam/scsi/scsi_all.c</filename> for more info on how
 	    this is done.</para>
 	      
 	  <para>This scheme works fine, but keep in mind that it of course
 	    only works for devices that are known to be weird.  If you are the
 	    first to connect your bogus Mumbletech SCSI CDROM you might be
 	    the one that has to define which workaround is needed.</para>
 	      
 	  <para>After you got your Mumbletech working, please send the
 	    required workaround to the FreeBSD development team for inclusion
 	    in the next release of FreeBSD.  Other Mumbletech owners will be
 	    grateful  to you.</para>
 	</sect3>
 
 	<sect3>
 	  <title>Multiple LUN devices</title>
 	  
 	  <para>In some cases you come across devices that use multiple
 	    logical units (LUNs) on a single SCSI ID.  In most cases FreeBSD
 	    only probes devices for LUN 0.  An example are so called bridge
 	    boards that connect 2 non-SCSI hard disks to a SCSI bus (e.g. an
 	    Emulex MD21 found in old Sun systems).</para>
 	      
 	  <para>This means that any devices with LUNs != 0 are not normally
 	    found during device probe on system boot.  To work around this
 	    problem you must add an appropriate entry in /sys/cam/scsi
 	    and rebuild your kernel.</para>
 	      
 	  <para>Look for a struct that is initialized like below:
 	    (FIXME: which file? Do these entries still exist in this form
             now that we use CAM?)</para>
 
 	  <programlisting>{
         T_DIRECT, T_FIXED, "MAXTOR", "XT-4170S", "B5A",
         "mx1", SC_ONE_LU
 }</programlisting>
 	      
 	  <para>For your Mumbletech BRIDGE2000 that has more than one LUN, acts
 	    as a SCSI disk and has firmware revision 123 you would add
 	    something like:</para>
 	      
 	  <programlisting>{
         T_DIRECT, T_FIXED, "MUMBLETECH", "BRIDGE2000", "123",
         "da", SC_MORE_LUS
 }</programlisting>
 	      
 	  <para>The kernel on boot scans the inquiry data it receives against
 	    the table and acts accordingly.  See the source for more
 	    info.</para>
 	</sect3>
 
 	<sect3>
 	  <title>Tagged command queuing</title>
 	  
 	  <para>Modern SCSI devices, particularly magnetic disks,
 	    support what is called tagged command queuing (TCQ).</para>
 	  
 	  <para>In a nutshell, TCQ allows the device to have multiple I/O
 	    requests outstanding at the same time.  Because the device is
 	    intelligent, it can optimize its operations (like head
 	    positioning) based on its own request queue.  On  SCSI devices
 	    like RAID (Redundant Array of Independent Disks) arrays the TCQ
 	    function is indispensable to take advantage of the device's
 	    inherent parallelism.</para>
 	      
 	  <para>Each I/O request is uniquely identified by a <quote>tag</quote>
 	    (hence the name tagged command queuing) and this tag is used by
 	    FreeBSD to see which I/O in the device drivers queue is reported
 	    as complete by the device.</para>
 	      
 	  <para>It should be noted however that TCQ requires device driver
 	    support and that some devices implemented it <quote>not quite
 	    right</quote> in their firmware.  This problem bit me once, and it
 	    leads to highly mysterious problems.  In such cases, try to
 	    disable TCQ.</para>
 	</sect3>
 
 	<sect3>
 	  <title>Bus-master host adapters</title>
 	  
 	  <para>Most, but not all, SCSI host adapters are bus mastering
 	    controllers.  This means that they can do I/O on their own without
 	    putting load onto the host CPU for data movement.</para>
 	      
 	  <para>This is of course an advantage for a multitasking operating
 	    system like FreeBSD.  It must be noted however that there might be
 	    some rough edges.</para>
 	      
 	  <para>For instance an Adaptec 1542 controller can be set to use
 	    different transfer speeds on the host bus (ISA or AT in this
 	    case).  The controller is settable to different rates because not
 	    all motherboards can handle the higher speeds.  Problems like
 	    hang-ups, bad data etc might be the result of using a higher data
 	    transfer rate then your motherboard can stomach.</para>
 	      
 	  <para>The solution is of course obvious: switch to a lower data
 	    transfer rate and try if that works better.</para>
 	      
 	  <para>In the case of a Adaptec 1542, there is an option that can be
 	    put into the kernel config file to allow dynamic determination of
 	    the right, read: fastest feasible, transfer rate.  This option is
 	    disabled by default:</para>
 	      
 	  <programlisting>options        "TUNE_1542"             #dynamic tune of bus DMA speed</programlisting>
 	      
 	  <para>Check the manual pages for the host adapter that you use.  Or
 	    better still, use the ultimate documentation (read: driver
 	    source).</para>
 	</sect3>
       </sect2>
       
       <sect2>
 	<title>Tracking down problems</title>
 
 	<para>The following list is an attempt to give a guideline for the
 	  most common SCSI problems and their solutions.  It is by no means
 	  complete.</para>
 	    
 	<itemizedlist>
 	  <listitem>
 	    <para>Check for loose connectors and cables.</para>
 	  </listitem>
 	  
 	  <listitem>
 	    <para>Check and double check the location and number of your
 	      terminators.</para>
 	  </listitem>
 	  
 	  <listitem>
 	    <para>Check if your bus has at least one supplier of terminator
 	      power (especially with external terminators.</para>
 	  </listitem>
 	  
 	  <listitem>
 	    <para>Check if no double target IDs are used.</para>
 	  </listitem>
 	  
 	  <listitem>
 	    <para>Check if all devices to be used are powered up.</para>
 	  </listitem>
 	  
 	  <listitem>
 	    <para>Make a minimal bus config with as little devices as
 	      possible.</para>
 	  </listitem>
 	  
 	  <listitem>
 	    <para>If possible, configure your host adapter to use slow bus
 	      speeds.</para>
 	  </listitem>
 	  
 	  <listitem>
 	    <para>Disable tagged command queuing to make things as simple as
 	      possible (for a NCR host adapter based system see man
 	      ncrcontrol)</para>
 	  </listitem>
 	  
 	  <listitem>
 	    <para>If you can compile a kernel, make one with the
 	      <literal>SCSIDEBUG</literal> option, and try accessing the
 	      device with debugging turned on for that device.  If your device
 	      does not even probe at startup, you may have to define the
 	      address of the device that is failing, and the desired debug
 	      level in <filename>/sys/cam/cam_debug.h</filename>. If it
 	      probes but just does not work, you can use the
 	      &man.camcontrol.8; command to dynamically set a debug level to
 	      it in a running kernel (if <literal>CAMDEBUG</literal> is
 	      defined).  This will give you <emphasis>copious</emphasis>
 	      debugging output with which to confuse the gurus. See
 	      <command>man camcontrol</command> for more exact information.  Also
 	      look at <command>man 4 pass</command>.</para>
 	  </listitem>
 	</itemizedlist>
       </sect2>
       
       <sect2 id="scsi-further-reading">
 	<title>Further reading</title>
 
 	<para>If you intend to do some serious SCSI hacking, you might want to
 	  have the official standard at hand:</para>
 	    
 	<para>Approved American National Standards can be purchased from
 	  ANSI at
 	  
 	  <address>
 	    <otheraddr>13th Floor</otheraddr>
 	    <street>11 West 42nd Street</street>
 	    <city>New York</city>
 	    <state>NY</state> <postcode>10036</postcode>
 	    Sales Dept: <phone>(212) 642-4900</phone>
 	  </address>
 	</para>
 
 	<para>You can also buy many ANSI
 	  standards and most committee draft documents from Global
 	  Engineering Documents,
 
 	  <address>
 	    <street>15 Inverness Way East</street>
 	    <city>Englewood</city>
 	    <state>CO</state>, <postcode>80112-5704</postcode>
 	    Phone: <phone>(800) 854-7179</phone>
 	    Outside USA and Canada: <phone>(303) 792-2181</phone>
 	    Fax: <fax>(303) 792- 2192</fax>
 	  </address>
 	</para>
 
 	<para>Many X3T10 draft documents are available electronically on the
 	  SCSI BBS (719-574-0424) and on the <hostid
 	    role="fqdn">ncrinfo.ncr.com</hostid> anonymous FTP site.</para>
 	    
 	<para>Latest X3T10 committee documents are:</para>
 
 	<itemizedlist>
 	  <listitem>
 	    <para>AT Attachment (ATA or IDE) [X3.221-1994]
 	      (<emphasis>Approved</emphasis>)</para>
 	  </listitem>
 	  
 	  <listitem>
 	    <para>ATA Extensions (ATA-2) [X3T10/948D Rev 2i]</para>
 	  </listitem>
 	  
 	  <listitem>
 	    <para>Enhanced Small Device Interface (ESDI)
 	      [X3.170-1990/X3.170a-1991]
 	      (<emphasis>Approved</emphasis>)</para>
 	  </listitem>
 	  
 	  <listitem>
 	    <para>Small Computer System Interface &mdash; 2 (SCSI-2)
 	      [X3.131-1994] (<emphasis>Approved</emphasis>)</para>
 	  </listitem>
 	  
 	  <listitem>
 	    <para>SCSI-2 Common Access Method Transport and SCSI Interface
 	      Module (CAM)  [X3T10/792D Rev 11]</para>
 	  </listitem>
 	</itemizedlist>
 
 	<para>Other publications that might provide you with additional
 	  information are:</para>
 
 	<itemizedlist>
 	  <listitem>
 	    <para><quote>SCSI: Understanding the Small Computer System
 	      Interface</quote>, written by NCR  Corporation.  Available from:
 	      Prentice Hall, Englewood Cliffs, NJ, 07632 Phone: (201) 767-5937
 	      ISBN 0-13-796855-8</para>
 	  </listitem>
 	  
 	  <listitem>
 	    <para><quote>Basics of SCSI</quote>, a SCSI tutorial written by
 	      Ancot Corporation Contact Ancot for availability information at:
 	      Phone: (415) 322-5322  Fax: (415) 322-0455</para>
 	  </listitem>
 	  
 	  <listitem>
 	    <para><quote>SCSI Interconnection Guide Book</quote>, an AMP
 	      publication (dated 4/93, Catalog  65237) that lists the various
 	      SCSI connectors and suggests cabling schemes.  Available from
 	      AMP at (800) 522-6752 or (717) 564-0100</para>
 	  </listitem>
 	  
 	  <listitem>
 	    <para><quote>Fast Track to SCSI</quote>, A Product Guide written by
 	      Fujitsu.  Available from: Prentice Hall, Englewood Cliffs, NJ,
 	      07632 Phone: (201) 767-5937 ISBN 0-13-307000-X</para>
 	  </listitem>
 	  
 	  <listitem>
 	    <para><quote>The SCSI Bench Reference</quote>, <quote>The SCSI
 	      Encyclopedia</quote>, and the <quote>SCSI Tutor</quote>, ENDL
 	      Publications, 14426 Black Walnut Court, Saratoga CA, 95070
 	      Phone: (408) 867-6642</para>
 	  </listitem>
 	  
 	  <listitem>
 	    <para><quote>Zadian SCSI Navigator</quote> (quick ref. book) and
 	      <quote>Discover the Power of SCSI</quote>  (First book along with
 	      a one-hour video and tutorial book), Zadian Software, Suite 214,
 	      1210 S. Bascom Ave., San Jose, CA 92128, (408) 293-0800</para>
 	  </listitem>
 	</itemizedlist>
 
 	<para>On Usenet the newsgroups <ulink
 	    url="news:comp.periphs.scsi">comp.periphs.scsi</ulink> and <ulink
 	    url="news:comp.periphs">comp.periphs</ulink> are noteworthy places
 	  to look for more info.  You can also find the <ulink
 	  url="http://scsifaq.org:9080/scsi_faq/scsifaq.html">SCSI-FAQ</ulink>
 	  there, which is posted periodically.</para>
 
 	<para>Most major SCSI device and host adapter suppliers operate FTP
 	  sites and/or BBS systems.  They may be valuable sources of
 	  information about the devices you own.</para>
       </sect2>
     </sect1>
     
     <sect1 id="hw-storage-controllers">
       <title>* Disk/tape controllers</title>
       
       <sect2>
 	<title>* SCSI</title>
 
 	<para></para>
       </sect2>
       
       <sect2>
 	<title>* IDE</title>
 
 	<para></para>
       </sect2>
       
       <sect2>
 	<title>* Floppy</title>
 
 	<para></para>
       </sect2>
     </sect1>
     
     <sect1>
       <title>Hard drives</title>
       
       <sect2>
 	<title>SCSI hard drives</title>
 
 	<para><emphasis>Contributed by &a.asami;.  17 February
 	    1998.</emphasis></para>
 	    
 	<para>As mentioned in the <link linkend="scsi">SCSI</link> section,
 	  virtually all SCSI hard drives sold today are SCSI-2 compliant and
 	  thus will work fine as long as you connect them to a supported SCSI
 	  host adapter.  Most problems people encounter are either due to
 	  badly designed cabling (cable too long, star topology, etc.),
 	  insufficient termination, or defective parts. Please refer to the
 	  <link linkend="scsi">SCSI</link> section first if your SCSI hard
 	  drive is not working.  However, there are a couple of things you may
 	  want to take into account before you purchase SCSI hard drives for
 	  your system.</para>
 
 	<sect3>
 	  <title>Rotational speed</title>
 	  
 	  <para>Rotational speeds of SCSI drives sold today range from around
 	    4,500RPM to 15,000RPM.  Most of them are either 7,200RPM or
 	    10,000RPM, with 15,000RPM becoming affordable (June 2002).
 	    Even though the 10,000RPM drives can generally transfer
 	    data faster, they run considerably hotter than their 7,200RPM
 	    counterparts.  A large fraction of today's disk drive malfunctions
 	    are heat-related.  If you do not have very good cooling in your PC
 	    case, you may want to stick with 7,200RPM or slower drives.</para>
 	      
 	  <para>Note that newer drives, with higher areal recording densities,
 	    can deliver much more bits per rotation than older ones.  Today's
 	    top-of-line 7,200RPM drives can sustain a throughput comparable to
 	    10,000RPM drives of one or two model generations ago.  The number
 	    to find on the spec sheet for bandwidth is <quote>internal data
 	    (or transfer) rate</quote>.  It is usually in megabits/sec so
 	    divide it by 8 and you will get the rough approximation of how much
 	    megabytes/sec you can get out of the drive.</para>
 	      
 	  <para>(If you are a speed maniac and want a 15,000RPM drive for your
 	    cute little PC, be my guest; however, those drives become
 	    extremely hot.  Do not even think about it if you do not have a fan
 	    blowing air <emphasis>directly at</emphasis> the drive or a
 	    properly ventilated disk enclosure.)</para>
 	      
 	  <para>Obviously, the latest 15,000RPM drives and 10,000RPM drives can
 	    deliver more data than the latest 7,200RPM drives, so if absolute
 	    bandwidth is the necessity for your applications, you have little
 	    choice but to get the faster drives.  Also, if you need low
 	    latency, faster drives are better; not only do they usually have
 	    lower average seek times, but also the rotational delay is one
 	    place where slow-spinning drives can never beat a faster one.
 	    (The average rotational latency is half the time it takes to
 	    rotate the drive once; thus, it is 2 milliseconds for 15,000RPM,
 	    3ms for 10,000RPM
 	    drives, 4.2ms for 7,200RPM drives and 5.6ms for 5,400RPM drives.)
 	    Latency is seek time plus rotational delay. Make sure you
 	    understand whether you need low latency or more accesses per
 	    second, though; in the latter case (e.g., news servers), it may
 	    not be optimal to purchase one big fast drive.  You can achieve
 	    similar or even better results by using the ccd (concatenated
 	    disk) driver to create a striped disk array out of multiple slower
 	    drives for comparable overall cost.</para>
 	      
 	  <para>Make sure you have adequate air flow around the drive,
 	    especially if you are going to use a fast-spinning drive.  You
 	    generally need at least 1/2&rdquo; (1.25cm) of spacing above and below a
 	    drive.  Understand how the air flows through your PC case.  Most
 	    cases have the power supply suck the air out of the back.  See
 	    where the air flows in, and put the drive where it will have the
 	    largest volume of cool air flowing around it. You may need to seal
 	    some unwanted holes or add a new fan for effective cooling.</para>
 	      
 	  <para>Another consideration is noise.  Many 10,000 or faster drives
 	    generate a high-pitched whine which is quite unpleasant to most
 	    people.  That, plus the extra fans often required for cooling, may
 	    make 10,000 or faster drives unsuitable for some office and home
 	    environments.</para>
 	</sect3>
 
 	<sect3>
 	  <title>Form factor</title>
 	  
 	  <para>Most SCSI drives sold today are of 3.5&rdquo; form factor.  They
 	    come in two different heights; 1.6&rdquo; (<quote>half-height</quote>) or
 	    1&rdquo; (<quote>low-profile</quote>).  The half-height drive is the same
 	    height as a CDROM drive.  However, do not forget the spacing rule
 	    mentioned in the previous section.  If you have three standard
 	    3.5&rdquo; drive bays, you will not be able to put three half-height
 	    drives in there (without frying them, that is).</para>
 	</sect3>
 
 	<sect3>
 	  <title>Interface</title>
 	  
 	  <para>The majority of SCSI hard drives sold today are Ultra,
 	    Ultra-wide, or Ultra160 SCSI. As of this writing (June 2002),
 	    the first Ultra320 host adapters and devices become available.
 	    The maximum bandwidth of Ultra SCSI is 20MB/sec,
 	    and Ultra-wide SCSI is 40MB/sec. Ultra160 can transfer 160MB/sec
 	    and Ultra320 can transfer 320MB/sec.  There is no difference in max
 	    cable length between Ultra and Ultra-wide; however, the more
 	    devices you have on the same bus, the sooner you will start having
 	    bus integrity problems.  Unless you have a well-designed disk
 	    enclosure, it is not easy to make more than 5 or 6 Ultra SCSI
 	    drives work on a single bus.</para>
 	      
 	  <para>On the other hand, if you need to connect many drives, going
 	    for Fast-wide SCSI may not be a bad idea.  That will have the same
 	    max bandwidth as Ultra (narrow) SCSI, while electronically it is
 	    much easier to get it <quote>right</quote>.  My advice would be: if
 	    you want to connect many disks, get wide or Ultra160 SCSI drives;
         they usually
 	    cost a little more but it may save you down the road.  (Besides,
 	    if you can not afford the cost difference, you should not be building
 	    a disk array.)</para>
 	      
 	  <para>There are two variant of wide SCSI drives; 68-pin and 80-pin
 	    SCA (Single Connector Attach).  The SCA drives do not have a
 	    separate 4-pin power connector, and also read the SCSI ID settings
 	    through the 80-pin connector.  If you are really serious about
 	    building a large storage system, get SCA drives and a good SCA
 	    enclosure (dual power supply with at least one extra fan).  They
 	    are more electronically sound than 68-pin counterparts because
 	    there is no <quote>stub</quote> of the SCSI bus inside the disk
 	    canister as in arrays built from 68-pin drives.  They are easier
 	    to install too (you just need to screw the drive in the canister,
 	    instead of trying to squeeze in your fingers in a tight place to
 	    hook up all the little cables (like the SCSI ID and disk activity
 	    LED lines).</para>
 	</sect3>
       </sect2>
       
       <sect2>
 	<title>* IDE hard drives</title>
 
 	<para></para>
       </sect2>
     </sect1>
     
     <sect1>
       <title>Tape drives</title>
       
       <para><emphasis>Contributed by &a.jmb;.  2 July
 	  1996.</emphasis></para>
       
       <sect2>
 	<title>General tape access commands</title>
 
 	<para>&man.mt.1; provides generic access to the tape drives.  Some of
 	  the more common commands are <command>rewind</command>,
 	  <command>erase</command>, and <command>status</command>.  See the
 	    &man.mt.1; manual page for a detailed description.</para>
       </sect2>
       
       <sect2>
 	<title>Controller Interfaces</title>
 
 	<para>There are several different interfaces that support tape drives.
 	  The interfaces are SCSI, IDE, Floppy and Parallel Port. A wide
 	  variety of tape drives are available for these interfaces.
 	  Controllers are discussed in <link
 	    linkend="hw-storage-controllers">Disk/tape
 	    controllers</link>.</para>
       </sect2>
       
       <sect2>
 	<title>SCSI drives</title>
 
 	<para>The &man.st.4; driver provides support for 8mm (Exabyte), 4mm
 	  (DAT: Digital Audio Tape), QIC (Quarter-Inch Cartridge), DLT
 	  (Digital Linear Tape), QIC Mini cartridge and 9-track (remember the
 	  big reels that you see spinning in Hollywood computer rooms) tape
 	  drives.  See the &man.st.4; manual page for a detailed
 	  description.</para>
 
 	<para>The drives listed below are currently being used by members of
 	  the FreeBSD community.  They are not the only drives that will work
 	  with FreeBSD.  They just happen to be the ones that we use.</para>
 
 	<sect3>
 	  <title>4mm (DAT: Digital Audio Tape)</title>
 	  
 	  <para><link linkend="hw-storage-python-28454">Archive Python
 	      28454</link></para>
 
 	  <para><link linkend="hw-storage-python-04687">Archive Python
 	      04687</link></para>
 	  
 	  <para><link linkend="hw-storage-hp1533a">HP C1533A</link></para>
 	      
 	  <para><link linkend="hw-storage-hp1534a">HP C1534A</link></para>
 	      
 	  <para><link linkend="hw-storage-hp35450a">HP 35450A</link></para>
 	      
 	  <para><link linkend="hw-storage-hp35470a">HP 35470A</link></para>
 	      
 	  <para><link linkend="hw-storage-hp35480a">HP 35480A</link></para>
 	      
 	  <para><link linkend="hw-storage-sdt5000">SDT-5000</link></para>
 	      
 	  <para><link linkend="hw-storage-wangtek6200">Wangtek
 	      6200</link></para>
 	</sect3>
 
 	<sect3>
 	  <title>8mm (Exabyte)</title>
 	  
 	  <para><link linkend="hw-storage-exb8200">EXB-8200</link></para>
 	  
 	  <para><link linkend="hw-storage-exb8500">EXB-8500</link></para>
 	  
 	  <para><link linkend="hw-storage-exb8505">EXB-8505</link></para>
 	</sect3>
 
 	<sect3>
 	  <title>QIC (Quarter-Inch Cartridge)</title>
 	  
 	  <para><link linkend="hw-storage-anaconda">Archive Anaconda
 	      2750</link></para>
 	      
 	  <para><link linkend="hw-storage-viper60">Archive Viper
 	      60</link></para>
 	      
 	  <para><link linkend="hw-storage-viper150">Archive Viper
 	      150</link></para>
 	      
 	  <para><link linkend="hw-storage-viper2525">Archive Viper
 	      2525</link></para>
 	      
 	  <para><link linkend="hw-storage-tandberg3600">Tandberg TDC
 	      3600</link></para>
 
 	  <para><link linkend="hw-storage-tandberg3620">Tandberg TDC
 	      3620</link></para>
 	    
 	  <para><link linkend="hw-storage-tandberg3800">Tandberg TDC
 	      3800</link></para>
 	      
 	  <para><link linkend="hw-storage-tandberg4222">Tandberg TDC
 	      4222</link></para>
 	      
 	  <para><link linkend="hw-storage-wangtek5525es">Wangtek
 	      5525ES</link></para>
 	</sect3>
 
 	<sect3>
 	  <title>DLT (Digital Linear Tape)</title>
 	  
 	  <para><link linkend="hw-storage-dectz87">Digital TZ87</link></para>
 	</sect3>
 
 	<sect3>
 	  <title>Mini-Cartridge</title>
 	  
 	  <para><link linkend="hw-storage-ctms3200">Conner CTMS
 	      3200</link></para>
 	      
 	  <para><link linkend="hw-storage-exb2501">Exabyte 2501</link></para>
 	</sect3>
 
 	<sect3>
 	  <title>Autoloaders/Changers</title>
 	  
 	  <para><link linkend="hw-storage-hp1553a">Hewlett-Packard HP C1553A
 	      Autoloading DDS2</link></para>
 	</sect3>
       </sect2>
       
       <sect2>
 	<title>* IDE drives</title>
 
 	<para></para>
       </sect2>
       
       <sect2>
 	<title>Floppy drives</title>
 
 	<para><link linkend="hw-storage-conner420r">Conner 420R</link></para>
       </sect2>
       
       <sect2>
 	<title>* Parallel port drives</title>
 
 	<para></para>
       </sect2>
       
       <sect2>
 	<title>Detailed Information</title>
 
 	<sect3 id="hw-storage-anaconda">
 	  <title>Archive Anaconda 2750</title>
 	  
 	  <para>The boot message identifier for this drive is <literal>ARCHIVE
 	      ANCDA 2750 28077 -003 type 1 removable SCSI 2</literal></para>
 	  
 	  <para>This is a QIC tape drive.</para>
 	  
 	  <para>Native capacity is 1.35GB when using QIC-1350 tapes.  This
 	    drive will read and write QIC-150 (DC6150), QIC-250 (DC6250), and
 	    QIC-525 (DC6525) tapes as well.</para>
 	      
 	  <para>Data transfer rate is 350kB/s using &man.dump.8;.
 	    Rates of 530kB/s have been reported when using
 	    Amanda</para>
 
 	  <para>Production of this drive has been discontinued.</para>
 	  
 	  <para>The SCSI bus connector on this tape drive is reversed from
 	    that on most other SCSI devices.  Make sure that you have enough
 	    SCSI cable to twist the cable one-half turn before and after the
 	    Archive Anaconda tape drive, or turn your other SCSI devices
 	    upside-down.</para>
 	  
 	  <para>Two kernel code changes are required to use this drive. This
 	    drive will not work as delivered.</para>
 	      
 	  <para>If you have a SCSI-2 controller, short jumper 6. Otherwise,
 	    the drive behaves are a SCSI-1 device.  When operating as a SCSI-1
 	    device, this drive, <quote>locks</quote> the SCSI bus during some
 	    tape operations, including: fsf, rewind, and rewoffl.</para>
 	      
 	  <para>If you are using the NCR SCSI controllers, patch the file
 	    <filename>/usr/src/sys/pci/ncr.c</filename> (as shown below).
 	    Build and install a new kernel.</para>
 	      
 	  <programlisting>*** 4831,4835 ****
                 };
         
 !               if (np-&gt;latetime&gt;4) {
                         /*
                         **      Although we tried to wake it up,
 --- 4831,4836 ----
                 };
 
 !               if (np-&gt;latetime&gt;1200) {
                         /*
                         **      Although we tried to wake it up,</programlisting>
 	      
 	  <para>Reported by: &a.jmb;</para>
 	</sect3>
 
 	<sect3 id="hw-storage-python-28454">
 	  <title>Archive Python 28454</title>
 	  
 	  <para>The boot message identifier for this drive is <literal>ARCHIVE
 	      Python 28454-XXX4ASB</literal> <literal>type 1 removable SCSI
 	      2</literal> <literal>density code 0x8c, 512-byte
 	      blocks</literal></para>
 	      
 	  <para>This is a DDS-1 tape drive.</para>
 	      
 	  <para>Native capacity is 2.5GB on 90m tapes.</para>
 	  
 	  <para>Data transfer rate is XXX.</para>
 	  
 	  <para>This drive was repackaged by Sun Microsystems as model
 	    595-3067.</para>
 	      
 	  <para>Reported by: Bob Bishop <email>rb@gid.co.uk</email></para>
 	  
 	  <para>Throughput is in the 1.5 MByte/sec range, however this will
 	    drop if the disks and tape drive are on the same SCSI
 	    controller.</para>
 
 	  <para>Reported by: Robert E. Seastrom
 	    <email>rs@seastrom.com</email></para>
 	</sect3>
 
 	<sect3 id="hw-storage-python-04687">
 	  <title>Archive Python 04687</title>
 
 	  <para>The boot message identifier for this drive is <literal>ARCHIVE 
 	      Python 04687-XXX 6580</literal> <literal>Removable Sequential
 	      Access SCSI-2 device</literal></para>
 
 	  <para>This is a DAT-DDS-2 drive.</para>
 
 	  <para>Native capacity is 4GB when using 120m tapes.</para>
 
 	  <para>This drive supports hardware data compression.  Switch 4
 	    controls MRS (Media Recognition System).  MRS tapes have stripes
 	    on the transparent leader.  Switch 4 <emphasis>off</emphasis>
 	    enables MRS, <emphasis>on</emphasis> disables MRS.</para>
 
 	  <para>Parity is controlled by switch 5.  Switch 5
 	    <emphasis>on</emphasis> to enable parity control.  Compression is
 	    enabled with Switch 6 <emphasis>off</emphasis>.  It is possible to 
 	    override compression with the <literal>SCSI MODE SELECT</literal>
 	    command (see &man.mt.1;).</para>
 
 	  <para>Data transfer rate is 800kB/s.</para>
 	</sect3>
 
 	<sect3 id="hw-storage-viper60">
 	  <title>Archive Viper 60</title>
 	  
 	  <para>The boot message identifier for this drive is <literal>ARCHIVE
 	      VIPER 60 21116 -007</literal> <literal>type 1 removable SCSI
 	      1</literal></para>
 	      
 	  <para>This is a QIC tape drive.</para>
 	  
 	  <para>Native capacity is 60MB.</para>
 	  
 	  <para>Data transfer rate is XXX.</para>
 	  
 	  <para>Production of this drive has been discontinued.</para>
 	  
 	  <para>Reported by: Philippe Regnauld
 	    <email>regnauld@hsc.fr</email></para>
 	</sect3>
 
 	<sect3 id="hw-storage-viper150">
 	  <title>Archive Viper 150</title>
 	  
 	  <para>The boot message identifier for this drive is <literal>ARCHIVE
 	      VIPER 150 21531 -004</literal> <literal>Archive Viper 150 is a
 	      known rogue</literal> <literal>type 1 removable SCSI
 	      1</literal>.  A multitude of firmware revisions exist for this
 	    drive.  Your drive may report different numbers (e.g
 	    <literal>21247 -005</literal>.</para>
 	      
 	  <para>This is a QIC tape drive.</para>
 	  
 	  <para>Native capacity is 150/250MB.  Both 150MB (DC6150) and 250MB
 	    (DC6250) tapes have the recording format.  The 250MB tapes are
 	    approximately 67% longer than the 150MB tapes.  This drive can
 	    read 120MB tapes as well.  It can not write 120MB tapes.</para>
 	      
 	  <para>Data transfer rate is 100kB/s</para>
 	  
 	  <para>This drive reads and writes DC6150 (150MB) and DC6250 (250MB)
 	    tapes.</para>
 	      
 	  <para>This drives quirks are known and pre-compiled into the SCSI
 	    tape device driver (&man.st.4;).</para>
 	      
 	  <para>Under FreeBSD 2.2-CURRENT, use <command>mt blocksize
 	      512</command> to set the blocksize.  (The particular drive had
 	    firmware revision 21247 -005.  Other firmware revisions may behave
 	    differently) Previous versions of FreeBSD did not have this
 	    problem.</para>
 	      
 	  <para>Production of this drive has been discontinued.</para>
 	  
 	  <para>Reported by: Pedro A M Vazquez
 	    <email>vazquez@IQM.Unicamp.BR</email></para>
 	  
 	  <para>&a.msmith;</para>
 	</sect3>
 	    
 	<sect3 id="hw-storage-viper2525">
 	  <title>Archive Viper 2525</title>
 	  
 	  <para>The boot message identifier for this drive is <literal>ARCHIVE
 	      VIPER 2525 25462 -011</literal> <literal>type 1 removable SCSI
 	      1</literal></para>
 	  
 	  <para>This is a QIC tape drive.</para>
 	  
 	  <para>Native capacity is 525MB.</para>
 	  
 	  <para>Data transfer rate is 180kB/s at 90 inches/sec.</para>
 	  
 	  <para>The drive reads QIC-525, QIC-150, QIC-120 and QIC-24 tapes.
 	    Writes QIC-525, QIC-150, and QIC-120.</para>
 	      
 	  <para>Firmware revisions prior to <literal>25462 -011</literal> are
 	    bug ridden and will not function properly.</para>
 	  
 	  <para>Production of this drive has been discontinued.</para>
 	</sect3>
 
 	<sect3 id="hw-storage-conner420r">
 	  <title>Conner 420R</title>
 	  
 	  <para>The boot message identifier for this drive is <literal>Conner
 	      tape</literal>.</para>
 	      
 	  <para>This is a floppy controller, mini cartridge tape drive.</para>
 	      
 	  <para>Native capacity is XXXX</para>
 	  
 	  <para>Data transfer rate is XXX</para>
 	  
 	  <para>The drive uses QIC-80 tape cartridges.</para>
 	  
 	  <para>Reported by: Mark Hannon
 	    <email>mark@seeware.DIALix.oz.au</email></para>
 	</sect3>
 
 	<sect3 id="hw-storage-ctms3200">
 	  <title>Conner CTMS 3200</title>
 	      
 	  <para>The boot message identifier for this drive is <literal>CONNER
 	      CTMS 3200 7.00</literal> <literal>type 1 removable SCSI
 	      2</literal>.</para>
 	      
 	  <para>This is a mini cartridge tape drive.</para>
 	  
 	  <para>Native capacity is XXXX</para>
 	  
 	  <para>Data transfer rate is XXX</para>
 	  
 	  <para>The drive uses QIC-3080 tape cartridges.</para>
 	  
 	  <para>Reported by: Thomas S. Traylor
 	    <email>tst@titan.cs.mci.com</email></para>
 	</sect3>
 
 	<sect3 id="hw-storage-dectz87">
 	  <title><ulink
 	      url="http://www.digital.com/info/Customer-Update/931206004.txt.html">DEC TZ87</ulink></title>
 	  
 	  <para>The boot message identifier for this drive is <literal>DEC
 	      TZ87 (C) DEC 9206</literal> <literal>type 1 removable SCSI
 	      2</literal> <literal>density code 0x19</literal></para>
 	      
 	  <para>This is a DLT tape drive.</para>
 	  
 	  <para>Native capacity is 10GB.</para>
 	  
 	  <para>This drive supports hardware data compression.</para>
 	  
 	  <para>Data transfer rate is 1.2MB/s.</para>
 	  
 	  <para>This drive is identical to the Quantum DLT2000.  The drive
 	    firmware can be set to emulate several well-known drives,
 	    including an Exabyte 8mm drive.</para>
 	      
 	  <para>Reported by: &a.wilko;</para>
 	</sect3>
 
 	<sect3 id="hw-storage-exb2501">
 	  <title><ulink
 	      url="http://www.Exabyte.COM:80/Products/Minicartridge/2501/Rfeatures.html">Exabyte EXB-2501</ulink></title>
 	      
 	  <para>The boot message identifier for this drive is <literal>EXABYTE
 	      EXB-2501</literal></para>
 	      
 	  <para>This is a mini-cartridge tape drive.</para>
 	      
 	  <para>Native capacity is 1GB when using MC3000XL
 	    mini cartridges.</para>
 	      
 	  <para>Data transfer rate is XXX</para>
 	  
 	  <para>This drive can read and write DC2300 (550MB), DC2750 (750MB),
 	    MC3000 (750MB), and MC3000XL (1GB) mini cartridges.</para>
 	      
 	  <para>WARNING: This drive does not meet the SCSI-2 specifications.
 	    The drive locks up completely in response to a SCSI MODE_SELECT
 	    command unless there is a formatted tape in the drive.  Before
 	    using this drive, set the tape blocksize with</para>
 	      
 	  <screen>&prompt.root; <userinput>mt -f /dev/st0ctl.0 blocksize 1024</userinput></screen>
 	      
 	  <para>Before using a mini cartridge for the first time, the
 	    mini cartridge must be formated.  FreeBSD 2.1.0-RELEASE and
 	    earlier:</para>
 	      
 	  <screen>&prompt.root; <userinput>/sbin/scsi -f /dev/rst0.ctl -s 600 -c "4 0 0 0 0 0"</userinput></screen>
 	      
 	  <para>(Alternatively, fetch a copy of the
 	    <command>scsiformat</command> shell script from FreeBSD
 	    2.1.5/2.2.) FreeBSD 2.1.5 and later:</para>
 	      
 	  <screen>&prompt.root; <userinput>/sbin/scsiformat -q -w /dev/rst0.ctl</userinput></screen>
 	      
 	  <para>Right now, this drive cannot really be recommended for
 	    FreeBSD.</para>
 	      
 	  <para>Reported by: Bob Beaulieu
 	    <email>ez@eztravel.com</email></para>
 	</sect3>
 
 	<sect3 id="hw-storage-exb8200">
 	  <title>Exabyte EXB-8200</title>
 	  
 	  <para>The boot message identifier for this drive is <literal>EXABYTE
 	      EXB-8200 252X</literal> <literal>type 1 removable SCSI
 	      1</literal></para>
 	      
 	  <para>This is an 8mm tape drive.</para>
 	  
 	  <para>Native capacity is 2.3GB.</para>
 	  
 	  <para>Data transfer rate is 270kB/s.</para>
 	  
 	  <para>This drive is fairly slow in responding to the SCSI bus during
 	    boot.  A custom kernel may be required (set SCSI_DELAY to 10
 	    seconds).</para>
 	  
 	  <para>There are a large number of firmware configurations for this
 	    drive, some have been customized to a particular vendor's
 	    hardware.  The firmware can be changed via EPROM
 	    replacement.</para>
 	      
 	  <para>Production of this drive has been discontinued.</para>
 	  
 	  <para>Reported by: &a.msmith;</para>
 	</sect3>
 
 	<sect3 id="hw-storage-exb8500">
 	  <title>Exabyte EXB-8500</title>
 	  
 	  <para>The boot message identifier for this drive is <literal>EXABYTE
 	      EXB-8500-85Qanx0 0415</literal> <literal>type 1 removable SCSI
 	      2</literal></para>
 	      
 	  <para>This is an 8mm tape drive.</para>
 	  
 	  <para>Native capacity is 5GB.</para>
 	  
 	  <para>Data transfer rate is 300kB/s.</para>
 	      
 	  <para>Reported by: Greg Lehey <email>grog@lemis.de</email></para>
 	</sect3>
 
 	<sect3 id="hw-storage-exb8505">
 	  <title><ulink
 	      url="http://www.Exabyte.COM:80/Products/8mm/8505XL/Rfeatures.html">Exabyte EXB-8505</ulink></title>
 	      
 	      <para>The boot message identifier for this drive is
 	    <literal>EXABYTE EXB-85058SQANXR1 05B0</literal> <literal>type 1
 	      removable SCSI 2</literal></para>
 	      
 	  <para>This is an 8mm tape drive which supports compression, and is
 	    upward compatible with the EXB-5200 and EXB-8500.</para>
 	  
 	  <para>Native capacity is 5GB.</para>
 	  
 	  <para>The drive supports hardware data compression.</para>
 	  
 	  <para>Data transfer rate is 300kB/s.</para>
 	  
 	  <para>Reported by: Glen Foster
 	    <email>gfoster@gfoster.com</email></para>
 	</sect3>
 
 	<sect3 id="hw-storage-hp1533a">
 	  <title>Hewlett-Packard HP C1533A</title>
 	  
 	  <para>The boot message identifier for this drive is <literal>HP
 	      C1533A 9503</literal> <literal>type 1 removable SCSI
 	      2</literal>.</para>
 	      
 	  <para>This is a DDS-2 tape drive.  DDS-2 means hardware data
 	    compression and narrower tracks for increased data
 	    capacity.</para>
 	  
 	  <para>Native capacity is 4GB when using 120m tapes.  This drive
 	    supports hardware data compression.</para>
 	      
 	  <para>Data transfer rate is 510kB/s.</para>
 	  
 	  <para>This drive is used in Hewlett-Packard's SureStore 6000eU and
 	    6000i tape drives and C1533A DDS-2 DAT drive.</para>
 	  
 	  <para>The drive has a block of 8 dip switches.  The proper settings
 	    for FreeBSD are: 1 ON; 2 ON; 3 OFF; 4 ON; 5 ON; 6 ON; 7 ON; 8
 	    ON.</para>
 	  
 	  <informaltable frame="none">
 	    <tgroup cols="3">
 	      <thead>
 		<row>
 		  <entry>switch 1</entry>
 		  <entry>switch 2</entry>
 		  <entry>Result</entry>
 		</row>
 	      </thead>
 	      
 	      <tbody>
 		<row>
 		  <entry>On</entry>
 		  <entry>On</entry>
 		  <entry>Compression enabled at power-on, with host
 		    control</entry>
 		</row>
 		
 		<row>
 		  <entry>On</entry>
 		  <entry>Off</entry>
 		  <entry>Compression enabled at power-on, no host
 		    control</entry>
 		</row>
 		
 		<row>
 		  <entry>Off</entry>
 		  <entry>On</entry>
 		  <entry>Compression disabled at power-on, with host
 		    control</entry>
 		</row>
 		
 		<row>
 		  <entry>Off</entry>
 		  <entry>Off</entry>
 		  <entry>Compression disabled at power-on, no host
 		    control</entry>
 		</row>
 	      </tbody>
 	    </tgroup>
 	  </informaltable>
 	  
 	  <para>Switch 3 controls MRS (Media Recognition System).  MRS tapes
 	    have stripes on the transparent leader.  These identify the tape
 	    as DDS (Digital Data Storage) grade media.  Tapes that do not have
 	    the stripes will be treated as write-protected.  Switch 3 OFF
 	    enables MRS.  Switch 3 ON disables MRS.</para>
 	      
 	      <para>See <ulink url="http://www.hp.com/tape/c_intro.html">HP
 	      SureStore Tape Products</ulink> and <ulink
 	      url="http://www.impediment.com/hp/hp_technical.html">Hewlett-Packard
 	      Disk and Tape Technical Information</ulink> for more information
 	    on configuring this drive.</para>
 	      
 	  <para><emphasis>Warning:</emphasis> Quality control on these drives
 	    varies greatly.  One FreeBSD core-team member has returned 2 of
 	    these drives.  Neither lasted more than 5 months.</para>
 	      
 	  <para>Reported by: &a.se;</para>
 	</sect3>
 
 	<sect3 id="hw-storage-hp1534a">
 	  <title>Hewlett-Packard HP 1534A</title>
 	  
 	  <para>The boot message identifier for this drive is <literal>HP
 	      HP35470A T503</literal> <literal>type 1 removable SCSI
 	      2</literal> <literal>Sequential-Access density code 0x13,
 	      variable blocks</literal>.</para>
 	      
 	  <para>This is a DDS-1 tape drive.  DDS-1 is the original DAT tape
 	    format.</para>
 	      
 	  <para>Native capacity is 2GB when using 90m tapes.</para>
 	      
 	  <para>Data transfer rate is 183kB/s.</para>
 	  
 	  <para>The same mechanism is used in Hewlett-Packard's SureStore
 	    <ulink url="http://www.dmo.hp.com/tape/sst2000.htm">2000i</ulink>
 	    tape drive, C35470A DDS format DAT drive, C1534A DDS format DAT
 	    drive and HP C1536A DDS format DAT drive.</para>
 	      
 	  <para>The HP C1534A DDS format DAT drive has two indicator lights,
 	    one green and one amber.  The green one indicates tape action:
 	    slow flash during load, steady when loaded, fast flash during
 	    read/write operations.  The amber one indicates warnings: slow
 	    flash when cleaning is required or tape is nearing the end of its
 	    useful life, steady indicates an hard fault.  (factory service
 	    required?)</para>
 	      
 	  <para>Reported by Gary Crutcher
 	    <email>gcrutchr@nightflight.com</email></para>
 	</sect3>
 
 	<sect3 id="hw-storage-hp1553a">
 	  <title>Hewlett-Packard HP C1553A Autoloading DDS2</title>
 	  
 	  <para>The boot message identifier for this drive is "".</para>
 	  
 	  <para>This is a DDS-2 tape drive with a tape changer.  DDS-2 means
 	    hardware data compression and narrower tracks for increased data
 	    capacity.</para>
 	      
 	  <para>Native capacity is 24GB when using 120m tapes.  This drive
 	    supports hardware data compression.</para>
 	      
 	  <para>Data transfer rate is 510kB/s (native).</para>
 	      
 	  <para>This drive is used in Hewlett-Packard's SureStore <ulink
 	      url="http://www.dmo.hp.com/tape/sst12000.htm">12000e</ulink>
 	    tape drive.</para>
 	      
 	  <para>The drive has two selectors on the rear panel.  The selector
 	    closer to the fan is SCSI id.  The other selector should be set to
 	    7.</para>
 	      
 	  <para>There are four internal switches.  These should be set: 1 ON;
 	    2 ON; 3 ON; 4 OFF.</para>
 	      
 	  <para>At present the kernel drivers do not automatically change
 	    tapes at the end of a volume.  This shell script can be used to
 	    change tapes:</para>
 	      
 	  <programlisting>#!/bin/sh
 PATH="/sbin:/usr/sbin:/bin:/usr/bin"; export PATH
 
 usage()
 {
         echo "Usage: dds_changer [123456ne] raw-device-name
         echo "1..6 = Select cartridge"
         echo "next cartridge"
         echo "eject magazine"
         exit 2
 }
 
 if [ $# -ne 2 ] ; then
         usage
 fi
 
 cdb3=0
 cdb4=0
 cdb5=0
 
 case $1 in
         [123456])
                 cdb3=$1
                 cdb4=1
                 ;;
         n)
                 ;;
         e)
                 cdb5=0x80
                 ;;
         ?)
                 usage
                 ;;
 esac
 
 scsi -f $2 -s 100 -c "1b 0 0 $cdb3 $cdb4 $cdb5"</programlisting>
 	</sect3>
 
 	<sect3 id="hw-storage-hp35450a">
 	  <title>Hewlett-Packard HP 35450A</title>
 	  
 	  <para>The boot message identifier for this drive is <literal>HP
 	      HP35450A -A C620</literal> <literal>type 1 removable SCSI
 	      2</literal> <literal>Sequential-Access density code
 	      0x13</literal></para>
 	      
 	  <para>This is a DDS-1 tape drive.  DDS-1 is the original DAT tape
 	    format.</para>
 	      
 	  <para>Native capacity is 1.2GB.</para>
 	      
 	  <para>Data transfer rate is 160kB/s.</para>
 	  
 	  <para>Reported by: Mark Thompson
 	    <email>mark.a.thompson@pobox.com</email></para>
 	</sect3>
 
 	<sect3 id="hw-storage-hp35470a">
 	  <title>Hewlett-Packard HP 35470A</title>
 	  
 	  <para>The boot message identifier for this drive is <literal>HP
 	      HP35470A 9 09</literal> <literal>type 1 removable SCSI
 	      2</literal></para>
 	      
 	  <para>This is a DDS-1 tape drive.  DDS-1 is the original DAT tape
 	    format.</para>
 	      
 	  <para>Native capacity is 2GB when using 90m tapes.</para>
 	      
 	  <para>Data transfer rate is 183kB/s.</para>
 	  
 	  <para>The same mechanism is used in Hewlett-Packard's SureStore
 	    <ulink url="http://www.dmo.hp.com/tape/sst2000.htm">2000i</ulink>
 	    tape drive, C35470A DDS format DAT drive, C1534A DDS format DAT
 	    drive, and HP C1536A DDS format DAT drive.</para>
 	      
 	  <para><emphasis>Warning:</emphasis> Quality control on these drives
 	    varies greatly.  One FreeBSD core-team member has returned 5 of
 	    these drives.  None lasted more than 9 months.</para>
 	      
 	  <para>Reported by: David Dawes
 	    <email>dawes@rf900.physics.usyd.edu.au</email> (9 09)</para>
 	      
 	</sect3>
 
 	<sect3 id="hw-storage-hp35480a">
 	  <title>Hewlett-Packard HP 35480A</title>
 	  
 	  <para>The boot message identifier for this drive is <literal>HP
 	      HP35480A 1009</literal> <literal>type 1 removable SCSI
 	      2</literal> <literal>Sequential-Access density code
 	      0x13</literal>.</para>
 	      
 	  <para>This is a DDS-DC tape drive.  DDS-DC is DDS-1 with hardware
 	    data compression.  DDS-1 is the original DAT tape format.</para>
 	      
 	  <para>Native capacity is 2GB when using 90m tapes.  It cannot handle
 	    120m tapes.  This drive supports hardware data compression.
 	    Please refer to the section on <link
 	      linkend="hw-storage-hp1533a">HP C1533A</link> for the proper
 	    switch settings.</para>
 	      
 	  <para>Data transfer rate is 183kB/s.</para>
 	  
 	  <para>This drive is used in Hewlett-Packard's SureStore <ulink
 	      url="http://www.dmo.hp.com/tape/sst5000.htm">5000eU</ulink> and
 	    <ulink url="http://www.dmo.hp.com/tape/sst5000.htm">5000i</ulink>
 	    tape drives and C35480A DDS format DAT drive..</para>
 	      
 	  <para>This drive will occasionally hang during a tape eject
 	    operation (<command>mt offline</command>). Pressing the front
 	    panel button will eject the tape and bring the tape drive back to
 	    life.</para>
 	      
 	  <para>WARNING: HP 35480-03110 only.  On at least two occasions this
 	    tape drive when used with FreeBSD 2.1.0, an IBM Server 320 and an
 	    2940W SCSI controller resulted in all SCSI disk partitions being
 	    lost.  The problem has not be analyzed or resolved at this
 	    time.</para>
 	</sect3>
 
 	<sect3 id="hw-storage-sdt5000">
 	  <title><ulink
 	      url="http://www.sel.sony.com/SEL/ccpg/storage/tape/t5000.html">Sony SDT-5000</ulink></title>
 	      
 	  <para>There are at least two significantly different models: one is
 	    a DDS-1 and the other DDS-2.  The DDS-1 version is
 	    <literal>SDT-5000 3.02</literal>.  The DDS-2 version is
 	    <literal>SONY SDT-5000 327M</literal>. The DDS-2 version has a 1MB
 	    cache.  This cache is able to keep the tape streaming in almost
 	    any circumstances.</para>
 	      
 	  <para>The boot message identifier for this drive is <literal>SONY
 	      SDT-5000 3.02</literal> <literal>type 1 removable SCSI
 	      2</literal> <literal>Sequential-Access density code
 	      0x13</literal></para>
 	      
 	  <para>Native capacity is 4GB when using 120m tapes.  This drive
 	    supports hardware data compression.</para>
 	      
 	  <para>Data transfer rate is depends upon the model or the drive. The
 	    rate is 630kB/s for the <literal>SONY SDT-5000 327M</literal>
 	    while compressing the data.  For the <literal>SONY SDT-5000
 	      3.02</literal>, the data transfer rate is 225kB/s.</para>
 	      
 	  <para>In order to get this drive to stream, set the blocksize to 512
 	    bytes (<command>mt blocksize 512</command>) reported by Kenneth
 	    Merry <email>ken@ulc199.residence.gatech.edu</email>.</para>
 	      
 	  <para><literal>SONY SDT-5000 327M</literal> information reported by
 	    Charles Henrich <email>henrich@msu.edu</email>.</para>
 	      
 	  <para>Reported by: &a.jmz;</para>
 	</sect3>
 
 	<sect3 id="hw-storage-tandberg3600">
 	  <title>Tandberg TDC 3600</title>
 	  
 	  <para>The boot message identifier for this drive is
 	    <literal>TANDBERG TDC 3600 =08:</literal> <literal>type 1
 	      removable SCSI 2</literal></para>
 	      
 	  <para>This is a QIC tape drive.</para>
 	  
 	  <para>Native capacity is 150/250MB.</para>
 	  
 	  <para>This drive has quirks which are known and work around code is
 	    present in the SCSI tape device driver (&man.st.4;).
 	    Upgrading the firmware to XXX version will fix the quirks and
 	    provide SCSI 2 capabilities.</para>
 	      
 	  <para>Data transfer rate is 80kB/s.</para>
 	  
 	  <para>IBM and Emerald units will not work.  Replacing the firmware
 	    EPROM of these units will solve the problem.</para>
 	      
 	  <para>Reported by: &a.msmith;</para>
 	</sect3>
 
 	<sect3 id="hw-storage-tandberg3620">
 	  <title>Tandberg TDC 3620</title>
 	  
 	  <para>This is very similar to the <link
 	      linkend="hw-storage-tandberg3600">Tandberg TDC 3600</link>
 	    drive.</para>
 	      
 	  <para>Reported by: &a.joerg;</para>
 	</sect3>
 
 	<sect3 id="hw-storage-tandberg3800">
 	  <title>Tandberg TDC 3800</title>
 
 	  <para>The boot message identifier for this drive is
 	    <literal>TANDBERG TDC 3800 =04Y</literal> <literal>Removable
 	      Sequential Access SCSI-2 device</literal></para>
 	      
 	  <para>This is a QIC tape drive.</para>
 	  
 	  <para>Native capacity is 525MB.</para>
 	  
 	  <para>Reported by: &a.jhs;</para>
 	</sect3>
 
 	<sect3 id="hw-storage-tandberg4222">
 	  <title>Tandberg TDC 4222</title>
 	  
 	  <para>The boot message identifier for this drive is
 	    <literal>TANDBERG TDC 4222 =07</literal> <literal>type 1 removable
 	      SCSI 2</literal></para>
 	      
 	  <para>This is a QIC tape drive.</para>
 	  
 	  <para>Native capacity is 2.5GB.  The drive will read all cartridges
 	    from the 60 MB (DC600A) upwards, and write 150 MB (DC6150)
 	    upwards.  Hardware compression is optionally supported for the 2.5
 	    GB cartridges.</para>
 	      
 	  <para>This drives quirks are known and pre-compiled into the SCSI
 	    tape device driver (&man.st.4;) beginning with FreeBSD
 	    2.2-CURRENT.  For previous versions of FreeBSD, use
 	    <command>mt</command> to read one block from the tape, rewind the
 	    tape, and then execute the backup program (<command>mt fsr 1; mt
 	      rewind; dump ...</command>)</para>
 	      
 	  <para>Data transfer rate is 600kB/s (vendor claim with compression),
 	    350 KB/s can even be reached in start/stop mode. The rate
 	    decreases for smaller cartridges.</para>
 	  
 	  <para>Reported by: &a.joerg;</para>
 	</sect3>
 
 	<sect3 id="hw-storage-wangtek5525es">
 	  <title>Wangtek 5525ES</title>
 	  
 	  <para>The boot message identifier for this drive is <literal>WANGTEK
 	      5525ES SCSI REV7 3R1</literal> <literal>type 1 removable SCSI
 	      1</literal> <literal>density code 0x11, 1024-byte
 	      blocks</literal></para>
 	      
 	  <para>This is a QIC tape drive.</para>
 	  
 	  <para>Native capacity is 525MB.</para>
 	  
 	  <para>Data transfer rate is 180kB/s.</para>
 	  
 	  <para>The drive reads 60, 120, 150, and 525MB tapes.  The drive will
 	    not write 60MB (DC600 cartridge) tapes.  In order to overwrite 120
 	    and 150 tapes reliably, first erase (<command>mt erase</command>)
 	    the tape.  120 and 150 tapes used a wider track (fewer tracks per
 	    tape) than 525MB tapes. The <quote>extra</quote> width of the
 	    previous tracks is not overwritten, as a result the new data lies
 	    in a band surrounded on both sides by the previous data unless the
 	    tape have been erased.</para>
 	      
 	  <para>This drives quirks are known and pre-compiled into the SCSI
 	    tape device driver (&man.st.4;).</para>
 	  
 	  <para>Other firmware revisions that are known to work are:
 	    M75D</para>
 	  
 	  <para>Reported by: Marc van Kempen <email>marc@bowtie.nl</email>
 	    <literal>REV73R1</literal> Andrew Gordon
 	    <email>Andrew.Gordon@net-tel.co.uk</email>
 	    <literal>M75D</literal></para>
 	</sect3>
 
 	<sect3 id="hw-storage-wangtek6200">
 	  <title>Wangtek 6200</title>
 	  
 	  <para>The boot message identifier for this drive is <literal>WANGTEK
 	      6200-HS 4B18</literal> <literal>type 1 removable SCSI
 	      2</literal> <literal>Sequential-Access density code
 	      0x13</literal></para>
 	  
 	  <para>This is a DDS-1 tape drive.</para>
 	  
 	  <para>Native capacity is 2GB using 90m tapes.</para>
 	  
 	  <para>Data transfer rate is 150kB/s.</para>
 	  
 	  <para>Reported by: Tony Kimball <email>alk@Think.COM</email></para>
 	</sect3>
       </sect2>
       
       <sect2>
 	<title>* Problem drives</title>
 
 	<para></para>
       </sect2>
     </sect1>
     
     <sect1>
       <title>CDROM drives</title>
       
       <para><emphasis>Contributed by &a.obrien;.  23 November
 	      1997.</emphasis></para>
       
       <para>Generally speaking those in <emphasis>The FreeBSD
 	Project</emphasis> prefer SCSI CDROM drives over IDE CDROM
 	drives.  However not all SCSI CDROM drives are equal.  Some
 	feel the quality of some SCSI CDROM drives have been
 	deteriorating to that of IDE CDROM drives.  Toshiba used to be
 	the favored stand-by, but many on the SCSI mailing list have
 	found displeasure with the 12x speed XM-5701TA as its volume
 	(when playing audio CDROMs) is not controllable by the various
 	audio player software.</para>
 	  
       <para>Another area where SCSI CDROM manufacturers are cutting corners is
 	adherence to the <link linkend="scsi-further-reading">SCSI
 	  specification</link>. Many SCSI CDROMs will respond to <link
 	  linkend="scsi-rogue-devices">multiple LUNs</link> for its target
 	address.  Known violators include the 6x Teac CD-56S 1.0D.</para>
   </sect1>
 
 </article>
diff --git a/en_US.ISO8859-1/articles/vinum/article.sgml b/en_US.ISO8859-1/articles/vinum/article.sgml
index f3caf01a8e..a3d2098834 100644
--- a/en_US.ISO8859-1/articles/vinum/article.sgml
+++ b/en_US.ISO8859-1/articles/vinum/article.sgml
@@ -1,2542 +1,2550 @@
 <!-- $FreeBSD$ -->
 <!-- FreeBSD Documentation Project -->
 
 <!DOCTYPE article PUBLIC "-//FreeBSD//DTD DocBook V4.1-Based Extension//EN" [
 <!ENTITY % man PUBLIC "-//FreeBSD//ENTITIES DocBook Manual Page Entities//EN">
+
+<!ENTITY % trademarks PUBLIC "-//FreeBSD//ENTITIES DocBook Trademark Entities//EN">
+%trademarks;
+
 <!ENTITY vinum.ap "<application>Vinum</application>">
 %man;
 ]>
 
 <article>
   <articleinfo>
     <title>
       Bootstrapping Vinum: A Foundation for Reliable Servers
     </title>
     <author>
       <firstname>Robert A.</firstname>
       <surname>Van Valzah</surname>
     </author>
     <copyright>
       <year>2001</year>
       <holder>Robert A. Van Valzah</holder>
     </copyright>
-    <pubdate>$Date: 2003-08-27 07:13:11 $ GMT</pubdate>
-    <releaseinfo>$Id: article.sgml,v 1.13 2003-08-27 07:13:11 blackend Exp $</releaseinfo>
+    <pubdate>$Date: 2003-10-18 10:39:16 $ GMT</pubdate>
+    <releaseinfo>$Id: article.sgml,v 1.14 2003-10-18 10:39:16 simon Exp $</releaseinfo>
+    <legalnotice id="trademarks" role="trademarks">
+      &tm-attrib.freebsd;
+      &tm-attrib.general;
+    </legalnotice>
   </articleinfo>
 
   <abstract>
     <para> In the most abstract sense, these instructions show how
       to build a pair of disk drives where either one is adequate
       to keep your server running if the other fails.
       Life is better if they are both working, but your server will never die
       unless both disk drives die at once.
       If you choose ATAPI drives and use a fairly generic kernel, you can
       be confident that either of these drives can be plugged into most any
       main board to produce a working server in a pinch.
       The drives need not be identical.
       These techniques work equally well with SCSI drives as they do with ATAPI,
       but I will focus on ATAPI here because main boards with this interface are
       ubiquitous.
       After building the foundation of a reliable server as shown here, you
       can expand to as many disk drives as necessary to build the
       failure-resilient server of your dreams.</para>
   </abstract>
 
   <section id="Introduction">
     <title>Introduction</title>
 
     <para>Any machine that is going to provide reliable service needs
       to have either redundant components on-line or a pool of
       off-line spares that can be promptly swapped in.  Commodity
       PC hardware makes it affordable for even small organizations
       to have some spare parts available that could be pressed
       into service following the failure of production equipment.
       In many organizations, a failed power supply, NIC, memory,
       or main board could easily be swapped with a standby in a
       matter of minutes and be ready to return to production work.</para>
 
     <para>If a disk drive fails, however, it often has to be restored
       from a tape backup.  This may take many hours.  With disk
       drive capacities rising faster than tape drive capacities,
       the time needed to restore a failed disk drive seems to
       increase as technology progresses.</para>
 
     <para>&vinum.ap;
       is a volume manager for FreeBSD that provides a standard block
       I/O layer interface to the filesystem code just as any hardware
       device driver would.
       It works by managing partitions
       of type <literal>vinum</literal> and
       allows you to subdivide and group the space in such
       partitions into logical devices called
       <firstterm>volumes</firstterm> that
       can be used in the same way as disk partitions.
       Volumes can
       be configured for resilience, performance, or both.  Experienced
       system administrators will immediately recognize the benefits
       of being able to configure each filesystem to match the way
       it is most often used.</para>
 
     <para>In some ways, <application>Vinum</application> is similar to
       &man.ccd.4;, but it is far more flexible and robust in the face
       of failures.
       It is only slightly more difficult to set up than &man.ccd.4;.
       &man.ccd.4; may meet your needs if you are only interested in
       concatenation.</para>
 
     <section id="Terminology">
       <title>Terminology</title>
 
       <para>Discussion of storage management can get very tricky
 	simply because of the terminology involved.
 	As we will see below,
 	the terms <firstterm>disk</firstterm>,
 	<firstterm>slice</firstterm>, <firstterm>partition</firstterm>,
 	<firstterm>subdisk</firstterm>, and <firstterm>volume</firstterm>
 	each refer to different things that present the same interface
 	to a kernel function like swapping.
 	The potential for confusion is compounded because the objects
 	that these terms represent can be nested inside each other.</para>
 
       <para>I will refer to a physical disk drive as a
 	<firstterm>spindle</firstterm>.
 	A <firstterm>partition</firstterm> here means a BSD partition as
 	maintained by <command>disklabel</command>.
 	It does not refer to <firstterm>slices</firstterm> or
 	<firstterm>BIOS partitions</firstterm> as
 	maintained by <command>fdisk</command>.</para>
     </section>
 
     <section id="Objects">
       <title>Vinum Objects</title>
 
       <para><application>Vinum</application>
 	defines a hierarchy of four objects that it uses to manage storage
 	(see <xref linkend="arch">).
 	Different combinations of these objects are used to achieve
 	failure resilience, performance, and/or extra capacity.
 	I will give a whirlwind tour of the objects here--see the
 	<ulink url="http://www.vinumvm.org/">Vinum web site</ulink>
 	for a more thorough description.</para>
 
       <figure id="arch">
 	<title>Vinum Objects and Architecture</title>
 
 	<mediaobject>
 	  <imageobject>
 	    <imagedata fileref="arch" format="EPS">
 	  </imageobject>
 
 	  <textobject>
 	    <literallayout class="monospaced">+-----+------+------+
 | UFS | swap | Etc. |
 +---+-+------+----+ +
 |       volume    | |
 + V +-------------+ +
 | i      plex     | |
 + n +-------------+ +
 | u     subdisk   | |
 + m +-------------+ +
 |        drive    | |
 +-----------------+ +
 | Block I/O devices |
 +-------------------+</literallayout>
 	  </textobject>
 
 	  <textobject>
 	    <phrase>Vinum Objects and Architecture</phrase>
 	  </textobject>
 	</mediaobject>
       </figure>
 
       <para>The top object, a vinum <firstterm>volume</firstterm>,
 	implements a virtual disk that
 	provides a standard block I/O layer
 	interface to other parts of the kernel.
 	The bottom object, a vinum <firstterm>drive</firstterm>,
 	uses this same interface to
 	request I/O from physical devices below it.</para>
 
       <para>In between these two (from top to bottom) we have objects called
 	a vinum <firstterm>plex</firstterm>
 	and a vinum <firstterm>subdisk</firstterm>.
 	As you can probably guess from the name, a vinum subdisk is a
 	contiguous subset of the space available on a vinum drive.
 	It lets you subdivide a vinum drive in much the same way that
 	a disk BSD partition lets you subdivide a BIOS slice.</para>
 
       <para>A plex allows subdisks to be grouped together making the space
 	of all subdisks available as a single object.</para>
 
       <para>A plex can be organized with its constituent subdisks concatenated
 	or striped.
 	Both organizations are useful for spreading I/O requests across
 	spindles since plexes reside on distinct spindles.
 	A striped plex will switch spindles each time a multiple of the
 	stripe size is reached.
 	A concatenated plex will switch spindles only when the end of
 	a subdisk is reached.</para>
 
       <para>An important characteristic of a
 	<application>Vinum</application> volume is that it can be
 	made up of more than one plex.
 	In this case, writes go to all plexes and a read may be satisfied
 	by any plex.
 	Configuring two or more plexes on distinct spindles yields a
 	volume that is resilient to failure.</para>
 
       <para><application>Vinum</application> maintains a
 	<firstterm>configuration</firstterm>
 	that defines instances of the above objects and the way they
 	are related to each other.
 	This configuration is automatically written to all spindles under
 	<application>Vinum</application> management whenever it changes.</para>
     </section>
 
     <section id="Organizations">
       <title>Vinum Volume/Plex Organization</title>
 
       <para>Although <application>Vinum</application>
 	can manage any number of spindles,
 	I will only cover scenarios with two spindles here
 	for simplification.
 	See <xref linkend=OrgCompare> to see how
 	two spindles organized with <application>Vinum</application>
 	compare to two spindles without <application>Vinum</application>.</para>
 
       <para>
 	<table id=OrgCompare frame=all>
 	  <title>Characteristics of Two Spindles Organized with Vinum</title>
 
 	  <tgroup cols="5">
 	    <thead>
 	      <row>
 		<entry>Organization</entry>
 		<entry>Total Capacity</entry>
 		<entry>Failure Resilient</entry>
 		<entry>Peak Read Performance</entry>
 		<entry>Peak Write Performance</entry>
 	      </row>
 	    </thead>
 	    <tbody>
 	      <row>
 		<entry>Concatenated Plexes</entry>
 		<entry>Unchanged, but appears as a single drive</entry>
 		<entry>No</entry>
 		<entry>Unchanged</entry>
 		<entry>Unchanged</entry>
 	      </row>
 	      <row>
 		<entry>Striped Plexes (RAID-0)</entry>
 		<entry>Unchanged, but appears as a single drive</entry>
 		<entry>No</entry>
 		<entry>2x</entry>
 		<entry>2x</entry>
 	      </row>
 	      <row>
 		<entry>Mirrored Volumes (RAID-1)</entry>
 		<entry>1/2, appearing as a single drive</entry>
 		<entry>Yes</entry>
 		<entry>2x</entry>
 		<entry>Unchanged</entry>
 	      </row>
 	    </tbody>
 	  </tgroup>
 	</table>
       </para>
 
       <para><xref linkend=OrgCompare> shows that striping yields
 	the same capacity and lack of failure resilience
 	as concatenation, but it has better peak read and write performance.
 	Hence we will not be using concatenation in any of the examples here.
 	Mirrored volumes provide the benefits of improved peak read performance
 	and failure resilience--but this comes at a loss in capacity.</para>
 
       <note><para>Both concatenation and striping bring their benefits over a
 	single spindle at the cost of increased likelihood of failure since
 	more than one spindle is now involved.</para></note>
 
       <para>When three or more spindles are present,
 	<application>Vinum</application> also supports rotated,
 	block-interleaved parity (also called <firstterm>RAID-5</firstterm>)
 	that provides better
 	capacity than mirroring (but not quite as good as striping), better
 	read performance than both mirroring and striping,
 	and good failure resilience.
 	There is, however,
 	a substantial decrease in write performance with RAID-5.
 	Most of the benefits become more pronounced with five or more
 	spindles.</para>
 
       <para>The organizations described above may be combined to provide
 	benefits that no single organization can match.
 	For example, mirroring and striping can be combined to provide
 	failure-resilience with very fast read performance.</para>
 
     </section>
 
     <section id="History">
       <title>Vinum History</title>
 
       <para><application>Vinum</application>
 	is a standard part of even a "minimum" FreeBSD distribution and
 	it has been standard since 3.0-RELEASE.
 	The official pronunciation of the name is
 	<emphasis>VEE-noom</emphasis>.</para>
 
       <para>&vinum.ap; was inspired by the Veritas Volume Manager, but
 	was not derived from it.
 	The name is a play on that history and the Latin adage
 	<foreignphrase>In Vino Veritas</foreignphrase>
 	(<foreignphrase>Vino</foreignphrase> is the ablative form of
 	<foreignphrase>Vinum</foreignphrase>).
 	Literally translated, that is <quote>Truth lies in wine</quote> hinting that
 	drunkards have a hard time lying.
 	</para>
 
       <para>I have been using it in production on six different servers for
 	over two years with no data loss.
 	Like the rest of FreeBSD, <application>Vinum</application>
 	provides <quote>rock-stable performance.</quote>
 	(On a personal note, I have seen <application>Vinum</application>
 	panic when I misconfigured something, but I have
 	never had any trouble in normal operation.)
 	Greg Lehey wrote
 	<application>Vinum</application> for FreeBSD,
 	but he is seeking
 	help in porting it to NetBSD and OpenBSD.</para>
 
       <warning>
 	<para>Just like the rest of FreeBSD, <application>Vinum</application>
 	  is undergoing continuous
 	  development.
 	  Several subtle, but significant bugs have been fixed in recent
 	  releases.
 	  It is always best to use the most recent code base that meets your
 	  stability requirements.</para></warning>
 
     </section>
 
     <section id="Strategy">
       <title>Vinum Deployment Strategy</title>
 
       <para><application>Vinum</application>,
 	coupled with prudent partition management, lets you
 	keep <quote>warm-spare</quote> spindles on-line so that failures
 	are transparent to users.  Failed spindles can be replaced
 	during regular maintenance periods or whenever it is convenient.
 	When all spindles are working, the server benefits from increased
 	performance and capacity.</para>
 
       <para>Having redundant copies of your home directory does not
 	help you if the spindle holding root,
 	<filename>/usr</filename>, or swap fails on your server.
 	Hence I focus here on building a simple
 	foundation for a failure-resilient server covering the root,
 	<filename>/usr</filename>,
 	<filename>/home</filename>, and swap partitions.</para>
 
       <warning>
 	<para><application>Vinum</application>
 	  mirroring does not remove the need for making backups!
 	  Mirroring cannot help you recover from site disasters
 	  or the dreaded
 	  <command>rm -r -f /</command> command.</para></warning>
     </section>
 
     <section id="WhyBootstrap">
       <title>Why Bootstrap Vinum?</title>
 
       <para>It is possible to add <application>Vinum</application>
 	to a server configuration after
 	it is already in production use, but this is much harder than
 	designing for it from the start.  Ironically,
 	<application>Vinum</application> is not supported by
 	<command>/stand/sysinstall</command>
 	and hence you cannot install
 	<filename>/usr</filename> right onto a
 	<application>Vinum</application> volume.</para>
 
       <note><para><application>Vinum</application> currently does not
 	support the root filesystem (this feature
 	is in development).</para></note>
 
       <para>Hence it is a bit
 	tricky to get started using
 	<application>Vinum</application>, but these instructions
 	take you though the process of planning for
 	<application>Vinum</application>, installing FreeBSD
 	without it, and then beginning to use it.</para>
 
       <para>I have come to call this whole process <quote>bootstrapping Vinum.</quote>
 	That is, the process of getting <application>Vinum</application>
 	initially installed
 	and operating to the point where you have met your resilience
 	or performance goals.  My purpose here is to document a
 	<application>Vinum</application>
 	bootstrapping method that I have found that works well for me.</para>
     </section>
 
     <section id="Benefits">
       <title>Vinum Benefits</title>
 
       <para>The server foundation scenario I have chosen here allows me
 	to show you examples of configuring for resilience on
 	<filename>/usr</filename> and
 	<filename>/home</filename>.
 	Yet <application>Vinum</application>
 	provides benefits other than resilience--namely
 	performance, capacity, and manageability.
 	It can significantly improve disk performance (especially
 	under multi-user loads).
 	<application>Vinum</application>
 	can easily concatenate many smaller disks to produce the
 	illusion of a single larger disk (but my server foundation
 	scenario does not allow me to illustrate these benefits here).</para>
 
       <para>For servers with many spindles, <application>Vinum</application>
 	provides substantial
 	benefits in volume management, particularly when coupled with
 	hot-pluggable hardware.  Data can be moved from spindle to
 	spindle while the system is running without loss of production
 	time.  Again, details of this will not be given here, but once
 	you get your feet wet with <application>Vinum</application>,
 	other documentation will help you do things like this.
 	See
 	"<ulink url="http://www.vinumvm.org/vinum/vinum.ps">The Vinum
 	Volume Manager</ulink>" for a technical introduction to
 	<application>Vinum</application>,
 	&man.vinum.8; for a description of the <command>vinum</command>
 	command, and
 	&man.vinum.4;
 	for a description of the vinum device
 	driver and the way <application>Vinum</application>
 	objects are named.</para>
 
       <note>
 	<para>Breaking up your disk space into smaller and smaller partitions
 	  has the benefit of allowing you to <quote>tune</quote> for the most common
 	  type of access and tends to keep disk hogs <quote>within their pens.</quote>
 	  However it also causes some loss in total available disk space
 	  due to fragmentation.</para></note>
     </section>
 
     <section id="DegradedOperation">
       <title>Server Operation in Degraded Mode</title>
 
       <para>Some disk failures in this two-spindle scenario will result in
 	<application>Vinum</application>
 	automatically routing
 	all disk I/O to the remaining good spindle.
 	Others will require brief manual intervention on the console
 	to configure the server for degraded mode operation and a quick reboot.
 	Other than actual hardware repairs, most recovery work
 	can be done while the server is running in multi-user degraded
 	mode so there is as little production impact
 	from failures as possible.</para>
 
       <para>I give the instructions in <xref linkend=Failures> needed to
 	configure the server for degraded mode operation
 	in those cases where <application>Vinum</application>
 	cannot do it automatically.
 	I also give the instructions needed to
 	return to normal operation once the failed hardware is repaired.
 	You might call these instructions <application>Vinum</application>
 	failure recovery techniques.</para>
 
       <para>I recommend practicing using these instructions
 	by recovering from simulated failures.
 	For each failure scenario, I also give tips below for simulating
 	a failure even when your hardware is working well.
 	Even a minimum <application>Vinum</application>
 	system as described in
 	<xref linkend="HW">
 	below can be a good place to experiment with
 	recovery techniques without impacting production equipment.</para>
     </section>
 
     <section id="HWvsSW">
       <title>Hardware RAID vs. Vinum (Software RAID)</title>
 
       <para>Manual intervention is sometimes required to configure a server for
 	degraded mode because
 	<application>Vinum</application>
 	is implemented in software that runs after the FreeBSD
 	kernel is loaded.  One disadvantage of such
 	<firstterm>software RAID</firstterm>
 	solutions is that there is nothing that can be done to hide spindle
 	failures from the BIOS or the FreeBSD boot sequence.  Hence
 	the manual reconfiguration of the server
 	for degraded operation mentioned
 	above just informs the BIOS and boot sequence of failed
 	spindles.
 	<firstterm>Hardware RAID</firstterm> solutions generally have an
 	advantage in that they require no such reconfiguration since
 	spindle failures are hidden from the BIOS and boot sequence.</para>
 
       <para>Hardware RAID, however, may have some disadvantages that can
 	be significant in some cases:
 	<itemizedlist>
 	  <listitem><para>
 	    The hardware RAID controller itself may become a single
 	    point of failure for the system.
 	  </para></listitem>
 	  <listitem><para>
 	    The data is usually kept in a proprietary
 	    format so that a disk drive cannot be simply plugged
 	    into another main board and booted.
 	  </para></listitem>
 	  <listitem><para>
 	    You often cannot mix and
 	    match drives with different sizes and interfaces.
 	  </para></listitem>
 	  <listitem><para>
 	    You are often limited to the number of drives supported by the
 	    hardware RAID controller (often only four or eight).
 	  </para></listitem>
 	</itemizedlist>
 	In other words, &vinum.ap; may offer advantages in that
 	there is no single point of failure,
 	the drives can boot on most any main board, and
 	you are free to mix and match as many drives using
 	whatever interface you choose.</para>
 
       <tip>
 	<para>Keep your kernel fairly generic (or at least keep
 	  <filename>/kernel.GENERIC</filename> around).
 	  This will improve the chances that you can come back up on
 	  <quote>foreign</quote> hardware more quickly.</para>
       </tip>
 
       <para>The pros and cons discussed above suggest
 	that the root filesystem and swap partition are good
 	candidates for hardware RAID if available.
 	This is especially true for servers where it is difficult for
 	administrators to get console access (recall that this is sometimes
 	required to configure a server for degraded mode operation).
 	A server with only software RAID is well suited to office and home
 	environments where an administrator can be close at hand.</para>
 
       <note><para>A common myth is that hardware RAID is always faster
 	than software RAID.
 	Since it runs on the host CPU, <application>Vinum</application>
 	often has more CPU power and memory available than a
 	dedicated RAID controller would have.
 	If performance is a prime concern, it is best to benchmark
 	your application running on your CPU with your spindles using
 	both hardware and software RAID systems before making
 	a decision.</para></note>
 
     </section>
 
     <section id="HW">
       <title>Hardware for Vinum</title>
 
       <para>These instructions may be timely since commodity PC hardware
 	can now easily host several hundred gigabytes of reasonably
 	high-performance disk space at a low price.  Many disk
 	drive manufactures now sell 7,200 RPM disk drives with quite
 	low seek times and high transfer rates through ATA-100
 	interfaces, all at very attractive prices.  Four such drives,
 	attached to a suitable main board and configured with
 	<application>Vinum</application>
 	and prudent partitioning, yields a failure-resilient, high
 	performance disk server at a very reasonable cost.</para>
 
       <para>However, you can indeed get started with
 	<application>Vinum</application> very simply.  
 	A minimum system can be as simple as
 	an old CPU (even a 486 is fine) and a pair of drives
 	that are 500 MB or more.  They need not be the same size or
 	even use the same interface (i.e., it is fine to mix ATAPI and
 	SCSI).  So get busy and give this a try today!  You will have
 	the foundation of a failure-resilient server running in an
 	hour or so!</para>
     </section>
   </section>
 
   <section id="BootstrappingPhases">
     <title>Bootstrapping Phases</title>
 
     <para>Greg Lehey suggested this bootstrapping method.
       It uses knowledge of how <application>Vinum</application>
       internally allocates disk space to avoid copying data.
       Instead, <application>Vinum</application>
       objects are configured so that they occupy the
       same disk space where <command>/stand/sysinstall</command> built
       filesystems.
       The filesystems are thus embedded within
       <application>Vinum</application> objects without copying.</para>
 
     <para>There are several distinct phases to the
       <application>Vinum</application> bootstrapping
       procedure.  Each of these phases is presented in a separate section below.
       The section starts with a general overview of the phase and its goals.
       It then gives example steps for the two-spindle scenario
       presented here and advice on how to adapt them for your server.
       (If you are reading for a general understanding
       of <application>Vinum</application>
       bootstrapping, the example sections for each phase
       can safely be skipped.)
       The remainder of this section gives
       an overview of the entire bootstrapping process.</para>
 
     <para>Phase 1 involves planning and preparation.
       We will balance requirements
       for the server against available resources and make design
       tradeoffs.
       We will plan the transition from no
       <application>Vinum</application> to 
       <application>Vinum</application>
       on just one spindle, to <application>Vinum</application>
       on two spindles.</para>
 
     <para>In phase 2, we will install a minimum FreeBSD system on a
       single spindle using partitions of type
       <literal>4.2BSD</literal> (regular UFS filesystems).</para>
 
     <para>Phase 3 will embed the non-root filesystems from phase 2 in
       <application>Vinum</application> objects.
       Note that <application>Vinum</application> will be up and
       running at this point,
       but it cannot yet provide any resilience since it only has
       one spindle on which to store data.</para>
 
     <para>Finally in phase 4, we configure <application>Vinum</application>
       on a second spindle and make a backup copy of the root filesystem.
       This will give us resilience on all filesystems.</para>
 
     <section id="P1">
       <title>Bootstrapping Phase 1: Planning and Preparation</title>
 
       <para>Our goal in this phase is to define the different partitions
 	we will need and examine their requirements.
 	We will also look at available disk drives and controllers and allocate
 	partitions to them.
 	Finally, we will determine the size of
 	each partition and its use during the bootstrapping process.
 	After this planning is complete, we can optionally prepare to use some
 	tools that will make bootstrapping <application>Vinum</application>
 	easier.</para>
 
       <para>Several key questions must be answered in this
 	planning phase:</para>
 
       <itemizedlist>
 	<listitem><para>
 	  What filesystem and partitions will be needed?
 	</para></listitem>
 	<listitem><para>
 	  How will they be used?
 	</para></listitem>
 	<listitem><para>
 	  How will we name each spindle?
 	</para></listitem>
 	<listitem><para>
 	  How will the partitions be ordered for each spindle?
 	</para></listitem>
 	<listitem><para>
 	  How will partitions be assigned to the spindles?
 	</para></listitem>
 	<listitem><para>
 	  How will partitions be configured? Resilience or performance?
 	</para></listitem>
 	<listitem><para>
 	  What technique will be used to achieve resilience?
 	</para></listitem>
 	<listitem><para>
 	  What spindles will be used?
 	</para></listitem>
 	<listitem><para>
 	  How will they be configured on the available controllers?
 	</para></listitem>
 	<listitem><para>
 	  How much space is required for each partition?
 	</para></listitem>
       </itemizedlist>
 
       <section id="P1E">
 	<title>Phase 1 Example</title>
 
 	<para>In this example, I will assume a scenario
 	  where we are building
 	  a minimal foundation for a failure-resilient server.
 	  Hence we will need at least root,
 	  <filename>/usr</filename>,
 	  <filename>/home</filename>,
 	  and swap partitions.
 	  The root,
 	  <filename>/usr</filename>, and
 	  <filename>/home</filename> filesystems all need resilience since the
 	  server will not be much good without them.
 	  The swap partition needs performance first and
 	  generally does
 	  not need resilience since nothing it holds needs to be retained
 	  across a reboot.</para>
 
 	<section>
 	  <title>Spindle Naming</title>
 
 	  <para>The kernel would refer to the master spindle on
 	    the primary and secondary ATA controllers as
 	    <devicename>/dev/ad0</devicename> and
 	    <devicename>/dev/ad2</devicename> respectively.
 	    <footnote>
 	      <para>
 		This assumes that you have not removed the line
 		<programlisting>options	ATA_STATIC_ID</programlisting>
 		from your kernel configuration.
 	      </para>
 	    </footnote>
 	    But <application>Vinum</application>
 	    also needs to have a name for each spindle
 	    that will stay the same name regardless
 	    of how it is attached to the CPU (i.e., if the drive moves, the
 	    <application>Vinum</application> name moves with the drive).</para>
 
 	  <para>Some recovery techniques documented below suggest
 	    moving a spindle from
 	    the secondary ATA controller to the primary ATA controller.
 	    (Indeed, the flexibility of making such moves is a key benefit
 	    of <application>Vinum</application>
 	    especially if you are managing a large number of spindles.)
 	    After such a drive/controller swap,
 	    the kernel will see what used to be
 	    <devicename>/dev/ad2</devicename> as
 	    <devicename>/dev/ad0</devicename>
 	    but <application>Vinum</application>
 	    will still call
 	    it by whatever name it had when it was attached to
 	    <devicename>/dev/ad2</devicename>
 	    (i.e., when it was <quote>created</quote> or first made known to
 	    <application>Vinum</application>).</para>
 
 	  <para>Since connections can change, it is best to give
 	    each spindle a unique, abstract
 	    name that gives no hint of how it is attached.
 	    Avoid names that suggest a manufacturer, model number,
 	    physical location, or membership in a sequence
 	    (e.g. avoid names like
 	    <literal>upper</literal>, <literal>lower</literal>, etc.,
 	    <literal>alpha</literal>, <literal>beta</literal>, etc.,
 	    <literal>SCSI1</literal>, <literal>SCSI2</literal>, etc., or
 	    <literal>Seagate1</literal>, <literal>Seagate2</literal> etc.).
 	    Such names are likely to lose their uniqueness or
 	    get out of sequence
 	    someday even if they seem like great names today.</para>
 
 	  <tip>
 	    <para>Once you have picked names for your spindles,
 	      label them with a permanent marker.
 	      If you have hot-swappable hardware, write the names on the sleds
 	      in which the spindles are mounted.
 	      This will significantly reduce the likelihood of
 	      error when you are moving spindles around later as
 	      part of failure recovery or routine system management
 	      procedures.</para></tip>
 
 	  <para>In the instructions that follow,
 	    <application>Vinum</application>
 	    will name the root spindle <literal>YouCrazy</literal>
 	    and the rootback spindle <literal>UpWindow</literal>.
 	    I will only use <devicename>/dev/ad0</devicename>
 	    when I want to refer to whichever
 	    of the two spindles is currently attached as
 	    <devicename>/dev/ad0</devicename>.</para>
 	</section>
 
 	<section>
 	  <title>Partition Ordering</title>
 
 	  <para>Modern disk drives operate with fairly uniform areal
 	    density across the surface of the disk.
 	    That implies that more data is available under the heads without
 	    seeking on the outer cylinders than on the inner cylinders.
 	    We will allocate partitions most critical to system performance
 	    from these outer cylinders as
 	    <command>/stand/sysinstall</command> generally does.</para>
 
 	  <para>The root filesystem is traditionally the outermost, even though
 	    it generally is not as critical to system performance as others.
 	    (However root can have a larger impact on performance if it contains
 	    <filename>/tmp</filename> and <filename>/var</filename> as it
 	    does in this example.)
 	    The FreeBSD boot loaders assume that the
 	    root filesystem lives in the <literal>a</literal> partition.
 	    There is no requirement that the <literal>a</literal>
 	    partition start on the outermost cylinders, but this
 	    convention makes it easier to manage disk labels.</para>
 
 	  <para>Swap performance is critical so it comes next on our way toward
 	    the center.
 	    I/O operations here tend to be large and contiguous.
 	    Having as much data under the heads as possible avoids seeking
 	    while swapping.</para>
 
 	  <para>With all the smaller partitions out of the way, we finish
 	    up the disk with
 	    <filename>/home</filename> and
 	    <filename>/usr</filename>.
 	    Access patterns here tend not to be as intense as for other
 	    filesystems (especially if there is an abundant supply of RAM
 	    and read cache hit rates are high).</para>
 
 	  <para>If the pair of spindles you have are large enough to allow
 	    for more than
 	    <filename>/home</filename> and
 	    <filename>/usr</filename>,
 	    it is fine to plan for additional filesystems here.</para>
 
 	</section>
 
 	<section>
 	  <title>Assigning Partitions to Spindles</title>
 
 	  <para>We will want to assign
 	    partitions to these spindles so that either can fail
 	    without loss of data on filesystems configured for
 	    resilience.</para>
 
 	  <para>Reliability on
 	    <filename>/usr</filename> and
 	    <filename>/home</filename>
 	    is best achieved using <application>Vinum</application>
 	    mirroring.
 	    Resilience will have to come differently, however, for the root
 	    filesystem since <application>Vinum</application>
 	    is not a part of the FreeBSD boot sequence.
 	    Here we will have to settle for two identical
 	    partitions with a periodic copy from the primary to the
 	    backup secondary.</para>
 
 	  <para>The kernel already has support for interleaved swap across
 	    all available partitions so there is no need for help from
 	    <application>Vinum</application> here.
 	    <command>/stand/sysinstall</command>
 	    will automatically configure <filename>/etc/fstab</filename>
 	    for all swap partitions given.</para>
 
 	  <para>The &vinum.ap; bootstrapping method given below
 	    requires a pair of spindles that I will call the
 	    <firstterm>root spindle</firstterm> and the
 	    <firstterm>rootback spindle</firstterm>.</para>
 
 	  <important><para>The rootback spindle must be the same size or
 	    larger than the root spindle.</para></important>
 
 	  <para>These instructions first allocate all space on the root
 	    spindle and then allocate exactly that amount of space on
 	    a rootback spindle.
 	    (After &vinum.ap; is bootstrapped, there is nothing special
 	    about either of these spindles--they are interchangeable.)
 	    You can later use the remaining space on the rootback spindle for
 	    other filesystems.</para>
 
 	  <para>If you have more than two spindles, the
 	    <literal>bootvinum</literal> Perl script and the procedure
 	    below will help you initialize them for use with &vinum.ap;.
 	    However you will have to figure out how to assign partitions
 	    to them on your own.</para>
 
 	</section>
 
 	<section>
 	  <title>Assigning Space to Partitions</title>
 
 	  <para>For this example, I will use two spindles: one with
 	    4,124,673 blocks (about 2 GB) on <devicename>/dev/ad0</devicename>
 	    and one with 8,420,769 blocks (about 4 GB) on
 	    <devicename>/dev/ad2</devicename>.</para>
 
 	  <para>It is best to configure your two spindles on separate
 	    controllers so that both can operate in parallel and
 	    so that you will have failure resilience in case a
 	    controller dies.
 	    Note that mirrored volume write performance will be halved
 	    in cases where both spindles share a controller that requires
 	    they operate serially (as is often the case with ATA controllers).
 	    One spindle will be the master on the primary ATA
 	    controller and the other will be the master on the
 	    secondary ATA controller.</para>
 
 	  <para>Recall that we will be allocating space on the smaller
 	    spindle first and the larger spindle second.</para>
 
 	</section>
 
 	<section id=AssignSmall>
 	  <title>Assigning Partitions on the Root Spindle</title>
 
 	  <para>We will allocate 200,000 blocks (about 93 MB)
 	    for a root filesystem on each spindle
 	    (<devicename>/dev/ad0s1a</devicename> and
 	    <devicename>/dev/ad2s1a</devicename>).
 	    We will initially allocate 200,265 blocks for a swap partition
 	    on each spindle,
 	    giving a total of about 186 MB of
 	    swap space (<devicename>/dev/ad0s1b</devicename> and
 	    <devicename>/dev/ad2s1b</devicename>).</para>
 
 	  <note><para>We will lose 265 blocks from each swap partition
 	    as part of the bootstrapping process.
 	    This is the size of the space used by
 	    <application>Vinum</application> to store configuration
 	    information.
 	    The space will be taken from swap and given to a vinum
 	    partition but will be unavailable for
 	    <application>Vinum</application> subdisks.</para></note>
 
 	  <note><para>I have done the partition allocation in nice round
 	    numbers of blocks just to emphasize where the 265 blocks go.
 	    There is nothing wrong with allocating space in MB if that is
 	    more convenient for you.</para></note>
 
 	  <para>This leaves 4,124,673 - 200,000 - 200,265 = 3,724,408 blocks
 	    (about 1,818 MB) on the root spindle for
 	    <application>Vinum</application>
 	    partitions (<devicename>/dev/ad0s1e</devicename> and
 	    <devicename>/dev/ad2s1f</devicename>).
 	    From this, allocate the 265 blocks for
 	    <application>Vinum</application> configuration information,
 	    1,000,000 blocks (about 488 MB)
 	    for <filename>/home</filename>, and the remaining
 	    2,724,408 blocks (about 1,330 MB) for
 	    <filename>/usr</filename>.
 	    See <xref linkend=ad0b4aft> below to see this graphically.</para>
 
 	  <para>The left-hand side of
 	    <xref linkend="ad0b4aft"> below shows what spindle ad0 will
 	    look like at the end of phase 2.
 	    The right-hand side shows what it will look like at the
 	    end of phase 3.</para>
 
 	  <figure id="ad0b4aft">
 	    <title>Spindle ad0 Before and After Vinum</title>
 
 	    <mediaobject>
 	      <imageobject>
 		<imagedata fileref="ad0b4aft" format="EPS">
 	      </imageobject>
 
 	      <textobject>
 		<literallayout class="monospaced">    ad0 Before Vinum    Offset (blocks)    ad0 After Vinum
 +----------------------+ <--      0--> +----------------------+
 |        root          |               |        root          |
 |     /dev/ad0s1a      |               |     /dev/ad0s1a      |
 +----------------------+ <-- 200000--> +----------------------+
 |        swap          |               |        swap          |
 |     /dev/ad0s1b      |               |     /dev/ad0s1b      |
 |                      |     400000--> +----------------------+
 |                      |               | Vinum drive YouCrazy |
 |                      |               |     /dev/ad0s1h      |
 +----------------------+ <-- 400265--> +-----------------+    |
 |        /home         |               |    Vinum sd     |    |
 |     /dev/ad0s1e      |               |    home.p0.s0   |    |
 +----------------------+ <--1400265--> +-----------------+    |
 |        /usr          |               |    Vinum sd     |    |
 |     /dev/ad0s1f      |               |    usr.p0.s0    |    |
 +----------------------+ <--4124673--> +-----------------+----+
 Not to scale</literallayout>
 	      </textobject>
 
 	      <textobject>
 		<phrase>Spindle /dev/ad0 Before and After Vinum</phrase>
 	      </textobject>
 	    </mediaobject>
 	  </figure>
 
 	</section>
 
 	<section id=AssignLarge>
 	  <title>Assigning Partitions on the Rootback Spindle</title>
 
 	  <para>The <filename>/rootback</filename> and swap partition sizes
 	    on the rootback spindle must
 	    match the root and swap partition sizes on the root spindle.
 	    That leaves 8,420,769 - 200,000 - 200,265 = 8,020,504
 	    blocks for the <application>Vinum</application> partition.
 	    Mirrors of <filename>/home</filename> and
 	    <filename>/usr</filename> receive the same allocation as on
 	    the root spindle.
 	    That will leave an extra 2 GB or so that we can deal
 	    with later.
 	    See <xref linkend=ad2b4aft> below to see this graphically.</para>
 
 	  <para>The left-hand side of
 	    <xref linkend="ad2b4aft"> below shows what spindle ad2 will
 	    look like at the beginning of phase 4.
 	    The right-hand side shows what it will look like at the end.</para>
 
 	  <figure id="ad2b4aft">
 	    <title>Spindle ad2 Before and After Vinum</title>
 
 	    <mediaobject>
 	      <imageobject>
 		<imagedata fileref="ad2b4aft" format="EPS">
 	      </imageobject>
 
 	      <textobject>
 		<literallayout class="monospaced">    ad2 Before Vinum    Offset (blocks)    ad2 After Vinum
 +----------------------+ <--      0--> +----------------------+
 |      /rootback       |               |      /rootback       |
 |     /dev/ad2s1e      |               |     /dev/ad2s1a      |
 +----------------------+ <-- 200000--> +----------------------+
 |        swap          |               |        swap          |
 |     /dev/ad2s1b      |               |     /dev/ad2s1b      |
 |                      |     400000--> +----------------------+
 |                      |               | Vinum drive UpWindow |
 |                      |               |     /dev/ad2s1h      |
 +----------------------+ <-- 400265--> +-----------------+    |
 |      /NOFUTURE       |               |    Vinum sd     |    |
 |     /dev/ad2s1f      |               |    home.p1.s0   |    |
 |                      |    1400265--> +-----------------+    |
 |                      |               |    Vinum sd     |    |
 |                      |               |    usr.p1.s0    |    |
 |                      |    4124673--> +-----------------+    |
 |                      |               |    Vinum sd     |    |
 |                      |               |    hope.p0.s0   |    |
 +----------------------+ <--8420769--> +-----------------+----+
 Not to scale</literallayout>
 	      </textobject>
 
 	      <textobject>
 		<phrase>Spindle ad2 Before and After Vinum</phrase>
 	      </textobject>
 	    </mediaobject>
 	  </figure>
 
 	</section>
 
 	<section id=floppy>
 	  <title>Preparation of Tools</title>
 
 	  <para>The <literal>bootvinum</literal> Perl script given below in
 	    <xref linkend=Perl> will make the
 	    <application>Vinum</application> bootstrapping process much
 	    easier if you can run it on the machine being bootstrapped.
 	    It is over 200 lines and you would not want to type it in.
 	    At this point, I recommend that you
 	    copy it to a floppy or arrange some
 	    alternative method of making it readily available
 	    so that it can be available later when needed.
 	    For example:</para>
 
 <screen>&prompt.root; <userinput>fdformat -f 1440 /dev/fd0</userinput>
 &prompt.root; <userinput>newfs_msdos -f 1440 /dev/fd0</userinput>
 &prompt.root; <userinput>mount_msdos /dev/fd0 /mnt</userinput>
 &prompt.root; <userinput>cp /usr/share/examples/vinum/bootvinum /mnt</userinput></screen>
 
 	  <para>XXX Someday, I would like this script to live in
 	    <filename>/usr/share/examples/vinum</filename>.
 	    Till then, please use this
 	    <ulink url="http://www.BGPBook.Com/vinum/bootvinum">link</ulink>
 	    to get a copy.</para>
 	</section>
       </section>
     </section>
 
     <section id="P2">
       <title>Bootstrapping Phase 2: Minimal OS Installation</title>
 
       <para>Our goal in this phase is to complete the smallest possible
 	FreeBSD installation in such a way that we can later install
 	<application>Vinum</application>.
 	We will use only
 	partitions of type <literal>4.2BSD</literal> (i.e., regular UFS file
 	systems) since that is the only type supported by
 	<command>/stand/sysinstall</command>.</para>
 
       <section id="P2E">
 	<title>Phase 2 Example</title>
 
 	<procedure>
 	  <step>
 	    <para>Start up the FreeBSD installation process by running
 	      <command>/stand/sysinstall</command> from
 	      installation media as you normally would.</para></step>
 
 	  <step>
 	    <para>Fdisk partition all spindles as needed.</para>
 
 	    <important>
 	      <para>Make sure to select BootMgr for all spindles.</para></important>
 	  </step>
 
 	  <step>
 	    <para>Partition the root spindle with appropriate block
 	      allocations as described above in <xref linkend=AssignSmall>.
 	      For this example on a 2 GB spindle, I will use
 	      200,000 blocks for root, 200,265 blocks for swap,
 	      1,000,000 blocks for <filename>/home</filename>, and 
 	      the rest of the spindle (2,724,408 blocks) for
 	      <filename>/usr</filename>.
 	      (<command>/stand/sysinstall</command>
 	      should automatically assign these to
 	      <devicename>/dev/ad0s1a</devicename>,
 	      <devicename>/dev/ad0s1b</devicename>,
 	      <devicename>/dev/ad0s1e</devicename>, and
 	      <devicename>/dev/ad0s1f</devicename>
 	      by default.)</para>
 
 	    <note><para>If you prefer Soft Updates as I do and you are
 	      using 4.4-RELEASE or better, this is a good time to enable
 	      them.</para></note>
 
 	  </step>
 
 	  <step>
 	    <para>Partition the rootback spindle with the appropriate block
 	      allocations as described above in <xref linkend=AssignLarge>.
 	      For this example on a 4 GB spindle, I will use
 	      200,000 blocks for <filename>/rootback</filename>,
 	      200,265 blocks for swap, and
 	      the rest of the spindle (8,020,504 blocks) for
 	      <filename>/NOFUTURE</filename>.
 	      (<command>/stand/sysinstall</command>
 	      should automatically assign these to
 	      <devicename>/dev/ad2s1e</devicename>,
 	      <devicename>/dev/ad2s1b</devicename>, and
 	      <devicename>/dev/ad2s1f</devicename> by default.)</para>
 
 	    <note>
 	      <para>We do not really want to have a
 		<filename>/NOFUTURE</filename> UFS filesystem (we
 		want a vinum partition instead), but that is the
 		best choice we have for the space given the limitations of
 		<command>/stand/sysinstall</command>.
 		Mount point names beginning with <literal>NOFUTURE</literal>
 		and <literal>rootback</literal>
 		serve as sentinels to the bootstrapping
 		script presented in <xref linkend=Perl> below.</para></note>
 	  </step>
 
 	  <step>
 	    <para>Partition any other spindles with swap if desired and a
 	      single <filename>/NOFUTURExx</filename> filesystem.</para>
 	  </step>
 
 	  <step>
 	    <para>Select a minimum system install for now even if you
 	      want to end up with more distributions loaded later.</para>
 
 	    <tip>
 	      <para>Do not worry about system configuration options at this
 		point--get <application>Vinum</application>
 		set up and get the partitions in
 		the right places first.</para></tip>
 	  </step>
 
 	  <step>
 	    <para>Exit <command>/stand/sysinstall</command> and reboot.
 	      Do a quick test to verify that the minimum
 	      installation was successful.</para>
 	  </step>
 	</procedure>
 
 	<para>The left-hand side of <xref linkend=ad0b4aft> above
 	  and the left-hand side of <xref linkend=ad2b4aft> above
 	  show how the disks will look at this point.</para>
       </section>
     </section>
 
     <section id="P3">
       <title>Bootstrapping Phase 3: Root Spindle Setup</title>
 
       <para>Our goal in this phase is get <application>Vinum</application>
 	set up and running on the
 	root spindle.
 	We will embed the existing
 	<filename>/usr</filename> and
 	<filename>/home</filename> filesystems in a
 	<application>Vinum</application> partition.
 	Note that the <application>Vinum</application>
 	volumes created will not yet be
 	failure-resilient since we have
 	only one underlying <application>Vinum</application>
 	drive to hold them.
 	The resulting system will automatically start
 	<application>Vinum</application> as it boots to multi-user mode.</para>
 
       <section id="P3E">
 	<title>Phase 3 Example</title>
 
 	<procedure>
 	  <step>
 	    <para>Login as root.</para>
 	  </step>
 
 	  <step>
 	    <para>We will need a directory in the root filesystem in
 	      which to keep a few files that will be used in the
 	      <application>Vinum</application>
 	      bootstrapping process.</para>
 
 	    <screen>&prompt.root; <userinput>mkdir /bootvinum</userinput>
 &prompt.root; <userinput>cd /bootvinum</userinput></screen>
 	  </step>
 
 	  <step>
 	    <para>Several files need to be prepared for use in bootstrapping.
 	      I have written a Perl script that makes all the required
 	      files for you.
 	      Copy this script to <filename>/bootvinum</filename> by
 	      floppy disk, tape, network, or any convenient means and
 	      then run it.
 	      (If you cannot get this script copied onto the machine being
 	      bootstrapped, then see <xref linkend=ManualBoot>
 	      below for a manual alternative.)</para>
 
 	    <screen>&prompt.root; <userinput>cp /mnt/bootvinum .</userinput>
 &prompt.root; <userinput>./bootvinum</userinput></screen>
 
 	    <note><para><literal>bootvinum</literal> produces no output
 	      when run successfully.
 	      If you get any errors,
 	      something may have gone wrong when you were creating
 	      partitions with
 	      <command>/stand/sysinstall</command> above.</para></note>
 
 	    <para>Running <literal>bootvinum</literal> will:</para>
 
 	    <itemizedlist>
 	      <listitem><para>
 		Create <filename>/etc/fstab.vinum</filename>
 		based on what it finds
 		in your existing <filename>/etc/fstab</filename>
 	      </para></listitem>
 	      <listitem><para>
 		Create new disk labels for each spindle mentioned
 		in <filename>/etc/fstab</filename> and keep copies of the
 		current disk labels
 	      </para></listitem>
 	      <listitem><para>
 		Create files needed as input to <command>vinum</command>
 		<option>create</option> for building
 		<application>Vinum</application> objects on each spindle
 	      </para></listitem>
 	      <listitem><para>
 		Create many alternates to <filename>/etc/fstab.vinum</filename>
 		that might come in handy should a spindle fail
 	      </para></listitem>
 	    </itemizedlist>
 
 	    <para>You may want to take a look at these files to learn more
 	      about the disk partitioning required for
 	      <application>Vinum</application> or to learn more about the
 	      commands needed to create
 	      <application>Vinum</application> objects.</para>
 
 	  </step>
 
 	  <step>
 	    <para>We now need to install new spindle partitioning for
 	      <devicename>/dev/ad0</devicename>.
 	      This requires that 
 	      <devicename>/dev/ad0s1b</devicename> not be in use for
 	      swapping so we have to reboot in single-user mode.</para>
 
 	    <substeps>
 	      <step>
 		<para>First, reboot the system.</para>
 
 		<screen>&prompt.root; <userinput>reboot</userinput></screen>
 	      </step>
 
 	      <step>
 		<para>Next, enter single-user mode.</para>
 		<screen>Hit [Enter] to boot immediately, or any other key for command prompt.
 Booting [kernel] in 8 seconds...
 
 Type '?' for a list of commands, 'help' for more detailed help.
 ok <userinput>boot -s</userinput></screen>
 
 	      </step>
 	    </substeps>
 	  </step>
 
 	  <step>
 	    <para>In single-user mode, install the new partitioning
 	      created above.</para>
 
 	    <screen>&prompt.root; <userinput>cd /bootvinum</userinput>
 &prompt.root; <userinput>disklabel -R ad0s1 disklabel.ad0s1</userinput>
 &prompt.root; <userinput>disklabel -R ad2s1 disklabel.ad2s1</userinput></screen>
 
 	    <note><para>If you have additional spindles, repeat the
 	      above commands as appropriate for them.</para></note>
 
 	  </step>
 
 	  <step>
 	    <para>We are about to start <application>Vinum</application>
 	      for the first time.
 	      It is going to want to create several device nodes under
 	      <filename>/dev/vinum</filename> so we will need to mount the
 	      root filesystem for read/write access.</para>
 
 	    <screen>&prompt.root; <userinput>fsck -p /</userinput>
 &prompt.root; <userinput>mount /</userinput></screen>
 	  </step>
 
 	  <step>
 	    <para>Now it is time to create the <application>Vinum</application>
 	      objects that
 	      will embed the existing non-root filesystems on
 	      the root spindle in a
 	      <application>Vinum</application> partition.
 	      This will load the <application>Vinum</application>
 	      kernel module and start <application>Vinum</application>
 	      as a side effect.</para>
 
 	    <screen>&prompt.root; <userinput>vinum create create.YouCrazy</userinput></screen>
 
 	    <para>
 	      You should see a list of <application>Vinum</application>
 	      objects created that looks like the following:</para>
 <screen>1 drives:
 D YouCrazy              State: up	Device /dev/ad0s1h	Avail: 0/1818 MB (0%)
 
 2 volumes:
 V home                  State: up	Plexes:       1	Size:        488 MB
 V usr                   State: up	Plexes:       1	Size:       1330 MB
 
 2 plexes:
 P home.p0             C State: up	Subdisks:     1	Size:        488 MB
 P usr.p0              C State: up	Subdisks:     1	Size:       1330 MB
 
 2 subdisks:
 S home.p0.s0            State: up	PO:        0  B Size:        488 MB
 S usr.p0.s0             State: up	PO:        0  B Size:       1330 MB</screen>
 	    <para>
 	      You should also see several kernel messages
 	      which state that the <application>Vinum</application>
 	      objects you have created are now <literal>up</literal>.</para>
 	  </step>
 
 	  <step>
 	    <para>Our non-root filesystems should now be embedded in a
 	      <application>Vinum</application> partition and
 	      hence available through <application>Vinum</application>
 	      volumes.
 	      It is important to test that this embedding worked.</para>
 
 	    <screen>&prompt.root; <userinput>fsck -n /dev/vinum/home</userinput>
 &prompt.root; <userinput>fsck -n /dev/vinum/usr</userinput></screen>
 
 	    <para>This should produce no errors.
 	      If it does produce errors <emphasis>do not fix them</emphasis>.
 	      Instead, go back and examine the root spindle partition tables
 	      before and after <application>Vinum</application>
 	      to see if you can spot the error.
 	      You can back out the partition table changes by using
 	      <command>disklabel -R</command> with the
 	      <filename>disklabel.*.b4vinum</filename> files.</para>
 	  </step>
 
 	  <step>
 	    <para>While we have the root filesystem mounted read/write, this is
 	      a good time to install <filename>/etc/fstab</filename>.</para>
 
 	    <screen>&prompt.root; <userinput>mv /etc/fstab /etc/fstab.b4vinum</userinput>
 &prompt.root; <userinput>cp /etc/fstab.vinum /etc/fstab</userinput></screen>
 	  </step>
 
 	  <step>
 	    <para>We are now done with tasks requiring single-user
 	      mode, so it is safe to go multi-user from here on.</para>
 
 	    <screen>&prompt.root; <userinput>^D</userinput></screen>
 	  </step>
 
 	  <step>
 	    <para>Login as root.</para>
 	  </step>
 
 	  <step>
 	    <para>Edit <filename>/etc/rc.conf</filename> and add this line:
 	      <programlisting>start_vinum="YES"</programlisting></para>
 	  </step>
 	</procedure>
       </section>
     </section>
 
     <section id="P4">
       <title>Bootstrapping Phase 4: Rootback Spindle Setup</title>
 
       <para>Our goal in this phase is to get redundant copies of all data
 	from the root spindle to the rootback spindle.
 	We will first create the necessary <application>Vinum</application>
 	objects on the rootback spindle.
 	Then we will ask <application>Vinum</application>
 	to copy the data from the root spindle to the
 	rootback spindle.
 	Finally, we use <command>dump</command> and <command>restore</command>
 	to copy the root filesystem.</para>
 
       <section id="P4E">
 	<title>Phase 4 Example</title>
 
 	<procedure>
 	  <step>
 	    <para>Now that <application>Vinum</application>
 	      is running on the root spindle, we can bring
 	      it up on the rootback spindle so that our
 	      <application>Vinum</application> volumes can become
 	      failure-resilient.</para>
 
 	    <screen>&prompt.root; <userinput>cd /bootvinum</userinput>
 &prompt.root; <userinput>vinum create create.UpWindow</userinput></screen>
 
 	    <para>You should see a list of <application>Vinum</application>
 	      objects created that
 	      looks like the following:</para>
 
 <screen>2 drives:
 D YouCrazy              State: up       Device /dev/ad0s1h      Avail: 0/1818 MB (0%)
 D UpWindow              State: up       Device /dev/ad2s1h      Avail: 2096/3915 MB (53%)
 
 2 volumes:
 V home                  State: up       Plexes:       2 Size:        488 MB
 V usr                   State: up       Plexes:       2 Size:       1330 MB
 
 4 plexes:
 P home.p0             C State: up       Subdisks:     1 Size:        488 MB
 P usr.p0              C State: up       Subdisks:     1 Size:       1330 MB
 P home.p1             C State: faulty   Subdisks:     1 Size:        488 MB
 P usr.p1              C State: faulty   Subdisks:     1 Size:       1330 MB
 
 4 subdisks:
 S home.p0.s0            State: up       PO:        0  B Size:        488 MB
 S usr.p0.s0             State: up       PO:        0  B Size:       1330 MB
 S home.p1.s0            State: stale    PO:        0  B Size:        488 MB
 S usr.p1.s0             State: stale    PO:        0  B Size:       1330 MB</screen>
 
 	    <para>You should also see several kernel messages
 	      which state that some of the <application>Vinum</application>
 	      objects you have created are now <literal>up</literal>
 	      while others are <literal>faulty</literal> or
 	      <literal>stale</literal>.</para>
 	  </step>
 
 	  <step>
 	    <para>Now we ask <application>Vinum</application>
 	      to copy each of the subdisks on drive
 	      <literal>YouCrazy</literal> to drive <literal>UpWindow</literal>.
 	      This will change the state of the newly created
 	      <application>Vinum</application> subdisks
 	      from <literal>stale</literal> to <literal>up</literal>.
 	      It will also change the state of the newly created
 	      <application>Vinum</application> plexes
 	      from <literal>faulty</literal> to <literal>up</literal>.</para>
 
 	    <para>First, we do the new subdisk we
 	      added to <filename>/home</filename>.</para>
 
 	    <screen>&prompt.root; <userinput>vinum start -w home.p1.s0</userinput>
 reviving home.p1.s0
 <emphasis>(time passes . . . )</emphasis>
 home.p1.s0 is up by force
 home.p1 is up
 home.p1.s0 is up</screen>
 	    <note>
 	      <para>
 		My 5,400 RPM EIDE spindles copied at about 3.5 MBytes/sec.
 		Your mileage may vary.
 	      </para>
 	    </note>
 	  </step>
 
 	  <step>
 	    <para>Next we do the new subdisk we
 	      added to <filename>/usr</filename>.</para>
 
 	    <screen>&prompt.root; <userinput>vinum start -w usr.p1.s0</userinput>
 reviving usr.p1.s0
 <emphasis>(time passes . . . )</emphasis>
 usr.p1.s0 is up by force
 usr.p1 is up
 usr.p1.s0 is up</screen>
 
 	    <para>All <application>Vinum</application>
 	      objects should be in state <literal>up</literal> at this point.
 	      The output of
 	      <command>vinum list</command> should look
 	      like the following:</para>
 
 <screen>2 drives:
 D YouCrazy              State: up	Device /dev/ad0s1h	Avail: 0/1818 MB (0%)
 D UpWindow              State: up	Device /dev/ad2s1h	Avail: 2096/3915 MB (53%)
 
 2 volumes:
 V home                  State: up	Plexes:       2	Size:        488 MB
 V usr                   State: up	Plexes:       2	Size:       1330 MB
 
 4 plexes:
 P home.p0             C State: up	Subdisks:     1	Size:        488 MB
 P usr.p0              C State: up	Subdisks:     1	Size:       1330 MB
 P home.p1             C State: up	Subdisks:     1	Size:        488 MB
 P usr.p1              C State: up	Subdisks:     1	Size:       1330 MB
 
 4 subdisks:
 S home.p0.s0            State: up	PO:        0  B Size:        488 MB
 S usr.p0.s0             State: up	PO:        0  B Size:       1330 MB
 S home.p1.s0            State: up	PO:        0  B Size:        488 MB
 S usr.p1.s0             State: up	PO:        0  B Size:       1330 MB</screen>
 	  </step>
 
 	  <step>
 	    <para>Copy the root filesystem so that you will have a backup.</para>
 
 	    <screen>&prompt.root; <userinput>cd /rootback</userinput>
 &prompt.root; <userinput>dump 0f - / | restore rf -</userinput>
 &prompt.root; <userinput>rm restoresymtable</userinput>
 &prompt.root; <userinput>cd /</userinput></screen>
 
 	    <note>
 	      <para>You may see errors like this:</para>
 
 	      <screen>./tmp/rstdir1001216411: (inode 558) not found on tape
 cannot find directory inode 265
 abort? [yn] <userinput>n</userinput>
 expected next file 492, got 491</screen>
 
 	      <para>They seem to cause no harm.
 		I suspect they are a consequence of dumping the filesystem
 		containing <filename>/tmp</filename> and/or the pipe
 		connecting <command>dump</command> and
 		<command>restore</command>.</para>
 
 	    </note>
 	  </step>
 
 	  <step>
 	    <para>Make a directory on which we can mount a damaged root
 	      filesystem during the recovery process.</para>
 
 	    <screen>&prompt.root; <userinput>mkdir /rootbad</userinput></screen>
 
 	  </step>
 
 	  <step>
 	    <para>Remove sentinel mount points that are now unused.</para>
 
 	    <screen>&prompt.root; <userinput>rmdir /NOFUTURE*</userinput></screen>
 
 	  </step>
 
 	  <step>
 	    <para>Create empty &vinum.ap; drives on remaining spindles.</para>
 
 	    <screen>&prompt.root; <userinput>vinum create create.ThruBank</userinput>
 &prompt.root; <userinput>...</userinput></screen>
 
 	  </step>
 	</procedure>
 
 	<para>At this point, the reliable server foundation is complete.
 	  The right-hand side of <xref linkend=ad0b4aft> above
 	  and the right-hand side of <xref linkend=ad2b4aft> above
 	  show how the disks will look.</para>
 
 	<para>You may want to do a quick reboot to multi-user and give it
 	  a quick test drive.
 	  This is also a good point to complete installation
 	  of other distributions beyond the minimal install.
 	  Add packages, ports, and users as required.
 	  Configure <filename>/etc/rc.conf</filename> as required.</para>
 
 	<tip>
 	  <para>After you have completed your server configuration,
 	    remember to do one more copy of root to
 	    <filename>/rootback</filename> as shown above before placing
 	    the server into production.</para></tip>
 
 	<tip>
 	  <para>Make a schedule to refresh
 	    <filename>/rootback</filename> periodically.</para></tip>
 
 	<tip>
 	  <para>It may be a good idea to mount
 	    <filename>/rootback</filename> read-only for normal operation
 	    of the server.
 	    This does, however, complicate the periodic refresh a bit.</para></tip>
 
 	<tip>
 	  <para>Do not forget to watch
 	    <filename>/var/log/messages</filename> carefully for errors.
 	    <application>Vinum</application>
 	    may automatically avoid failed hardware in a way that users
 	    do not notice.
 	    You must watch for such failures and get them repaired before a 
 	    second failure results in data loss.
 	    You may see
 	    <application>Vinum</application> noting damaged objects
 	    at server boot time.</para></tip>
 
       </section>
     </section>
   </section>
 
   <section id="FromHere">
     <title>Where to Go from Here?</title>
 
     <para>Now that you have established the foundation of a reliable server,
       there are several things you might want to try next.</para>
 
     <section>
       <title>Make a Vinum Volume with Remaining Space</title>
 
       <para>Following are the steps to create another
 	<application>Vinum</application> volume with space remaining
 	on the rootback spindle.</para>
 
       <note><para>This volume will not be resilient to spindle failure
 	since it has only one plex on a single spindle.</para></note>
 
       <procedure>
 	<step>
 	  <para>Create a file with the following contents:</para>
 
 	  <programlisting>volume hope
   plex name hope.p0 org concat volume hope
     sd name hope.p0.s0 drive UpWindow plex hope.p0 len 0</programlisting>
 
 	  <note>
 	    <para>Specifying a length of <literal>0</literal> for 
 	      the <filename>hope.p0.s0</filename> subdisk
 	      asks <application>Vinum</application>
 	      to use whatever space is left available on the underlying
 	      drive.</para></note>
 	</step>
 
 	<step>
 	  <para>Feed these commands into <command>vinum</command> <option>create</option>.</para>
 	  <screen>&prompt.root; <userinput>vinum create <replaceable>filename</replaceable></userinput></screen>
 	</step>
 
 	<step>
 	  <para>Now we <command>newfs</command> the volume and
 	    <command>mount</command> it.</para>
 
 	  <screen>&prompt.root; <userinput>newfs -v /dev/vinum/hope</userinput>
 &prompt.root; <userinput>mkdir /hope</userinput>
 &prompt.root; <userinput>mount /dev/vinum/hope /hope</userinput></screen>
 
 	</step>
 
 	<step>
 	  <para>Edit <filename>/etc/fstab</filename> if you want
 	    <filename>/hope</filename> mounted at boot time.</para>
 	</step>
 
       </procedure>
     </section>
 
     <section>
       <title>Try Out More Vinum Commands</title>
 
       <para>You might already be familiar with
 	<command>vinum</command> <option>list</option> to get a list of
 	all <application>Vinum</application> objects.
 	Try <option>-v</option> following it to see more detail.</para>
 
       <para>If you have more spindles and you want to bring them up as
 	concatenated, mirrored, or striped volumes, then give
 	<command>vinum</command> <option>concat</option> <replaceable>drivelist</replaceable>,
 	<command>vinum</command> <option>mirror</option> <replaceable>drivelist</replaceable>, or
 	<command>vinum</command> <option>stripe</option> <replaceable>drivelist</replaceable> a try.</para>
 
       <para>See &man.vinum.8; for sample configurations and important
 	performance considerations before settling on a final organization
 	for your additional spindles.</para>
 
       <para>The failure recovery instructions below will also give you
 	some experience using more <application>Vinum</application>
 	commands.</para>
 
     </section>
   </section>
 
   <section id="Failures">
     <title>Failure Scenarios</title>
 
     <para>This section contains descriptions of various failure scenarios.
       For each scenario, there is a subsection on how to configure your
       server for degraded mode operation, how to recover from the failure,
       how to exit degraded mode, and how to simulate the failure.</para>
       
     <tip>
       <para>Make a hard copy of these instructions and leave them inside the CPU
 	case, being careful not to interfere with ventilation.</para></tip>
 
     <section id="ad0RootBad">
       <title>Root filesystem on ad0 unusable, rest of drive ok</title>
 
       <note>
 	<para>We assume here that the boot blocks and disk label on
 	  <devicename>/dev/ad0</devicename> are ok.
 	  If your BIOS can boot from a drive other than
 	  <devicename>C:</devicename>, you may be able to get around this
 	  limitation.</para></note>
 
       <section id="enter1">
 	<title>Configure Server for Degraded Mode</title>
 
 	<procedure>
 	  <step>
 	    <para>Use BootMgr to load kernel from
 	      <devicename>/dev/ad2s1a</devicename>.</para>
 
 	    <substeps>
 	      <step>
 		<para>Hit <keycap>F5</keycap> in BootMgr to select
 		  <literal>Drive 1</literal>.</para>
 	      </step>
 
 	      <step>
 		<para>Hit <keycap>F1</keycap> to select
 		  <literal>FreeBSD</literal>.</para>
 	      </step>
 	    </substeps>
 	  </step>
 
 	  <step>
 	    <para>After the kernel is loaded, hit any key but enter to interrupt
 	      the boot sequence.
 	      Boot into single-user mode and allow explicit entry of
 	      a root filesystem.</para>
 
 	    <screen>Hit [Enter] to boot immediately, or any other key for command prompt.
 Booting [kernel] in 8 seconds...
 
 Type '?' for a list of commands, 'help' for more detailed help.
 ok <userinput>boot -as</userinput></screen>
 
 	  </step>
 
 	  <step>
 	    <para>Select <filename>/rootback</filename>
 	      as your root filesystem.</para>
 
 	    <screen>Manual root filesystem specification:
     &lt;fstype>:&lt;device>  Mount &lt;device> using filesystem &lt;fstype>
 			 e.g. ufs:/dev/da0s1a
     ?                  List valid disk boot devices
     &lt;empty line>       Abort manual input
 
   mountroot> <userinput>ufs:/dev/ad2s1a</userinput></screen>
 	  </step>
 
 	  <step>
 	    <para>Now that you are in single-user mode, change
 	      <filename>/etc/fstab</filename> to avoid the
 	      bad root filesystem.</para>
 
 	    <tip>
 	      <para>If you used the <literal>bootvinum</literal> Perl script from <xref linkend=Perl>
 		below, then these commands should configure your server for
 		degraded mode.</para>
 
 	      <screen>&prompt.root; <userinput>fsck -p /</userinput>
 &prompt.root; <userinput>mount /</userinput>
 &prompt.root; <userinput>cd /etc</userinput>
 &prompt.root; <userinput>mv fstab fstab.bak</userinput>
 &prompt.root; <userinput>cp fstab_ad0s1_root_bad fstab</userinput>
 &prompt.root; <userinput>cd /</userinput>
 &prompt.root; <userinput>mount -o ro /</userinput>
 &prompt.root; <userinput>vinum start</userinput>
 &prompt.root; <userinput>fsck -p</userinput>
 &prompt.root; <userinput>^D</userinput></screen>
 	    </tip>
 	  </step>
 	</procedure>
       </section>
 
       <section>
 	<title>Recovery</title>
 
 	<procedure>
 	  <step>
 	    <para>Restore <devicename>/dev/ad0s1a</devicename> from
 	      backups or copy
 	      <filename>/rootback</filename> to it with these commands:</para>
 
 	    <screen>&prompt.root; <userinput>umount /rootbad</userinput>
 &prompt.root; <userinput>newfs /dev/ad0s1a</userinput>
 &prompt.root; <userinput>tunefs -n enable /dev/ad0s1a</userinput>
 &prompt.root; <userinput>mount /rootbad</userinput>
 &prompt.root; <userinput>cd /rootbad</userinput>
 &prompt.root; <userinput>dump 0f - / | restore rf -</userinput>
 &prompt.root; <userinput>rm restoresymtable</userinput></screen>
 	  </step>
 	</procedure>
       </section>
 
       <section>
 	<title>Exiting Degraded Mode</title>
 
 	<procedure>
 	  <step>
 	    <para>Enter single-user mode.</para>
 
 	    <screen>&prompt.root; <userinput>shutdown now</userinput></screen>
 	  </step>
 
 	  <step>
 	    <para>Put <filename>/etc/fstab</filename> back to
 	      normal and reboot.</para>
 
 	    <screen>&prompt.root; <userinput>cd /rootbad/etc</userinput>
 &prompt.root; <userinput>rm fstab</userinput>
 &prompt.root; <userinput>mv fstab.bak fstab</userinput>
 &prompt.root; <userinput>reboot</userinput></screen>
 	  </step>
 
 	  <step>
 	    <para>Reboot and hit <keycap>F1</keycap> to boot from
 	      <devicename>/dev/ad0</devicename> when
 	      prompted by BootMgr.</para>
 	  </step>
 	</procedure>
       </section>
 
       <section>
 	<title>Simulation</title>
 
 	<para>This kind of failure can be simulated by shutting down to
 	  single-user mode and then booting as shown above in 
 	  <xref linkend=enter1>.</para>
       </section>
     </section>
 
     <section id="ad2Bad">
       <title>Drive ad2 Fails</title>
 
       <para>This section deals with the total failure of
 	<devicename>/dev/ad2</devicename>.</para>
 
       <section>
 	<title>Configure Server for Degraded Mode</title>
 
 	<procedure>
 	  <step>
 	    <para>After the kernel is loaded, hit any key but
 	      <keycap>Enter</keycap> to interrupt the boot sequence.
 	      Boot into single-user mode.</para>
 
 	    <screen>Hit [Enter] to boot immediately, or any other key for command prompt.
 Booting [kernel] in 8 seconds...
 
 Type '?' for a list of commands, 'help' for more detailed help.
 ok <userinput>boot -s</userinput></screen>
 
 	  </step>
 
 	  <step>
 	    <para>Change
 	      <filename>/etc/fstab</filename> to avoid the bad drive.
 	      If you used the <literal>bootvinum</literal> Perl script from <xref linkend=Perl>
 	      below, then
 	      these commands should configure your server for
 	      degraded mode.</para>
 
 	      <screen>&prompt.root; <userinput>fsck -p /</userinput>
 &prompt.root; <userinput>mount /</userinput>
 &prompt.root; <userinput>cd /etc</userinput>
 &prompt.root; <userinput>mv fstab fstab.bak</userinput>
 &prompt.root; <userinput>cp fstab_only_have_ad0s1 fstab</userinput>
 &prompt.root; <userinput>cd /</userinput>
 &prompt.root; <userinput>mount -o ro /</userinput>
 &prompt.root; <userinput>vinum start</userinput>
 &prompt.root; <userinput>fsck -p</userinput>
 &prompt.root; <userinput>^D</userinput></screen>
 
 	    <para>If you do not have modified versions of
 	      <filename>/etc/fstab</filename> that are ready for use,
 	      then you can use <command>ed</command> to make one.
 	      Alternatively, you can <command>fsck</command> and
 	      <command>mount</command>
 	      <filename>/usr</filename> and then use your
 	      favorite editor.</para>
 
 	  </step>
 	</procedure>
       </section>
 
       <section id=ad2Recov>
 	<title>Recovery</title>
 	<procedure>
 
 	  <para>We assume here that your server is up and running multi-user in
 	    degraded mode on just 
 	    <devicename>/dev/ad0</devicename> and that you have
 	    a new spindle now on
 	    <devicename>/dev/ad2</devicename> ready to go.</para>
 
 	  <para>You will need a new spindle with enough room to hold root and swap
 	    partitions plus a <application>Vinum</application>
 	    partition large enough to hold
 	    <filename>/home</filename> and <filename>/usr</filename>.</para>
 
 	  <step>
 	    <para>Create a BIOS partition (slice) on the new spindle.</para>
 
 	    <screen>&prompt.root; <userinput>/stand/sysinstall</userinput></screen>
 
 	    <substeps>
 	      <step><para>Select <literal>Custom</literal>.</para></step>
 	      <step><para>Select <literal>Partition</literal>.</para></step>
 	      <step><para>Select <devicename>ad2</devicename>.</para></step>
 	      <step><para>Create a FreeBSD (type 165) slice
 		large enough to hold everything mentioned above.</para></step>
 	      <step><para>Write changes.</para></step>
 	      <step><para>Yes, you are absolutely sure.</para></step>
 	      <step><para>Select BootMgr.</para></step>
 	      <step><para>Quit Partitioning.</para></step>
 	      <step><para>Exit <command>/stand/sysinstall</command>.</para></step>
 	    </substeps>
 	  </step>
 
 	  <step>
 	    <para>Create disk label partitioning based on current
 	      <devicename>/dev/ad0</devicename> partitioning.</para>
 
 	    <screen>&prompt.root; <userinput>disklabel ad0 > /tmp/ad0</userinput>
 &prompt.root; <userinput>disklabel -e ad2</userinput></screen>
 
 	    <para>This will drop you into your favorite editor.</para>
 
 	    <substeps>
 	      <step>
 		<para>Copy the lines for the <literal>a</literal> and
 		  <literal>b</literal> partitions from
 		  <filename>/tmp/ad0</filename> to the
 		  <devicename>ad2</devicename> disklabel.</para>
 	      </step>
 
 	      <step>
 		<para>Add the <literal>size</literal> of the
 		  <literal>a</literal> and
 		  <literal>b</literal> partitions to find the proper
 		  <literal>offset</literal> for the
 		  <literal>h</literal> partition.</para>
 	      </step>
 
 	      <step>
 		<para>Subtract this <literal>offset</literal> from the
 		  <literal>size</literal> of the <literal>c</literal>
 		  partition to find the proper <literal>size</literal> for the <literal>h</literal>
 		  partition.</para>
 	      </step>
 
 	      <step>
 		<para>Define an <literal>h</literal> partition with the
 		  <literal>size</literal> and
 		  <literal>offset</literal> calculated above.</para>
 	      </step>
 
 	      <step>
 		<para>Set the <literal>fstype</literal> column to
 		  <literal>vinum</literal>.</para>
 	      </step>
 
 	      <step>
 		<para>Save the file and quit your editor.</para>
 	      </step>
 	    </substeps>
 	  </step>
 
 	  <step>
 	    <para>Tell <application>Vinum</application>
 	      about the new drive.</para>
 
 	    <substeps>
 	      <step>
 		<para>Ask <application>Vinum</application> to start an
 		  editor with a copy of the current configuration.</para>
 
 		<screen>&prompt.root; <userinput>vinum create</userinput></screen>
 
 	      </step>
 
 	      <step>
 		<para>Uncomment the drive line referring to drive
 		  <literal>UpWindow</literal> and set
 		  <literal>device</literal> to
 		  <devicename>/dev/ad2s1h</devicename>.</para></step>
 
 	      <step>
 		<para>Save the file and quit your editor.</para></step>
 
 	    </substeps>
 	  </step>
 
 	  <step>
 	    <para>Now that <application>Vinum</application>
 	      has two spindles again, revive the mirrors.</para>
 
 	    <screen>&prompt.root; <userinput>vinum start -w usr.p1.s0</userinput>
 &prompt.root; <userinput>vinum start -w home.p1.s0</userinput></screen>
 	  </step>
 
 	  <step>
 	    <para>Now we need to restore
 	      <filename>/rootback</filename> to a current copy of the
 	      root filesystem.
 	      These commands will accomplish this.</para>
 
 	    <screen>&prompt.root; <userinput>newfs /dev/ad2s1a</userinput>
 &prompt.root; <userinput>tunefs -n enable /dev/ad2s1a</userinput>
 &prompt.root; <userinput>mount /dev/ad2s1a /mnt</userinput>
 &prompt.root; <userinput>cd /mnt</userinput>
 &prompt.root; <userinput>dump 0f - / | restore rf -</userinput>
 &prompt.root; <userinput>rm restoresymtable</userinput>
 &prompt.root; <userinput>cd /</userinput>
 &prompt.root; <userinput>umount /mnt</userinput></screen>
 	  </step>
 	</procedure>
       </section>
 
       <section>
 	<title>Exiting Degraded Mode</title>
 
 	<procedure>
 	  <step>
 	    <para>Enter single-user mode.</para>
 
 	    <screen>&prompt.root; <userinput>shutdown now</userinput></screen>
 	  </step>
 
 	  <step>
 	    <para>Return <filename>/etc/fstab</filename> to
 	      its normal state and reboot.</para>
 
 	    <screen>&prompt.root; <userinput>cd /etc</userinput>
 &prompt.root; <userinput>rm fstab</userinput>
 &prompt.root; <userinput>mv fstab.bak fstab</userinput>
 &prompt.root; <userinput>reboot</userinput></screen>
 	  </step>
 	</procedure>
       </section>
 
       <section>
 	<title>Simulation</title>
 
 	<para>You can simulate this kind of failure by unplugging
 	  <devicename>/dev/ad2</devicename>, write-protecting it,
 	  or by this procedure:</para>
 
 	<procedure>
 	  <step>
 	    <para>Shutdown to single-user mode.</para>
 	  </step>
 
 	  <step>
 	    <para>Unmount all non-root filesystems.</para>
 	  </step>
 
 	  <step>
 	    <para>Clobber any existing <application>Vinum</application>
 	      configuration and partitioning on
 	      <devicename>/dev/ad2</devicename>.</para>
 
 	    <screen>&prompt.root; <userinput>vinum stop</userinput>
 &prompt.root; <userinput>dd if=/dev/zero of=/dev/ad2s1h count=512</userinput>
 &prompt.root; <userinput>dd if=/dev/zero of=/dev/ad2 count=512</userinput></screen>
 	  </step>
 	</procedure>
       </section>
     </section>
 
     <section id="ad0Bad">
       <title>Drive ad0 Fails</title>
 
       <para>Some BIOSes can boot from drive 1 or drive 2 (often called
 	<devicename>C:</devicename> or <devicename>D:</devicename>),
 	while others can boot only from drive 1.
 	If your BIOS can boot from either, the fastest road to recovery
 	might be to boot directly from <filename>/dev/ad2</filename>
 	in single-user mode and
 	install <filename>/etc/fstab_only_have_ad2s1</filename> as
 	<filename>/etc/fstab</filename>.
 	You would then have to adapt the <filename>/dev/ad2</filename>
 	failure recovery instructions from <xref linkend=ad2Recov> above.</para>
 
       <para>If your BIOS can only boot from drive one, then you will have to
 	unplug drive <literal>YouCrazy</literal> from the controller for
 	<devicename>/dev/ad2</devicename> and plug it
 	into the controller for <devicename>/dev/ad0</devicename>.
 	Then continue with the instructions for
 	<devicename>/dev/ad2</devicename> failure recovery
 	in <xref linkend=ad2Recov> above.</para>
     </section>
   </section>
 
   <appendix id="Perl">
     <title>bootvinum Perl Script</title>
 
     <para>The <literal>bootvinum</literal> Perl script below reads <filename>/etc/fstab</filename>
       and current drive partitioning.
       It then writes several files in the current directory and several
       variants of <filename>/etc/fstab</filename> in <filename>/etc</filename>.
       These files significantly simplify the installation of
       <application>Vinum</application> and recovery from
       spindle failures.</para>
 
     <programlisting>#!/usr/bin/perl -w
 use strict;
 use FileHandle;
 
-my $config_tag1 = '$Id: article.sgml,v 1.13 2003-08-27 07:13:11 blackend Exp $';
+my $config_tag1 = '$Id: article.sgml,v 1.14 2003-10-18 10:39:16 simon Exp $';
 # Copyright (C) 2001 Robert A. Van Valzah 
 #
 # Bootstrap Vinum
 #
 # Read /etc/fstab and current partitioning for all spindles mentioned there.
 # Generate files needed to mirror all filesystems on root spindle.
 #  A new partition table for each spindle
 #  Input for the vinum create command to create Vinum objects on each spindle
 #  A copy of fstab mounting Vinum volumes instead of BSD partitions
 #  Copies of fstab altered for server's degraded modes of operation
 # See handbook for instructions on how to use the the files generated.
 # N.B. This bootstrapping method shrinks size of swap partition by the size
 # of Vinum's on-disk configuration (265 sectors).  It embeds existing file
 # systems on the root spindle in Vinum objects without having to copy them.
 # Thanks to Greg Lehey for suggesting this bootstrapping method.
 # Expectations:
 #  The root spindle must contain at least root, swap, and /usr partitions
 #  The rootback spindle must have matching /rootback and swap partitions
 #  Other spindles should only have a /NOFUTURE* filesystem and maybe swap
 #  File systems named /NOFUTURE* will be replaced with Vinum drives
 
 # Change configuration variables below to suit your taste
 my $vip = 'h';					# VInum Partition
 my @drv = ('YouCrazy', 'UpWindow', 'ThruBank',	# Vinum DRiVe names
   'OutSnakes', 'MeWild', 'InMovie', 'HomeJames', 'DownPrices', 'WhileBlind');
 # No configuration variables beyond this point
 
 my %vols;		# One entry per Vinum volume to be created
 my @spndl;		# One entry per SPiNDLe
 my $rsp;		# Root SPindle (as in /dev/$rsp)
 my $rbsp;		# RootBack SPindle (as in /dev/$rbsp)
 my $cfgsiz = 265;	# Size of Vinum on-disk configuration info in sectors
 my $nxtpas = 2;		# Next fsck pass number for non-root filesystems
 
 # Parse fstab, generating the version we'll need for Vinum and noting
 # spindles in use.
 my $fsin = "/etc/fstab";
 #my $fsin = "simu/fstab";
 open(FSIN, "$fsin") || die("Couldn't open $fsin: $!\n");
 
 my $fsout = "/etc/fstab.vinum";
 open(FSOUT, ">$fsout") || die("Couldn't open $fsout for writing: $!\n");
 
 while (&lt;FSIN>) {
   my ($dev, $mnt, $fstyp, $opt, $dump, $pass) = split;
   next if $dev =~ /^#/;
   if ($mnt eq '/' || $mnt eq '/rootback' || $mnt =~ /^\/NOFUTURE/) {
     my $dn = substr($dev, 5, length($dev)-6);	# Device Name without /dev/
     push(@spndl, $dn) unless grep($_ eq $dn, @spndl);
     $rsp = $dn if $mnt eq '/';
     next if $mnt =~ /^\/NOFUTURE/;
   }
   # Move /rootback from partition e to a
   if ($mnt =~ /^\/rootback/) {
     $dev =~ s/e$/a/;
     $pass = 1;
     $rbsp = substr($dev, 5, length($dev)-6);
     print FSOUT "$dev\t\t$mnt\t$fstyp\t$opt\t\t$dump\t$pass\n";
     next;
   }
   # Move non-root filesystems on smallest spindle into Vinum
   if (defined($rsp) && $dev =~ /^\/dev\/$rsp/ && $dev =~ /[d-h]$/) {
     $pass = $nxtpas++;
     print FSOUT "/dev/vinum$mnt\t\t$mnt\t\t$fstyp\t$opt\t\t$dump\t$pass\n";
     $vols{$dev}->{mnt} = substr($mnt, 1);
     next;
   }
   print FSOUT $_;
 }
 close(FSOUT);
 die("Found more spindles than we have abstract names\n") if $#spndl > $#drv;
 die("Didn't find a root partition!\n") if !defined($rsp);
 die("Didn't find a /rootback partition!\n") if !defined($rbsp);
 
 # Table of server's Degraded Modes
 # One row per mode with hash keys
 #   fn	FileName
 #   xpr	eXPRession needed to convert fstab lines for this mode
 #   cm1	CoMment 1 describing this mode
 #   cm2	CoMment 2 describing this mode
 #   FH  FileHandle (dynamically initialized below)
 my @DM = (
   { cm1 => "When we only have $rsp, comment out lines using $rbsp",
     fn  => "/etc/fstab_only_have_$rsp",
     xpr => "s:^/dev/$rbsp:#\$&:",
   },
   { cm1 => "When we only have $rbsp, comment out lines using $rsp and",
     cm2 => "rootback becomes root",
     fn  => "/etc/fstab_only_have_$rbsp",
     xpr => "s:^/dev/$rsp:#\$&: || s:/rootback:/\t:",
   },
   { cm1 => "When only $rsp root is bad, /rootback becomes root and",
     cm2 => "root becomes /rootbad",
     fn  => "/etc/fstab_${rsp}_root_bad",
     xpr => "s:\t/\t:\t/rootbad: || s:/rootback:/\t:",
   },
 );
 
 # Initialize output FileHandles and write comments
 foreach my $dm (@DM) {
   my $fh = new FileHandle;
   $fh->open(">$dm->{fn}") || die("Can't write $dm->{fn}: $!\n");
   print $fh "# $dm->{cm1}\n" if $dm->{cm1};
   print $fh "# $dm->{cm2}\n" if $dm->{cm2};
   $dm->{FH} = $fh;
 }
 
 # Parse the Vinum version of fstab written above and write versions needed
 # for server's degraded modes.
 open(FSOUT, "$fsout") || die("Couldn't open $fsout: $!\n");
 while (&lt;FSOUT>) {
   my $line = $_;
   foreach my $dm (@DM) {
     $_ = $line;
     eval $dm->{xpr};
     print {$dm->{FH}} $_;
   }
 }
 
 # Parse partition table for each spindle and write versions needed for Vinum
 my $rootsiz;	# ROOT partition SIZe
 my $swapsiz;	# SWAP partition SIZe
 my $rspminoff;	# Root SPindle MINimum OFFset of non-root, non-swap, non-c parts
 my $rspsiz;	# Root SPindle SIZe
 my $rbspsiz;	# RootBack SPindle SIZe
 foreach my $i (0..$#spndl) {
   my $dlin = "disklabel $spndl[$i] |";
 #  my $dlin = "simu/disklabel.$spndl[$i]";
   open(DLIN, "$dlin") || die("Couldn't open $dlin: $!\n");
 
   my $dlout = "disklabel.$spndl[$i]";
   open(DLOUT, ">$dlout") || die("Couldn't open $dlout for writing: $!\n");
 
   my $dlb4 = "$dlout.b4vinum";
   open(DLB4, ">$dlb4") || die("Couldn't open $dlb4 for writing: $!\n");
 
   my $minoff;		# MINimum OFFset of non-root, non-swap, non-c partitions
   my $totsiz = 0;	# TOTal SIZe of all non-root, non-swap, non-c partitions
   my $swapspndl = 0;	# True if SWAP partition on this SPiNDLe
   while (&lt;DLIN>) {
     print DLB4 $_;
     my ($part, $siz, $off, $fstyp, $fsiz, $bsiz, $bps) = split;
 
     if ($part && $part eq 'a:' && $spndl[$i] eq $rsp) {
 	$rootsiz = $siz;
     }
     if ($part && $part eq 'e:' && $spndl[$i] eq $rbsp) {
       if ($rootsiz != $siz) {
 	die("Rootback size ($siz) != root size ($rootsiz)\n");
       }
     }
     if ($part && $part eq 'c:') {
       $rspsiz  = $siz if $spndl[$i] eq $rsp;
       $rbspsiz = $siz if $spndl[$i] eq $rbsp;
     }
     # Make swap partition $cfgsiz sectors smaller
     if ($part && $part eq 'b:') {
       if ($spndl[$i] eq $rsp) {
 	$swapsiz = $siz;
       } else {
 	if ($swapsiz != $siz) {
 	  die("Swap partition sizes unequal across spindles\n");
 	}
       }
       printf DLOUT "%4s%9d%9d%10s\n", $part, $siz-$cfgsiz, $off, $fstyp;
       $swapspndl = 1;
       next;
     }
     # Move rootback spindle e partitions to a
     if ($part && $part eq 'e:' && $spndl[$i] eq $rbsp) {
       printf DLOUT "%4s%9d%9d%10s%9d%6d%6d\n", 'a:', $siz, $off, $fstyp,
 	$fsiz, $bsiz, $bps;
       next;
     }
     # Delete non-root, non-swap, non-c partitions but note their minimum
     # offset and total size that're needed below.
     if ($part && $part =~ /^[d-h]:$/) {
       $minoff = $off unless $minoff;
       $minoff = $off if $off &lt; $minoff;
       $totsiz += $siz;
       if ($spndl[$i] eq $rsp) {	# If doing spindle containing root
 	my $dev = "/dev/$spndl[$i]" . substr($part, 0, 1);
 	$vols{$dev}->{siz} = $siz;
 	$vols{$dev}->{off} = $off;
 	$rspminoff = $minoff;
       }
       next;
     }
     print DLOUT $_;
   }
   if ($swapspndl) {	# If there was a swap partition on this spindle
     # Make a Vinum partition the size of all non-root, non-swap,
     # non-c partitions + the size of Vinum's on-disk configuration.
     # Set its offset so that the start of the first subdisk it contains
     # coincides with the first filesystem we're embedding in Vinum.
     printf DLOUT "%4s%9d%9d%10s\n", "$vip:", $totsiz+$cfgsiz, $minoff-$cfgsiz,
       'vinum';
   } else {
     # No need to mess with size size and offset if there was no swap
     printf DLOUT "%4s%9d%9d%10s\n", "$vip:", $totsiz, $minoff,
       'vinum';
   }
 }
 die("Swap partition not found\n") unless $swapsiz;
 die("Swap partition not larger than $cfgsiz blocks\n") unless $swapsiz>$cfgsiz;
 die("Rootback spindle size not >= root spindle size\n") unless $rbspsiz>=$rspsiz;
 
 # Generate input to vinum create command needed for each spindle.
 foreach my $i (0..$#spndl) {
   my $cfn = "create.$drv[$i]";	# Create File Name
   open(CF, ">$cfn") || die("Can't open $cfn for writing: $!\n");
   print CF "drive $drv[$i] device /dev/$spndl[$i]$vip\n";
   next unless $spndl[$i] eq $rsp || $spndl[$i] eq $rbsp;
   foreach my $dev (keys(%vols)) {
     my $mnt = $vols{$dev}->{mnt};
     my $siz = $vols{$dev}->{siz};
     my $off = $vols{$dev}->{off}-$rspminoff+$cfgsiz;
     print CF "volume $mnt\n" if $spndl[$i] eq $rsp;
     print CF &lt;&lt;EOF;
   plex name $mnt.p$i org concat volume $mnt
     sd name $mnt.p$i.s0 drive $drv[$i] plex $mnt.p$i len ${siz}s driveoffset ${off}s
 EOF
   }
 }</programlisting>
   </appendix>
 
   <appendix id=ManualBoot>
     <title>Manual Vinum Bootstrapping</title>
 
     <para>The <literal>bootvinum</literal> Perl script in <xref linkend=Perl> makes life easier, but
       it may be necessary to manually perform some or all of the steps that
       it automates.
       This appendix describes how you would manually mimic the script.</para>
 
     <procedure>
       <step>
 	<para>Make a copy of <filename>/etc/fstab</filename>
 	  to be customized.</para>
 
 	<screen>&prompt.root; <userinput>cp /etc/fstab /etc/fstab.vinum</userinput></screen>
       </step>
 
       <step>
 	<para>Edit <filename>/etc/fstab.vinum</filename>.</para>
 
 	<substeps>
 	  <step>
 	    <para>Change the <literal>device</literal> column of
 	      non-root partitions on the root spindle to
 	      <filename>/dev/vinum/mnt</filename>.</para></step>
 
 	  <step>
 	    <para>Change the <literal>pass</literal> column of
 	      non-root partitions on the root spindle to <userinput>2</userinput>,
 	      <userinput>3</userinput>, etc.</para></step>
 
 	  <step>
 	    <para>Delete any lines with mountpoint
 	      matching <filename>/NOFUTURE*</filename>.</para></step>
 
 	  <step>
 	    <para>Change the <literal>device</literal> column of
 	      <filename>/rootback</filename>
 	      from <literal>e</literal> to
 	      <literal>a</literal>.</para></step>
 
 	  <step>
 	    <para>Change the <literal>pass</literal> column of
 	      <filename>/rootback</filename> to
 	      <userinput>1</userinput>.</para></step>
 
 	</substeps>
       </step>
 
       <step>
 	<para>Prepare disklabels for editing:</para>
 
 	<screen>&prompt.root; <userinput>cd /bootvinum</userinput>
 &prompt.root; <userinput>disklabel ad0s1 > disklabel.ad0s1</userinput>
 &prompt.root; <userinput>cp disklabel.ad0s1 disklabel.ad0s1.b4vinum</userinput>
 &prompt.root; <userinput>disklabel ad2s1 > disklabel.ad2s1</userinput>
 &prompt.root; <userinput>cp disklabel.ad2s1 disklabel.ad2s1.b4vinum</userinput></screen>
       </step>
 
       <step>
 	<para>Edit <filename>/etc/disklabel.ad?s1</filename>.</para>
 
 	<substeps>
 	  <step>
 	    <para>On the root spindle:</para>
 
 	    <substeps>
 	      <step>
 		<para>Decrease the <literal>size</literal> of the
 		  <literal>b</literal> partition by 265 blocks.</para></step>
 
 	      <step>
 		<para>Note the <literal>size</literal> and
 		  <literal>offset</literal> of the <literal>a</literal> and
 		  <literal>b</literal> partitions.</para></step>
 
 	      <step>
 		<para>Note the smallest <literal>offset</literal> for partitions
 		  <literal>d</literal>-<literal>h</literal>.</para></step>
 
 	      <step>
 		<para>Note the <literal>size</literal> and
 		  <literal>offset</literal> for all non-root, non-swap
 		  partitions (<filename>/home</filename> was probably on
 		  <literal>e</literal> and <filename>/usr</filename> was
 		  probably on <literal>f</literal>).</para></step>
 
 	      <step>
 		<para>Delete partitions
 		  <literal>d</literal>-<literal>h</literal>.</para></step>
 
 	      <step>
 		<para>Create a new <literal>h</literal> partition with
 		  <literal>offset</literal> 265 blocks less than the
 		  smallest <literal>offset</literal>
 		  for partitions <literal>d</literal>-<literal>h</literal>
 		  noted above.
 		  Set its <literal>size</literal> to the <literal>size</literal>
 		  of the <literal>c</literal> partition less the
 		  smallest <literal>offset</literal>
 		  for partitions <literal>d</literal>-<literal>h</literal>
 		  noted above + 265 blocks.</para>
 
 		<note>
 		  <para><application>Vinum</application>
 		    can use any partition other than <literal>c</literal>.
 		    It is not strictly necessary to use <literal>h</literal>
 		    for all your <application>Vinum</application>
 		    partitions, but it is good practice to
 		    be consistent across all spindles.</para></note>
 	      </step>
 
 	      <step>
 		<para>Set the <literal>fstype</literal> of this new
 		  partition to <userinput>vinum</userinput>.</para></step>
 	    </substeps>
 	  </step>
 
 	  <step>
 	    <para>On the rootback spindle:</para>
 
 	    <substeps>
 	      <step>
 		<para>Move the <literal>e</literal> partition to
 		  <literal>a</literal>.</para></step>
 
 	      <step>
 		<para>Verify that the <literal>size</literal> of the
 		  <literal>a</literal> and
 		  <literal>b</literal> partitions matches the
 		  root spindle.</para></step>
 
 	      <step>
 		<para>Note the smallest <literal>offset</literal> for partitions
 		  <literal>d</literal>-<literal>h</literal>.</para></step>
 
 	      <step>
 		<para>Delete partitions
 		  <literal>d</literal>-<literal>h</literal>.</para></step>
 
 	      <step>
 		<para>Create a new <literal>h</literal> partition with
 		  <literal>offset</literal> 265 blocks less than the
 		  smallest <literal>offset</literal>
 		  noted above for partitions
 		  <literal>d</literal>-<literal>h</literal>.
 		  Set its <literal>size</literal> to the <literal>size</literal>
 		  of the <literal>c</literal> partition less the
 		  smallest <literal>offset</literal>
 		  for partitions <literal>d</literal>-<literal>h</literal>
 		  noted above + 265 blocks.</para></step>
 
 	      <step>
 		<para>Set the <literal>fstype</literal> of this new
 		  partition to <userinput>vinum</userinput>.</para></step>
 	    </substeps>
 	  </step>
 
 	</substeps>
       </step>
 
       <step>
 	<para>Create a file named
 	  <filename>create.YouCrazy</filename> that contains:</para>
 
 	<programlisting>drive YouCrazy device /dev/ad0s1h
 volume home
   plex name home.p0 org concat volume home
     sd name home.p0.s0 drive YouCrazy plex home.p0 len $hl driveoffset $ho
 volume usr
   plex name usr.p0 org concat volume usr
     sd name usr.p0.s0 drive YouCrazy plex usr.p0 len $ul driveoffset $uo</programlisting>
 
 	<para>Where:</para>
 	<itemizedlist>
 	  <listitem><para>
 	    <literal>$hl</literal> is the length noted above for
 	    <filename>/home</filename>.</para></listitem>
 
 	  <listitem><para>
 	    <literal>$ho</literal> is the offset noted above for
 	    <filename>/home</filename> less the smallest offset
 	    noted above + 265 blocks.</para></listitem>
 
 	  <listitem><para>
 	    <literal>$ul</literal> is the length noted above for
 	    <filename>/usr</filename>.</para></listitem>
 
 	  <listitem><para>
 	    <literal>$uo</literal> is the offset noted above for
 	    <filename>/usr</filename> less the smallest offset
 	    noted above + 265 blocks.</para></listitem>
 	</itemizedlist>
       </step>
 
       <step>
 	<para>Create a file named
 	  <filename>create.UpWindow</filename> containing:</para>
 
 	<programlisting>drive UpWindow device /dev/ad2s1h
   plex name home.p1 org concat volume home
     sd name home.p1.s0 drive UpWindow plex home.p1 len $hl driveoffset $ho
   plex name usr.p1 org concat volume usr
     sd name usr.p1.s0 drive UpWindow plex usr.p1 len $ul driveoffset $uo</programlisting>
 
 	<para>Where <literal>$hl</literal>, <literal>$ho</literal>, <literal>$ul</literal>, and <literal>$uo</literal> are set as above.</para>
       </step>
     </procedure>
   </appendix>
 
   <appendix id="Acknowledgements">
     <title>Acknowledgements</title>
 
     <para>I would like to thank Greg Lehey for writing &vinum.ap; and for
       providing very helpful comments on early drafts.
       Several others made helpful suggestions after reviewing later drafts
       including
       Dag-Erling Sm&oslash;rgrav,
       Michael Splendoria,
       Chern Lee,
       Stefan Aeschbacher,
       Fleming Froekjaer,
       Bernd Walter,
       Aleksey Baranov, and
       Doug Swarin.</para>
   </appendix>
 </article>
diff --git a/en_US.ISO8859-1/articles/vm-design/article.sgml b/en_US.ISO8859-1/articles/vm-design/article.sgml
index bbe6da9094..c77ab30396 100644
--- a/en_US.ISO8859-1/articles/vm-design/article.sgml
+++ b/en_US.ISO8859-1/articles/vm-design/article.sgml
@@ -1,841 +1,851 @@
 <!-- $FreeBSD$ -->
 <!-- FreeBSD Documentation Project -->
 
 <!DOCTYPE ARTICLE PUBLIC "-//FreeBSD//DTD DocBook V4.1-Based Extension//EN" [
 <!ENTITY % man PUBLIC "-//FreeBSD//ENTITIES DocBook Manual Page Entities//EN">
 %man;
 <!ENTITY % freebsd PUBLIC "-//FreeBSD//ENTITIES DocBook Miscellaneous FreeBSD Entities//EN">
 %freebsd;
 
+<!ENTITY % trademarks PUBLIC "-//FreeBSD//ENTITIES DocBook Trademark Entities//EN">
+%trademarks;
 ]>
 
 <article>
   <articleinfo>
     <title>Design elements of the FreeBSD VM system</title>
 
     <authorgroup>
       <author>
 	<firstname>Matthew</firstname>
 
 	<surname>Dillon</surname>
 
 	<affiliation>
 	  <address>
 	    <email>dillon@apollo.backplane.com</email>
 	  </address>
 	</affiliation>
       </author>
     </authorgroup>
 
+    <legalnotice id="trademarks" role="trademarks">
+      &tm-attrib.freebsd;
+      &tm-attrib.linux;
+      &tm-attrib.microsoft;
+      &tm-attrib.opengroup;
+      &tm-attrib.general;
+    </legalnotice>
+
     <abstract>
       <para>The title is really just a fancy way of saying that I am going to
 	attempt to describe the whole VM enchilada, hopefully in a way that
 	everyone can follow.  For the last year I have concentrated on a number
 	of major kernel subsystems within FreeBSD, with the VM and Swap
 	subsystems being the most interesting and NFS being <quote>a necessary
 	chore</quote>.  I rewrote only small portions of the code.  In the VM
 	arena the only major rewrite I have done is to the swap subsystem.
 	Most of my work was cleanup and maintenance, with only moderate code
 	rewriting and no major algorithmic adjustments within the VM
 	subsystem.  The bulk of the VM subsystem's theoretical base remains
 	unchanged and a lot of the credit for the modernization effort in the
 	last few years belongs to John Dyson and David Greenman.  Not being a
 	historian like Kirk I will not attempt to tag all the various features
 	with peoples names, since I will invariably get it wrong.</para>
     </abstract>
 
     <legalnotice>
       <para>This article was originally published in the January 2000 issue of 
 	<ulink url="http://www.daemonnews.org/">DaemonNews</ulink>.  This
 	version of the article may include updates from Matt and other authors
 	to reflect changes in FreeBSD's VM implementation.</para>
     </legalnotice>
   </articleinfo>
 
   <sect1>
     <title>Introduction</title>
 
     <para>Before moving along to the actual design let's spend a little time
       on the necessity of maintaining and modernizing any long-living
       codebase.  In the programming world, algorithms tend to be more
       important than code and it is precisely due to BSD's academic roots that
       a great deal of attention was paid to algorithm design from the
       beginning.  More attention paid to the design generally leads to a clean
       and flexible codebase that can be fairly easily modified, extended, or
       replaced over time.  While BSD is considered an <quote>old</quote>
       operating system by some people, those of us who work on it tend to view
       it more as a <quote>mature</quote> codebase which has various components
       modified, extended, or replaced with modern code.  It has evolved, and
       FreeBSD is at the bleeding edge no matter how old some of the code might
       be.  This is an important distinction to make and one that is
       unfortunately lost to many people.  The biggest error a programmer can
       make is to not learn from history, and this is precisely the error that
-      many other modern operating systems have made.  NT is the best example
+      many other modern operating systems have made.  &windowsnt; is the best example
       of this, and the consequences have been dire.  Linux also makes this
       mistake to some degree&mdash;enough that we BSD folk can make small
       jokes about it every once in a while, anyway.  Linux's problem is simply
       one of a lack of experience and history to compare ideas against, a
       problem that is easily and rapidly being addressed by the Linux
       community in the same way it has been addressed in the BSD
-      community&mdash;by continuous code development.  The NT folk, on the
+      community&mdash;by continuous code development.  The &windowsnt; folk, on the
       other hand, repeatedly make the same mistakes solved by &unix; decades ago
       and then spend years fixing them. Over and over again.  They have a
       severe case of <quote>not designed here</quote> and <quote>we are always
       right because our marketing department says so</quote>.  I have little
       tolerance for anyone who cannot learn from history.</para>
 
     <para>Much of the apparent complexity of the FreeBSD design, especially in
       the VM/Swap subsystem, is a direct result of having to solve serious
       performance issues that occur under various conditions.  These issues
       are not due to bad algorithmic design but instead rise from
       environmental factors.  In any direct comparison between platforms,
       these issues become most apparent when system resources begin to get
       stressed.  As I describe FreeBSD's VM/Swap subsystem the reader should
       always keep two points in mind.  First, the most important aspect of
       performance design is what is known as <quote>Optimizing the Critical
       Path</quote>.  It is often the case that performance optimizations add a
       little bloat to the code in order to make the critical path perform
       better.  Second, a solid, generalized design outperforms a
       heavily-optimized design over the long run.  While a generalized design
       may end up being slower than an heavily-optimized design when they are
       first implemented, the generalized design tends to be easier to adapt to
       changing conditions and the heavily-optimized design winds up having to
       be thrown away.  Any codebase that will survive and be maintainable for
       years must therefore be designed properly from the beginning even if it
       costs some performance.  Twenty years ago people were still arguing that
       programming in assembly was better than programming in a high-level
       language because it produced code that was ten times as fast.  Today,
       the fallibility of that argument is obvious&mdash;as are the parallels
       to algorithmic design and code generalization.</para>
   </sect1>
 
   <sect1>
     <title>VM Objects</title>
 
     <para>The best way to begin describing the FreeBSD VM system is to look at
       it from the perspective of a user-level process.  Each user process sees
       a single, private, contiguous VM address space containing several types
       of memory objects.  These objects have various characteristics.  Program
       code and program data are effectively a single memory-mapped file (the
       binary file being run), but program code is read-only while program data
       is copy-on-write.  Program BSS is just memory allocated and filled with
       zeros on demand, called demand zero page fill.  Arbitrary files can be
       memory-mapped into the address space as well, which is how the shared
       library mechanism works.  Such mappings can require modifications to
       remain private to the process making them.  The fork system call adds an
       entirely new dimension to the VM management problem on top of the
       complexity already given.</para>
 
     <para>A program binary data page (which is a basic copy-on-write page)
       illustrates the complexity.  A program binary contains a preinitialized
       data section which is initially mapped directly from the program file.
       When a program is loaded into a process's VM space, this area is
       initially memory-mapped and backed by the program binary itself,
       allowing the VM system to free/reuse the page and later load it back in
       from the binary.  The moment a process modifies this data, however, the
       VM system must make a private copy of the page for that process.  Since
       the private copy has been modified, the VM system may no longer free it,
       because there is no longer any way to restore it later on.</para>
 
     <para>You will notice immediately that what was originally a simple file
       mapping has become much more complex.  Data may be modified on a
       page-by-page basis whereas the file mapping encompasses many pages at
       once.  The complexity further increases when a process forks.  When a
       process forks, the result is two processes&mdash;each with their own
       private address spaces, including any modifications made by the original
       process prior to the call to <function>fork()</function>.  It would be
       silly for the VM system to make a complete copy of the data at the time
       of the <function>fork()</function> because it is quite possible that at
       least one of the two processes will only need to read from that page
       from then on, allowing the original page to continue to be used.  What
       was a private page is made copy-on-write again, since each process
       (parent and child) expects their own personal post-fork modifications to
       remain private to themselves and not effect the other.</para>
 
     <para>FreeBSD manages all of this with a layered VM Object model.  The
       original binary program file winds up being the lowest VM Object layer.
       A copy-on-write layer is pushed on top of that to hold those pages which
       had to be copied from the original file.  If the program modifies a data
       page belonging to the original file the VM system takes a fault and
       makes a copy of the page in the higher layer.  When a process forks,
       additional VM Object layers are pushed on.  This might make a little
       more sense with a fairly basic example.  A <function>fork()</function>
       is a common operation for any *BSD system, so this example will consider
       a program that starts up, and forks.  When the process starts, the VM
       system creates an object layer, let's call this A:</para>
 
     <mediaobject>
       <imageobject>
         <imagedata fileref="fig1" format="EPS">
       </imageobject>
 	
       <textobject>
 	<literallayout class="monospaced">+---------------+
 |       A       |
 +---------------+</literallayout>
       </textobject>
 
       <textobject>
 	<phrase>A picture</phrase>
       </textobject>
     </mediaobject>
 
     <para>A represents the file&mdash;pages may be paged in and out of the
       file's physical media as necessary.  Paging in from the disk is
       reasonable for a program, but we really do not want to page back out and
       overwrite the executable.  The VM system therefore creates a second
       layer, B, that will be physically backed by swap space:</para>
 
     <mediaobject>
       <imageobject>
         <imagedata fileref="fig2" format="EPS">
       </imageobject>
 
       <textobject>
 	<literallayout class="monospaced">+---------------+
 |       B       |	  
 +---------------+
 |       A       |
 +---------------+</literallayout>
       </textobject>
     </mediaobject>
 
     <para>On the first write to a page after this, a new page is created in B,
       and its contents are initialized from A.  All pages in B can be paged in
       or out to a swap device.  When the program forks, the VM system creates
       two new object layers&mdash;C1 for the parent, and C2 for the
       child&mdash;that rest on top of B:</para>
 
     <mediaobject>
       <imageobject>
         <imagedata fileref="fig3" format="EPS">
       </imageobject>
       
       <textobject>
 	<literallayout class="monospaced">+-------+-------+
 |   C1  |   C2  |
 +-------+-------+
 |       B       |
 +---------------+
 |       A       |
 +---------------+</literallayout>
       </textobject>
     </mediaobject>
 
     <para>In this case, let's say a page in B is modified by the original
       parent process.  The process will take a copy-on-write fault and
       duplicate the page in C1, leaving the original page in B untouched.
       Now, let's say the same page in B is modified by the child process.  The
       process will take a copy-on-write fault and duplicate the page in C2.
       The original page in B is now completely hidden since both C1 and C2
       have a copy and B could theoretically be destroyed if it does not
       represent a <quote>real</quote> file).  However, this sort of optimization is not
       trivial to make because it is so fine-grained.  FreeBSD does not make
       this optimization.  Now, suppose (as is often the case) that the child
       process does an <function>exec()</function>.  Its current address space
       is usually replaced by a new address space representing a new file.  In
       this case, the C2 layer is destroyed:</para>
 
     <mediaobject>
       <imageobject>
         <imagedata fileref="fig4" format="EPS">
       </imageobject>
 
       <textobject>
 	<literallayout class="monospaced">+-------+
 |   C1  |
 +-------+-------+
 |       B       |
 +---------------+
 |       A       |
 +---------------+</literallayout>
       </textobject>
     </mediaobject>
 
     <para>In this case, the number of children of B drops to one, and all
       accesses to B now go through C1.  This means that B and C1 can be
       collapsed together.  Any pages in B that also exist in C1 are deleted
       from B during the collapse.  Thus, even though the optimization in the
       previous step could not be made, we can recover the dead pages when
       either of the processes exit or <function>exec()</function>.</para>
 
     <para>This model creates a number of potential problems.  The first is that
       you can wind up with a relatively deep stack of layered VM Objects which
       can cost scanning time and memory when you take a fault.  Deep
       layering can occur when processes fork and then fork again (either
       parent or child).  The second problem is that you can wind up with dead,
       inaccessible pages deep in the stack of VM Objects.  In our last example
       if both the parent and child processes modify the same page, they both
       get their own private copies of the page and the original page in B is
       no longer accessible by anyone.  That page in B can be freed.</para>
 
     <para>FreeBSD solves the deep layering problem with a special optimization
       called the <quote>All Shadowed Case</quote>.  This case occurs if either
       C1 or C2 take sufficient COW faults to completely shadow all pages in B.
       Lets say that C1 achieves this.  C1 can now bypass B entirely, so rather
       then have C1->B->A and C2->B->A we now have C1->A and C2->B->A.  But
       look what also happened&mdash;now B has only one reference (C2), so we
       can collapse B and C2 together.  The end result is that B is deleted
       entirely and we have C1->A and C2->A.  It is often the case that B will
       contain a large number of pages and neither C1 nor C2 will be able to
       completely overshadow it.  If we fork again and create a set of D
       layers, however, it is much more likely that one of the D layers will
       eventually be able to completely overshadow the much smaller dataset
       represented by C1 or C2.  The same optimization will work at any point in
       the graph and the grand result of this is that even on a heavily forked
       machine VM Object stacks tend to not get much deeper then 4.  This is
       true of both the parent and the children and true whether the parent is
       doing the forking or whether the children cascade forks.</para>
 
     <para>The dead page problem still exists in the case where C1 or C2 do not
       completely overshadow B.  Due to our other optimizations this case does
       not represent much of a problem and we simply allow the pages to be
       dead.  If the system runs low on memory it will swap them out, eating a
       little swap, but that is it.</para>
 
     <para>The advantage to the VM Object model is that
       <function>fork()</function> is extremely fast, since no real data
       copying need take place.  The disadvantage is that you can build a
       relatively complex VM Object layering that slows page fault handling
       down a little, and you spend memory managing the VM Object structures.
       The optimizations FreeBSD makes proves to reduce the problems enough
       that they can be ignored, leaving no real disadvantage.</para>
   </sect1>
 
   <sect1>
     <title>SWAP Layers</title>
 
     <para>Private data pages are initially either copy-on-write or zero-fill
       pages.  When a change, and therefore a copy, is made, the original
       backing object (usually a file) can no longer be used to save a copy of
       the page when the VM system needs to reuse it for other purposes.  This
       is where SWAP comes in.  SWAP is allocated to create backing store for
       memory that does not otherwise have it.  FreeBSD allocates the swap
       management structure for a VM Object only when it is actually needed.
       However, the swap management structure has had problems
       historically.</para>
 
     <para>Under FreeBSD 3.X the swap management structure preallocates an
       array that encompasses the entire object requiring swap backing
       store&mdash;even if only a few pages of that object are swap-backed.
       This creates a kernel memory fragmentation problem when large objects
       are mapped, or processes with large runsizes (RSS) fork.  Also, in order
       to keep track of swap space, a <quote>list of holes</quote> is kept in
       kernel memory, and this tends to get severely fragmented as well.  Since
       the <quote>list of holes</quote> is a linear list, the swap allocation and freeing
       performance is a non-optimal O(n)-per-page.  It also requires kernel
       memory allocations to take place during the swap freeing process, and
       that creates low memory deadlock problems.  The problem is further
       exacerbated by holes created due to the interleaving algorithm.  Also,
       the swap block map can become fragmented fairly easily resulting in
       non-contiguous allocations. Kernel memory must also be allocated on the
       fly for additional swap management structures when a swapout occurs.  It
       is evident that there was plenty of room for improvement.</para>
 
     <para>For FreeBSD 4.X, I completely rewrote the swap subsystem.  With this
       rewrite, swap management structures are allocated through a hash table
       rather than a linear array giving them a fixed allocation size and much
       finer granularity.  Rather then using a linearly linked list to keep
       track of swap space reservations, it now uses a bitmap of swap blocks
       arranged in a radix tree structure with free-space hinting in the radix
       node structures.  This effectively makes swap allocation and freeing an
       O(1) operation.  The entire radix tree bitmap is also preallocated in
       order to avoid having to allocate kernel memory during critical low
       memory swapping operations.  After all, the system tends to swap when it
       is low on memory so we should avoid allocating kernel memory at such
       times in order to avoid potential deadlocks.  Finally, to reduce
       fragmentation the radix tree is capable of allocating large contiguous
       chunks at once, skipping over smaller fragmented chunks.  I did not take
       the final step of having an <quote>allocating hint pointer</quote> that would trundle
       through a portion of swap as allocations were made in order to further
       guarantee contiguous allocations or at least locality of reference, but
       I ensured that such an addition could be made.</para>
   </sect1>
 
   <sect1>
     <title>When to free a page</title>
 
     <para>Since the VM system uses all available memory for disk caching,
       there are usually very few truly-free pages.  The VM system depends on
       being able to properly choose pages which are not in use to reuse for
       new allocations.  Selecting the optimal pages to free is possibly the
       single-most important function any VM system can perform because if it
       makes a poor selection, the VM system may be forced to unnecessarily
       retrieve pages from disk, seriously degrading system performance.</para>
 
     <para>How much overhead are we willing to suffer in the critical path to
       avoid freeing the wrong page?  Each wrong choice we make will cost us
       hundreds of thousands of CPU cycles and a noticeable stall of the
       affected processes, so we are willing to endure a significant amount of
       overhead in order to be sure that the right page is chosen.  This is why
       FreeBSD tends to outperform other systems when memory resources become
       stressed.</para>
 
     <para>The free page determination algorithm is built upon a history of the
       use of memory pages.  To acquire this history, the system takes advantage
       of a page-used bit feature that most hardware page tables have.</para>
 
     <para>In any case, the page-used bit is cleared and at some later point
       the VM system comes across the page again and sees that the page-used
       bit has been set.  This indicates that the page is still being actively
       used.  If the bit is still clear it is an indication that the page is not
       being actively used.  By testing this bit periodically, a use history (in
       the form of a counter) for the physical page is developed.  When the VM
       system later needs to free up some pages, checking this history becomes
       the cornerstone of determining the best candidate page to reuse.</para>
 
     <sidebar>
       <title>What if the hardware has no page-used bit?</title>
 
       <para>For those platforms that do not have this feature, the system
 	actually emulates a page-used bit.  It unmaps or protects a page,
 	forcing a page fault if the page is accessed again.  When the page
 	fault is taken, the system simply marks the page as having been used
 	and unprotects the page so that it may be used.  While taking such page
 	faults just to determine if a page is being used appears to be an
 	expensive proposition, it is much less expensive than reusing the page
 	for some other purpose only to find that a process needs it back and
 	then have to go to disk.</para>
     </sidebar>
 
     <para>FreeBSD makes use of several page queues to further refine the
       selection of pages to reuse as well as to determine when dirty pages
       must be flushed to their backing store.  Since page tables are dynamic
       entities under FreeBSD, it costs virtually nothing to unmap a page from
       the address space of any processes using it.  When a page candidate has
       been chosen based on the page-use counter, this is precisely what is
       done.  The system must make a distinction between clean pages which can
       theoretically be freed up at any time, and dirty pages which must first
       be written to their backing store before being reusable.  When a page
       candidate has been found it is moved to the inactive queue if it is
       dirty, or the cache queue if it is clean.  A separate algorithm based on
       the dirty-to-clean page ratio determines when dirty pages in the
       inactive queue must be flushed to disk.  Once this is accomplished, the
       flushed pages are moved from the inactive queue to the cache queue.  At
       this point, pages in the cache queue can still be reactivated by a VM
       fault at relatively low cost.  However, pages in the cache queue are
       considered to be <quote>immediately freeable</quote> and will be reused
       in an LRU (least-recently used) fashion when the system needs to
       allocate new memory.</para>
 
     <para>It is important to note that the FreeBSD VM system attempts to
       separate clean and dirty pages for the express reason of avoiding
       unnecessary flushes of dirty pages (which eats I/O bandwidth), nor does
       it move pages between the various page queues gratuitously when the
       memory subsystem is not being stressed.  This is why you will see some
       systems with very low cache queue counts and high active queue counts
       when doing a <command>systat -vm</command> command.  As the VM system
       becomes more stressed, it makes a greater effort to maintain the various
       page queues at the levels determined to be the most effective.  An urban
       myth has circulated for years that Linux did a better job avoiding
       swapouts than FreeBSD, but this in fact is not true.  What was actually
       occurring was that FreeBSD was proactively paging out unused pages in
       order to make room for more disk cache while Linux was keeping unused
       pages in core and leaving less memory available for cache and process
       pages.  I do not know whether this is still true today.</para>
   </sect1>
 
   <sect1>
     <title>Pre-Faulting and Zeroing Optimizations</title>
 
     <para>Taking a VM fault is not expensive if the underlying page is already
       in core and can simply be mapped into the process, but it can become
       expensive if you take a whole lot of them on a regular basis.  A good
       example of this is running a program such as &man.ls.1; or &man.ps.1;
       over and over again.  If the program binary is mapped into memory but
       not mapped into the page table, then all the pages that will be accessed
       by the program will have to be faulted in every time the program is run.
       This is unnecessary when the pages in question are already in the VM
       Cache, so FreeBSD will attempt to pre-populate a process's page tables
       with those pages that are already in the VM Cache.  One thing that
       FreeBSD does not yet do is pre-copy-on-write certain pages on exec.  For
       example, if you run the &man.ls.1; program while running <command>vmstat
 	1</command> you will notice that it always takes a certain number of
       page faults, even when you run it over and over again.  These are
       zero-fill faults, not program code faults (which were pre-faulted in
       already).  Pre-copying pages on exec or fork is an area that could use
       more study.</para>
 
     <para>A large percentage of page faults that occur are zero-fill faults.
       You can usually see this by observing the <command>vmstat -s</command>
       output.  These occur when a process accesses pages in its BSS area.  The
       BSS area is expected to be initially zero but the VM system does not
       bother to allocate any memory at all until the process actually accesses
       it.  When a fault occurs the VM system must not only allocate a new page,
       it must zero it as well.  To optimize the zeroing operation the VM system
       has the ability to pre-zero pages and mark them as such, and to request
       pre-zeroed pages when zero-fill faults occur.  The pre-zeroing occurs
       whenever the CPU is idle but the number of pages the system pre-zeros is
       limited in order to avoid blowing away the memory caches.  This is an
       excellent example of adding complexity to the VM system in order to
       optimize the critical path.</para>
   </sect1>
 
   <sect1>
     <title>Page Table Optimizations</title>
 
     <para>The page table optimizations make up the most contentious part of
       the FreeBSD VM design and they have shown some strain with the advent of
       serious use of <function>mmap()</function>.  I think this is actually a
       feature of most BSDs though I am not sure when it was first introduced.
       There are two major optimizations.  The first is that hardware page
       tables do not contain persistent state but instead can be thrown away at
       any time with only a minor amount of management overhead.  The second is
       that every active page table entry in the system has a governing
       <literal>pv_entry</literal> structure which is tied into the
       <literal>vm_page</literal> structure.  FreeBSD can simply iterate
       through those mappings that are known to exist while Linux must check
       all page tables that <emphasis>might</emphasis> contain a specific
       mapping to see if it does, which can achieve O(n^2) overhead in certain
       situations.  It is because of this that FreeBSD tends to make better
       choices on which pages to reuse or swap when memory is stressed, giving
       it better performance under load. However, FreeBSD requires kernel
       tuning to accommodate large-shared-address-space situations such as
       those that can occur in a news system because it may run out of
       <literal>pv_entry</literal> structures.</para>
 
     <para>Both Linux and FreeBSD need work in this area.  FreeBSD is trying to
       maximize the advantage of a potentially sparse active-mapping model (not
       all processes need to map all pages of a shared library, for example),
       whereas Linux is trying to simplify its algorithms.  FreeBSD generally
       has the performance advantage here at the cost of wasting a little extra
       memory, but FreeBSD breaks down in the case where a large file is
       massively shared across hundreds of processes.  Linux, on the other hand,
       breaks down in the case where many processes are sparsely-mapping the
       same shared library and also runs non-optimally when trying to determine
       whether a page can be reused or not.</para>
   </sect1>
 
   <sect1>
     <title>Page Coloring</title>
 
     <para>We will end with the page coloring optimizations.  Page coloring is a
       performance optimization designed to ensure that accesses to contiguous
       pages in virtual memory make the best use of the processor cache.  In
       ancient times (i.e. 10+ years ago) processor caches tended to map
       virtual memory rather than physical memory.  This led to a huge number of
       problems including having to clear the cache on every context switch in
       some cases, and problems with data aliasing in the cache.  Modern
       processor caches map physical memory precisely to solve those problems.
       This means that two side-by-side pages in a processes address space may
       not correspond to two side-by-side pages in the cache.  In fact, if you
       are not careful side-by-side pages in virtual memory could wind up using
       the same page in the processor cache&mdash;leading to cacheable data
       being thrown away prematurely and reducing CPU performance.  This is true
       even with multi-way set-associative caches (though the effect is
       mitigated somewhat).</para>
 
     <para>FreeBSD's memory allocation code implements page coloring
       optimizations, which means that the memory allocation code will attempt
       to locate free pages that are contiguous from the point of view of the
       cache.  For example, if page 16 of physical memory is assigned to page 0
       of a process's virtual memory and the cache can hold 4 pages, the page
       coloring code will not assign page 20 of physical memory to page 1 of a
       process's virtual memory.  It would, instead, assign page 21 of physical
       memory.  The page coloring code attempts to avoid assigning page 20
       because this maps over the same cache memory as page 16 and would result
       in non-optimal caching.  This code adds a significant amount of
       complexity to the VM memory allocation subsystem as you can well
       imagine, but the result is well worth the effort.  Page Coloring makes VM
       memory as deterministic as physical memory in regards to cache
       performance.</para>
   </sect1>
 
   <sect1>
     <title>Conclusion</title>
 
     <para>Virtual memory in modern operating systems must address a number of
       different issues efficiently and for many different usage patterns.  The
       modular and algorithmic approach that BSD has historically taken allows
       us to study and understand the current implementation as well as
       relatively cleanly replace large sections of the code.  There have been a
       number of improvements to the FreeBSD VM system in the last several
       years, and work is ongoing.</para>
   </sect1>
 
   <sect1>
     <title>Bonus QA session by Allen Briggs
       <email>briggs@ninthwonder.com</email></title>
 
     <qandaset>
       <qandaentry>
 	<question>
 	  <para>What is <quote>the interleaving algorithm</quote> that you
 	    refer to in your listing of the ills of the FreeBSD 3.X swap
 	    arrangements?</para>
 	</question>
 
 	<answer>
 	  <para>FreeBSD uses a fixed swap interleave which defaults to 4.  This
 	    means that FreeBSD reserves space for four swap areas even if you
 	    only have one, two, or three.  Since swap is interleaved the linear
 	    address space representing the <quote>four swap areas</quote> will be
 	    fragmented if you do not actually have four swap areas.  For
 	    example, if you have two swap areas A and B FreeBSD's address
 	    space representation for that swap area will be interleaved in
 	    blocks of 16 pages:</para>
 
 	  <literallayout>A B C D A B C D A B C D A B C D</literallayout>
 
 	  <para>FreeBSD 3.X uses a <quote>sequential list of free
 	    regions</quote> approach to accounting for the free swap areas.
 	    The idea is that large blocks of free linear space can be
 	    represented with a single list node
 	    (<filename>kern/subr_rlist.c</filename>).  But due to the
 	    fragmentation the sequential list winds up being insanely
 	    fragmented.  In the above example, completely unused swap will
 	    have A and B shown as <quote>free</quote> and C and D shown as
 	    <quote>all allocated</quote>.  Each A-B sequence requires a list
 	    node to account for because C and D are holes, so the list node
 	    cannot be combined with the next A-B sequence.</para>
 
 	  <para>Why do we interleave our swap space instead of just tack swap
 	    areas onto the end and do something fancier?  Because it is a whole
 	    lot easier to allocate linear swaths of an address space and have
 	    the result automatically be interleaved across multiple disks than
 	    it is to try to put that sophistication elsewhere.</para>
 
 	  <para>The fragmentation causes other problems.  Being a linear list
 	    under 3.X, and having such a huge amount of inherent
 	    fragmentation, allocating and freeing swap winds up being an O(N)
 	    algorithm instead of an O(1) algorithm.  Combined with other
 	    factors (heavy swapping) and you start getting into O(N^2) and
 	    O(N^3) levels of overhead, which is bad.  The 3.X system may also
 	    need to allocate KVM during a swap operation to create a new list
 	    node which can lead to a deadlock if the system is trying to
 	    pageout pages in a low-memory situation.</para>
 
 	  <para>Under 4.X we do not use a sequential list.  Instead we use a
 	    radix tree and bitmaps of swap blocks rather than ranged list
 	    nodes.  We take the hit of preallocating all the bitmaps required
 	    for the entire swap area up front but it winds up wasting less
 	    memory due to the use of a bitmap (one bit per block) instead of a
 	    linked list of nodes.  The use of a radix tree instead of a
 	    sequential list gives us nearly O(1) performance no matter how
 	    fragmented the tree becomes.</para>
 	</answer>
       </qandaentry>
 
       <qandaentry>
 	<question>
 	  <para>I do not get the following:</para>
 
 	  <blockquote>
 	    <para>It is important to note that the FreeBSD VM system attempts
 	      to separate clean and dirty pages for the express reason of
 	      avoiding unnecessary flushes of dirty pages (which eats I/O
 	      bandwidth), nor does it move pages between the various page
 	      queues gratuitously when the memory subsystem is not being
 	      stressed.  This is why you will see some systems with very low
 	      cache queue counts and high active queue counts when doing a
 	      <command>systat -vm</command> command.</para>
 	  </blockquote>
 	  
 	  <para>How is the separation of clean and dirty (inactive) pages
 	    related to the situation where you see low cache queue counts and
 	    high active queue counts in <command>systat -vm</command>?  Do the
 	    systat stats roll the active and dirty pages together for the
 	    active queue count?</para>
 	</question>
 
 	<answer>
 	  <para>Yes, that is confusing.  The relationship is
 	    <quote>goal</quote> verses <quote>reality</quote>.  Our goal is to
 	    separate the pages but the reality is that if we are not in a
 	    memory crunch, we do not really have to.</para>
 
 	  <para>What this means is that FreeBSD will not try very hard to
 	    separate out dirty pages (inactive queue) from clean pages (cache
 	    queue) when the system is not being stressed, nor will it try to
 	    deactivate pages (active queue -> inactive queue) when the system
 	    is not being stressed, even if they are not being used.</para>
 	</answer>
       </qandaentry>
 
       <qandaentry>
 	<question>
 	  <para> In the &man.ls.1; / <command>vmstat 1</command> example,
 	    would not some of the page faults be data page faults (COW from
 	    executable file to private page)?  I.e., I would expect the page
 	    faults to be some zero-fill and some program data.  Or are you
 	    implying that FreeBSD does do pre-COW for the program data?</para>
 	</question>
 
 	<answer>
 	  <para>A COW fault can be either zero-fill or program-data.  The
 	    mechanism is the same either way because the backing program-data
 	    is almost certainly already in the cache.  I am indeed lumping the
 	    two together.  FreeBSD does not pre-COW program data or zero-fill,
 	    but it <emphasis>does</emphasis> pre-map pages that exist in its
 	    cache.</para>
 	</answer>
       </qandaentry>
 
       <qandaentry>
 	<question>
 	  <para>In your section on page table optimizations, can you give a
 	    little more detail about <literal>pv_entry</literal> and
 	    <literal>vm_page</literal> (or should vm_page be
 	    <literal>vm_pmap</literal>&mdash;as in 4.4, cf. pp. 180-181 of
 	    McKusick, Bostic, Karel, Quarterman)?  Specifically, what kind of
 	    operation/reaction would require scanning the mappings?</para>
 
 	  <para>How does Linux do in the case where FreeBSD breaks down
 	    (sharing a large file mapping over many processes)?</para>
 	</question>
 
 	<answer>
 	  <para>A <literal>vm_page</literal> represents an (object,index#)
 	    tuple.  A <literal>pv_entry</literal> represents a hardware page
 	    table entry (pte).  If you have five processes sharing the same
 	    physical page, and three of those processes's page tables actually
 	    map the page, that page will be represented by a single
 	    <literal>vm_page</literal> structure and three
 	    <literal>pv_entry</literal> structures.</para>
 
 	  <para><literal>pv_entry</literal> structures only represent pages
 	    mapped by the MMU (one <literal>pv_entry</literal> represents one
 	    pte).  This means that when we need to remove all hardware
 	    references to a <literal>vm_page</literal> (in order to reuse the
 	    page for something else, page it out, clear it, dirty it, and so
 	    forth) we can simply scan the linked list of
 	    <literal>pv_entry</literal>'s associated with that
 	    <literal>vm_page</literal> to remove or modify the pte's from
 	    their page tables.</para>
 
 	  <para>Under Linux there is no such linked list.  In order to remove
 	    all the hardware page table mappings for a
 	    <literal>vm_page</literal> linux must index into every VM object
 	    that <emphasis>might</emphasis> have mapped the page.  For
 	    example, if you have 50 processes all mapping the same shared
 	    library and want to get rid of page X in that library, you need to
 	    index into the page table for each of those 50 processes even if
 	    only 10 of them have actually mapped the page.  So Linux is
 	    trading off the simplicity of its design against performance.
 	    Many VM algorithms which are O(1) or (small N) under FreeBSD wind
 	    up being O(N), O(N^2), or worse under Linux.  Since the pte's
 	    representing a particular page in an object tend to be at the same
 	    offset in all the page tables they are mapped in, reducing the
 	    number of accesses into the page tables at the same pte offset
 	    will often avoid blowing away the L1 cache line for that offset,
 	    which can lead to better performance.</para>
 
 	  <para>FreeBSD has added complexity (the <literal>pv_entry</literal>
 	    scheme) in order to increase performance (to limit page table
 	    accesses to <emphasis>only</emphasis> those pte's that need to be
 	    modified).</para>
 
 	  <para>But FreeBSD has a scaling problem that Linux does not in that
 	    there are a limited number of <literal>pv_entry</literal>
 	    structures and this causes problems when you have massive sharing
 	    of data.  In this case you may run out of
 	    <literal>pv_entry</literal> structures even though there is plenty
 	    of free memory available.  This can be fixed easily enough by
 	    bumping up the number of <literal>pv_entry</literal> structures in
 	    the kernel config, but we really need to find a better way to do
 	    it.</para>
 
 	  <para>In regards to the memory overhead of a page table verses the
 	    <literal>pv_entry</literal> scheme: Linux uses
 	    <quote>permanent</quote> page tables that are not throw away, but
 	    does not need a <literal>pv_entry</literal> for each potentially
 	    mapped pte.  FreeBSD uses <quote>throw away</quote> page tables but
 	    adds in a <literal>pv_entry</literal> structure for each
 	    actually-mapped pte.  I think memory utilization winds up being
 	    about the same, giving FreeBSD an algorithmic advantage with its
 	    ability to throw away page tables at will with very low
 	    overhead.</para>
 	</answer>
       </qandaentry>
 
       <qandaentry>
 	<question>
 	  <para>Finally, in the page coloring section, it might help to have a
 	    little more description of what you mean here.  I did not quite
 	    follow it.</para>
 	</question>
 
 	<answer>
 	  <para>Do you know how an L1 hardware memory cache works?  I will
 	    explain: Consider a machine with 16MB of main memory but only 128K
 	    of L1 cache.  Generally the way this cache works is that each 128K
 	    block of main memory uses the <emphasis>same</emphasis> 128K of
 	    cache.  If you access offset 0 in main memory and then offset
 	    offset 128K in main memory you can wind up throwing away the
 	    cached data you read from offset 0!</para>
 
 	  <para>Now, I am simplifying things greatly.  What I just described
 	    is what is called a <quote>direct mapped</quote> hardware memory
 	    cache.  Most modern caches are what are called
 	    2-way-set-associative or 4-way-set-associative caches.  The
 	    set-associatively allows you to access up to N different memory
 	    regions that overlap the same cache memory without destroying the
 	    previously cached data.  But only N.</para>
 
 	  <para>So if I have a 4-way set associative cache I can access offset
 	    0, offset 128K, 256K and offset 384K and still be able to access
 	    offset 0 again and have it come from the L1 cache.  If I then
 	    access offset 512K, however, one of the four previously cached
 	    data objects will be thrown away by the cache.</para>
 
 	  <para>It is extremely important&hellip;
 	    <emphasis>extremely</emphasis> important for most of a processor's
 	    memory accesses to be able to come from the L1 cache, because the
 	    L1 cache operates at the processor frequency.  The moment you have
 	    an L1 cache miss and have to go to the L2 cache or to main memory,
 	    the processor will stall and potentially sit twiddling its fingers
 	    for <emphasis>hundreds</emphasis> of instructions worth of time
 	    waiting for a read from main memory to complete.  Main memory (the
 	    dynamic ram you stuff into a computer) is
 	    <emphasis>slow</emphasis>, when compared to the speed of a modern
 	    processor core.</para>
 
 	  <para>Ok, so now onto page coloring: All modern memory caches are
 	    what are known as <emphasis>physical</emphasis> caches.  They
 	    cache physical memory addresses, not virtual memory addresses.
 	    This allows the cache to be left alone across a process context
 	    switch, which is very important.</para>
 
-	  <para>But in the Unix world you are dealing with virtual address
+	  <para>But in the &unix; world you are dealing with virtual address
 	    spaces, not physical address spaces.  Any program you write will
 	    see the virtual address space given to it.  The actual
 	    <emphasis>physical</emphasis> pages underlying that virtual
 	    address space are not necessarily physically contiguous! In fact,
 	    you might have two pages that are side by side in a processes
 	    address space which wind up being at offset 0 and offset 128K in
 	    <emphasis>physical</emphasis> memory.</para>
 
 	  <para>A program normally assumes that two side-by-side pages will be
 	    optimally cached.  That is, that you can access data objects in
 	    both pages without having them blow away each other's cache entry.
 	    But this is only true if the physical pages underlying the virtual
 	    address space are contiguous (insofar as the cache is
 	    concerned).</para>
 
 	  <para>This is what Page coloring does.  Instead of assigning
 	    <emphasis>random</emphasis> physical pages to virtual addresses,
 	    which may result in non-optimal cache performance, Page coloring
 	    assigns <emphasis>reasonably-contiguous</emphasis> physical pages
 	    to virtual addresses.  Thus programs can be written under the
 	    assumption that the characteristics of the underlying hardware
 	    cache are the same for their virtual address space as they would
 	    be if the program had been run directly in a physical address
 	    space.</para>
 
 	  <para>Note that I say <quote>reasonably</quote> contiguous rather
 	    than simply <quote>contiguous</quote>.  From the point of view of a
 	    128K direct mapped cache, the physical address 0 is the same as
 	    the physical address 128K.  So two side-by-side pages in your
 	    virtual address space may wind up being offset 128K and offset
 	    132K in physical memory, but could also easily be offset 128K and
 	    offset 4K in physical memory and still retain the same cache
 	    performance characteristics.  So page-coloring does
 	    <emphasis>not</emphasis> have to assign truly contiguous pages of
 	    physical memory to contiguous pages of virtual memory, it just
 	    needs to make sure it assigns contiguous pages from the point of
 	    view of cache performance and operation.</para>
 	</answer>
       </qandaentry>
     </qandaset>
   </sect1>
 </article>
diff --git a/en_US.ISO8859-1/articles/zip-drive/article.sgml b/en_US.ISO8859-1/articles/zip-drive/article.sgml
index 5330a94857..eba18e15d6 100644
--- a/en_US.ISO8859-1/articles/zip-drive/article.sgml
+++ b/en_US.ISO8859-1/articles/zip-drive/article.sgml
@@ -1,276 +1,287 @@
 <!-- $FreeBSD$ -->
 
 <!DOCTYPE article PUBLIC "-//FreeBSD//DTD DocBook V4.1-Based Extension//EN" [
 <!ENTITY % man PUBLIC "-//FreeBSD//ENTITIES DocBook Manual Page Entities//EN">
 %man;
 <!ENTITY % freebsd PUBLIC "-//FreeBSD//ENTITIES DocBook Miscellaneous FreeBSD Entities//EN">
 %freebsd;
+<!ENTITY % trademarks PUBLIC "-//FreeBSD//ENTITIES DocBook Trademark Entities//EN">
+%trademarks;
 
 ]>
 
 <article>
   <articleinfo>
-    <title>ZIP Drives</title>
+    <title>&iomegazip; Drives</title>
 
     <authorgroup>
       <author>
 	<firstname>Jason</firstname>
 	<surname>Bacon</surname>
 
 	<affiliation>
 	  <address><email>acadix@execpc.com</email></address>
 	</affiliation>
       </author>
     </authorgroup>
+
+    <legalnotice id="trademarks" role="trademarks">
+      &tm-attrib.freebsd;
+      &tm-attrib.adaptec;
+      &tm-attrib.iomega;
+      &tm-attrib.microsoft;
+      &tm-attrib.opengroup;
+      &tm-attrib.general;
+    </legalnotice>
   </articleinfo>
 
   <sect1>
-    <title>ZIP Drive Basics</title>
+    <title>&iomegazip; Drive Basics</title>
 
-    <para>ZIP disks are high capacity, removable, magnetic disks, which can be
+    <para>&iomegazip; disks are high capacity, removable, magnetic disks, which can be
       read or written by ZIP drives from IOMEGA corporation.  ZIP disks are
       similar to floppy disks, except that they are much faster, and have a
       much greater capacity.  While floppy disks typically hold 1.44
       megabytes, ZIP disks are available in two sizes, namely 100 megabytes
       and 250 megabytes.  ZIP drives should not be confused with the
       super-floppy, a 120 megabyte floppy drive which also handles traditional
       1.44 megabyte floppies.</para>
 
     <para>IOMEGA also sells a higher capacity, higher performance drive called
-      the JAZZ drive.  JAZZ drives come in 1 gigabyte and 2 gigabyte
+      the &jaz;/JAZZ drive.  Jaz drives come in 1 gigabyte and 2 gigabyte
       sizes.</para>
 
     <para>ZIP drives are available as internal or external units, using one of
       three interfaces:</para>
 
     <orderedlist>
       <listitem>
 	<para>The SCSI (Small Computer Standard Interface) interface is the
 	  fastest, most sophisticated, most expandable, and most expensive
 	  interface.  The SCSI interface is used by all types of computers
 	  from PC's to RISC workstations to minicomputers, to connect all
 	  types of peripherals such as disk drives, tape drives, scanners, and
 	  so on.  SCSI ZIP drives may be internal or external, assuming your
 	  host adapter has an external connector.</para>
 
 	<note>
 	  <para>If you are using an external SCSI device, it is important
 	    never to connect or disconnect it from the SCSI bus while the
 	    computer is running.  Doing so may cause file-system damage on the
 	    disks that remain connected.</para>
 	</note>
 
 	<para>If you want maximum performance and easy setup, the SCSI
 	  interface is the best choice.  This will probably require adding a
 	  SCSI host adapter, since most PC's (except for high-performance
 	  servers) do not have built-in SCSI support.  Each SCSI host adapter
 	  can support either 7 or 15 SCSI devices, depending on the
 	  model.</para>
 
 	<para>Each SCSI device has its own controller, and these
 	  controllers are fairly intelligent and well standardized, (the
 	  second `S' in SCSI is for Standard) so from the operating system's
 	  point of view, all SCSI disk drives look about the same, as do all
 	  SCSI tape drives, etc.  To support SCSI devices, the operating
 	  system need only have a driver for the particular host adapter, and
 	  a generic driver for each type of device, i.e. a SCSI disk driver,
 	  SCSI tape driver, and so on.  There are some SCSI devices that can
 	  be better utilized with specialized drivers (e.g. DAT tape drives),
 	  but they tend to work OK with the generic driver, too.  It is just
 	  that the generic drivers may not support some of the special
 	  features.</para>
 
 	<para>Using a SCSI zip drive is simply a matter of determining which
 	  device file in the <filename>/dev</filename> directory represents
 	  the ZIP drive. This can be determined by looking at the boot
 	  messages while FreeBSD is booting (or in
 	  <filename>/var/log/messages</filename> after booting), where you
 	  will see a line something like this:</para>
 
 	<programlisting>da1: &lt;IOMEGA ZIP 100 D.13&gt; Removable Direct Access SCSI-2 Device</programlisting>
 
 	<para>This means that the ZIP drive is represented by the file
 	  <filename>/dev/da1</filename>.</para>
       </listitem>
 
       <listitem>
 	<para>The IDE (Integrated Drive Electronics) interface is a low-cost
 	  disk drive interface used by many desktop PC's.  Most IDE devices
 	  are strictly internal.</para>
 
 	<para>Performance of IDE ZIP drives is comparable to SCSI ZIP drives.
 	  (The IDE interface is not as fast as SCSI, but ZIP drives
 	  performance is limited mainly by the mechanics of the drive, not by
 	  the bus interface.)</para>
 
 	<para>The drawback of the IDE interface is the limitations it imposes.
 	  Most IDE adapters can only support 2 devices, and IDE interfaces are
 	  not typically designed for the long term. For example, the original
 	  IDE interface would not support hard disks with more than 1024
 	  cylinders, which forced a lot of people to upgrade their hardware
 	  prematurely.  If you have plans to expand your PC by adding another
 	  disk, a tape drive, or scanner, you may want to invest in a SCSI
 	  host adapter and a SCSI ZIP drive to avoid problems in the
 	  future.</para>
 
 	<para>IDE devices in FreeBSD are prefixed with a <literal>a</literal>.
 	  For example, an IDE hard disk might be
 	  <filename>/dev/ad0</filename>, an IDE (ATAPI) CDROM might be
 	  <filename>/dev/acd1</filename>, and so on.</para>
       </listitem>
 
       <listitem>
 	<para>The parallel port interface is popular for portable external
 	  devices such as external ZIP drives and scanners, because virtually
 	  every computer has a standard parallel port (usually used for
 	  printers).  This makes things easy for people to transfer data
 	  between multiple computers by toting around their ZIP drive.</para>
 
 	<para>Performance will generally be slower than a SCSI or IDE ZIP
 	  drive, since it is limited by the speed of the parallel port.
 	  Parallel port speed varies considerably between various computers,
 	  and can often be configured in the system BIOS.  Some machines will
 	  also require BIOS configuration to operate the parallel port in
 	  bidirectional mode.  (Parallel ports were originally designed only
 	  for output to printers)</para>
       </listitem>
     </orderedlist>
   </sect1>
 
   <sect1>
     <title>Parallel ZIP: The <devicename>vpo</devicename> Driver</title>
 
     <para>To use a parallel-port ZIP drive under FreeBSD, the
       <devicename>vpo</devicename> driver must be configured into the kernel.
       Parallel port ZIP drives also have a built-in SCSI controller. The vpo
       driver allows the FreeBSD kernel to communicate with the ZIP drive's
       SCSI controller through the parallel port.</para>
 
     <para>Since the vpo driver is not a standard part of the kernel (as of
       FreeBSD 3.2), you will need to rebuild the kernel to enable this device.
       The process of building a kernel is outlined in detail in another
       section.  The following steps outline the process in brief for the
       purpose of enabling the vpo driver:</para>
 
     <orderedlist>
       <listitem>
 	<para>Run <command>/stand/sysinstall</command>, and install the kernel
 	  source code on your system.</para>
       </listitem>
 
       <listitem>
 	<para>Create a custom kernel configuration, that includes the
 	  driver for the vpo driver:</para>
 
 	<screen>&prompt.root; <userinput>cd /sys/i386/conf</userinput>
 &prompt.root; <userinput>cp GENERIC MYKERNEL</userinput></screen>
 
 	<para>Edit <filename>MYKERNEL</filename>, change the
 	  <literal>ident</literal> line to <literal>MYKERNEL</literal>, and
 	  uncomment the line describing the vpo driver.</para>
 
 	<para>If you have a second parallel port, you may need to copy the
 	  section for <literal>ppc0</literal> to create a
 	  <literal>ppc1</literal> device.  The second parallel port usually
 	  uses IRQ 5 and address 378.  Only the IRQ is required in the config
 	  file.</para>
 
 	<para>If your root hard disk is a SCSI disk, you might run into a
 	  problem with probing order, which will cause the system to attempt
 	  to use the ZIP drive as the root device.  This will cause a boot
 	  failure, unless you happen to have a FreeBSD root file-system on
 	  your ZIP disk!  In this case, you will need to <quote>wire
 	    down</quote> the root disk, i.e. force the kernel to bind a
 	  specific device to <filename>/dev/da0</filename>, the root SCSI
 	  disk.  It will then assign the ZIP disk to the next available SCSI
 	  disk, e.g. <literal>/dev/da1</literal>. To wire down your SCSI hard
 	  drive as <literal>da0</literal>, change the line
 
 	  <programlisting>device  da0</programlisting>
 
 	  to
 
 	  <programlisting>disk da0 at scbus0 target 0 unit 0</programlisting></para>
 
 	<para>You may need to change the target above to match the SCSI ID of
 	  your disk drive. You should also wire down the scbus0 entry to your
-	  controller.  For example, if you have an Adaptec 15xx controller,
+	  controller.  For example, if you have an &adaptec; 15xx controller,
 	  you would change
 
 	  <programlisting>controller scbus0</programlisting>
 
 	  to
 
 	  <programlisting>controller scbus0 at aha0</programlisting></para>
 
 	<para>Finally, since you are creating a custom kernel configuration,
 	  you can take the opportunity to remove all the unnecessary drivers.
 	  This should be done with a great deal of caution, and only if you
 	  feel confident about making modifications to your kernel
 	  configuration. Removing unnecessary drivers will reduce the kernel
 	  size, leaving more memory available for your applications.  To
 	  determine which drivers are not needed, go to the end of the file
 	  <filename>/var/log/messages</filename>, and look for lines reading
 	  "not found". Then, comment out these devices in your config file.
 	  You can also change other options to reduce the size and increase
 	  the speed of your kernel. Read the section on rebuilding your kernel
 	  for more complete information.</para>
       </listitem>
 
       <listitem>
 	<para>Now it is time to compile the kernel:</para>
 
 	<screen>&prompt.root; <userinput>/usr/sbin/config MYKERNEL</userinput>
 &prompt.root; <userinput>cd ../../compile/MYKERNEL</userinput>
 &prompt.root; <userinput>make clean depend && make all install</userinput></screen>
       </listitem>
     </orderedlist>
 
     <para>After the kernel is rebuilt, you will need to reboot.  Make sure the
       ZIP drive is connected to the parallel port before the boot begins. You
       should see the ZIP drive show up in the boot messages as device vpo0 or
       vpo1, depending on which parallel port the drive is attached to.  It
       should also show which device file the ZIP drive has been bound to. This
       will be <filename>/dev/da0</filename> if you have no other SCSI disks in
       the system, or <filename>/dev/da1</filename> if you have a SCSI hard
       disk wired down as the root device.</para>
   </sect1>
 
   <sect1>
     <title>Mounting ZIP disks</title>
 
     <para>To access the ZIP disk, you simply mount it like any other disk
       device. The file-system is represented as slice 4 on the device, so for
       SCSI or parallel ZIP disks, you would use:</para>
 
     <screen>&prompt.root; <userinput>mount_msdos /dev/da1s4 /mnt</userinput></screen>
 
     <para>For IDE ZIP drives, use:</para>
 
     <screen>&prompt.root; <userinput>mount_msdos /dev/ad1s4 /mnt</userinput></screen>
 
     <para>It will also be helpful to update <filename>/etc/fstab</filename> to
       make mounting easier. Add a line like the following, edited to suit your
       system:
 
       <programlisting>/dev/da1s4  /zip msdos rw,noauto  0 0</programlisting>
 
       and create the directory <filename>/zip</filename>.</para>
 
     <para>Then, you can mount simply by typing
 
       <screen>&prompt.root; <userinput>mount /zip</userinput></screen>
 
       and unmount by typing
 
       <screen>&prompt.root; <userinput>umount /zip</userinput></screen></para>
 
     <para>For more information on the format of
       <filename>/etc/fstab</filename>, see &man.fstab.5;.</para>
 
     <para>You can also create a FreeBSD file-system on the ZIP disk using
       &man.newfs.8;.  However, the disk will only be usable on a FreeBSD
       system, or perhaps a few other &unix; clones that recognize FreeBSD
-      file-systems.  (Definitely not DOS or Windows.)</para>
+      file-systems.  (Definitely not DOS or &windows;.)</para>
   </sect1>
 </article>