diff --git a/lib/geom/concat/gconcat.8 b/lib/geom/concat/gconcat.8 index 55c05b469d4a..165f809ffba8 100644 --- a/lib/geom/concat/gconcat.8 +++ b/lib/geom/concat/gconcat.8 @@ -1,230 +1,229 @@ .\" Copyright (c) 2004-2005 Pawel Jakub Dawidek .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" -.Dd June 14, 2021 +.Dd January 23, 2025 .Dt GCONCAT 8 .Os .Sh NAME .Nm gconcat .Nd "disk concatenation control utility" .Sh SYNOPSIS .Nm .Cm create .Op Fl v .Ar name .Ar prov ... .Nm .Cm destroy .Op Fl fv .Ar name ... .Nm .Cm label .Op Fl hv .Ar name .Ar prov ... .Nm .Cm append .Op Fl hv .Ar name .Ar prov .Nm .Cm stop .Op Fl fv .Ar name ... .Nm .Cm clear .Op Fl v .Ar prov ... .Nm .Cm dump .Ar prov ... .Nm .Cm list .Nm .Cm status .Nm .Cm load .Nm .Cm unload .Sh DESCRIPTION The .Nm utility is used for device concatenation configuration. The concatenation can be configured using two different methods: .Dq manual or .Dq automatic . When using the .Dq manual method, no metadata are stored on the devices, so the concatenated device has to be configured by hand every time it is needed. The .Dq automatic method uses on-disk metadata to detect devices. Once devices are labeled, they will be automatically detected and configured. .Pp The first argument to .Nm indicates an action to be performed: .Bl -tag -width ".Cm destroy" .It Cm create Concatenate the given devices with specified .Ar name . This is the .Dq manual method. The kernel module .Pa geom_concat.ko will be loaded if it is not loaded already. .It Cm label Concatenate the given devices with the specified .Ar name . This is the .Dq automatic method, where metadata are stored in every device's last sector. The kernel module .Pa geom_concat.ko will be loaded if it is not loaded already. .Pp Additional options include: .Bl -tag -width ".Fl h" .It Fl h Hardcode providers' names in metadata. .El .It Cm append Append a new device to the end of an existing concatenate device with the specified .Ar name . .Pp If the existing device is using the .Dq manual method, the new device is simply appended as-is. .Pp If the existing device is using the .Dq automatic method, the device is appended persistently. New .Cm gconcat metadata is written to all existing components, as well as to the newly added one. .Pp Additional options include: .Bl -tag -width ".Fl h" .It Fl h Hardcode providers' names in metadata. .El .It Cm stop Turn off existing concatenate device by its .Ar name . This command does not touch on-disk metadata! .Pp Additional options include: .Bl -tag -width ".Fl f" .It Fl f Stop the given device even if it is opened. .El .It Cm destroy Same as .Cm stop . .It Cm clear Clear metadata on the given devices. .It Cm dump Dump metadata stored on the given devices. .It Cm list See .Xr geom 8 . .It Cm status See .Xr geom 8 . .It Cm load See .Xr geom 8 . .It Cm unload See .Xr geom 8 . .El .Pp Additional options: .Bl -tag -width indent .It Fl v Be more verbose. .El .Sh SYSCTL VARIABLES The following .Xr sysctl 8 variables can be used to control the behavior of the .Nm CONCAT GEOM class. The default value is shown next to each variable. .Bl -tag -width indent .It Va kern.geom.concat.debug : No 0 Debug level of the .Nm CONCAT GEOM class. This can be set to a number between 0 and 3 inclusive. If set to 0 minimal debug information is printed, and if set to 3 the maximum amount of debug information is printed. .El .Sh EXIT STATUS Exit status is 0 on success, and 1 if the command fails. .Sh EXAMPLES The following example shows how to configure four disks for automatic concatenation, create a file system on it, and mount it: .Bd -literal -offset indent gconcat label -v data /dev/da0 /dev/da1 /dev/da2 /dev/da3 newfs /dev/concat/data mount /dev/concat/data /mnt [...] umount /mnt gconcat stop data gconcat unload .Ed .Pp Configure concatenated provider on one disk only. Create file system. Add two more disks and extend existing file system. .Bd -literal -offset indent gconcat label data /dev/da0 newfs /dev/concat/data gconcat label data /dev/da0 /dev/da1 /dev/da2 growfs /dev/concat/data .Ed .Sh SEE ALSO .Xr geom 4 , .Xr loader.conf 5 , .Xr geom 8 , .Xr growfs 8 , -.Xr gvinum 8 , .Xr mount 8 , .Xr newfs 8 , .Xr sysctl 8 , .Xr umount 8 .Sh HISTORY The .Nm utility appeared in .Fx 5.3 . .Sh AUTHORS .An Pawel Jakub Dawidek Aq Mt pjd@FreeBSD.org diff --git a/lib/geom/mirror/gmirror.8 b/lib/geom/mirror/gmirror.8 index 0d1bb4566d58..aeffb2d948b1 100644 --- a/lib/geom/mirror/gmirror.8 +++ b/lib/geom/mirror/gmirror.8 @@ -1,455 +1,454 @@ .\" Copyright (c) 2004-2009 Pawel Jakub Dawidek .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" -.Dd October 5, 2022 +.Dd January 23, 2025 .Dt GMIRROR 8 .Os .Sh NAME .Nm gmirror .Nd "control utility for mirrored devices" .Sh SYNOPSIS To compile GEOM_MIRROR into your kernel, add the following lines to your kernel configuration file: .Bd -ragged -offset indent .Cd "options GEOM_MIRROR" .Ed .Pp Alternatively, to load the GEOM_MIRROR module at boot time, add the following line to your .Xr loader.conf 5 : .Bd -literal -offset indent geom_mirror_load="YES" .Ed .Pp .No Usage of the Nm utility: .Pp .Nm .Cm label .Op Fl Fhnv .Op Fl b Ar balance .Op Fl s Ar slice .Ar name .Ar prov ... .Nm .Cm clear .Op Fl v .Ar prov ... .Nm .Cm create .Op Fl Fnv .Op Fl b Ar balance .Op Fl s Ar slice .Ar name .Ar prov ... .Nm .Cm configure .Op Fl adfFhnv .Op Fl b Ar balance .Op Fl s Ar slice .Ar name .Nm .Cm configure .Op Fl v .Fl p Ar priority .Ar name .Ar prov .Nm .Cm rebuild .Op Fl v .Ar name .Ar prov ... .Nm .Cm resize .Op Fl v .Op Fl s Ar size .Ar name .Nm .Cm insert .Op Fl hiv .Op Fl p Ar priority .Ar name .Ar prov ... .Nm .Cm remove .Op Fl v .Ar name .Ar prov ... .Nm .Cm activate .Op Fl v .Ar name .Ar prov ... .Nm .Cm deactivate .Op Fl v .Ar name .Ar prov ... .Nm .Cm destroy .Op Fl fv .Ar name ... .Nm .Cm forget .Op Fl v .Ar name ... .Nm .Cm stop .Op Fl fv .Ar name ... .Nm .Cm dump .Ar prov ... .Nm .Cm list .Nm .Cm status .Nm .Cm load .Nm .Cm unload .Sh DESCRIPTION The .Nm utility is used for mirror (RAID1) configurations. After a mirror's creation, all components are detected and configured automatically. All operations like failure detection, stale component detection, rebuild of stale components, etc.\& are also done automatically. The .Nm utility uses on-disk metadata (stored in the provider's last sector) to store all needed information. Since the last sector is used for this purpose, it is possible to place a root file system on a mirror. .Pp The first argument to .Nm indicates an action to be performed: .Bl -tag -width ".Cm deactivate" .It Cm label Create a mirror. The order of components is important, because a component's priority is based on its position (starting from 0 to 255). The component with the biggest priority is used by the .Cm prefer balance algorithm and is also used as a master component when resynchronization is needed, e.g.\& after a power failure when the device was open for writing. .Pp Additional options include: .Bl -tag -width ".Fl b Ar balance" .It Fl b Ar balance Specifies balance algorithm to use, one of: .Bl -tag -width ".Cm round-robin" .It Cm load Read from the component with the lowest load. This is the default balance algorithm. .It Cm prefer Read from the component with the biggest priority. .It Cm round-robin Use round-robin algorithm when choosing component to read. .It Cm split Split read requests, which are bigger than or equal to slice size on N pieces, where N is the number of active components. .El .It Fl F Do not synchronize after a power failure or system crash. Assumes device is in consistent state. .It Fl h Hardcode providers' names in metadata. .It Fl n Turn off autosynchronization of stale components. .It Fl s Ar slice When using the .Cm split balance algorithm and an I/O READ request is bigger than or equal to this value, the I/O request will be split into N pieces, where N is the number of active components. Defaults to 4096 bytes. .El .It Cm clear Clear metadata on the given providers. .It Cm create Similar to .Cm label , but creates mirror without storing on-disk metadata in last sector. This special "manual" operation mode assumes some external control to manage mirror detection after reboot, device hot-plug and other external events. .It Cm configure Configure the given device. .Pp Additional options include: .Bl -tag -width ".Fl p Ar priority" .It Fl a Turn on autosynchronization of stale components. .It Fl b Ar balance Specifies balance algorithm to use. .It Fl d Do not hardcode providers' names in metadata. .It Fl f Synchronize device after a power failure or system crash. .It Fl F Do not synchronize after a power failure or system crash. Assumes device is in consistent state. .It Fl h Hardcode providers' names in metadata. .It Fl n Turn off autosynchronization of stale components. .It Fl p Ar priority Specifies priority for the given component .Ar prov . .It Fl s Ar slice Specifies slice size for .Cm split balance algorithm. .El .It Cm rebuild Rebuild the given mirror components forcibly. If autosynchronization was not turned off for the given device, this command should be unnecessary. .It Cm resize Change the size of the given mirror. .Pp Additional options include: .Bl -tag -width ".Fl s Ar size" .It Fl s Ar size New size of the mirror is expressed in logical block numbers. This option can be omitted, then it will be automatically calculated to maximum available size. .El .It Cm insert Add the given component(s) to the existing mirror. .Pp Additional options include: .Bl -tag -width ".Fl p Ar priority" .It Fl h Hardcode providers' names in metadata. .It Fl i Mark component(s) as inactive immediately after insertion. .It Fl p Ar priority Specifies priority of the given component(s). .El .It Cm remove Remove the given component(s) from the mirror and clear metadata on it. .It Cm activate Activate the given component(s), which were marked as inactive before. .It Cm deactivate Mark the given component(s) as inactive, so it will not be automatically connected to the mirror. .It Cm destroy Stop the given mirror and clear metadata on all its components. .Pp Additional options include: .Bl -tag -width ".Fl f" .It Fl f Stop the given mirror even if it is opened. .El .It Cm forget Forget about components which are not connected. This command is useful when a disk has failed and cannot be reconnected, preventing the .Cm remove command from being used to remove it. .It Cm stop Stop the given mirror. .Pp Additional options include: .Bl -tag -width ".Fl f" .It Fl f Stop the given mirror even if it is opened. .El .It Cm dump Dump metadata stored on the given providers. .It Cm list See .Xr geom 8 . .It Cm status See .Xr geom 8 . .It Cm load See .Xr geom 8 . .It Cm unload See .Xr geom 8 . .El .Pp Additional options include: .Bl -tag -width ".Fl v" .It Fl v Be more verbose. .El .Sh EXIT STATUS Exit status is 0 on success, and 1 if the command fails. .Sh EXAMPLES Use 3 disks to setup a mirror. Choose split balance algorithm, split only requests which are bigger than or equal to 2kB. Create file system, mount it, then unmount it and stop device: .Bd -literal -offset indent gmirror label -v -b split -s 2048 data da0 da1 da2 newfs /dev/mirror/data mount /dev/mirror/data /mnt \&... umount /mnt gmirror stop data gmirror unload .Ed .Pp Create a mirror on disk with valid data (note that the last sector of the disk will be overwritten). Add another disk to this mirror, so it will be synchronized with existing disk: .Bd -literal -offset indent gmirror label -v -b round-robin data da0 gmirror insert data da1 .Ed .Pp Create a mirror, but do not use automatic synchronization feature. Add another disk and rebuild it: .Bd -literal -offset indent gmirror label -v -n -b load data da0 da1 gmirror insert data da2 gmirror rebuild data da2 .Ed .Pp One disk failed. Replace it with a brand new one: .Bd -literal -offset indent gmirror forget data gmirror insert data da1 .Ed .Pp Create a mirror, deactivate one component, do the backup and connect it again. It will not be resynchronized, if there is no need to do so (there were no writes in the meantime): .Bd -literal -offset indent gmirror label data da0 da1 gmirror deactivate data da1 dd if=/dev/da1 of=/backup/data.img bs=1m gmirror activate data da1 .Ed .Sh SYSCTL VARIABLES The following .Xr sysctl 8 variables can be used to configure behavior for all mirrors. .Bl -tag -width indent .It Va kern.geom.mirror.debug Control the verbosity of kernel logging related to mirrors. A value larger than 0 will enable debug logging. .It Va kern.geom.mirror.timeout The amount of time, in seconds, to wait for all copies of a mirror to appear before starting the mirror. Disks that appear after the mirror has been started are not automatically added to the mirror. .It Va kern.geom.mirror.idletime The amount of time, in seconds, which must elapse after the last write to a mirror before that mirror is marked clean. Clean mirrors do not need to be synchronized after a power failure or system crash. A small value may result in frequent overwrites of the disks' metadata sectors, and thus may reduce the longevity of the disks. .It Va kern.geom.mirror.disconnect_on_failure Determine whether a disk is automatically removed from its mirror when an I/O request to that disk fails. .It Va kern.geom.mirror.sync_requests The number of parallel I/O requests used while synchronizing a mirror. This parameter may only be configured as a .Xr loader.conf 5 tunable. .It Va kern.geom.mirror.sync_update_period The period, in seconds, at which a synchronizing mirror's metadata is updated. Periodic updates are used to record a synchronization's progress so that an interrupted synchronization may be resumed starting at the recorded offset, rather than at the beginning. A smaller value results in more accurate progress tracking, but also increases the number of non-sequential writes to the disk being synchronized. If the sysctl value is 0, no updates are performed until the synchronization is complete. .El .Sh NOTES Doing kernel dumps to .Nm providers is possible, but some conditions have to be met. First of all, a kernel dump will go only to one component and .Nm always chooses the component with the highest priority. Reading a dump from the mirror on boot will only work if the .Cm prefer balance algorithm is used (that way .Nm will read only from the component with the highest priority). If you use a different balance algorithm, you should create an .Xr rc 8 script that sets the balance algorithm to .Cm prefer , for example with the following command: .Bd -literal -offset indent gmirror configure -b prefer data .Ed .Pp Make sure that .Xr rcorder 8 schedules the new script before .Xr savecore 8 . The desired balance algorithm can be restored later on by placing the following command in .Xr rc.local 8 : .Bd -literal -offset indent gmirror configure -b round-robin data .Ed .Pp The decision which component to choose for dumping is made when .Xr dumpon 8 is called. If on the next boot a component with a higher priority will be available, the prefer algorithm will choose to read from it and .Xr savecore 8 will find nothing. If on the next boot a component with the highest priority will be synchronized, the prefer balance algorithm will read from the next one, thus will find nothing there. .Sh SEE ALSO .Xr geom 4 , .Xr dumpon 8 , .Xr geom 8 , -.Xr gvinum 8 , .Xr mount 8 , .Xr newfs 8 , .Xr savecore 8 , .Xr sysctl 8 , .Xr umount 8 .Sh HISTORY The .Nm utility appeared in .Fx 5.3 . .Sh AUTHORS .An Pawel Jakub Dawidek Aq Mt pjd@FreeBSD.org .Sh BUGS There should be a way to change a component's priority inside a running mirror. .Pp There should be a section with an implementation description. diff --git a/lib/geom/raid/graid.8 b/lib/geom/raid/graid.8 index 50c0116ba22e..f722fd5c17bd 100644 --- a/lib/geom/raid/graid.8 +++ b/lib/geom/raid/graid.8 @@ -1,319 +1,318 @@ .\" Copyright (c) 2010 Alexander Motin .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" -.Dd April 4, 2013 +.Dd January 23, 2025 .Dt GRAID 8 .Os .Sh NAME .Nm graid .Nd "control utility for software RAID devices" .Sh SYNOPSIS .Nm .Cm label .Op Fl f .Op Fl o Ar fmtopt .Op Fl S Ar size .Op Fl s Ar strip .Ar format .Ar label .Ar level .Ar prov ... .Nm .Cm add .Op Fl f .Op Fl S Ar size .Op Fl s Ar strip .Ar name .Ar label .Ar level .Nm .Cm delete .Op Fl f .Ar name .Op Ar label | Ar num .Nm .Cm insert .Ar name .Ar prov ... .Nm .Cm remove .Ar name .Ar prov ... .Nm .Cm fail .Ar name .Ar prov ... .Nm .Cm stop .Op Fl fv .Ar name ... .Nm .Cm list .Nm .Cm status .Nm .Cm load .Nm .Cm unload .Sh DESCRIPTION The .Nm utility is used to manage software RAID configurations, supported by the GEOM RAID class. GEOM RAID class uses on-disk metadata to provide access to software-RAID volumes defined by different RAID BIOSes. Depending on RAID BIOS type and its metadata format, different subsets of configurations and features are supported. To allow booting from RAID volume, the metadata format should match the RAID BIOS type and its capabilities. To guarantee that these match, it is recommended to create volumes via the RAID BIOS interface, while experienced users are free to do it using this utility. .Pp The first argument to .Nm indicates an action to be performed: .Bl -tag -width ".Cm destroy" .It Cm label Create an array with single volume. The .Ar format argument specifies the on-disk metadata format to use for this array, such as "Intel". The .Ar label argument specifies the label of the created volume. The .Ar level argument specifies the RAID level of the created volume, such as: "RAID0", "RAID1", etc. The subsequent list enumerates providers to use as array components. The special name "NONE" can be used to reserve space for absent disks. The order of components can be important, depending on specific RAID level and metadata format. .Pp Additional options include: .Bl -tag -width ".Fl s Ar strip" .It Fl f Enforce specified configuration creation if it is officially unsupported, but technically can be created. .It Fl o Ar fmtopt Specifies metadata format options. .It Fl S Ar size Use .Ar size bytes on each component for this volume. Should be used if several volumes per array are planned, or if smaller components going to be inserted later. Defaults to size of the smallest component. .It Fl s Ar strip Specifies strip size in bytes. Defaults to 131072. .El .It Cm add Create another volume on the existing array. The .Ar name argument is the name of the existing array, reported by label command. The rest of arguments are the same as for the label command. .It Cm delete Delete volume(s) from the existing array. When the last volume is deleted, the array is also deleted and its metadata erased. The .Ar name argument is the name of existing array. Optional .Ar label or .Ar num arguments allow specifying volume for deletion. .Pp Additional options include: .Bl -tag -width ".Fl f" .It Fl f Delete volume(s) even if it is still open. .El .It Cm insert Insert specified provider(s) into specified array instead of the first missing or failed components. If there are no such components, mark disk(s) as spare. .It Cm remove Remove the specified provider(s) from the specified array and erase metadata. If there are spare disks present, the removed disk(s) will be replaced by spares. .It Cm fail Mark the given disks(s) as failed, removing from active use unless absolutely necessary due to exhausted redundancy. If there are spare disks present - failed disk(s) will be replaced with one of them. .It Cm stop Stop the given array. The metadata will not be erased. .Pp Additional options include: .Bl -tag -width ".Fl f" .It Fl f Stop the given array even if some of its volumes are opened. .El .It Cm list See .Xr geom 8 . .It Cm status See .Xr geom 8 . .It Cm load See .Xr geom 8 . .It Cm unload See .Xr geom 8 . .El .Pp Additional options include: .Bl -tag -width ".Fl v" .It Fl v Be more verbose. .El .Sh SUPPORTED METADATA FORMATS The GEOM RAID class follows a modular design, allowing different metadata formats to be used. Support is currently implemented for the following formats: .Bl -tag -width "Intel" .It DDF The format defined by the SNIA Common RAID Disk Data Format v2.0 specification. Used by some Adaptec RAID BIOSes and some hardware RAID controllers. Because of high format flexibility different implementations support different set of features and have different on-disk metadata layouts. To provide compatibility, the GEOM RAID class mimics capabilities of the first detected DDF array. Respecting that, it may support different number of disks per volume, volumes per array, partitions per disk, etc. The following configurations are supported: RAID0 (2+ disks), RAID1 (2+ disks), RAID1E (3+ disks), RAID3 (3+ disks), RAID4 (3+ disks), RAID5 (3+ disks), RAID5E (4+ disks), RAID5EE (4+ disks), RAID5R (3+ disks), RAID6 (4+ disks), RAIDMDF (4+ disks), RAID10 (4+ disks), SINGLE (1 disk), CONCAT (2+ disks). .Pp Format supports two options "BE" and "LE", that mean big-endian byte order defined by specification (default) and little-endian used by some Adaptec controllers. .It Intel The format used by Intel RAID BIOS. Supports up to two volumes per array. Supports configurations: RAID0 (2+ disks), RAID1 (2 disks), RAID5 (3+ disks), RAID10 (4 disks). Configurations not supported by Intel RAID BIOS, but enforceable on your own risk: RAID1 (3+ disks), RAID1E (3+ disks), RAID10 (6+ disks). .It JMicron The format used by JMicron RAID BIOS. Supports one volume per array. Supports configurations: RAID0 (2+ disks), RAID1 (2 disks), RAID10 (4 disks), CONCAT (2+ disks). Configurations not supported by JMicron RAID BIOS, but enforceable on your own risk: RAID1 (3+ disks), RAID1E (3+ disks), RAID10 (6+ disks), RAID5 (3+ disks). .It NVIDIA The format used by NVIDIA MediaShield RAID BIOS. Supports one volume per array. Supports configurations: RAID0 (2+ disks), RAID1 (2 disks), RAID5 (3+ disks), RAID10 (4+ disks), SINGLE (1 disk), CONCAT (2+ disks). Configurations not supported by NVIDIA MediaShield RAID BIOS, but enforceable on your own risk: RAID1 (3+ disks). .It Promise The format used by Promise and AMD/ATI RAID BIOSes. Supports multiple volumes per array. Each disk can be split to be used by up to two arbitrary volumes. Supports configurations: RAID0 (2+ disks), RAID1 (2 disks), RAID5 (3+ disks), RAID10 (4 disks), SINGLE (1 disk), CONCAT (2+ disks). Configurations not supported by RAID BIOSes, but enforceable on your own risk: RAID1 (3+ disks), RAID10 (6+ disks). .It SiI The format used by SiliconImage RAID BIOS. Supports one volume per array. Supports configurations: RAID0 (2+ disks), RAID1 (2 disks), RAID5 (3+ disks), RAID10 (4 disks), SINGLE (1 disk), CONCAT (2+ disks). Configurations not supported by SiliconImage RAID BIOS, but enforceable on your own risk: RAID1 (3+ disks), RAID10 (6+ disks). .El .Sh SUPPORTED RAID LEVELS The GEOM RAID class follows a modular design, allowing different RAID levels to be used. Full support for the following RAID levels is currently implemented: RAID0, RAID1, RAID1E, RAID10, SINGLE, CONCAT. The following RAID levels supported as read-only for volumes in optimal state (without using redundancy): RAID4, RAID5, RAID5E, RAID5EE, RAID5R, RAID6, RAIDMDF. .Sh RAID LEVEL MIGRATION The GEOM RAID class has no support for RAID level migration, allowed by some metadata formats. If you started migration using BIOS or in some other way, make sure to complete it there. Do not run GEOM RAID class on migrating volumes under pain of possible data corruption! .Sh 2TiB BARRIERS NVIDIA metadata format does not support volumes above 2TiB. .Sh SYSCTL VARIABLES The following .Xr sysctl 8 variable can be used to control the behavior of the .Nm RAID GEOM class. .Bl -tag -width indent .It Va kern.geom.raid.aggressive_spare : No 0 Use any disks without metadata connected to controllers of the vendor matching to volume metadata format as spare. Use it with much care to not lose data if connecting unrelated disk! .It Va kern.geom.raid.clean_time : No 5 Mark volume as clean when idle for the specified number of seconds. .It Va kern.geom.raid.debug : No 0 Debug level of the .Nm RAID GEOM class. .It Va kern.geom.raid.enable : No 1 Enable on-disk metadata taste. .It Va kern.geom.raid.idle_threshold : No 1000000 Time in microseconds to consider a volume idle for rebuild purposes. .It Va kern.geom.raid.name_format : No 0 Providers name format: 0 -- raid/r{num}, 1 -- raid/{label}. .It Va kern.geom.raid.read_err_thresh : No 10 Number of read errors equated to disk failure. Write errors are always considered as disk failures. .It Va kern.geom.raid.start_timeout : No 30 Time to wait for missing array components on startup. .It Va kern.geom.raid. Ns Ar X Ns Va .enable : No 1 Enable taste for specific metadata or transformation module. .El .Sh EXIT STATUS Exit status is 0 on success, and non-zero if the command fails. .Sh SEE ALSO .Xr geom 4 , -.Xr geom 8 , -.Xr gvinum 8 +.Xr geom 8 .Sh HISTORY The .Nm utility appeared in .Fx 9.0 . .Sh AUTHORS .An Alexander Motin Aq Mt mav@FreeBSD.org .An M. Warner Losh Aq Mt imp@FreeBSD.org diff --git a/lib/geom/raid3/graid3.8 b/lib/geom/raid3/graid3.8 index 0e8eebc2bd81..e1bcdac17f99 100644 --- a/lib/geom/raid3/graid3.8 +++ b/lib/geom/raid3/graid3.8 @@ -1,255 +1,254 @@ .\" Copyright (c) 2004-2005 Pawel Jakub Dawidek .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" -.Dd January 15, 2012 +.Dd January 23, 2025 .Dt GRAID3 8 .Os .Sh NAME .Nm graid3 .Nd "control utility for RAID3 devices" .Sh SYNOPSIS .Nm .Cm label .Op Fl Fhnrvw .Op Fl s Ar blocksize .Ar name .Ar prov prov prov ... .Nm .Cm clear .Op Fl v .Ar prov ... .Nm .Cm configure .Op Fl adfFhnrRvwW .Ar name .Nm .Cm rebuild .Op Fl v .Ar name .Ar prov .Nm .Cm insert .Op Fl hv .Op Fl n Ar number .Ar name .Ar prov .Nm .Cm remove .Op Fl v .Fl n Ar number .Ar name .Nm .Cm stop .Op Fl fv .Ar name ... .Nm .Cm list .Nm .Cm status .Nm .Cm load .Nm .Cm unload .Sh DESCRIPTION The .Nm utility is used for RAID3 array configuration. After a device is created, all components are detected and configured automatically. All operations such as failure detection, stale component detection, rebuild of stale components, etc.\& are also done automatically. The .Nm utility uses on-disk metadata (the provider's last sector) to store all needed information. .Pp The first argument to .Nm indicates an action to be performed: .Bl -tag -width ".Cm configure" .It Cm label Create a RAID3 device. The last given component will contain parity data, whilst the others will all contain regular data. The number of components must be equal to 3, 5, 9, 17, etc.\& (2^n + 1). .Pp Additional options include: .Bl -tag -width ".Fl h" .It Fl F Do not synchronize after a power failure or system crash. Assumes device is in consistent state. .It Fl h Hardcode providers' names in metadata. .It Fl n Turn off autosynchronization of stale components. .It Fl r Use parity component for reading in round-robin fashion. Without this option the parity component is not used at all for reading operations when the device is in a complete state. With this option specified random I/O read operations are even 40% faster, but sequential reads are slower. One cannot use this option if the .Fl w option is also specified. .It Fl s Manually specify array block size. Block size will be set equal to least common multiple of all component's sector sizes and specified value. Note that array sector size calculated as multiple of block size and number of regular data components. Big values may decrease performance and compatibility, as all I/O requests have to be multiple of sector size. .It Fl w Use verify reading feature. When reading from a device in a complete state, also read data from the parity component and verify the data by comparing XORed regular data with parity data. If verification fails, an .Er EIO error is returned and the value of the .Va kern.geom.raid3.stat.parity_mismatch sysctl is increased. One cannot use this option if the .Fl r option is also specified. .El .It Cm clear Clear metadata on the given providers. .It Cm configure Configure the given device. .Pp Additional options include: .Bl -tag -width ".Fl a" .It Fl a Turn on autosynchronization of stale components. .It Fl d Do not hardcode providers' names in metadata. .It Fl f Synchronize device after a power failure or system crash. .It Fl F Do not synchronize after a power failure or system crash. Assumes device is in consistent state. .It Fl h Hardcode providers' names in metadata. .It Fl n Turn off autosynchronization of stale components. .It Fl r Turn on round-robin reading. .It Fl R Turn off round-robin reading. .It Fl w Turn on verify reading. .It Fl W Turn off verify reading. .El .It Cm rebuild Rebuild the given component forcibly. If autosynchronization was not turned off for the given device, this command should be unnecessary. .It Cm insert Add the given component to the existing array, if one of the components was removed previously with the .Cm remove command or if one component is missing and will not be connected again. If no number is given, new component will be added instead of first missed component. .Pp Additional options include: .Bl -tag -width ".Fl h" .It Fl h Hardcode providers' names in metadata. .El .It Cm remove Remove the given component from the given array and clear metadata on it. .It Cm stop Stop the given arrays. .Pp Additional options include: .Bl -tag -width ".Fl f" .It Fl f Stop the given array even if it is opened. .El .It Cm list See .Xr geom 8 . .It Cm status See .Xr geom 8 . .It Cm load See .Xr geom 8 . .It Cm unload See .Xr geom 8 . .El .Pp Additional options include: .Bl -tag -width ".Fl v" .It Fl v Be more verbose. .El .Sh EXIT STATUS Exit status is 0 on success, and 1 if the command fails. .Sh EXAMPLES Use 3 disks to setup a RAID3 array (with the round-robin reading feature). Create a file system, mount it, then unmount it and stop device: .Bd -literal -offset indent graid3 label -v -r data da0 da1 da2 newfs /dev/raid3/data mount /dev/raid3/data /mnt \&... umount /mnt graid3 stop data graid3 unload .Ed .Pp Create a RAID3 array, but do not use the automatic synchronization feature. Rebuild parity component: .Bd -literal -offset indent graid3 label -n data da0 da1 da2 graid3 rebuild data da2 .Ed .Pp Replace one data disk with a brand new one: .Bd -literal -offset indent graid3 remove -n 0 data graid3 insert -n 0 data da5 .Ed .Sh SEE ALSO .Xr geom 4 , .Xr geom 8 , -.Xr gvinum 8 , .Xr mount 8 , .Xr newfs 8 , .Xr umount 8 .Sh HISTORY The .Nm utility appeared in .Fx 5.3 . .Sh AUTHORS .An Pawel Jakub Dawidek Aq Mt pjd@FreeBSD.org .Sh BUGS There should be a section with an implementation description. .Pp Documentation for sysctls .Va kern.geom.raid3.* is missing. diff --git a/lib/geom/stripe/gstripe.8 b/lib/geom/stripe/gstripe.8 index 0282faf58b6d..6fd486355a2e 100644 --- a/lib/geom/stripe/gstripe.8 +++ b/lib/geom/stripe/gstripe.8 @@ -1,241 +1,240 @@ .\" Copyright (c) 2004-2005 Pawel Jakub Dawidek .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" -.Dd May 21, 2004 +.Dd January 23, 2025 .Dt GSTRIPE 8 .Os .Sh NAME .Nm gstripe .Nd "control utility for striped devices" .Sh SYNOPSIS .Nm .Cm create .Op Fl v .Op Fl s Ar stripesize .Ar name .Ar prov prov ... .Nm .Cm destroy .Op Fl fv .Ar name ... .Nm .Cm label .Op Fl hv .Op Fl s Ar stripesize .Ar name .Ar prov prov ... .Nm .Cm stop .Op Fl fv .Ar name ... .Nm .Cm clear .Op Fl v .Ar prov ... .Nm .Cm dump .Ar prov ... .Nm .Cm list .Nm .Cm status .Nm .Cm load .Nm .Cm unload .Sh DESCRIPTION The .Nm utility is used for setting up a stripe on two or more disks. The striped device can be configured using two different methods: .Dq manual or .Dq automatic . When using the .Dq manual method, no metadata are stored on the devices, so the striped device has to be configured by hand every time it is needed. The .Dq automatic method uses on-disk metadata to detect devices. Once devices are labeled, they will be automatically detected and configured. .Pp The first argument to .Nm indicates an action to be performed: .Bl -tag -width ".Cm destroy" .It Cm create Set up a striped device from the given devices with specified .Ar name . This is the .Dq manual method and the stripe will not exist after a reboot (see .Sx DESCRIPTION above). The kernel module .Pa geom_stripe.ko will be loaded if it is not loaded already. .It Cm label Set up a striped device from the given devices with the specified .Ar name . This is the .Dq automatic method, where metadata are stored in every device's last sector. The kernel module .Pa geom_stripe.ko will be loaded if it is not loaded already. .It Cm stop Turn off an existing striped device by its .Ar name . This command does not touch on-disk metadata! .It Cm destroy Same as .Cm stop . .It Cm clear Clear metadata on the given devices. .It Cm dump Dump metadata stored on the given devices. .It Cm list See .Xr geom 8 . .It Cm status See .Xr geom 8 . .It Cm load See .Xr geom 8 . .It Cm unload See .Xr geom 8 . .El .Pp Additional options: .Bl -tag -width ".Fl s Ar stripesize" .It Fl f Force the removal of the specified striped device. .It Fl h Hardcode providers' names in metadata. .It Fl s Ar stripesize Specifies size of stripe block in bytes. The .Ar stripesize must be a multiple of the largest sector size of all the providers. .It Fl v Be more verbose. .El .Sh SYSCTL VARIABLES The following .Xr sysctl 8 variables can be used to control the behavior of the .Nm STRIPE GEOM class. The default value is shown next to each variable. .Bl -tag -width indent .It Va kern.geom.stripe.debug : No 0 Debug level of the .Nm STRIPE GEOM class. This can be set to a number between 0 and 3 inclusive. If set to 0 minimal debug information is printed, and if set to 3 the maximum amount of debug information is printed. .It Va kern.geom.stripe.fast : No 0 If set to a non-zero value enable .Dq "fast mode" instead of the normal .Dq "economic mode" . Compared to .Dq "economic mode" , .Dq "fast mode" uses more memory, but it is much faster for smaller stripe sizes. If enough memory cannot be allocated, .Nm STRIPE will fall back to .Dq "economic mode" . .It Va kern.geom.stripe.maxmem : No 13107200 Maximum amount of memory that can be consumed by .Dq "fast mode" (in bytes). This .Xr sysctl 8 variable is read-only and can only be set as a tunable in .Xr loader.conf 5 . .It Va kern.geom.stripe.fast_failed A count of how many times .Dq "fast mode" has failed due to an insufficient amount of memory. If this value is large, you should consider increasing the .Va kern.geom.stripe.maxmem value. .El .Sh EXIT STATUS Exit status is 0 on success, and 1 if the command fails. .Sh EXAMPLES The following example shows how to set up a striped device from four disks with a 128KB stripe size for automatic configuration, create a file system on it, and mount it: .Bd -literal -offset indent gstripe label -v -s 131072 data /dev/da0 /dev/da1 /dev/da2 /dev/da3 newfs /dev/stripe/data mount /dev/stripe/data /mnt [...] umount /mnt gstripe stop data gstripe unload .Ed .Sh COMPATIBILITY The .Nm interleave is in number of bytes, unlike .Xr ccdconfig 8 which use the number of sectors. A .Xr ccdconfig 8 .Ar ileave of .Ql 128 is 64 KB (128 512B sectors). The same stripe interleave would be specified as .Ql 65536 for .Nm . .Sh SEE ALSO .Xr geom 4 , .Xr loader.conf 5 , .Xr ccdconfig 8 , .Xr geom 8 , -.Xr gvinum 8 , .Xr mount 8 , .Xr newfs 8 , .Xr sysctl 8 , .Xr umount 8 .Sh HISTORY The .Nm utility appeared in .Fx 5.3 . .Sh AUTHORS .An Pawel Jakub Dawidek Aq Mt pjd@FreeBSD.org diff --git a/sbin/ffsinfo/ffsinfo.8 b/sbin/ffsinfo/ffsinfo.8 index 6daa9a79c9e9..2221b2fb63b6 100644 --- a/sbin/ffsinfo/ffsinfo.8 +++ b/sbin/ffsinfo/ffsinfo.8 @@ -1,145 +1,144 @@ .\" Copyright (c) 2000 Christoph Herrmann, Thomas-Henning von Kamptz .\" Copyright (c) 1980, 1989, 1993 The Regents of the University of California. .\" All rights reserved. .\" .\" This code is derived from software contributed to Berkeley by .\" Christoph Herrmann and Thomas-Henning von Kamptz, Munich and Frankfurt. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" 3. All advertising materials mentioning features or use of this software .\" must display the following acknowledgment: .\" This product includes software developed by the University of .\" California, Berkeley and its contributors, as well as Christoph .\" Herrmann and Thomas-Henning von Kamptz. .\" 4. Neither the name of the University nor the names of its contributors .\" may be used to endorse or promote products derived from this software .\" without specific prior written permission. .\" .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" $TSHeader: src/sbin/ffsinfo/ffsinfo.8,v 1.3 2000/12/12 19:30:55 tomsoft Exp $ .\" -.Dd November 19, 2024 +.Dd January 23, 2025 .Dt FFSINFO 8 .Os .Sh NAME .Nm ffsinfo .Nd "dump all meta information of an existing ufs file system" .Sh SYNOPSIS .Nm .Op Fl g Ar cylinder_group .Op Fl i Ar inode .Op Fl l Ar level .Op Fl o Ar outfile .Ar special | file .Sh DESCRIPTION The .Nm utility extends the .Xr dumpfs 8 utility. .Pp The output is appended to the file .Pa outfile . Also expect the output file to be rather large. Up to 2 percent of the size of the specified file system is not uncommon. .Pp The following options are available: .Bl -tag -width indent .It Fl g Ar cylinder_group This restricts the dump to information about this cylinder group only. Here .Ar 0 means the first cylinder group and .Ar -1 the last one. .It Fl i Ar inode This restricts the dump to information about this particular inode only. Here the minimum acceptable inode is .Ar 2 . If this option is omitted but a cylinder group is defined then only inodes within that cylinder group are dumped. .It Fl l Ar level The level of detail which will be dumped. This value defaults to .Ar 255 and is the .Dq bitwise or of the following table: .Pp .Bl -hang -width indent -compact .It Ar 0x001 initial superblock .It Ar 0x002 superblock copies in each cylinder group .It Ar 0x004 cylinder group summary in initial cylinder group .It Ar 0x008 cylinder group information .It Ar 0x010 inode allocation bitmap .It Ar 0x020 fragment allocation bitmap .It Ar 0x040 cluster maps and summary .It Ar 0x100 inode information .It Ar 0x200 indirect block dump .El .It Fl o Ar outfile This sets the output filename where the dump is written to, and must be specified. If .Fl is provided, output will be sent to stdout. .El .Sh EXAMPLES .Dl ffsinfo -o /var/tmp/ffsinfo -l 1023 /dev/md0 .Pp will dump .Pa /dev/md0 to .Pa /var/tmp/ffsinfo with all available information. .Sh SEE ALSO .Xr ffs 4 , .Xr dumpfs 8 , .Xr fsck 8 , .Xr gpart 8 , .Xr growfs 8 , -.Xr gvinum 8 , .Xr newfs 8 , .Xr tunefs 8 .Sh HISTORY The .Nm utility first appeared in .Fx 4.4 . .Sh AUTHORS .An Christoph Herrmann Aq Mt chm@FreeBSD.org .An Thomas-Henning von Kamptz Aq Mt tomsoft@FreeBSD.org .An The GROWFS team Aq Mt growfs@Tomsoft.COM .Sh BUGS Snapshots are handled like plain files. They should get their own level to provide for independent control of the amount of what gets dumped. It probably also makes sense to some extend to dump the snapshot as a file system. diff --git a/sbin/growfs/growfs.8 b/sbin/growfs/growfs.8 index 9b619613f30e..f23817b0afbe 100644 --- a/sbin/growfs/growfs.8 +++ b/sbin/growfs/growfs.8 @@ -1,135 +1,133 @@ .\" Copyright (c) 2000 Christoph Herrmann, Thomas-Henning von Kamptz .\" Copyright (c) 1980, 1989, 1993 The Regents of the University of California. .\" All rights reserved. .\" .\" This code is derived from software contributed to Berkeley by .\" Christoph Herrmann and Thomas-Henning von Kamptz, Munich and Frankfurt. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" 3. All advertising materials mentioning features or use of this software .\" must display the following acknowledgment: .\" This product includes software developed by the University of .\" California, Berkeley and its contributors, as well as Christoph .\" Herrmann and Thomas-Henning von Kamptz. .\" 4. Neither the name of the University nor the names of its contributors .\" may be used to endorse or promote products derived from this software .\" without specific prior written permission. .\" .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" $TSHeader: src/sbin/growfs/growfs.8,v 1.3 2000/12/12 19:31:00 tomsoft Exp $ .\" -.Dd October 3, 2023 +.Dd January 23, 2025 .Dt GROWFS 8 .Os .Sh NAME .Nm growfs .Nd expand an existing UFS file system .Sh SYNOPSIS .Nm .Op Fl Ny .Op Fl s Ar size .Ar special | filesystem .Sh DESCRIPTION The .Nm utility makes it possible to expand an UFS file system. Before running .Nm the partition or slice containing the file system must be extended using .Xr gpart 8 . -If you are using volumes you must enlarge them by using -.Xr gvinum 8 . The .Nm utility extends the size of the file system on the specified special file. The following options are available: .Bl -tag -width "-s size" .It Fl N .Dq Test mode . Causes the new file system parameters to be printed out without actually enlarging the file system. .It Fl s Ar size Determines the .Ar size of the file system after enlarging in sectors. .Ar Size is the number of 512 byte sectors unless suffixed with a .Cm b , k , m , g , or .Cm t which denotes byte, kilobyte, megabyte, gigabyte and terabyte respectively. This value defaults to the size of the raw partition specified in .Ar special (in other words, .Nm will enlarge the file system to the size of the entire partition). .It Fl y Causes .Nm to assume yes as the answer to all operator questions. .El .Sh EXIT STATUS Exit status is 0 on success, and >= 1 on errors. Errors recoverable by user action are indicated by 2. OS errors, which are usually not recoverable, are indicated by 3 or greater. .Sh EXAMPLES Expand root file system to fill up available space: .Dl growfs / .Pp Refresh the LUN size, resize the partition to use all available capacity, and expand the filesystem accordingly: .Dl camcontrol reprobe da0 .Dl gpart recover da0 .Dl gpart resize -i 1 da0 .Dl growfs /dev/da0p1 .Sh SEE ALSO .Xr growfs 7 , .Xr camcontrol 8 , .Xr fsck 8 , .Xr gpart 8 , .Xr newfs 8 , .Xr tunefs 8 .Sh HISTORY The .Nm utility first appeared in .Fx 4.4 . The ability to resize mounted file systems was added in .Fx 10.0 . .Sh AUTHORS .An Christoph Herrmann Aq Mt chm@FreeBSD.org .An Thomas-Henning von Kamptz Aq Mt tomsoft@FreeBSD.org .An The GROWFS team Aq Mt growfs@Tomsoft.COM .An Edward Tomasz Napierala Aq Mt trasz@FreeBSD.org .Sh CAVEATS When expanding a file system mounted read-write, any writes to that file system will be temporarily suspended until the expansion is finished. .Sh BUGS Normally .Nm writes cylinder group summary to disk and reads it again later for doing more updates. This read operation will provide unexpected data when using .Fl N . Therefore, this part cannot really be simulated and will be skipped in test mode. diff --git a/sbin/newfs/newfs.8 b/sbin/newfs/newfs.8 index daf568fe6a33..16bca26f7cd8 100644 --- a/sbin/newfs/newfs.8 +++ b/sbin/newfs/newfs.8 @@ -1,386 +1,385 @@ .\" Copyright (c) 1983, 1987, 1991, 1993, 1994 .\" The Regents of the University of California. All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" 3. Neither the name of the University nor the names of its contributors .\" may be used to endorse or promote products derived from this software .\" without specific prior written permission. .\" .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" -.Dd November 19, 2024 +.Dd January 23, 2025 .Dt NEWFS 8 .Os .Sh NAME .Nm newfs .Nd construct a new UFS1/UFS2 file system .Sh SYNOPSIS .Nm .Op Fl EJNUjlnt .Op Fl L Ar volname .Op Fl O Ar filesystem-type .Op Fl S Ar sector-size .Op Fl T Ar disktype .Op Fl a Ar maxcontig .Op Fl b Ar block-size .Op Fl c Ar blocks-per-cylinder-group .Op Fl d Ar max-extent-size .Op Fl e Ar maxbpg .Op Fl f Ar frag-size .Op Fl g Ar avgfilesize .Op Fl h Ar avgfpdir .Op Fl i Ar bytes .Op Fl k Ar held-for-metadata-blocks .Op Fl m Ar free-space .Op Fl o Ar optimization .Op Fl p Ar partition .Op Fl r Ar reserved .Op Fl s Ar size .Ar special .Sh DESCRIPTION The .Nm utility is used to initialize and clear file systems before first use. The .Nm utility builds a file system on the specified special file. (We often refer to the .Dq special file as the .Dq disk , although the special file need not be a physical disk. In fact, it need not even be special.) Typically the defaults are reasonable, however .Nm has numerous options to allow the defaults to be selectively overridden. .Pp The following options define the general layout policies: .Bl -tag -width indent .It Fl E Erase the content of the disk before making the filesystem. The reserved area in front of the superblock (for bootcode) will not be erased. Erasing is only relevant to flash-memory or thinly provisioned devices. Erasing may take a long time. If the device does not support BIO_DELETE, the command will fail. .It Fl J Enable journaling on the new file system via gjournal. See .Xr gjournal 8 for details. .It Fl L Ar volname Add a volume label to the new file system. Legal characters are alphanumerics, dashes, and underscores. .It Fl N Cause the file system parameters to be printed out without really creating the file system. .It Fl O Ar filesystem-type Use 1 to specify that a UFS1 format file system be built; use 2 to specify that a UFS2 format file system be built. The default format is UFS2. .It Fl T Ar disktype For backward compatibility. .It Fl U Enable soft updates on the new file system. Soft updates are enabled by default for UFS2 format file systems. Use .Xr tunefs 8 to disable soft updates if they are not wanted. .It Fl a Ar maxcontig Specify the maximum number of contiguous blocks that will be laid out before forcing a rotational delay. The default value is 16. See .Xr tunefs 8 for more details on how to set this option. .It Fl b Ar block-size The block size of the file system, in bytes. It must be a power of 2. .\" If changing the default block size and it causes the default .\" fragment size to change, be sure to update the location of .\" the first backup superblock on the fsck_ffs.8 manual page. The default size is 32768 bytes, and the smallest allowable size is 4096 bytes. The optimal block:fragment ratio is 8:1. Other ratios are possible, but are not recommended, and may produce poor results. .It Fl c Ar blocks-per-cylinder-group The number of blocks per cylinder group in a file system. The default is to compute the maximum allowed by the other parameters. This value is dependent on a number of other parameters, in particular the block size and the number of bytes per inode. .It Fl d Ar max-extent-size The file system may choose to store large files using extents. This parameter specifies the largest extent size that may be used. The default value is the file system blocksize. It is presently limited to a maximum value of 16 times the file system blocksize and a minimum value of the file system blocksize. .It Fl e Ar maxbpg Indicate the maximum number of blocks any single file can allocate out of a cylinder group before it is forced to begin allocating blocks from another cylinder group. The default is about one quarter of the total blocks in a cylinder group. See .Xr tunefs 8 for more details on how to set this option. .It Fl f Ar frag-size The fragment size of the file system in bytes. It must be a power of two ranging in value between .Ar blocksize Ns /8 and .Ar blocksize . .\" If changing the default fragment size or it changes because of a .\" change to the default block size, be sure to update the location .\" of the first backup superblock on the fsck_ffs.8 manual page. The default is 4096 bytes. .It Fl g Ar avgfilesize The expected average file size for the file system. .It Fl h Ar avgfpdir The expected average number of files per directory on the file system. .It Fl i Ar bytes Specify the density of inodes in the file system. The default is to create an inode for every .Pq 2 * Ar frag-size bytes of data space. If fewer inodes are desired, a larger number should be used; to create more inodes a smaller number should be given. One inode is required for each distinct file, so this value effectively specifies the average file size on the file system. .It Fl j Enable soft updates journaling on the new file system. This flag is implemented by running the .Xr tunefs 8 utility found in the user's .Dv $PATH . .Pp Enabling journaling reduces the time spent by .Xr fsck_ffs 8 cleaning up a filesystem after a crash to a few seconds from minutes to hours. Without journaling, the time to recover after a crash is a function of the number of files in the filesystem and the size of the filesystem. With journaling, the time to recover after a crash is a function of the amount of activity in the filesystem in the minute before the crash. Journaled recovery time is usually only a few seconds and never exceeds a minute. .Pp The drawback to using journaling is that the writes to its log adds an extra write load to the media containing the filesystem. Thus a write-intensive workload will have reduced throughput on a filesystem running with journaling. .Pp Like all journaling filesystems, the journal recovery will only fix issues known to the journal. Specifically if a media error occurs, the journal will not know about it and hence will not fix it. Thus when using journaling, it is still necessary to run a full fsck every few months or after a filesystem panic to check for and fix any errors brought on by media failure. A full fsck can be done by running a background fsck on a live filesystem or by running with the .Fl f flag on an unmounted filesystem. When running .Xr fsck_ffs 8 in background on a live filesystem the filesystem performance will be about half of normal during the time that the background .Xr fsck_ffs 8 is running. Running a full fsck on a UFS filesystem is the equivalent of running a scrub on a ZFS filesystem. .It Fl k Ar held-for-metadata-blocks Set the amount of space to be held for metadata blocks in each cylinder group. When set, the file system preference routines will try to save the specified amount of space immediately following the inode blocks in each cylinder group for use by metadata blocks. Clustering the metadata blocks speeds up random file access and decreases the running time of .Xr fsck 8 . By default .Nm sets it to half of the space reserved to minfree. .It Fl l Enable multilabel MAC on the new file system. .It Fl m Ar free-space The percentage of space reserved from normal users; the minimum free space threshold. The default value used is defined by .Dv MINFREE from .In ufs/ffs/fs.h , currently 8%. See .Xr tunefs 8 for more details on how to set this option. .It Fl n Do not create a .Pa .snap directory on the new file system. The resulting file system will not support snapshot generation, so .Xr dump 8 in live mode and background .Xr fsck 8 will not function properly. The traditional .Xr fsck 8 and offline .Xr dump 8 will work on the file system. This option is intended primarily for memory or vnode-backed file systems that do not require .Xr dump 8 or .Xr fsck 8 support. .It Fl o Ar optimization .Cm ( space or .Cm time ) . The file system can either be instructed to try to minimize the time spent allocating blocks, or to try to minimize the space fragmentation on the disk. If the value of minfree (see above) is less than 8%, the default is to optimize for .Cm space ; if the value of minfree is greater than or equal to 8%, the default is to optimize for .Cm time . See .Xr tunefs 8 for more details on how to set this option. .It Fl p Ar partition The partition name (a..h) you want to use in case the underlying image is a file, so you do not have access to individual partitions through the filesystem. Can also be used with a device, e.g., .Nm .Fl p Ar f .Ar /dev/da1s3 is equivalent to .Nm .Ar /dev/da1s3f . .It Fl r Ar reserved The size, in sectors, of reserved space at the end of the partition specified in .Ar special . This space will not be occupied by the file system; it can be used by other consumers such as .Xr geom 4 . Defaults to 0. .It Fl s Ar size The size of the file system in sectors. This value defaults to the size of the raw partition specified in .Ar special less the .Ar reserved space at its end (see .Fl r ) . A .Ar size of 0 can also be used to choose the default value. A valid .Ar size value cannot be larger than the default one, which means that the file system cannot extend into the reserved space. .It Fl t Turn on the TRIM enable flag. If enabled, and if the underlying device supports the BIO_DELETE command, the file system will send a delete request to the underlying device for each freed block. The trim enable flag is typically set for flash-memory devices to reduce write amplification which reduces wear on write-limited flash-memory and often improves long-term performance. Thinly provisioned storage also benefits by returning unused blocks to the global pool. .El .Pp The following options override the standard sizes for the disk geometry. Their default values are taken from the disk label. Changing these defaults is useful only when using .Nm to build a file system whose raw image will eventually be used on a different type of disk than the one on which it is initially created (for example on a write-once disk). Note that changing any of these values from their defaults will make it impossible for .Xr fsck 8 to find the alternate superblocks if the standard superblock is lost. .Bl -tag -width indent .It Fl S Ar sector-size The size of a sector in bytes (almost never anything but 512). .El .Sh NOTES ON THE NAMING .Dq newfs is a common name prefix for utilities creating filesystems, with the suffix indicating the type of the filesystem, for instance .Xr newfs_msdos 8 . The .Nm utility is a special case which predates that convention. .Sh EXAMPLES .Dl newfs /dev/ada3s1a .Pp Creates a new ufs file system on .Pa ada3s1a . The .Nm utility will use a block size of 32768 bytes, a fragment size of 4096 bytes and the largest possible number of blocks per cylinders group. These values tend to produce better performance for most applications than the historical defaults (8192 byte block size and 1024 byte fragment size). This large fragment size may lead to much wasted space on file systems that contain many small files. .Sh SEE ALSO .Xr ffs 4 , .Xr geom 4 , .Xr disktab 5 , .Xr fs 5 , .Xr camcontrol 8 , .Xr dump 8 , .Xr dumpfs 8 , .Xr fdformat 8 , .Xr fsck 8 , .Xr gjournal 8 , .Xr gpart 8 , .Xr growfs 8 , -.Xr gvinum 8 , .Xr makefs 8 , .Xr mount 8 , .Xr newfs_msdos 8 , .Xr tunefs 8 .Rs .%A M. McKusick .%A W. Joy .%A S. Leffler .%A R. Fabry .%T A Fast File System for UNIX .%J ACM Transactions on Computer Systems 2 .%V 3 .%P pp 181-197 .%D August 1984 .%O (reprinted in the BSD System Manager's Manual) .Re .Sh HISTORY The .Nm utility appeared in .Bx 4.2 . diff --git a/share/man/man4/ccd.4 b/share/man/man4/ccd.4 index c44013dab11f..9727fb68064f 100644 --- a/share/man/man4/ccd.4 +++ b/share/man/man4/ccd.4 @@ -1,285 +1,284 @@ .\" $NetBSD: ccd.4,v 1.5 1995/10/09 06:09:09 thorpej Exp $ .\" .\" Copyright (c) 1994 Jason Downs. .\" Copyright (c) 1994, 1995 Jason R. Thorpe. .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" 3. All advertising materials mentioning features or use of this software .\" must display the following acknowledgement: .\" This product includes software developed for the NetBSD Project .\" by Jason Downs and Jason R. Thorpe. .\" 4. Neither the name of the author nor the names of its contributors .\" may be used to endorse or promote products derived from this software .\" without specific prior written permission. .\" .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR .\" IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES .\" OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. .\" IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, .\" INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, .\" BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; .\" LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED .\" AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, .\" OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" -.Dd August 9, 1995 +.Dd January 23, 2025 .Dt CCD 4 .Os .Sh NAME .Nm ccd .Nd Concatenated Disk driver .Sh SYNOPSIS .Cd "device ccd" .Sh DESCRIPTION The .Nm driver provides the capability of combining one or more disks/partitions into one virtual disk. .Pp This document assumes that you are familiar with how to generate kernels, how to properly configure disks and devices in a kernel configuration file, and how to partition disks. .Pp In order to compile in support for the .Nm , you must add a line similar to the following to your kernel configuration file: .Pp .Dl "device ccd # concatenated disk devices" .Pp As of the .Fx 3.0 release, you do not need to configure your kernel with .Nm but may instead use it as a kernel loadable module. Simply running .Xr ccdconfig 8 will load the module into the kernel. .Pp A .Nm may be either serially concatenated or interleaved. To serially concatenate the partitions, specify the interleave factor of 0. Note that mirroring may not be used with an interleave factor of 0. .Pp There is a run-time utility that is used for configuring .Nm Ns s . See .Xr ccdconfig 8 for more information. .Ss The Interleave Factor If a .Nm is interleaved correctly, a .Dq striping effect is achieved, which can increase sequential read/write performance. The interleave factor is expressed in units of .Dv DEV_BSIZE (usually 512 bytes). For large writes, the optimum interleave factor is typically the size of a track, while for large reads, it is about a quarter of a track. (Note that this changes greatly depending on the number and speed of disks.) For instance, with eight 7,200 RPM drives on two Fast-Wide SCSI buses, this translates to about 128 for writes and 32 for reads. A larger interleave tends to work better when the disk is taking a multitasking load by localizing the file I/O from any given process onto a single disk. You lose sequential performance when you do this, but sequential performance is not usually an issue with a multitasking load. .Pp An interleave factor must be specified when using a mirroring configuration, even when you have only two disks (i.e., the layout winds up being the same no matter what the interleave factor). The interleave factor will determine how I/O is broken up, however, and a value 128 or greater is recommended. .Pp .Nm has an option for a parity disk, but does not currently implement it. .Pp The best performance is achieved if all component disks have the same geometry and size. Optimum striping cannot occur with different disk types. .Pp For random-access oriented workloads, such as news servers, a larger interleave factor (e.g., 65,536) is more desirable. Note that there is not much .Nm can do to speed up applications that are seek-time limited. Larger interleave factors will at least reduce the chance of having to seek two disk-heads to read one directory or a file. .Ss Disk Mirroring You can configure the .Nm to .Dq mirror any even number of disks. See .Xr ccdconfig 8 for how to specify the necessary flags. For example, if you have a .Nm configuration specifying four disks, the first two disks will be mirrored with the second two disks. A write will be run to both sides of the mirror. A read will be run to either side of the mirror depending on what the driver believes to be most optimal. If the read fails, the driver will automatically attempt to read the same sector from the other side of the mirror. Currently .Nm uses a dual seek zone model to optimize reads for a multi-tasking load rather than a sequential load. .Pp In an event of a disk failure, you can use .Xr dd 1 to recover the failed disk. .Pp Note that a one-disk .Nm is not the same as the original partition. In particular, this means if you have a file system on a two-disk mirrored .Nm and one of the disks fail, you cannot mount and use the remaining partition as itself; you have to configure it as a one-disk .Nm . You cannot replace a disk in a mirrored .Nm partition without first backing up the partition, then replacing the disk, then restoring the partition. .Ss Linux Compatibility The .Tn Linux compatibility mode does not try to read the label that .Tn Linux Ns ' .Xr md 4 driver leaves on the raw devices. You will have to give the order of devices and the interleave factor on your own. When in .Tn Linux compatibility mode, .Nm will convert the interleave factor from .Tn Linux terminology. That means you give the same interleave factor that you gave as chunk size in .Tn Linux . .Pp If you have a .Tn Linux .Xr md 4 device in .Dq legacy mode, do not use the .Dv CCDF_LINUX flag in .Xr ccdconfig 8 . Use the .Dv CCDF_NO_OFFSET flag instead. In that case you have to convert the interleave factor on your own, usually it is .Tn Linux Ns ' chunk size multiplied by two. .Pp Using a .Tn Linux RAID this way is potentially dangerous and can destroy the data in there. Since .Fx does not read the label used by .Tn Linux , changes in .Tn Linux might invalidate the compatibility layer. .Pp However, using this is reasonably safe if you test the compatibility before mounting a RAID read-write for the first time. Just using .Xr ccdconfig 8 without mounting does not write anything to the .Tn Linux RAID. Then you do a .Nm fsck.ext2fs Pq Pa ports/sysutils/e2fsprogs on the .Nm device using the .Fl n flag. You can mount the file system read-only to check files in there. If all this works, it is unlikely that there is a problem with .Nm . Keep in mind that even when the .Tn Linux compatibility mode in .Nm is working correctly, bugs in .Fx Ap s .Nm ex2fs implementation would still destroy your data. .Sh WARNINGS If just one (or more) of the disks in a .Nm fails, the entire file system will be lost unless you are mirroring the disks. .Pp If one of the disks in a mirror is lost, you should still be able to back up your data. If a write error occurs, however, data read from that sector may be non-deterministic. It may return the data prior to the write or it may return the data that was written. When a write error occurs, you should recover and regenerate the data as soon as possible. .Pp Changing the interleave or other parameters for a .Nm disk usually destroys whatever data previously existed on that disk. .Sh FILES .Bl -tag -width ".Pa /dev/ccd*" .It Pa /dev/ccd* .Nm device special files .El .Sh SEE ALSO .Xr dd 1 , .Xr ccdconfig 8 , .Xr config 8 , .Xr disklabel 8 , .Xr fsck 8 , -.Xr gvinum 8 , .Xr mount 8 , .Xr newfs 8 .Sh HISTORY The concatenated disk driver was originally written at the University of Utah. diff --git a/share/man/man7/tuning.7 b/share/man/man7/tuning.7 index 8cebfe62bb64..ebba551f65d0 100644 --- a/share/man/man7/tuning.7 +++ b/share/man/man7/tuning.7 @@ -1,709 +1,707 @@ .\" Copyright (C) 2001 Matthew Dillon. All rights reserved. .\" Copyright (C) 2012 Eitan Adler. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" -.Dd July 25, 2024 +.Dd January 23, 2025 .Dt TUNING 7 .Os .Sh NAME .Nm tuning .Nd performance tuning under FreeBSD .Sh SYSTEM SETUP - DISKLABEL, NEWFS, TUNEFS, SWAP The swap partition should typically be approximately 2x the size of main memory for systems with less than 4GB of RAM, or approximately equal to the size of main memory if you have more. Keep in mind future memory expansion when sizing the swap partition. Configuring too little swap can lead to inefficiencies in the VM page scanning code as well as create issues later on if you add more memory to your machine. On larger systems with multiple disks, configure swap on each drive. The swap partitions on the drives should be approximately the same size. The kernel can handle arbitrary sizes but internal data structures scale to 4 times the largest swap partition. Keeping the swap partitions near the same size will allow the kernel to optimally stripe swap space across the N disks. Do not worry about overdoing it a little, swap space is the saving grace of .Ux and even if you do not normally use much swap, it can give you more time to recover from a runaway program before being forced to reboot. .Pp It is not a good idea to make one large partition. First, each partition has different operational characteristics and separating them allows the file system to tune itself to those characteristics. For example, the root and .Pa /usr partitions are read-mostly, with very little writing, while a lot of reading and writing could occur in .Pa /var/tmp . By properly partitioning your system fragmentation introduced in the smaller more heavily write-loaded partitions will not bleed over into the mostly-read partitions. .Pp Properly partitioning your system also allows you to tune .Xr newfs 8 , and .Xr tunefs 8 parameters. The only .Xr tunefs 8 option worthwhile turning on is .Em softupdates with .Dq Li "tunefs -n enable /filesystem" . Softupdates drastically improves meta-data performance, mainly file creation and deletion. We recommend enabling softupdates on most file systems; however, there are two limitations to softupdates that you should be aware of when determining whether to use it on a file system. First, softupdates guarantees file system consistency in the case of a crash but could very easily be several seconds (even a minute!\&) behind on pending write to the physical disk. If you crash you may lose more work than otherwise. Secondly, softupdates delays the freeing of file system blocks. If you have a file system (such as the root file system) which is close to full, doing a major update of it, e.g.,\& .Dq Li "make installworld" , can run it out of space and cause the update to fail. For this reason, softupdates will not be enabled on the root file system during a typical install. There is no loss of performance since the root file system is rarely written to. .Pp A number of run-time .Xr mount 8 options exist that can help you tune the system. The most obvious and most dangerous one is .Cm async . Only use this option in conjunction with .Xr gjournal 8 , as it is far too dangerous on a normal file system. A less dangerous and more useful .Xr mount 8 option is called .Cm noatime . .Ux file systems normally update the last-accessed time of a file or directory whenever it is accessed. This operation is handled in .Fx with a delayed write and normally does not create a burden on the system. However, if your system is accessing a huge number of files on a continuing basis the buffer cache can wind up getting polluted with atime updates, creating a burden on the system. For example, if you are running a heavily loaded web site, or a news server with lots of readers, you might want to consider turning off atime updates on your larger partitions with this .Xr mount 8 option. However, you should not gratuitously turn off atime updates everywhere. For example, the .Pa /var file system customarily holds mailboxes, and atime (in combination with mtime) is used to determine whether a mailbox has new mail. You might as well leave atime turned on for mostly read-only partitions such as .Pa / and .Pa /usr as well. This is especially useful for .Pa / since some system utilities use the atime field for reporting. .Sh STRIPING DISKS In larger systems you can stripe partitions from several drives together to create a much larger overall partition. Striping can also improve the performance of a file system by splitting I/O operations across two or more disks. The -.Xr gstripe 8 , -.Xr gvinum 8 , +.Xr gstripe 8 and .Xr ccdconfig 8 utilities may be used to create simple striped file systems. Generally speaking, striping smaller partitions such as the root and .Pa /var/tmp , or essentially read-only partitions such as .Pa /usr is a complete waste of time. You should only stripe partitions that require serious I/O performance, typically .Pa /var , /home , or custom partitions used to hold databases and web pages. Choosing the proper stripe size is also important. File systems tend to store meta-data on power-of-2 boundaries and you usually want to reduce seeking rather than increase seeking. This means you want to use a large off-center stripe size such as 1152 sectors so sequential I/O does not seek both disks and so meta-data is distributed across both disks rather than concentrated on a single disk. .Sh SYSCTL TUNING .Xr sysctl 8 variables permit system behavior to be monitored and controlled at run-time. Some sysctls simply report on the behavior of the system; others allow the system behavior to be modified; some may be set at boot time using .Xr rc.conf 5 , but most will be set via .Xr sysctl.conf 5 . There are several hundred sysctls in the system, including many that appear to be candidates for tuning but actually are not. In this document we will only cover the ones that have the greatest effect on the system. .Pp The .Va vm.overcommit sysctl defines the overcommit behaviour of the vm subsystem. The virtual memory system always does accounting of the swap space reservation, both total for system and per-user. Corresponding values are available through sysctl .Va vm.swap_total , that gives the total bytes available for swapping, and .Va vm.swap_reserved , that gives number of bytes that may be needed to back all currently allocated anonymous memory. .Pp Setting bit 0 of the .Va vm.overcommit sysctl causes the virtual memory system to return failure to the process when allocation of memory causes .Va vm.swap_reserved to exceed .Va vm.swap_total . Bit 1 of the sysctl enforces .Dv RLIMIT_SWAP limit (see .Xr getrlimit 2 ) . Root is exempt from this limit. Bit 2 allows to count most of the physical memory as allocatable, except wired and free reserved pages (accounted by .Va vm.stats.vm.v_free_target and .Va vm.stats.vm.v_wire_count sysctls, respectively). .Pp The .Va kern.ipc.maxpipekva loader tunable is used to set a hard limit on the amount of kernel address space allocated to mapping of pipe buffers. Use of the mapping allows the kernel to eliminate a copy of the data from writer address space into the kernel, directly copying the content of mapped buffer to the reader. Increasing this value to a higher setting, such as `25165824' might improve performance on systems where space for mapping pipe buffers is quickly exhausted. This exhaustion is not fatal; however, and it will only cause pipes to fall back to using double-copy. .Pp The .Va kern.ipc.shm_use_phys sysctl defaults to 0 (off) and may be set to 0 (off) or 1 (on). Setting this parameter to 1 will cause all System V shared memory segments to be mapped to unpageable physical RAM. This feature only has an effect if you are either (A) mapping small amounts of shared memory across many (hundreds) of processes, or (B) mapping large amounts of shared memory across any number of processes. This feature allows the kernel to remove a great deal of internal memory management page-tracking overhead at the cost of wiring the shared memory into core, making it unswappable. .Pp The .Va vfs.vmiodirenable sysctl defaults to 1 (on). This parameter controls how directories are cached by the system. Most directories are small and use but a single fragment (typically 2K) in the file system and even less (typically 512 bytes) in the buffer cache. However, when operating in the default mode the buffer cache will only cache a fixed number of directories even if you have a huge amount of memory. Turning on this sysctl allows the buffer cache to use the VM Page Cache to cache the directories. The advantage is that all of memory is now available for caching directories. The disadvantage is that the minimum in-core memory used to cache a directory is the physical page size (typically 4K) rather than 512 bytes. We recommend turning this option off in memory-constrained environments; however, when on, it will substantially improve the performance of services that manipulate a large number of files. Such services can include web caches, large mail systems, and news systems. Turning on this option will generally not reduce performance even with the wasted memory but you should experiment to find out. .Pp The .Va vfs.write_behind sysctl defaults to 1 (on). This tells the file system to issue media writes as full clusters are collected, which typically occurs when writing large sequential files. The idea is to avoid saturating the buffer cache with dirty buffers when it would not benefit I/O performance. However, this may stall processes and under certain circumstances you may wish to turn it off. .Pp The .Va vfs.hirunningspace sysctl determines how much outstanding write I/O may be queued to disk controllers system-wide at any given time. It is used by the UFS file system. The default is self-tuned and usually sufficient but on machines with advanced controllers and lots of disks this may be tuned up to match what the controllers buffer. Configuring this setting to match tagged queuing capabilities of controllers or drives with average IO size used in production works best (for example: 16 MiB will use 128 tags with IO requests of 128 KiB). Note that setting too high a value (exceeding the buffer cache's write threshold) can lead to extremely bad clustering performance. Do not set this value arbitrarily high! Higher write queuing values may also add latency to reads occurring at the same time. .Pp The .Va vfs.read_max sysctl governs VFS read-ahead and is expressed as the number of blocks to pre-read if the heuristics algorithm decides that the reads are issued sequentially. It is used by the UFS, ext2fs and msdosfs file systems. With the default UFS block size of 32 KiB, a setting of 64 will allow speculatively reading up to 2 MiB. This setting may be increased to get around disk I/O latencies, especially where these latencies are large such as in virtual machine emulated environments. It may be tuned down in specific cases where the I/O load is such that read-ahead adversely affects performance or where system memory is really low. .Pp The .Va vfs.ncsizefactor sysctl defines how large VFS namecache may grow. The number of currently allocated entries in namecache is provided by .Va debug.numcache sysctl and the condition debug.numcache < kern.maxvnodes * vfs.ncsizefactor is adhered to. .Pp The .Va vfs.ncnegfactor sysctl defines how many negative entries VFS namecache is allowed to create. The number of currently allocated negative entries is provided by .Va debug.numneg sysctl and the condition vfs.ncnegfactor * debug.numneg < debug.numcache is adhered to. .Pp There are various other buffer-cache and VM page cache related sysctls. We do not recommend modifying these values. .Pp The .Va net.inet.tcp.sendspace and .Va net.inet.tcp.recvspace sysctls are of particular interest if you are running network intensive applications. They control the amount of send and receive buffer space allowed for any given TCP connection. The default sending buffer is 32K; the default receiving buffer is 64K. You can often improve bandwidth utilization by increasing the default at the cost of eating up more kernel memory for each connection. We do not recommend increasing the defaults if you are serving hundreds or thousands of simultaneous connections because it is possible to quickly run the system out of memory due to stalled connections building up. But if you need high bandwidth over a fewer number of connections, especially if you have gigabit Ethernet, increasing these defaults can make a huge difference. You can adjust the buffer size for incoming and outgoing data separately. For example, if your machine is primarily doing web serving you may want to decrease the recvspace in order to be able to increase the sendspace without eating too much kernel memory. Note that the routing table (see .Xr route 8 ) can be used to introduce route-specific send and receive buffer size defaults. .Pp As an additional management tool you can use pipes in your firewall rules (see .Xr ipfw 8 ) to limit the bandwidth going to or from particular IP blocks or ports. For example, if you have a T1 you might want to limit your web traffic to 70% of the T1's bandwidth in order to leave the remainder available for mail and interactive use. Normally a heavily loaded web server will not introduce significant latencies into other services even if the network link is maxed out, but enforcing a limit can smooth things out and lead to longer term stability. Many people also enforce artificial bandwidth limitations in order to ensure that they are not charged for using too much bandwidth. .Pp Setting the send or receive TCP buffer to values larger than 65535 will result in a marginal performance improvement unless both hosts support the window scaling extension of the TCP protocol, which is controlled by the .Va net.inet.tcp.rfc1323 sysctl. These extensions should be enabled and the TCP buffer size should be set to a value larger than 65536 in order to obtain good performance from certain types of network links; specifically, gigabit WAN links and high-latency satellite links. RFC1323 support is enabled by default. .Pp The .Va net.inet.tcp.always_keepalive sysctl determines whether or not the TCP implementation should attempt to detect dead TCP connections by intermittently delivering .Dq keepalives on the connection. By default, this is enabled for all applications; by setting this sysctl to 0, only applications that specifically request keepalives will use them. In most environments, TCP keepalives will improve the management of system state by expiring dead TCP connections, particularly for systems serving dialup users who may not always terminate individual TCP connections before disconnecting from the network. However, in some environments, temporary network outages may be incorrectly identified as dead sessions, resulting in unexpectedly terminated TCP connections. In such environments, setting the sysctl to 0 may reduce the occurrence of TCP session disconnections. .Pp The .Va net.inet.tcp.delayed_ack TCP feature is largely misunderstood. Historically speaking, this feature was designed to allow the acknowledgement to transmitted data to be returned along with the response. For example, when you type over a remote shell, the acknowledgement to the character you send can be returned along with the data representing the echo of the character. With delayed acks turned off, the acknowledgement may be sent in its own packet, before the remote service has a chance to echo the data it just received. This same concept also applies to any interactive protocol (e.g.,\& SMTP, WWW, POP3), and can cut the number of tiny packets flowing across the network in half. The .Fx delayed ACK implementation also follows the TCP protocol rule that at least every other packet be acknowledged even if the standard 40ms timeout has not yet passed. Normally the worst a delayed ACK can do is slightly delay the teardown of a connection, or slightly delay the ramp-up of a slow-start TCP connection. While we are not sure we believe that the several FAQs related to packages such as SAMBA and SQUID which advise turning off delayed acks may be referring to the slow-start issue. .Pp The .Va net.inet.ip.portrange.* sysctls control the port number ranges automatically bound to TCP and UDP sockets. There are three ranges: a low range, a default range, and a high range, selectable via the .Dv IP_PORTRANGE .Xr setsockopt 2 call. Most network programs use the default range which is controlled by .Va net.inet.ip.portrange.first and .Va net.inet.ip.portrange.last , which default to 49152 and 65535, respectively. Bound port ranges are used for outgoing connections, and it is possible to run the system out of ports under certain circumstances. This most commonly occurs when you are running a heavily loaded web proxy. The port range is not an issue when running a server which handles mainly incoming connections, such as a normal web server, or has a limited number of outgoing connections, such as a mail relay. For situations where you may run out of ports, we recommend decreasing .Va net.inet.ip.portrange.first modestly. A range of 10000 to 30000 ports may be reasonable. You should also consider firewall effects when changing the port range. Some firewalls may block large ranges of ports (usually low-numbered ports) and expect systems to use higher ranges of ports for outgoing connections. By default .Va net.inet.ip.portrange.last is set at the maximum allowable port number. .Pp The .Va kern.ipc.soacceptqueue sysctl limits the size of the listen queue for accepting new TCP connections. The default value of 128 is typically too low for robust handling of new connections in a heavily loaded web server environment. For such environments, we recommend increasing this value to 1024 or higher. The service daemon may itself limit the listen queue size (e.g.,\& .Xr sendmail 8 , apache) but will often have a directive in its configuration file to adjust the queue size up. Larger listen queues also do a better job of fending off denial of service attacks. .Pp The .Va kern.maxfiles sysctl determines how many open files the system supports. The default is typically a few thousand but you may need to bump this up to ten or twenty thousand if you are running databases or large descriptor-heavy daemons. The read-only .Va kern.openfiles sysctl may be interrogated to determine the current number of open files on the system. .Sh LOADER TUNABLES Some aspects of the system behavior may not be tunable at runtime because memory allocations they perform must occur early in the boot process. To change loader tunables, you must set their values in .Xr loader.conf 5 and reboot the system. .Pp .Va kern.maxusers controls the scaling of a number of static system tables, including defaults for the maximum number of open files, sizing of network memory resources, etc. .Va kern.maxusers is automatically sized at boot based on the amount of memory available in the system, and may be determined at run-time by inspecting the value of the read-only .Va kern.maxusers sysctl. .Pp The .Va kern.dfldsiz and .Va kern.dflssiz tunables set the default soft limits for process data and stack size respectively. Processes may increase these up to the hard limits by calling .Xr setrlimit 2 . The .Va kern.maxdsiz , .Va kern.maxssiz , and .Va kern.maxtsiz tunables set the hard limits for process data, stack, and text size respectively; processes may not exceed these limits. The .Va kern.sgrowsiz tunable controls how much the stack segment will grow when a process needs to allocate more stack. .Pp .Va kern.ipc.nmbclusters may be adjusted to increase the number of network mbufs the system is willing to allocate. Each cluster represents approximately 2K of memory, so a value of 1024 represents 2M of kernel memory reserved for network buffers. You can do a simple calculation to figure out how many you need. If you have a web server which maxes out at 1000 simultaneous connections, and each connection eats a 16K receive and 16K send buffer, you need approximately 32MB worth of network buffers to deal with it. A good rule of thumb is to multiply by 2, so 32MBx2 = 64MB/2K = 32768. So for this case you would want to set .Va kern.ipc.nmbclusters to 32768. We recommend values between 1024 and 4096 for machines with moderates amount of memory, and between 4096 and 32768 for machines with greater amounts of memory. Under no circumstances should you specify an arbitrarily high value for this parameter, it could lead to a boot-time crash. The .Fl m option to .Xr netstat 1 may be used to observe network cluster use. .Pp More and more programs are using the .Xr sendfile 2 system call to transmit files over the network. The .Va kern.ipc.nsfbufs sysctl controls the number of file system buffers .Xr sendfile 2 is allowed to use to perform its work. This parameter nominally scales with .Va kern.maxusers so you should not need to modify this parameter except under extreme circumstances. See the .Sx TUNING section in the .Xr sendfile 2 manual page for details. .Sh KERNEL CONFIG TUNING There are a number of kernel options that you may have to fiddle with in a large-scale system. In order to change these options you need to be able to compile a new kernel from source. The .Xr config 8 manual page and the handbook are good starting points for learning how to do this. Generally the first thing you do when creating your own custom kernel is to strip out all the drivers and services you do not use. Removing things like .Dv INET6 and drivers you do not have will reduce the size of your kernel, sometimes by a megabyte or more, leaving more memory available for applications. .Pp .Dv SCSI_DELAY may be used to reduce system boot times. The defaults are fairly high and can be responsible for 5+ seconds of delay in the boot process. Reducing .Dv SCSI_DELAY to something below 5 seconds could work (especially with modern drives). .Pp There are a number of .Dv *_CPU options that can be commented out. If you only want the kernel to run on a Pentium class CPU, you can easily remove .Dv I486_CPU , but only remove .Dv I586_CPU if you are sure your CPU is being recognized as a Pentium II or better. Some clones may be recognized as a Pentium or even a 486 and not be able to boot without those options. If it works, great! The operating system will be able to better use higher-end CPU features for MMU, task switching, timebase, and even device operations. Additionally, higher-end CPUs support 4MB MMU pages, which the kernel uses to map the kernel itself into memory, increasing its efficiency under heavy syscall loads. .Sh CPU, MEMORY, DISK, NETWORK The type of tuning you do depends heavily on where your system begins to bottleneck as load increases. If your system runs out of CPU (idle times are perpetually 0%) then you need to consider upgrading the CPU or perhaps you need to revisit the programs that are causing the load and try to optimize them. If your system is paging to swap a lot you need to consider adding more memory. If your system is saturating the disk you typically see high CPU idle times and total disk saturation. .Xr systat 1 can be used to monitor this. There are many solutions to saturated disks: increasing memory for caching, mirroring disks, distributing operations across several machines, and so forth. .Pp Finally, you might run out of network suds. Optimize the network path as much as possible. For example, in .Xr firewall 7 we describe a firewall protecting internal hosts with a topology where the externally visible hosts are not routed through it. Most bottlenecks occur at the WAN link. If expanding the link is not an option it may be possible to use the .Xr dummynet 4 feature to implement peak shaving or other forms of traffic shaping to prevent the overloaded service (such as web services) from affecting other services (such as email), or vice versa. In home installations this could be used to give interactive traffic (your browser, .Xr ssh 1 logins) priority over services you export from your box (web services, email). .Sh SEE ALSO .Xr netstat 1 , .Xr systat 1 , .Xr sendfile 2 , .Xr ata 4 , .Xr dummynet 4 , .Xr eventtimers 4 , .Xr ffs 4 , .Xr login.conf 5 , .Xr rc.conf 5 , .Xr sysctl.conf 5 , .Xr firewall 7 , .Xr hier 7 , .Xr ports 7 , .Xr boot 8 , .Xr bsdinstall 8 , .Xr ccdconfig 8 , .Xr config 8 , .Xr fsck 8 , .Xr gjournal 8 , .Xr gpart 8 , .Xr gstripe 8 , -.Xr gvinum 8 , .Xr ifconfig 8 , .Xr ipfw 8 , .Xr loader 8 , .Xr mount 8 , .Xr newfs 8 , .Xr route 8 , .Xr sysctl 8 , .Xr tunefs 8 .Sh HISTORY The .Nm manual page was originally written by .An Matthew Dillon and first appeared in .Fx 4.3 , May 2001. The manual page was greatly modified by .An Eitan Adler Aq Mt eadler@FreeBSD.org .