diff --git a/share/man/man9/disk.9 b/share/man/man9/disk.9 index e65dc889b50e..081e63c67000 100644 --- a/share/man/man9/disk.9 +++ b/share/man/man9/disk.9 @@ -1,255 +1,260 @@ .\" .\" Copyright (c) 2003 Robert N. M. Watson .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice(s), this list of conditions and the following disclaimer as .\" the first lines of this file unmodified other than the possible .\" addition of one or more copyright notices. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice(s), this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDER(S) ``AS IS'' AND ANY .\" EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED .\" WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE .\" DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER(S) BE LIABLE FOR ANY .\" DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES .\" (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR .\" SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER .\" CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH .\" DAMAGE. .\" .\" $FreeBSD$ .\" -.Dd August 3, 2017 +.Dd April 30, 2020 .Dt DISK 9 .Os .Sh NAME .Nm disk .Nd kernel disk storage API .Sh SYNOPSIS .In geom/geom_disk.h .Ft struct disk * .Fn disk_alloc void .Ft void .Fn disk_create "struct disk *disk" "int version" .Ft void .Fn disk_gone "struct disk *disk" .Ft void .Fn disk_destroy "struct disk *disk" .Ft int .Fn disk_resize "struct disk *disk" "int flags" .Ft void .Fn disk_add_alias "struct disk *disk" "const char *alias" .Sh DESCRIPTION The disk storage API permits kernel device drivers providing access to disk-like storage devices to advertise the device to other kernel components, including .Xr GEOM 4 and .Xr devfs 5 . .Pp Each disk device is described by a .Vt "struct disk" structure, which contains a variety of parameters for the disk device, function pointers for various methods that may be performed on the device, as well as private data storage for the device driver. In addition, some fields are reserved for use by GEOM in managing access to the device and its statistics. .Pp GEOM has the ownership of .Vt "struct disk" , and drivers must allocate storage for it with the .Fn disk_alloc function, fill in the fields and call .Fn disk_create when the device is ready to service requests. .Fn disk_add_alias adds an alias for the disk and must be called before .Fn disk_create , but may be called multiple times. For each alias added, a device node will be created with .Xr make_dev_alias 9 in the same way primary device nodes are created with .Xr make_dev 9 for .Va d_name and .Va d_unit . Care should be taken to ensure that only one driver creates aliases for any given name. .Fn disk_resize can be called by the driver after modifying .Va d_mediasize to notify GEOM about the disk capacity change. The .Fa flags field should be set to either M_WAITOK, or M_NOWAIT. .Fn disk_gone orphans all of the providers associated with the drive, setting an error condition of ENXIO in each one. In addition, it prevents a re-taste on last close for writing if an error condition has been set in the provider. After calling .Fn disk_destroy , the device driver is not allowed to access the contents of .Vt "struct disk" anymore. .Pp The .Fn disk_create function takes a second parameter, .Fa version , which must always be passed .Dv DISK_VERSION . If GEOM detects that the driver is compiled against an unsupported version, it will ignore the device and print a warning on the console. .Ss Descriptive Fields The following fields identify the disk device described by the structure instance, and must be filled in prior to submitting the structure to .Fn disk_create and may not be subsequently changed: .Bl -tag -width indent .It Vt u_int Va d_flags Optional flags indicating to the storage framework what optional features or descriptions the storage device driver supports. Currently supported flags are .Dv DISKFLAG_OPEN (maintained by storage framework), .Dv DISKFLAG_CANDELETE (maintained by device driver), and .Dv DISKFLAG_CANFLUSHCACHE (maintained by device driver). .It Vt "const char *" Va d_name Holds the name of the storage device class, e.g., .Dq Li ahd . This value typically uniquely identifies a particular driver device, and must not conflict with devices serviced by other device drivers. .It Vt u_int Va d_unit Holds the instance of the storage device class, e.g., .Dq Li 4 . This namespace is managed by the device driver, and assignment of unit numbers might be a property of probe order, or in some cases topology. Together, the .Va d_name and .Va d_unit values will uniquely identify a disk storage device. .El .Ss Disk Device Methods The following fields identify various disk device methods, if implemented: .Bl -tag -width indent .It Vt "disk_open_t *" Va d_open Optional: invoked when the disk device is opened. If no method is provided, open will always succeed. .It Vt "disk_close_t *" Va d_close Optional: invoked when the disk device is closed. Although an error code may be returned, the call should always terminate any state setup by the corresponding open method call. .It Vt "disk_strategy_t *" Va d_strategy Mandatory: invoked when a new .Vt "struct bio" is to be initiated on the disk device. .It Vt "disk_ioctl_t *" Va d_ioctl Optional: invoked when an I/O control operation is initiated on the disk device. Please note that for security reasons these operations should not be able to affect other devices than the one on which they are performed. .It Vt "dumper_t *" Va d_dump Optional: if configured with .Xr dumpon 8 , this function is invoked from a very restricted system state after a kernel panic to record a copy of the system RAM to the disk. .It Vt "disk_getattr_t *" Va d_getattr Optional: if this method is provided, it gives the disk driver the opportunity to override the default GEOM response to BIO_GETATTR requests. This function should return -1 if the attribute is not handled, 0 if the attribute is handled, or an errno to be passed to g_io_deliver(). .It Vt "disk_gone_t *" Va d_gone Optional: if this method is provided, it will be called after disk_gone() is called, once GEOM has finished its cleanup process. Once this callback is called, it is safe for the disk driver to free all of its resources, as it will not be receiving further calls from GEOM. .El .Ss Mandatory Media Properties The following fields identify the size and granularity of the disk device. These fields must stay stable from return of the drivers open method until the close method is called, but it is perfectly legal to modify them in the open method before returning. .Bl -tag -width indent .It Vt u_int Va d_sectorsize The sector size of the disk device in bytes. .It Vt off_t Va d_mediasize The size of the disk device in bytes. .It Vt u_int Va d_maxsize The maximum supported size in bytes of an I/O request. Requests larger than this size will be chopped up by GEOM. .El .Ss Optional Media Properties These optional fields can provide extra information about the disk device. Do not initialize these fields if the field/concept does not apply. These fields must stay stable from return of the drivers open method until the close method is called, but it is perfectly legal to modify them in the open method before returning. .Bl -tag -width indent .It Vt u_int Va d_fwsectors , Vt u_int Va d_fwheads The number of sectors and heads advertised on the disk device by the firmware or BIOS. These values are almost universally bogus, but on some architectures necessary for the correct calculation of disk partitioning. .It Vt u_int Va d_stripeoffset , Vt u_int Va d_stripesize These two fields can be used to describe the width and location of natural performance boundaries for most disk technologies. Please see .Pa src/sys/geom/notes for details. .It Vt char Va d_ident[DISK_IDENT_SIZE] This field can and should be used to store disk's serial number if the d_getattr method described above isn't implemented, or if it does not support the GEOM::ident attribute. .It Vt char Va d_descr[DISK_IDENT_SIZE] This field can be used to store the disk vendor and product description. .It Vt uint16_t Va d_hba_vendor This field can be used to store the PCI vendor ID for the HBA connected to the disk. .It Vt uint16_t Va d_hba_device This field can be used to store the PCI device ID for the HBA connected to the disk. .It Vt uint16_t Va d_hba_subvendor This field can be used to store the PCI subvendor ID for the HBA connected to the disk. .It Vt uint16_t Va d_hba_subdevice This field can be used to store the PCI subdevice ID for the HBA connected to the disk. .El .Ss Driver Private Data This field may be used by the device driver to store a pointer to private data to implement the disk service. .Bl -tag -width indent .It Vt "void *" Va d_drv1 Private data pointer. Typically used to store a pointer to the drivers .Vt softc structure for this disk device. .El +.Sh HISTORY +The +.Nm kernel disk storage API +first appeard in +.Fx 4.9 . .Sh SEE ALSO .Xr GEOM 4 , .Xr devfs 5 , .Xr MAKE_DEV 9 .Sh AUTHORS This manual page was written by .An Robert Watson . .Sh BUGS Disk aliases are not a general purpose aliasing mechanism, but are intended only to ease the transition from one name to another. They can be used to ensure that nvd0 and nda0 are the same thing. They cannot be used to implement the diskX concept from macOS. diff --git a/share/man/man9/driver.9 b/share/man/man9/driver.9 index a23b1ac41543..5dba1b94a6bb 100644 --- a/share/man/man9/driver.9 +++ b/share/man/man9/driver.9 @@ -1,121 +1,126 @@ .\" -*- nroff -*- .\" .\" Copyright (c) 1998 Doug Rabson .\" .\" All rights reserved. .\" .\" This program is free software. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE DEVELOPERS ``AS IS'' AND ANY EXPRESS OR .\" IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES .\" OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. .\" IN NO EVENT SHALL THE DEVELOPERS BE LIABLE FOR ANY DIRECT, INDIRECT, .\" INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT .\" NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, .\" DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY .\" THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT .\" (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF .\" THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. .\" .\" $FreeBSD$ .\" -.Dd November 22, 2011 +.Dd April 30, 2020 .Dt DRIVER 9 .Os .Sh NAME .Nm driver .Nd structure describing a device driver .Sh SYNOPSIS .Bd -literal #include #include #include #include static int foo_probe(device_t); static int foo_attach(device_t); static int foo_detach(device_t); static int foo_frob(device_t, int, int); static int foo_twiddle(device_t, char *); static device_method_t foo_methods[] = { /* Methods from the device interface */ DEVMETHOD(device_probe, foo_probe), DEVMETHOD(device_attach, foo_attach), DEVMETHOD(device_detach, foo_detach), /* Methods from the bogo interface */ DEVMETHOD(bogo_frob, foo_frob), DEVMETHOD(bogo_twiddle, foo_twiddle), /* Terminate method list */ DEVMETHOD_END }; static driver_t foo_driver = { "foo", foo_methods, sizeof(struct foo_softc) }; static devclass_t foo_devclass; DRIVER_MODULE(foo, bogo, foo_driver, foo_devclass, NULL, NULL); .Ed .Sh DESCRIPTION Each driver in the kernel is described by a .Dv driver_t structure. The structure contains the name of the device, a pointer to a list of methods, an indication of the kind of device which the driver implements and the size of the private data which the driver needs to associate with a device instance. Each driver will implement one or more sets of methods (called interfaces). The example driver implements the standard "driver" interface and the fictitious "bogo" interface. .Pp When a driver is registered with the system (by the .Dv DRIVER_MODULE macro, see .Xr DRIVER_MODULE 9 ) , it is added to the list of drivers contained in the devclass of its parent bus type. For instance all PCI drivers would be contained in the devclass named "pci" and all ISA drivers would be in the devclass named "isa". The reason the drivers are not held in the device object of the parent bus is to handle multiple instances of a given type of bus. The .Dv DRIVER_MODULE macro will also create the devclass with the name of the driver and can optionally call extra initialisation code in the driver by specifying an extra module event handler and argument as the last two arguments. +.Sh HISTORY +The +.Nm +framework first appeared in +.Fx 2.2.7 . .Sh SEE ALSO .Xr devclass 9 , .Xr device 9 , .Xr DEVICE_ATTACH 9 , .Xr DEVICE_DETACH 9 , .Xr DEVICE_IDENTIFY 9 , .Xr DEVICE_PROBE 9 , .Xr DEVICE_SHUTDOWN 9 , .Xr DRIVER_MODULE 9 .Sh HISTORY The .Nm framework first appeared in .Fx 2.2.7 . .Sh AUTHORS This manual page was written by .An Doug Rabson . diff --git a/share/man/man9/epoch.9 b/share/man/man9/epoch.9 index 48ad09d2e249..bb16980da2da 100644 --- a/share/man/man9/epoch.9 +++ b/share/man/man9/epoch.9 @@ -1,216 +1,221 @@ .\" .\" Copyright (C) 2018 Matthew Macy . .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice(s), this list of conditions and the following disclaimer as .\" the first lines of this file unmodified other than the possible .\" addition of one or more copyright notices. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice(s), this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDER(S) ``AS IS'' AND ANY .\" EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED .\" WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE .\" DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER(S) BE LIABLE FOR ANY .\" DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES .\" (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR .\" SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER .\" CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH .\" DAMAGE. .\" .\" $FreeBSD$ .\" -.Dd June 28, 2019 +.Dd April 30, 2020 .Dt EPOCH 9 .Os .Sh NAME .Nm epoch , .Nm epoch_context , .Nm epoch_alloc , .Nm epoch_free , .Nm epoch_enter , .Nm epoch_exit , .Nm epoch_wait , .Nm epoch_call , .Nm epoch_drain_callbacks , .Nm in_epoch , .Nd kernel epoch based reclamation .Sh SYNOPSIS .In sys/param.h .In sys/proc.h .In sys/epoch.h .Ft epoch_t .Fn epoch_alloc "int flags" .Ft void .Fn epoch_enter "epoch_t epoch" .Ft void .Fn epoch_enter_preempt "epoch_t epoch" "epoch_tracker_t et" .Ft void .Fn epoch_exit "epoch_t epoch" .Ft void .Fn epoch_exit_preempt "epoch_t epoch" "epoch_tracker_t et" .Ft void .Fn epoch_wait "epoch_t epoch" .Ft void .Fn epoch_wait_preempt "epoch_t epoch" .Ft void .Fn epoch_call "epoch_t epoch" "epoch_context_t ctx" "void (*callback) (epoch_context_t)" .Ft void .Fn epoch_drain_callbacks "epoch_t epoch" .Ft int .Fn in_epoch "epoch_t epoch" .Sh DESCRIPTION Epochs are used to guarantee liveness and immutability of data by deferring reclamation and mutation until a grace period has elapsed. Epochs do not have any lock ordering issues. Entering and leaving an epoch section will never block. .Pp Epochs are allocated with .Fn epoch_alloc and freed with .Fn epoch_free . The flags passed to epoch_alloc determine whether preemption is allowed during a section or not (the default), as specified by EPOCH_PREEMPT. Threads indicate the start of an epoch critical section by calling .Fn epoch_enter . The end of a critical section is indicated by calling .Fn epoch_exit . The _preempt variants can be used around code which requires preemption. A thread can wait until a grace period has elapsed since any threads have entered the epoch by calling .Fn epoch_wait or .Fn epoch_wait_preempt , depending on the epoch_type. The use of a default epoch type allows one to use .Fn epoch_wait which is guaranteed to have much shorter completion times since we know that none of the threads in an epoch section will be preempted before completing its section. If the thread can't sleep or is otherwise in a performance sensitive path it can ensure that a grace period has elapsed by calling .Fn epoch_call with a callback with any work that needs to wait for an epoch to elapse. Only non-sleepable locks can be acquired during a section protected by .Fn epoch_enter_preempt and .Fn epoch_exit_preempt . INVARIANTS can assert that a thread is in an epoch by using .Fn in_epoch . .Pp The epoch API currently does not support sleeping in epoch_preempt sections. A caller should never call .Fn epoch_wait in the middle of an epoch section for the same epoch as this will lead to a deadlock. .Pp By default mutexes cannot be held across .Fn epoch_wait_preempt . To permit this the epoch must be allocated with EPOCH_LOCKED. When doing this one must be cautious of creating a situation where a deadlock is possible. Note that epochs are not a straight replacement for read locks. Callers must use safe list and tailq traversal routines in an epoch (see ck_queue). When modifying a list referenced from an epoch section safe removal routines must be used and the caller can no longer modify a list entry in place. An item to be modified must be handled with copy on write and frees must be deferred until after a grace period has elapsed. .Pp The .Fn epoch_drain_callbacks function is used to drain all pending callbacks which have been invoked by prior .Fn epoch_call function calls on the same epoch. This function is useful when there are shared memory structure(s) referred to by the epoch callback(s) which are not refcounted and are rarely freed. The typical place for calling this function is right before freeing or invalidating the shared resource(s) used by the epoch callback(s). This function can sleep and is not optimized for performance. .Sh RETURN VALUES .Fn in_epoch curepoch will return 1 if curthread is in curepoch, 0 otherwise. .Sh CAVEATS One must be cautious when using .Fn epoch_wait_preempt threads are pinned during epoch sections so if a thread in a section is then preempted by a higher priority compute bound thread on that CPU it can be prevented from leaving the section. Thus the wait time for the waiter is potentially unbounded. .Sh EXAMPLES Async free example: Thread 1: .Bd -literal int in_pcbladdr(struct inpcb *inp, struct in_addr *faddr, struct in_laddr *laddr, struct ucred *cred) { /* ... */ epoch_enter(net_epoch); CK_STAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link) { sa = ifa->ifa_addr; if (sa->sa_family != AF_INET) continue; sin = (struct sockaddr_in *)sa; if (prison_check_ip4(cred, &sin->sin_addr) == 0) { ia = (struct in_ifaddr *)ifa; break; } } epoch_exit(net_epoch); /* ... */ } .Ed Thread 2: .Bd -literal void ifa_free(struct ifaddr *ifa) { if (refcount_release(&ifa->ifa_refcnt)) epoch_call(net_epoch, &ifa->ifa_epoch_ctx, ifa_destroy); } void if_purgeaddrs(struct ifnet *ifp) { /* .... * IF_ADDR_WLOCK(ifp); CK_STAILQ_REMOVE(&ifp->if_addrhead, ifa, ifaddr, ifa_link); IF_ADDR_WUNLOCK(ifp); ifa_free(ifa); } .Ed .Pp Thread 1 traverses the ifaddr list in an epoch. Thread 2 unlinks with the corresponding epoch safe macro, marks as logically free, and then defers deletion. More general mutation or a synchronous free would have to follow a call to .Fn epoch_wait . .Sh ERRORS None. .Sh NOTES The .Nm kernel programming interface is under development and is subject to change. .El +.Sh HISTORY +The +.Nm +framework first appeard in +.Fx 11.0 . .Sh SEE ALSO .Xr locking 9 , .Xr mtx_pool 9 , .Xr mutex 9 , .Xr rwlock 9 , .Xr sema 9 , .Xr sleep 9 , .Xr sx 9 , .Xr timeout 9