Index: head/share/man/man9/BUF_ISLOCKED.9 =================================================================== --- head/share/man/man9/BUF_ISLOCKED.9 (revision 275992) +++ head/share/man/man9/BUF_ISLOCKED.9 (revision 275993) @@ -1,69 +1,69 @@ .\" .\" Copyright (C) 2008 Attilio Rao .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice(s), this list of conditions and the following disclaimer as .\" the first lines of this file unmodified other than the possible .\" addition of one or more copyright notices. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice(s), this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDER(S) ``AS IS'' AND ANY .\" EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED .\" WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE .\" DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER(S) BE LIABLE FOR ANY .\" DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES .\" (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR .\" SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER .\" CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH .\" DAMAGE. .\" .\" $FreeBSD$ .\" .Dd January 22, 2008 .Dt BUF_ISLOCKED 9 .Os .Sh NAME .Nm BUF_ISLOCKED .Nd "returns the state of the lock linked to the buffer" .Sh SYNOPSIS .In sys/param.h .In sys/systm.h .In sys/uio.h .In sys/bio.h .In sys/buf.h .Ft int .Fn BUF_ISLOCKED "struct buf *bp" .Sh DESCRIPTION The .Fn BUF_ISLOCKED function returns the status of the lock linked to the buffer in relation to curthread. .Pp It can return: .Bl -tag -width ".Dv LK_EXCLUSIVE" .It Dv LK_EXCLUSIVE An exclusive lock is held by curthread. .It Dv LK_EXCLOTHER An exclusive lock is held by someone other than curthread. .It Dv LK_SHARED A shared lock is held. .It Li 0 The lock is not held by anyone. .El .Sh SEE ALSO -.Xr lockstatus 9 , .Xr buf 9 , .Xr BUF_LOCK 9 , .Xr BUF_UNLOCK 9 , -.Xr lockmgr 9 +.Xr lockmgr 9 , +.Xr lockstatus 9 .Sh AUTHORS This manual page was written by .An Attilio Rao Aq Mt attilio@FreeBSD.org . Index: head/share/man/man9/BUS_BIND_INTR.9 =================================================================== --- head/share/man/man9/BUS_BIND_INTR.9 (revision 275992) +++ head/share/man/man9/BUS_BIND_INTR.9 (revision 275993) @@ -1,98 +1,98 @@ .\" -*- nroff -*- .\" .\" Copyright (c) 2009 Advanced Computing Technologies LLC .\" Written by: John H. Baldwin .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" $FreeBSD$ .\" .Dd October 14, 2009 .Dt BUS_BIND_INTR 9 .Os .Sh NAME .Nm BUS_BIND_INTR , .Nm bus_bind_intr .Nd "bind an interrupt resource to a specific CPU" .Sh SYNOPSIS .In sys/param.h .In sys/bus.h .Ft int .Fo BUS_BIND_INTR .Fa "device_t dev" "device_t child" "struct resource *irq" "int cpu" .Fc .Ft int .Fn bus_bind_intr "device_t dev" "struct resource *irq" "int cpu" .Sh DESCRIPTION The .Fn BUS_BIND_INTR method allows an interrupt resource to be pinned to a specific CPU. The interrupt resource must have an interrupt handler attached via .Xr BUS_SETUP_INTR 9 . The .Fa cpu parameter corresponds to the ID of a valid CPU in the system. Binding an interrupt restricts the .Xr cpuset 2 of any associated interrupt threads to only include the specified CPU. It may also direct the low-level interrupt handling of the interrupt to the specified CPU as well, but this behavior is platform-dependent. If the value .Dv NOCPU is used for .Fa cpu , then the interrupt will be .Dq unbound which restores any associated interrupt threads back to the default cpuset. .Pp Non-sleepable locks such as mutexes should not be held across calls to these functions. .Pp The .Fn bus_bind_intr function is a simple wrapper around .Fn BUS_BIND_INTR . .Pp Note that currently there is no attempt made to arbitrate between multiple bind requests for the same interrupt from either the same device or multiple devices. There is also no arbitration between interrupt binding requests submitted by userland via .Xr cpuset 2 and .Fn BUS_BIND_INTR . The most recent binding request is the one that will be in effect. .Sh RETURN VALUES Zero is returned on success, otherwise an appropriate error is returned. .Sh SEE ALSO -.Xr BUS_SETUP_INTR 9 , .Xr cpuset 2 , +.Xr BUS_SETUP_INTR 9 , .Xr device 9 .Sh HISTORY The .Fn BUS_BIND_INTR method and .Fn bus_bind_intr functions first appeared in .Fx 7.2 . Index: head/share/man/man9/BUS_DESCRIBE_INTR.9 =================================================================== --- head/share/man/man9/BUS_DESCRIBE_INTR.9 (revision 275992) +++ head/share/man/man9/BUS_DESCRIBE_INTR.9 (revision 275993) @@ -1,104 +1,104 @@ .\" -*- nroff -*- .\" .\" Copyright (c) 2009 Advanced Computing Technologies LLC .\" Written by: John H. Baldwin .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" $FreeBSD$ .\" .Dd October 14, 2009 .Dt BUS_DESCRIBE_INTR 9 .Os .Sh NAME .Nm BUS_DESCRIBE_INTR , .Nm bus_describe_intr .Nd "associate a description with an active interrupt handler" .Sh SYNOPSIS .In sys/param.h .In sys/bus.h .Ft int .Fo BUS_BIND_INTR .Fa "device_t dev" "device_t child" "struct resource *irq" "void *cookie" .Fa "const char *descr" .Fc .Ft int .Fo bus_describe_intr .Fa "device_t dev" "struct resource *irq" "void *cookie" "const char *fmt" .Fa ... .Fc .Sh DESCRIPTION The .Fn BUS_DESCRIBE_INTR method associates a description with an active interrupt handler. The .Fa cookie parameter must be the value returned by a successful call to .Xr BUS_SETUP_INTR 9 for the interrupt .Fa irq . .Pp The .Fn bus_describe_intr function is a simple wrapper around .Fn BUS_DESCRIBE_INTR . As a convenience, .Fn bus_describe_intr allows the caller to use .Xr printf 9 style formatting to build the description string using .Fa fmt . .Pp When an interrupt handler is established by .Xr BUS_SETUP_INTR 9 , the handler is named after the device the handler is established for. This name is then used in various places such as interrupt statistics displayed by .Xr systat 1 and .Xr vmstat 8 . For devices that use a single interrupt, the device name is sufficiently unique to identify the interrupt handler. However, for devices that use multiple interrupts it can be useful to distinguish the interrupt handlers. When a description is set for an active interrupt handler, a colon followed by the description is appended to the device name to form the interrupt handler name. .Sh RETURN VALUES Zero is returned on success, otherwise an appropriate error is returned. .Sh SEE ALSO -.Xr BUS_SETUP_INTR 9 , .Xr systat 1 , .Xr vmstat 8 , +.Xr BUS_SETUP_INTR 9 , .Xr device 9 , .Xr printf 9 .Sh HISTORY The .Fn BUS_DESCRIBE_INTR method and .Fn bus_describe_intr functions first appeared in .Fx 8.1 . .Sh BUGS It is not currently possible to remove a description from an active interrupt handler. Index: head/share/man/man9/DB_COMMAND.9 =================================================================== --- head/share/man/man9/DB_COMMAND.9 (revision 275992) +++ head/share/man/man9/DB_COMMAND.9 (revision 275993) @@ -1,110 +1,110 @@ .\"- .\" Copyright (c) 2008 Guillaume Ballet .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" $FreeBSD$ .\" .Dd August 27, 2008 .Dt DB_COMMAND 9 .Os .Sh NAME .Nm DB_COMMAND , .Nm DB_SHOW_COMMAND , .Nm DB_SHOW_ALL_COMMAND .Nd Extends the ddb command set .Sh SYNOPSIS .In ddb/ddb.h .Fo DB_COMMAND .Fa command_name .Fa command_function .Fc .Fn DB_SHOW_COMMAND "command_name" "command_function" .Fn DB_SHOW_ALL_COMMAND "command_name" "command_function" .Sh DESCRIPTION The .Fn DB_COMMAND macro adds .Fa command_name to the list of top-level commands. Invoking .Fa command_name from ddb will call .Fa command_function . .Pp The .Fn DB_SHOW_COMMAND and .Fn DB_SHOW_ALL_COMMAND are roughly equivalent to .Fn DB_COMMAND but in these cases, .Fa command_name is a sub-command of the ddb .Sy show command and .Sy show all command, respectively. .Pp The general command syntax: .Cm command Ns Op Li \&/ Ns Ar modifier -.Ar address Ns Op Li , Ns Ar count , +.Ar address Ns Op , Ns Ar count , translates into the following parameters for .Fa command_function : .Bl -tag -width Fa -offset indent .It Fa addr The address passed to the command as an argument. .It Fa have_addr A boolean value that is true if the addr field is valid. .It Fa count The number of quad words starting at offset .Fa addr that the command must process. .It Fa modif A pointer to the string of modifiers. That is, a series of symbols used to pass some options to the command. For example, the .Sy examine command will display words in decimal form if it is passed the modifier "d". .El .Sh EXAMPLE In your module, the command is declared as: .Bd -literal DB_COMMAND(mycmd, my_cmd_func) { if (have_addr) db_printf("Calling my command with address %p\\n", addr); } .Ed .Pp Then, when in ddb: .Bd -literal .Bf Sy db> mycmd 0x1000 Calling my command with address 0x1000 db> .Ef .Ed .Sh SEE ALSO .Xr ddb 4 .Sh AUTHORS This manual page was written by .An Guillaume Ballet Aq Mt gballet@gmail.com . Index: head/share/man/man9/EVENTHANDLER.9 =================================================================== --- head/share/man/man9/EVENTHANDLER.9 (revision 275992) +++ head/share/man/man9/EVENTHANDLER.9 (revision 275993) @@ -1,337 +1,337 @@ .\" Copyright (c) 2004 Joseph Koshy .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" $FreeBSD$ .\" .Dd April 19, 2014 .Dt EVENTHANDLER 9 .Os .Sh NAME .Nm EVENTHANDLER .Nd kernel event handling functions .Sh SYNOPSIS .In sys/eventhandler.h .Fn EVENTHANDLER_DECLARE name type .Fn EVENTHANDLER_INVOKE name ... .Ft eventhandler_tag .Fn EVENTHANDLER_REGISTER name func arg priority .Fn EVENTHANDLER_DEREGISTER name tag .Ft eventhandler_tag .Fo eventhandler_register .Fa "struct eventhandler_list *list" .Fa "const char *name" .Fa "void *func" .Fa "void *arg" .Fa "int priority" .Fc .Ft void .Fo eventhandler_deregister .Fa "struct eventhandler_list *list" .Fa "eventhandler_tag tag" .Fc .Ft "struct eventhandler_list *" .Fn eventhandler_find_list "const char *name" .Ft void .Fn eventhandler_prune_list "struct eventhandler_list *list" .Sh DESCRIPTION The .Nm mechanism provides a way for kernel subsystems to register interest in kernel events and have their callback functions invoked when these events occur. .Pp The normal way to use this subsystem is via the macro interface. The macros that can be used for working with event handlers and callback function lists are: .Bl -tag -width indent .It Fn EVENTHANDLER_DECLARE This macro declares an event handler named by argument .Fa name with callback functions of type .Fa type . .It Fn EVENTHANDLER_REGISTER This macro registers a callback function .Fa func with event handler .Fa name . When invoked, function .Fa func will be invoked with argument .Fa arg as its first parameter along with any additional parameters passed in via macro .Fn EVENTHANDLER_INVOKE (see below). Callback functions are invoked in order of priority. The relative priority of each callback among other callbacks associated with an event is given by argument .Fa priority , which is an integer ranging from .Dv EVENTHANDLER_PRI_FIRST (highest priority), to .Dv EVENTHANDLER_PRI_LAST (lowest priority). The symbol .Dv EVENTHANDLER_PRI_ANY may be used if the handler does not have a specific priority associated with it. If registration is successful, .Fn EVENTHANDLER_REGISTER returns a cookie of type .Vt eventhandler_tag . .It Fn EVENTHANDLER_DEREGISTER This macro removes a previously registered callback associated with tag .Fa tag from the event handler named by argument .Fa name . .It Fn EVENTHANDLER_INVOKE This macro is used to invoke all the callbacks associated with event handler .Fa name . This macro is a variadic one. Additional arguments to the macro after the .Fa name parameter are passed as the second and subsequent arguments to each registered callback function. .El .Pp The macros are implemented using the following functions: .Bl -tag -width indent .It Fn eventhandler_register The .Fn eventhandler_register function is used to register a callback with a given event. The arguments expected by this function are: .Bl -tag -width ".Fa priority" .It Fa list A pointer to an existing event handler list, or .Dv NULL . If .Fa list is .Dv NULL , the event handler list corresponding to argument .Fa name is used. .It Fa name The name of the event handler list. .It Fa func A pointer to a callback function. Argument .Fa arg is passed to the callback function .Fa func as its first argument when it is invoked. .It Fa priority The relative priority of this callback among all the callbacks registered for this event. Valid values are those in the range .Dv EVENTHANDLER_PRI_FIRST to .Dv EVENTHANDLER_PRI_LAST . .El .Pp The .Fn eventhandler_register function returns a .Fa tag that can later be used with .Fn eventhandler_deregister to remove the particular callback function. .It Fn eventhandler_deregister The .Fn eventhandler_deregister function removes the callback associated with tag .Fa tag from the event handler list pointed to by .Fa list . This function is safe to call from inside an event handler callback. .It Fn eventhandler_find_list The .Fn eventhandler_find_list function returns a pointer to event handler list structure corresponding to event .Fa name . .It Fn eventhandler_prune_list The .Fn eventhandler_prune_list function removes all deregistered callbacks from the event list .Fa list . .El .Ss Kernel Event Handlers The following event handlers are present in the kernel: .Bl -tag -width indent .It Vt acpi_sleep_event Callbacks invoked when the system is being sent to sleep. .It Vt acpi_wakeup_event Callbacks invoked when the system is being woken up. .It Vt app_coredump_start Callbacks invoked at start of application core dump. .It Vt app_coredump_progress Callbacks invoked during progress of application core dump. .It Vt app_coredump_finish Callbacks invoked at finish of application core dump. .It Vt app_coredump_error Callbacks invoked on error of application core dump. .It Vt bpf_track Callbacks invoked when a BPF listener attaches to/detaches from network interface. .It Vt cpufreq_levels_changed Callback invoked when cpu frequence levels have changed. .It Vt cpufreq_post_change Callback invoked after cpu frequence has changed. .It Vt cpufreq_pre_change Callback invoked before cpu frequence has changed .It Vt dcons_poll Callback invoked to poll for dcons changes. .It Vt dev_clone Callbacks invoked when a new entry is created under .Pa /dev . .It Vt group_attach_event Callback invoked when an interfance has been added to an interface group. .It Vt group_change_event Callback invoked when an change has been made to an interface group. .It Vt group_detach_event Callback invoked when an interfance has been removed from an interface group. .It Vt ifaddr_event Callbacks invoked when an address is set up on a network interface. .It Vt if_clone_event Callbacks invoked when an interface is cloned. .It Vt iflladdr_event Callback invoked when an if link layer address event has happened. .It Vt ifnet_arrival_event Callbacks invoked when a new network interface appears. .It Vt ifnet_departure_event Callbacks invoked when a network interface is taken down. .It Vt ifnet_link_event Callback invoked when an interfance link event has happened. .It Vt kld_load Callbacks invoked after a linker file has been loaded. .It Vt kld_unload Callbacks invoked after a linker file has been successfully unloaded. .It Vt kld_unload_try Callbacks invoked before a linker file is about to be unloaded. These callbacks may be used to return an error and prevent the unload from proceeding. .It Vt lle_event Callback invoked when an link layer event has happened. .It Vt nmbclusters_change Callback invoked when the number of mbuf clusters has changed. .It Vt nmbufs_change Callback invoked when the number of mbufs has changed. .It Vt maxsockets_change Callback invoked when the maximum number of sockets has changed. .It Vt mountroot Callback invoked when root has been mounted. .It Vt power_profile_change Callbacks invoked when the power profile of the system changes. .It Vt power_resume Callback invoked when the system has resumed. .It Vt power_suspend Callback invoked just before the system is suspended. .It Vt process_ctor Callback invoked when a process is created. .It Vt process_dtor Callback invoked when a process is destroyed. .It Vt process_exec Callbacks invoked when a process performs an .Fn exec operation. .It Vt process_exit Callbacks invoked when a process exits. .It Vt process_fini Callback invoked when a process memory is destroyed. -This is never called. +This is never called. .It Vt process_fork Callbacks invoked when a process forks a child. .It Vt process_init Callback invoked when a process is initalized. .It Vt random_adaptor_attach Callback invoked when a new random module has been loaded. .It Vt register_framebuffer Callback invoked when a new frame buffer is registered. .It Vt route_redirect_event Callback invoked when a route gets redirected to a new location. .It Vt shutdown_pre_sync Callbacks invoked at shutdown time, before file systems are synchronized. .It Vt shutdown_post_sync Callbacks invoked at shutdown time, after all file systems are synchronized. .It Vt shutdown_final Callbacks invoked just before halting the system. .It Vt tcp_offload_listen_start Callback invoked for TCP Offload to start listening for new connections. .It Vt tcp_offload_listen_stop Callback invoked ror TCP Offload to stop listening for new connections. .It Vt thread_ctor Callback invoked when a thread object is created. .It Vt thread_dtor Callback invoked when a thread object is destroyed. .It Vt thread_init Callback invoked when a thread object is initalized. .It Vt thread_fini Callback invoked when a thread object is deinitalized. .It Vt usb_dev_configured Callback invoked when a USB device is configured .It Vt unregister_framebuffer Callback invoked when a frame buffer is deregistered. .It Vt vfs_mounted Callback invoked when a file system is mounted. .It Vt vfs_unmounted Callback invoked when a file system is unmounted. .It Vt vlan_config Callback invoked when the vlan configuration has changed. .It Vt vlan_unconfig Callback invoked when a vlan is destroyed. .It Vt vm_lowmem Callbacks invoked when virtual memory is low. .It Vt watchdog_list Callbacks invoked when the system watchdog timer is reinitialized. .El .Sh RETURN VALUES The macro .Fn EVENTHANDLER_REGISTER and function .Fn eventhandler_register return a cookie of type .Vt eventhandler_tag , which may be used in a subsequent call to .Fn EVENTHANDLER_DEREGISTER or .Fn eventhandler_deregister . .Pp The .Fn eventhandler_find_list function returns a pointer to an event handler list corresponding to parameter .Fa name , or .Dv NULL if no such list was found. .Sh HISTORY The .Nm facility first appeared in .Fx 4.0 . .Sh AUTHORS This manual page was written by .An Joseph Koshy Aq Mt jkoshy@FreeBSD.org . Index: head/share/man/man9/VFS.9 =================================================================== --- head/share/man/man9/VFS.9 (revision 275992) +++ head/share/man/man9/VFS.9 (revision 275993) @@ -1,61 +1,61 @@ .\" -*- nroff -*- .\" .\" Copyright (c) 1996 Doug Rabson .\" .\" All rights reserved. .\" .\" This program is free software. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE DEVELOPERS ``AS IS'' AND ANY EXPRESS OR .\" IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES .\" OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. .\" IN NO EVENT SHALL THE DEVELOPERS BE LIABLE FOR ANY DIRECT, INDIRECT, .\" INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT .\" NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, .\" DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY .\" THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT .\" (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF .\" THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. .\" .\" $FreeBSD$ .\" .Dd February 9, 2010 .Dt VFS 9 .Os .Sh NAME .Nm VFS .Nd kernel interface to file systems .Sh DESCRIPTION Calls used to set or query file systems for settings or information. .Pp File systems that do not implement a VFS operation should use the appropriate .Fa vfs_std function from .Pa src/sys/kern/vfs_default.c rather than implementing empty functions or casting to .Fa eopnotsupp . .Sh SEE ALSO .Xr VFS_CHECKEXP 9 , .Xr VFS_FHTOVP 9 , .Xr VFS_INIT 9 , .Xr VFS_MOUNT 9 , .Xr VFS_QUOTACTL 9 , .Xr VFS_SET 9 , .Xr VFS_STATFS 9 , .Xr VFS_SYNC 9 , .Xr VFS_UNMOUNT 9 , .Xr VFS_VGET 9 , -.Xr VOP_VPTOFH 9 , -.Xr vnode 9 +.Xr vnode 9 , +.Xr VOP_VPTOFH 9 .Sh AUTHORS This manual page was written by .An Doug Rabson . Index: head/share/man/man9/VFS_CHECKEXP.9 =================================================================== --- head/share/man/man9/VFS_CHECKEXP.9 (revision 275992) +++ head/share/man/man9/VFS_CHECKEXP.9 (revision 275993) @@ -1,88 +1,88 @@ .\" .\" Copyright (c) 1999 Alfred Perlstein .\" .\" All rights reserved. .\" .\" This program is free software. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following condition .\" is met: .\" Redistributions of source code must retain the above copyright .\" notice, this condition and the following disclaimer. .\" .\" THIS SOFTWARE IS PROVIDED BY THE DEVELOPERS ``AS IS'' AND ANY EXPRESS OR .\" IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES .\" OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. .\" IN NO EVENT SHALL THE DEVELOPERS BE LIABLE FOR ANY DIRECT, INDIRECT, .\" INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT .\" NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, .\" DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY .\" THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT .\" (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF .\" THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. .\" .\" $FreeBSD$ .\" .Dd January 4, 2010 .Dt VFS_CHECKEXP 9 .Os .Sh NAME .Nm VFS_CHECKEXP .Nd check if a file system is exported to a client .Sh SYNOPSIS .In sys/param.h .In sys/mount.h .Ft int .Fn VFS_CHECKEXP "struct mount *mp" "struct sockaddr *nam" "int *exflagsp" "struct ucred **credanonp" .Sh DESCRIPTION The .Fn VFS_CHECKEXP macro is used by the NFS server to check if a mount point is exported to a client. .Pp The arguments it expects are: .Bl -tag -width credanonp .It Fa mp The mount point to be checked. .It Fa nam An mbuf containing the network address of the client. .It Fa exflagsp Return parameter for the export flags for this client. .It Fa credanonp Return parameter for the anonymous credentials for this client. .El .Pp The .Fn VFS_CHECKEXP macro should be called on a file system's mount structure to determine if it is exported to a client whose address is contained in .Fa nam . .Pp It is generally called before .Xr VFS_FHTOVP 9 to validate that a client has access to the file system. .Pp The file system should call .Xr vfs_export_lookup 9 with the address of an appropriate .Vt netexport structure and the address of the client, .Fa nam , to verify that the client can access this file system. .Sh RETURN VALUES The export flags and anonymous credentials specific to the client (returned by .Xr vfs_export_lookup 9 ) will be returned in .Fa *exflagsp and .Fa *credanonp . .Sh SEE ALSO .Xr VFS 9 , .Xr VFS_FHTOVP 9 , -.Xr VOP_VPTOFH 9 , -.Xr vnode 9 +.Xr vnode 9 , +.Xr VOP_VPTOFH 9 .Sh AUTHORS This manual page was written by .An Alfred Perlstein . Index: head/share/man/man9/VFS_FHTOVP.9 =================================================================== --- head/share/man/man9/VFS_FHTOVP.9 (revision 275992) +++ head/share/man/man9/VFS_FHTOVP.9 (revision 275993) @@ -1,83 +1,83 @@ .\" -*- nroff -*- .\" .\" Copyright (c) 1996 Doug Rabson .\" .\" All rights reserved. .\" .\" This program is free software. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE DEVELOPERS ``AS IS'' AND ANY EXPRESS OR .\" IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES .\" OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. .\" IN NO EVENT SHALL THE DEVELOPERS BE LIABLE FOR ANY DIRECT, INDIRECT, .\" INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT .\" NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, .\" DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY .\" THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT .\" (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF .\" THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. .\" .\" $FreeBSD$ .\" .Dd January 4, 2010 .Dt VFS_FHTOVP 9 .Os .Sh NAME .Nm VFS_FHTOVP .Nd turn an NFS filehandle into a vnode .Sh SYNOPSIS .In sys/param.h .In sys/mount.h .In sys/vnode.h .Ft int .Fn VFS_FHTOVP "struct mount *mp" "struct fid *fhp" "struct vnode **vpp" .Sh DESCRIPTION The .Fn VFS_FHTOVP macro is used by the NFS server to turn an NFS filehandle into a vnode. .Pp The arguments it expects are: .Bl -tag -width vpp .It Fa mp The file system. .It Fa fhp The filehandle to convert. .It Fa vpp Return parameter for the new locked vnode. .El .Pp The contents of the filehandle are defined by the file system and are not examined by any other part of the system. It should contain enough information to uniquely identify a file within the file system as well as noticing when a file has been removed and the file system resources have been reused for a new file. For instance, UFS file system stores the inode number and inode generation counter in its filehandle. .Pp A call to .Fn VFS_FHTOVP should generally be preceded by a call to .Xr VFS_CHECKEXP 9 to check if the file is accessible to the client. .Sh RETURN VALUES The locked vnode for the file will be returned in .Fa *vpp . .Sh SEE ALSO .Xr VFS 9 , .Xr VFS_CHECKEXP 9 , -.Xr VOP_VPTOFH 9 , -.Xr vnode 9 +.Xr vnode 9 , +.Xr VOP_VPTOFH 9 .Sh AUTHORS This manual page was written by .An Doug Rabson . Index: head/share/man/man9/VFS_SET.9 =================================================================== --- head/share/man/man9/VFS_SET.9 (revision 275992) +++ head/share/man/man9/VFS_SET.9 (revision 275993) @@ -1,111 +1,111 @@ .\" .\" Copyright (C) 2001 Chad David . All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice(s), this list of conditions and the following disclaimer as .\" the first lines of this file unmodified other than the possible .\" addition of one or more copyright notices. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice(s), this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDER(S) ``AS IS'' AND ANY .\" EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED .\" WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE .\" DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER(S) BE LIABLE FOR ANY .\" DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES .\" (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR .\" SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER .\" CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH .\" DAMAGE. .\" .\" $FreeBSD$ .\" .Dd February 21, 2013 .Dt VFS_SET 9 .Os .Sh NAME .Nm VFS_SET .Nd set up loadable file system .Vt vfsconf .Sh SYNOPSIS .In sys/param.h .In sys/kernel.h .In sys/module.h .In sys/mount.h .Ft void .Fn VFS_SET "struct vfsops *vfsops" "fsname" "int flags" .Sh DESCRIPTION .Fn VFS_SET creates a .Vt vfsconf structure for the loadable module with the given .Fa vfsops , fsname and .Fa flags , and declares it by calling .Xr DECLARE_MODULE 9 using .Fn vfs_modevent as the event handler. .Pp Possible values for the .Fa flags argument are: .Bl -hang -width ".Dv VFCF_DELEGADMIN" .It Dv VFCF_STATIC File system should be statically available in the kernel. .It Dv VFCF_NETWORK Network exportable file system. .It Dv VFCF_READONLY Does not support write operations. .It Dv VFCF_SYNTHETIC Pseudo file system, data does not represent on-disk files. .It Dv VFCF_LOOPBACK Loopback file system layer. .It Dv VFCF_UNICODE File names are stored as Unicode. .It Dv VFCF_JAIL Can be mounted from within a jail if .Va security.jail.mount_allowed sysctl is set to .Dv 1 . .It Dv VFCF_DELEGADMIN Supports delegated administration if .Va vfs.usermount sysctl is set to .Dv 1 . .It Dv VFCF_SBDRY When in VFS method, the thread suspension is deferred to the user boundary upon arrival of stop action. .El .Sh PSEUDOCODE .Bd -literal /* * Fill in the fields for which we have special methods. * The others are initially null. This tells vfs to change them to * pointers to vfs_std* functions during file system registration. */ static struct vfsops myfs_vfsops = { .vfs_mount = myfs_mount, .vfs_root = myfs_root, .vfs_statfs = myfs_statfs, .vfs_unmount = myfs_unmount, }; VFS_SET(myfs_vfsops, myfs, 0); .Ed .Sh SEE ALSO .Xr jail 2 , .Xr jail 8 , .Xr DECLARE_MODULE 9 , -.Xr vfsconf 9 , -.Xr vfs_modevent 9 +.Xr vfs_modevent 9 , +.Xr vfsconf 9 .Sh AUTHORS This manual page was written by .An Chad David Aq Mt davidc@acns.ab.ca . Index: head/share/man/man9/VOP_LOCK.9 =================================================================== --- head/share/man/man9/VOP_LOCK.9 (revision 275992) +++ head/share/man/man9/VOP_LOCK.9 (revision 275993) @@ -1,124 +1,125 @@ .\" -*- nroff -*- .\" .\" Copyright (c) 1996 Doug Rabson .\" .\" All rights reserved. .\" .\" This program is free software. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE DEVELOPERS ``AS IS'' AND ANY EXPRESS OR .\" IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES .\" OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. .\" IN NO EVENT SHALL THE DEVELOPERS BE LIABLE FOR ANY DIRECT, INDIRECT, .\" INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT .\" NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, .\" DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY .\" THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT .\" (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF .\" THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. .\" .\" $FreeBSD$ .\" .Dd February 25, 2008 .Dt VOP_LOCK 9 .Os .Sh NAME .Nm VOP_LOCK , .Nm VOP_UNLOCK , .Nm VOP_ISLOCKED , .Nm vn_lock .Nd serialize access to a vnode .Sh SYNOPSIS .In sys/param.h .In sys/lock.h .In sys/vnode.h .Ft int .Fn VOP_LOCK "struct vnode *vp" "int flags" .Ft int .Fn VOP_UNLOCK "struct vnode *vp" "int flags" .Ft int .Fn VOP_ISLOCKED "struct vnode *vp" .Ft int .Fn vn_lock "struct vnode *vp" "int flags" .Sh DESCRIPTION These calls are used to serialize access to the file system, such as to prevent two writes to the same file from happening at the same time. .Pp The arguments are: .Bl -tag -width flags .It Fa vp The vnode being locked or unlocked. .It Fa flags One of the lock request types: .Pp .Bl -tag -width ".Dv LK_CANRECURSE" -offset indent -compact .It Dv LK_SHARED Shared lock. .It Dv LK_EXCLUSIVE Exclusive lock. .It Dv LK_UPGRADE Shared-to-exclusive upgrade. .It Dv LK_DOWNGRADE Exclusive-to-shared downgrade. .It Dv LK_RELEASE Release any type of lock. .It Dv LK_DRAIN Wait for all lock activity to end. .El .Pp The lock type may be .Em or Ns 'ed with these lock flags: .Pp .Bl -tag -width ".Dv LK_CANRECURSE" -offset indent -compact .It Dv LK_NOWAIT Do not sleep to wait for lock. .It Dv LK_SLEEPFAIL Sleep, then return failure. .It Dv LK_CANRECURSE Allow recursive exclusive lock. .It Dv LK_NOWITNESS Instruct .Xr witness 4 to ignore this instance. .El .Pp The lock type may be .Em or Ns 'ed with these control flags: .Pp .Bl -tag -width ".Dv LK_CANRECURSE" -offset indent -compact .It Dv LK_INTERLOCK Specify when the caller already has a simple lock -.Fn ( VOP_LOCK -will unlock the simple lock after getting the lock). +.Po Fn VOP_LOCK +will unlock the simple lock after getting the lock +.Pc . .It Dv LK_RETRY Retry until locked. .El .Pp Kernel code should use .Fn vn_lock to lock a vnode rather than calling .Fn VOP_LOCK directly. .Fn vn_lock also does not want a thread specified as argument but it assumes curthread to be used. .El .Sh RETURN VALUES Zero is returned on success, otherwise an error is returned. .Sh SEE ALSO .Xr vnode 9 .Sh AUTHORS This manual page was written by .An Doug Rabson . Index: head/share/man/man9/VOP_VPTOCNP.9 =================================================================== --- head/share/man/man9/VOP_VPTOCNP.9 (revision 275992) +++ head/share/man/man9/VOP_VPTOCNP.9 (revision 275993) @@ -1,92 +1,92 @@ .\" -*- nroff -*- .\" .\" Copyright (c) 2008 Joe Marcus Clarke .\" .\" All rights reserved. .\" .\" This program is free software. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE DEVELOPERS ``AS IS'' AND ANY EXPRESS OR .\" IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES .\" OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. .\" IN NO EVENT SHALL THE DEVELOPERS BE LIABLE FOR ANY DIRECT, INDIRECT, .\" INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT .\" NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, .\" DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY .\" THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT .\" (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF .\" THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. .\" .\" $FreeBSD$ .\" .Dd November 19, 2011 .Dt VOP_VPTOCNP 9 .Os .Sh NAME .Nm VOP_VPTOCNP .Nd translate a vnode to its component name .Sh SYNOPSIS .In sys/param.h .In sys/vnode.h .Ft int .Fn VOP_VPTOCNP "struct vnode *vp" "struct vnode **dvp" "char *buf" "int *buflen" .Sh DESCRIPTION This translates a vnode into its component name, and writes that name to the head of the buffer specified by .Fa buf . .Bl -tag -width buflen .It Fa vp The vnode to translate. .It Fa dvp The vnode of the parent directory of .Fa vp . .It Fa buf The buffer into which to prepend the component name. .It Fa buflen The remaining size of the buffer. .El .Pp The default implementation of .Nm scans through .Fa vp Ns 's parent directory looking for a dirent with a matching file number. If .Fa vp is not a directory, then .Nm returns ENOENT. .Sh LOCKS The vnode should be locked on entry and will still be locked on exit. The parent directory vnode will be unlocked on a successful exit. However, it will have its use count incremented. .Sh RETURN VALUES Zero is returned on success, otherwise an error code is returned. .Sh ERRORS .Bl -tag -width Er .It Bq Er ENOMEM The buffer was not large enough to hold the vnode's component name. .It Bq Er ENOENT The vnode was not found on the file system. .El .Sh SEE ALSO -.Xr VOP_LOOKUP 9 , -.Xr vnode 9 +.Xr vnode 9 , +.Xr VOP_LOOKUP 9 .Sh NOTES This interface is a work in progress. .Sh HISTORY The function .Nm appeared in .Fx 8.0 . .Sh AUTHORS This manual page was written by .An Joe Marcus Clarke . Index: head/share/man/man9/accf_data.9 =================================================================== --- head/share/man/man9/accf_data.9 (revision 275992) +++ head/share/man/man9/accf_data.9 (revision 275993) @@ -1,78 +1,78 @@ .\" .\" Copyright (c) 2000 Alfred Perlstein .\" .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE DEVELOPERS ``AS IS'' AND ANY EXPRESS OR .\" IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES .\" OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. .\" IN NO EVENT SHALL THE DEVELOPERS BE LIABLE FOR ANY DIRECT, INDIRECT, .\" INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT .\" NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, .\" DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY .\" THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT .\" (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF .\" THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. .\" .\" $FreeBSD$ .\" " .Dd November 15, 2000 .Dt ACCF_DATA 9 .Os .Sh NAME .Nm accf_data .Nd buffer incoming connections until data arrives .Sh SYNOPSIS .Nm options INET .Nm options ACCEPT_FILTER_DATA .Nm kldload accf_data .Sh DESCRIPTION This is a filter to be placed on a socket that will be using .Fn accept to receive incoming connections. .Pp It prevents the application from receiving the connected descriptor via .Fn accept until data arrives on the connection. .Pp The .Fa ACCEPT_FILTER_DATA kernel option is also a module that can be enabled at runtime via .Xr kldload 8 if the INET option has been compiled into the kernel. .Sh EXAMPLES Assuming ACCEPT_FILTER_DATA has been included in the kernel config file or the .Nm module has been loaded, this will enable the data accept filter on the socket .Fa sok . .Bd -literal -offset 0i struct accept_filter_arg afa; bzero(&afa, sizeof(afa)); strcpy(afa.af_name, "dataready"); setsockopt(sok, SOL_SOCKET, SO_ACCEPTFILTER, &afa, sizeof(afa)); .Ed .Sh SEE ALSO .Xr setsockopt 2 , .Xr accept_filter 9 , -.Xr accf_dns 9 +.Xr accf_dns 9 , .Xr accf_http 9 .Sh HISTORY The accept filter mechanism and the accf_data filter were introduced in .Fx 4.0 . .Sh AUTHORS This manual page and the filter were written by .An Alfred Perlstein . Index: head/share/man/man9/accf_dns.9 =================================================================== --- head/share/man/man9/accf_dns.9 (revision 275992) +++ head/share/man/man9/accf_dns.9 (revision 275993) @@ -1,79 +1,79 @@ .\" .\" Copyright (c) 2008 David Malone .\" .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE DEVELOPERS ``AS IS'' AND ANY EXPRESS OR .\" IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES .\" OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. .\" IN NO EVENT SHALL THE DEVELOPERS BE LIABLE FOR ANY DIRECT, INDIRECT, .\" INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT .\" NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, .\" DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY .\" THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT .\" (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF .\" THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. .\" .\" $FreeBSD$ .\" " .Dd July 16, 2008 .Dt ACCF_DNS 9 .Os .Sh NAME .Nm accf_dns .Nd buffer incoming DNS requests until the whole first request is present .Sh SYNOPSIS .Nm options INET .Nm options ACCEPT_FILTER_DNS .Nm kldload accf_dns .Sh DESCRIPTION This is a filter to be placed on a socket that will be using .Fn accept to receive incoming connections. .Pp It prevents the application from receiving the connected descriptor via .Fn accept until a whole DNS request is available on the socket. It does this by reading the first two bytes of the request, to determine its size, and waiting until the required amount of data is available to be read. .Pp The .Fa ACCEPT_FILTER_DNS kernel option is also a module that can be enabled at runtime via .Xr kldload 8 if the INET option has been compiled into the kernel. .Sh EXAMPLES If the .Nm module is available in the kernel, the following code will enable the DNS accept filter on a socket .Fa sok . .Bd -literal -offset 0i struct accept_filter_arg afa; bzero(&afa, sizeof(afa)); strcpy(afa.af_name, "dnsready"); setsockopt(sok, SOL_SOCKET, SO_ACCEPTFILTER, &afa, sizeof(afa)); .Ed .Sh SEE ALSO .Xr setsockopt 2 , .Xr accept_filter 9 , +.Xr accf_data 9 , .Xr accf_http 9 -.Xr accf_data 9 .Sh HISTORY The accept filter mechanism was introduced in .Fx 4.0 . .Sh AUTHORS This manual page and the filter were written by .An David Malone . Index: head/share/man/man9/acl.9 =================================================================== --- head/share/man/man9/acl.9 (revision 275992) +++ head/share/man/man9/acl.9 (revision 275993) @@ -1,219 +1,219 @@ .\"- .\" Copyright (c) 1999-2001 Robert N. M. Watson .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" $FreeBSD$ .\" .Dd September 18, 2009 .Dt ACL 9 .Os .Sh NAME .Nm acl .Nd virtual file system access control lists .Sh SYNOPSIS .In sys/param.h .In sys/vnode.h .In sys/acl.h .Pp In the kernel configuration file: .Cd "options UFS_ACL" .Sh DESCRIPTION Access control lists, or ACLs, allow fine-grained specification of rights for vnodes representing files and directories. However, as there are a plethora of file systems with differing ACL semantics, the vnode interface is aware only of the syntax of ACLs, relying on the underlying file system to implement the details. Depending on the underlying file system, each file or directory may have zero or more ACLs associated with it, named using the .Fa type field of the appropriate vnode ACL calls: .Xr VOP_ACLCHECK 9 , .Xr VOP_GETACL 9 , and .Xr VOP_SETACL 9 . .Pp Currently, each ACL is represented in-kernel by a fixed-size .Vt acl structure, defined as follows: .Bd -literal -offset indent struct acl { unsigned int acl_maxcnt; unsigned int acl_cnt; int acl_spare[4]; struct acl_entry acl_entry[ACL_MAX_ENTRIES]; }; .Ed .Pp An ACL is constructed from a fixed size array of ACL entries, each of which consists of a set of permissions, principal namespace, and principal identifier. In this implementation, the .Vt acl_maxcnt field is always set to .Dv ACL_MAX_ENTRIES . .Pp Each individual ACL entry is of the type .Vt acl_entry_t , which is a structure with the following members: .Bl -tag -width 2n .It Vt acl_tag_t Va ae_tag The following is a list of definitions of ACL types to be set in .Va ae_tag : .Pp .Bl -tag -width ".Dv ACL_UNDEFINED_FIELD" -offset indent -compact .It Dv ACL_UNDEFINED_FIELD Undefined ACL type. .It Dv ACL_USER_OBJ Discretionary access rights for processes whose effective user ID matches the user ID of the file's owner. .It Dv ACL_USER Discretionary access rights for processes whose effective user ID matches the ACL entry qualifier. .It Dv ACL_GROUP_OBJ Discretionary access rights for processes whose effective group ID or any supplemental groups match the group ID of the file's owner. .It Dv ACL_GROUP Discretionary access rights for processes whose effective group ID or any supplemental groups match the ACL entry qualifier. .It Dv ACL_MASK The maximum discretionary access rights that can be granted to a process in the file group class. This is only valid for POSIX.1e ACLs. .It Dv ACL_OTHER Discretionary access rights for processes not covered by any other ACL entry. This is only valid for POSIX.1e ACLs. .It Dv ACL_OTHER_OBJ Same as .Dv ACL_OTHER . .It Dv ACL_EVERYONE Discretionary access rights for all users. This is only valid for NFSv4 ACLs. .El .Pp Each POSIX.1e ACL must contain exactly one .Dv ACL_USER_OBJ , one .Dv ACL_GROUP_OBJ , and one .Dv ACL_OTHER . If any of .Dv ACL_USER , .Dv ACL_GROUP , or .Dv ACL_OTHER are present, then exactly one .Dv ACL_MASK entry should be present. .It Vt uid_t Va ae_id The ID of user for whom this ACL describes access permissions. For entries other than .Dv ACL_USER and .Dv ACL_GROUP , this field should be set to .Dv ACL_UNDEFINED_ID . .It Vt acl_perm_t Va ae_perm This field defines what kind of access the process matching this ACL has for accessing the associated file. For POSIX.1e ACLs, the following are valid: .Bl -tag -width ".Dv ACL_WRITE_NAMED_ATTRS" .It Dv ACL_EXECUTE The process may execute the associated file. .It Dv ACL_WRITE The process may write to the associated file. .It Dv ACL_READ The process may read from the associated file. .It Dv ACL_PERM_NONE The process has no read, write or execute permissions to the associated file. .El .Pp For NFSv4 ACLs, the following are valid: .Bl -tag -width ".Dv ACL_WRITE_NAMED_ATTRS" .It Dv ACL_READ_DATA The process may read from the associated file. .It Dv ACL_LIST_DIRECTORY Same as .Dv ACL_READ_DATA . .It Dv ACL_WRITE_DATA The process may write to the associated file. .It Dv ACL_ADD_FILE Same as .Dv ACL_ACL_WRITE_DATA . .It Dv ACL_APPEND_DATA .It Dv ACL_ADD_SUBDIRECTORY Same as .Dv ACL_APPEND_DATA . .It Dv ACL_READ_NAMED_ATTRS Ignored. .It Dv ACL_WRITE_NAMED_ATTRS Ignored. .It Dv ACL_EXECUTE The process may execute the associated file. .It Dv ACL_DELETE_CHILD .It Dv ACL_READ_ATTRIBUTES .It Dv ACL_WRITE_ATTRIBUTES .It Dv ACL_DELETE .It Dv ACL_READ_ACL .It Dv ACL_WRITE_ACL .It Dv ACL_WRITE_OWNER .It Dv ACL_SYNCHRONIZE Ignored. .El .It Vt acl_entry_type_t Va ae_entry_type This field defines the type of NFSv4 ACL entry. It is not used with POSIX.1e ACLs. The following values are valid: .Bl -tag -width ".Dv ACL_WRITE_NAMED_ATTRS" .It Dv ACL_ENTRY_TYPE_ALLOW .It Dv ACL_ENTRY_TYPE_DENY .El .It Vt acl_flag_t Va ae_flags This field defines the inheritance flags of NFSv4 ACL entry. It is not used with POSIX.1e ACLs. The following values are valid: .Bl -tag -width ".Dv ACL_ENTRY_DIRECTORY_INHERIT" .It Dv ACL_ENTRY_FILE_INHERIT .It Dv ACL_ENTRY_DIRECTORY_INHERIT .It Dv ACL_ENTRY_NO_PROPAGATE_INHERIT .It Dv ACL_ENTRY_INHERIT_ONLY .El .El .Sh SEE ALSO .Xr acl 3 , +.Xr vaccess 9 , .Xr vaccess_acl_nfs4 9 , .Xr vaccess_acl_posix1e 9 , .Xr VFS 9 , -.Xr vaccess 9 , .Xr VOP_ACLCHECK 9 , .Xr VOP_GETACL 9 , .Xr VOP_SETACL 9 .Sh AUTHORS This manual page was written by .An Robert Watson . Index: head/share/man/man9/alq.9 =================================================================== --- head/share/man/man9/alq.9 (revision 275992) +++ head/share/man/man9/alq.9 (revision 275993) @@ -1,441 +1,441 @@ .\" .\" Copyright (c) 2003 Hiten Pandya .\" Copyright (c) 2009-2010 The FreeBSD Foundation .\" All rights reserved. .\" .\" Portions of this software were developed at the Centre for Advanced .\" Internet Architectures, Swinburne University of Technology, Melbourne, .\" Australia by Lawrence Stewart under sponsorship from the FreeBSD .\" Foundation. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions, and the following disclaimer, .\" without modification, immediately at the beginning of the file. .\" 2. The name of the author may not be used to endorse or promote products .\" derived from this software without specific prior written permission. .\" .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR .\" ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" $FreeBSD$ .\" .Dd April 26, 2010 .Dt ALQ 9 .Os .Sh NAME .Nm alq , .Nm alq_open_flags , .Nm alq_open , .Nm alq_writen , .Nm alq_write , .Nm alq_flush , .Nm alq_close , .Nm alq_getn , .Nm alq_get , .Nm alq_post_flags , .Nm alq_post .Nd Asynchronous Logging Queues .Sh SYNOPSIS .In sys/alq.h .Ft int .Fo alq_open_flags .Fa "struct alq **app" .Fa "const char *file" .Fa "struct ucred *cred" .Fa "int cmode" .Fa "int size" .Fa "int flags" .Fc .Ft int .Fo alq_open .Fa "struct alq **app" .Fa "const char *file" .Fa "struct ucred *cred" .Fa "int cmode" .Fa "int size" .Fa "int count" .Fc .Ft int .Fn alq_writen "struct alq *alq" "void *data" "int len" "int flags" .Ft int .Fn alq_write "struct alq *alq" "void *data" "int flags" .Ft void .Fn alq_flush "struct alq *alq" .Ft void .Fn alq_close "struct alq *alq" .Ft struct ale * .Fn alq_getn "struct alq *alq" "int len" "int flags" .Ft struct ale * .Fn alq_get "struct alq *alq" "int flags" .Ft void .Fn alq_post_flags "struct alq *alq" "struct ale *ale" "int flags" .Ft void .Fn alq_post "struct alq *alq" "struct ale *ale" .Sh DESCRIPTION The .Nm facility provides an asynchronous fixed or variable length recording mechanism, known as Asynchronous Logging Queues. It can record to any .Xr vnode 9 , thus providing the ability to journal logs to character devices as well as regular files. All functions accept a .Vt "struct alq" argument, which is an opaque type that maintains state information for an Asynchronous Logging Queue. The logging facility runs in a separate kernel thread, which services all log entry requests. .Pp An .Dq asynchronous log entry is defined as .Vt "struct ale" , which has the following members: .Bd -literal -offset indent struct ale { intptr_t ae_bytesused; /* # bytes written to ALE. */ char *ae_data; /* Write ptr. */ int ae_pad; /* Unused, compat. */ }; .Ed .Pp An .Nm can be created in either fixed or variable length mode. A variable length .Nm accommodates writes of varying length using .Fn alq_writen and .Fn alq_getn . A fixed length .Nm accommodates a fixed number of writes using .Fn alq_write and .Fn alq_get , each of fixed size (set at queue creation time). Fixed length mode is deprecated in favour of variable length mode. .Sh FUNCTIONS The .Fn alq_open_flags function creates a new variable length asynchronous logging queue. The .Fa file argument is the name of the file to open for logging. If the file does not yet exist, .Fn alq_open will attempt to create it. The .Fa cmode argument will be passed to .Fn vn_open as the requested creation mode, to be used if the file will be created by .Fn alq_open . Consumers of this API may wish to pass .Dv ALQ_DEFAULT_CMODE , a default creation mode suitable for most applications. The .Fa cred argument specifies the credentials to use when opening and performing I/O on the file. The .Fa size argument sets the size (in bytes) of the underlying queue. The ALQ_ORDERED flag may be passed in via .Fa flags to indicate that the ordering of writer threads waiting for a busy .Nm to free up resources should be preserved. .Pp The deprecated .Fn alq_open function is implemented as a wrapper around .Fn alq_open_flags to provide backwards compatibility to consumers that have not been updated to utilise the newer .Fn alq_open_flags function. It passes all arguments through to .Fn alq_open_flags untouched except for .Fa size and .Fa count , and sets .Fa flags to 0. To create a variable length mode .Nm , the .Fa size argument should be set to the size (in bytes) of the underlying queue and the .Fa count argument should be set to 0. To create a fixed length mode .Nm , the .Fa size argument should be set to the size (in bytes) of each write and the .Fa count argument should be set to the number of .Fa size byte chunks to reserve capacity for. .Pp The .Fn alq_writen function writes .Fa len bytes from .Fa data to the designated variable length mode queue .Fa alq . If .Fn alq_writen could not write the entry immediately and .Dv ALQ_WAITOK is set in .Fa flags , the function will be allowed to .Xr msleep_spin 9 with the .Dq Li alqwnord or .Dq Li alqwnres wait message. A write will automatically schedule the queue .Fa alq to be flushed to disk. This behaviour can be controlled by passing ALQ_NOACTIVATE via .Fa flags to indicate that the write should not schedule .Fa alq to be flushed to disk. .Pp The deprecated .Fn alq_write function is implemented as a wrapper around .Fn alq_writen to provide backwards compatibility to consumers that have not been updated to utilise variable length mode queues. The function will write .Fa size bytes of data (where .Fa size was specified at queue creation time) from the .Fa data buffer to the .Fa alq . Note that it is an error to call .Fn alq_write on a variable length mode queue. .Pp The .Fn alq_flush function is used for flushing .Fa alq to the log medium that was passed to .Fn alq_open . If .Fa alq has data to flush and is not already in the process of being flushed, the function will block doing IO. Otherwise, the function will return immediately. .Pp The .Fn alq_close function will close the asynchronous logging queue .Fa alq and flush all pending write requests to the log medium. It will free all resources that were previously allocated. .Pp The .Fn alq_getn function returns an asynchronous log entry from .Fa alq , initialised to point at a buffer capable of receiving .Fa len bytes of data. This function leaves .Fa alq in a locked state, until a subsequent .Fn alq_post or .Fn alq_post_flags call is made. If .Fn alq_getn could not obtain .Fa len bytes of buffer immediately and .Dv ALQ_WAITOK is set in .Fa flags , the function will be allowed to .Xr msleep_spin 9 with the .Dq Li alqgnord or .Dq Li alqgnres wait message. The caller can choose to write less than .Fa len bytes of data to the returned asynchronous log entry by setting the entry's ae_bytesused field to the number of bytes actually written. This must be done prior to calling .Fn alq_post . .Pp The deprecated .Fn alq_get function is implemented as a wrapper around .Fn alq_getn to provide backwards compatibility to consumers that have not been updated to utilise variable length mode queues. The asynchronous log entry returned will be initialised to point at a buffer capable of receiving .Fa size bytes of data (where .Fa size was specified at queue creation time). Note that it is an error to call .Fn alq_get on a variable length mode queue. .Pp The .Fn alq_post_flags function schedules the asynchronous log entry .Fa ale (obtained from .Fn alq_getn or .Fn alq_get ) for writing to .Fa alq . The ALQ_NOACTIVATE flag may be passed in via .Fa flags to indicate that the queue should not be immediately scheduled to be flushed to disk. This function leaves .Fa alq in an unlocked state. .Pp The .Fn alq_post function is implemented as a wrapper around .Fn alq_post_flags to provide backwards compatibility to consumers that have not been updated to utilise the newer .Fn alq_post_flags function. It simply passes all arguments through to .Fn alq_post_flags untouched, and sets .Fa flags to 0. .Sh IMPLEMENTATION NOTES The .Fn alq_writen and .Fn alq_write functions both perform a .Xr bcopy 3 from the supplied .Fa data buffer into the underlying .Nm buffer. Performance critical code paths may wish to consider using .Fn alq_getn (variable length queues) or .Fn alq_get (fixed length queues) to avoid the extra memory copy. Note that a queue remains locked between calls to .Fn alq_getn or .Fn alq_get and .Fn alq_post or .Fn alq_post_flags , so this method of writing to a queue is unsuitable for situations where the time between calls may be substantial. .Sh LOCKING Each asynchronous logging queue is protected by a spin mutex. .Pp Functions .Fn alq_flush and .Fn alq_open may attempt to acquire an internal sleep mutex, and should consequently not be used in contexts where sleeping is not allowed. .Sh RETURN VALUES The .Fn alq_open function returns one of the error codes listed in .Xr open 2 , if it fails to open .Fa file , or else it returns 0. .Pp The .Fn alq_writen and .Fn alq_write functions return .Er EWOULDBLOCK if .Dv ALQ_NOWAIT was set in .Fa flags and either the queue is full or the system is shutting down. .Pp The .Fn alq_getn and .Fn alq_get functions return .Dv NULL if .Dv ALQ_NOWAIT was set in .Fa flags and either the queue is full or the system is shutting down. .Pp NOTE: invalid arguments to non-void functions will result in undefined behaviour. .Sh SEE ALSO +.Xr syslog 3 , .Xr kproc 9 , .Xr ktr 9 , .Xr msleep_spin 9 , -.Xr syslog 3 , .Xr vnode 9 .Sh HISTORY The Asynchronous Logging Queues (ALQ) facility first appeared in .Fx 5.0 . .Sh AUTHORS .An -nosplit The .Nm facility was written by .An Jeffrey Roberson Aq Mt jeff@FreeBSD.org and extended by .An Lawrence Stewart Aq Mt lstewart@freebsd.org . .Pp This manual page was written by .An Hiten Pandya Aq Mt hmp@FreeBSD.org and revised by .An Lawrence Stewart Aq Mt lstewart@freebsd.org . Index: head/share/man/man9/devfs_set_cdevpriv.9 =================================================================== --- head/share/man/man9/devfs_set_cdevpriv.9 (revision 275992) +++ head/share/man/man9/devfs_set_cdevpriv.9 (revision 275993) @@ -1,125 +1,125 @@ .\" Copyright (c) 2008 Konstantin Belousov .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" $FreeBSD$ .\" .Dd August 15, 2012 .Dt DEVFS_CDEVPRIV 9 .Os .Sh NAME .Nm devfs_set_cdevpriv , .Nm devfs_get_cdevpriv , .Nm devfs_clear_cdevpriv .Nd manage per-open filedescriptor data for devices .Sh SYNOPSIS .In sys/param.h .In sys/conf.h .Bd -literal typedef void (*cdevpriv_dtr_t)(void *data); .Ed .Ft int .Fn devfs_get_cdevpriv "void **datap" .Ft int .Fn devfs_set_cdevpriv "void *priv" "cdevpriv_dtr_t dtr" .Ft void .Fn devfs_clear_cdevpriv "void" .Sh DESCRIPTION The .Fn devfs_xxx_cdevpriv family of functions allows the .Fa cdev driver methods to associate some driver-specific data with each user process .Xr open 2 of the device special file. Currently, functioning of these functions is restricted to the context of the .Fa cdevsw switch method calls performed as .Xr devfs 5 operations in response to system calls that use filedescriptors. .Pp The .Fn devfs_set_cdevpriv function associates a data pointed by .Va priv with current calling context (filedescriptor). The data may be retrieved later, possibly from another call performed on this filedescriptor, by the .Fn devfs_get_cdevpriv function. The .Fn devfs_clear_cdevpriv disassociates previously attached data from context. Immediately after .Fn devfs_clear_cdevpriv finished operating, the .Va dtr callback is called, with private data supplied .Va data argument. The .Fn devfs_clear_cdevpriv function will be also be called if the open callback function returns an error code. .Pp On the last filedescriptor close, system automatically arranges .Fn devfs_clear_cdevpriv call. .Pp If successful, the functions return 0. .Pp The function .Fn devfs_set_cdevpriv returns the following values on error: .Bl -tag -width Er .It Bq Er ENOENT The current call is not associated with some filedescriptor. .It Bq Er EBUSY The private driver data is already associated with current filedescriptor. .El .Pp The function .Fn devfs_get_cdevpriv returns the following values on error: .Bl -tag -width Er .It Bq Er EBADF The current call is not associated with some filedescriptor. .It Bq Er ENOENT The private driver data was not associated with current filedescriptor, or .Fn devfs_clear_cdevpriv was called. .El .Sh SEE ALSO -.Xr open 2 , .Xr close 2 , +.Xr open 2 , .Xr devfs 5 , .Xr kern_openat 9 .Sh HISTORY The .Fn devfs_cdevpriv family of functions first appeared in .Fx 7.1 . Index: head/share/man/man9/eventtimers.9 =================================================================== --- head/share/man/man9/eventtimers.9 (revision 275992) +++ head/share/man/man9/eventtimers.9 (revision 275993) @@ -1,253 +1,253 @@ .\" .\" Copyright (c) 2011-2013 Alexander Motin .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE DEVELOPERS ``AS IS'' AND ANY EXPRESS OR .\" IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES .\" OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. .\" IN NO EVENT SHALL THE DEVELOPERS BE LIABLE FOR ANY DIRECT, INDIRECT, .\" INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT .\" NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, .\" DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY .\" THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT .\" (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF .\" THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. .\" .\" $FreeBSD$ .\" .Dd April 2, 2014 .Dt EVENTTIMERS 9 .Os .Sh NAME .Nm eventtimers .Nd kernel event timers subsystem .Sh SYNOPSIS .In sys/timeet.h .Bd -literal struct eventtimer; typedef int et_start_t(struct eventtimer *et, sbintime_t first, sbintime_t period); typedef int et_stop_t(struct eventtimer *et); typedef void et_event_cb_t(struct eventtimer *et, void *arg); typedef int et_deregister_cb_t(struct eventtimer *et, void *arg); struct eventtimer { SLIST_ENTRY(eventtimer) et_all; char *et_name; int et_flags; #define ET_FLAGS_PERIODIC 1 #define ET_FLAGS_ONESHOT 2 #define ET_FLAGS_PERCPU 4 #define ET_FLAGS_C3STOP 8 #define ET_FLAGS_POW2DIV 16 int et_quality; int et_active; uint64_t et_frequency; sbintime_t et_min_period; sbintime_t et_max_period; et_start_t *et_start; et_stop_t *et_stop; et_event_cb_t *et_event_cb; et_deregister_cb_t *et_deregister_cb; void *et_arg; void *et_priv; struct sysctl_oid *et_sysctl; }; .Ed .Ft int .Fn et_register "struct eventtimer *et" .Ft int .Fn et_deregister "struct eventtimer *et" .Ft void .Fn et_change_frequency "struct eventtimer *et" "uint64_t newfreq" .Fn ET_LOCK .Fn ET_UNLOCK .Ft struct eventtimer * .Fn et_find "const char *name" "int check" "int want" .Ft int .Fn et_init "struct eventtimer *et" "et_event_cb_t *event" "et_deregister_cb_t *deregister" "void *arg" .Ft int .Fn et_start "struct eventtimer *et" "sbintime_t first" "sbintime_t period" .Ft int .Fn et_stop "struct eventtimer *et" .Ft int .Fn et_ban "struct eventtimer *et" .Ft int .Fn et_free "struct eventtimer *et" .Sh DESCRIPTION Event timers are responsible for generating interrupts at specified time or periodically, to run different time-based events. Subsystem consists of three main parts: .Bl -tag -width "Consumers" .It Drivers Manage hardware to generate requested time events. .It Consumers .Pa sys/kern/kern_clocksource.c uses event timers to supply kernel with .Fn hardclock , .Fn statclock and .Fn profclock time events. .It Glue code .Pa sys/sys/timeet.h , .Pa sys/kern/kern_et.c provide APIs for event timer drivers and consumers. .El .Sh DRIVER API Driver API is built around eventtimer structure. To register its functionality driver allocates that structure and calls .Fn et_register . Driver should fill following fields there: .Bl -tag -width Va .It Va et_name Unique name of the event timer for management purposes. .It Va et_flags Set of flags, describing timer capabilities: .Bl -tag -width "ET_FLAGS_PERIODIC" -compact .It ET_FLAGS_PERIODIC Periodic mode supported. .It ET_FLAGS_ONESHOT One-shot mode supported. .It ET_FLAGS_PERCPU Timer is per-CPU. .It ET_FLAGS_C3STOP Timer may stop in CPU sleep state. .It ET_FLAGS_POW2DIV Timer supports only 2^n divisors. .El .It Va et_quality Abstract value to certify whether this timecounter is better than the others. Higher value means better. .It Va et_frequency Timer oscillator's base frequency, if applicable and known. Used by consumers to predict set of possible frequencies that could be obtained by dividing it. Should be zero if not applicable or unknown. .It Va et_min_period , et_max_period Minimal and maximal reliably programmable time periods. .It Va et_start Driver's timer start function pointer. .It Va et_stop Driver's timer stop function pointer. .It Va et_priv Driver's private data storage. .El .Pp After the event timer functionality is registered, it is controlled via .Va et_start and .Va et_stop methods. .Va et_start method is called to start the specified event timer. The last two arguments are used to specify time when events should be generated. .Va first argument specifies time period before the first event generated. In periodic mode NULL value specifies that first period is equal to the .Va period argument value. .Va period argument specifies the time period between following events for the periodic mode. The NULL value there specifies the one-shot mode. At least one of these two arguments should be not NULL. When event time arrive, driver should call .Va et_event_cb callback function, passing .Va et_arg as the second argument. .Va et_stop method is called to stop the specified event timer. For the per-CPU event timers .Va et_start and .Va et_stop methods control timers associated with the current CPU. .Pp Driver may deregister its functionality by calling .Fn et_deregister . .Pp -If the frequency of the clock hardware can change while it is +If the frequency of the clock hardware can change while it is running (for example, during power-saving modes), the driver must call .Fn et_change_frequency on each change. If the given event timer is the active timer, .Fn et_change_frequency -stops the timer on all CPUs, updates +stops the timer on all CPUs, updates .Va et->frequency , then restarts the timer on all CPUs so that all current events are rescheduled using the new frequency. If the given timer is not currently active, .Fn et_change_frequency simply updates .Va et->frequency . .Sh CONSUMER API .Fn et_find allows consumer to find available event timer, optionally matching specific name and/or capability flags. Consumer may read returned eventtimer structure, but should not modify it. When wanted event timer is found, .Fn et_init should be called for it, submitting .Va event and optionally .Va deregister callbacks functions, and the opaque argument .Va arg . That argument will be passed as argument to the callbacks. Event callback function will be called on scheduled time events. It is called from the hardware interrupt context, so no sleep is permitted there. Deregister callback function may be called to report consumer that the event timer functionality is no longer available. On this call, consumer should stop using event timer before the return. .Pp After the timer is found and initialized, it can be controlled via .Fn et_start and .Fn et_stop . The arguments are the same as described in driver API. Per-CPU event timers can be controlled only from specific CPUs. .Pp .Fn et_ban allows consumer to mark event timer as broken via clearing both one-shot and periodic capability flags, if it was somehow detected. .Fn et_free is the opposite to .Fn et_init . It releases the event timer for other consumers use. .Pp .Fn ET_LOCK and .Fn ET_UNLOCK macros should be used to manage .Xr mutex 9 lock around .Fn et_find , .Fn et_init and .Fn et_free calls to serialize access to the list of the registered event timers and the pointers returned by .Fn et_find . .Fn et_start and .Fn et_stop calls should be serialized in consumer's internal way to avoid concurrent timer hardware access. .Sh SEE ALSO .Xr eventtimers 4 .Sh AUTHORS .An Alexander Motin Aq Mt mav@FreeBSD.org Index: head/share/man/man9/ieee80211_crypto.9 =================================================================== --- head/share/man/man9/ieee80211_crypto.9 (revision 275992) +++ head/share/man/man9/ieee80211_crypto.9 (revision 275993) @@ -1,260 +1,260 @@ .\" .\" Copyright (c) 2004 Bruce M. Simpson .\" Copyright (c) 2004 Darron Broad .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" $FreeBSD$ .\" $Id: ieee80211_crypto.9,v 1.3 2004/03/04 10:42:56 bruce Exp $ .\" .Dd March 29, 2010 .Dt IEEE80211_CRYPTO 9 .Os .Sh NAME .Nm ieee80211_crypto .Nd 802.11 cryptographic support .Sh SYNOPSIS .In net80211/ieee80211_var.h .\" .Pp .Ft void .Fn ieee80211_crypto_register "const struct ieee80211_cipher *" .\" .Ft void .Fn ieee80211_crypto_unregister "const struct ieee80211_cipher *" .\" .Ft int .Fn ieee80211_crypto_available "int cipher" .\" .Pp .Ft void .Fo ieee80211_notify_replay_failure .Fa "struct ieee80211vap *" .Fa "const struct ieee80211_frame *" .Fa "const struct ieee80211_key *" .Fa "uint64_t rsc" .Fa "int tid" .Fc .\" .Ft void .Fo ieee80211_notify_michael_failure .Fa "struct ieee80211vap *" .Fa "const struct ieee80211_frame *" .Fa "u_int keyix" .Fc .\" .Ft int .Fo ieee80211_crypto_newkey .Fa "struct ieee80211vap *" .Fa "int cipher" .Fa "int flags" .Fa "struct ieee80211_key *" .Fc .\" .Ft int .Fn ieee80211_crypto_setkey "struct ieee80211vap *" "struct ieee80211_key *" .\" .Ft int .Fn ieee80211_crypto_delkey "struct ieee80211vap *" "struct ieee80211_key *" .\" .Ft void .Fn ieee80211_key_update_begin "struct ieee80211vap *" .\" .Ft void .Fn ieee80211_key_update_end "struct ieee80211vap *" .\" .Ft void .Fn ieee80211_crypto_delglobalkeys "struct ieee80211vap *" .\" .Ft void .Fn ieee80211_crypto_reload_keys "struct ieee80211com *" .\" .Pp .Ft struct ieee80211_key * .Fn ieee80211_crypto_encap "struct ieee80211_node *" "struct mbuf *" .\" .Ft struct ieee80211_key * .Fn ieee80211_crypto_decap "struct ieee80211_node *" "struct mbuf *" "int flags" .\" .Ft int .Fo ieee80211_crypto_demic .Fa "struct ieee80211vap *" .Fa "struct ieee80211_key *" .Fa "struct mbuf *" .Fa "int force" .Fc .\" .Ft int .Fo ieee80211_crypto_enmic .Fa "struct ieee80211vap *" .Fa "struct ieee80211_key *" .Fa "struct mbuf *" .Fa "int force" .Fc .Sh DESCRIPTION The .Nm net80211 layer includes comprehensive cryptographic support for 802.11 protocols. Software implementations of ciphers required by WPA and 802.11i are provided as well as encap/decap processing of 802.11 frames. Software ciphers are written as kernel modules and register with the core crypto support. The cryptographic framework supports hardware acceleration of ciphers by drivers with automatic fall-back to software implementations when a driver is unable to provide necessary hardware services. .Sh CRYPTO CIPHER MODULES .Nm net80211 cipher modules register their services using .Fn ieee80211_crypto_register and supply a template that describes their operation. This .Vt ieee80211_cipher structure defines protocol-related state such as the number of bytes of space in the 802.11 header to reserve/remove during encap/decap and entry points for setting up keys and doing cryptographic operations. .Pp Cipher modules can associate private state to each key through the .Vt wk_private structure member. If state is setup by the module it will be called before a key is destroyed so it can reclaim resources. .Pp Crypto modules can notify the system of two events. When a packet replay event is recognized .Fn ieee80211_notify_replay_failure can be used to signal the event. When a .Dv TKIP Michael failure is detected .Fn ieee80211_notify_michael_failure can be invoked. Drivers may also use these routines to signal events detected by the hardware. .Sh CRYPTO KEY MANAGEMENT The .Nm net80211 layer implements a per-vap 4-element .Dq global key table and a per-station .Dq unicast key for protocols such as WPA, 802.1x, and 802.11i. The global key table is designed to support legacy WEP operation and Multicast/Group keys, though some applications also use it to implement WPA in station mode. Keys in the global table are identified by a key index in the range 0-3. Per-station keys are identified by the MAC address of the station and are typically used for unicast PTK bindings. .Pp .Nm net80211 provides .Xr ioctl 2 operations for managing both global and per-station keys. Drivers typically do not participate in software key management; they are involved only when providing hardware acceleration of cryptographic operations. .Pp .Fn ieee80211_crypto_newkey is used to allocate a new .Nm net80211 key or reconfigure an existing key. The cipher must be specified along with any fixed key index. The .Nm net80211 layer will handle allocating cipher and driver resources to support the key. .Pp Once a key is allocated it's contents can be set using .Fn ieee80211_crypto_setkey and deleted with .Fn ieee80211_crypto_delkey (with any cipher and driver resources reclaimed). .Pp .Fn ieee80211_crypto_delglobalkeys is used to reclaim all keys in the global key table for a vap; it typically is used only within the .Nm net80211 layer. .Pp .Fn ieee80211_crypto_reload_keys handles hardware key state reloading from software key state, such as required after a suspend/resume cycle. .Sh DRIVER CRYPTO SUPPORT Drivers identify ciphers they have hardware support for through the .Vt ic_cryptocaps field of the .Vt ieee80211com structure. If hardware support is available then a driver should also fill in the .Dv iv_key_alloc , .Dv iv_key_set , and .Dv iv_key_delete methods of each .Vt ieee80211vap created for use with the device. In addition the methods .Dv iv_key_update_begin and .Dv iv_key_update_end can be setup to handle synchronization requirements for updating hardware key state. .Pp When .Nm net80211 allocates a software key and the driver can accelerate the cipher operations the .Dv iv_key_alloc method will be invoked. Drivers may return a token that is associated with outbound traffic (for use in encrypting frames). Otherwise, e.g. if hardware resources are not available, the driver will not return a token and .Nm net80211 will arrange to do the work in software and pass frames to the driver that are already prepared for transmission. .Pp For receive, drivers mark frames with the .Dv M_WEP mbuf flag to indicate the hardware has decrypted the payload. If frames have the .Dv IEEE80211_FC1_PROTECTED bit marked in their 802.11 header and are not tagged with .Dv M_WEP then decryption is done in software. For more complicated scenarios the software key state is consulted; e.g. to decide if Michael verification needs to be done in software after the hardware has handled TKIP decryption. .Pp Drivers that manage complicated key data structures, e.g. faulting software keys into a hardware key cache, can safely manipulate software key state by bracketing their work with calls to .Fn ieee80211_key_update_begin and .Fn ieee80211_key_update_end . These calls also synchronize hardware key state update when receive traffic is active. .Sh SEE ALSO -.Xr ieee80211 9 , .Xr ioctl 2 , .Xr wlan_ccmp 4 , .Xr wlan_tkip 4 , -.Xr wlan_wep 4 +.Xr wlan_wep 4 , +.Xr ieee80211 9 Index: head/share/man/man9/ifnet.9 =================================================================== --- head/share/man/man9/ifnet.9 (revision 275992) +++ head/share/man/man9/ifnet.9 (revision 275993) @@ -1,1529 +1,1530 @@ .\" -*- Nroff -*- .\" Copyright 1996, 1997 Massachusetts Institute of Technology .\" .\" Permission to use, copy, modify, and distribute this software and .\" its documentation for any purpose and without fee is hereby .\" granted, provided that both the above copyright notice and this .\" permission notice appear in all copies, that both the above .\" copyright notice and this permission notice appear in all .\" supporting documentation, and that the name of M.I.T. not be used .\" in advertising or publicity pertaining to distribution of the .\" software without specific, written prior permission. M.I.T. makes .\" no representations about the suitability of this software for any .\" purpose. It is provided "as is" without express or implied .\" warranty. .\" .\" THIS SOFTWARE IS PROVIDED BY M.I.T. ``AS IS''. M.I.T. DISCLAIMS .\" ALL EXPRESS OR IMPLIED WARRANTIES WITH REGARD TO THIS SOFTWARE, .\" INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF .\" MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. IN NO EVENT .\" SHALL M.I.T. BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, .\" SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT .\" LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF .\" USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND .\" ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, .\" OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT .\" OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" $FreeBSD$ .\" .Dd July 29, 2014 .Dt IFNET 9 .Os .Sh NAME .Nm ifnet , .Nm ifaddr , .Nm ifqueue , .Nm if_data .Nd kernel interfaces for manipulating network interfaces .Sh SYNOPSIS .In sys/param.h .In sys/time.h .In sys/socket.h .In net/if.h .In net/if_var.h .In net/if_types.h .\" .Ss "Interface Manipulation Functions" .Ft "struct ifnet *" .Fn if_alloc "u_char type" .Ft void .Fn if_attach "struct ifnet *ifp" .Ft void .Fn if_detach "struct ifnet *ifp" .Ft void .Fn if_free "struct ifnet *ifp" .Ft void .Fn if_free_type "struct ifnet *ifp" "u_char type" .Ft void .Fn if_down "struct ifnet *ifp" .Ft int .Fn ifioctl "struct socket *so" "u_long cmd" "caddr_t data" "struct thread *td" .Ft int .Fn ifpromisc "struct ifnet *ifp" "int pswitch" .Ft int .Fn if_allmulti "struct ifnet *ifp" "int amswitch" .Ft "struct ifnet *" .Fn ifunit "const char *name" .Ft "struct ifnet *" .Fn ifunit_ref "const char *name" .Ft void .Fn if_up "struct ifnet *ifp" .\" .Ss "Interface Address Functions" .Ft "struct ifaddr *" .Fn ifaddr_byindex "u_short idx" .Ft "struct ifaddr *" .Fn ifa_ifwithaddr "struct sockaddr *addr" .Ft "struct ifaddr *" .Fn ifa_ifwithdstaddr "struct sockaddr *addr" "int fib" .Ft "struct ifaddr *" .Fn ifa_ifwithnet "struct sockaddr *addr" "int ignore_ptp" "int fib" .Ft "struct ifaddr *" .Fn ifaof_ifpforaddr "struct sockaddr *addr" "struct ifnet *ifp" .Ft void .Fn ifa_ref "struct ifaddr *ifa" .Ft void .Fn ifa_free "struct ifaddr *ifa" .\" .Ss "Interface Multicast Address Functions" .Ft int .Fn if_addmulti "struct ifnet *ifp" "struct sockaddr *sa" "struct ifmultiaddr **ifmap" .Ft int .Fn if_delmulti "struct ifnet *ifp" "struct sockaddr *sa" .Ft "struct ifmultiaddr *" .Fn if_findmulti "struct ifnet *ifp" "struct sockaddr *sa" .Ss "Output queue macros" .Fn IF_DEQUEUE "struct ifqueue *ifq" "struct mbuf *m" .\" .Ss "struct ifnet Member Functions" .Ft void .Fn \*(lp*if_input\*(rp "struct ifnet *ifp" "struct mbuf *m" .Ft int .Fo \*(lp*if_output\*(rp .Fa "struct ifnet *ifp" "struct mbuf *m" .Fa "const struct sockaddr *dst" "struct route *ro" .Fc .Ft void .Fn \*(lp*if_start\*(rp "struct ifnet *ifp" .Ft int .Fn \*(lp*if_transmit\*(rp "struct ifnet *ifp" "struct mbuf *m" .Ft void .Fn \*(lp*if_qflush\*(rp "struct ifnet *ifp" .Ft int .Fn \*(lp*if_ioctl\*(rp "struct ifnet *ifp" "u_long cmd" "caddr_t data" .Ft void .Fn \*(lp*if_init\*(rp "void *if_softc" .Ft int .Fo \*(lp*if_resolvemulti\*(rp .Fa "struct ifnet *ifp" "struct sockaddr **retsa" "struct sockaddr *addr" .Fc .Ss "struct ifaddr member function" .Ft void .Fo \*(lp*ifa_rtrequest\*(rp .Fa "int cmd" "struct rtentry *rt" "struct rt_addrinfo *info" .Fc .\" .Ss "Global Variables" .Vt extern struct ifnethead ifnet ; .\" extern struct ifindex_entry *ifindex_table ; .Vt extern int if_index ; .Vt extern int ifqmaxlen ; .Sh DATA STRUCTURES The kernel mechanisms for handling network interfaces reside primarily in the .Vt ifnet , if_data , ifaddr , and .Vt ifmultiaddr structures in .In net/if.h and .In net/if_var.h and the functions named above and defined in .Pa /sys/net/if.c . Those interfaces which are intended to be used by user programs are defined in .In net/if.h ; these include the interface flags, the .Vt if_data structure, and the structures defining the appearance of interface-related messages on the .Xr route 4 routing socket and in .Xr sysctl 3 . The header file .In net/if_var.h defines the kernel-internal interfaces, including the .Vt ifnet , ifaddr , and .Vt ifmultiaddr structures and the functions which manipulate them. (A few user programs will need .In net/if_var.h because it is the prerequisite of some other header file like .In netinet/if_ether.h . Most references to those two files in particular can be replaced by .In net/ethernet.h . ) .Pp The system keeps a linked list of interfaces using the .Li TAILQ macros defined in .Xr queue 3 ; this list is headed by a .Vt "struct ifnethead" called .Va ifnet . The elements of this list are of type .Vt "struct ifnet" , and most kernel routines which manipulate interface as such accept or return pointers to these structures. Each interface structure contains an .Vt if_data structure used for statistics and information. Each interface also has a .Li TAILQ of interface addresses, described by .Vt ifaddr structures. An .Dv AF_LINK address (see .Xr link_addr 3 ) describing the link layer implemented by the interface (if any) is accessed by the .Fn ifaddr_byindex function or .Va if_addr structure. (Some trivial interfaces do not provide any link layer addresses; this structure, while still present, serves only to identify the interface name and index.) .Pp Finally, those interfaces supporting reception of multicast datagrams have a .Li TAILQ of multicast group memberships, described by .Vt ifmultiaddr structures. These memberships are reference-counted. .Pp Interfaces are also associated with an output queue, defined as a .Vt "struct ifqueue" ; this structure is used to hold packets while the interface is in the process of sending another. .Pp .Ss The Vt ifnet Ss structure The fields of .Vt "struct ifnet" are as follows: .Bl -tag -width ".Va if_capabilities" -offset indent .It Va if_softc .Pq Vt "void *" A pointer to the driver's private state block. (Initialized by driver.) .It Va if_l2com .Pq Vt "void *" A pointer to the common data for the interface's layer 2 protocol. (Initialized by .Fn if_alloc . ) .It Va if_vnet .Pq Vt "struct vnet *" A pointer to the virtual network stack instance. (Initialized by .Fn if_attach . ) .It Va if_home_vnet .Pq Vt "struct vnet *" A pointer to the parent virtual network stack, where this .Vt "struct ifnet" originates from. (Initialized by .Fn if_attach . ) .It Va if_link .Pq Fn TAILQ_ENTRY ifnet .Xr queue 3 macro glue. .It Va if_xname .Pq Vt "char *" The name of the interface, (e.g., .Dq Li fxp0 or .Dq Li lo0 ) . (Initialized by driver (usually via .Fn if_initname ) . ) .It Va if_dname .Pq Vt "const char *" The name of the driver. (Initialized by driver (usually via .Fn if_initname ) . ) .It Va if_dunit .Pq Vt int A unique number assigned to each interface managed by a particular driver. Drivers may choose to set this to .Dv IF_DUNIT_NONE if a unit number is not associated with the device. (Initialized by driver (usually via .Fn if_initname ) . ) .It Va if_refcount .Pq Vt u_int The reference count. (Initialized by .Fn if_alloc . ) .It Va if_addrhead .Pq Vt "struct ifaddrhead" The head of the .Xr queue 3 .Li TAILQ containing the list of addresses assigned to this interface. .It Va if_pcount .Pq Vt int A count of promiscuous listeners on this interface, used to reference-count the .Dv IFF_PROMISC flag. .It Va if_carp .Pq Vt "struct carp_if *" A pointer to the CARP interface structure, .Xr carp 4 . (Initialized by the driver-specific .Fn if_ioctl routine.) .It Va if_bpf .Pq Vt "struct bpf_if *" Opaque per-interface data for the packet filter, .Xr bpf 4 . (Initialized by .Fn bpf_attach . ) .It Va if_index .Pq Vt u_short A unique number assigned to each interface in sequence as it is attached. This number can be used in a .Vt "struct sockaddr_dl" to refer to a particular interface by index (see .Xr link_addr 3 ) . (Initialized by .Fn if_alloc . ) .It Va if_vlantrunk .Pq Vt struct ifvlantrunk * A pointer to 802.1Q trunk structure, .Xr vlan 4 . (Initialized by the driver-specific .Fn if_ioctl routine.) .It Va if_flags .Pq Vt int Flags describing operational parameters of this interface (see below). (Manipulated by generic code.) .It Va if_drv_flags .Pq Vt int Flags describing operational status of this interface (see below). (Manipulated by driver.) .It Va if_capabilities .Pq Vt int Flags describing the capabilities the interface supports (see below). .It Va if_capenable .Pq Vt int Flags describing the enabled capabilities of the interface (see below). .It Va if_linkmib .Pq Vt "void *" A pointer to an interface-specific MIB structure exported by .Xr ifmib 4 . (Initialized by driver.) .It Va if_linkmiblen .Pq Vt size_t The size of said structure. (Initialized by driver.) .It Va if_data .Pq Vt "struct if_data" More statistics and information; see .Sx "The if_data structure" , below. (Initialized by driver, manipulated by both driver and generic code.) .It Va if_multiaddrs .Pq Vt struct ifmultihead The head of the .Xr queue 3 .Li TAILQ containing the list of multicast addresses assigned to this interface. .It Va if_amcount .Pq Vt int A number of multicast requests on this interface, used to reference-count the .Dv IFF_ALLMULTI flag. .It Va if_addr .Pq Vt "struct ifaddr *" A pointer to the link-level interface address. (Initialized by .Fn if_alloc . ) .\" .It Va if_llsoftc .\" .Pq Vt "void *" .\" The purpose of the field is unclear. .It Va if_snd .Pq Vt "struct ifaltq" The output queue. (Manipulated by driver.) .It Va if_broadcastaddr .Pq Vt "const u_int8_t *" A link-level broadcast bytestring for protocols with variable address length. .It Va if_bridge .Pq Vt "void *" A pointer to the bridge interface structure, .Xr if_bridge 4 . (Initialized by the driver-specific .Fn if_ioctl routine.) .It Va if_label .Pq Vt "struct label *" A pointer to the MAC Framework label structure, .Xr mac 4 . (Initialized by .Fn if_alloc . ) .It Va if_afdata .Pq Vt "void *" An address family dependent data region. .It Va if_afdata_initialized .Pq Vt int Used to track the current state of address family initialization. .It Va if_afdata_lock .Pq Vt "struct rwlock" An .Xr rwlock 9 lock used to protect .Va if_afdata internals. .It Va if_linktask .Pq Vt "struct task" A .Xr taskqueue 9 task scheduled for link state change events of the interface. .It Va if_addr_lock .Pq Vt "struct rwlock" An .Xr rwlock 9 lock used to protect interface-related address lists. .It Va if_clones .Pq Fn LIST_ENTRY ifnet .Xr queue 3 macro glue for the list of clonable network interfaces. .It Va if_groups -.Pq Fn TAILQ_HEAD ", ifg_list" +.Pq Fn TAILQ_HEAD "" "ifg_list" The head of the .Xr queue 3 .Li TAILQ containing the list of groups per interface. .It Va if_pf_kif .Pq Vt "void *" A pointer to the structure used for interface abstraction by .Xr pf 4 . .It Va if_lagg .Pq Vt "void *" A pointer to the .Xr lagg 4 interface structure. .It Va if_alloctype .Pq Vt u_char The type of the interface as it was at the time of its allocation. It is used to cache the type passed to .Fn if_alloc , but unlike .Va if_type , it would not be changed by drivers. .El .Pp References to .Vt ifnet structures are gained by calling the .Fn if_ref function and released by calling the .Fn if_rele function. They are used to allow kernel code walking global interface lists to release the .Vt ifnet lock yet keep the .Vt ifnet structure stable. .Pp There are in addition a number of function pointers which the driver must initialize to complete its interface with the generic interface layer: .Bl -ohang -offset indent .It Fn if_input Pass a packet to an appropriate upper layer as determined from the link-layer header of the packet. This routine is to be called from an interrupt handler or used to emulate reception of a packet on this interface. A single function implementing .Fn if_input can be shared among multiple drivers utilizing the same link-layer framing, e.g., Ethernet. .It Fn if_output Output a packet on interface .Fa ifp , or queue it on the output queue if the interface is already active. .It Fn if_transmit Transmit a packet on an interface or queue it if the interface is in use. This function will return .Dv ENOBUFS if the devices software and hardware queues are both full. This function must be installed after .Fn if_attach to override the default implementation. This function is exposed in order to allow drivers to manage their own queues and to reduce the latency caused by a frequently gratuitous enqueue / dequeue pair to ifq. The suggested internal software queueing mechanism is buf_ring. .It Fn if_qflush Free mbufs in internally managed queues when the interface is marked down. This function must be installed after .Fn if_attach to override the default implementation. This function is exposed in order to allow drivers to manage their own queues and to reduce the latency caused by a frequently gratuitous enqueue / dequeue pair to ifq. The suggested internal software queueing mechanism is buf_ring. .It Fn if_start Start queued output on an interface. This function is exposed in order to provide for some interface classes to share a .Fn if_output among all drivers. .Fn if_start may only be called when the .Dv IFF_DRV_OACTIVE flag is not set. (Thus, .Dv IFF_DRV_OACTIVE does not literally mean that output is active, but rather that the device's internal output queue is full.) Please note that this function will soon be deprecated. .It Fn if_ioctl Process interface-related .Xr ioctl 2 requests (defined in .In sys/sockio.h ) . Preliminary processing is done by the generic routine .Fn ifioctl to check for appropriate privileges, locate the interface being manipulated, and perform certain generic operations like twiddling flags and flushing queues. See the description of .Fn ifioctl below for more information. .It Fn if_init Initialize and bring up the hardware, e.g., reset the chip and enable the receiver unit. Should mark the interface running, but not active .Dv ( IFF_DRV_RUNNING , ~IIF_DRV_OACTIVE ) . .It Fn if_resolvemulti Check the requested multicast group membership, .Fa addr , for validity, and if necessary compute a link-layer group which corresponds to that address which is returned in .Fa *retsa . Returns zero on success, or an error code on failure. .El .Ss "Interface Flags" Interface flags are used for a number of different purposes. Some flags simply indicate information about the type of interface and its capabilities; others are dynamically manipulated to reflect the current state of the interface. Flags of the former kind are marked .Aq S in this table; the latter are marked .Aq D . Flags which begin with .Dq IFF_DRV_ are stored in .Va if_drv_flags ; all other flags are stored in .Va if_flags . .Pp The macro .Dv IFF_CANTCHANGE defines the bits which cannot be set by a user program using the .Dv SIOCSIFFLAGS command to .Xr ioctl 2 ; these are indicated by an asterisk .Pq Ql * in the following listing. .Pp .Bl -tag -width ".Dv IFF_POINTOPOINT" -offset indent -compact .It Dv IFF_UP .Aq D The interface has been configured up by the user-level code. .It Dv IFF_BROADCAST .Aq S* The interface supports broadcast. .It Dv IFF_DEBUG .Aq D Used to enable/disable driver debugging code. .It Dv IFF_LOOPBACK .Aq S The interface is a loopback device. .It Dv IFF_POINTOPOINT .Aq S* The interface is point-to-point; .Dq broadcast address is actually the address of the other end. .It Dv IFF_DRV_RUNNING .Aq D* The interface has been configured and dynamic resources were successfully allocated. Probably only useful internal to the interface. .It Dv IFF_NOARP .Aq D Disable network address resolution on this interface. .It Dv IFF_PROMISC .Aq D* This interface is in promiscuous mode. .It Dv IFF_PPROMISC .Aq D This interface is in the permanently promiscuous mode (implies .Dv IFF_PROMISC ) . .It Dv IFF_ALLMULTI .Aq D* This interface is in all-multicasts mode (used by multicast routers). .It Dv IFF_DRV_OACTIVE .Aq D* The interface's hardware output queue (if any) is full; output packets are to be queued. .It Dv IFF_SIMPLEX .Aq S* The interface cannot hear its own transmissions. .It Dv IFF_LINK0 .It Dv IFF_LINK1 .It Dv IFF_LINK2 .Aq D Control flags for the link layer. (Currently abused to select among multiple physical layers on some devices.) .It Dv IFF_MULTICAST .Aq S* This interface supports multicast. .It Dv IFF_CANTCONFIG .Aq S* The interface is not configurable in a meaningful way. Primarily useful for .Dv IFT_USB interfaces registered at the interface list. .It Dv IFF_MONITOR .Aq D This interface blocks transmission of packets and discards incoming packets after BPF processing. Used to monitor network traffic but not interact with the network in question. .It Dv IFF_STATICARP .Aq D Used to enable/disable ARP requests on this interface. .It Dv IFF_DYING .Aq D* Set when the .Vt ifnet structure of this interface is being released and still has .Va if_refcount references. .It Dv IFF_RENAMING .Aq D* Set when this interface is being renamed. .El .Ss "Interface Capabilities Flags" Interface capabilities are specialized features an interface may or may not support. These capabilities are very hardware-specific and allow, when enabled, to offload specific network processing to the interface or to offer a particular feature for use by other kernel parts. .Pp It should be stressed that a capability can be completely uncontrolled (i.e., stay always enabled with no way to disable it) or allow limited control over itself (e.g., depend on another capability's state.) Such peculiarities are determined solely by the hardware and driver of a particular interface. Only the driver possesses the knowledge on whether and how the interface capabilities can be controlled. Consequently, capabilities flags in .Va if_capenable should never be modified directly by kernel code other than the interface driver. The command .Dv SIOCSIFCAP to .Fn ifioctl is the dedicated means to attempt altering .Va if_capenable on an interface. Userland code shall use .Xr ioctl 2 . .Pp The following capabilities are currently supported by the system: .Bl -tag -width ".Dv IFCAP_POLLING_NOCOUNT" -offset indent .It Dv IFCAP_RXCSUM This interface can do checksum validation on receiving data. Some interfaces do not have sufficient buffer storage to store frames above a certain MTU-size completely. The driver for the interface might disable hardware checksum validation if the MTU is set above the hardcoded limit. .It Dv IFCAP_TXCSUM This interface can do checksum calculation on transmitting data. .It Dv IFCAP_HWCSUM A shorthand for .Pq Dv IFCAP_RXCSUM | IFCAP_TXCSUM . .It Dv IFCAP_NETCONS This interface can be a network console. .It Dv IFCAP_VLAN_MTU The .Xr vlan 4 driver can operate over this interface in software tagging mode without having to decrease MTU on .Xr vlan 4 interfaces below 1500 bytes. This implies the ability of this interface to cope with frames somewhat longer than permitted by the Ethernet specification. .It Dv IFCAP_VLAN_HWTAGGING This interface can do VLAN tagging on output and demultiplex frames by their VLAN tag on input. .It Dv IFCAP_JUMBO_MTU This Ethernet interface can transmit and receive frames up to 9000 bytes long. .It Dv IFCAP_POLLING This interface supports .Xr polling 4 . See below for details. .It Dv IFCAP_VLAN_HWCSUM This interface can do checksum calculation on both transmitting and receiving data on .Xr vlan 4 interfaces (implies .Dv IFCAP_HWCSUM ) . .It Dv IFCAP_TSO4 This Ethernet interface supports TCP4 Segmentation offloading. .It Dv IFCAP_TSO6 This Ethernet interface supports TCP6 Segmentation offloading. .It Dv IFCAP_TSO A shorthand for .Pq Dv IFCAP_TSO4 | IFCAP_TSO6 . .It Dv IFCAP_TOE4 This Ethernet interface supports TCP offloading. .It Dv IFCAP_TOE6 This Ethernet interface supports TCP6 offloading. .It Dv IFCAP_TOE A shorthand for .Pq Dv IFCAP_TOE4 | IFCAP_TOE6 . .It Dv IFCAP_WOL_UCAST This Ethernet interface supports waking up on any Unicast packet. .It Dv IFCAP_WOL_MCAST This Ethernet interface supports waking up on any Multicast packet. .It Dv IFCAP_WOL_MAGIC This Ethernet interface supports waking up on any Magic packet such as those sent by .Xr wake 8 . .It Dv IFCAP_WOL A shorthand for .Pq Dv IFCAP_WOL_UCAST | IFCAP_WOL_MCAST | IFCAP_WOL_MAGIC . .It Dv IFCAP_TOE4 This Ethernet interface supports TCP4 Offload Engine. .It Dv IFCAP_TOE6 This Ethernet interface supports TCP6 Offload Engine. .It Dv IFCAP_TOE A shorthand for .Pq Dv IFCAP_TOE4 | IFCAP_TOE6 . .It Dv IFCAP_VLAN_HWFILTER This interface supports frame filtering in hardware on .Xr vlan 4 interfaces. .It Dv IFCAP_POLLING_NOCOUNT The return value for the number of processed packets should be skipped for this interface. .It Dv IFCAP_VLAN_HWTSO This interface supports TCP Segmentation offloading on .Xr vlan 4 interfaces (implies .Dv IFCAP_TSO ) . .It Dv IFCAP_LINKSTATE This Ethernet interface supports dynamic link state changes. .El .Pp The ability of advanced network interfaces to offload certain computational tasks from the host CPU to the board is limited mostly to TCP/IP. Therefore a separate field associated with an interface (see .Va ifnet.if_data.ifi_hwassist below) keeps a detailed description of its enabled capabilities specific to TCP/IP processing. The TCP/IP module consults the field to see which tasks can be done on an .Em outgoing packet by the interface. The flags defined for that field are a superset of those for .Va mbuf.m_pkthdr.csum_flags , namely: .Bl -tag -width ".Dv CSUM_FRAGMENT" -offset indent .It Dv CSUM_IP The interface will compute IP checksums. .It Dv CSUM_TCP The interface will compute TCP checksums. .It Dv CSUM_UDP The interface will compute UDP checksums. .It Dv CSUM_IP_FRAGS The interface can compute a TCP or UDP checksum for a packet fragmented by the host CPU. Makes sense only along with .Dv CSUM_TCP or .Dv CSUM_UDP . .It Dv CSUM_FRAGMENT The interface will do the fragmentation of IP packets if necessary. The host CPU does not need to care about MTU on this interface as long as a packet to transmit through it is an IP one and it does not exceed the size of the hardware buffer. .El .Pp An interface notifies the TCP/IP module about the tasks the former has performed on an .Em incoming packet by setting the corresponding flags in the field .Va mbuf.m_pkthdr.csum_flags of the .Vt mbuf chain containing the packet. See .Xr mbuf 9 for details. .Pp The capability of a network interface to operate in .Xr polling 4 mode involves several flags in different global variables and per-interface fields. The capability flag .Dv IFCAP_POLLING set in interface's .Va if_capabilities indicates support for .Xr polling 4 on the particular interface. If set in .Va if_capabilities , the same flag can be marked or cleared in the interface's .Va if_capenable within .Fn ifioctl , thus initiating switch of the interface to .Xr polling 4 mode or interrupt mode, respectively. The actual mode change is managed by the driver-specific .Fn if_ioctl routine. The .Xr polling handler returns the number of packets processed. .Ss The Vt if_data Ss Structure The .Vt if_data structure contains statistics and identifying information used by management programs, and which is exported to user programs by way of the .Xr ifmib 4 branch of the .Xr sysctl 3 MIB. The following elements of the .Vt if_data structure are initialized by the interface and are not expected to change significantly over the course of normal operation: .Bl -tag -width ".Va ifi_lastchange" -offset indent .It Va ifi_type .Pq Vt u_char The type of the interface, as defined in .In net/if_types.h and described below in the .Sx "Interface Types" section. .It Va ifi_physical .Pq Vt u_char Intended to represent a selection of physical layers on devices which support more than one; never implemented. .It Va ifi_addrlen .Pq Vt u_char Length of a link-layer address on this device, or zero if there are none. Used to initialized the address length field in .Vt sockaddr_dl structures referring to this interface. .It Va ifi_hdrlen .Pq Vt u_char Maximum length of any link-layer header which might be prepended by the driver to a packet before transmission. The generic code computes the maximum over all interfaces and uses that value to influence the placement of data in .Vt mbuf Ns s to attempt to ensure that there is always sufficient space to prepend a link-layer header without allocating an additional .Vt mbuf . .It Va ifi_datalen .Pq Vt u_char Length of the .Vt if_data structure. Allows some stabilization of the routing socket ABI in the face of increases in the length of .Vt struct ifdata . .It Va ifi_mtu .Pq Vt u_long The maximum transmission unit of the medium, exclusive of any link-layer overhead. .It Va ifi_metric .Pq Vt u_long A dimensionless metric interpreted by a user-mode routing process. .It Va ifi_baudrate .Pq Vt u_long The line rate of the interface, in bits per second. .It Va ifi_hwassist .Pq Vt u_long A detailed interpretation of the capabilities to offload computational tasks for .Em outgoing packets. The interface driver must keep this field in accord with the current value of .Va if_capenable . .It Va ifi_epoch .Pq Vt time_t The system uptime when interface was attached or the statistics below were reset. This is intended to be used to set the SNMP variable .Va ifCounterDiscontinuityTime . It may also be used to determine if two successive queries for an interface of the same index have returned results for the same interface. .El .Pp The structure additionally contains generic statistics applicable to a variety of different interface types (except as noted, all members are of type .Vt u_long ) : .Bl -tag -width ".Va ifi_lastchange" -offset indent .It Va ifi_link_state .Pq Vt u_char The current link state of Ethernet interfaces. See the .Sx Interface Link States section for possible values. .It Va ifi_ipackets Number of packets received. .It Va ifi_ierrors Number of receive errors detected (e.g., FCS errors, DMA overruns, etc.). More detailed breakdowns can often be had by way of a link-specific MIB. .It Va ifi_opackets Number of packets transmitted. .It Va ifi_oerrors Number of output errors detected (e.g., late collisions, DMA overruns, etc.). More detailed breakdowns can often be had by way of a link-specific MIB. .It Va ifi_collisions Total number of collisions detected on output for CSMA interfaces. (This member is sometimes [ab]used by other types of interfaces for other output error counts.) .It Va ifi_ibytes Total traffic received, in bytes. .It Va ifi_obytes Total traffic transmitted, in bytes. .It Va ifi_imcasts Number of packets received which were sent by link-layer multicast. .It Va ifi_omcasts Number of packets sent by link-layer multicast. .It Va ifi_iqdrops Number of packets dropped on input. Rarely implemented. .It Va ifi_noproto Number of packets received for unknown network-layer protocol. .It Va ifi_lastchange .Pq Vt "struct timeval" The time of the last administrative change to the interface (as required for .Tn SNMP ) . .El .Ss Interface Types The header file .In net/if_types.h defines symbolic constants for a number of different types of interfaces. The most common are: .Pp .Bl -tag -offset indent -width ".Dv IFT_PROPVIRTUAL" -compact .It Dv IFT_OTHER none of the following .It Dv IFT_ETHER Ethernet .It Dv IFT_ISO88023 ISO 8802-3 CSMA/CD .It Dv IFT_ISO88024 ISO 8802-4 Token Bus .It Dv IFT_ISO88025 ISO 8802-5 Token Ring .It Dv IFT_ISO88026 ISO 8802-6 DQDB MAN .It Dv IFT_FDDI FDDI .It Dv IFT_PPP Internet Point-to-Point Protocol .Pq Xr ppp 8 .It Dv IFT_LOOP The loopback .Pq Xr lo 4 interface .It Dv IFT_SLIP Serial Line IP .It Dv IFT_PARA Parallel-port IP .Pq Dq Tn PLIP .It Dv IFT_ATM Asynchronous Transfer Mode .It Dv IFT_USB USB Interface .El .Ss Interface Link States The following link states are currently defined: .Pp .Bl -tag -offset indent -width ".Dv LINK_STATE_UNKNOWN" -compact .It Dv LINK_STATE_UNKNOWN The link is in an invalid or unknown state. .It Dv LINK_STATE_DOWN The link is down. .It Dv LINK_STATE_UP The link is up. .El .Ss The Vt ifaddr Ss Structure Every interface is associated with a list (or, rather, a .Li TAILQ ) of addresses, rooted at the interface structure's .Va if_addrlist member. The first element in this list is always an .Dv AF_LINK address representing the interface itself; multi-access network drivers should complete this structure by filling in their link-layer addresses after calling .Fn if_attach . Other members of the structure represent network-layer addresses which have been configured by means of the .Dv SIOCAIFADDR command to .Xr ioctl 2 , called on a socket of the appropriate protocol family. The elements of this list consist of .Vt ifaddr structures. Most protocols will declare their own protocol-specific interface address structures, but all begin with a .Vt "struct ifaddr" which provides the most-commonly-needed functionality across all protocols. Interface addresses are reference-counted. .Pp The members of .Vt "struct ifaddr" are as follows: .Bl -tag -width ".Va ifa_rtrequest" -offset indent .It Va ifa_addr .Pq Vt "struct sockaddr *" The local address of the interface. .It Va ifa_dstaddr .Pq Vt "struct sockaddr *" The remote address of point-to-point interfaces, and the broadcast address of broadcast interfaces. .Va ( ifa_broadaddr is a macro for .Va ifa_dstaddr . ) .It Va ifa_netmask .Pq Vt "struct sockaddr *" The network mask for multi-access interfaces, and the confusion generator for point-to-point interfaces. .It Va ifa_ifp .Pq Vt "struct ifnet *" A link back to the interface structure. .It Va ifa_link .Pq Fn TAILQ_ENTRY ifaddr .Xr queue 3 glue for list of addresses on each interface. .It Va ifa_rtrequest See below. .It Va ifa_flags .Pq Vt u_short Some of the flags which would be used for a route representing this address in the route table. .It Va ifa_refcnt .Pq Vt short The reference count. .El .Pp References to .Vt ifaddr structures are gained by calling the .Fn ifa_ref function and released by calling the .Fn ifa_free function. .Pp .Fn ifa_rtrequest is a pointer to a function which receives callouts from the routing code .Pq Fn rtrequest to perform link-layer-specific actions upon requests to add, or delete routes. The .Fa cmd argument indicates the request in question: .Dv RTM_ADD , or .Dv RTM_DELETE . The .Fa rt argument is the route in question; the .Fa info argument contains the specific destination being manipulated. .Sh FUNCTIONS The functions provided by the generic interface code can be divided into two groups: those which manipulate interfaces, and those which manipulate interface addresses. In addition to these functions, there may also be link-layer support routines which are used by a number of drivers implementing a specific link layer over different hardware; see the documentation for that link layer for more details. .Ss The Vt ifmultiaddr Ss Structure Every multicast-capable interface is associated with a list of multicast group memberships, which indicate at a low level which link-layer multicast addresses (if any) should be accepted, and at a high level, in which network-layer multicast groups a user process has expressed interest. .Pp The elements of the structure are as follows: .Bl -tag -width ".Va ifma_refcount" -offset indent .It Va ifma_link .Pq Fn LIST_ENTRY ifmultiaddr .Xr queue 3 macro glue. .It Va ifma_addr .Pq Vt "struct sockaddr *" A pointer to the address which this record represents. The memberships for various address families are stored in arbitrary order. .It Va ifma_lladdr .Pq Vt "struct sockaddr *" A pointer to the link-layer multicast address, if any, to which the network-layer multicast address in .Va ifma_addr is mapped, else a null pointer. If this element is non-nil, this membership also holds an invisible reference to another membership for that link-layer address. .It Va ifma_refcount .Pq Vt u_int A reference count of requests for this particular membership. .El .Ss Interface Manipulation Functions .Bl -ohang -offset indent .It Fn if_alloc Allocate and initialize .Vt "struct ifnet" . Initialization includes the allocation of an interface index and may include the allocation of a .Fa type specific structure in .Va if_l2com . .It Fn if_attach Link the specified interface .Fa ifp into the list of network interfaces. Also initialize the list of addresses on that interface, and create a link-layer .Vt ifaddr structure to be the first element in that list. (A pointer to this address structure is saved in the .Vt ifnet structure and shall be accessed by the .Fn ifaddr_byindex function.) The .Fa ifp must have been allocated by .Fn if_alloc . .It Fn if_detach Shut down and unlink the specified .Fa ifp from the interface list. .It Fn if_free Free the given .Fa ifp back to the system. The interface must have been previously detached if it was ever attached. .It Fn if_free_type Identical to .Fn if_free except that the given .Fa type is used to free .Va if_l2com instead of the type in .Va if_type . This is intended for use with drivers that change their interface type. .It Fn if_down Mark the interface .Fa ifp as down (i.e., .Dv IFF_UP is not set), flush its output queue, notify protocols of the transition, and generate a message from the .Xr route 4 routing socket. .It Fn if_up Mark the interface .Fa ifp as up, notify protocols of the transition, and generate a message from the .Xr route 4 routing socket. .It Fn ifpromisc Add or remove a promiscuous reference to .Fa ifp . If .Fa pswitch is true, add a reference; if it is false, remove a reference. On reference count transitions from zero to one and one to zero, set the .Dv IFF_PROMISC flag appropriately and call .Fn if_ioctl to set up the interface in the desired mode. .It Fn if_allmulti As .Fn ifpromisc , but for the all-multicasts .Pq Dv IFF_ALLMULTI flag instead of the promiscuous flag. .It Fn ifunit Return an .Vt ifnet pointer for the interface named .Fa name . .It Fn ifunit_ref Return a reference-counted (via .Fn ifa_ref ) .Vt ifnet pointer for the interface named .Fa name . This is the preferred function over .Fn ifunit . The caller is responsible for releasing the reference with .Fn if_rele when it is finished with the ifnet. .It Fn ifioctl Process the ioctl request .Fa cmd , issued on socket .Fa so by thread .Fa td , with data parameter .Fa data . This is the main routine for handling all interface configuration requests from user mode. It is ordinarily only called from the socket-layer .Xr ioctl 2 handler, and only for commands with class .Sq Li i . Any unrecognized commands will be passed down to socket .Fa so Ns 's protocol for further interpretation. The following commands are handled by .Fn ifioctl : .Pp .Bl -tag -width ".Dv SIOCGIFNETMASK" -offset indent -compact .It Dv SIOCGIFCONF Get interface configuration. (No call-down to driver.) .Pp .It Dv SIOCSIFNAME Set the interface name. .Dv RTM_IFANNOUNCE departure and arrival messages are sent so that routing code that relies on the interface name will update its interface list. Caller must have appropriate privilege. (No call-down to driver.) .It Dv SIOCGIFCAP .It Dv SIOCGIFFIB .It Dv SIOCGIFFLAGS .It Dv SIOCGIFMETRIC .It Dv SIOCGIFMTU .It Dv SIOCGIFPHYS Get interface capabilities, FIB, flags, metric, MTU, medium selection. (No call-down to driver.) .Pp .It Dv SIOCSIFCAP Enable or disable interface capabilities. Caller must have appropriate privilege. Before a call to the driver-specific .Fn if_ioctl routine, the requested mask for enabled capabilities is checked against the mask of capabilities supported by the interface, .Va if_capabilities . Requesting to enable an unsupported capability is invalid. The rest is supposed to be done by the driver, which includes updating .Va if_capenable and .Va if_data.ifi_hwassist appropriately. .Pp .It Dv SIOCSIFFIB Sets interface FIB. Caller must have appropriate privilege. FIB values start at 0 and values greater or equals than .Va net.fibs are considered invalid. .It Dv SIOCSIFFLAGS Change interface flags. Caller must have appropriate privilege. If a change to the .Dv IFF_UP flag is requested, .Fn if_up or .Fn if_down is called as appropriate. Flags listed in .Dv IFF_CANTCHANGE are masked off, and the field .Va if_flags in the interface structure is updated. Finally, the driver .Fn if_ioctl routine is called to perform any setup requested. .Pp .It Dv SIOCSIFMETRIC .It Dv SIOCSIFPHYS Change interface metric or medium. Caller must have appropriate privilege. .Pp .It Dv SIOCSIFMTU Change interface MTU. Caller must have appropriate privilege. MTU values less than 72 or greater than 65535 are considered invalid. The driver .Fn if_ioctl routine is called to implement the change; it is responsible for any additional sanity checking and for actually modifying the MTU in the interface structure. .Pp .It Dv SIOCADDMULTI .It Dv SIOCDELMULTI Add or delete permanent multicast group memberships on the interface. Caller must have appropriate privilege. The .Fn if_addmulti or .Fn if_delmulti function is called to perform the operation; qq.v. .Pp .It Dv SIOCAIFADDR .It Dv SIOCDIFADDR The socket's protocol control routine is called to implement the requested action. .El .El .Pp .Fn if_down , .Fn ifioctl , .Fn ifpromisc , and .Fn if_up must be called at .Fn splnet or higher. .Ss "Interface Address Functions" Several functions exist to look up an interface address structure given an address. .Fn ifa_ifwithaddr returns an interface address with either a local address or a broadcast address precisely matching the parameter .Fa addr . .Fn ifa_ifwithdstaddr returns an interface address for a point-to-point interface whose remote .Pq Dq destination address is .Fa addr and a fib is .Fa fib . If .Fa fib is .Dv RT_ALL_FIBS , then the first interface address matching .Fa addr will be returned. .Pp .Fn ifa_ifwithnet returns the most specific interface address which matches the specified address, .Fa addr , subject to its configured netmask, or a point-to-point interface address whose remote address is .Fa addr if one is found. If .Fa ignore_ptp -is true, skip point-to-point interface addresses. The +is true, skip point-to-point interface addresses. +The .Fa fib parameter is handled the same way as by .Fn ifa_ifwithdstaddr . .Pp .Fn ifaof_ifpforaddr returns the most specific address configured on interface .Fa ifp which matches address .Fa addr , subject to its configured netmask. If the interface is point-to-point, only an interface address whose remote address is precisely .Fa addr will be returned. .Pp .Fn ifaddr_byindex returns the link-level address of the interface with the given index .Fa idx . .Pp All of these functions return a null pointer if no such address can be found. .Ss "Interface Multicast Address Functions" The .Fn if_addmulti , .Fn if_delmulti , and .Fn if_findmulti functions provide support for requesting and relinquishing multicast group memberships, and for querying an interface's membership list, respectively. The .Fn if_addmulti function takes a pointer to an interface, .Fa ifp , and a generic address, .Fa sa . It also takes a pointer to a .Vt "struct ifmultiaddr *" which is filled in on successful return with the address of the group membership control block. The .Fn if_addmulti function performs the following four-step process: .Bl -enum -offset indent .It Call the interface's .Fn if_resolvemulti entry point to determine the link-layer address, if any, corresponding to this membership request, and also to give the link layer an opportunity to veto this membership request should it so desire. .It Check the interface's group membership list for a pre-existing membership for this group. If one is not found, allocate a new one; if one is, increment its reference count. .It If the .Fn if_resolvemulti routine returned a link-layer address corresponding to the group, repeat the previous step for that address as well. .It If the interface's multicast address filter needs to be changed because a new membership was added, call the interface's .Fn if_ioctl routine (with a .Fa cmd argument of .Dv SIOCADDMULTI ) to request that it do so. .El .Pp The .Fn if_delmulti function, given an interface .Fa ifp and an address, .Fa sa , reverses this process. Both functions return zero on success, or a standard error number on failure. .Pp The .Fn if_findmulti function examines the membership list of interface .Fa ifp for an address matching .Fa sa , and returns a pointer to that .Vt "struct ifmultiaddr" if one is found, else it returns a null pointer. .Sh SEE ALSO .Xr ioctl 2 , .Xr link_addr 3 , .Xr queue 3 , .Xr sysctl 3 , .Xr bpf 4 , .Xr ifmib 4 , .Xr lo 4 , .Xr netintro 4 , .Xr polling 4 , .Xr config 8 , .Xr ppp 8 , .Xr mbuf 9 , .Xr rtentry 9 .Rs .%A Gary R. Wright .%A W. Richard Stevens .%B TCP/IP Illustrated .%V Vol. 2 .%O Addison-Wesley, ISBN 0-201-63354-X .Re .Sh AUTHORS This manual page was written by .An Garrett A. Wollman . Index: head/share/man/man9/kqueue.9 =================================================================== --- head/share/man/man9/kqueue.9 (revision 275992) +++ head/share/man/man9/kqueue.9 (revision 275993) @@ -1,466 +1,467 @@ .\" Copyright 2006,2011 John-Mark Gurney .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" $FreeBSD$ .\" .Dd March 26, 2012 .Dt KQUEUE 9 .Os .Sh NAME .Nm kqueue_add_filteropts , kqueue_del_filteropts , .Nm kqfd_register , .Nm knote_fdclose , .Nm knlist_init , knlist_init_mtx , knlist_init_rw_reader , .Nm knlist_add , knlist_remove , knlist_remove_inevent , knlist_empty , .Nm knlist_clear , knlist_delete , knlist_destroy , .Nm KNOTE_LOCKED , KNOTE_UNLOCKED .Nd "event delivery subsystem" .Sh SYNOPSIS .In sys/event.h .Ft int .Fn kqueue_add_filteropts "int filt" "struct filterops *filtops" .Ft int .Fn kqueue_del_filteropts "int filt" .Ft int .Fn kqfd_register "int fd" "struct kevent *kev" "struct thread *td" "int waitok" .Ft void .Fn knote_fdclose "struct thread *td" "int fd" .Ft void .Fo knlist_init .Fa "struct knlist *knl" .Fa "void *lock" .Fa "void \*[lp]*kl_lock\*[rp]\*[lp]void *\*[rp]" .Fa "void \*[lp]*kl_unlock\*[rp]\*[lp]void *\*[rp]" .Fa "int \*[lp]*kl_locked\*[rp]\*[lp]void *\*[rp]" .Fc .Ft void .Fn knlist_init_mtx "struct knlist *knl" "struct mtx *lock" .Ft void .Fn knlist_init_rw_reader "struct knlist *knl" "struct rwlock *lock" .Ft void .Fn knlist_add "struct knlist *knl" "struct knote *kn" "int islocked" .Ft void .Fn knlist_remove "struct knlist *knl" "struct knote *kn" "int islocked" .Ft void .Fn knlist_remove_inevent "struct knlist *knl" "struct knote *kn" .Ft int .Fn knlist_empty "struct knlist *knl" .Ft void .Fn knlist_clear "struct knlist *knl" "int islocked" .Ft void .Fn knlist_delete "struct knlist *knl" "struct thread *td" "int islocked" .Ft void .Fn knlist_destroy "struct knlist *knl" .Ft void .Fn KNOTE_LOCKED "struct knlist *knl" "long hint" .Ft void .Fn KNOTE_UNLOCKED "struct knlist *knl" "long hint" .Sh DESCRIPTION The functions .Fn kqueue_add_filteropts and .Fn kqueue_del_filteropts allow for the addition and removal of a filter type. The filter is statically defined by the .Dv EVFILT_* macros. The function .Fn kqueue_add_filteropts will make .Fa filt available. The .Vt "struct filterops" has the following members: .Bl -tag -width ".Va f_attach" .It Va f_isfd If .Va f_isfd is set, .Va ident in .Vt "struct kevent" is taken to be a file descriptor. In this case, the .Vt knote passed into .Va f_attach will have the .Va kn_fp member initialized to the .Vt "struct file *" that represents the file descriptor. .It Va f_attach The .Va f_attach function will be called when attaching a .Vt knote to the object. The method should call .Fn knlist_add to add the .Vt knote to the list that was initialized with .Fn knlist_init . The call to .Fn knlist_add is only necessary if the object can have multiple .Vt knotes associated with it. If there is no .Vt knlist to call .Fn knlist_add with, the function .Va f_attach must clear the .Dv KN_DETACHED bit of .Va kn_status in the .Vt knote . The function shall return 0 on success, or appropriate error for the failure, such as when the object is being destroyed, or does not exist. During .Va f_attach , it is valid to change the .Va kn_fops pointer to a different pointer. This will change the .Va f_event and .Va f_detach functions called when processing the .Vt knote . .It Va f_detach The .Va f_detach function will be called to detach the .Vt knote if the .Vt knote has not already been detached by a call to .Fn knlist_remove , .Fn knlist_remove_inevent or .Fn knlist_delete . The list .Fa lock will not be held when this function is called. .It Va f_event The .Va f_event function will be called to update the status of the .Vt knote . If the function returns 0, it will be assumed that the object is not ready (or no longer ready) to be woken up. The .Fa hint argument will be 0 when scanning .Vt knotes to see which are triggered. Otherwise, the .Fa hint argument will be the value passed to either .Dv KNOTE_LOCKED or .Dv KNOTE_UNLOCKED . The .Va kn_data value should be updated as necessary to reflect the current value, such as number of bytes available for reading, or buffer space available for writing. If the note needs to be removed, .Fn knlist_remove_inevent must be called. The function .Fn knlist_remove_inevent will remove the note from the list, the .Va f_detach function will not be called and the .Vt knote will not be returned as an event. .Pp Locks .Em must not be acquired in .Va f_event . If a lock is required in .Va f_event , it must be obtained in the .Fa kl_lock function of the .Vt knlist that the .Va knote was added to. .El .Pp The function .Fn kqfd_register will register the .Vt kevent on the kqueue file descriptor .Fa fd . If it is safe to sleep, .Fa waitok should be set. .Pp The function .Fn knote_fdclose is used to delete all .Vt knotes associated with .Fa fd . Once returned, there will no longer be any .Vt knotes associated with the .Fa fd . The .Vt knotes removed will never be returned from a .Xr kevent 2 call, so if userland uses the .Vt knote to track resources, they will be leaked. The .Fn FILEDESC_LOCK lock must be held over the call to .Fn knote_fdclose so that file descriptors cannot be added or removed. .Pp The .Fn knlist_* family of functions are for managing .Vt knotes associated with an object. A .Vt knlist is not required, but is commonly used. If used, the .Vt knlist must be initialized with either .Fn knlist_init , .Fn knlist_init_mtx or .Fn knlist_init_rw_reader . The .Vt knlist structure may be embedded into the object structure. The .Fa lock will be held over .Va f_event calls. .Pp For the .Fn knlist_init function, if .Fa lock is .Dv NULL , a shared global lock will be used and the remaining arguments must be .Dv NULL . The function pointers .Fa kl_lock , kl_unlock and .Fa kl_locked will be used to manipulate the argument .Fa lock . If any of the function pointers are .Dv NULL , a function operating on .Dv MTX_DEF style .Xr mutex 9 locks will be used instead. .Pp The function .Fn knlist_init_mtx may be used to initialize a .Vt knlist when .Fa lock is a .Dv MTX_DEF style .Xr mutex 9 lock. .Pp The function .Fn knlist_init_rw_reader may be used to initialize a .Vt knlist when .Fa lock is a .Xr rwlock 9 read lock. Lock is acquired via .Fn rw_rlock function. .Pp The function .Fn knlist_empty returns true when there are no .Vt knotes on the list. The function requires that the .Fa lock be held when called. .Pp The function .Fn knlist_clear removes all .Vt knotes from the list. The .Fa islocked argument declares if the .Fa lock has been acquired. All .Vt knotes will have .Dv EV_ONESHOT set so that the .Vt knote will be returned and removed durning the next scan. The .Va f_detach function will be called when the .Vt knote is deleted durning the next scan. This function must not be used when .Va f_isfd is set in .Vt "struct filterops" , as the .Fa td argument of .Fn fdrop will be .Dv NULL . .Pp The function .Fn knlist_delete removes and deletes all .Vt knotes on the list. The function .Va f_detach will not be called, and the .Vt knote will not be returned on the next scan. Using this function could leak user land resources if a process uses the .Vt knote to track resources. .Pp Both the .Fn knlist_clear and .Fn knlist_delete functions may sleep. They also may release the .Fa lock to wait for other .Vt knotes to drain. .Pp The .Fn knlist_destroy function is used to destroy a .Vt knlist . There must be no .Vt knotes associated with the .Vt knlist -.Fn ( knlist_empty -returns true) +.Po Fn knlist_empty +returns true +.Pc and no more .Vt knotes may be attached to the object. A .Vt knlist may be emptied by calling .Fn knlist_clear or .Fn knlist_delete . .Pp The macros .Fn KNOTE_LOCKED and .Fn KNOTE_UNLOCKED are used to notify .Vt knotes about events associated with the object. It will iterate over all .Vt knotes on the list calling the .Va f_event function associated with the .Vt knote . The macro .Fn KNOTE_LOCKED must be used if the lock associated with the .Fa knl is held. The function .Fn KNOTE_UNLOCKED will acquire the lock before iterating over the list of .Vt knotes . .Sh RETURN VALUES The function .Fn kqueue_add_filteropts will return zero on success, .Er EINVAL in the case of an invalid .Fa filt , or .Er EEXIST if the filter has already been installed. .Pp The function .Fn kqueue_del_filteropts will return zero on success, .Er EINVAL in the case of an invalid .Fa filt , or .Er EBUSY if the filter is still in use. .Pp The function .Fn kqfd_register will return zero on success, .Er EBADF if the file descriptor is not a kqueue, or any of the possible values returned by .Xr kevent 2 . .Sh SEE ALSO .Xr kevent 2 , .Xr kqueue 2 .Sh AUTHORS This manual page was written by .An John-Mark Gurney Aq Mt jmg@FreeBSD.org . Index: head/share/man/man9/lock.9 =================================================================== --- head/share/man/man9/lock.9 (revision 275992) +++ head/share/man/man9/lock.9 (revision 275993) @@ -1,423 +1,423 @@ .\" .\" Copyright (C) 2002 Chad David . All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice(s), this list of conditions and the following disclaimer as .\" the first lines of this file unmodified other than the possible .\" addition of one or more copyright notices. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice(s), this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDER(S) ``AS IS'' AND ANY .\" EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED .\" WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE .\" DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER(S) BE LIABLE FOR ANY .\" DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES .\" (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR .\" SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER .\" CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH .\" DAMAGE. .\" .\" $FreeBSD$ .\" .Dd November 2, 2014 .Dt LOCK 9 .Os .Sh NAME .Nm lockinit , .Nm lockdestroy , .Nm lockmgr , .Nm lockmgr_args , .Nm lockmgr_args_rw , .Nm lockmgr_disown , .Nm lockmgr_printinfo , .Nm lockmgr_recursed , .Nm lockmgr_rw , .Nm lockmgr_waiters , .Nm lockstatus , .Nm lockmgr_assert .Nd "lockmgr family of functions" .Sh SYNOPSIS .In sys/types.h .In sys/lock.h .In sys/lockmgr.h .Ft void .Fn lockinit "struct lock *lkp" "int prio" "const char *wmesg" "int timo" "int flags" .Ft void .Fn lockdestroy "struct lock *lkp" .Ft int .Fn lockmgr "struct lock *lkp" "u_int flags" "struct mtx *ilk" .Ft int .Fn lockmgr_args "struct lock *lkp" "u_int flags" "struct mtx *ilk" "const char *wmesg" "int prio" "int timo" .Ft int .Fn lockmgr_args_rw "struct lock *lkp" "u_int flags" "struct rwlock *ilk" "const char *wmesg" "int prio" "int timo" .Ft void .Fn lockmgr_disown "struct lock *lkp" .Ft void .Fn lockmgr_printinfo "const struct lock *lkp" .Ft int .Fn lockmgr_recursed "const struct lock *lkp" .Ft int .Fn lockmgr_rw "struct lock *lkp" "u_int flags" "struct rwlock *ilk" .Ft int .Fn lockmgr_waiters "const struct lock *lkp" .Ft int .Fn lockstatus "const struct lock *lkp" .Pp .Cd "options INVARIANTS" .Cd "options INVARIANT_SUPPORT" .Ft void .Fn lockmgr_assert "const struct lock *lkp" "int what" .Sh DESCRIPTION The .Fn lockinit function is used to initialize a lock. It must be called before any operation can be performed on a lock. Its arguments are: .Bl -tag -width ".Fa wmesg" .It Fa lkp A pointer to the lock to initialize. .It Fa prio The priority passed to .Xr sleep 9 . .It Fa wmesg The lock message. This is used for both debugging output and .Xr sleep 9 . .It Fa timo The timeout value passed to .Xr sleep 9 . .It Fa flags The flags the lock is to be initialized with: .Bl -tag -width ".Dv LK_CANRECURSE" .It Dv LK_ADAPTIVE Enable adaptive spinning for this lock if the kernel is compiled with the ADAPTIVE_LOCKMGRS option. .It Dv LK_CANRECURSE Allow recursive exclusive locks. .It Dv LK_NOPROFILE Disable lock profiling for this lock. .It Dv LK_NOSHARE Allow exclusive locks only. .It Dv LK_NOWITNESS Instruct .Xr witness 4 to ignore this lock. .It Dv LK_NODUP .Xr witness 4 should log messages about duplicate locks being acquired. .It Dv LK_QUIET Disable .Xr ktr 4 logging for this lock. .It Dv LK_TIMELOCK Use .Fa timo during a sleep; otherwise, 0 is used. .El .El .Pp The .Fn lockdestroy function is used to destroy a lock, and while it is called in a number of places in the kernel, it currently does nothing. .Pp The .Fn lockmgr and .Fn lockmgr_rw functions handle general locking functionality within the kernel, including support for shared and exclusive locks, and recursion. .Fn lockmgr and .Fn lockmgr_rw are also able to upgrade and downgrade locks. .Pp Their arguments are: .Bl -tag -width ".Fa flags" .It Fa lkp A pointer to the lock to manipulate. .It Fa flags Flags indicating what action is to be taken. .Bl -tag -width ".Dv LK_NODDLKTREAT" .It Dv LK_SHARED Acquire a shared lock. If an exclusive lock is currently held, .Dv EDEADLK will be returned. .It Dv LK_EXCLUSIVE Acquire an exclusive lock. If an exclusive lock is already held, and .Dv LK_CANRECURSE is not set, the system will .Xr panic 9 . .It Dv LK_DOWNGRADE Downgrade exclusive lock to a shared lock. Downgrading a shared lock is not permitted. If an exclusive lock has been recursed, the system will .Xr panic 9 . .It Dv LK_UPGRADE Upgrade a shared lock to an exclusive lock. If this call fails, the shared lock is lost, even if the .Dv LK_NOWAIT flag is specified. During the upgrade, the shared lock could be temporarily dropped. Attempts to upgrade an exclusive lock will cause a .Xr panic 9 . .It Dv LK_TRYUPGRADE Try to upgrade a shared lock to an exclusive lock. The failure to upgrade does not result in the dropping of the shared lock ownership. .It Dv LK_RELEASE Release the lock. Releasing a lock that is not held can cause a .Xr panic 9 . .It Dv LK_DRAIN Wait for all activity on the lock to end, then mark it decommissioned. This is used before freeing a lock that is part of a piece of memory that is about to be freed. (As documented in .In sys/lockmgr.h . ) .It Dv LK_SLEEPFAIL Fail if operation has slept. .It Dv LK_NOWAIT Do not allow the call to sleep. This can be used to test the lock. .It Dv LK_NOWITNESS Skip the .Xr witness 4 checks for this instance. .It Dv LK_CANRECURSE Allow recursion on an exclusive lock. For every lock there must be a release. .It Dv LK_INTERLOCK Unlock the interlock (which should be locked already). .It Dv LK_NODDLKTREAT Normally, .Fn lockmgr postpones serving further shared requests for shared-locked lock if there is exclusive waiter, to avoid exclusive lock starvation. But, if the thread requesting the shared lock already owns a shared lockmgr lock, the request is granted even in presence of the parallel exclusive lock request, which is done to avoid deadlocks with recursive shared acquisition. .Pp The .Dv LK_NODDLKTREAT flag can only be used by code which requests shared non-recursive lock. The flag allows exclusive requests to preempt the current shared request even if the current thread owns shared locks. This is safe since shared lock is guaranteed to not recurse, and is used when thread is known to held unrelated shared locks, to not cause unneccessary starvation. An example is .Dv vp locking in VFS .Xr lookup 9 , when .Dv dvp is already locked. .El .It Fa ilk An interlock mutex for controlling group access to the lock. If .Dv LK_INTERLOCK is specified, .Fn lockmgr and .Fn lockmgr_rw assume .Fa ilk is currently owned and not recursed, and will return it unlocked. See .Xr mtx_assert 9 . .El .Pp The .Fn lockmgr_args and .Fn lockmgr_args_rw function work like .Fn lockmgr and .Fn lockmgr_rw but accepting a .Fa wmesg , .Fa timo and .Fa prio on a per-instance basis. The specified values will override the default ones, but this can still be used passing, respectively, .Dv LK_WMESG_DEFAULT , .Dv LK_PRIO_DEFAULT and .Dv LK_TIMO_DEFAULT . .Pp The .Fn lockmgr_disown function switches the owner from the current thread to be .Dv LK_KERNPROC , if the lock is already held. .Pp The .Fn lockmgr_printinfo function prints debugging information about the lock. It is used primarily by .Xr VOP_PRINT 9 functions. .Pp The .Fn lockmgr_recursed function returns true if the lock is recursed, 0 otherwise. .Pp The .Fn lockmgr_waiters function returns true if the lock has waiters, 0 otherwise. .Pp The .Fn lockstatus function returns the status of the lock in relation to the current thread. .Pp When compiled with .Cd "options INVARIANTS" and .Cd "options INVARIANT_SUPPORT" , the .Fn lockmgr_assert function tests .Fa lkp for the assertions specified in .Fa what , and panics if they are not met. One of the following assertions must be specified: .Bl -tag -width ".Dv KA_UNLOCKED" .It Dv KA_LOCKED Assert that the current thread has either a shared or an exclusive lock on the .Vt lkp lock pointed to by the first argument. .It Dv KA_SLOCKED Assert that the current thread has a shared lock on the .Vt lkp lock pointed to by the first argument. .It Dv KA_XLOCKED Assert that the current thread has an exclusive lock on the .Vt lkp lock pointed to by the first argument. .It Dv KA_UNLOCKED Assert that the current thread has no lock on the .Vt lkp lock pointed to by the first argument. .El .Pp In addition, one of the following optional assertions can be used with either an .Dv KA_LOCKED , .Dv KA_SLOCKED , or .Dv KA_XLOCKED assertion: .Bl -tag -width ".Dv KA_NOTRECURSED" .It Dv KA_RECURSED Assert that the current thread has a recursed lock on .Fa lkp . .It Dv KA_NOTRECURSED Assert that the current thread does not have a recursed lock on .Fa lkp . .El .Sh RETURN VALUES The .Fn lockmgr and .Fn lockmgr_rw functions return 0 on success and non-zero on failure. .Pp The .Fn lockstatus function returns: .Bl -tag -width ".Dv LK_EXCLUSIVE" .It Dv LK_EXCLUSIVE An exclusive lock is held by the current thread. .It Dv LK_EXCLOTHER An exclusive lock is held by someone other than the current thread. .It Dv LK_SHARED A shared lock is held. .It Li 0 The lock is not held by anyone. .El .Sh ERRORS .Fn lockmgr and .Fn lockmgr_rw fail if: .Bl -tag -width Er .It Bq Er EBUSY .Dv LK_FORCEUPGRADE was requested and another thread had already requested a lock upgrade. .It Bq Er EBUSY .Dv LK_NOWAIT was set, and a sleep would have been required, or .Dv LK_TRYUPGRADE operation was not able to upgrade the lock. .It Bq Er ENOLCK .Dv LK_SLEEPFAIL was set and .Fn lockmgr or .Fn lockmgr_rw did sleep. .It Bq Er EINTR .Dv PCATCH was set in the lock priority, and a signal was delivered during a sleep. Note the .Er ERESTART error below. .It Bq Er ERESTART .Dv PCATCH was set in the lock priority, a signal was delivered during a sleep, and the system call is to be restarted. .It Bq Er EWOULDBLOCK a non-zero timeout was given, and the timeout expired. .El .Sh LOCKS If .Dv LK_INTERLOCK is passed in the .Fa flags argument to .Fn lockmgr or .Fn lockmgr_rw , the .Fa ilk must be held prior to calling .Fn lockmgr or .Fn lockmgr_rw , and will be returned unlocked. .Pp Upgrade attempts that fail result in the loss of the lock that is currently held. Also, it is invalid to upgrade an exclusive lock, and a .Xr panic 9 will be the result of trying. .Sh SEE ALSO .Xr condvar 9 , .Xr locking 9 , +.Xr mtx_assert 9 , .Xr mutex 9 , +.Xr panic 9 , .Xr rwlock 9 , .Xr sleep 9 , .Xr sx 9 , -.Xr mtx_assert 9 , -.Xr panic 9 , .Xr VOP_PRINT 9 .Sh AUTHORS This manual page was written by .An Chad David Aq Mt davidc@acns.ab.ca . Index: head/share/man/man9/locking.9 =================================================================== --- head/share/man/man9/locking.9 (revision 275992) +++ head/share/man/man9/locking.9 (revision 275993) @@ -1,412 +1,412 @@ .\" Copyright (c) 2007 Julian Elischer (julian - freebsd org ) .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" $FreeBSD$ .\" .Dd June 30, 2013 .Dt LOCKING 9 .Os .Sh NAME .Nm locking .Nd kernel synchronization primitives .Sh DESCRIPTION The .Em FreeBSD kernel is written to run across multiple CPUs and as such provides several different synchronization primitives to allow developers to safely access and manipulate many data types. .Ss Mutexes Mutexes (also called "blocking mutexes") are the most commonly used synchronization primitive in the kernel. A thread acquires (locks) a mutex before accessing data shared with other threads (including interrupt threads), and releases (unlocks) it afterwards. If the mutex cannot be acquired, the thread requesting it will wait. Mutexes are adaptive by default, meaning that if the owner of a contended mutex is currently running on another CPU, then a thread attempting to acquire the mutex will spin rather than yielding the processor. Mutexes fully support priority propagation. .Pp See .Xr mutex 9 for details. .Ss Spin Mutexes Spin mutexes are a variation of basic mutexes; the main difference between the two is that spin mutexes never block. Instead, they spin while waiting for the lock to be released. To avoid deadlock, a thread that holds a spin mutex must never yield its CPU. Unlike ordinary mutexes, spin mutexes disable interrupts when acquired. Since disabling interrupts can be expensive, they are generally slower to acquire and release. Spin mutexes should be used only when absolutely necessary, e.g. to protect data shared with interrupt filter code (see .Xr bus_setup_intr 9 for details), or for scheduler internals. .Ss Mutex Pools With most synchronization primitives, such as mutexes, the programmer must provide memory to hold the primitive. For example, a mutex may be embedded inside the structure it protects. Mutex pools provide a preallocated set of mutexes to avoid this requirement. Note that mutexes from a pool may only be used as leaf locks. .Pp See .Xr mtx_pool 9 for details. .Ss Reader/Writer Locks Reader/writer locks allow shared access to protected data by multiple threads or exclusive access by a single thread. The threads with shared access are known as .Em readers since they should only read the protected data. A thread with exclusive access is known as a .Em writer since it may modify protected data. .Pp Reader/writer locks can be treated as mutexes (see above and .Xr mutex 9 ) with shared/exclusive semantics. Reader/writer locks support priority propagation like mutexes, but priority is propagated only to an exclusive holder. This limitation comes from the fact that shared owners are anonymous. .Pp See .Xr rwlock 9 for details. .Ss Read-Mostly Locks Read-mostly locks are similar to .Em reader/writer locks but optimized for very infrequent write locking. .Em Read-mostly locks implement full priority propagation by tracking shared owners using a caller-supplied .Em tracker data structure. .Pp See .Xr rmlock 9 for details. .Ss Sleepable Read-Mostly Locks Sleepable read-mostly locks are a variation on read-mostly locks. Threads holding an exclusive lock may sleep, but threads holding a shared lock may not. Priority is propagated to shared owners but not to exclusive owners. .Ss Shared/exclusive locks Shared/exclusive locks are similar to reader/writer locks; the main difference between them is that shared/exclusive locks may be held during unbounded sleep. Acquiring a contested shared/exclusive lock can perform an unbounded sleep. These locks do not support priority propagation. .Pp See .Xr sx 9 for details. .Ss Lockmanager locks Lockmanager locks are sleepable shared/exclusive locks used mostly in .Xr VFS 9 .Po as a .Xr vnode 9 lock .Pc and in the buffer cache .Po .Xr BUF_LOCK 9 .Pc . They have features other lock types do not have such as sleep timeouts, blocking upgrades, writer starvation avoidance, draining, and an interlock mutex, but this makes them complicated both to use and to implement; for this reason, they should be avoided. .Pp See .Xr lock 9 for details. .Ss Counting semaphores Counting semaphores provide a mechanism for synchronizing access to a pool of resources. Unlike mutexes, semaphores do not have the concept of an owner, so they can be useful in situations where one thread needs to acquire a resource, and another thread needs to release it. They are largely deprecated. .Pp See .Xr sema 9 for details. .Ss Condition variables Condition variables are used in conjunction with locks to wait for a condition to become true. A thread must hold the associated lock before calling one of the .Fn cv_wait , functions. When a thread waits on a condition, the lock is atomically released before the thread yields the processor and reacquired before the function call returns. Condition variables may be used with blocking mutexes, reader/writer locks, read-mostly locks, and shared/exclusive locks. .Pp See .Xr condvar 9 for details. .Ss Sleep/Wakeup The functions .Fn tsleep , .Fn msleep , .Fn msleep_spin , .Fn pause , .Fn wakeup , and .Fn wakeup_one also handle event-based thread blocking. Unlike condition variables, arbitrary addresses may be used as wait channels and a dedicated structure does not need to be allocated. However, care must be taken to ensure that wait channel addresses are unique to an event. If a thread must wait for an external event, it is put to sleep by .Fn tsleep , .Fn msleep , .Fn msleep_spin , or .Fn pause . Threads may also wait using one of the locking primitive sleep routines .Xr mtx_sleep 9 , .Xr rw_sleep 9 , or .Xr sx_sleep 9 . .Pp The parameter .Fa chan is an arbitrary address that uniquely identifies the event on which the thread is being put to sleep. All threads sleeping on a single .Fa chan are woken up later by .Fn wakeup .Pq often called from inside an interrupt routine to indicate that the event the thread was blocking on has occurred. .Pp Several of the sleep functions including .Fn msleep , .Fn msleep_spin , and the locking primitive sleep routines specify an additional lock parameter. The lock will be released before sleeping and reacquired before the sleep routine returns. If .Fa priority includes the .Dv PDROP flag, then the lock will not be reacquired before returning. The lock is used to ensure that a condition can be checked atomically, and that the current thread can be suspended without missing a change to the condition or an associated wakeup. In addition, all of the sleep routines will fully drop the .Va Giant mutex .Pq even if recursed while the thread is suspended and will reacquire the .Va Giant mutex .Pq restoring any recursion before the function returns. .Pp The .Fn pause function is a special sleep function that waits for a specified amount of time to pass before the thread resumes execution. This sleep cannot be terminated early by either an explicit .Fn wakeup or a signal. .Pp See .Xr sleep 9 for details. .Ss Giant Giant is a special mutex used to protect data structures that do not yet have their own locks. Since it provides semantics akin to the old .Xr spl 9 interface, Giant has special characteristics: .Bl -enum .It It is recursive. .It Drivers can request that Giant be locked around them by not marking themselves MPSAFE. Note that infrastructure to do this is slowly going away as non-MPSAFE drivers either became properly locked or disappear. .It Giant must be locked before other non-sleepable locks. .It Giant is dropped during unbounded sleeps and reacquired after wakeup. .It There are places in the kernel that drop Giant and pick it back up again. Sleep locks will do this before sleeping. Parts of the network or VM code may do this as well. This means that you cannot count on Giant keeping other code from running if your code sleeps, even if you want it to. .El .Sh INTERACTIONS The primitives can interact and have a number of rules regarding how they can and can not be combined. Many of these rules are checked by .Xr witness 4 . .Ss Bounded vs. Unbounded Sleep In a bounded sleep .Po also referred to as .Dq blocking .Pc the only resource needed to resume execution of a thread is CPU time for the owner of a lock that the thread is waiting to acquire. In an unbounded sleep .Po often referred to as simply .Dq sleeping .Pc a thread waits for an external event or for a condition to become true. In particular, a dependency chain of threads in bounded sleeps should always make forward progress, since there is always CPU time available. This requires that no thread in a bounded sleep is waiting for a lock held by a thread in an unbounded sleep. To avoid priority inversions, a thread in a bounded sleep lends its priority to the owner of the lock that it is waiting for. .Pp The following primitives perform bounded sleeps: mutexes, reader/writer locks and read-mostly locks. .Pp The following primitives perform unbounded sleeps: sleepable read-mostly locks, shared/exclusive locks, lockmanager locks, counting semaphores, condition variables, and sleep/wakeup. .Ss General Principles .Bl -bullet .It It is an error to do any operation that could result in yielding the processor while holding a spin mutex. .It It is an error to do any operation that could result in unbounded sleep while holding any primitive from the 'bounded sleep' group. For example, it is an error to try to acquire a shared/exclusive lock while holding a mutex, or to try to allocate memory with M_WAITOK while holding a reader/writer lock. .Pp Note that the lock passed to one of the .Fn sleep or .Fn cv_wait functions is dropped before the thread enters the unbounded sleep and does not violate this rule. .It It is an error to do any operation that could result in yielding of the processor when running inside an interrupt filter. .It It is an error to do any operation that could result in unbounded sleep when running inside an interrupt thread. .El .Ss Interaction table The following table shows what you can and can not do while holding one of the locking primitives discussed. Note that .Dq sleep includes .Fn sema_wait , .Fn sema_timedwait , any of the .Fn cv_wait functions, and any of the .Fn sleep functions. .Bl -column ".Ic xxxxxxxxxxxxxxxx" ".Xr XXXXXXXXX" ".Xr XXXXXXXXX" ".Xr XXXXXXX" ".Xr XXXXXXXXX" ".Xr XXXXXX" -offset 3n .It Em " You want:" Ta spin mtx Ta mutex/rw Ta rmlock Ta sleep rm Ta sx/lk Ta sleep .It Em "You have: " Ta -------- Ta -------- Ta ------ Ta -------- Ta ------ Ta ------ .It spin mtx Ta \&ok Ta \&no Ta \&no Ta \&no Ta \&no Ta \&no-1 .It mutex/rw Ta \&ok Ta \&ok Ta \&ok Ta \&no Ta \&no Ta \&no-1 .It rmlock Ta \&ok Ta \&ok Ta \&ok Ta \&no Ta \&no Ta \&no-1 .It sleep rm Ta \&ok Ta \&ok Ta \&ok Ta \&ok-2 Ta \&ok-2 Ta \&ok-2/3 .It sx Ta \&ok Ta \&ok Ta \&ok Ta \&ok Ta \&ok Ta \&ok-3 .It lockmgr Ta \&ok Ta \&ok Ta \&ok Ta \&ok Ta \&ok Ta \&ok .El .Pp .Em *1 There are calls that atomically release this primitive when going to sleep and reacquire it on wakeup .Po .Fn mtx_sleep , .Fn rw_sleep , .Fn msleep_spin , etc. .Pc . .Pp .Em *2 These cases are only allowed while holding a write lock on a sleepable read-mostly lock. .Pp .Em *3 Though one can sleep while holding this lock, one can also use a .Fn sleep function to atomically release this primitive when going to sleep and reacquire it on wakeup. .Pp Note that non-blocking try operations on locks are always permitted. .Ss Context mode table The next table shows what can be used in different contexts. At this time this is a rather easy to remember table. .Bl -column ".Ic Xxxxxxxxxxxxxxxxxxx" ".Xr XXXXXXXXX" ".Xr XXXXXXXXX" ".Xr XXXXXXX" ".Xr XXXXXXXXX" ".Xr XXXXXX" -offset 3n .It Em "Context:" Ta spin mtx Ta mutex/rw Ta rmlock Ta sleep rm Ta sx/lk Ta sleep .It interrupt filter: Ta \&ok Ta \&no Ta \&no Ta \&no Ta \&no Ta \&no .It interrupt thread: Ta \&ok Ta \&ok Ta \&ok Ta \&no Ta \&no Ta \&no .It callout: Ta \&ok Ta \&ok Ta \&ok Ta \&no Ta \&no Ta \&no .It system call: Ta \&ok Ta \&ok Ta \&ok Ta \&ok Ta \&ok Ta \&ok .El .Sh SEE ALSO .Xr witness 4 , +.Xr BUS_SETUP_INTR 9 , .Xr condvar 9 , .Xr lock 9 , +.Xr LOCK_PROFILING 9 , .Xr mtx_pool 9 , .Xr mutex 9 , .Xr rmlock 9 , .Xr rwlock 9 , .Xr sema 9 , .Xr sleep 9 , -.Xr sx 9 , -.Xr BUS_SETUP_INTR 9 , -.Xr LOCK_PROFILING 9 +.Xr sx 9 .Sh HISTORY These functions appeared in .Bsx 4.1 through .Fx 7.0 . .Sh BUGS There are too many locking primitives to choose from. Index: head/share/man/man9/mbuf.9 =================================================================== --- head/share/man/man9/mbuf.9 (revision 275992) +++ head/share/man/man9/mbuf.9 (revision 275993) @@ -1,1190 +1,1192 @@ .\" Copyright (c) 2000 FreeBSD Inc. .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL [your name] OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" $FreeBSD$ .\" .Dd October 21, 2014 .Dt MBUF 9 .Os .\" .Sh NAME .Nm mbuf .Nd "memory management in the kernel IPC subsystem" .\" .Sh SYNOPSIS .In sys/param.h .In sys/systm.h .In sys/mbuf.h .\" .Ss Mbuf allocation macros .Fn MGET "struct mbuf *mbuf" "int how" "short type" .Fn MGETHDR "struct mbuf *mbuf" "int how" "short type" .Fn MCLGET "struct mbuf *mbuf" "int how" .Fo MEXTADD .Fa "struct mbuf *mbuf" .Fa "caddr_t buf" .Fa "u_int size" .Fa "void (*free)(void *opt_arg1, void *opt_arg2)" .Fa "void *opt_arg1" .Fa "void *opt_arg2" .Fa "short flags" .Fa "int type" .Fc .Fn MEXTFREE "struct mbuf *mbuf" .Fn MFREE "struct mbuf *mbuf" "struct mbuf *successor" .\" .Ss Mbuf utility macros .Fn mtod "struct mbuf *mbuf" "type" .Fn M_ALIGN "struct mbuf *mbuf" "u_int len" .Fn MH_ALIGN "struct mbuf *mbuf" "u_int len" .Ft int .Fn M_LEADINGSPACE "struct mbuf *mbuf" .Ft int .Fn M_TRAILINGSPACE "struct mbuf *mbuf" .Fn M_MOVE_PKTHDR "struct mbuf *to" "struct mbuf *from" .Fn M_PREPEND "struct mbuf *mbuf" "int len" "int how" .Fn MCHTYPE "struct mbuf *mbuf" "short type" .Ft int .Fn M_WRITABLE "struct mbuf *mbuf" .\" .Ss Mbuf allocation functions .Ft struct mbuf * .Fn m_get "int how" "short type" .Ft struct mbuf * .Fn m_get2 "int size" "int how" "short type" "int flags" .Ft struct mbuf * .Fn m_getm "struct mbuf *orig" "int len" "int how" "short type" .Ft struct mbuf * .Fn m_getjcl "int how" "short type" "int flags" "int size" .Ft struct mbuf * .Fn m_getcl "int how" "short type" "int flags" .Ft struct mbuf * .Fn m_getclr "int how" "short type" .Ft struct mbuf * .Fn m_gethdr "int how" "short type" .Ft struct mbuf * .Fn m_free "struct mbuf *mbuf" .Ft void .Fn m_freem "struct mbuf *mbuf" .\" .Ss Mbuf utility functions .Ft void .Fn m_adj "struct mbuf *mbuf" "int len" .Ft void .Fn m_align "struct mbuf *mbuf" "int len" .Ft int .Fn m_append "struct mbuf *mbuf" "int len" "c_caddr_t cp" .Ft struct mbuf * .Fn m_prepend "struct mbuf *mbuf" "int len" "int how" .Ft struct mbuf * .Fn m_copyup "struct mbuf *mbuf" "int len" "int dstoff" .Ft struct mbuf * .Fn m_pullup "struct mbuf *mbuf" "int len" .Ft struct mbuf * .Fn m_pulldown "struct mbuf *mbuf" "int offset" "int len" "int *offsetp" .Ft struct mbuf * .Fn m_copym "struct mbuf *mbuf" "int offset" "int len" "int how" .Ft struct mbuf * .Fn m_copypacket "struct mbuf *mbuf" "int how" .Ft struct mbuf * .Fn m_dup "struct mbuf *mbuf" "int how" .Ft void .Fn m_copydata "const struct mbuf *mbuf" "int offset" "int len" "caddr_t buf" .Ft void .Fn m_copyback "struct mbuf *mbuf" "int offset" "int len" "caddr_t buf" .Ft struct mbuf * .Fo m_devget .Fa "char *buf" .Fa "int len" .Fa "int offset" .Fa "struct ifnet *ifp" .Fa "void (*copy)(char *from, caddr_t to, u_int len)" .Fc .Ft void .Fn m_cat "struct mbuf *m" "struct mbuf *n" .Ft u_int .Fn m_fixhdr "struct mbuf *mbuf" .Ft void .Fn m_dup_pkthdr "struct mbuf *to" "struct mbuf *from" .Ft void .Fn m_move_pkthdr "struct mbuf *to" "struct mbuf *from" .Ft u_int .Fn m_length "struct mbuf *mbuf" "struct mbuf **last" .Ft struct mbuf * .Fn m_split "struct mbuf *mbuf" "int len" "int how" .Ft int .Fn m_apply "struct mbuf *mbuf" "int off" "int len" "int (*f)(void *arg, void *data, u_int len)" "void *arg" .Ft struct mbuf * .Fn m_getptr "struct mbuf *mbuf" "int loc" "int *off" .Ft struct mbuf * .Fn m_defrag "struct mbuf *m0" "int how" .Ft struct mbuf * .Fn m_unshare "struct mbuf *m0" "int how" .\" .Sh DESCRIPTION An .Vt mbuf is a basic unit of memory management in the kernel IPC subsystem. Network packets and socket buffers are stored in .Vt mbufs . A network packet may span multiple .Vt mbufs arranged into a .Vt mbuf chain (linked list), which allows adding or trimming network headers with little overhead. .Pp While a developer should not bother with .Vt mbuf internals without serious reason in order to avoid incompatibilities with future changes, it is useful to understand the general structure of an .Vt mbuf . .Pp An .Vt mbuf consists of a variable-sized header and a small internal buffer for data. The total size of an .Vt mbuf , .Dv MSIZE , is a constant defined in .In sys/param.h . The .Vt mbuf header includes: .Bl -tag -width "m_nextpkt" -offset indent .It Va m_next .Pq Vt struct mbuf * A pointer to the next .Vt mbuf in the .Vt mbuf chain . .It Va m_nextpkt .Pq Vt struct mbuf * A pointer to the next .Vt mbuf chain in the queue. .It Va m_data .Pq Vt caddr_t A pointer to data attached to this .Vt mbuf . .It Va m_len .Pq Vt int The length of the data. .It Va m_type .Pq Vt short The type of the data. .It Va m_flags .Pq Vt int The .Vt mbuf flags. .El .Pp The .Vt mbuf flag bits are defined as follows: .Bd -literal /* mbuf flags */ #define M_EXT 0x00000001 /* has associated external storage */ #define M_PKTHDR 0x00000002 /* start of record */ #define M_EOR 0x00000004 /* end of record */ #define M_RDONLY 0x00000008 /* associated data marked read-only */ #define M_PROTO1 0x00001000 /* protocol-specific */ #define M_PROTO2 0x00002000 /* protocol-specific */ #define M_PROTO3 0x00004000 /* protocol-specific */ #define M_PROTO4 0x00008000 /* protocol-specific */ #define M_PROTO5 0x00010000 /* protocol-specific */ #define M_PROTO6 0x00020000 /* protocol-specific */ #define M_PROTO7 0x00040000 /* protocol-specific */ #define M_PROTO8 0x00080000 /* protocol-specific */ #define M_PROTO9 0x00100000 /* protocol-specific */ #define M_PROTO10 0x00200000 /* protocol-specific */ #define M_PROTO11 0x00400000 /* protocol-specific */ #define M_PROTO12 0x00800000 /* protocol-specific */ /* mbuf pkthdr flags (also stored in m_flags) */ #define M_BCAST 0x00000010 /* send/received as link-level broadcast */ #define M_MCAST 0x00000020 /* send/received as link-level multicast */ .Ed .Pp The available .Vt mbuf types are defined as follows: .Bd -literal /* mbuf types */ #define MT_DATA 1 /* dynamic (data) allocation */ #define MT_HEADER MT_DATA /* packet header */ #define MT_SONAME 8 /* socket name */ #define MT_CONTROL 14 /* extra-data protocol message */ #define MT_OOBDATA 15 /* expedited data */ .Ed .Pp The available external buffer types are defined as follows: .Bd -literal /* external buffer types */ #define EXT_CLUSTER 1 /* mbuf cluster */ #define EXT_SFBUF 2 /* sendfile(2)'s sf_bufs */ #define EXT_JUMBOP 3 /* jumbo cluster 4096 bytes */ #define EXT_JUMBO9 4 /* jumbo cluster 9216 bytes */ #define EXT_JUMBO16 5 /* jumbo cluster 16184 bytes */ #define EXT_PACKET 6 /* mbuf+cluster from packet zone */ #define EXT_MBUF 7 /* external mbuf reference (M_IOVEC) */ #define EXT_NET_DRV 252 /* custom ext_buf provided by net driver(s) */ #define EXT_MOD_TYPE 253 /* custom module's ext_buf type */ #define EXT_DISPOSABLE 254 /* can throw this buffer away w/page flipping */ #define EXT_EXTREF 255 /* has externally maintained ref_cnt ptr */ .Ed .Pp If the .Dv M_PKTHDR flag is set, a .Vt struct pkthdr Va m_pkthdr is added to the .Vt mbuf header. It contains a pointer to the interface the packet has been received from .Pq Vt struct ifnet Va *rcvif , and the total packet length .Pq Vt int Va len . Optionally, it may also contain an attached list of packet tags .Pq Vt "struct m_tag" . See .Xr mbuf_tags 9 for details. Fields used in offloading checksum calculation to the hardware are kept in .Va m_pkthdr as well. See .Sx HARDWARE-ASSISTED CHECKSUM CALCULATION for details. .Pp If small enough, data is stored in the internal data buffer of an .Vt mbuf . If the data is sufficiently large, another .Vt mbuf may be added to the .Vt mbuf chain , or external storage may be associated with the .Vt mbuf . .Dv MHLEN bytes of data can fit into an .Vt mbuf with the .Dv M_PKTHDR flag set, .Dv MLEN bytes can otherwise. .Pp If external storage is being associated with an .Vt mbuf , the .Va m_ext header is added at the cost of losing the internal data buffer. It includes a pointer to external storage, the size of the storage, a pointer to a function used for freeing the storage, a pointer to an optional argument that can be passed to the function, and a pointer to a reference counter. An .Vt mbuf using external storage has the .Dv M_EXT flag set. .Pp The system supplies a macro for allocating the desired external storage buffer, .Dv MEXTADD . .Pp The allocation and management of the reference counter is handled by the subsystem. .Pp The system also supplies a default type of external storage buffer called an .Vt mbuf cluster . .Vt Mbuf clusters can be allocated and configured with the use of the .Dv MCLGET macro. Each .Vt mbuf cluster is .Dv MCLBYTES in size, where MCLBYTES is a machine-dependent constant. The system defines an advisory macro .Dv MINCLSIZE , which is the smallest amount of data to put into an .Vt mbuf cluster . It is equal to .Dv MHLEN plus one. It is typically preferable to store data into the data region of an .Vt mbuf , if size permits, as opposed to allocating a separate .Vt mbuf cluster to hold the same data. .\" .Ss Macros and Functions There are numerous predefined macros and functions that provide the developer with common utilities. .\" .Bl -ohang -offset indent .It Fn mtod mbuf type Convert an .Fa mbuf pointer to a data pointer. The macro expands to the data pointer cast to the specified .Fa type . .Sy Note : It is advisable to ensure that there is enough contiguous data in .Fa mbuf . See .Fn m_pullup for details. .It Fn MGET mbuf how type Allocate an .Vt mbuf and initialize it to contain internal data. .Fa mbuf will point to the allocated .Vt mbuf on success, or be set to .Dv NULL on failure. The .Fa how argument is to be set to .Dv M_WAITOK or .Dv M_NOWAIT . It specifies whether the caller is willing to block if necessary. A number of other functions and macros related to .Vt mbufs have the same argument because they may at some point need to allocate new .Vt mbufs . .It Fn MGETHDR mbuf how type Allocate an .Vt mbuf and initialize it to contain a packet header and internal data. See .Fn MGET for details. .It Fn MEXTADD mbuf buf size free opt_arg1 opt_arg2 flags type Associate externally managed data with .Fa mbuf . Any internal data contained in the mbuf will be discarded, and the .Dv M_EXT flag will be set. The .Fa buf and .Fa size arguments are the address and length, respectively, of the data. The .Fa free argument points to a function which will be called to free the data when the mbuf is freed; it is only used if .Fa type is .Dv EXT_EXTREF . The .Fa opt_arg1 and .Fa opt_arg2 arguments will be passed unmodified to .Fa free . The .Fa flags argument specifies additional .Vt mbuf flags; it is not necessary to specify .Dv M_EXT . Finally, the .Fa type argument specifies the type of external data, which controls how it will be disposed of when the .Vt mbuf is freed. In most cases, the correct value is .Dv EXT_EXTREF . .It Fn MCLGET mbuf how Allocate and attach an .Vt mbuf cluster to .Fa mbuf . If the macro fails, the .Dv M_EXT flag will not be set in .Fa mbuf . .It Fn M_ALIGN mbuf len Set the pointer .Fa mbuf->m_data to place an object of the size .Fa len at the end of the internal data area of .Fa mbuf , long word aligned. Applicable only if .Fa mbuf is newly allocated with .Fn MGET or .Fn m_get . .It Fn MH_ALIGN mbuf len Serves the same purpose as .Fn M_ALIGN does, but only for .Fa mbuf newly allocated with .Fn MGETHDR or .Fn m_gethdr , or initialized by .Fn m_dup_pkthdr or .Fn m_move_pkthdr . .It Fn m_align mbuf len Services the same purpose as .Fn M_ALIGN but handles any type of mbuf. .It Fn M_LEADINGSPACE mbuf Returns the number of bytes available before the beginning of data in .Fa mbuf . .It Fn M_TRAILINGSPACE mbuf Returns the number of bytes available after the end of data in .Fa mbuf . .It Fn M_PREPEND mbuf len how This macro operates on an .Vt mbuf chain . It is an optimized wrapper for .Fn m_prepend that can make use of possible empty space before data (e.g.\& left after trimming of a link-layer header). The new .Vt mbuf chain pointer or .Dv NULL is in .Fa mbuf after the call. .It Fn M_MOVE_PKTHDR to from Using this macro is equivalent to calling .Fn m_move_pkthdr to from . .It Fn M_WRITABLE mbuf This macro will evaluate true if .Fa mbuf is not marked .Dv M_RDONLY and if either .Fa mbuf does not contain external storage or, if it does, then if the reference count of the storage is not greater than 1. The .Dv M_RDONLY flag can be set in .Fa mbuf->m_flags . This can be achieved during setup of the external storage, by passing the .Dv M_RDONLY bit as a .Fa flags argument to the .Fn MEXTADD macro, or can be directly set in individual .Vt mbufs . .It Fn MCHTYPE mbuf type Change the type of .Fa mbuf to .Fa type . This is a relatively expensive operation and should be avoided. .El .Pp The functions are: .Bl -ohang -offset indent .It Fn m_get how type A function version of .Fn MGET for non-critical paths. .It Fn m_get2 size how type flags Allocate an .Vt mbuf with enough space to hold specified amount of data. .It Fn m_getm orig len how type Allocate .Fa len bytes worth of .Vt mbufs and .Vt mbuf clusters if necessary and append the resulting allocated .Vt mbuf chain to the .Vt mbuf chain .Fa orig , if it is .No non- Ns Dv NULL . If the allocation fails at any point, free whatever was allocated and return .Dv NULL . If .Fa orig is .No non- Ns Dv NULL , it will not be freed. It is possible to use .Fn m_getm to either append .Fa len bytes to an existing .Vt mbuf or .Vt mbuf chain (for example, one which may be sitting in a pre-allocated ring) or to simply perform an all-or-nothing .Vt mbuf and .Vt mbuf cluster allocation. .It Fn m_gethdr how type A function version of .Fn MGETHDR for non-critical paths. .It Fn m_getcl how type flags Fetch an .Vt mbuf with a .Vt mbuf cluster attached to it. If one of the allocations fails, the entire allocation fails. This routine is the preferred way of fetching both the .Vt mbuf and .Vt mbuf cluster together, as it avoids having to unlock/relock between allocations. Returns .Dv NULL on failure. .It Fn m_getjcl how type flags size This is like .Fn m_getcl but it the size of the cluster allocated will be large enough for .Fa size bytes. .It Fn m_getclr how type Allocate an .Vt mbuf and zero out the data region. .It Fn m_free mbuf Frees .Vt mbuf . Returns .Va m_next of the freed .Vt mbuf . .El .Pp The functions below operate on .Vt mbuf chains . .Bl -ohang -offset indent .It Fn m_freem mbuf Free an entire .Vt mbuf chain , including any external storage. .\" .It Fn m_adj mbuf len Trim .Fa len bytes from the head of an .Vt mbuf chain if .Fa len is positive, from the tail otherwise. .\" .It Fn m_append mbuf len cp Append .Vt len bytes of data .Vt cp to the .Vt mbuf chain . Extend the mbuf chain if the new data does not fit in existing space. .\" .It Fn m_prepend mbuf len how Allocate a new .Vt mbuf and prepend it to the .Vt mbuf chain , handle .Dv M_PKTHDR properly. .Sy Note : It does not allocate any .Vt mbuf clusters , so .Fa len must be less than .Dv MLEN or .Dv MHLEN , depending on the .Dv M_PKTHDR flag setting. .\" .It Fn m_copyup mbuf len dstoff Similar to .Fn m_pullup but copies .Fa len bytes of data into a new mbuf at .Fa dstoff bytes into the mbuf. The .Fa dstoff argument aligns the data and leaves room for a link layer header. Returns the new .Vt mbuf chain on success, and frees the .Vt mbuf chain and returns .Dv NULL on failure. .Sy Note : The function does not allocate .Vt mbuf clusters , so .Fa len + dstoff must be less than .Dv MHLEN . .\" .It Fn m_pullup mbuf len Arrange that the first .Fa len bytes of an .Vt mbuf chain are contiguous and lay in the data area of .Fa mbuf , so they are accessible with .Fn mtod mbuf type . It is important to remember that this may involve reallocating some mbufs and moving data so all pointers referencing data within the old mbuf chain must be recalculated or made invalid. Return the new .Vt mbuf chain on success, .Dv NULL on failure (the .Vt mbuf chain is freed in this case). .Sy Note : It does not allocate any .Vt mbuf clusters , so .Fa len must be less than or equal to .Dv MHLEN . .\" .It Fn m_pulldown mbuf offset len offsetp Arrange that .Fa len bytes between .Fa offset and .Fa offset + len in the .Vt mbuf chain are contiguous and lay in the data area of .Fa mbuf , so they are accessible with .Fn mtod mbuf type . .Fa len must be smaller than, or equal to, the size of an .Vt mbuf cluster . Return a pointer to an intermediate .Vt mbuf in the chain containing the requested region; the offset in the data region of the .Vt mbuf chain to the data contained in the returned mbuf is stored in .Fa *offsetp . If .Fa offsetp is NULL, the region may be accessed using .Fn mtod mbuf type . If .Fa offsetp is non-NULL, the region may be accessed using .Fn mtod mbuf uint8_t + *offsetp. The region of the mbuf chain between its beginning and .Fa offset is not modified, therefore it is safe to hold pointers to data within this region before calling .Fn m_pulldown . .\" .It Fn m_copym mbuf offset len how Make a copy of an .Vt mbuf chain starting .Fa offset bytes from the beginning, continuing for .Fa len bytes. If .Fa len is .Dv M_COPYALL , copy to the end of the .Vt mbuf chain . .Sy Note : The copy is read-only, because the .Vt mbuf clusters are not copied, only their reference counts are incremented. .\" .It Fn m_copypacket mbuf how Copy an entire packet including header, which must be present. This is an optimized version of the common case .Fn m_copym mbuf 0 M_COPYALL how . .Sy Note : the copy is read-only, because the .Vt mbuf clusters are not copied, only their reference counts are incremented. .\" .It Fn m_dup mbuf how Copy a packet header .Vt mbuf chain into a completely new .Vt mbuf chain , including copying any .Vt mbuf clusters . Use this instead of .Fn m_copypacket when you need a writable copy of an .Vt mbuf chain . .\" .It Fn m_copydata mbuf offset len buf Copy data from an .Vt mbuf chain starting .Fa off bytes from the beginning, continuing for .Fa len bytes, into the indicated buffer .Fa buf . .\" .It Fn m_copyback mbuf offset len buf Copy .Fa len bytes from the buffer .Fa buf back into the indicated .Vt mbuf chain , starting at .Fa offset bytes from the beginning of the .Vt mbuf chain , extending the .Vt mbuf chain if necessary. .Sy Note : It does not allocate any .Vt mbuf clusters , just adds .Vt mbufs to the .Vt mbuf chain . It is safe to set .Fa offset beyond the current .Vt mbuf chain end: zeroed .Vt mbufs will be allocated to fill the space. .\" .It Fn m_length mbuf last Return the length of the .Vt mbuf chain , and optionally a pointer to the last .Vt mbuf . .\" .It Fn m_dup_pkthdr to from how Upon the function's completion, the .Vt mbuf .Fa to will contain an identical copy of .Fa from->m_pkthdr and the per-packet attributes found in the .Vt mbuf chain .Fa from . The .Vt mbuf .Fa from must have the flag .Dv M_PKTHDR initially set, and .Fa to must be empty on entry. .\" .It Fn m_move_pkthdr to from Move .Va m_pkthdr and the per-packet attributes from the .Vt mbuf chain .Fa from to the .Vt mbuf .Fa to . The .Vt mbuf .Fa from must have the flag .Dv M_PKTHDR initially set, and .Fa to must be empty on entry. Upon the function's completion, .Fa from will have the flag .Dv M_PKTHDR and the per-packet attributes cleared. .\" .It Fn m_fixhdr mbuf Set the packet-header length to the length of the .Vt mbuf chain . .\" .It Fn m_devget buf len offset ifp copy Copy data from a device local memory pointed to by .Fa buf to an .Vt mbuf chain . The copy is done using a specified copy routine .Fa copy , or .Fn bcopy if .Fa copy is .Dv NULL . .\" .It Fn m_cat m n Concatenate .Fa n to .Fa m . Both .Vt mbuf chains must be of the same type. .Fa N is still valid after the function returned. .Sy Note : It does not handle .Dv M_PKTHDR and friends. .\" .It Fn m_split mbuf len how Partition an .Vt mbuf chain in two pieces, returning the tail: all but the first .Fa len bytes. In case of failure, it returns .Dv NULL and attempts to restore the .Vt mbuf chain to its original state. .\" .It Fn m_apply mbuf off len f arg Apply a function to an .Vt mbuf chain , at offset .Fa off , for length .Fa len bytes. Typically used to avoid calls to .Fn m_pullup which would otherwise be unnecessary or undesirable. .Fa arg is a convenience argument which is passed to the callback function .Fa f . .Pp Each time .Fn f is called, it will be passed .Fa arg , a pointer to the .Fa data in the current mbuf, and the length .Fa len of the data in this mbuf to which the function should be applied. .Pp The function should return zero to indicate success; otherwise, if an error is indicated, then .Fn m_apply will return the error and stop iterating through the .Vt mbuf chain . .\" .It Fn m_getptr mbuf loc off Return a pointer to the mbuf containing the data located at .Fa loc bytes from the beginning of the .Vt mbuf chain . The corresponding offset into the mbuf will be stored in .Fa *off . .It Fn m_defrag m0 how Defragment an mbuf chain, returning the shortest possible chain of mbufs and clusters. If allocation fails and this can not be completed, .Dv NULL will be returned and the original chain will be unchanged. Upon success, the original chain will be freed and the new chain will be returned. .Fa how should be either .Dv M_WAITOK or .Dv M_NOWAIT , depending on the caller's preference. .Pp This function is especially useful in network drivers, where certain long mbuf chains must be shortened before being added to TX descriptor lists. .It Fn m_unshare m0 how Create a version of the specified mbuf chain whose contents can be safely modified without affecting other users. If allocation fails and this operation can not be completed, .Dv NULL will be returned. The original mbuf chain is always reclaimed and the reference count of any shared mbuf clusters is decremented. .Fa how should be either .Dv M_WAITOK or .Dv M_NOWAIT , depending on the caller's preference. As a side-effect of this process the returned mbuf chain may be compacted. .Pp This function is especially useful in the transmit path of network code, when data must be encrypted or otherwise altered prior to transmission. .El .Sh HARDWARE-ASSISTED CHECKSUM CALCULATION This section currently applies to TCP/IP only. In order to save the host CPU resources, computing checksums is offloaded to the network interface hardware if possible. The .Va m_pkthdr member of the leading .Vt mbuf of a packet contains two fields used for that purpose, .Vt int Va csum_flags and .Vt int Va csum_data . The meaning of those fields depends on the direction a packet flows in, and on whether the packet is fragmented. Henceforth, .Va csum_flags or .Va csum_data of a packet will denote the corresponding field of the .Va m_pkthdr member of the leading .Vt mbuf in the .Vt mbuf chain containing the packet. .Pp On output, checksum offloading is attempted after the outgoing interface has been determined for a packet. The interface-specific field .Va ifnet.if_data.ifi_hwassist (see .Xr ifnet 9 ) is consulted for the capabilities of the interface to assist in computing checksums. The .Va csum_flags field of the packet header is set to indicate which actions the interface is supposed to perform on it. The actions unsupported by the network interface are done in the software prior to passing the packet down to the interface driver; such actions will never be requested through .Va csum_flags . .Pp The flags demanding a particular action from an interface are as follows: .Bl -tag -width ".Dv CSUM_TCP" -offset indent .It Dv CSUM_IP The IP header checksum is to be computed and stored in the corresponding field of the packet. The hardware is expected to know the format of an IP header to determine the offset of the IP checksum field. .It Dv CSUM_TCP The TCP checksum is to be computed. (See below.) .It Dv CSUM_UDP The UDP checksum is to be computed. (See below.) .El .Pp Should a TCP or UDP checksum be offloaded to the hardware, the field .Va csum_data will contain the byte offset of the checksum field relative to the end of the IP header. In this case, the checksum field will be initially set by the TCP/IP module to the checksum of the pseudo header defined by the TCP and UDP specifications. .Pp On input, an interface indicates the actions it has performed on a packet by setting one or more of the following flags in .Va csum_flags associated with the packet: .Bl -tag -width ".Dv CSUM_IP_CHECKED" -offset indent .It Dv CSUM_IP_CHECKED The IP header checksum has been computed. .It Dv CSUM_IP_VALID The IP header has a valid checksum. This flag can appear only in combination with .Dv CSUM_IP_CHECKED . .It Dv CSUM_DATA_VALID The checksum of the data portion of the IP packet has been computed and stored in the field .Va csum_data in network byte order. .It Dv CSUM_PSEUDO_HDR Can be set only along with .Dv CSUM_DATA_VALID to indicate that the IP data checksum found in .Va csum_data allows for the pseudo header defined by the TCP and UDP specifications. Otherwise the checksum of the pseudo header must be calculated by the host CPU and added to .Va csum_data to obtain the final checksum to be used for TCP or UDP validation purposes. .El .Pp If a particular network interface just indicates success or failure of TCP or UDP checksum validation without returning the exact value of the checksum to the host CPU, its driver can mark .Dv CSUM_DATA_VALID and .Dv CSUM_PSEUDO_HDR in .Va csum_flags , and set .Va csum_data to .Li 0xFFFF hexadecimal to indicate a valid checksum. It is a peculiarity of the algorithm used that the Internet checksum calculated over any valid packet will be .Li 0xFFFF as long as the original checksum field is included. .Sh STRESS TESTING When running a kernel compiled with the option .Dv MBUF_STRESS_TEST , the following .Xr sysctl 8 Ns -controlled options may be used to create various failure/extreme cases for testing of network drivers and other parts of the kernel that rely on .Vt mbufs . .Bl -tag -width ident .It Va net.inet.ip.mbuf_frag_size Causes .Fn ip_output to fragment outgoing .Vt mbuf chains into fragments of the specified size. Setting this variable to 1 is an excellent way to test the long .Vt mbuf chain handling ability of network drivers. .It Va kern.ipc.m_defragrandomfailures Causes the function .Fn m_defrag to randomly fail, returning .Dv NULL . Any piece of code which uses .Fn m_defrag should be tested with this feature. .El .Sh RETURN VALUES See above. .Sh SEE ALSO .Xr ifnet 9 , .Xr mbuf_tags 9 .Sh HISTORY .\" Please correct me if I'm wrong .Vt Mbufs appeared in an early version of .Bx . Besides being used for network packets, they were used to store various dynamic structures, such as routing table entries, interface addresses, protocol control blocks, etc. In more recent .Fx use of .Vt mbufs is almost entirely limited to packet storage, with .Xr uma 9 zones being used directly to store other network-related memory. .Pp Historically, the .Vt mbuf allocator has been a special-purpose memory allocator able to run in interrupt contexts and allocating from a special kernel address space map. As of .Fx 5.3 , the .Vt mbuf allocator is a wrapper around .Xr uma 9 , allowing caching of .Vt mbufs , clusters, and .Vt mbuf + cluster pairs in per-CPU caches, as well as bringing other benefits of slab allocation. .Sh AUTHORS The original .Nm -manual page was written by Yar Tikhiy. +manual page was written by +.An Yar Tikhiy . The .Xr uma 9 .Vt mbuf -allocator was written by Bosko Milekic. +allocator was written by +.An Bosko Milekic . Index: head/share/man/man9/refcount.9 =================================================================== --- head/share/man/man9/refcount.9 (revision 275992) +++ head/share/man/man9/refcount.9 (revision 275993) @@ -1,96 +1,96 @@ .\" .\" Copyright (c) 2009 Advanced Computing Technologies LLC .\" Written by: John H. Baldwin .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" $FreeBSD$ .\" .Dd January 20, 2009 .Dt REFCOUNT 9 .Os .Sh NAME .Nm refcount , .Nm refcount_init , .Nm refcount_acquire , .Nm refcount_release .Nd manage a simple reference counter .Sh SYNOPSIS .In sys/param.h .In sys/refcount.h .Ft void -.Fn refcount_init "volatile u_int *count, u_int value" +.Fn refcount_init "volatile u_int *count" "u_int value" .Ft void .Fn refcount_acquire "volatile u_int *count" .Ft int .Fn refcount_release "volatile u_int *count" .Sh DESCRIPTION The .Nm functions provide an API to manage a simple reference counter. The caller provides the storage for the counter in an unsigned integer. A pointer to this integer is passed via .Fa count . Usually the counter is used to manage the lifetime of an object and is stored as a member of the object. .Pp The .Fn refcount_init function is used to set the initial value of the counter to .Fa value . It is normally used when creating a reference-counted object. .Pp The .Fn refcount_acquire function is used to acquire a new reference. The caller is responsible for ensuring that it holds a valid reference while obtaining a new reference. For example, if an object is stored on a list and the list holds a reference on the object, then holding a lock that protects the list provides sufficient protection for acquiring a new reference. .Pp The .Fn refcount_release function is used to release an existing reference. The function returns a non-zero value if the reference being released was the last reference; otherwise, it returns zero. .Pp Note that these routines do not provide any inter-CPU synchronization, data protection, or memory ordering guarantees except for managing the counter. The caller is responsible for any additional synchronization needed by consumers of any containing objects. In addition, the caller is also responsible for managing the life cycle of any containing objects including explicitly releasing any resources when the last reference is released. .Sh RETURN VALUES The .Nm refcount_release function returns non-zero when releasing the last reference and zero when releasing any other reference. .Sh HISTORY These functions were introduced in .Fx 6.0 . Index: head/share/man/man9/usbdi.9 =================================================================== --- head/share/man/man9/usbdi.9 (revision 275992) +++ head/share/man/man9/usbdi.9 (revision 275993) @@ -1,641 +1,641 @@ .\" .\" Copyright (c) 2005 Ian Dowse .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" $FreeBSD$ .Dd June 24, 2009 .Dt USBDI 9 .Os .Sh NAME .Nm usb_fifo_alloc_buffer , .Nm usb_fifo_attach , .Nm usb_fifo_detach , .Nm usb_fifo_free_buffer , .Nm usb_fifo_get_data , .Nm usb_fifo_get_data_buffer , .Nm usb_fifo_get_data_error , .Nm usb_fifo_get_data_linear , .Nm usb_fifo_put_bytes_max , .Nm usb_fifo_put_data , .Nm usb_fifo_put_data_buffer , .Nm usb_fifo_put_data_error , .Nm usb_fifo_put_data_linear , .Nm usb_fifo_reset , .Nm usb_fifo_softc , .Nm usb_fifo_wakeup , .Nm usbd_do_request , .Nm usbd_do_request_flags , .Nm usbd_errstr , .Nm usbd_lookup_id_by_info , .Nm usbd_lookup_id_by_uaa , .Nm usbd_transfer_clear_stall , .Nm usbd_transfer_drain , .Nm usbd_transfer_pending , .Nm usbd_transfer_poll , .Nm usbd_transfer_setup , .Nm usbd_transfer_start , .Nm usbd_transfer_stop , .Nm usbd_transfer_submit , .Nm usbd_transfer_unsetup , .Nm usbd_xfer_clr_flag , .Nm usbd_xfer_frame_data , .Nm usbd_xfer_frame_len , .Nm usbd_xfer_get_frame , .Nm usbd_xfer_get_priv , .Nm usbd_xfer_is_stalled , .Nm usbd_xfer_max_framelen , .Nm usbd_xfer_max_frames , .Nm usbd_xfer_max_len , .Nm usbd_xfer_set_flag , .Nm usbd_xfer_set_frame_data , .Nm usbd_xfer_set_frame_len , .Nm usbd_xfer_set_frame_offset , .Nm usbd_xfer_set_frames , .Nm usbd_xfer_set_interval , .Nm usbd_xfer_set_priv , .Nm usbd_xfer_set_stall , .Nm usbd_xfer_set_timeout , .Nm usbd_xfer_softc , .Nm usbd_xfer_state , .Nm usbd_xfer_status .Nd Universal Serial Bus driver programming interface .Sh SYNOPSIS .In dev/usb/usb.h .In dev/usb/usbdi.h .In dev/usb/usbdi_util.h .Sh DESCRIPTION The Universal Serial Bus (USB) driver programming interface provides USB peripheral drivers with a host controller independent API for controlling and communicating with USB peripherals. The .Nm usb module supports both USB Host and USB Device side mode. . .Sh USB KERNEL PROGRAMMING Here is a list of commonly used functions: .Pp . .Ft "usb_error_t" .Fo "usbd_transfer_setup" .Fa "udev" .Fa "ifaces" .Fa "pxfer" .Fa "setup_start" .Fa "n_setup" .Fa "priv_sc" .Fa "priv_mtx" .Fc . .Pp . .Ft "void" .Fo "usbd_transfer_unsetup" .Fa "pxfer" .Fa "n_setup" .Fc . .Pp . .Ft "void" .Fo "usbd_transfer_start" .Fa "xfer" .Fc . .Pp . .Ft "void" .Fo "usbd_transfer_stop" .Fa "xfer" .Fc . .Pp . .Ft "void" .Fo "usbd_transfer_drain" .Fa "xfer" .Fc . . . .Sh USB TRANSFER MANAGEMENT FUNCTIONS The USB standard defines four types of USB transfers. . Control transfers, Bulk transfers, Interrupt transfers and Isochronous transfers. . All the transfer types are managed using the following five functions: . .Pp . .Fn usbd_transfer_setup This function will allocate memory for and initialise an array of USB transfers and all required DMA memory. . This function can sleep or block waiting for resources to become available. .Fa udev is a pointer to "struct usb_device". .Fa ifaces is an array of interface index numbers to use. See "if_index". .Fa pxfer is a pointer to an array of USB transfer pointers that are initialized to NULL, and then pointed to allocated USB transfers. .Fa setup_start is a pointer to an array of USB config structures. .Fa n_setup is a number telling the USB system how many USB transfers should be setup. .Fa priv_sc is the private softc pointer, which will be used to initialize "xfer->priv_sc". .Fa priv_mtx is the private mutex protecting the transfer structure and the softc. This pointer is used to initialize "xfer->priv_mtx". This function returns zero upon success. A non-zero return value indicates failure. . .Pp . .Fn usbd_transfer_unsetup This function will release the given USB transfers and all allocated resources associated with these USB transfers. .Fa pxfer is a pointer to an array of USB transfer pointers, that may be NULL, that should be freed by the USB system. .Fa n_setup is a number telling the USB system how many USB transfers should be unsetup. . This function can sleep waiting for USB transfers to complete. . This function is NULL safe with regard to the USB transfer structure pointer. . It is not allowed to call this function from the USB transfer callback. . .Pp . .Fn usbd_transfer_start This function will start the USB transfer pointed to by -.Fa xfer, +.Fa xfer , if not already started. . This function is always non-blocking and must be called with the so-called private USB mutex locked. . This function is NULL safe with regard to the USB transfer structure pointer. . .Pp . .Fn usbd_transfer_stop This function will stop the USB transfer pointed to by -.Fa xfer, +.Fa xfer , if not already stopped. . This function is always non-blocking and must be called with the so-called private USB mutex locked. . This function can return before the USB callback has been called. . This function is NULL safe with regard to the USB transfer structure pointer. . If the transfer was in progress, the callback will called with "USB_ST_ERROR" and "error = USB_ERR_CANCELLED". . .Pp . .Fn usbd_transfer_drain This function will stop an USB transfer, if not already stopped and wait for any additional USB hardware operations to complete. . Buffers that are loaded into DMA using "usbd_xfer_set_frame_data()" can safely be freed after that this function has returned. . This function can block the caller and will not return before the USB callback has been called. . This function is NULL safe with regard to the USB transfer structure pointer. . .Sh USB TRANSFER CALLBACK . The USB callback has three states. . USB_ST_SETUP, USB_ST_TRANSFERRED and USB_ST_ERROR. USB_ST_SETUP is the initial state. . After the callback has been called with this state it will always be called back at a later stage in one of the other two states. . The USB callback should not restart the USB transfer in case the error cause is USB_ERR_CANCELLED. . The USB callback is protected from recursion. . That means one can start and stop whatever transfer from the callback of another transfer one desires. . Also the transfer that is currently called back. . Recursion is handled like this that when the callback that wants to recurse returns it is called one more time. . . .Pp . .Fn usbd_transfer_submit This function should only be called from within the USB callback and is used to start the USB hardware. . An USB transfer can have multiple frames consisting of one or more USB packets making up an I/O vector for all USB transfer types. . .Bd -literal -offset indent void usb_default_callback(struct usb_xfer *xfer, usb_error_t error) { int actlen; usbd_xfer_status(xfer, &actlen, NULL, NULL, NULL); switch (USB_GET_STATE(xfer)) { case USB_ST_SETUP: /* * Setup xfer frame lengths/count and data */ usbd_transfer_submit(xfer); break; case USB_ST_TRANSFERRED: /* * Read usb frame data, if any. * "actlen" has the total length for all frames * transferred. */ break; default: /* Error */ /* * Print error message and clear stall * for example. */ break; } /* * Here it is safe to do something without the private * USB mutex locked. */ return; } .Ed . .Sh USB CONTROL TRANSFERS An USB control transfer has three parts. . First the SETUP packet, then DATA packet(s) and then a STATUS packet. . The SETUP packet is always pointed to by frame 0 and the length is set by .Fn usbd_xfer_frame_len also if there should not be sent any SETUP packet! If an USB control transfer has no DATA stage, then the number of frames should be set to 1. . Else the default number of frames is 2. . .Bd -literal -offset indent Example1: SETUP + STATUS usbd_xfer_set_frames(xfer, 1); usbd_xfer_set_frame_len(xfer, 0, 8); usbd_transfer_submit(xfer); Example2: SETUP + DATA + STATUS usbd_xfer_set_frames(xfer, 2); usbd_xfer_set_frame_len(xfer, 0, 8); usbd_xfer_set_frame_len(xfer, 1, 1); usbd_transfer_submit(xfer); Example3: SETUP + DATA + STATUS - split 1st callback: usbd_xfer_set_frames(xfer, 1); usbd_xfer_set_frame_len(xfer, 0, 8); usbd_transfer_submit(xfer); 2nd callback: /* IMPORTANT: frbuffers[0] must still point at the setup packet! */ usbd_xfer_set_frames(xfer, 2); usbd_xfer_set_frame_len(xfer, 0, 0); usbd_xfer_set_frame_len(xfer, 1, 1); usbd_transfer_submit(xfer); Example4: SETUP + STATUS - split 1st callback: usbd_xfer_set_frames(xfer, 1); usbd_xfer_set_frame_len(xfer, 0, 8); usbd_xfer_set_flag(xfer, USB_MANUAL_STATUS); usbd_transfer_submit(xfer); 2nd callback: usbd_xfer_set_frames(xfer, 1); usbd_xfer_set_frame_len(xfer, 0, 0); usbd_xfer_clr_flag(xfer, USB_MANUAL_STATUS); usbd_transfer_submit(xfer); .Ed .Sh USB TRANSFER CONFIG To simply the search for endpoints the .Nm usb module defines a USB config structure where it is possible to specify the characteristics of the wanted endpoint. .Bd -literal -offset indent struct usb_config { bufsize, callback direction, endpoint, frames, index flags, interval, timeout, type, }; .Ed . .Pp .Fa type field selects the USB pipe type. . Valid values are: UE_INTERRUPT, UE_CONTROL, UE_BULK, UE_ISOCHRONOUS. . The special value UE_BULK_INTR will select BULK and INTERRUPT pipes. . This field is mandatory. . .Pp .Fa endpoint field selects the USB endpoint number. . A value of 0xFF, "-1" or "UE_ADDR_ANY" will select the first matching endpoint. . This field is mandatory. . .Pp .Fa direction field selects the USB endpoint direction. . A value of "UE_DIR_ANY" will select the first matching endpoint. . Else valid values are: "UE_DIR_IN" and "UE_DIR_OUT". . "UE_DIR_IN" and "UE_DIR_OUT" can be binary OR'ed by "UE_DIR_SID" which means that the direction will be swapped in case of USB_MODE_DEVICE. . Note that "UE_DIR_IN" refers to the data transfer direction of the "IN" tokens and "UE_DIR_OUT" refers to the data transfer direction of the "OUT" tokens. . This field is mandatory. . .Pp .Fa interval field selects the interrupt interval. . The value of this field is given in milliseconds and is independent of device speed. . Depending on the endpoint type, this field has different meaning: .Bl -tag -width "UE_ISOCHRONOUS" .It UE_INTERRUPT "0" use the default interrupt interval based on endpoint descriptor. "Else" use the given value for polling rate. .It UE_ISOCHRONOUS "0" use default. "Else" the value is ignored. .It UE_BULK .It UE_CONTROL "0" no transfer pre-delay. "Else" a delay as given by this field in milliseconds is inserted before the hardware is started when "usbd_transfer_submit()" is called. .Pp NOTE: The transfer timeout, if any, is started after that the pre-delay has elapsed! .El . .Pp .Fa timeout field, if non-zero, will set the transfer timeout in milliseconds. If the "timeout" field is zero and the transfer type is ISOCHRONOUS a timeout of 250ms will be used. . .Pp .Fa frames field sets the maximum number of frames. If zero is specified it will yield the following results: .Bl -tag -width "UE_INTERRUPT" .It UE_BULK xfer->nframes = 1; .It UE_INTERRUPT xfer->nframes = 1; .It UE_CONTROL xfer->nframes = 2; .It UE_ISOCHRONOUS Not allowed. Will cause an error. .El . .Pp .Fa ep_index field allows you to give a number, in case more endpoints match the description, that selects which matching "ep_index" should be used. . .Pp .Fa if_index field allows you to select which of the interface numbers in the "ifaces" array parameter passed to "usbd_transfer_setup" that should be used when setting up the given USB transfer. . .Pp .Fa flags field has type "struct usb_xfer_flags" and allows one to set initial flags an USB transfer. Valid flags are: .Bl -tag -width "force_short_xfer" .It force_short_xfer This flag forces the last transmitted USB packet to be short. A short packet has a length of less than "xfer->max_packet_size", which derives from "wMaxPacketSize". This flag can be changed during operation. .It short_xfer_ok This flag allows the received transfer length, "xfer->actlen" to be less than "xfer->sumlen" upon completion of a transfer. This flag can be changed during operation. .It short_frames_ok This flag allows the reception of multiple short USB frames. This flag only has effect for BULK and INTERRUPT endpoints and if the number of frames received is greater than 1. This flag can be changed during operation. .It pipe_bof This flag causes a failing USB transfer to remain first in the PIPE queue except in the case of "xfer->error" equal to "USB_ERR_CANCELLED". No other USB transfers in the affected PIPE queue will be started until either: .Bl -tag -width "X" .It 1 The failing USB transfer is stopped using "usbd_transfer_stop()". .It 2 The failing USB transfer performs a successful transfer. .El The purpose of this flag is to avoid races when multiple transfers are queued for execution on an USB endpoint, and the first executing transfer fails leading to the need for clearing of stall for example. . In this case this flag is used to prevent the following USB transfers from being executed at the same time the clear-stall command is executed on the USB control endpoint. . This flag can be changed during operation. .Pp "BOF" is short for "Block On Failure". .Pp NOTE: This flag should be set on all BULK and INTERRUPT USB transfers which use an endpoint that can be shared between userland and kernel. . . .It proxy_buffer Setting this flag will cause that the total buffer size will be rounded up to the nearest atomic hardware transfer size. . The maximum data length of any USB transfer is always stored in the "xfer->max_data_length". . For control transfers the USB kernel will allocate additional space for the 8-bytes of SETUP header. . These 8-bytes are not counted by the "xfer->max_data_length" variable. . This flag can not be changed during operation. . . .It ext_buffer Setting this flag will cause that no data buffer will be allocated. . Instead the USB client must supply a data buffer. . This flag can not be changed during operation. . . .It manual_status Setting this flag prevents an USB STATUS stage to be appended to the end of the USB control transfer. . If no control data is transferred this flag must be cleared. . Else an error will be returned to the USB callback. . This flag is mostly useful for the USB device side. . This flag can be changed during operation. . . .It no_pipe_ok Setting this flag causes the USB_ERR_NO_PIPE error to be ignored. This flag can not be changed during operation. . . .It stall_pipe .Bl -tag -width "Device Side Mode" .It Device Side Mode Setting this flag will cause STALL pids to be sent to the endpoint belonging to this transfer before the transfer is started. . The transfer is started at the moment the host issues a clear-stall command on the STALL'ed endpoint. . This flag can be changed during operation. .It Host Side Mode Setting this flag will cause a clear-stall control request to be executed on the endpoint before the USB transfer is started. .El .Pp If this flag is changed outside the USB callback function you have to use the "usbd_xfer_set_stall()" and "usbd_transfer_clear_stall()" functions! This flag is automatically cleared after that the stall or clear stall has been executed. . .It pre_scale_frames If this flag is set the number of frames specified is assumed to give the buffering time in milliseconds instead of frames. During transfer setup the frames field is pre scaled with the corresponding value for the endpoint and rounded to the nearest number of frames greater than zero. This option only has effect for ISOCHRONOUS transfers. .El .Pp .Fa bufsize field sets the total buffer size in bytes. . If this field is zero, "wMaxPacketSize" will be used, multiplied by the "frames" field if the transfer type is ISOCHRONOUS. . This is useful for setting up interrupt pipes. . This field is mandatory. .Pp NOTE: For control transfers "bufsize" includes the length of the request structure. . .Pp .Fa callback pointer sets the USB callback. This field is mandatory. . . .Sh USB LINUX COMPAT LAYER The .Nm usb module supports the Linux USB API. . . .Sh SEE ALSO .Xr libusb 3 , .Xr usb 4 , .Xr usbconfig 8 .Sh STANDARDS The .Nm usb module complies with the USB 2.0 standard. .Sh HISTORY The .Nm usb module has been inspired by the NetBSD USB stack initially written by Lennart Augustsson. The .Nm usb module was written by .An Hans Petter Selasky Aq Mt hselasky@FreeBSD.org . Index: head/share/man/man9/vm_page_busy.9 =================================================================== --- head/share/man/man9/vm_page_busy.9 (revision 275992) +++ head/share/man/man9/vm_page_busy.9 (revision 275993) @@ -1,216 +1,216 @@ .\" .\" Copyright (c) 2013 EMC Corp. .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR .\" SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER .\" CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH .\" DAMAGE. .\" .\" $FreeBSD$ .Dd August 07, 2013 .Dt VM_PAGE_BUSY 9 .Os .Sh NAME .Nm vm_page_busied , .Nm vm_page_busy_downgrade , .Nm vm_page_busy_sleep , .Nm vm_page_sbusied , .Nm vm_page_sbusy , .Nm vm_page_sleep_if_busy , .Nm vm_page_sunbusy , .Nm vm_page_trysbusy , .Nm vm_page_tryxbusy , .Nm vm_page_xbusied , .Nm vm_page_xbusy , .Nm vm_page_xunbusy , .Nm vm_page_assert_sbusied , .Nm vm_page_assert_unbusied , .Nm vm_page_assert_xbusied .Nd protect page identity changes and page content references .Sh SYNOPSIS .In sys/param.h .In vm/vm.h .In vm/vm_page.h .Ft int .Fn vm_page_busied "vm_page_t m" .Ft void .Fn vm_page_busy_downgrade "vm_page_t m" .Ft void .Fn vm_page_busy_sleep "vm_page_t m" "const char *msg" .Ft int .Fn vm_page_sbusied "vm_page_t m" .Ft void .Fn vm_page_sbusy "vm_page_t m" .Ft int .Fn vm_page_sleep_if_busy "vm_page_t m" "const char *msg" .Ft void .Fn vm_page_sunbusy "vm_page_t m" .Ft int .Fn vm_page_trysbusy "vm_page_t m" .Ft int .Fn vm_page_tryxbusy "vm_page_t m" .Ft int .Fn vm_page_xbusied "vm_page_t m" .Ft void .Fn vm_page_xbusy "vm_page_t m" .Ft void .Fn vm_page_xunbusy "vm_page_t m" .Pp .Cd "options INVARIANTS" .Cd "options INVARIANT_SUPPORT" .Ft void .Fn vm_page_assert_sbusied "vm_page_t m" .Ft void .Fn vm_page_assert_unbusied "vm_page_t m" .Ft void .Fn vm_page_assert_xbusied "vm_page_t m" .Sh DESCRIPTION Page identity is usually protected by higher level locks like vm_object locks and vm page locks. However, sometimes it is not possible to hold such locks for the time necessary to complete the identity change. In such case the page can be exclusively busied by a thread which needs to own the identity for a certain amount of time. .Pp In other situations, threads do not need to change the identity of the page but they want to prevent other threads from changing the identity themselves. For example, when a thread wants to access or update page contents without a lock held the page is shared busied. .Pp Before busying a page the vm_object lock must be held. The same rule applies when a page is unbusied. This makes the vm_object lock a real busy interlock. .Pp The .Fn vm_page_busied function returns non-zero if the current thread busied .Fa m in either exclusive or shared mode. Returns zero otherwise. .Pp The .Fn vm_page_busy_downgrade function must be used to downgrade .Fa m from an exclusive busy state to a shared busy state. .Pp The .Fn vm_page_busy_sleep function puts the invoking thread to sleep using the appropriate waitchannels for the busy mechanism. The parameter .Fa msg is a string describing the sleep condition for userland tools. .Pp The .Fn vm_page_busied function returns non-zero if the current thread busied .Fa m in shared mode. Returns zero otherwise. .Pp The .Fn vm_page_sbusy function shared busies .Fa m . .Pp The .Fn vm_page_sleep_if_busy function puts the invoking thread to sleep, using the appropriate waitchannels for the busy mechanism, if .Fa m . is busied in either exclusive or shared mode. If the invoking thread slept a non-zero value is returned, otherwise 0 is returned. The parameter .Fa msg is a string describing the sleep condition for userland tools. .Pp The .Fn vm_page_sunbusy function shared unbusies .Fa m . .Pp The .Fn vm_page_trysbusy attempts to shared busy .Fa m . If the operation cannot immediately succeed .Fn vm_page_trysbusy returns 0, otherwise a non-zero value is returned. .Pp The .Fn vm_page_tryxbusy attempts to exclusive busy .Fa m . If the operation cannot immediately succeed .Fn vm_page_tryxbusy returns 0, otherwise a non-zero value is returned. .Pp The .Fn vm_page_xbusied function returns non-zero if the current thread busied .Fa m in exclusive mode. Returns zero otherwise. .Pp The .Fn vm_page_xbusy function exclusive busies .Fa m . .Pp The .Fn vm_page_xunbusy function exclusive unbusies .Fa m . Assertions on the busy state allow kernels compiled with .Cd "options INVARIANTS" and .Cd "options INVARIANT_SUPPORT" to panic if they are not respected. .Pp The .Fn vm_page_assert_sbusied function panics if .Fa m is not shared busied. .Pp The .Fn vm_page_assert_unbusied function panics if .Fa m is not unbusied. .Pp The .Fn vm_page_assert_xbusied function panics if .Fa m is not exclusive busied. .Sh SEE ALSO -.Xr VOP_GETPAGES 9 , .Xr vm_page_aflag 9 , .Xr vm_page_alloc 9 , .Xr vm_page_deactivate 9 , .Xr vm_page_free 9 , .Xr vm_page_grab 9 , .Xr vm_page_insert 9 , .Xr vm_page_lookup 9 , -.Xr vm_page_rename 9 +.Xr vm_page_rename 9 , +.Xr VOP_GETPAGES 9 Index: head/share/man/man9/vnet.9 =================================================================== --- head/share/man/man9/vnet.9 (revision 275992) +++ head/share/man/man9/vnet.9 (revision 275993) @@ -1,502 +1,497 @@ .\"- .\" Copyright (c) 2010 The FreeBSD Foundation .\" All rights reserved. .\" .\" This documentation was written by CK Software GmbH under sponsorship from .\" the FreeBSD Foundation. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" $FreeBSD$ .\" .Dd November 20, 2014 .Dt VNET 9 .Os .Sh NAME .Nm VNET .Nd "network subsystem virtualization infrastructure" .Sh SYNOPSIS .Cd "options VIMAGE" .Cd "options VNET_DEBUG" .Pp .In sys/vnet.h .Pp .\"------------------------------------------------------------ .Ss "Constants and Global Variables" .\" .Dv VNET_SETNAME .\" "set_vnet" .Dv VNET_SYMPREFIX .\" "vnet_entry_" .Vt extern struct vnet *vnet0; .\"------------------------------------------------------------ .Ss "Variable Declaration" .Fo VNET .Fa "name" .Fc .\" .Fo VNET_NAME .Fa "name" .Fc .\" .Fo VNET_DECLARE .Fa "type" "name" .Fc .\" .Fo VNET_DEFINE .Fa "type" "name" .Fc .\" .Bd -literal #define V_name VNET(name) .Ed .\" ------------------------------------------------------------ .Ss "Virtual Instance Selection" .\" .Fo CRED_TO_VNET .Fa "struct ucred *" .Fc .\" .Fo TD_TO_VNET .Fa "struct thread *" .Fc .\" .Fo P_TO_VNET .Fa "struct proc *" .Fc .\" .Fo IS_DEFAULT_VNET .Fa "struct vnet *" .Fc .\" .Fo VNET_ASSERT .Fa exp msg .Fc .\" .Fo CURVNET_SET .Fa "struct vnet *" .Fc .\" .Fo CURVNET_SET_QUIET .Fa "struct vnet *" .Fc .\" -.Fo CURVNET_RESTORE -.Fc +.Fn CURVNET_RESTORE .\" .Fo VNET_ITERATOR_DECL .Fa "struct vnet *" .Fc .\" .Fo VNET_FOREACH .Fa "struct vnet *" .Fc .\" ------------------------------------------------------------ .Ss "Locking" .\" -.Fo VNET_LIST_RLOCK -.Fc -.Fo VNET_LIST_RUNLOCK -.Fc -.Fo VNET_LIST_RLOCK_NOSLEEP -.Fc -.Fo VNET_LIST_RUNLOCK_NOSLEEP -.Fc +.Fn VNET_LIST_RLOCK +.Fn VNET_LIST_RUNLOCK +.Fn VNET_LIST_RLOCK_NOSLEEP +.Fn VNET_LIST_RUNLOCK_NOSLEEP .\" ------------------------------------------------------------ .Ss "Startup and Teardown Functions" .\" .Ft "struct vnet *" .Fo vnet_alloc .Fa void .Fc .\" .Ft void .Fo vnet_destroy .Fa "struct vnet *" .Fc .\" .Fo VNET_SYSINIT .Fa ident .Fa "enum sysinit_sub_id subsystem" .Fa "enum sysinit_elem_order order" .Fa "sysinit_cfunc_t func" .Fa "const void *arg" .Fc .\" .Fo VNET_SYSUNINIT .Fa ident .Fa "enum sysinit_sub_id subsystem" .Fa "enum sysinit_elem_order order" .Fa "sysinit_cfunc_t func" .Fa "const void *arg" .Fc .\" ------------------------------------------------------------ .Ss "Eventhandlers" .\" .Fo VNET_GLOBAL_EVENTHANDLER_REGISTER .Fa "const char *name" .Fa "void *func" .Fa "void *arg" .Fa "int priority" .Fc .\" .Fo VNET_GLOBAL_EVENTHANDLER_REGISTER_TAG .Fa "eventhandler_tag tag" .Fa "const char *name" .Fa "void *func" .Fa "void *arg" .Fa "int priority" .Fc .\" ------------------------------------------------------------ .Ss "Sysctl Handling" .Fo SYSCTL_VNET_INT .Fa parent nbr name access ptr val descr .Fc .Fo SYSCTL_VNET_PROC .Fa parent nbr name access ptr arg handler fmt descr .Fc .Fo SYSCTL_VNET_STRING .Fa parent nbr name access arg len descr .Fc .Fo SYSCTL_VNET_STRUCT .Fa parent nbr name access ptr type descr .Fc .Fo SYSCTL_VNET_UINT .Fa parent nbr name access ptr val descr .Fc .Fo VNET_SYSCTL_ARG .Fa req arg1 .Fc .\" ------------------------------------------------------------ .Sh DESCRIPTION .Nm is the name of a technique to virtualize the network stack. The basic idea is to change global resources most notably variables into per network stack resources and have functions, sysctls, eventhandlers, etc. access and handle them in the context of the correct instance. Each (virtual) network stack is attached to a .Em prison , with .Vt vnet0 being the unrestricted default network stack of the base system. .Pp The global defines for .Dv VNET_SETNAME and .Dv VNET_SYMPREFIX are shared with .Xr kvm 3 to access internals for debugging reasons. .\" ------------------------------------------------------------ .Ss "Variable Declaration" .\" Variables are virtualized by using the .Fn VNET_DEFINE macro rather than writing them out as .Em type name . One can still use static initialization or storage class specifiers, e.g., .Pp .Dl Li static VNET_DEFINE(int, foo) = 1; or .Dl Li static VNET_DEFINE(SLIST_HEAD(, bar), bars); .Pp Static initialization is not possible when the virtualized variable would need to be referenced, e.g., with .Dq TAILQ_HEAD_INITIALIZER() . In that case a .Fn VNET_SYSINIT based initialization function must be used. .Pp External variables have to be declared using the .Fn VNET_DECLARE macro. In either case the convention is to define another macro, that is then used throughout the implementation to access that variable. The variable name is usually prefixed by .Em V_ to express that it is virtualized. The .Fn VNET macro will then translate accesses to that variable to the copy of the currently selected instance (see the .Sx "Virtual instance selection" section): .Pp .Dl Li #define V_name VNET(name) .Pp .Em NOTE: Do not confuse this with the convention used by .Xr VFS 9 . .Pp The .Fn VNET_NAME macro returns the offset within the memory region of the virtual network stack instance. It is usually only used with .Fn SYSCTL_VNET_* macros. .\" ------------------------------------------------------------ .Ss "Virtual Instance Selection" .\" There are three different places where the current virtual network stack pointer is stored and can be taken from: .Bl -enum -offset indent .It a .Em prison : .Dl "(struct prison *)->pr_vnet" .Pp For convenience the following macros are provided: .Bd -literal -compact -offset indent .Fn CRED_TO_VNET "struct ucred *" .Fn TD_TO_VNET "struct thread *" .Fn P_TO_VNET "struct proc *" .Ed .It a .Em socket : .Dl "(struct socket *)->so_vnet" .It an .Em interface : .Dl "(struct ifnet *)->if_vnet" .El .Pp .\" In addition the currently active instance is cached in .Dq "curthread->td_vnet" which is usually only accessed through the .Dv curvnet macro. .Pp .\" To set the correct context of the current virtual network instance, use the .Fn CURVNET_SET or .Fn CURVNET_SET_QUIET macros. The .Fn CURVNET_SET_QUIET version will not record vnet recursions in case the kernel was compiled with .Cd "options VNET_DEBUG" and should thus only be used in well known cases, where recursion is unavoidable. Both macros will save the previous state on the stack and it must be restored with the .Fn CURVNET_RESTORE macro. .Pp .Em NOTE: As the previous state is saved on the stack, you cannot have multiple .Fn CURVNET_SET calls in the same block. .Pp .Em NOTE: As the previous state is saved on the stack, a .Fn CURVNET_RESTORE call has to be in the same block as the .Fn CURVNET_SET call or in a subblock with the same idea of the saved instances as the outer block. .Pp .Em NOTE: As each macro is a set of operations and, as previously explained, cannot be put into its own block when defined, one cannot conditionally set the current vnet context. The following will .Em not work: .Bd -literal -offset indent if (condition) CURVNET_SET(vnet); .Ed .Pp nor would this work: .Bd -literal -offset indent if (condition) { CURVNET_SET(vnet); } CURVNET_RESTORE(); .Ed .Pp .\" Sometimes one needs to loop over all virtual instances, for example to update virtual from global state, to run a function from a .Xr callout 9 for each instance, etc. For those cases the .Fn VNET_ITERATOR_DECL and .Fn VNET_FOREACH macros are provided. The former macro defines the variable that iterates over the loop, and the latter loops over all of the virtual network stack instances. See .Sx "Locking" for how to savely traverse the list of all virtual instances. .Pp .\" The .Fn IS_DEFAULT_VNET macro provides a safe way to check whether the currently active instance is the unrestricted default network stack of the base system .Pq Vt vnet0 . .Pp .\" The .Fn VNET_ASSERT macro provides a way to conditionally add assertions that are only active with .Cd "options VIMAGE" compiled in and either .Cd "options VNET_DEBUG" or .Cd "options INVARIANTS" enabled as well. It uses the same semantics as .Xr KASSERT 9 . .\" ------------------------------------------------------------ .Ss "Locking" .\" For public access to the list of virtual network stack instances e.g., by the .Fn VNET_FOREACH macro, read locks are provided. Macros are used to abstract from the actual type of the locks. If a caller may sleep while traversing the list, it must use the .Fn VNET_LIST_RLOCK and .Fn VNET_LIST_RUNLOCK macros. Otherwise, the caller can use .Fn VNET_LIST_RLOCK_NOSLEEP and .Fn VNET_LIST_RUNLOCK_NOSLEEP . .\" ------------------------------------------------------------ .Ss "Startup and Teardown Functions" .\" To start or tear down a virtual network stack instance the internal functions .Fn vnet_alloc and .Fn vnet_destroy are provided and called from the jail framework. They run the publicly provided methods to handle network stack startup and teardown. .Pp For public control, the system startup interface has been enhanced to not only handle a system boot but to also handle a virtual network stack startup and teardown. To the base system the .Fn VNET_SYSINIT and .Fn VNET_SYSUNINIT macros look exactly as if there were no virtual network stack. In fact, if .Cd "options VIMAGE" is not compiled in they are compiled to the standard .Fn SYSINIT macros. In addition to that they are run for each virtual network stack when starting or, in reverse order, when shutting down. .\" ------------------------------------------------------------ .Ss "Eventhandlers" .\" Eventhandlers can be handled in two ways: .Pp .Bl -enum -offset indent -compact .It save the .Em tags returned in each virtual instance and properly free the eventhandlers on teardown using those, or .It use one eventhandler that will iterate over all virtual network stack instances. .El .Pp For the first case one can just use the normal .Xr EVENTHANDLER 9 functions, while for the second case the .Fn VNET_GLOBAL_EVENTHANDLER_REGISTER and .Fn VNET_GLOBAL_EVENTHANDLER_REGISTER_TAG macros are provided. These differ in that .Fn VNET_GLOBAL_EVENTHANDLER_REGISTER_TAG takes an extra first argument that will carry the .Fa "tag" upon return. Eventhandlers registered with either of these will not run .Fa func directly but .Fa func will be called from an internal iterator function for each vnet. Both macros can only be used for eventhandlers that do not take additional arguments, as the variadic arguments from an .Xr EVENTHANDLER_INVOKE 9 call will be ignored. .\" ------------------------------------------------------------ .Ss "Sysctl Handling" .\" A .Xr sysctl 9 can be virtualized by using one of the .Fn SYSCTL_VNET_* macros. .Pp They take the same arguments as the standard .Xr sysctl 9 functions, with the only difference, that the .Fa ptr argument has to be passed as .Ql &VNET_NAME(foo) instead of .Ql &foo so that the variable can be selected from the correct memory region of the virtual network stack instance of the caller. .Pp For the very rare case a sysctl handler function would want to handle .Fa arg1 itself the .Fn VNET_SYSCTL_ARG req arg1 is provided that will translate the .Fa arg1 argument to the correct memory address in the virtual network stack context of the caller. .\" ------------------------------------------------------------ .Sh SEE ALSO .Xr jail 2 , .Xr kvm 3 , .Xr EVENTHANDLER 9 , .\" .Xr pcpu 9 , .Xr KASSERT 9 , .Xr sysctl 9 .\" .Xr SYSINIT 9 .Sh HISTORY The virtual network stack implementation first appeared in .Fx 8.0 . .Sh AUTHORS This manual page was written by .An Bjoern A. Zeeb, CK Software GmbH, under sponsorship from the FreeBSD Foundation. Index: head/share/man/man9/vnode.9 =================================================================== --- head/share/man/man9/vnode.9 (revision 275992) +++ head/share/man/man9/vnode.9 (revision 275993) @@ -1,197 +1,197 @@ .\" Copyright (c) 1996 Doug Rabson .\" .\" All rights reserved. .\" .\" This program is free software. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE DEVELOPERS ``AS IS'' AND ANY EXPRESS OR .\" IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES .\" OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. .\" IN NO EVENT SHALL THE DEVELOPERS BE LIABLE FOR ANY DIRECT, INDIRECT, .\" INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT .\" NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, .\" DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY .\" THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT .\" (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF .\" THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. .\" .\" $FreeBSD$ .\" .Dd February 12, 2014 .Dt VNODE 9 .Os .Sh NAME .Nm vnode .Nd internal representation of a file or directory .Sh SYNOPSIS .In sys/param.h .In sys/vnode.h .Sh DESCRIPTION The vnode is the focus of all file activity in .Ux . A vnode is described by .Vt "struct vnode" . There is a unique vnode allocated for each active file, each current directory, each mounted-on file, text file, and the root. .Pp Each vnode has three reference counts, .Va v_usecount , .Va v_holdcnt and .Va v_writecount . The first is the number of clients within the kernel which are using this vnode. This count is maintained by .Xr vref 9 , .Xr vrele 9 and .Xr vput 9 . The second is the number of clients within the kernel who veto the recycling of this vnode. This count is maintained by .Xr vhold 9 and .Xr vdrop 9 . When both the .Va v_usecount and the .Va v_holdcnt of a vnode reaches zero then the vnode will be put on the freelist and may be reused for another file, possibly in another file system. The transition from the freelist is handled by .Xr getnewvnode 9 . The third is a count of the number of clients which are writing into the file. It is maintained by the .Xr open 2 and .Xr close 2 system calls. .Pp Any call which returns a vnode (e.g.,\& .Xr vget 9 , .Xr VOP_LOOKUP 9 , etc.) will increase the .Va v_usecount of the vnode by one. When the caller is finished with the vnode, it should release this reference by calling .Xr vrele 9 (or .Xr vput 9 if the vnode is locked). .Pp Other commonly used members of the vnode structure are .Va v_id which is used to maintain consistency in the name cache, .Va v_mount which points at the file system which owns the vnode, .Va v_type which contains the type of object the vnode represents and .Va v_data which is used by file systems to store file system specific data with the vnode. The .Va v_op field is used by the .Dv VOP_* macros to call functions in the file system which implement the vnode's functionality. .Sh VNODE TYPES .Bl -tag -width VSOCK .It Dv VNON No type. .It Dv VREG A regular file; may be with or without VM object backing. If you want to make sure this get a backing object, call .Fn vnode_create_vobject . .It Dv VDIR A directory. .It Dv VBLK A block device; may be with or without VM object backing. If you want to make sure this get a backing object, call .Fn vnode_create_vobject . .It Dv VCHR A character device. .It Dv VLNK A symbolic link. .It Dv VSOCK A socket. Advisory locking will not work on this. .It Dv VFIFO A FIFO (named pipe). Advisory locking will not work on this. .It Dv VBAD Indicates that the vnode has been reclaimed. .El .Sh IMPLEMENTATION NOTES VFIFO uses the "struct fileops" from .Pa /sys/kern/sys_pipe.c . VSOCK uses the "struct fileops" from .Pa /sys/kern/sys_socket.c . Everything else uses the one from .Pa /sys/kern/vfs_vnops.c . .Pp The VFIFO/VSOCK code, which is why "struct fileops" is used at all, is an artifact of an incomplete integration of the VFS code into the kernel. .Pp Calls to .Xr malloc 9 or .Xr free 9 when holding a .Nm interlock, will cause a LOR (Lock Order Reversal) due to the intertwining of VM Objects and Vnodes. .Sh SEE ALSO .Xr malloc 9 , +.Xr VFS 9 , .Xr VOP_ACCESS 9 , .Xr VOP_ACLCHECK 9 , .Xr VOP_ADVISE 9 , .Xr VOP_ADVLOCK 9 , .Xr VOP_ALLOCATE 9 , .Xr VOP_ATTRIB 9 , .Xr VOP_BWRITE 9 , .Xr VOP_CREATE 9 , .Xr VOP_FSYNC 9 , .Xr VOP_GETACL 9 , .Xr VOP_GETEXTATTR 9 , .Xr VOP_GETPAGES 9 , .Xr VOP_INACTIVE 9 , .Xr VOP_IOCTL 9 , .Xr VOP_LINK 9 , .Xr VOP_LISTEXTATTR 9 , .Xr VOP_LOCK 9 , .Xr VOP_LOOKUP 9 , .Xr VOP_OPENCLOSE 9 , .Xr VOP_PATHCONF 9 , .Xr VOP_PRINT 9 , .Xr VOP_RDWR 9 , .Xr VOP_READDIR 9 , .Xr VOP_READLINK 9 , .Xr VOP_REALLOCBLKS 9 , .Xr VOP_REMOVE 9 , .Xr VOP_RENAME 9 , .Xr VOP_REVOKE 9 , .Xr VOP_SETACL 9 , .Xr VOP_SETEXTATTR 9 , .Xr VOP_STRATEGY 9 , .Xr VOP_VPTOCNP 9 , -.Xr VOP_VPTOFH 9 , -.Xr VFS 9 +.Xr VOP_VPTOFH 9 .Sh AUTHORS This manual page was written by .An Doug Rabson . Index: head/share/man/man9/zone.9 =================================================================== --- head/share/man/man9/zone.9 (revision 275992) +++ head/share/man/man9/zone.9 (revision 275993) @@ -1,375 +1,375 @@ .\"- .\" Copyright (c) 2001 Dag-Erling Coïdan Smørgrav .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" $FreeBSD$ .\" .Dd February 7, 2014 .Dt ZONE 9 .Os .Sh NAME .Nm uma_zcreate , .Nm uma_zalloc , .Nm uma_zalloc_arg , .Nm uma_zfree , .Nm uma_zfree_arg , .Nm uma_find_refcnt , .Nm uma_zdestroy , .Nm uma_zone_set_max, .Nm uma_zone_get_max, .Nm uma_zone_get_cur, .Nm uma_zone_set_warning .Nd zone allocator .Sh SYNOPSIS .In sys/param.h .In sys/queue.h .In vm/uma.h .Ft uma_zone_t .Fo uma_zcreate .Fa "char *name" "int size" .Fa "uma_ctor ctor" "uma_dtor dtor" "uma_init uminit" "uma_fini fini" .Fa "int align" "uint16_t flags" .Fc .Ft "void *" .Fn uma_zalloc "uma_zone_t zone" "int flags" .Ft "void *" .Fn uma_zalloc_arg "uma_zone_t zone" "void *arg" "int flags" .Ft void .Fn uma_zfree "uma_zone_t zone" "void *item" .Ft void .Fn uma_zfree_arg "uma_zone_t zone" "void *item" "void *arg" .Ft "uint32_t *" .Fn uma_find_refcnt "uma_zone_t zone" "void *item" .Ft void .Fn uma_zdestroy "uma_zone_t zone" .Ft int .Fn uma_zone_set_max "uma_zone_t zone" "int nitems" .Ft int .Fn uma_zone_get_max "uma_zone_t zone" .Ft int .Fn uma_zone_get_cur "uma_zone_t zone" .Ft void .Fn uma_zone_set_warning "uma_zone_t zone" "const char *warning" .In sys/sysctl.h .Fn SYSCTL_UMA_MAX parent nbr name access zone descr .Fn SYSCTL_ADD_UMA_MAX ctx parent nbr name access zone descr .Fn SYSCTL_UMA_CUR parent nbr name access zone descr .Fn SYSCTL_ADD_UMA_CUR ctx parent nbr name access zone descr .Sh DESCRIPTION The zone allocator provides an efficient interface for managing dynamically-sized collections of items of similar size. The zone allocator can work with preallocated zones as well as with runtime-allocated ones, and is therefore available much earlier in the boot process than other memory management routines. .Pp A zone is an extensible collection of items of identical size. The zone allocator keeps track of which items are in use and which are not, and provides functions for allocating items from the zone and for releasing them back (which makes them available for later use). .Pp After the first allocation of an item, it will have been cleared to zeroes, however subsequent allocations will retain the contents as of the last free. .Pp The .Fn uma_zcreate function creates a new zone from which items may then be allocated from. The .Fa name argument is a text name of the zone for debugging and stats; this memory should not be freed until the zone has been deallocated. .Pp The .Fa ctor and .Fa dtor arguments are callback functions that are called by the uma subsystem at the time of the call to .Fn uma_zalloc and .Fn uma_zfree respectively. Their purpose is to provide hooks for initializing or destroying things that need to be done at the time of the allocation or release of a resource. A good usage for the .Fa ctor and .Fa dtor callbacks might be to adjust a global count of the number of objects allocated. .Pp The .Fa uminit and .Fa fini arguments are used to optimize the allocation of objects from the zone. They are called by the uma subsystem whenever it needs to allocate or free several items to satisfy requests or memory pressure. A good use for the .Fa uminit and .Fa fini callbacks might be to initialize and destroy mutexes contained within the object. This would allow one to re-use already initialized mutexes when an object is returned from the uma subsystem's object cache. They are not called on each call to .Fn uma_zalloc and .Fn uma_zfree but rather in a batch mode on several objects. .Pp The .Fa flags argument of the .Fn uma_zcreate is a subset of the following flags: .Bl -tag -width "foo" .It Dv UMA_ZONE_NOFREE Slabs of the zone are never returned back to VM. .It Dv UMA_ZONE_REFCNT Each item in the zone would have internal reference counter associated with it. See .Fn uma_find_refcnt . .It Dv UMA_ZONE_NODUMP Pages belonging to the zone will not be included into mini-dumps. .It Dv UMA_ZONE_PCPU An allocation from zone would have .Va mp_ncpu shadow copies, that are privately assigned to CPUs. A CPU can address its private copy using base allocation address plus multiple of current CPU id and .Fn sizeof "struct pcpu" : .Bd -literal -offset indent foo_zone = uma_zcreate(..., UMA_ZONE_PCPU); ... foo_base = uma_zalloc(foo_zone, ...); ... critical_enter(); foo_pcpu = (foo_t *)zpcpu_get(foo_base); /* do something with foo_pcpu */ critical_exit(); .Ed .It Dv UMA_ZONE_OFFPAGE By default book-keeping of items within a slab is done in the slab page itself. This flag explicitly tells subsystem that book-keeping structure should be allocated separately from special internal zone. This flag requires either .Dv UMA_ZONE_VTOSLAB or .Dv UMA_ZONE_HASH , since subsystem requires a mechanism to find a book-keeping structure to an item beeing freed. The subsystem may choose to prefer offpage book-keeping for certain zones implicitly. .It Dv UMA_ZONE_ZINIT The zone will have its .Ft uma_init method set to internal method that initializes a new allocated slab to all zeros. Do not mistake .Ft uma_init method with .Ft uma_ctor . A zone with .Dv UMA_ZONE_ZINIT flag would not return zeroed memory on every .Fn uma_zalloc . .It Dv UMA_ZONE_HASH The zone should use an internal hash table to find slab book-keeping structure where an allocation being freed belongs to. .It Dv UMA_ZONE_VTOSLAB The zone should use special field of .Vt vm_page_t to find slab book-keeping structure where an allocation being freed belongs to. .It Dv UMA_ZONE_MALLOC The zone is for the .Xr malloc 9 subsystem. .It Dv UMA_ZONE_VM The zone is for the VM subsystem. .El .Pp To allocate an item from a zone, simply call .Fn uma_zalloc with a pointer to that zone and set the .Fa flags argument to selected flags as documented in .Xr malloc 9 . It will return a pointer to an item if successful, or .Dv NULL in the rare case where all items in the zone are in use and the allocator is unable to grow the zone and .Dv M_NOWAIT is specified. .Pp Items are released back to the zone from which they were allocated by calling .Fn uma_zfree with a pointer to the zone and a pointer to the item. If .Fa item is .Dv NULL , then .Fn uma_zfree does nothing. .Pp The variations .Fn uma_zalloc_arg and .Fn uma_zfree_arg allow to specify an argument for the .Dv ctor and .Dv dtor functions, respectively. .Pp If zone was created with .Dv UMA_ZONE_REFCNT flag, then pointer to reference counter for an item can be retrieved with help of the .Fn uma_find_refcnt function. .Pp Created zones, which are empty, can be destroyed using .Fn uma_zdestroy , freeing all memory that was allocated for the zone. All items allocated from the zone with .Fn uma_zalloc must have been freed with .Fn uma_zfree before. .Pp The .Fn uma_zone_set_max function limits the number of items .Pq and therefore memory that can be allocated to .Fa zone . The .Fa nitems argument specifies the requested upper limit number of items. The effective limit is returned to the caller, as it may end up being higher than requested due to the implementation rounding up to ensure all memory pages allocated to the zone are utilised to capacity. The limit applies to the total number of items in the zone, which includes allocated items, free items and free items in the per-cpu caches. On systems with more than one CPU it may not be possible to allocate the specified number of items even when there is no shortage of memory, because all of the remaining free items may be in the caches of the other CPUs when the limit is hit. .Pp The .Fn uma_zone_get_max function returns the effective upper limit number of items for a zone. .Pp The .Fn uma_zone_get_cur function returns the approximate current occupancy of the zone. The returned value is approximate because appropriate synchronisation to determine an exact value is not performed by the implementation. This ensures low overhead at the expense of potentially stale data being used in the calculation. .Pp The .Fn uma_zone_set_warning function sets a warning that will be printed on the system console when the given zone becomes full and fails to allocate an item. The warning will be printed not often than every five minutes. Warnings can be turned off globally by setting the .Va vm.zone_warnings sysctl tunable to .Va 0 . .Pp The .Fn SYSCTL_UMA_MAX parent nbr name access zone descr macro declares a static .Xr sysctl oid that exports the effective upper limit number of items for a zone. The .Fa zone argument should be a pointer to .Vt uma_zone_t . A read of the oid returns value obtained through .Fn uma_zone_get_max . A write to the oid sets new value via .Fn uma_zone_set_max . The .Fn SYSCTL_ADD_UMA_MAX ctx parent nbr name access zone descr macro is provided to create this type of oid dynamically. .Pp The .Fn SYSCTL_UMA_CUR parent nbr name access zone descr macro declares a static read only .Xr sysctl oid that exports the approximate current occupancy of the zone. The .Fa zone -argument should be a pointer to +argument should be a pointer to .Vt uma_zone_t . A read of the oid returns value obtained through .Fn uma_zone_get_cur . The .Fn SYSCTL_ADD_UMA_CUR ctx parent nbr name zone descr macro is provided to create this type of oid dynamically. .Sh RETURN VALUES The .Fn uma_zalloc function returns a pointer to an item, or .Dv NULL if the zone ran out of unused items and .Dv M_NOWAIT was specified. .Sh SEE ALSO .Xr malloc 9 .Sh HISTORY The zone allocator first appeared in .Fx 3.0 . It was radically changed in .Fx 5.0 to function as a slab allocator. .Sh AUTHORS .An -nosplit The zone allocator was written by .An John S. Dyson . The zone allocator was rewritten in large parts by .An Jeff Roberson Aq Mt jeff@FreeBSD.org to function as a slab allocator. .Pp This manual page was written by .An Dag-Erling Sm\(/orgrav Aq Mt des@FreeBSD.org . Changes for UMA by .An Jeroen Ruigrok van der Werven Aq Mt asmodai@FreeBSD.org .