Index: head/lib/libc/sys/kqueue.2 =================================================================== --- head/lib/libc/sys/kqueue.2 (revision 327258) +++ head/lib/libc/sys/kqueue.2 (revision 327259) @@ -1,790 +1,790 @@ .\" Copyright (c) 2000 Jonathan Lemon .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" $FreeBSD$ .\" .Dd June 22, 2017 .Dt KQUEUE 2 .Os .Sh NAME .Nm kqueue , .Nm kevent .Nd kernel event notification mechanism .Sh LIBRARY .Lb libc .Sh SYNOPSIS .In sys/event.h .Ft int .Fn kqueue "void" .Ft int .Fn kevent "int kq" "const struct kevent *changelist" "int nchanges" "struct kevent *eventlist" "int nevents" "const struct timespec *timeout" .Fn EV_SET "kev" ident filter flags fflags data udata .Sh DESCRIPTION The .Fn kqueue system call provides a generic method of notifying the user when an event happens or a condition holds, based on the results of small pieces of kernel code termed filters. A kevent is identified by the (ident, filter) pair; there may only be one unique kevent per kqueue. .Pp The filter is executed upon the initial registration of a kevent in order to detect whether a preexisting condition is present, and is also executed whenever an event is passed to the filter for evaluation. If the filter determines that the condition should be reported, then the kevent is placed on the kqueue for the user to retrieve. .Pp The filter is also run when the user attempts to retrieve the kevent from the kqueue. If the filter indicates that the condition that triggered the event no longer holds, the kevent is removed from the kqueue and is not returned. .Pp Multiple events which trigger the filter do not result in multiple kevents being placed on the kqueue; instead, the filter will aggregate the events into a single struct kevent. Calling .Fn close on a file descriptor will remove any kevents that reference the descriptor. .Pp The .Fn kqueue system call creates a new kernel event queue and returns a descriptor. The queue is not inherited by a child created with .Xr fork 2 . However, if .Xr rfork 2 is called without the .Dv RFFDG flag, then the descriptor table is shared, which will allow sharing of the kqueue between two processes. .Pp The .Fn kevent system call is used to register events with the queue, and return any pending events to the user. The .Fa changelist argument is a pointer to an array of .Va kevent structures, as defined in .In sys/event.h . All changes contained in the .Fa changelist are applied before any pending events are read from the queue. The .Fa nchanges argument gives the size of .Fa changelist . The .Fa eventlist argument is a pointer to an array of kevent structures. The .Fa nevents argument determines the size of .Fa eventlist . When .Fa nevents is zero, .Fn kevent will return immediately even if there is a .Fa timeout specified unlike .Xr select 2 . If .Fa timeout is a non-NULL pointer, it specifies a maximum interval to wait for an event, which will be interpreted as a struct timespec. If .Fa timeout is a NULL pointer, .Fn kevent waits indefinitely. To effect a poll, the .Fa timeout argument should be non-NULL, pointing to a zero-valued .Va timespec structure. The same array may be used for the .Fa changelist and .Fa eventlist . .Pp The .Fn EV_SET macro is provided for ease of initializing a kevent structure. .Pp The .Va kevent structure is defined as: .Bd -literal struct kevent { uintptr_t ident; /* identifier for this event */ short filter; /* filter for event */ u_short flags; /* action flags for kqueue */ u_int fflags; /* filter flag value */ int64_t data; /* filter data value */ void *udata; /* opaque user data identifier */ uint64_t ext[4]; /* extentions */ }; .Ed .Pp The fields of .Fa struct kevent are: .Bl -tag -width "Fa filter" .It Fa ident Value used to identify this event. The exact interpretation is determined by the attached filter, but often is a file descriptor. .It Fa filter Identifies the kernel filter used to process this event. The pre-defined system filters are described below. .It Fa flags Actions to perform on the event. .It Fa fflags Filter-specific flags. .It Fa data Filter-specific data value. .It Fa udata Opaque user-defined value passed through the kernel unchanged. .It Fa ext Extended data passed to and from kernel. The .Fa ext[0] and .Fa ext[1] members use is defined by the filter. If the filter does not use them, the members are copied unchanged. The .Fa ext[2] and .Fa ext[3] -members are always passed throught the kernel as-is, +members are always passed through the kernel as-is, making additional context available to application. .El .Pp The .Va flags field can contain the following values: .Bl -tag -width EV_DISPATCH .It Dv EV_ADD Adds the event to the kqueue. Re-adding an existing event will modify the parameters of the original event, and not result in a duplicate entry. Adding an event automatically enables it, unless overridden by the EV_DISABLE flag. .It Dv EV_ENABLE Permit .Fn kevent to return the event if it is triggered. .It Dv EV_DISABLE Disable the event so .Fn kevent will not return it. The filter itself is not disabled. .It Dv EV_DISPATCH Disable the event source immediately after delivery of an event. See .Dv EV_DISABLE above. .It Dv EV_DELETE Removes the event from the kqueue. Events which are attached to file descriptors are automatically deleted on the last close of the descriptor. .It Dv EV_RECEIPT This flag is useful for making bulk changes to a kqueue without draining any pending events. When passed as input, it forces .Dv EV_ERROR to always be returned. When a filter is successfully added the .Va data field will be zero. .It Dv EV_ONESHOT Causes the event to return only the first occurrence of the filter being triggered. After the user retrieves the event from the kqueue, it is deleted. .It Dv EV_CLEAR After the event is retrieved by the user, its state is reset. This is useful for filters which report state transitions instead of the current state. Note that some filters may automatically set this flag internally. .It Dv EV_EOF Filters may set this flag to indicate filter-specific EOF condition. .It Dv EV_ERROR See .Sx RETURN VALUES below. .El .Pp The predefined system filters are listed below. Arguments may be passed to and from the filter via the .Va fflags and .Va data fields in the kevent structure. .Bl -tag -width "Dv EVFILT_PROCDESC" .It Dv EVFILT_READ Takes a descriptor as the identifier, and returns whenever there is data available to read. The behavior of the filter is slightly different depending on the descriptor type. .Bl -tag -width 2n .It Sockets Sockets which have previously been passed to .Fn listen return when there is an incoming connection pending. .Va data contains the size of the listen backlog. .Pp Other socket descriptors return when there is data to be read, subject to the .Dv SO_RCVLOWAT value of the socket buffer. This may be overridden with a per-filter low water mark at the time the filter is added by setting the .Dv NOTE_LOWAT flag in .Va fflags , and specifying the new low water mark in .Va data . On return, .Va data contains the number of bytes of protocol data available to read. .Pp If the read direction of the socket has shutdown, then the filter also sets .Dv EV_EOF in .Va flags , and returns the socket error (if any) in .Va fflags . It is possible for EOF to be returned (indicating the connection is gone) while there is still data pending in the socket buffer. .It Vnodes Returns when the file pointer is not at the end of file. .Va data contains the offset from current position to end of file, and may be negative. .Pp This behavior is different from .Xr poll 2 , where read events are triggered for regular files unconditionally. This event can be triggered unconditionally by setting the .Dv NOTE_FILE_POLL flag in .Va fflags . .It "Fifos, Pipes" Returns when the there is data to read; .Va data contains the number of bytes available. .Pp When the last writer disconnects, the filter will set .Dv EV_EOF in .Va flags . This may be cleared by passing in .Dv EV_CLEAR , at which point the filter will resume waiting for data to become available before returning. .It "BPF devices" Returns when the BPF buffer is full, the BPF timeout has expired, or when the BPF has .Dq immediate mode enabled and there is any data to read; .Va data contains the number of bytes available. .El .It Dv EVFILT_WRITE Takes a descriptor as the identifier, and returns whenever it is possible to write to the descriptor. For sockets, pipes and fifos, .Va data will contain the amount of space remaining in the write buffer. The filter will set EV_EOF when the reader disconnects, and for the fifo case, this may be cleared by use of .Dv EV_CLEAR . Note that this filter is not supported for vnodes or BPF devices. .Pp For sockets, the low water mark and socket error handling is identical to the .Dv EVFILT_READ case. .It Dv EVFILT_EMPTY Takes a descriptor as the identifier, and returns whenever there is no remaining data in the write buffer. .It Dv EVFILT_AIO Events for this filter are not registered with .Fn kevent directly but are registered via the .Va aio_sigevent member of an asychronous I/O request when it is scheduled via an asychronous I/O system call such as .Fn aio_read . The filter returns under the same conditions as .Fn aio_error . For more details on this filter see .Xr sigevent 3 and .Xr aio 4 . .It Dv EVFILT_VNODE Takes a file descriptor as the identifier and the events to watch for in .Va fflags , and returns when one or more of the requested events occurs on the descriptor. The events to monitor are: .Bl -tag -width "Dv NOTE_CLOSE_WRITE" .It Dv NOTE_ATTRIB The file referenced by the descriptor had its attributes changed. .It Dv NOTE_CLOSE A file descriptor referencing the monitored file, was closed. The closed file descriptor did not have write access. .It Dv NOTE_CLOSE_WRITE A file descriptor referencing the monitored file, was closed. The closed file descriptor had write access. .Pp This note, as well as .Dv NOTE_CLOSE , are not activated when files are closed forcibly by .Xr unmount 2 or .Xr revoke 2 . Instead, .Dv NOTE_REVOKE is sent for such events. .It Dv NOTE_DELETE The .Fn unlink system call was called on the file referenced by the descriptor. .It Dv NOTE_EXTEND For regular file, the file referenced by the descriptor was extended. .Pp For directory, reports that a directory entry was added or removed, as the result of rename operation. The .Dv NOTE_EXTEND event is not reported when a name is changed inside the directory. .It Dv NOTE_LINK The link count on the file changed. In particular, the .Dv NOTE_LINK event is reported if a subdirectory was created or deleted inside the directory referenced by the descriptor. .It Dv NOTE_OPEN The file referenced by the descriptor was opened. .It Dv NOTE_READ A read occurred on the file referenced by the descriptor. .It Dv NOTE_RENAME The file referenced by the descriptor was renamed. .It Dv NOTE_REVOKE Access to the file was revoked via .Xr revoke 2 or the underlying file system was unmounted. .It Dv NOTE_WRITE A write occurred on the file referenced by the descriptor. .El .Pp On return, .Va fflags contains the events which triggered the filter. .It Dv EVFILT_PROC Takes the process ID to monitor as the identifier and the events to watch for in .Va fflags , and returns when the process performs one or more of the requested events. If a process can normally see another process, it can attach an event to it. The events to monitor are: .Bl -tag -width "Dv NOTE_TRACKERR" .It Dv NOTE_EXIT The process has exited. The exit status will be stored in .Va data . .It Dv NOTE_FORK The process has called .Fn fork . .It Dv NOTE_EXEC The process has executed a new process via .Xr execve 2 or a similar call. .It Dv NOTE_TRACK Follow a process across .Fn fork calls. The parent process registers a new kevent to monitor the child process using the same .Va fflags as the original event. The child process will signal an event with .Dv NOTE_CHILD set in .Va fflags and the parent PID in .Va data . .Pp If the parent process fails to register a new kevent .Pq usually due to resource limitations , it will signal an event with .Dv NOTE_TRACKERR set in .Va fflags , and the child process will not signal a .Dv NOTE_CHILD event. .El .Pp On return, .Va fflags contains the events which triggered the filter. .It Dv EVFILT_PROCDESC Takes the process descriptor created by .Xr pdfork 2 to monitor as the identifier and the events to watch for in .Va fflags , and returns when the associated process performs one or more of the requested events. The events to monitor are: .Bl -tag -width "Dv NOTE_EXIT" .It Dv NOTE_EXIT The process has exited. The exit status will be stored in .Va data . .El .Pp On return, .Va fflags contains the events which triggered the filter. .It Dv EVFILT_SIGNAL Takes the signal number to monitor as the identifier and returns when the given signal is delivered to the process. This coexists with the .Fn signal and .Fn sigaction facilities, and has a lower precedence. The filter will record all attempts to deliver a signal to a process, even if the signal has been marked as .Dv SIG_IGN , except for the .Dv SIGCHLD signal, which, if ignored, won't be recorded by the filter. Event notification happens after normal signal delivery processing. .Va data returns the number of times the signal has occurred since the last call to .Fn kevent . This filter automatically sets the .Dv EV_CLEAR flag internally. .It Dv EVFILT_TIMER Establishes an arbitrary timer identified by .Va ident . When adding a timer, .Va data specifies the moment to fire the timer (for .Dv NOTE_ABSTIME ) or the timeout period. The timer will be periodic unless .Dv EV_ONESHOT or .Dv NOTE_ABSTIME is specified. On return, .Va data contains the number of times the timeout has expired since the last call to .Fn kevent . For non-monotonic timers, this filter automatically sets the .Dv EV_CLEAR flag internally. .Pp The filter accepts the following flags in the .Va fflags argument: .Bl -tag -width "Dv NOTE_MSECONDS" .It Dv NOTE_SECONDS .Va data is in seconds. .It Dv NOTE_MSECONDS .Va data is in milliseconds. .It Dv NOTE_USECONDS .Va data is in microseconds. .It Dv NOTE_NSECONDS .Va data is in nanoseconds. .It Dv NOTE_ABSTIME The specified expiration time is absolute. .El .Pp If .Va fflags is not set, the default is milliseconds. On return, .Va fflags contains the events which triggered the filter. .Pp There is a system wide limit on the number of timers which is controlled by the .Va kern.kq_calloutmax sysctl. .It Dv EVFILT_USER Establishes a user event identified by .Va ident which is not associated with any kernel mechanism but is triggered by user level code. The lower 24 bits of the .Va fflags may be used for user defined flags and manipulated using the following: .Bl -tag -width "Dv NOTE_FFLAGSMASK" .It Dv NOTE_FFNOP Ignore the input .Va fflags . .It Dv NOTE_FFAND Bitwise AND .Va fflags . .It Dv NOTE_FFOR Bitwise OR .Va fflags . .It Dv NOTE_FFCOPY Copy .Va fflags . .It Dv NOTE_FFCTRLMASK Control mask for .Va fflags . .It Dv NOTE_FFLAGSMASK User defined flag mask for .Va fflags . .El .Pp A user event is triggered for output with the following: .Bl -tag -width "Dv NOTE_FFLAGSMASK" .It Dv NOTE_TRIGGER Cause the event to be triggered. .El .Pp On return, .Va fflags contains the users defined flags in the lower 24 bits. .El .Sh CANCELLATION BEHAVIOUR If .Fa nevents is non-zero, i.e. the function is potentially blocking, the call is a cancellation point. Otherwise, i.e. if .Fa nevents is zero, the call is not cancellable. Cancellation can only occur before any changes are made to the kqueue, or when the call was blocked and no changes to the queue were requested. .Sh RETURN VALUES The .Fn kqueue system call creates a new kernel event queue and returns a file descriptor. If there was an error creating the kernel event queue, a value of -1 is returned and errno set. .Pp The .Fn kevent system call returns the number of events placed in the .Fa eventlist , up to the value given by .Fa nevents . If an error occurs while processing an element of the .Fa changelist and there is enough room in the .Fa eventlist , then the event will be placed in the .Fa eventlist with .Dv EV_ERROR set in .Va flags and the system error in .Va data . Otherwise, .Dv -1 will be returned, and .Dv errno will be set to indicate the error condition. If the time limit expires, then .Fn kevent returns 0. .Sh EXAMPLES .Bd -literal -compact #include #include #include #include #include #include int main(int argc, char **argv) { struct kevent event; /* Event we want to monitor */ struct kevent tevent; /* Event triggered */ int kq, fd, ret; if (argc != 2) err(EXIT_FAILURE, "Usage: %s path\en", argv[0]); fd = open(argv[1], O_RDONLY); if (fd == -1) err(EXIT_FAILURE, "Failed to open '%s'", argv[1]); /* Create kqueue. */ kq = kqueue(); if (kq == -1) err(EXIT_FAILURE, "kqueue() failed"); /* Initialize kevent structure. */ EV_SET(&event, fd, EVFILT_VNODE, EV_ADD | EV_CLEAR, NOTE_WRITE, 0, NULL); /* Attach event to the kqueue. */ ret = kevent(kq, &event, 1, NULL, 0, NULL); if (ret == -1) err(EXIT_FAILURE, "kevent register"); if (event.flags & EV_ERROR) errx(EXIT_FAILURE, "Event error: %s", strerror(event.data)); for (;;) { /* Sleep until something happens. */ ret = kevent(kq, NULL, 0, &tevent, 1, NULL); if (ret == -1) { err(EXIT_FAILURE, "kevent wait"); } else if (ret > 0) { printf("Something was written in '%s'\en", argv[1]); } } } .Ed .Sh ERRORS The .Fn kqueue system call fails if: .Bl -tag -width Er .It Bq Er ENOMEM The kernel failed to allocate enough memory for the kernel queue. .It Bq Er ENOMEM The .Dv RLIMIT_KQUEUES rlimit (see .Xr getrlimit 2 ) for the current user would be exceeded. .It Bq Er EMFILE The per-process descriptor table is full. .It Bq Er ENFILE The system file table is full. .El .Pp The .Fn kevent system call fails if: .Bl -tag -width Er .It Bq Er EACCES The process does not have permission to register a filter. .It Bq Er EFAULT There was an error reading or writing the .Va kevent structure. .It Bq Er EBADF The specified descriptor is invalid. .It Bq Er EINTR A signal was delivered before the timeout expired and before any events were placed on the kqueue for return. .It Bq Er EINTR A cancellation request was delivered to the thread, but not yet handled. .It Bq Er EINVAL The specified time limit or filter is invalid. .It Bq Er ENOENT The event could not be found to be modified or deleted. .It Bq Er ENOMEM No memory was available to register the event or, in the special case of a timer, the maximum number of timers has been exceeded. This maximum is configurable via the .Va kern.kq_calloutmax sysctl. .It Bq Er ESRCH The specified process to attach to does not exist. .El .Pp When .Fn kevent call fails with .Er EINTR error, all changes in the .Fa changelist have been applied. .Sh SEE ALSO .Xr aio_error 2 , .Xr aio_read 2 , .Xr aio_return 2 , .Xr poll 2 , .Xr read 2 , .Xr select 2 , .Xr sigaction 2 , .Xr write 2 , .Xr pthread_setcancelstate 3 , .Xr signal 3 .Sh HISTORY The .Fn kqueue and .Fn kevent system calls first appeared in .Fx 4.1 . .Sh AUTHORS The .Fn kqueue system and this manual page were written by .An Jonathan Lemon Aq Mt jlemon@FreeBSD.org . .Sh BUGS The .Fa timeout value is limited to 24 hours; longer timeouts will be silently reinterpreted as 24 hours. .Pp In versions older than .Fx 12.0 , .In sys/event.h failed to parse without including .In sys/types.h manually. Index: head/lib/libc/sys/sendfile.2 =================================================================== --- head/lib/libc/sys/sendfile.2 (revision 327258) +++ head/lib/libc/sys/sendfile.2 (revision 327259) @@ -1,399 +1,399 @@ .\" Copyright (c) 2003, David G. Lawrence .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice unmodified, this list of conditions, and the following .\" disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" $FreeBSD$ .\" .Dd November 17, 2016 .Dt SENDFILE 2 .Os .Sh NAME .Nm sendfile .Nd send a file to a socket .Sh LIBRARY .Lb libc .Sh SYNOPSIS .In sys/types.h .In sys/socket.h .In sys/uio.h .Ft int .Fo sendfile .Fa "int fd" "int s" "off_t offset" "size_t nbytes" .Fa "struct sf_hdtr *hdtr" "off_t *sbytes" "int flags" .Fc .Sh DESCRIPTION The .Fn sendfile system call sends a regular file or shared memory object specified by descriptor .Fa fd out a stream socket specified by descriptor .Fa s . .Pp The .Fa offset argument specifies where to begin in the file. Should .Fa offset fall beyond the end of file, the system will return success and report 0 bytes sent as described below. The .Fa nbytes argument specifies how many bytes of the file should be sent, with 0 having the special meaning of send until the end of file has been reached. .Pp An optional header and/or trailer can be sent before and after the file data by specifying a pointer to a .Vt "struct sf_hdtr" , which has the following structure: .Pp .Bd -literal -offset indent -compact struct sf_hdtr { struct iovec *headers; /* pointer to header iovecs */ int hdr_cnt; /* number of header iovecs */ struct iovec *trailers; /* pointer to trailer iovecs */ int trl_cnt; /* number of trailer iovecs */ }; .Ed .Pp The .Fa headers and .Fa trailers pointers, if .Pf non- Dv NULL , point to arrays of .Vt "struct iovec" structures. See the .Fn writev system call for information on the iovec structure. The number of iovecs in these arrays is specified by .Fa hdr_cnt and .Fa trl_cnt . .Pp If .Pf non- Dv NULL , the system will write the total number of bytes sent on the socket to the variable pointed to by .Fa sbytes . .Pp The least significant 16 bits of .Fa flags argument is a bitmap of these values: .Bl -tag -offset indent .It Dv SF_NODISKIO This flag causes .Nm to return .Er EBUSY instead of blocking when a busy page is encountered. This rare situation can happen if some other process is now working with the same region of the file. It is advised to retry the operation after a short period. .Pp Note that in older .Fx versions the .Dv SF_NODISKIO had slightly different notion. The flag prevented .Nm to run I/O operations in case if an invalid (not cached) page is encountered, thus avoiding blocking on I/O. Starting with .Fx 11 .Nm sending files off the .Xr ffs 7 filesystem doesn't block on I/O (see .Sx IMPLEMENTATION NOTES ), so the condition no longer applies. However, it is safe if an application utilizes .Dv SF_NODISKIO and on .Er EBUSY performs the same action as it did in older .Fx versions, e.g. .Xr aio_read 2, .Xr read 2 or .Nm in a different context. .It Dv SF_NOCACHE The data sent to socket will not be cached by the virtual memory system, and will be freed directly to the pool of free pages. .It Dv SF_SYNC .Nm sleeps until the network stack no longer references the VM pages of the file, making subsequent modifications to it safe. Please note that this is not a guarantee that the data has actually been sent. .It Dv SF_USER_READAHEAD .Nm has some internal heuristics to do readahead when sending data. This flag forces .Nm to override any heuristically calculated readahead and use exactly the application specified readahead. See .Sx SETTING READAHEAD for more details on readahead. .El .Pp When using a socket marked for non-blocking I/O, .Fn sendfile may send fewer bytes than requested. In this case, the number of bytes successfully written is returned in .Fa *sbytes (if specified), and the error .Er EAGAIN is returned. .Sh SETTING READAHEAD .Nm uses internal heuristics based on request size and file system layout to do readahead. Additionally application may request extra readahead. The most significant 16 bits of .Fa flags specify amount of pages that .Nm may read ahead when reading the file. A macro .Fn SF_FLAGS is provided to combine readahead amount and flags. -Example shows specifing readahead of 16 pages and +An example showing specifying readahead of 16 pages and .Dv SF_NOCACHE flag: .Pp .Bd -literal -offset indent -compact SF_FLAGS(16, SF_NOCACHE) .Ed .Pp .Nm will use either application specified readahead or internally calculated, whichever is bigger. Setting flag .Dv SF_USER_READAHEAD would turn off any heuristics and set maximum possible readahead length to the number of pages specified via flags. .Sh IMPLEMENTATION NOTES The .Fx implementation of .Fn sendfile doesn't block on disk I/O when it sends a file off the .Xr ffs 7 filesystem. The syscall returns success before the actual I/O completes, and data is put into the socket later unattended. However, the order of data in the socket is preserved, so it is safe to do further writes to the socket. .Pp The .Fx implementation of .Fn sendfile is "zero-copy", meaning that it has been optimized so that copying of the file data is avoided. .Sh TUNING On some architectures, this system call internally uses a special .Fn sendfile buffer .Pq Vt "struct sf_buf" to handle sending file data to the client. If the sending socket is blocking, and there are not enough .Fn sendfile buffers available, .Fn sendfile will block and report a state of .Dq Li sfbufa . If the sending socket is non-blocking and there are not enough .Fn sendfile buffers available, the call will block and wait for the necessary buffers to become available before finishing the call. .Pp The number of .Vt sf_buf Ns 's allocated should be proportional to the number of nmbclusters used to send data to a client via .Fn sendfile . Tune accordingly to avoid blocking! Busy installations that make extensive use of .Fn sendfile may want to increase these values to be inline with their .Va kern.ipc.nmbclusters (see .Xr tuning 7 for details). .Pp The number of .Fn sendfile buffers available is determined at boot time by either the .Va kern.ipc.nsfbufs .Xr loader.conf 5 variable or the .Dv NSFBUFS kernel configuration tunable. The number of .Fn sendfile buffers scales with .Va kern.maxusers . The .Va kern.ipc.nsfbufsused and .Va kern.ipc.nsfbufspeak read-only .Xr sysctl 8 variables show current and peak .Fn sendfile buffers usage respectively. These values may also be viewed through .Nm netstat Fl m . .Pp If a value of zero is reported for .Va kern.ipc.nsfbufs , your architecture does not need to use .Fn sendfile buffers because their task can be efficiently performed by the generic virtual memory structures. .Sh RETURN VALUES .Rv -std sendfile .Sh ERRORS .Bl -tag -width Er .It Bq Er EAGAIN The socket is marked for non-blocking I/O and not all data was sent due to the socket buffer being filled. If specified, the number of bytes successfully sent will be returned in .Fa *sbytes . .It Bq Er EBADF The .Fa fd argument is not a valid file descriptor. .It Bq Er EBADF The .Fa s argument is not a valid socket descriptor. .It Bq Er EBUSY A busy page was encountered and .Dv SF_NODISKIO had been specified. Partial data may have been sent. .It Bq Er EFAULT An invalid address was specified for an argument. .It Bq Er EINTR A signal interrupted .Fn sendfile before it could be completed. If specified, the number of bytes successfully sent will be returned in .Fa *sbytes . .It Bq Er EINVAL The .Fa fd argument is not a regular file. .It Bq Er EINVAL The .Fa s argument is not a SOCK_STREAM type socket. .It Bq Er EINVAL The .Fa offset argument is negative. .It Bq Er EIO An error occurred while reading from .Fa fd . .It Bq Er ENOBUFS The system was unable to allocate an internal buffer. .It Bq Er ENOTCONN The .Fa s argument points to an unconnected socket. .It Bq Er ENOTSOCK The .Fa s argument is not a socket. .It Bq Er EOPNOTSUPP The file system for descriptor .Fa fd does not support .Fn sendfile . .It Bq Er EPIPE The socket peer has closed the connection. .El .Sh SEE ALSO .Xr netstat 1 , .Xr open 2 , .Xr send 2 , .Xr socket 2 , .Xr writev 2 , .Xr tuning 7 .Rs .%A K. Elmeleegy .%A A. Chanda .%A A. L. Cox .%A W. Zwaenepoel .%T A Portable Kernel Abstraction for Low-Overhead Ephemeral Mapping Management .%J The Proceedings of the 2005 USENIX Annual Technical Conference .%P pp 223-236 .%D 2005 .Re .Sh HISTORY The .Fn sendfile system call first appeared in .Fx 3.0 . This manual page first appeared in .Fx 3.1 . In .Fx 10 support for sending shared memory descriptors had been introduced. In .Fx 11 a non-blocking implementation had been introduced. .Sh AUTHORS The initial implementation of .Fn sendfile system call and this manual page were written by .An David G. Lawrence Aq Mt dg@dglawrence.com . The .Fx 11 implementation was written by .An Gleb Smirnoff Aq Mt glebius@FreeBSD.org . Index: head/lib/libopenbsd/imsg_init.3 =================================================================== --- head/lib/libopenbsd/imsg_init.3 (revision 327258) +++ head/lib/libopenbsd/imsg_init.3 (revision 327259) @@ -1,549 +1,549 @@ .\" $OpenBSD: imsg_init.3,v 1.13 2015/07/11 16:23:59 deraadt Exp $ .\" .\" Copyright (c) 2010 Nicholas Marriott .\" .\" Permission to use, copy, modify, and distribute this software for any .\" purpose with or without fee is hereby granted, provided that the above .\" copyright notice and this permission notice appear in all copies. .\" .\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES .\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF .\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR .\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES .\" WHATSOEVER RESULTING FROM LOSS OF MIND, USE, DATA OR PROFITS, WHETHER .\" IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING .\" OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. .\" .\" $FreeBSD$ .\" .Dd $Mdocdate: July 11 2015 $ .Dt IMSG_INIT 3 .Os .Sh NAME .Nm imsg_init , .Nm imsg_read , .Nm imsg_get , .Nm imsg_compose , .Nm imsg_composev , .Nm imsg_create , .Nm imsg_add , .Nm imsg_close , .Nm imsg_free , .Nm imsg_flush , .Nm imsg_clear , .Nm ibuf_open , .Nm ibuf_dynamic , .Nm ibuf_add , .Nm ibuf_reserve , .Nm ibuf_seek , .Nm ibuf_size , .Nm ibuf_left , .Nm ibuf_close , .Nm ibuf_write , .Nm ibuf_free , .Nm msgbuf_init , .Nm msgbuf_clear , .Nm msgbuf_write , .Nm msgbuf_drain .Nd IPC messaging functions .Sh SYNOPSIS .In sys/types.h .In sys/queue.h .In sys/uio.h .In imsg.h .Ft void .Fn imsg_init "struct imsgbuf *ibuf" "int fd" .Ft ssize_t .Fn imsg_read "struct imsgbuf *ibuf" .Ft ssize_t .Fn imsg_get "struct imsgbuf *ibuf" "struct imsg *imsg" .Ft int .Fn imsg_compose "struct imsgbuf *ibuf" "u_int32_t type" "uint32_t peerid" \ "pid_t pid" "int fd" "const void *data" "u_int16_t datalen" .Ft int .Fn imsg_composev "struct imsgbuf *ibuf" "u_int32_t type" "u_int32_t peerid" \ "pid_t pid" "int fd" "const struct iovec *iov" "int iovcnt" .Ft "struct ibuf *" .Fn imsg_create "struct imsgbuf *ibuf" "u_int32_t type" "u_int32_t peerid" \ "pid_t pid" "u_int16_t datalen" .Ft int .Fn imsg_add "struct ibuf *buf" "const void *data" "u_int16_t datalen" .Ft void .Fn imsg_close "struct imsgbuf *ibuf" "struct ibuf *msg" .Ft void .Fn imsg_free "struct imsg *imsg" .Ft int .Fn imsg_flush "struct imsgbuf *ibuf" .Ft void .Fn imsg_clear "struct imsgbuf *ibuf" .Ft "struct ibuf *" .Fn ibuf_open "size_t len" .Ft "struct ibuf *" .Fn ibuf_dynamic "size_t len" "size_t max" .Ft int .Fn ibuf_add "struct ibuf *buf" "const void *data" "size_t len" .Ft "void *" .Fn ibuf_reserve "struct ibuf *buf" "size_t len" .Ft "void *" .Fn ibuf_seek "struct ibuf *buf" "size_t pos" "size_t len" .Ft size_t .Fn ibuf_size "struct ibuf *buf" .Ft size_t .Fn ibuf_left "struct ibuf *buf" .Ft void .Fn ibuf_close "struct msgbuf *msgbuf" "struct ibuf *buf" .Ft int .Fn ibuf_write "struct msgbuf *msgbuf" .Ft void .Fn ibuf_free "struct ibuf *buf" .Ft void .Fn msgbuf_init "struct msgbuf *msgbuf" .Ft void .Fn msgbuf_clear "struct msgbuf *msgbuf" .Ft int .Fn msgbuf_write "struct msgbuf *msgbuf" .Ft void .Fn msgbuf_drain "struct msgbuf *msgbuf" "size_t n" .Sh DESCRIPTION The .Nm imsg functions provide a simple mechanism for communication between processes using sockets. Each transmitted message is guaranteed to be presented to the receiving program whole. They are commonly used in privilege separated processes, where processes with different rights are required to cooperate. .Pp A program using these functions should be linked with .Em -lutil . .Pp The basic .Nm structure is the .Em imsgbuf , which wraps a file descriptor and represents one side of a channel on which messages are sent and received: .Bd -literal -offset indent struct imsgbuf { TAILQ_HEAD(, imsg_fd) fds; struct ibuf_read r; struct msgbuf w; int fd; pid_t pid; }; .Ed .Pp .Fn imsg_init is a routine which initializes .Fa ibuf as one side of a channel associated with .Fa fd . The file descriptor is used to send and receive messages, but is not closed by any of the imsg functions. An imsgbuf is initialized with the .Em w member as the output buffer queue, .Em fd with the file descriptor passed to .Fn imsg_init and the other members for internal use only. .Pp The .Fn imsg_clear function frees any data allocated as part of an imsgbuf. .Pp .Fn imsg_create , .Fn imsg_add and .Fn imsg_close are generic construction routines for messages that are to be sent using an imsgbuf. .Pp .Fn imsg_create creates a new message with header specified by .Fa type , .Fa peerid and .Fa pid . A .Fa pid of zero uses the process ID returned by .Xr getpid 2 when .Fa ibuf was initialized. In addition to this common imsg header, .Fa datalen bytes of space may be reserved for attaching to this imsg. This space is populated using .Fn imsg_add . Additionally, the file descriptor .Fa fd may be passed over the socket to the other process. If .Fa fd is given, it is closed in the sending program after the message is sent. A value of \-1 indicates no file descriptor should be passed. .Fn imsg_create returns a pointer to a new message if it succeeds, NULL otherwise. .Pp .Fn imsg_add appends to .Fa imsg .Fa len bytes of ancillary data pointed to by .Fa buf . It returns .Fa len if it succeeds, \-1 otherwise. .Pp .Fn imsg_close completes creation of .Fa imsg by adding it to .Fa imsgbuf output buffer. .Pp .Fn imsg_compose is a routine which is used to quickly create and queue an imsg. It takes the same parameters as the .Fn imsg_create , .Fn imsg_add and .Fn imsg_close routines, except that only one ancillary data buffer can be provided. This routine returns 1 if it succeeds, \-1 otherwise. .Pp .Fn imsg_composev is similar to .Fn imsg_compose . It takes the same parameters, except that the ancillary data buffer is specified by .Fa iovec . .Pp .Fn imsg_flush is a function which calls .Fn msgbuf_write in a loop until all imsgs in the output buffer are sent. It returns 0 if it succeeds, \-1 otherwise. .Pp The .Fn imsg_read routine reads pending data with .Xr recvmsg 2 and queues it as individual messages on .Fa imsgbuf . It returns the number of bytes read on success, or \-1 on error. A return value of \-1 from .Fn imsg_read invalidates .Fa imsgbuf , and renders it suitable only for passing to .Fn imsg_clear . .Pp .Fn imsg_get fills in an individual imsg pending on .Fa imsgbuf into the structure pointed to by .Fa imsg . It returns the total size of the message, 0 if no messages are ready, or \-1 for an error. Received messages are returned as a .Em struct imsg , which must be freed by .Fn imsg_free when no longer required. .Em struct imsg has this form: .Bd -literal -offset indent struct imsg { struct imsg_hdr hdr; int fd; void *data; }; struct imsg_hdr { u_int32_t type; u_int16_t len; u_int16_t flags; u_int32_t peerid; u_int32_t pid; }; .Ed .Pp The header members are: .Bl -tag -width Ds -offset indent .It type A integer identifier, typically used to express the meaning of the message. .It len The total length of the imsg, including the header and any ancillary data transmitted with the message (pointed to by the .Em data member of the message itself). .It flags Flags used internally by the imsg functions: should not be used by application programs. .It peerid, pid 32-bit values specified on message creation and free for any use by the caller, normally used to identify the message sender. .El .Pp In addition, .Em struct imsg has the following: .Bl -tag -width Ds -offset indent .It fd The file descriptor specified when the message was created and passed using the socket control message API, or \-1 if no file descriptor was sent. .It data A pointer to the ancillary data transmitted with the imsg. .El .Pp The IMSG_HEADER_SIZE define is the size of the imsg message header, which may be subtracted from the .Fa len member of .Em struct imsg_hdr to obtain the length of any additional data passed with the message. .Pp MAX_IMSGSIZE is defined as the maximum size of a single imsg, currently 16384 bytes. .Sh BUFFERS The imsg API defines functions to manipulate buffers, used internally and during construction of imsgs with .Fn imsg_create . A .Em struct ibuf is a single buffer and a .Em struct msgbuf a queue of output buffers for transmission: .Bd -literal -offset indent struct ibuf { TAILQ_ENTRY(ibuf) entry; u_char *buf; size_t size; size_t max; size_t wpos; size_t rpos; int fd; }; struct msgbuf { TAILQ_HEAD(, ibuf) bufs; u_int32_t queued; int fd; }; .Ed .Pp The .Fn ibuf_open function allocates a fixed-length buffer. The buffer may not be resized and may contain a maximum of .Fa len bytes. On success .Fn ibuf_open returns a pointer to the buffer; on failure it returns NULL. .Pp .Fn ibuf_dynamic -allocates a resizeable buffer of initial length +allocates a resizable buffer of initial length .Fa len and maximum size .Fa max . Buffers allocated with .Fn ibuf_dynamic are automatically grown if necessary when data is added. .Pp .Fn ibuf_add is a routine which appends a block of data to .Fa buf . 0 is returned on success and \-1 on failure. .Pp .Fn ibuf_reserve is used to reserve .Fa len bytes in .Fa buf . A pointer to the start of the reserved space is returned, or NULL on error. .Pp .Fn ibuf_seek is a function which returns a pointer to the part of the buffer at offset .Fa pos and of extent .Fa len . NULL is returned if the requested range is outside the part of the buffer in use. .Pp .Fn ibuf_size and .Fn ibuf_left are functions which return the total bytes used and available in .Fa buf respectively. .Pp .Fn ibuf_close appends .Fa buf to .Fa msgbuf ready to be sent. .Pp The .Fn ibuf_write routine transmits as many pending buffers as possible from .Fn msgbuf using .Xr writev 2 . It returns 1 if it succeeds, \-1 on error and 0 when no buffers were pending or an EOF condition on the socket is detected. Temporary resource shortages are returned with errno .Er EAGAIN and require the application to retry again in the future. .Pp .Fn ibuf_free frees .Fa buf and any associated storage. .Pp The .Fn msgbuf_init function initializes .Fa msgbuf so that buffers may be appended to it. The .Em fd member should also be set directly before .Fn msgbuf_write is used. .Pp .Fn msgbuf_clear empties a msgbuf, removing and discarding any queued buffers. .Pp The .Fn msgbuf_write routine calls .Xr sendmsg 2 to transmit buffers queued in .Fa msgbuf . It returns 1 if it succeeds, \-1 on error, and 0 when the queue was empty or an EOF condition on the socket is detected. Temporary resource shortages are returned with errno .Er EAGAIN and require the application to retry again in the future. .Pp .Fn msgbuf_drain discards data from buffers queued in .Fa msgbuf until .Fa n bytes have been removed or .Fa msgbuf is empty. .Sh EXAMPLES In a typical program, a channel between two processes is created with .Xr socketpair 2 , and an .Em imsgbuf created around one file descriptor in each process: .Bd -literal -offset indent struct imsgbuf parent_ibuf, child_ibuf; int imsg_fds[2]; if (socketpair(AF_UNIX, SOCK_STREAM, PF_UNSPEC, imsg_fds) == -1) err(1, "socketpair"); switch (fork()) { case -1: err(1, "fork"); case 0: /* child */ close(imsg_fds[0]); imsg_init(&child_ibuf, imsg_fds[1]); exit(child_main(&child_ibuf)); } /* parent */ close(imsg_fds[1]); imsg_init(&parent_ibuf, imsg_fds[0]); exit(parent_main(&parent_ibuf)); .Ed .Pp Messages may then be composed and queued on the .Em imsgbuf , for example using the .Fn imsg_compose function: .Bd -literal -offset indent enum imsg_type { IMSG_A_MESSAGE, IMSG_MESSAGE2 }; int child_main(struct imsgbuf *ibuf) { int idata; ... idata = 42; imsg_compose(ibuf, IMSG_A_MESSAGE, 0, 0, -1, &idata, sizeof idata); ... } .Ed .Pp A mechanism such as .Xr poll 2 or the .Xr event 3 library is used to monitor the socket file descriptor. When the socket is ready for writing, queued messages are transmitted with .Fn msgbuf_write : .Bd -literal -offset indent if (msgbuf_write(&ibuf-\*(Gtw) \*(Lt= 0 && errno != EAGAIN) { /* handle write failure */ } .Ed .Pp And when ready for reading, messages are first received using .Fn imsg_read and then extracted with .Fn imsg_get : .Bd -literal -offset indent void dispatch_imsg(struct imsgbuf *ibuf) { struct imsg imsg; ssize_t n, datalen; int idata; if ((n = imsg_read(ibuf)) == -1 || n == 0) { /* handle socket error */ } for (;;) { if ((n = imsg_get(ibuf, &imsg)) == -1) { /* handle read error */ } if (n == 0) /* no more messages */ return; datalen = imsg.hdr.len - IMSG_HEADER_SIZE; switch (imsg.hdr.type) { case IMSG_A_MESSAGE: if (datalen \*(Lt sizeof idata) { /* handle corrupt message */ } memcpy(&idata, imsg.data, sizeof idata); /* handle message received */ break; ... } imsg_free(&imsg); } } .Ed .Sh SEE ALSO .Xr socketpair 2 , .Xr unix 4 Index: head/sbin/ipfw/ipfw.8 =================================================================== --- head/sbin/ipfw/ipfw.8 (revision 327258) +++ head/sbin/ipfw/ipfw.8 (revision 327259) @@ -1,4075 +1,4075 @@ .\" .\" $FreeBSD$ .\" .Dd November 26, 2017 .Dt IPFW 8 .Os .Sh NAME .Nm ipfw .Nd User interface for firewall, traffic shaper, packet scheduler, in-kernel NAT. .Sh SYNOPSIS .Ss FIREWALL CONFIGURATION .Nm .Op Fl cq .Cm add .Ar rule .Nm .Op Fl acdefnNStT .Op Cm set Ar N .Brq Cm list | show .Op Ar rule | first-last ... .Nm .Op Fl f | q .Op Cm set Ar N .Cm flush .Nm .Op Fl q .Op Cm set Ar N .Brq Cm delete | zero | resetlog .Op Ar number ... .Pp .Nm .Cm set Oo Cm disable Ar number ... Oc Op Cm enable Ar number ... .Nm .Cm set move .Op Cm rule .Ar number Cm to Ar number .Nm .Cm set swap Ar number number .Nm .Cm set show .Ss SYSCTL SHORTCUTS .Nm .Cm enable .Brq Cm firewall | altq | one_pass | debug | verbose | dyn_keepalive .Nm .Cm disable .Brq Cm firewall | altq | one_pass | debug | verbose | dyn_keepalive .Ss LOOKUP TABLES .Nm .Oo Cm set Ar N Oc Cm table Ar name Cm create Ar create-options .Nm .Oo Cm set Ar N Oc Cm table .Brq Ar name | all .Cm destroy .Nm .Oo Cm set Ar N Oc Cm table Ar name Cm modify Ar modify-options .Nm .Oo Cm set Ar N Oc Cm table Ar name Cm swap Ar name .Nm .Oo Cm set Ar N Oc Cm table Ar name Cm add Ar table-key Op Ar value .Nm .Oo Cm set Ar N Oc Cm table Ar name Cm add Op Ar table-key Ar value ... .Nm .Oo Cm set Ar N Oc Cm table Ar name Cm atomic add Op Ar table-key Ar value ... .Nm .Oo Cm set Ar N Oc Cm table Ar name Cm delete Op Ar table-key ... .Nm .Oo Cm set Ar N Oc Cm table Ar name Cm lookup Ar addr .Nm .Oo Cm set Ar N Oc Cm table Ar name Cm lock .Nm .Oo Cm set Ar N Oc Cm table Ar name Cm unlock .Nm .Oo Cm set Ar N Oc Cm table .Brq Ar name | all .Cm list .Nm .Oo Cm set Ar N Oc Cm table .Brq Ar name | all .Cm info .Nm .Oo Cm set Ar N Oc Cm table .Brq Ar name | all .Cm detail .Nm .Oo Cm set Ar N Oc Cm table .Brq Ar name | all .Cm flush .Ss DUMMYNET CONFIGURATION (TRAFFIC SHAPER AND PACKET SCHEDULER) .Nm .Brq Cm pipe | queue | sched .Ar number .Cm config .Ar config-options .Nm .Op Fl s Op Ar field .Brq Cm pipe | queue | sched .Brq Cm delete | list | show .Op Ar number ... .Ss IN-KERNEL NAT .Nm .Op Fl q .Cm nat .Ar number .Cm config .Ar config-options .Pp .Nm .Op Fl cfnNqS .Oo .Fl p Ar preproc .Oo .Ar preproc-flags .Oc .Oc .Ar pathname .Ss STATEFUL IPv6/IPv4 NETWORK ADDRESS AND PROTOCOL TRANSLATION .Nm .Oo Cm set Ar N Oc Cm nat64lsn Ar name Cm create Ar create-options .Nm .Oo Cm set Ar N Oc Cm nat64lsn Ar name Cm config Ar config-options .Nm .Oo Cm set Ar N Oc Cm nat64lsn .Brq Ar name | all .Brq Cm list | show .Op Cm states .Nm .Oo Cm set Ar N Oc Cm nat64lsn .Brq Ar name | all .Cm destroy .Nm .Oo Cm set Ar N Oc Cm nat64lsn Ar name Cm stats Op Cm reset .Ss STATELESS IPv6/IPv4 NETWORK ADDRESS AND PROTOCOL TRANSLATION .Nm .Oo Cm set Ar N Oc Cm nat64stl Ar name Cm create Ar create-options .Nm .Oo Cm set Ar N Oc Cm nat64stl Ar name Cm config Ar config-options .Nm .Oo Cm set Ar N Oc Cm nat64stl .Brq Ar name | all .Brq Cm list | show .Nm .Oo Cm set Ar N Oc Cm nat64stl .Brq Ar name | all .Cm destroy .Nm .Oo Cm set Ar N Oc Cm nat64stl Ar name Cm stats Op Cm reset .Ss IPv6-to-IPv6 NETWORK PREFIX TRANSLATION .Nm .Oo Cm set Ar N Oc Cm nptv6 Ar name Cm create Ar create-options .Nm .Oo Cm set Ar N Oc Cm nptv6 .Brq Ar name | all .Brq Cm list | show .Nm .Oo Cm set Ar N Oc Cm nptv6 .Brq Ar name | all .Cm destroy .Nm .Oo Cm set Ar N Oc Cm nptv6 Ar name Cm stats Op Cm reset .Ss INTERNAL DIAGNOSTICS .Nm .Cm internal iflist .Nm .Cm internal talist .Nm .Cm internal vlist .Sh DESCRIPTION The .Nm utility is the user interface for controlling the .Xr ipfw 4 firewall, the .Xr dummynet 4 traffic shaper/packet scheduler, and the in-kernel NAT services. .Pp A firewall configuration, or .Em ruleset , is made of a list of .Em rules numbered from 1 to 65535. Packets are passed to the firewall from a number of different places in the protocol stack (depending on the source and destination of the packet, it is possible for the firewall to be invoked multiple times on the same packet). The packet passed to the firewall is compared against each of the rules in the .Em ruleset , in rule-number order (multiple rules with the same number are permitted, in which case they are processed in order of insertion). When a match is found, the action corresponding to the matching rule is performed. .Pp Depending on the action and certain system settings, packets can be reinjected into the firewall at some rule after the matching one for further processing. .Pp A ruleset always includes a .Em default rule (numbered 65535) which cannot be modified or deleted, and matches all packets. The action associated with the .Em default rule can be either .Cm deny or .Cm allow depending on how the kernel is configured. .Pp If the ruleset includes one or more rules with the .Cm keep-state or .Cm limit option, the firewall will have a .Em stateful behaviour, i.e., upon a match it will create .Em dynamic rules , i.e., rules that match packets with the same 5-tuple (protocol, source and destination addresses and ports) as the packet which caused their creation. Dynamic rules, which have a limited lifetime, are checked at the first occurrence of a .Cm check-state , .Cm keep-state or .Cm limit rule, and are typically used to open the firewall on-demand to legitimate traffic only. See the .Sx STATEFUL FIREWALL and .Sx EXAMPLES Sections below for more information on the stateful behaviour of .Nm . .Pp All rules (including dynamic ones) have a few associated counters: a packet count, a byte count, a log count and a timestamp indicating the time of the last match. Counters can be displayed or reset with .Nm commands. .Pp Each rule belongs to one of 32 different .Em sets , and there are .Nm commands to atomically manipulate sets, such as enable, disable, swap sets, move all rules in a set to another one, delete all rules in a set. These can be useful to install temporary configurations, or to test them. See Section .Sx SETS OF RULES for more information on .Em sets . .Pp Rules can be added with the .Cm add command; deleted individually or in groups with the .Cm delete command, and globally (except those in set 31) with the .Cm flush command; displayed, optionally with the content of the counters, using the .Cm show and .Cm list commands. Finally, counters can be reset with the .Cm zero and .Cm resetlog commands. .Pp .Ss COMMAND OPTIONS The following general options are available when invoking .Nm : .Bl -tag -width indent .It Fl a Show counter values when listing rules. The .Cm show command implies this option. .It Fl b Only show the action and the comment, not the body of a rule. Implies .Fl c . .It Fl c When entering or showing rules, print them in compact form, i.e., omitting the "ip from any to any" string when this does not carry any additional information. .It Fl d When listing, show dynamic rules in addition to static ones. .It Fl e When listing and .Fl d is specified, also show expired dynamic rules. .It Fl f Do not ask for confirmation for commands that can cause problems if misused, i.e., .Cm flush . If there is no tty associated with the process, this is implied. .It Fl i When listing a table (see the .Sx LOOKUP TABLES section below for more information on lookup tables), format values as IP addresses. By default, values are shown as integers. .It Fl n Only check syntax of the command strings, without actually passing them to the kernel. .It Fl N Try to resolve addresses and service names in output. .It Fl q Be quiet when executing the .Cm add , .Cm nat , .Cm zero , .Cm resetlog or .Cm flush commands; (implies .Fl f ) . This is useful when updating rulesets by executing multiple .Nm commands in a script (e.g., .Ql sh\ /etc/rc.firewall ) , or by processing a file with many .Nm rules across a remote login session. It also stops a table add or delete from failing if the entry already exists or is not present. .Pp The reason why this option may be important is that for some of these actions, .Nm may print a message; if the action results in blocking the traffic to the remote client, the remote login session will be closed and the rest of the ruleset will not be processed. Access to the console would then be required to recover. .It Fl S When listing rules, show the .Em set each rule belongs to. If this flag is not specified, disabled rules will not be listed. .It Fl s Op Ar field When listing pipes, sort according to one of the four counters (total or current packets or bytes). .It Fl t When listing, show last match timestamp converted with ctime(). .It Fl T When listing, show last match timestamp as seconds from the epoch. This form can be more convenient for postprocessing by scripts. .El .Ss LIST OF RULES AND PREPROCESSING To ease configuration, rules can be put into a file which is processed using .Nm as shown in the last synopsis line. An absolute .Ar pathname must be used. The file will be read line by line and applied as arguments to the .Nm utility. .Pp Optionally, a preprocessor can be specified using .Fl p Ar preproc where .Ar pathname is to be piped through. Useful preprocessors include .Xr cpp 1 and .Xr m4 1 . If .Ar preproc does not start with a slash .Pq Ql / as its first character, the usual .Ev PATH name search is performed. Care should be taken with this in environments where not all file systems are mounted (yet) by the time .Nm is being run (e.g.\& when they are mounted over NFS). Once .Fl p has been specified, any additional arguments are passed on to the preprocessor for interpretation. This allows for flexible configuration files (like conditionalizing them on the local hostname) and the use of macros to centralize frequently required arguments like IP addresses. .Ss TRAFFIC SHAPER CONFIGURATION The .Nm .Cm pipe , queue and .Cm sched commands are used to configure the traffic shaper and packet scheduler. See the .Sx TRAFFIC SHAPER (DUMMYNET) CONFIGURATION Section below for details. .Pp If the world and the kernel get out of sync the .Nm ABI may break, preventing you from being able to add any rules. This can adversely affect the booting process. You can use .Nm .Cm disable .Cm firewall to temporarily disable the firewall to regain access to the network, allowing you to fix the problem. .Sh PACKET FLOW A packet is checked against the active ruleset in multiple places in the protocol stack, under control of several sysctl variables. These places and variables are shown below, and it is important to have this picture in mind in order to design a correct ruleset. .Bd -literal -offset indent ^ to upper layers V | | +----------->-----------+ ^ V [ip(6)_input] [ip(6)_output] net.inet(6).ip(6).fw.enable=1 | | ^ V [ether_demux] [ether_output_frame] net.link.ether.ipfw=1 | | +-->--[bdg_forward]-->--+ net.link.bridge.ipfw=1 ^ V | to devices | .Ed .Pp The number of times the same packet goes through the firewall can vary between 0 and 4 depending on packet source and destination, and system configuration. .Pp Note that as packets flow through the stack, headers can be stripped or added to it, and so they may or may not be available for inspection. E.g., incoming packets will include the MAC header when .Nm is invoked from .Cm ether_demux() , but the same packets will have the MAC header stripped off when .Nm is invoked from .Cm ip_input() or .Cm ip6_input() . .Pp Also note that each packet is always checked against the complete ruleset, irrespective of the place where the check occurs, or the source of the packet. If a rule contains some match patterns or actions which are not valid for the place of invocation (e.g.\& trying to match a MAC header within .Cm ip_input or .Cm ip6_input ), the match pattern will not match, but a .Cm not operator in front of such patterns .Em will cause the pattern to .Em always match on those packets. It is thus the responsibility of the programmer, if necessary, to write a suitable ruleset to differentiate among the possible places. .Cm skipto rules can be useful here, as an example: .Bd -literal -offset indent # packets from ether_demux or bdg_forward ipfw add 10 skipto 1000 all from any to any layer2 in # packets from ip_input ipfw add 10 skipto 2000 all from any to any not layer2 in # packets from ip_output ipfw add 10 skipto 3000 all from any to any not layer2 out # packets from ether_output_frame ipfw add 10 skipto 4000 all from any to any layer2 out .Ed .Pp (yes, at the moment there is no way to differentiate between ether_demux and bdg_forward). .Sh SYNTAX In general, each keyword or argument must be provided as a separate command line argument, with no leading or trailing spaces. Keywords are case-sensitive, whereas arguments may or may not be case-sensitive depending on their nature (e.g.\& uid's are, hostnames are not). .Pp Some arguments (e.g., port or address lists) are comma-separated lists of values. In this case, spaces after commas ',' are allowed to make the line more readable. You can also put the entire command (including flags) into a single argument. E.g., the following forms are equivalent: .Bd -literal -offset indent ipfw -q add deny src-ip 10.0.0.0/24,127.0.0.1/8 ipfw -q add deny src-ip 10.0.0.0/24, 127.0.0.1/8 ipfw "-q add deny src-ip 10.0.0.0/24, 127.0.0.1/8" .Ed .Sh RULE FORMAT The format of firewall rules is the following: .Bd -ragged -offset indent .Bk -words .Op Ar rule_number .Op Cm set Ar set_number .Op Cm prob Ar match_probability .Ar action .Op Cm log Op Cm logamount Ar number .Op Cm altq Ar queue .Oo .Bro Cm tag | untag .Brc Ar number .Oc .Ar body .Ek .Ed .Pp where the body of the rule specifies which information is used for filtering packets, among the following: .Pp .Bl -tag -width "Source and dest. addresses and ports" -offset XXX -compact .It Layer-2 header fields When available .It IPv4 and IPv6 Protocol SCTP, TCP, UDP, ICMP, etc. .It Source and dest. addresses and ports .It Direction See Section .Sx PACKET FLOW .It Transmit and receive interface By name or address .It Misc. IP header fields Version, type of service, datagram length, identification, fragment flag (non-zero IP offset), Time To Live .It IP options .It IPv6 Extension headers Fragmentation, Hop-by-Hop options, Routing Headers, Source routing rthdr0, Mobile IPv6 rthdr2, IPSec options. .It IPv6 Flow-ID .It Misc. TCP header fields TCP flags (SYN, FIN, ACK, RST, etc.), sequence number, acknowledgment number, window .It TCP options .It ICMP types for ICMP packets .It ICMP6 types for ICMP6 packets .It User/group ID When the packet can be associated with a local socket. .It Divert status Whether a packet came from a divert socket (e.g., .Xr natd 8 ) . .It Fib annotation state Whether a packet has been tagged for using a specific FIB (routing table) in future forwarding decisions. .El .Pp Note that some of the above information, e.g.\& source MAC or IP addresses and TCP/UDP ports, can be easily spoofed, so filtering on those fields alone might not guarantee the desired results. .Bl -tag -width indent .It Ar rule_number Each rule is associated with a .Ar rule_number in the range 1..65535, with the latter reserved for the .Em default rule. Rules are checked sequentially by rule number. Multiple rules can have the same number, in which case they are checked (and listed) according to the order in which they have been added. If a rule is entered without specifying a number, the kernel will assign one in such a way that the rule becomes the last one before the .Em default rule. Automatic rule numbers are assigned by incrementing the last non-default rule number by the value of the sysctl variable .Ar net.inet.ip.fw.autoinc_step which defaults to 100. If this is not possible (e.g.\& because we would go beyond the maximum allowed rule number), the number of the last non-default value is used instead. .It Cm set Ar set_number Each rule is associated with a .Ar set_number in the range 0..31. Sets can be individually disabled and enabled, so this parameter is of fundamental importance for atomic ruleset manipulation. It can be also used to simplify deletion of groups of rules. If a rule is entered without specifying a set number, set 0 will be used. .br Set 31 is special in that it cannot be disabled, and rules in set 31 are not deleted by the .Nm ipfw flush command (but you can delete them with the .Nm ipfw delete set 31 command). Set 31 is also used for the .Em default rule. .It Cm prob Ar match_probability A match is only declared with the specified probability (floating point number between 0 and 1). This can be useful for a number of applications such as random packet drop or (in conjunction with .Nm dummynet ) to simulate the effect of multiple paths leading to out-of-order packet delivery. .Pp Note: this condition is checked before any other condition, including ones such as keep-state or check-state which might have side effects. .It Cm log Op Cm logamount Ar number Packets matching a rule with the .Cm log keyword will be made available for logging in two ways: if the sysctl variable .Va net.inet.ip.fw.verbose is set to 0 (default), one can use .Xr bpf 4 attached to the .Li ipfw0 pseudo interface. This pseudo interface can be created after a boot manually by using the following command: .Bd -literal -offset indent # ifconfig ipfw0 create .Ed .Pp Or, automatically at boot time by adding the following line to the .Xr rc.conf 5 file: .Bd -literal -offset indent firewall_logif="YES" .Ed .Pp There is no overhead if no .Xr bpf 4 is attached to the pseudo interface. .Pp If .Va net.inet.ip.fw.verbose is set to 1, packets will be logged to .Xr syslogd 8 with a .Dv LOG_SECURITY facility up to a maximum of .Cm logamount packets. If no .Cm logamount is specified, the limit is taken from the sysctl variable .Va net.inet.ip.fw.verbose_limit . In both cases, a value of 0 means unlimited logging. .Pp Once the limit is reached, logging can be re-enabled by clearing the logging counter or the packet counter for that entry, see the .Cm resetlog command. .Pp Note: logging is done after all other packet matching conditions have been successfully verified, and before performing the final action (accept, deny, etc.) on the packet. .It Cm tag Ar number When a packet matches a rule with the .Cm tag keyword, the numeric tag for the given .Ar number in the range 1..65534 will be attached to the packet. The tag acts as an internal marker (it is not sent out over the wire) that can be used to identify these packets later on. This can be used, for example, to provide trust between interfaces and to start doing policy-based filtering. A packet can have multiple tags at the same time. Tags are "sticky", meaning once a tag is applied to a packet by a matching rule it exists until explicit removal. Tags are kept with the packet everywhere within the kernel, but are lost when packet leaves the kernel, for example, on transmitting packet out to the network or sending packet to a .Xr divert 4 socket. .Pp To check for previously applied tags, use the .Cm tagged rule option. To delete previously applied tag, use the .Cm untag keyword. .Pp Note: since tags are kept with the packet everywhere in kernelspace, they can be set and unset anywhere in the kernel network subsystem (using the .Xr mbuf_tags 9 facility), not only by means of the .Xr ipfw 4 .Cm tag and .Cm untag keywords. For example, there can be a specialized .Xr netgraph 4 node doing traffic analyzing and tagging for later inspecting in firewall. .It Cm untag Ar number When a packet matches a rule with the .Cm untag keyword, the tag with the number .Ar number is searched among the tags attached to this packet and, if found, removed from it. Other tags bound to packet, if present, are left untouched. .It Cm altq Ar queue When a packet matches a rule with the .Cm altq keyword, the ALTQ identifier for the given .Ar queue (see .Xr altq 4 ) will be attached. Note that this ALTQ tag is only meaningful for packets going "out" of IPFW, and not being rejected or going to divert sockets. Note that if there is insufficient memory at the time the packet is processed, it will not be tagged, so it is wise to make your ALTQ "default" queue policy account for this. If multiple .Cm altq rules match a single packet, only the first one adds the ALTQ classification tag. In doing so, traffic may be shaped by using .Cm count Cm altq Ar queue rules for classification early in the ruleset, then later applying the filtering decision. For example, .Cm check-state and .Cm keep-state rules may come later and provide the actual filtering decisions in addition to the fallback ALTQ tag. .Pp You must run .Xr pfctl 8 to set up the queues before IPFW will be able to look them up by name, and if the ALTQ disciplines are rearranged, the rules in containing the queue identifiers in the kernel will likely have gone stale and need to be reloaded. Stale queue identifiers will probably result in misclassification. .Pp All system ALTQ processing can be turned on or off via .Nm .Cm enable Ar altq and .Nm .Cm disable Ar altq . The usage of .Va net.inet.ip.fw.one_pass is irrelevant to ALTQ traffic shaping, as the actual rule action is followed always after adding an ALTQ tag. .El .Ss RULE ACTIONS A rule can be associated with one of the following actions, which will be executed when the packet matches the body of the rule. .Bl -tag -width indent .It Cm allow | accept | pass | permit Allow packets that match rule. The search terminates. .It Cm check-state Op Ar :flowname | Cm :any Checks the packet against the dynamic ruleset. If a match is found, execute the action associated with the rule which generated this dynamic rule, otherwise move to the next rule. .br .Cm Check-state rules do not have a body. If no .Cm check-state rule is found, the dynamic ruleset is checked at the first .Cm keep-state or .Cm limit rule. The .Ar :flowname is symbolic name assigned to dynamic rule by .Cm keep-state opcode. The special flowname .Cm :any can be used to ignore states flowname when matching. The .Cm :default keyword is special name used for compatibility with old rulesets. .It Cm count Update counters for all packets that match rule. The search continues with the next rule. .It Cm deny | drop Discard packets that match this rule. The search terminates. .It Cm divert Ar port Divert packets that match this rule to the .Xr divert 4 socket bound to port .Ar port . The search terminates. .It Cm fwd | forward Ar ipaddr | tablearg Ns Op , Ns Ar port Change the next-hop on matching packets to .Ar ipaddr , which can be an IP address or a host name. For IPv4, the next hop can also be supplied by the last table looked up for the packet by using the .Cm tablearg keyword instead of an explicit address. The search terminates if this rule matches. .Pp If .Ar ipaddr is a local address, then matching packets will be forwarded to .Ar port (or the port number in the packet if one is not specified in the rule) on the local machine. .br If .Ar ipaddr is not a local address, then the port number (if specified) is ignored, and the packet will be forwarded to the remote address, using the route as found in the local routing table for that IP. .br A .Ar fwd rule will not match layer-2 packets (those received on ether_input, ether_output, or bridged). .br The .Cm fwd action does not change the contents of the packet at all. In particular, the destination address remains unmodified, so packets forwarded to another system will usually be rejected by that system unless there is a matching rule on that system to capture them. For packets forwarded locally, the local address of the socket will be set to the original destination address of the packet. This makes the .Xr netstat 1 entry look rather weird but is intended for use with transparent proxy servers. .It Cm nat Ar nat_nr | tablearg Pass packet to a nat instance (for network address translation, address redirect, etc.): see the .Sx NETWORK ADDRESS TRANSLATION (NAT) Section for further information. .It Cm nat64lsn Ar name Pass packet to a stateful NAT64 instance (for IPv6/IPv4 network address and protocol translation): see the .Sx IPv6/IPv4 NETWORK ADDRESS AND PROTOCOL TRANSLATION Section for further information. .It Cm nat64stl Ar name Pass packet to a stateless NAT64 instance (for IPv6/IPv4 network address and protocol translation): see the .Sx IPv6/IPv4 NETWORK ADDRESS AND PROTOCOL TRANSLATION Section for further information. .It Cm nptv6 Ar name Pass packet to a NPTv6 instance (for IPv6-to-IPv6 network prefix translation): see the .Sx IPv6-to-IPv6 NETWORK PREFIX TRANSLATION (NPTv6) Section for further information. .It Cm pipe Ar pipe_nr Pass packet to a .Nm dummynet .Dq pipe (for bandwidth limitation, delay, etc.). See the .Sx TRAFFIC SHAPER (DUMMYNET) CONFIGURATION Section for further information. The search terminates; however, on exit from the pipe and if the .Xr sysctl 8 variable .Va net.inet.ip.fw.one_pass is not set, the packet is passed again to the firewall code starting from the next rule. .It Cm queue Ar queue_nr Pass packet to a .Nm dummynet .Dq queue (for bandwidth limitation using WF2Q+). .It Cm reject (Deprecated). Synonym for .Cm unreach host . .It Cm reset Discard packets that match this rule, and if the packet is a TCP packet, try to send a TCP reset (RST) notice. The search terminates. .It Cm reset6 Discard packets that match this rule, and if the packet is a TCP packet, try to send a TCP reset (RST) notice. The search terminates. .It Cm skipto Ar number | tablearg Skip all subsequent rules numbered less than .Ar number . The search continues with the first rule numbered .Ar number or higher. It is possible to use the .Cm tablearg keyword with a skipto for a .Em computed skipto. Skipto may work either in O(log(N)) or in O(1) depending on amount of memory and/or sysctl variables. See the .Sx SYSCTL VARIABLES section for more details. .It Cm call Ar number | tablearg The current rule number is saved in the internal stack and ruleset processing continues with the first rule numbered .Ar number or higher. If later a rule with the .Cm return action is encountered, the processing returns to the first rule with number of this .Cm call rule plus one or higher (the same behaviour as with packets returning from .Xr divert 4 socket after a .Cm divert action). This could be used to make somewhat like an assembly language .Dq subroutine calls to rules with common checks for different interfaces, etc. .Pp Rule with any number could be called, not just forward jumps as with .Cm skipto . So, to prevent endless loops in case of mistakes, both .Cm call and .Cm return actions don't do any jumps and simply go to the next rule if memory cannot be allocated or stack overflowed/underflowed. .Pp Internally stack for rule numbers is implemented using .Xr mbuf_tags 9 facility and currently has size of 16 entries. As mbuf tags are lost when packet leaves the kernel, .Cm divert should not be used in subroutines to avoid endless loops and other undesired effects. .It Cm return Takes rule number saved to internal stack by the last .Cm call action and returns ruleset processing to the first rule with number greater than number of corresponding .Cm call rule. See description of the .Cm call action for more details. .Pp Note that .Cm return rules usually end a .Dq subroutine and thus are unconditional, but .Nm command-line utility currently requires every action except .Cm check-state to have body. While it is sometimes useful to return only on some packets, usually you want to print just .Dq return for readability. A workaround for this is to use new syntax and .Fl c switch: .Bd -literal -offset indent # Add a rule without actual body ipfw add 2999 return via any # List rules without "from any to any" part ipfw -c list .Ed .Pp This cosmetic annoyance may be fixed in future releases. .It Cm tee Ar port Send a copy of packets matching this rule to the .Xr divert 4 socket bound to port .Ar port . The search continues with the next rule. .It Cm unreach Ar code Discard packets that match this rule, and try to send an ICMP unreachable notice with code .Ar code , where .Ar code is a number from 0 to 255, or one of these aliases: .Cm net , host , protocol , port , .Cm needfrag , srcfail , net-unknown , host-unknown , .Cm isolated , net-prohib , host-prohib , tosnet , .Cm toshost , filter-prohib , host-precedence or .Cm precedence-cutoff . The search terminates. .It Cm unreach6 Ar code Discard packets that match this rule, and try to send an ICMPv6 unreachable notice with code .Ar code , where .Ar code is a number from 0, 1, 3 or 4, or one of these aliases: .Cm no-route, admin-prohib, address or .Cm port . The search terminates. .It Cm netgraph Ar cookie Divert packet into netgraph with given .Ar cookie . The search terminates. If packet is later returned from netgraph it is either accepted or continues with the next rule, depending on .Va net.inet.ip.fw.one_pass sysctl variable. .It Cm ngtee Ar cookie A copy of packet is diverted into netgraph, original packet continues with the next rule. See .Xr ng_ipfw 4 for more information on .Cm netgraph and .Cm ngtee actions. .It Cm setfib Ar fibnum | tablearg The packet is tagged so as to use the FIB (routing table) .Ar fibnum in any subsequent forwarding decisions. In the current implementation, this is limited to the values 0 through 15, see .Xr setfib 2 . Processing continues at the next rule. It is possible to use the .Cm tablearg keyword with setfib. If the tablearg value is not within the compiled range of fibs, the packet's fib is set to 0. .It Cm setdscp Ar DSCP | number | tablearg Set specified DiffServ codepoint for an IPv4/IPv6 packet. Processing continues at the next rule. Supported values are: .Pp .Cm CS0 .Pq Dv 000000 , .Cm CS1 .Pq Dv 001000 , .Cm CS2 .Pq Dv 010000 , .Cm CS3 .Pq Dv 011000 , .Cm CS4 .Pq Dv 100000 , .Cm CS5 .Pq Dv 101000 , .Cm CS6 .Pq Dv 110000 , .Cm CS7 .Pq Dv 111000 , .Cm AF11 .Pq Dv 001010 , .Cm AF12 .Pq Dv 001100 , .Cm AF13 .Pq Dv 001110 , .Cm AF21 .Pq Dv 010010 , .Cm AF22 .Pq Dv 010100 , .Cm AF23 .Pq Dv 010110 , .Cm AF31 .Pq Dv 011010 , .Cm AF32 .Pq Dv 011100 , .Cm AF33 .Pq Dv 011110 , .Cm AF41 .Pq Dv 100010 , .Cm AF42 .Pq Dv 100100 , .Cm AF43 .Pq Dv 100110 , .Cm EF .Pq Dv 101110 , .Cm BE .Pq Dv 000000 . Additionally, DSCP value can be specified by number (0..64). It is also possible to use the .Cm tablearg keyword with setdscp. If the tablearg value is not within the 0..64 range, lower 6 bits of supplied value are used. .It Cm tcp-setmss Ar mss Set the Maximum Segment Size (MSS) in the TCP segment to value .Ar mss . The kernel module .Cm ipfw_pmod should be loaded or kernel should have .Cm options IPFIREWALL_PMOD to be able use this action. This command does not change a packet if original MSS value is lower than specified value. Both TCP over IPv4 and over IPv6 are supported. Regardless of matched a packet or not by the .Cm tcp-setmss rule, the search continues with the next rule. .It Cm reass Queue and reassemble IP fragments. If the packet is not fragmented, counters are updated and processing continues with the next rule. If the packet is the last logical fragment, the packet is reassembled and, if .Va net.inet.ip.fw.one_pass is set to 0, processing continues with the next rule. Otherwise, the packet is allowed to pass and the search terminates. If the packet is a fragment in the middle of a logical group of fragments, it is consumed and processing stops immediately. .Pp Fragment handling can be tuned via .Va net.inet.ip.maxfragpackets and .Va net.inet.ip.maxfragsperpacket which limit, respectively, the maximum number of processable fragments (default: 800) and the maximum number of fragments per packet (default: 16). .Pp NOTA BENE: since fragments do not contain port numbers, they should be avoided with the .Nm reass rule. Alternatively, direction-based (like .Nm in / .Nm out ) and source-based (like .Nm via ) match patterns can be used to select fragments. .Pp Usually a simple rule like: .Bd -literal -offset indent # reassemble incoming fragments ipfw add reass all from any to any in .Ed .Pp is all you need at the beginning of your ruleset. .It Cm abort Discard packets that match this rule, and if the packet is an SCTP packet, try to send an SCTP packet containing an ABORT chunk. The search terminates. .It Cm abort6 Discard packets that match this rule, and if the packet is an SCTP packet, try to send an SCTP packet containing an ABORT chunk. The search terminates. .El .Ss RULE BODY The body of a rule contains zero or more patterns (such as specific source and destination addresses or ports, protocol options, incoming or outgoing interfaces, etc.) that the packet must match in order to be recognised. In general, the patterns are connected by (implicit) .Cm and operators -- i.e., all must match in order for the rule to match. Individual patterns can be prefixed by the .Cm not operator to reverse the result of the match, as in .Pp .Dl "ipfw add 100 allow ip from not 1.2.3.4 to any" .Pp Additionally, sets of alternative match patterns .Pq Em or-blocks can be constructed by putting the patterns in lists enclosed between parentheses ( ) or braces { }, and using the .Cm or operator as follows: .Pp .Dl "ipfw add 100 allow ip from { x or not y or z } to any" .Pp Only one level of parentheses is allowed. Beware that most shells have special meanings for parentheses or braces, so it is advisable to put a backslash \\ in front of them to prevent such interpretations. .Pp The body of a rule must in general include a source and destination address specifier. The keyword .Ar any can be used in various places to specify that the content of a required field is irrelevant. .Pp The rule body has the following format: .Bd -ragged -offset indent .Op Ar proto Cm from Ar src Cm to Ar dst .Op Ar options .Ed .Pp The first part (proto from src to dst) is for backward compatibility with earlier versions of .Fx . In modern .Fx any match pattern (including MAC headers, IP protocols, addresses and ports) can be specified in the .Ar options section. .Pp Rule fields have the following meaning: .Bl -tag -width indent .It Ar proto : protocol | Cm { Ar protocol Cm or ... } .It Ar protocol : Oo Cm not Oc Ar protocol-name | protocol-number An IP protocol specified by number or name (for a complete list see .Pa /etc/protocols ) , or one of the following keywords: .Bl -tag -width indent .It Cm ip4 | ipv4 Matches IPv4 packets. .It Cm ip6 | ipv6 Matches IPv6 packets. .It Cm ip | all Matches any packet. .El .Pp The .Cm ipv6 in .Cm proto option will be treated as inner protocol. And, the .Cm ipv4 is not available in .Cm proto option. .Pp The .Cm { Ar protocol Cm or ... } format (an .Em or-block ) is provided for convenience only but its use is deprecated. .It Ar src No and Ar dst : Bro Cm addr | Cm { Ar addr Cm or ... } Brc Op Oo Cm not Oc Ar ports An address (or a list, see below) optionally followed by .Ar ports specifiers. .Pp The second format .Em ( or-block with multiple addresses) is provided for convenience only and its use is discouraged. .It Ar addr : Oo Cm not Oc Bro .Cm any | me | me6 | .Cm table Ns Pq Ar name Ns Op , Ns Ar value .Ar | addr-list | addr-set .Brc .Bl -tag -width indent .It Cm any matches any IP address. .It Cm me matches any IP address configured on an interface in the system. .It Cm me6 matches any IPv6 address configured on an interface in the system. The address list is evaluated at the time the packet is analysed. .It Cm table Ns Pq Ar name Ns Op , Ns Ar value Matches any IPv4 or IPv6 address for which an entry exists in the lookup table .Ar number . If an optional 32-bit unsigned .Ar value is also specified, an entry will match only if it has this value. See the .Sx LOOKUP TABLES section below for more information on lookup tables. .El .It Ar addr-list : ip-addr Ns Op Ns , Ns Ar addr-list .It Ar ip-addr : A host or subnet address specified in one of the following ways: .Bl -tag -width indent .It Ar numeric-ip | hostname Matches a single IPv4 address, specified as dotted-quad or a hostname. Hostnames are resolved at the time the rule is added to the firewall list. .It Ar addr Ns / Ns Ar masklen Matches all addresses with base .Ar addr (specified as an IP address, a network number, or a hostname) and mask width of .Cm masklen bits. As an example, 1.2.3.4/25 or 1.2.3.0/25 will match all IP numbers from 1.2.3.0 to 1.2.3.127 . .It Ar addr Ns : Ns Ar mask Matches all addresses with base .Ar addr (specified as an IP address, a network number, or a hostname) and the mask of .Ar mask , specified as a dotted quad. As an example, 1.2.3.4:255.0.255.0 or 1.0.3.0:255.0.255.0 will match 1.*.3.*. This form is advised only for non-contiguous masks. It is better to resort to the .Ar addr Ns / Ns Ar masklen format for contiguous masks, which is more compact and less error-prone. .El .It Ar addr-set : addr Ns Oo Ns / Ns Ar masklen Oc Ns Cm { Ns Ar list Ns Cm } .It Ar list : Bro Ar num | num-num Brc Ns Op Ns , Ns Ar list Matches all addresses with base address .Ar addr (specified as an IP address, a network number, or a hostname) and whose last byte is in the list between braces { } . Note that there must be no spaces between braces and numbers (spaces after commas are allowed). Elements of the list can be specified as single entries or ranges. The .Ar masklen field is used to limit the size of the set of addresses, and can have any value between 24 and 32. If not specified, it will be assumed as 24. .br This format is particularly useful to handle sparse address sets within a single rule. Because the matching occurs using a bitmask, it takes constant time and dramatically reduces the complexity of rulesets. .br As an example, an address specified as 1.2.3.4/24{128,35-55,89} or 1.2.3.0/24{128,35-55,89} will match the following IP addresses: .br 1.2.3.128, 1.2.3.35 to 1.2.3.55, 1.2.3.89 . .It Ar addr6-list : ip6-addr Ns Op Ns , Ns Ar addr6-list .It Ar ip6-addr : A host or subnet specified one of the following ways: .Bl -tag -width indent .It Ar numeric-ip | hostname Matches a single IPv6 address as allowed by .Xr inet_pton 3 or a hostname. Hostnames are resolved at the time the rule is added to the firewall list. .It Ar addr Ns / Ns Ar masklen Matches all IPv6 addresses with base .Ar addr (specified as allowed by .Xr inet_pton or a hostname) and mask width of .Cm masklen bits. .It Ar addr Ns / Ns Ar mask Matches all IPv6 addresses with base .Ar addr (specified as allowed by .Xr inet_pton or a hostname) and the mask of .Ar mask , specified as allowed by .Xr inet_pton. As an example, fe::640:0:0/ffff::ffff:ffff:0:0 will match fe:*:*:*:0:640:*:*. This form is advised only for non-contiguous masks. It is better to resort to the .Ar addr Ns / Ns Ar masklen format for contiguous masks, which is more compact and less error-prone. .El .Pp No support for sets of IPv6 addresses is provided because IPv6 addresses are typically random past the initial prefix. .It Ar ports : Bro Ar port | port Ns \&- Ns Ar port Ns Brc Ns Op , Ns Ar ports For protocols which support port numbers (such as SCTP, TCP and UDP), optional .Cm ports may be specified as one or more ports or port ranges, separated by commas but no spaces, and an optional .Cm not operator. The .Ql \&- notation specifies a range of ports (including boundaries). .Pp Service names (from .Pa /etc/services ) may be used instead of numeric port values. The length of the port list is limited to 30 ports or ranges, though one can specify larger ranges by using an .Em or-block in the .Cm options section of the rule. .Pp A backslash .Pq Ql \e can be used to escape the dash .Pq Ql - character in a service name (from a shell, the backslash must be typed twice to avoid the shell itself interpreting it as an escape character). .Pp .Dl "ipfw add count tcp from any ftp\e\e-data-ftp to any" .Pp Fragmented packets which have a non-zero offset (i.e., not the first fragment) will never match a rule which has one or more port specifications. See the .Cm frag option for details on matching fragmented packets. .El .Ss RULE OPTIONS (MATCH PATTERNS) Additional match patterns can be used within rules. Zero or more of these so-called .Em options can be present in a rule, optionally prefixed by the .Cm not operand, and possibly grouped into .Em or-blocks . .Pp The following match patterns can be used (listed in alphabetical order): .Bl -tag -width indent .It Cm // this is a comment. Inserts the specified text as a comment in the rule. Everything following // is considered as a comment and stored in the rule. You can have comment-only rules, which are listed as having a .Cm count action followed by the comment. .It Cm bridged Alias for .Cm layer2 . .It Cm diverted Matches only packets generated by a divert socket. .It Cm diverted-loopback Matches only packets coming from a divert socket back into the IP stack input for delivery. .It Cm diverted-output Matches only packets going from a divert socket back outward to the IP stack output for delivery. .It Cm dst-ip Ar ip-address Matches IPv4 packets whose destination IP is one of the address(es) specified as argument. .It Bro Cm dst-ip6 | dst-ipv6 Brc Ar ip6-address Matches IPv6 packets whose destination IP is one of the address(es) specified as argument. .It Cm dst-port Ar ports Matches IP packets whose destination port is one of the port(s) specified as argument. .It Cm established Matches TCP packets that have the RST or ACK bits set. .It Cm ext6hdr Ar header Matches IPv6 packets containing the extended header given by .Ar header . Supported headers are: .Pp Fragment, .Pq Cm frag , Hop-to-hop options .Pq Cm hopopt , any type of Routing Header .Pq Cm route , Source routing Routing Header Type 0 .Pq Cm rthdr0 , Mobile IPv6 Routing Header Type 2 .Pq Cm rthdr2 , Destination options .Pq Cm dstopt , IPSec authentication headers .Pq Cm ah , and IPsec encapsulated security payload headers .Pq Cm esp . .It Cm fib Ar fibnum Matches a packet that has been tagged to use the given FIB (routing table) number. .It Cm flow Ar table Ns Pq Ar name Ns Op , Ns Ar value Search for the flow entry in lookup table .Ar name . If not found, the match fails. Otherwise, the match succeeds and .Cm tablearg is set to the value extracted from the table. .Pp This option can be useful to quickly dispatch traffic based on certain packet fields. See the .Sx LOOKUP TABLES section below for more information on lookup tables. .It Cm flow-id Ar labels Matches IPv6 packets containing any of the flow labels given in .Ar labels . .Ar labels is a comma separated list of numeric flow labels. .It Cm frag Matches packets that are fragments and not the first fragment of an IP datagram. Note that these packets will not have the next protocol header (e.g.\& TCP, UDP) so options that look into these headers cannot match. .It Cm gid Ar group Matches all TCP or UDP packets sent by or received for a .Ar group . A .Ar group may be specified by name or number. .It Cm jail Ar prisonID Matches all TCP or UDP packets sent by or received for the jail whos prison ID is .Ar prisonID . .It Cm icmptypes Ar types Matches ICMP packets whose ICMP type is in the list .Ar types . The list may be specified as any combination of individual types (numeric) separated by commas. .Em Ranges are not allowed . The supported ICMP types are: .Pp echo reply .Pq Cm 0 , destination unreachable .Pq Cm 3 , source quench .Pq Cm 4 , redirect .Pq Cm 5 , echo request .Pq Cm 8 , router advertisement .Pq Cm 9 , router solicitation .Pq Cm 10 , time-to-live exceeded .Pq Cm 11 , IP header bad .Pq Cm 12 , timestamp request .Pq Cm 13 , timestamp reply .Pq Cm 14 , information request .Pq Cm 15 , information reply .Pq Cm 16 , address mask request .Pq Cm 17 and address mask reply .Pq Cm 18 . .It Cm icmp6types Ar types Matches ICMP6 packets whose ICMP6 type is in the list of .Ar types . The list may be specified as any combination of individual types (numeric) separated by commas. .Em Ranges are not allowed . .It Cm in | out Matches incoming or outgoing packets, respectively. .Cm in and .Cm out are mutually exclusive (in fact, .Cm out is implemented as .Cm not in Ns No ). .It Cm ipid Ar id-list Matches IPv4 packets whose .Cm ip_id field has value included in .Ar id-list , which is either a single value or a list of values or ranges specified in the same way as .Ar ports . .It Cm iplen Ar len-list Matches IP packets whose total length, including header and data, is in the set .Ar len-list , which is either a single value or a list of values or ranges specified in the same way as .Ar ports . .It Cm ipoptions Ar spec Matches packets whose IPv4 header contains the comma separated list of options specified in .Ar spec . The supported IP options are: .Pp .Cm ssrr (strict source route), .Cm lsrr (loose source route), .Cm rr (record packet route) and .Cm ts (timestamp). The absence of a particular option may be denoted with a .Ql \&! . .It Cm ipprecedence Ar precedence Matches IPv4 packets whose precedence field is equal to .Ar precedence . .It Cm ipsec Matches packets that have IPSEC history associated with them (i.e., the packet comes encapsulated in IPSEC, the kernel has IPSEC support, and can correctly decapsulate it). .Pp Note that specifying .Cm ipsec is different from specifying .Cm proto Ar ipsec as the latter will only look at the specific IP protocol field, irrespective of IPSEC kernel support and the validity of the IPSEC data. .Pp Further note that this flag is silently ignored in kernels without IPSEC support. It does not affect rule processing when given and the rules are handled as if with no .Cm ipsec flag. .It Cm iptos Ar spec Matches IPv4 packets whose .Cm tos field contains the comma separated list of service types specified in .Ar spec . The supported IP types of service are: .Pp .Cm lowdelay .Pq Dv IPTOS_LOWDELAY , .Cm throughput .Pq Dv IPTOS_THROUGHPUT , .Cm reliability .Pq Dv IPTOS_RELIABILITY , .Cm mincost .Pq Dv IPTOS_MINCOST , .Cm congestion .Pq Dv IPTOS_ECN_CE . The absence of a particular type may be denoted with a .Ql \&! . .It Cm dscp spec Ns Op , Ns Ar spec Matches IPv4/IPv6 packets whose .Cm DS field value is contained in .Ar spec mask. Multiple values can be specified via the comma separated list. Value can be one of keywords used in .Cm setdscp action or exact number. .It Cm ipttl Ar ttl-list Matches IPv4 packets whose time to live is included in .Ar ttl-list , which is either a single value or a list of values or ranges specified in the same way as .Ar ports . .It Cm ipversion Ar ver Matches IP packets whose IP version field is .Ar ver . .It Cm keep-state Op Ar :flowname Upon a match, the firewall will create a dynamic rule, whose default behaviour is to match bidirectional traffic between source and destination IP/port using the same protocol. The rule has a limited lifetime (controlled by a set of .Xr sysctl 8 variables), and the lifetime is refreshed every time a matching packet is found. The .Ar :flowname is used to assign additional to addresses, ports and protocol parameter to dynamic rule. It can be used for more accurate matching by .Cm check-state rule. The .Cm :default keyword is special name used for compatibility with old rulesets. .It Cm layer2 Matches only layer2 packets, i.e., those passed to .Nm from ether_demux() and ether_output_frame(). .It Cm limit Bro Cm src-addr | src-port | dst-addr | dst-port Brc Ar N Op Ar :flowname The firewall will only allow .Ar N connections with the same set of parameters as specified in the rule. One or more of source and destination addresses and ports can be specified. .It Cm lookup Bro Cm dst-ip | dst-port | src-ip | src-port | uid | jail Brc Ar name Search an entry in lookup table .Ar name that matches the field specified as argument. If not found, the match fails. Otherwise, the match succeeds and .Cm tablearg is set to the value extracted from the table. .Pp This option can be useful to quickly dispatch traffic based on certain packet fields. See the .Sx LOOKUP TABLES section below for more information on lookup tables. .It Cm { MAC | mac } Ar dst-mac src-mac Match packets with a given .Ar dst-mac and .Ar src-mac addresses, specified as the .Cm any keyword (matching any MAC address), or six groups of hex digits separated by colons, and optionally followed by a mask indicating the significant bits. The mask may be specified using either of the following methods: .Bl -enum -width indent .It A slash .Pq / followed by the number of significant bits. For example, an address with 33 significant bits could be specified as: .Pp .Dl "MAC 10:20:30:40:50:60/33 any" .It An ampersand .Pq & followed by a bitmask specified as six groups of hex digits separated by colons. For example, an address in which the last 16 bits are significant could be specified as: .Pp .Dl "MAC 10:20:30:40:50:60&00:00:00:00:ff:ff any" .Pp Note that the ampersand character has a special meaning in many shells and should generally be escaped. .El Note that the order of MAC addresses (destination first, source second) is the same as on the wire, but the opposite of the one used for IP addresses. .It Cm mac-type Ar mac-type Matches packets whose Ethernet Type field corresponds to one of those specified as argument. .Ar mac-type is specified in the same way as .Cm port numbers (i.e., one or more comma-separated single values or ranges). You can use symbolic names for known values such as .Em vlan , ipv4, ipv6 . Values can be entered as decimal or hexadecimal (if prefixed by 0x), and they are always printed as hexadecimal (unless the .Cm -N option is used, in which case symbolic resolution will be attempted). .It Cm proto Ar protocol Matches packets with the corresponding IP protocol. .It Cm recv | xmit | via Brq Ar ifX | Ar if Ns Cm * | Ar table Ns Po Ar name Ns Oo , Ns Ar value Oc Pc | Ar ipno | Ar any Matches packets received, transmitted or going through, respectively, the interface specified by exact name .Po Ar ifX Pc , by device name .Po Ar if* Pc , by IP address, or through some interface. Table .Ar name may be used to match interface by its kernel ifindex. See the .Sx LOOKUP TABLES section below for more information on lookup tables. .Pp The .Cm via keyword causes the interface to always be checked. If .Cm recv or .Cm xmit is used instead of .Cm via , then only the receive or transmit interface (respectively) is checked. By specifying both, it is possible to match packets based on both receive and transmit interface, e.g.: .Pp .Dl "ipfw add deny ip from any to any out recv ed0 xmit ed1" .Pp The .Cm recv interface can be tested on either incoming or outgoing packets, while the .Cm xmit interface can only be tested on outgoing packets. So .Cm out is required (and .Cm in is invalid) whenever .Cm xmit is used. .Pp A packet might not have a receive or transmit interface: packets originating from the local host have no receive interface, while packets destined for the local host have no transmit interface. .It Cm setup Matches TCP packets that have the SYN bit set but no ACK bit. This is the short form of .Dq Li tcpflags\ syn,!ack . .It Cm sockarg Matches packets that are associated to a local socket and for which the SO_USER_COOKIE socket option has been set to a non-zero value. As a side effect, the value of the option is made available as .Cm tablearg value, which in turn can be used as .Cm skipto or .Cm pipe number. .It Cm src-ip Ar ip-address Matches IPv4 packets whose source IP is one of the address(es) specified as an argument. .It Cm src-ip6 Ar ip6-address Matches IPv6 packets whose source IP is one of the address(es) specified as an argument. .It Cm src-port Ar ports Matches IP packets whose source port is one of the port(s) specified as argument. .It Cm tagged Ar tag-list Matches packets whose tags are included in .Ar tag-list , which is either a single value or a list of values or ranges specified in the same way as .Ar ports . Tags can be applied to the packet using .Cm tag rule action parameter (see it's description for details on tags). .It Cm tcpack Ar ack TCP packets only. Match if the TCP header acknowledgment number field is set to .Ar ack . .It Cm tcpdatalen Ar tcpdatalen-list Matches TCP packets whose length of TCP data is .Ar tcpdatalen-list , which is either a single value or a list of values or ranges specified in the same way as .Ar ports . .It Cm tcpflags Ar spec TCP packets only. Match if the TCP header contains the comma separated list of flags specified in .Ar spec . The supported TCP flags are: .Pp .Cm fin , .Cm syn , .Cm rst , .Cm psh , .Cm ack and .Cm urg . The absence of a particular flag may be denoted with a .Ql \&! . A rule which contains a .Cm tcpflags specification can never match a fragmented packet which has a non-zero offset. See the .Cm frag option for details on matching fragmented packets. .It Cm tcpseq Ar seq TCP packets only. Match if the TCP header sequence number field is set to .Ar seq . .It Cm tcpwin Ar tcpwin-list Matches TCP packets whose header window field is set to .Ar tcpwin-list , which is either a single value or a list of values or ranges specified in the same way as .Ar ports . .It Cm tcpoptions Ar spec TCP packets only. Match if the TCP header contains the comma separated list of options specified in .Ar spec . The supported TCP options are: .Pp .Cm mss (maximum segment size), .Cm window (tcp window advertisement), .Cm sack (selective ack), .Cm ts (rfc1323 timestamp) and .Cm cc (rfc1644 t/tcp connection count). The absence of a particular option may be denoted with a .Ql \&! . .It Cm uid Ar user Match all TCP or UDP packets sent by or received for a .Ar user . A .Ar user may be matched by name or identification number. .It Cm verrevpath For incoming packets, a routing table lookup is done on the packet's source address. If the interface on which the packet entered the system matches the outgoing interface for the route, the packet matches. If the interfaces do not match up, the packet does not match. All outgoing packets or packets with no incoming interface match. .Pp The name and functionality of the option is intentionally similar to the Cisco IOS command: .Pp .Dl ip verify unicast reverse-path .Pp This option can be used to make anti-spoofing rules to reject all packets with source addresses not from this interface. See also the option .Cm antispoof . .It Cm versrcreach For incoming packets, a routing table lookup is done on the packet's source address. If a route to the source address exists, but not the default route or a blackhole/reject route, the packet matches. Otherwise, the packet does not match. All outgoing packets match. .Pp The name and functionality of the option is intentionally similar to the Cisco IOS command: .Pp .Dl ip verify unicast source reachable-via any .Pp This option can be used to make anti-spoofing rules to reject all packets whose source address is unreachable. .It Cm antispoof For incoming packets, the packet's source address is checked if it belongs to a directly connected network. If the network is directly connected, then the interface the packet came on in is compared to the interface the network is connected to. When incoming interface and directly connected interface are not the same, the packet does not match. Otherwise, the packet does match. All outgoing packets match. .Pp This option can be used to make anti-spoofing rules to reject all packets that pretend to be from a directly connected network but do not come in through that interface. This option is similar to but more restricted than .Cm verrevpath because it engages only on packets with source addresses of directly connected networks instead of all source addresses. .El .Sh LOOKUP TABLES Lookup tables are useful to handle large sparse sets of addresses or other search keys (e.g., ports, jail IDs, interface names). In the rest of this section we will use the term ``key''. Table name needs to match the following spec: .Ar table-name . Tables with the same name can be created in different .Ar sets . However, rule links to the tables in .Ar set 0 by default. This behavior can be controlled by .Va net.inet.ip.fw.tables_sets variable. See the .Sx SETS OF RULES section for more information. There may be up to 65535 different lookup tables. .Pp The following table types are supported: .Bl -tag -width indent .It Ar table-type : Ar addr | iface | number | flow .It Ar table-key : Ar addr Ns Oo / Ns Ar masklen Oc | iface-name | number | flow-spec .It Ar flow-spec : Ar flow-field Ns Op , Ns Ar flow-spec .It Ar flow-field : src-ip | proto | src-port | dst-ip | dst-port .It Cm addr matches IPv4 or IPv6 address. Each entry is represented by an .Ar addr Ns Op / Ns Ar masklen and will match all addresses with base .Ar addr (specified as an IPv4/IPv6 address, or a hostname) and mask width of .Ar masklen bits. If .Ar masklen is not specified, it defaults to 32 for IPv4 and 128 for IPv6. When looking up an IP address in a table, the most specific entry will match. .It Cm iface matches interface names. Each entry is represented by string treated as interface name. Wildcards are not supported. .It Cm number maches protocol ports, uids/gids or jail IDs. Each entry is represented by 32-bit unsigned integer. Ranges are not supported. .It Cm flow Matches packet fields specified by .Ar flow type suboptions with table entries. .El .Pp Tables require explicit creation via .Cm create before use. .Pp The following creation options are supported: .Bl -tag -width indent .It Ar create-options : Ar create-option | create-options .It Ar create-option : Cm type Ar table-type | Cm valtype Ar value-mask | Cm algo Ar algo-desc | .Cm limit Ar number | Cm locked .It Cm type Table key type. .It Cm valtype Table value mask. .It Cm algo Table algorithm to use (see below). .It Cm limit Maximum number of items that may be inserted into table. .It Cm locked Restrict any table modifications. .El .Pp Some of these options may be modified later via .Cm modify keyword. The following options can be changed: .Bl -tag -width indent .It Ar modify-options : Ar modify-option | modify-options .It Ar modify-option : Cm limit Ar number .It Cm limit Alter maximum number of items that may be inserted into table. .El .Pp Additionally, table can be locked or unlocked using .Cm lock or .Cm unlock commands. .Pp Tables of the same .Ar type can be swapped with each other using .Cm swap Ar name command. Swap may fail if tables limits are set and data exchange would result in limits hit. Operation is performed atomically. .Pp One or more entries can be added to a table at once using .Cm add command. Addition of all items are performed atomically. By default, error in addition of one entry does not influence addition of other entries. However, non-zero error code is returned in that case. Special .Cm atomic keyword may be specified before .Cm add to indicate all-or-none add request. .Pp One or more entries can be removed from a table at once using .Cm delete command. By default, error in removal of one entry does not influence removing of other entries. However, non-zero error code is returned in that case. .Pp It may be possible to check what entry will be found on particular .Ar table-key using .Cm lookup .Ar table-key command. This functionality is optional and may be unsupported in some algorithms. .Pp The following operations can be performed on .Ar one or .Cm all tables: .Bl -tag -width indent .It Cm list List all entries. .It Cm flush Removes all entries. .It Cm info Shows generic table information. .It Cm detail Shows generic table information and algo-specific data. .El .Pp The following lookup algorithms are supported: .Bl -tag -width indent .It Ar algo-desc : algo-name | "algo-name algo-data" .It Ar algo-name: Ar addr:radix | addr:hash | iface:array | number:array | flow:hash .It Cm addr:radix Separate Radix trees for IPv4 and IPv6, the same way as the routing table (see .Xr route 4 ) . Default choice for .Ar addr type. .It Cm addr:hash Separate auto-growing hashes for IPv4 and IPv6. Accepts entries with the same mask length specified initially via .Cm "addr:hash masks=/v4,/v6" algorithm creation options. Assume /32 and /128 masks by default. Search removes host bits (according to mask) from supplied address and checks resulting key in appropriate hash. Mostly optimized for /64 and byte-ranged IPv6 masks. .It Cm iface:array Array storing sorted indexes for entries which are presented in the system. Optimized for very fast lookup. .It Cm number:array Array storing sorted u32 numbers. .It Cm flow:hash Auto-growing hash storing flow entries. Search calculates hash on required packet fields and searches for matching entries in selected bucket. .El .Pp The .Cm tablearg feature provides the ability to use a value, looked up in the table, as the argument for a rule action, action parameter or rule option. This can significantly reduce number of rules in some configurations. If two tables are used in a rule, the result of the second (destination) is used. .Pp Each record may hold one or more values according to .Ar value-mask . This mask is set on table creation via .Cm valtype option. The following value types are supported: .Bl -tag -width indent .It Ar value-mask : Ar value-type Ns Op , Ns Ar value-mask .It Ar value-type : Ar skipto | pipe | fib | nat | dscp | tag | divert | .Ar netgraph | limit | ipv4 .It Cm skipto rule number to jump to. .It Cm pipe Pipe number to use. .It Cm fib fib number to match/set. .It Cm nat nat number to jump to. .It Cm dscp dscp value to match/set. .It Cm tag tag number to match/set. .It Cm divert port number to divert traffic to. .It Cm netgraph hook number to move packet to. .It Cm limit maximum number of connections. .It Cm ipv4 IPv4 nexthop to fwd packets to. .It Cm ipv6 IPv6 nexthop to fwd packets to. .El .Pp The .Cm tablearg argument can be used with the following actions: .Cm nat, pipe , queue, divert, tee, netgraph, ngtee, fwd, skipto, setfib, action parameters: .Cm tag, untag, rule options: .Cm limit, tagged. .Pp When used with the .Cm skipto action, the user should be aware that the code will walk the ruleset up to a rule equal to, or past, the given number. .Pp See the .Sx EXAMPLES Section for example usage of tables and the tablearg keyword. .Sh SETS OF RULES Each rule or table belongs to one of 32 different .Em sets , numbered 0 to 31. Set 31 is reserved for the default rule. .Pp By default, rules or tables are put in set 0, unless you use the .Cm set N attribute when adding a new rule or table. Sets can be individually and atomically enabled or disabled, so this mechanism permits an easy way to store multiple configurations of the firewall and quickly (and atomically) switch between them. .Pp By default, tables from set 0 are referenced when adding rule with table opcodes regardless of rule set. This behavior can be changed by setting .Va net.inet.ip.fw.tables_set variable to 1. Rule's set will then be used for table references. .Pp The command to enable/disable sets is .Bd -ragged -offset indent .Nm .Cm set Oo Cm disable Ar number ... Oc Op Cm enable Ar number ... .Ed .Pp where multiple .Cm enable or .Cm disable sections can be specified. Command execution is atomic on all the sets specified in the command. By default, all sets are enabled. .Pp When you disable a set, its rules behave as if they do not exist in the firewall configuration, with only one exception: .Bd -ragged -offset indent dynamic rules created from a rule before it had been disabled will still be active until they expire. In order to delete dynamic rules you have to explicitly delete the parent rule which generated them. .Ed .Pp The set number of rules can be changed with the command .Bd -ragged -offset indent .Nm .Cm set move .Brq Cm rule Ar rule-number | old-set .Cm to Ar new-set .Ed .Pp Also, you can atomically swap two rulesets with the command .Bd -ragged -offset indent .Nm .Cm set swap Ar first-set second-set .Ed .Pp See the .Sx EXAMPLES Section on some possible uses of sets of rules. .Sh STATEFUL FIREWALL Stateful operation is a way for the firewall to dynamically create rules for specific flows when packets that match a given pattern are detected. Support for stateful operation comes through the .Cm check-state , keep-state and .Cm limit options of .Nm rules . .Pp Dynamic rules are created when a packet matches a .Cm keep-state or .Cm limit rule, causing the creation of a .Em dynamic rule which will match all and only packets with a given .Em protocol between a .Em src-ip/src-port dst-ip/dst-port pair of addresses .Em ( src and .Em dst are used here only to denote the initial match addresses, but they are completely equivalent afterwards). Rules created by .Cm keep-state option also have a .Ar :flowname taken from it. This name is used in matching together with addresses, ports and protocol. Dynamic rules will be checked at the first .Cm check-state, keep-state or .Cm limit occurrence, and the action performed upon a match will be the same as in the parent rule. .Pp Note that no additional attributes other than protocol and IP addresses and ports and :flowname are checked on dynamic rules. .Pp The typical use of dynamic rules is to keep a closed firewall configuration, but let the first TCP SYN packet from the inside network install a dynamic rule for the flow so that packets belonging to that session will be allowed through the firewall: .Pp .Dl "ipfw add check-state :OUTBOUND" .Dl "ipfw add allow tcp from my-subnet to any setup keep-state :OUTBOUND" .Dl "ipfw add deny tcp from any to any" .Pp A similar approach can be used for UDP, where an UDP packet coming from the inside will install a dynamic rule to let the response through the firewall: .Pp .Dl "ipfw add check-state :OUTBOUND" .Dl "ipfw add allow udp from my-subnet to any keep-state :OUTBOUND" .Dl "ipfw add deny udp from any to any" .Pp Dynamic rules expire after some time, which depends on the status of the flow and the setting of some .Cm sysctl variables. See Section .Sx SYSCTL VARIABLES for more details. For TCP sessions, dynamic rules can be instructed to periodically send keepalive packets to refresh the state of the rule when it is about to expire. .Pp See Section .Sx EXAMPLES for more examples on how to use dynamic rules. .Sh TRAFFIC SHAPER (DUMMYNET) CONFIGURATION .Nm is also the user interface for the .Nm dummynet traffic shaper, packet scheduler and network emulator, a subsystem that can artificially queue, delay or drop packets emulating the behaviour of certain network links or queueing systems. .Pp .Nm dummynet operates by first using the firewall to select packets using any match pattern that can be used in .Nm rules. Matching packets are then passed to either of two different objects, which implement the traffic regulation: .Bl -hang -offset XXXX .It Em pipe A .Em pipe emulates a .Em link with given bandwidth and propagation delay, driven by a FIFO scheduler and a single queue with programmable queue size and packet loss rate. Packets are appended to the queue as they come out from .Nm ipfw , and then transferred in FIFO order to the link at the desired rate. .It Em queue A .Em queue is an abstraction used to implement packet scheduling using one of several packet scheduling algorithms. Packets sent to a .Em queue are first grouped into flows according to a mask on the 5-tuple. Flows are then passed to the scheduler associated to the .Em queue , and each flow uses scheduling parameters (weight and others) as configured in the .Em queue itself. A scheduler in turn is connected to an emulated link, and arbitrates the link's bandwidth among backlogged flows according to weights and to the features of the scheduling algorithm in use. .El .Pp In practice, .Em pipes can be used to set hard limits to the bandwidth that a flow can use, whereas .Em queues can be used to determine how different flows share the available bandwidth. .Pp A graphical representation of the binding of queues, flows, schedulers and links is below. .Bd -literal -offset indent (flow_mask|sched_mask) sched_mask +---------+ weight Wx +-------------+ | |->-[flow]-->--| |-+ -->--| QUEUE x | ... | | | | |->-[flow]-->--| SCHEDuler N | | +---------+ | | | ... | +--[LINK N]-->-- +---------+ weight Wy | | +--[LINK N]-->-- | |->-[flow]-->--| | | -->--| QUEUE y | ... | | | | |->-[flow]-->--| | | +---------+ +-------------+ | +-------------+ .Ed It is important to understand the role of the SCHED_MASK and FLOW_MASK, which are configured through the commands .Dl "ipfw sched N config mask SCHED_MASK ..." and .Dl "ipfw queue X config mask FLOW_MASK ..." . .Pp The SCHED_MASK is used to assign flows to one or more scheduler instances, one for each value of the packet's 5-tuple after applying SCHED_MASK. As an example, using ``src-ip 0xffffff00'' creates one instance for each /24 destination subnet. .Pp The FLOW_MASK, together with the SCHED_MASK, is used to split packets into flows. As an example, using ``src-ip 0x000000ff'' together with the previous SCHED_MASK makes a flow for each individual source address. In turn, flows for each /24 subnet will be sent to the same scheduler instance. .Pp The above diagram holds even for the .Em pipe case, with the only restriction that a .Em pipe only supports a SCHED_MASK, and forces the use of a FIFO scheduler (these are for backward compatibility reasons; in fact, internally, a .Nm dummynet's pipe is implemented exactly as above). .Pp There are two modes of .Nm dummynet operation: .Dq normal and .Dq fast . The .Dq normal mode tries to emulate a real link: the .Nm dummynet scheduler ensures that the packet will not leave the pipe faster than it would on the real link with a given bandwidth. The .Dq fast mode allows certain packets to bypass the .Nm dummynet scheduler (if packet flow does not exceed pipe's bandwidth). This is the reason why the .Dq fast mode requires less CPU cycles per packet (on average) and packet latency can be significantly lower in comparison to a real link with the same bandwidth. The default mode is .Dq normal . The .Dq fast mode can be enabled by setting the .Va net.inet.ip.dummynet.io_fast .Xr sysctl 8 variable to a non-zero value. .Pp .Ss PIPE, QUEUE AND SCHEDULER CONFIGURATION The .Em pipe , .Em queue and .Em scheduler configuration commands are the following: .Bd -ragged -offset indent .Cm pipe Ar number Cm config Ar pipe-configuration .Pp .Cm queue Ar number Cm config Ar queue-configuration .Pp .Cm sched Ar number Cm config Ar sched-configuration .Ed .Pp The following parameters can be configured for a pipe: .Pp .Bl -tag -width indent -compact .It Cm bw Ar bandwidth | device Bandwidth, measured in .Sm off .Op Cm K | M | G .Brq Cm bit/s | Byte/s . .Sm on .Pp A value of 0 (default) means unlimited bandwidth. The unit must immediately follow the number, as in .Pp .Dl "ipfw pipe 1 config bw 300Kbit/s" .Pp If a device name is specified instead of a numeric value, as in .Pp .Dl "ipfw pipe 1 config bw tun0" .Pp then the transmit clock is supplied by the specified device. At the moment only the .Xr tun 4 device supports this functionality, for use in conjunction with .Xr ppp 8 . .Pp .It Cm delay Ar ms-delay Propagation delay, measured in milliseconds. The value is rounded to the next multiple of the clock tick (typically 10ms, but it is a good practice to run kernels with .Dq "options HZ=1000" to reduce the granularity to 1ms or less). The default value is 0, meaning no delay. .Pp .It Cm burst Ar size If the data to be sent exceeds the pipe's bandwidth limit (and the pipe was previously idle), up to .Ar size bytes of data are allowed to bypass the .Nm dummynet scheduler, and will be sent as fast as the physical link allows. Any additional data will be transmitted at the rate specified by the .Nm pipe bandwidth. The burst size depends on how long the pipe has been idle; the effective burst size is calculated as follows: MAX( .Ar size , .Nm bw * pipe_idle_time). .Pp .It Cm profile Ar filename A file specifying the additional overhead incurred in the transmission of a packet on the link. .Pp Some link types introduce extra delays in the transmission of a packet, e.g., because of MAC level framing, contention on the use of the channel, MAC level retransmissions and so on. From our point of view, the channel is effectively unavailable for this extra time, which is constant or variable depending on the link type. Additionally, packets may be dropped after this time (e.g., on a wireless link after too many retransmissions). We can model the additional delay with an empirical curve that represents its distribution. .Bd -literal -offset indent cumulative probability 1.0 ^ | L +-- loss-level x | ****** | * | ***** | * | ** | * +-------*-------------------> delay .Ed The empirical curve may have both vertical and horizontal lines. Vertical lines represent constant delay for a range of probabilities. Horizontal lines correspond to a discontinuity in the delay distribution: the pipe will use the largest delay for a given probability. .Pp The file format is the following, with whitespace acting as a separator and '#' indicating the beginning a comment: .Bl -tag -width indent .It Cm name Ar identifier optional name (listed by "ipfw pipe show") to identify the delay distribution; .It Cm bw Ar value the bandwidth used for the pipe. If not specified here, it must be present explicitly as a configuration parameter for the pipe; .It Cm loss-level Ar L the probability above which packets are lost. (0.0 <= L <= 1.0, default 1.0 i.e., no loss); .It Cm samples Ar N the number of samples used in the internal representation of the curve (2..1024; default 100); .It Cm "delay prob" | "prob delay" One of these two lines is mandatory and defines the format of the following lines with data points. .It Ar XXX Ar YYY 2 or more lines representing points in the curve, with either delay or probability first, according to the chosen format. The unit for delay is milliseconds. Data points do not need to be sorted. Also, the number of actual lines can be different from the value of the "samples" parameter: .Nm utility will sort and interpolate the curve as needed. .El .Pp Example of a profile file: .Bd -literal -offset indent name bla_bla_bla samples 100 loss-level 0.86 prob delay 0 200 # minimum overhead is 200ms 0.5 200 0.5 300 0.8 1000 0.9 1300 1 1300 #configuration file end .Ed .El .Pp The following parameters can be configured for a queue: .Pp .Bl -tag -width indent -compact .It Cm pipe Ar pipe_nr Connects a queue to the specified pipe. Multiple queues (with the same or different weights) can be connected to the same pipe, which specifies the aggregate rate for the set of queues. .Pp .It Cm weight Ar weight Specifies the weight to be used for flows matching this queue. The weight must be in the range 1..100, and defaults to 1. .El .Pp The following case-insensitive parameters can be configured for a scheduler: .Pp .Bl -tag -width indent -compact .It Cm type Ar {fifo | wf2q+ | rr | qfq} specifies the scheduling algorithm to use. .Bl -tag -width indent -compact .It Cm fifo is just a FIFO scheduler (which means that all packets are stored in the same queue as they arrive to the scheduler). FIFO has O(1) per-packet time complexity, with very low constants (estimate 60-80ns on a 2GHz desktop machine) but gives no service guarantees. .It Cm wf2q+ implements the WF2Q+ algorithm, which is a Weighted Fair Queueing algorithm which permits flows to share bandwidth according to their weights. Note that weights are not priorities; even a flow with a minuscule weight will never starve. WF2Q+ has O(log N) per-packet processing cost, where N is the number of flows, and is the default algorithm used by previous versions dummynet's queues. .It Cm rr implements the Deficit Round Robin algorithm, which has O(1) processing costs (roughly, 100-150ns per packet) and permits bandwidth allocation according to weights, but with poor service guarantees. .It Cm qfq implements the QFQ algorithm, which is a very fast variant of WF2Q+, with similar service guarantees and O(1) processing costs (roughly, 200-250ns per packet). .El .El .Pp In addition to the type, all parameters allowed for a pipe can also be specified for a scheduler. .Pp Finally, the following parameters can be configured for both pipes and queues: .Pp .Bl -tag -width XXXX -compact .It Cm buckets Ar hash-table-size Specifies the size of the hash table used for storing the various queues. Default value is 64 controlled by the .Xr sysctl 8 variable .Va net.inet.ip.dummynet.hash_size , allowed range is 16 to 65536. .Pp .It Cm mask Ar mask-specifier Packets sent to a given pipe or queue by an .Nm rule can be further classified into multiple flows, each of which is then sent to a different .Em dynamic pipe or queue. A flow identifier is constructed by masking the IP addresses, ports and protocol types as specified with the .Cm mask options in the configuration of the pipe or queue. For each different flow identifier, a new pipe or queue is created with the same parameters as the original object, and matching packets are sent to it. .Pp Thus, when .Em dynamic pipes are used, each flow will get the same bandwidth as defined by the pipe, whereas when .Em dynamic queues are used, each flow will share the parent's pipe bandwidth evenly with other flows generated by the same queue (note that other queues with different weights might be connected to the same pipe). .br Available mask specifiers are a combination of one or more of the following: .Pp .Cm dst-ip Ar mask , .Cm dst-ip6 Ar mask , .Cm src-ip Ar mask , .Cm src-ip6 Ar mask , .Cm dst-port Ar mask , .Cm src-port Ar mask , .Cm flow-id Ar mask , .Cm proto Ar mask or .Cm all , .Pp where the latter means all bits in all fields are significant. .Pp .It Cm noerror When a packet is dropped by a .Nm dummynet queue or pipe, the error is normally reported to the caller routine in the kernel, in the same way as it happens when a device queue fills up. Setting this option reports the packet as successfully delivered, which can be needed for some experimental setups where you want to simulate loss or congestion at a remote router. .Pp .It Cm plr Ar packet-loss-rate Packet loss rate. Argument .Ar packet-loss-rate is a floating-point number between 0 and 1, with 0 meaning no loss, 1 meaning 100% loss. The loss rate is internally represented on 31 bits. .Pp .It Cm queue Brq Ar slots | size Ns Cm Kbytes Queue size, in .Ar slots or .Cm KBytes . Default value is 50 slots, which is the typical queue size for Ethernet devices. Note that for slow speed links you should keep the queue size short or your traffic might be affected by a significant queueing delay. E.g., 50 max-sized ethernet packets (1500 bytes) mean 600Kbit or 20s of queue on a 30Kbit/s pipe. Even worse effects can result if you get packets from an interface with a much larger MTU, e.g.\& the loopback interface with its 16KB packets. The .Xr sysctl 8 variables .Em net.inet.ip.dummynet.pipe_byte_limit and .Em net.inet.ip.dummynet.pipe_slot_limit control the maximum lengths that can be specified. .Pp .It Cm red | gred Ar w_q Ns / Ns Ar min_th Ns / Ns Ar max_th Ns / Ns Ar max_p [ecn] Make use of the RED (Random Early Detection) queue management algorithm. .Ar w_q and .Ar max_p are floating point numbers between 0 and 1 (inclusive), while .Ar min_th and .Ar max_th are integer numbers specifying thresholds for queue management (thresholds are computed in bytes if the queue has been defined in bytes, in slots otherwise). The two parameters can also be of the same value if needed. The .Nm dummynet also supports the gentle RED variant (gred) and ECN (Explicit Congestion Notification) as optional. Three .Xr sysctl 8 variables can be used to control the RED behaviour: .Bl -tag -width indent .It Va net.inet.ip.dummynet.red_lookup_depth specifies the accuracy in computing the average queue when the link is idle (defaults to 256, must be greater than zero) .It Va net.inet.ip.dummynet.red_avg_pkt_size specifies the expected average packet size (defaults to 512, must be greater than zero) .It Va net.inet.ip.dummynet.red_max_pkt_size specifies the expected maximum packet size, only used when queue thresholds are in bytes (defaults to 1500, must be greater than zero). .El .El .Pp When used with IPv6 data, .Nm dummynet currently has several limitations. Information necessary to route link-local packets to an interface is not available after processing by .Nm dummynet so those packets are dropped in the output path. Care should be taken to ensure that link-local packets are not passed to .Nm dummynet . .Sh CHECKLIST Here are some important points to consider when designing your rules: .Bl -bullet .It Remember that you filter both packets going .Cm in and .Cm out . Most connections need packets going in both directions. .It Remember to test very carefully. It is a good idea to be near the console when doing this. If you cannot be near the console, use an auto-recovery script such as the one in .Pa /usr/share/examples/ipfw/change_rules.sh . .It Do not forget the loopback interface. .El .Sh FINE POINTS .Bl -bullet .It There are circumstances where fragmented datagrams are unconditionally dropped. TCP packets are dropped if they do not contain at least 20 bytes of TCP header, UDP packets are dropped if they do not contain a full 8 byte UDP header, and ICMP packets are dropped if they do not contain 4 bytes of ICMP header, enough to specify the ICMP type, code, and checksum. These packets are simply logged as .Dq pullup failed since there may not be enough good data in the packet to produce a meaningful log entry. .It Another type of packet is unconditionally dropped, a TCP packet with a fragment offset of one. This is a valid packet, but it only has one use, to try to circumvent firewalls. When logging is enabled, these packets are reported as being dropped by rule -1. .It If you are logged in over a network, loading the .Xr kld 4 version of .Nm is probably not as straightforward as you would think. The following command line is recommended: .Bd -literal -offset indent kldload ipfw && \e ipfw add 32000 allow ip from any to any .Ed .Pp Along the same lines, doing an .Bd -literal -offset indent ipfw flush .Ed .Pp in similar surroundings is also a bad idea. .It The .Nm filter list may not be modified if the system security level is set to 3 or higher (see .Xr init 8 for information on system security levels). .El .Sh PACKET DIVERSION A .Xr divert 4 socket bound to the specified port will receive all packets diverted to that port. If no socket is bound to the destination port, or if the divert module is not loaded, or if the kernel was not compiled with divert socket support, the packets are dropped. .Sh NETWORK ADDRESS TRANSLATION (NAT) .Nm support in-kernel NAT using the kernel version of .Xr libalias 3 . The kernel module .Cm ipfw_nat should be loaded or kernel should have .Cm options IPFIREWALL_NAT to be able use NAT. .Pp The nat configuration command is the following: .Bd -ragged -offset indent .Bk -words .Cm nat .Ar nat_number .Cm config .Ar nat-configuration .Ek .Ed .Pp The following parameters can be configured: .Bl -tag -width indent .It Cm ip Ar ip_address Define an ip address to use for aliasing. .It Cm if Ar nic Use ip address of NIC for aliasing, dynamically changing it if NIC's ip address changes. .It Cm log Enable logging on this nat instance. .It Cm deny_in Deny any incoming connection from outside world. .It Cm same_ports Try to leave the alias port numbers unchanged from the actual local port numbers. .It Cm unreg_only Traffic on the local network not originating from an unregistered address spaces will be ignored. .It Cm reset Reset table of the packet aliasing engine on address change. .It Cm reverse Reverse the way libalias handles aliasing. .It Cm proxy_only Obey transparent proxy rules only, packet aliasing is not performed. .It Cm skip_global Skip instance in case of global state lookup (see below). .El .Pp Some specials value can be supplied instead of .Va nat_number: .Bl -tag -width indent .It Cm global Looks up translation state in all configured nat instances. If an entry is found, packet is aliased according to that entry. If no entry was found in any of the instances, packet is passed unchanged, and no new entry will be created. See section .Sx MULTIPLE INSTANCES in .Xr natd 8 for more information. .It Cm tablearg Uses argument supplied in lookup table. See .Sx LOOKUP TABLES section below for more information on lookup tables. .El .Pp To let the packet continue after being (de)aliased, set the sysctl variable .Va net.inet.ip.fw.one_pass to 0. For more information about aliasing modes, refer to .Xr libalias 3 . See Section .Sx EXAMPLES for some examples about nat usage. .Ss REDIRECT AND LSNAT SUPPORT IN IPFW Redirect and LSNAT support follow closely the syntax used in .Xr natd 8 . See Section .Sx EXAMPLES for some examples on how to do redirect and lsnat. .Ss SCTP NAT SUPPORT SCTP nat can be configured in a similar manner to TCP through the .Nm command line tool. The main difference is that .Nm sctp nat does not do port translation. Since the local and global side ports will be the same, there is no need to specify both. Ports are redirected as follows: .Bd -ragged -offset indent .Bk -words .Cm nat .Ar nat_number .Cm config if .Ar nic .Cm redirect_port sctp .Ar ip_address [,addr_list] {[port | port-port] [,ports]} .Ek .Ed .Pp Most .Nm sctp nat configuration can be done in real-time through the .Xr sysctl 8 interface. All may be changed dynamically, though the hash_table size will only change for new .Nm nat instances. See .Sx SYSCTL VARIABLES for more info. .Sh IPv6/IPv4 NETWORK ADDRESS AND PROTOCOL TRANSLATION .Nm supports in-kernel IPv6/IPv4 network address and protocol translation. Stateful NAT64 translation allows IPv6-only clients to contact IPv4 servers using unicast TCP, UDP or ICMP protocols. One or more IPv4 addresses assigned to a stateful NAT64 translator are shared -among serveral IPv6-only clients. +among several IPv6-only clients. When stateful NAT64 is used in conjunction with DNS64, no changes are usually required in the IPv6 client or the IPv4 server. The kernel module .Cm ipfw_nat64 should be loaded or kernel should have .Cm options IPFIREWALL_NAT64 to be able use stateful NAT64 translator. .Pp Stateful NAT64 uses a bunch of memory for several types of objects. When IPv6 client initiates connection, NAT64 translator creates a host entry in the states table. Each host entry has a number of ports group entries allocated on demand. Ports group entries contains connection state entries. There are several options to control limits and lifetime for these objects. .Pp NAT64 translator follows RFC7915 when does ICMPv6/ICMP translation, unsupported message types will be silently dropped. IPv6 needs several ICMPv6 message types to be explicitly allowed for correct operation. Make sure that ND6 neighbor solicitation (ICMPv6 type 135) and neighbor advertisement (ICMPv6 type 136) messages will not be handled by translation rules. .Pp After translation NAT64 translator sends packets through corresponding netisr queue. Thus translator host should be configured as IPv4 and IPv6 router. .Pp Currently both stateful and stateless NAT64 translators use Well-Known IPv6 Prefix .Ar 64:ff9b::/96 to represent IPv4 addresses in the IPv6 address. Thus DNS64 service and routing should be configured to use Well-Known IPv6 Prefix. .Pp The stateful NAT64 configuration command is the following: .Bd -ragged -offset indent .Bk -words .Cm nat64lsn .Ar name .Cm create .Ar create-options .Ek .Ed .Pp The following parameters can be configured: .Bl -tag -width indent .It Cm prefix4 Ar ipv4_prefix/mask The IPv4 prefix with mask defines the pool of IPv4 addresses used as source address after translation. Stateful NAT64 module translates IPv6 source address of client to one IPv4 address from this pool. Note that incoming IPv4 packets that don't have corresponding state entry in the states table will be dropped by translator. Make sure that translation rules handle packets, destined to configured prefix. .It Cm max_ports Ar number Maximum number of ports reserved for upper level protocols to one IPv6 client. All reserved ports are divided into chunks between supported protocols. The number of connections from one IPv6 client is limited by this option. Note that closed TCP connections still remain in the list of connections until .Cm tcp_close_age interval will not expire. Default value is .Ar 2048 . .It Cm host_del_age Ar seconds The number of seconds until the host entry for a IPv6 client will be deleted and all its resources will be released due to inactivity. Default value is .Ar 3600 . .It Cm pg_del_age Ar seconds The number of seconds until a ports group with unused state entries will be released. Default value is .Ar 900 . .It Cm tcp_syn_age Ar seconds The number of seconds while a state entry for TCP connection with only SYN sent will be kept. If TCP connection establishing will not be finished, state entry will be deleted. Default value is .Ar 10 . .It Cm tcp_est_age Ar seconds The number of seconds while a state entry for established TCP connection will be kept. Default value is .Ar 7200 . .It Cm tcp_close_age Ar seconds The number of seconds while a state entry for closed TCP connection will be kept. Keeping state entries for closed connections is needed, because IPv4 servers typically keep closed connections in a TIME_WAIT state for a several minutes. Since translator's IPv4 addresses are shared among all IPv6 clients, new connections from the same addresses and ports may be rejected by server, because these connections are still in a TIME_WAIT state. Keeping them in translator's state table protects from such rejects. Default value is .Ar 180 . .It Cm udp_age Ar seconds The number of seconds while translator keeps state entry in a waiting for reply to the sent UDP datagram. Default value is .Ar 120 . .It Cm icmp_age Ar seconds The number of seconds while translator keeps state entry in a waiting for reply to the sent ICMP message. Default value is .Ar 60 . .It Cm log Turn on logging of all handled packets via BPF through .Ar ipfwlog0 interface. .Ar ipfwlog0 is a pseudo interface and can be created after a boot manually with .Cm ifconfig command. Note that it has different purpose than .Ar ipfw0 interface. Translators sends to BPF an additional information with each packet. With .Cm tcpdump you are able to see each handled packet before and after translation. .It Cm -log Turn off logging of all handled packets via BPF. .El .Pp To inspect a states table of stateful NAT64 the following command can be used: .Bd -ragged -offset indent .Bk -words .Cm nat64lsn .Ar name .Cm show Cm states .Ek .Ed .Pp .Pp Stateless NAT64 translator doesn't use a states table for translation and converts IPv4 addresses to IPv6 and vice versa solely based on the mappings taken from configured lookup tables. Since a states table doesn't used by stateless translator, it can be configured to pass IPv4 clients to IPv6-only servers. .Pp The stateless NAT64 configuration command is the following: .Bd -ragged -offset indent .Bk -words .Cm nat64stl .Ar name .Cm create .Ar create-options .Ek .Ed .Pp The following parameters can be configured: .Bl -tag -width indent .It Cm table4 Ar table46 The lookup table .Ar table46 contains mapping how IPv4 addresses should be translated to IPv6 addresses. .It Cm table6 Ar table64 The lookup table .Ar table64 contains mapping how IPv6 addresses should be translated to IPv4 addresses. .It Cm log Turn on logging of all handled packets via BPF through .Ar ipfwlog0 interface. .It Cm -log Turn off logging of all handled packets via BPF. .El .Pp Note that the behavior of stateless translator with respect to not matched packets differs from stateful translator. If corresponding addresses was not found in the lookup tables, the packet will not be dropped and the search continues. .Sh IPv6-to-IPv6 NETWORK PREFIX TRANSLATION (NPTv6) .Nm supports in-kernel IPv6-to-IPv6 network prefix translation as described in RFC6296. The kernel module .Cm ipfw_nptv6 should be loaded or kernel should has .Cm options IPFIREWALL_NPTV6 to be able use NPTv6 translator. .Pp The NPTv6 configuration command is the following: .Bd -ragged -offset indent .Bk -words .Cm nptv6 .Ar name .Cm create .Ar create-options .Ek .Ed .Pp The following parameters can be configured: .Bl -tag -width indent .It Cm int_prefix Ar ipv6_prefix IPv6 prefix used in internal network. NPTv6 module translates source address when it matches this prefix. .It Cm ext_prefix Ar ipv6_prefix IPv6 prefix used in external network. NPTv6 module translates destination address when it matches this prefix. .It Cm prefixlen Ar length The length of specified IPv6 prefixes. It must be in range from 8 to 64. .El .Pp Note that the prefix translation rules are silently ignored when IPv6 packet forwarding is disabled. To enable the packet forwarding, set the sysctl variable .Va net.inet6.ip6.forwarding to 1. .Pp To let the packet continue after being translated, set the sysctl variable .Va net.inet.ip.fw.one_pass to 0. .Sh LOADER TUNABLES Tunables can be set in .Xr loader 8 prompt, .Xr loader.conf 5 or .Xr kenv 1 before ipfw module gets loaded. .Bl -tag -width indent .It Va net.inet.ip.fw.default_to_accept: No 0 Defines ipfw last rule behavior. This value overrides .Cd "options IPFW_DEFAULT_TO_(ACCEPT|DENY)" from kernel configuration file. .It Va net.inet.ip.fw.tables_max: No 128 Defines number of tables available in ipfw. Number cannot exceed 65534. .El .Sh SYSCTL VARIABLES A set of .Xr sysctl 8 variables controls the behaviour of the firewall and associated modules .Pq Nm dummynet , bridge , sctp nat . These are shown below together with their default value (but always check with the .Xr sysctl 8 command what value is actually in use) and meaning: .Bl -tag -width indent .It Va net.inet.ip.alias.sctp.accept_global_ootb_addip: No 0 Defines how the .Nm nat responds to receipt of global OOTB ASCONF-AddIP: .Bl -tag -width indent .It Cm 0 No response (unless a partially matching association exists - ports and vtags match but global address does not) .It Cm 1 .Nm nat will accept and process all OOTB global AddIP messages. .El .Pp Option 1 should never be selected as this forms a security risk. An attacker can establish multiple fake associations by sending AddIP messages. .It Va net.inet.ip.alias.sctp.chunk_proc_limit: No 5 Defines the maximum number of chunks in an SCTP packet that will be parsed for a packet that matches an existing association. This value is enforced to be greater or equal than .Cm net.inet.ip.alias.sctp.initialising_chunk_proc_limit . A high value is a DoS risk yet setting too low a value may result in important control chunks in the packet not being located and parsed. .It Va net.inet.ip.alias.sctp.error_on_ootb: No 1 Defines when the .Nm nat responds to any Out-of-the-Blue (OOTB) packets with ErrorM packets. An OOTB packet is a packet that arrives with no existing association registered in the .Nm nat and is not an INIT or ASCONF-AddIP packet: .Bl -tag -width indent .It Cm 0 ErrorM is never sent in response to OOTB packets. .It Cm 1 ErrorM is only sent to OOTB packets received on the local side. .It Cm 2 ErrorM is sent to the local side and on the global side ONLY if there is a partial match (ports and vtags match but the source global IP does not). This value is only useful if the .Nm nat is tracking global IP addresses. .It Cm 3 ErrorM is sent in response to all OOTB packets on both the local and global side (DoS risk). .El .Pp At the moment the default is 0, since the ErrorM packet is not yet supported by most SCTP stacks. When it is supported, and if not tracking global addresses, we recommend setting this value to 1 to allow multi-homed local hosts to function with the .Nm nat . To track global addresses, we recommend setting this value to 2 to allow global hosts to be informed when they need to (re)send an ASCONF-AddIP. Value 3 should never be chosen (except for debugging) as the .Nm nat will respond to all OOTB global packets (a DoS risk). .It Va net.inet.ip.alias.sctp.hashtable_size: No 2003 Size of hash tables used for .Nm nat lookups (100 < prime_number > 1000001). This value sets the .Nm hash table size for any future created .Nm nat instance and therefore must be set prior to creating a .Nm nat instance. The table sizes may be changed to suit specific needs. If there will be few concurrent associations, and memory is scarce, you may make these smaller. If there will be many thousands (or millions) of concurrent associations, you should make these larger. A prime number is best for the table size. The sysctl update function will adjust your input value to the next highest prime number. .It Va net.inet.ip.alias.sctp.holddown_time: No 0 Hold association in table for this many seconds after receiving a SHUTDOWN-COMPLETE. This allows endpoints to correct shutdown gracefully if a shutdown_complete is lost and retransmissions are required. .It Va net.inet.ip.alias.sctp.init_timer: No 15 Timeout value while waiting for (INIT-ACK|AddIP-ACK). This value cannot be 0. .It Va net.inet.ip.alias.sctp.initialising_chunk_proc_limit: No 2 Defines the maximum number of chunks in an SCTP packet that will be parsed when no existing association exists that matches that packet. Ideally this packet will only be an INIT or ASCONF-AddIP packet. A higher value may become a DoS risk as malformed packets can consume processing resources. .It Va net.inet.ip.alias.sctp.param_proc_limit: No 25 Defines the maximum number of parameters within a chunk that will be parsed in a packet. As for other similar sysctl variables, larger values pose a DoS risk. .It Va net.inet.ip.alias.sctp.log_level: No 0 Level of detail in the system log messages (0 \- minimal, 1 \- event, 2 \- info, 3 \- detail, 4 \- debug, 5 \- max debug). May be a good option in high loss environments. .It Va net.inet.ip.alias.sctp.shutdown_time: No 15 Timeout value while waiting for SHUTDOWN-COMPLETE. This value cannot be 0. .It Va net.inet.ip.alias.sctp.track_global_addresses: No 0 Enables/disables global IP address tracking within the .Nm nat and places an upper limit on the number of addresses tracked for each association: .Bl -tag -width indent .It Cm 0 Global tracking is disabled .It Cm >1 Enables tracking, the maximum number of addresses tracked for each association is limited to this value .El .Pp This variable is fully dynamic, the new value will be adopted for all newly arriving associations, existing associations are treated as they were previously. Global tracking will decrease the number of collisions within the .Nm nat at a cost of increased processing load, memory usage, complexity, and possible .Nm nat state problems in complex networks with multiple .Nm nats . We recommend not tracking global IP addresses, this will still result in a fully functional .Nm nat . .It Va net.inet.ip.alias.sctp.up_timer: No 300 Timeout value to keep an association up with no traffic. This value cannot be 0. .It Va net.inet.ip.dummynet.expire : No 1 Lazily delete dynamic pipes/queue once they have no pending traffic. You can disable this by setting the variable to 0, in which case the pipes/queues will only be deleted when the threshold is reached. .It Va net.inet.ip.dummynet.hash_size : No 64 Default size of the hash table used for dynamic pipes/queues. This value is used when no .Cm buckets option is specified when configuring a pipe/queue. .It Va net.inet.ip.dummynet.io_fast : No 0 If set to a non-zero value, the .Dq fast mode of .Nm dummynet operation (see above) is enabled. .It Va net.inet.ip.dummynet.io_pkt Number of packets passed to .Nm dummynet . .It Va net.inet.ip.dummynet.io_pkt_drop Number of packets dropped by .Nm dummynet . .It Va net.inet.ip.dummynet.io_pkt_fast Number of packets bypassed by the .Nm dummynet scheduler. .It Va net.inet.ip.dummynet.max_chain_len : No 16 Target value for the maximum number of pipes/queues in a hash bucket. The product .Cm max_chain_len*hash_size is used to determine the threshold over which empty pipes/queues will be expired even when .Cm net.inet.ip.dummynet.expire=0 . .It Va net.inet.ip.dummynet.red_lookup_depth : No 256 .It Va net.inet.ip.dummynet.red_avg_pkt_size : No 512 .It Va net.inet.ip.dummynet.red_max_pkt_size : No 1500 Parameters used in the computations of the drop probability for the RED algorithm. .It Va net.inet.ip.dummynet.pipe_byte_limit : No 1048576 .It Va net.inet.ip.dummynet.pipe_slot_limit : No 100 The maximum queue size that can be specified in bytes or packets. These limits prevent accidental exhaustion of resources such as mbufs. If you raise these limits, you should make sure the system is configured so that sufficient resources are available. .It Va net.inet.ip.fw.autoinc_step : No 100 Delta between rule numbers when auto-generating them. The value must be in the range 1..1000. .It Va net.inet.ip.fw.curr_dyn_buckets : Va net.inet.ip.fw.dyn_buckets The current number of buckets in the hash table for dynamic rules (readonly). .It Va net.inet.ip.fw.debug : No 1 Controls debugging messages produced by .Nm . .It Va net.inet.ip.fw.default_rule : No 65535 The default rule number (read-only). By the design of .Nm , the default rule is the last one, so its number can also serve as the highest number allowed for a rule. .It Va net.inet.ip.fw.dyn_buckets : No 256 The number of buckets in the hash table for dynamic rules. Must be a power of 2, up to 65536. It only takes effect when all dynamic rules have expired, so you are advised to use a .Cm flush command to make sure that the hash table is resized. .It Va net.inet.ip.fw.dyn_count : No 3 Current number of dynamic rules (read-only). .It Va net.inet.ip.fw.dyn_keepalive : No 1 Enables generation of keepalive packets for .Cm keep-state rules on TCP sessions. A keepalive is generated to both sides of the connection every 5 seconds for the last 20 seconds of the lifetime of the rule. .It Va net.inet.ip.fw.dyn_max : No 8192 Maximum number of dynamic rules. When you hit this limit, no more dynamic rules can be installed until old ones expire. .It Va net.inet.ip.fw.dyn_ack_lifetime : No 300 .It Va net.inet.ip.fw.dyn_syn_lifetime : No 20 .It Va net.inet.ip.fw.dyn_fin_lifetime : No 1 .It Va net.inet.ip.fw.dyn_rst_lifetime : No 1 .It Va net.inet.ip.fw.dyn_udp_lifetime : No 5 .It Va net.inet.ip.fw.dyn_short_lifetime : No 30 These variables control the lifetime, in seconds, of dynamic rules. Upon the initial SYN exchange the lifetime is kept short, then increased after both SYN have been seen, then decreased again during the final FIN exchange or when a RST is received. Both .Em dyn_fin_lifetime and .Em dyn_rst_lifetime must be strictly lower than 5 seconds, the period of repetition of keepalives. The firewall enforces that. .It Va net.inet.ip.fw.dyn_keep_states: No 0 Keep dynamic states on rule/set deletion. States are relinked to default rule (65535). This can be handly for ruleset reload. Turned off by default. .It Va net.inet.ip.fw.enable : No 1 Enables the firewall. Setting this variable to 0 lets you run your machine without firewall even if compiled in. .It Va net.inet6.ip6.fw.enable : No 1 provides the same functionality as above for the IPv6 case. .It Va net.inet.ip.fw.one_pass : No 1 When set, the packet exiting from the .Nm dummynet pipe or from .Xr ng_ipfw 4 node is not passed though the firewall again. Otherwise, after an action, the packet is reinjected into the firewall at the next rule. .It Va net.inet.ip.fw.tables_max : No 128 Maximum number of tables. .It Va net.inet.ip.fw.verbose : No 1 Enables verbose messages. .It Va net.inet.ip.fw.verbose_limit : No 0 Limits the number of messages produced by a verbose firewall. .It Va net.inet6.ip6.fw.deny_unknown_exthdrs : No 1 If enabled packets with unknown IPv6 Extension Headers will be denied. .It Va net.link.ether.ipfw : No 0 Controls whether layer-2 packets are passed to .Nm . Default is no. .It Va net.link.bridge.ipfw : No 0 Controls whether bridged packets are passed to .Nm . Default is no. .El .Sh INTERNAL DIAGNOSTICS There are some commands that may be useful to understand current state of certain subsystems inside kernel module. These commands provide debugging output which may change without notice. .Pp Currently the following commands are available as .Cm internal sub-options: .Bl -tag -width indent .It Cm iflist Lists all interface which are currently tracked by .Nm with their in-kernel status. .It Cm talist List all table lookup algorithms currently available. .El .Sh EXAMPLES There are far too many possible uses of .Nm so this Section will only give a small set of examples. .Pp .Ss BASIC PACKET FILTERING This command adds an entry which denies all tcp packets from .Em cracker.evil.org to the telnet port of .Em wolf.tambov.su from being forwarded by the host: .Pp .Dl "ipfw add deny tcp from cracker.evil.org to wolf.tambov.su telnet" .Pp This one disallows any connection from the entire cracker's network to my host: .Pp .Dl "ipfw add deny ip from 123.45.67.0/24 to my.host.org" .Pp A first and efficient way to limit access (not using dynamic rules) is the use of the following rules: .Pp .Dl "ipfw add allow tcp from any to any established" .Dl "ipfw add allow tcp from net1 portlist1 to net2 portlist2 setup" .Dl "ipfw add allow tcp from net3 portlist3 to net3 portlist3 setup" .Dl "..." .Dl "ipfw add deny tcp from any to any" .Pp The first rule will be a quick match for normal TCP packets, but it will not match the initial SYN packet, which will be matched by the .Cm setup rules only for selected source/destination pairs. All other SYN packets will be rejected by the final .Cm deny rule. .Pp If you administer one or more subnets, you can take advantage of the address sets and or-blocks and write extremely compact rulesets which selectively enable services to blocks of clients, as below: .Pp .Dl "goodguys=\*q{ 10.1.2.0/24{20,35,66,18} or 10.2.3.0/28{6,3,11} }\*q" .Dl "badguys=\*q10.1.2.0/24{8,38,60}\*q" .Dl "" .Dl "ipfw add allow ip from ${goodguys} to any" .Dl "ipfw add deny ip from ${badguys} to any" .Dl "... normal policies ..." .Pp The .Cm verrevpath option could be used to do automated anti-spoofing by adding the following to the top of a ruleset: .Pp .Dl "ipfw add deny ip from any to any not verrevpath in" .Pp This rule drops all incoming packets that appear to be coming to the system on the wrong interface. For example, a packet with a source address belonging to a host on a protected internal network would be dropped if it tried to enter the system from an external interface. .Pp The .Cm antispoof option could be used to do similar but more restricted anti-spoofing by adding the following to the top of a ruleset: .Pp .Dl "ipfw add deny ip from any to any not antispoof in" .Pp This rule drops all incoming packets that appear to be coming from another directly connected system but on the wrong interface. For example, a packet with a source address of .Li 192.168.0.0/24 , configured on .Li fxp0 , but coming in on .Li fxp1 would be dropped. .Pp The .Cm setdscp option could be used to (re)mark user traffic, by adding the following to the appropriate place in ruleset: .Pp .Dl "ipfw add setdscp be ip from any to any dscp af11,af21" .Ss DYNAMIC RULES In order to protect a site from flood attacks involving fake TCP packets, it is safer to use dynamic rules: .Pp .Dl "ipfw add check-state" .Dl "ipfw add deny tcp from any to any established" .Dl "ipfw add allow tcp from my-net to any setup keep-state" .Pp This will let the firewall install dynamic rules only for those connection which start with a regular SYN packet coming from the inside of our network. Dynamic rules are checked when encountering the first occurrence of a .Cm check-state , .Cm keep-state or .Cm limit rule. A .Cm check-state rule should usually be placed near the beginning of the ruleset to minimize the amount of work scanning the ruleset. Your mileage may vary. .Pp To limit the number of connections a user can open you can use the following type of rules: .Pp .Dl "ipfw add allow tcp from my-net/24 to any setup limit src-addr 10" .Dl "ipfw add allow tcp from any to me setup limit src-addr 4" .Pp The former (assuming it runs on a gateway) will allow each host on a /24 network to open at most 10 TCP connections. The latter can be placed on a server to make sure that a single client does not use more than 4 simultaneous connections. .Pp .Em BEWARE : stateful rules can be subject to denial-of-service attacks by a SYN-flood which opens a huge number of dynamic rules. The effects of such attacks can be partially limited by acting on a set of .Xr sysctl 8 variables which control the operation of the firewall. .Pp Here is a good usage of the .Cm list command to see accounting records and timestamp information: .Pp .Dl ipfw -at list .Pp or in short form without timestamps: .Pp .Dl ipfw -a list .Pp which is equivalent to: .Pp .Dl ipfw show .Pp Next rule diverts all incoming packets from 192.168.2.0/24 to divert port 5000: .Pp .Dl ipfw divert 5000 ip from 192.168.2.0/24 to any in .Ss TRAFFIC SHAPING The following rules show some of the applications of .Nm and .Nm dummynet for simulations and the like. .Pp This rule drops random incoming packets with a probability of 5%: .Pp .Dl "ipfw add prob 0.05 deny ip from any to any in" .Pp A similar effect can be achieved making use of .Nm dummynet pipes: .Pp .Dl "ipfw add pipe 10 ip from any to any" .Dl "ipfw pipe 10 config plr 0.05" .Pp We can use pipes to artificially limit bandwidth, e.g.\& on a machine acting as a router, if we want to limit traffic from local clients on 192.168.2.0/24 we do: .Pp .Dl "ipfw add pipe 1 ip from 192.168.2.0/24 to any out" .Dl "ipfw pipe 1 config bw 300Kbit/s queue 50KBytes" .Pp note that we use the .Cm out modifier so that the rule is not used twice. Remember in fact that .Nm rules are checked both on incoming and outgoing packets. .Pp Should we want to simulate a bidirectional link with bandwidth limitations, the correct way is the following: .Pp .Dl "ipfw add pipe 1 ip from any to any out" .Dl "ipfw add pipe 2 ip from any to any in" .Dl "ipfw pipe 1 config bw 64Kbit/s queue 10Kbytes" .Dl "ipfw pipe 2 config bw 64Kbit/s queue 10Kbytes" .Pp The above can be very useful, e.g.\& if you want to see how your fancy Web page will look for a residential user who is connected only through a slow link. You should not use only one pipe for both directions, unless you want to simulate a half-duplex medium (e.g.\& AppleTalk, Ethernet, IRDA). It is not necessary that both pipes have the same configuration, so we can also simulate asymmetric links. .Pp Should we want to verify network performance with the RED queue management algorithm: .Pp .Dl "ipfw add pipe 1 ip from any to any" .Dl "ipfw pipe 1 config bw 500Kbit/s queue 100 red 0.002/30/80/0.1" .Pp Another typical application of the traffic shaper is to introduce some delay in the communication. This can significantly affect applications which do a lot of Remote Procedure Calls, and where the round-trip-time of the connection often becomes a limiting factor much more than bandwidth: .Pp .Dl "ipfw add pipe 1 ip from any to any out" .Dl "ipfw add pipe 2 ip from any to any in" .Dl "ipfw pipe 1 config delay 250ms bw 1Mbit/s" .Dl "ipfw pipe 2 config delay 250ms bw 1Mbit/s" .Pp Per-flow queueing can be useful for a variety of purposes. A very simple one is counting traffic: .Pp .Dl "ipfw add pipe 1 tcp from any to any" .Dl "ipfw add pipe 1 udp from any to any" .Dl "ipfw add pipe 1 ip from any to any" .Dl "ipfw pipe 1 config mask all" .Pp The above set of rules will create queues (and collect statistics) for all traffic. Because the pipes have no limitations, the only effect is collecting statistics. Note that we need 3 rules, not just the last one, because when .Nm tries to match IP packets it will not consider ports, so we would not see connections on separate ports as different ones. .Pp A more sophisticated example is limiting the outbound traffic on a net with per-host limits, rather than per-network limits: .Pp .Dl "ipfw add pipe 1 ip from 192.168.2.0/24 to any out" .Dl "ipfw add pipe 2 ip from any to 192.168.2.0/24 in" .Dl "ipfw pipe 1 config mask src-ip 0x000000ff bw 200Kbit/s queue 20Kbytes" .Dl "ipfw pipe 2 config mask dst-ip 0x000000ff bw 200Kbit/s queue 20Kbytes" .Ss LOOKUP TABLES In the following example, we need to create several traffic bandwidth classes and we need different hosts/networks to fall into different classes. We create one pipe for each class and configure them accordingly. Then we create a single table and fill it with IP subnets and addresses. For each subnet/host we set the argument equal to the number of the pipe that it should use. Then we classify traffic using a single rule: .Pp .Dl "ipfw pipe 1 config bw 1000Kbyte/s" .Dl "ipfw pipe 4 config bw 4000Kbyte/s" .Dl "..." .Dl "ipfw table T1 create type addr" .Dl "ipfw table T1 add 192.168.2.0/24 1" .Dl "ipfw table T1 add 192.168.0.0/27 4" .Dl "ipfw table T1 add 192.168.0.2 1" .Dl "..." .Dl "ipfw add pipe tablearg ip from 'table(T1)' to any" .Pp Using the .Cm fwd action, the table entries may include hostnames and IP addresses. .Pp .Dl "ipfw table T2 create type addr ftype ip" .Dl "ipfw table T2 add 192.168.2.0/24 10.23.2.1" .Dl "ipfw table T21 add 192.168.0.0/27 router1.dmz" .Dl "..." .Dl "ipfw add 100 fwd tablearg ip from any to table(1)" .Pp In the following example per-interface firewall is created: .Pp .Dl "ipfw table IN create type iface valtype skipto,fib" .Dl "ipfw table IN add vlan20 12000,12" .Dl "ipfw table IN add vlan30 13000,13" .Dl "ipfw table OUT create type iface valtype skipto" .Dl "ipfw table OUT add vlan20 22000" .Dl "ipfw table OUT add vlan30 23000" .Dl ".." .Dl "ipfw add 100 ipfw setfib tablearg ip from any to any recv 'table(IN)' in" .Dl "ipfw add 200 ipfw skipto tablearg ip from any to any recv 'table(IN)' in" .Dl "ipfw add 300 ipfw skipto tablearg ip from any to any xmit 'table(OUT)' out" .Pp The following example illustrate usage of flow tables: .Pp .Dl "ipfw table fl create type flow:flow:src-ip,proto,dst-ip,dst-port" .Dl "ipfw table fl add 2a02:6b8:77::88,tcp,2a02:6b8:77::99,80 11" .Dl "ipfw table fl add 10.0.0.1,udp,10.0.0.2,53 12" .Dl ".." .Dl "ipfw add 100 allow ip from any to any flow 'table(fl,11)' recv ix0" .Ss SETS OF RULES To add a set of rules atomically, e.g.\& set 18: .Pp .Dl "ipfw set disable 18" .Dl "ipfw add NN set 18 ... # repeat as needed" .Dl "ipfw set enable 18" .Pp To delete a set of rules atomically the command is simply: .Pp .Dl "ipfw delete set 18" .Pp To test a ruleset and disable it and regain control if something goes wrong: .Pp .Dl "ipfw set disable 18" .Dl "ipfw add NN set 18 ... # repeat as needed" .Dl "ipfw set enable 18; echo done; sleep 30 && ipfw set disable 18" .Pp Here if everything goes well, you press control-C before the "sleep" terminates, and your ruleset will be left active. Otherwise, e.g.\& if you cannot access your box, the ruleset will be disabled after the sleep terminates thus restoring the previous situation. .Pp To show rules of the specific set: .Pp .Dl "ipfw set 18 show" .Pp To show rules of the disabled set: .Pp .Dl "ipfw -S set 18 show" .Pp To clear a specific rule counters of the specific set: .Pp .Dl "ipfw set 18 zero NN" .Pp To delete a specific rule of the specific set: .Pp .Dl "ipfw set 18 delete NN" .Ss NAT, REDIRECT AND LSNAT First redirect all the traffic to nat instance 123: .Pp .Dl "ipfw add nat 123 all from any to any" .Pp Then to configure nat instance 123 to alias all the outgoing traffic with ip 192.168.0.123, blocking all incoming connections, trying to keep same ports on both sides, clearing aliasing table on address change and keeping a log of traffic/link statistics: .Pp .Dl "ipfw nat 123 config ip 192.168.0.123 log deny_in reset same_ports" .Pp Or to change address of instance 123, aliasing table will be cleared (see reset option): .Pp .Dl "ipfw nat 123 config ip 10.0.0.1" .Pp To see configuration of nat instance 123: .Pp .Dl "ipfw nat 123 show config" .Pp To show logs of all the instances in range 111-999: .Pp .Dl "ipfw nat 111-999 show" .Pp To see configurations of all instances: .Pp .Dl "ipfw nat show config" .Pp Or a redirect rule with mixed modes could looks like: .Pp .Dl "ipfw nat 123 config redirect_addr 10.0.0.1 10.0.0.66" .Dl " redirect_port tcp 192.168.0.1:80 500" .Dl " redirect_proto udp 192.168.1.43 192.168.1.1" .Dl " redirect_addr 192.168.0.10,192.168.0.11" .Dl " 10.0.0.100 # LSNAT" .Dl " redirect_port tcp 192.168.0.1:80,192.168.0.10:22" .Dl " 500 # LSNAT" .Pp or it could be split in: .Pp .Dl "ipfw nat 1 config redirect_addr 10.0.0.1 10.0.0.66" .Dl "ipfw nat 2 config redirect_port tcp 192.168.0.1:80 500" .Dl "ipfw nat 3 config redirect_proto udp 192.168.1.43 192.168.1.1" .Dl "ipfw nat 4 config redirect_addr 192.168.0.10,192.168.0.11,192.168.0.12" .Dl " 10.0.0.100" .Dl "ipfw nat 5 config redirect_port tcp" .Dl " 192.168.0.1:80,192.168.0.10:22,192.168.0.20:25 500" .Sh SEE ALSO .Xr cpp 1 , .Xr m4 1 , .Xr altq 4 , .Xr divert 4 , .Xr dummynet 4 , .Xr if_bridge 4 , .Xr ip 4 , .Xr ipfirewall 4 , .Xr ng_ipfw 4 , .Xr protocols 5 , .Xr services 5 , .Xr init 8 , .Xr kldload 8 , .Xr reboot 8 , .Xr sysctl 8 , .Xr syslogd 8 .Sh HISTORY The .Nm utility first appeared in .Fx 2.0 . .Nm dummynet was introduced in .Fx 2.2.8 . Stateful extensions were introduced in .Fx 4.0 . .Nm ipfw2 was introduced in Summer 2002. .Sh AUTHORS .An Ugen J. S. Antsilevich , .An Poul-Henning Kamp , .An Alex Nash , .An Archie Cobbs , .An Luigi Rizzo . .Pp .An -nosplit API based upon code written by .An Daniel Boulet for BSDI. .Pp Dummynet has been introduced by Luigi Rizzo in 1997-1998. .Pp Some early work (1999-2000) on the .Nm dummynet traffic shaper supported by Akamba Corp. .Pp The ipfw core (ipfw2) has been completely redesigned and reimplemented by Luigi Rizzo in summer 2002. Further actions and options have been added by various developer over the years. .Pp .An -nosplit In-kernel NAT support written by .An Paolo Pisati Aq Mt piso@FreeBSD.org as part of a Summer of Code 2005 project. .Pp SCTP .Nm nat support has been developed by .An The Centre for Advanced Internet Architectures (CAIA) Aq http://www.caia.swin.edu.au . The primary developers and maintainers are David Hayes and Jason But. For further information visit: .Aq http://www.caia.swin.edu.au/urp/SONATA .Pp Delay profiles have been developed by Alessandro Cerri and Luigi Rizzo, supported by the European Commission within Projects Onelab and Onelab2. .Sh BUGS The syntax has grown over the years and sometimes it might be confusing. Unfortunately, backward compatibility prevents cleaning up mistakes made in the definition of the syntax. .Pp .Em !!! WARNING !!! .Pp Misconfiguring the firewall can put your computer in an unusable state, possibly shutting down network services and requiring console access to regain control of it. .Pp Incoming packet fragments diverted by .Cm divert are reassembled before delivery to the socket. The action used on those packet is the one from the rule which matches the first fragment of the packet. .Pp Packets diverted to userland, and then reinserted by a userland process may lose various packet attributes. The packet source interface name will be preserved if it is shorter than 8 bytes and the userland process saves and reuses the sockaddr_in (as does .Xr natd 8 ) ; otherwise, it may be lost. If a packet is reinserted in this manner, later rules may be incorrectly applied, making the order of .Cm divert rules in the rule sequence very important. .Pp Dummynet drops all packets with IPv6 link-local addresses. .Pp Rules using .Cm uid or .Cm gid may not behave as expected. In particular, incoming SYN packets may have no uid or gid associated with them since they do not yet belong to a TCP connection, and the uid/gid associated with a packet may not be as expected if the associated process calls .Xr setuid 2 or similar system calls. .Pp Rule syntax is subject to the command line environment and some patterns may need to be escaped with the backslash character or quoted appropriately. .Pp Due to the architecture of .Xr libalias 3 , ipfw nat is not compatible with the TCP segmentation offloading (TSO). Thus, to reliably nat your network traffic, please disable TSO on your NICs using .Xr ifconfig 8 . .Pp ICMP error messages are not implicitly matched by dynamic rules for the respective conversations. To avoid failures of network error detection and path MTU discovery, ICMP error messages may need to be allowed explicitly through static rules. .Pp Rules using .Cm call and .Cm return actions may lead to confusing behaviour if ruleset has mistakes, and/or interaction with other subsystems (netgraph, dummynet, etc.) is used. One possible case for this is packet leaving .Nm in subroutine on the input pass, while later on output encountering unpaired .Cm return first. As the call stack is kept intact after input pass, packet will suddenly return to the rule number used on input pass, not on output one. Order of processing should be checked carefully to avoid such mistakes. Index: head/usr.sbin/ntp/doc/ntp.conf.5 =================================================================== --- head/usr.sbin/ntp/doc/ntp.conf.5 (revision 327258) +++ head/usr.sbin/ntp/doc/ntp.conf.5 (revision 327259) @@ -1,3011 +1,3011 @@ .Dd March 21 2017 .Dt NTP_CONF 5 File Formats .Os .\" EDIT THIS FILE WITH CAUTION (ntp.mdoc) .\" .\" $FreeBSD$ .\" .\" It has been AutoGen-ed March 21, 2017 at 10:31:09 AM by AutoGen 5.18.5 .\" From the definitions ntp.conf.def .\" and the template file agmdoc-cmd.tpl .Sh NAME .Nm ntp.conf .Nd Network Time Protocol (NTP) daemon configuration file format .Sh SYNOPSIS .Nm .Op Fl \-option\-name .Op Fl \-option\-name Ar value .Pp All arguments must be options. .Pp .Sh DESCRIPTION The .Nm configuration file is read at initial startup by the .Xr ntpd 8 daemon in order to specify the synchronization sources, modes and other related information. Usually, it is installed in the .Pa /etc directory, but could be installed elsewhere (see the daemon's .Fl c command line option). .Pp The file format is similar to other .Ux configuration files. Comments begin with a .Ql # character and extend to the end of the line; blank lines are ignored. Configuration commands consist of an initial keyword followed by a list of arguments, some of which may be optional, separated by whitespace. Commands may not be continued over multiple lines. Arguments may be host names, host addresses written in numeric, dotted\-quad form, integers, floating point numbers (when specifying times in seconds) and text strings. .Pp The rest of this page describes the configuration and control options. The .Qq Notes on Configuring NTP and Setting up an NTP Subnet page (available as part of the HTML documentation provided in .Pa /usr/share/doc/ntp ) contains an extended discussion of these options. In addition to the discussion of general .Sx Configuration Options , there are sections describing the following supported functionality and the options used to control it: .Bl -bullet -offset indent .It .Sx Authentication Support .It .Sx Monitoring Support .It .Sx Access Control Support .It .Sx Automatic NTP Configuration Options .It .Sx Reference Clock Support .It .Sx Miscellaneous Options .El .Pp Following these is a section describing .Sx Miscellaneous Options . While there is a rich set of options available, the only required option is one or more .Ic pool , .Ic server , .Ic peer , .Ic broadcast or .Ic manycastclient commands. .Sh Configuration Support Following is a description of the configuration commands in NTPv4. These commands have the same basic functions as in NTPv3 and in some cases new functions and new arguments. There are two classes of commands, configuration commands that configure a persistent association with a remote server or peer or reference clock, and auxiliary commands that specify environmental variables that control various related operations. .Ss Configuration Commands The various modes are determined by the command keyword and the type of the required IP address. Addresses are classed by type as (s) a remote server or peer (IPv4 class A, B and C), (b) the broadcast address of a local interface, (m) a multicast address (IPv4 class D), or (r) a reference clock address (127.127.x.x). Note that only those options applicable to each command are listed below. Use of options not listed may not be caught as an error, but may result in some weird and even destructive behavior. .Pp If the Basic Socket Interface Extensions for IPv6 (RFC\-2553) is detected, support for the IPv6 address family is generated in addition to the default support of the IPv4 address family. In a few cases, including the .Cm reslist billboard generated by .Xr ntpq 8 or .Xr ntpdc 8 , IPv6 addresses are automatically generated. IPv6 addresses can be identified by the presence of colons .Dq \&: in the address field. IPv6 addresses can be used almost everywhere where IPv4 addresses can be used, with the exception of reference clock addresses, which are always IPv4. .Pp Note that in contexts where a host name is expected, a .Fl 4 qualifier preceding the host name forces DNS resolution to the IPv4 namespace, while a .Fl 6 qualifier forces DNS resolution to the IPv6 namespace. See IPv6 references for the equivalent classes for that address family. .Bl -tag -width indent .It Xo Ic pool Ar address .Op Cm burst .Op Cm iburst .Op Cm version Ar version .Op Cm prefer .Op Cm minpoll Ar minpoll .Op Cm maxpoll Ar maxpoll .Xc .It Xo Ic server Ar address .Op Cm key Ar key \&| Cm autokey .Op Cm burst .Op Cm iburst .Op Cm version Ar version .Op Cm prefer .Op Cm minpoll Ar minpoll .Op Cm maxpoll Ar maxpoll .Op Cm true .Xc .It Xo Ic peer Ar address .Op Cm key Ar key \&| Cm autokey .Op Cm version Ar version .Op Cm prefer .Op Cm minpoll Ar minpoll .Op Cm maxpoll Ar maxpoll .Op Cm true .Op Cm xleave .Xc .It Xo Ic broadcast Ar address .Op Cm key Ar key \&| Cm autokey .Op Cm version Ar version .Op Cm prefer .Op Cm minpoll Ar minpoll .Op Cm ttl Ar ttl .Op Cm xleave .Xc .It Xo Ic manycastclient Ar address .Op Cm key Ar key \&| Cm autokey .Op Cm version Ar version .Op Cm prefer .Op Cm minpoll Ar minpoll .Op Cm maxpoll Ar maxpoll .Op Cm ttl Ar ttl .Xc .El .Pp These five commands specify the time server name or address to be used and the mode in which to operate. The .Ar address can be either a DNS name or an IP address in dotted\-quad notation. Additional information on association behavior can be found in the .Qq Association Management page (available as part of the HTML documentation provided in .Pa /usr/share/doc/ntp ) . .Bl -tag -width indent .It Ic pool For type s addresses, this command mobilizes a persistent client mode association with a number of remote servers. In this mode the local clock can synchronized to the remote server, but the remote server can never be synchronized to the local clock. .It Ic server For type s and r addresses, this command mobilizes a persistent client mode association with the specified remote server or local radio clock. In this mode the local clock can synchronized to the remote server, but the remote server can never be synchronized to the local clock. This command should .Em not be used for type b or m addresses. .It Ic peer For type s addresses (only), this command mobilizes a persistent symmetric\-active mode association with the specified remote peer. In this mode the local clock can be synchronized to the remote peer or the remote peer can be synchronized to the local clock. This is useful in a network of servers where, depending on various failure scenarios, either the local or remote peer may be the better source of time. This command should NOT be used for type b, m or r addresses. .It Ic broadcast For type b and m addresses (only), this command mobilizes a persistent broadcast mode association. Multiple commands can be used to specify multiple local broadcast interfaces (subnets) and/or multiple multicast groups. Note that local broadcast messages go only to the interface associated with the subnet specified, but multicast messages go to all interfaces. In broadcast mode the local server sends periodic broadcast messages to a client population at the .Ar address specified, which is usually the broadcast address on (one of) the local network(s) or a multicast address assigned to NTP. The IANA has assigned the multicast group address IPv4 224.0.1.1 and IPv6 ff05::101 (site local) exclusively to NTP, but other nonconflicting addresses can be used to contain the messages within administrative boundaries. Ordinarily, this specification applies only to the local server operating as a sender; for operation as a broadcast client, see the .Ic broadcastclient or .Ic multicastclient commands below. .It Ic manycastclient For type m addresses (only), this command mobilizes a manycast client mode association for the multicast address specified. In this case a specific address must be supplied which matches the address used on the .Ic manycastserver command for the designated manycast servers. The NTP multicast address 224.0.1.1 assigned by the IANA should NOT be used, unless specific means are taken to avoid spraying large areas of the Internet with these messages and causing a possibly massive implosion of replies at the sender. The .Ic manycastserver command specifies that the local server is to operate in client mode with the remote servers that are discovered as the result of broadcast/multicast messages. The client broadcasts a request message to the group address associated with the specified .Ar address and specifically enabled servers respond to these messages. The client selects the servers providing the best time and continues as with the .Ic server command. The remaining servers are discarded as if never heard. .El .Pp Options: .Bl -tag -width indent .It Cm autokey All packets sent to and received from the server or peer are to include authentication fields encrypted using the autokey scheme described in .Sx Authentication Options . .It Cm burst when the server is reachable, send a burst of eight packets instead of the usual one. The packet spacing is normally 2 s; however, the spacing between the first and second packets can be changed with the .Ic calldelay command to allow additional time for a modem or ISDN call to complete. This is designed to improve timekeeping quality with the .Ic server command and s addresses. .It Cm iburst When the server is unreachable, send a burst of eight packets instead of the usual one. The packet spacing is normally 2 s; however, the spacing between the first two packets can be changed with the .Ic calldelay command to allow additional time for a modem or ISDN call to complete. This is designed to speed the initial synchronization acquisition with the .Ic server command and s addresses and when .Xr ntpd 8 is started with the .Fl q option. .It Cm key Ar key All packets sent to and received from the server or peer are to include authentication fields encrypted using the specified .Ar key identifier with values from 1 to 65534, inclusive. The default is to include no encryption field. .It Cm minpoll Ar minpoll .It Cm maxpoll Ar maxpoll These options specify the minimum and maximum poll intervals for NTP messages, as a power of 2 in seconds The maximum poll interval defaults to 10 (1,024 s), but can be increased by the .Cm maxpoll option to an upper limit of 17 (36.4 h). The minimum poll interval defaults to 6 (64 s), but can be decreased by the .Cm minpoll option to a lower limit of 4 (16 s). .It Cm noselect Marks the server as unused, except for display purposes. The server is discarded by the selection algroithm. .It Cm preempt Says the association can be preempted. .It Cm true Marks the server as a truechimer. Use this option only for testing. .It Cm prefer Marks the server as preferred. All other things being equal, this host will be chosen for synchronization among a set of correctly operating hosts. See the .Qq Mitigation Rules and the prefer Keyword page (available as part of the HTML documentation provided in .Pa /usr/share/doc/ntp ) for further information. .It Cm true Forces the association to always survive the selection and clustering algorithms. This option should almost certainly .Em only be used while testing an association. .It Cm ttl Ar ttl This option is used only with broadcast server and manycast client modes. It specifies the time\-to\-live .Ar ttl to use on broadcast server and multicast server and the maximum .Ar ttl for the expanding ring search with manycast client packets. Selection of the proper value, which defaults to 127, is something of a black art and should be coordinated with the network administrator. .It Cm version Ar version Specifies the version number to be used for outgoing NTP packets. Versions 1\-4 are the choices, with version 4 the default. .It Cm xleave Valid in .Cm peer and .Cm broadcast modes only, this flag enables interleave mode. .El .Ss Auxiliary Commands .Bl -tag -width indent .It Ic broadcastclient This command enables reception of broadcast server messages to any local interface (type b) address. Upon receiving a message for the first time, the broadcast client measures the nominal server propagation delay using a brief client/server exchange with the server, then enters the broadcast client mode, in which it synchronizes to succeeding broadcast messages. Note that, in order to avoid accidental or malicious disruption in this mode, both the server and client should operate using symmetric\-key or public\-key authentication as described in .Sx Authentication Options . .It Ic manycastserver Ar address ... This command enables reception of manycast client messages to the multicast group address(es) (type m) specified. At least one address is required, but the NTP multicast address 224.0.1.1 assigned by the IANA should NOT be used, unless specific means are taken to limit the span of the reply and avoid a possibly massive implosion at the original sender. Note that, in order to avoid accidental or malicious disruption in this mode, both the server and client should operate using symmetric\-key or public\-key authentication as described in .Sx Authentication Options . .It Ic multicastclient Ar address ... This command enables reception of multicast server messages to the multicast group address(es) (type m) specified. Upon receiving a message for the first time, the multicast client measures the nominal server propagation delay using a brief client/server exchange with the server, then enters the broadcast client mode, in which it synchronizes to succeeding multicast messages. Note that, in order to avoid accidental or malicious disruption in this mode, both the server and client should operate using symmetric\-key or public\-key authentication as described in .Sx Authentication Options . .It Ic mdnstries Ar number If we are participating in mDNS, after we have synched for the first time we attempt to register with the mDNS system. If that registration attempt fails, we try again at one minute intervals for up to .Ic mdnstries times. After all, .Ic ntpd may be starting before mDNS. The default value for .Ic mdnstries is 5. .El .Sh Authentication Support Authentication support allows the NTP client to verify that the server is in fact known and trusted and not an intruder intending accidentally or on purpose to masquerade as that server. The NTPv3 specification RFC\-1305 defines a scheme which provides cryptographic authentication of received NTP packets. Originally, this was done using the Data Encryption Standard (DES) algorithm operating in Cipher Block Chaining (CBC) mode, commonly called DES\-CBC. Subsequently, this was replaced by the RSA Message Digest 5 (MD5) algorithm using a private key, commonly called keyed\-MD5. Either algorithm computes a message digest, or one\-way hash, which can be used to verify the server has the correct private key and key identifier. .Pp NTPv4 retains the NTPv3 scheme, properly described as symmetric key cryptography and, in addition, provides a new Autokey scheme based on public key cryptography. Public key cryptography is generally considered more secure than symmetric key cryptography, since the security is based on a private value which is generated by each server and never revealed. With Autokey all key distribution and management functions involve only public values, which considerably simplifies key distribution and storage. Public key management is based on X.509 certificates, which can be provided by commercial services or produced by utility programs in the OpenSSL software library or the NTPv4 distribution. .Pp While the algorithms for symmetric key cryptography are included in the NTPv4 distribution, public key cryptography requires the OpenSSL software library to be installed before building the NTP distribution. Directions for doing that are on the Building and Installing the Distribution page. .Pp Authentication is configured separately for each association using the .Cm key or .Cm autokey subcommand on the .Ic peer , .Ic server , .Ic broadcast and .Ic manycastclient configuration commands as described in .Sx Configuration Options page. The authentication options described below specify the locations of the key files, if other than default, which symmetric keys are trusted and the interval between various operations, if other than default. .Pp Authentication is always enabled, although ineffective if not configured as described below. If a NTP packet arrives including a message authentication code (MAC), it is accepted only if it passes all cryptographic checks. The checks require correct key ID, key value and message digest. If the packet has been modified in any way or replayed by an intruder, it will fail one or more of these checks and be discarded. Furthermore, the Autokey scheme requires a preliminary protocol exchange to obtain the server certificate, verify its credentials and initialize the protocol .Pp The .Cm auth flag controls whether new associations or remote configuration commands require cryptographic authentication. This flag can be set or reset by the .Ic enable and .Ic disable commands and also by remote configuration commands sent by a .Xr ntpdc 8 program running on another machine. If this flag is enabled, which is the default case, new broadcast client and symmetric passive associations and remote configuration commands must be cryptographically authenticated using either symmetric key or public key cryptography. If this flag is disabled, these operations are effective even if not cryptographic authenticated. It should be understood that operating with the .Ic auth flag disabled invites a significant vulnerability where a rogue hacker can masquerade as a falseticker and seriously disrupt system timekeeping. It is important to note that this flag has no purpose other than to allow or disallow a new association in response to new broadcast and symmetric active messages and remote configuration commands and, in particular, the flag has no effect on the authentication process itself. .Pp An attractive alternative where multicast support is available is manycast mode, in which clients periodically troll for servers as described in the .Sx Automatic NTP Configuration Options page. Either symmetric key or public key cryptographic authentication can be used in this mode. The principle advantage of manycast mode is that potential servers need not be configured in advance, since the client finds them during regular operation, and the configuration files for all clients can be identical. .Pp The security model and protocol schemes for both symmetric key and public key cryptography are summarized below; further details are in the briefings, papers and reports at the NTP project page linked from .Li http://www.ntp.org/ . .Ss Symmetric\-Key Cryptography The original RFC\-1305 specification allows any one of possibly 65,534 keys, each distinguished by a 32\-bit key identifier, to authenticate an association. The servers and clients involved must agree on the key and key identifier to authenticate NTP packets. Keys and related information are specified in a key file, usually called .Pa ntp.keys , which must be distributed and stored using secure means beyond the scope of the NTP protocol itself. Besides the keys used for ordinary NTP associations, additional keys can be used as passwords for the .Xr ntpq 8 and .Xr ntpdc 8 utility programs. .Pp When .Xr ntpd 8 is first started, it reads the key file specified in the .Ic keys configuration command and installs the keys in the key cache. However, individual keys must be activated with the .Ic trusted command before use. This allows, for instance, the installation of possibly several batches of keys and then activating or deactivating each batch remotely using .Xr ntpdc 8 . This also provides a revocation capability that can be used if a key becomes compromised. The .Ic requestkey command selects the key used as the password for the .Xr ntpdc 8 utility, while the .Ic controlkey command selects the key used as the password for the .Xr ntpq 8 utility. .Ss Public Key Cryptography NTPv4 supports the original NTPv3 symmetric key scheme described in RFC\-1305 and in addition the Autokey protocol, which is based on public key cryptography. The Autokey Version 2 protocol described on the Autokey Protocol page verifies packet integrity using MD5 message digests and verifies the source with digital signatures and any of several digest/signature schemes. Optional identity schemes described on the Identity Schemes page and based on cryptographic challenge/response algorithms are also available. Using all of these schemes provides strong security against replay with or without modification, spoofing, masquerade and most forms of clogging attacks. .\" .Pp .\" The cryptographic means necessary for all Autokey operations .\" is provided by the OpenSSL software library. .\" This library is available from http://www.openssl.org/ .\" and can be installed using the procedures outlined .\" in the Building and Installing the Distribution page. .\" Once installed, .\" the configure and build .\" process automatically detects the library and links .\" the library routines required. .Pp The Autokey protocol has several modes of operation corresponding to the various NTP modes supported. Most modes use a special cookie which can be computed independently by the client and server, but encrypted in transmission. All modes use in addition a variant of the S\-KEY scheme, in which a pseudo\-random key list is generated and used in reverse order. These schemes are described along with an executive summary, current status, briefing slides and reading list on the .Sx Autonomous Authentication page. .Pp The specific cryptographic environment used by Autokey servers and clients is determined by a set of files and soft links generated by the .Xr ntp\-keygen 1ntpkeygenmdoc program. This includes a required host key file, required certificate file and optional sign key file, leapsecond file and identity scheme files. The digest/signature scheme is specified in the X.509 certificate along with the matching sign key. There are several schemes available in the OpenSSL software library, each identified by a specific string such as .Cm md5WithRSAEncryption , which stands for the MD5 message digest with RSA encryption scheme. The current NTP distribution supports all the schemes in the OpenSSL library, including those based on RSA and DSA digital signatures. .Pp NTP secure groups can be used to define cryptographic compartments and security hierarchies. It is important that every host in the group be able to construct a certificate trail to one or more trusted hosts in the same group. Each group host runs the Autokey protocol to obtain the certificates for all hosts along the trail to one or more trusted hosts. This requires the configuration file in all hosts to be engineered so that, even under anticipated failure conditions, the NTP subnet will form such that every group host can find a trail to at least one trusted host. .Ss Naming and Addressing It is important to note that Autokey does not use DNS to resolve addresses, since DNS can't be completely trusted until the name servers have synchronized clocks. The cryptographic name used by Autokey to bind the host identity credentials and cryptographic values must be independent of interface, network and any other naming convention. The name appears in the host certificate in either or both the subject and issuer fields, so protection against DNS compromise is essential. .Pp By convention, the name of an Autokey host is the name returned by the Unix .Xr gethostname 2 system call or equivalent in other systems. By the system design model, there are no provisions to allow alternate names or aliases. However, this is not to say that DNS aliases, different names for each interface, etc., are constrained in any way. .Pp It is also important to note that Autokey verifies authenticity using the host name, network address and public keys, all of which are bound together by the protocol specifically to deflect masquerade attacks. For this reason Autokey includes the source and destination IP addresses in message digest computations and so the same addresses must be available at both the server and client. For this reason operation with network address translation schemes is not possible. This reflects the intended robust security model where government and corporate NTP servers are operated outside firewall perimeters. .Ss Operation A specific combination of authentication scheme (none, symmetric key, public key) and identity scheme is called a cryptotype, although not all combinations are compatible. There may be management configurations where the clients, servers and peers may not all support the same cryptotypes. A secure NTPv4 subnet can be configured in many ways while keeping in mind the principles explained above and in this section. Note however that some cryptotype combinations may successfully interoperate with each other, but may not represent good security practice. .Pp The cryptotype of an association is determined at the time of mobilization, either at configuration time or some time later when a message of appropriate cryptotype arrives. When mobilized by a .Ic server or .Ic peer configuration command and no .Ic key or .Ic autokey subcommands are present, the association is not authenticated; if the .Ic key subcommand is present, the association is authenticated using the symmetric key ID specified; if the .Ic autokey subcommand is present, the association is authenticated using Autokey. .Pp When multiple identity schemes are supported in the Autokey protocol, the first message exchange determines which one is used. The client request message contains bits corresponding to which schemes it has available. The server response message contains bits corresponding to which schemes it has available. Both server and client match the received bits with their own and select a common scheme. .Pp Following the principle that time is a public value, a server responds to any client packet that matches its cryptotype capabilities. Thus, a server receiving an unauthenticated packet will respond with an unauthenticated packet, while the same server receiving a packet of a cryptotype it supports will respond with packets of that cryptotype. However, unconfigured broadcast or manycast client associations or symmetric passive associations will not be mobilized unless the server supports a cryptotype compatible with the first packet received. By default, unauthenticated associations will not be mobilized unless overridden in a decidedly dangerous way. .Pp Some examples may help to reduce confusion. Client Alice has no specific cryptotype selected. Server Bob has both a symmetric key file and minimal Autokey files. Alice's unauthenticated messages arrive at Bob, who replies with unauthenticated messages. Cathy has a copy of Bob's symmetric key file and has selected key ID 4 in messages to Bob. Bob verifies the message with his key ID 4. If it's the same key and the message is verified, Bob sends Cathy a reply authenticated with that key. If verification fails, Bob sends Cathy a thing called a crypto\-NAK, which tells her something broke. She can see the evidence using the .Xr ntpq 8 program. .Pp Denise has rolled her own host key and certificate. She also uses one of the identity schemes as Bob. She sends the first Autokey message to Bob and they both dance the protocol authentication and identity steps. If all comes out okay, Denise and Bob continue as described above. .Pp It should be clear from the above that Bob can support all the girls at the same time, as long as he has compatible authentication and identity credentials. Now, Bob can act just like the girls in his own choice of servers; he can run multiple configured associations with multiple different servers (or the same server, although that might not be useful). But, wise security policy might preclude some cryptotype combinations; for instance, running an identity scheme with one server and no authentication with another might not be wise. .Ss Key Management The cryptographic values used by the Autokey protocol are incorporated as a set of files generated by the .Xr ntp\-keygen 1ntpkeygenmdoc utility program, including symmetric key, host key and public certificate files, as well as sign key, identity parameters and leapseconds files. Alternatively, host and sign keys and certificate files can be generated by the OpenSSL utilities and certificates can be imported from public certificate authorities. Note that symmetric keys are necessary for the .Xr ntpq 8 and .Xr ntpdc 8 utility programs. The remaining files are necessary only for the Autokey protocol. .Pp Certificates imported from OpenSSL or public certificate authorities have certian limitations. The certificate should be in ASN.1 syntax, X.509 Version 3 format and encoded in PEM, which is the same format used by OpenSSL. The overall length of the certificate encoded in ASN.1 must not exceed 1024 bytes. The subject distinguished name field (CN) is the fully qualified name of the host on which it is used; the remaining subject fields are ignored. The certificate extension fields must not contain either a subject key identifier or a issuer key identifier field; however, an extended key usage field for a trusted host must contain the value .Cm trustRoot ; . Other extension fields are ignored. .Ss Authentication Commands .Bl -tag -width indent .It Ic autokey Op Ar logsec Specifies the interval between regenerations of the session key list used with the Autokey protocol. Note that the size of the key list for each association depends on this interval and the current poll interval. The default value is 12 (4096 s or about 1.1 hours). For poll intervals above the specified interval, a session key list with a single entry will be regenerated for every message sent. .It Ic controlkey Ar key Specifies the key identifier to use with the .Xr ntpq 8 utility, which uses the standard protocol defined in RFC\-1305. The .Ar key argument is the key identifier for a trusted key, where the value can be in the range 1 to 65,534, inclusive. .It Xo Ic crypto .Op Cm cert Ar file .Op Cm leap Ar file .Op Cm randfile Ar file .Op Cm host Ar file .Op Cm sign Ar file .Op Cm gq Ar file .Op Cm gqpar Ar file .Op Cm iffpar Ar file .Op Cm mvpar Ar file .Op Cm pw Ar password .Xc This command requires the OpenSSL library. It activates public key cryptography, selects the message digest and signature encryption scheme and loads the required private and public values described above. If one or more files are left unspecified, the default names are used as described above. Unless the complete path and name of the file are specified, the location of a file is relative to the keys directory specified in the .Ic keysdir command or default .Pa /usr/local/etc . Following are the subcommands: .Bl -tag -width indent .It Cm cert Ar file Specifies the location of the required host public certificate file. This overrides the link .Pa ntpkey_cert_ Ns Ar hostname in the keys directory. .It Cm gqpar Ar file Specifies the location of the optional GQ parameters file. This overrides the link .Pa ntpkey_gq_ Ns Ar hostname in the keys directory. .It Cm host Ar file Specifies the location of the required host key file. This overrides the link .Pa ntpkey_key_ Ns Ar hostname in the keys directory. .It Cm iffpar Ar file Specifies the location of the optional IFF parameters file. This overrides the link .Pa ntpkey_iff_ Ns Ar hostname in the keys directory. .It Cm leap Ar file Specifies the location of the optional leapsecond file. This overrides the link .Pa ntpkey_leap in the keys directory. .It Cm mvpar Ar file Specifies the location of the optional MV parameters file. This overrides the link .Pa ntpkey_mv_ Ns Ar hostname in the keys directory. .It Cm pw Ar password Specifies the password to decrypt files containing private keys and identity parameters. This is required only if these files have been encrypted. .It Cm randfile Ar file Specifies the location of the random seed file used by the OpenSSL library. The defaults are described in the main text above. .It Cm sign Ar file Specifies the location of the optional sign key file. This overrides the link .Pa ntpkey_sign_ Ns Ar hostname in the keys directory. If this file is not found, the host key is also the sign key. .El .It Ic keys Ar keyfile Specifies the complete path and location of the MD5 key file containing the keys and key identifiers used by .Xr ntpd 8 , .Xr ntpq 8 and .Xr ntpdc 8 when operating with symmetric key cryptography. This is the same operation as the .Fl k command line option. .It Ic keysdir Ar path This command specifies the default directory path for cryptographic keys, parameters and certificates. The default is .Pa /usr/local/etc/ . .It Ic requestkey Ar key Specifies the key identifier to use with the .Xr ntpdc 8 utility program, which uses a proprietary protocol specific to this implementation of .Xr ntpd 8 . The .Ar key argument is a key identifier for the trusted key, where the value can be in the range 1 to 65,534, inclusive. .It Ic revoke Ar logsec Specifies the interval between re\-randomization of certain cryptographic values used by the Autokey scheme, as a power of 2 in seconds. These values need to be updated frequently in order to deflect brute\-force attacks on the algorithms of the scheme; however, updating some values is a relatively expensive operation. The default interval is 16 (65,536 s or about 18 hours). For poll intervals above the specified interval, the values will be updated for every message sent. .It Ic trustedkey Ar key ... Specifies the key identifiers which are trusted for the purposes of authenticating peers with symmetric key cryptography, as well as keys used by the .Xr ntpq 8 and .Xr ntpdc 8 programs. The authentication procedures require that both the local and remote servers share the same key and key identifier for this purpose, although different keys can be used with different servers. The .Ar key arguments are 32\-bit unsigned integers with values from 1 to 65,534. .El .Ss Error Codes The following error codes are reported via the NTP control and monitoring protocol trap mechanism. .Bl -tag -width indent .It 101 .Pq bad field format or length The packet has invalid version, length or format. .It 102 .Pq bad timestamp The packet timestamp is the same or older than the most recent received. This could be due to a replay or a server clock time step. .It 103 .Pq bad filestamp The packet filestamp is the same or older than the most recent received. This could be due to a replay or a key file generation error. .It 104 .Pq bad or missing public key The public key is missing, has incorrect format or is an unsupported type. .It 105 .Pq unsupported digest type The server requires an unsupported digest/signature scheme. .It 106 .Pq mismatched digest types Not used. .It 107 .Pq bad signature length The signature length does not match the current public key. .It 108 .Pq signature not verified The message fails the signature check. It could be bogus or signed by a different private key. .It 109 .Pq certificate not verified The certificate is invalid or signed with the wrong key. .It 110 .Pq certificate not verified The certificate is not yet valid or has expired or the signature could not be verified. .It 111 .Pq bad or missing cookie The cookie is missing, corrupted or bogus. .It 112 .Pq bad or missing leapseconds table The leapseconds table is missing, corrupted or bogus. .It 113 .Pq bad or missing certificate The certificate is missing, corrupted or bogus. .It 114 .Pq bad or missing identity The identity key is missing, corrupt or bogus. .El .Sh Monitoring Support .Xr ntpd 8 includes a comprehensive monitoring facility suitable for continuous, long term recording of server and client timekeeping performance. See the .Ic statistics command below for a listing and example of each type of statistics currently supported. Statistic files are managed using file generation sets and scripts in the .Pa ./scripts directory of the source code distribution. Using these facilities and .Ux .Xr cron 8 jobs, the data can be automatically summarized and archived for retrospective analysis. .Ss Monitoring Commands .Bl -tag -width indent .It Ic statistics Ar name ... Enables writing of statistics records. Currently, eight kinds of .Ar name statistics are supported. .Bl -tag -width indent .It Cm clockstats Enables recording of clock driver statistics information. Each update received from a clock driver appends a line of the following form to the file generation set named .Cm clockstats : .Bd -literal 49213 525.624 127.127.4.1 93 226 00:08:29.606 D .Ed .Pp The first two fields show the date (Modified Julian Day) and time (seconds and fraction past UTC midnight). The next field shows the clock address in dotted\-quad notation. The final field shows the last timecode received from the clock in decoded ASCII format, where meaningful. In some clock drivers a good deal of additional information can be gathered and displayed as well. See information specific to each clock for further details. .It Cm cryptostats This option requires the OpenSSL cryptographic software library. It enables recording of cryptographic public key protocol information. Each message received by the protocol module appends a line of the following form to the file generation set named .Cm cryptostats : .Bd -literal 49213 525.624 127.127.4.1 message .Ed .Pp The first two fields show the date (Modified Julian Day) and time (seconds and fraction past UTC midnight). The next field shows the peer address in dotted\-quad notation, The final message field includes the message type and certain ancillary information. See the .Sx Authentication Options section for further information. .It Cm loopstats Enables recording of loop filter statistics information. Each update of the local clock outputs a line of the following form to the file generation set named .Cm loopstats : .Bd -literal 50935 75440.031 0.000006019 13.778190 0.000351733 0.0133806 .Ed .Pp The first two fields show the date (Modified Julian Day) and time (seconds and fraction past UTC midnight). The next five fields show time offset (seconds), frequency offset (parts per million \- PPM), RMS jitter (seconds), Allan deviation (PPM) and clock discipline time constant. .It Cm peerstats Enables recording of peer statistics information. This includes statistics records of all peers of a NTP server and of special signals, where present and configured. Each valid update appends a line of the following form to the current element of a file generation set named .Cm peerstats : .Bd -literal 48773 10847.650 127.127.4.1 9714 \-0.001605376 0.000000000 0.001424877 0.000958674 .Ed .Pp The first two fields show the date (Modified Julian Day) and time (seconds and fraction past UTC midnight). The next two fields show the peer address in dotted\-quad notation and status, respectively. The status field is encoded in hex in the format described in Appendix A of the NTP specification RFC 1305. The final four fields show the offset, delay, dispersion and RMS jitter, all in seconds. .It Cm rawstats Enables recording of raw\-timestamp statistics information. This includes statistics records of all peers of a NTP server and of special signals, where present and configured. Each NTP message received from a peer or clock driver appends a line of the following form to the file generation set named .Cm rawstats : .Bd -literal 50928 2132.543 128.4.1.1 128.4.1.20 3102453281.584327000 3102453281.58622800031 02453332.540806000 3102453332.541458000 .Ed .Pp The first two fields show the date (Modified Julian Day) and time (seconds and fraction past UTC midnight). The next two fields show the remote peer or clock address followed by the local address in dotted\-quad notation. The final four fields show the originate, receive, transmit and final NTP timestamps in order. The timestamp values are as received and before processing by the various data smoothing and mitigation algorithms. .It Cm sysstats Enables recording of ntpd statistics counters on a periodic basis. Each hour a line of the following form is appended to the file generation set named .Cm sysstats : .Bd -literal 50928 2132.543 36000 81965 0 9546 56 71793 512 540 10 147 .Ed .Pp The first two fields show the date (Modified Julian Day) and time (seconds and fraction past UTC midnight). The remaining ten fields show the statistics counter values accumulated since the last generated line. .Bl -tag -width indent .It Time since restart Cm 36000 Time in hours since the system was last rebooted. .It Packets received Cm 81965 Total number of packets received. .It Packets processed Cm 0 Number of packets received in response to previous packets sent .It Current version Cm 9546 Number of packets matching the current NTP version. .It Previous version Cm 56 Number of packets matching the previous NTP version. .It Bad version Cm 71793 Number of packets matching neither NTP version. .It Access denied Cm 512 Number of packets denied access for any reason. .It Bad length or format Cm 540 Number of packets with invalid length, format or port number. .It Bad authentication Cm 10 Number of packets not verified as authentic. .It Rate exceeded Cm 147 Number of packets discarded due to rate limitation. .El .It Cm statsdir Ar directory_path Indicates the full path of a directory where statistics files should be created (see below). This keyword allows the (otherwise constant) .Cm filegen filename prefix to be modified for file generation sets, which is useful for handling statistics logs. .It Cm filegen Ar name Xo .Op Cm file Ar filename .Op Cm type Ar typename .Op Cm link | nolink .Op Cm enable | disable .Xc Configures setting of generation file set name. Generation file sets provide a means for handling files that are continuously growing during the lifetime of a server. Server statistics are a typical example for such files. Generation file sets provide access to a set of files used to store the actual data. At any time at most one element of the set is being written to. The type given specifies when and how data will be directed to a new element of the set. This way, information stored in elements of a file set that are currently unused are available for administrational operations without the risk of disturbing the operation of ntpd. (Most important: they can be removed to free space for new data produced.) .Pp Note that this command can be sent from the .Xr ntpdc 8 program running at a remote location. .Bl -tag -width indent .It Cm name This is the type of the statistics records, as shown in the .Cm statistics command. .It Cm file Ar filename This is the file name for the statistics records. Filenames of set members are built from three concatenated elements .Ar Cm prefix , .Ar Cm filename and .Ar Cm suffix : .Bl -tag -width indent .It Cm prefix This is a constant filename path. It is not subject to modifications via the .Ar filegen option. It is defined by the server, usually specified as a compile\-time constant. It may, however, be configurable for individual file generation sets via other commands. For example, the prefix used with .Ar loopstats and .Ar peerstats generation can be configured using the .Ar statsdir option explained above. .It Cm filename This string is directly concatenated to the prefix mentioned above (no intervening .Ql / ) . This can be modified using the file argument to the .Ar filegen statement. No .Pa .. elements are allowed in this component to prevent filenames referring to parts outside the filesystem hierarchy denoted by .Ar prefix . .It Cm suffix This part is reflects individual elements of a file set. It is generated according to the type of a file set. .El .It Cm type Ar typename A file generation set is characterized by its type. The following types are supported: .Bl -tag -width indent .It Cm none The file set is actually a single plain file. .It Cm pid One element of file set is used per incarnation of a ntpd server. This type does not perform any changes to file set members during runtime, however it provides an easy way of separating files belonging to different .Xr ntpd 8 server incarnations. The set member filename is built by appending a .Ql \&. to concatenated .Ar prefix and .Ar filename strings, and appending the decimal representation of the process ID of the .Xr ntpd 8 server process. .It Cm day One file generation set element is created per day. A day is defined as the period between 00:00 and 24:00 UTC. The file set member suffix consists of a .Ql \&. and a day specification in the form .Cm YYYYMMdd . .Cm YYYY is a 4\-digit year number (e.g., 1992). .Cm MM is a two digit month number. .Cm dd is a two digit day number. Thus, all information written at 10 December 1992 would end up in a file named .Ar prefix .Ar filename Ns .19921210 . .It Cm week Any file set member contains data related to a certain week of a year. The term week is defined by computing day\-of\-year modulo 7. Elements of such a file generation set are distinguished by appending the following suffix to the file set filename base: A dot, a 4\-digit year number, the letter .Cm W , and a 2\-digit week number. For example, information from January, 10th 1992 would end up in a file with suffix .No . Ns Ar 1992W1 . .It Cm month One generation file set element is generated per month. The file name suffix consists of a dot, a 4\-digit year number, and a 2\-digit month. .It Cm year One generation file element is generated per year. The filename suffix consists of a dot and a 4 digit year number. .It Cm age This type of file generation sets changes to a new element of the file set every 24 hours of server operation. The filename suffix consists of a dot, the letter .Cm a , and an 8\-digit number. This number is taken to be the number of seconds the server is running at the start of the corresponding 24\-hour period. Information is only written to a file generation by specifying .Cm enable ; output is prevented by specifying .Cm disable . .El .It Cm link | nolink It is convenient to be able to access the current element of a file generation set by a fixed name. This feature is enabled by specifying .Cm link and disabled using .Cm nolink . If link is specified, a hard link from the current file set element to a file without suffix is created. When there is already a file with this name and the number of links of this file is one, it is renamed appending a dot, the letter .Cm C , and the pid of the .Xr ntpd 8 server process. When the number of links is greater than one, the file is unlinked. This allows the current file to be accessed by a constant name. .It Cm enable \&| Cm disable Enables or disables the recording function. .El .El .El .Sh Access Control Support The .Xr ntpd 8 daemon implements a general purpose address/mask based restriction list. The list contains address/match entries sorted first by increasing address values and and then by increasing mask values. A match occurs when the bitwise AND of the mask and the packet source address is equal to the bitwise AND of the mask and address in the list. The list is searched in order with the last match found defining the restriction flags associated with the entry. Additional information and examples can be found in the .Qq Notes on Configuring NTP and Setting up a NTP Subnet page (available as part of the HTML documentation provided in .Pa /usr/share/doc/ntp ) . .Pp The restriction facility was implemented in conformance with the access policies for the original NSFnet backbone time servers. Later the facility was expanded to deflect cryptographic and clogging attacks. While this facility may be useful for keeping unwanted or broken or malicious clients from congesting innocent servers, it should not be considered an alternative to the NTP authentication facilities. Source address based restrictions are easily circumvented by a determined cracker. .Pp Clients can be denied service because they are explicitly included in the restrict list created by the .Ic restrict command or implicitly as the result of cryptographic or rate limit violations. Cryptographic violations include certificate or identity verification failure; rate limit violations generally result from defective NTP implementations that send packets at abusive rates. Some violations cause denied service only for the offending packet, others cause denied service for a timed period and others cause the denied service for an indefinite period. When a client or network is denied access for an indefinite period, the only way at present to remove the restrictions is by restarting the server. .Ss The Kiss\-of\-Death Packet Ordinarily, packets denied service are simply dropped with no further action except incrementing statistics counters. Sometimes a more proactive response is needed, such as a server message that explicitly requests the client to stop sending and leave a message for the system operator. A special packet format has been created for this purpose called the "kiss\-of\-death" (KoD) packet. KoD packets have the leap bits set unsynchronized and stratum set to zero and the reference identifier field set to a four\-byte ASCII code. If the .Cm noserve or .Cm notrust flag of the matching restrict list entry is set, the code is "DENY"; if the .Cm limited flag is set and the rate limit is exceeded, the code is "RATE". Finally, if a cryptographic violation occurs, the code is "CRYP". .Pp A client receiving a KoD performs a set of sanity checks to minimize security exposure, then updates the stratum and reference identifier peer variables, sets the access denied (TEST4) bit in the peer flash variable and sends a message to the log. As long as the TEST4 bit is set, the client will send no further packets to the server. The only way at present to recover from this condition is to restart the protocol at both the client and server. This happens automatically at the client when the association times out. It will happen at the server only if the server operator cooperates. .Ss Access Control Commands .Bl -tag -width indent .It Xo Ic discard .Op Cm average Ar avg .Op Cm minimum Ar min .Op Cm monitor Ar prob .Xc Set the parameters of the .Cm limited facility which protects the server from client abuse. The .Cm average subcommand specifies the minimum average packet spacing, while the .Cm minimum subcommand specifies the minimum packet spacing. Packets that violate these minima are discarded and a kiss\-o'\-death packet returned if enabled. The default minimum average and minimum are 5 and 2, respectively. The .Ic monitor subcommand specifies the probability of discard for packets that overflow the rate\-control window. .It Xo Ic restrict address .Op Cm mask Ar mask .Op Ar flag ... .Xc The .Ar address argument expressed in dotted\-quad form is the address of a host or network. Alternatively, the .Ar address argument can be a valid host DNS name. The .Ar mask argument expressed in dotted\-quad form defaults to .Cm 255.255.255.255 , meaning that the .Ar address is treated as the address of an individual host. A default entry (address .Cm 0.0.0.0 , mask .Cm 0.0.0.0 ) is always included and is always the first entry in the list. Note that text string .Cm default , with no mask option, may be used to indicate the default entry. In the current implementation, .Cm flag always restricts access, i.e., an entry with no flags indicates that free access to the server is to be given. The flags are not orthogonal, in that more restrictive flags will often make less restrictive ones redundant. The flags can generally be classed into two categories, those which restrict time service and those which restrict informational queries and attempts to do run\-time reconfiguration of the server. One or more of the following flags may be specified: .Bl -tag -width indent .It Cm ignore Deny packets of all kinds, including .Xr ntpq 8 and .Xr ntpdc 8 queries. .It Cm kod If this flag is set when an access violation occurs, a kiss\-o'\-death (KoD) packet is sent. KoD packets are rate limited to no more than one per second. If another KoD packet occurs within one second after the last one, the packet is dropped. .It Cm limited Deny service if the packet spacing violates the lower limits specified in the .Ic discard command. A history of clients is kept using the monitoring capability of .Xr ntpd 8 . Thus, monitoring is always active as long as there is a restriction entry with the .Cm limited flag. .It Cm lowpriotrap Declare traps set by matching hosts to be low priority. The number of traps a server can maintain is limited (the current limit is 3). Traps are usually assigned on a first come, first served basis, with later trap requestors being denied service. This flag modifies the assignment algorithm by allowing low priority traps to be overridden by later requests for normal priority traps. .It Cm nomodify Deny .Xr ntpq 8 and .Xr ntpdc 8 queries which attempt to modify the state of the server (i.e., run time reconfiguration). Queries which return information are permitted. .It Cm noquery Deny .Xr ntpq 8 and .Xr ntpdc 8 queries. Time service is not affected. .It Cm nopeer Deny packets which would result in mobilizing a new association. This includes broadcast and symmetric active packets when a configured association does not exist. It also includes .Cm pool associations, so if you want to use servers from a .Cm pool directive and also want to use .Cm nopeer by default, you'll want a .Cm "restrict source ..." line as well that does .It not include the .Cm nopeer directive. .It Cm noserve Deny all packets except .Xr ntpq 8 and .Xr ntpdc 8 queries. .It Cm notrap Decline to provide mode 6 control message trap service to matching hosts. The trap service is a subsystem of the .Xr ntpq 8 control message protocol which is intended for use by remote event logging programs. .It Cm notrust Deny service unless the packet is cryptographically authenticated. .It Cm ntpport This is actually a match algorithm modifier, rather than a restriction flag. Its presence causes the restriction entry to be matched only if the source port in the packet is the standard NTP UDP port (123). Both .Cm ntpport and .Cm non\-ntpport may be specified. The .Cm ntpport is considered more specific and is sorted later in the list. .It Cm version Deny packets that do not match the current NTP version. .El .Pp Default restriction list entries with the flags ignore, interface, ntpport, for each of the local host's interface addresses are inserted into the table at startup to prevent the server from attempting to synchronize to its own time. A default entry is also always present, though if it is otherwise unconfigured; no flags are associated with the default entry (i.e., everything besides your own NTP server is unrestricted). .El .Sh Automatic NTP Configuration Options .Ss Manycasting Manycasting is a automatic discovery and configuration paradigm new to NTPv4. It is intended as a means for a multicast client to troll the nearby network neighborhood to find cooperating manycast servers, validate them using cryptographic means and evaluate their time values with respect to other servers that might be lurking in the vicinity. The intended result is that each manycast client mobilizes client associations with some number of the "best" of the nearby manycast servers, yet automatically reconfigures to sustain this number of servers should one or another fail. .Pp Note that the manycasting paradigm does not coincide with the anycast paradigm described in RFC\-1546, which is designed to find a single server from a clique of servers providing the same service. The manycast paradigm is designed to find a plurality of redundant servers satisfying defined optimality criteria. .Pp Manycasting can be used with either symmetric key or public key cryptography. The public key infrastructure (PKI) offers the best protection against compromised keys and is generally considered stronger, at least with relatively large key sizes. It is implemented using the Autokey protocol and the OpenSSL cryptographic library available from .Li http://www.openssl.org/ . The library can also be used with other NTPv4 modes as well and is highly recommended, especially for broadcast modes. .Pp A persistent manycast client association is configured using the .Ic manycastclient command, which is similar to the .Ic server command but with a multicast (IPv4 class .Cm D or IPv6 prefix .Cm FF ) group address. The IANA has designated IPv4 address 224.1.1.1 and IPv6 address FF05::101 (site local) for NTP. When more servers are needed, it broadcasts manycast client messages to this address at the minimum feasible rate and minimum feasible time\-to\-live (TTL) hops, depending on how many servers have already been found. There can be as many manycast client associations as different group address, each one serving as a template for a future ephemeral unicast client/server association. .Pp Manycast servers configured with the .Ic manycastserver command listen on the specified group address for manycast client messages. Note the distinction between manycast client, which actively broadcasts messages, and manycast server, which passively responds to them. If a manycast server is in scope of the current TTL and is itself synchronized to a valid source and operating at a stratum level equal to or lower than the manycast client, it replies to the manycast client message with an ordinary unicast server message. .Pp The manycast client receiving this message mobilizes an ephemeral client/server association according to the matching manycast client template, but only if cryptographically authenticated and the server stratum is less than or equal to the client stratum. Authentication is explicitly required and either symmetric key or public key (Autokey) can be used. Then, the client polls the server at its unicast address in burst mode in order to reliably set the host clock and validate the source. This normally results in a volley of eight client/server at 2\-s intervals during which both the synchronization and cryptographic protocols run concurrently. Following the volley, the client runs the NTP intersection and clustering algorithms, which act to discard all but the "best" associations according to stratum and synchronization distance. The surviving associations then continue in ordinary client/server mode. .Pp The manycast client polling strategy is designed to reduce as much as possible the volume of manycast client messages and the effects of implosion due to near\-simultaneous arrival of manycast server messages. The strategy is determined by the .Ic manycastclient , .Ic tos and .Ic ttl configuration commands. The manycast poll interval is normally eight times the system poll interval, which starts out at the .Cm minpoll value specified in the .Ic manycastclient , command and, under normal circumstances, increments to the .Cm maxpolll value specified in this command. Initially, the TTL is set at the minimum hops specified by the .Ic ttl command. At each retransmission the TTL is increased until reaching the maximum hops specified by this command or a sufficient number client associations have been found. Further retransmissions use the same TTL. .Pp The quality and reliability of the suite of associations discovered by the manycast client is determined by the NTP mitigation algorithms and the .Cm minclock and .Cm minsane values specified in the .Ic tos configuration command. At least .Cm minsane candidate servers must be available and the mitigation algorithms produce at least .Cm minclock survivors in order to synchronize the clock. Byzantine agreement principles require at least four candidates in order to correctly discard a single falseticker. For legacy purposes, .Cm minsane defaults to 1 and .Cm minclock defaults to 3. For manycast service .Cm minsane should be explicitly set to 4, assuming at least that number of servers are available. .Pp If at least .Cm minclock servers are found, the manycast poll interval is immediately set to eight times .Cm maxpoll . If less than .Cm minclock servers are found when the TTL has reached the maximum hops, the manycast poll interval is doubled. For each transmission after that, the poll interval is doubled again until reaching the maximum of eight times .Cm maxpoll . Further transmissions use the same poll interval and TTL values. Note that while all this is going on, each client/server association found is operating normally it the system poll interval. .Pp Administratively scoped multicast boundaries are normally specified by the network router configuration and, in the case of IPv6, the link/site scope prefix. By default, the increment for TTL hops is 32 starting from 31; however, the .Ic ttl configuration command can be used to modify the values to match the scope rules. .Pp It is often useful to narrow the range of acceptable servers which can be found by manycast client associations. Because manycast servers respond only when the client stratum is equal to or greater than the server stratum, primary (stratum 1) servers fill find only primary servers in TTL range, which is probably the most common objective. However, unless configured otherwise, all manycast clients in TTL range will eventually find all primary servers in TTL range, which is probably not the most common objective in large networks. The .Ic tos command can be used to modify this behavior. Servers with stratum below .Cm floor or above .Cm ceiling specified in the .Ic tos command are strongly discouraged during the selection process; however, these servers may be temporally accepted if the number of servers within TTL range is less than .Cm minclock . .Pp The above actions occur for each manycast client message, which repeats at the designated poll interval. However, once the ephemeral client association is mobilized, subsequent manycast server replies are discarded, since that would result in a duplicate association. If during a poll interval the number of client associations falls below .Cm minclock , all manycast client prototype associations are reset to the initial poll interval and TTL hops and operation resumes from the beginning. It is important to avoid frequent manycast client messages, since each one requires all manycast servers in TTL range to respond. The result could well be an implosion, either minor or major, depending on the number of servers in range. The recommended value for .Cm maxpoll is 12 (4,096 s). .Pp It is possible and frequently useful to configure a host as both manycast client and manycast server. A number of hosts configured this way and sharing a common group address will automatically organize themselves in an optimum configuration based on stratum and synchronization distance. For example, consider an NTP subnet of two primary servers and a hundred or more dependent clients. With two exceptions, all servers and clients have identical configuration files including both .Ic multicastclient and .Ic multicastserver commands using, for instance, multicast group address 239.1.1.1. The only exception is that each primary server configuration file must include commands for the primary reference source such as a GPS receiver. .Pp The remaining configuration files for all secondary servers and clients have the same contents, except for the .Ic tos command, which is specific for each stratum level. For stratum 1 and stratum 2 servers, that command is not necessary. For stratum 3 and above servers the .Cm floor value is set to the intended stratum number. Thus, all stratum 3 configuration files are identical, all stratum 4 files are identical and so forth. .Pp Once operations have stabilized in this scenario, the primary servers will find the primary reference source and each other, since they both operate at the same stratum (1), but not with any secondary server or client, since these operate at a higher stratum. The secondary servers will find the servers at the same stratum level. If one of the primary servers loses its GPS receiver, it will continue to operate as a client and other clients will time out the corresponding association and re\-associate accordingly. .Pp Some administrators prefer to avoid running .Xr ntpd 8 continuously and run either .Xr sntp 8 or .Xr ntpd 8 .Fl q as a cron job. In either case the servers must be configured in advance and the program fails if none are available when the cron job runs. A really slick application of manycast is with .Xr ntpd 8 .Fl q . The program wakes up, scans the local landscape looking for the usual suspects, selects the best from among the rascals, sets the clock and then departs. Servers do not have to be configured in advance and all clients throughout the network can have the same configuration file. .Ss Manycast Interactions with Autokey Each time a manycast client sends a client mode packet to a multicast group address, all manycast servers in scope generate a reply including the host name and status word. The manycast clients then run the Autokey protocol, which collects and verifies all certificates involved. Following the burst interval all but three survivors are cast off, but the certificates remain in the local cache. It often happens that several complete signing trails from the client to the primary servers are collected in this way. .Pp About once an hour or less often if the poll interval exceeds this, the client regenerates the Autokey key list. This is in general transparent in client/server mode. However, about once per day the server private value used to generate cookies is refreshed along with all manycast client associations. In this case all cryptographic values including certificates is refreshed. If a new certificate has been generated since the last refresh epoch, it will automatically revoke all prior certificates that happen to be in the certificate cache. At the same time, the manycast scheme starts all over from the beginning and the expanding ring shrinks to the minimum and increments from there while collecting all servers in scope. .Ss Broadcast Options .Bl -tag -width indent .It Xo Ic tos .Oo .Cm bcpollbstep Ar gate .Oc .Xc This command provides a way to delay, by the specified number of broadcast poll intervals, believing backward time steps from a broadcast server. Broadcast time networks are expected to be trusted. In the event a broadcast server's time is stepped backwards, there is clear benefit to having the clients notice this change as soon as possible. Attacks such as replay attacks can happen, however, and even though there are a number of protections built in to broadcast mode, attempts to perform a replay attack are possible. This value defaults to 0, but can be changed to any number of poll intervals between 0 and 4. .Ss Manycast Options .Bl -tag -width indent .It Xo Ic tos .Oo .Cm ceiling Ar ceiling | .Cm cohort { 0 | 1 } | .Cm floor Ar floor | .Cm minclock Ar minclock | .Cm minsane Ar minsane .Oc .Xc This command affects the clock selection and clustering algorithms. It can be used to select the quality and quantity of peers used to synchronize the system clock and is most useful in manycast mode. The variables operate as follows: .Bl -tag -width indent .It Cm ceiling Ar ceiling Peers with strata above .Cm ceiling will be discarded if there are at least .Cm minclock peers remaining. This value defaults to 15, but can be changed to any number from 1 to 15. .It Cm cohort Bro 0 | 1 Brc This is a binary flag which enables (0) or disables (1) manycast server replies to manycast clients with the same stratum level. This is useful to reduce implosions where large numbers of clients with the same stratum level are present. The default is to enable these replies. .It Cm floor Ar floor Peers with strata below .Cm floor will be discarded if there are at least .Cm minclock peers remaining. This value defaults to 1, but can be changed to any number from 1 to 15. .It Cm minclock Ar minclock The clustering algorithm repeatedly casts out outlier associations until no more than .Cm minclock associations remain. This value defaults to 3, but can be changed to any number from 1 to the number of configured sources. .It Cm minsane Ar minsane This is the minimum number of candidates available to the clock selection algorithm in order to produce one or more truechimers for the clustering algorithm. If fewer than this number are available, the clock is undisciplined and allowed to run free. The default is 1 for legacy purposes. However, according to principles of Byzantine agreement, .Cm minsane should be at least 4 in order to detect and discard a single falseticker. .El .It Cm ttl Ar hop ... This command specifies a list of TTL values in increasing order, up to 8 values can be specified. In manycast mode these values are used in turn in an expanding\-ring search. The default is eight multiples of 32 starting at 31. .El .Sh Reference Clock Support The NTP Version 4 daemon supports some three dozen different radio, satellite and modem reference clocks plus a special pseudo\-clock used for backup or when no other clock source is available. Detailed descriptions of individual device drivers and options can be found in the .Qq Reference Clock Drivers page (available as part of the HTML documentation provided in .Pa /usr/share/doc/ntp ) . Additional information can be found in the pages linked there, including the .Qq Debugging Hints for Reference Clock Drivers and .Qq How To Write a Reference Clock Driver pages (available as part of the HTML documentation provided in .Pa /usr/share/doc/ntp ) . In addition, support for a PPS signal is available as described in the .Qq Pulse\-per\-second (PPS) Signal Interfacing page (available as part of the HTML documentation provided in .Pa /usr/share/doc/ntp ) . Many drivers support special line discipline/streams modules which can significantly improve the accuracy using the driver. These are described in the .Qq Line Disciplines and Streams Drivers page (available as part of the HTML documentation provided in .Pa /usr/share/doc/ntp ) . .Pp A reference clock will generally (though not always) be a radio timecode receiver which is synchronized to a source of standard time such as the services offered by the NRC in Canada and NIST and USNO in the US. The interface between the computer and the timecode receiver is device dependent, but is usually a serial port. A device driver specific to each reference clock must be selected and compiled in the distribution; however, most common radio, satellite and modem clocks are included by default. Note that an attempt to configure a reference clock when the driver has not been compiled or the hardware port has not been appropriately configured results in a scalding remark to the system log file, but is otherwise non hazardous. .Pp For the purposes of configuration, .Xr ntpd 8 treats reference clocks in a manner analogous to normal NTP peers as much as possible. Reference clocks are identified by a syntactically correct but invalid IP address, in order to distinguish them from normal NTP peers. Reference clock addresses are of the form .Sm off .Li 127.127. Ar t . Ar u , .Sm on where .Ar t is an integer denoting the clock type and .Ar u indicates the unit number in the range 0\-3. While it may seem overkill, it is in fact sometimes useful to configure multiple reference clocks of the same type, in which case the unit numbers must be unique. .Pp The .Ic server command is used to configure a reference clock, where the .Ar address argument in that command is the clock address. The .Cm key , .Cm version and .Cm ttl options are not used for reference clock support. The .Cm mode option is added for reference clock support, as described below. The .Cm prefer option can be useful to persuade the server to cherish a reference clock with somewhat more enthusiasm than other reference clocks or peers. Further information on this option can be found in the .Qq Mitigation Rules and the prefer Keyword (available as part of the HTML documentation provided in .Pa /usr/share/doc/ntp ) page. The .Cm minpoll and .Cm maxpoll options have meaning only for selected clock drivers. See the individual clock driver document pages for additional information. .Pp The .Ic fudge command is used to provide additional information for individual clock drivers and normally follows immediately after the .Ic server command. The .Ar address argument specifies the clock address. The .Cm refid and .Cm stratum options can be used to override the defaults for the device. There are two optional device\-dependent time offsets and four flags that can be included in the .Ic fudge command as well. .Pp The stratum number of a reference clock is by default zero. Since the .Xr ntpd 8 daemon adds one to the stratum of each peer, a primary server ordinarily displays an external stratum of one. In order to provide engineered backups, it is often useful to specify the reference clock stratum as greater than zero. The .Cm stratum option is used for this purpose. Also, in cases involving both a reference clock and a pulse\-per\-second (PPS) discipline signal, it is useful to specify the reference clock identifier as other than the default, depending on the driver. The .Cm refid option is used for this purpose. Except where noted, these options apply to all clock drivers. .Ss Reference Clock Commands .Bl -tag -width indent .It Xo Ic server .Sm off .Li 127.127. Ar t . Ar u .Sm on .Op Cm prefer .Op Cm mode Ar int .Op Cm minpoll Ar int .Op Cm maxpoll Ar int .Xc This command can be used to configure reference clocks in special ways. The options are interpreted as follows: .Bl -tag -width indent .It Cm prefer Marks the reference clock as preferred. All other things being equal, this host will be chosen for synchronization among a set of correctly operating hosts. See the .Qq Mitigation Rules and the prefer Keyword page (available as part of the HTML documentation provided in .Pa /usr/share/doc/ntp ) for further information. .It Cm mode Ar int Specifies a mode number which is interpreted in a device\-specific fashion. For instance, it selects a dialing protocol in the ACTS driver and a device subtype in the parse drivers. .It Cm minpoll Ar int .It Cm maxpoll Ar int These options specify the minimum and maximum polling interval for reference clock messages, as a power of 2 in seconds For most directly connected reference clocks, both .Cm minpoll and .Cm maxpoll default to 6 (64 s). For modem reference clocks, .Cm minpoll defaults to 10 (17.1 m) and .Cm maxpoll defaults to 14 (4.5 h). The allowable range is 4 (16 s) to 17 (36.4 h) inclusive. .El .It Xo Ic fudge .Sm off .Li 127.127. Ar t . Ar u .Sm on .Op Cm time1 Ar sec .Op Cm time2 Ar sec .Op Cm stratum Ar int .Op Cm refid Ar string .Op Cm mode Ar int .Op Cm flag1 Cm 0 \&| Cm 1 .Op Cm flag2 Cm 0 \&| Cm 1 .Op Cm flag3 Cm 0 \&| Cm 1 .Op Cm flag4 Cm 0 \&| Cm 1 .Xc This command can be used to configure reference clocks in special ways. It must immediately follow the .Ic server command which configures the driver. Note that the same capability is possible at run time using the .Xr ntpdc 8 program. The options are interpreted as follows: .Bl -tag -width indent .It Cm time1 Ar sec Specifies a constant to be added to the time offset produced by the driver, a fixed\-point decimal number in seconds. This is used as a calibration constant to adjust the nominal time offset of a particular clock to agree with an external standard, such as a precision PPS signal. It also provides a way to correct a systematic error or bias due to serial port or operating system latencies, different cable lengths or receiver internal delay. The specified offset is in addition to the propagation delay provided by other means, such as internal DIPswitches. Where a calibration for an individual system and driver is available, an approximate correction is noted in the driver documentation pages. Note: in order to facilitate calibration when more than one radio clock or PPS signal is supported, a special calibration feature is available. It takes the form of an argument to the .Ic enable command described in .Sx Miscellaneous Options page and operates as described in the .Qq Reference Clock Drivers page (available as part of the HTML documentation provided in .Pa /usr/share/doc/ntp ) . .It Cm time2 Ar secs Specifies a fixed\-point decimal number in seconds, which is interpreted in a driver\-dependent way. See the descriptions of specific drivers in the .Qq Reference Clock Drivers page (available as part of the HTML documentation provided in .Pa /usr/share/doc/ntp ) . .It Cm stratum Ar int Specifies the stratum number assigned to the driver, an integer between 0 and 15. This number overrides the default stratum number ordinarily assigned by the driver itself, usually zero. .It Cm refid Ar string Specifies an ASCII string of from one to four characters which defines the reference identifier used by the driver. This string overrides the default identifier ordinarily assigned by the driver itself. .It Cm mode Ar int Specifies a mode number which is interpreted in a device\-specific fashion. For instance, it selects a dialing protocol in the ACTS driver and a device subtype in the parse drivers. .It Cm flag1 Cm 0 \&| Cm 1 .It Cm flag2 Cm 0 \&| Cm 1 .It Cm flag3 Cm 0 \&| Cm 1 .It Cm flag4 Cm 0 \&| Cm 1 These four flags are used for customizing the clock driver. The interpretation of these values, and whether they are used at all, is a function of the particular clock driver. However, by convention .Cm flag4 is used to enable recording monitoring data to the .Cm clockstats file configured with the .Ic filegen command. Further information on the .Ic filegen command can be found in .Sx Monitoring Options . .El .El .Sh Miscellaneous Options .Bl -tag -width indent .It Ic broadcastdelay Ar seconds The broadcast and multicast modes require a special calibration to determine the network delay between the local and remote servers. Ordinarily, this is done automatically by the initial protocol exchanges between the client and server. In some cases, the calibration procedure may fail due to network or server access controls, for example. This command specifies the default delay to be used under these circumstances. Typically (for Ethernet), a number between 0.003 and 0.007 seconds is appropriate. The default when this command is not used is 0.004 seconds. .It Ic calldelay Ar delay This option controls the delay in seconds between the first and second packets sent in burst or iburst mode to allow additional time for a modem or ISDN call to complete. .It Ic driftfile Ar driftfile This command specifies the complete path and name of the file used to record the frequency of the local clock oscillator. This is the same operation as the .Fl f command line option. If the file exists, it is read at startup in order to set the initial frequency and then updated once per hour with the current frequency computed by the daemon. If the file name is specified, but the file itself does not exist, the starts with an initial frequency of zero and creates the file when writing it for the first time. If this command is not given, the daemon will always start with an initial frequency of zero. .Pp The file format consists of a single line containing a single floating point number, which records the frequency offset measured in parts\-per\-million (PPM). The file is updated by first writing the current drift value into a temporary file and then renaming this file to replace the old version. This implies that .Xr ntpd 8 must have write permission for the directory the drift file is located in, and that file system links, symbolic or otherwise, should be avoided. .It Ic dscp Ar value This option specifies the Differentiated Services Control Point (DSCP) value, a 6\-bit code. The default value is 46, signifying Expedited Forwarding. .It Xo Ic enable .Oo .Cm auth | Cm bclient | .Cm calibrate | Cm kernel | .Cm mode7 | Cm monitor | .Cm ntp | Cm stats | .Cm peer_clear_digest_early | .Cm unpeer_crypto_early | Cm unpeer_crypto_nak_early | Cm unpeer_digest_early .Oc .Xc .It Xo Ic disable .Oo .Cm auth | Cm bclient | .Cm calibrate | Cm kernel | .Cm mode7 | Cm monitor | .Cm ntp | Cm stats | .Cm peer_clear_digest_early | .Cm unpeer_crypto_early | Cm unpeer_crypto_nak_early | Cm unpeer_digest_early .Oc .Xc Provides a way to enable or disable various server options. Flags not mentioned are unaffected. Note that all of these flags can be controlled remotely using the .Xr ntpdc 8 utility program. .Bl -tag -width indent .It Cm auth Enables the server to synchronize with unconfigured peers only if the peer has been correctly authenticated using either public key or private key cryptography. The default for this flag is .Ic enable . .It Cm bclient Enables the server to listen for a message from a broadcast or multicast server, as in the .Ic multicastclient command with default address. The default for this flag is .Ic disable . .It Cm calibrate Enables the calibrate feature for reference clocks. The default for this flag is .Ic disable . .It Cm kernel Enables the kernel time discipline, if available. The default for this flag is .Ic enable if support is available, otherwise .Ic disable . .It Cm mode7 Enables processing of NTP mode 7 implementation\-specific requests which are used by the deprecated .Xr ntpdc 8 program. The default for this flag is disable. This flag is excluded from runtime configuration using .Xr ntpq 8 . The .Xr ntpq 8 program provides the same capabilities as .Xr ntpdc 8 using standard mode 6 requests. .It Cm monitor Enables the monitoring facility. See the .Xr ntpdc 8 program and the .Ic monlist command or further information. The default for this flag is .Ic enable . .It Cm ntp Enables time and frequency discipline. In effect, this switch opens and closes the feedback loop, which is useful for testing. The default for this flag is .Ic enable . .It Cm peer_clear_digest_early By default, if .Xr ntpd 8 is using autokey and it receives a crypto\-NAK packet that passes the duplicate packet and origin timestamp checks the peer variables are immediately cleared. While this is generally a feature as it allows for quick recovery if a server key has changed, a properly forged and appropriately delivered crypto\-NAK packet can be used in a DoS attack. -If you have active noticable problems with this type of DoS attack +If you have active noticeable problems with this type of DoS attack then you should consider disabling this option. You can check your .Cm peerstats file for evidence of any of these attacks. The default for this flag is .Ic enable . .It Cm stats Enables the statistics facility. See the .Sx Monitoring Options section for further information. The default for this flag is .Ic disable . .It Cm unpeer_crypto_early By default, if .Xr ntpd 8 receives an autokey packet that fails TEST9, a crypto failure, the association is immediately cleared. This is almost certainly a feature, but if, in spite of the current recommendation of not using autokey, you are .B still using autokey .B and you are seeing this sort of DoS attack disabling this flag will delay tearing down the association until the reachability counter becomes zero. You can check your .Cm peerstats file for evidence of any of these attacks. The default for this flag is .Ic enable . .It Cm unpeer_crypto_nak_early By default, if .Xr ntpd 8 receives a crypto\-NAK packet that passes the duplicate packet and origin timestamp checks the association is immediately cleared. While this is generally a feature as it allows for quick recovery if a server key has changed, a properly forged and appropriately delivered crypto\-NAK packet can be used in a DoS attack. -If you have active noticable problems with this type of DoS attack +If you have active noticeable problems with this type of DoS attack then you should consider disabling this option. You can check your .Cm peerstats file for evidence of any of these attacks. The default for this flag is .Ic enable . .It Cm unpeer_digest_early By default, if .Xr ntpd 8 receives what should be an authenticated packet that passes other packet sanity checks but contains an invalid digest the association is immediately cleared. While this is generally a feature as it allows for quick recovery, if this type of packet is carefully forged and sent during an appropriate window it can be used for a DoS attack. -If you have active noticable problems with this type of DoS attack +If you have active noticeable problems with this type of DoS attack then you should consider disabling this option. You can check your .Cm peerstats file for evidence of any of these attacks. The default for this flag is .Ic enable . .El .It Ic includefile Ar includefile This command allows additional configuration commands to be included from a separate file. Include files may be nested to a depth of five; upon reaching the end of any include file, command processing resumes in the previous configuration file. This option is useful for sites that run .Xr ntpd 8 on multiple hosts, with (mostly) common options (e.g., a restriction list). .It Ic leapsmearinterval Ar seconds This EXPERIMENTAL option is only available if .Xr ntpd 8 was built with the .Cm \-\-enable\-leap\-smear option to the .Cm configure script. It specifies the interval over which a leap second correction will be applied. Recommended values for this option are between 7200 (2 hours) and 86400 (24 hours). .Sy DO NOT USE THIS OPTION ON PUBLIC\-ACCESS SERVERS! See http://bugs.ntp.org/2855 for more information. .It Ic logconfig Ar configkeyword This command controls the amount and type of output written to the system .Xr syslog 3 facility or the alternate .Ic logfile log file. By default, all output is turned on. All .Ar configkeyword keywords can be prefixed with .Ql = , .Ql + and .Ql \- , where .Ql = sets the .Xr syslog 3 priority mask, .Ql + adds and .Ql \- removes messages. .Xr syslog 3 messages can be controlled in four classes .Po .Cm clock , .Cm peer , .Cm sys and .Cm sync .Pc . Within these classes four types of messages can be controlled: informational messages .Po .Cm info .Pc , event messages .Po .Cm events .Pc , statistics messages .Po .Cm statistics .Pc and status messages .Po .Cm status .Pc . .Pp Configuration keywords are formed by concatenating the message class with the event class. The .Cm all prefix can be used instead of a message class. A message class may also be followed by the .Cm all keyword to enable/disable all messages of the respective message class. Thus, a minimal log configuration could look like this: .Bd -literal logconfig =syncstatus +sysevents .Ed .Pp This would just list the synchronizations state of .Xr ntpd 8 and the major system events. For a simple reference server, the following minimum message configuration could be useful: .Bd -literal logconfig =syncall +clockall .Ed .Pp This configuration will list all clock information and synchronization information. All other events and messages about peers, system events and so on is suppressed. .It Ic logfile Ar logfile This command specifies the location of an alternate log file to be used instead of the default system .Xr syslog 3 facility. This is the same operation as the .Fl l command line option. .It Ic setvar Ar variable Op Cm default This command adds an additional system variable. These variables can be used to distribute additional information such as the access policy. If the variable of the form .Sm off .Va name = Ar value .Sm on is followed by the .Cm default keyword, the variable will be listed as part of the default system variables .Po .Xr ntpq 8 .Ic rv command .Pc ) . These additional variables serve informational purposes only. They are not related to the protocol other that they can be listed. The known protocol variables will always override any variables defined via the .Ic setvar mechanism. There are three special variables that contain the names of all variable of the same group. The .Va sys_var_list holds the names of all system variables. The .Va peer_var_list holds the names of all peer variables and the .Va clock_var_list holds the names of the reference clock variables. .It Xo Ic tinker .Oo .Cm allan Ar allan | .Cm dispersion Ar dispersion | .Cm freq Ar freq | .Cm huffpuff Ar huffpuff | .Cm panic Ar panic | .Cm step Ar step | .Cm stepback Ar stepback | .Cm stepfwd Ar stepfwd | .Cm stepout Ar stepout .Oc .Xc This command can be used to alter several system variables in very exceptional circumstances. It should occur in the configuration file before any other configuration options. The default values of these variables have been carefully optimized for a wide range of network speeds and reliability expectations. In general, they interact in intricate ways that are hard to predict and some combinations can result in some very nasty behavior. Very rarely is it necessary to change the default values; but, some folks cannot resist twisting the knobs anyway and this command is for them. Emphasis added: twisters are on their own and can expect no help from the support group. .Pp The variables operate as follows: .Bl -tag -width indent .It Cm allan Ar allan The argument becomes the new value for the minimum Allan intercept, which is a parameter of the PLL/FLL clock discipline algorithm. The value in log2 seconds defaults to 7 (1024 s), which is also the lower limit. .It Cm dispersion Ar dispersion The argument becomes the new value for the dispersion increase rate, normally .000015 s/s. .It Cm freq Ar freq The argument becomes the initial value of the frequency offset in parts\-per\-million. This overrides the value in the frequency file, if present, and avoids the initial training state if it is not. .It Cm huffpuff Ar huffpuff The argument becomes the new value for the experimental huff\-n'\-puff filter span, which determines the most recent interval the algorithm will search for a minimum delay. The lower limit is 900 s (15 m), but a more reasonable value is 7200 (2 hours). There is no default, since the filter is not enabled unless this command is given. .It Cm panic Ar panic The argument is the panic threshold, normally 1000 s. If set to zero, the panic sanity check is disabled and a clock offset of any value will be accepted. .It Cm step Ar step The argument is the step threshold, which by default is 0.128 s. It can be set to any positive number in seconds. If set to zero, step adjustments will never occur. Note: The kernel time discipline is disabled if the step threshold is set to zero or greater than the default. .It Cm stepback Ar stepback The argument is the step threshold for the backward direction, which by default is 0.128 s. It can be set to any positive number in seconds. If both the forward and backward step thresholds are set to zero, step adjustments will never occur. Note: The kernel time discipline is disabled if each direction of step threshold are either set to zero or greater than .5 second. .It Cm stepfwd Ar stepfwd As for stepback, but for the forward direction. .It Cm stepout Ar stepout The argument is the stepout timeout, which by default is 900 s. It can be set to any positive number in seconds. If set to zero, the stepout pulses will not be suppressed. .El .It Xo Ic rlimit .Oo .Cm memlock Ar Nmegabytes | .Cm stacksize Ar N4kPages .Cm filenum Ar Nfiledescriptors .Oc .Xc .Bl -tag -width indent .It Cm memlock Ar Nmegabytes Specify the number of megabytes of memory that should be allocated and locked. Probably only available under Linux, this option may be useful when dropping root (the .Fl i option). The default is 32 megabytes on non\-Linux machines, and \-1 under Linux. -1 means "do not lock the process into memory". 0 means "lock whatever memory the process wants into memory". .It Cm stacksize Ar N4kPages Specifies the maximum size of the process stack on systems with the .Fn mlockall function. Defaults to 50 4k pages (200 4k pages in OpenBSD). .It Cm filenum Ar Nfiledescriptors Specifies the maximum number of file descriptors ntpd may have open at once. Defaults to the system default. .El .It Xo Ic trap Ar host_address .Op Cm port Ar port_number .Op Cm interface Ar interface_address .Xc This command configures a trap receiver at the given host address and port number for sending messages with the specified local interface address. If the port number is unspecified, a value of 18447 is used. If the interface address is not specified, the message is sent with a source address of the local interface the message is sent through. Note that on a multihomed host the interface used may vary from time to time with routing changes. .Pp The trap receiver will generally log event messages and other information from the server in a log file. While such monitor programs may also request their own trap dynamically, configuring a trap receiver will ensure that no messages are lost when the server is started. .It Cm hop Ar ... This command specifies a list of TTL values in increasing order, up to 8 values can be specified. In manycast mode these values are used in turn in an expanding\-ring search. The default is eight multiples of 32 starting at 31. .El .Sh "OPTIONS" .Bl -tag .It Fl \-help Display usage information and exit. .It Fl \-more\-help Pass the extended usage information through a pager. .It Fl \-version Op Brq Ar v|c|n Output version of program and exit. The default mode is `v', a simple version. The `c' mode will print copyright information and `n' will print the full copyright notice. .El .Sh "OPTION PRESETS" Any option that is not marked as \fInot presettable\fP may be preset by loading values from environment variables named: .nf \fBNTP_CONF_\fP or \fBNTP_CONF\fP .fi .ad .Sh "ENVIRONMENT" See \fBOPTION PRESETS\fP for configuration environment variables. .Sh FILES .Bl -tag -width /etc/ntp.drift -compact .It Pa /etc/ntp.conf the default name of the configuration file .It Pa ntp.keys private MD5 keys .It Pa ntpkey RSA private key .It Pa ntpkey_ Ns Ar host RSA public key .It Pa ntp_dh Diffie\-Hellman agreement parameters .El .Sh "EXIT STATUS" One of the following exit values will be returned: .Bl -tag .It 0 " (EXIT_SUCCESS)" Successful program execution. .It 1 " (EXIT_FAILURE)" The operation failed or the command syntax was not valid. .It 70 " (EX_SOFTWARE)" libopts had an internal operational error. Please report it to autogen\-users@lists.sourceforge.net. Thank you. .El .Sh "SEE ALSO" .Xr ntpd 8 , .Xr ntpdc 8 , .Xr ntpq 8 .Pp In addition to the manual pages provided, comprehensive documentation is available on the world wide web at .Li http://www.ntp.org/ . A snapshot of this documentation is available in HTML format in .Pa /usr/share/doc/ntp . .Rs .%A David L. Mills .%T Network Time Protocol (Version 4) .%O RFC5905 .Re .Sh "AUTHORS" The University of Delaware and Network Time Foundation .Sh "COPYRIGHT" Copyright (C) 1992\-2017 The University of Delaware and Network Time Foundation all rights reserved. This program is released under the terms of the NTP license, . .Sh BUGS The syntax checking is not picky; some combinations of ridiculous and even hilarious options and modes may not be detected. .Pp The .Pa ntpkey_ Ns Ar host files are really digital certificates. These should be obtained via secure directory services when they become universally available. .Pp Please send bug reports to: http://bugs.ntp.org, bugs@ntp.org .Sh NOTES This document was derived from FreeBSD. .Pp This manual page was \fIAutoGen\fP\-erated from the \fBntp.conf\fP option definitions.