Index: head/lib/libc/sys/recv.2 =================================================================== --- head/lib/libc/sys/recv.2 (revision 338059) +++ head/lib/libc/sys/recv.2 (revision 338060) @@ -1,378 +1,379 @@ .\" Copyright (c) 1983, 1990, 1991, 1993 .\" The Regents of the University of California. All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" 3. Neither the name of the University nor the names of its contributors .\" may be used to endorse or promote products derived from this software .\" without specific prior written permission. .\" .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" @(#)recv.2 8.3 (Berkeley) 2/21/94 .\" $FreeBSD$ .\" -.Dd May 20, 2018 +.Dd August 19, 2018 .Dt RECV 2 .Os .Sh NAME .Nm recv , .Nm recvfrom , .Nm recvmsg , .Nm recvmmsg .Nd receive message(s) from a socket .Sh LIBRARY .Lb libc .Sh SYNOPSIS .In sys/socket.h .Ft ssize_t .Fn recv "int s" "void *buf" "size_t len" "int flags" .Ft ssize_t .Fn recvfrom "int s" "void *buf" "size_t len" "int flags" "struct sockaddr * restrict from" "socklen_t * restrict fromlen" .Ft ssize_t .Fn recvmsg "int s" "struct msghdr *msg" "int flags" .Ft ssize_t .Fn recvmmsg "int s" "struct mmsghdr * restrict msgvec" "size_t vlen" "int flags" "const struct timespec * restrict timeout" .Sh DESCRIPTION The .Fn recvfrom , .Fn recvmsg , and .Fn recvmmsg system calls are used to receive messages from a socket, and may be used to receive data on a socket whether or not it is connection-oriented. .Pp If .Fa from is not a null pointer and the socket is not connection-oriented, the source address of the message is filled in. The .Fa fromlen argument is a value-result argument, initialized to the size of the buffer associated with .Fa from , and modified on return to indicate the actual size of the address stored there. .Pp The .Fn recv function is normally used only on a .Em connected socket (see .Xr connect 2 ) and is identical to .Fn recvfrom with a null pointer passed as its .Fa from argument. .Pp The .Fn recvmmsg function is used to receive multiple messages at a call. Their number is supplied by .Fa vlen . The messages are placed in the buffers described by .Fa msgvec vector, after reception. The size of each received message is placed in the .Fa msg_len field of each element of the vector. If .Fa timeout is NULL the call blocks until the data is available for each supplied message buffer. Otherwise it waits for data for the specified amount of time. If the timeout expired and there is no data received, a value 0 is returned. The .Xr ppoll 2 system call is used to implement the timeout mechanism, before first receive is performed. .Pp The .Fn recv , .Fn recvfrom and .Fn recvmsg return the length of the message on successful completion, whereas .Fn recvmmsg returns the number of received messages. If a message is too long to fit in the supplied buffer, excess bytes may be discarded depending on the type of socket the message is received from (see .Xr socket 2 ) . .Pp If no messages are available at the socket, the receive call waits for a message to arrive, unless the socket is non-blocking (see .Xr fcntl 2 ) in which case the value \-1 is returned and the global variable .Va errno is set to .Er EAGAIN . The receive calls except .Fn recvmmsg normally return any data available, up to the requested amount, rather than waiting for receipt of the full amount requested; this behavior is affected by the socket-level options .Dv SO_RCVLOWAT and .Dv SO_RCVTIMEO described in .Xr getsockopt 2 . The .Fn recvmmsg function implements this behaviour for each message in the vector. .Pp The .Xr select 2 system call may be used to determine when more data arrives. .Pp The .Fa flags argument to a .Fn recv function is formed by .Em or Ap ing one or more of the values: .Bl -column ".Dv MSG_CMSG_CLOEXEC" -offset indent .It Dv MSG_OOB Ta process out-of-band data .It Dv MSG_PEEK Ta peek at incoming message .It Dv MSG_WAITALL Ta wait for full request or error .It Dv MSG_DONTWAIT Ta do not block .It Dv MSG_CMSG_CLOEXEC Ta set received fds close-on-exec .It Dv MSG_WAITFORONE Ta do not block after receiving the first message (only for .Fn recvmmsg ) .El .Pp The .Dv MSG_OOB flag requests receipt of out-of-band data that would not be received in the normal data stream. Some protocols place expedited data at the head of the normal data queue, and thus this flag cannot be used with such protocols. The .Dv MSG_PEEK flag causes the receive operation to return data from the beginning of the receive queue without removing that data from the queue. Thus, a subsequent receive call will return the same data. The .Dv MSG_WAITALL flag requests that the operation block until the full request is satisfied. However, the call may still return less data than requested if a signal is caught, an error or disconnect occurs, or the next data to be received is of a different type than that returned. The .Dv MSG_DONTWAIT flag requests the call to return when it would block otherwise. If no data is available, .Va errno is set to .Er EAGAIN . This flag is not available in .St -ansiC or .St -isoC-99 compilation mode. The .Dv MSG_WAITFORONE flag sets MSG_DONTWAIT after the first message has been received. This flag is only relevant for .Fn recvmmsg . .Pp The .Fn recvmsg system call uses a .Fa msghdr structure to minimize the number of directly supplied arguments. This structure has the following form, as defined in .In sys/socket.h : .Bd -literal struct msghdr { void *msg_name; /* optional address */ socklen_t msg_namelen; /* size of address */ struct iovec *msg_iov; /* scatter/gather array */ int msg_iovlen; /* # elements in msg_iov */ void *msg_control; /* ancillary data, see below */ socklen_t msg_controllen;/* ancillary data buffer len */ int msg_flags; /* flags on received message */ }; .Ed .Pp Here .Fa msg_name and .Fa msg_namelen specify the source address if the socket is unconnected; .Fa msg_name may be given as a null pointer if no names are desired or required. The .Fa msg_iov and .Fa msg_iovlen arguments describe scatter gather locations, as discussed in .Xr read 2 . The .Fa msg_control argument, which has length .Fa msg_controllen , points to a buffer for other protocol control related messages or other miscellaneous ancillary data. The messages are of the form: .Bd -literal struct cmsghdr { socklen_t cmsg_len; /* data byte count, including hdr */ int cmsg_level; /* originating protocol */ int cmsg_type; /* protocol-specific type */ /* followed by u_char cmsg_data[]; */ }; .Ed .Pp As an example, one could use this to learn of changes in the data-stream in XNS/SPP, or in ISO, to obtain user-connection-request data by requesting a .Fn recvmsg with no data buffer provided immediately after an .Fn accept system call. .Pp With .Dv AF_UNIX domain sockets, ancillary data can be used to pass file descriptors and process credentials. See .Xr unix 4 for details. .Pp The .Fa msg_flags field is set on return according to the message received. .Dv MSG_EOR indicates end-of-record; the data returned completed a record (generally used with sockets of type .Dv SOCK_SEQPACKET ) . .Dv MSG_TRUNC indicates that the trailing portion of a datagram was discarded because the datagram was larger than the buffer supplied. .Dv MSG_CTRUNC indicates that some control data were discarded due to lack of space in the buffer for ancillary data. .Dv MSG_OOB is returned to indicate that expedited or out-of-band data were received. .Pp The .Fn recvmmsg system call uses the .Fa mmsghdr structure, defined as follows in the .In sys/socket.h header: .Bd -literal struct mmsghdr { struct msghdr msg_hdr; /* message header */ ssize_t msg_len; /* message length */ }; .Ed .Pp On data reception the .Fa msg_len field is updated to the length of the received message. .Sh RETURN VALUES These calls except .Fn recvmmsg return the number of bytes received. .Fn recvmmsg returns the number of messages received. A value of -1 is returned if an error occurred. .Sh ERRORS The calls fail if: .Bl -tag -width Er .It Bq Er EBADF The argument .Fa s is an invalid descriptor. .It Bq Er ECONNRESET The remote socket end is forcibly closed. .It Bq Er ENOTCONN The socket is associated with a connection-oriented protocol and has not been connected (see .Xr connect 2 and .Xr accept 2 ) . .It Bq Er ENOTSOCK The argument .Fa s does not refer to a socket. .It Bq Er EMSGSIZE The .Fn recvmsg system call was used to receive rights (file descriptors) that were in flight on the connection. However, the receiving program did not have enough free file descriptor slots to accept them. In this case the descriptors are closed, any pending data can be returned by another call to .Fn recvmsg . .It Bq Er EAGAIN The socket is marked non-blocking and the receive operation would block, or a receive timeout had been set and the timeout expired before data were received. .It Bq Er EINTR The receive was interrupted by delivery of a signal before any data were available. .It Bq Er EFAULT The receive buffer pointer(s) point outside the process's address space. .El .Sh SEE ALSO .Xr fcntl 2 , .Xr getsockopt 2 , .Xr read 2 , .Xr select 2 , .Xr socket 2 , +.Xr CMSG_DATA 3 , .Xr unix 4 .Sh HISTORY The .Fn recv function appeared in .Bx 4.2 . The .Fn recvmmsg function appeared in .Fx 11.0 . Index: head/lib/libc/sys/send.2 =================================================================== --- head/lib/libc/sys/send.2 (revision 338059) +++ head/lib/libc/sys/send.2 (revision 338060) @@ -1,269 +1,270 @@ .\" Copyright (c) 1983, 1991, 1993 .\" The Regents of the University of California. All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" 3. Neither the name of the University nor the names of its contributors .\" may be used to endorse or promote products derived from this software .\" without specific prior written permission. .\" .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" From: @(#)send.2 8.2 (Berkeley) 2/21/94 .\" $FreeBSD$ .\" -.Dd August 18, 2016 +.Dd August 19, 2018 .Dt SEND 2 .Os .Sh NAME .Nm send , .Nm sendto , .Nm sendmsg , .Nm sendmmsg .Nd send message(s) from a socket .Sh LIBRARY .Lb libc .Sh SYNOPSIS .In sys/socket.h .Ft ssize_t .Fn send "int s" "const void *msg" "size_t len" "int flags" .Ft ssize_t .Fn sendto "int s" "const void *msg" "size_t len" "int flags" "const struct sockaddr *to" "socklen_t tolen" .Ft ssize_t .Fn sendmsg "int s" "const struct msghdr *msg" "int flags" .Ft ssize_t .Fn sendmmsg "int s" "struct mmsghdr * restrict msgvec" "size_t vlen" "int flags" .Sh DESCRIPTION The .Fn send and .Fn sendmmsg functions, and .Fn sendto and .Fn sendmsg system calls are used to transmit one or more messages (with the .Fn sendmmsg call) to another socket. The .Fn send function may be used only when the socket is in a .Em connected state, while .Fn sendto , .Fn sendmsg and .Fn sendmmsg may be used at any time. .Pp The address of the target is given by .Fa to with .Fa tolen specifying its size. The length of the message is given by .Fa len . If the message is too long to pass atomically through the underlying protocol, the error .Er EMSGSIZE is returned, and the message is not transmitted. .Pp The .Fn sendmmsg function sends multiple messages at a call. They are given by the .Fa msgvec vector along with .Fa vlen specifying the vector size. The number of octets sent per each message is placed in the .Fa msg_len field of each processed element of the vector after transmission. .Pp No indication of failure to deliver is implicit in a .Fn send . Locally detected errors are indicated by a return value of -1. .Pp If no messages space is available at the socket to hold the message to be transmitted, then .Fn send normally blocks, unless the socket has been placed in non-blocking I/O mode. The .Xr select 2 system call may be used to determine when it is possible to send more data. .Pp The .Fa flags argument may include one or more of the following: .Bd -literal #define MSG_OOB 0x00001 /* process out-of-band data */ #define MSG_DONTROUTE 0x00004 /* bypass routing, use direct interface */ #define MSG_EOR 0x00008 /* data completes record */ #define MSG_EOF 0x00100 /* data completes transaction */ #define MSG_NOSIGNAL 0x20000 /* do not generate SIGPIPE on EOF */ .Ed .Pp The flag .Dv MSG_OOB is used to send .Dq out-of-band data on sockets that support this notion (e.g.\& .Dv SOCK_STREAM ) ; the underlying protocol must also support .Dq out-of-band data. .Dv MSG_EOR is used to indicate a record mark for protocols which support the concept. .Dv MSG_EOF requests that the sender side of a socket be shut down, and that an appropriate indication be sent at the end of the specified data; this flag is only implemented for .Dv SOCK_STREAM sockets in the .Dv PF_INET protocol family. .Dv MSG_DONTROUTE is usually used only by diagnostic or routing programs. .Dv MSG_NOSIGNAL is used to prevent .Dv SIGPIPE generation when writing a socket that may be closed. .Pp See .Xr recv 2 for a description of the .Fa msghdr structure and the .Fa mmsghdr structure. .Sh RETURN VALUES The .Fn send , .Fn sendto and .Fn sendmsg calls return the number of octets sent. The .Fn sendmmsg call returns the number of messages sent. If an error occurred a value of -1 is returned. .Sh ERRORS The .Fn send and .Fn sendmmsg functions and .Fn sendto and .Fn sendmsg system calls fail if: .Bl -tag -width Er .It Bq Er EBADF An invalid descriptor was specified. .It Bq Er EACCES The destination address is a broadcast address, and .Dv SO_BROADCAST has not been set on the socket. .It Bq Er ENOTSOCK The argument .Fa s is not a socket. .It Bq Er EFAULT An invalid user space address was specified for an argument. .It Bq Er EMSGSIZE The socket requires that message be sent atomically, and the size of the message to be sent made this impossible. .It Bq Er EAGAIN The socket is marked non-blocking and the requested operation would block. .It Bq Er ENOBUFS The system was unable to allocate an internal buffer. The operation may succeed when buffers become available. .It Bq Er ENOBUFS The output queue for a network interface was full. This generally indicates that the interface has stopped sending, but may be caused by transient congestion. .It Bq Er EHOSTUNREACH The remote host was unreachable. .It Bq Er EISCONN A destination address was specified and the socket is already connected. .It Bq Er ECONNREFUSED The socket received an ICMP destination unreachable message from the last message sent. This typically means that the receiver is not listening on the remote port. .It Bq Er EHOSTDOWN The remote host was down. .It Bq Er ENETDOWN The remote network was down. .It Bq Er EADDRNOTAVAIL The process using a .Dv SOCK_RAW socket was jailed and the source address specified in the IP header did not match the IP address bound to the prison. .It Bq Er EPIPE The socket is unable to send anymore data .Dv ( SBS_CANTSENDMORE has been set on the socket). This typically means that the socket is not connected. .El .Sh SEE ALSO .Xr fcntl 2 , .Xr getsockopt 2 , .Xr recv 2 , .Xr select 2 , .Xr socket 2 , -.Xr write 2 +.Xr write 2 , +.Xr CMSG_DATA 3 .Sh HISTORY The .Fn send function appeared in .Bx 4.2 . The .Fn sendmmsg function appeared in .Fx 11.0 . .Sh BUGS Because .Fn sendmsg does not necessarily block until the data has been transferred, it is possible to transfer an open file descriptor across an .Dv AF_UNIX domain socket (see .Xr recv 2 ) , then .Fn close it before it has actually been sent, the result being that the receiver gets a closed file descriptor. It is left to the application to implement an acknowledgment mechanism to prevent this from happening. Index: head/lib/libc/sys/socket.2 =================================================================== --- head/lib/libc/sys/socket.2 (revision 338059) +++ head/lib/libc/sys/socket.2 (revision 338060) @@ -1,350 +1,351 @@ .\" Copyright (c) 1983, 1991, 1993 .\" The Regents of the University of California. All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" 3. Neither the name of the University nor the names of its contributors .\" may be used to endorse or promote products derived from this software .\" without specific prior written permission. .\" .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" From: @(#)socket.2 8.1 (Berkeley) 6/4/93 .\" $FreeBSD$ .\" -.Dd June 10, 2017 +.Dd August 19, 2018 .Dt SOCKET 2 .Os .Sh NAME .Nm socket .Nd create an endpoint for communication .Sh LIBRARY .Lb libc .Sh SYNOPSIS .In sys/socket.h .Ft int .Fn socket "int domain" "int type" "int protocol" .Sh DESCRIPTION The .Fn socket system call creates an endpoint for communication and returns a descriptor. .Pp The .Fa domain argument specifies a communications domain within which communication will take place; this selects the protocol family which should be used. These families are defined in the include file .In sys/socket.h . The currently understood formats are: .Pp .Bd -literal -offset indent -compact PF_LOCAL Host-internal protocols (alias for PF_UNIX), PF_UNIX Host-internal protocols, PF_INET Internet version 4 protocols, PF_INET6 Internet version 6 protocols, PF_ROUTE Internal routing protocol, PF_LINK Link layer interface, PF_KEY Internal key-management function, PF_NATM Asynchronous transfer mode protocols, PF_NETGRAPH Netgraph sockets, PF_IEEE80211 IEEE 802.11 wireless link-layer protocols (WiFi), PF_BLUETOOTH Bluetooth protocols, PF_INET_SDP OFED socket direct protocol (IPv4), PF_INET6_SDP OFED socket direct protocol (IPv6) .Ed .Pp Each protocol family is connected to an address family, which has the same name except that the prefix is .Dq Dv AF_ in place of .Dq Dv PF_ . Other protocol families may be also defined, beginning with .Dq Dv PF_ , with corresponding address families. .Pp The socket has the indicated .Fa type , which specifies the semantics of communication. Currently defined types are: .Pp .Bd -literal -offset indent -compact SOCK_STREAM Stream socket, SOCK_DGRAM Datagram socket, SOCK_RAW Raw-protocol interface, SOCK_RDM Reliably-delivered packet, SOCK_SEQPACKET Sequenced packet stream .Ed .Pp A .Dv SOCK_STREAM type provides sequenced, reliable, two-way connection based byte streams. An out-of-band data transmission mechanism may be supported. A .Dv SOCK_DGRAM socket supports datagrams (connectionless, unreliable messages of a fixed (typically small) maximum length). A .Dv SOCK_SEQPACKET socket may provide a sequenced, reliable, two-way connection-based data transmission path for datagrams of fixed maximum length; a consumer may be required to read an entire packet with each read system call. This facility may have protocol-specific properties. .Dv SOCK_RAW sockets provide access to internal network protocols and interfaces. The types .Dv SOCK_RAW , which is available only to the super-user, and .Dv SOCK_RDM , which is planned, but not yet implemented, are not described here. .Pp Additionally, the following flags are allowed in the .Fa type argument: .Pp .Bd -literal -offset indent -compact SOCK_CLOEXEC Set close-on-exec on the new descriptor, SOCK_NONBLOCK Set non-blocking mode on the new socket .Ed .Pp The .Fa protocol argument specifies a particular protocol to be used with the socket. Normally only a single protocol exists to support a particular socket type within a given protocol family. However, it is possible that many protocols may exist, in which case a particular protocol must be specified in this manner. The protocol number to use is particular to the .Dq "communication domain" in which communication is to take place; see .Xr protocols 5 . .Pp The .Fa protocol argument may be set to zero (0) to request the default implementation of a socket type for the protocol, if any. .Pp Sockets of type .Dv SOCK_STREAM are full-duplex byte streams, similar to pipes. A stream socket must be in a .Em connected state before any data may be sent or received on it. A connection to another socket is created with a .Xr connect 2 system call. Once connected, data may be transferred using .Xr read 2 and .Xr write 2 calls or some variant of the .Xr send 2 and .Xr recv 2 functions. (Some protocol families, such as the Internet family, support the notion of an .Dq implied connect , which permits data to be sent piggybacked onto a connect operation by using the .Xr sendto 2 system call.) When a session has been completed a .Xr close 2 may be performed. Out-of-band data may also be transmitted as described in .Xr send 2 and received as described in .Xr recv 2 . .Pp The communications protocols used to implement a .Dv SOCK_STREAM ensure that data is not lost or duplicated. If a piece of data for which the peer protocol has buffer space cannot be successfully transmitted within a reasonable length of time, then the connection is considered broken and calls will indicate an error with -1 returns and with .Er ETIMEDOUT as the specific code in the global variable .Va errno . The protocols optionally keep sockets .Dq warm by forcing transmissions roughly every minute in the absence of other activity. An error is then indicated if no response can be elicited on an otherwise idle connection for an extended period (e.g.\& 5 minutes). By default, a .Dv SIGPIPE signal is raised if a process sends on a broken stream, but this behavior may be inhibited via .Xr setsockopt 2 . .Pp .Dv SOCK_SEQPACKET sockets employ the same system calls as .Dv SOCK_STREAM sockets. The only difference is that .Xr read 2 calls will return only the amount of data requested, and any remaining in the arriving packet will be discarded. .Pp .Dv SOCK_DGRAM and .Dv SOCK_RAW sockets allow sending of datagrams to correspondents named in .Xr send 2 calls. Datagrams are generally received with .Xr recvfrom 2 , which returns the next datagram with its return address. .Pp An .Xr fcntl 2 system call can be used to specify a process group to receive a .Dv SIGURG signal when the out-of-band data arrives. It may also enable non-blocking I/O and asynchronous notification of I/O events via .Dv SIGIO . .Pp The operation of sockets is controlled by socket level .Em options . These options are defined in the file .In sys/socket.h . The .Xr setsockopt 2 and .Xr getsockopt 2 system calls are used to set and get options, respectively. .Sh RETURN VALUES A -1 is returned if an error occurs, otherwise the return value is a descriptor referencing the socket. .Sh ERRORS The .Fn socket system call fails if: .Bl -tag -width Er .It Bq Er EACCES Permission to create a socket of the specified type and/or protocol is denied. .It Bq Er EAFNOSUPPORT The address family (domain) is not supported or the specified domain is not supported by this protocol family. .It Bq Er EMFILE The per-process descriptor table is full. .It Bq Er ENFILE The system file table is full. .It Bq Er ENOBUFS Insufficient buffer space is available. The socket cannot be created until sufficient resources are freed. .It Bq Er EPERM User has insufficient privileges to carry out the requested operation. .It Bq Er EPROTONOSUPPORT The protocol type or the specified protocol is not supported within this domain. .It Bq Er EPROTOTYPE The socket type is not supported by the protocol. .El .Sh SEE ALSO .Xr accept 2 , .Xr bind 2 , .Xr connect 2 , .Xr getpeername 2 , .Xr getsockname 2 , .Xr getsockopt 2 , .Xr ioctl 2 , .Xr listen 2 , .Xr read 2 , .Xr recv 2 , .Xr select 2 , .Xr send 2 , .Xr shutdown 2 , .Xr socketpair 2 , .Xr write 2 , +.Xr CMSG_DATA 3 , .Xr getprotoent 3 , .Xr netgraph 4 , .Xr protocols 5 .Rs .%T "An Introductory 4.3 BSD Interprocess Communication Tutorial" .%B PS1 .%N 7 .Re .Rs .%T "BSD Interprocess Communication Tutorial" .%B PS1 .%N 8 .Re .Sh STANDARDS The .Fn socket function conforms to .St -p1003.1-2008 . The .Tn POSIX standard specifies only the .Dv AF_INET , .Dv AF_INET6 , and .Dv AF_UNIX constants for address families, and requires the use of .Dv AF_* constants for the .Fa domain argument of .Fn socket . The .Dv SOCK_CLOEXEC flag is expected to conform to the next revision of the .Tn POSIX standard. The .Dv SOCK_RDM .Fa type , the .Dv PF_* constants, and other address families are .Fx extensions. .Sh HISTORY The .Fn socket system call appeared in .Bx 4.2 . Index: head/share/man/man3/CMSG_DATA.3 =================================================================== --- head/share/man/man3/CMSG_DATA.3 (nonexistent) +++ head/share/man/man3/CMSG_DATA.3 (revision 338060) @@ -0,0 +1,214 @@ +.\" Written by Jared Yanovich +.\" Public domain, July 3, 2005 +.\" +.\" $FreeBSD$ +.Dd August 19, 2018 +.Dt CMSG_DATA 3 +.Os +.Sh NAME +.Nm CMSG_DATA , +.Nm CMSG_FIRSTHDR , +.Nm CMSG_LEN , +.Nm CMSG_NXTHDR , +.Nm CMSG_SPACE +.Nd socket control message routines for ancillary data access +.Sh SYNOPSIS +.In sys/socket.h +.Ft unsigned char * +.Fn CMSG_DATA "struct cmsghdr *" +.Ft struct cmsghdr * +.Fn CMSG_FIRSTHDR "struct msghdr *" +.Ft size_t +.Fn CMSG_LEN "size_t" +.Ft struct cmsghdr * +.Fn CMSG_NXTHDR "struct msghdr *" "struct cmsghdr *" +.Ft size_t +.Fn CMSG_SPACE "size_t" +.Sh DESCRIPTION +The control message API is used to construct ancillary data objects for +use in control messages sent and received across sockets. +.Pp +Control messages are passed around by the +.Xr recvmsg 2 +and +.Xr sendmsg 2 +system calls. +The +.Vt cmsghdr +structure, described in +.Xr recvmsg 2 , +is used to specify a chain of control messages. +.Pp +These routines should be used instead of directly accessing the control +message header members and data buffers as they ensure that necessary +alignment constraints are met. +.Pp +The following routines are provided: +.Bl -tag -width Ds +.It Fn CMSG_DATA cmsg +This routine accesses the data portion of the control message header +.Fa cmsg . +It ensures proper alignment constraints on the beginning of ancillary +data are met. +.It Fn CMSG_FIRSTHDR mhdr +This routine accesses the first control message attached to the +message +.Fa msg . +If no control messages are attached to the message, this routine +returns +.Dv NULL . +.It Fn CMSG_LEN len +This routine determines the size in bytes of a control message, +which includes the control message header. +.Fa len +specifies the length of the data held by the control message. +This value is what is normally stored in the +.Fa cmsg_len +of each control message. +This routine accounts for any alignment constraints on the beginning of +ancillary data. +.It Fn CMSG_NXTHDR mhdr cmsg +This routine returns the location of the control message following +.Fa cmsg +in the message +.Fa mhdr . +If +.Fa cmsg +is the last control message in the chain, this routine returns +.Dv NULL . +.It Fn CMSG_SPACE len +This routine determines the size in bytes needed to hold a control +message and its contents of length +.Fa len , +which includes the control message header. +This value is what is normally stored in +.Fa msg_msgcontrollen . +This routine accounts for any alignment constraints on the beginning of +ancillary data as well as any needed to pad the next control message. +.El +.Sh EXAMPLES +The following example constructs a control message containing a file descriptor +in the parent process and passes it over a pre-shared socket over the child +process. +Then the child process sends a "hello" string to the parent process using the +received file descriptor. +.Bd -literal +#include + +#include +#include +#include +#include +#include + +#define HELLOLEN sizeof("hello") + +int +main() +{ + struct msghdr msg; + union { + struct cmsghdr hdr; + unsigned char buf[CMSG_SPACE(sizeof(int))]; + } cmsgbuf; + char buf[HELLOLEN]; + int hellofd[2]; + int presharedfd[2]; + struct cmsghdr *cmsg; + + if (socketpair(PF_LOCAL, SOCK_STREAM, 0, presharedfd) == -1) + err(EX_OSERR, "failed to create a pre-shared socket pair"); + + memset(&msg, 0, sizeof(msg)); + msg.msg_control = &cmsgbuf.buf; + msg.msg_controllen = sizeof(cmsgbuf.buf); + msg.msg_iov = NULL; + msg.msg_iovlen = 0; + + switch (fork()) { + case -1: + err(EX_OSERR, "fork"); + case 0: + close(presharedfd[0]); + strlcpy(buf, "hello", HELLOLEN); + + if (recvmsg(presharedfd[1], &msg, 0) == -1) + err(EX_IOERR, "failed to receive a message"); + if (msg.msg_flags & (MSG_CTRUNC | MSG_TRUNC)) + errx(EX_IOERR, "control message truncated"); + for (cmsg = CMSG_FIRSTHDR(&msg); cmsg != NULL; + cmsg = CMSG_NXTHDR(&msg, cmsg)) { + if (cmsg->cmsg_len == CMSG_LEN(sizeof(int)) && + cmsg->cmsg_level == SOL_SOCKET && + cmsg->cmsg_type == SCM_RIGHTS) { + hellofd[1] = *(int *)CMSG_DATA(cmsg); + printf("child: sending '%s'\n", buf); + if (write(hellofd[1], buf, HELLOLEN) == -1) + err(EX_IOERR, "failed to send 'hello'"); + } + } + break; + default: + close(presharedfd[1]); + + if (socketpair(PF_LOCAL, SOCK_STREAM, 0, hellofd) == -1) + err(EX_OSERR, "failed to create a 'hello' socket pair"); + + cmsg = CMSG_FIRSTHDR(&msg); + cmsg->cmsg_len = CMSG_LEN(sizeof(int)); + cmsg->cmsg_level = SOL_SOCKET; + cmsg->cmsg_type = SCM_RIGHTS; + *(int *)CMSG_DATA(cmsg) = hellofd[1]; + + if (sendmsg(presharedfd[0], &msg, 0) == -1) + err(EX_IOERR, "sendmsg"); + close(hellofd[1]); + + if (read(hellofd[0], buf, HELLOLEN) == -1) + err(EX_IOERR, "faild to receive 'hello'"); + printf("parent: received '%s'\n", buf); + break; + } + + return (0); +} +.Ed +.Sh SEE ALSO +.Xr recvmsg 2 , +.Xr sendmsg 2 , +.Xr socket 2 , +.Xr ip 4 , +.Xr ip6 4 , +.Xr unix 4 +.Sh STANDARDS +.Bl -item +.It +.Rs +.%A W. Stevens +.%A M. Thomas +.%T "Advanced Sockets API for IPv6" +.%R RFC 2292 +.%D February 1998 +.Re +.It +.Rs +.%A W. Stevens +.%A M. Thomas +.%A E. Nordmark +.%A T. Jinmei +.%T "Advanced Sockets Application Program Interface (API) for IPv6" +.%R RFC 3542 +.%D May 2003 +.Re +.El +.Sh HISTORY +The control message API first appeared in +.Bx 4.2 . +This manual page was originally written by +.An Jared Yanovich Aq Mt jaredy@OpenBSD.org +for +.Ox 3.8 +and eventually brought to +.Fx 12.0 +by +.An Mateusz Piotrowski Aq Mt 0mp@FreeBSD.org . Property changes on: head/share/man/man3/CMSG_DATA.3 ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: head/share/man/man3/Makefile =================================================================== --- head/share/man/man3/Makefile (revision 338059) +++ head/share/man/man3/Makefile (revision 338060) @@ -1,347 +1,352 @@ # @(#)Makefile 8.2 (Berkeley) 12/13/93 # $FreeBSD$ .include PACKAGE=runtime-manuals MAN= assert.3 \ ATOMIC_VAR_INIT.3 \ bitstring.3 \ + CMSG_DATA.3 \ end.3 \ fpgetround.3 \ intro.3 \ makedev.3 \ offsetof.3 \ ${PTHREAD_MAN} \ queue.3 \ sigevent.3 \ siginfo.3 \ stdarg.3 \ sysexits.3 \ tgmath.3 \ timeradd.3 \ tree.3 MLINKS= ATOMIC_VAR_INIT.3 atomic_compare_exchange_strong.3 \ ATOMIC_VAR_INIT.3 atomic_compare_exchange_strong_explicit.3 \ ATOMIC_VAR_INIT.3 atomic_compare_exchange_weak.3 \ ATOMIC_VAR_INIT.3 atomic_compare_exchange_weak_explicit.3 \ ATOMIC_VAR_INIT.3 atomic_exchange.3 \ ATOMIC_VAR_INIT.3 atomic_exchange_explicit.3 \ ATOMIC_VAR_INIT.3 atomic_fetch_add.3 \ ATOMIC_VAR_INIT.3 atomic_fetch_add_explicit.3 \ ATOMIC_VAR_INIT.3 atomic_fetch_and.3 \ ATOMIC_VAR_INIT.3 atomic_fetch_and_explicit.3 \ ATOMIC_VAR_INIT.3 atomic_fetch_or.3 \ ATOMIC_VAR_INIT.3 atomic_fetch_or_explicit.3 \ ATOMIC_VAR_INIT.3 atomic_fetch_sub.3 \ ATOMIC_VAR_INIT.3 atomic_fetch_sub_explicit.3 \ ATOMIC_VAR_INIT.3 atomic_fetch_xor.3 \ ATOMIC_VAR_INIT.3 atomic_fetch_xor_explicit.3 \ ATOMIC_VAR_INIT.3 atomic_init.3 \ ATOMIC_VAR_INIT.3 atomic_is_lock_free.3 \ ATOMIC_VAR_INIT.3 atomic_load.3 \ ATOMIC_VAR_INIT.3 atomic_load_explicit.3 \ ATOMIC_VAR_INIT.3 atomic_store.3 \ ATOMIC_VAR_INIT.3 atomic_store_explicit.3 MLINKS+= bitstring.3 bit_alloc.3 \ bitstring.3 bit_clear.3 \ bitstring.3 bit_decl.3 \ bitstring.3 bit_ffc.3 \ bitstring.3 bit_ffc_at.3 \ bitstring.3 bit_ffs.3 \ bitstring.3 bit_ffs_at.3 \ bitstring.3 bit_nclear.3 \ bitstring.3 bit_nset.3 \ bitstring.3 bit_set.3 \ bitstring.3 bitstr_size.3 \ bitstring.3 bit_test.3 +MLINKS+= CMSG_DATA.3 CMSG_FIRSTHDR.3 \ + CMSG_DATA.3 CMSG_LEN.3 \ + CMSG_DATA.3 CMSG_NEXTHDR.3 \ + CMSG_DATA.3 CMSG_SPACE.3 MLINKS+= end.3 edata.3 \ end.3 etext.3 MLINKS+= fpgetround.3 fpgetmask.3 \ fpgetround.3 fpgetprec.3 \ fpgetround.3 fpgetsticky.3 \ fpgetround.3 fpresetsticky.3 \ fpgetround.3 fpsetmask.3 \ fpgetround.3 fpsetprec.3 \ fpgetround.3 fpsetround.3 MLINKS+= makedev.3 major.3 \ makedev.3 minor.3 MLINKS+= ${PTHREAD_MLINKS} MLINKS+= queue.3 LIST_CLASS_ENTRY.3 \ queue.3 LIST_CLASS_HEAD.3 \ queue.3 LIST_EMPTY.3 \ queue.3 LIST_ENTRY.3 \ queue.3 LIST_FIRST.3 \ queue.3 LIST_FOREACH.3 \ queue.3 LIST_FOREACH_FROM.3 \ queue.3 LIST_FOREACH_FROM_SAFE.3 \ queue.3 LIST_FOREACH_SAFE.3 \ queue.3 LIST_HEAD.3 \ queue.3 LIST_HEAD_INITIALIZER.3 \ queue.3 LIST_INIT.3 \ queue.3 LIST_INSERT_AFTER.3 \ queue.3 LIST_INSERT_BEFORE.3 \ queue.3 LIST_INSERT_HEAD.3 \ queue.3 LIST_NEXT.3 \ queue.3 LIST_PREV.3 \ queue.3 LIST_REMOVE.3 \ queue.3 LIST_SWAP.3 \ queue.3 SLIST_CLASS_ENTRY.3 \ queue.3 SLIST_CLASS_HEAD.3 \ queue.3 SLIST_EMPTY.3 \ queue.3 SLIST_ENTRY.3 \ queue.3 SLIST_FIRST.3 \ queue.3 SLIST_FOREACH.3 \ queue.3 SLIST_FOREACH_FROM.3 \ queue.3 SLIST_FOREACH_FROM_SAFE.3 \ queue.3 SLIST_FOREACH_SAFE.3 \ queue.3 SLIST_HEAD.3 \ queue.3 SLIST_HEAD_INITIALIZER.3 \ queue.3 SLIST_INIT.3 \ queue.3 SLIST_INSERT_AFTER.3 \ queue.3 SLIST_INSERT_HEAD.3 \ queue.3 SLIST_NEXT.3 \ queue.3 SLIST_REMOVE.3 \ queue.3 SLIST_REMOVE_AFTER.3 \ queue.3 SLIST_REMOVE_HEAD.3 \ queue.3 SLIST_REMOVE_PREVPTR.3 \ queue.3 SLIST_SWAP.3 \ queue.3 STAILQ_CLASS_ENTRY.3 \ queue.3 STAILQ_CLASS_HEAD.3 \ queue.3 STAILQ_CONCAT.3 \ queue.3 STAILQ_EMPTY.3 \ queue.3 STAILQ_ENTRY.3 \ queue.3 STAILQ_FIRST.3 \ queue.3 STAILQ_FOREACH.3 \ queue.3 STAILQ_FOREACH_FROM.3 \ queue.3 STAILQ_FOREACH_FROM_SAFE.3 \ queue.3 STAILQ_FOREACH_SAFE.3 \ queue.3 STAILQ_HEAD.3 \ queue.3 STAILQ_HEAD_INITIALIZER.3 \ queue.3 STAILQ_INIT.3 \ queue.3 STAILQ_INSERT_AFTER.3 \ queue.3 STAILQ_INSERT_HEAD.3 \ queue.3 STAILQ_INSERT_TAIL.3 \ queue.3 STAILQ_LAST.3 \ queue.3 STAILQ_NEXT.3 \ queue.3 STAILQ_REMOVE.3 \ queue.3 STAILQ_REMOVE_AFTER.3 \ queue.3 STAILQ_REMOVE_HEAD.3 \ queue.3 STAILQ_SWAP.3 \ queue.3 TAILQ_CLASS_ENTRY.3 \ queue.3 TAILQ_CLASS_HEAD.3 \ queue.3 TAILQ_CONCAT.3 \ queue.3 TAILQ_EMPTY.3 \ queue.3 TAILQ_ENTRY.3 \ queue.3 TAILQ_FIRST.3 \ queue.3 TAILQ_FOREACH.3 \ queue.3 TAILQ_FOREACH_FROM.3 \ queue.3 TAILQ_FOREACH_FROM_SAFE.3 \ queue.3 TAILQ_FOREACH_REVERSE.3 \ queue.3 TAILQ_FOREACH_REVERSE_FROM.3 \ queue.3 TAILQ_FOREACH_REVERSE_FROM_SAFE.3 \ queue.3 TAILQ_FOREACH_REVERSE_SAFE.3 \ queue.3 TAILQ_FOREACH_SAFE.3 \ queue.3 TAILQ_HEAD.3 \ queue.3 TAILQ_HEAD_INITIALIZER.3 \ queue.3 TAILQ_INIT.3 \ queue.3 TAILQ_INSERT_AFTER.3 \ queue.3 TAILQ_INSERT_BEFORE.3 \ queue.3 TAILQ_INSERT_HEAD.3 \ queue.3 TAILQ_INSERT_TAIL.3 \ queue.3 TAILQ_LAST.3 \ queue.3 TAILQ_NEXT.3 \ queue.3 TAILQ_PREV.3 \ queue.3 TAILQ_REMOVE.3 \ queue.3 TAILQ_SWAP.3 MLINKS+= stdarg.3 va_arg.3 \ stdarg.3 va_copy.3 \ stdarg.3 va_end.3 \ stdarg.3 varargs.3 \ stdarg.3 va_start.3 MLINKS+= timeradd.3 timerclear.3 \ timeradd.3 timercmp.3 \ timeradd.3 timerisset.3 \ timeradd.3 timersub.3 \ timeradd.3 timespecadd.3 \ timeradd.3 timespecsub.3 \ timeradd.3 timespecclear.3 \ timeradd.3 timespecisset.3 \ timeradd.3 timespeccmp.3 MLINKS+= tree.3 RB_EMPTY.3 \ tree.3 RB_ENTRY.3 \ tree.3 RB_FIND.3 \ tree.3 RB_FOREACH.3 \ tree.3 RB_FOREACH_REVERSE.3 \ tree.3 RB_GENERATE.3 \ tree.3 RB_GENERATE_STATIC.3 \ tree.3 RB_HEAD.3 \ tree.3 RB_INIT.3 \ tree.3 RB_INITIALIZER.3 \ tree.3 RB_INSERT.3 \ tree.3 RB_LEFT.3 \ tree.3 RB_MAX.3 \ tree.3 RB_MIN.3 \ tree.3 RB_NEXT.3 \ tree.3 RB_NFIND.3 \ tree.3 RB_PARENT.3 \ tree.3 RB_PREV.3 \ tree.3 RB_PROTOTYPE.3 \ tree.3 RB_PROTOTYPE_STATIC.3 \ tree.3 RB_REMOVE.3 \ tree.3 RB_RIGHT.3 \ tree.3 RB_ROOT.3 \ tree.3 SPLAY_EMPTY.3 \ tree.3 SPLAY_ENTRY.3 \ tree.3 SPLAY_FIND.3 \ tree.3 SPLAY_FOREACH.3 \ tree.3 SPLAY_GENERATE.3 \ tree.3 SPLAY_HEAD.3 \ tree.3 SPLAY_INIT.3 \ tree.3 SPLAY_INITIALIZER.3 \ tree.3 SPLAY_INSERT.3 \ tree.3 SPLAY_LEFT.3 \ tree.3 SPLAY_MAX.3 \ tree.3 SPLAY_MIN.3 \ tree.3 SPLAY_NEXT.3 \ tree.3 SPLAY_PROTOTYPE.3 \ tree.3 SPLAY_REMOVE.3 \ tree.3 SPLAY_RIGHT.3 \ tree.3 SPLAY_ROOT.3 .if ${MK_LIBTHR} != "no" PTHREAD_MAN= pthread.3 \ pthread_affinity_np.3 \ pthread_atfork.3 \ pthread_attr.3 \ pthread_attr_affinity_np.3 \ pthread_attr_get_np.3 \ pthread_attr_setcreatesuspend_np.3 \ pthread_barrierattr.3 \ pthread_barrier_destroy.3 \ pthread_cancel.3 \ pthread_cleanup_pop.3 \ pthread_cleanup_push.3 \ pthread_condattr.3 \ pthread_cond_broadcast.3 \ pthread_cond_destroy.3 \ pthread_cond_init.3 \ pthread_cond_signal.3 \ pthread_cond_timedwait.3 \ pthread_cond_wait.3 \ pthread_create.3 \ pthread_detach.3 \ pthread_equal.3 \ pthread_exit.3 \ pthread_getconcurrency.3 \ pthread_getcpuclockid.3 \ pthread_getspecific.3 \ pthread_getthreadid_np.3 \ pthread_join.3 \ pthread_key_create.3 \ pthread_key_delete.3 \ pthread_kill.3 \ pthread_main_np.3 \ pthread_multi_np.3 \ pthread_mutexattr.3 \ pthread_mutexattr_getkind_np.3 \ pthread_mutex_consistent.3 \ pthread_mutex_destroy.3 \ pthread_mutex_init.3 \ pthread_mutex_lock.3 \ pthread_mutex_timedlock.3 \ pthread_mutex_trylock.3 \ pthread_mutex_unlock.3 \ pthread_once.3 \ pthread_resume_all_np.3 \ pthread_resume_np.3 \ pthread_rwlockattr_destroy.3 \ pthread_rwlockattr_getpshared.3 \ pthread_rwlockattr_init.3 \ pthread_rwlockattr_setpshared.3 \ pthread_rwlock_destroy.3 \ pthread_rwlock_init.3 \ pthread_rwlock_rdlock.3 \ pthread_rwlock_timedrdlock.3 \ pthread_rwlock_timedwrlock.3 \ pthread_rwlock_unlock.3 \ pthread_rwlock_wrlock.3 \ pthread_schedparam.3 \ pthread_self.3 \ pthread_set_name_np.3 \ pthread_setspecific.3 \ pthread_sigmask.3 \ pthread_spin_init.3 \ pthread_spin_lock.3 \ pthread_suspend_all_np.3 \ pthread_suspend_np.3 \ pthread_switch_add_np.3 \ pthread_testcancel.3 \ pthread_yield.3 PTHREAD_MLINKS= pthread_affinity_np.3 pthread_getaffinity_np.3 \ pthread_affinity_np.3 pthread_setaffinity_np.3 PTHREAD_MLINKS+=pthread_attr.3 pthread_attr_destroy.3 \ pthread_attr.3 pthread_attr_getdetachstate.3 \ pthread_attr.3 pthread_attr_getguardsize.3 \ pthread_attr.3 pthread_attr_getinheritsched.3 \ pthread_attr.3 pthread_attr_getschedparam.3 \ pthread_attr.3 pthread_attr_getschedpolicy.3 \ pthread_attr.3 pthread_attr_getscope.3 \ pthread_attr.3 pthread_attr_getstack.3 \ pthread_attr.3 pthread_attr_getstackaddr.3 \ pthread_attr.3 pthread_attr_getstacksize.3 \ pthread_attr.3 pthread_attr_init.3 \ pthread_attr.3 pthread_attr_setdetachstate.3 \ pthread_attr.3 pthread_attr_setguardsize.3 \ pthread_attr.3 pthread_attr_setinheritsched.3 \ pthread_attr.3 pthread_attr_setschedparam.3 \ pthread_attr.3 pthread_attr_setschedpolicy.3 \ pthread_attr.3 pthread_attr_setscope.3 \ pthread_attr.3 pthread_attr_setstack.3 \ pthread_attr.3 pthread_attr_setstackaddr.3 \ pthread_attr.3 pthread_attr_setstacksize.3 PTHREAD_MLINKS+=pthread_attr_affinity_np.3 pthread_attr_getaffinity_np.3 \ pthread_attr_affinity_np.3 pthread_attr_setaffinity_np.3 PTHREAD_MLINKS+=pthread_barrierattr.3 pthread_barrierattr_destroy.3 \ pthread_barrierattr.3 pthread_barrierattr_getpshared.3 \ pthread_barrierattr.3 pthread_barrierattr_init.3 \ pthread_barrierattr.3 pthread_barrierattr_setpshared.3 PTHREAD_MLINKS+=pthread_barrier_destroy.3 pthread_barrier_init.3 \ pthread_barrier_destroy.3 pthread_barrier_wait.3 PTHREAD_MLINKS+=pthread_condattr.3 pthread_condattr_destroy.3 \ pthread_condattr.3 pthread_condattr_init.3 \ pthread_condattr.3 pthread_condattr_getclock.3 \ pthread_condattr.3 pthread_condattr_setclock.3 \ pthread_condattr.3 pthread_condattr_getpshared.3 \ pthread_condattr.3 pthread_condattr_setpshared.3 PTHREAD_MLINKS+=pthread_getconcurrency.3 pthread_setconcurrency.3 PTHREAD_MLINKS+=pthread_multi_np.3 pthread_single_np.3 PTHREAD_MLINKS+=pthread_mutexattr.3 pthread_mutexattr_destroy.3 \ pthread_mutexattr.3 pthread_mutexattr_getprioceiling.3 \ pthread_mutexattr.3 pthread_mutexattr_getprotocol.3 \ pthread_mutexattr.3 pthread_mutexattr_getrobust.3 \ pthread_mutexattr.3 pthread_mutexattr_gettype.3 \ pthread_mutexattr.3 pthread_mutexattr_init.3 \ pthread_mutexattr.3 pthread_mutexattr_setprioceiling.3 \ pthread_mutexattr.3 pthread_mutexattr_setprotocol.3 \ pthread_mutexattr.3 pthread_mutexattr_setrobust.3 \ pthread_mutexattr.3 pthread_mutexattr_settype.3 PTHREAD_MLINKS+=pthread_mutexattr_getkind_np.3 pthread_mutexattr_setkind_np.3 PTHREAD_MLINKS+=pthread_rwlock_rdlock.3 pthread_rwlock_tryrdlock.3 PTHREAD_MLINKS+=pthread_rwlock_wrlock.3 pthread_rwlock_trywrlock.3 PTHREAD_MLINKS+=pthread_schedparam.3 pthread_getschedparam.3 \ pthread_schedparam.3 pthread_setschedparam.3 PTHREAD_MLINKS+=pthread_set_name_np.3 pthread_get_name_np.3 PTHREAD_MLINKS+=pthread_spin_init.3 pthread_spin_destroy.3 \ pthread_spin_lock.3 pthread_spin_trylock.3 \ pthread_spin_lock.3 pthread_spin_unlock.3 PTHREAD_MLINKS+=pthread_switch_add_np.3 pthread_switch_delete_np.3 PTHREAD_MLINKS+=pthread_testcancel.3 pthread_setcancelstate.3 \ pthread_testcancel.3 pthread_setcanceltype.3 PTHREAD_MLINKS+=pthread_join.3 pthread_timedjoin_np.3 .endif .include Index: head/share/man/man4/ip.4 =================================================================== --- head/share/man/man4/ip.4 (revision 338059) +++ head/share/man/man4/ip.4 (revision 338060) @@ -1,934 +1,935 @@ .\" Copyright (c) 1983, 1991, 1993 .\" The Regents of the University of California. All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" 3. Neither the name of the University nor the names of its contributors .\" may be used to endorse or promote products derived from this software .\" without specific prior written permission. .\" .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" @(#)ip.4 8.2 (Berkeley) 11/30/93 .\" $FreeBSD$ .\" -.Dd September 1, 2014 +.Dd August 19, 2018 .Dt IP 4 .Os .Sh NAME .Nm ip .Nd Internet Protocol .Sh SYNOPSIS .In sys/types.h .In sys/socket.h .In netinet/in.h .Ft int .Fn socket AF_INET SOCK_RAW proto .Sh DESCRIPTION .Tn IP is the transport layer protocol used by the Internet protocol family. Options may be set at the .Tn IP level when using higher-level protocols that are based on .Tn IP (such as .Tn TCP and .Tn UDP ) . It may also be accessed through a .Dq raw socket when developing new protocols, or special-purpose applications. .Pp There are several .Tn IP-level .Xr setsockopt 2 and .Xr getsockopt 2 options. .Dv IP_OPTIONS may be used to provide .Tn IP options to be transmitted in the .Tn IP header of each outgoing packet or to examine the header options on incoming packets. .Tn IP options may be used with any socket type in the Internet family. The format of .Tn IP options to be sent is that specified by the .Tn IP protocol specification (RFC-791), with one exception: the list of addresses for Source Route options must include the first-hop gateway at the beginning of the list of gateways. The first-hop gateway address will be extracted from the option list and the size adjusted accordingly before use. To disable previously specified options, use a zero-length buffer: .Bd -literal setsockopt(s, IPPROTO_IP, IP_OPTIONS, NULL, 0); .Ed .Pp .Dv IP_TOS and .Dv IP_TTL may be used to set the type-of-service and time-to-live fields in the .Tn IP header for .Dv SOCK_STREAM , SOCK_DGRAM , and certain types of .Dv SOCK_RAW sockets. For example, .Bd -literal int tos = IPTOS_LOWDELAY; /* see */ setsockopt(s, IPPROTO_IP, IP_TOS, &tos, sizeof(tos)); int ttl = 60; /* max = 255 */ setsockopt(s, IPPROTO_IP, IP_TTL, &ttl, sizeof(ttl)); .Ed .Pp .Dv IP_MINTTL may be used to set the minimum acceptable TTL a packet must have when received on a socket. All packets with a lower TTL are silently dropped. This option is only really useful when set to 255, preventing packets from outside the directly connected networks reaching local listeners on sockets. .Pp .Dv IP_DONTFRAG may be used to set the Don't Fragment flag on IP packets. Currently this option is respected only on .Xr udp 4 and raw .Nm sockets, unless the .Dv IP_HDRINCL option has been set. On .Xr tcp 4 sockets, the Don't Fragment flag is controlled by the Path MTU Discovery option. Sending a packet larger than the MTU size of the egress interface, determined by the destination address, returns an .Er EMSGSIZE error. .Pp If the .Dv IP_ORIGDSTADDR option is enabled on a .Dv SOCK_DGRAM socket, the .Xr recvmsg 2 call will return the destination .Tn IP address and destination port or a .Tn UDP datagram. The .Vt msg_control field in the .Vt msghdr structure points to a buffer that contains a .Vt cmsghdr structure followed by the .Tn in_sockkaddr structre. The .Vt cmsghdr fields have the following values: .Bd -literal cmsg_len = CMSG_LEN(sizeof(struct in_sockaddr)) cmsg_level = IPPROTO_IP cmsg_type = IP_ORIGDSTADDR .Ed .Pp If the .Dv IP_RECVDSTADDR option is enabled on a .Dv SOCK_DGRAM socket, the .Xr recvmsg 2 call will return the destination .Tn IP address for a .Tn UDP datagram. The .Vt msg_control field in the .Vt msghdr structure points to a buffer that contains a .Vt cmsghdr structure followed by the .Tn IP address. The .Vt cmsghdr fields have the following values: .Bd -literal cmsg_len = CMSG_LEN(sizeof(struct in_addr)) cmsg_level = IPPROTO_IP cmsg_type = IP_RECVDSTADDR .Ed .Pp The source address to be used for outgoing .Tn UDP datagrams on a socket can be specified as ancillary data with a type code of .Dv IP_SENDSRCADDR . The msg_control field in the msghdr structure should point to a buffer that contains a .Vt cmsghdr structure followed by the .Tn IP address. The cmsghdr fields should have the following values: .Bd -literal cmsg_len = CMSG_LEN(sizeof(struct in_addr)) cmsg_level = IPPROTO_IP cmsg_type = IP_SENDSRCADDR .Ed .Pp The socket should be either bound to .Dv INADDR_ANY and a local port, and the address supplied with .Dv IP_SENDSRCADDR should't be .Dv INADDR_ANY , or the socket should be bound to a local address and the address supplied with .Dv IP_SENDSRCADDR should be .Dv INADDR_ANY . In the latter case bound address is overridden via generic source address selection logic, which would choose IP address of interface closest to destination. .Pp For convenience, .Dv IP_SENDSRCADDR is defined to have the same value as .Dv IP_RECVDSTADDR , so the .Dv IP_RECVDSTADDR control message from .Xr recvmsg 2 can be used directly as a control message for .Xr sendmsg 2 . .\" .Pp If the .Dv IP_ONESBCAST option is enabled on a .Dv SOCK_DGRAM or a .Dv SOCK_RAW socket, the destination address of outgoing broadcast datagrams on that socket will be forced to the undirected broadcast address, .Dv INADDR_BROADCAST , before transmission. This is in contrast to the default behavior of the system, which is to transmit undirected broadcasts via the first network interface with the .Dv IFF_BROADCAST flag set. .Pp This option allows applications to choose which interface is used to transmit an undirected broadcast datagram. For example, the following code would force an undirected broadcast to be transmitted via the interface configured with the broadcast address 192.168.2.255: .Bd -literal char msg[512]; struct sockaddr_in sin; int onesbcast = 1; /* 0 = disable (default), 1 = enable */ setsockopt(s, IPPROTO_IP, IP_ONESBCAST, &onesbcast, sizeof(onesbcast)); sin.sin_addr.s_addr = inet_addr("192.168.2.255"); sin.sin_port = htons(1234); sendto(s, msg, sizeof(msg), 0, &sin, sizeof(sin)); .Ed .Pp It is the application's responsibility to set the .Dv IP_TTL option to an appropriate value in order to prevent broadcast storms. The application must have sufficient credentials to set the .Dv SO_BROADCAST socket level option, otherwise the .Dv IP_ONESBCAST option has no effect. .Pp If the .Dv IP_BINDANY option is enabled on a .Dv SOCK_STREAM , .Dv SOCK_DGRAM or a .Dv SOCK_RAW socket, one can .Xr bind 2 to any address, even one not bound to any available network interface in the system. This functionality (in conjunction with special firewall rules) can be used for implementing a transparent proxy. The .Dv PRIV_NETINET_BINDANY privilege is needed to set this option. .Pp If the .Dv IP_RECVTTL option is enabled on a .Dv SOCK_DGRAM socket, the .Xr recvmsg 2 call will return the .Tn IP .Tn TTL (time to live) field for a .Tn UDP datagram. The msg_control field in the msghdr structure points to a buffer that contains a cmsghdr structure followed by the .Tn TTL . The cmsghdr fields have the following values: .Bd -literal cmsg_len = CMSG_LEN(sizeof(u_char)) cmsg_level = IPPROTO_IP cmsg_type = IP_RECVTTL .Ed .\" .Pp If the .Dv IP_RECVTOS option is enabled on a .Dv SOCK_DGRAM socket, the .Xr recvmsg 2 call will return the .Tn IP .Tn TOS (type of service) field for a .Tn UDP datagram. The msg_control field in the msghdr structure points to a buffer that contains a cmsghdr structure followed by the .Tn TOS . The cmsghdr fields have the following values: .Bd -literal cmsg_len = CMSG_LEN(sizeof(u_char)) cmsg_level = IPPROTO_IP cmsg_type = IP_RECVTOS .Ed .\" .Pp If the .Dv IP_RECVIF option is enabled on a .Dv SOCK_DGRAM socket, the .Xr recvmsg 2 call returns a .Vt "struct sockaddr_dl" corresponding to the interface on which the packet was received. The .Va msg_control field in the .Vt msghdr structure points to a buffer that contains a .Vt cmsghdr structure followed by the .Vt "struct sockaddr_dl" . The .Vt cmsghdr fields have the following values: .Bd -literal cmsg_len = CMSG_LEN(sizeof(struct sockaddr_dl)) cmsg_level = IPPROTO_IP cmsg_type = IP_RECVIF .Ed .Pp .Dv IP_PORTRANGE may be used to set the port range used for selecting a local port number on a socket with an unspecified (zero) port number. It has the following possible values: .Bl -tag -width IP_PORTRANGE_DEFAULT .It Dv IP_PORTRANGE_DEFAULT use the default range of values, normally .Dv IPPORT_HIFIRSTAUTO through .Dv IPPORT_HILASTAUTO . This is adjustable through the sysctl setting: .Va net.inet.ip.portrange.first and .Va net.inet.ip.portrange.last . .It Dv IP_PORTRANGE_HIGH use a high range of values, normally .Dv IPPORT_HIFIRSTAUTO and .Dv IPPORT_HILASTAUTO . This is adjustable through the sysctl setting: .Va net.inet.ip.portrange.hifirst and .Va net.inet.ip.portrange.hilast . .It Dv IP_PORTRANGE_LOW use a low range of ports, which are normally restricted to privileged processes on .Ux systems. The range is normally from .Dv IPPORT_RESERVED \- 1 down to .Li IPPORT_RESERVEDSTART in descending order. This is adjustable through the sysctl setting: .Va net.inet.ip.portrange.lowfirst and .Va net.inet.ip.portrange.lowlast . .El .Pp The range of privileged ports which only may be opened by root-owned processes may be modified by the .Va net.inet.ip.portrange.reservedlow and .Va net.inet.ip.portrange.reservedhigh sysctl settings. The values default to the traditional range, 0 through .Dv IPPORT_RESERVED \- 1 (0 through 1023), respectively. Note that these settings do not affect and are not accounted for in the use or calculation of the other .Va net.inet.ip.portrange values above. Changing these values departs from .Ux tradition and has security consequences that the administrator should carefully evaluate before modifying these settings. .Pp Ports are allocated at random within the specified port range in order to increase the difficulty of random spoofing attacks. In scenarios such as benchmarking, this behavior may be undesirable. In these cases, .Va net.inet.ip.portrange.randomized can be used to toggle randomization off. If more than .Va net.inet.ip.portrange.randomcps ports have been allocated in the last second, then return to sequential port allocation. Return to random allocation only once the current port allocation rate drops below .Va net.inet.ip.portrange.randomcps for at least .Va net.inet.ip.portrange.randomtime seconds. The default values for .Va net.inet.ip.portrange.randomcps and .Va net.inet.ip.portrange.randomtime are 10 port allocations per second and 45 seconds correspondingly. .Ss "Multicast Options" .Tn IP multicasting is supported only on .Dv AF_INET sockets of type .Dv SOCK_DGRAM and .Dv SOCK_RAW , and only on networks where the interface driver supports multicasting. .Pp The .Dv IP_MULTICAST_TTL option changes the time-to-live (TTL) for outgoing multicast datagrams in order to control the scope of the multicasts: .Bd -literal u_char ttl; /* range: 0 to 255, default = 1 */ setsockopt(s, IPPROTO_IP, IP_MULTICAST_TTL, &ttl, sizeof(ttl)); .Ed .Pp Datagrams with a TTL of 1 are not forwarded beyond the local network. Multicast datagrams with a TTL of 0 will not be transmitted on any network, but may be delivered locally if the sending host belongs to the destination group and if multicast loopback has not been disabled on the sending socket (see below). Multicast datagrams with TTL greater than 1 may be forwarded to other networks if a multicast router is attached to the local network. .Pp For hosts with multiple interfaces, where an interface has not been specified for a multicast group membership, each multicast transmission is sent from the primary network interface. The .Dv IP_MULTICAST_IF option overrides the default for subsequent transmissions from a given socket: .Bd -literal struct in_addr addr; setsockopt(s, IPPROTO_IP, IP_MULTICAST_IF, &addr, sizeof(addr)); .Ed .Pp where "addr" is the local .Tn IP address of the desired interface or .Dv INADDR_ANY to specify the default interface. .Pp To specify an interface by index, an instance of .Vt ip_mreqn may be passed instead. The .Vt imr_ifindex member should be set to the index of the desired interface, or 0 to specify the default interface. The kernel differentiates between these two structures by their size. .Pp The use of .Vt IP_MULTICAST_IF is .Em not recommended , as multicast memberships are scoped to each individual interface. It is supported for legacy use only by applications, such as routing daemons, which expect to be able to transmit link-local IPv4 multicast datagrams (224.0.0.0/24) on multiple interfaces, without requesting an individual membership for each interface. .Pp .\" An interface's local IP address and multicast capability can be obtained via the .Dv SIOCGIFCONF and .Dv SIOCGIFFLAGS ioctls. Normal applications should not need to use this option. .Pp If a multicast datagram is sent to a group to which the sending host itself belongs (on the outgoing interface), a copy of the datagram is, by default, looped back by the IP layer for local delivery. The .Dv IP_MULTICAST_LOOP option gives the sender explicit control over whether or not subsequent datagrams are looped back: .Bd -literal u_char loop; /* 0 = disable, 1 = enable (default) */ setsockopt(s, IPPROTO_IP, IP_MULTICAST_LOOP, &loop, sizeof(loop)); .Ed .Pp This option improves performance for applications that may have no more than one instance on a single host (such as a routing daemon), by eliminating the overhead of receiving their own transmissions. It should generally not be used by applications for which there may be more than one instance on a single host (such as a conferencing program) or for which the sender does not belong to the destination group (such as a time querying program). .Pp The sysctl setting .Va net.inet.ip.mcast.loop controls the default setting of the .Dv IP_MULTICAST_LOOP socket option for new sockets. .Pp A multicast datagram sent with an initial TTL greater than 1 may be delivered to the sending host on a different interface from that on which it was sent, if the host belongs to the destination group on that other interface. The loopback control option has no effect on such delivery. .Pp A host must become a member of a multicast group before it can receive datagrams sent to the group. To join a multicast group, use the .Dv IP_ADD_MEMBERSHIP option: .Bd -literal struct ip_mreq mreq; setsockopt(s, IPPROTO_IP, IP_ADD_MEMBERSHIP, &mreq, sizeof(mreq)); .Ed .Pp where .Fa mreq is the following structure: .Bd -literal struct ip_mreq { struct in_addr imr_multiaddr; /* IP multicast address of group */ struct in_addr imr_interface; /* local IP address of interface */ } .Ed .Pp .Va imr_interface should be set to the .Tn IP address of a particular multicast-capable interface if the host is multihomed. It may be set to .Dv INADDR_ANY to choose the default interface, although this is not recommended; this is considered to be the first interface corresponding to the default route. Otherwise, the first multicast-capable interface configured in the system will be used. .Pp Prior to .Fx 7.0 , if the .Va imr_interface member is within the network range .Li 0.0.0.0/8 , it is treated as an interface index in the system interface MIB, as per the RIP Version 2 MIB Extension (RFC-1724). In versions of .Fx since 7.0, this behavior is no longer supported. Developers should instead use the RFC 3678 multicast source filter APIs; in particular, .Dv MCAST_JOIN_GROUP . .Pp Up to .Dv IP_MAX_MEMBERSHIPS memberships may be added on a single socket. Membership is associated with a single interface; programs running on multihomed hosts may need to join the same group on more than one interface. .Pp To drop a membership, use: .Bd -literal struct ip_mreq mreq; setsockopt(s, IPPROTO_IP, IP_DROP_MEMBERSHIP, &mreq, sizeof(mreq)); .Ed .Pp where .Fa mreq contains the same values as used to add the membership. Memberships are dropped when the socket is closed or the process exits. .\" TODO: Update this piece when IPv4 source-address selection is implemented. .Pp The IGMP protocol uses the primary IP address of the interface as its identifier for group membership. This is the first IP address configured on the interface. If this address is removed or changed, the results are undefined, as the IGMP membership state will then be inconsistent. If multiple IP aliases are configured on the same interface, they will be ignored. .Pp This shortcoming was addressed in IPv6; MLDv2 requires that the unique link-local address for an interface is used to identify an MLDv2 listener. .Ss "Source-Specific Multicast Options" Since .Fx 8.0 , the use of Source-Specific Multicast (SSM) is supported. These extensions require an IGMPv3 multicast router in order to make best use of them. If a legacy multicast router is present on the link, .Fx will simply downgrade to the version of IGMP spoken by the router, and the benefits of source filtering on the upstream link will not be present, although the kernel will continue to squelch transmissions from blocked sources. .Pp Each group membership on a socket now has a filter mode: .Bl -tag -width MCAST_EXCLUDE .It Dv MCAST_EXCLUDE Datagrams sent to this group are accepted, unless the source is in a list of blocked source addresses. .It Dv MCAST_INCLUDE Datagrams sent to this group are accepted only if the source is in a list of accepted source addresses. .El .Pp Groups joined using the legacy .Dv IP_ADD_MEMBERSHIP option are placed in exclusive-mode, and are able to request that certain sources are blocked or allowed. This is known as the .Em delta-based API . .Pp To block a multicast source on an existing group membership: .Bd -literal struct ip_mreq_source mreqs; setsockopt(s, IPPROTO_IP, IP_BLOCK_SOURCE, &mreqs, sizeof(mreqs)); .Ed .Pp where .Fa mreqs is the following structure: .Bd -literal struct ip_mreq_source { struct in_addr imr_multiaddr; /* IP multicast address of group */ struct in_addr imr_sourceaddr; /* IP address of source */ struct in_addr imr_interface; /* local IP address of interface */ } .Ed .Va imr_sourceaddr should be set to the address of the source to be blocked. .Pp To unblock a multicast source on an existing group: .Bd -literal struct ip_mreq_source mreqs; setsockopt(s, IPPROTO_IP, IP_UNBLOCK_SOURCE, &mreqs, sizeof(mreqs)); .Ed .Pp The .Dv IP_BLOCK_SOURCE and .Dv IP_UNBLOCK_SOURCE options are .Em not permitted for inclusive-mode group memberships. .Pp To join a multicast group in .Dv MCAST_INCLUDE mode with a single source, or add another source to an existing inclusive-mode membership: .Bd -literal struct ip_mreq_source mreqs; setsockopt(s, IPPROTO_IP, IP_ADD_SOURCE_MEMBERSHIP, &mreqs, sizeof(mreqs)); .Ed .Pp To leave a single source from an existing group in inclusive mode: .Bd -literal struct ip_mreq_source mreqs; setsockopt(s, IPPROTO_IP, IP_DROP_SOURCE_MEMBERSHIP, &mreqs, sizeof(mreqs)); .Ed If this is the last accepted source for the group, the membership will be dropped. .Pp The .Dv IP_ADD_SOURCE_MEMBERSHIP and .Dv IP_DROP_SOURCE_MEMBERSHIP options are .Em not accepted for exclusive-mode group memberships. However, both exclusive and inclusive mode memberships support the use of the .Em full-state API documented in RFC 3678. For management of source filter lists using this API, please refer to .Xr sourcefilter 3 . .Pp The sysctl settings .Va net.inet.ip.mcast.maxsocksrc and .Va net.inet.ip.mcast.maxgrpsrc are used to specify an upper limit on the number of per-socket and per-group source filter entries which the kernel may allocate. .\"----------------------- .Ss "Raw IP Sockets" Raw .Tn IP sockets are connectionless, and are normally used with the .Xr sendto 2 and .Xr recvfrom 2 calls, though the .Xr connect 2 call may also be used to fix the destination for future packets (in which case the .Xr read 2 or .Xr recv 2 and .Xr write 2 or .Xr send 2 system calls may be used). .Pp If .Fa proto is 0, the default protocol .Dv IPPROTO_RAW is used for outgoing packets, and only incoming packets destined for that protocol are received. If .Fa proto is non-zero, that protocol number will be used on outgoing packets and to filter incoming packets. .Pp Outgoing packets automatically have an .Tn IP header prepended to them (based on the destination address and the protocol number the socket is created with), unless the .Dv IP_HDRINCL option has been set. Unlike in previous .Bx releases, incoming packets are received with .Tn IP header and options intact, leaving all fields in network byte order. .Pp .Dv IP_HDRINCL indicates the complete IP header is included with the data and may be used only with the .Dv SOCK_RAW type. .Bd -literal #include #include int hincl = 1; /* 1 = on, 0 = off */ setsockopt(s, IPPROTO_IP, IP_HDRINCL, &hincl, sizeof(hincl)); .Ed .Pp Unlike previous .Bx releases, the program must set all the fields of the IP header, including the following: .Bd -literal ip->ip_v = IPVERSION; ip->ip_hl = hlen >> 2; ip->ip_id = 0; /* 0 means kernel set appropriate value */ ip->ip_off = htons(offset); ip->ip_len = htons(len); .Ed .Pp The packet should be provided as is to be sent over wire. This implies all fields, including .Va ip_len and .Va ip_off to be in network byte order. See .Xr byteorder 3 for more information on network byte order. If the .Va ip_id field is set to 0 then the kernel will choose an appropriate value. If the header source address is set to .Dv INADDR_ANY , the kernel will choose an appropriate address. .Sh ERRORS A socket operation may fail with one of the following errors returned: .Bl -tag -width Er .It Bq Er EISCONN when trying to establish a connection on a socket which already has one, or when trying to send a datagram with the destination address specified and the socket is already connected; .It Bq Er ENOTCONN when trying to send a datagram, but no destination address is specified, and the socket has not been connected; .It Bq Er ENOBUFS when the system runs out of memory for an internal data structure; .It Bq Er EADDRNOTAVAIL when an attempt is made to create a socket with a network address for which no network interface exists. .It Bq Er EACCES when an attempt is made to create a raw IP socket by a non-privileged process. .El .Pp The following errors specific to .Tn IP may occur when setting or getting .Tn IP options: .Bl -tag -width Er .It Bq Er EINVAL An unknown socket option name was given. .It Bq Er EINVAL The IP option field was improperly formed; an option field was shorter than the minimum value or longer than the option buffer provided. .El .Pp The following errors may occur when attempting to send .Tn IP datagrams via a .Dq raw socket with the .Dv IP_HDRINCL option set: .Bl -tag -width Er .It Bq Er EINVAL The user-supplied .Va ip_len field was not equal to the length of the datagram written to the socket. .El .Sh SEE ALSO .Xr getsockopt 2 , .Xr recv 2 , .Xr send 2 , .Xr byteorder 3 , +.Xr CMSG_DATA 3 , .Xr sourcefilter 3 , .Xr icmp 4 , .Xr igmp 4 , .Xr inet 4 , .Xr intro 4 , .Xr multicast 4 .Rs .%A D. Thaler .%A B. Fenner .%A B. Quinn .%T "Socket Interface Extensions for Multicast Source Filters" .%N RFC 3678 .%D Jan 2004 .Re .Sh HISTORY The .Nm protocol appeared in .Bx 4.2 . The .Vt ip_mreqn structure appeared in .Tn Linux 2.4 . .Sh BUGS Before .Fx 10.0 packets received on raw IP sockets had the .Va ip_hl subtracted from the .Va ip_len field. .Pp Before .Fx 11.0 packets received on raw IP sockets had the .Va ip_len and .Va ip_off fields converted to host byte order. Packets written to raw IP sockets were expected to have .Va ip_len and .Va ip_off in host byte order. Index: head/share/man/man4/ip6.4 =================================================================== --- head/share/man/man4/ip6.4 (revision 338059) +++ head/share/man/man4/ip6.4 (revision 338060) @@ -1,701 +1,702 @@ .\" $KAME: ip6.4,v 1.23 2005/01/11 05:56:25 itojun Exp $ .\" $OpenBSD: ip6.4,v 1.21 2005/01/06 03:50:46 itojun Exp $ .\" .\" Copyright (c) 1983, 1991, 1993 .\" The Regents of the University of California. All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" 3. Neither the name of the University nor the names of its contributors .\" may be used to endorse or promote products derived from this software .\" without specific prior written permission. .\" .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" $FreeBSD$ .\" -.Dd March 13, 2011 +.Dd August 19, 2018 .Dt IP6 4 .Os .Sh NAME .Nm ip6 .Nd Internet Protocol version 6 (IPv6) network layer .Sh SYNOPSIS .In sys/socket.h .In netinet/in.h .Ft int .Fn socket AF_INET6 SOCK_RAW proto .Sh DESCRIPTION The IPv6 network layer is used by the IPv6 protocol family for transporting data. IPv6 packets contain an IPv6 header that is not provided as part of the payload contents when passed to an application. IPv6 header options affect the behavior of this protocol and may be used by high-level protocols (such as the .Xr tcp 4 and .Xr udp 4 protocols) as well as directly by .Dq raw sockets , which process IPv6 messages at a lower-level and may be useful for developing new protocols and special-purpose applications. .Ss Header All IPv6 packets begin with an IPv6 header. When data received by the kernel are passed to the application, this header is not included in buffer, even when raw sockets are being used. Likewise, when data are sent to the kernel for transmit from the application, the buffer is not examined for an IPv6 header: the kernel always constructs the header. To directly access IPv6 headers from received packets and specify them as part of the buffer passed to the kernel, link-level access .Po .Xr bpf 4 , for example .Pc must instead be utilized. .Pp The header has the following definition: .Bd -literal -offset indent struct ip6_hdr { union { struct ip6_hdrctl { uint32_t ip6_un1_flow; /* 20 bits of flow ID */ uint16_t ip6_un1_plen; /* payload length */ uint8_t ip6_un1_nxt; /* next header */ uint8_t ip6_un1_hlim; /* hop limit */ } ip6_un1; uint8_t ip6_un2_vfc; /* version and class */ } ip6_ctlun; struct in6_addr ip6_src; /* source address */ struct in6_addr ip6_dst; /* destination address */ } __packed; #define ip6_vfc ip6_ctlun.ip6_un2_vfc #define ip6_flow ip6_ctlun.ip6_un1.ip6_un1_flow #define ip6_plen ip6_ctlun.ip6_un1.ip6_un1_plen #define ip6_nxt ip6_ctlun.ip6_un1.ip6_un1_nxt #define ip6_hlim ip6_ctlun.ip6_un1.ip6_un1_hlim #define ip6_hops ip6_ctlun.ip6_un1.ip6_un1_hlim .Ed .Pp All fields are in network-byte order. Any options specified (see .Sx Options below) must also be specified in network-byte order. .Pp .Va ip6_flow specifies the flow ID. .Va ip6_plen specifies the payload length. .Va ip6_nxt specifies the type of the next header. .Va ip6_hlim specifies the hop limit. .Pp The top 4 bits of .Va ip6_vfc specify the class and the bottom 4 bits specify the version. .Pp .Va ip6_src and .Va ip6_dst specify the source and destination addresses. .Pp The IPv6 header may be followed by any number of extension headers that start with the following generic definition: .Bd -literal -offset indent struct ip6_ext { uint8_t ip6e_nxt; uint8_t ip6e_len; } __packed; .Ed .Ss Options IPv6 allows header options on packets to manipulate the behavior of the protocol. These options and other control requests are accessed with the .Xr getsockopt 2 and .Xr setsockopt 2 system calls at level .Dv IPPROTO_IPV6 and by using ancillary data in .Xr recvmsg 2 and .Xr sendmsg 2 . They can be used to access most of the fields in the IPv6 header and extension headers. .Pp The following socket options are supported: .Bl -tag -width Ds .\" .It Dv IPV6_OPTIONS .It Dv IPV6_UNICAST_HOPS Fa "int *" Get or set the default hop limit header field for outgoing unicast datagrams sent on this socket. .\" .It Dv IPV6_RECVOPTS Fa "int *" .\" Get or set the status of whether all header options will be .\" delivered along with the datagram when it is received. .\" .It Dv IPV6_RECVRETOPTS Fa "int *" .\" Get or set the status of whether header options will be delivered .\" for reply. .\" .It Dv IPV6_RECVDSTADDR Fa "int *" .\" Get or set the status of whether datagrams are received with .\" destination addresses. .\" .It Dv IPV6_ORIGDSTADDR Fa "int *" .\" Get or set the status of whether datagrams are received with .\" destination addresses and destination ports. .\" .It Dv IPV6_RETOPTS .\" Get or set IPv6 options. .It Dv IPV6_MULTICAST_IF Fa "u_int *" Get or set the interface from which multicast packets will be sent. For hosts with multiple interfaces, each multicast transmission is sent from the primary network interface. The interface is specified as its index as provided by .Xr if_nametoindex 3 . A value of zero specifies the default interface. .It Dv IPV6_MULTICAST_HOPS Fa "int *" Get or set the default hop limit header field for outgoing multicast datagrams sent on this socket. This option controls the scope of multicast datagram transmissions. .Pp Datagrams with a hop limit of 1 are not forwarded beyond the local network. Multicast datagrams with a hop limit of zero will not be transmitted on any network but may be delivered locally if the sending host belongs to the destination group and if multicast loopback (see below) has not been disabled on the sending socket. Multicast datagrams with a hop limit greater than 1 may be forwarded to the other networks if a multicast router (such as .Xr mrouted 8 Pq Pa ports/net/mrouted ) is attached to the local network. .It Dv IPV6_MULTICAST_LOOP Fa "u_int *" Get or set the status of whether multicast datagrams will be looped back for local delivery when a multicast datagram is sent to a group to which the sending host belongs. .Pp This option improves performance for applications that may have no more than one instance on a single host (such as a router daemon) by eliminating the overhead of receiving their own transmissions. It should generally not be used by applications for which there may be more than one instance on a single host (such as a conferencing program) or for which the sender does not belong to the destination group (such as a time-querying program). .Pp A multicast datagram sent with an initial hop limit greater than 1 may be delivered to the sending host on a different interface from that on which it was sent if the host belongs to the destination group on that other interface. The multicast loopback control option has no effect on such delivery. .It Dv IPV6_JOIN_GROUP Fa "struct ipv6_mreq *" Join a multicast group. A host must become a member of a multicast group before it can receive datagrams sent to the group. .Bd -literal struct ipv6_mreq { struct in6_addr ipv6mr_multiaddr; unsigned int ipv6mr_interface; }; .Ed .Pp .Va ipv6mr_interface may be set to zeroes to choose the default multicast interface or to the index of a particular multicast-capable interface if the host is multihomed. Membership is associated with a single interface; programs running on multihomed hosts may need to join the same group on more than one interface. .Pp If the multicast address is unspecified (i.e., all zeroes), messages from all multicast addresses will be accepted by this group. Note that setting to this value requires superuser privileges. .It Dv IPV6_LEAVE_GROUP Fa "struct ipv6_mreq *" Drop membership from the associated multicast group. Memberships are automatically dropped when the socket is closed or when the process exits. .It Dv IPV6_PORTRANGE Fa "int *" Get or set the allocation policy of ephemeral ports for when the kernel automatically binds a local address to this socket. The following values are available: .Pp .Bl -tag -width IPV6_PORTRANGE_DEFAULT -compact .It Dv IPV6_PORTRANGE_DEFAULT Use the regular range of non-reserved ports (varies, see .Xr ip 4 ) . .It Dv IPV6_PORTRANGE_HIGH Use a high range (varies, see .Xr ip 4 ) . .It Dv IPV6_PORTRANGE_LOW Use a low, reserved range (600\-1023, see .Xr ip 4 ) . .El .It Dv IPV6_PKTINFO Fa "int *" Get or set whether additional information about subsequent packets will be provided as ancillary data along with the payload in subsequent .Xr recvmsg 2 calls. The information is stored in the following structure in the ancillary data returned: .Bd -literal struct in6_pktinfo { struct in6_addr ipi6_addr; /* src/dst IPv6 address */ unsigned int ipi6_ifindex; /* send/recv if index */ }; .Ed .It Dv IPV6_HOPLIMIT Fa "int *" Get or set whether the hop limit header field from subsequent packets will be provided as ancillary data along with the payload in subsequent .Xr recvmsg 2 calls. The value is stored as an .Vt int in the ancillary data returned. .\" .It Dv IPV6_NEXTHOP Fa "int *" .\" Get or set whether the address of the next hop for subsequent .\" packets will be provided as ancillary data along with the payload in .\" subsequent .\" .Xr recvmsg 2 .\" calls. .\" The option is stored as a .\" .Vt sockaddr .\" structure in the ancillary data returned. .\" .Pp .\" This option requires superuser privileges. .It Dv IPV6_HOPOPTS Fa "int *" Get or set whether the hop-by-hop options from subsequent packets will be provided as ancillary data along with the payload in subsequent .Xr recvmsg 2 calls. The option is stored in the following structure in the ancillary data returned: .Bd -literal struct ip6_hbh { uint8_t ip6h_nxt; /* next header */ uint8_t ip6h_len; /* length in units of 8 octets */ /* followed by options */ } __packed; .Ed .Pp The .Fn inet6_option_space routine and family of routines may be used to manipulate this data. .Pp This option requires superuser privileges. .It Dv IPV6_DSTOPTS Fa "int *" Get or set whether the destination options from subsequent packets will be provided as ancillary data along with the payload in subsequent .Xr recvmsg 2 calls. The option is stored in the following structure in the ancillary data returned: .Bd -literal struct ip6_dest { uint8_t ip6d_nxt; /* next header */ uint8_t ip6d_len; /* length in units of 8 octets */ /* followed by options */ } __packed; .Ed .Pp The .Fn inet6_option_space routine and family of routines may be used to manipulate this data. .Pp This option requires superuser privileges. .It Dv IPV6_TCLASS Fa "int *" Get or set the value of the traffic class field used for outgoing datagrams on this socket. The value must be between \-1 and 255. A value of \-1 resets to the default value. .It Dv IPV6_RECVTCLASS Fa "int *" Get or set the status of whether the traffic class header field will be provided as ancillary data along with the payload in subsequent .Xr recvmsg 2 calls. The header field is stored as a single value of type .Vt int . .It Dv IPV6_RTHDR Fa "int *" Get or set whether the routing header from subsequent packets will be provided as ancillary data along with the payload in subsequent .Xr recvmsg 2 calls. The header is stored in the following structure in the ancillary data returned: .Bd -literal struct ip6_rthdr { uint8_t ip6r_nxt; /* next header */ uint8_t ip6r_len; /* length in units of 8 octets */ uint8_t ip6r_type; /* routing type */ uint8_t ip6r_segleft; /* segments left */ /* followed by routing-type-specific data */ } __packed; .Ed .Pp The .Fn inet6_option_space routine and family of routines may be used to manipulate this data. .Pp This option requires superuser privileges. .It Dv IPV6_PKTOPTIONS Fa "struct cmsghdr *" Get or set all header options and extension headers at one time on the last packet sent or received on the socket. All options must fit within the size of an mbuf (see .Xr mbuf 9 ) . Options are specified as a series of .Vt cmsghdr structures followed by corresponding values. .Va cmsg_level is set to .Dv IPPROTO_IPV6 , .Va cmsg_type to one of the other values in this list, and trailing data to the option value. When setting options, if the length .Va optlen to .Xr setsockopt 2 is zero, all header options will be reset to their default values. Otherwise, the length should specify the size the series of control messages consumes. .Pp Instead of using .Xr sendmsg 2 to specify option values, the ancillary data used in these calls that correspond to the desired header options may be directly specified as the control message in the series of control messages provided as the argument to .Xr setsockopt 2 . .It Dv IPV6_CHECKSUM Fa "int *" Get or set the byte offset into a packet where the 16-bit checksum is located. When set, this byte offset is where incoming packets will be expected to have checksums of their data stored and where outgoing packets will have checksums of their data computed and stored by the kernel. A value of \-1 specifies that no checksums will be checked on incoming packets and that no checksums will be computed or stored on outgoing packets. The offset of the checksum for ICMPv6 sockets cannot be relocated or turned off. .It Dv IPV6_V6ONLY Fa "int *" Get or set whether only IPv6 connections can be made to this socket. For wildcard sockets, this can restrict connections to IPv6 only. .\"With .\".Ox .\"IPv6 sockets are always IPv6-only, so the socket option is read-only .\"(not modifiable). .It Dv IPV6_USE_MIN_MTU Fa "int *" Get or set whether the minimal IPv6 maximum transmission unit (MTU) size will be used to avoid fragmentation from occurring for subsequent outgoing datagrams. .It Dv IPV6_AUTH_LEVEL Fa "int *" Get or set the .Xr ipsec 4 authentication level. .It Dv IPV6_ESP_TRANS_LEVEL Fa "int *" Get or set the ESP transport level. .It Dv IPV6_ESP_NETWORK_LEVEL Fa "int *" Get or set the ESP encapsulation level. .It Dv IPV6_IPCOMP_LEVEL Fa "int *" Get or set the .Xr ipcomp 4 level. .El .Pp The .Dv IPV6_PKTINFO , .\" .Dv IPV6_NEXTHOP , .Dv IPV6_HOPLIMIT , .Dv IPV6_HOPOPTS , .Dv IPV6_DSTOPTS , and .Dv IPV6_RTHDR options will return ancillary data along with payload contents in subsequent .Xr recvmsg 2 calls with .Va cmsg_level set to .Dv IPPROTO_IPV6 and .Va cmsg_type set to respective option name value (e.g., .Dv IPV6_HOPTLIMIT ) . These options may also be used directly as ancillary .Va cmsg_type values in .Xr sendmsg 2 to set options on the packet being transmitted by the call. The .Va cmsg_level value must be .Dv IPPROTO_IPV6 . For these options, the ancillary data object value format is the same as the value returned as explained for each when received with .Xr recvmsg 2 . .Pp Note that using .Xr sendmsg 2 to specify options on particular packets works only on UDP and raw sockets. To manipulate header options for packets on TCP sockets, only the socket options may be used. .Pp In some cases, there are multiple APIs defined for manipulating an IPv6 header field. A good example is the outgoing interface for multicast datagrams, which can be set by the .Dv IPV6_MULTICAST_IF socket option, through the .Dv IPV6_PKTINFO option, and through the .Va sin6_scope_id field of the socket address passed to the .Xr sendto 2 system call. .Pp Resolving these conflicts is implementation dependent. This implementation determines the value in the following way: options specified by using ancillary data (i.e., .Xr sendmsg 2 ) are considered first, options specified by using .Dv IPV6_PKTOPTIONS to set .Dq sticky options are considered second, options specified by using the individual, basic, and direct socket options (e.g., .Dv IPV6_UNICAST_HOPS ) are considered third, and options specified in the socket address supplied to .Xr sendto 2 are the last choice. .Ss Multicasting IPv6 multicasting is supported only on .Dv AF_INET6 sockets of type .Dv SOCK_DGRAM and .Dv SOCK_RAW , and only on networks where the interface driver supports multicasting. Socket options (see above) that manipulate membership of multicast groups and other multicast options include .Dv IPV6_MULTICAST_IF , .Dv IPV6_MULTICAST_HOPS , .Dv IPV6_MULTICAST_LOOP , .Dv IPV6_LEAVE_GROUP , and .Dv IPV6_JOIN_GROUP . .Ss Raw Sockets Raw IPv6 sockets are connectionless and are normally used with the .Xr sendto 2 and .Xr recvfrom 2 calls, although the .Xr connect 2 call may be used to fix the destination address for future outgoing packets so that .Xr send 2 may instead be used and the .Xr bind 2 call may be used to fix the source address for future outgoing packets instead of having the kernel choose a source address. .Pp By using .Xr connect 2 or .Xr bind 2 , raw socket input is constrained to only packets with their source address matching the socket destination address if .Xr connect 2 was used and to packets with their destination address matching the socket source address if .Xr bind 2 was used. .Pp If the .Ar proto argument to .Xr socket 2 is zero, the default protocol .Pq Dv IPPROTO_RAW is used for outgoing packets. For incoming packets, protocols recognized by kernel are .Sy not passed to the application socket (e.g., .Xr tcp 4 and .Xr udp 4 ) except for some ICMPv6 messages. The ICMPv6 messages not passed to raw sockets include echo, timestamp, and address mask requests. If .Ar proto is non-zero, only packets with this protocol will be passed to the socket. .Pp IPv6 fragments are also not passed to application sockets until they have been reassembled. If reception of all packets is desired, link-level access (such as .Xr bpf 4 ) must be used instead. .Pp Outgoing packets automatically have an IPv6 header prepended to them (based on the destination address and the protocol number the socket was created with). Incoming packets are received by an application without the IPv6 header or any extension headers. .Pp Outgoing packets will be fragmented automatically by the kernel if they are too large. Incoming packets will be reassembled before being sent to the raw socket, so packet fragments or fragment headers will never be seen on a raw socket. .Sh EXAMPLES The following determines the hop limit on the next packet received: .Bd -literal struct iovec iov[2]; u_char buf[BUFSIZ]; struct cmsghdr *cm; struct msghdr m; int optval; bool found; u_char data[2048]; /* Create socket. */ (void)memset(&m, 0, sizeof(m)); (void)memset(&iov, 0, sizeof(iov)); iov[0].iov_base = data; /* buffer for packet payload */ iov[0].iov_len = sizeof(data); /* expected packet length */ m.msg_name = &from; /* sockaddr_in6 of peer */ m.msg_namelen = sizeof(from); m.msg_iov = iov; m.msg_iovlen = 1; m.msg_control = (caddr_t)buf; /* buffer for control messages */ m.msg_controllen = sizeof(buf); /* * Enable the hop limit value from received packets to be * returned along with the payload. */ optval = 1; if (setsockopt(s, IPPROTO_IPV6, IPV6_HOPLIMIT, &optval, sizeof(optval)) == -1) err(1, "setsockopt"); found = false; do { if (recvmsg(s, &m, 0) == -1) err(1, "recvmsg"); for (cm = CMSG_FIRSTHDR(&m); cm != NULL; cm = CMSG_NXTHDR(&m, cm)) { if (cm->cmsg_level == IPPROTO_IPV6 && cm->cmsg_type == IPV6_HOPLIMIT && cm->cmsg_len == CMSG_LEN(sizeof(int))) { found = true; (void)printf("hop limit: %d\en", *(int *)CMSG_DATA(cm)); break; } } } while (!found); .Ed .Sh DIAGNOSTICS A socket operation may fail with one of the following errors returned: .Bl -tag -width EADDRNOTAVAILxx .It Bq Er EISCONN when trying to establish a connection on a socket which already has one or when trying to send a datagram with the destination address specified and the socket is already connected. .It Bq Er ENOTCONN when trying to send a datagram, but no destination address is specified, and the socket has not been connected. .It Bq Er ENOBUFS when the system runs out of memory for an internal data structure. .It Bq Er EADDRNOTAVAIL when an attempt is made to create a socket with a network address for which no network interface exists. .It Bq Er EACCES when an attempt is made to create a raw IPv6 socket by a non-privileged process. .El .Pp The following errors specific to IPv6 may occur when setting or getting header options: .Bl -tag -width EADDRNOTAVAILxx .It Bq Er EINVAL An unknown socket option name was given. .It Bq Er EINVAL An ancillary data object was improperly formed. .El .Sh SEE ALSO .Xr getsockopt 2 , .Xr recv 2 , .Xr send 2 , .Xr setsockopt 2 , .Xr socket 2 , +.Xr CMSG_DATA 3 , .\" .Xr inet6_option_space 3 , .\" .Xr inet6_rthdr_space 3 , .Xr if_nametoindex 3 , .Xr bpf 4 , .Xr icmp6 4 , .Xr inet6 4 , .Xr ip 4 , .Xr netintro 4 , .Xr tcp 4 , .Xr udp 4 .Rs .%A W. Stevens .%A M. Thomas .%T Advanced Sockets API for IPv6 .%R RFC 2292 .%D February 1998 .Re .Rs .%A S. Deering .%A R. Hinden .%T Internet Protocol, Version 6 (IPv6) Specification .%R RFC 2460 .%D December 1998 .Re .Rs .%A R. Gilligan .%A S. Thomson .%A J. Bound .%A W. Stevens .%T Basic Socket Interface Extensions for IPv6 .%R RFC 2553 .%D March 1999 .Re .Rs .%A W. Stevens .%A B. Fenner .%A A. Rudoff .%T UNIX Network Programming, third edition .Re .Sh STANDARDS Most of the socket options are defined in RFC 2292 or RFC 2553. The .Dv IPV6_V6ONLY socket option is defined in RFC 3493 Section 5.3. The .Dv IPV6_PORTRANGE socket option and the conflict resolution rule are not defined in the RFCs and should be considered implementation dependent. Index: head/share/man/man4/unix.4 =================================================================== --- head/share/man/man4/unix.4 (revision 338059) +++ head/share/man/man4/unix.4 (revision 338060) @@ -1,363 +1,364 @@ .\" Copyright (c) 1991, 1993 .\" The Regents of the University of California. All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" 3. Neither the name of the University nor the names of its contributors .\" may be used to endorse or promote products derived from this software .\" without specific prior written permission. .\" .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" @(#)unix.4 8.1 (Berkeley) 6/9/93 .\" $FreeBSD$ .\" -.Dd February 3, 2017 +.Dd August 19, 2018 .Dt UNIX 4 .Os .Sh NAME .Nm unix .Nd UNIX-domain protocol family .Sh SYNOPSIS .In sys/types.h .In sys/un.h .Sh DESCRIPTION The .Ux Ns -domain protocol family is a collection of protocols that provides local (on-machine) interprocess communication through the normal .Xr socket 2 mechanisms. The .Ux Ns -domain family supports the .Dv SOCK_STREAM , .Dv SOCK_SEQPACKET , and .Dv SOCK_DGRAM socket types and uses file system pathnames for addressing. .Sh ADDRESSING .Ux Ns -domain addresses are variable-length file system pathnames of at most 104 characters. The include file .In sys/un.h defines this address: .Bd -literal -offset indent struct sockaddr_un { u_char sun_len; u_char sun_family; char sun_path[104]; }; .Ed .Pp Binding a name to a .Ux Ns -domain socket with .Xr bind 2 causes a socket file to be created in the file system. This file is .Em not removed when the socket is closed \(em .Xr unlink 2 must be used to remove the file. .Pp The length of .Ux Ns -domain address, required by .Xr bind 2 and .Xr connect 2 , can be calculated by the macro .Fn SUN_LEN defined in .In sys/un.h . The .Va sun_path field must be terminated by a .Dv NUL character to be used with .Fn SUN_LEN , but the terminating .Dv NUL is .Em not part of the address. .Pp The .Ux Ns -domain protocol family does not support broadcast addressing or any form of .Dq wildcard matching on incoming messages. All addresses are absolute- or relative-pathnames of other .Ux Ns -domain sockets. Normal file system access-control mechanisms are also applied when referencing pathnames; e.g., the destination of a .Xr connect 2 or .Xr sendto 2 must be writable. .Sh CONTROL MESSAGES The .Ux Ns -domain sockets support the communication of .Ux file descriptors and process credentials through the use of the .Va msg_control field in the .Fa msg argument to .Xr sendmsg 2 and .Xr recvmsg 2 . The items to be passed are described using a .Vt "struct cmsghdr" that is defined in the include file .In sys/socket.h . .Pp To send file descriptors, the type of the message is .Dv SCM_RIGHTS , and the data portion of the messages is an array of integers representing the file descriptors to be passed. The number of descriptors being passed is defined by the length field of the message; the length field is the sum of the size of the header plus the size of the array of file descriptors. .Pp The received descriptor is a .Em duplicate of the sender's descriptor, as if it were created via .Li dup(fd) or .Li fcntl(fd, F_DUPFD_CLOEXEC, 0) depending on whether .Dv MSG_CMSG_CLOEXEC is passed in the .Xr recvmsg 2 call. Descriptors that are awaiting delivery, or that are purposely not received, are automatically closed by the system when the destination socket is closed. .Pp Credentials of the sending process can be transmitted explicitly using a control message of type .Dv SCM_CREDS with a data portion of type .Vt "struct cmsgcred" , defined in .In sys/socket.h as follows: .Bd -literal struct cmsgcred { pid_t cmcred_pid; /* PID of sending process */ uid_t cmcred_uid; /* real UID of sending process */ uid_t cmcred_euid; /* effective UID of sending process */ gid_t cmcred_gid; /* real GID of sending process */ short cmcred_ngroups; /* number of groups */ gid_t cmcred_groups[CMGROUP_MAX]; /* groups */ }; .Ed .Pp The sender should pass a zeroed buffer which will be filled in by the system. .Pp The group list is truncated to at most .Dv CMGROUP_MAX GIDs. .Pp The process ID .Fa cmcred_pid should not be looked up (such as via the .Dv KERN_PROC_PID sysctl) for making security decisions. The sending process could have exited and its process ID already been reused for a new process. .Sh SOCKET OPTIONS .Tn UNIX domain sockets support a number of socket options which can be set with .Xr setsockopt 2 and tested with .Xr getsockopt 2 : .Bl -tag -width ".Dv LOCAL_CONNWAIT" .It Dv LOCAL_CREDS This option may be enabled on .Dv SOCK_DGRAM , .Dv SOCK_SEQPACKET , or a .Dv SOCK_STREAM socket. This option provides a mechanism for the receiver to receive the credentials of the process calling .Xr write 2 , .Xr send 2 , .Xr sendto 2 or .Xr sendmsg 2 as a .Xr recvmsg 2 control message. The .Va msg_control field in the .Vt msghdr structure points to a buffer that contains a .Vt cmsghdr structure followed by a variable length .Vt sockcred structure, defined in .In sys/socket.h as follows: .Bd -literal struct sockcred { uid_t sc_uid; /* real user id */ uid_t sc_euid; /* effective user id */ gid_t sc_gid; /* real group id */ gid_t sc_egid; /* effective group id */ int sc_ngroups; /* number of supplemental groups */ gid_t sc_groups[1]; /* variable length */ }; .Ed .Pp The current implementation truncates the group list to at most .Dv CMGROUP_MAX groups. .Pp The .Fn SOCKCREDSIZE macro computes the size of the .Vt sockcred structure for a specified number of groups. The .Vt cmsghdr fields have the following values: .Bd -literal cmsg_len = CMSG_LEN(SOCKCREDSIZE(ngroups)) cmsg_level = SOL_SOCKET cmsg_type = SCM_CREDS .Ed .Pp On .Dv SOCK_STREAM and .Dv SOCK_SEQPACKET sockets credentials are passed only on the first read from a socket, then the system clears the option on the socket. .Pp This option and the above explicit .Vt "struct cmsgcred" both use the same value .Dv SCM_CREDS but incompatible control messages. If this option is enabled and the sender attached a .Dv SCM_CREDS control message with a .Vt "struct cmsgcred" , it will be discarded and a .Vt "struct sockcred" will be included. .Pp Many setuid programs will .Xr write 2 data at least partially controlled by the invoker, such as error messages. Therefore, a message accompanied by a particular .Fa sc_euid value should not be trusted as being from that user. .It Dv LOCAL_CONNWAIT Used with .Dv SOCK_STREAM sockets, this option causes the .Xr connect 2 function to block until .Xr accept 2 has been called on the listening socket. .It Dv LOCAL_PEERCRED Requested via .Xr getsockopt 2 on a .Dv SOCK_STREAM socket returns credentials of the remote side. These will arrive in the form of a filled in .Vt xucred structure, defined in .In sys/ucred.h as follows: .Bd -literal struct xucred { u_int cr_version; /* structure layout version */ uid_t cr_uid; /* effective user id */ short cr_ngroups; /* number of groups */ gid_t cr_groups[XU_NGROUPS]; /* groups */ }; .Ed The .Vt cr_version fields should be checked against .Dv XUCRED_VERSION define. .Pp The credentials presented to the server (the .Xr listen 2 caller) are those of the client when it called .Xr connect 2 ; the credentials presented to the client (the .Xr connect 2 caller) are those of the server when it called .Xr listen 2 . This mechanism is reliable; there is no way for either party to influence the credentials presented to its peer except by calling the appropriate system call (e.g., .Xr connect 2 or .Xr listen 2 ) under different effective credentials. .Pp To reliably obtain peer credentials on a .Dv SOCK_DGRAM socket refer to the .Dv LOCAL_CREDS socket option. .El .Sh SEE ALSO .Xr connect 2 , .Xr dup 2 , .Xr fcntl 2 , .Xr getsockopt 2 , .Xr listen 2 , .Xr recvmsg 2 , .Xr sendto 2 , .Xr setsockopt 2 , .Xr socket 2 , +.Xr CMSG_DATA 3 , .Xr intro 4 .Rs .%T "An Introductory 4.3 BSD Interprocess Communication Tutorial" .%B PS1 .%N 7 .Re .Rs .%T "An Advanced 4.3 BSD Interprocess Communication Tutorial" .%B PS1 .%N 8 .Re