Certain socket-based workloads in the system may require significantly
more socket buffers than the rest of the sockets. To accomodate these
workloads and avoid bumping the system-wide limit, weaking the system
protection, allow an easy override. A special socket option,
SO_RCVBUFFORCE, requiring PRIV_IPC_MSGSIZE priviledge, sets the maximum
socket buffer size, ignoring sb_max (kern.ipc.maxsockbuf) value.
This option exists on Linux since 2.6.18.
The primary use case (for us) are netlink sockets, which can receive large dumps from kernel. Current full-view IPv4 dump for a single fib is about ~40 megabytes, but it can notable increase with wide-multi path is set. Large-scale customers set the buffer to ~128M.
Questions:
* Should we use existing `PRIV_IPC_MSGSIZE` privilege or create a new one?
* `PRIV_IPC_MSGSIZE` is allowed to be set in jails. Are we okay with that?
* Should there be an absolute maximum on the buffer size (e.g. another sysctl set to, say, 1 gigabytes or an number correlated to the total amount of mbufs?