- Rearrange struct inpcb fields to optimize the TCP output code path
considering cache line hits and misses. Put the lock and hash list
glue into the first cache line, put inp_refcount inp_flags inp_socket
into the second cache line.
This has been tested at Netflix.
- When zeroing inpcb on allocation, zero all except lock. Before this
change inp_gencnt and inp_lle inp_rtu were not zeroed. Not zeroing
inp_gencnt doesn't seem to have any sense, and not zeroing lle/rtu
data was definitely a bug.
- Reduce in_pcbinfo_init() by two params. No users supply any flags
to this function (they used to say UMA_ZONE_NOFREE), so flag parameter
goes away. The zone_fini parameter also goes away. Previously no protocols
(except divert) supplied zone_fini function, so inpcb locks were leaked
with slabs. This was okay while zones were allocated with UMA_ZONE_NOFREE
flag, but now this is a leak. Fix that by suppling inpcb_fini() function
as fini method for all inpcb zones.