HomeFreeBSD

ip6_output: Reduce cache misses on pktopts

Description

ip6_output: Reduce cache misses on pktopts

When profiling an IP6 heavy workload, I noticed that we were
getting a lot of cache misses in ip6_output() around
ip6_pktopts. This was happening because the TCP stack passes
inp->in6p_outputopts even if all options are unused. So in the
common case of no options present, pkt_opts is not null, and is
checked repeatedly for different options. Since ip6_pktopts is
large (4 cachelines), and every field is checked, we take 4
cache misses (2 of which tend to be hidden by the adjacent line
prefetcher).

To fix this common case, I introduced a new flag in ip6_pktopts
(ip6po_valid) which tracks which options have been set. In the
common case where nothing is set, this causes just a single
cache miss to load. It also eliminates a test for some options
(if (opt != NULL && opt->val >= const) vs if ((optvalid & flag) !=0 )

To keep the struct the same size in 64-bit kernels, and to keep
the integer values (like ip6po_hlim, ip6po_tclass, etc) on the
same cacheline, I moved them to the top.

As suggested by zlei, the null check in MAKE_EXTHDR() becomes
redundant, and can be removed.

For our web server workload (with the ip6po_tclass option set),
this drops the CPI from 2.9 to 2.4 for ip6_output

Differential Revision: https://reviews.freebsd.org/D44204
Reviewed by: bz, glebius, zlei
No Objection from: melifaro
Sponsored by: Netflix Inc.

Details

Provenance
gallatinAuthored on Mar 20 2024, 7:46 PM
Reviewer
bz
Differential Revision
D44204: ip6_output: Reduce cache misses on pktopts
Parents
rG3a9ddfa1ab46: pkgbase: install all libc test files into the tests package
Branches
Unknown
Tags
Unknown