Page MenuHomeFreeBSD

Avoid /sbin/sysctl net.inet.tcp.hostcache.list from stalling when under memory pressure
ClosedPublic

Authored by rscheff on Mar 28 2021, 4:06 PM.

Details

Summary

While looking how to gracefully deprecate a sysctl (to return a hint,
what the new sysctl is called), I found that /sbin/sysctl will call a
SYSCTL_PROC twice for reads, or five times for changing a value.

The 1st and 4th call are to check what buffersize to allocate, before
the actual content is retrieved (and shown).

In tcp_hostcache_list, sbuf was used in a way, which would need
a large (~2MB) blocking allocation of memory (M_WAITOK), which
may not happen for some time on systems having a full hostcache
(>15k entries), and/or kernel memory fragementation (uptime).

Replacing the output there with the ready-made functions which
allocate much less kernel sbuf memory, and repeatedly drain this
to userspace if need be, to prevent any apparent "stuck" sbin/sysctl
calls.

PR: 254333
MFC after: 2 weeks (stable/13, stable/12)

Test Plan

Populated the hostcache with about 450 entries, so that
a hostcache.list will return around 41 kB.

Diff Detail

Repository
R10 FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

rscheff retitled this revision from prevent tcp_hostcache_list from stalling, when insufficient kernel heap memory is available. to Avoid /sbin/sysctl net.inet.tcp.hostcache.list from stalling when under memory pressure.Mar 28 2021, 4:17 PM
rscheff edited the summary of this revision. (Show Details)
rscheff edited the test plan for this revision. (Show Details)
sys/netinet/tcp_hostcache.c
631

Setting error to 0 is superfluous from what I can see.

I don't see how this can help. Instead of allocating in kernel a buffer large enough, you allocate a small one and then for almost every output line you allocate a larger one, copy over then contents and free the smaller one. At the end, you have to allocate the same large buffer as in the old solution. You just replace one allocation by n allocation, were n is in the order of the number of lines.

I don't think this helps in allocating, but it definitely hurst in performance, I think.

Unless I am mistaken, sbuf_new_for_sysctl does NOT set SBUF_AUTOEXTEND. It rather sets the drain function (SYSCTL_OUT), and sbuf_put_bytes will call that function rather than extending the buffer (even if SBUF_AUTOEXTEND would be set).

So, at least from the sbuf infrastructure, this seems to be the right call to make. If the SYSCTL_OUT is doing re-allocs and copy-overs (especially when the caller has not provided any buffer, but I believe in that instance, it will actually only count up the bytes needed, but never actually copy memory), i think thats needed to be fixed there. If the buffer provided to SYSCTL_OUT is large enough to hold the data, only small chunks should be copied over repeatedly, I think.

rscheff edited the test plan for this revision. (Show Details)
  • remove superfluous initialization.

You are absolutely right. Using sbuf_new_for_sysctl() allows the operation to succeed with using only a small buffer. I sent a patch via e-mail, which I used for testing and which reduces the number of whitespace changes, add a simplified way of calculating the size. Up to you what you take from it.

  • improve buffersize query from sbin/sysctl
  • reduce whitespace churn (Thanks to @tuexen for improving the initial patch)
This revision is now accepted and ready to land.Mar 28 2021, 9:06 PM