HomeFreeBSD

Fix hangs with processes stuck sleeping on btalloc on i386.

Description

Fix hangs with processes stuck sleeping on btalloc on i386.

r358097 introduced a problem for i386, where kernel builds will intermittently
get hung, typically with many processes sleeping on "btalloc".
I know nothing about VM, but received assistance from rlibby@ and markj@.

rlibby@ stated the following:

It looks like the problem is that
for systems that do not have UMA_MD_SMALL_ALLOC, we do
        uma_zone_set_allocf(vmem_bt_zone, vmem_bt_alloc);
but we haven't set an appropriate free function.  This is probably why
UMA_ZONE_NOFREE was originally there.  When NOFREE was removed, it was
appropriate for systems with uma_small_alloc.

So by default we get page_free as our free function.  That calls
kmem_free, which calls vmem_free ... but we do our allocs with
vmem_xalloc.  I'm not positive, but I think the problem is that in
effect we vmem_xalloc -> vmem_free, not vmem_xfree.

Three possible fixes:
 1: The one you tested, but this is not best for systems with
    uma_small_alloc.
 2: Pass UMA_ZONE_NOFREE conditional on UMA_MD_SMALL_ALLOC.
 3: Actually provide an appropriate vmem_bt_free function.

I think we should just do option 2 with a comment, it's simple and it's
what we used to do.  I'm not sure how much benefit we would see from
option 3, but it's more work.

This patch implements #2. I haven't done a comment, since I don't know
what the problem is.

markj@ noted the following:

I think the suggested patch is ok, but not for the reason stated.
On platforms without a direct map the problem is:
to allocate btags we need a slab,
and to allocate a slab we need to map a page, and to map a page we need
to allocate btags.

We handle this recursion using a custom slab allocator which specifies
M_USE_RESERVE, allowing it to dip into a reserve of free btags.
Because the returned slab can be used to keep the reserve populated,
this ensures that there are always enough free btags available to
handle the recursion.

UMA_ZONE_NOFREE ensures that we never reclaim free slabs from the zone.
However, when it was removed, an apparent bug in UMA was exposed:
keg_drain() ignores the reservation set by uma_zone_reserve()
in vmem_startup().
So under memory pressure we reclaim the free btags that are needed to
break the recursion.
That's why adding _NOFREE back fixes the problem: it disables the
reclamation.

We could perhaps fix it more cleverly, by modifying keg_drain() to always
leave uk_reserve slabs available.

markj@'s initial patch failed testing, so committing this patch was agreed
upon as the interim solution.
Either rlibby@ or markj@ might choose to add a comment to it.

PR: 248008
Reviewed by: rlibby, markj

Details

Provenance
rmacklemAuthored on Aug 25 2020, 12:58 AM
Parents
rGeda14cbc264d: Initial import from vendor-sys branch of openzfs
Branches
Unknown
Tags
Unknown

Event Timeline