Page MenuHomeFreeBSD

This KASSERT is overzealous because of the following race condition:
ClosedPublic

Authored by mlaier on Feb 10 2021, 1:02 AM.

Details

Summary
 1) A managed page which is currently in PQ_LAUNDRY is freed.
    vm_page_free_prep calls vm_page_dequeue_deferred()

    The page state is:
       PQ_LAUNDRY, PGA_DEQUEUE|PGA_ENQUEUED

 2) The laundry worker comes around and pick up the page and calls
    vm_pageout_defer(m, PQ_LAUNDRY, true) to check if page is still in the
    queue.  We do a vm_page_astate_load and get
       PQ_LAUNDRY, PGA_DEQUEUE|PGA_ENQUEUED
    as per above.

 3) The laundry worker is pre-empted and another thread allocates our page
    from the free pool.  For example vm_page_alloc_domain_after calls
    vm_page_dequeue() and sets VPO_UNMANAGED because we are allocating for
    an OBJT_UNMANAGED object.

    The page state is:
       PQ_NONE, 0 - VPO_UNMANAGED

 4) The laundry worker resumes, and processes vm_pageout_defer based on the
    stale astate which leads to a call to vm_page_pqbatch_submit, which will
    trip on the KASSERT.

Sponsored by:	Dell EMC Isilon

Diff Detail

Repository
R10 FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Forgot to add:

There is no negative fallout from this. When we eventually get into vm_pqbatch_process_page we reestablish the astate and atomically process any outstanding operations - if they still apply. In the case above, we would end in a no-op.

Thanks. This assertion is left over from when there was tighter synchronization. I guess there are not so many scenarios where we frequently allocate unmanaged pages from VM_FREEPOOL_DEFAULT, so this went unnoticed in stress testing.

This revision is now accepted and ready to land.Feb 10 2021, 1:36 AM
rlibby added inline comments.
sys/vm/vm_page.c
3547–3550

I think the assert would be valid here in vm_pqbatch_process_page()? Since here we know that we are (or were just) on a page queue, and we hold the pagequeue lock, and at least in vm_page_alloc_domain_after() we do the vm_page_dequeue() first and the assignment of oflags after that.

(Incidentally I think this assert here is also a little messed up, although harmless. The first condition is always true because we just established above that old.queue == queue, and we asserted at the top that queue < PQ_COUNT and we could _Static_assert(PQ_COUNT < PQ_NONE, ...).)

sys/vm/vm_page.c
3547–3550

I agree with both of these statements. In other words, we should

  • remove this assertion, because it's always true, and
  • assert that the page is managed since at this point the page queue lock ensures that the page will not be allocated as an unmanaged page
This revision now requires review to proceed.Feb 10 2021, 6:43 PM
This revision is now accepted and ready to land.Feb 10 2021, 6:47 PM

Should I commit this? No problem at all if one of you were planning to do it.