Page MenuHomeFreeBSD

pf: Fix possible shutdown race
ClosedPublic

Authored by kp on Mar 16 2017, 12:49 PM.
Tags
None
Referenced Files
Unknown Object (File)
Fri, Dec 20, 8:18 AM
Unknown Object (File)
Fri, Dec 20, 7:39 AM
Unknown Object (File)
Dec 20 2024, 1:59 AM
Unknown Object (File)
Nov 22 2024, 11:42 AM
Unknown Object (File)
Nov 4 2024, 1:45 PM
Unknown Object (File)
Sep 30 2024, 1:30 PM
Unknown Object (File)
Sep 16 2024, 1:43 AM
Unknown Object (File)
Sep 14 2024, 8:19 AM
Subscribers

Details

Summary

Prevent possible races in the pf_unload() / pf_purge_thread() shutdown
code. Lock the pf_purge_thread() with the new pf_end_lock to prevent
these races.

Use a shared/exclusive lock, as we need to also acquire another sx lock
(VNET_LIST_RLOCK). It's fine for both pf_purge_thread() and pf_unload()
to sleep,

Pointed out by: eri, glebius, jhb

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

You don't need sx here as you don't need to hold a lock around all cycle. You need to hold lock only when you update pf_end_threads and sleep/wakeup.

You don't need sx here as you don't need to hold a lock around all cycle. You need to hold lock only when you update pf_end_threads and sleep/wakeup.

Wouldn't that run the risk of sending the wakeup from pf_unload() while pf_purge_thread() is running, thus going back to sleep for 10 seconds before it terminates?

sys/netpfil/pf/pf.c
1433 ↗(On Diff #26313)

It would seem simpler to reuse the existing VNET_LIST lock. I would also probably restructure this like so:

    VNET_LIST_RLOCK();
    while (pf_end_threads == 0) {
        VNET_LIST_SLEEP(pf_purge_thread, ....);
        /* existing code, but don't need pf_end_threads check or RUNLOCK(). */
    }
    VNET_LIST_RUNLOCK();
    kproc_exit(0);
}

This does require adding a VNET_LIST_SLEEP() macro to vnet.h, but that seems cleaner than doing an explicit sx_sleep() on &vnet_sxlock given the VNET_LIST_RLOCK/RUNLOCK macros.

sys/netpfil/pf/pf_ioctl.c
3743 ↗(On Diff #26313)

You need to add a 'static struct proc *pf_purge_proc;' global to this file and change this line to:

error = kproc_create(pf_purge_thread, NULL, &pf_purge_proc, 0, 0, "pf purge");
3796 ↗(On Diff #26313)

Here you would use VNET_LIST_WLOCK() and sleep on the proc address:

VNET_LIST_WLOCK();
pf_end_threads = 1;
wakeup(pf_purge_thread);
VNET_LIST_SLEEP(pf_purge_proc, 0, "pftmo", 0);
VNET_LIST_WUNLOCK();

In particular, sleeping on pf_end_threads is still subject to a race that can lead to a panic. Namely, pf_purge_thread() might be preempted after it has incremented pf_end_threads but before it has finished exiting from the function. Then pf_unload() might resume and complete and the pf.ko module would be removed from the kernel (including freeing its memory, etc.). When the pf purge thread resumed at the tail end of pf_purge_thread() it would then panic because it would be trying to execute instructions at an invalid virtual address. This is why you have to sleep on the process or thread structure as that is only awoken from within kproc_exit or kthread_exit at which point you know that the associated kthread is no longer executing instructions in a module.

I see your point about the remaining race and will try to work up a fix according to your suggestion.

I don't think the VNET locks are suitable though.
They don't exist if VIMAGE is not set (which is still the default), so we'd end up not having a lock at all.

Ensure pf_unload() can't continue until pf_purge_thread() is fully done.

This revision is now accepted and ready to land.Mar 21 2017, 8:19 PM
This revision was automatically updated to reflect the committed changes.