lagg: Avoid adding a port to a lagg device being destroyed.
ClosedPublic
Actions

Authored by bdrewery on Jun 15 2020, 5:36 PM.

Details

Reviewers

markj
• hselasky
jhb
glebius

Commits

rS364220: lagg: Avoid adding a port to a lagg device being destroyed.

Summary

Also reenable tests that now pass.

The lagg_clone_destroy() handles detach and waiting for ifconfig callers
to drain already.

This fixes 2 panics that the tests triggered. Both were a consequence of
adding a port to the lagg device after it had already detached from all
of its ports. The link state task would run after lagg_clone_destroy()
free'd the lagg softc.

kernel:trap_fatal+0xa4
kernel:trap_pfault+0x61
kernel:trap+0x316
kernel:witness_checkorder+0x6d
kernel:_sx_xlock+0x72
if_lagg.ko:lagg_port_state+0x3b
kernel:if_down+0x144
kernel:if_detach+0x659
if_tap.ko:tap_destroy+0x46
kernel:if_clone_destroyif+0x1b7
kernel:if_clone_destroy+0x8d
kernel:ifioctl+0x29c
kernel:kern_ioctl+0x2bd
kernel:sys_ioctl+0x16d
kernel:amd64_syscall+0x337

kernel:trap_fatal+0xa4
kernel:trap_pfault+0x61
kernel:trap+0x316
kernel:witness_checkorder+0x6d
kernel:_sx_xlock+0x72
if_lagg.ko:lagg_port_state+0x3b
kernel:do_link_state_change+0x9b
kernel:taskqueue_run_locked+0x10b
kernel:taskqueue_run+0x49
kernel:ithread_loop+0x19c
kernel:fork_exit+0x83

PR: 244168
MFC after: 2 weeks

Test Plan

Ran if_lagg_test in a loop for several days.

Diff Detail

Lint

Lint Passed

Unit

No Test Coverage

Build Status

Buildable 32723
Build 30167: arc lint + arc unit

Event Timeline

bdrewery created this revision.Jun 15 2020, 5:36 PM

Herald added subscribers: melifaro, ae. · View Herald TranscriptJun 15 2020, 5:36 PM

bdrewery requested review of this revision.Jun 15 2020, 5:36 PM

Harbormaster completed remote builds in B31721: Diff 73144.Jun 15 2020, 5:36 PM

bdrewery added a subscriber: asomers.Jun 15 2020, 5:37 PM

lwhsu added a subscriber: lwhsu.Jun 15 2020, 5:52 PM

I think this is a step in the right direction, but it does not completely fix the problem. sc_destroying is set by lagg_clone_destroy(), which later drops the lock and frees the lagg softc. If lagg_port_create() is dereferencing the softc pointer, then it may be dereferencing freed memory. You could probably trigger a panic by enabling memguard on "lagg" and running your repro.

I suspect that we need to use the net epoch to ensure that if_softc points to valid memory, but I don't have any detailed suggestions at the moment.

Is this change required for any other lagg ioctl handlers?

In D25284#557571, @markj wrote:

I think this is a step in the right direction, but it does not completely fix the problem. sc_destroying is set by lagg_clone_destroy(), which later drops the lock and frees the lagg softc. If lagg_port_create() is dereferencing the softc pointer, then it may be dereferencing freed memory. You could probably trigger a panic by enabling memguard on "lagg" and running your repro.

I suspect that we need to use the net epoch to ensure that if_softc points to valid memory, but I don't have any detailed suggestions at the moment.

I see what you're talking about. It's odd I haven't seen any panic in lagg_ioctl() or lagg_port_create(). memguard for a few hours hasn't helped. I also find it odd that this isn't just fixed globally somehow.

Is this change required for any other lagg ioctl handlers?

Probably.

The way I see other devices handling this is to use a single global sx like this (untested):

diff --git sys/net/if_lagg.c sys/net/if_lagg.c
index 01d1fba16b76..a6c4681b60bb 100644
--- sys/net/if_lagg.c
+++ sys/net/if_lagg.c
@@ -95,6 +95,9 @@ extern void   nd6_setmtu(struct ifnet *);
 #define        LAGG_SXLOCK_ASSERT(_sc) sx_assert(&(_sc)->sc_sx, SA_LOCKED)
 #define        LAGG_XLOCK_ASSERT(_sc)  sx_assert(&(_sc)->sc_sx, SA_XLOCKED)

+static struct sx lagg_ioctl_sx;
+SX_SYSINIT(lagg_ioctl_sx, &lagg_ioctl_sx, "lagg_ioctl");
+
 /* Special flags we should propagate to the lagg ports. */
 static struct {
        int flag;
@@ -597,6 +600,9 @@ lagg_clone_destroy(struct ifnet *ifp)

        ifmedia_removeall(&sc->sc_media);
        ether_ifdetach(ifp);
+       sx_xlock(&lagg_ioctl_sx);
+       ifp->if_softc = NULL;
+       sx_xunlock(&lagg_ioctl_sx);
        if_free(ifp);

        LAGG_LIST_LOCK();
@@ -1180,7 +1186,7 @@ lagg_stop(struct lagg_softc *sc)
 static int
 lagg_ioctl(struct ifnet *ifp, u_long cmd, caddr_t data)
 {
-       struct lagg_softc *sc = (struct lagg_softc *)ifp->if_softc;
+       struct lagg_softc *sc;
        struct lagg_reqall *ra = (struct lagg_reqall *)data;
        struct lagg_reqopts *ro = (struct lagg_reqopts *)data;
        struct lagg_reqport *rp = (struct lagg_reqport *)data, rpbuf;
@@ -1194,6 +1200,13 @@ lagg_ioctl(struct ifnet *ifp, u_long cmd, caddr_t data)

        bzero(&rpbuf, sizeof(rpbuf));

+       sx_slock(&lagg_ioctl_sx);
+       sc = (struct lagg_softc *)ifp->if_softc;
+       if (sc == NULL) {
+               error = ENXIO;
+               goto out;
+       }
+
        switch (cmd) {
        case SIOCGLAGG:
                LAGG_XLOCK(sc);
@@ -1593,6 +1606,8 @@ lagg_ioctl(struct ifnet *ifp, u_long cmd, caddr_t data)
                error = ether_ioctl(ifp, cmd, data);
                break;
        }
+out:
+       sx_sunlock(&lagg_ioctl_sx);
        return (error);
 }

Other examples use xlock in the ioctl handler but I think only shared is needed here.

In D25284#557606, @bdrewery wrote:

In D25284#557571, @markj wrote:

Is this change required for any other lagg ioctl handlers?

Probably.

The way I see other devices handling this is to use a single global sx like this (untested):

I think that could work. You might use a reference counter in the softc instead of holding the lock for the entire lagg_ioctl() body.

Anyway, I have no objection to the proposed patch if you include an XXX comment noting that a race still exists. Since the change narrows the race in a useful way, and isn't complicated, I think it is worth having.