Changeset View
Standalone View
sys/net/if.c
Show First 20 Lines • Show All 465 Lines • ▼ Show 20 Lines | |||||
} | } | ||||
VNET_SYSUNINIT(vnet_if_uninit, SI_SUB_INIT_IF, SI_ORDER_FIRST, | VNET_SYSUNINIT(vnet_if_uninit, SI_SUB_INIT_IF, SI_ORDER_FIRST, | ||||
vnet_if_uninit, NULL); | vnet_if_uninit, NULL); | ||||
static void | static void | ||||
vnet_if_return(const void *unused __unused) | vnet_if_return(const void *unused __unused) | ||||
{ | { | ||||
struct ifnet *ifp, *nifp; | struct ifnet *ifp, *nifp; | ||||
melifaro: Sorry, this can potentially be a bit problematic - as we don't protect reading `vnet_ifcnt`… | |||||
/* We need to protect our access to the V_ifnet tailq. Ordinarily we'd | |||||
jmgUnsubmitted Not Done Inline Actionsfirst line of multiline comments should start on the line after the opening /*, /*
jmg: first line of multiline comments should start on the line after the opening /*,
so:
/*
* We… | |||||
* enter NET_EPOCH, but that's not possible, because if_vmove() calls | |||||
* if_detach_internal(), which waits for NET_EPOCH callbacks to | |||||
* complete. We can't do that from within NET_EPOCH. | |||||
* | |||||
* However, we can also use the IFNET_xLOCK, which is the V_ifnet | |||||
Not Done Inline Actionsis this supposed to be 'x'lock? It would make sense if describing the set of locks, but you mention the write lock specifically allanjude: is this supposed to be 'x'lock? It would make sense if describing the set of locks, but you… | |||||
Done Inline ActionsYeah, that needs to be read/write lock. kp: Yeah, that needs to be read/write lock. | |||||
* read/write lock. We take this lock from within if_vmove() as well, | |||||
* but it's a recursive lock, so that's safe. We do have to take the | |||||
* write lock, because we need the write lock in if_vmove(). | |||||
Not Done Inline ActionsWould it be possible to do it a bit differently?
This will solve LORs and avoid having another instance of recursed locks. melifaro: Would it be possible to do it a bit differently?
1) Introduce `if_link_ifnet()` and… | |||||
Not Done Inline ActionsThat's a good notion, yes. It doesn't fix the epair related panic, sadly. It's worth keeping in mind though, because we need something to protect the V_ifnet list access (and we don't have anything at all right now). I'm experimenting with a similar approach, where if_vmove() now assumes it will be called with the IFNET_WLOCK held, and if_detach_internal releases it before in_ifdetach(). Alternatively, I wonder if a 'delete interface' lock would fix this issue as well. kp: That's a good notion, yes.
It doesn't fix the epair related panic, sadly. It's worth keeping… | |||||
Not Done Inline ActionsDoing as this suggests, combined with an explicit "destroy the interface" lock, as below appears to fix the epair/vnet cleanup panic as well: diff --git a/sys/net/if.c b/sys/net/if.c index 481a3c23ef8..a5415b6e9a1 100644 --- a/sys/net/if.c +++ b/sys/net/if.c @@ -314,6 +314,9 @@ VNET_DEFINE(struct ifnet **, ifindex_table); struct sx ifnet_sxlock; SX_SYSINIT_FLAGS(ifnet_sx, &ifnet_sxlock, "ifnet_sx", SX_RECURSE); +struct sx ifnet_del_sxlock; +SX_SYSINIT(ifnet_del, &ifnet_del_sxlock, "ifnet_del_sx"); + /* * The allocation of network interfaces is a rather non-atomic affair; we * need to select an index before we are ready to expose the interface for @@ -536,9 +539,11 @@ vnet_if_return(const void *unused __unused) } IFNET_WUNLOCK(); + sx_xlock(&ifnet_del_sxlock); CK_STAILQ_FOREACH_SAFE(ifp, &ifnets, if_link, nifp) { if_vmove(ifp, ifp->if_home_vnet); } + sx_xunlock(&ifnet_del_sxlock); } VNET_SYSUNINIT(vnet_if_return, SI_SUB_VNET_DONE, SI_ORDER_ANY, vnet_if_return, NULL); @@ -2996,8 +3001,11 @@ ifioctl(struct socket *so, u_long cmd, caddr_t data, struct thread *td) goto out_noref; case SIOCIFDESTROY: error = priv_check(td, PRIV_NET_IFDESTROY); - if (error == 0) + if (error == 0) { + sx_xlock(&ifnet_del_sxlock); error = if_clone_destroy(ifr->ifr_name); + sx_xunlock(&ifnet_del_sxlock); + } goto out_noref; case SIOCIFGCLONERS: That serialises interface destruction, of course, but that seems like a small price to pay for not panicking half the time you destroy an epair and a vnet. I'll try to clean up this change and push a separate review for this cleanup lock thing. kp: Doing as this suggests, combined with an explicit "destroy the interface" lock, as below… | |||||
Not Done Inline ActionsYay! It's awesome we have a solution for the panic. Though, we still need some sort of safety protection for ifnet list traversal in vnet_if_return(). melifaro: Yay! It's awesome we have a solution for the panic.
Though, we still need some sort of safety… | |||||
Not Done Inline ActionsUnfortunately that doesn't appear to be possible. Using IFNET_WLOCK to protect SIOCIFDESTROY produces the LOR we're trying to avoid: login: lock order reversal: 1st 0xffffffff81d9dad8 ifnet_sx (ifnet_sx, sx) @ /usr/src/sys/net/if.c:1625 2nd 0xffffffff81d9fc08 in_multi_sx (in_multi_sx, sx) @ /usr/src/sys/netinet/in.c:1045 lock order ifnet_sx -> in_multi_sx attempted at: #0 0xffffffff80c4838c at witness_checkorder+0xdcc #1 0xffffffff80be3a47 at _sx_xlock+0x67 #2 0xffffffff80d7bf14 at in_ifdetach+0x24 #3 0xffffffff80cf03da at if_detach_internal+0x1ea #4 0xffffffff80cf0031 at if_detach+0x51 #5 0xffffffff82b88d58 at epair_clone_destroy+0x98 #6 0xffffffff80cf759a at if_clone_destroyif+0x19a #7 0xffffffff80cf7381 at if_clone_destroy+0x1f1 #8 0xffffffff80cf4237 at ifioctl+0x357 #9 0xffffffff80c4d829 at kern_ioctl+0x289 #10 0xffffffff80c4d4ea at sys_ioctl+0x12a #11 0xffffffff8102715e at amd64_syscall+0x12e #12 0xffffffff80ffa0ae at fast_syscall_common+0xf8 We do indeed also need to protect the V_ifnet list in vnet_if_returm(). Your suggested approach will work for that. I'll split this patch up in two: one to address the V_ifnet list protection in vnet_if_return() and one to introduce the new ifnet_del_sxlock to fix the epair/vnet cleanup panic. kp: Unfortunately that doesn't appear to be possible. Using IFNET_WLOCK to protect SIOCIFDESTROY… | |||||
Done Inline ActionsNit: given both vmove and return value are, in fact, bool, maybe worth considering declaring them as bool explicitly? melifaro: Nit: given both `vmove` and return value are, in fact, `bool`, maybe worth considering… | |||||
*/ | |||||
IFNET_WLOCK(); | |||||
/* Return all inherited interfaces to their parent vnets. */ | /* Return all inherited interfaces to their parent vnets. */ | ||||
CK_STAILQ_FOREACH_SAFE(ifp, &V_ifnet, if_link, nifp) { | CK_STAILQ_FOREACH_SAFE(ifp, &V_ifnet, if_link, nifp) { | ||||
if (ifp->if_home_vnet != ifp->if_vnet) | if (ifp->if_home_vnet != ifp->if_vnet) | ||||
if_vmove(ifp, ifp->if_home_vnet); | if_vmove(ifp, ifp->if_home_vnet); | ||||
} | } | ||||
IFNET_WUNLOCK(); | |||||
} | } | ||||
VNET_SYSUNINIT(vnet_if_return, SI_SUB_VNET_DONE, SI_ORDER_ANY, | VNET_SYSUNINIT(vnet_if_return, SI_SUB_VNET_DONE, SI_ORDER_ANY, | ||||
Not Done Inline ActionsIt would be better if we could avoid changing if_link. The other callers that can potentially be iterating V_ifnet at the same time may switch to ifnets list instead due to the fact that we're changing if_link of the ifp entry. melifaro: It would be better if we could avoid changing `if_link`.
The other callers that can… | |||||
Done Inline ActionsThat's a good point. I assumed this was safe because the CK_LISTs are safe to modify with concurrent readers and one writer, but that's only the case if you don't add the element to a different list as you pointed out. kp: That's a good point.
I assumed this was safe because the CK_LISTs are safe to modify with… | |||||
vnet_if_return, NULL); | vnet_if_return, NULL); | ||||
#endif | #endif | ||||
static void * | static void * | ||||
if_grow(void) | if_grow(void) | ||||
{ | { | ||||
int oldlim; | int oldlim; | ||||
u_int n; | u_int n; | ||||
▲ Show 20 Lines • Show All 4,021 Lines • Show Last 20 Lines |
Sorry, this can potentially be a bit problematic - as we don't protect reading vnet_ifcnt here, we can end up having more interfaces once we got to IFNET_WLOCK part, which may lead to stack memory corruption.
Also, as we don't know how big vnet_ifcnt will be. I'd be rather cautious and prefer avoiding allocating unknown amount of space on stack, as the stack size is limited.
Given IFNET_WLOCK is sx, maybe it's worth allocating memory there explicitly, avoiding vnet_ifcnt potential change problems there.