Page MenuHomeFreeBSD

ifnet: if_detach(): Fix races with if_vmove()
ClosedPublic

Authored by zlei on Apr 13 2026, 10:00 AM.
Tags
None
Referenced Files
Unknown Object (File)
Tue, May 12, 9:06 PM
Unknown Object (File)
Tue, May 12, 3:36 AM
Unknown Object (File)
Mon, May 11, 7:39 PM
Unknown Object (File)
Mon, May 11, 6:13 PM
Unknown Object (File)
Sun, May 10, 3:10 AM
Unknown Object (File)
Sun, May 10, 3:05 AM
Unknown Object (File)
Sat, May 9, 3:17 PM
Unknown Object (File)
Sat, May 9, 2:42 PM

Details

Summary

The rationality is that the driver private data holds a strong reference
to the interface, and the detach operation shall never fail. Given the
vmove operation is not atomic and spans multiple steps, acquire
ifnet_detach_sxlock only for if_detach_internal() and if_vmove() is not
sufficient. It is possible that the thread running if_detach() sees
stale vnet, or the vmoving is in progress, hence if_unlink_ifnet() can
fail.

Fix that by extending coverage of ifnet_detach_sxlock a bit to cover
the entire detach and vmove operations.

Since it is an error to fail to detach, and if_detach() is a public KPI,
prefer panic() instead of assertion, to indicate the error explicitly.
That shall prevent potential races with vmove operations ( concurrent
if_detach_internal() and if_attach_internal() ) and the status of the
interface get corrupted, which is harder to diagnose.

PR: 292993
MFC after: 2 weeks

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

zlei requested review of this revision.Apr 13 2026, 10:00 AM

This is still not sufficient to fix 292993. The thread running ether_ifdetach() / bpfdetach() is still racing with those running if_vmove() / bpf_vmove() . Ideally bpfdetach() shall run after if_detach(), but that is a behavior change, so I will later post a workaround for that.

I have WIP to overhaul the if_attach() and if_detach(). But that is a longer task and I think a workaround shall fit right now.

This is still not sufficient to fix 292993. The thread running ether_ifdetach() / bpfdetach() is still racing with those running if_vmove() / bpf_vmove() . Ideally bpfdetach() shall run after if_detach(), but that is a behavior change, so I will later post a workaround for that.

See D56375.

I have WIP to overhaul the if_attach() and if_detach(). But that is a longer task and I think a workaround shall fit right now.

I really don't want to see overhaul of if_attach() and if_detach() in favor of fixing if_vmove().

This is still not sufficient to fix 292993. The thread running ether_ifdetach() / bpfdetach() is still racing with those running if_vmove() / bpf_vmove() . Ideally bpfdetach() shall run after if_detach(), but that is a behavior change, so I will later post a workaround for that.

See D56375.

I have WIP to overhaul the if_attach() and if_detach(). But that is a longer task and I think a workaround shall fit right now.

I really don't want to see overhaul of if_attach() and if_detach() in favor of fixing if_vmove().

Those are WIP for quite a long time. They're not directly related to if_vmove(), but to the routines and bpf. For example there is still a little window for the data path to see a NULL if->if_bpf. The first report ( IIRC ) is https://lists.freebsd.org/archives/freebsd-net/2025-May/006817.html .

They're not directly related to if_vmove(), but to the routines and bpf. For example there is still a little window for the data path to see a NULL if->if_bpf.

Yep, that's not related to if_vmove. This is a generic problem with interfaces going away. The old mechanism was the if_dead, and it is still doing a large part of the work. The modern strategy is that everything that hangs off ifnet shall be freed via epoch_call(9) and can be safely dereferenced in the detached state. In such state we don't need to be able to do any actual work - just avoid panics and memory leaks. The epoch-delayed free is true for the struct ifnet itself in all branches. This is also true for if_inet and if_inet6 in CURRENT. Not yet true for if_bpf. So, the way to go is to work on proper epoch-protected free of if_bpf and other structures that hang directly off ifnet, or indirectly (via if_inet, if_inet6, etc). Eventually the if_dead can be safely removed.

Hi @kp @glebius , I added two tests , see D56606 and D56609 .

I have run them for days and I believe this patch ( along with some previous commits related to vmove operations ) is sufficient to fix PR 292993 on stable/15 and stable/14. I'd like to have this fix in stable/15 before 15.1-BETA, if no objections.

No objections! Good plan for the stable branches. Thanks!

This revision is now accepted and ready to land.Fri, Apr 24, 3:25 AM