Page MenuHomeFreeBSD

bpf: Fix incorrect cleanup
ClosedPublic

Authored by kp on Jul 30 2017, 12:01 PM.
Referenced Files
Unknown Object (File)
Aug 19 2025, 7:25 PM
Unknown Object (File)
Aug 10 2025, 9:35 PM
Unknown Object (File)
Aug 9 2025, 7:28 PM
Unknown Object (File)
Aug 8 2025, 3:40 PM
Unknown Object (File)
Aug 6 2025, 9:18 PM
Unknown Object (File)
Aug 6 2025, 5:58 PM
Unknown Object (File)
Jul 27 2025, 12:51 AM
Unknown Object (File)
Jul 26 2025, 11:49 AM
Subscribers

Details

Summary

Cleaning up a bpf_if is a two stage process. We first move it to the
bpf_freelist (in bpfdetach()) and only later do we actually free it (in
bpf_ifdetach()).

We cannot set the ifp->if_bpf to NULL from bpf_ifdetach() because it's
possible that the ifnet has already gone away, or that it has been assigned
a new bpf_if.
This can lead to a struct ifnet which is up, but has if_bpf set to NULL,
which will panic when we try to send the next packet.

Keep track of the pointer to the bpf_if (because it's not always
ifp->if_bpf), and NULL it immediately in bpfdetach().

PR: 213896

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

In D11782#244283, @bz wrote:

Is this a consequence of https://svnweb.freebsd.org/base?view=revision&revision=297816 or independent of it?

It's related in the sense that r297816 makes it easier to trigger this. I've not gone through all of the bpf users, but basically whenever an ifnet removes its bpf and gets a new one there's a chance that'll end up triggering this panic.
I think it's very much bpf doing the wrong thing here. It shouldn't asynchronously change things in the ifnet.

I think your description could be a bit more clear as I got confused. I think after re-reading it I now understand what you mean. Can you try to describe the sequence of events step by step here on how this gets triggered (especially how the new entry appears before we set it to NULL overwriting a new one?)?

Hmm, good question. I thought I understood the failure flow fully, but now I'm not so sure.

What I thought happened was this:
When we shut down a vnet jail with an epair in it it gets moved back to the host (through vnet_if_return() and if_vmove()).
if_vmove() detaches the if_bpf, which puts it on the bpf_freelist.
Later in if_vmove() we allocate a new if_bpf. When the kernel gets around to actually deleting it (in bpf_ifdetach()) it clobbers the if->if_bpf pointer, which in turn triggers a panic the next time the affected interface tries to transmit a packet (when it calls BPF_MTAP()).

The problem is that either I'm missing something, or the comment in if_vmove() is wrong, because I don't see how the if_bpf gets freed through if_detach_internal(). It should only end up in bpf_ifdetach() (and not in bpfdetach()).
The only other scenario I can picture is that, because the test scenario creates and destroys interfaces quickly we end up re-using a freed struct ifnet and the bpf_ifdetach() clobbers a pointer in freed memory that got re-used for another struct ifnet. I'm not sure how plausible that is.

I'm still convinced the 'ifp->if_bpf = NULL;' in bpf_ifdetach() is wrong, and that's backed up by the fact that with this patch we no longer panic as reported in PR 213896.
On the other hand, we should be sure we fully understand the problem too...

I've done a bit more testing, and these debug traces tell the story:

KP: if_alloc() = 0xfffff80016f22800
KP: bpf_if 0xfffff80016b6d280 freed. ifp 0xfffff80016f22800 ifp->if_bpf = 0xfffff80016b6d280
KP: if_free_internal(0xfffff80016f22800)
KP: if_alloc() = 0xfffff80016f22800
KP: if_free_internal(0xfffff80016f22800)
KP: if_alloc() = 0xfffff80016f22800
KP: if_free_internal(0xfffff80016f22800)
KP: if_alloc() = 0xfffff80016f22800
KP: if_free_internal(0xfffff80016f22800)
KP: if_alloc() = 0xfffff80016f22800
KP: bpf_if 0xfffff8005ebfc500 freed. ifp 0xfffff80016f22800 ifp->if_bpf = 0xfffff8005ebd3c80
KP: bpf_if 0xfffff8005ebfc800 freed. ifp 0xfffff80016f22800 ifp->if_bpf = 0xfffff8005ebd3c80
KP: bpf_if 0xfffff80016a72800 freed. ifp 0xfffff80016f22800 ifp->if_bpf = 0xfffff8005ebd3c80

We end up allocating and freeing multiple ifnets on the same address before the cleanup code in bpf_ifdetach actually runs, so we clobber a completely unrelated ifnet while freeing the bpf_if.
The proposed fix should be fine, because it moves the act of writing into the ifnet (to set the if_bpf pointer to NULL) out of the delayed section and into a method where the ifnet still exists.

I'd say go ahead for now; seems to make life better at least.

This revision is now accepted and ready to land.Aug 13 2017, 12:13 PM
This revision was automatically updated to reflect the committed changes.