Page MenuHomeFreeBSD

pflog: create bpf tapping points without ifnet(9)
AcceptedPublic

Authored by kp on Thu, Jun 25, 7:32 PM.

Details

Reviewers
glebius
Group Reviewers
pfsense
network
Summary

Just as was done for ipfw's log device stop creating entire struct
ifnet's for pflog devices. Create only a bpf_t.

This does mean we lose the create/destroy infrastructure provided by the
clone interface. Rather than implement this ourselves we allow users to
configure the number of pflog interfaces using the net.pflog.if_count
sysctl. We default to 8 devices, but allow up to 256.

While it was possible to rename pflog devices pfctl expected the pflogX
name, so it's safe to assume users never did this.

Requested by: gleb
Sponsored by: Rubicon Communications, LLC ("Netgate")

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 74317
Build 71200: arc lint + arc unit

Event Timeline

kp requested review of this revision.Thu, Jun 25, 7:32 PM
share/man/man4/pflog.4
89

My only concern is racyness of sysctl vs sysctl. The node and proc are marked CTLFLAG_MPSAFE, so they can execute in parallel. bpf_attach() doesn't have protection against duplicates, so we will leak one.

I don't know, should we make the bpf(4) smarter or this code. Could there be other tapping facilities that want to bpf(4) to handle name collisions? What's your opinion?

sys/netpfil/pf/if_pflog.c
113

CTLTYPE_U8 shall have format "CU"

In the commit message use glebius@, not gleb@. That's not me.

My only concern is racyness of sysctl vs sysctl. The node and proc are marked CTLFLAG_MPSAFE, so they can execute in parallel. bpf_attach() doesn't have protection against duplicates, so we will leak one.

Ah, yeah, that's an issue.
I could remove the MPSAFE, but we probably don't want to rely on Giant, so that's not great either.

I don't know, should we make the bpf(4) smarter or this code. Could there be other tapping facilities that want to bpf(4) to handle name collisions? What's your opinion?

Given that pflog also wants V_npflogifs to be correct and consistent I don't think we can fix this in bpf(4).

It's pretty trivial to add a sysctl mtx, and it doesn't really cost anything. Although that would leave a race where one thread could be logging to V_pflogifs[1].pflog_bpf, while a sysctl update removes it.
So perhaps we need an rmlock, but .. no, that could just do NET_EPOCH. With a lock in sysctl to prevent simultaneous updates.

  • locking fixes
  • man page improvement
  • fix pflogd to not look for a network interface
  • have /etc/rc.d/pflogd create extra log devices if required

bpf_attach() uses M_WAITOK (that's why it doesn't fail). This lock should be sx(9). IMHO, it should be asserted both in call to pflog_create() and pflog_destroy(). The startup calls to pflog_create() should also acquire the lock, just for consistency.

P.S. This problem gave me idea that we might want a generic locked sysctl oids. Basically a lock per oid, can be external or can be automatic. I guess many of those sysctl oids that are still under Giant can be easily converted, as they don't need a lock over all of the kernel, they just need protection against self.

Do you happen to know of a way to get an interface (struct ifnet) without AF_INET6? I used to use pflog0 for this, and now clearly can't any more. (The test for PR 288263 does that.)
Ideally I'd like to not remove the test, but without such an interface there's no point to it.

In D57851#1327098, @kp wrote:

Do you happen to know of a way to get an interface (struct ifnet) without AF_INET6? I used to use pflog0 for this, and now clearly can't any more. (The test for PR 288263 does that.)
Ideally I'd like to not remove the test, but without such an interface there's no point to it.

The whole point of the change, that now (after pfsync0 de-ifnetting) you can assert that any interface has IPv6 context. Lots of checks for that in netinet6 go away.

sys/netpfil/pf/if_pflog.c
108–111

You don't need this. bpf_detach() provides the deferred epoch free of bpf_if.

136–137

bpf_attach() doesn't fail.

155–185

bpf_detach() provides needed epoch deferral.

278–281

Why do you think this is needed?

pouria added inline comments.
share/man/man4/pflog.4
26
sys/netpfil/pf/if_pflog.c
288

Does this still need to be in SI_SUB_INIT_IF order? We may also want to use SI_SUB_PROTO_IF.

sys/netpfil/pf/if_pflog.c
288

Maybe just chain the init from the overall pf(4) init without a separate sysinit entry?

sys/netpfil/pf/if_pflog.c
288

+1

sys/netpfil/pf/if_pflog.c
106
194

Can we use bool here?

Remove epoch cleanup.
Update failing test case to use pfsync0 iso. pflog0 (for now).

In D57851#1327098, @kp wrote:

Do you happen to know of a way to get an interface (struct ifnet) without AF_INET6? I used to use pflog0 for this, and now clearly can't any more. (The test for PR 288263 does that.)
Ideally I'd like to not remove the test, but without such an interface there's no point to it.

The whole point of the change, that now (after pfsync0 de-ifnetting) you can assert that any interface has IPv6 context. Lots of checks for that in netinet6 go away.

Okay, great. I'll change it to use pfsync for now, and once that one's gone I'll delete the test and change the check for AF_INEt6 in in6_getlinkifnet() to an assertion.

sys/netpfil/pf/if_pflog.c
194

Potentially, maybe.

I don't want to change that in this commit though, it's not related to what we're doing here.

(And I'm not entirely sure I want to do refactoring like that as it moves us a bit further from OpenBSD. We already have to manually apply patches, so we could, but I'm not sure it's worth making it - a little - harder.)

278–281

I ran into a couple of crashes that seemed to be fixed by using epoch to delay the free of the bpf_if, and ensuring that it was really free by the time we went away.
Or at least, I thought I did, because they didn't re-appear when I just removed it.

288

We can't.

pflog is a separate kernel module and might not be loaded when we load pf. It can still be loaded later, so it needs its own VNET_SYSINIT.

sys/netpfil/pf/if_pflog.c
288

Then it should be something around SI_SUB_PROTO_FIREWALL. But better make that a separate change.

This revision is now accepted and ready to land.Fri, Jun 26, 10:25 PM