Page MenuHomeFreeBSD

pfsync: Prepare code to accommodate AF_INET6 family
ClosedPublic

Authored by email_luiz.eng.br on Aug 21 2022, 7:00 PM.

Details

Summary

I am currently working on bringing IPv6 support to pfsync. This required some changes to allow for differentiating between the two families in a more generic way.

As I am still ironing out the last issues with the IPv6 support, I figured it would be a nice idea to submit the non-IPv6 part of changes for review.

I am happy to get any suggestions on how to improve this, as this is my first big project in the FreeBSD kernel.

So far, I have tested it by just setting up pfsync between two hosts and ensuring states get synced between them.

Sponsored by: InnoGames GmbH

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

sbin/ifconfig/ifpfsync.c
151

We prefer POSIX memset() over bzero() in new code.

sys/netpfil/pf/if_pfsync.c
92

What did bring in this requirement?

I still need to look at the rest in more detail, but we have to answer the ABI (to userspace) / protocol (for pfsync peers) change questions before we go any further here.

sys/net/if_pfsync.h
249

It looks like we're (going to be) changing the size of struct pfsyncreq here. Have you thought about how to cope with the ABI breakage that entails?

sys/netpfil/pf/if_pfsync.c
118

Same question here. If we change the size of pfsync_pkt we're going to be changing the on-wire protocol, and that's going to break sync between versions with and without IPv6 support.

sys/net/if_pfsync.h
249

I did not. Not on purpose, but because I made the change without being aware of the implications.

Should we resort to having one pfsyncreq struct per AF? We would then need to find another way to figure out the AF being used, as the current code relies on the sockaddr in pfsync_sockaddr.

sys/netpfil/pf/if_pfsync.c
92

What did bring in this requirement?

92

It doesn't belong on this diff, ended up here by accident.

With the IPv6 code in place, it's needed in order to have RT_DEFAULT_FIB for a in6_selectsrc_addr() call.

118

Same as the previous case, I did the change without being aware of the implications.

I honestly don't have an idea on how to deal with it. I welcome any suggestions.

Dealing with the struct pfsyncreq won't be too very painful. We'll have to introduce a new version that takes a struct pfsyncreq_v2 which can contain the IPv6 address.
Or, even better, we can do what we've been doing to a bunch of pf ioctl calls, and pass an nvlist instead. That'll make it much easier to extend in the future.

There are a lot of examples in sys/netpfil/pf/pf_ioctl.c. DIOCCLRSTATESNV is a decent place to start looking at that code.

The struct pfsync_pkt change.. well, I might actually be wrong here. It doesn't actually look like that hits the wire. Instead it looks to be a way to carry metadata (i.e. data about received pfsync packets) around. So we could change that easily. It does also appear as if we only ever use the flags field, so maybe we should also remove struct pfsync_pkt and replace it with a uint8_t flags or something.

OpenBSD did that, so it might be good to adopt their change. The patch won't apply cleanly, so it'll be a manual process, but in general I prefer not needlessly deviating from what they've done. Of course, if there's a good reason to do things differently we can.

commit 715f5bf131320c2aafedcfbf83423286535a1746
Author: dlg <dlg@openbsd.org>
Date: Mon Nov 29 05:31:38 2010 +0000

get rid of struct pfsync_pkt. it was used to store data on the stack to
pass to all the submessage handlers, but only the flags part of it was
ever used. just pass the flags directly instead.

Rebase
Replace bzero() with memset() in new code

In D36277#824065, @kp wrote:

Or, even better, we can do what we've been doing to a bunch of pf ioctl calls, and pass an nvlist instead. That'll make it much easier to extend in the future.

Does nvlist convert to net byte order internally?

In D36277#824065, @kp wrote:

Or, even better, we can do what we've been doing to a bunch of pf ioctl calls, and pass an nvlist instead. That'll make it much easier to extend in the future.

Does nvlist convert to net byte order internally?

It doesn't, but I'm also not sure why that'd matter. It should be safe enough to assume that kernel and user space use the same endianness (and old-style struct passing ioctls also have implicit endianness assumptions).
It would be relevant if we were going to use nvlists in the on-wire protocol, but I think we've concluded that we don't actually need to (and certainly don't want to) change that.

My bad, misunderstood that we are talking about ioctl struct, not on wire. Sorry.

I am currently working on the implementation using nvlists.

One of the struggles I'm having is deciding what to use to exchange data between pfsyncioctl() in if_pfsync.c and the functions in ifpfsync.c. Should I go for ifr->ifr_cap_nv with the nvlist packed straight into it or should I go for ifr->ifr_buffer holding a struct similar to pfioc_nv? What is your preference?

I am currently working on the implementation using nvlists.

One of the struggles I'm having is deciding what to use to exchange data between pfsyncioctl() in if_pfsync.c and the functions in ifpfsync.c. Should I go for ifr->ifr_cap_nv with the nvlist packed straight into it or should I go for ifr->ifr_buffer holding a struct similar to pfioc_nv? What is your preference?

cap_nv is used for a different purpose. It'll need to use ifr_buffer.

Rebase

Add new ioctls that use nvlists

I think we're moving in the right direction.

Let's fix these things and then I'll do a style(9) check too.

sbin/ifconfig/ifpfsync.c
63

Not blocking, but it might be good to think about how to move this into libpfctl.

69

There's not much point to checking for allocation failures in user space. The kernel will (by default, at least) overcommit memory. So the first thing a user space applications knows about a memory shortage is not when malloc fails, but when it touches previously unused memory and gets killed by the out-of-memory killer.

The ifconfig code is pretty inconsistent about checking for malloc() failure already. My personal view is that it's not worth writing error handling code that will never be exercised (and thus is almost certain to be buggy), but this isn't a blocking remark.

76

I'm not strictly happy with using ifr_cap_nv here, because that field is for "nv-based cap interface", and this isn't about interface capabilities.

There's no obvious alternative, so perhaps we should look at adding a new name in sys/net/if.h. Let's put that aside until we've got everything else done and tested.

81

We fail to free ifr.ifr_cap_nv.buffer here.

124

Is this worth considering? Do we every specify something that's not IPv4 or IPv6?

133

This might be something that can be usefully translated into a libpfctl library function.
Something like pfsyncctl_set_syncdev(int fd, const char *ifname);

147

This seems like pointless work. Can't we just nvlist_add_string(nvl, "synced", val);?

165

Is it worth doing this? All we want is to set an empty string for syncdev, so why not just do that unconditionally?

180

Also a good libpfctl candidate.

208

And we'd do case AF_INET6 here, right?
Which will come in a subsequent commit?

282

Yes. That's a status flag, and it'll get reset by the kernel on every read.
We can set or clear it on write, and it won't affect what the kernel does.

339

What's the purpose of pfsync_kstatus?

The 'k' prefix usual means this is a kernel-only structure (at least in the context of pf), so I'm a little confused at seeing that in user space.

This might be a structure definition that belongs in libpfctl.h

342

We should zero out the status struct so we don't end up with random garbage if the kernel doesn't supply a syncdev.

357

Should we be returning an error here?

It's arguable, because we're checking the result of pfsync_syncpeer_nvlist_to_sockaddr(). On the other hand, the kernel really shouldn't be sending us invalid data. Perhaps this needs an assertion.

373

We leak the nvlist if the ioctl fails.

sys/netpfil/pf/if_pfsync.c
1394

We're leaking packed and nvl here.

1520

We should probably separate the input parsing (struct vs nvlist) from the actual pfsync configuration so we can re-use the latter for both cases.

1539

We'd leak data here. Unless we failed to allocate data, in which case I'm not sure what copyin() will do.

1546

How so? It should be safe as soon as you've copied the strings out. (numbers are always copied, as are bools. Byte arrays and strings need to be copied, but always should be, because we can't reasonably rely on the lifetime of the nvlist structure).

1550

Leaks data.

sys/netpfil/pf/pfsync_nv.h
3

I don't know if it's worth putting this in a separate file. Maybe just in pf_nv?

(No strong views either way.)

I think I addressed all comments. I will submit the updated patch in a moment.

Thanks for pointing out style(9) as well. I will take a moment to go through it and keep it in mind for the future.

sbin/ifconfig/ifpfsync.c
124

The way pfsync was handling the case where no peer address was set (or was unset) was by leaving the address empty (that would later be filled by the multicast address in if_pfsync.c), which would result in a AF that is neither IPv4 nor IPv6.
I did not change that behavior, so the code here accommodated for it.

147

It was a complicated way of ensuring val was not bigger than IFNAMSIZ.

I added a check for the size, and adjusted the code per your suggestion.

	if (strlen(val) > IFNAMSIZ)
		errx(1, "interface name %s is too long", val);
165

I will simplify it, but we still need the nvlist_exists_string() check, otherwise I get a EEXIST error, as the syncdev is returned by the SIOCGETPFSYNCNV ioctl.

208

That is my current plan.

339

I tried to mimic what was done with pf_kstate in pf_ioctl.c but failed to understand it was a kernel-only structure.

I made this function considering that it would be useful for the other functions, but in the end it only got used once in pfsync_status. I'm considering getting rid of it and moving this code into pfsync_status directly.

342

Since we now zero out the status struct here, I removed the zeroing out that was done in line 367, before calling pfsync_nvstatus_to_kstatus

357

I looked around the ifconfig code and it seems they don't use assertions. I'm inclined towards returning an error.

sys/netpfil/pf/if_pfsync.c
1539

I am freeing data before returning now.

According to copy(9), "All but copystr() return EFAULT if a bad address is encountered". So if the allocation fails, copyin() will see the address NULL and return EFAULT, right?

1546

I wrote that when I still had a very poor understanding of nvlists. I removed the comment.

1550

I went with freeing data right after the nvlist_destroy(), which covers this case and prevents leaking on normal execution as well.

sys/netpfil/pf/pfsync_nv.h
3

I wasn't sure on how separate pf and pfsync are kept, so I made a new file.
I'm OK with moving this to pf_nv as well.

Remove another allocation failure check
Prevent leak of ifr.ifr_cap_nv.buffer

That's getting close.

We still need to rename pfsync_kstatus, and I think there's room for reducing the code duplication between SIOCSETPFSYNCNV and SIOCSETPFSYNC.

sbin/ifconfig/ifpfsync.c
202

We should try to be consistent about using {} for single line statements. At least within the same function.

email_luiz.eng.br marked an inline comment as done.

Rebase
Reduce duplication between SIOCSETPFSYNC and SIOCSETPFSYNCNV.

I looked into renaming kstatus on the ifpfsync.c file and now I'm considering removing it altogether from there, since it's only used once in pfsync_status(). What do you think?

I looked into renaming kstatus on the ifpfsync.c file and now I'm considering removing it altogether from there, since it's only used once in pfsync_status(). What do you think?

If you don't need it, yeah, just remove it. We may end up introducing it if/when we move some of the userspace code into libpfctl, but we can always worry about that later.

Removed pfsync_kstatus usage from ifpfsync.c.

Ping @kp
I think we are good for a style(9) check

Ping @kp
I think we are good for a style(9) check

Ack. It's near the top of my todo list, so any decade now ;)

Looking pretty good. A few minor things and we can land this.

sys/netpfil/pf/if_pfsync.c
1359

Is this comment still relevant?

1391

I'd just ENOMEM here, that's by far the most likely cause anyway, and nvlist_error() isn't going to tell us that.

sys/netpfil/pf/pfsync_nv.c
127

Needs an empty line before this.

141

I usually try to invert this. That is:

if (<something wrong)
   return (error);

do_good_thing();

It tends to reduce levels of indentation and make things slightly cleared. It's a very slight difference here, but this is the one I noticed.

144

return (error);

149

return (0);

email_luiz.eng.br marked 6 inline comments as done.

I rebased and addressed the last comments.

I will move the code implementing the IPv6 support to the new nvlists format and send another differential. I also plan to work on extracting some of those functions to libpfctl as you suggested, but would prefer to work on it after the IPv6 support is finished, if that's ok with everybody.

This revision was not accepted when it landed; it landed in state Needs Review.Wed, Nov 9, 9:07 PM
This revision was automatically updated to reflect the committed changes.