Page MenuHomeFreeBSD

pf: Implement the NAT source port selection of MAP-E Customer Edge
ClosedPublic

Authored by takahiro.kurosawa_gmail.com on Mar 28 2021, 11:39 AM.

Details

Summary

MAP-E (RFC 7597) requires special care for selecting source ports
in NAT operation on the Customer Edge because a part of bits of the port
numbers are used by the Border Relay to distinguish another side of the
IPv4-over-IPv6 tunnel.

PR: 254577
Submitted by: takahiro.kurosawa@gmail.com

Diff Detail

Repository
R10 FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

kp requested review of this revision.Mar 28 2021, 11:39 AM

So this looks pretty good. I've got a couple of minor remarks.

The main blocker for this patch is that it changes the interface between kernel and userspace. I'm correctly working on converting ADDRULE/GETRULE to use nvlists, which will ensure we don't break that ABI. See https://lists.freebsd.org/pipermail/freebsd-pf/2021-March/009419.html

I hope to post patches for that in the coming week.

sbin/pfctl/parse.y
4022

The warning doesn't match the check. 0 is allowed here.

4247

If I'm reading the parser code correctly this is impossible, so maybe this should be an assert() rather than input validation.

takahiro.kurosawa_gmail.com added a reviewer: kp.
takahiro.kurosawa_gmail.com added inline comments.
sbin/pfctl/parse.y
4022

I will fix $2 < 0 to $2 <= 0

4247

I have confirmed that the yyerror message below is actually displayed by running:

$ echo 'rdr on lo0 proto tcp from any to any -> 127.0.0.1 map-e-portset 6/8/0x34' | /sbin/pfctl -n -f -
stdin:1: the 'map-e-portset' option is only valid with nat rules

So I'm going to keep it unchanged.

sys/netpfil/pf/pf_lb.c
325

i < ahigh should be i <= ahigh.
The line 320 subtract 1 so ahigh should be inclusive.
I will fix the code.

  • rebased the patch from 12-STABLE to -current
  • added the test script
  • fixed 2 issues
    • the check for MAP-E PSID offset value in sbin/pfctl/parse.y was incorrect
    • source port selection code in pf_get_mape_sport() was incorrect

The nvlist ioctl commit series is:

That's not quite everything that's needed for this though. It occurred to me that we still don't have a good way of expressing the change in pf_pool_opts. I think I want to copy the definition into libpfctl so pfctl can use it to populate the nvlist. In main we can (once the relevant ioctls have changed over) remove pf_pool_opts.

I've posted the rest of the patches that should let us add this without ABI breakage. The last of them is D29644 (and it lists its required parent revisions).

With those patches it's safe to add fields to pfctl_rule, which is now an entirely userspace structure.

@kp thanks for notifying the changes.
I'd like to hear your idea for the following points to rebase the code to the nvlist-based implementation.

  • I plan to keep the struct pf_pool.mape member and the struct pf_kpool.mape member unchanged from the current implementation, while introducing the nvlist-based representation as an array of 3 uint16_t's with the name "mape" under "rpool". Is this design consistent with your implementation?
  • I plan to implement the code to allow the nvlist-based rules without the "mape" names, not handing this as an error. This is because there might be a timing that a newer kernel that knows "mape" with the older pfctl that does not know "mape", while I feel the care might be a little too much.

@kp thanks for notifying the changes.
I'd like to hear your idea for the following points to rebase the code to the nvlist-based implementation.

  • I plan to keep the struct pf_pool.mape member and the struct pf_kpool.mape member unchanged from the current implementation, while introducing the nvlist-based representation as an array of 3 uint16_t's with the name "mape" under "rpool". Is this design consistent with your implementation?

I think you want to keep pf_pool unchanged. Other than that, yes: add mape to pf_kpool (that's kernel-internal, so it's safe to modify).
We can also safely add new fields to the nvlist, so adding a "mape" nvlist child to "rpool" would be the way to go as well.
It may or may not make more sense to have three separate number entries in the "mape" nvlist rather than one three entry array.

  • I plan to implement the code to allow the nvlist-based rules without the "mape" names, not handing this as an error. This is because there might be a timing that a newer kernel that knows "mape" with the older pfctl that does not know "mape", while I feel the care might be a little too much.

Yes. The "mape" nvlist in "rpool" should be treated as optional. Well, maybe. Given that we've not released any version that has nvlist support but not MAP-E support it'd arguably be fine to just require it.
Some ABI/API breakage in current is to be expected.
I'd say that if it's harder than if (nvlist_exists_nvlist(nvl, "mape")) { ... } it might be fine to just make it mandatory.

D29669 may be a useful example.

tests/sys/netpfil/pf/map_e.sh
83

The timeout in the nat.sh test was to test for a very specific bug, where the kernel could get stuck. We don't expect that for this test, so it's fine to not explicitly check for timeout here.

In D29468#665329, @kp wrote:

@kp thanks for notifying the changes.
I'd like to hear your idea for the following points to rebase the code to the nvlist-based implementation.

  • I plan to keep the struct pf_pool.mape member and the struct pf_kpool.mape member unchanged from the current implementation, while introducing the nvlist-based representation as an array of 3 uint16_t's with the name "mape" under "rpool". Is this design consistent with your implementation?

I think you want to keep pf_pool unchanged.

Thinking about this some more, I think that what we want to do here is to introduce a pfctl_pool, for use in libpfctl / pfctl. Similar to how we already have struct pfctl_rule and struct pfctl_anchor.

In D29468#665332, @kp wrote:

Thinking about this some more, I think that what we want to do here is to introduce a pfctl_pool, for use in libpfctl / pfctl. Similar to how we already have struct pfctl_rule and struct pfctl_anchor.

That makes sense!
I'll make a patch that introduce pfctl_pool and remake this MAP-E patch based on that, keeping pf_pool unchanged.

  • moved to the new nvlist-based implementation
  • removed timeout in the test
  • changed the manual page that map-e-portset is for MAP-E CE (avoid confusion with MAP-E BR)
lib/libpfctl/libpfctl.c
255

I think I'd prefer this to be a sub-nvlist with three separate numbers, but that's a nitpick more than a genuine criticism.

It'd be easier to add extra settings that way, but perhaps that's not realistically going to be needed. You probably have a better view of that than I do.

tests/sys/netpfil/pf/map_e.sh
5

You'll want to claim copyright over this (and update the date). You wrote it, I didn't.

81

Do we need this sleep?

84

These two pfctl calls don't appear to do anything. We don't verify their output. I think they can just be removed.

lib/libpfctl/libpfctl.c
255

Indeed I feel it's ugly.
I'll add a "mape" nvlist node and add three integers under it as you described, instead of adding an array of 3 integers directly under "pool".

tests/sys/netpfil/pf/map_e.sh
5

I'll change the copyright, thanks for checking.

81

No, we don't need this. I'll remove the line.
Missed to remove on porting from nat.sh.

84

I'll remove these 2 pfctl -sa's.
They were used for the check by human eyes, so they should be removed.

  • changed the "mape" nvlist structure
  • updated the test for the review comments
This revision was not accepted when it landed; it landed in state Needs Review.Apr 13 2021, 10:55 AM
This revision was automatically updated to reflect the committed changes.