Page MenuHomeFreeBSD

pf: Add support for endpoint independent NAT bindings for UDP
ClosedPublic

Authored by thj on Jun 10 2017, 5:26 PM.
Tags
None
Referenced Files
F93569941: D11137.id143021.diff
Tue, Sep 10, 3:18 PM
F93567631: D11137.id142976.diff
Tue, Sep 10, 2:43 PM
F93559605: D11137.id142339.diff
Tue, Sep 10, 12:34 PM
F93549879: D11137.id142594.diff
Tue, Sep 10, 9:35 AM
F93546792: D11137.id142155.diff
Tue, Sep 10, 8:27 AM
F93527681: D11137.id29425.diff
Tue, Sep 10, 2:53 AM
F93517765: D11137.id142339.diff
Tue, Sep 10, 12:23 AM
F93514635: D11137.id142339.diff
Mon, Sep 9, 11:51 PM

Details

Summary

With Endpoint Independent NAT bindings for UDP flows from a NATed source
address are always mapped to the same ip:port pair on the NAT router.
This allows a client to connect to multiple external servers while
appearing as the same host and enables NAT traversal without requiring
the client to use a middlebox traversal protocol such as STUN or TURN.

Introduce the 'endpoint-independent' option to NAT rules to allow
configuration of endpoint independent without effecting existing
deployments.

This change satisfies REQ 1 and 3 of RFC 4787 also known as 'full cone'
NAT.

Using Endpoint Independent NAT changes NAT exhaustion behaviour it does
not introduce any additional security considerations compared to other
forms of NAT.

PR: 219803
Co-authored-by: Damjan Jovanovic <damjan.jov@gmail.com>
Co-authored-by: Naman Sood <mail@nsood.in>
Reviewed-by:
Sponsored-by: Tailscale
Sponsored-by: The FreeBSD Foundation

Test Plan

I still need to add test for pfctl syntax.
I would like to add a stressor for NAT bindings, but it might be larger than what is needed for this test

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 9783

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
sys/net/pfvar.h
696

Please use new C standard uintXX_t types instead of historic u_intXX_t in new code.

sys/netpfil/pf/pf.c
442

Please use standard C keyword "inline" instead of "__inline" gcc-ism.

1402

Please follow style(9), do not initialize variables in declaration. They rule may be violated sometimes, but calling function in initializer is too much!

1414

Please put return values into braces, to follow style(9). This refers to all new returns in this patch.

1465

Please put empty line after declarations.

sys/netpfil/pf/pf_lb.c
224

Empty line below.

Damjan has updated the patch to fix the style issues and address a panic with non-udp traffic.

I've tested this patch on a setup where this feature is usefull: Using Mediation (NAT hole punching) feature of Strongswan for connecting 2 sites with IPSec, each site behind a NAT box (https://bsdrp.net/documentation/examples/strongswan_ipsec_mediation_feature).
Previously I had to use nat keyword "static-port" into pf line configuration for "helping" pf to keept original UDP source port, then allowing this feature to work.
With this patch, a standard pf nat (without any option) allow to create NAT hole: I don't think it's a "security downgrade" problem, because NAT was never desing for security.

For concluding: I like the feature brings by this patch :-)

Any news about pushing this change to head ?

Any news about pushing this change to head ?

I'm afraid not. I've not had the time or energy to review the patch, or examine the implications in sufficient detail. Considering all of the other issues I'm also struggling to find the time for I really can't promise anything.

TWIMC some users, some of whom seem willingly to invest time & work into this, are discussing this in the FreeBSD Forum; i.e. the pros & cons & whether it's worth it at all. Please chime in if you have some valuable arguments.

Tailscale has a good blog on NAT traversal that explains the value of allowing NAT hole punching https://tailscale.com/blog/how-nat-traversal-works/

Tailscale has a good blog on NAT traversal that explains the value of allowing NAT hole punching https://tailscale.com/blog/how-nat-traversal-works/

Is the assertion that this currently does not work with pf and will work with this patch?

I've been following discussions here and on the forums for literally years and I have yet to hear an explanation of why this is needed. Frankly, it's more than a little frustrating at this point.

My ask remains the same: explain why this is needed (and at this point: rebase the patch, add test cases and commit to supporting any fallout from it).

IMO RFC 4787 and the Tailscale blog provide reasonable justification for preferring endpoint-independent mapping. I guess one challenge in making the case is that functionality and failure modes fall on a spectrum, and applications implement fallbacks to work around the inability to transit NATs, to varying levels of success. One example is the inability of Nintendo Switch peer-to-peer communication to work between various "NAT types." (It doesn't help that Nintendo doesn't document what the "NAT Types" actually mean.) Various guides for Switch online gaming include suggestions like giving the Switch a static internal IP and forwarding ports 1-65535 to it to work around "NAT problems". Or, another example is Tailscale trying to use probabilistic port guessing to establish a connection or falling back to a TCP connection to a relay server if nothing else works.

I think it's uncontroversial that there are cases where EIM NAT provides a better user experience. Some applications can make use of UPnP, some users can research and configure port forwarding, but you're still left with cases where applications "just work" with "easy" NATs and do not with "hard" NATs. Look at the number of reddit posts, blogs etc. written about getting a Nintendo Switch to work with FreeBSD or pfSense, compared to the relative lack of issues with common consumer CPE.

All of that is separate from discussions of NAT security properties, opt-in/opt-out controls and default behaviors, tests, and the specific details of this patch. Those are all things that indeed need to be addressed as part of an effort to bring it in.

Rebased and added a test.

Since the work is already done and sitting on my computer, I thought it was worth pushing anyway.

I spent some time coming up with concrete use-cases to demonstrate this patch's usefulness:

  1. Here is a demonstration showing a 3x performance uplift in a file transfer over the Tailscale VPN from applying this patch on a router, enabling a peer-to-peer connection rather than using their relay servers: https://gist.github.com/tendstofortytwo/5bc7b158239b1216e338c45b56e6b9b1
  1. When I try to play Mario Kart 7 online on my Nintendo 2DS behind a FreeBSD router running pf NAT, I get error code 006-0612: https://files.catbox.moe/zt067x.jpg. While I can't test applying this patch to that router (I don't have admin privileges over that router), Nintendo's support page indicates that a less restrictive NAT type would solve this issue: https://en-americas-support.nintendo.com/app/answers/detail/a_id/25881/~/error-code%3A-006-0612
thj added a reviewer: kp.
  • Rebase change
  • add pf rule syntax for endpoint-independent
thj retitled this revision from PF: implement RFC 4787 REQ 1 and 3 (full cone NAT) to pf: Add support for endpoint independent NAT bindings for UDP.Mon, Aug 19, 11:52 AM
thj edited the summary of this revision. (Show Details)
thj edited the test plan for this revision. (Show Details)

This is just the first pass, I need to take a closer look at the core changes, but that won't be today.

It also looks like there are bits of the patch missing. I get this build error (which I've not investigated further):

/usr/src/sys/netpfil/pf/pf_lb.c:239:4: error: use of undeclared identifier 'udp_mapping'
  239 |                 *udp_mapping = pf_udp_mapping_find(&udp_source);
      |                  ^
/usr/src/sys/netpfil/pf/pf_lb.c:240:8: error: use of undeclared identifier 'udp_mapping'
  240 |                 if (*udp_mapping) {
      |                      ^
/usr/src/sys/netpfil/pf/pf_lb.c:241:22: error: use of undeclared identifier 'udp_mapping'
  241 |                         PF_ACPY(naddr, &(*udp_mapping)->endpoints[1].addr, af);
      |                                           ^
/usr/src/sys/netpfil/pf/pf_lb.c:242:15: error: use of undeclared identifier 'udp_mapping'
  242 |                         *nport = (*udp_mapping)->endpoints[1].port;
      |                                    ^
/usr/src/sys/netpfil/pf/pf_lb.c:249:5: error: use of undeclared identifier 'udp_mapping'
  249 |                         *udp_mapping = pf_udp_mapping_create(af, saddr, sport, &init_addr, 0);
      |                          ^
/usr/src/sys/netpfil/pf/pf_lb.c:250:9: error: use of undeclared identifier 'udp_mapping'
  250 |                         if (*udp_mapping == NULL)
      |                              ^
/usr/src/sys/netpfil/pf/pf_lb.c:312:8: error: use of undeclared identifier 'udp_mapping'
  312 |                                         (*udp_mapping)->endpoints[1].port = htons(low);
      |                                           ^
/usr/src/sys/netpfil/pf/pf_lb.c:313:33: error: use of undeclared identifier 'udp_mapping'
  313 |                                         if (pf_udp_mapping_insert(*udp_mapping) == 0) {
      |                                                                    ^
/usr/src/sys/netpfil/pf/pf_lb.c:337:8: error: use of undeclared identifier 'udp_mapping'
  337 |                                         (*udp_mapping)->endpoints[1].port = htons(tmp);
      |                                           ^
/usr/src/sys/netpfil/pf/pf_lb.c:338:33: error: use of undeclared identifier 'udp_mapping'
  338 |                                         if (pf_udp_mapping_insert(*udp_mapping) == 0) {
      |                                                                    ^
/usr/src/sys/netpfil/pf/pf_lb.c:354:8: error: use of undeclared identifier 'udp_mapping'
  354 |                                         (*udp_mapping)->endpoints[1].port = htons(tmp);
      |                                           ^
/usr/src/sys/netpfil/pf/pf_lb.c:355:33: error: use of undeclared identifier 'udp_mapping'
  355 |                                         if (pf_udp_mapping_insert(*udp_mapping) == 0) {
      |                                                                    ^
12 errors generated.
Building /usr/obj/usr/src/amd64.amd64/sys/GENERIC/modules/usr/src/sys/modules/plip/plip.ko.full
*** [pf_lb.o] Error code 1
sbin/pfctl/pfctl_parser.c
491 ↗(On Diff #142155)

That seems like the wrong place for this.

sys/net/pfvar.h
705

Why two endpoints?

sys/netpfil/pf/pf.c
357

I seem to be missing the allocation (and freeing) of this. Am I looking in the wrong place or is it just not part of the patch?

830

Oh dear god. Another lock for the data path?
This is worrying.

At the very least it needs thorough documentation w.r.t what it locks and what the lock order expectations are.

831

MTX_DUPOK

Ick. Probably required for the same reason we need it for pf_keyhash, but ick nonetheless.

1406

style(9)

1478

Is there a risk of races because we remove separate entries (why are there two?) non-atomically?

Add a sysctl to control size of the udp mapping allocations

Also, the test case fails for me with failed: server1 did not receive connection from client.

sbin/pfctl/parse.y
6311 ↗(On Diff #142594)

It'd be good to have a parser test too (see sbin/pfctl/tests for examples).

sys/netpfil/pf/pf.c
370

That'll want an entry in the pf(4) man page.

471

Shouldn't that be V_pf_udpendpointhashmask ?

802

Do we want to default to the same size as the state table, or maybe something smaller?

It's hardly scientific, but my local gateway has about twice as much TCP as UDP, and on my server box it's more like 4 to 1, so I'd go with half or even a quarter of PF_HASHSIZ as a default.

820

I'd leave this out.

857

And given that we have a hardcoded default for the udp hash size we probably want a define (at least, if we end up choosing a different default for it than PF_HASHSIZ).

863

<= V_pf_udpendpointhashmask ?

Arguably another reason for having a different default size for the udp hash table is that we'd encounter issues like this in default configurations. Now we're only going to see this when a user twiddles the setting.

879–880

Is that right? What happens if V_pf_hashmask != V_pf_udpendpointhashmask?

1424

I'd default to EEXIST, or maybe just switch to a bool return value.

1469

I like that that's documented, but wonder if it wouldn't be more useful to put that in the struct definition instead. Or possibly both here and there.

1506

style(9).

3644

I wonder if we shouldn't put the NULL check in pf_udp_mapping_release() instead.
Similar to free() and such, it'd be safe to do pf_udp_mapping_release(NULL), so we wouldn't have to do this check everywhere we call it.

It looks like there are at least three callers, so that'd save a couple of lines of code too.

sys/netpfil/pf/pf_lb.c
246

I'd MPASS(*udp_mapping == NULL) here.

259

Do we not need to release the mapping when we're done with it?

282–291

Can we not just if (*udp_mapping != NULL) here?

thj marked 14 inline comments as done.Wed, Sep 4, 2:42 PM
thj added inline comments.
sys/netpfil/pf/pf_lb.c
259

The mapping is cleaned up in pf_unlink_state, or do you mean something else I'm missing?

Address most of Kp's comments
This adds a pfctl test case

sys/netpfil/pf/pf_lb.c
259

I meant that pf_udp_mapping_find() increments the reference counter for the mapping, and I didn't see where we decrement it again. I suspect I missed that we return it to the caller, and rely on it to release the reference for us.

Have you tried unloading pf after a test run? If we forget to clean things up uma ought to warn.

The test case still fails for me, once we get that and some silly whitespace remarks fixed I think this will be ready.

sys/net/pfvar.h
688

Space out addr/port/... with tabs, not spaces.

695

Space out addr/port/... with tabs, not spaces.

sys/netpfil/pf/pf.c
1406

return (NULL); please.

1457

Seems to have wound up with spaces to rather than tabs here.

3273

Align *udp_mapping with the other variables with a tab.

3679

Doesn't need the NULL check any more.

sys/netpfil/pf/pf_lb.c
66–67

Line length.

218

Line length.

270

Accidentally indented with spaces, not tabs.

tests/sys/netpfil/pf/nat.sh
147 ↗(On Diff #142861)

Ideally use a documentation/test subnet here, so 198.51.100.0/24 or 203.0.113.0/24

163 ↗(On Diff #142861)

I've been poking the test failure a bit, and it looks like there's some buffering going on on the receive side, and that's causing the test to fail.

I wonder if it wouldn't be better to just use tcpdump on the receive side. That'll also show us the port numbers. There are a few other test cases that use tcpdump with --immediate-mode to get immediate output, rather than have to deal with buffering.

thj marked 10 inline comments as done.Thu, Sep 5, 3:41 PM
  • Fix white space issues
  • Fix line lengths
  • test: use documentation addresses
  • test: use tcpdump for verifying traffic

I'd like to see the TEST-NET-2 range fixed, but other than that I think it's ready to be inflicted on users.

sys/netpfil/pf/pf_lb.c
353

That doesn't need the NULL check. ima_zfree_pcpu(..., NULL) is safe.

tests/sys/netpfil/pf/nat.sh
148 ↗(On Diff #142976)

TEST-NET-2 is 198.51.100.0/24, not 192.51.100.0/24.

This revision is now accepted and ready to land.Fri, Sep 6, 7:56 AM

In the commit message, there is no info about MFC. Are there any plans to MFC this to stable/14?