Page MenuHomeFreeBSD

libalias: Allow setting alias port ranges
AcceptedPublic

Authored by neel_neelc.org on Feb 1 2020, 3:31 AM.

Details

Summary

Allow setting alias port ranges in libalias and ipfw. Along with r357092, this will allow a system to be a true RFC 6598 NAT444 setup, where each network segment (e.g. user, subnet) can have their own dedicated port aliasing ranges.

Submitted by: Neel Chauhan <neel AT neelc DOT org>

Test Plan

In short: test a network with an internal IP address in the 100.64.0.0/10 with port_alias LOWER UPPER in IPFW, and see if NAT is performed.

Explained:

Compile a HEAD with this patch and reboot.

Add the following to /etc/rc.conf:

ifconfig_lan0="inet 100.64.0.1 netmask 255.255.255.0"
firewall_enable="YES"
firewall_nat_enable="YES"
firewall_script="/etc/ipfw.conf"

Add the following to /etc/ipfw.conf:

#!/bin/sh

ipfw -q flush

ipfw nat 1 config if wan0 unreg_cgn port_alias 2000-3000
ipfw add 100 nat 1 ip from any to me 2000-3000 in via wan0
ipfw add 200 nat 1 ip from 100.64.0.0/24 to any out via wan0
ipfw add allow ip from any to any

Replace 2000 and 3000 with your lower and upper port ranges. Keep in mind that both have to be greater than 1024, and UPPER (obviously) has to be greater than LOWER.

Replace wan0 with your WAN (outside) interface, and lan0 with your LAN (inside) interface.

Then run

kldload ipfw ipfw_nat

and

service netif restart

Then, add clients on the 100.64.0.0/24 subnet with the 100.64.0.1 gateway and 255.255.255.0 subnet mask.

You could also do DHCP, or NAT from a loopback interface, I won't mention that here.

Diff Detail

Repository
rS FreeBSD src repository
Lint
Lint Skipped
Unit
Unit Tests Skipped

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
pi added a subscriber: pi.Feb 4 2020, 10:20 AM

The use case is a static mapping from behind-nat users to the outside world. If such a static mapping is possible, CGN-operators do not need to log every transaction.

pi added a comment.Feb 4 2020, 10:24 AM

What happens, if some port mapping is defined unexpected source IPs are sent from behind the CGN ? Will this packet/port end up 'somewhere' in the reserved range or will it map somewhere outside of all reserved ranges ?

In D23450#515673, @pi wrote:

The use case is a static mapping from behind-nat users to the outside world. If such a static mapping is possible, CGN-operators do not need to log every transaction.

This change allows to easily map translated packet to NAT instance only, not to behing-nat IP address. Do you mean a setup with NAT instance per user? In case of many users, port variance for single user would be very poor.

To bring this up, you need a bunch of ipfw rules (one per customer) where you know the (internal) IP of the customer beforehand.

ipfw nat 1 config if wan0 unreg_cgn port_alias 2000 2999
ipfw nat 2 config if wan0 unreg_cgn port_alias 3000 3999
ipfw nat 3 config if wan0 unreg_cgn port_alias 4000 4999
ipfw nat 4 config if wan0 unreg_cgn port_alias 5000 5999
ipfw add 100 nat 1 ip from 100.64.0.1 to any out via wan0
ipfw add 101 nat 2 ip from 100.64.0.2 to any out via wan0
ipfw add 102 nat 3 ip from 100.64.0.3 to any out via wan0
ipfw add 103 nat 4 ip from 100.64.0.4 to any out via wan0

This way you can deduce from the externally visible port range, which IP was natted. Then you may deduce the customer from the IP if you log the assignment.

I'd prefer an approach to limit the port range per source IP (e.g. "port_range 300" as an option to reserve 300 ports for this IP) and log this assignment. This allows to keep the NAT setup simple, while reducing the amount of logging for NAT.

Please note, that for CGN, the NAT setup is far from that simple. In order to have happy eyeballs, the CGN range is typically segmented and each segment has a single, different external IP. This setup is lengthy but do not break the various cloud services, which insist that the IP does not change during the session. See https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=219316 for an example.

I'd prefer an approach to limit the port range per source IP (e.g. "port_range 300" as an option to reserve 300 ports for this IP) and log this assignment. This allows to keep the NAT setup simple, while reducing the amount of logging for NAT.

Let me explain this a bit more.

Given such an option exists, the libalias is expected to automatically allocate a free range of this number of ports per source IP dynamically. It should log this assignment as

libalias: assigned to port range 7514-7913 on 193.0.0.193 for 100.64.0.123

I do not insist on a log message on freeing the range later, but I do insist on the failure message if no range is available.

In D23450#515674, @pi wrote:

What happens, if some port mapping is defined unexpected source IPs are sent from behind the CGN ? Will this packet/port end up 'somewhere' in the reserved range or will it map somewhere outside of all reserved ranges ?

In fact, this change provides no guarantees. It just makes NAT try using random port from configured range when looking for unused alias port. If such random port occurs to be occupied, another completely random ports outside of the range will be tried instead.

In D23450#515674, @pi wrote:

What happens, if some port mapping is defined unexpected source IPs are sent from behind the CGN ? Will this packet/port end up 'somewhere' in the reserved range or will it map somewhere outside of all reserved ranges ?

In fact, this change provides no guarantees. It just makes NAT try using random port from configured range when looking for unused alias port. If such random port occurs to be occupied, another completely random ports outside of the range will be tried instead.

I could be wrong in the code, but I built this to search for another random port in the range. See lines 608-612 (initial port search) and then 666-670 (called when you need another port).

Correct me, if I'm wrong, but how are packets dealiased if more than one instance is defined using this patch?

ipfw nat 1 config if wan0 unreg_cgn port_alias 2000 3000
ipfw add 101 nat 1 ip from any to me dst-port 2000-3000 in via wan0
ipfw add 201 nat 1 ip from 100.64.0.0/24 to any out via wan0

ipfw nat 2 config if wan0 unreg_cgn port_alias 4000 5000
ipfw add 102 nat 2 ip from any to me dst-port 4000-5000 in via wan0
ipfw add 202 nat 2 ip from 100.64.1.0/24 to any out via wan0

How does this scale?

Correct me, if I'm wrong, but how are packets dealiased if more than one instance is defined using this patch?

Packets are aliased and dealiased as they would without this patch, as this patch only impacts the selection of ports if port_alias is selected.

ipfw nat 1 config if wan0 unreg_cgn port_alias 2000 3000
ipfw add 101 nat 1 ip from any to me dst-port 2000-3000 in via wan0
ipfw add 201 nat 1 ip from 100.64.0.0/24 to any out via wan0

ipfw nat 2 config if wan0 unreg_cgn port_alias 4000 5000
ipfw add 102 nat 2 ip from any to me dst-port 4000-5000 in via wan0
ipfw add 202 nat 2 ip from 100.64.1.0/24 to any out via wan0

How does this scale?

This can scale by using say for or while loops in shell scripting for a range of internal/external addresses and ports.

MikroTik needs something similar for CGN as well: https://wiki.mikrotik.com/wiki/Manual:IP/Firewall/NAT in the Carrier-Grade NAT (CGNAT) or NAT444 section.

lutz_donnerhacke.de requested changes to this revision.Feb 4 2020, 5:30 PM

Correct me, if I'm wrong, but how are packets dealiased if more than one instance is defined using this patch?

Packets are aliased and dealiased as they would without this patch, as this patch only impacts the selection of ports if port_alias is selected.

You have to direct the responses to the right NAT instance for dealiasing.
Libalias needs the "global" flag to search all NAT instances for all translation (which is a performance bummer).

Please add test cases for the scenario of at least two instances sharing the same public IP while using different port ranges.

ipfw nat 1 config if wan0 unreg_cgn port_alias 2000 3000
ipfw add 101 nat 1 ip from any to me dst-port 2000-3000 in via wan0
ipfw add 201 nat 1 ip from 100.64.0.0/24 to any out via wan0

ipfw nat 2 config if wan0 unreg_cgn port_alias 4000 5000
ipfw add 102 nat 2 ip from any to me dst-port 4000-5000 in via wan0
ipfw add 202 nat 2 ip from 100.64.1.0/24 to any out via wan0

How does this scale?

This can scale by using say for or while loops in shell scripting for a range of internal/external addresses and ports.

I should offer my patch to look up the NAT table in much more efficient way than serial sequencing through all instances.
The selection of NAT instances in ipfw is designed to work with only a handful instances. So please avoid even considering scripted generation of NAT instances.

This revision now requires changes to proceed.Feb 4 2020, 5:30 PM

Thanks for clarifying. I'm still new to FreeBSD TCP/IP stack development.

I'm currently at work so I'll get back to working on this patch when I come home.

Also, may I please have your patch for NAT table lookup? It would work very well for this.

Thanks for clarifying. I'm still new to FreeBSD TCP/IP stack development.

So do I.

Also, may I please have your patch for NAT table lookup? It would work very well for this.

Yep, I'v to adapt it to CURRECT, the structure of accessing the "nat" component in ipfw was changed (which make it much easier to apply a different access scheme)

I have added a test.

I was not able to get the test to pass, but neither did the included ipfw test pass for me either.

root@currentvm:~ # kyua --logfile ~/out test -k /usr/tests/Kyuafile sys/netpfil/common/nat
sys/netpfil/common/nat:ipfnat_basic  ->  skipped: This test requires ipf  [0.036s]
sys/netpfil/common/nat:ipfw_basic  ->  failed: atf-check failed; see the output of the test for details  [1.234s]
sys/netpfil/common/nat:root@currentvm:~ # kyua --logfile ~/out test -k /usr/tests/Kyuafile sys/netpfil/common/nat
sys/netpfil/common/nat:ipfnat_basic  ->  skipped: This test requires ipf  [0.036s]
sys/netpfil/common/nat:ipfw_basic  ->  failed: atf-check failed; see the output of the test for details  [1.234s]
sys/netpfil/common/nat:ipfw_cgn  ->  failed: atf-check failed; see the output of the test for details  [1.257s]
sys/netpfil/common/nat:ipfw_portalias  ->  failed: atf-check failed; see the output of the test for details  [1.247s]
sys/netpfil/common/nat:ipfw_userspace_nat  ->  skipped: This test requires ipdivert module loaded  [0.039s]
sys/netpfil/common/nat:pf_basic  ->  skipped: This test requires pf  [0.036s]

Results file id is usr_tests.20200205-160052-215416
Results saved to /root/.kyua/store/results.usr_tests.20200205-160052-215416.db

3/6 passed (3 failed)
root@currentvm:~ #

ipfw_cgn and ipfw_portalias are my new tests, while ipfw_basic is the existing ipfw test.

Is there anything special I need in order to make the tests pass?

What still needs to be done with one test (port range) is a TCP client/server. It may or may not pass.

I added an TCP test.

sbin/ipfw/ipfw.8
3265

Please add a note, that the interval is half-open. The upper number is never used.

tests/sys/netpfil/common/nat.sh
185–186

Without knowing exactly the bounds of the interval (half open) the configuration raises concerns due to overlapping.

187–188

Where are the rules for dealiasing packets?

neel_neelc.org updated this revision to Diff 67943.
neel_neelc.org marked 2 inline comments as done.Feb 7 2020, 4:14 PM
neel_neelc.org added inline comments.
sbin/ipfw/ipfw.8
3265

Sure, fixed it.

tests/sys/netpfil/common/nat.sh
185–186

Fixed it.

187–188

I'm not sure if this would work, but here it is.

neel_neelc.org marked 2 inline comments as done.Feb 7 2020, 4:14 PM
tests/sys/netpfil/common/nat.sh
187–188

Line 187 and 188 have exactly the same match, so only 187 is invoked. 188 will never be used. Furthermore they are only outgoing (for aliasing), they do not match incoming packets (for dealiasing)

neel_neelc.org edited the test plan for this revision. (Show Details)Feb 8 2020, 7:30 PM

In the test, incoming connections are now based on their port, each NAT host gets their own port range.

Somewhat unrelated (unrelated?), I believe we should commit D23448 before this.

sbin/ipfw/nat.c
756–757

I'd prefer a function witch return the valid port number or 0 in the case of failure. Writable pointers in arguments open a special class of worms in debugging.

759

Did you try "12345six" as a port number?

760

Did try "7654321" as a port number?

957–958

Is this case already covered by the validation functions? If yes, remove those check.

960

Using a other type of function call this would be

(0 < (hp = nat_port_...(av[1])))

which is easier to audit.

sys/netinet/libalias/alias_db.c
599

So PKT_ALIAS_SAME_PORT is incompatible with port ranges?
Can we prevent setting both flags with an error message in parsing the config?

sys/netinet/libalias/alias_local.h
166–168

How about a stuct and a single enty in the record?

struct nat_port_range {
  u_short lower, upper;
};

and

struct nat_port_range portRange;
sys/netpfil/ipfw/ip_fw_nat.c
96–97

Here the struct can be reused.

534–535

Using a stuct the lines become

ptr->portRange = ucfg->portRange;

Also, may I please have your patch for NAT table lookup? It would work very well for this.

Yep, I'v to adapt it to CURRECT, the structure of accessing the "nat" component in ipfw was changed (which make it much easier to apply a different access scheme)

Please have a look at D23586

neel_neelc.org updated this revision to Diff 67989.EditedFeb 8 2020, 11:40 PM

I made the requested changes.

UPDATE: I forgot the PKT_ALIAS_SAME_PORT change, I realized that after I posted this so I am working on that now.

Also Lutz, thank you so much for posting D23586.

neel_neelc.org marked 3 inline comments as done.

Made the change.

neel_neelc.org marked 7 inline comments as done.Feb 8 2020, 11:58 PM
neel_neelc.org marked an inline comment as done.

You do not have to follow my comments.
I just express my feelings, which may be wrong or misleading.
If your idea is different, please feel free to refuse the advice.

sbin/ipfw/nat.c
760–762

The common idiom is to provide a pointer as the second argument ti strtol and inspect the (first unparsable) char it points to afterwards.

port = strtol(ptr, ptr2, 10);
if ( *ptr2 != '\0' || port < 1024 )
  error
763

u_short comparsion with MAX_USHORT is not recommended.

959

Given the fact, that the upper limit is one above the usable range, this can't be specified, because the port number to configure is out of range. So it might be advisable to redefine the upper limit to be part of the interval (2000 2999 instead of 2000 3000).

sys/netpfil/ipfw/ip_fw_nat.c
539

Why not take the (pointer to the) struct as argument?

I'm going to revert to the code to the one without the struct nat_port_range since it caused more problems than it's worth, especially with #include statements.

Your other requested changes, including those to the port parsing and options checking will be merged in the new patch.

Here's my updated patch.

sbin/ipfw/nat.c
761

The pointer end will always be != NULL, so you may even allow port numbers below 1024.

tests/sys/netpfil/common/nat.sh
185–186

No no. The concern is only valid as long as the documentation about the half open interval was missing. The configuration you need (using half open intervals) is 2000 3000 and 3000 4000 in order to match the port ranges in lines 187-188 below.

(Hopefully) fixed the interval and port parsing.

jilles added a subscriber: jilles.Feb 9 2020, 7:18 PM
jilles added inline comments.
sbin/ipfw/nat.c
760

It looks like that will still get through, since the cast will convert it to 52145 (the C standard defines this conversion, although it is not what is wanted here).

I suggest storing the result from strtol() in a variable of type long and doing range checks on that.

Good catch. Fixed it.

I wonder how it is possible to configure the whole range 65000 to 65535 as usable ports,

sbin/ipfw/nat.c
760

long is indeed better than any unsigned type, because it would catch negative numbers easily.

Using long for the port parsing sounds good. Port numbers obviously can't be negative, neither than overflowed ints.

New patch does exactly this.

Can you please mark all the comments as "Done", which are solved. Only the author of the patch can do this.

neel_neelc.org marked 8 inline comments as done.Feb 10 2020, 3:02 PM

Sure, done that.

I'm still not satisfied with the "upper bound", which is inconsistent between "config port range" and "matching port range" in the ipfw rule set. It does not allow to specify the highest port (but this is a minor issue).

Furthermore I'm not familiar with the test framework, so somebody else should have a look at this part. Especially, because the is no success report here (but a failure). I don't want to stumble over an erroneous test ...

So most of my concerns are handled, I can use this patch to extend it to an dynamic port allocator (for less extensive lawful logging).

But I'm not in the position to give the final go.

Thanks for your feedback.

I'm thinking about switching the NAT port range to something like 2000-2999 instead of 2000 3000 for consistency with the rest of IPFW. Would this be okay?

Thanks for your feedback.

I'm thinking about switching the NAT port range to something like 2000-2999 instead of 2000 3000 for consistency with the rest of IPFW. Would this be okay?

I'd feel better with this, yes. It solves some corner cases, like storing the upper bound in ushort.

melifaro added inline comments.Feb 12 2020, 4:29 PM
sbin/ipfw/ipfw.8
3263

Given we're defining port range here, wouldn't the port_range be a bit more relevant name here?

sys/netinet/libalias/alias_db.c
608

Why do we need to check for both lower and upper in fast path?

Here, I switch to the range separated by a - (e.g. 2000-2999 instead of 2000 3000), where the upper number is also included. On lines 611 and 668 in alias_db.c I added a " + 1" in order to account for the new range allocation mechanism.

I also switched the argument to port_range from port_alias.

In general, I'm pleased with the renaming from the generic "alias" to "range".

sbin/ipfw/ipfw.8
3265

That's not true any longer. The interval is closed now, it includes the end points.

sbin/ipfw/nat.c
769

Why parsing from again from the very beginning of the string?

sys/netinet/libalias/alias_db.c
611

In terms of performance, it would be interesting to replace "Upper" with a precomputed "Range" or "Length". This part is used for ever new flow (very often).

For printing the configuration, the "Upper" value an be synthesized on demand.

bcr added a subscriber: bcr.Feb 13 2020, 8:09 AM

Minor man page change. You can run checks on man pages with "mandoc -Tlint" and textproc/igor.

sbin/ipfw/ipfw.8
3265

You also need to have a line break after a sentence stop here.

Here, I made changes to the:

  • man page
  • argument parsing
  • switch to a range, per Lutz's suggestion
neel_neelc.org marked 5 inline comments as done.Feb 13 2020, 4:44 PM
neel_neelc.org edited the test plan for this revision. (Show Details)Feb 26 2020, 5:26 PM
This revision is now accepted and ready to land.Feb 26 2020, 5:36 PM