Page MenuHomeFreeBSD

Implement NDP resource limits.
Needs ReviewPublic

Authored by bz on Nov 19 2019, 10:51 PM.



Implement limit checks for ND6 neighbour entries,
ND6 default routers, and ND6 prefixes learnt.
This avoids easy attacks exhausting resources, e.g.,
At the moment no further (expiration) mitigation is done
which means an attacker might still succeed to render the
system unusable (for IPv6) but not to crash it anymore
at least allowing an admin to react.

Limits are only setable on the base system and not on a
per-VNET base to avoid a VNET opening itself up as a target
for DoS taking the entire system down. Should there be a
need for per-VNET limits we can either do as frag6 and have
a system global maximum sum for all VNETs and a per-VNET
one, or we could implement a SYSCTL_PROC not allowing a VNET
to go beyond the global (per-VNET) limit.
For the neighbour table entires the global limit is not
per network stack but per interface. While a per-interface
counter would help to limit the impact of the attack for
the rtr/prefix once, the data structures are global and
we'd need a way more sophisticated logic, which should go
hand-in-hand with way more sophisticated expiry logic.

Add statistics for the three aforementioned cases and
for redirects (which are another can of worms of themselves
and could be mititgated by globally disabling them for now).
Teach netstat to print the overflow counters so an attack
can be detected more easily.

For default routers and prefixes we can just keep count
ourselves. For neighbour table entries we have to overload
the llatbl logic. For that export the link and unlink functions
and change their return types to allow us to know whether the
link or unlink actually happened to avoid double-counting.
Then do our checks in the extended implementation.
There is an idea of allowing the llatbls to manage the limits
themselves (not exporting this) by adding a "max" variable.
However updating the system limit would mean walking all tables,
checking the address family and dealing with the new limit.
We'll defer this to the future, once we do understand
expiration and effects on IPv4/ARP better.

In the end the goal here is to avoid a panic and not to provide
all possible counter measures. For some cases network
infrastructure could help a lot easier than we could with 100s
of lines of code.

PR: 157410

Test Plan

Tried (an adjusted) nmap script to also test the
default router limits as referenced in the PR.
I'd have a test case and there are other tests out there
but nothing yet I could easily include in the test suite.

Diff Detail

rS FreeBSD src repository
No Linters Available
No Unit Test Coverage
Build Status
Buildable 27889
Build 26060: arc lint + arc unit

Event Timeline added inline comments.

2048 is only suitable for small environments. Given privacy extension and crypto addresses (i.e. SeND), only a handful nodes are sufficient to reach this limit.

In larger environment (i.e. carrier networks or university lans) the limit is easily reached directly.

So I'd like to see a limit of 16k or more for the first try to limit it from the former "unlimited".


For and end-node? If you are a router you might want tune other things as well but then it pretty much does depend on your customer size. Remember this one is per interface. Say if you are a DC router and your average customer has 1 machine with a handful of static IP addresses, even 2k on a VLAN interface is probably way too big.


Yes, I use FreeBSD for routing, not as a desktop or end node.
YMMV, please feel free to ignore my comment.

kbowling added inline comments.

I would suggest doubling to 4k, I can think of a particular user that has some flat VLANs in a big chassis switch pair and will have a few entries per host.


And all these addresses on all these hosts always talk to all of the others?

kbowling added inline comments.

This is lossy memory for me, maybe @shurd can poke around at some machines in DAL


ND is multicast. You might receive ND messages from others and learn (like gratious arp). Simply assume a not so intelligent switch. Consider cryptographically generated addresses or privacy extensions (up to a new address per destination and or connection)


Ok, I guess the answer will be to scale this based on memory then. As your tiny IoT device is going to run the same code as your super-huge-server in your server-farm. And while the later probably has resources to deal with 4k or 16k or 128k, the former may still keel over in rows very quickly.

Ignoring the one maximum value, is anyone interest in actually reviewing this?


count is misleading, "added" seems enhance the casual reader's understanding. Count reads like a total value.


Either add unconditionally or only if > 0


May those lines stay together?

Address variable naming.
Address conditional checks.

Address the default (upper) limits now based on physmem / 256 but capped at 16k:
This gives you ~120 entries at 128MB of RAM, ~1k at 1G mem, 4k at 4G mem,
and 16k at 16G or more mem by default.

bz marked 2 inline comments as done.Dec 2 2019, 4:30 PM
bz added inline comments.

Not sure what you mean by that? Do you mean if we can link the new before unlinking the old? If that is the question the answer is yes as it is a simple list insert/remove under the same lock.


Nit: if_name(llt->llt_ifp) ?


Nit: worth adding contract on return value here instead of/in addition to v6 part, as we'll be changing it for v4 as well.


I was under impression that physmem reflects the amount of physical memory in bytes (so effectively we end up with 16k for >4M). Am I missing something?

Sorry for responding for that late, but would it be possible to outline the logic as a comment here?
Say, start with 4 entries anyway (as the safe bet), and do binary increase starting from 16M, while retaining 16k as maximum?


Give llt_entries has been added to the llt_table, would it be possible to move accounting to the core llt functions? Similar problem (though less severe) exists in IPv4 world as well.


Sorry, would it be possible to move this block under the lltable_unlink_entry() ?
If we have too many entries we may end up locking ln_tmp and then returning without deleting/unlocking it.


The ln_tmp value is extracted and immediately tested. Now new code is moved upwards between those lines. I do not grasp, why this code needs to stay between those lines of ln_tmp

Thanks a lot for the feedback. let me know what you think about the suggestion to move the basic functionality into llatbl?


OK. What if I update the patch, move the logic into the base functions (no longer needing to have IPv6 special functions), add the maximum where I put the llt_spare spare field and make the SYSCTL a proc function which on write loops over the tables of the given protocol family and updates the table limits (I'd love to think about an improved, pro-active expiry mechanism in the future but not complicate things now in the first pass). Does that sound better?


Oh I missed the LLE_EXCLUSIVE; I'll fix that. Thanks a lot for catching this!
The idea was not to change the current status if we cannot insert.


Certainly, sounds awesome!

In D22447#495146, @bz wrote:

Thanks a lot for the feedback. let me know what you think about the suggestion to move the basic functionality into llatbl?


I'l split this up into a couple of chunks so we can more easily do the base infrastructure for the llatbl changes for IPv6 and the IPv4.
The rtr and prefix parts are unrelated to that.


Nit: can we have it bool?


Nit: worth converting to bool this as well