User Details
- User Since
- Feb 4 2016, 4:45 PM (482 w, 4 d)
Yesterday
Sun, May 4
As far as I understand the main difference between RFC 4291 and its predecessors in relation to anycast addresses is that RFC 4291 allows them to be used as source addresses. Do you share this view?
The current behavior of FreeBSD is RFC 3531. Do you share this view?
FreeBSD should support RFC 4291. Do you share this view?
My understanding of rfc4291_anycast=0 was that RFC 3531, but not RFC 4291 is supported. Changing the sysctl value to rfc4291_anycast>0 changes this such that RFC 4291 is supported. Is that wrong? If I am wrong, please explain what behavior is expected with the setting rfc4291_anycast=0.
my intention is that in all cases, we comply with RFC4291 (and not obsolete RFCs/I-Ds) and this is not user-configurable; RFC4291 is the current RFC regarding this behaviour and there is no reason to support obsolete behaviour.
But what does rfc4291_anycast=0 then mean?
My understanding here is that rfc4291_anycast=0 keeps the old behavior: it is not allowed to use an anycast address as the source address.
If that is true, you can not send a TCP RST segment, since the source address would be the anycast address. The alternative would be to silently discard the incoming packet, but sending something is better. In the FreeBSD case, the client would terminate the unseccessful connection setup earlier due to the received ICMP6 messages.
the only reason we currently send an ICMP is due to the reasoning in draft-itojun-ipv6-tcp-to-anycast, but that reasoning is obsolete and we do not need to follow it anymore. we should just send a RST. note that sending a RST does not create state, which is what we're trying to avoid; we're not trying to avoid "sending traffic from an anycast address" in general because we're allowed to do that now.
My understanding is that you can send a RST only if rfc4291_anycast!=0.
- with allowing anycast addresses as source addresses , we should send TCP RST segments using the anycast source address in response to incoming SYN segments.
i assume you mean if we don't have a listener on the port - otherwise we should just accept the connection, not send a RST.
I am focussing in your terminology on rfc4291_anycast < 2.
Testing shows that you patch results in always sending TCP reset segments using the anycast source address. In case you have a listener (IPv6 wildcard address as the local address), the machine panics, if you have BBLogging enabled, since a KASSERT is hit.
oops, i will investigate the panic. however sending RST using the anycast source address is intentional.
I agree, if rfc4291_anycast > 0
I would add a TCP level sysctl variable to allow binding of an TCP endpoint to an anycast address. That would be used for example in tcp6_usr_bind() to disallow binding if the sysctl variable is not enabled.
this is what rfc4291_anycast is supposed to do. if it's < 2, we don't accept TCP anycast and you can't bind a TCP socket to an anycast address. i don't see any need to add two sysctls for this, it is already enough hassle having to set one on every system.
But how to you deal with SCTP in combination with TCP? I could envision a system where anycast addresses as source address are fine for TCP RST segments or SCTP ABORT packets (which corresponds in your terminology to rfc4291_anycast=1, but no connections. Maybe you want to enable TCP connections, but not SCTP or vice versa or both.
I did some testing using this patch with a focus on TCP and SCTP. I also read draft-itojun-ipv6-tcp-to-anycast and RFC 4291 and RFC 3513.
Thu, May 1
Sure, both are connection oriented. But if you run a simple query/response protocol on top of UDP, it will work.
UDP DNS on anycast might break if cookies are enabled. i think we're all agreed that there's no technical reason to restrict this, but rather the purpose is to avoid letting the user configure things that might break in unexpected ways. do we think users who aren't aware of the problems with TCP anycast are going to be aware of the problems with UDP DNS anycast? i feel like the answer will be 'no' in a non-trivial number of cases.
for me, the purpose of the sysctl is to give the user a nudge that anycast addresses are special and perhaps they don't want to do this, or at least they should look into it further, and this applies equally regardless of which layer the problem actually occurs on. i might change my mind if we could modify all userland DNS servers (for example) to refuse to run on UDP anycast addresses unless configured correctly, but that's clearly not feasible, so this seems like the least worst alternative.
if i can rephase your position slightly (please correct me if i'm wrong here) i think your concern, as tcp maintainer, is that you don't want to ship a tcp stack that can break like this in the default configuration, because it's then "the fault" of the tcp stack for permitting the bad configuration. i'm not disagreeing with that, i only feel like it's useful to extend this to udp as well simply because that seems beneficial to users.
I still think that you might want to run a UDP-based service on a anycast address, but not a TCP one. Can we support this?
To be clear: I would like to have a setting, where UDP is allowed to be used, but TCP isn't.
Would it make sense to have also a setting where you allow anycast for UDP, but not for TCP?
Yes, TCP does its own error handling and does not use ICMP for it like UDP.
secondly, this problem isn't specific to TCP, there are also UDP protocols that don't always work well over anycast:
- QUIC (the protocol supports anycast, but server implementation may not or admin may not configure it)
- NTP (client may keep state with server)
- DNS with EDNS cookies (depending on how the server is configured)
- probably more i haven't thought of; any UDP service that keeps state and is not explicitly anycast-aware could be affected.
of course, it's possible to implement these services in such a way that they work with anycast, but the idea here is to protect users who don't understand that this might be required or don't even realise it's a problem.
so my suggestion, which solves both of these issues, is to remove the current code in tcp and instead place bind() behind a sysctl, i.e., you cannot bind a socket to an IFF_ANYCAST address unless the sysctl is enabled, regardless of protocol - this would apply to TCP, UDP, SCTP, raw sockets, whatever.
I do see a difference between UDP raw sockets on the one side, and TCP and SCTP on the other side:
If you use anycast with UDP, or raw sockets, the upper layer protocol might nor might not break. The lower layer is fine.
If you use anycast with TCP or SCTP, it is the transport layer which might break or not.
Sat, Apr 26
Fri, Apr 25
I see. They are using short lived TCP connections for that time scale the anycast routing seems to be stable.
my feeling here is that we shouldn't nanny the user; anyone configuring anycast addresses should be aware of the operational implications of doing this and make their own decision about it. currently, they're forced to use non-IFF_ANYCAST addresses to do this, which increases of risk of error from e.g. accidentally using an anycast address as the source of an outgoing connection due to source address selection algorithm. that seems definitely worse than using an anycast address.
OK. But if TCP connections are failing, we can't do much about it.
the solution in RFC 7094 §4.2 is quite clever, but does anyone currently implement either the client or server side of this? it seems like it would have implications for the security of TCP connections.
In my view this breaks a couple of things. Therefore, I doubt that it will be implemented.
One could use a variant of MPTCP (which is not implemented in FreeBSD right now) for the TCP use case of the INIT forwarding for SCTP (which is easy to implement).
I am aware that you use DNS/TCP in addition to DNS/UDP and I am aware that for DNS/UDP anycast is used. But are you saying that people deploy TCP in combination with anycast at scale?
I do understand that anycast addresses can be used as source and destination addresses in combination with UDP, but I share the view of stated in section 3.1 of RFC 7094:
Most stateful transport protocols (e.g., TCP), without modification, do not understand the properties of anycast; hence, they will fail probabilistically, but possibly catastrophically, when using anycast addresses in the presence of "normal" routing dynamics.
In section 4.2 a possible way of handling anycast addresses in TCP is described, which is not implemented as far as I know. For SCTP we can implement something like https://datatracker.ietf.org/doc/html/draft-tuexen-tsvwg-sctp-init-fwd-02.
Mon, Apr 21
Sun, Apr 20
Sat, Apr 19
Wed, Apr 16
Just tested the latest version. No problems observed.
Tue, Apr 15
Mon, Apr 14
Use correct argument as reported by markj.
Sat, Apr 12
Wed, Apr 9
Thanks for providing your view. It helps to understand why you are proposing it.
It should be noted, that the old code does not initialize all fields, when they are not used. I guess this was done intentionally due to performance considerations. This does not work with the style suggested. But I don't know how relevant this actually is.
First of all I would like the thank you for providing running code for discussing this change. This allows at least me to have a discussion based on concrete code.
Mon, Apr 7
Apr 6 2025
If I understand the patch correctly, it contains three components:
- the fix by replacing in_delayed_cksum(m) with in_delayed_cksum_o(m, ipofs) in ng_nat.c.
- adding protection code which logs a warning.
- refactoring the code to remove indentation by using goto send.
It might be useful to separate these three when committing the code.
Apr 4 2025
Apr 3 2025
Apr 2 2025
Mar 31 2025
This was committed in 6e76489098c6.
Mar 30 2025
Mar 26 2025
Can you get that into the tree? Or are you waiting some something specific?
Mar 21 2025
Mar 20 2025
Use one big condition and try to use indentation to structure the boolean expression.
Mar 19 2025
Feb 27 2025
Tested again:
sudo kldload vmm
does NOT lock up the system anymore.
/var/log/messages
contains
vgic0: <Virtual GIC v3> on gic0
module_register_init: MOD_LOAD (vmm, 0xffff0002a5298574, 0) error 6
vgic0: detached
as expected.
What is unexpected for me is that after sudo kldload vmm,
kldstat reports vmm.ko as being loaded.
sudo kldunload vmm
succeeds. Is it expected, that vmm.ko is reported as loaded even though loading
of it fails?
After approving:
I applied the patch, compiled and installed the kernel, rebooted and tried sudo kldload vmm.
The system became unresponsive. Is a buildworld needed? I thought just building kernel
includes building the kernel modules including vmm.ko...
Any idea why the above fix did not work?
Is there anything I could do to help fixing the root cause?