Paths

Table of Contentst

netinet6: make RFC4291 anycast conformance a sysctl
AbandonedPublic
Actions

Authored by ivy on May 1 2025, 8:17 AM.

Details

Reviewers

kevans
des
rrs
adrian
tuexen
ziaee

Group Reviewers

network
transport
manpages

Summary

in releases prior to 15.0 we followed RFC3513, which did not allow
sending any packets from an anycast address, by forbidding any attempt
to bind to an anycast address and rejecting incoming TCP connections to
a local anycast address.

recently this was changed to allow binding to an anycast address so that
UDP services could use an anycast address. however, TCP anycast was
still forbidden.

D50019 proposed removing the TCP restriction to allow full use of
anycast addresses by applications, but the consensus was that this is
too permissive and introduces the risk of a user configuring a stateful
service over an anycast address without understanding the implications.

however, we do want to permit TCP anycast for e.g. DNS servers, which
require both UDP and TCP, and where TCP anycast is in wide use at
scale[0]. so, introduce a new sysctl net.inet6.ip6.ip6_rfc4291_anycast,
which controls the behaviour of anycast addresses.

if the sysctl is set to 0, which is the default:

binding a UDP socket to an anycast address is permitted; this allows UDP services to receive data sent to an anycast address (e.g., for anycast monitoring or data collection services).
sending UDP packets with an anycast source address is forbidden.
binding a TCP socket to an anycast address is forbidden.
incoming TCP connections to an anycast address are rejected.

except for the first point, this largely matches the pre-15.0 behaviour,
but we now reject TCP connections with a RST rather than an ICMP error
because according to RFC4291 we're allowed to do that.

if the sysctl is set to 1:

outgoing UDP packets from an anycast address are permitted.

if the sysctl is set to 2:

binding a TCP socket to an anycast address is permitted.
incoming TCP connections to an anycast address are permitted.

this allows the user to use anycast addresses for stateful protocols
when they understand the risks, but protects them from doing this
accidentally.

raw sockets are treated the same as UDP, i.e. by default
they can receive traffic but not send it.

this does not change the default source selection logic: an anycast
address will never be chosen as the default source address. the user
has to explicitly bind the socket regardless of the sysctl setting.

document this behaviour in inet6.4, and add a pointer there from the
'anycast' option in ifconfig.8.

add some basic tests for TCP and raw sockets.

[0] https://www.ripe.net/publications/docs/ripe-393/

Diff Detail

Repository

rG FreeBSD src repository

Lint

Lint Skipped

Unit

Tests Skipped

Build Status

Buildable 63827
Build 60711: arc lint + arc unit

Event Timeline

ivy created this revision.May 1 2025, 8:17 AM

Herald added 1 blocking reviewer(s): transport. · View Herald TranscriptMay 1 2025, 8:17 AM

Herald added subscribers: ziaee, glebius, melifaro and 2 others. · View Herald Transcript

ivy requested review of this revision.May 1 2025, 8:17 AM

Harbormaster completed remote builds in B63817: Diff 154608.May 1 2025, 8:18 AM

ivy added a reviewer: manpages.May 1 2025, 8:18 AM

add a couple of missing parentheses

Harbormaster completed remote builds in B63818: Diff 154609.May 1 2025, 8:23 AM

ivy mentioned this in D50018: netinet6: allow binding a raw socket to any anycast address.May 1 2025, 8:24 AM

ivy mentioned this in D50019: tcp: allow connections to IPv6 anycast address.May 1 2025, 9:54 AM

rrs accepted this revision.May 1 2025, 12:09 PM

This revision is now accepted and ready to land.May 1 2025, 12:09 PM

Would it make sense to have also a setting where you allow anycast for UDP, but not for TCP?

tuexen added inline comments.May 1 2025, 1:06 PM

sys/netinet6/in6_pcb.c
215	Why do you assume that TCP is equivalent to SOCK_STREAM? This is not true. For example, SCTP supports SOCK_STREAM sockets...

Small nit

sbin/ifconfig/ifconfig.8
457	Then this link becomes clickable in man.freebsd.org or gnome help or whatever.

This revision now requires changes to proceed.May 1 2025, 1:57 PM

kevans added inline comments.May 1 2025, 2:36 PM

sbin/ifconfig/ifconfig.8
457	I would expect that link to be broken? It's cross-referencing a section from another manpage, not one within this manpage. I don't see how whatever's generating the page would actually get the correct address.

ivy added inline comments.May 1 2025, 2:42 PM

sys/netinet6/in6_pcb.c
215	the intent is to disallow anything that's not UDP or raw sockets. aiui, we should also prohibit SCTP here. so, does the code seem correct if i adjust the comment? it was not immediately clear to me how to get the actual protocol from an inpcb, since `inp_ip_p` doesn't seem to be set at this point.

ziaee resigned from this revision.May 1 2025, 3:35 PM

ziaee added inline comments.

sbin/ifconfig/ifconfig.8
457	Excuse me, I was incorrect. Looking at mdoc(7) again it says clearly `Reference a section or subsection in the same manual page`.

(I also incorrectly thought that "Resign as Reviewer" would remove the "Needs Revision" without marking it to claim that I reviewed this, since I did not contribute anything to this one, my apologies)

ziaee accepted this revision as: manpages.May 1 2025, 3:36 PM

This revision is now accepted and ready to land.May 1 2025, 3:36 PM

p.mousavizadeh_protonmail.com mentioned this in D49782: Handbook: Improving IPv6 Documentation.Thu, May 1, 11:30 PM

p.mousavizadeh_protonmail.com added a subscriber: p.mousavizadeh_protonmail.com.

change ip6_rfc4291_anycast to have three values: no outgoing anycast, outgoing
udp only, or outgoing udp and tcp.

This revision now requires review to proceed.Fri, May 2, 4:24 AM

Harbormaster completed remote builds in B63827: Diff 154678.Fri, May 2, 4:24 AM

ivy edited the summary of this revision. (Show Details)Fri, May 2, 4:25 AM

In D50099#1142843, @tuexen wrote:

Would it make sense to have also a setting where you allow anycast for UDP, but not for TCP?

now possible by setting ip6_rfc4291_anycast=1. to enable tcp, it must be set to 2.

sbin/ifconfig/ifconfig.8
457	i changed this locally to test and forgot to revert before committing, but i'll undo this for the next update (or if this is approved as-is).

I did some testing using this patch with a focus on TCP and SCTP. I also read draft-itojun-ipv6-tcp-to-anycast and RFC 4291 and RFC 3513.

I am all in favor of allowing IPv6 anycast addresses as source addresses, since the restriction stated in RFC 3513 has been removed in RFC 4291 and that RFC is now 9 years old. The restriction in the RFC 3513 is about using anycast addresses as source
addresses.

Having a sysctl variable controlling if FreeBSD allows IPv6 anycast addresses as source addresses is fine. This way the initial commit sets it to false, keeping the old behavior. That can be MFCed to stable/14 and in a separate commit the default can be changed to true and FreeBSD 15 will use anycast addresses as source addresses.

For TCP (not support anycast) this would mean:

without allowing anycast addresses as source addresses, we should keep sending ICMP6 messages in response to SYN segments towards the anycast address.
with allowing anycast addresses as source addresses , we should send TCP RST segments using the anycast source address in response to incoming SYN segments.

Testing shows that you patch results in always sending TCP reset segments using the anycast source address. In case you have a listener (IPv6 wildcard address as the local address), the machine panics, if you have BBLogging enabled, since a KASSERT is hit.

I would add a TCP level sysctl variable to allow binding of an TCP endpoint to an anycast address. That would be used for example in tcp6_usr_bind() to disallow binding if the sysctl variable is not enabled.

No using a single sysctl variable makes is simpler to deal with the different protocols and the distinction between

using anycast addresses as source addresses in general
allowing an endpoint to bind against an anycast address in case of TCP and SCTP

A similar approach would be done for SCTP. Right now you can always bind against an anycast address.

If you agree with the above, I could to the TCP related changes in a separate review, which would build on this one. I also can take care of the SCTP related changes.

In D50099#1143785, @tuexen wrote:

Having a sysctl variable controlling if FreeBSD allows IPv6 anycast addresses as source addresses is fine. This way the initial commit sets it to false, keeping the old behavior. That can be MFCed to stable/14 and in a separate commit the default can be changed to true and FreeBSD 15 will use anycast addresses as source addresses.

i did not intend to MFC this because i think it might be too large of a change even with rfc4291_anycast=0, and especially with 15.0R just around the corner. for now, anyone using anycast on FreeBSD is simply not configuring the addresses as IFF_ANYCAST addresses and they can continue doing that, so it's not like you can't do anycast TCP without this patch.

however i'm not opposed to an MFC if you think it's worthwhile.

For TCP (not support anycast) this would mean:

without allowing anycast addresses as source addresses, we should keep sending ICMP6 messages in response to SYN segments towards the anycast address.

i strongly disagree with this, but perhaps you could elaborate on your reasoning here. we already have a standard way to indicate "i do not want to accept a TCP connection to this host/port combination", which is to send a RST; what do we gain by sending an ICMP here?

the only reason we currently send an ICMP is due to the reasoning in draft-itojun-ipv6-tcp-to-anycast, but that reasoning is obsolete and we do not need to follow it anymore. we should just send a RST. note that sending a RST does not create state, which is what we're trying to avoid; we're not trying to avoid "sending traffic from an anycast address" in general because we're allowed to do that now.

with allowing anycast addresses as source addresses , we should send TCP RST segments using the anycast source address in response to incoming SYN segments.

i assume you mean if we don't have a listener on the port - otherwise we should just accept the connection, not send a RST.

Testing shows that you patch results in always sending TCP reset segments using the anycast source address. In case you have a listener (IPv6 wildcard address as the local address), the machine panics, if you have BBLogging enabled, since a KASSERT is hit.

oops, i will investigate the panic. however sending RST using the anycast source address is intentional.

I would add a TCP level sysctl variable to allow binding of an TCP endpoint to an anycast address. That would be used for example in tcp6_usr_bind() to disallow binding if the sysctl variable is not enabled.

this is what rfc4291_anycast is supposed to do. if it's < 2, we don't accept TCP anycast and you can't bind a TCP socket to an anycast address. i don't see any need to add two sysctls for this, it is already enough hassle having to set one on every system.

No using a single sysctl variable makes is simpler to deal with the different protocols and the distinction between

using anycast addresses as source addresses in general

allowing an endpoint to bind against an anycast address in case of TCP and SCTP

i don't see a distinction here. what we're trying to avoid is stateful traffic over anycast. whether we accepted or originated the connection actually makes no difference.

A similar approach would be done for SCTP. Right now you can always bind against an anycast address.

i thought this would be disallowed if you tried to bind an SCTP stream socket. i will investigate this. btw, does connecting/accepting an SCTP datagram socket create state? if the answer is yes (which i think it is), then we should also disallow this when rfc4291_anycast < 2.

In D50099#1143905, @ivy wrote:

In D50099#1143785, @tuexen wrote:

Having a sysctl variable controlling if FreeBSD allows IPv6 anycast addresses as source addresses is fine. This way the initial commit sets it to false, keeping the old behavior. That can be MFCed to stable/14 and in a separate commit the default can be changed to true and FreeBSD 15 will use anycast addresses as source addresses.

i did not intend to MFC this because i think it might be too large of a change even with rfc4291_anycast=0, and especially with 15.0R just around the corner. for now, anyone using anycast on FreeBSD is simply not configuring the addresses as IFF_ANYCAST addresses and they can continue doing that, so it's not like you can't do anycast TCP without this patch.

however i'm not opposed to an MFC if you think it's worthwhile.

For TCP (not support anycast) this would mean:

without allowing anycast addresses as source addresses, we should keep sending ICMP6 messages in response to SYN segments towards the anycast address.

i strongly disagree with this, but perhaps you could elaborate on your reasoning here. we already have a standard way to indicate "i do not want to accept a TCP connection to this host/port combination", which is to send a RST; what do we gain by sending an ICMP here?

My understanding here is that rfc4291_anycast=0 keeps the old behavior: it is not allowed to use an anycast address as the source address.
If that is true, you can not send a TCP RST segment, since the source address would be the anycast address. The alternative would be to silently discard the incoming packet, but sending something is better. In the FreeBSD case, the client would terminate the unseccessful connection setup earlier due to the received ICMP6 messages.

the only reason we currently send an ICMP is due to the reasoning in draft-itojun-ipv6-tcp-to-anycast, but that reasoning is obsolete and we do not need to follow it anymore. we should just send a RST. note that sending a RST does not create state, which is what we're trying to avoid; we're not trying to avoid "sending traffic from an anycast address" in general because we're allowed to do that now.

My understanding is that you can send a RST only if rfc4291_anycast!=0.

with allowing anycast addresses as source addresses , we should send TCP RST segments using the anycast source address in response to incoming SYN segments.

i assume you mean if we don't have a listener on the port - otherwise we should just accept the connection, not send a RST.

I am focussing in your terminology on rfc4291_anycast < 2.

Testing shows that you patch results in always sending TCP reset segments using the anycast source address. In case you have a listener (IPv6 wildcard address as the local address), the machine panics, if you have BBLogging enabled, since a KASSERT is hit.

oops, i will investigate the panic. however sending RST using the anycast source address is intentional.

I agree, if rfc4291_anycast > 0

I would add a TCP level sysctl variable to allow binding of an TCP endpoint to an anycast address. That would be used for example in tcp6_usr_bind() to disallow binding if the sysctl variable is not enabled.

this is what rfc4291_anycast is supposed to do. if it's < 2, we don't accept TCP anycast and you can't bind a TCP socket to an anycast address. i don't see any need to add two sysctls for this, it is already enough hassle having to set one on every system.

But how to you deal with SCTP in combination with TCP? I could envision a system where anycast addresses as source address are fine for TCP RST segments or SCTP ABORT packets (which corresponds in your terminology to rfc4291_anycast=1, but no connections. Maybe you want to enable TCP connections, but not SCTP or vice versa or both.

What I am proposing is to have rfc4291_anycast controlling the use of anycast addresses as a source address. I would argue we can make the default to 1 in FreeBSD 15, but make it switchable in FreeBSD 14.
Then have a TCP and a SCTP sysctl, which controls if you allow anycast addresses for local endpoints. Default would be off. So people wanting TCP would need to switch one sysctl for TCP.

No using a single sysctl variable makes is simpler to deal with the different protocols and the distinction between

using anycast addresses as source addresses in general

allowing an endpoint to bind against an anycast address in case of TCP and SCTP

i don't see a distinction here. what we're trying to avoid is stateful traffic over anycast. whether we accepted or originated the connection actually makes no difference.

The first allows RST segments and ABORT chunks to be sent. This is a layer 3 thing. Much better than ICMPv6, but not connections. The second allows connections. This is a layer 4 thing.

A similar approach would be done for SCTP. Right now you can always bind against an anycast address.

i thought this would be disallowed if you tried to bind an SCTP stream socket. i will investigate this. btw, does connecting/accepting an SCTP datagram socket create state? if the answer is yes (which i think it is), then we should also disallow this when rfc4291_anycast < 2.

SCTP does not use all the stuff from the infrastructure, since it doesn't really fit (all the PCB stuff is single homing, SCTP supports multihoming).
My commits from today ensure that you cannot bind to an anycast address for SCTP. This is done by marking the address as unusable.
However, the SCTP sends ABORT from anycast addresses in response to OOTB packets as allowed by RFC 4291, but does not conform to the older one. I'll fix this soon.

In D50099#1144036, @tuexen wrote:

My understanding here is that rfc4291_anycast=0 keeps the old behavior: it is not allowed to use an anycast address as the source address.

this is not my intent, but i appreciate the sysctl may be confusingly named. i am open to suggestions for other names.

my intention is that in all cases, we comply with RFC4291 (and not obsolete RFCs/I-Ds) and this is not user-configurable; RFC4291 is the current RFC regarding this behaviour and there is no reason to support obsolete behaviour.

the "rfc4291_anycast" sysctl is not actually directly related to RFC4291-conformance, but is rather intended to be the sysctl for "i know what anycast is and i really want to do this". so, i could have called it net.inet6.ip6.i_know_what_im_doing_let_me_use_anycast_addresses=1. but that's not a great name either, so like i say, i'm open to naming suggestions.

i definitely do not want to end up with a situation where we're sending ICMP errors in response to a TCP SYN.

But how to you deal with SCTP in combination with TCP? I could envision a system where anycast addresses as source address are fine for TCP RST segments or SCTP ABORT packets (which corresponds in your terminology to rfc4291_anycast=1, but no connections. Maybe you want to enable TCP connections, but not SCTP or vice versa or both.

i am not really familiar enough with SCTP to really answer this but my understanding is that SCTP is always stateful even when only sending datagrams. in that case, we should not allow SCTP over anycast addresses (incoming or outgoing) unless rfc4291_anycast=2.

i already compromised here on having separate knobs for "UDP anycast" and "TCP anycast" but i am not going to add a third knob for "i want SCTP anycast but not TCP anycast", this is getting ridiculous. at some point we just need to trust that the admin knows what they're doing.

remember that you can already do UDP, TCP and SCTP anycast in 14-STABLE by just not setting IFF_ANYCAST on the address. what we're trying to do here is to encourage people to set IFF_ANYCAST on their anycast addresses, not confuse/annoy them so much they just give up entirely.

What I am proposing is to have rfc4291_anycast controlling the use of anycast addresses as a source address. I would argue we can make the default to 1 in FreeBSD 15, but make it switchable in FreeBSD 14.

fine, i'm okay with this. i would prefer the default be either 0 or 2, but i will accept 1.

Then have a TCP and a SCTP sysctl, which controls if you allow anycast addresses for local endpoints. Default would be off. So people wanting TCP would need to switch one sysctl for TCP.

no, see above.

In D50099#1144051, @ivy wrote:

In D50099#1144036, @tuexen wrote:

My understanding here is that rfc4291_anycast=0 keeps the old behavior: it is not allowed to use an anycast address as the source address.

this is not my intent, but i appreciate the sysctl may be confusingly named. i am open to suggestions for other names.

As far as I understand the main difference between RFC 4291 and its predecessors in relation to anycast addresses is that RFC 4291 allows them to be used as source addresses. Do you share this view?
The current behavior of FreeBSD is RFC 3531. Do you share this view?
FreeBSD should support RFC 4291. Do you share this view?
My understanding of rfc4291_anycast=0 was that RFC 3531, but not RFC 4291 is supported. Changing the sysctl value to rfc4291_anycast>0 changes this such that RFC 4291 is supported. Is that wrong? If I am wrong, please explain what behavior is expected with the setting rfc4291_anycast=0.

my intention is that in all cases, we comply with RFC4291 (and not obsolete RFCs/I-Ds) and this is not user-configurable; RFC4291 is the current RFC regarding this behaviour and there is no reason to support obsolete behaviour.

But what does rfc4291_anycast=0 then mean?

the "rfc4291_anycast" sysctl is not actually directly related to RFC4291-conformance, but is rather intended to be the sysctl for "i know what anycast is and i really want to do this". so, i could have called it net.inet6.ip6.i_know_what_im_doing_let_me_use_anycast_addresses=1. but that's not a great name either, so like i say, i'm open to naming suggestions.

i definitely do not want to end up with a situation where we're sending ICMP errors in response to a TCP SYN.

This is what we are doing now and is the best we can do if we do not use anycast addresses as source addresses.
If we follow RFC 4291, we can send TCP RST segments, but we cannot if we follow RFC 3531.

But how to you deal with SCTP in combination with TCP? I could envision a system where anycast addresses as source address are fine for TCP RST segments or SCTP ABORT packets (which corresponds in your terminology to rfc4291_anycast=1, but no connections. Maybe you want to enable TCP connections, but not SCTP or vice versa or both.

i am not really familiar enough with SCTP to really answer this but my understanding is that SCTP is always stateful even when only sending datagrams. in that case, we should not allow SCTP over anycast addresses (incoming or outgoing) unless rfc4291_anycast=2.

With rfc4291_anycast=2 you mean allowing anycast addresses to be bound to an endpoint, right?

i already compromised here on having separate knobs for "UDP anycast" and "TCP anycast" but i am not going to add a third knob for "i want SCTP anycast but not TCP anycast", this is getting ridiculous. at some point we just need to trust that the admin knows what they're doing.

I don't understand the last sentence. I do trust the admin of the FreeBSD system. But the person controlling the host is not controlling the network.
I can always add an SCTP knob later, when needed.

In my understanding you need

one knob to allow anycast addresses as source addresses. This covers UDP and UDPLite. I guess you want to allow this always and therefore don't need this knob.
one knob for allowing anycast addresses bound to TCP endpoints.

If we always allow anycast address a source addresses, we only need a knob for TCP (and one for SCTP, which I can add when needed).

remember that you can already do UDP, TCP and SCTP anycast in 14-STABLE by just not setting IFF_ANYCAST on the address. what we're trying to do here is to encourage people to set IFF_ANYCAST on their anycast addresses, not confuse/annoy them so much they just give up entirely.

What I am proposing is to have rfc4291_anycast controlling the use of anycast addresses as a source address. I would argue we can make the default to 1 in FreeBSD 15, but make it switchable in FreeBSD 14.

fine, i'm okay with this. i would prefer the default be either 0 or 2, but i will accept 1.

Then have a TCP and a SCTP sysctl, which controls if you allow anycast addresses for local endpoints. Default would be off. So people wanting TCP would need to switch one sysctl for TCP.

no, see above.

In D50099#1144055, @tuexen wrote:

The current behavior of FreeBSD is RFC 3531. Do you share this view?

i believe you mean RFC 3513, in which case: yes.

FreeBSD should support RFC 4291. Do you share this view?

yes.

My understanding of rfc4291_anycast=0 was that RFC 3531, but not RFC 4291 is supported.

no.

we should always support RFC4291. the purpose of rfc4291_anycast is to protect the user from configuring stateful services over anycast when she doesn't understand anycast. as i said, the sysctl may be badly named.

we should never conform to RFC 3513. it is obsolete.

Changing the sysctl value to rfc4291_anycast>0 changes this such that RFC 4291 is supported. Is that wrong?

yes, this is wrong. changing the value of rfc4291_anycast means "i understand how anycast works and i want to use anycast addresses for stateful services, as permitted by RFC4291". it is not intended to reflect support for RFC4291 in the OS itself, which should always be supported.

This is what we are doing now

yes, and it's wrong and we should stop doing it.

and is the best we can do if we do not use anycast addresses as source addresses.

see above.

i am not really familiar enough with SCTP to really answer this but my understanding is that SCTP is always stateful even when only sending datagrams. in that case, we should not allow SCTP over anycast addresses (incoming or outgoing) unless rfc4291_anycast=2.

With rfc4291_anycast=2 you mean allowing anycast addresses to be bound to an endpoint, right?

i am talking about both making outgoing connections (using bind) and accepting incoming connections (e.g., to wildcard).

i already compromised here on having separate knobs for "UDP anycast" and "TCP anycast" but i am not going to add a third knob for "i want SCTP anycast but not TCP anycast", this is getting ridiculous. at some point we just need to trust that the admin knows what they're doing.

I don't understand the last sentence. I do trust the admin of the FreeBSD system. But the person controlling the host is not controlling the network.

i do not understand the last sentence. :-)

configuring an anycast address that provides a service requires the cooperation of both the host admin and the network admin (assuming they are not the same person). at some point we have to assume at least one of them knows what they're doing, since we cannot prevent them from configuring anycast address however they want - all we can control is whether the address is configured with IFF_ANYCAST.

I can always add an SCTP knob later, when needed.

please, do not. i am very strongly opposed to this. it is really very difficult to overstate how much i dislike this idea.

one knob to allow anycast addresses as source addresses. This covers UDP and UDPLite. I guess you want to allow this always and therefore don't need this knob.

no, what i want is to allow all anycast traffic always. i compromised on this diff which is to disallow all stateful services (UDP or TCP or SCTP) without setting a sysctl. i am opposed to allowing UDP by default but not allowing TCP, because this will trick people into configuring their services wrongly. (for example, they might assume that UDP DNS is a stateless service.)

I honestly don't think we need any sysctl at all. Not for TCP, not for UDP, not for SCTP.

People have to configure an address as ANYCAST for this logic to even apply, no? So why add it behind a sysctl? They already need to set the address family flag.

In D50099#1144071, @adrian wrote:

People have to configure an address as ANYCAST for this logic to even apply, no? So why add it behind a sysctl? They already need to set the address family flag.

essentially the problem is this:

people who are experts on anycast will configure their IP addresses without the anycast keyword and everything will work fine with UDP, TCP, SCTP, etc. (this is what they already do in 14.0 and all prior releases)

people who don't understand anycast will configure their IP addresses with the anycast keyword and we have to stop them from doing this because reasons, i am not sure what the reasons are

basically the idea is to discourage people from configuring anycast addresses correctly because we don't want them to do that.

In D50099#1144074, @ivy wrote:

In D50099#1144071, @adrian wrote:

People have to configure an address as ANYCAST for this logic to even apply, no? So why add it behind a sysctl? They already need to set the address family flag.

essentially the problem is this:

people who are experts on anycast will configure their IP addresses without the anycast keyword and everything will work fine with UDP, TCP, SCTP, etc. (this is what they already do in 14.0 and all prior releases)

people who don't understand anycast will configure their IP addresses with the anycast keyword and we have to stop them from doing this because reasons, i am not sure what the reasons are

basically the idea is to discourage people from configuring anycast addresses correctly because we don't want them to do that.

Ok, that's how I read the original (no sysctl) diff. The "knob" here is really the anycast keyword, and it's not implied by the address itself. If someone configures the same address with no anycast keyword, they get the normal system behaviour (everything works). This isn't like classful IPv4 address behaviour or anything.

Thanks! I still am totally OK with "no sysctl needed" here.

If I understand it correctly, the use of an anycast address as a source address will be allowed unconditionally. This is a change between FreeBSD 14 and FreeBSD 15 and I am fine with it. In particular this allows TCP RST and SCTP packets with an ABORT chunk to be sent from an anycast address. So no more ICMPv6 messages sent back.
It makes sense to not MFC this to stable/14.
I think we all agree up to here.

I also think that we do not need any sysctl knob for binding an anycast address to a UDP or UDP-Lite socket or for using it with raw sockets, since they are often used to send something special.
I am not sure, but I think we can also agree up to here. Am I wrong?

Now it comes to the question of binding an anycast address to a TCP endpoint. Here we do not agree yet, I think. But assuming that we agree on the above, here are some points:

If we want some knob, it would only affect binding anycast addresses to TCP endpoints, and therefore it would make sense to use a TCP level sysctl variable.
I am arguing for a TCP specific knob, since I think using anycast addresses for short term connections might just work, but I am not sure for long term connections. So I don't want to make the live harder for admins, but I want to avoid a situation, where a system is configured using anycast for a UDP based service and an TCP service binding to the wildcard address will now also use the anycast address.
In my understanding, which could be wrong or outdated, is that the ECMP routing is not guaranteed to be time invariant. So at a later time, it might route packets for the same TCP connection to a different destination, for example if the routing changes or some paths fail. In these cases long term TCP connections will just fail. But, as I said, I might be wrong and these problems do not exist anymore.
If my understanding is wrong and the ECMP problems mentioned above do not exist anymore, I am fine this no knob.

In D50099#1145337, @tuexen wrote:

If I understand it correctly, the use of an anycast address as a source address will be allowed unconditionally. This is a change between FreeBSD 14 and FreeBSD 15 and I am fine with it. In particular this allows TCP RST and SCTP packets with an ABORT chunk to be sent from an anycast address. So no more ICMPv6 messages sent back.

Ok, just to be clear, there's no "anycast address", right? There's "an address that has also been configured as an anycast address."

So if my anycast address is 2601::1:

I configure it without the anycast flag, I can do what I want with it, but it may be chosen as an IP address for an outbound connection
I configure it with the anycast flag, it won't be chosen for outbound connections unless explicitly bind()'ed to; ivy's work here changes the current behaviour

There's nothing about the address 2601::1 that says it's anycast.

Right now admins configuring anycast services don't need to set the anycast flag on an IPv6 address, announce it via some routing protocol somewhere else, and .. it's an anycast address. But they then have to make sure /outbound/ connections don't choose the anycast address. If they don't set the anycast flag, they already have to make sure their outbound connections don't break upon a routing topology change.

That's why I'm confused. :-) The situation you're trying to guard against with adding sysctls is actually the situation people can and are likely doing right now by not setting the anycast flag on the ipv6 address.

In D50099#1145337, @tuexen wrote:

I also think that we do not need any sysctl knob for binding an anycast address to a UDP or UDP-Lite socket or for using it with raw sockets, since they are often used to send something special.
I am not sure, but I think we can also agree up to here. Am I wrong?

i agree up to here but only if this also applies to TCP and SCTP. if it doesn't then i don't agree. that's why i included UDP support in this sysctl.

i think perhaps we disagree on the intended purpose of the sysctl knob. to me, it is intended to prevent admins from doing something dumb without realising, which they can do equally with both UDP and TCP, so it doesn't make sense to me to enable this for TCP(/SCTP) only. i think your position is more that if users do something dumb with UDP, that's happening in userland, but if they do something dumb with TCP or SCTP, that's in the kernel, so we need a kernel knob. is that more or less right?

if so, my position is that from a user's point of view, whether the dumb thing happens in userland or in the kernel doesn't actually matter, their network is broken either way.

(and to be clear, i am opposed to adding the sysctl to begin with, but if we do add the sysctl, i think we have to add it for UDP as well, for the reason i just articulated.)

I am arguing for a TCP specific knob, since I think using anycast addresses for short term connections might just work, but I am not sure for long term connections. So I don't want to make the live harder for admins, but I want to avoid a situation, where a system is configured using anycast for a UDP based service and an TCP service binding to the wildcard address will now also use the anycast address.

i think it's important to note that, as adrian already mentioned, everyone is already doing this. the IFF_ANYCAST flag in stable/14 (and earlier) is basically useless since it doesn't let you do anything with the anycast address, so in practice, everyone using either TCP or UDP over anycast is omitting this flag in their addresses. what i want to do with this change is to make anycast more useful so people start using it, and adding new restrictions on top is in opposition to that goal.

In my understanding, which could be wrong or outdated, is that the ECMP routing is not guaranteed to be time invariant. So at a later time, it might route packets for the same TCP connection to a different destination, for example if the routing changes or some paths fail. In these cases long term TCP connections will just fail. But, as I said, I might be wrong and these problems do not exist anymore.

anycast does not imply ECMP, and in fact i believe ECMP with anycast is very unusual. if admin wants to allow ECMP with anycast, they have a lot of problems to solve and we can't reasonably do anything about that.

it is possible that, without ECMP, an anycast address can become routed to a different node because of underlying IGP/EGP changes, but this applies equally to stateful UDP services.

regarding MFC, whichever solution we go with, i am fine with not MFCing this change. we're far too late for 14.3 and 15.0 is going to be out before 14.4, so we can just tell users to upgrade.

In D50099#1145567, @adrian wrote:

Ok, just to be clear, there's no "anycast address", right? There's "an address that has also been configured as an anycast address."

to be precise about terminology, according to RFC 4291, if the same IP address is configured on two different interfaces (perhaps on different nodes, but this is not a requirement) then it is an anycast address and technically, to conform to the RFC, you are required to configure such address with the IFF_ANYCAST flag.

RFC 4291, § 2.6:

"Anycast addresses are allocated from the unicast address space, using any of the defined unicast address formats. Thus, anycast addresses are syntactically indistinguishable from unicast addresses. When a unicast address is assigned to more than one interface, thus turning it into an anycast address, the nodes to which the address is assigned must be explicitly configured to know that it is an anycast address."

in practice, no one actually does this (as we discussed on IRC) due to IFF_ANYCAST addresses being useless.

as an interesting aside, it is possible to configure link-local addresses as IFF_ANYCAST, and this conforms to RFC4291, but it's broken in FreeBSD since we never pass scope id to in6ifa_ifwithaddr(). we probably also do this wrong if the same GUA is assigned to multiple interfaces.

so we should probably fix that as well, but it's largely orthogonal to the current problem.

In D50099#1145646, @ivy wrote:

as an interesting aside, it is possible to configure link-local addresses as IFF_ANYCAST, and this conforms to RFC4291, but it's broken in FreeBSD since we never pass scope id to in6ifa_ifwithaddr(). we probably also do this wrong if the same GUA is assigned to multiple interfaces.

I agree, that formally an anycast address can be a link-local address. Not sure if this would actually work and provide a time constant node selection. This needs testing.

so we should probably fix that as well, but it's largely orthogonal to the current problem.

I have thought about the arguments I provided and the ones provided by Lexi and Adrian.

I still think that some applications might break, if they use anycast by accident. For example, a TCP server binding to the wildcard address, and the admin configuring an anycast address for running a UDP server, which is fine with anycast.
But this can happen all the time and it really does not make sense to make a protocol level switch for an actually end-point specific problem which might be there or not. And on the end-point level, binding to the local addresses is a perfect way to deal with anycast.

In summary: I changed my mind and this we don't need a sysctl at all. Let us keep it simple and allow using anycast addresses as source addresses and allow binding anycast addresses to endpoints.
Sorry that it took so long for me to come to this conclusion. Thanks a lot for the discussion!

closing this in favour of D50019. thank you everyone for your input on a surprisingly complicated issue :-)