Page MenuHomeFreeBSD

tcp: allow connections to IPv6 anycast address
Needs RevisionPublic

Authored by ivy on Fri, Apr 25, 1:18 PM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, May 15, 2:37 AM
Unknown Object (File)
Tue, Apr 29, 3:56 AM

Details

Reviewers
kevans
des
rrs
adrian
tuexen
Group Reviewers
network
transport
Summary

currently, we reject incoming TCP connections to an IPv6 anycast address
based on IETF I-D "draft-itojun-ipv6-tcp-to-anycast-01"[0]. the
rationale is that since RFC2373 prohibits sending IPv6 packets with an
anycast address as the source address, it would be impossible to
establish a TCP connection to such an address since the destination host
could not send any replies.

however, this restriction was lifted in RFC4291 and it is no longer
forbidden to send packets from an anycast address; therefore, it's both
possible and permitted to establish a TCP connection using an anycast
address as src or dst address (or both).

based on the above, delete this restriction and allow people to do this.

while there are certain operational reasons to avoid TCP anycast (such
as the risk of the route changing while the connection is open), these
also apply to IPv4 anycast and are specific to the local environment;
for example, it's perfectly valid to have an anycast address which is
only ever assigned to one node.

[0] https://www.ietf.org/archive/id/draft-itojun-ipv6-tcp-to-anycast-01.txt

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 63724
Build 60608: arc lint + arc unit

Event Timeline

ivy requested review of this revision.Fri, Apr 25, 1:18 PM
This revision is now accepted and ready to land.Fri, Apr 25, 1:46 PM

I do understand that anycast addresses can be used as source and destination addresses in combination with UDP, but I share the view of stated in section 3.1 of RFC 7094:

Most stateful transport protocols (e.g., TCP), without modification,
do not understand the properties of anycast; hence, they will fail
probabilistically, but possibly catastrophically, when using anycast
addresses in the presence of "normal" routing dynamics.

In section 4.2 a possible way of handling anycast addresses in TCP is described, which is not implemented as far as I know. For SCTP we can implement something like https://datatracker.ietf.org/doc/html/draft-tuexen-tsvwg-sctp-init-fwd-02.

So I think we should not allow binding a TCP endpoint to an anycast address... Doing it for UDP is fine.

i agree there are reasons not to do this, but by forbidding it we're essentially saying there's never any reason to do this, which i don't think is true. for example, i have some anycast addresses in my network which are only on one node, and these work fine, although of course they are not currently configured as IFF_ANYCAST addresses due to this restriction.

my feeling here is that we shouldn't nanny the user; anyone configuring anycast addresses should be aware of the operational implications of doing this and make their own decision about it. currently, they're forced to use non-IFF_ANYCAST addresses to do this, which increases of risk of error from e.g. accidentally using an anycast address as the source of an outgoing connection due to source address selection algorithm. that seems definitely worse than using an anycast address.

the solution in RFC 7094 §4.2 is quite clever, but does anyone currently implement either the client or server side of this? it seems like it would have implications for the security of TCP connections.

adrian added a subscriber: adrian.
In D50019#1140715, @ivy wrote:

i agree there are reasons not to do this, but by forbidding it we're essentially saying there's never any reason to do this, which i don't think is true. for example, i have some anycast addresses in my network which are only on one node, and these work fine, although of course they are not currently configured as IFF_ANYCAST addresses due to this restriction.

my feeling here is that we shouldn't nanny the user; anyone configuring anycast addresses should be aware of the operational implications of doing this and make their own decision about it. currently, they're forced to use non-IFF_ANYCAST addresses to do this, which increases of risk of error from e.g. accidentally using an anycast address as the source of an outgoing connection due to source address selection algorithm. that seems definitely worse than using an anycast address.

+1, I think it's fine to let the user shoot their own foot so to speak. If they're configuring ANYCAST then they should know what that entails.

the solution in RFC 7094 §4.2 is quite clever, but does anyone currently implement either the client or server side of this? it seems like it would have implications for the security of TCP connections.

Maybe add a comment to replace the section you deleted covering that ANYCAST is fine now, see RFC xxxx ?

Oh that's a neat hack, but yeah, at that point you should be using anycast for service location, and then protocol level redirects (eg HTTP 3xx) to clean things up.

i discussed this a bit elsewhere and someone pointed out that if you want to use anycast for DNS then you have to enable TCP or your DNS server won't work properly. given how many people have successfully deployed anycast DNS, it seems like there's a strong operational argument to permit this.

Maybe add a comment to replace the section you deleted covering that ANYCAST is fine now, see RFC xxxx ?

i considered this but it seemed odd to have a comment with no associated code saying "don't do anything here since we don't disallow this". in exchange i tried to make the commit message as informative as possible for anyone who is curious :-)

In D50019#1140724, @ivy wrote:

i discussed this a bit elsewhere and someone pointed out that if you want to use anycast for DNS then you have to enable TCP or your DNS server won't work properly. given how many people have successfully deployed anycast DNS, it seems like there's a strong operational argument to permit this.

I am aware that you use DNS/TCP in addition to DNS/UDP and I am aware that for DNS/UDP anycast is used. But are you saying that people deploy TCP in combination with anycast at scale?

my feeling here is that we shouldn't nanny the user; anyone configuring anycast addresses should be aware of the operational implications of doing this and make their own decision about it. currently, they're forced to use non-IFF_ANYCAST addresses to do this, which increases of risk of error from e.g. accidentally using an anycast address as the source of an outgoing connection due to source address selection algorithm. that seems definitely worse than using an anycast address.

OK. But if TCP connections are failing, we can't do much about it.

the solution in RFC 7094 §4.2 is quite clever, but does anyone currently implement either the client or server side of this? it seems like it would have implications for the security of TCP connections.

In my view this breaks a couple of things. Therefore, I doubt that it will be implemented.
One could use a variant of MPTCP (which is not implemented in FreeBSD right now) for the TCP use case of the INIT forwarding for SCTP (which is easy to implement).

are you saying that people deploy TCP in combination with anycast at scale?

yes, for example see: https://www.ripe.net/publications/docs/ripe-393/

The impact of this problem [i.e., stateful connections over anycast] is not clear: for example, in a study of J-root [5] the authors state that this is a serious problem and recommend that stateful services not be run on anycast at all. Other work has since concluded that the impact of node switches is not significant enough to be a concern [6, 12]. Our own results for K-root are presented in Section 4.3.

so, there are mixed results, but some people are using it successfully at scale.

In D50019#1140737, @ivy wrote:

are you saying that people deploy TCP in combination with anycast at scale?

yes, for example see: https://www.ripe.net/publications/docs/ripe-393/

The impact of this problem [i.e., stateful connections over anycast] is not clear: for example, in a study of J-root [5] the authors state that this is a serious problem and recommend that stateful services not be run on anycast at all. Other work has since concluded that the impact of node switches is not significant enough to be a concern [6, 12]. Our own results for K-root are presented in Section 4.3.

so, there are mixed results, but some people are using it successfully at scale.

I see. They are using short lived TCP connections for that time scale the anycast routing seems to be stable.

Thank you very much for sharing this document!

rrs requested changes to this revision.Sat, Apr 26, 2:07 PM

How about a compromise here... I do think Michael has a valid point...

Add in a sysctl that defaults to off. And add back the code you took out with a

if ((sysctl_var == true) && (ia6 && ,,,,) {+
}

This way you have to set the sysctl to true (blatantly shooting yourself in the foot).

R

This revision now requires changes to proceed.Sat, Apr 26, 2:07 PM
In D50019#1141053, @rrs wrote:

How about a compromise here... I do think Michael has a valid point...

Add in a sysctl that defaults to off. And add back the code you took out with a

if ((sysctl_var == true) && (ia6 && ,,,,) {+
}

This way you have to set the sysctl to true (blatantly shooting yourself in the foot).

R

I agree with Randall. I think in other cases the default values are safe and you have to do something (change a sysctl variable, for example) to shoot yourself in the foot...

here's the thing though, you can already accept TCP connections to an anycast address by simply not marking the address as IFF_ANYCAST, and this is what everyone does today. if you force users to set a sysctl to do this, they will just not bother marking addresses as IFF_ANYCAST and will be more likely to run into problems from e.g. accidentally originating outgoing connections from an anycast address.

or in other words, the choice here isn't "should we let users accept TCP connections to anycast addresses?" because it's impossible to prevent that, it's "should we allow users to correctly mark their existing anycast addresses as anycast addresses?". i don't see any advantage to forcing users to set a sysctl in order to configure networking more correctly.

more sysctls is more confusing and creates more pain for users, removing special cases and unnecessary code is more beautiful. :-)

however, i would be willing to add some text to ifconfig.8 to mention the potential issues with TCP anycast and refer them to some appropriate documentation.

In D50019#1141170, @ivy wrote:

here's the thing though, you can already accept TCP connections to an anycast address by simply not marking the address as IFF_ANYCAST, and this is what everyone does today. if you force users to set a sysctl to do this, they will just not bother marking addresses as IFF_ANYCAST and will be more likely to run into problems from e.g. accidentally originating outgoing connections from an anycast address.

or in other words, the choice here isn't "should we let users accept TCP connections to anycast addresses?" because it's impossible to prevent that, it's "should we allow users to correctly mark their existing anycast addresses as anycast addresses?". i don't see any advantage to forcing users to set a sysctl in order to configure networking more correctly.

more sysctls is more confusing and creates more pain for users, removing special cases and unnecessary code is more beautiful. :-)

however, i would be willing to add some text to ifconfig.8 to mention the potential issues with TCP anycast and refer them to some appropriate documentation.

And who exactly is "everyone"?

I know for a fact Netflix (one of the main users of FreeBSD) does not.

I strongly suggest you add the sysctl and suggested code. In general I think it best to make it so naive users will not "shoot themselves in the foot". And let only power users do that.

In D50019#1141201, @rrs wrote:

here's the thing though, you can already accept TCP connections to an anycast address by simply not marking the address as IFF_ANYCAST, and this is what everyone does today.

And who exactly is "everyone"?

I know for a fact Netflix (one of the main users of FreeBSD) does not.

by 'everyone' i mean 'everyone using TCP on an anycast address' - sorry if this was not clear.

I strongly suggest you add the sysctl and suggested code. In general I think it best to make it so naive users will not "shoot themselves in the foot". And let only power users do that.

i do not think simply wrapping the current code in a sysctl is the right approach, but let me explain why and suggest an alternative.

firstly, regardless of anything else, i think the current code should be removed as it's based on an 20-year-old expired I-D, and the only reason it's done this way is that at the time you couldn't send a RST packet from an anycast address. but now you can do that, so if we're going to forbid TCP connections to anycast addresses, i think we should do this with a RST, not an ICMP error. does that seem reasonable?

secondly, this problem isn't specific to TCP, there are also UDP protocols that don't always work well over anycast:

  • QUIC (the protocol supports anycast, but server implementation may not or admin may not configure it)
  • NTP (client may keep state with server)
  • DNS with EDNS cookies (depending on how the server is configured)
  • probably more i haven't thought of; any UDP service that keeps state and is not explicitly anycast-aware could be affected.

of course, it's possible to implement these services in such a way that they work with anycast, but the idea here is to protect users who don't understand that this might be required or don't even realise it's a problem.

so my suggestion, which solves both of these issues, is to remove the current code in tcp and instead place bind() behind a sysctl, i.e., you cannot bind a socket to an IFF_ANYCAST address unless the sysctl is enabled, regardless of protocol - this would apply to TCP, UDP, SCTP, raw sockets, whatever.

does this seem reasonable? i believe it's simpler for users because behaviour of IFF_ANYCAST is the same regardless of protocol, and it also more thoroughly addresses the current objection.

In D50019#1141295, @ivy wrote:
In D50019#1141201, @rrs wrote:

here's the thing though, you can already accept TCP connections to an anycast address by simply not marking the address as IFF_ANYCAST, and this is what everyone does today.

And who exactly is "everyone"?

I know for a fact Netflix (one of the main users of FreeBSD) does not.

by 'everyone' i mean 'everyone using TCP on an anycast address' - sorry if this was not clear.

I strongly suggest you add the sysctl and suggested code. In general I think it best to make it so naive users will not "shoot themselves in the foot". And let only power users do that.

i do not think simply wrapping the current code in a sysctl is the right approach, but let me explain why and suggest an alternative.

firstly, regardless of anything else, i think the current code should be removed as it's based on an 20-year-old expired I-D, and the only reason it's done this way is that at the time you couldn't send a RST packet from an anycast address. but now you can do that, so if we're going to forbid TCP connections to anycast addresses, i think we should do this with a RST, not an ICMP error. does that seem reasonable?

Yes, TCP does its own error handling and does not use ICMP for it like UDP.

secondly, this problem isn't specific to TCP, there are also UDP protocols that don't always work well over anycast:

  • QUIC (the protocol supports anycast, but server implementation may not or admin may not configure it)
  • NTP (client may keep state with server)
  • DNS with EDNS cookies (depending on how the server is configured)
  • probably more i haven't thought of; any UDP service that keeps state and is not explicitly anycast-aware could be affected.

of course, it's possible to implement these services in such a way that they work with anycast, but the idea here is to protect users who don't understand that this might be required or don't even realise it's a problem.

so my suggestion, which solves both of these issues, is to remove the current code in tcp and instead place bind() behind a sysctl, i.e., you cannot bind a socket to an IFF_ANYCAST address unless the sysctl is enabled, regardless of protocol - this would apply to TCP, UDP, SCTP, raw sockets, whatever.

I do see a difference between UDP raw sockets on the one side, and TCP and SCTP on the other side:
If you use anycast with UDP, or raw sockets, the upper layer protocol might nor might not break. The lower layer is fine.
If you use anycast with TCP or SCTP, it is the transport layer which might break or not.

So maybe have a net.inet.tcp sysctl, which you need to enable to bind an anycast address to a TCP socket? For SCTP, anycast addresses are not allowed right now by the code, but one could relax that on a per socket base, if INIT forwarding is implemented and enabled on that socket.

Does that sound reasonable?

does this seem reasonable? i believe it's simpler for users because behaviour of IFF_ANYCAST is the same regardless of protocol, and it also more thoroughly addresses the current objection.

I am not sure if it is protocol agnostic as mentioned above.

Yes, TCP does its own error handling and does not use ICMP for it like UDP.

i've done this in D50099 (among a lot of other changes, but that part is fairly self-contained and could be split out).

I do see a difference between UDP raw sockets on the one side, and TCP and SCTP on the other side:
If you use anycast with UDP, or raw sockets, the upper layer protocol might nor might not break. The lower layer is fine.
If you use anycast with TCP or SCTP, it is the transport layer which might break or not.

i understand your point, but from the user's perspective, i'm not sure it makes a difference. if the user configures a TCP HTTP server on an anycast address, it might break, and if the user configures a UDP QUIC server on an anycast address, it might break. even deploying UDP DNS on anycast might break if cookies are enabled. i think we're all agreed that there's no technical reason to restrict this, but rather the purpose is to avoid letting the user configure things that might break in unexpected ways. do we think users who aren't aware of the problems with TCP anycast are going to be aware of the problems with UDP DNS anycast? i feel like the answer will be 'no' in a non-trivial number of cases.

for me, the purpose of the sysctl is to give the user a nudge that anycast addresses are special and perhaps they don't want to do this, or at least they should look into it further, and this applies equally regardless of which layer the problem actually occurs on. i might change my mind if we could modify all userland DNS servers (for example) to refuse to run on UDP anycast addresses unless configured correctly, but that's clearly not feasible, so this seems like the least worst alternative.

if i can rephase your position slightly (please correct me if i'm wrong here) i think your concern, as tcp maintainer, is that you don't want to ship a tcp stack that can break like this in the default configuration, because it's then "the fault" of the tcp stack for permitting the bad configuration. i'm not disagreeing with that, i only feel like it's useful to extend this to udp as well simply because that seems beneficial to users.

In D50019#1142820, @ivy wrote:

Yes, TCP does its own error handling and does not use ICMP for it like UDP.

i've done this in D50099 (among a lot of other changes, but that part is fairly self-contained and could be split out).

I do see a difference between UDP raw sockets on the one side, and TCP and SCTP on the other side:
If you use anycast with UDP, or raw sockets, the upper layer protocol might nor might not break. The lower layer is fine.
If you use anycast with TCP or SCTP, it is the transport layer which might break or not.

i understand your point, but from the user's perspective, i'm not sure it makes a difference. if the user configures a TCP HTTP server on an anycast address, it might break, and if the user configures a UDP QUIC server on an anycast address, it might break. even deploying

Sure, both are connection oriented. But if you run a simple query/response protocol on top of UDP, it will work.

UDP DNS on anycast might break if cookies are enabled. i think we're all agreed that there's no technical reason to restrict this, but rather the purpose is to avoid letting the user configure things that might break in unexpected ways. do we think users who aren't aware of the problems with TCP anycast are going to be aware of the problems with UDP DNS anycast? i feel like the answer will be 'no' in a non-trivial number of cases.

for me, the purpose of the sysctl is to give the user a nudge that anycast addresses are special and perhaps they don't want to do this, or at least they should look into it further, and this applies equally regardless of which layer the problem actually occurs on. i might change my mind if we could modify all userland DNS servers (for example) to refuse to run on UDP anycast addresses unless configured correctly, but that's clearly not feasible, so this seems like the least worst alternative.

if i can rephase your position slightly (please correct me if i'm wrong here) i think your concern, as tcp maintainer, is that you don't want to ship a tcp stack that can break like this in the default configuration, because it's then "the fault" of the tcp stack for permitting the bad configuration. i'm not disagreeing with that, i only feel like it's useful to extend this to udp as well simply because that seems beneficial to users.

I still think that you might want to run a UDP-based service on a anycast address, but not a TCP one. Can we support this?
To be clear: I would like to have a setting, where UDP is allowed to be used, but TCP isn't.

I still think that you might want to run a UDP-based service on a anycast address, but not a TCP one.

if you want that, can't you just... not run a TCP service?

i'm not trying to be flippant there but i don't quite understand the use-case for this. the sysctl is basically a knob that says "i understand what i'm doing"; as long as you understand what you're doing, you can run UDP over anycast, or TCP over anycast, or both, or you could just enable it and not run any services at all. i don't see the benefit of being able to say "i understand what i'm doing but i want to be prevented from doing this anyway, just in case".

Can we support this?

i'm already not convinced about the value of the sysctl, because i think it's going to cause more confusion/frustration than it avoids, but i can accept it if it's just making the address either usable or not usable. having different protocols behave differently (especially by default, which i can tell someone will suggest) would make that trade-off even less worth it. so... if doing it like this is the only way to let me configure my existing anycast addresses as IFF_ANYCAST, then i will do that, but i think this is a bad design.

Can we support this?

see the update to D50099.

In D50019#1143124, @ivy wrote:

Can we support this?

see the update to D50099.

I'm going to re-iterate on this review as well - the existence of the anycast flag is the configuration knob that behaviour can/should be gated behind. It's not a property of the address value, it's a configured flag by the administrator. I don't believe we need another sysctl on top of this.

I think now, this is OK. As a follow-up, we can do the m_pullup() dance in tcp_input_with_port() similar to the IPv4 code path.

i intend to land this on Saturday unless anyone objects before then.