Page MenuHomeFreeBSD

TCP Fast Open (TFO) [RFC7413] Client-side Implementation
AbandonedPublic

Authored by pkelsey on Dec 31 2017, 8:56 PM.

Details

Summary

This is primarily an implementation of the client side of RFC7413. It
also includes a pre-shared key mode of operation in which the server
requires the client to be in possession of a shared secret in order to
successfully open TFO connections with that server.

The TFO code is now unconditionally compiled into the kernel and the
TCP_RFC7413 option has been removed. The TCP_RFC7413_MAX_KEYS
option has been renamed to TCP_FASTOPEN_MAX_KEYS.

A bug in tcp_fastopen_check_cookie() where V_tcp_fastopen_numkeys was
being evaluated outside of the lock has been fixed.

The names of some existing fastopen sysctls have changed (e.g.,
net.inet.tcp.fastopen.enabled ->
net.inet.tcp.fastopen.server_enabled).

Test Plan

Tested using the quick-and-dirty tools at
https://people.freebsd.org/~pkelsey/tfo-tools/

I have been hacking on some packetdrill scripts that are not quite
ready for prime time, but should be soon.

Diff Detail

Repository
rS FreeBSD src repository
Lint
Lint WarningsExcuse: Will address phabricator hyphenation opinions later
SeverityLocationCodeMessage
Warningsys/netinet/tcp_fastopen.c:882SPELL1Possible Spelling Mistake
Unit
No Unit Test Coverage
Build Status
Buildable 14012
Build 14204: arc lint + arc unit

Event Timeline

pkelsey created this revision.Dec 31 2017, 8:56 PM
hiren added a subscriber: hiren.Jan 1 2018, 7:31 AM

Awesome! Thank you for this work!

Now, sorry to be a PITA but you should separate out this into 3 reviews: 1) fast-open ON by default removing TCP_RFC7413 2) Bug fix for the issue in tcp_fastopen_check_cookie() 3) Client side TFO implementation.

tuexen added a subscriber: tuexen.Jan 1 2018, 8:24 AM

I agree with hiren... Splitting this up in three patch sets makes a lot of sense and makes the review much simpler. While doing that, you could also change the name of the sysctl variables net.inet.tcp.fastopen.server_enabled to net.inet.tcp.fastopen.server_enable and net.inet.tcp.fastopen.client_enabled to net.inet.tcp.fastopen.client_enable to be consistent with other sysctl variables controlling the usage of features.

Awesome! Thank you for this work!

Now, sorry to be a PITA but you should separate out this into 3 reviews: 1) fast-open ON by default removing TCP_RFC7413 2) Bug fix for the issue in tcp_fastopen_check_cookie() 3) Client side TFO implementation.

I can do this I suppose, although I have to say it's hard for me to see the value in doing it as (2) and (3) seem pretty easy to evaluate in the current patchset, as (2) consists of moving a lock acquisition earlier in a small function and ensuring release on all the exit paths, and (3) is the mechanical removal of #ifdefs and the related rename of a kernel option that is used in one place. Perhaps because I don't think I fully understand the practical motivation for this separation, I'm also not sure whether or not it matters how I synthesize the new versions of the code - should they all be independent diffs against -current (which seems weird as it can hide interactions between the changes from review), or is there some preferred sequence (should the locking bug be presented as happening before or after all the other changes to same function)?

I agree with hiren... Splitting this up in three patch sets makes a lot of sense and makes the review much simpler. While doing that, you could also change the name of the sysctl variables net.inet.tcp.fastopen.server_enabled to net.inet.tcp.fastopen.server_enable and net.inet.tcp.fastopen.client_enabled to net.inet.tcp.fastopen.client_enable to be consistent with other sysctl variables controlling the usage of features.

With respect to 'enable' vs 'enabled', I considered this issue, but I didn't find the consistency you are claiming. I saw that both are used, and I was unable to discern any rule as to when one was employed versus the other, so I picked the one that reads as a status indicator as that seemed like it would be the more frequent reason they would be read. As seen on an r327445 kernel (with my TFO patches applied, hence the grep -v), 'enabled' is used in 8 names (5 of them are loader tunables, and an overlapping 5 are runtime writeable):

$ sysctl -aN | grep -v fastopen | grep enabled | wc -l

8
hiren added a comment.Jan 2 2018, 6:16 PM

Awesome! Thank you for this work!

Now, sorry to be a PITA but you should separate out this into 3 reviews: 1) fast-open ON by default removing TCP_RFC7413 2) Bug fix for the issue in tcp_fastopen_check_cookie() 3) Client side TFO implementation.

I can do this I suppose, although I have to say it's hard for me to see the value in doing it as (2) and (3) seem pretty easy to evaluate in the current patchset, as (2) consists of moving a lock acquisition earlier in a small function and ensuring release on all the exit paths, and (3) is the mechanical removal of #ifdefs and the related rename of a kernel option that is used in one place. Perhaps because I don't think I fully understand the practical motivation for this separation, I'm also not sure whether or not it matters how I synthesize the new versions of the code - should they all be independent diffs against -current (which seems weird as it can hide interactions between the changes from review), or is there some preferred sequence (should the locking bug be presented as happening before or after all the other changes to same function)?

Motivation: easier on reviewers.
Also, I don't think you'd want to commit this as is, do you? These are 3 separate changes and I prefer 3 different commits. But as you are the one doing actual work, you can decide how to go about it. :-)
I'd get 2) and 3) committed first as they are small and then generate 1) for larger review. But again, your call. :-)

Irrespective of how we get there, it is awesome to have client side support in FreeBSD!

I agree with hiren... Splitting this up in three patch sets makes a lot of sense and makes the review much simpler. While doing that, you could also change the name of the sysctl variables net.inet.tcp.fastopen.server_enabled to net.inet.tcp.fastopen.server_enable and net.inet.tcp.fastopen.client_enabled to net.inet.tcp.fastopen.client_enable to be consistent with other sysctl variables controlling the usage of features.

With respect to 'enable' vs 'enabled', I considered this issue, but I didn't find the consistency you are claiming. I saw that both are used, and I was unable to discern any rule as to when one was employed versus the other, so I picked the one that reads as a status indicator as that seemed like it would be the more frequent reason they would be read. As seen on an r327445 kernel (with my TFO patches applied, hence the grep -v), 'enabled' is used in 8 names (5 of them are loader tunables, and an overlapping 5 are runtime writeable):

$ sysctl -aN | grep -v fastopen | grep enabled | wc -l

8

I should be clear, I'm not so much arguing that one is better than the other as I'm interested in how the decision could be seen as non-arbitrary - are you saying I might as well make it match other 'enables' in the network stack, or that systemwide, majority rules on this naming question, or...

tuexen added a comment.Jan 2 2018, 6:38 PM

Regarding enable versus enabled, here is what I was referring to:

> sysctl net.inet.tcp | grep enable
net.inet.tcp.sack.enable: 1
net.inet.tcp.ecn.enable: 2
net.inet.tcp.hostcache.enable: 1
net.inet.tcp.fastopen.enabled: 0
> sysctl net.inet.sctp | grep enable
net.inet.sctp.enable_sack_immediately: 1
net.inet.sctp.pktdrop_enable: 0
net.inet.sctp.nrsack_enable: 0
net.inet.sctp.reconfig_enable: 1
net.inet.sctp.asconf_enable: 1
net.inet.sctp.auth_enable: 1
net.inet.sctp.pr_enable: 1
net.inet.sctp.ecn_enable: 1
> sysctl net.inet.ip | grep enable
net.inet.ip.fw.enable: 1
> sysctl net.inet6.ip6 | grep enable
net.inet6.ip6.fw.enable: 1

As far as I can see, only TCP FO uses enabled... Focusing on transport/network layer related sysctl variables.

Awesome! Thank you for this work!

Now, sorry to be a PITA but you should separate out this into 3 reviews: 1) fast-open ON by default removing TCP_RFC7413 2) Bug fix for the issue in tcp_fastopen_check_cookie() 3) Client side TFO implementation.

I can do this I suppose, although I have to say it's hard for me to see the value in doing it as (2) and (3) seem pretty easy to evaluate in the current patchset, as (2) consists of moving a lock acquisition earlier in a small function and ensuring release on all the exit paths, and (3) is the mechanical removal of #ifdefs and the related rename of a kernel option that is used in one place. Perhaps because I don't think I fully understand the practical motivation for this separation, I'm also not sure whether or not it matters how I synthesize the new versions of the code - should they all be independent diffs against -current (which seems weird as it can hide interactions between the changes from review), or is there some preferred sequence (should the locking bug be presented as happening before or after all the other changes to same function)?

Motivation: easier on reviewers.
Also, I don't think you'd want to commit this as is, do you? These are 3 separate changes and I prefer 3 different commits. But as you are the one doing actual work, you can decide how to go about it. :-)
I'd get 2) and 3) committed first as they are small and then generate 1) for larger review. But again, your call. :-)

I was going to commit this as one commit, otherwise I would have split it up to begin with. The number of changes in here depends on how we segment patch lines into 'changes' (I mean, obviously, but I've got to start the rest with that recitation). We could, for example, break this into even more pieces by segmenting along the lines of required, optional, and non-RFC extensional TFO functionality. The reason I haven't is because I fail to see the point - I don't see it as likely we would want to treat any subsets of this patch separately from the whole with respect to commit and MFC, and it seems to me under any reasonable segmentation, we still wind up with the lion's share of the cognitive load in reviewing a single one of the segmented patches, so segmenting feels a bit hobgobliny.

I really just see this as 'completion of TFO support' without any practical reason to segment it any further. The bug I fixed has no significant consequence that I can identify - it makes it possible that some TFO requests would wind up unnecessarily falling back to normal 3WHS behavior while the entire facility is being enabled/disabled. The code change is trivial to evaluate and I don't see any need to MFC that fix separately from the MFC of the rest of TFO support, so it makes sense to me to just keep it rolled in with the rest. With respect to getting rid of the conditional compilation support (the existence of which in the first place was a *very* conservative decision that was itself controversial), I don't see that as something conceptually separable here in practical terms - this is the completion of TFO support and it is going be deployed on some pretty significant infrastructure. Either all of it will be ready to be always-on and MFC'd or none of it is. I was thinking something along the lines of MFC-in-three-months would be appropriate.

pkelsey added inline comments.Jan 2 2018, 9:44 PM
sys/netinet/tcp_input.c
2042

Typo: rightg -> right

sys/netinet/tcp_output.c
548

As TFO permits data to be sent in SYN and SYN|ACK packets, I think we need to add '&& !(flags & TH_SYN)' here to ensure tso is disabled when SYN or SYN|ACK packets are sent.

pkelsey added inline comments.Jan 4 2018, 3:57 AM
sys/netinet/tcp_fastopen.c
859

It's possible that cce->cookie_len is zero here - this can happen when a path is reënabled (and thus its prior cookie data is cleared) during a connection attempt and another connection attempt occurs on the same path before the current cookie is returned by the server. In this case, we really should send a cookie request instead of a PSK cookie computed from a null cookie, even though the result of sending an invalid PSK cookie will be the same as sending a cookie request - the server will send the current cookie value in response.

Setting tp->tfo_client_cookie_len to zero here when cce->cookie_len is zero, instead of computing a PSK cookie, will cause a cookie request to be sent.

922

I don't believe this is actually necessary as tcp_fastopen_disable_path() will create an entry for the path if one doesn't already exist. This call and the comment above should just be replaced with a comment that a cookie request will be sent and the cookie details are already zero from tcpcb init.

swills added a subscriber: swills.Jan 14 2018, 1:46 AM
pkelsey abandoned this revision.Jan 25 2018, 6:01 AM

Breaking into separate reviews.