kern: net: remove TCP_LINGERTIME
ClosedPublic
Actions

Authored by kevans on Jan 21 2021, 4:06 AM.

Details

Reviewers

rscheff
tuexen

Group Reviewers

network
transport

Commits

rG4c0bef07be07: kern: net: remove TCP_LINGERTIME

Summary

The only purpose of this that I can find is to cause surprising behavior on
accepted connections. Newly-created sockets will never hit these paths as
one cannot set SO_LINGER prior to socket(2). If SO_LINGER is set on a
listening socket and inherited, one would expect the timeout to be inherited
rather than changed arbitrarily like this.

Diff Detail

Repository

rG FreeBSD src repository

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

kevans created this revision.Jan 21 2021, 4:06 AM

Herald added a reviewer: transport. · View Herald TranscriptJan 21 2021, 4:06 AM

Herald added subscribers: melifaro, imp. · View Herald Transcript

kevans requested review of this revision.Jan 21 2021, 4:06 AM

Harbormaster completed remote builds in B36385: Diff 82676.Jan 21 2021, 4:07 AM

Can you describe how one can hit the code you are suggesting to be removed?

In D28265#632217, @tuexen wrote:

Can you describe how one can hit the code you are suggesting to be removed?

In theory, set SO_LINGER with l_onoff=1, l_linger=0 on a listening socket; sonewconn() will inherit the SO_LINGER and linger time but hit this path and reset it to TCP_LINGERTIME. My best guess is that this might've been intended at one point to apply across all TCP sockets, but ended up getting hammered around until we arrived at it only applying to accept()d sockets.

In practice, I found that this was subtly broken in r10:779f106aa169 which moved setting so->so_options in sonewconn() to *after* pru_attach(). This made this effectively dead code, and we've spent the last ~3 years with this being dead code.

I sent a separate e-mail to @glebius to gather his thoughts on it, because the only effect I can see from the move across all in-tree pru_attach() that may be called from sonewconn() is that these two bits became dead code. Nothing else in-tree relies on an accurate so->so_options in pru_attach.

How is the default 2 minute "linger" timer being implemented now if this is dead code? iirc this value use to be used when a socket was closed on the server side to control how long one waited in close_wait for the client to finish up the connection (server initiated close.)

sys/netinet/tcp_usrreq.c
2566	The fact that so_linger is NOT 0 becomes critical here, as it causes the else clause to trigger. This is the expected default behavior of TCP. If we do not initialize so_linger to TCP_LINGERTIME above this second condition, == 0, is always true, thus altering the expected default behavior of SO_LINGER if one does not also set the timer.

kevans added inline comments.Jan 21 2021, 3:06 PM

sys/netinet/tcp_usrreq.c
2566	I'm not sure what you're saying here. SO_LINGER cannot be set when the socket is initialized, and you can't set it post-creation without also specifying a timeout. If you have some out-of-band method to set it on a socket that doesn't pair those together, then that should be fixed to apply whatever TCP_LINGERTIME semantics may be needed; the bits I've removed here are not the droids you are looking for, though.

kevans mentioned this in rG504ebd612ec6: kern: sonewconn: set so_options before pru_attach().Feb 9 2021, 3:47 AM

Ping; we now again hit this path as of 504ebd612ec6 when we create a new connection from a listening socket with SO_LINGER set. I'm not aware of any other implementation that has this TCP_LINGERTIME type of concept that specifically applies to SO_LINGER, and especially not one that resets a 0 lingertime to any other value for those inherited from a listening socket -- but I only tested/inspected Illumos, and tested Linux.

Let me have a look into it tomorrow. I agree, it looks strange.

looked into uipc_socket.c, sonewconn() does as you mention, and yes, I would agree that the expected behavior should be to retain the settings of the listening socket rather than change this... But I leave it to @tuexen to give the verdict on this.

In D28265#642773, @tuexen wrote:

Let me have a look into it tomorrow. I agree, it looks strange.

Thanks; I think my description is clear enough, but I've also got this (suboptimal, but functional) demo of the issue (adapted from a previous reproducer I've sent, in case it looks familiar): https://people.freebsd.org/~kevans/list.c (cc -pthread)

I think clamping so_linger could make sense (my recollection is that somewhere around ~4 minutes is the most one should reasonably need to wait by spec once we've progressed to FIN_WAIT*), but it should be evenly enforced rather than just on new incoming connections.

This fixes actually a bug. With this patch the following packetdrill runs successfully:

 0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+0.000 bind(3, ..., ...) = 0
+0.000 setsockopt(3, SOL_SOCKET, SO_LINGER, {onoff=1, linger=0}, 8) = 0
+0.000 listen(3, 1) = 0
+0.000 < S    0:0(0)       win 65535 <mss 1460,sackOK,eol,eol>
+0.000 > S.   0:0(0) ack 1 win 65535 <mss 1460,sackOK,eol,eol>
+0.000 <  .   1:1(0) ack 1 win 65535
+0.000 accept(3, ..., ...) = 4
+0.000 close(3) = 0
+0.000 getsockopt(4, SOL_SOCKET, SO_LINGER, {onoff=128, linger=0}, [8]) = 0
+0.000 close(4) = 0
+0.000 > R.   1:1(0) ack 1 win     0

Thanks for finding the bug and fixing it.

This revision is now accepted and ready to land.Feb 18 2021, 1:20 PM

Closed by commit rG4c0bef07be07: kern: net: remove TCP_LINGERTIME (authored by kevans). · Explain WhyFeb 19 2021, 4:38 AM

This revision was automatically updated to reflect the committed changes.

kevans added a commit: rG4c0bef07be07: kern: net: remove TCP_LINGERTIME.

kevans mentioned this in rG7f73db6d04d0: 504ebd612ec: kern: sonewconn: set so_options before pru_attach().Feb 24 2021, 4:49 AM

kevans mentioned this in rG5e57a3806ef3: 504ebd612ec: kern: sonewconn: set so_options before pru_attach().Feb 24 2021, 4:53 AM