Page MenuHomeFreeBSD

Fix spurious ENOTCONN from closed unix domain socket other' side.
ClosedPublic

Authored by kib on May 12 2020, 12:24 PM.

Details

Summary

Sometimes, when doing read(2) over unix domain socket, for which the other side socket was closed, read(2) returns -1/ENOTCONN instead of EOF AKA zero-size read. This is because soreceive_generic does not lock socket when testing the so_state SS_ISCONNECTED|SS_ISCONNECTING flags. It could end up that we do not observe so->so_rcv.sb_state bit SBS_CANTRCVMORE, and then miss SS_ flags.

Reverting the order of checks for SBS and SS would not help, I believe this would result in different problem, that we enter sbwait() too late, missing wakeups soisdisconnected(). Also, I think it is not desirable to add socket locking to soreceive_generic(), because socket read is a hot path, while soisdisconnected() is not so hot.

I propose the following fix for discussion. Its main uncertainty point is that I have to establish order between socket lock and sockbuf locks. I do not quite understand the note in r322856 about recursion.

Problem was reported by pho, and the patch is confirmed to fix the issue. Also it survived all stress2 socket tests.

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

kib created this revision.May 12 2020, 12:24 PM
kib requested review of this revision.May 12 2020, 12:24 PM
markj added a comment.May 12 2020, 2:21 PM

I thought the socket and sockbuf lock already had a well-defined lock order, the socket lock is acquired first. There are many examples in e.g., the poll and kevent code for sockets.

I can't understand the note about recursion either.

markj added inline comments.May 12 2020, 2:55 PM
sys/kern/uipc_socket.c
4027 ↗(On Diff #71682)

Why do we need to clear these bits for listening sockets at all?

kib added inline comments.May 12 2020, 6:35 PM
sys/kern/uipc_socket.c
4027 ↗(On Diff #71682)

It is probably a question not to me. But even if we can avoid this cleaning of the flag for listening sockets, it should be done in separate change. I do not see any assertions that would enforce split of so_state flags between listening/non-listening.

markj added a comment.May 13 2020, 1:03 PM

I see the recursion: socantrcvmore_locked() calls sorwakeup_locked(), which may call sowakeup(), which may call soisconnected() if an accept filter decided to accept the connection based on data resident in the receive buffer. In this case, the so_state update is rolled back...

What if soisdisconnected() is changed to update socket buffer state before so_state? soreceive_generic() checks for SBS_CANTRCVMORE first.

kib added a comment.May 13 2020, 1:09 PM

I see the recursion: socantrcvmore_locked() calls sorwakeup_locked(), which may call sowakeup(), which may call soisconnected() if an accept filter decided to accept the connection based on data resident in the receive buffer. In this case, the so_state update is rolled back...

What if soisdisconnected() is changed to update socket buffer state before so_state? soreceive_generic() checks for SBS_CANTRCVMORE first.

I believe this might cause hang if we did not noticed state change ? See the summary of the review.

kib updated this revision to Diff 71718.May 13 2020, 2:29 PM

Rework the patch together with markj.

Properly check that the socket was never connected before returning ENOTCONN.

markj accepted this revision.May 13 2020, 2:50 PM
This revision is now accepted and ready to land.May 13 2020, 2:50 PM
This revision was automatically updated to reflect the committed changes.