Page MenuHomeFreeBSD

sockets: re-check socket state after call to pr_rcvd()
ClosedPublic

Authored by becker.greg_att.net on Sep 8 2023, 3:13 PM.
Referenced Files
Unknown Object (File)
Fri, Nov 29, 5:55 AM
Unknown Object (File)
Fri, Nov 22, 6:19 AM
Unknown Object (File)
Thu, Nov 21, 11:17 PM
Unknown Object (File)
Wed, Nov 20, 11:32 AM
Unknown Object (File)
Wed, Nov 20, 11:21 AM
Unknown Object (File)
Wed, Nov 20, 11:21 AM
Unknown Object (File)
Wed, Nov 20, 9:30 AM
Unknown Object (File)
Oct 9 2024, 9:25 AM
Subscribers

Details

Summary

This patch addresses the close hang reported in bugzilla PR 212716 (https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=212716)

Socket state may have changed after dropping the receive
buffer lock in order to call pr_rcvd(). If the buffer is
empty, re-check the state after reaquiring the lock and
skip calling sbwait() if the socket is in error or the
peer has closed.

Much thanks to markj for the regression test (which will be applied as a separate commit prior to the actual fix).

Test Plan

kyua test /usr/tests/Kyuafile

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

sys/kern/uipc_socket.c
2432–2436

Can you please collapse to if-clauses into one and add prediction clause (acts like documentation) :)

Reformat to ahere to suggested style...

sys/kern/uipc_socket.c
2432–2436

No problem, done!

Looks good to me. Thanks for working on this.

Could you please mail me two patches generated by git format-patch with your desired commit messages, one with the assertion and one with the bug fix? Though, please give other reviewers a couple of days to look further at the patch.

sys/kern/uipc_sockbuf.c
460 ↗(On Diff #127176)

I like this change but would commit it separately.

This revision is now accepted and ready to land.Sep 9 2023, 3:50 PM

Thanks, Mark, I will send you two commits as requested. Thanks again for all your help!

I've been able to trigger an sbwait() assertion failure in the receive path when running KTLS tests, but only when running tests in parallel in QEMU. I believe the problem is the following: KTLS introduces a state for received data wherein it is being decrypted but is not yet queued in the socket buffer. In this case (sb_tlsdcc != 0), we can end up calling sbwait() with CANTRECVMORE set because that sleep/wakeup mechanism is used to signal completion of decryption operations.

So, at least for now, I'll commit the bugfix without the assertion.