Page MenuHomeFreeBSD

sockets: remove the socket-on-stack hack from sorflush()
ClosedPublic

Authored by glebius on Wed, May 4, 6:45 PM.

Details

Summary

The hack can be tracked down to 4.4BSD, where copy was performed
under splimp() and then after splx() dom_dispose was called.
Stevens has a chapter on this function, but he doesn't answer why
this trick is necessary. Why can't we call into dom_dispose under
splimp()? Anyway, with multithreaded kernel the hack seems to be
necessary to avoid LORs between socket buffer lock and different
filesystem locks, especially network file systems.

The new socket buffers KPI sbcut() from 1d2df300e9b allow us to get
rid of the hack.

Diff Detail

Repository
R10 FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Seems ok, but I'm still not sure about LORs. I would love to get rid of this hack though.

sys/kern/uipc_usrreq.c
2784

Isn't this the same as m = sbcut_locked(sb, sb->sb_ccc);?

2787

sbrelease_locked() asserts this already.

2791

So we still hold the IO lock here, and unp_dispose_mbuf() can close arbitrary descriptors, so a lot of code is reachable from this point (i.e., most implementations of fo_close). How can we be certain that there are no LORs possible here?

sys/kern/uipc_usrreq.c
2784

sbflush_internal() has a comment that suggests a problem exists here. It could be outdated, though.

2791

Good concern. I will update the revision.

sys/kern/uipc_usrreq.c
2787

sbrelease_locked() asserts this already.

2787

sbrelease_locked() asserts this already.

It does so after calling sbcut(). We want to make sure socket buffer is empty after sbcut done by us.

sys/kern/uipc_usrreq.c
2784

I suspect it is outdated. Right now the code doesn't even panic in that case, unless INVARIANTS is on.

2785
  • Drop the I/O lock earlier
  • Don't be afraid to call sbcut_internal() with 0 length
markj added inline comments.
sys/kern/uipc_usrreq.c
2791

Actually, I don't see why we need to hold the io lock here at all. Can't it be dropped before calling pru_dispose? We marked the buffer with CANTRCVMORE already.

Hmm, no, it seems that soreceive_generic() removes mbufs from the recv socket buffer after copying them out to userspace, and to do that it needs to drop the socket buffer mutex. That's surprising to me, to be honest.

This revision is now accepted and ready to land.Wed, May 4, 9:47 PM
sys/kern/uipc_usrreq.c
2791

Other reason is to protect against a race between shutdown(2) and listen(2). Not something to be expected from a sane application though.

So it seems the most likely reason is simply that there was no sbcut()-like operation available.

sys/kern/uipc_usrreq.c
2791

An application may be sane and may also have file descriptor reuse bugs. :)