Page MenuHomeFreeBSD

pipe: fix thundering herd problem in pipelock
ClosedPublic

Authored by mjg on Thu, Nov 19, 8:20 AM.

Details

Summary

All reads and writes are serialized with a hand-rolled lock, but unlocking it always wakes up all waiters. Existing flag fields get resized to make room for introduction of waiter counter without growing the struct.

I don't see an easy way to get rid of pipelock or convert into into an actual lock.

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

mjg requested review of this revision.Thu, Nov 19, 8:20 AM

In what situation is there a large number of threads blocked on the pipe lock?

sys/kern/sys_pipe.c
626 ↗(On Diff #79743)

No newlines in kassert messages, ditto below.

BTW why pipelock() is not sx ?

In what situation is there a large number of threads blocked on the pipe lock?

For example make -j $BIGNUM. There are tons of processes all polling the same pipe for "tokens" and then reading.

BTW why pipelock() is not sx ?

As in why not convert pipelock to sx as it is? There are lock ordering issues against PIPE_LOCK which is a mutex held and dropped around calls to pipelock. Converting that to sx as well so that the code can wait for pipelock with PIPE_LOCK taken runs into other problems -- for example right now there is a larger window where it is possible to poll the pipe despite other ops running on it, and the lock taken with PIPE_LOCK is heavily contested there.

So a straight up conversion without reworking the locking scheme would regress things here. If going for serious effort someone should implement using the direct map to back pipe buffers and implement new locking with that in mind.

I see that PIPE_LOCK is simply interlocked with PIPE_LOCKFL, then why cannot we just replace msleep() with sx_sleep() if pipelock() becomes sx ?

This revision is now accepted and ready to land.Thu, Nov 19, 5:28 PM

Not sure I follow.

In pipe_read the operation is PIPE_LOCK ; pipelock which relocks PIPE_LOCK and the read loop which performs:

 PIPE_UNLOCK(rpipe);
error = uiomove(
    &rpipe->pipe_buffer.buffer[rpipe->pipe_buffer.out],
    size, uio);
PIPE_LOCK(rpipe);

This allows pipe_poll to proceed while data is moved around, despite pipelock being taken. I don't think anything invasive locking-wiase is warranted without implementing direct map support.

This revision was automatically updated to reflect the committed changes.