Page MenuHomeFreeBSD

Some fixes for FIFOs/pipes when used with kqueue/poll
ClosedPublic

Authored by jan.kokemueller_gmail.com on Apr 22 2020, 2:51 AM.
Tags
None
Referenced Files
F81428615: D24528.diff
Tue, Apr 16, 5:33 AM
Unknown Object (File)
Mon, Mar 18, 1:43 PM
Unknown Object (File)
Mar 7 2024, 8:32 PM
Unknown Object (File)
Mar 4 2024, 12:52 PM
Unknown Object (File)
Feb 23 2024, 11:32 PM
Unknown Object (File)
Feb 23 2024, 11:31 PM
Unknown Object (File)
Feb 23 2024, 11:35 AM
Unknown Object (File)
Feb 23 2024, 11:35 AM

Details

Summary

This should fix some issues when using kqueue/poll on FIFOs or pipes.

Related bug reports:

In detail this patch makes the following changes:

When new readers/writers connect to a FIFO, pipeselwakeup() is now called so that pollers are woken up and a kqueue event is triggered. I've also added pipeselwakeup() to the error path but I'm not exactly sure how to test the error path.

When reading from a pipe, pipeselwakeup() is now only called if there was actually some data read. This prevents the continuous re-triggering of a EVFILT_READ event on EOF when in edge-triggered mode (using EV_CLEAR).

It is possible to get POLLIN from a O_WRONLY fd on a closed FIFO. There should probably be a check for the FREAD fd flag.

When closing down a pipe, pipeselwakeup() was called before the PIPE_EOF flag is set. This could be moved down a bit so that pipe_poll() and filt_piperead() pick up the PIPE_EOF flag correctly. Similarly, the pipeselwakeup() when disconnecting from the peer can be moved a bit down so calling KNOTE_LOCKED again becomes unnecessary.

The filt_piperead() was modified so it can clear EV_EOF again. This happens when new readers/writers connect to a FIFO. This behavior was present in the old socket based FIFO implementation but lost when FIFOs/pipes where merged some years ago. The man page says that the EV_EOF flag can be cleared by specifying EV_CLEAR, but this seems to be inaccurate. The wording should probably be updated (as DragonFlyBSD did here https://github.com/DragonFlyBSD/DragonFlyBSD/commit/acb71b22f1443bd8a34f6e457cf57c4527d7ab52).

I've also added the check for the current FIFO writer generation to filt_piperead() so that new readers that connect to a previously closed FIFO don't get the "old" EV_EOF flag. This now matches the poll() behavior. I removed the checks on "wpipe": On FIFOs it does not matter because only one pipe of a pipepair is actually used (in this case the "rpipe"). On pipes, "pipeclose" sets both EV_EOF flags shortly after another anyway.

In filt_pipewrite() the EV_EOF flag is now also cleared on new readers. I've moved the assignments to "kn->kn_data" above the EOF checking. This now matches the behavior of the old socket based implementation where kn_data could be non-zero and EV_EOF be set at the same time.

I wrote some ATF tests for most of this (https://github.com/jiixyj/epoll-shim/blob/develop/test/pipe-test.c) and to compare with other OSs. I'd like to clean them up so they could be added as well.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

I think the fifofs changes are ok.

sys/kern/sys_pipe.c
830

Generally we prefer to have standalone explanations for the behaviour of kernel code, instead of indirect references like "PR 203366."

1745

The indentation here is wrong. I'd find the condition a bit easier to read if written as

if ((rpipe->pipe_state & PIPE_EOF) != 0 &&
    ((rpipe->pipe_state & PIPE_NAMED) == 0 ||
    fp->f_seqcount != rpipe->pipe_wgen))

Why are we comparing with f_seqcount instead of f_pipegen?

1762

The comment should be moved to just before the pipe_present == PIPE_ACTIVE check.

1775

Hmm, I don't really understand the PIPE_DIRECTW check. Note that for EVFILT_WRITE, the knote is attached to the opposite end of the pipe, i.e., the reader side. But PIPE_DIRECTW only gets set on the writing end. Anyway, that is a separate issue.

Why do we return a non-zero kn_data if pipe_present != PIPE_ACTIVE? In that case I think the reader is gone, no?

Thank you for the review! I've incorporated your comments and updated the diff.

I've also added the man page change regarding EV_EOF behavior on FIFOs, taken from DragonFlyBSD (https://github.com/DragonFlyBSD/DragonFlyBSD/commit/acb71b22f1443bd8a34f6e457cf57c4527d7ab52).

I also wrote some ATF tests for FIFOs and pipes that show the new behavior for FIFOs. The pipe tests also run successfully under FreeBSD 11.3, so no behavior changes there.

sys/kern/sys_pipe.c
830

Fixed!

1745

Why are we comparing with f_seqcount instead of f_pipegen?

Correct, it should be f_pipegen! Back then when I wrote this part, it was named f_seqcount.

1775

Regarding the PIPE_DIRECTW check I've looked at pipe_poll, and the logic there should match filt_pipewrite I think -- EVFILT_WRITE is triggered iff POLLOUT is set. Of course, they may both be buggy...

Why do we return a non-zero kn_data if pipe_present != PIPE_ACTIVE? In that case I think the reader is gone, no?

Yes, that's true. In case the pipe is destroyed, the last size of the write buffer is written into kn_data. Since a pipe cannot ever be reopened after it is closed, I guess it would be also valid to set kn_data to 0 if PIPE_NAMED is not set. Curiously, in FreeBSD 11.3 this part of the code is never exercised: If a pipe is closed, pipeselwakeup() is called *before* the EOF flag is set on the pipe. EV_EOF was then set in knlist_cleardel() in kern_event.c (https://svnweb.freebsd.org/base/head/sys/kern/kern_event.c?revision=360140&view=markup#l2529).

Thank you for the review! I've incorporated your comments and updated the diff.

I've also added the man page change regarding EV_EOF behavior on FIFOs, taken from DragonFlyBSD (https://github.com/DragonFlyBSD/DragonFlyBSD/commit/acb71b22f1443bd8a34f6e457cf57c4527d7ab52).

I also wrote some ATF tests for FIFOs and pipes that show the new behavior for FIFOs. The pipe tests also run successfully under FreeBSD 11.3, so no behavior changes there.

Nice! Thank you.

I am being a bit cautious with this review since the pipe code is easy to break and there are several distinct changes here. I will commit a few pieces today.

sys/fs/fifofs/fifo_vnops.c
179

One side effect of calling pipeselwakeup() is that SIGIO will be sent if the application requested it, but in these cases there is no I/O to be done. I'm not sure if this is really worth handling given that we also call pipeselwakeup() in pipe_close().

sys/kern/sys_pipe.c
1775

I guess PIPE_DIRECTW may be set if this is a fifo, though I don't think that was true when the check was added many years ago. (The pipe and fifo code were merged after.)

Regarding the PIPE_ACTIVE check, I am mostly worried about accessing wpipe fields when the lock is not held. If we are dealing with a pipe, then wpipe actually points to the reader side, and in that case if pipe_present != PIPE_ACTIVE, there is no reason I can see to return a non-zero kn_data.

So, I think you can write:

if (wpipe->pipe_present == PIPE_ACTIVE || (wpipe->pipe_flags & PIPE_NAMED) != 0) {
    PIPE_LOCK_ASSERT(...)
    <set kn_data>;
}
...
tests/sys/fifo/fifo_kqueue.c
1 ↗(On Diff #70939)

Could you add SPDX tags to the new files?

SPDX-License-Identifier: BSD-2-Clause-FreeBSD

52 ↗(On Diff #70939)

Indentation should be by four spaces.

57 ↗(On Diff #70939)

Why 32 and not 2?

I've uploaded a new version of the patch:

  • The update of kn_data in filt_pipewrite() is now done under the pipe lock (as it reads from wpipe->pipe_buffer). The tests still run fine. Is this what you had in mind?
  • I've fixed the indentation of the tests and added the SPDX line.
sys/fs/fifofs/fifo_vnops.c
179

I must admit that I haven't tested with SIGIO at all... In the old socket based FIFO implementation there was a "sowwakeup" call here (which sends a SIGIO I think). I don't know why it was removed.

Do you know if programs that use SIGIO are supposed to deal with those kinds of spurious signals? I would guess so, but I'm not super familiar with this programming model.

tests/sys/fifo/fifo_kqueue.c
57 ↗(On Diff #70939)

I used 32 out of habit. I'm reusing the array for output events, and 32 is a good enough size in practice.

Here in the tests it should just have strictly more space available that the number of expected events so that I know there are no more events in the kqueue.

I spent some more time reviewing and testing this. I'll get it committed.

This revision is now accepted and ready to land.Apr 27 2020, 3:39 PM