pwait: Add a -t flag to specify a timeout in seconds before exiting.
ClosedPublic
Actions

Authored by bdrewery on Feb 21 2017, 3:37 AM.

Details

Reviewers

jilles

Commits

rS314886: pwait: Add a -t flag to specify a timeout before exiting, and tests.

Summary

The exit status will be 124, as the timeout(1) utility uses.

I plan to use this in poudriere to replace a chained 'timeout $n pwait' usage
in some critical code. I could easily just commit this to poudriere only
but this seemed useful in the default utility as well.

MFC after: 2 weeks

Diff Detail

Repository

rS FreeBSD src repository - subversion

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

bdrewery updated this revision to Diff 25455.Feb 21 2017, 3:37 AM

bdrewery retitled this revision from to pwait: Add a -t flag to specify a timeout in seconds before exiting..

bdrewery updated this object.

bdrewery edited the test plan for this revision. (Show Details)

bdrewery added a reviewer: jilles.

bdrewery added subscribers: mjg, bapt.

jilles requested changes to this revision.Feb 24 2017, 2:31 PM

jilles edited edge metadata.

jilles added inline comments.

bin/pwait/pwait.c
94 ↗	(On Diff #25455)	Hmm, given that you are discarding the fractional part anyway, why use doubles in `parse_duration()`?
144 ↗	(On Diff #25455)	The timeout applies to waiting for each process, so if N processes are waited for it may take N times the given timeout. This needs some more complicated code, either with `clock_gettime()` or with `SIGALRM`.

This revision now requires changes to proceed.Feb 24 2017, 2:31 PM

bdrewery added inline comments.Feb 24 2017, 5:08 PM

bin/pwait/pwait.c
94 ↗	(On Diff #25455)	Yes double makes no sense here. It was because I copied it from the timeout.c file and removed the support for suffixes.
144 ↗	(On Diff #25455)	Yeah I suspected that may be the case. It does what I expect with a simple case of all timing out which is a timeout for pwait, regardless of how many processes, but not if one exits: ~ # sleep 300 & [1] 28293 ~ # sleep 300 & [2] 28299 ~ # sleep 300 & [3] 28337 ~ # sleep 300 & [4] 28345 ~ # sleep 300 & [5] 28358 ~ # date Fri Feb 24 09:03:24 PST 2017 ~ # ./pwait -t 5 28293 28299 28337 28345 28358 ~ # date Fri Feb 24 09:03:31 PST 2017 But if one of them returns then the timeout is all wrong: ~ # date Fri Feb 24 09:05:04 PST 2017 ~ # ./pwait -t 5 28293 28299 28337 28345 28358 [5] + terminated sleep 300 ~ # date Fri Feb 24 09:05:34 PST 2017

Rework to use SIGALRM, support fractional seconds, and add some tests.

I like the tests, even though they take a fair bit of time to execute.

I would like to see the admittedly unlikely race fixed where SIGALRM occurs between checking got_alarm and calling kevent(). This could be either using EVFILT_SIGNAL or by writing the -v message (using write(2)) and exiting from the signal handler. I think the code will be simpler either way.

Also, signal() or sigaction() should be called before setitimer() to eliminate any window where you may get a SIGALRM exit.

jilles requested changes to this revision.Mar 6 2017, 10:59 PM

jilles edited edge metadata.

This revision now requires changes to proceed.Mar 6 2017, 10:59 PM

In D9697#204786, @jilles wrote:

I like the tests, even though they take a fair bit of time to execute.

I would like to see the admittedly unlikely race fixed where SIGALRM occurs between checking got_alarm and calling kevent(). This could be either using EVFILT_SIGNAL or by writing the -v message (using write(2)) and exiting from the signal handler. I think the code will be simpler either way.

Admittedly I wanted to avoid exiting in the signal handler since I have been making some utilities into builtins in poudriere and want to control the exit status. Exiting from and using write(2) in the handler definitely would be simpler. I am going to explore using EVFILT_SIGNAL first.

Also, signal() or sigaction() should be called before setitimer() to eliminate any window where you may get a SIGALRM exit.

Woops.

In D9697#204796, @bdrewery wrote:

In D9697#204786, @jilles wrote:

I like the tests, even though they take a fair bit of time to execute.

I would like to see the admittedly unlikely race fixed where SIGALRM occurs between checking got_alarm and calling kevent(). This could be either using EVFILT_SIGNAL or by writing the -v message (using write(2)) and exiting from the signal handler. I think the code will be simpler either way.

Admittedly I wanted to avoid exiting in the signal handler since I have been making some utilities into builtins in poudriere and want to control the exit status. Exiting from and using write(2) in the handler definitely would be simpler. I am going to explore using EVFILT_SIGNAL first.

Meant to link https://github.com/freebsd/poudriere/blob/master/src/libexec/poudriere/pwait/pwait.c

Also, signal() or sigaction() should be called before setitimer() to eliminate any window where you may get a SIGALRM exit.

Woops.

Use EVFILT_SIGNAL rather than a signal handler function.

Looks good to me as a separate process.

For usage as a builtin, you may want shell-specific handling to allow processing traps while waiting in some way.

This revision is now accepted and ready to land.Mar 7 2017, 9:49 PM

In D9697#205046, @jilles wrote:

Looks good to me as a separate process.

For usage as a builtin, you may want shell-specific handling to allow processing traps while waiting in some way.

Yeah, the only one I've cared to handle so far for poudriere is SIGINFO. The rest I am fine with causing an EINTR and ceasing the wait. For SIGINFO I just disable it while doing operations in builtins that may EINTR since poudriere's SIGINFO support is not worth all of the extra complexity to handle ambiguous EINTR, and still halt on SIGINT/SIGTERM/etc, in tools like this. That's what the 'siginfo_push' and 'siginfo_pop' calls are doing.

Closed by commit rS314886: pwait: Add a -t flag to specify a timeout before exiting, and tests. (authored by bdrewery). · Explain WhyMar 7 2017, 10:17 PM

This revision was automatically updated to reflect the committed changes.