Page MenuHomeFreeBSD

Make xargs -P0 create as many processes as possible
ClosedPublic

Authored by allanjude on May 22 2015, 3:25 AM.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
allanjude added a reviewer: eadler.
allanjude retitled this revision from Make xargs -P0 create as many processes as possible PR: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=199976 Submitted By: Nikolai Lifanov to Make xargs -P0 create as many processes as possible.
allanjude edited edge metadata.
allanjude updated this object.

Fix manpage markup
Fix manpage date

Updated with feedback from davide@

bjk added inline comments.
usr.bin/xargs/xargs.c
170 ↗(On Diff #5545)

I do not see how nargs is the correct value to use here; it has a different meaning to the program, and can be overridden by other command-line arguments, giving a bizzare behavior change if -n 1 -P 0 are passed in different orders.

Good point. The -nX flag overrides nargs depending on the position.
Let me think of a different value to use.

usr.bin/xargs/xargs.c
170 ↗(On Diff #5545)

Good point. I didn't trip this up by pure coincidence.
Let me think of a different value to use.

I have a new patch. I'd like to commandeer this revision to apply it.

This replaces the use of nargs, which is arbitrary and can be
overridden by the "-n" flag by value of kern.maxprocs.

This was suggested by Mateusz Guzik on current@.

allanjude edited reviewers, added: lifanov_mail.lifanov.com; removed: allanjude.
allanjude added inline comments.
usr.bin/xargs/xargs.c
171 ↗(On Diff #5572)

We should probably use the same sysctl instead of nargs for the upper bounds as well

Updated to use kern.maxproc for both the upper bound, and the automatic limit of -P

This revision is now accepted and ready to land.May 23 2015, 12:52 AM
jbeich added inline comments.
usr.bin/xargs/xargs.c
169 ↗(On Diff #5574)

Why allow value larger than RLIMIT_NPROC or _SC_CHILD_MAX? Soft limit would cap number of processes sooner. And kern.maxproc can be larger than "infinity" hard limit.

$ limits -Su 50 sh -c "yes | xargs -n1 -P0 yes foo" >/dev/null &
$ pgrep -f foo | wc -l
       6

$ limits -Su
Resource limits (current):
  maxprocesses-cur          100
$ limits -Hu
Resource limits (current):
  maxprocesses-max         1458
$ sysctl kern.max{proc,users}
kern.maxproc: 1620 # (20 + 16 * maxusers)
kern.maxusers: 100

I don't understand: don't limits work correctly here if they allow for fewer jobs than maxproc?

I don't understand: don't limits work correctly here if they allow for fewer jobs than maxproc?

They do, so no need to even check with kern.maxproc for the upper bound. And value less than LONG_MAX can break compatibility for xargs -P<N> in scripts that stress-test system (many processes vs. low memory) - our xargs(1) (unlike GNU) would abort instead of hitting the limits.

usr.bin/xargs/xargs.c
51 ↗(On Diff #5574)

style(9) doesn't allow both <sys/types.h> and <sys/param.h> included at the same time.

68 ↗(On Diff #5574)

Why? Other uses of strtou*() in the file or across the base don't use macros for integer bases.

usr.bin/xargs/xargs.c:                  Rflag = strtol(optarg, &endptr, 10);
usr.bin/xargs/xargs.c:                  Sflag = strtoul(optarg, &endptr, 10);

http://bxr.su/search?refs=strtou*&project=FreeBSD

allanjude added inline comments.
usr.bin/xargs/xargs.c
68 ↗(On Diff #5574)

This was recommended by davide@
I can change all of the references, or remove the macro.

allanjude edited edge metadata.

Updated with feedback from jbeich@

This revision now requires review to proceed.May 24 2015, 5:58 PM
jbeich added inline comments.
usr.bin/xargs/xargs.c
68 ↗(On Diff #5666)

@davide has phabricator account. Let's ask him for the the rationale. I don't see his feedback here, in the bug or on freebsd-current maillist.

usr.bin/xargs/xargs.c
68 ↗(On Diff #5666)

Yeah, it was on IRC:
<davide> AllanJude: you don't do upper bound validation on the number of processes
<davide> AllanJude: '10' is hardcoded and makes hard to understand what that's actually about
<AllanJude> the 10 is the 'base'
<AllanJude> it is a standard argument to strtol
<davide> AllanJude: I can read the manpage
<davide> AllanJude: it should be a macro
<AllanJude> there is a macro for decimal?
<davide> #define BLAH 10 ?
<AllanJude> so, I should make one?
<davide> otherwise, lgtm

usr.bin/xargs/xargs.c
175 ↗(On Diff #5666)

RLIMIT_NPROC won't mean anything to the end user.
How about changing the line to just:
errx(1, "max processes must be less than %d", proclim);?

Is this acceptable as-is? 10.2 code slush is coming up.

Except for one small wording not, the man page changes look okay. Others have spoken for the code. What needs to be done to move this along?

usr.bin/xargs/xargs.1
214 ↗(On Diff #5666)

"at a time" is kind of weird at the end of this sentence. It could be removed without changing the meaning of the sentence, or maybe reword for better clarity.

lifanov_mail.lifanov.com edited reviewers, added: allanjude; removed: lifanov_mail.lifanov.com.

I'm commandeering this revision to address feedback from wblock@.

o Address feedback from wblock@ by simplifying wording.
For what it's worth, this was the wording similar to GNU info page.
o Replace RLIMIT_NPROC in a user-facing message with a user-friendly
description of what this value is.

mjg added inline comments.
usr.bin/xargs/xargs.1
214 ↗(On Diff #6188)

Why would a negative value be allowed? gnu xargs does not allow one. This sentence does not mention the upper limit present in the code (and erroring out if not met), which is actually a good thing since the limit should be removed.

usr.bin/xargs/xargs.c
68 ↗(On Diff #6188)

I also don't see what's the point of of this deifne.

116 ↗(On Diff #6188)

Why is this set here?

178 ↗(On Diff #6188)

The code form user perspective should behave just like gnu xargs, unless there is a good reason to deviate, which I don't see.

In this particular case, they have "unlimited" for an argument == 0, specified limit for > 0 and error out for non-numbers and negative values. Here negative values are accepted for some reason.

One big not-user-visible difference is the fact that gnu xargs dynamically resizes the process table, while our xargs allocates it upfront. This means we have to have a reasonable limit which can pose as "unlimited" for practical purposes.

getrlimit seen here is fine enough.

However, the code must not complain about passed limit being bigger. It should internally cap it to getrlimit so that user-visible behaviour is identical with gnu in this respect.

As a result the code should explicitly coompare maxprocs to 0 (as opposed to < 1), proclim variable has no use. Fetch rlimit /after/ maxprocs is validated. If (maxprocs == 0 || maxprocs > reslimit.rlim_cur) maxprocs = reslimit.rlim_cur;

getrlimit realistically should never fail, so a failure with err(1, "getrlimit"); should be fine enough for surprising cases.

Address comments by mjg:

This version tries to get user-facing behavior as close to GNU xargs
as possible. As for why BASE_DECIMAL is defined, I don't know...
The initial patch used a literal 10 here, as seen in other code.

Apart from 2 things the code looks good to me.

usr.bin/xargs/xargs.c
50 ↗(On Diff #6554)

sys/types.h must be included prior to sys/wait.h.

105 ↗(On Diff #6554)

I would rename this to rl or rlimit, as a nit.

Address comments by mjg:

  • include sys/types.h before sys/wait.h
  • rename reslimit to rl
mjg added a reviewer: mjg.
This revision is now accepted and ready to land.Jul 4 2015, 1:48 PM

Is this ready to commit? If so, can a committer grab it please?

bjk added a reviewer: bjk.

I'm happy to commit as soon as someone ACKs the manpage.

usr.bin/xargs/xargs.1
214 ↗(On Diff #6690)

The current text is good, but I wonder if this is better:

If
.Ar maxprocs
is set to 0,
.Nm
will run as many concurrent processes as possible.

lifanov_mail.lifanov.com edited edge metadata.

reroll patch after r285552 and bump date on manpage

This revision now requires review to proceed.Jul 29 2015, 5:16 PM

This doesn't compile without sys/param.h for some reason.

lifanov_mail.lifanov.com edited edge metadata.

Add <sys/limits.h>, which is needed for INT_MAX without param.h

bjk edited edge metadata.
This revision is now accepted and ready to land.Jul 30 2015, 9:57 PM
allanjude edited reviewers, added: lifanov_mail.lifanov.com; removed: allanjude.

Getting approval to commit this

bapt edited edge metadata.
This revision was automatically updated to reflect the committed changes.