Page MenuHomeFreeBSD

sh(1): Improve recommendation of use of -e
Needs RevisionPublic

Authored by michaelo on May 3 2024, 8:01 AM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, Jun 8, 2:29 PM
Unknown Object (File)
May 8 2024, 4:26 AM
Unknown Object (File)
May 8 2024, 4:26 AM
Unknown Object (File)
May 8 2024, 1:02 AM
Unknown Object (File)
May 5 2024, 6:12 PM
Subscribers

Details

Summary

This partially reverts b14cfdf665bb8b7b2898a4ee5b073ab87f8ea3d0 and has
been discussed in D42719.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 58081
Build 54969: arc lint + arc unit

Event Timeline

The immediately preceding sentence about shell functions is an example of unexpected behaviour of set -e. The part "all commands of the function are considered to be tested as well" may result in code unexpectedly executing in the function and the function returning a different return status.

Many more examples can be found at https://mywiki.wooledge.org/BashFAQ/105 . Bash does not work exactly the same as our sh, but most of the problems are the same.

Bash's behaviour related to set -e has changed over the versions, but it has not succeeded in fixing the confusing parts. I don't believe it's possible to fix, and it is best to leave it unchanged so it is at least consistent.

set -e works reasonably well for scripts that are plain lists of commands with possible case, for, if, while and until, and only use && and || in conditions, which probably explains its use in make. However, even then, command substitution might lead to confusing results, for example because expr returns status 1 if the result is zero or the empty string, and the exit status from the inner command is passed through only in particular cases (for the last command substitution if there is no utility name).

A linting tool will probably be more effective (I've heard of ShellCheck, but I don't know whether it can catch missing error checks). With a linting tool, it matters less how beautiful the code looks but whether the tool can analyze it. Not enabling set -e makes the code easier to analyze, for example because functions don't behave differently depending on whether their return status is tested.

Another option is to use a better programming language. For a long time, there did not use to be one in the base system, but perhaps Lua is now a suitable option.

Manual pages are reference material and not a place for advocacy. Proper use of the feature, which seems to document its drawbacks (checked function call etc.), is a duty of the one writing a shell script. We don't have Python in base, so we can't recommend that.

This revision is now accepted and ready to land.May 6 2024, 6:12 PM

@j

The immediately preceding sentence about shell functions is an example of unexpected behaviour of set -e. The part "all commands of the function are considered to be tested as well" may result in code unexpectedly executing in the function and the function returning a different return status.

Many more examples can be found at https://mywiki.wooledge.org/BashFAQ/105 . Bash does not work exactly the same as our sh, but most of the problems are the same.

Bash's behaviour related to set -e has changed over the versions, but it has not succeeded in fixing the confusing parts. I don't believe it's possible to fix, and it is best to leave it unchanged so it is at least consistent.

set -e works reasonably well for scripts that are plain lists of commands with possible case, for, if, while and until, and only use && and || in conditions, which probably explains its use in make. However, even then, command substitution might lead to confusing results, for example because expr returns status 1 if the result is zero or the empty string, and the exit status from the inner command is passed through only in particular cases (for the last command substitution if there is no utility name).

A linting tool will probably be more effective (I've heard of ShellCheck, but I don't know whether it can catch missing error checks). With a linting tool, it matters less how beautiful the code looks but whether the tool can analyze it. Not enabling set -e makes the code easier to analyze, for example because functions don't behave differently depending on whether their return status is tested.

Another option is to use a better programming language. For a long time, there did not use to be one in the base system, but perhaps Lua is now a suitable option.

It is not fully clear to me how to continue with this. @mandree accepted the change, you consider the statement should remain? What about the other folks?

Would you consider these tricky set -e cases?

#!/bin/sh

set -e

foo=$(expr 1 - 1)

echo "$foo"

Perhaps arithmetic expansion should be used instead of expr, but there are lots of cases of expr in the tree.

Here is another interesting example.

#!/bin/sh

set -e

f() {
        [ -d /notadir ] && echo "notadir exists"
}

f
echo "the end"

I personnally like this advice in manpage, as I think any big script should use set -e (see poudriere for example) but it clearly requires proper and explicit checks for failures.

I would love to see the recommendation of explicity check for failures to stay but the subjective "it tends to behave ... larger scripts" be removed.

Thank you Bapt. However, your recommendation is unclear (to me).

I personnally like this advice in manpage, as I think any big script should use set -e (see poudriere for example) but it clearly requires proper and explicit checks for failures.

The advice in sh(1) is somewhat ambiguous, but I think most would interpret it as recommending against set -e, and poudriere looks to only use it in specific, limited cases.

I would rephrase:

It is recommended to check for failures explicitly
instead of relying on
.Fl e
because it tends to behave in unexpected ways,
particularly in larger scripts.

into

It is recommended to check for failures explicitly
in combinaison with
.Fl e .

Incorporate Baptiste's proposal

This revision now requires review to proceed.Thu, Jun 6, 9:35 AM
michaelo retitled this revision from sh(1): Remove unsubstantiated discourage use of -e to sh(1): Improve recommendation of use of -e.Thu, Jun 6, 9:36 AM
michaelo edited the summary of this revision. (Show Details)
michaelo added a reviewer: bapt.
This revision is now accepted and ready to land.Thu, Jun 6, 12:28 PM

I lean towards something stronger.

Relying solely on
.Fl e
to check for failures is discouraged.
Many commands return non-zero values to convey information other than errors, which can cause unexpected program termination.
Thus, it is recommended to explicitly check for errors.

It hedges on outright discouraging set -e with "Relying *solely* on set -e..."

IRC conversation:

[2024-06-06 11:36] <jrm> bapt_: Sorry to drag out D45073, but I recommended something a little stronger and (I think) clearer.  Could you take one final look?
[2024-06-06 11:38] <bapt_> jrm: ok with your version as well!
In D45073#1038323, @jrm wrote:

I lean towards something stronger.

Relying solely on
.Fl e
to check for failures is discouraged.
Many commands return non-zero values to convey information other than errors, which can cause unexpected program termination.
Thus, it is recommended to explicitly check for errors.

It hedges on outright discouraging set e with "Relying *solely* on set -e..."

Can we *please* stop patronizing and intruding into programmers' judgement? Do you sell a hammer with an advice slip to not use it for screws?

What I am getting at is we can happily and objectively tell people that set -e isn't a cure-it-all solution, and it is DOCUMENTED corner cases (as in functions) - but commands with more than "success = 0, failure = 1" exit code value range are not an excuse. They call for special code anyways. Preventing that unintended exit under set -e is as simple as rc=0 ; my-command-with-complex-exit-code || rc=$? because pipelines connected with && or || explicitly prevent that "exit on nonzero status return" behavior - but let me tell you we've been hunting down unintended "success" Continuous Integration (build) pipelines in shell and "power"shell scripts that miss errors more often than I care to count. Even with Gitlab runners pampering it all up for what that Redmond Corp. believes to be a p....shell.

We are Unix and are really not writing reference manuals for idiots, so let's stop advocacy there. It's a matter of skill and telling the facts and not of opinionating what people should do. So feel free to refer to another man page section if you feel it's worth mentioning, that is objective information after all.

mandree requested changes to this revision.Thu, Jun 6, 8:24 PM
This revision now requires changes to proceed.Thu, Jun 6, 8:24 PM
In D45073#1038323, @jrm wrote:

I lean towards something stronger.

Relying solely on
.Fl e
to check for failures is discouraged.
Many commands return non-zero values to convey information other than errors, which can cause unexpected program termination.
Thus, it is recommended to explicitly check for errors.

It hedges on outright discouraging set e with "Relying *solely* on set -e..."

Can we *please* stop patronizing and intruding into programmers' judgement? Do you sell a hammer with an advice slip to not use it for screws?

I disagree this is patronizing or intrusive. We aren't even telling anyone not to use set -e, but just documenting a likely source of confusion.

What I am getting at is we can happily and objectively tell people that set -e isn't a cure-it-all solution, and it is DOCUMENTED corner cases (as in functions) - but commands with more than "success = 0, failure = 1" exit code value range are not an excuse. They call for special code anyways. Preventing that unintended exit under set -e is as simple as rc=0 ; my-command-with-complex-exit-code || rc=$? because pipelines connected with && or || explicitly prevent that "exit on nonzero status return" behavior - but let me tell you we've been hunting down unintended "success" Continuous Integration (build) pipelines in shell and "power"shell scripts that miss errors more often than I care to count. Even with Gitlab runners pampering it all up for what that Redmond Corp. believes to be a p....shell.

This is objective. It's simply warning people that if they *solely* put set -e in their scripts for error checking, they will run into the clear-cut issue described above.

We are Unix and are really not writing reference manuals for idiots, so let's stop advocacy there. It's a matter of skill and telling the facts and not of opinionating what people should do. So feel free to refer to another man page section if you feel it's worth mentioning, that is objective information after all.

I again disagree with your hyperbolic interpretation. Making these sorts of recommendations is common in our man pages.

% grep --include="*.[0-9]" -r "is discouraged" /usr/src/ | wc -l
      45

@mandree The text has been already reduced. Would you like to have it removed completely?

I'd say that we shouldn't make any recommendations on how to use parts of a language in our manual. We should instead focus on facts, list them all and describe as they are. If there is some kind of interaction between parts of the language that is possibly counter-intuitive then we can and probably should indicate directly or indirectly that something requires consideration, but at the same time we should keep the manual free of judgement.

Given that set -e is a controversial issue (see the disagreement here and in the "Conclusions" section of https://mywiki.wooledge.org/BashFAQ/105 ), I don't think the man page should recommend anything about it, although it should mention some pitfalls in a neutral manner.

Recommendations about non-controversial issues are acceptable to me.

I am fine with mentioning pitfalls and limitations objectively by pointing to the detail sections. I reject "recommendations", and if we have those in other manual pages, they need to be reviewed and reworded in a similar vein or removed.

I am fine with mentioning pitfalls and limitations objectively by pointing to the detail sections. I reject "recommendations", and if we have those in other manual pages, they need to be reviewed and reworded in a similar vein or removed.

It's not just other manual pages; sh(1) itself has other recommendations.

The “--” flag may be omitted when specifying arguments to be used as positional replacement parameters. This is not recommended, because the first argument may begin with a dash (‘-’) or a plus (‘+’), which the set command will interpret as a request to enable or disable options.

Using aliases in scripts is discouraged because the command that defines them must be executed before the code that uses them is parsed. This is fragile and not portable.

Here is another attempt to come up with something that everyone can accept.

Note that many commands return non-zero values to convey information other than errors, which can cause unexpected program termination with
.Fl e .