Page MenuHomeFreeBSD

sh: Restore $((x)) error checking after fix for $((-9223372036854775808))
ClosedPublic

Authored by jilles on Jan 22 2019, 10:49 PM.

Details

Summary

SVN r342880 was designed to fix $((-9223372036854775808)) and things like
$((0x8000000000000000)) but also broke error detection for values of
variables without dollar sign ($((x))).

For compatibility, overflow in plain literals continues to be ignored and
the value is clamped to the boundary (except 9223372036854775808 which is
changed to -9223372036854775808).

Test Plan

Kyua/ATF tests including the new expansion/arith16.0 and expansion/arith17.0

Diff Detail

Repository
rS FreeBSD src repository
Lint
Lint Skipped
Unit
Unit Tests Skipped
Build Status
Buildable 22084

Event Timeline

jilles created this revision.Jan 22 2019, 10:49 PM
se added a comment.Jan 27 2019, 9:16 PM

The behavior is not consistent for shell arithmetic operating on parameters with or without "$" (e.g. "$((X))" vs. "$(($X))") for variables not in the range of valid integers.
This may be allowed by the standard, but is still somewhat surprising:

# X=9223372036854775809
# echo $((X))
/usr/obj/usr/svn/base/head/amd64.amd64/bin/sh/sh: arithmetic expression: variable conversion error: "X"
# echo $(($X))
9223372036854775807

# echo $((X + 1))
/usr/obj/usr/svn/base/head/amd64.amd64/bin/sh/sh: arithmetic expression: variable conversion error: "X + 1"
# echo $(($X + 1))
-9223372036854775808

After commenting out the ERANGE assignments, the above expressions return identical results, but there is still a difference, e.g.:

 X=-9223372036854775817
# echo $((X))
/usr/obj/usr/svn/base/head/amd64.amd64/bin/sh/sh: arithmetic expression: variable conversion error: "X"
# echo $(($X))
-9223372036854775807

The reason is, that strtoimax() sets ERANGE in the case of $((X)), while $(($X)) is evaluated as -1 * the unsigned number to the right of the minus sign.

Arithmetic in $(( )) should be performed without checking for overflow, AFAIK.
This rule does not directly apply to out-of-range numeric arguments, but I'd think that the use of $X vs. X for variables should not change the outcome of the operation.

In D18926#405742, @se wrote:

The behavior is not consistent for shell arithmetic operating on parameters with or without "$" (e.g. "$((X))" vs. "$(($X))") for variables not in the range of valid integers.
This may be allowed by the standard, but is still somewhat surprising:

# X=9223372036854775809
# echo $((X))
/usr/obj/usr/svn/base/head/amd64.amd64/bin/sh/sh: arithmetic expression: variable conversion error: "X"
# echo $(($X))
9223372036854775807
# echo $((X + 1))
/usr/obj/usr/svn/base/head/amd64.amd64/bin/sh/sh: arithmetic expression: variable conversion error: "X + 1"
# echo $(($X + 1))
-9223372036854775808

Adding overflow checks for literals in expressions (i.e. $(($X)) case) would require an exp-run since this seems an uncommon choice among shells. Restoring the variable conversion error can be done without exp-run.

After commenting out the ERANGE assignments, the above expressions return identical results, but there is still a difference, e.g.:

 X=-9223372036854775817
# echo $((X))
/usr/obj/usr/svn/base/head/amd64.amd64/bin/sh/sh: arithmetic expression: variable conversion error: "X"
# echo $(($X))
-9223372036854775807

The reason is, that strtoimax() sets ERANGE in the case of $((X)), while $(($X)) is evaluated as -1 * the unsigned number to the right of the minus sign.

Arithmetic in $(( )) should be performed without checking for overflow, AFAIK.

POSIX defers to the C standard in these matters, which states that overflow in arithmetic with signed integers is undefined behaviour. Clarifying that, POSIX explicitly allows resolving overflow by converting to floating point (POSIX.1-2016 XCU 2.6.4 Arithmetic Expansion).

A few years ago, I removed various C undefined behaviour from arithmetic expansion. Most overflow wraps around, but overflow in expanding a variable without $ and division (dividing the most negative integer by -1) generates errors.

This rule does not directly apply to out-of-range numeric arguments, but I'd think that the use of $X vs. X for variables should not change the outcome of the operation.

If $X and X must match while still keeping error checking, that seems to require creating a separate token type for the most negative integer in an expression, so it can occur after a unary minus but not anywhere else.

If overflow checking is to be discarded, just using strtoumax will fix everything. However, I'd like to preserve the current overflow checking ($((X)) case only).

se added a comment.Jan 29 2019, 12:08 PM

I'd do not like the conversion to floating point for numbers outside the range of 64 bit integers. They violate assumptions I'd like to be able to make (e.g. that adding and then subtracting the same number leads back to the initial value in all cases).

If it was my decision, I'd opt for no range checks, but I think I understand your reasons to keep them in at least some cases. Only that I do not like the different semantics of $((X)) vs. $(($X)). The documentation says, that the dollar sign is stripped and the variable used as if there was no dollar sign, but the implementation expands $X and replaces it with the text representation of its value, and that is what leads to overflow being detected in one case and not the other ... (And this is also the cause of the difference in processing a negative number, since the minus sign is parsed separately in one case but not the other).

I could live quite well with the simple version that does no overflow checking, but if you think that checking for overflow only for $((X)) offers an advantage in some situations, it's up to you to decide.
Adding overflow tests for the $(($X)) case is not useful, IMHO, and not only because of the need for an exp-run. Such a change would cause a deviation from long established practices and could also break locally developed scripts not covered by any testing that we could perform.

One remark regarding the cases where ERANGE is set: AFAICT, the numeric value being returned is discarded, and I do not see a reason to return anything but "val".
This would allow to significantly simplify the test:

default:
          val = (arith_t)strtoumax(nptr, endptr, 0);
          if (val < 0)
                  errno = ERANGE;
          return (val);

This version passes all tests ("make tests"), as did your more complex version ...

jilles updated this revision to Diff 53382.Jan 29 2019, 7:12 PM
jilles edited the test plan for this revision. (Show Details)

Add test for [ERANGE] special case.

In D18926#406366, @se wrote:

I'd do not like the conversion to floating point for numbers outside the range of 64 bit integers. They violate assumptions I'd like to be able to make (e.g. that adding and then subtracting the same number leads back to the initial value in all cases).

I don't like that either; overflowing to floating point is not going to happen.

If it was my decision, I'd opt for no range checks, but I think I understand your reasons to keep them in at least some cases. Only that I do not like the different semantics of $((X)) vs. $(($X)). The documentation says, that the dollar sign is stripped and the variable used as if there was no dollar sign, but the implementation expands $X and replaces it with the text representation of its value, and that is what leads to overflow being detected in one case and not the other ... (And this is also the cause of the difference in processing a negative number, since the minus sign is parsed separately in one case but not the other).

Where does the documentation say that? Our man page (under "Arithmetic Expansion") describes what the implementation does and leaves differences between $(($X)) and $((X)) to the reader. POSIX's text is also pretty limited since it only says something about valid integer constants

I could live quite well with the simple version that does no overflow checking, but if you think that checking for overflow only for $((X)) offers an advantage in some situations, it's up to you to decide.
Adding overflow tests for the $(($X)) case is not useful, IMHO, and not only because of the need for an exp-run. Such a change would cause a deviation from long established practices and could also break locally developed scripts not covered by any testing that we could perform.

OK.

One remark regarding the cases where ERANGE is set: AFAICT, the numeric value being returned is discarded, and I do not see a reason to return anything but "val".
This would allow to significantly simplify the test:

default:
          val = (arith_t)strtoumax(nptr, endptr, 0);
          if (val < 0)
                  errno = ERANGE;
          return (val);

This version passes all tests ("make tests"), as did your more complex version ...

Good catch. If a variable is being read as in $((X)), errors cause the numeric value to be ignored, but if a literal in an expression is being read, errors are ignored and the value matters. I have added a test so the extra logic is clearly necessary.

This revision was not accepted when it landed; it landed in state Needs Review.Feb 10 2019, 10:23 PM
This revision was automatically updated to reflect the committed changes.