diff --git a/NEWS.md b/NEWS.md index 98b52024b2e8..5251096d9f2a 100644 --- a/NEWS.md +++ b/NEWS.md @@ -1,1203 +1,1211 @@ # News +## 5.1.1 + +This is a production release that completes a bug fix from `5.1.0`. The bug +exists in all versions of `bc`. + +The bug was that `if` statements without `else` statements would not be handled +correctly at the end of files or right before a function definition. + ## 5.1.0 This is a production release with some fixes and new features. * Fixed a bug where an `if` statement without an `else` before defining a function caused an error. * Fixed a bug with the `bc` banner and `-q`. * Fixed a bug on Windows where files were not read correctly. * Added a command-line flag (`-z`) to make `bc` and `dc` print leading zeroes on numbers `-1 < x < 1`. * Added four functions to `lib2.bc` (`plz()`, `plznl()`, `pnlz()`, and `pnlznl()`) to allow printing numbers with or without leading zeros, despite the use of `-z` or not. * Added builtin functions to query global state like line length, global stacks, and leading zeroes. * Added a command-line flag (`-L`) to disable wrapping when printing numbers. * Improved builds on Windows. ## 5.0.2 This is a production release with one fix for a flaky test. If you have not experienced problems with the test suite, you do ***NOT*** need to upgrade. The test was one that tested whether `bc` fails gracefully when it can't allocate memory. Unfortunately, there are cases when Linux and FreeBSD lie and pretend to allocate the memory. The reason they do this is because a lot of programs don't use all of the memory they allocate, so those OS's usually get away with it. However, this `bc` uses all of the memory it allocates (at least at page granularity), so when it tries to use the memory, FreeBSD and Linux kill it. This only happens sometimes, however. Other times (on my machine), they do, in fact, refuse the request. So I changed the test to not test for that because I think the graceful failure code won't really change much. ## 5.0.1 This is a production release with two fixes: * Fix for the build on Mac OSX. * Fix for the build on Android. Users that do not use those platforms do ***NOT*** need to update. ## 5.0.0 This is a major production release with several changes: * Added support for OpenBSD's `pledge()` and `unveil()`. * Fixed print bug where a backslash newline combo was printed even if only one digit was left, something I blindly copied from GNU `bc`, like a fool. * Fixed bugs in the manuals. * Fixed a possible multiplication overflow in power. * Temporary numbers are garbage collected if allocation fails, and the allocation is retried. This is to make `bc` and `dc` more resilient to running out of memory. * Limited the number of temporary numbers and made the space for them static so that allocating more space for them cannot fail. * Allowed integers with non-zero `scale` to be used with power, places, and shift operators. * Added greatest common divisor and least common multiple to `lib2.bc`. * Added `SIGQUIT` handling to history. * Added a command to `dc` (`y`) to get the length of register stacks. * Fixed multi-digit bugs in `lib2.bc`. * Removed the no prompt build option. * Created settings that builders can set defaults for and users can set their preferences for. This includes the `bc` banner, resetting on `SIGINT`, TTY mode, and prompt. * Added history support to Windows. * Fixed bugs with the handling of register names in `dc`. * Fixed bugs with multi-line comments and strings in both calculators. * Added a new error type and message for `dc` when register stacks don't have enough items. * Optimized string allocation. * Made `bc` and `dc` UTF-8 capable. * Fixed a bug with `void` functions. * Fixed a misspelled symbol in `bcl`. This is technically a breaking change, which requires this to be `5.0.0`. * Added the ability for users to get the copyright banner back. * Added the ability for users to have `bc` and `dc` quit on `SIGINT`. * Added the ability for users to disable prompt and TTY mode by environment variables. * Added the ability for users to redefine keywords. This is another reason this is `5.0.0`. * Added `dc`'s modular exponentiation and divmod to `bc`. * Added the ability to assign strings to variables and array elements and pass them to functions in `bc`. * Added `dc`'s asciify command and stream printing to `bc`. * Added a command to `dc` (`Y`) to get the length of an array. * Added a command to `dc` (`,`) to get the depth of the execution stack. * Added bitwise and, or, xor, left shift, right shift, reverse, left rotate, right rotate, and mod functions to `lib2.bc`. * Added the functions `s2u(x)` and `s2un(x,n)`, to `lib2.bc`. ## 4.0.2 This is a production release that fixes two bugs: 1. If no files are used and the first statement on `stdin` is invalid, `scale` would not be set to `20` even if `-l` was used. 2. When using history, `bc` failed to respond properly to `SIGSTOP` and `SIGTSTP`. ## 4.0.1 This is a production release that only adds one thing: flushing output when it is printed with a print statement. ## 4.0.0 This is a production release with many fixes, a new command-line option, and a big surprise: * A bug was fixed in `dc`'s `P` command where the item on the stack was *not* popped. * Various bugs in the manuals have been fixed. * A known bug was fixed where history did not interact well with prompts printed by user code without newlines. * A new command-line option, `-R` and `--no-read-prompt` was added to disable just the prompt when using `read()` (`bc`) or `?` (`dc`). * And finally, **official support for Windows was added**. The last item is why this is a major version bump. Currently, only one set of build options (extra math and prompt enabled, history and NLS/locale support disabled, both calculators enabled) is supported on Windows. However, both debug and release builds are supported. In addition, Windows builds are supported for the the library (`bcl`). For more details about how to build on Windows, see the [README][5] or the [build manual][13]. ## 3.3.4 This is a production release that fixes a small bug. The bug was that output was not flushed before a `read()` call, so prompts without a newline on the end were not flushed before the `read()` call. This is such a tiny bug that users only need to upgrade if they are affected. ## 3.3.3 This is a production release with one tweak and fixes for manuals. The tweak is that `length(0)` returns `1` instead of `0`. In `3.3.1`, I changed it so `length(0.x)`, where `x` could be any number of digits, returned the `scale`, but `length(0)` still returned `0` because I believe that `0` has `0` significant digits. After request of FreeBSD and considering the arguments of a mathematician, compatibility with other `bc`'s, and the expectations of users, I decided to make the change. The fixes for manuals fixed a bug where `--` was rendered as `-`. ## 3.3.2 This is a production release that fixes a divide-by-zero bug in `root()` in the [extended math library][16]. All previous versions with `root()` have the bug. ## 3.3.1 This is a production release that fixes a bug. The bug was in the reporting of number length when the value was 0. ## 3.3.0 This is a production release that changes one behavior and fixes documentation bugs. The changed behavior is the treatment of `-e` and `-f` when given through `BC_ENV_ARGS` or `DC_ENV_ARGS`. Now `bc` and `dc` do not exit when those options (or their equivalents) are given through those environment variables. However, `bc` and `dc` still exit when they or their equivalents are given on the command-line. ## 3.2.7 This is a production release that removes a small non-portable shell operation in `configure.sh`. This problem was only noticed on OpenBSD, not FreeBSD or Linux. Non-OpenBSD users do ***NOT*** need to upgrade, although NetBSD users may also need to upgrade. ## 3.2.6 This is a production release that fixes the build on FreeBSD. There was a syntax error in `configure.sh` that the Linux shell did not catch, and FreeBSD depends on the existence of `tests/all.sh`. All users that already upgraded to `3.2.5` should update to this release, with my apologies for the poor release of `3.2.5`. Other users should skip `3.2.5` in favor of this version. ## 3.2.5 This is a production release that fixes several bugs and adds a couple small things. The two most important bugs were bugs that causes `dc` to access memory out-of-bounds (crash in debug builds). This was found by upgrading to `afl++` from `afl`. Both were caused by a failure to distinguish between the same two cases. Another bug was the failure to put all of the licenses in the `LICENSE.md` file. Third, some warnings by `scan-build` were found and eliminated. This needed one big change: `bc` and `dc` now bail out as fast as possible on fatal errors instead of unwinding the stack. Fourth, the pseudo-random number now attempts to seed itself with `/dev/random` if `/dev/urandom` fails. Finally, this release has a few quality-of-life changes to the build system. The usage should not change at all; the only thing that changed was making sure the `Makefile.in` was written to rebuild properly when headers changed and to not rebuild when not necessary. ## 3.2.4 This is a production release that fixes a warning on `gcc` 6 or older, which does not have an attribute that is used. Users do ***NOT*** need to upgrade if they don't use `gcc` 6 or older. ## 3.2.3 This is a production release that fixes a bug in `gen/strgen.sh`. I recently changed `gen/strgen.c`, but I did not change `gen/strgen.sh`. Users that do not use `gen/strgen.sh` do not need to upgrade. ## 3.2.2 This is a production release that fixes a portability bug in `configure.sh`. The bug was using the GNU `find` extension `-wholename`. ## 3.2.1 This is a production release that has one fix for `bcl(3)`. It is technically not a bug fix since the behavior is undefined, but the `BclNumber`s that `bcl_divmod()` returns will be set to `BCL_ERROR_INVALID_NUM` if there is an error. Previously, they were not set. ## 3.2.0 This is a production release that has one bug fix and a major addition. The bug fix was a missing `auto` variable in the bessel `j()` function in the math library. The major addition is a way to build a version of `bc`'s math code as a library. This is done with the `-a` option to `configure.sh`. The API for the library can be read in `./manuals/bcl.3.md` or `man bcl` once the library is installed with `make install`. This library was requested by developers before I even finished version 1.0, but I could not figure out how to do it until now. If the library has API breaking changes, the major version of `bc` will be incremented. ## 3.1.6 This is a production release that fixes a new warning from Clang 12 for FreeBSD and also removes some possible undefined behavior found by UBSan that compilers did not seem to take advantage of. Users do ***NOT*** need to upgrade, if they do not want to. ## 3.1.5 This is a production release that fixes the Chinese locales (which caused `bc` to crash) and a crash caused by `bc` executing code when it should not have been able to. ***ALL USERS SHOULD UPGRADE.*** ## 3.1.4 This is a production release that fixes one bug, changes two behaviors, and removes one environment variable. The bug is like the one in the last release except it applies if files are being executed. I also made the fix more general. The behavior that was changed is that `bc` now exits when given `-e`, `-f`, `--expression` or `--file`. However, if the last one of those is `-f-` (using `stdin` as the file), `bc` does not exit. If `-f-` exists and is not the last of the `-e` and `-f` options (and equivalents), `bc` gives a fatal error and exits. Next, I removed the `BC_EXPR_EXIT` and `DC_EXPR_EXIT` environment variables since their use is not needed with the behavior change. Finally, I made it so `bc` does not print the header, though the `-q` and `--quiet` options were kept for compatibility with GNU `bc`. ## 3.1.3 This is a production release that fixes one minor bug: if `bc` was invoked like the following, it would error: ``` echo "if (1 < 3) 1" | bc ``` Unless users run into this bug, they do not need to upgrade, but it is suggested that they do. ## 3.1.2 This is a production release that adds a way to install *all* locales. Users do ***NOT*** need to upgrade. For package maintainers wishing to make use of the change, just pass `-l` to `configure.sh`. ## 3.1.1 This is a production release that adds two Spanish locales. Users do ***NOT*** need to upgrade, unless they want those locales. ## 3.1.0 This is a production release that adjusts one behavior, fixes eight bugs, and improves manpages for FreeBSD. Because this release fixes bugs, **users and package maintainers should update to this version as soon as possible**. The behavior that was adjusted was how code from the `-e` and `-f` arguments (and equivalents) were executed. They used to be executed as one big chunk, but in this release, they are now executed line-by-line. The first bug fix in how output to `stdout` was handled in `SIGINT`. If a `SIGINT` came in, the `stdout` buffer was not correctly flushed. In fact, a clean-up function was not getting called. This release fixes that bug. The second bug is in how `dc` handled input from `stdin`. This affected `bc` as well since it was a mishandling of the `stdin` buffer. The third fixed bug was that `bc` and `dc` could `abort()` (in debug mode) when receiving a `SIGTERM`. This one was a race condition with pushing and popping items onto and out of vectors. The fourth bug fixed was that `bc` could leave extra items on the stack and thus, not properly clean up some memory. (The memory would still get `free()`'ed, but it would not be `free()`'ed when it could have been.) The next two bugs were bugs in `bc`'s parser that caused crashes when executing the resulting code. The last two bugs were crashes in `dc` that resulted from mishandling of strings. The manpage improvement was done by switching from [ronn][20] to [Pandoc][21] to generate manpages. Pandoc generates much cleaner manpages and doesn't leave blank lines where they shouldn't be. ## 3.0.3 This is a production release that adds one new feature: specific manpages. Before this release, `bc` and `dc` only used one manpage each that referred to various build options. This release changes it so there is one manpage set per relevant build type. Each manual only has information about its particular build, and `configure.sh` selects the correct set for install. ## 3.0.2 This is a production release that adds `utf8` locale symlinks and removes an unused `auto` variable from the `ceil()` function in the [extended math library][16]. Users do ***NOT*** need to update unless they want the locales. ## 3.0.1 This is a production release with two small changes. Users do ***NOT*** need to upgrade to this release; however, if they haven't upgraded to `3.0.0` yet, it may be worthwhile to upgrade to this release. The first change is fixing a compiler warning on FreeBSD with strict warnings on. The second change is to make the new implementation of `ceil()` in `lib2.bc` much more efficient. ## 3.0.0 *Notes for package maintainers:* *First, the `2.7.0` release series saw a change in the option parsing. This made me change one error message and add a few others. The error message that was changed removed one format specifier. This means that `printf()` will seqfault on old locale files. Unfortunately, `bc` cannot use any locale files except the global ones that are already installed, so it will use the previous ones while running tests during install. **If `bc` segfaults while running arg tests when updating, it is because the global locale files have not been replaced. Make sure to either prevent the test suite from running on update or remove the old locale files before updating.** (Removing the locale files can be done with `make uninstall` or by running the [`locale_uninstall.sh`][22] script.) Once this is done, `bc` should install without problems.* *Second, **the option to build without signal support has been removed**. See below for the reasons why.* This is a production release with some small bug fixes, a few improvements, three major bug fixes, and a complete redesign of `bc`'s error and signal handling. **Users and package maintainers should update to this version as soon as possible.** The first major bug fix was in how `bc` executed files. Previously, a whole file was parsed before it was executed, but if a function is defined *after* code, especially if the function definition was actually a redefinition, and the code before the definition referred to the previous function, this `bc` would replace the function before executing any code. The fix was to make sure that all code that existed before a function definition was executed. The second major bug fix was in `bc`'s `lib2.bc`. The `ceil()` function had a bug where a `0` in the decimal place after the truncation position, caused it to output the wrong numbers if there was any non-zero digit after. The third major bug is that when passing parameters to functions, if an expression included an array (not an array element) as a parameter, it was accepted, when it should have been rejected. It is now correctly rejected. Beyond that, this `bc` got several improvements that both sped it up, improved the handling of signals, and improved the error handling. First, the requirements for `bc` were pushed back to POSIX 2008. `bc` uses one function, `strdup()`, which is not in POSIX 2001, and it is in the X/Open System Interfaces group 2001. It is, however, in POSIX 2008, and since POSIX 2008 is old enough to be supported anywhere that I care, that should be the requirement. Second, the BcVm global variable was put into `bss`. This actually slightly reduces the size of the executable from a massive code shrink, and it will stop `bc` from allocating a large set of memory when `bc` starts. Third, the default Karatsuba length was updated from 64 to 32 after making the optimization changes below, since 32 is going to be better than 64 after the changes. Fourth, Spanish translations were added. Fifth, the interpreter received a speedup to make performance on non-math-heavy scripts more competitive with GNU `bc`. While improvements did, in fact, get it much closer (see the [benchmarks][19]), it isn't quite there. There were several things done to speed up the interpreter: First, several small inefficiencies were removed. These inefficiencies included calling the function `bc_vec_pop(v)` twice instead of calling `bc_vec_npop(v, 2)`. They also included an extra function call for checking the size of the stack and checking the size of the stack more than once on several operations. Second, since the current `bc` function is the one that stores constants and strings, the program caches pointers to the current function's vectors of constants and strings to prevent needing to grab the current function in order to grab a constant or a string. Third, `bc` tries to reuse `BcNum`'s (the internal representation of arbitary-precision numbers). If a `BcNum` has the default capacity of `BC_NUM_DEF_SIZE` (32 on 64-bit and 16 on 32-bit) when it is freed, it is added to a list of available `BcNum`'s. And then, when a `BcNum` is allocated with a capacity of `BC_NUM_DEF_SIZE` and any `BcNum`'s exist on the list of reusable ones, one of those ones is grabbed instead. In order to support these changes, the `BC_NUM_DEF_SIZE` was changed. It used to be 16 bytes on all systems, but it was changed to more closely align with the minimum allocation size on Linux, which is either 32 bytes (64-bit musl), 24 bytes (64-bit glibc), 16 bytes (32-bit musl), or 12 bytes (32-bit glibc). Since these are the minimum allocation sizes, these are the sizes that would be allocated anyway, making it worth it to just use the whole space, so the value of `BC_NUM_DEF_SIZE` on 64-bit systems was changed to 32 bytes. On top of that, at least on 64-bit, `BC_NUM_DEF_SIZE` supports numbers with either 72 integer digits or 45 integer digits and 27 fractional digits. This should be more than enough for most cases since `bc`'s default `scale` values are 0 or 20, meaning that, by default, it has at most 20 fractional digits. And 45 integer digits are *a lot*; it's enough to calculate the amount of mass in the Milky Way galaxy in kilograms. Also, 72 digits is enough to calculate the diameter of the universe in Planck lengths. (For 32-bit, these numbers are either 32 integer digits or 12 integer digits and 20 fractional digits. These are also quite big, and going much bigger on a 32-bit system seems a little pointless since 12 digits is just under a trillion and 20 fractional digits is still enough for about any use since `10^-20` light years is just under a millimeter.) All of this together means that for ordinary uses, and even uses in scientific work, the default number size will be all that is needed, which means that nearly all, if not all, numbers will be reused, relieving pressure on the system allocator. I did several experiments to find the changes that had the most impact, especially with regard to reusing `BcNum`'s. One was putting `BcNum`'s into buckets according to their capacity in powers of 2 up to 512. That performed worse than `bc` did in `2.7.2`. Another was putting any `BcNum` on the reuse list that had a capacity of `BC_NUM_DEF_SIZE * 2` and reusing them for `BcNum`'s that requested `BC_NUM_DEF_SIZE`. This did reduce the amount of time spent, but it also spent a lot of time in the system allocator for an unknown reason. (When using `strace`, a bunch more `brk` calls showed up.) Just reusing `BcNum`'s that had exactly `BC_NUM_DEF_SIZE` capacity spent the smallest amount of time in both user and system time. This makes sense, especially with the changes to make `BC_NUM_DEF_SIZE` bigger on 64-bit systems, since the vast majority of numbers will only ever use numbers with a size less than or equal to `BC_NUM_DEF_SIZE`. Last of all, `bc`'s signal handling underwent a complete redesign. (This is the reason that this version is `3.0.0` and not `2.8.0`.) The change was to move from a polling approach to signal handling to an interrupt-based approach. Previously, every single loop condition had a check for signals. I suspect that this could be expensive when in tight loops. Now, the signal handler just uses `longjmp()` (actually `siglongjmp()`) to start an unwinding of the stack until it is stopped or the stack is unwound to `main()`, which just returns. If `bc` is currently executing code that cannot be safely interrupted (according to POSIX), then signals are "locked." The signal handler checks if the lock is taken, and if it is, it just sets the status to indicate that a signal arrived. Later, when the signal lock is released, the status is checked to see if a signal came in. If so, the stack unwinding starts. This design eliminates polling in favor of maintaining a stack of `jmp_buf`'s. This has its own performance implications, but it gives better interaction. And the cost of pushing and popping a `jmp_buf` in a function is paid at most twice. Most functions do not pay that price, and most of the rest only pay it once. (There are only some 3 functions in `bc` that push and pop a `jmp_buf` twice.) As a side effect of this change, I had to eliminate the use of `stdio.h` in `bc` because `stdio` does not play nice with signals and `longjmp()`. I implemented custom I/O buffer code that takes a fraction of the size. This means that static builds will be smaller, but non-static builds will be bigger, though they will have less linking time. This change is also good because my history implementation was already bypassing `stdio` for good reasons, and unifying the architecture was a win. Another reason for this change is that my `bc` should *always* behave correctly in the presence of signals like `SIGINT`, `SIGTERM`, and `SIGQUIT`. With the addition of my own I/O buffering, I needed to also make sure that the buffers were correctly flushed even when such signals happened. For this reason, I **removed the option to build without signal support**. As a nice side effect of this change, the error handling code could be changed to take advantage of the stack unwinding that signals used. This means that signals and error handling use the same code paths, which means that the stack unwinding is well-tested. (Errors are tested heavily in the test suite.) It also means that functions do not need to return a status code that ***every*** caller needs to check. This eliminated over 100 branches that simply checked return codes and then passed that return code up the stack if necessary. The code bloat savings from this is at least 1700 bytes on `x86_64`, *before* taking into account the extra code from removing `stdio.h`. ## 2.7.2 This is a production release with one major bug fix. The `length()` built-in function can take either a number or an array. If it takes an array, it returns the length of the array. Arrays can be passed by reference. The bug is that the `length()` function would not properly dereference arrays that were references. This is a bug that affects all users. **ALL USERS SHOULD UPDATE `bc`**. ## 2.7.1 This is a production release with fixes for new locales and fixes for compiler warnings on FreeBSD. ## 2.7.0 This is a production release with a bug fix for Linux, new translations, and new features. Bug fixes: * Option parsing in `BC_ENV_ARGS` was broken on Linux in 2.6.1 because `glibc`'s `getopt_long()` is broken. To get around that, and to support long options on every platform, an adapted version of [`optparse`][17] was added. Now, `bc` does not even use `getopt()`. * Parsing `BC_ENV_ARGS` with quotes now works. It isn't the smartest, but it does the job if there are spaces in file names. The following new languages are supported: * Dutch * Polish * Russian * Japanes * Simplified Chinese All of these translations were generated using [DeepL][18], so improvements are welcome. There is only one new feature: **`bc` now has a built-in pseudo-random number generator** (PRNG). The PRNG is seeded, making it useful for applications where `/dev/urandom` does not work because output needs to be reproducible. However, it also uses `/dev/urandom` to seed itself by default, so it will start with a good seed by default. It also outputs 32 bits on 32-bit platforms and 64 bits on 64-bit platforms, far better than the 15 bits of C's `rand()` and `bash`'s `$RANDOM`. In addition, the PRNG can take a bound, and when it gets a bound, it automatically adjusts to remove bias. It can also generate numbers of arbitrary size. (As of the time of release, the largest pseudo-random number generated by this `bc` was generated with a bound of `2^(2^20)`.) ***IMPORTANT: read the [`bc` manual][9] and the [`dc` manual][10] to find out exactly what guarantees the PRNG provides. The underlying implementation is not guaranteed to stay the same, but the guarantees that it provides are guaranteed to stay the same regardless of the implementation.*** On top of that, four functions were added to `bc`'s [extended math library][16] to make using the PRNG easier: * `frand(p)`: Generates a number between `[0,1)` to `p` decimal places. * `ifrand(i, p)`: Generates an integer with bound `i` and adds it to `frand(p)`. * `srand(x)`: Randomizes the sign of `x`. In other words, it flips the sign of `x` with probability `0.5`. * `brand()`: Returns a random boolean value (either `0` or `1`). ## 2.6.1 This is a production release with a bug fix for FreeBSD. The bug was that when `bc` was built without long options, it would give a fatal error on every run. This was caused by a mishandling of `optind`. ## 2.6.0 This release is a production release ***with no bugfixes***. If you do not want to upgrade, you don't have to. No source code changed; the only thing that changed was `lib2.bc`. This release adds one function to the [extended math library][16]: `p(x, y)`, which calculates `x` to the power of `y`, whether or not `y` is an integer. (The `^` operator can only accept integer powers.) This release also includes a couple of small tweaks to the [extended math library][16], mostly to fix returning numbers with too high of `scale`. ## 2.5.3 This release is a production release which addresses inconsistencies in the Portuguese locales. No `bc` code was changed. The issues were that the ISO files used different naming, and also that the files that should have been symlinks were not. I did not catch that because GitHub rendered them the exact same way. ## 2.5.2 This release is a production release. No code was changed, but the build system was changed to allow `CFLAGS` to be given to `CC`, like this: ``` CC="gcc -O3 -march=native" ./configure.sh ``` If this happens, the flags are automatically put into `CFLAGS`, and the compiler is set appropriately. In the example above this means that `CC` will be "gcc" and `CFLAGS` will be "-O3 -march=native". This behavior was added to conform to GNU autotools practices. ## 2.5.1 This is a production release which addresses portability concerns discovered in the `bc` build system. No `bc` code was changed. * Support for Solaris SPARC and AIX were added. * Minor documentations edits were performed. * An option for `configure.sh` was added to disable long options if `getopt_long()` is missing. ## 2.5.0 This is a production release with new translations. No code changed. The translations were contributed by [bugcrazy][15], and they are for Portuguese, both Portugal and Brazil locales. ## 2.4.0 This is a production release primarily aimed at improving `dc`. * A couple of copy and paste errors in the [`dc` manual][10] were fixed. * `dc` startup was optimized by making sure it didn't have to set up `bc`-only things. * The `bc` `&&` and `||` operators were made available to `dc` through the `M` and `m` commands, respectively. * `dc` macros were changed to be tail call-optimized. The last item, tail call optimization, means that if the last thing in a macro is a call to another macro, then the old macro is popped before executing the new macro. This change was made to stop `dc` from consuming more and more memory as macros are executed in a loop. The `q` and `Q` commands still respect the "hidden" macros by way of recording how many macros were removed by tail call optimization. ## 2.3.2 This is a production release meant to fix warnings in the Gentoo `ebuild` by making it possible to disable binary stripping. Other users do *not* need to upgrade. ## 2.3.1 This is a production release. It fixes a bug that caused `-1000000000 < -1` to return `0`. This only happened with negative numbers and only if the value on the left was more negative by a certain amount. That said, this bug *is* a bad bug, and needs to be fixed. **ALL USERS SHOULD UPDATE `bc`**. ## 2.3.0 This is a production release with changes to the build system. ## 2.2.0 This release is a production release. It only has new features and performance improvements. 1. The performance of `sqrt(x)` was improved. 2. The new function `root(x, n)` was added to the extended math library to calculate `n`th roots. 3. The new function `cbrt(x)` was added to the extended math library to calculate cube roots. ## 2.1.3 This is a non-critical release; it just changes the build system, and in non-breaking ways: 1. Linked locale files were changed to link to their sources with a relative link. 2. A bug in `configure.sh` that caused long option parsing to fail under `bash` was fixed. ## 2.1.2 This release is not a critical release. 1. A few codes were added to history. 2. Multiplication was optimized a bit more. 3. Addition and subtraction were both optimized a bit more. ## 2.1.1 This release contains a fix for the test suite made for Linux from Scratch: now the test suite prints `pass` when a test is passed. Other than that, there is no change in this release, so distros and other users do not need to upgrade. ## 2.1.0 This release is a production release. The following bugs were fixed: 1. A `dc` bug that caused stack mishandling was fixed. 2. A warning on OpenBSD was fixed. 3. Bugs in `ctrl+arrow` operations in history were fixed. 4. The ability to paste multiple lines in history was added. 5. A `bc` bug, mishandling of array arguments to functions, was fixed. 6. A crash caused by freeing the wrong pointer was fixed. 7. A `dc` bug where strings, in a rare case, were mishandled in parsing was fixed. In addition, the following changes were made: 1. Division was slightly optimized. 2. An option was added to the build to disable printing of prompts. 3. The special case of empty arguments is now handled. This is to prevent errors in scripts that end up passing empty arguments. 4. A harmless bug was fixed. This bug was that, with the pop instructions (mostly) removed (see below), `bc` would leave extra values on its stack for `void` functions and in a few other cases. These extra items would not affect anything put on the stack and would not cause any sort of crash or even buggy behavior, but they would cause `bc` to take more memory than it needed. On top of the above changes, the following optimizations were added: 1. The need for pop instructions in `bc` was removed. 2. Extra tests on every iteration of the interpreter loop were removed. 3. Updating function and code pointers on every iteration of the interpreter loop was changed to only updating them when necessary. 4. Extra assignments to pointers were removed. Altogether, these changes sped up the interpreter by around 2x. ***NOTE***: This is the last release with new features because this `bc` is now considered complete. From now on, only bug fixes and new translations will be added to this `bc`. ## 2.0.3 This is a production, bug-fix release. Two bugs were fixed in this release: 1. A rare and subtle signal handling bug was fixed. 2. A misbehavior on `0` to a negative power was fixed. The last bug bears some mentioning. When I originally wrote power, I did not thoroughly check its error cases; instead, I had it check if the first number was `0` and then if so, just return `0`. However, `0` to a negative power means that `1` will be divided by `0`, which is an error. I caught this, but only after I stopped being cocky. You see, sometime later, I had noticed that GNU `bc` returned an error, correctly, but I thought it was wrong simply because that's not what my `bc` did. I saw it again later and had a double take. I checked for real, finally, and found out that my `bc` was wrong all along. That was bad on me. But the bug was easy to fix, so it is fixed now. There are two other things in this release: 1. Subtraction was optimized by [Stefan Eßer][14]. 2. Division was also optimized, also by Stefan Eßer. ## 2.0.2 This release contains a fix for a possible overflow in the signal handling. I would be surprised if any users ran into it because it would only happen after 2 billion (`2^31-1`) `SIGINT`'s, but I saw it and had to fix it. ## 2.0.1 This release contains very few things that will apply to any users. 1. A slight bug in `dc`'s interactive mode was fixed. 2. A bug in the test suite that was only triggered on NetBSD was fixed. 3. **The `-P`/`--no-prompt` option** was added for users that do not want a prompt. 4. A `make check` target was added as an alias for `make test`. 5. `dc` got its own read prompt: `?> `. ## 2.0.0 This release is a production release. This release is also a little different from previous releases. From here on out, I do not plan on adding any more features to this `bc`; I believe that it is complete. However, there may be bug fix releases in the future, if I or any others manage to find bugs. This release has only a few new features: 1. `atan2(y, x)` was added to the extended math library as both `a2(y, x)` and `atan2(y, x)`. 2. Locales were fixed. 3. A **POSIX shell-compatible script was added as an alternative to compiling `gen/strgen.c`** on a host machine. More details about making the choice between the two can be found by running `./configure.sh --help` or reading the [build manual][13]. 4. Multiplication was optimized by using **diagonal multiplication**, rather than straight brute force. 5. The `locale_install.sh` script was fixed. 6. `dc` was given the ability to **use the environment variable `DC_ENV_ARGS`**. 7. `dc` was also given the ability to **use the `-i` or `--interactive`** options. 8. Printing the prompt was fixed so that it did not print when it shouldn't. 9. Signal handling was fixed. 10. **Handling of `SIGTERM` and `SIGQUIT`** was fixed. 11. The **built-in functions `maxibase()`, `maxobase()`, and `maxscale()`** (the commands `T`, `U`, `V` in `dc`, respectively) were added to allow scripts to query for the max allowable values of those globals. 12. Some incompatibilities with POSIX were fixed. In addition, this release is `2.0.0` for a big reason: the internal format for numbers changed. They used to be a `char` array. Now, they are an array of larger integers, packing more decimal digits into each integer. This has delivered ***HUGE*** performance improvements, especially for multiplication, division, and power. This `bc` should now be the fastest `bc` available, but I may be wrong. ## 1.2.8 This release contains a fix for a harmless bug (it is harmless in that it still works, but it just copies extra data) in the [`locale_install.sh`][12] script. ## 1.2.7 This version contains fixes for the build on Arch Linux. ## 1.2.6 This release removes the use of `local` in shell scripts because it's not POSIX shell-compatible, and also updates a man page that should have been updated a long time ago but was missed. ## 1.2.5 This release contains some missing locale `*.msg` files. ## 1.2.4 This release contains a few bug fixes and new French translations. ## 1.2.3 This release contains a fix for a bug: use of uninitialized data. Such data was only used when outputting an error message, but I am striving for perfection. As Michelangelo said, "Trifles make perfection, and perfection is no trifle." ## 1.2.2 This release contains fixes for OpenBSD. ## 1.2.1 This release contains bug fixes for some rare bugs. ## 1.2.0 This is a production release. There have been several changes since `1.1.0`: 1. The build system had some changes. 2. Locale support has been added. (Patches welcome for translations.) 3. **The ability to turn `ibase`, `obase`, and `scale` into stacks** was added with the `-g` command-line option. (See the [`bc` manual][9] for more details.) 4. Support for compiling on Mac OSX out of the box was added. 5. The extended math library got `t(x)`, `ceil(x)`, and some aliases. 6. The extended math library also got `r2d(x)` (for converting from radians to degrees) and `d2r(x)` (for converting from degrees to radians). This is to allow using degrees with the standard library. 7. Both calculators now accept numbers in **scientific notation**. See the [`bc` manual][9] and the [`dc` manual][10] for details. 8. Both calculators can **output in either scientific or engineering notation**. See the [`bc` manual][9] and the [`dc` manual][10] for details. 9. Some inefficiencies were removed. 10. Some bugs were fixed. 11. Some bugs in the extended library were fixed. 12. Some defects from [Coverity Scan][11] were fixed. ## 1.1.4 This release contains a fix to the build system that allows it to build on older versions of `glibc`. ## 1.1.3 This release contains a fix for a bug in the test suite where `bc` tests and `dc` tests could not be run in parallel. ## 1.1.2 This release has a fix for a history bug; the down arrow did not work. ## 1.1.1 This release fixes a bug in the `1.1.0` build system. The source is exactly the same. The bug that was fixed was a failure to install if no `EXECSUFFIX` was used. ## 1.1.0 This is a production release. However, many new features were added since `1.0`. 1. **The build system has been changed** to use a custom, POSIX shell-compatible configure script ([`configure.sh`][6]) to generate a POSIX make-compatible `Makefile`, which means that `bc` and `dc` now build out of the box on any POSIX-compatible system. 2. Out-of-memory and output errors now cause the `bc` to report the error, clean up, and die, rather than just reporting and trying to continue. 3. **Strings and constants are now garbage collected** when possible. 4. Signal handling and checking has been made more simple and more thorough. 5. `BcGlobals` was refactored into `BcVm` and `BcVm` was made global. Some procedure names were changed to reflect its difference to everything else. 6. Addition got a speed improvement. 7. Some common code for addition and multiplication was refactored into its own procedure. 8. A bug was removed where `dc` could have been selected, but the internal `#define` that returned `true` for a query about `dc` would not have returned `true`. 9. Useless calls to `bc_num_zero()` were removed. 10. **History support was added.** The history support is based off of a [UTF-8 aware fork][7] of [`linenoise`][8], which has been customized with `bc`'s own data structures and signal handling. 11. Generating C source from the math library now removes tabs from the library, shrinking the size of the executable. 12. The math library was shrunk. 13. Error handling and reporting was improved. 14. Reallocations were reduced by giving access to the request size for each operation. 15. **`abs()` (`b` command for `dc`) was added as a builtin.** 16. Both calculators were tested on FreeBSD. 17. Many obscure parse bugs were fixed. 18. Markdown and man page manuals were added, and the man pages are installed by `make install`. 19. Executable size was reduced, though the added features probably made the executable end up bigger. 20. **GNU-style array references were added as a supported feature.** 21. Allocations were reduced. 22. **New operators were added**: `$` (`$` for `dc`), `@` (`@` for `dc`), `@=`, `<<` (`H` for `dc`), `<<=`, `>>` (`h` for `dc`), and `>>=`. See the [`bc` manual][9] and the [`dc` manual][10] for more details. 23. **An extended math library was added.** This library contains code that makes it so I can replace my desktop calculator with this `bc`. See the [`bc` manual][3] for more details. 24. Support for all capital letters as numbers was added. 25. **Support for GNU-style void functions was added.** 26. A bug fix for improper handling of function parameters was added. 27. Precedence for the or (`||`) operator was changed to match GNU `bc`. 28. `dc` was given an explicit negation command. 29. `dc` was changed to be able to handle strings in arrays. ## 1.1 Release Candidate 3 This release is the eighth release candidate for 1.1, though it is the third release candidate meant as a general release candidate. The new code has not been tested as thoroughly as it should for release. ## 1.1 Release Candidate 2 This release is the seventh release candidate for 1.1, though it is the second release candidate meant as a general release candidate. The new code has not been tested as thoroughly as it should for release. ## 1.1 FreeBSD Beta 5 This release is the sixth release candidate for 1.1, though it is the fifth release candidate meant specifically to test if `bc` works on FreeBSD. The new code has not been tested as thoroughly as it should for release. ## 1.1 FreeBSD Beta 4 This release is the fifth release candidate for 1.1, though it is the fourth release candidate meant specifically to test if `bc` works on FreeBSD. The new code has not been tested as thoroughly as it should for release. ## 1.1 FreeBSD Beta 3 This release is the fourth release candidate for 1.1, though it is the third release candidate meant specifically to test if `bc` works on FreeBSD. The new code has not been tested as thoroughly as it should for release. ## 1.1 FreeBSD Beta 2 This release is the third release candidate for 1.1, though it is the second release candidate meant specifically to test if `bc` works on FreeBSD. The new code has not been tested as thoroughly as it should for release. ## 1.1 FreeBSD Beta 1 This release is the second release candidate for 1.1, though it is meant specifically to test if `bc` works on FreeBSD. The new code has not been tested as thoroughly as it should for release. ## 1.1 Release Candidate 1 This is the first release candidate for 1.1. The new code has not been tested as thoroughly as it should for release. ## 1.0 This is the first non-beta release. `bc` is ready for production use. As such, a lot has changed since 0.5. 1. `dc` has been added. It has been tested even more thoroughly than `bc` was for `0.5`. It does not have the `!` command, and for security reasons, it never will, so it is complete. 2. `bc` has been more thoroughly tested. An entire section of the test suite (for both programs) has been added to test for errors. 3. A prompt (`>>> `) has been added for interactive mode, making it easier to see inputs and outputs. 4. Interrupt handling has been improved, including elimination of race conditions (as much as possible). 5. MinGW and [Windows Subsystem for Linux][1] support has been added (see [xstatic][2] for binaries). 6. Memory leaks and errors have been eliminated (as far as ASan and Valgrind can tell). 7. Crashes have been eliminated (as far as [afl][3] can tell). 8. Karatsuba multiplication was added (and thoroughly) tested, speeding up multiplication and power by orders of magnitude. 9. Performance was further enhanced by using a "divmod" function to reduce redundant divisions and by removing superfluous `memset()` calls. 10. To switch between Karatsuba and `O(n^2)` multiplication, the config variable `BC_NUM_KARATSUBA_LEN` was added. It is set to a sane default, but the optimal number can be found with [`karatsuba.py`][4] (requires Python 3) and then configured through `make`. 11. The random math test generator script was changed to Python 3 and improved. `bc` and `dc` have together been run through 30+ million random tests. 12. All known math bugs have been fixed, including out of control memory allocations in `sine` and `cosine` (that was actually a parse bug), certain cases of infinite loop on square root, and slight inaccuracies (as much as possible; see the [README][5]) in transcendental functions. 13. Parsing has been fixed as much as possible. 14. Test coverage was improved to 94.8%. The only paths not covered are ones that happen when `malloc()` or `realloc()` fails. 15. An extension to get the length of an array was added. 16. The boolean not (`!`) had its precedence change to match negation. 17. Data input was hardened. 18. `bc` was made fully compliant with POSIX when the `-s` flag is used or `POSIXLY_CORRECT` is defined. 19. Error handling was improved. 20. `bc` now checks that files it is given are not directories. ## 1.0 Release Candidate 7 This is the seventh release candidate for 1.0. It fixes a few bugs in 1.0 Release Candidate 6. ## 1.0 Release Candidate 6 This is the sixth release candidate for 1.0. It fixes a few bugs in 1.0 Release Candidate 5. ## 1.0 Release Candidate 5 This is the fifth release candidate for 1.0. It fixes a few bugs in 1.0 Release Candidate 4. ## 1.0 Release Candidate 4 This is the fourth release candidate for 1.0. It fixes a few bugs in 1.0 Release Candidate 3. ## 1.0 Release Candidate 3 This is the third release candidate for 1.0. It fixes a few bugs in 1.0 Release Candidate 2. ## 1.0 Release Candidate 2 This is the second release candidate for 1.0. It fixes a few bugs in 1.0 Release Candidate 1. ## 1.0 Release Candidate 1 This is the first Release Candidate for 1.0. `bc` is complete, with `dc`, but it is not tested. ## 0.5 This beta release completes more features, but it is still not complete nor tested as thoroughly as necessary. ## 0.4.1 This beta release fixes a few bugs in 0.4. ## 0.4 This is a beta release. It does not have the complete set of features, and it is not thoroughly tested. [1]: https://docs.microsoft.com/en-us/windows/wsl/install-win10 [2]: https://pkg.musl.cc/bc/ [3]: http://lcamtuf.coredump.cx/afl/ [4]: ./scripts/karatsuba.py [5]: ./README.md [6]: ./configure.sh [7]: https://github.com/rain-1/linenoise-mob [8]: https://github.com/antirez/linenoise [9]: ./manuals/bc/A.1.md [10]: ./manuals/dc/A.1.md [11]: https://scan.coverity.com/projects/gavinhoward-bc [12]: ./scripts/locale_install.sh [13]: ./manuals/build.md [14]: https://github.com/stesser [15]: https://github.com/bugcrazy [16]: ./manuals/bc/A.1.md#extended-library [17]: https://github.com/skeeto/optparse [18]: https://www.deepl.com/translator [19]: ./manuals/benchmarks.md [20]: https://github.com/apjanke/ronn-ng [21]: https://pandoc.org/ [22]: ./scripts/locale_uninstall.sh diff --git a/include/bc.h b/include/bc.h index 3d4a11592875..a4198b91ebc6 100644 --- a/include/bc.h +++ b/include/bc.h @@ -1,458 +1,467 @@ /* * ***************************************************************************** * * SPDX-License-Identifier: BSD-2-Clause * * Copyright (c) 2018-2021 Gavin D. Howard and contributors. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are met: * * * Redistributions of source code must retain the above copyright notice, this * list of conditions and the following disclaimer. * * * Redistributions in binary form must reproduce the above copyright notice, * this list of conditions and the following disclaimer in the documentation * and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE * POSSIBILITY OF SUCH DAMAGE. * * ***************************************************************************** * * Definitions for bc only. * */ #ifndef BC_BC_H #define BC_BC_H #if BC_ENABLED #include #include #include #include #include /** * The main function for bc. It just sets variables and passes its arguments * through to @a bc_vm_boot(). */ void bc_main(int argc, char *argv[]); // These are references to the help text, the library text, and the "filename" // for the library. extern const char bc_help[]; extern const char bc_lib[]; extern const char* bc_lib_name; // These are references to the second math library and its "filename." #if BC_ENABLE_EXTRA_MATH extern const char bc_lib2[]; extern const char* bc_lib2_name; #endif // BC_ENABLE_EXTRA_MATH /** * A struct containing information about a bc keyword. */ typedef struct BcLexKeyword { /// Holds the length of the keyword along with a bit that, if set, means the /// keyword is used in POSIX bc. uchar data; /// The keyword text. const char name[14]; } BcLexKeyword; /// Sets the most significant bit. Used for setting the POSIX bit in /// BcLexKeyword's data field. #define BC_LEX_CHAR_MSB(bit) ((bit) << (CHAR_BIT - 1)) /// Returns non-zero if the keyword is POSIX, zero otherwise. #define BC_LEX_KW_POSIX(kw) ((kw)->data & (BC_LEX_CHAR_MSB(1))) /// Returns the length of the keyword. #define BC_LEX_KW_LEN(kw) ((size_t) ((kw)->data & ~(BC_LEX_CHAR_MSB(1)))) /// A macro to easily build a keyword entry. See bc_lex_kws in src/data.c. #define BC_LEX_KW_ENTRY(a, b, c) \ { .data = ((b) & ~(BC_LEX_CHAR_MSB(1))) | BC_LEX_CHAR_MSB(c), .name = a } #if BC_ENABLE_EXTRA_MATH /// A macro for the number of keywords bc has. This has to be updated if any are /// added. This is for the redefined_kws field of the BcVm struct. #define BC_LEX_NKWS (35) #else // BC_ENABLE_EXTRA_MATH /// A macro for the number of keywords bc has. This has to be updated if any are /// added. This is for the redefined_kws field of the BcVm struct. #define BC_LEX_NKWS (31) #endif // BC_ENABLE_EXTRA_MATH // The array of keywords and its length. extern const BcLexKeyword bc_lex_kws[]; extern const size_t bc_lex_kws_len; /** * The @a BcLexNext function for bc. (See include/lex.h for a definition of * @a BcLexNext.) * @param l The lexer. */ void bc_lex_token(BcLex *l); // The following section is for flags needed when parsing bc code. These flags // are complicated, but necessary. Why you ask? Because bc's standard is awful. // // If you don't believe me, go read the bc Parsing section of the Development // manual (manuals/development.md). Then come back. // // In other words, these flags are the sign declaring, "Here be dragons." /** * This returns a pointer to the set of flags at the top of the flag stack. * @a p is expected to be a BcParse pointer. * @param p The parser. * @return A pointer to the top flag set. */ #define BC_PARSE_TOP_FLAG_PTR(p) ((uint16_t*) bc_vec_top(&(p)->flags)) /** * This returns the flag set at the top of the flag stack. @a p is expected to * be a BcParse pointer. * @param p The parser. * @return The top flag set. */ #define BC_PARSE_TOP_FLAG(p) (*(BC_PARSE_TOP_FLAG_PTR(p))) // After this point, all flag #defines are in sets of 2: one to define the flag, // and one to define a way to grab the flag from the flag set at the top of the // flag stack. All `p` arguments are pointers to a BcParse. // This flag is set if the parser has seen a left brace. #define BC_PARSE_FLAG_BRACE (UINTMAX_C(1)<<0) #define BC_PARSE_BRACE(p) (BC_PARSE_TOP_FLAG(p) & BC_PARSE_FLAG_BRACE) // This flag is set if the parser is parsing inside of the braces of a function // body. #define BC_PARSE_FLAG_FUNC_INNER (UINTMAX_C(1)<<1) #define BC_PARSE_FUNC_INNER(p) (BC_PARSE_TOP_FLAG(p) & BC_PARSE_FLAG_FUNC_INNER) // This flag is set if the parser is parsing a function. It is different from // the one above because it is set if it is parsing a function body *or* header, // not just if it's parsing a function body. #define BC_PARSE_FLAG_FUNC (UINTMAX_C(1)<<2) #define BC_PARSE_FUNC(p) (BC_PARSE_TOP_FLAG(p) & BC_PARSE_FLAG_FUNC) // This flag is set if the parser is expecting to parse a body, whether of a // function, an if statement, or a loop. #define BC_PARSE_FLAG_BODY (UINTMAX_C(1)<<3) #define BC_PARSE_BODY(p) (BC_PARSE_TOP_FLAG(p) & BC_PARSE_FLAG_BODY) // This flag is set if bc is parsing a loop. This is important because the break // and continue keywords are only valid inside of a loop. #define BC_PARSE_FLAG_LOOP (UINTMAX_C(1)<<4) #define BC_PARSE_LOOP(p) (BC_PARSE_TOP_FLAG(p) & BC_PARSE_FLAG_LOOP) // This flag is set if bc is parsing the body of a loop. It is different from // the one above the same way @a BC_PARSE_FLAG_FUNC_INNER is different from // @a BC_PARSE_FLAG_FUNC. #define BC_PARSE_FLAG_LOOP_INNER (UINTMAX_C(1)<<5) #define BC_PARSE_LOOP_INNER(p) (BC_PARSE_TOP_FLAG(p) & BC_PARSE_FLAG_LOOP_INNER) // This flag is set if bc is parsing an if statement. #define BC_PARSE_FLAG_IF (UINTMAX_C(1)<<6) #define BC_PARSE_IF(p) (BC_PARSE_TOP_FLAG(p) & BC_PARSE_FLAG_IF) // This flag is set if bc is parsing an else statement. This is important // because of "else if" constructions, among other things. #define BC_PARSE_FLAG_ELSE (UINTMAX_C(1)<<7) #define BC_PARSE_ELSE(p) (BC_PARSE_TOP_FLAG(p) & BC_PARSE_FLAG_ELSE) // This flag is set if bc just finished parsing an if statement and its body. // It tells the parser that it can probably expect an else statement next. This // flag is, thus, one of the most subtle. #define BC_PARSE_FLAG_IF_END (UINTMAX_C(1)<<8) #define BC_PARSE_IF_END(p) (BC_PARSE_TOP_FLAG(p) & BC_PARSE_FLAG_IF_END) /** * This returns true if bc is in a state where it should not execute any code * at all. * @param p The parser. * @return True if execution cannot proceed, false otherwise. */ #define BC_PARSE_NO_EXEC(p) ((p)->flags.len != 1 || BC_PARSE_TOP_FLAG(p) != 0) /** * This returns true if the token @a t is a statement delimiter, which is * either a newline or a semicolon. * @param t The token to check. * @return True if t is a statement delimiter token; false otherwise. */ #define BC_PARSE_DELIMITER(t) \ ((t) == BC_LEX_SCOLON || (t) == BC_LEX_NLINE || (t) == BC_LEX_EOF) /** * This is poorly named, but it basically returns whether or not the current * state is valid for the end of an else statement. * @param f The flag set to be checked. * @return True if the state is valid for the end of an else statement. */ #define BC_PARSE_BLOCK_STMT(f) \ ((f) & (BC_PARSE_FLAG_ELSE | BC_PARSE_FLAG_LOOP_INNER)) /** * This returns the value of the data for an operator with precedence @a p and * associativity @a l (true if left associative, false otherwise). This is used * to construct an array of operators, bc_parse_ops, in src/data.c. * @param p The precedence. * @param l True if the operator is left associative, false otherwise. * @return The data for the operator. */ #define BC_PARSE_OP(p, l) (((p) & ~(BC_LEX_CHAR_MSB(1))) | (BC_LEX_CHAR_MSB(l))) /** * Returns the operator data for the lex token @a t. * @param t The token to return operator data for. * @return The operator data for @a t. */ #define BC_PARSE_OP_DATA(t) bc_parse_ops[((t) - BC_LEX_OP_INC)] /** * Returns non-zero if operator @a op is left associative, zero otherwise. * @param op The operator to test for associativity. * @return Non-zero if the operator is left associative, zero otherwise. */ #define BC_PARSE_OP_LEFT(op) (BC_PARSE_OP_DATA(op) & BC_LEX_CHAR_MSB(1)) /** * Returns the precedence of operator @a op. Lower number means higher * precedence. * @param op The operator to return the precedence of. * @return The precedence of @a op. */ #define BC_PARSE_OP_PREC(op) (BC_PARSE_OP_DATA(op) & ~(BC_LEX_CHAR_MSB(1))) /** * A macro to easily define a series of bits for whether a lex token is an * expression token or not. It takes 8 expression bits, corresponding to the 8 * bits in a uint8_t. You can see this in use for bc_parse_exprs in src/data.c. * @param e1 The first bit. * @param e2 The second bit. * @param e3 The third bit. * @param e4 The fourth bit. * @param e5 The fifth bit. * @param e6 The sixth bit. * @param e7 The seventh bit. * @param e8 The eighth bit. * @return An expression entry for bc_parse_exprs[]. */ #define BC_PARSE_EXPR_ENTRY(e1, e2, e3, e4, e5, e6, e7, e8) \ ((UINTMAX_C(e1) << 7) | (UINTMAX_C(e2) << 6) | (UINTMAX_C(e3) << 5) | \ (UINTMAX_C(e4) << 4) | (UINTMAX_C(e5) << 3) | (UINTMAX_C(e6) << 2) | \ (UINTMAX_C(e7) << 1) | (UINTMAX_C(e8) << 0)) /** * Returns true if token @a i is a token that belongs in an expression. * @param i The token to test. * @return True if i is an expression token, false otherwise. */ #define BC_PARSE_EXPR(i) \ (bc_parse_exprs[(((i) & (uchar) ~(0x07)) >> 3)] & (1 << (7 - ((i) & 0x07)))) /** * Returns the operator (by lex token) that is at the top of the operator * stack. * @param p The parser. * @return The operator that is at the top of the operator stack, as a lex * token. */ #define BC_PARSE_TOP_OP(p) (*((BcLexType*) bc_vec_top(&(p)->ops))) /** * Returns true if bc has a "leaf" token. A "leaf" token is one that can stand * alone in an expression. For example, a number by itself can be an expression, * but a binary operator, while valid for an expression, cannot be alone in the * expression. It must have an expression to the left and right of itself. See * the documentation for @a bc_parse_expr_err() in src/bc_parse.c. * @param prev The previous token as an instruction. * @param bin_last True if that last operator was a binary operator, false * otherwise. * @param rparen True if the last operator was a right paren. * return True if the last token was a leaf token, false otherwise. */ #define BC_PARSE_LEAF(prev, bin_last, rparen) \ (!(bin_last) && ((rparen) || bc_parse_inst_isLeaf(prev))) /** * This returns true if the token @a t should be treated as though it's a * variable. This goes for actual variables, array elements, and globals. * @param t The token to test. * @return True if @a t should be treated as though it's a variable, false * otherwise. */ #if BC_ENABLE_EXTRA_MATH #define BC_PARSE_INST_VAR(t) \ ((t) >= BC_INST_VAR && (t) <= BC_INST_SEED && (t) != BC_INST_ARRAY) #else // BC_ENABLE_EXTRA_MATH #define BC_PARSE_INST_VAR(t) \ ((t) >= BC_INST_VAR && (t) <= BC_INST_SCALE && (t) != BC_INST_ARRAY) #endif // BC_ENABLE_EXTRA_MATH /** * Returns true if the previous token @a p (in the form of a bytecode * instruction) is a prefix operator. The fact that it is for bytecode * instructions is what makes it different from @a BC_PARSE_OP_PREFIX below. * @param p The previous token. * @return True if @a p is a prefix operator. */ #define BC_PARSE_PREV_PREFIX(p) ((p) >= BC_INST_NEG && (p) <= BC_INST_BOOL_NOT) /** * Returns true if token @a t is a prefix operator. * @param t The token to test. * @return True if @a t is a prefix operator, false otherwise. */ #define BC_PARSE_OP_PREFIX(t) ((t) == BC_LEX_OP_BOOL_NOT || (t) == BC_LEX_NEG) /** * We can calculate the conversion between tokens and bytecode instructions by * subtracting the position of the first operator in the lex enum and adding the * position of the first in the instruction enum. Note: This only works for * binary operators. * @param t The token to turn into an instruction. * @return The token as an instruction. */ #define BC_PARSE_TOKEN_INST(t) ((uchar) ((t) - BC_LEX_NEG + BC_INST_NEG)) /** * Returns true if the token is a bc keyword. * @param t The token to check. * @return True if @a t is a bc keyword, false otherwise. */ #define BC_PARSE_IS_KEYWORD(t) ((t) >= BC_LEX_KW_AUTO && (t) <= BC_LEX_KW_ELSE) /// A struct that holds data about what tokens should be expected next. There /// are a few instances of these, all named because they are used in specific /// cases. Basically, in certain situations, it's useful to use the same code, /// but have a list of valid tokens. /// /// Obviously, @a len is the number of tokens in the @a tokens array. If more /// than 4 is needed in the future, @a tokens will have to be changed. typedef struct BcParseNext { /// The number of tokens in the tokens array. uchar len; /// The tokens that can be expected next. uchar tokens[4]; } BcParseNext; /// A macro to construct an array literal of tokens from a parameter list. #define BC_PARSE_NEXT_TOKENS(...) .tokens = { __VA_ARGS__ } /// A macro to generate a BcParseNext literal from BcParseNext data. See /// src/data.c for examples. #define BC_PARSE_NEXT(a, ...) \ { .len = (uchar) (a), BC_PARSE_NEXT_TOKENS(__VA_ARGS__) } /// A status returned by @a bc_parse_expr_err(). It can either return success or /// an error indicating an empty expression. typedef enum BcParseStatus { BC_PARSE_STATUS_SUCCESS, BC_PARSE_STATUS_EMPTY_EXPR, } BcParseStatus; /** * The @a BcParseExpr function for bc. (See include/parse.h for a definition of * @a BcParseExpr.) * @param p The parser. * @param flags Flags that define the requirements that the parsed code must * meet or an error will result. See @a BcParseExpr for more info. */ void bc_parse_expr(BcParse *p, uint8_t flags); /** * The @a BcParseParse function for bc. (See include/parse.h for a definition of * @a BcParseParse.) * @param p The parser. */ void bc_parse_parse(BcParse *p); +/** + * Ends a series of if statements. This is to ensure that full parses happen + * when a file finishes or before defining a function. Without this, bc thinks + * that it cannot parse any further. But if we reach the end of a file or a + * function definition, we know we can add an empty else clause. + * @param p The parser. + */ +void bc_parse_endif(BcParse *p); + /// References to the signal message and its length. extern const char bc_sig_msg[]; extern const uchar bc_sig_msg_len; /// A reference to an array of bits that are set if the corresponding lex token /// is valid in an expression. extern const uint8_t bc_parse_exprs[]; /// A reference to an array of bc operators. extern const uchar bc_parse_ops[]; // References to the various instances of BcParseNext's. /// A reference to what tokens are valid as next tokens when parsing normal /// expressions. More accurately. these are the tokens that are valid for /// *ending* the expression. extern const BcParseNext bc_parse_next_expr; /// A reference to what tokens are valid as next tokens when parsing function /// parameters (well, actually arguments). extern const BcParseNext bc_parse_next_arg; /// A reference to what tokens are valid as next tokens when parsing a print /// statement. extern const BcParseNext bc_parse_next_print; /// A reference to what tokens are valid as next tokens when parsing things like /// loop headers and builtin functions where the only thing expected is a right /// paren. /// /// The name is an artifact of history, and is related to @a BC_PARSE_REL (see /// include/parse.h). It refers to how POSIX only allows some operators as part /// of the conditional of for loops, while loops, and if statements. extern const BcParseNext bc_parse_next_rel; // What tokens are valid as next tokens when parsing an array element // expression. extern const BcParseNext bc_parse_next_elem; /// A reference to what tokens are valid as next tokens when parsing the first /// two parts of a for loop header. extern const BcParseNext bc_parse_next_for; /// A reference to what tokens are valid as next tokens when parsing a read /// expression. extern const BcParseNext bc_parse_next_read; /// A reference to what tokens are valid as next tokens when parsing a builtin /// function with multiple arguments. extern const BcParseNext bc_parse_next_builtin; #else // BC_ENABLED // If bc is not enabled, execution is always possible because dc has strict // rules that ensure execution can always proceed safely. #define BC_PARSE_NO_EXEC(p) (0) #endif // BC_ENABLED #endif // BC_BC_H diff --git a/include/version.h b/include/version.h index 3be823189b8f..72500c8e3f28 100644 --- a/include/version.h +++ b/include/version.h @@ -1,42 +1,42 @@ /* * ***************************************************************************** * * SPDX-License-Identifier: BSD-2-Clause * * Copyright (c) 2018-2021 Gavin D. Howard and contributors. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are met: * * * Redistributions of source code must retain the above copyright notice, this * list of conditions and the following disclaimer. * * * Redistributions in binary form must reproduce the above copyright notice, * this list of conditions and the following disclaimer in the documentation * and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE * POSSIBILITY OF SUCH DAMAGE. * * ***************************************************************************** * * The version of bc. * */ #ifndef BC_VERSION_H #define BC_VERSION_H /// The current version. -#define VERSION 5.1.0 +#define VERSION 5.1.1 #endif // BC_VERSION_H diff --git a/manuals/build.md b/manuals/build.md index 13e969e8e673..1ed2b269f13c 100644 --- a/manuals/build.md +++ b/manuals/build.md @@ -1,838 +1,838 @@ # Build This `bc` attempts to be as portable as possible. It can be built on any POSIX-compliant system. To accomplish that, a POSIX-compatible, custom `configure.sh` script is used to select build options, compiler, and compiler flags and generate a `Makefile`. The general form of configuring, building, and installing this `bc` is as follows: ``` [ENVIRONMENT_VARIABLE=...] ./configure.sh [build_options...] make make install ``` To get all of the options, including any useful environment variables, use either one of the following commands: ``` ./configure.sh -h ./configure.sh --help ``` ***WARNING***: even though `configure.sh` supports both option types, short and long, it does not support handling both at the same time. Use only one type. To learn the available `make` targets run the following command after running the `configure.sh` script: ``` make help ``` See [Build Environment Variables][4] for a more detailed description of all accepted environment variables and [Build Options][5] for more detail about all accepted build options. ## Windows For releases, Windows builds of `bc`, `dc`, and `bcl` are available for download from and GitHub. However, if you wish to build it yourself, this `bc` can be built using Visual Studio or MSBuild. Unfortunately, only one build configuration (besides Debug or Release) is supported: extra math enabled, history and NLS (locale support) disabled, with both calculators built. The default [settings][11] are `BC_BANNER=1`, `{BC,DC}_SIGINT_RESET=0`, `{BC,DC}_TTY_MODE=1`, `{BC,DC}_PROMPT=1`. The library can also be built on Windows. ### Visual Studio In Visual Studio, open up the solution file (`bc.sln` for `bc`, or `bcl.sln` for the library), select the desired configuration, and build. ### MSBuild To build with MSBuild, first, *be sure that you are using the MSBuild that comes with Visual Studio*. To build `bc`, run the following from the root directory: ``` -msbuild -property:Configuration= bc.sln +msbuild -property:Configuration= vs/bc.sln ``` where `` is either one of `Debug` or `Release`. To build the library, run the following from the root directory: ``` -msbuild -property:Configuration= bcl.sln +msbuild -property:Configuration= vs/bcl.sln ``` -where `` is either one of `Debug` or `Release`. +where `` is either one of `Debug`, `ReleaseMD`, or `ReleaseMT`. ## POSIX-Compatible Systems Building `bc`, `dc`, and `bcl` (the library) is more complex than on Windows because many build options are supported. ### Cross Compiling To cross-compile this `bc`, an appropriate compiler must be present and assigned to the environment variable `HOSTCC` or `HOST_CC` (the two are equivalent, though `HOSTCC` is prioritized). This is in order to bootstrap core file(s), if the architectures are not compatible (i.e., unlike i686 on x86_64). Thus, the approach is: ``` HOSTCC="/path/to/native/compiler" ./configure.sh make make install ``` `HOST_CC` will work in exactly the same way. `HOSTCFLAGS` and `HOST_CFLAGS` can be used to set compiler flags for `HOSTCC`. (The two are equivalent, as `HOSTCC` and `HOST_CC` are.) `HOSTCFLAGS` is prioritized over `HOST_CFLAGS`. If neither are present, `HOSTCC` (or `HOST_CC`) uses `CFLAGS` (see [Build Environment Variables][4] for more details). It is expected that `CC` produces code for the target system and `HOSTCC` produces code for the host system. See [Build Environment Variables][4] for more details. If an emulator is necessary to run the bootstrap binaries, it can be set with the environment variable `GEN_EMU`. ### Build Environment Variables This `bc` supports `CC`, `HOSTCC`, `HOST_CC`, `CFLAGS`, `HOSTCFLAGS`, `HOST_CFLAGS`, `CPPFLAGS`, `LDFLAGS`, `LDLIBS`, `PREFIX`, `DESTDIR`, `BINDIR`, `DATAROOTDIR`, `DATADIR`, `MANDIR`, `MAN1DIR`, `LOCALEDIR` `EXECSUFFIX`, `EXECPREFIX`, `LONG_BIT`, `GEN_HOST`, and `GEN_EMU` environment variables in `configure.sh`. Any values of those variables given to `configure.sh` will be put into the generated Makefile. More detail on what those environment variables do can be found in the following sections. #### `CC` C compiler for the target system. `CC` must be compatible with POSIX `c99` behavior and options. However, **I encourage users to use any C99 or C11 compatible compiler they wish.** If there is a space in the basename of the compiler, the items after the first space are assumed to be compiler flags, and in that case, the flags are automatically moved into CFLAGS. Defaults to `c99`. #### `HOSTCC` or `HOST_CC` C compiler for the host system, used only in [cross compiling][6]. Must be compatible with POSIX `c99` behavior and options. If there is a space in the basename of the compiler, the items after the first space are assumed to be compiler flags, and in that case, the flags are automatically moved into HOSTCFLAGS. Defaults to `$CC`. #### `CFLAGS` Command-line flags that will be passed verbatim to `CC`. Defaults to empty. #### `HOSTCFLAGS` or `HOST_CFLAGS` Command-line flags that will be passed verbatim to `HOSTCC` or `HOST_CC`. Defaults to `$CFLAGS`. #### `CPPFLAGS` Command-line flags for the C preprocessor. These are also passed verbatim to both compilers (`CC` and `HOSTCC`); they are supported just for legacy reasons. Defaults to empty. #### `LDFLAGS` Command-line flags for the linker. These are also passed verbatim to both compilers (`CC` and `HOSTCC`); they are supported just for legacy reasons. Defaults to empty. #### `LDLIBS` Libraries to link to. These are also passed verbatim to both compilers (`CC` and `HOSTCC`); they are supported just for legacy reasons and for cross compiling with different C standard libraries (like [musl][3]). Defaults to empty. #### `PREFIX` The prefix to install to. Can be overridden by passing the `--prefix` option to `configure.sh`. Defaults to `/usr/local`. #### `DESTDIR` Path to prepend onto `PREFIX`. This is mostly for distro and package maintainers. This can be passed either to `configure.sh` or `make install`. If it is passed to both, the one given to `configure.sh` takes precedence. Defaults to empty. #### `BINDIR` The directory to install binaries in. Can be overridden by passing the `--bindir` option to `configure.sh`. Defaults to `$PREFIX/bin`. #### `INCLUDEDIR` The directory to install header files in. Can be overridden by passing the `--includedir` option to `configure.sh`. Defaults to `$PREFIX/include`. #### `LIBDIR` The directory to install libraries in. Can be overridden by passing the `--libdir` option to `configure.sh`. Defaults to `$PREFIX/lib`. #### `DATAROOTDIR` The root directory to install data files in. Can be overridden by passing the `--datarootdir` option to `configure.sh`. Defaults to `$PREFIX/share`. #### `DATADIR` The directory to install data files in. Can be overridden by passing the `--datadir` option to `configure.sh`. Defaults to `$DATAROOTDIR`. #### `MANDIR` The directory to install manpages in. Can be overridden by passing the `--mandir` option to `configure.sh`. Defaults to `$DATADIR/man` #### `MAN1DIR` The directory to install Section 1 manpages in. Because both `bc` and `dc` are Section 1 commands, this is the only relevant section directory. Can be overridden by passing the `--man1dir` option to `configure.sh`. Defaults to `$MANDIR/man1`. #### `LOCALEDIR` The directory to install locales in. Can be overridden by passing the `--localedir` option to `configure.sh`. Defaults to `$DATAROOTDIR/locale`. #### `EXECSUFFIX` The suffix to append onto the executable names *when installing*. This is for packagers and distro maintainers who want this `bc` as an option, but do not want to replace the default `bc`. Defaults to empty. #### `EXECPREFIX` The prefix to append onto the executable names *when building and installing*. This is for packagers and distro maintainers who want this `bc` as an option, but do not want to replace the default `bc`. Defaults to empty. #### `LONG_BIT` The number of bits in a C `long` type. This is mostly for the embedded space. This `bc` uses `long`s internally for overflow checking. In C99, a `long` is required to be 32 bits. For this reason, on 8-bit and 16-bit microcontrollers, the generated code to do math with `long` types may be inefficient. For most normal desktop systems, setting this is unnecessary, except that 32-bit platforms with 64-bit longs may want to set it to `32`. Defaults to the default value of `LONG_BIT` for the target platform. For compliance with the `bc` spec, the minimum allowed value is `32`. It is an error if the specified value is greater than the default value of `LONG_BIT` for the target platform. #### `GEN_HOST` Whether to use `gen/strgen.c`, instead of `gen/strgen.sh`, to produce the C files that contain the help texts as well as the math libraries. By default, `gen/strgen.c` is used, compiled by `$HOSTCC` and run on the host machine. Using `gen/strgen.sh` removes the need to compile and run an executable on the host machine since `gen/strgen.sh` is a POSIX shell script. However, `gen/lib2.bc` is perilously close to 4095 characters, the max supported length of a string literal in C99 (and it could be added to in the future), and `gen/strgen.sh` generates a string literal instead of an array, as `gen/strgen.c` does. For most production-ready compilers, this limit probably is not enforced, but it could be. Both options are still available for this reason. If you are sure your compiler does not have the limit and do not want to compile and run a binary on the host machine, set this variable to "0". Any other value, or a non-existent value, will cause the build system to compile and run `gen/strgen.c`. Default is "". #### `GEN_EMU` The emulator to run bootstrap binaries under. This is only if the binaries produced by `HOSTCC` (or `HOST_CC`) need to be run under an emulator to work. Defaults to empty. ### Build Options This `bc` comes with several build options, all of which are enabled by default. All options can be used with each other, with a few exceptions that will be noted below. **NOTE**: All long options with mandatory argumenst accept either one of the following forms: ``` --option arg --option=arg ``` #### Library To build the math library, use the following commands for the configure step: ``` ./configure.sh -a ./configure.sh --library ``` Both commands are equivalent. When the library is built, history and locales are disabled, and the functionality for `bc` and `dc` are both enabled, though the executables are *not* built. This is because the library's options clash with the executables. To build an optimized version of the library, users can pass optimization options to `configure.sh` or include them in `CFLAGS`. The library API can be found in `manuals/bcl.3.md` or `man bcl` once the library is installed. The library is built as `bin/libbcl.a`. #### `bc` Only To build `bc` only (no `dc`), use any one of the following commands for the configure step: ``` ./configure.sh -b ./configure.sh --bc-only ./configure.sh -D ./configure.sh --disable-dc ``` Those commands are all equivalent. ***Warning***: It is an error to use those options if `bc` has also been disabled (see below). #### `dc` Only To build `dc` only (no `bc`), use either one of the following commands for the configure step: ``` ./configure.sh -d ./configure.sh --dc-only ./configure.sh -B ./configure.sh --disable-bc ``` Those commands are all equivalent. ***Warning***: It is an error to use those options if `dc` has also been disabled (see above). #### History To disable hisory, pass either the `-H` flag or the `--disable-history` option to `configure.sh`, as follows: ``` ./configure.sh -H ./configure.sh --disable-history ``` Both commands are equivalent. History is automatically disabled when building for Windows or on another platform that does not support the terminal handling that is required. ***WARNING***: Of all of the code in the `bc`, this is the only code that is not completely portable. If the `bc` does not work on your platform, your first step should be to retry with history disabled. This option affects the [build type][7]. #### NLS (Locale Support) To disable locale support (use only English), pass either the `-N` flag or the `--disable-nls` option to `configure.sh`, as follows: ``` ./configure.sh -N ./configure.sh --disable-nls ``` Both commands are equivalent. NLS (locale support) is automatically disabled when building for Windows or on another platform that does not support the POSIX locale API or utilities. This option affects the [build type][7]. #### Extra Math This `bc` has 7 extra operators: * `$` (truncation to integer) * `@` (set precision) * `@=` (set precision and assign) * `<<` (shift number left, shifts radix right) * `<<=` (shift number left and assign) * `>>` (shift number right, shifts radix left) * `>>=` (shift number right and assign) There is no assignment version of `$` because it is a unary operator. The assignment versions of the above operators are not available in `dc`, but the others are, as the operators `$`, `@`, `H`, and `h`, respectively. In addition, this `bc` has the option of outputting in scientific notation or engineering notation. It can also take input in scientific or engineering notation. On top of that, it has a pseudo-random number generator. (See the full manual for more details.) Extra operators, scientific notation, engineering notation, and the pseudo-random number generator can be disabled by passing either the `-E` flag or the `--disable-extra-math` option to `configure.sh`, as follows: ``` ./configure.sh -E ./configure.sh --disable-extra-math ``` Both commands are equivalent. This `bc` also has a larger library that is only enabled if extra operators and the pseudo-random number generator are. More information about the functions can be found in the Extended Library section of the full manual. This option affects the [build type][7]. #### Karatsuba Length The Karatsuba length is the point at which `bc` and `dc` switch from Karatsuba multiplication to brute force, `O(n^2)` multiplication. It can be set by passing the `-k` flag or the `--karatsuba-len` option to `configure.sh` as follows: ``` ./configure.sh -k32 ./configure.sh --karatsuba-len 32 ``` Both commands are equivalent. Default is `32`. ***WARNING***: The Karatsuba Length must be a **integer** greater than or equal to `16` (to prevent stack overflow). If it is not, `configure.sh` will give an error. #### Settings This `bc` and `dc` have a few settings to override default behavior. The defaults for these settings can be set by package maintainers, and the settings themselves can be overriden by users. To set a default to **on**, use the `-s` or `--set-default-on` option to `configure.sh`, with the name of the setting, as follows: ``` ./configure.sh -s bc.banner ./configure.sh --set-default-on=bc.banner ``` Both commands are equivalent. To set a default to **off**, use the `-S` or `--set-default-off` option to `configure.sh`, with the name of the setting, as follows: ``` ./configure.sh -S bc.banner ./configure.sh --set-default-off=bc.banner ``` Both commands are equivalent. Users can override the default settings set by packagers with environment variables. If the environment variable has an integer, then the setting is turned **on** for a non-zero integer, and **off** for zero. The table of the available settings, along with their defaults and the environment variables to override them, is below: ``` | Setting | Description | Default | Env Variable | | =============== | ==================== | ============ | ==================== | | bc.banner | Whether to display | 0 | BC_BANNER | | | the bc version | | | | | banner when in | | | | | interactive mode. | | | | --------------- | -------------------- | ------------ | -------------------- | | bc.sigint_reset | Whether SIGINT will | 1 | BC_SIGINT_RESET | | | reset bc, instead of | | | | | exiting, when in | | | | | interactive mode. | | | | --------------- | -------------------- | ------------ | -------------------- | | dc.sigint_reset | Whether SIGINT will | 1 | DC_SIGINT_RESET | | | reset dc, instead of | | | | | exiting, when in | | | | | interactive mode. | | | | --------------- | -------------------- | ------------ | -------------------- | | bc.tty_mode | Whether TTY mode for | 1 | BC_TTY_MODE | | | bc should be on when | | | | | available. | | | | --------------- | -------------------- | ------------ | -------------------- | | dc.tty_mode | Whether TTY mode for | 0 | BC_TTY_MODE | | | dc should be on when | | | | | available. | | | | --------------- | -------------------- | ------------ | -------------------- | | bc.prompt | Whether the prompt | $BC_TTY_MODE | BC_PROMPT | | | for bc should be on | | | | | in tty mode. | | | | --------------- | -------------------- | ------------ | -------------------- | | dc.prompt | Whether the prompt | $DC_TTY_MODE | DC_PROMPT | | | for dc should be on | | | | | in tty mode. | | | | --------------- | -------------------- | ------------ | -------------------- | ``` These settings are not meant to be changed on a whim. They are meant to ensure that this bc and dc will conform to the expectations of the user on each platform. #### Install Options The relevant `autotools`-style install options are supported in `configure.sh`: * `--prefix` * `--bindir` * `--datarootdir` * `--datadir` * `--mandir` * `--man1dir` * `--localedir` An example is: ``` ./configure.sh --prefix=/usr --localedir /usr/share/nls make make install ``` They correspond to the environment variables `$PREFIX`, `$BINDIR`, `$DATAROOTDIR`, `$DATADIR`, `$MANDIR`, `$MAN1DIR`, and `$LOCALEDIR`, respectively. ***WARNING***: If the option is given, the value of the corresponding environment variable is overridden. ***WARNING***: If any long command-line options are used, the long form of all other command-line options must be used. Mixing long and short options is not supported. ##### Manpages To disable installing manpages, pass either the `-M` flag or the `--disable-man-pages` option to `configure.sh` as follows: ``` ./configure.sh -M ./configure.sh --disable-man-pages ``` Both commands are equivalent. ##### Locales By default, `bc` and `dc` do not install all locales, but only the enabled locales. If `DESTDIR` exists and is not empty, then they will install all of the locales that exist on the system. The `-l` flag or `--install-all-locales` option skips all of that and just installs all of the locales that `bc` and `dc` have, regardless. To enable that behavior, you can pass the `-l` flag or the `--install-all-locales` option to `configure.sh`, as follows: ``` ./configure.sh -l ./configure.sh --install-all-locales ``` Both commands are equivalent. ### Optimization The `configure.sh` script will accept an optimization level to pass to the compiler. Because `bc` is orders of magnitude faster with optimization, I ***highly*** recommend package and distro maintainers pass the highest optimization level available in `CC` to `configure.sh` with the `-O` flag or `--opt` option, as follows: ``` ./configure.sh -O3 ./configure.sh --opt 3 ``` Both commands are equivalent. The build and install can then be run as normal: ``` make make install ``` As usual, `configure.sh` will also accept additional `CFLAGS` on the command line, so for SSE4 architectures, the following can add a bit more speed: ``` CFLAGS="-march=native -msse4" ./configure.sh -O3 make make install ``` Building with link-time optimization (`-flto` in clang) can further increase the performance. I ***highly*** recommend doing so. I do ***NOT*** recommend building with `-march=native`; doing so reduces this `bc`'s performance. Manual stripping is not necessary; non-debug builds are automatically stripped in the link stage. ### Debug Builds Debug builds (which also disable optimization if no optimization level is given and if no extra `CFLAGS` are given) can be enabled with either the `-g` flag or the `--debug` option, as follows: ``` ./configure.sh -g ./configure.sh --debug ``` Both commands are equivalent. The build and install can then be run as normal: ``` make make install ``` ### Stripping Binaries By default, when `bc` and `dc` are not built in debug mode, the binaries are stripped. Stripping can be disabled with either the `-T` or the `--disable-strip` option, as follows: ``` ./configure.sh -T ./configure.sh --disable-strip ``` Both commands are equivalent. The build and install can then be run as normal: ``` make make install ``` ### Build Type `bc` and `dc` have 8 build types, affected by the [History][8], [NLS (Locale Support)][9], and [Extra Math][10] build options. The build types are as follows: * `A`: Nothing disabled. * `E`: Extra math disabled. * `H`: History disabled. * `N`: NLS disabled. * `EH`: Extra math and History disabled. * `EN`: Extra math and NLS disabled. * `HN`: History and NLS disabled. * `EHN`: Extra math, History, and NLS all disabled. These build types correspond to the generated manuals in `manuals/bc` and `manuals/dc`. ### Binary Size When built with both calculators, all available features, and `-Os` using `clang` and `musl`, the executable is 140.4 kb (140,386 bytes) on `x86_64`. That isn't much for what is contained in the binary, but if necessary, it can be reduced. The single largest user of space is the `bc` calculator. If just `dc` is needed, the size can be reduced to 107.6 kb (107,584 bytes). The next largest user of space is history support. If that is not needed, size can be reduced (for a build with both calculators) to 119.9 kb (119,866 bytes). There are several reasons that history is a bigger user of space than `dc` itself: * `dc`'s lexer and parser are *tiny* compared to `bc`'s because `dc` code is almost already in the form that it is executed in, while `bc` has to not only adjust the form to be executable, it has to parse functions, loops, `if` statements, and other extra features. * `dc` does not have much extra code in the interpreter. * History has a lot of const data for supporting `UTF-8` terminals. * History pulls in a bunch of more code from the `libc`. The next biggest user is extra math support. Without it, the size is reduced to 124.0 kb (123,986 bytes) with history and 107.6 kb (107,560 bytes) without history. The reasons why extra math support is bigger than `dc`, besides the fact that `dc` is small already, are: * Extra math supports adds an extra math library that takes several kilobytes of constant data space. * Extra math support includes support for a pseudo-random number generator, including the code to convert a series of pseudo-random numbers into a number of arbitrary size. * Extra math support adds several operators. The next biggest user is `dc`, so if just `bc` is needed, the size can be reduced to 128.1 kb (128,096 bytes) with history and extra math support, 107.6 kb (107,576 bytes) without history and with extra math support, and 95.3 kb (95,272 bytes) without history and without extra math support. *Note*: all of these binary sizes were compiled using `musl` `1.2.0` as the `libc`, making a fully static executable, with `clang` `9.0.1` (well, `musl-clang` using `clang` `9.0.1`) as the compiler and using `-Os` optimizations. These builds were done on an `x86_64` machine running Gentoo Linux. ### Testing The default test suite can be run with the following command: ``` make test ``` To test `bc` only, run the following command: ``` make test_bc ``` To test `dc` only, run the following command: ``` make test_dc ``` This `bc`, if built, assumes a working, GNU-compatible `bc`, installed on the system and in the `PATH`, to generate some tests, unless the `-G` flag or `--disable-generated-tests` option is given to `configure.sh`, as follows: ``` ./configure.sh -G ./configure.sh --disable-generated-tests ``` After running `configure.sh`, build and run tests as follows: ``` make make test ``` This `dc` also assumes a working, GNU-compatible `dc`, installed on the system and in the `PATH`, to generate some tests, unless one of the above options is given to `configure.sh`. To generate test coverage, pass the `-c` flag or the `--coverage` option to `configure.sh` as follows: ``` ./configure.sh -c ./configure.sh --coverage ``` Both commands are equivalent. ***WARNING***: Both `bc` and `dc` must be built for test coverage. Otherwise, `configure.sh` will give an error. [1]: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/bc.html [2]: https://www.gnu.org/software/bc/ [3]: https://www.musl-libc.org/ [4]: #build-environment-variables [5]: #build-options [6]: #cross-compiling [7]: #build-type [8]: #history [9]: #nls-locale-support [10]: #extra-math [11]: #settings diff --git a/src/bc_parse.c b/src/bc_parse.c index c64121ec5da8..c2fc2186a065 100644 --- a/src/bc_parse.c +++ b/src/bc_parse.c @@ -1,2278 +1,2313 @@ /* * ***************************************************************************** * * SPDX-License-Identifier: BSD-2-Clause * * Copyright (c) 2018-2021 Gavin D. Howard and contributors. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are met: * * * Redistributions of source code must retain the above copyright notice, this * list of conditions and the following disclaimer. * * * Redistributions in binary form must reproduce the above copyright notice, * this list of conditions and the following disclaimer in the documentation * and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE * POSSIBILITY OF SUCH DAMAGE. * * ***************************************************************************** * * The parser for bc. * */ #if BC_ENABLED #include #include #include #include #include #include #include #include // Before you embark on trying to understand this code, have you read the // Development manual (manuals/development.md) and the comment in include/bc.h // yet? No? Do that first. I'm serious. // // The reason is because this file holds the most sensitive and finicky code in // the entire codebase. Even getting history to work on Windows was nothing // compared to this. This is where dreams go to die, where dragons live, and // from which Ken Thompson himself would flee. static void bc_parse_else(BcParse *p); static void bc_parse_stmt(BcParse *p); static BcParseStatus bc_parse_expr_err(BcParse *p, uint8_t flags, BcParseNext next); static void bc_parse_expr_status(BcParse *p, uint8_t flags, BcParseNext next); /** * Returns true if an instruction could only have come from a "leaf" expression. * For more on what leaf expressions are, read the comment for BC_PARSE_LEAF(). * @param t The instruction to test. */ static bool bc_parse_inst_isLeaf(BcInst t) { return (t >= BC_INST_NUM && t <= BC_INST_MAXSCALE) || #if BC_ENABLE_EXTRA_MATH t == BC_INST_TRUNC || #endif // BC_ENABLE_EXTRA_MATH t <= BC_INST_DEC; } /** * Returns true if the *previous* token was a delimiter. A delimiter is anything * that can legally end a statement. In bc's case, it could be a newline, a * semicolon, and a brace in certain cases. * @param p The parser. */ static bool bc_parse_isDelimiter(const BcParse *p) { BcLexType t = p->l.t; bool good; // If it's an obvious delimiter, say so. if (BC_PARSE_DELIMITER(t)) return true; good = false; // If the current token is a keyword, then...beware. That means that we need // to check for a "dangling" else, where there was no brace-delimited block // on the previous if. if (t == BC_LEX_KW_ELSE) { size_t i; uint16_t *fptr = NULL, flags = BC_PARSE_FLAG_ELSE; // As long as going up the stack is valid for a dangling else, keep on. for (i = 0; i < p->flags.len && BC_PARSE_BLOCK_STMT(flags); ++i) { fptr = bc_vec_item_rev(&p->flags, i); flags = *fptr; // If we need a brace and don't have one, then we don't have a // delimiter. if ((flags & BC_PARSE_FLAG_BRACE) && p->l.last != BC_LEX_RBRACE) return false; } // Oh, and we had also better have an if statement somewhere. good = ((flags & BC_PARSE_FLAG_IF) != 0); } else if (t == BC_LEX_RBRACE) { size_t i; // Since we have a brace, we need to just check if a brace was needed. for (i = 0; !good && i < p->flags.len; ++i) { uint16_t *fptr = bc_vec_item_rev(&p->flags, i); good = (((*fptr) & BC_PARSE_FLAG_BRACE) != 0); } } return good; } /** * Sets a previously defined exit label. What are labels? See the bc Parsing * section of the Development manual (manuals/development.md). * @param p The parser. */ static void bc_parse_setLabel(BcParse *p) { BcFunc *func = p->func; BcInstPtr *ip = bc_vec_top(&p->exits); size_t *label; assert(func == bc_vec_item(&p->prog->fns, p->fidx)); // Set the preallocated label to the correct index. label = bc_vec_item(&func->labels, ip->idx); *label = func->code.len; // Now, we don't need the exit label; it is done. bc_vec_pop(&p->exits); } /** * Creates a label and sets it to idx. If this is an exit label, then idx is * actually invalid, but it doesn't matter because it will be fixed by * bc_parse_setLabel() later. * @param p The parser. * @param idx The index of the label. */ static void bc_parse_createLabel(BcParse *p, size_t idx) { bc_vec_push(&p->func->labels, &idx); } /** * Creates a conditional label. Unlike an exit label, this label is set at * creation time because it comes *before* the code that will target it. * @param p The parser. * @param idx The index of the label. */ static void bc_parse_createCondLabel(BcParse *p, size_t idx) { bc_parse_createLabel(p, p->func->code.len); bc_vec_push(&p->conds, &idx); } /* * Creates an exit label to be filled in later by bc_parse_setLabel(). Also, why * create a label to be filled in later? Because exit labels are meant to be * targeted by code that comes *before* the label. Since we have to parse that * code first, and don't know how long it will be, we need to just make sure to * reserve a slot to be filled in later when we know. * * By the way, this uses BcInstPtr because it was convenient. The field idx * holds the index, and the field func holds the loop boolean. * * @param p The parser. * @param idx The index of the label's position. * @param loop True if the exit label is for a loop or not. */ static void bc_parse_createExitLabel(BcParse *p, size_t idx, bool loop) { BcInstPtr ip; assert(p->func == bc_vec_item(&p->prog->fns, p->fidx)); ip.func = loop; ip.idx = idx; ip.len = 0; bc_vec_push(&p->exits, &ip); bc_parse_createLabel(p, SIZE_MAX); } /** * Pops the correct operators off of the operator stack based on the current * operator. This is because of the Shunting-Yard algorithm. Lower prec means * higher precedence. * @param p The parser. * @param type The operator. * @param start The previous start of the operator stack. For more * information, see the bc Parsing section of the Development * manual (manuals/development.md). * @param nexprs A pointer to the current number of expressions that have not * been consumed yet. This is an IN and OUT parameter. */ static void bc_parse_operator(BcParse *p, BcLexType type, size_t start, size_t *nexprs) { BcLexType t; uchar l, r = BC_PARSE_OP_PREC(type); uchar left = BC_PARSE_OP_LEFT(type); // While we haven't hit the stop point yet. while (p->ops.len > start) { // Get the top operator. t = BC_PARSE_TOP_OP(p); // If it's a right paren, we have reached the end of whatever expression // this is no matter what. if (t == BC_LEX_LPAREN) break; // Break for precedence. Precedence operates differently on left and // right associativity, by the way. A left associative operator that // matches the current precedence should take priority, but a right // associative operator should not. l = BC_PARSE_OP_PREC(t); if (l >= r && (l != r || !left)) break; // Do the housekeeping. In particular, make sure to note that one // expression was consumed. (Two were, but another was added.) bc_parse_push(p, BC_PARSE_TOKEN_INST(t)); bc_vec_pop(&p->ops); *nexprs -= !BC_PARSE_OP_PREFIX(t); } bc_vec_push(&p->ops, &type); } /** * Parses a right paren. In the Shunting-Yard algorithm, it needs to be put on * the operator stack. But before that, it needs to consume whatever operators * there are until it hits a left paren. * @param p The parser. * @param nexprs A pointer to the current number of expressions that have not * been consumed yet. This is an IN and OUT parameter. */ static void bc_parse_rightParen(BcParse *p, size_t *nexprs) { BcLexType top; // Consume operators until a left paren. while ((top = BC_PARSE_TOP_OP(p)) != BC_LEX_LPAREN) { bc_parse_push(p, BC_PARSE_TOKEN_INST(top)); bc_vec_pop(&p->ops); *nexprs -= !BC_PARSE_OP_PREFIX(top); } // We need to pop the left paren as well. bc_vec_pop(&p->ops); // Oh, and we also want the next token. bc_lex_next(&p->l); } /** * Parses function arguments. * @param p The parser. * @param flags Flags restricting what kind of expressions the arguments can * be. */ static void bc_parse_args(BcParse *p, uint8_t flags) { bool comma = false; size_t nargs; bc_lex_next(&p->l); // Print and comparison operators not allowed. Well, comparison operators // only for POSIX. But we do allow arrays, and we *must* get a value. flags &= ~(BC_PARSE_PRINT | BC_PARSE_REL); flags |= (BC_PARSE_ARRAY | BC_PARSE_NEEDVAL); // Count the arguments and parse them. for (nargs = 0; p->l.t != BC_LEX_RPAREN; ++nargs) { bc_parse_expr_status(p, flags, bc_parse_next_arg); comma = (p->l.t == BC_LEX_COMMA); if (comma) bc_lex_next(&p->l); } // An ending comma is FAIL. if (BC_ERR(comma)) bc_parse_err(p, BC_ERR_PARSE_TOKEN); // Now do the call with the number of arguments. bc_parse_push(p, BC_INST_CALL); bc_parse_pushIndex(p, nargs); } /** * Parses a function call. * @param p The parser. * @param flags Flags restricting what kind of expressions the arguments can * be. */ static void bc_parse_call(BcParse *p, const char *name, uint8_t flags) { size_t idx; bc_parse_args(p, flags); // We just assert this because bc_parse_args() should // ensure that the next token is what it should be. assert(p->l.t == BC_LEX_RPAREN); // We cannot use bc_program_insertFunc() here // because it will overwrite an existing function. idx = bc_map_index(&p->prog->fn_map, name); // The function does not exist yet. Create a space for it. If the user does // not define it, it's a *runtime* error, not a parse error. if (idx == BC_VEC_INVALID_IDX) { BC_SIG_LOCK; idx = bc_program_insertFunc(p->prog, name); BC_SIG_UNLOCK; assert(idx != BC_VEC_INVALID_IDX); // Make sure that this pointer was not invalidated. p->func = bc_vec_item(&p->prog->fns, p->fidx); } // The function exists, so set the right function index. else idx = ((BcId*) bc_vec_item(&p->prog->fn_map, idx))->idx; bc_parse_pushIndex(p, idx); // Make sure to get the next token. bc_lex_next(&p->l); } /** * Parses a name/identifier-based expression. It could be a variable, an array * element, an array itself (for function arguments), a function call, etc. * */ static void bc_parse_name(BcParse *p, BcInst *type, bool *can_assign, uint8_t flags) { char *name; BC_SIG_LOCK; // We want a copy of the name since the lexer might overwrite its copy. name = bc_vm_strdup(p->l.str.v); BC_SETJMP_LOCKED(err); BC_SIG_UNLOCK; // We need the next token to see if it's just a variable or something more. bc_lex_next(&p->l); // Array element or array. if (p->l.t == BC_LEX_LBRACKET) { bc_lex_next(&p->l); // Array only. This has to be a function parameter. if (p->l.t == BC_LEX_RBRACKET) { // Error if arrays are not allowed. if (BC_ERR(!(flags & BC_PARSE_ARRAY))) bc_parse_err(p, BC_ERR_PARSE_EXPR); *type = BC_INST_ARRAY; *can_assign = false; } else { // If we are here, we have an array element. We need to set the // expression parsing flags. uint8_t flags2 = (flags & ~(BC_PARSE_PRINT | BC_PARSE_REL)) | BC_PARSE_NEEDVAL; bc_parse_expr_status(p, flags2, bc_parse_next_elem); // The next token *must* be a right bracket. if (BC_ERR(p->l.t != BC_LEX_RBRACKET)) bc_parse_err(p, BC_ERR_PARSE_TOKEN); *type = BC_INST_ARRAY_ELEM; *can_assign = true; } // Make sure to get the next token. bc_lex_next(&p->l); // Push the instruction and the name of the identifier. bc_parse_push(p, *type); bc_parse_pushName(p, name, false); } else if (p->l.t == BC_LEX_LPAREN) { // We are parsing a function call; error if not allowed. if (BC_ERR(flags & BC_PARSE_NOCALL)) bc_parse_err(p, BC_ERR_PARSE_TOKEN); *type = BC_INST_CALL; *can_assign = false; bc_parse_call(p, name, flags); } else { // Just a variable. *type = BC_INST_VAR; *can_assign = true; bc_parse_push(p, BC_INST_VAR); bc_parse_pushName(p, name, true); } err: // Need to make sure to unallocate the name. BC_SIG_MAYLOCK; free(name); BC_LONGJMP_CONT; } /** * Parses a builtin function that takes no arguments. This includes read(), * rand(), maxibase(), maxobase(), maxscale(), and maxrand(). * @param p The parser. * @param inst The instruction corresponding to the builtin. */ static void bc_parse_noArgBuiltin(BcParse *p, BcInst inst) { // Must have a left paren. bc_lex_next(&p->l); if (BC_ERR(p->l.t != BC_LEX_LPAREN)) bc_parse_err(p, BC_ERR_PARSE_TOKEN); // Must have a right paren. bc_lex_next(&p->l); if ((p->l.t != BC_LEX_RPAREN)) bc_parse_err(p, BC_ERR_PARSE_TOKEN); bc_parse_push(p, inst); bc_lex_next(&p->l); } /** * Parses a builtin function that takes 1 argument. This includes length(), * sqrt(), abs(), scale(), and irand(). * @param p The parser. * @param type The lex token. * @param flags The expression parsing flags for parsing the argument. * @param prev An out parameter; the previous instruction pointer. */ static void bc_parse_builtin(BcParse *p, BcLexType type, uint8_t flags, BcInst *prev) { // Must have a left paren. bc_lex_next(&p->l); if (BC_ERR(p->l.t != BC_LEX_LPAREN)) bc_parse_err(p, BC_ERR_PARSE_TOKEN); bc_lex_next(&p->l); // Change the flags as needed for parsing the argument. flags &= ~(BC_PARSE_PRINT | BC_PARSE_REL); flags |= BC_PARSE_NEEDVAL; // Since length can take arrays, we need to specially add that flag. if (type == BC_LEX_KW_LENGTH) flags |= BC_PARSE_ARRAY; bc_parse_expr_status(p, flags, bc_parse_next_rel); // Must have a right paren. if (BC_ERR(p->l.t != BC_LEX_RPAREN)) bc_parse_err(p, BC_ERR_PARSE_TOKEN); // Adjust previous based on the token and push it. *prev = type - BC_LEX_KW_LENGTH + BC_INST_LENGTH; bc_parse_push(p, *prev); bc_lex_next(&p->l); } /** * Parses a builtin function that takes 3 arguments. This includes modexp() and * divmod(). */ static void bc_parse_builtin3(BcParse *p, BcLexType type, uint8_t flags, BcInst *prev) { assert(type == BC_LEX_KW_MODEXP || type == BC_LEX_KW_DIVMOD); // Must have a left paren. bc_lex_next(&p->l); if (BC_ERR(p->l.t != BC_LEX_LPAREN)) bc_parse_err(p, BC_ERR_PARSE_TOKEN); bc_lex_next(&p->l); // Change the flags as needed for parsing the argument. flags &= ~(BC_PARSE_PRINT | BC_PARSE_REL); flags |= BC_PARSE_NEEDVAL; bc_parse_expr_status(p, flags, bc_parse_next_builtin); // Must have a comma. if (BC_ERR(p->l.t != BC_LEX_COMMA)) bc_parse_err(p, BC_ERR_PARSE_TOKEN); bc_lex_next(&p->l); bc_parse_expr_status(p, flags, bc_parse_next_builtin); // Must have a comma. if (BC_ERR(p->l.t != BC_LEX_COMMA)) bc_parse_err(p, BC_ERR_PARSE_TOKEN); bc_lex_next(&p->l); // If it is a divmod, parse an array name. Otherwise, just parse another // expression. if (type == BC_LEX_KW_DIVMOD) { // Must have a name. if (BC_ERR(p->l.t != BC_LEX_NAME)) bc_parse_err(p, BC_ERR_PARSE_TOKEN); // This is safe because the next token should not overwrite the name. bc_lex_next(&p->l); // Must have a left bracket. if (BC_ERR(p->l.t != BC_LEX_LBRACKET)) bc_parse_err(p, BC_ERR_PARSE_TOKEN); // This is safe because the next token should not overwrite the name. bc_lex_next(&p->l); // Must have a right bracket. if (BC_ERR(p->l.t != BC_LEX_RBRACKET)) bc_parse_err(p, BC_ERR_PARSE_TOKEN); // This is safe because the next token should not overwrite the name. bc_lex_next(&p->l); } else bc_parse_expr_status(p, flags, bc_parse_next_rel); // Must have a right paren. if (BC_ERR(p->l.t != BC_LEX_RPAREN)) bc_parse_err(p, BC_ERR_PARSE_TOKEN); // Adjust previous based on the token and push it. *prev = type - BC_LEX_KW_MODEXP + BC_INST_MODEXP; bc_parse_push(p, *prev); // If we have divmod, we need to assign the modulus to the array element, so // we need to push the instructions for doing so. if (type == BC_LEX_KW_DIVMOD) { // The zeroth element. bc_parse_push(p, BC_INST_ZERO); bc_parse_push(p, BC_INST_ARRAY_ELEM); // Push the array. bc_parse_pushName(p, p->l.str.v, false); // Swap them and assign. After this, the top item on the stack should // be the quotient. bc_parse_push(p, BC_INST_SWAP); bc_parse_push(p, BC_INST_ASSIGN_NO_VAL); } bc_lex_next(&p->l); } /** * Parses the scale keyword. This is special because scale can be a value or a * builtin function. * @param p The parser. * @param type An out parameter; the instruction for the parse. * @param can_assign An out parameter; whether the expression can be assigned * to. * @param flags The expression parsing flags for parsing a scale() arg. */ static void bc_parse_scale(BcParse *p, BcInst *type, bool *can_assign, uint8_t flags) { bc_lex_next(&p->l); // Without the left paren, it's just the keyword. if (p->l.t != BC_LEX_LPAREN) { // Set, push, and return. *type = BC_INST_SCALE; *can_assign = true; bc_parse_push(p, BC_INST_SCALE); return; } // Handle the scale function. *type = BC_INST_SCALE_FUNC; *can_assign = false; // Once again, adjust the flags. flags &= ~(BC_PARSE_PRINT | BC_PARSE_REL); flags |= BC_PARSE_NEEDVAL; bc_lex_next(&p->l); bc_parse_expr_status(p, flags, bc_parse_next_rel); // Must have a right paren. if (BC_ERR(p->l.t != BC_LEX_RPAREN)) bc_parse_err(p, BC_ERR_PARSE_TOKEN); bc_parse_push(p, BC_INST_SCALE_FUNC); bc_lex_next(&p->l); } /** * Parses and increment or decrement operator. This is a bit complex. * @param p The parser. * @param prev An out parameter; the previous instruction pointer. * @param can_assign An out parameter; whether the expression can be assigned * to. * @param nexs An in/out parameter; the number of expressions in the * parse tree that are not used. * @param flags The expression parsing flags for parsing a scale() arg. */ static void bc_parse_incdec(BcParse *p, BcInst *prev, bool *can_assign, size_t *nexs, uint8_t flags) { BcLexType type; uchar inst; BcInst etype = *prev; BcLexType last = p->l.last; assert(prev != NULL && can_assign != NULL); // If we can't assign to the previous token, then we have an error. if (BC_ERR(last == BC_LEX_OP_INC || last == BC_LEX_OP_DEC || last == BC_LEX_RPAREN)) { bc_parse_err(p, BC_ERR_PARSE_ASSIGN); } // Is the previous instruction for a variable? if (BC_PARSE_INST_VAR(etype)) { // If so, this is a postfix operator. if (!*can_assign) bc_parse_err(p, BC_ERR_PARSE_ASSIGN); // Only postfix uses BC_INST_INC and BC_INST_DEC. *prev = inst = BC_INST_INC + (p->l.t != BC_LEX_OP_INC); bc_parse_push(p, inst); bc_lex_next(&p->l); *can_assign = false; } else { // This is a prefix operator. In that case, we just convert it to // an assignment instruction. *prev = inst = BC_INST_ASSIGN_PLUS + (p->l.t != BC_LEX_OP_INC); bc_lex_next(&p->l); type = p->l.t; // Because we parse the next part of the expression // right here, we need to increment this. *nexs = *nexs + 1; // Is the next token a normal identifier? if (type == BC_LEX_NAME) { // Parse the name. uint8_t flags2 = flags & ~BC_PARSE_ARRAY; bc_parse_name(p, prev, can_assign, flags2 | BC_PARSE_NOCALL); } // Is the next token a global? else if (type >= BC_LEX_KW_LAST && type <= BC_LEX_KW_OBASE) { bc_parse_push(p, type - BC_LEX_KW_LAST + BC_INST_LAST); bc_lex_next(&p->l); } // Is the next token specifically scale, which needs special treatment? else if (BC_NO_ERR(type == BC_LEX_KW_SCALE)) { bc_lex_next(&p->l); // Check that scale() was not used. if (BC_ERR(p->l.t == BC_LEX_LPAREN)) bc_parse_err(p, BC_ERR_PARSE_TOKEN); else bc_parse_push(p, BC_INST_SCALE); } // Now we know we have an error. else bc_parse_err(p, BC_ERR_PARSE_TOKEN); *can_assign = false; bc_parse_push(p, BC_INST_ONE); bc_parse_push(p, inst); } } /** * Parses the minus operator. This needs special treatment because it is either * subtract or negation. * @param p The parser. * @param prev An in/out parameter; the previous instruction. * @param ops_bgn The size of the operator stack. * @param rparen True if the last token was a right paren. * @param binlast True if the last token was a binary operator. * @param nexprs An in/out parameter; the number of unused expressions. */ static void bc_parse_minus(BcParse *p, BcInst *prev, size_t ops_bgn, bool rparen, bool binlast, size_t *nexprs) { BcLexType type; bc_lex_next(&p->l); // Figure out if it's a minus or a negation. type = BC_PARSE_LEAF(*prev, binlast, rparen) ? BC_LEX_OP_MINUS : BC_LEX_NEG; *prev = BC_PARSE_TOKEN_INST(type); // We can just push onto the op stack because this is the largest // precedence operator that gets pushed. Inc/dec does not. if (type != BC_LEX_OP_MINUS) bc_vec_push(&p->ops, &type); else bc_parse_operator(p, type, ops_bgn, nexprs); } /** * Parses a string. * @param p The parser. * @param inst The instruction corresponding to how the string was found and * how it should be printed. */ static void bc_parse_str(BcParse *p, BcInst inst) { bc_parse_addString(p); bc_parse_push(p, inst); bc_lex_next(&p->l); } /** * Parses a print statement. * @param p The parser. */ static void bc_parse_print(BcParse *p, BcLexType type) { BcLexType t; bool comma = false; BcInst inst = type == BC_LEX_KW_STREAM ? BC_INST_PRINT_STREAM : BC_INST_PRINT_POP; bc_lex_next(&p->l); t = p->l.t; // A print or stream statement has to have *something*. if (bc_parse_isDelimiter(p)) bc_parse_err(p, BC_ERR_PARSE_PRINT); do { // If the token is a string, then print it with escapes. // BC_INST_PRINT_POP plays that role for bc. if (t == BC_LEX_STR) bc_parse_str(p, inst); else { // We have an actual number; parse and add a print instruction. bc_parse_expr_status(p, BC_PARSE_NEEDVAL, bc_parse_next_print); bc_parse_push(p, inst); } // Is the next token a comma? comma = (p->l.t == BC_LEX_COMMA); // Get the next token if we have a comma. if (comma) bc_lex_next(&p->l); else { // If we don't have a comma, the statement needs to end. if (!bc_parse_isDelimiter(p)) bc_parse_err(p, BC_ERR_PARSE_TOKEN); else break; } t = p->l.t; } while (true); // If we have a comma but no token, that's bad. if (BC_ERR(comma)) bc_parse_err(p, BC_ERR_PARSE_TOKEN); } /** * Parses a return statement. * @param p The parser. */ static void bc_parse_return(BcParse *p) { BcLexType t; bool paren; uchar inst = BC_INST_RET0; // If we are not in a function, that's an error. if (BC_ERR(!BC_PARSE_FUNC(p))) bc_parse_err(p, BC_ERR_PARSE_TOKEN); // If we are in a void function, make sure to return void. if (p->func->voidfn) inst = BC_INST_RET_VOID; bc_lex_next(&p->l); t = p->l.t; paren = (t == BC_LEX_LPAREN); // An empty return statement just needs to push the selected instruction. if (bc_parse_isDelimiter(p)) bc_parse_push(p, inst); else { BcParseStatus s; // Need to parse the expression whose value will be returned. s = bc_parse_expr_err(p, BC_PARSE_NEEDVAL, bc_parse_next_expr); // If the expression was empty, just push the selected instruction. if (s == BC_PARSE_STATUS_EMPTY_EXPR) { bc_parse_push(p, inst); bc_lex_next(&p->l); } // POSIX requires parentheses. if (!paren || p->l.last != BC_LEX_RPAREN) { bc_parse_err(p, BC_ERR_POSIX_RET); } // Void functions require an empty expression. if (BC_ERR(p->func->voidfn)) { if (s != BC_PARSE_STATUS_EMPTY_EXPR) bc_parse_verr(p, BC_ERR_PARSE_RET_VOID, p->func->name); } // If we got here, we want to be sure to end the function with a real // return instruction, just in case. else bc_parse_push(p, BC_INST_RET); } } /** * Clears flags that indicate the end of an if statement and its block and sets * the jump location. * @param p The parser. */ static void bc_parse_noElse(BcParse *p) { uint16_t *flag_ptr = BC_PARSE_TOP_FLAG_PTR(p); *flag_ptr = (*flag_ptr & ~(BC_PARSE_FLAG_IF_END)); bc_parse_setLabel(p); } /** * Ends (finishes parsing) the body of a control statement or a function. * @param p The parser. * @param brace True if the body was ended by a brace, false otherwise. */ static void bc_parse_endBody(BcParse *p, bool brace) { bool has_brace, new_else = false; // We cannot be ending a body if there are no bodies to end. if (BC_ERR(p->flags.len <= 1)) bc_parse_err(p, BC_ERR_PARSE_TOKEN); if (brace) { // The brace was already gotten; make sure that the caller did not lie. // We check for the requirement of braces later. assert(p->l.t == BC_LEX_RBRACE); bc_lex_next(&p->l); // If the next token is not a delimiter, that is a problem. if (BC_ERR(!bc_parse_isDelimiter(p))) bc_parse_err(p, BC_ERR_PARSE_TOKEN); } // Do we have a brace flag? has_brace = (BC_PARSE_BRACE(p) != 0); do { size_t len = p->flags.len; bool loop; // If we have a brace flag but not a brace, that's a problem. if (has_brace && !brace) bc_parse_err(p, BC_ERR_PARSE_TOKEN); // Are we inside a loop? loop = (BC_PARSE_LOOP_INNER(p) != 0); // If we are ending a loop or an else... if (loop || BC_PARSE_ELSE(p)) { // Loops have condition labels that we have to take care of as well. if (loop) { size_t *label = bc_vec_top(&p->conds); bc_parse_push(p, BC_INST_JUMP); bc_parse_pushIndex(p, *label); bc_vec_pop(&p->conds); } bc_parse_setLabel(p); bc_vec_pop(&p->flags); } // If we are ending a function... else if (BC_PARSE_FUNC_INNER(p)) { BcInst inst = (p->func->voidfn ? BC_INST_RET_VOID : BC_INST_RET0); bc_parse_push(p, inst); bc_parse_updateFunc(p, BC_PROG_MAIN); bc_vec_pop(&p->flags); } // If we have a brace flag and not an if statement, we can pop the top // of the flags stack because they have been taken care of above. else if (has_brace && !BC_PARSE_IF(p)) bc_vec_pop(&p->flags); // This needs to be last to parse nested if's properly. if (BC_PARSE_IF(p) && (len == p->flags.len || !BC_PARSE_BRACE(p))) { // Eat newlines. while (p->l.t == BC_LEX_NLINE) bc_lex_next(&p->l); // *Now* we can pop the flags. bc_vec_pop(&p->flags); // If we are allowed non-POSIX stuff... if (!BC_S) { // Have we found yet another dangling else? *(BC_PARSE_TOP_FLAG_PTR(p)) |= BC_PARSE_FLAG_IF_END; new_else = (p->l.t == BC_LEX_KW_ELSE); // Parse the else or end the if statement body. if (new_else) bc_parse_else(p); else if (!has_brace && (!BC_PARSE_IF_END(p) || brace)) bc_parse_noElse(p); } // POSIX requires us to do the bare minimum only. else bc_parse_noElse(p); } // If these are both true, we have "used" the braces that we found. if (brace && has_brace) brace = false; // This condition was perhaps the hardest single part of the parser. If the // flags stack does not have enough, we should stop. If we have a new else // statement, we should stop. If we do have the end of an if statement and // we have eaten the brace, we should stop. If we do have a brace flag, we // should stop. } while (p->flags.len > 1 && !new_else && (!BC_PARSE_IF_END(p) || brace) && !(has_brace = (BC_PARSE_BRACE(p) != 0))); // If we have a brace, yet no body for it, that's a problem. if (BC_ERR(p->flags.len == 1 && brace)) bc_parse_err(p, BC_ERR_PARSE_TOKEN); else if (brace && BC_PARSE_BRACE(p)) { // If we make it here, we have a brace and a flag for it. uint16_t flags = BC_PARSE_TOP_FLAG(p); // This condition ensure that the *last* body is correctly finished by // popping its flags. if (!(flags & (BC_PARSE_FLAG_FUNC_INNER | BC_PARSE_FLAG_LOOP_INNER)) && !(flags & (BC_PARSE_FLAG_IF | BC_PARSE_FLAG_ELSE)) && !(flags & (BC_PARSE_FLAG_IF_END))) { bc_vec_pop(&p->flags); } } } /** * Starts the body of a control statement or function. * @param p The parser. * @param flags The current flags (will be edited). */ static void bc_parse_startBody(BcParse *p, uint16_t flags) { assert(flags); flags |= (BC_PARSE_TOP_FLAG(p) & (BC_PARSE_FLAG_FUNC | BC_PARSE_FLAG_LOOP)); flags |= BC_PARSE_FLAG_BODY; bc_vec_push(&p->flags, &flags); } +void bc_parse_endif(BcParse *p) { + + size_t i; + bool good; + + // Not a problem if this is true. + if (BC_NO_ERR(!BC_PARSE_NO_EXEC(p))) return; + + good = true; + + // Find an instance of a body that needs closing, i.e., a statement that did + // not have a right brace when it should have. + for (i = 0; good && i < p->flags.len; ++i) { + uint16_t flag = *((uint16_t*) bc_vec_item(&p->flags, i)); + good = ((flag & BC_PARSE_FLAG_BRACE) != BC_PARSE_FLAG_BRACE); + } + + // If we did not find such an instance... + if (good) { + + // We set this to restore it later. We don't want the parser thinking + // that we are on stdin for this one because it will want more. + bool is_stdin = vm.is_stdin; + + vm.is_stdin = false; + + // End all of the if statements and loops. + while (p->flags.len > 1 || BC_PARSE_IF_END(p)) { + if (BC_PARSE_IF_END(p)) bc_parse_noElse(p); + if (p->flags.len > 1) bc_parse_endBody(p, false); + } + + vm.is_stdin = is_stdin; + } + // If we reach here, a block was not properly closed, and we should error. + else bc_parse_err(&vm.prs, BC_ERR_PARSE_BLOCK); +} + /** * Parses an if statement. * @param p The parser. */ static void bc_parse_if(BcParse *p) { // We are allowed relational operators, and we must have a value. size_t idx; uint8_t flags = (BC_PARSE_REL | BC_PARSE_NEEDVAL); // Get the left paren and barf if necessary. bc_lex_next(&p->l); if (BC_ERR(p->l.t != BC_LEX_LPAREN)) bc_parse_err(p, BC_ERR_PARSE_TOKEN); // Parse the condition. bc_lex_next(&p->l); bc_parse_expr_status(p, flags, bc_parse_next_rel); // Must have a right paren. if (BC_ERR(p->l.t != BC_LEX_RPAREN)) bc_parse_err(p, BC_ERR_PARSE_TOKEN); bc_lex_next(&p->l); // Insert the conditional jump instruction. bc_parse_push(p, BC_INST_JUMP_ZERO); idx = p->func->labels.len; // Push the index for the instruction and create an exit label for an else // statement. bc_parse_pushIndex(p, idx); bc_parse_createExitLabel(p, idx, false); bc_parse_startBody(p, BC_PARSE_FLAG_IF); } /** * Parses an else statement. * @param p The parser. */ static void bc_parse_else(BcParse *p) { size_t idx = p->func->labels.len; // We must be at the end of an if statement. if (BC_ERR(!BC_PARSE_IF_END(p))) bc_parse_err(p, BC_ERR_PARSE_TOKEN); // Push an unconditional jump to make bc jump over the else statement if it // executed the original if statement. bc_parse_push(p, BC_INST_JUMP); bc_parse_pushIndex(p, idx); // Clear the else stuff. Yes, that function is misnamed for its use here, // but deal with it. bc_parse_noElse(p); // Create the exit label and parse the body. bc_parse_createExitLabel(p, idx, false); bc_parse_startBody(p, BC_PARSE_FLAG_ELSE); bc_lex_next(&p->l); } /** * Parse a while loop. * @param p The parser. */ static void bc_parse_while(BcParse *p) { // We are allowed relational operators, and we must have a value. size_t idx; uint8_t flags = (BC_PARSE_REL | BC_PARSE_NEEDVAL); // Get the left paren and barf if necessary. bc_lex_next(&p->l); if (BC_ERR(p->l.t != BC_LEX_LPAREN)) bc_parse_err(p, BC_ERR_PARSE_TOKEN); bc_lex_next(&p->l); // Create the labels. Loops need both. bc_parse_createCondLabel(p, p->func->labels.len); idx = p->func->labels.len; bc_parse_createExitLabel(p, idx, true); // Parse the actual condition and barf on non-right paren. bc_parse_expr_status(p, flags, bc_parse_next_rel); if (BC_ERR(p->l.t != BC_LEX_RPAREN)) bc_parse_err(p, BC_ERR_PARSE_TOKEN); bc_lex_next(&p->l); // Now we can push the conditional jump and start the body. bc_parse_push(p, BC_INST_JUMP_ZERO); bc_parse_pushIndex(p, idx); bc_parse_startBody(p, BC_PARSE_FLAG_LOOP | BC_PARSE_FLAG_LOOP_INNER); } /** * Parse a for loop. * @param p The parser. */ static void bc_parse_for(BcParse *p) { size_t cond_idx, exit_idx, body_idx, update_idx; // Barf on the missing left paren. bc_lex_next(&p->l); if (BC_ERR(p->l.t != BC_LEX_LPAREN)) bc_parse_err(p, BC_ERR_PARSE_TOKEN); bc_lex_next(&p->l); // The first statement can be empty, but if it is, check for error in POSIX // mode. Otherwise, parse it. if (p->l.t != BC_LEX_SCOLON) bc_parse_expr_status(p, 0, bc_parse_next_for); else bc_parse_err(p, BC_ERR_POSIX_FOR); // Must have a semicolon. if (BC_ERR(p->l.t != BC_LEX_SCOLON)) bc_parse_err(p, BC_ERR_PARSE_TOKEN); bc_lex_next(&p->l); // These are indices for labels. There are so many of them because the end // of the loop must unconditionally jump to the update code. Then the update // code must unconditionally jump to the condition code. Then the condition // code must *conditionally* jump to the exit. cond_idx = p->func->labels.len; update_idx = cond_idx + 1; body_idx = update_idx + 1; exit_idx = body_idx + 1; // This creates the condition label. bc_parse_createLabel(p, p->func->code.len); // Parse an expression if it exists. if (p->l.t != BC_LEX_SCOLON) { uint8_t flags = (BC_PARSE_REL | BC_PARSE_NEEDVAL); bc_parse_expr_status(p, flags, bc_parse_next_for); } else { // Set this for the next call to bc_parse_number because an empty // condition means that it is an infinite loop, so the condition must be // non-zero. This is safe to set because the current token is a // semicolon, which has no string requirement. bc_vec_string(&p->l.str, sizeof(bc_parse_one) - 1, bc_parse_one); bc_parse_number(p); // An empty condition makes POSIX mad. bc_parse_err(p, BC_ERR_POSIX_FOR); } // Must have a semicolon. if (BC_ERR(p->l.t != BC_LEX_SCOLON)) bc_parse_err(p, BC_ERR_PARSE_TOKEN); bc_lex_next(&p->l); // Now we can set up the conditional jump to the exit and an unconditional // jump to the body right after. The unconditional jump to the body is // because there is update code coming right after the condition, so we need // to skip it to get to the body. bc_parse_push(p, BC_INST_JUMP_ZERO); bc_parse_pushIndex(p, exit_idx); bc_parse_push(p, BC_INST_JUMP); bc_parse_pushIndex(p, body_idx); // Now create the label for the update code. bc_parse_createCondLabel(p, update_idx); // Parse if not empty, and if it is, let POSIX yell if necessary. if (p->l.t != BC_LEX_RPAREN) bc_parse_expr_status(p, 0, bc_parse_next_rel); else bc_parse_err(p, BC_ERR_POSIX_FOR); // Must have a right paren. if (BC_ERR(p->l.t != BC_LEX_RPAREN)) bc_parse_err(p, BC_ERR_PARSE_TOKEN); // Set up a jump to the condition right after the update code. bc_parse_push(p, BC_INST_JUMP); bc_parse_pushIndex(p, cond_idx); bc_parse_createLabel(p, p->func->code.len); // Create an exit label for the body and start the body. bc_parse_createExitLabel(p, exit_idx, true); bc_lex_next(&p->l); bc_parse_startBody(p, BC_PARSE_FLAG_LOOP | BC_PARSE_FLAG_LOOP_INNER); } /** * Parse a statement or token that indicates a loop exit. This includes an * actual loop exit, the break keyword, or the continue keyword. * @param p The parser. * @param type The type of exit. */ static void bc_parse_loopExit(BcParse *p, BcLexType type) { size_t i; BcInstPtr *ip; // Must have a loop. If we don't, that's an error. if (BC_ERR(!BC_PARSE_LOOP(p))) bc_parse_err(p, BC_ERR_PARSE_TOKEN); // If we have a break statement... if (type == BC_LEX_KW_BREAK) { // If there are no exits, something went wrong somewhere. if (BC_ERR(!p->exits.len)) bc_parse_err(p, BC_ERR_PARSE_TOKEN); // Get the exit. i = p->exits.len - 1; ip = bc_vec_item(&p->exits, i); // The condition !ip->func is true if the exit is not for a loop, so we // need to find the first actual loop exit. while (!ip->func && i < p->exits.len) ip = bc_vec_item(&p->exits, i--); // Make sure everything is hunky dory. assert(ip != NULL && (i < p->exits.len || ip->func)); // Set the index for the exit. i = ip->idx; } // If we have a continue statement or just the loop end, jump to the // condition (or update for a foor loop). else i = *((size_t*) bc_vec_top(&p->conds)); // Add the unconditional jump. bc_parse_push(p, BC_INST_JUMP); bc_parse_pushIndex(p, i); bc_lex_next(&p->l); } /** * Parse a function (header). * @param p The parser. */ static void bc_parse_func(BcParse *p) { bool comma = false, voidfn; uint16_t flags; size_t idx; bc_lex_next(&p->l); // Must have a name. if (BC_ERR(p->l.t != BC_LEX_NAME)) bc_parse_err(p, BC_ERR_PARSE_FUNC); // If the name is "void", and POSIX is not on, mark as void. voidfn = (!BC_IS_POSIX && p->l.t == BC_LEX_NAME && !strcmp(p->l.str.v, "void")); // We can safely do this because the expected token should not overwrite the // function name. bc_lex_next(&p->l); // If we *don't* have another name, then void is the name of the function. voidfn = (voidfn && p->l.t == BC_LEX_NAME); // With a void function, allow POSIX to complain and get a new token. if (voidfn) { bc_parse_err(p, BC_ERR_POSIX_VOID); // We can safely do this because the expected token should not overwrite // the function name. bc_lex_next(&p->l); } // Must have a left paren. if (BC_ERR(p->l.t != BC_LEX_LPAREN)) bc_parse_err(p, BC_ERR_PARSE_FUNC); // Make sure the functions map and vector are synchronized. assert(p->prog->fns.len == p->prog->fn_map.len); // Must lock signals because vectors are changed, and the vector functions // expect signals to be locked. BC_SIG_LOCK; // Insert the function by name into the map and vector. idx = bc_program_insertFunc(p->prog, p->l.str.v); BC_SIG_UNLOCK; // Make sure the insert worked. assert(idx); // Update the function pointer and stuff in the parser and set its void. bc_parse_updateFunc(p, idx); p->func->voidfn = voidfn; bc_lex_next(&p->l); // While we do not have a right paren, we are still parsing arguments. while (p->l.t != BC_LEX_RPAREN) { BcType t = BC_TYPE_VAR; // If we have an asterisk, we are parsing a reference argument. if (p->l.t == BC_LEX_OP_MULTIPLY) { t = BC_TYPE_REF; bc_lex_next(&p->l); // Let POSIX complain if necessary. bc_parse_err(p, BC_ERR_POSIX_REF); } // If we don't have a name, the argument will not have a name. Barf. if (BC_ERR(p->l.t != BC_LEX_NAME)) bc_parse_err(p, BC_ERR_PARSE_FUNC); // Increment the number of parameters. p->func->nparams += 1; // Copy the string in the lexer so that we can use the lexer again. bc_vec_string(&p->buf, p->l.str.len, p->l.str.v); bc_lex_next(&p->l); // We are parsing an array parameter if this is true. if (p->l.t == BC_LEX_LBRACKET) { // Set the array type, unless we are already parsing a reference. if (t == BC_TYPE_VAR) t = BC_TYPE_ARRAY; bc_lex_next(&p->l); // The brackets *must* be empty. if (BC_ERR(p->l.t != BC_LEX_RBRACKET)) bc_parse_err(p, BC_ERR_PARSE_FUNC); bc_lex_next(&p->l); } // If we did *not* get a bracket, but we are expecting a reference, we // have a problem. else if (BC_ERR(t == BC_TYPE_REF)) bc_parse_verr(p, BC_ERR_PARSE_REF_VAR, p->buf.v); // Test for comma and get the next token if it exists. comma = (p->l.t == BC_LEX_COMMA); if (comma) bc_lex_next(&p->l); // Insert the parameter into the function. bc_func_insert(p->func, p->prog, p->buf.v, t, p->l.line); } // If we have a comma, but no parameter, barf. if (BC_ERR(comma)) bc_parse_err(p, BC_ERR_PARSE_FUNC); // Start the body. flags = BC_PARSE_FLAG_FUNC | BC_PARSE_FLAG_FUNC_INNER; bc_parse_startBody(p, flags); bc_lex_next(&p->l); // POSIX requires that a brace be on the same line as the function header. // If we don't have a brace, let POSIX throw an error. if (p->l.t != BC_LEX_LBRACE) bc_parse_err(p, BC_ERR_POSIX_BRACE); } /** * Parse an auto list. * @param p The parser. */ static void bc_parse_auto(BcParse *p) { bool comma, one; // Error if the auto keyword appeared in the wrong place. if (BC_ERR(!p->auto_part)) bc_parse_err(p, BC_ERR_PARSE_TOKEN); bc_lex_next(&p->l); p->auto_part = comma = false; // We need at least one variable or array. one = (p->l.t == BC_LEX_NAME); // While we have a variable or array. while (p->l.t == BC_LEX_NAME) { BcType t; // Copy the name from the lexer, so we can use it again. bc_vec_string(&p->buf, p->l.str.len - 1, p->l.str.v); bc_lex_next(&p->l); // If we are parsing an array... if (p->l.t == BC_LEX_LBRACKET) { t = BC_TYPE_ARRAY; bc_lex_next(&p->l); // The brackets *must* be empty. if (BC_ERR(p->l.t != BC_LEX_RBRACKET)) bc_parse_err(p, BC_ERR_PARSE_FUNC); bc_lex_next(&p->l); } else t = BC_TYPE_VAR; // Test for comma and get the next token if it exists. comma = (p->l.t == BC_LEX_COMMA); if (comma) bc_lex_next(&p->l); // Insert the auto into the function. bc_func_insert(p->func, p->prog, p->buf.v, t, p->l.line); } // If we have a comma, but no auto, barf. if (BC_ERR(comma)) bc_parse_err(p, BC_ERR_PARSE_FUNC); // If we don't have any variables or arrays, barf. if (BC_ERR(!one)) bc_parse_err(p, BC_ERR_PARSE_NO_AUTO); // The auto statement should be all that's in the statement. if (BC_ERR(!bc_parse_isDelimiter(p))) bc_parse_err(p, BC_ERR_PARSE_TOKEN); } /** * Parses a body. * @param p The parser. * @param brace True if a brace was encountered, false otherwise. */ static void bc_parse_body(BcParse *p, bool brace) { uint16_t *flag_ptr = BC_PARSE_TOP_FLAG_PTR(p); assert(flag_ptr != NULL); assert(p->flags.len >= 2); // The body flag is for when we expect a body. We got a body, so clear the // flag. *flag_ptr &= ~(BC_PARSE_FLAG_BODY); // If we are inside a function, that means we just barely entered it, and // we can expect an auto list. if (*flag_ptr & BC_PARSE_FLAG_FUNC_INNER) { // We *must* have a brace in this case. if (BC_ERR(!brace)) bc_parse_err(p, BC_ERR_PARSE_TOKEN); p->auto_part = (p->l.t != BC_LEX_KW_AUTO); if (!p->auto_part) { // Make sure this is true to not get a parse error. p->auto_part = true; // Since we already have the auto keyword, parse. bc_parse_auto(p); } // Eat a newline. if (p->l.t == BC_LEX_NLINE) bc_lex_next(&p->l); } else { // This is the easy part. size_t len = p->flags.len; assert(*flag_ptr); // Parse a statement. bc_parse_stmt(p); // This is a very important condition to get right. If there is no // brace, and no body flag, and the flags len hasn't shrunk, then we // have a body that was not delimited by braces, so we need to end it // now, after just one statement. if (!brace && !BC_PARSE_BODY(p) && len <= p->flags.len) bc_parse_endBody(p, false); } } /** * Parses a statement. This is the entry point for just about everything, except * function definitions. * @param p The parser. */ static void bc_parse_stmt(BcParse *p) { size_t len; uint16_t flags; BcLexType type = p->l.t; // Eat newline. if (type == BC_LEX_NLINE) { bc_lex_next(&p->l); return; } // Eat auto list. if (type == BC_LEX_KW_AUTO) { bc_parse_auto(p); return; } // If we reach this point, no auto list is allowed. p->auto_part = false; // Everything but an else needs to be taken care of here, but else is // special. if (type != BC_LEX_KW_ELSE) { // After an if, no else found. if (BC_PARSE_IF_END(p)) { // Clear the expectation for else, end body, and return. Returning // gives us a clean slate for parsing again. bc_parse_noElse(p); if (p->flags.len > 1 && !BC_PARSE_BRACE(p)) bc_parse_endBody(p, false); return; } // With a left brace, we are parsing a body. else if (type == BC_LEX_LBRACE) { // We need to start a body if we are not expecting one yet. if (!BC_PARSE_BODY(p)) { bc_parse_startBody(p, BC_PARSE_FLAG_BRACE); bc_lex_next(&p->l); } // If we *are* expecting a body, that body should get a brace. This // takes care of braces being on a different line than if and loop // headers. else { *(BC_PARSE_TOP_FLAG_PTR(p)) |= BC_PARSE_FLAG_BRACE; bc_lex_next(&p->l); bc_parse_body(p, true); } // If we have reached this point, we need to return for a clean // slate. return; } // This happens when we are expecting a body and get a single statement, // i.e., a body with no braces surrounding it. Returns after for a clean // slate. else if (BC_PARSE_BODY(p) && !BC_PARSE_BRACE(p)) { bc_parse_body(p, false); return; } } len = p->flags.len; flags = BC_PARSE_TOP_FLAG(p); switch (type) { // All of these are valid for expressions. case BC_LEX_OP_INC: case BC_LEX_OP_DEC: case BC_LEX_OP_MINUS: case BC_LEX_OP_BOOL_NOT: case BC_LEX_LPAREN: case BC_LEX_NAME: case BC_LEX_NUMBER: case BC_LEX_KW_IBASE: case BC_LEX_KW_LAST: case BC_LEX_KW_LENGTH: case BC_LEX_KW_OBASE: case BC_LEX_KW_SCALE: #if BC_ENABLE_EXTRA_MATH case BC_LEX_KW_SEED: #endif // BC_ENABLE_EXTRA_MATH case BC_LEX_KW_SQRT: case BC_LEX_KW_ABS: #if BC_ENABLE_EXTRA_MATH case BC_LEX_KW_IRAND: #endif // BC_ENABLE_EXTRA_MATH case BC_LEX_KW_ASCIIFY: case BC_LEX_KW_MODEXP: case BC_LEX_KW_DIVMOD: case BC_LEX_KW_READ: #if BC_ENABLE_EXTRA_MATH case BC_LEX_KW_RAND: #endif // BC_ENABLE_EXTRA_MATH case BC_LEX_KW_MAXIBASE: case BC_LEX_KW_MAXOBASE: case BC_LEX_KW_MAXSCALE: #if BC_ENABLE_EXTRA_MATH case BC_LEX_KW_MAXRAND: #endif // BC_ENABLE_EXTRA_MATH case BC_LEX_KW_LINE_LENGTH: case BC_LEX_KW_GLOBAL_STACKS: case BC_LEX_KW_LEADING_ZERO: { bc_parse_expr_status(p, BC_PARSE_PRINT, bc_parse_next_expr); break; } case BC_LEX_KW_ELSE: { bc_parse_else(p); break; } // Just eat. case BC_LEX_SCOLON: { // Do nothing. break; } case BC_LEX_RBRACE: { bc_parse_endBody(p, true); break; } case BC_LEX_STR: { bc_parse_str(p, BC_INST_PRINT_STR); break; } case BC_LEX_KW_BREAK: case BC_LEX_KW_CONTINUE: { bc_parse_loopExit(p, p->l.t); break; } case BC_LEX_KW_FOR: { bc_parse_for(p); break; } case BC_LEX_KW_HALT: { bc_parse_push(p, BC_INST_HALT); bc_lex_next(&p->l); break; } case BC_LEX_KW_IF: { bc_parse_if(p); break; } case BC_LEX_KW_LIMITS: { // `limits` is a compile-time command, so execute it right away. bc_vm_printf("BC_LONG_BIT = %lu\n", (ulong) BC_LONG_BIT); bc_vm_printf("BC_BASE_DIGS = %lu\n", (ulong) BC_BASE_DIGS); bc_vm_printf("BC_BASE_POW = %lu\n", (ulong) BC_BASE_POW); bc_vm_printf("BC_OVERFLOW_MAX = %lu\n", (ulong) BC_NUM_BIGDIG_MAX); bc_vm_printf("\n"); bc_vm_printf("BC_BASE_MAX = %lu\n", BC_MAX_OBASE); bc_vm_printf("BC_DIM_MAX = %lu\n", BC_MAX_DIM); bc_vm_printf("BC_SCALE_MAX = %lu\n", BC_MAX_SCALE); bc_vm_printf("BC_STRING_MAX = %lu\n", BC_MAX_STRING); bc_vm_printf("BC_NAME_MAX = %lu\n", BC_MAX_NAME); bc_vm_printf("BC_NUM_MAX = %lu\n", BC_MAX_NUM); #if BC_ENABLE_EXTRA_MATH bc_vm_printf("BC_RAND_MAX = %lu\n", BC_MAX_RAND); #endif // BC_ENABLE_EXTRA_MATH bc_vm_printf("MAX Exponent = %lu\n", BC_MAX_EXP); bc_vm_printf("Number of vars = %lu\n", BC_MAX_VARS); bc_lex_next(&p->l); break; } case BC_LEX_KW_STREAM: case BC_LEX_KW_PRINT: { bc_parse_print(p, type); break; } case BC_LEX_KW_QUIT: { // Quit is a compile-time command. We don't exit directly, so the vm // can clean up. vm.status = BC_STATUS_QUIT; BC_JMP; break; } case BC_LEX_KW_RETURN: { bc_parse_return(p); break; } case BC_LEX_KW_WHILE: { bc_parse_while(p); break; } default: { bc_parse_err(p, BC_ERR_PARSE_TOKEN); } } // If the flags did not change, we expect a delimiter. if (len == p->flags.len && flags == BC_PARSE_TOP_FLAG(p)) { if (BC_ERR(!bc_parse_isDelimiter(p))) bc_parse_err(p, BC_ERR_PARSE_TOKEN); } // Make sure semicolons are eaten. while (p->l.t == BC_LEX_SCOLON) bc_lex_next(&p->l); } void bc_parse_parse(BcParse *p) { assert(p); BC_SETJMP(exit); // We should not let an EOF get here unless some partial parse was not // completed, in which case, it's the user's fault. if (BC_ERR(p->l.t == BC_LEX_EOF)) bc_parse_err(p, BC_ERR_PARSE_EOF); // Functions need special parsing. else if (p->l.t == BC_LEX_KW_DEFINE) { if (BC_ERR(BC_PARSE_NO_EXEC(p))) { - if (p->flags.len == 1 && - BC_PARSE_TOP_FLAG(p) == BC_PARSE_FLAG_IF_END) - { - bc_parse_noElse(p); - } - else bc_parse_err(p, BC_ERR_PARSE_TOKEN); + bc_parse_endif(p); + if (BC_ERR(BC_PARSE_NO_EXEC(p))) + bc_parse_err(p, BC_ERR_PARSE_TOKEN); } bc_parse_func(p); } // Otherwise, parse a normal statement. else bc_parse_stmt(p); exit: BC_SIG_MAYLOCK; // We need to reset on error. if (BC_ERR(((vm.status && vm.status != BC_STATUS_QUIT) || vm.sig))) bc_parse_reset(p); BC_LONGJMP_CONT; } /** * Parse an expression. This is the actual implementation of the Shunting-Yard * Algorithm. * @param p The parser. * @param flags The flags for what is valid in the expression. * @param next A set of tokens for what is valid *after* the expression. * @return A parse status. In some places, an empty expression is an * error, and sometimes, it is required. This allows this function * to tell the caller if the expression was empty and let the * caller handle it. */ static BcParseStatus bc_parse_expr_err(BcParse *p, uint8_t flags, BcParseNext next) { BcInst prev = BC_INST_PRINT; uchar inst = BC_INST_INVALID; BcLexType top, t; size_t nexprs, ops_bgn; uint32_t i, nparens, nrelops; bool pfirst, rprn, done, get_token, assign, bin_last, incdec, can_assign; // One of these *must* be true. assert(!(flags & BC_PARSE_PRINT) || !(flags & BC_PARSE_NEEDVAL)); // These are set very carefully. In fact, controlling the values of these // locals is the biggest part of making this work. ops_bgn especially is // important because it marks where the operator stack begins for *this* // invocation of this function. That's because bc_parse_expr_err() is // recursive (the Shunting-Yard Algorithm is most easily expressed // recursively when parsing subexpressions), and each invocation needs to // know where to stop. // // - nparens is the number of left parens without matches. // - nrelops is the number of relational operators that appear in the expr. // - nexprs is the number of unused expressions. // - rprn is a right paren encountered last. // - done means the expression has been fully parsed. // - get_token is true when a token is needed at the end of an iteration. // - assign is true when an assignment statement was parsed last. // - incdec is true when the previous operator was an inc or dec operator. // - can_assign is true when an assignemnt is valid. // - bin_last is true when the previous instruction was a binary operator. t = p->l.t; pfirst = (p->l.t == BC_LEX_LPAREN); nparens = nrelops = 0; nexprs = 0; ops_bgn = p->ops.len; rprn = done = get_token = assign = incdec = can_assign = false; bin_last = true; // We want to eat newlines if newlines are not a valid ending token. // This is for spacing in things like for loop headers. if (!(flags & BC_PARSE_NOREAD)) { while ((t = p->l.t) == BC_LEX_NLINE) bc_lex_next(&p->l); } // This is the Shunting-Yard algorithm loop. for (; !done && BC_PARSE_EXPR(t); t = p->l.t) { switch (t) { case BC_LEX_OP_INC: case BC_LEX_OP_DEC: { // These operators can only be used with items that can be // assigned to. if (BC_ERR(incdec)) bc_parse_err(p, BC_ERR_PARSE_ASSIGN); bc_parse_incdec(p, &prev, &can_assign, &nexprs, flags); rprn = get_token = bin_last = false; incdec = true; flags &= ~(BC_PARSE_ARRAY); break; } #if BC_ENABLE_EXTRA_MATH case BC_LEX_OP_TRUNC: { // The previous token must have been a leaf expression, or the // operator is in the wrong place. if (BC_ERR(!BC_PARSE_LEAF(prev, bin_last, rprn))) bc_parse_err(p, BC_ERR_PARSE_TOKEN); // I can just add the instruction because // negative will already be taken care of. bc_parse_push(p, BC_INST_TRUNC); rprn = can_assign = incdec = false; get_token = true; flags &= ~(BC_PARSE_ARRAY); break; } #endif // BC_ENABLE_EXTRA_MATH case BC_LEX_OP_MINUS: { bc_parse_minus(p, &prev, ops_bgn, rprn, bin_last, &nexprs); rprn = get_token = can_assign = false; // This is true if it was a binary operator last. bin_last = (prev == BC_INST_MINUS); if (bin_last) incdec = false; flags &= ~(BC_PARSE_ARRAY); break; } // All of this group, including the fallthrough, is to parse binary // operators. case BC_LEX_OP_ASSIGN_POWER: case BC_LEX_OP_ASSIGN_MULTIPLY: case BC_LEX_OP_ASSIGN_DIVIDE: case BC_LEX_OP_ASSIGN_MODULUS: case BC_LEX_OP_ASSIGN_PLUS: case BC_LEX_OP_ASSIGN_MINUS: #if BC_ENABLE_EXTRA_MATH case BC_LEX_OP_ASSIGN_PLACES: case BC_LEX_OP_ASSIGN_LSHIFT: case BC_LEX_OP_ASSIGN_RSHIFT: #endif // BC_ENABLE_EXTRA_MATH case BC_LEX_OP_ASSIGN: { // We need to make sure the assignment is valid. if (!BC_PARSE_INST_VAR(prev)) bc_parse_err(p, BC_ERR_PARSE_ASSIGN); } // Fallthrough. BC_FALLTHROUGH case BC_LEX_OP_POWER: case BC_LEX_OP_MULTIPLY: case BC_LEX_OP_DIVIDE: case BC_LEX_OP_MODULUS: case BC_LEX_OP_PLUS: #if BC_ENABLE_EXTRA_MATH case BC_LEX_OP_PLACES: case BC_LEX_OP_LSHIFT: case BC_LEX_OP_RSHIFT: #endif // BC_ENABLE_EXTRA_MATH case BC_LEX_OP_REL_EQ: case BC_LEX_OP_REL_LE: case BC_LEX_OP_REL_GE: case BC_LEX_OP_REL_NE: case BC_LEX_OP_REL_LT: case BC_LEX_OP_REL_GT: case BC_LEX_OP_BOOL_NOT: case BC_LEX_OP_BOOL_OR: case BC_LEX_OP_BOOL_AND: { // This is true if the operator if the token is a prefix // operator. This is only for boolean not. if (BC_PARSE_OP_PREFIX(t)) { // Prefix operators are only allowed after binary operators // or prefix operators. if (BC_ERR(!bin_last && !BC_PARSE_OP_PREFIX(p->l.last))) bc_parse_err(p, BC_ERR_PARSE_EXPR); } // If we execute the else, that means we have a binary operator. // If the previous operator was a prefix or a binary operator, // then a binary operator is not allowed. else if (BC_ERR(BC_PARSE_PREV_PREFIX(prev) || bin_last)) bc_parse_err(p, BC_ERR_PARSE_EXPR); nrelops += (t >= BC_LEX_OP_REL_EQ && t <= BC_LEX_OP_REL_GT); prev = BC_PARSE_TOKEN_INST(t); bc_parse_operator(p, t, ops_bgn, &nexprs); rprn = incdec = can_assign = false; get_token = true; bin_last = !BC_PARSE_OP_PREFIX(t); flags &= ~(BC_PARSE_ARRAY); break; } case BC_LEX_LPAREN: { // A left paren is *not* allowed right after a leaf expr. if (BC_ERR(BC_PARSE_LEAF(prev, bin_last, rprn))) bc_parse_err(p, BC_ERR_PARSE_EXPR); nparens += 1; rprn = incdec = can_assign = false; get_token = true; // Push the paren onto the operator stack. bc_vec_push(&p->ops, &t); break; } case BC_LEX_RPAREN: { // This needs to be a status. The error is handled in // bc_parse_expr_status(). if (BC_ERR(p->l.last == BC_LEX_LPAREN)) return BC_PARSE_STATUS_EMPTY_EXPR; // The right paren must not come after a prefix or binary // operator. if (BC_ERR(bin_last || BC_PARSE_PREV_PREFIX(prev))) bc_parse_err(p, BC_ERR_PARSE_EXPR); // If there are no parens left, we are done, but we need another // token. if (!nparens) { done = true; get_token = false; break; } nparens -= 1; rprn = true; get_token = bin_last = incdec = false; bc_parse_rightParen(p, &nexprs); break; } case BC_LEX_STR: { // POSIX only allows strings alone. if (BC_IS_POSIX) bc_parse_err(p, BC_ERR_POSIX_EXPR_STRING); // A string is a leaf and cannot come right after a leaf. if (BC_ERR(BC_PARSE_LEAF(prev, bin_last, rprn))) bc_parse_err(p, BC_ERR_PARSE_EXPR); bc_parse_addString(p); get_token = true; bin_last = rprn = false; nexprs += 1; break; } case BC_LEX_NAME: { // A name is a leaf and cannot come right after a leaf. if (BC_ERR(BC_PARSE_LEAF(prev, bin_last, rprn))) bc_parse_err(p, BC_ERR_PARSE_EXPR); get_token = bin_last = false; bc_parse_name(p, &prev, &can_assign, flags & ~BC_PARSE_NOCALL); rprn = (prev == BC_INST_CALL); nexprs += 1; flags &= ~(BC_PARSE_ARRAY); break; } case BC_LEX_NUMBER: { // A number is a leaf and cannot come right after a leaf. if (BC_ERR(BC_PARSE_LEAF(prev, bin_last, rprn))) bc_parse_err(p, BC_ERR_PARSE_EXPR); // The number instruction is pushed in here. bc_parse_number(p); nexprs += 1; prev = BC_INST_NUM; get_token = true; rprn = bin_last = can_assign = false; flags &= ~(BC_PARSE_ARRAY); break; } case BC_LEX_KW_IBASE: case BC_LEX_KW_LAST: case BC_LEX_KW_OBASE: #if BC_ENABLE_EXTRA_MATH case BC_LEX_KW_SEED: #endif // BC_ENABLE_EXTRA_MATH { // All of these are leaves and cannot come right after a leaf. if (BC_ERR(BC_PARSE_LEAF(prev, bin_last, rprn))) bc_parse_err(p, BC_ERR_PARSE_EXPR); prev = t - BC_LEX_KW_LAST + BC_INST_LAST; bc_parse_push(p, prev); get_token = can_assign = true; rprn = bin_last = false; nexprs += 1; flags &= ~(BC_PARSE_ARRAY); break; } case BC_LEX_KW_LENGTH: case BC_LEX_KW_SQRT: case BC_LEX_KW_ABS: #if BC_ENABLE_EXTRA_MATH case BC_LEX_KW_IRAND: #endif // BC_ENABLE_EXTRA_MATH case BC_LEX_KW_ASCIIFY: { // All of these are leaves and cannot come right after a leaf. if (BC_ERR(BC_PARSE_LEAF(prev, bin_last, rprn))) bc_parse_err(p, BC_ERR_PARSE_EXPR); bc_parse_builtin(p, t, flags, &prev); rprn = get_token = bin_last = incdec = can_assign = false; nexprs += 1; flags &= ~(BC_PARSE_ARRAY); break; } case BC_LEX_KW_READ: #if BC_ENABLE_EXTRA_MATH case BC_LEX_KW_RAND: #endif // BC_ENABLE_EXTRA_MATH case BC_LEX_KW_MAXIBASE: case BC_LEX_KW_MAXOBASE: case BC_LEX_KW_MAXSCALE: #if BC_ENABLE_EXTRA_MATH case BC_LEX_KW_MAXRAND: #endif // BC_ENABLE_EXTRA_MATH case BC_LEX_KW_LINE_LENGTH: case BC_LEX_KW_GLOBAL_STACKS: case BC_LEX_KW_LEADING_ZERO: { // All of these are leaves and cannot come right after a leaf. if (BC_ERR(BC_PARSE_LEAF(prev, bin_last, rprn))) bc_parse_err(p, BC_ERR_PARSE_EXPR); // Error if we have read and it's not allowed. else if (t == BC_LEX_KW_READ && BC_ERR(flags & BC_PARSE_NOREAD)) bc_parse_err(p, BC_ERR_EXEC_REC_READ); prev = t - BC_LEX_KW_READ + BC_INST_READ; bc_parse_noArgBuiltin(p, prev); rprn = get_token = bin_last = incdec = can_assign = false; nexprs += 1; flags &= ~(BC_PARSE_ARRAY); break; } case BC_LEX_KW_SCALE: { // This is a leaf and cannot come right after a leaf. if (BC_ERR(BC_PARSE_LEAF(prev, bin_last, rprn))) bc_parse_err(p, BC_ERR_PARSE_EXPR); // Scale needs special work because it can be a variable *or* a // function. bc_parse_scale(p, &prev, &can_assign, flags); rprn = get_token = bin_last = false; nexprs += 1; flags &= ~(BC_PARSE_ARRAY); break; } case BC_LEX_KW_MODEXP: case BC_LEX_KW_DIVMOD: { // This is a leaf and cannot come right after a leaf. if (BC_ERR(BC_PARSE_LEAF(prev, bin_last, rprn))) bc_parse_err(p, BC_ERR_PARSE_EXPR); bc_parse_builtin3(p, t, flags, &prev); rprn = get_token = bin_last = incdec = can_assign = false; nexprs += 1; flags &= ~(BC_PARSE_ARRAY); break; } default: { #ifndef NDEBUG // We should never get here, even in debug builds. bc_parse_err(p, BC_ERR_PARSE_TOKEN); break; #endif // NDEBUG } } if (get_token) bc_lex_next(&p->l); } // Now that we have parsed the expression, we need to empty the operator // stack. while (p->ops.len > ops_bgn) { top = BC_PARSE_TOP_OP(p); assign = top >= BC_LEX_OP_ASSIGN_POWER && top <= BC_LEX_OP_ASSIGN; // There should not be *any* parens on the stack anymore. if (BC_ERR(top == BC_LEX_LPAREN || top == BC_LEX_RPAREN)) bc_parse_err(p, BC_ERR_PARSE_EXPR); bc_parse_push(p, BC_PARSE_TOKEN_INST(top)); // Adjust the number of unused expressions. nexprs -= !BC_PARSE_OP_PREFIX(top); bc_vec_pop(&p->ops); incdec = false; } // There must be only one expression at the top. if (BC_ERR(nexprs != 1)) bc_parse_err(p, BC_ERR_PARSE_EXPR); // Check that the next token is correct. for (i = 0; i < next.len && t != next.tokens[i]; ++i); if (BC_ERR(i == next.len && !bc_parse_isDelimiter(p))) bc_parse_err(p, BC_ERR_PARSE_EXPR); // Check that POSIX would be happy with the number of relational operators. if (!(flags & BC_PARSE_REL) && nrelops) bc_parse_err(p, BC_ERR_POSIX_REL_POS); else if ((flags & BC_PARSE_REL) && nrelops > 1) bc_parse_err(p, BC_ERR_POSIX_MULTIREL); // If this is true, then we might be in a situation where we don't print. // We would want to have the increment/decrement operator not make an extra // copy if it's not necessary. if (!(flags & BC_PARSE_NEEDVAL) && !pfirst) { // We have the easy case if the last operator was an assignment // operator. if (assign) { inst = *((uchar*) bc_vec_top(&p->func->code)); inst += (BC_INST_ASSIGN_POWER_NO_VAL - BC_INST_ASSIGN_POWER); incdec = false; } // If we have an inc/dec operator and we are *not* printing, implement // the optimization to get rid of the extra copy. else if (incdec && !(flags & BC_PARSE_PRINT)) { inst = *((uchar*) bc_vec_top(&p->func->code)); incdec = (inst <= BC_INST_DEC); inst = BC_INST_ASSIGN_PLUS_NO_VAL + (inst != BC_INST_INC && inst != BC_INST_ASSIGN_PLUS); } // This condition allows us to change the previous assignment // instruction (which does a copy) for a NO_VAL version, which does not. // This condition is set if either of the above if statements ends up // being true. if (inst >= BC_INST_ASSIGN_POWER_NO_VAL && inst <= BC_INST_ASSIGN_NO_VAL) { // Pop the previous assignment instruction and push a new one. // Inc/dec needs the extra instruction because it is now a binary // operator and needs a second operand. bc_vec_pop(&p->func->code); if (incdec) bc_parse_push(p, BC_INST_ONE); bc_parse_push(p, inst); } } // If we might have to print... if ((flags & BC_PARSE_PRINT)) { // With a paren first or the last operator not being an assignment, we // *do* want to print. if (pfirst || !assign) bc_parse_push(p, BC_INST_PRINT); } // We need to make sure to push a pop instruction for assignment statements // that will not print. The print will pop, but without it, we need to pop. else if (!(flags & BC_PARSE_NEEDVAL) && (inst < BC_INST_ASSIGN_POWER_NO_VAL || inst > BC_INST_ASSIGN_NO_VAL)) { bc_parse_push(p, BC_INST_POP); } // We want to eat newlines if newlines are not a valid ending token. // This is for spacing in things like for loop headers. // // Yes, this is one case where I reuse a variable for a different purpose; // in this case, incdec being true now means that newlines are not valid. for (incdec = true, i = 0; i < next.len && incdec; ++i) incdec = (next.tokens[i] != BC_LEX_NLINE); if (incdec) { while (p->l.t == BC_LEX_NLINE) bc_lex_next(&p->l); } return BC_PARSE_STATUS_SUCCESS; } /** * Parses an expression with bc_parse_expr_err(), but throws an error if it gets * an empty expression. * @param p The parser. * @param flags The flags for what is valid in the expression. * @param next A set of tokens for what is valid *after* the expression. */ static void bc_parse_expr_status(BcParse *p, uint8_t flags, BcParseNext next) { BcParseStatus s = bc_parse_expr_err(p, flags, next); if (BC_ERR(s == BC_PARSE_STATUS_EMPTY_EXPR)) bc_parse_err(p, BC_ERR_PARSE_EMPTY_EXPR); } void bc_parse_expr(BcParse *p, uint8_t flags) { assert(p); bc_parse_expr_status(p, flags, bc_parse_next_read); } #endif // BC_ENABLED diff --git a/src/vm.c b/src/vm.c index 8f222f8ccf69..853dff0820dd 100644 --- a/src/vm.c +++ b/src/vm.c @@ -1,1468 +1,1437 @@ /* * ***************************************************************************** * * SPDX-License-Identifier: BSD-2-Clause * * Copyright (c) 2018-2021 Gavin D. Howard and contributors. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are met: * * * Redistributions of source code must retain the above copyright notice, this * list of conditions and the following disclaimer. * * * Redistributions in binary form must reproduce the above copyright notice, * this list of conditions and the following disclaimer in the documentation * and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE * POSSIBILITY OF SUCH DAMAGE. * * ***************************************************************************** * * Code common to all of bc and dc. * */ #include #include #include #include #include #include #include #ifndef _WIN32 #include #include #include #else // _WIN32 #define WIN32_LEAN_AND_MEAN #include #include #endif // _WIN32 #include #include #include #include #include #include // The actual globals. static BcDig* temps_buf[BC_VM_MAX_TEMPS]; char output_bufs[BC_VM_BUF_SIZE]; BcVm vm; #if BC_DEBUG_CODE BC_NORETURN void bc_vm_jmp(const char* f) { #else // BC_DEBUG_CODE BC_NORETURN void bc_vm_jmp(void) { #endif assert(BC_SIG_EXC); BC_SIG_MAYLOCK; #if BC_DEBUG_CODE bc_file_puts(&vm.ferr, bc_flush_none, "Longjmp: "); bc_file_puts(&vm.ferr, bc_flush_none, f); bc_file_putchar(&vm.ferr, bc_flush_none, '\n'); bc_file_flush(&vm.ferr, bc_flush_none); #endif // BC_DEBUG_CODE #ifndef NDEBUG assert(vm.jmp_bufs.len - (size_t) vm.sig_pop); #endif // NDEBUG if (vm.jmp_bufs.len == 0) abort(); if (vm.sig_pop) bc_vec_pop(&vm.jmp_bufs); else vm.sig_pop = 1; siglongjmp(*((sigjmp_buf*) bc_vec_top(&vm.jmp_bufs)), 1); } #if !BC_ENABLE_LIBRARY /** * Handles signals. This is the signal handler. * @param sig The signal to handle. */ static void bc_vm_sig(int sig) { // There is already a signal in flight. if (vm.status == (sig_atomic_t) BC_STATUS_QUIT || vm.sig) { if (!BC_I || sig != SIGINT) vm.status = BC_STATUS_QUIT; return; } // Only reset under these conditions; otherwise, quit. if (sig == SIGINT && BC_SIGINT && BC_I) { int err = errno; // Write the message. if (write(STDOUT_FILENO, vm.sigmsg, vm.siglen) != (ssize_t) vm.siglen) vm.status = BC_STATUS_ERROR_FATAL; else vm.sig = 1; errno = err; } else vm.status = BC_STATUS_QUIT; assert(vm.jmp_bufs.len); // Only jump if signals are not locked. The jump will happen by whoever // unlocks signals. if (!vm.sig_lock) BC_JMP; } /** * Sets up signal handling. */ static void bc_vm_sigaction(void) { #ifndef _WIN32 struct sigaction sa; sigemptyset(&sa.sa_mask); sa.sa_handler = bc_vm_sig; sa.sa_flags = SA_NODEFER; sigaction(SIGTERM, &sa, NULL); sigaction(SIGQUIT, &sa, NULL); sigaction(SIGINT, &sa, NULL); #if BC_ENABLE_HISTORY if (BC_TTY) sigaction(SIGHUP, &sa, NULL); #endif // BC_ENABLE_HISTORY #else // _WIN32 signal(SIGTERM, bc_vm_sig); signal(SIGINT, bc_vm_sig); #endif // _WIN32 } void bc_vm_info(const char* const help) { BC_SIG_ASSERT_LOCKED; // Print the banner. bc_file_puts(&vm.fout, bc_flush_none, vm.name); bc_file_putchar(&vm.fout, bc_flush_none, ' '); bc_file_puts(&vm.fout, bc_flush_none, BC_VERSION); bc_file_putchar(&vm.fout, bc_flush_none, '\n'); bc_file_puts(&vm.fout, bc_flush_none, bc_copyright); // Print the help. if (help) { bc_file_putchar(&vm.fout, bc_flush_none, '\n'); #if BC_ENABLED if (BC_IS_BC) { const char* const banner = BC_DEFAULT_BANNER ? "to" : "to not"; const char* const sigint = BC_DEFAULT_SIGINT_RESET ? "to reset" : "to exit"; const char* const tty = BC_DEFAULT_TTY_MODE ? "enabled" : "disabled"; const char* const prompt = BC_DEFAULT_PROMPT ? "enabled" : "disabled"; bc_file_printf(&vm.fout, help, vm.name, vm.name, BC_VERSION, BC_BUILD_TYPE, banner, sigint, tty, prompt); } #endif // BC_ENABLED #if DC_ENABLED if (BC_IS_DC) { const char* const sigint = DC_DEFAULT_SIGINT_RESET ? "to reset" : "to exit"; const char* const tty = DC_DEFAULT_TTY_MODE ? "enabled" : "disabled"; const char* const prompt = DC_DEFAULT_PROMPT ? "enabled" : "disabled"; bc_file_printf(&vm.fout, help, vm.name, vm.name, BC_VERSION, BC_BUILD_TYPE, sigint, tty, prompt); } #endif // DC_ENABLED } // Flush. bc_file_flush(&vm.fout, bc_flush_none); } #endif // !BC_ENABLE_LIBRARY #if !BC_ENABLE_LIBRARY && !BC_ENABLE_MEMCHECK BC_NORETURN #endif // !BC_ENABLE_LIBRARY && !BC_ENABLE_MEMCHECK void bc_vm_fatalError(BcErr e) { bc_err(e); #if !BC_ENABLE_LIBRARY && !BC_ENABLE_MEMCHECK BC_UNREACHABLE abort(); #endif // !BC_ENABLE_LIBRARY && !BC_ENABLE_MEMCHECK } #if BC_ENABLE_LIBRARY void bc_vm_handleError(BcErr e) { assert(e < BC_ERR_NELEMS); assert(!vm.sig_pop); BC_SIG_LOCK; // If we have a normal error... if (e <= BC_ERR_MATH_DIVIDE_BY_ZERO) { // Set the error. vm.err = (BclError) (e - BC_ERR_MATH_NEGATIVE + BCL_ERROR_MATH_NEGATIVE); } // Abort if we should. else if (vm.abrt) abort(); else if (e == BC_ERR_FATAL_ALLOC_ERR) vm.err = BCL_ERROR_FATAL_ALLOC_ERR; else vm.err = BCL_ERROR_FATAL_UNKNOWN_ERR; BC_JMP; } #else // BC_ENABLE_LIBRARY void bc_vm_handleError(BcErr e, size_t line, ...) { BcStatus s; va_list args; uchar id = bc_err_ids[e]; const char* err_type = vm.err_ids[id]; sig_atomic_t lock; assert(e < BC_ERR_NELEMS); assert(!vm.sig_pop); #if BC_ENABLED // Figure out if the POSIX error should be an error, a warning, or nothing. if (!BC_S && e >= BC_ERR_POSIX_START) { if (BC_W) { // Make sure to not return an error. id = UCHAR_MAX; err_type = vm.err_ids[BC_ERR_IDX_WARN]; } else return; } #endif // BC_ENABLED BC_SIG_TRYLOCK(lock); // Make sure all of stdout is written first. s = bc_file_flushErr(&vm.fout, bc_flush_err); // Just jump out if the flush failed; there's nothing we can do. if (BC_ERR(s == BC_STATUS_ERROR_FATAL)) { vm.status = (sig_atomic_t) s; BC_JMP; } // Print the error message. va_start(args, line); bc_file_putchar(&vm.ferr, bc_flush_none, '\n'); bc_file_puts(&vm.ferr, bc_flush_none, err_type); bc_file_putchar(&vm.ferr, bc_flush_none, ' '); bc_file_vprintf(&vm.ferr, vm.err_msgs[e], args); va_end(args); // Print the extra information if we have it. if (BC_NO_ERR(vm.file != NULL)) { // This is the condition for parsing vs runtime. // If line is not 0, it is parsing. if (line) { bc_file_puts(&vm.ferr, bc_flush_none, "\n "); bc_file_puts(&vm.ferr, bc_flush_none, vm.file); bc_file_printf(&vm.ferr, bc_err_line, line); } else { BcInstPtr *ip = bc_vec_item_rev(&vm.prog.stack, 0); BcFunc *f = bc_vec_item(&vm.prog.fns, ip->func); bc_file_puts(&vm.ferr, bc_flush_none, "\n "); bc_file_puts(&vm.ferr, bc_flush_none, vm.func_header); bc_file_putchar(&vm.ferr, bc_flush_none, ' '); bc_file_puts(&vm.ferr, bc_flush_none, f->name); #if BC_ENABLED if (BC_IS_BC && ip->func != BC_PROG_MAIN && ip->func != BC_PROG_READ) { bc_file_puts(&vm.ferr, bc_flush_none, "()"); } #endif // BC_ENABLED } } bc_file_puts(&vm.ferr, bc_flush_none, "\n\n"); s = bc_file_flushErr(&vm.ferr, bc_flush_err); #if !BC_ENABLE_MEMCHECK // Because this function is called by a BC_NORETURN function when fatal // errors happen, we need to make sure to exit on fatal errors. This will // be faster anyway. This function *cannot jump when a fatal error occurs!* if (BC_ERR(id == BC_ERR_IDX_FATAL || s == BC_STATUS_ERROR_FATAL)) exit(bc_vm_atexit((int) BC_STATUS_ERROR_FATAL)); #else // !BC_ENABLE_MEMCHECK if (BC_ERR(s == BC_STATUS_ERROR_FATAL)) vm.status = (sig_atomic_t) s; else #endif // !BC_ENABLE_MEMCHECK { vm.status = (sig_atomic_t) (uchar) (id + 1); } // Only jump if there is an error. if (BC_ERR(vm.status)) BC_JMP; BC_SIG_TRYUNLOCK(lock); } char* bc_vm_getenv(const char* var) { char* ret; #ifndef _WIN32 ret = getenv(var); #else // _WIN32 _dupenv_s(&ret, NULL, var); #endif // _WIN32 return ret; } void bc_vm_getenvFree(char* val) { BC_UNUSED(val); #ifdef _WIN32 free(val); #endif // _WIN32 } /** * Sets a flag from an environment variable and the default. * @param var The environment variable. * @param def The default. * @param flag The flag to set. */ static void bc_vm_setenvFlag(const char* const var, int def, uint16_t flag) { // Get the value. char* val = bc_vm_getenv(var); // If there is no value... if (val == NULL) { // Set the default. if (def) vm.flags |= flag; else vm.flags &= ~(flag); } // Parse the value. else if (strtoul(val, NULL, 0)) vm.flags |= flag; else vm.flags &= ~(flag); bc_vm_getenvFree(val); } /** * Parses the arguments in {B,D]C_ENV_ARGS. * @param env_args_name The environment variable to use. */ static void bc_vm_envArgs(const char* const env_args_name) { char *env_args = bc_vm_getenv(env_args_name), *buf, *start; char instr = '\0'; BC_SIG_ASSERT_LOCKED; if (env_args == NULL) return; // Windows already allocates, so we don't need to. #ifndef _WIN32 start = buf = vm.env_args_buffer = bc_vm_strdup(env_args); #else // _WIN32 start = buf = vm.env_args_buffer = env_args; #endif // _WIN32 assert(buf != NULL); // Create two buffers for parsing. These need to stay throughout the entire // execution of bc, unfortunately, because of filenames that might be in // there. bc_vec_init(&vm.env_args, sizeof(char*), BC_DTOR_NONE); bc_vec_push(&vm.env_args, &env_args_name); // While we haven't reached the end of the args... while (*buf) { // If we don't have whitespace... if (!isspace(*buf)) { // If we have the start of a string... if (*buf == '"' || *buf == '\'') { // Set stuff appropriately. instr = *buf; buf += 1; // Check for the empty string. if (*buf == instr) { instr = '\0'; buf += 1; continue; } } // Push the pointer to the args buffer. bc_vec_push(&vm.env_args, &buf); // Parse the string. while (*buf && ((!instr && !isspace(*buf)) || (instr && *buf != instr))) { buf += 1; } // If we did find the end of the string... if (*buf) { if (instr) instr = '\0'; // Reset stuff. *buf = '\0'; buf += 1; start = buf; } else if (instr) bc_error(BC_ERR_FATAL_OPTION, 0, start); } // If we have whitespace, eat it. else buf += 1; } // Make sure to push a NULL pointer at the end. buf = NULL; bc_vec_push(&vm.env_args, &buf); // Parse the arguments. bc_args((int) vm.env_args.len - 1, bc_vec_item(&vm.env_args, 0), false); } /** * Gets the {B,D}C_LINE_LENGTH. * @param var The environment variable to pull it from. * @return The line length. */ static size_t bc_vm_envLen(const char *var) { char *lenv = bc_vm_getenv(var); size_t i, len = BC_NUM_PRINT_WIDTH; int num; // Return the default with none. if (lenv == NULL) return len; len = strlen(lenv); // Figure out if it's a number. for (num = 1, i = 0; num && i < len; ++i) num = isdigit(lenv[i]); // If it is a number... if (num) { // Parse it and clamp it if needed. len = (size_t) atoi(lenv) - 1; if (len == 1 || len >= UINT16_MAX) len = BC_NUM_PRINT_WIDTH; } // Set the default. else len = BC_NUM_PRINT_WIDTH; bc_vm_getenvFree(lenv); return len; } #endif // BC_ENABLE_LIBRARY void bc_vm_shutdown(void) { BC_SIG_ASSERT_LOCKED; #if BC_ENABLE_NLS if (vm.catalog != BC_VM_INVALID_CATALOG) catclose(vm.catalog); #endif // BC_ENABLE_NLS #if BC_ENABLE_HISTORY // This must always run to ensure that the terminal is back to normal, i.e., // has raw mode disabled. if (BC_TTY) bc_history_free(&vm.history); #endif // BC_ENABLE_HISTORY #ifndef NDEBUG #if !BC_ENABLE_LIBRARY bc_vec_free(&vm.env_args); free(vm.env_args_buffer); bc_vec_free(&vm.files); bc_vec_free(&vm.exprs); if (BC_PARSE_IS_INITED(&vm.read_prs, &vm.prog)) { bc_vec_free(&vm.read_buf); bc_parse_free(&vm.read_prs); } bc_parse_free(&vm.prs); bc_program_free(&vm.prog); bc_slabvec_free(&vm.other_slabs); bc_slabvec_free(&vm.main_slabs); bc_slabvec_free(&vm.main_const_slab); #endif // !BC_ENABLE_LIBRARY bc_vm_freeTemps(); #endif // NDEBUG #if !BC_ENABLE_LIBRARY // We always want to flush. bc_file_free(&vm.fout); bc_file_free(&vm.ferr); #endif // !BC_ENABLE_LIBRARY } void bc_vm_addTemp(BcDig *num) { // If we don't have room, just free. if (vm.temps_len == BC_VM_MAX_TEMPS) free(num); else { // Add to the buffer and length. temps_buf[vm.temps_len] = num; vm.temps_len += 1; } } BcDig* bc_vm_takeTemp(void) { if (!vm.temps_len) return NULL; vm.temps_len -= 1; return temps_buf[vm.temps_len]; } void bc_vm_freeTemps(void) { size_t i; BC_SIG_ASSERT_LOCKED; if (!vm.temps_len) return; // Free them all... for (i = 0; i < vm.temps_len; ++i) free(temps_buf[i]); vm.temps_len = 0; } inline size_t bc_vm_arraySize(size_t n, size_t size) { size_t res = n * size; if (BC_ERR(BC_VM_MUL_OVERFLOW(n, size, res))) bc_vm_fatalError(BC_ERR_FATAL_ALLOC_ERR); return res; } inline size_t bc_vm_growSize(size_t a, size_t b) { size_t res = a + b; if (BC_ERR(res >= SIZE_MAX || res < a)) bc_vm_fatalError(BC_ERR_FATAL_ALLOC_ERR); return res; } void* bc_vm_malloc(size_t n) { void* ptr; BC_SIG_ASSERT_LOCKED; ptr = malloc(n); if (BC_ERR(ptr == NULL)) { bc_vm_freeTemps(); ptr = malloc(n); if (BC_ERR(ptr == NULL)) bc_vm_fatalError(BC_ERR_FATAL_ALLOC_ERR); } return ptr; } void* bc_vm_realloc(void *ptr, size_t n) { void* temp; BC_SIG_ASSERT_LOCKED; temp = realloc(ptr, n); if (BC_ERR(temp == NULL)) { bc_vm_freeTemps(); temp = realloc(ptr, n); if (BC_ERR(temp == NULL)) bc_vm_fatalError(BC_ERR_FATAL_ALLOC_ERR); } return temp; } char* bc_vm_strdup(const char *str) { char *s; BC_SIG_ASSERT_LOCKED; s = strdup(str); if (BC_ERR(s == NULL)) { bc_vm_freeTemps(); s = strdup(str); if (BC_ERR(s == NULL)) bc_vm_fatalError(BC_ERR_FATAL_ALLOC_ERR); } return s; } #if !BC_ENABLE_LIBRARY void bc_vm_printf(const char *fmt, ...) { va_list args; BC_SIG_LOCK; va_start(args, fmt); bc_file_vprintf(&vm.fout, fmt, args); va_end(args); vm.nchars = 0; BC_SIG_UNLOCK; } #endif // !BC_ENABLE_LIBRARY void bc_vm_putchar(int c, BcFlushType type) { #if BC_ENABLE_LIBRARY bc_vec_pushByte(&vm.out, (uchar) c); #else // BC_ENABLE_LIBRARY bc_file_putchar(&vm.fout, type, (uchar) c); vm.nchars = (c == '\n' ? 0 : vm.nchars + 1); #endif // BC_ENABLE_LIBRARY } #if !BC_ENABLE_LIBRARY #ifdef __OpenBSD__ /** * Aborts with a message. This should never be called because I have carefully * made sure that the calls to pledge() and unveil() are correct, but it's here * just in case. * @param msg The message to print. */ BC_NORETURN static void bc_abortm(const char* msg) { bc_file_puts(&vm.ferr, bc_flush_none, msg); bc_file_puts(&vm.ferr, bc_flush_none, "; this is a bug"); bc_file_flush(&vm.ferr, bc_flush_none); abort(); } void bc_pledge(const char *promises, const char* execpromises) { int r = pledge(promises, execpromises); if (r) bc_abortm("pledge() failed"); } #if BC_ENABLE_EXTRA_MATH /** * A convenience and portability function for OpenBSD's unveil(). * @param path The path. * @param permissions The permissions for the path. */ static void bc_unveil(const char *path, const char *permissions) { int r = unveil(path, permissions); if (r) bc_abortm("unveil() failed"); } #endif // BC_ENABLE_EXTRA_MATH #else // __OpenBSD__ void bc_pledge(const char *promises, const char *execpromises) { BC_UNUSED(promises); BC_UNUSED(execpromises); } #if BC_ENABLE_EXTRA_MATH static void bc_unveil(const char *path, const char *permissions) { BC_UNUSED(path); BC_UNUSED(permissions); } #endif // BC_ENABLE_EXTRA_MATH #endif // __OpenBSD__ /** * Cleans unneeded variables, arrays, functions, strings, and constants when * done executing a line of stdin. This is to prevent memory usage growing * without bound. This is an idea from busybox. */ static void bc_vm_clean(void) { BcVec *fns = &vm.prog.fns; BcFunc *f = bc_vec_item(fns, BC_PROG_MAIN); BcInstPtr *ip = bc_vec_item(&vm.prog.stack, 0); bool good = ((vm.status && vm.status != BC_STATUS_QUIT) || vm.sig); // If all is good, go ahead and reset. if (good) bc_program_reset(&vm.prog); #if BC_ENABLED // bc has this extra condition. If it not satisfied, it is in the middle of // a parse. if (good && BC_IS_BC) good = !BC_PARSE_NO_EXEC(&vm.prs); #endif // BC_ENABLED #if DC_ENABLED // For dc, it is safe only when all of the results on the results stack are // safe, which means that they are temporaries or other things that don't // need strings or constants. if (BC_IS_DC) { size_t i; good = true; for (i = 0; good && i < vm.prog.results.len; ++i) { BcResult *r = (BcResult*) bc_vec_item(&vm.prog.results, i); good = BC_VM_SAFE_RESULT(r); } } #endif // DC_ENABLED // If this condition is true, we can get rid of strings, // constants, and code. if (good && vm.prog.stack.len == 1 && ip->idx == f->code.len) { #if BC_ENABLED if (BC_IS_BC) { bc_vec_popAll(&f->labels); bc_vec_popAll(&f->strs); bc_vec_popAll(&f->consts); // I can't clear out the other_slabs because it has functions, // consts, strings, vars, and arrays. It has strings from *other* // functions, specifically. bc_slabvec_clear(&vm.main_const_slab); bc_slabvec_clear(&vm.main_slabs); } #endif // BC_ENABLED #if DC_ENABLED // Note to self: you cannot delete strings and functions. Deal with it. if (BC_IS_DC) { bc_vec_popAll(vm.prog.consts); bc_slabvec_clear(&vm.main_const_slab); } #endif // DC_ENABLED bc_vec_popAll(&f->code); ip->idx = 0; } } /** * Process a bunch of text. * @param text The text to process. * @param is_stdin True if the text came from stdin, false otherwise. */ static void bc_vm_process(const char *text, bool is_stdin) { // Set up the parser. bc_parse_text(&vm.prs, text, is_stdin); do { #if BC_ENABLED // If the first token is the keyword define, then we need to do this // specially because bc thinks it may not be able to parse. if (vm.prs.l.t == BC_LEX_KW_DEFINE) vm.parse(&vm.prs); #endif // BC_ENABLED // Parse it all. while (BC_PARSE_CAN_PARSE(vm.prs)) vm.parse(&vm.prs); // Execute if possible. if(BC_IS_DC || !BC_PARSE_NO_EXEC(&vm.prs)) bc_program_exec(&vm.prog); assert(BC_IS_DC || vm.prog.results.len == 0); // Flush in interactive mode. if (BC_I) bc_file_flush(&vm.fout, bc_flush_save); } while (vm.prs.l.t != BC_LEX_EOF); } #if BC_ENABLED /** - * Ends an if statement that ends a file. This is to ensure that full parses - * happen when a file finishes. Without this, bc thinks that it cannot parse - * any further. But if we reach the end of a file, we know we can add an empty - * else clause. + * Ends a series of if statements. This is to ensure that full parses happen + * when a file finishes or stdin has no more data. Without this, bc thinks that + * it cannot parse any further. But if we reach the end of a file or stdin has + * no more data, we know we can add an empty else clause. */ static void bc_vm_endif(void) { - - size_t i; - bool good; - - // Not a problem if this is true. - if (BC_NO_ERR(!BC_PARSE_NO_EXEC(&vm.prs))) return; - - good = true; - - // Find an instance of a body that needs closing, i.e., a statement that did - // not have a right brace when it should have. - for (i = 0; good && i < vm.prs.flags.len; ++i) { - uint16_t flag = *((uint16_t*) bc_vec_item(&vm.prs.flags, i)); - good = ((flag & BC_PARSE_FLAG_BRACE) != BC_PARSE_FLAG_BRACE); - } - - // If we did not find such an instance... - if (good) { - - // We set this to restore it later. We don't want the parser thinking - // that we are on stdin for this one because it will want more. - bool is_stdin = vm.is_stdin; - - vm.is_stdin = false; - - // Cheat and keep parsing empty else clauses until all of them are - // satisfied. - while (BC_PARSE_IF_END(&vm.prs)) bc_vm_process("else {}", false); - - vm.is_stdin = is_stdin; - } - // If we reach here, a block was not properly closed, and we should error. - else bc_parse_err(&vm.prs, BC_ERR_PARSE_BLOCK); + bc_parse_endif(&vm.prs); + bc_program_exec(&vm.prog); } #endif // BC_ENABLED /** * Processes a file. * @param file The filename. */ static void bc_vm_file(const char *file) { char *data = NULL; assert(!vm.sig_pop); // Set up the lexer. bc_lex_file(&vm.prs.l, file); BC_SIG_LOCK; // Read the file. data = bc_read_file(file); assert(data != NULL); BC_SETJMP_LOCKED(err); BC_SIG_UNLOCK; // Process it. bc_vm_process(data, false); #if BC_ENABLED // Make sure to end any open if statements. if (BC_IS_BC) bc_vm_endif(); #endif // BC_ENABLED err: BC_SIG_MAYLOCK; // Cleanup. free(data); bc_vm_clean(); // bc_program_reset(), called by bc_vm_clean(), resets the status. // We want it to clear the sig_pop variable in case it was set. if (vm.status == (sig_atomic_t) BC_STATUS_SUCCESS) BC_LONGJMP_STOP; BC_LONGJMP_CONT; } bool bc_vm_readLine(bool clear) { BcStatus s; bool good; // Clear the buffer if desired. if (clear) bc_vec_empty(&vm.buffer); // Empty the line buffer. bc_vec_empty(&vm.line_buf); if (vm.eof) return false; do { // bc_read_line() must always return either BC_STATUS_SUCCESS or // BC_STATUS_EOF. Everything else, it and whatever it calls, must jump // out instead. s = bc_read_line(&vm.line_buf, ">>> "); vm.eof = (s == BC_STATUS_EOF); } while (!(s) && !vm.eof && vm.line_buf.len < 1); good = (vm.line_buf.len > 1); // Concat if we found something. if (good) bc_vec_concat(&vm.buffer, vm.line_buf.v); return good; } /** * Processes text from stdin. */ static void bc_vm_stdin(void) { bool clear = true; vm.is_stdin = true; // Set up the lexer. bc_lex_file(&vm.prs.l, bc_program_stdin_name); // These are global so that the dc lexer can access them, but they are tied // to this function, really. Well, this and bc_vm_readLine(). These are the // reason that we have vm.is_stdin to tell the dc lexer if we are reading // from stdin. Well, both lexers care. And the reason they care is so that // if a comment or a string goes across multiple lines, the lexer can // request more data from stdin until the comment or string is ended. BC_SIG_LOCK; bc_vec_init(&vm.buffer, sizeof(uchar), BC_DTOR_NONE); bc_vec_init(&vm.line_buf, sizeof(uchar), BC_DTOR_NONE); BC_SETJMP_LOCKED(err); BC_SIG_UNLOCK; // This label exists because errors can cause jumps to end up at the err label // below. If that happens, and the error should be cleared and execution // continue, then we need to jump back. restart: // While we still read data from stdin. while (bc_vm_readLine(clear)) { size_t len = vm.buffer.len - 1; const char *str = vm.buffer.v; // We don't want to clear the buffer when the line ends with a backslash // because a backslash newline is special in bc. clear = (len < 2 || str[len - 2] != '\\' || str[len - 1] != '\n'); if (!clear) continue; // Process the data. bc_vm_process(vm.buffer.v, true); if (vm.eof) break; else bc_vm_clean(); } #if BC_ENABLED // End the if statements. if (BC_IS_BC) bc_vm_endif(); #endif // BC_ENABLED err: BC_SIG_MAYLOCK; // Cleanup. bc_vm_clean(); #if !BC_ENABLE_MEMCHECK assert(vm.status != BC_STATUS_ERROR_FATAL); vm.status = vm.status == BC_STATUS_QUIT || !BC_I ? vm.status : BC_STATUS_SUCCESS; #else // !BC_ENABLE_MEMCHECK vm.status = vm.status == BC_STATUS_ERROR_FATAL || vm.status == BC_STATUS_QUIT || !BC_I ? vm.status : BC_STATUS_SUCCESS; #endif // !BC_ENABLE_MEMCHECK if (!vm.status && !vm.eof) { bc_vec_empty(&vm.buffer); BC_LONGJMP_STOP; BC_SIG_UNLOCK; goto restart; } #ifndef NDEBUG // Since these are tied to this function, free them here. bc_vec_free(&vm.line_buf); bc_vec_free(&vm.buffer); #endif // NDEBUG BC_LONGJMP_CONT; } #if BC_ENABLED /** * Loads a math library. * @param name The name of the library. * @param text The text of the source code. */ static void bc_vm_load(const char *name, const char *text) { bc_lex_file(&vm.prs.l, name); bc_parse_text(&vm.prs, text, false); while (vm.prs.l.t != BC_LEX_EOF) vm.parse(&vm.prs); } #endif // BC_ENABLED /** * Loads the default error messages. */ static void bc_vm_defaultMsgs(void) { size_t i; vm.func_header = bc_err_func_header; // Load the error categories. for (i = 0; i < BC_ERR_IDX_NELEMS + BC_ENABLED; ++i) vm.err_ids[i] = bc_errs[i]; // Load the error messages. for (i = 0; i < BC_ERR_NELEMS; ++i) vm.err_msgs[i] = bc_err_msgs[i]; } /** * Loads the error messages for the locale. If NLS is disabled, this just loads * the default messages. */ static void bc_vm_gettext(void) { #if BC_ENABLE_NLS uchar id = 0; int set = 1, msg = 1; size_t i; // If no locale, load the defaults. if (vm.locale == NULL) { vm.catalog = BC_VM_INVALID_CATALOG; bc_vm_defaultMsgs(); return; } vm.catalog = catopen(BC_MAINEXEC, NL_CAT_LOCALE); // If no catalog, load the defaults. if (vm.catalog == BC_VM_INVALID_CATALOG) { bc_vm_defaultMsgs(); return; } // Load the function header. vm.func_header = catgets(vm.catalog, set, msg, bc_err_func_header); // Load the error categories. for (set += 1; msg <= BC_ERR_IDX_NELEMS + BC_ENABLED; ++msg) vm.err_ids[msg - 1] = catgets(vm.catalog, set, msg, bc_errs[msg - 1]); i = 0; id = bc_err_ids[i]; // Load the error messages. In order to understand this loop, you must know // the order of messages and categories in the enum and in the locale files. for (set = id + 3, msg = 1; i < BC_ERR_NELEMS; ++i, ++msg) { if (id != bc_err_ids[i]) { msg = 1; id = bc_err_ids[i]; set = id + 3; } vm.err_msgs[i] = catgets(vm.catalog, set, msg, bc_err_msgs[i]); } #else // BC_ENABLE_NLS bc_vm_defaultMsgs(); #endif // BC_ENABLE_NLS } /** * Starts execution. Really, this is a function of historical accident; it could * probably be combined with bc_vm_boot(), but I don't care enough. Really, this * function starts when execution of bc or dc source code starts. */ static void bc_vm_exec(void) { size_t i; bool has_file = false; BcVec buf; #if BC_ENABLED // Load the math libraries. if (BC_IS_BC && (vm.flags & BC_FLAG_L)) { // Can't allow redefinitions in the builtin library. vm.no_redefine = true; bc_vm_load(bc_lib_name, bc_lib); #if BC_ENABLE_EXTRA_MATH if (!BC_IS_POSIX) bc_vm_load(bc_lib2_name, bc_lib2); #endif // BC_ENABLE_EXTRA_MATH // Make sure to clear this. vm.no_redefine = false; // Execute to ensure that all is hunky dory. Without this, scale can be // set improperly. bc_program_exec(&vm.prog); } #endif // BC_ENABLED // If there are expressions to execute... if (vm.exprs.len) { size_t len = vm.exprs.len - 1; bool more; BC_SIG_LOCK; // Create this as a buffer for reading into. bc_vec_init(&buf, sizeof(uchar), BC_DTOR_NONE); #ifndef NDEBUG BC_SETJMP_LOCKED(err); #endif // NDEBUG BC_SIG_UNLOCK; // Prepare the lexer. bc_lex_file(&vm.prs.l, bc_program_exprs_name); // Process the expressions one at a time. do { more = bc_read_buf(&buf, vm.exprs.v, &len); bc_vec_pushByte(&buf, '\0'); bc_vm_process(buf.v, false); bc_vec_popAll(&buf); } while (more); BC_SIG_LOCK; bc_vec_free(&buf); #ifndef NDEBUG BC_UNSETJMP; #endif // NDEBUG BC_SIG_UNLOCK; // Sometimes, executing expressions means we need to quit. if (!vm.no_exprs && vm.exit_exprs) return; } // Process files. for (i = 0; i < vm.files.len; ++i) { char *path = *((char**) bc_vec_item(&vm.files, i)); if (!strcmp(path, "")) continue; has_file = true; bc_vm_file(path); } #if BC_ENABLE_EXTRA_MATH // These are needed for the pseudo-random number generator. bc_unveil("/dev/urandom", "r"); bc_unveil("/dev/random", "r"); bc_unveil(NULL, NULL); #endif // BC_ENABLE_EXTRA_MATH #if BC_ENABLE_HISTORY // We need to keep tty if history is enabled, and we need to keep rpath for // the times when we read from /dev/urandom. if (BC_TTY && !vm.history.badTerm) { bc_pledge(bc_pledge_end_history, NULL); } else #endif // BC_ENABLE_HISTORY { bc_pledge(bc_pledge_end, NULL); } #if BC_ENABLE_AFL // This is the thing that makes fuzzing with AFL++ so fast. If you move this // back, you won't cause any problems, but fuzzing will slow down. If you // move this forward, you won't fuzz anything because you will be skipping // the reading from stdin. __AFL_INIT(); #endif // BC_ENABLE_AFL // Execute from stdin. bc always does. if (BC_IS_BC || !has_file) bc_vm_stdin(); // These are all protected by ifndef NDEBUG because if these are needed, bc is // going to exit anyway, and I see no reason to include this code in a release // build when the OS is going to free all of the resources anyway. #ifndef NDEBUG return; err: BC_SIG_MAYLOCK; bc_vec_free(&buf); BC_LONGJMP_CONT; #endif // NDEBUG } void bc_vm_boot(int argc, char *argv[]) { int ttyin, ttyout, ttyerr; bool tty; const char* const env_len = BC_IS_BC ? "BC_LINE_LENGTH" : "DC_LINE_LENGTH"; const char* const env_args = BC_IS_BC ? "BC_ENV_ARGS" : "DC_ENV_ARGS"; // We need to know which of stdin, stdout, and stderr are tty's. ttyin = isatty(STDIN_FILENO); ttyout = isatty(STDOUT_FILENO); ttyerr = isatty(STDERR_FILENO); tty = (ttyin != 0 && ttyout != 0 && ttyerr != 0); vm.flags |= ttyin ? BC_FLAG_TTYIN : 0; vm.flags |= tty ? BC_FLAG_TTY : 0; vm.flags |= ttyin && ttyout ? BC_FLAG_I : 0; // Set up signals. bc_vm_sigaction(); // Initialize some vm stuff. This is separate to make things easier for the // library. bc_vm_init(); // Explicitly set this in case NULL isn't all zeroes. vm.file = NULL; // Set the error messages. bc_vm_gettext(); // Initialize the output file buffers. They each take portions of the global // buffer. stdout gets more because it will probably have more data. bc_file_init(&vm.ferr, STDERR_FILENO, output_bufs + BC_VM_STDOUT_BUF_SIZE, BC_VM_STDERR_BUF_SIZE); bc_file_init(&vm.fout, STDOUT_FILENO, output_bufs, BC_VM_STDOUT_BUF_SIZE); // Set the input buffer to the rest of the global buffer. vm.buf = output_bufs + BC_VM_STDOUT_BUF_SIZE + BC_VM_STDERR_BUF_SIZE; // Set the line length by environment variable. vm.line_len = (uint16_t) bc_vm_envLen(env_len); // Clear the files and expressions vectors, just in case. This marks them as // *not* allocated. bc_vec_clear(&vm.files); bc_vec_clear(&vm.exprs); #if !BC_ENABLE_LIBRARY // Initialize the slab vectors. bc_slabvec_init(&vm.main_const_slab); bc_slabvec_init(&vm.main_slabs); bc_slabvec_init(&vm.other_slabs); #endif // !BC_ENABLE_LIBRARY // Initialize the program and main parser. These have to be in this order // because the program has to be initialized first, since a pointer to it is // passed to the parser. bc_program_init(&vm.prog); bc_parse_init(&vm.prs, &vm.prog, BC_PROG_MAIN); #if BC_ENABLED // bc checks this environment variable to see if it should run in standard // mode. if (BC_IS_BC) { char* var = bc_vm_getenv("POSIXLY_CORRECT"); vm.flags |= BC_FLAG_S * (var != NULL); bc_vm_getenvFree(var); } #endif // BC_ENABLED // Set defaults. vm.flags |= BC_TTY ? BC_FLAG_P | BC_FLAG_R : 0; vm.flags |= BC_I ? BC_FLAG_Q : 0; #if BC_ENABLED if (BC_IS_BC && BC_I) { // Set whether we print the banner or not. bc_vm_setenvFlag("BC_BANNER", BC_DEFAULT_BANNER, BC_FLAG_Q); } #endif // BC_ENABLED // Are we in TTY mode? if (BC_TTY) { const char* const env_tty = BC_IS_BC ? "BC_TTY_MODE" : "DC_TTY_MODE"; int env_tty_def = BC_IS_BC ? BC_DEFAULT_TTY_MODE : DC_DEFAULT_TTY_MODE; const char* const env_prompt = BC_IS_BC ? "BC_PROMPT" : "DC_PROMPT"; int env_prompt_def = BC_IS_BC ? BC_DEFAULT_PROMPT : DC_DEFAULT_PROMPT; // Set flags for TTY mode and prompt. bc_vm_setenvFlag(env_tty, env_tty_def, BC_FLAG_TTY); bc_vm_setenvFlag(env_prompt, tty ? env_prompt_def : 0, BC_FLAG_P); #if BC_ENABLE_HISTORY // If TTY mode is used, activate history. if (BC_TTY) bc_history_init(&vm.history); #endif // BC_ENABLE_HISTORY } // Process environment and command-line arguments. bc_vm_envArgs(env_args); bc_args(argc, argv, true); // If we are in interactive mode... if (BC_I) { const char* const env_sigint = BC_IS_BC ? "BC_SIGINT_RESET" : "DC_SIGINT_RESET"; int env_sigint_def = BC_IS_BC ? BC_DEFAULT_SIGINT_RESET : DC_DEFAULT_SIGINT_RESET; // Set whether we reset on SIGINT or not. bc_vm_setenvFlag(env_sigint, env_sigint_def, BC_FLAG_SIGINT); } #if BC_ENABLED // Disable global stacks in POSIX mode. if (BC_IS_POSIX) vm.flags &= ~(BC_FLAG_G); #endif // BC_ENABLED #if BC_ENABLED // Print the banner if allowed. We have to be in bc, in interactive mode, // and not be quieted by command-line option or environment variable. if (BC_IS_BC && BC_I && (vm.flags & BC_FLAG_Q)) { bc_vm_info(NULL); bc_file_putchar(&vm.fout, bc_flush_none, '\n'); bc_file_flush(&vm.fout, bc_flush_none); } #endif // BC_ENABLED BC_SIG_UNLOCK; // Start executing. bc_vm_exec(); } #endif // !BC_ENABLE_LIBRARY void bc_vm_init(void) { BC_SIG_ASSERT_LOCKED; #if !BC_ENABLE_LIBRARY // Set up the constant zero. bc_num_setup(&vm.zero, vm.zero_num, BC_VM_ONE_CAP); #endif // !BC_ENABLE_LIBRARY // Set up more constant BcNum's. bc_num_setup(&vm.one, vm.one_num, BC_VM_ONE_CAP); bc_num_one(&vm.one); // Set up more constant BcNum's. memcpy(vm.max_num, bc_num_bigdigMax, bc_num_bigdigMax_size * sizeof(BcDig)); memcpy(vm.max2_num, bc_num_bigdigMax2, bc_num_bigdigMax2_size * sizeof(BcDig)); bc_num_setup(&vm.max, vm.max_num, BC_NUM_BIGDIG_LOG10); bc_num_setup(&vm.max2, vm.max2_num, BC_NUM_BIGDIG_LOG10); vm.max.len = bc_num_bigdigMax_size; vm.max2.len = bc_num_bigdigMax2_size; // Set up the maxes for the globals. vm.maxes[BC_PROG_GLOBALS_IBASE] = BC_NUM_MAX_POSIX_IBASE; vm.maxes[BC_PROG_GLOBALS_OBASE] = BC_MAX_OBASE; vm.maxes[BC_PROG_GLOBALS_SCALE] = BC_MAX_SCALE; #if BC_ENABLE_EXTRA_MATH vm.maxes[BC_PROG_MAX_RAND] = ((BcRand) 0) - 1; #endif // BC_ENABLE_EXTRA_MATH #if BC_ENABLED #if !BC_ENABLE_LIBRARY // bc has a higher max ibase when it's not in POSIX mode. if (BC_IS_BC && !BC_IS_POSIX) #endif // !BC_ENABLE_LIBRARY { vm.maxes[BC_PROG_GLOBALS_IBASE] = BC_NUM_MAX_IBASE; } #endif // BC_ENABLED } #if BC_ENABLE_LIBRARY void bc_vm_atexit(void) { bc_vm_shutdown(); #ifndef NDEBUG bc_vec_free(&vm.jmp_bufs); #endif // NDEBUG } #else // BC_ENABLE_LIBRARY int bc_vm_atexit(int status) { // Set the status correctly. int s = BC_STATUS_IS_ERROR(status) ? status : BC_STATUS_SUCCESS; bc_vm_shutdown(); #ifndef NDEBUG bc_vec_free(&vm.jmp_bufs); #endif // NDEBUG return s; } #endif // BC_ENABLE_LIBRARY diff --git a/tests/bc/scripts/all.txt b/tests/bc/scripts/all.txt index 4ebfe5643c3d..0a8d2fe17c6c 100644 --- a/tests/bc/scripts/all.txt +++ b/tests/bc/scripts/all.txt @@ -1,16 +1,18 @@ multiply.bc divide.bc subtract.bc add.bc print.bc parse.bc array.bc atan.bc bessel.bc functions.bc globals.bc len.bc rand.bc references.bc screen.bc strings2.bc +ifs.bc +ifs2.bc diff --git a/tests/bc/scripts/ifs.bc b/tests/bc/scripts/ifs.bc new file mode 100644 index 000000000000..9d6ab2dbb0ef --- /dev/null +++ b/tests/bc/scripts/ifs.bc @@ -0,0 +1,49 @@ +#! /usr/bin/bc -q + +a = 1 +b = 2 +c = 3 + +if (a == 1) if (b == 2) if (c == 3) print "Yay!\n" + +define void g(x) { + print "g: x: ", x, "\n" +} + +if (a == 1) { + if (b == 2) { + if (c == 3) { + g(5) + } + } +} + +define void h(x) { + print "h: x: ", x, "\n" +} + +if (z == 0) + for (i = 0; i < 2; ++i) + if (a == 1) + for (j = 0; j < 2; ++j) + if (b == 2) + for (k = 0; k < 2; ++k) + if (c == 3) h(k) + +define void i(x) { + print "i: x: ", x, "\n" +} + +if (z == 0) { + for (i = 0; i < 2; ++i) { + if (a == 1) { + for (j = 0; j < 2; ++j) { + if (b == 2) { + for (k = 0; k < 2; ++k) { + if (c == 3) i(k) + } + } + } + } + } +} diff --git a/tests/bc/scripts/ifs.txt b/tests/bc/scripts/ifs.txt new file mode 100644 index 000000000000..56e07330f1f2 --- /dev/null +++ b/tests/bc/scripts/ifs.txt @@ -0,0 +1,18 @@ +Yay! +g: x: 5 +h: x: 0 +h: x: 1 +h: x: 0 +h: x: 1 +h: x: 0 +h: x: 1 +h: x: 0 +h: x: 1 +i: x: 0 +i: x: 1 +i: x: 0 +i: x: 1 +i: x: 0 +i: x: 1 +i: x: 0 +i: x: 1 diff --git a/tests/bc/scripts/ifs2.bc b/tests/bc/scripts/ifs2.bc new file mode 100644 index 000000000000..052ef06ee4e3 --- /dev/null +++ b/tests/bc/scripts/ifs2.bc @@ -0,0 +1,33 @@ +#! /usr/bin/bc -q + +a = 1 +b = 2 +c = 3 + +if (a == 1) if (b == 2) if (c == 3) print "Yay!\n" + +define void g(x) { + print "g: x: ", x, "\n" +} + +if (a == 1) { + if (b == 2) { + if (c == 3) { + g(5) + } + } +} + +define void h(x) { + print "h: x: ", x, "\n" +} + +if (z == 0) + for (i = 0; i < 2; ++i) + for (l = 0; l < 2; ++l) + if (a == 1) + for (j = 0; j < 2; ++j) + for (m = 0; m < 2; ++m) + if (b == 2) + for (k = 0; k < 2; ++k) + if (c == 3) h(k) diff --git a/tests/bc/scripts/ifs2.txt b/tests/bc/scripts/ifs2.txt new file mode 100644 index 000000000000..b226e98ad44b --- /dev/null +++ b/tests/bc/scripts/ifs2.txt @@ -0,0 +1,34 @@ +Yay! +g: x: 5 +h: x: 0 +h: x: 1 +h: x: 0 +h: x: 1 +h: x: 0 +h: x: 1 +h: x: 0 +h: x: 1 +h: x: 0 +h: x: 1 +h: x: 0 +h: x: 1 +h: x: 0 +h: x: 1 +h: x: 0 +h: x: 1 +h: x: 0 +h: x: 1 +h: x: 0 +h: x: 1 +h: x: 0 +h: x: 1 +h: x: 0 +h: x: 1 +h: x: 0 +h: x: 1 +h: x: 0 +h: x: 1 +h: x: 0 +h: x: 1 +h: x: 0 +h: x: 1 diff --git a/tests/bc/stdin2.txt b/tests/bc/stdin2.txt index f260cfa7dbcf..a7f1981c6658 100644 --- a/tests/bc/stdin2.txt +++ b/tests/bc/stdin2.txt @@ -1 +1,4 @@ for (i = 0; i < 3; ++i) if (2 < 3) 1 +if (3 < 4) for (i = 0; i < 3; ++i) if (4 < 5) 2 +for (j = 0; j < 3; ++j) if (5 < 6) for (i = 0; i < 3; ++i) if (4 < 5) 3 +if (6 < 7) for (j = 0; j < 3; ++j) if (5 < 6) for (i = 0; i < 3; ++i) if (4 < 5) 4 diff --git a/tests/bc/stdin2_results.txt b/tests/bc/stdin2_results.txt index e8183f05f5db..43e2b02f53f2 100644 --- a/tests/bc/stdin2_results.txt +++ b/tests/bc/stdin2_results.txt @@ -1,3 +1,24 @@ 1 1 1 +2 +2 +2 +3 +3 +3 +3 +3 +3 +3 +3 +3 +4 +4 +4 +4 +4 +4 +4 +4 +4 diff --git a/tests/stdin.sh b/tests/stdin.sh index c9e02253c30a..69e6f2cabf34 100755 --- a/tests/stdin.sh +++ b/tests/stdin.sh @@ -1,103 +1,103 @@ #! /bin/sh # # SPDX-License-Identifier: BSD-2-Clause # # Copyright (c) 2018-2021 Gavin D. Howard and contributors. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions are met: # # * Redistributions of source code must retain the above copyright notice, this # list of conditions and the following disclaimer. # # * Redistributions in binary form must reproduce the above copyright notice, # this list of conditions and the following disclaimer in the documentation # and/or other materials provided with the distribution. # # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" # AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE # IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE # ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE # LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR # CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF # SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS # INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN # CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) # ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE # POSSIBILITY OF SUCH DAMAGE. # set -e script="$0" testdir=$(dirname "$script") . "$testdir/../scripts/functions.sh" outputdir=${BC_TEST_OUTPUT_DIR:-$testdir} # Command-line processing. if [ "$#" -lt 1 ]; then printf 'usage: %s dir [exe [args...]]\n' "$0" printf 'valid dirs are:\n' printf '\n' cat "$testdir/all.txt" printf '\n' exit 1 fi d="$1" shift if [ "$#" -gt 0 ]; then exe="$1" shift else exe="$testdir/../bin/$d" fi out="$outputdir/${d}_outputs/stdin_results.txt" outdir=$(dirname "$out") # Make sure the directory exists. if [ ! -d "$outdir" ]; then mkdir -p "$outdir" fi # Set stuff for the correct calculator. if [ "$d" = "bc" ]; then options="-lq" else options="-x" fi rm -f "$out" # I use these, so unset them to make the tests work. unset BC_ENV_ARGS unset BC_LINE_LENGTH unset DC_ENV_ARGS unset DC_LINE_LENGTH set +e printf 'Running %s stdin tests...' "$d" # Run the file through stdin. cat "$testdir/$d/stdin.txt" | "$exe" "$@" "$options" > "$out" 2> /dev/null checktest "$d" "$?" "stdin" "$testdir/$d/stdin_results.txt" "$out" # bc has some more tests; run those. if [ "$d" = "bc" ]; then cat "$testdir/$d/stdin1.txt" | "$exe" "$@" "$options" > "$out" 2> /dev/null - checktest "$d" "$?" "stdin" "$testdir/$d/stdin1_results.txt" "$out" + checktest "$d" "$?" "stdin1" "$testdir/$d/stdin1_results.txt" "$out" cat "$testdir/$d/stdin2.txt" | "$exe" "$@" "$options" > "$out" 2> /dev/null - checktest "$d" "$?" "stdin" "$testdir/$d/stdin2_results.txt" "$out" + checktest "$d" "$?" "stdin2" "$testdir/$d/stdin2_results.txt" "$out" fi rm -f "$out" exec printf 'pass\n' diff --git a/vs/bin/some.txt b/vs/bin/some.txt deleted file mode 100644 index e69de29bb2d1..000000000000 diff --git a/vs/tests/some.txt b/vs/tests/some.txt deleted file mode 100644 index e69de29bb2d1..000000000000