Page MenuHomeFreeBSD

(WIP) Rewrite makesyscalls.sh in Lua
ClosedPublic

Authored by kevans on Sep 24 2019, 3:43 AM.
Tags
None
Referenced Files
F103266697: D21775.diff
Fri, Nov 22, 8:27 PM
Unknown Object (File)
Thu, Nov 21, 5:00 AM
Unknown Object (File)
Tue, Nov 19, 7:41 PM
Unknown Object (File)
Tue, Nov 12, 2:16 AM
Unknown Object (File)
Sat, Nov 9, 10:47 AM
Unknown Object (File)
Mon, Nov 4, 8:10 AM
Unknown Object (File)
Thu, Oct 24, 12:15 AM
Unknown Object (File)
Oct 23 2024, 3:19 AM
Subscribers

Details

Summary

Here's a nice checkpoint... this gets us the rough equivalent to what we have now (plus some stuff mixed in from CheriBSD that's mostly untested), so we can start quibbling over architecture and other fun stuff...

To start, I've included a diff of the generated files. There's a diff for these reasons:

  • I've integrated Cheri's isptrtype and how it uses it (I thin k faithfully), so some systrace bits get extra casts because we didn't consider them pointers
  • The new approach screws with whitespace much less, and I've tried to preserve it as faithfully as possible (because it's easier in many regards)
  • I didn't understand how the comments are supposed to work and got annoyed at the awk bits, so the comments(?) got sucked into various C comments (e.g. "resuba (BSD/OS 2.x)" vs. "resuba") because the format description broke down in my head around this point, where it seemed like resuba should be function name and (BSD/OS 2.x) the comment.

For now, this has some other major problems:

1.) It's a monolithic lua script, and can probably be broken up

2.) Diagnostics are crap, dumping out the entire syscall line in most cases when it hits a failure but no line numbers or other fun stuff

Most importantly, I'm unsure if it's even close to structured how we want or what we even want here. The current approach is basically:

  • Grab capabilities.conf entries
  • Setup stuff
  • Process the syscall file; effectively run each line through pattern_table. This part either ignores the line, writes it to sysinc (#include), buffers the line as-is (other preprocessor directives), or collapses lines until it reaches another syscall looking definition (line starts with a number) stripping out any trailing escapes along the way
  • Then it processes the buffer, each line of which is either a preprocessor directive to be written out as-is or a standardized (fro m a whitespace/newline perspective) syscall definition

I think this is at least a good discussion piece, and we can rewrite all of it as many times as we need to.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 26669

Event Timeline

Slightly better version that kills off the intermediate buffering step. All lines are processed as we get them and decide they're complete, where complete is defined as:

  • Doesn't end in a comma
  • Doesn't end with a trailing backslash
  • If it has an opening brace, it must have a closing brace
kevans edited the summary of this revision. (Show Details)

Finalize some more details: build and install /usr/libexec/bsdlua, revert the src.sys.mk hack and just conditionally set LUA in the various sysent Makefiles.

The config for bsdlua is just the stock config from Lua 5.3.5, I've made no changes. Versioning is left out of the name (as opposed to standard ports policy of including it) so users can't assume the version from looking at it. I considered fundamentally breaking it in some way so entities external to the base system can't use it OOTB without jumping through hoops, but decided that was going too far when it's already out of PATH and disguised as bsdlua.

but decided that was going too far when it's already out of PATH and disguised as bsdlua.

I agree, disguising it and putting it in libexec is sufficient. I was thinking about it this morning and thought of this same approach, except calling it flua.

One question, is it feasible to generate (modulo whitespace) identical files (even if we follow up almost immediately with changes to the generator and output)? IMO being able to trivially show that it's functionally equivalent should quiet any concerns. Based on your initial comments it seems perhaps not.

some quite long lines in here; we don't have a good established lua style yet

sys/tools/makesyscalls.lua
37–38

What about a temporary directory, using fixed filenames in there? That might be easier to extend to keep around.

but decided that was going too far when it's already out of PATH and disguised as bsdlua.

I agree, disguising it and putting it in libexec is sufficient. I was thinking about it this morning and thought of this same approach, except calling it flua.

One question, is it feasible to generate (modulo whitespace) identical files (even if we follow up almost immediately with changes to the generator and output)? IMO being able to trivially show that it's functionally equivalent should quiet any concerns. Based on your initial comments it seems perhaps not.

Sure- I think I see where I was getting the comment part wrong, so that will be fixed in the next iteration.

some quite long lines in here; we don't have a good established lua style yet

Hmm... I thought style.lua(9) allowed up to 120 columns, but it looks like I instead wrote it for 80 with a note that it's ok to wrap much earlier to make it look better if needed. Will fix in the next iteration.

sys/tools/makesyscalls.lua
37–38

Stock lua doesn't provide a mkdir facility, so I'd need to hack that into bsdlua (or flua, not attached to any particular name)... that wouldn't be a major problem, though, and the kind of liberty we can take with private lua.

I'll likely take the minimal lfs out of liblua, add mkdir/rmdir functions (which is still a subset of real lfs functionality), then share between the two with the new functionality ripped out of loader builds.

This way we can say you either need base flua/bsdlua or lua+lfs from ports.

Note a use case which must not be broken: often the workstation where syscall files are regenerated is running some kind of stable or even a release.

In D21775#475239, @kib wrote:

Note a use case which must not be broken: often the workstation where syscall files are regenerated is running some kind of stable or even a release.

I think we'll have to make the compromise that flua become a bootstrap tool and at least toolchain/kernel-toolchain must be built before one can regenerate syscall files, then I fix the LUA references to use bootstrap version if it exists.

The top-level sysent target is trivial here, but I am not sure how cleanly I can do the individual sysent targets. That may be the other concession, that you have to run the top-level sysent target if you're relying on bootstrap. I wouldn't think that's a major problem, though.

imp requested changes to this revision.Sep 24 2019, 5:17 PM

I'm currently in transit from EuroBSDcon and will likely be out of commission for a couple of days. I'd like to request the favor of getting a chance to review this once I've recovered before you push it in. Thanks!

This revision now requires changes to proceed.Sep 24 2019, 5:17 PM

Build-related fun:

  • Bootstrap flua if we're bootstrapping < 1300048 (arbitrary, will change upon later commit)
  • Prefer bootstrap flua over system flua if it's available in the top-level target
  • Provide a <src.lua.mk> to set some common variables we'll use in all of the individual sysent targets. This includes a specific flua to point to in case one of the individual sysent targets is being invoked and it exists... this is kind of hacky, but given that we're internalizing our lua I don't see a better option. I was going to add these bits to src.sys.mk, but this looks like a nightmare because <bsd.own.mk> is desirable here.

On the actual flua/makesyscalls side:

  • Pulled lfs from the stand/ build and added mkdir/rmdir
  • Reimplemented a really tiny subset of lua-posix; just enough to getpid(2)
  • Now uses /tmp/sysent.<pid> as the tempdir and create all files in there. One can set cleantmp = false in the script to leave the tempdir around, even on abort, for debugging purposes.
  • Various cleanup/fixes

I figured out where I was screwing up the comment format... I think including "(BSD/OS 2.x)" in comments about resuba and similar syscalls would be interesting and is likely a bug in the current makesyscalls.sh due to how the line gets split up, but we can fix that up later. The only divergence now is in whitespace -- still included in this review for easy review.

In D21775#475324, @imp wrote:

I'm currently in transit from EuroBSDcon and will likely be out of commission for a couple of days. I'd like to request the favor of getting a chance to review this once I've recovered before you push it in. Thanks!

Sure... this needs both buy-in and thorough review. The initial set of reviewers I added are what I consider "minimum buy-in" in order for this to work out.

Tagging @bdrewery, too, to make sure my build system atrocities are kept to a minimum... I don't know that we have much good precedent for build-only tools that we're hiding like this, but I don't know much. =)

I'm planning to make the syscall files generate *during the build*. It will need to happen quite early. Is lua already an early bootstrap tool?

My next question is why?

Makefile.inc1
2124–2130 ↗(On Diff #62518)

Is this actually needed in this change? I will need it later but I don't think it's needed yet.

I'm planning to make the syscall files generate *during the build*. It will need to happen quite early. Is lua already an early bootstrap tool?

Lua's just getting added as a tool here.

My next question is why?

My immediate reasoning is that my attempt to generate awk to simplify adding new COMPAT* levels to makesyscalls.sh was basically rejected, because there was at least some level of desire (from @brooks) to remove the embedding of awk in sh (see: D21718). As a nice side bonus, the lua version of this is about 4x faster on my local machine for generating all of the sysent files after touch sys/tools/makesyscalls.lua.

Makefile.inc1
2124–2130 ↗(On Diff #62518)

kib wants to maintain the ability to run sysent on older stable/releases where flua won't be installed yet (because it's just now existing), so my response to this was throw it into bootstrap tools for these systems and require bootstrap to have been built for the top-level sysent target to work.

Minimize diff to just makesyscalls.lua; the flua bits are getting broken out into a parent review, and converting sysent targets into a child.

I don't know lua yet so cannot review most of this. But I'm OK with expanding lua usage. One "regression" I see is that move from a heredoc to a ton of write_line lines. Can lua not continue a string on the next line? Or maybe we should have some kind of template file instead.

Can lua not continue a string on the next line?

It can, with something like:

> x=[[Hello
>> there]]
> print(x)
Hello
there

Prefer multiline string format over multiple subsequent calls to write_line of the same file in most places.

I'm slightly biased against the long format due to disliking dealing with style details near the end, especially when they're used in conjunction with string.format(), but it's a clear win for readability.

Over all this is a massive improvement. I was able to port a number of local changes quickly.

It would be nice if flua were in the path in buildenv. It's a little odd to have to run make toolchain before make sysent but I doubt make sysent is in anyone else's finger memory...

I ended up adding this error handling improvement making it work with CheriBSD's syscalls.master:

@@ -854,6 +876,9 @@
 
        -- Split flags
        for flag in allflags:gmatch("([^|]+)") do
+               if known_flags[flag] == nil then
+                       abort(1, "Unknown flag " .. flag .. " for " ..  sysnum)
+               end
                flags = flags | known_flags[flag]
        end

I have more changes locally, but they don't yet make sense to add to FreeBSD yet.

This revision was not accepted when it landed; it landed in state Needs Review.Nov 17 2019, 2:08 PM
This revision was automatically updated to reflect the committed changes.