Page MenuHomeFreeBSD

Split libc into libc and libsys
Needs ReviewPublic

Authored by ali_mashtizadeh.com on Mar 7 2018, 7:19 PM.

Details

Summary

Split libc.so into libc.so and libsys.so libraries that enable third
party tools to interpose on system calls more easilly. A tool will replace the
libsys library at runtime with their own version. In a follow up patch libsys.a
will be linked against rtld rather than all of libc. This patch is based off
of the CheriBSD project.

Test Plan

make universe, testing amd64 builds, and building open source
applications agains the new version of the library. Compare symbols
present in libc before and after the change. Run performance tests as
well.

Diff Detail

Lint
Lint OK
Unit
No Unit Test Coverage
Build Status
Buildable 15413
Build 15462: arc lint + arc unit

Event Timeline

bdrewery added inline comments.Mar 7 2018, 7:26 PM
lib/libc/gen/__pthread_mutex_init_calloc_cb_stub.c
43–44

What's going on here exactly? I think a comment is warranted.

kib added a subscriber: kib.Mar 7 2018, 8:45 PM

I do not understand this. There were a desire to have something name libsys for long time.

But: the library must have the dynamic version, with the defined and versioned ABI. If such library is provided, rtld should link against it statically, and perhaps libc should depend on the library. libc_nosyscalls is not needed then. Depending on the exported interfaces [do we want apps to directly reference syscall symbols from libsys, or only provide the libsys for libc and libthr use ?] libc might be linked as a filter for libsys.

What you do seems to be 0.1 of the whole road in this direction, and it feels like an unfinished hack.

lib/libc/gen/__pthread_mutex_init_calloc_cb_stub.c
43–44

No, it is spelled __unused.

sys/amd64/include/asm.h
48

Why is this needed ?

In CheriBSD this is implemented as a low-impact hack which avoids touching libc.so.7. I'd be fine with looking at a more aggressive approach. At one point I did try making the libc linker script add libsyscalls to a a syscall-less libc. My motivation for that approach vs linking libc.so.7 against libsyscalls was to make it easy to reuse the syscall-less libc in a sandbox where a completely different libsyscalls is used. I'm happy to try to support a move to another approach.

I don't care much how we spell the name. If there is ABI stuff other than syscalls that makes sense to put in there then a libsys, libsystem, or libabi might be a better name.

@ali_mashtizadeh.com: we should run the missing prototypes and the like through as separate revisions so they aren't part of what ever this diff becomes.

Thanks for the comments I'll rework the diff

  • Change the names to libabi and libc_noabi unless there's an objection
  • Add versioning
  • Test a build with libabi added to libc linker script
  • Remove the warning fixes for now
lib/libc/gen/__pthread_mutex_init_calloc_cb_stub.c
43–44

I think this part is now unnecessary. A bit of cruft from updating to HEAD from where I cross ported the changes.

I set "WARNS=2" in libsyscalls/Makefile to eliminate a few warnings and not require these changes.

As Brooks suggested we can take care of this in another diff and I'm happy to do that. I also saw a bit of cleanup in a few files that folks should appreciate.

sys/amd64/include/asm.h
48

This prevents requiring an ifdef in the amd64 version of setlogin.S and keeps the code essentially on par with the i386 version.

kib added a comment.EditedMar 8 2018, 10:35 AM

Thanks for the comments I'll rework the diff

  • Change the names to libabi and libc_noabi unless there's an objection
  • Add versioning
  • Test a build with libabi added to libc linker script
  • Remove the warning fixes for now

libabi is the worst name among all proposed. What does the library have to do with ABI ?

I do not want binaries to grow direct dependency on lib<whatever> (I will continue calling it libsys. I like the name). libsys.so.1 should be a dependency of the libc, most likely libc should be a filter over libsys. So no linker script modifications are necessary. But note the next question.

What symbols should export libsys.so ? Are the symbols should be internal sys_<syscallname>, and libc redirecting <syscallname> to sys_XXX ? Or do we want to allow libsys to directly export raw versioned syscall names, to be bound by the application references ? I suspect that the later is undesirable, in poaticular because some 'syscall' symbols are not, they are wrappers. e.g. open(2).

Another related question, should libsys export symbols for compat syscalls ? I think yes.

brooks added a comment.Mar 8 2018, 7:22 PM
In D14609#307039, @kib wrote:

I do not want binaries to grow direct dependency on lib<whatever> (I will continue calling it libsys. I like the name). libsys.so.1 should be a dependency of the libc, most likely libc should be a filter over libsys. So no linker script modifications are necessary. But note the next question.

I'm fine with libsys.

What symbols should export libsys.so ? Are the symbols should be internal sys_<syscallname>, and libc redirecting <syscallname> to sys_XXX ? Or do we want to allow libsys to directly export raw versioned syscall names, to be bound by the application references ? I suspect that the later is undesirable, in poaticular because some 'syscall' symbols are not, they are wrappers. e.g. open(2).

I think having libsys.so export __sys_<syscallname> and having libc provide the other symbols makes sense. That decouples the syscall mechanism from libc and nothing more which is the desired effect.

Another related question, should libsys export symbols for compat syscalls ? I think yes.

I think so too.

Copy from my email regarding the proposed plan

I've made a rough sketch of the changes and it seems to be working on my machine. So I wanted to show you my plan for the libc/libsys split before I polish things up.

I will make libsys into an auxiliary library for libc. This means libsys will have to export all of the same weak symbols that libc exports. Also, the linker doesn't require an aux library (it's optional) so I'll enforce that in the generated stubs in libc.

  • Dynamic Libraries ***

libc.so:

  • Changed: __sys_<syscall> (write a default error string & abort), (same weak symbols as before)

libsys.so:

  • Include: __sys_<syscall> (syscall implementation), _<syscall> (weak symbol), <syscall> (weak symbol)
  • Include: compat calls
  • Include: error(), set_error_selector(), single threaded error
  • Versioning and symbols have to match libc to an extent for this to be an auxiliary library
  • See vDSO discussion below
    • Static libc.so ***
  • Roughly unchanged
  • Static libc_pic.a ***

I think the easiest way to generate libc_pic.a correctly (that -fPIE applications can be supported transparently) is to create a new lib/libc_pic/Makefile that links the necessary files from libc and libsys. This shouldn't add to the build time. If -fPIE is not a concern, rtld can be manually linked against libc_pic.so and libsys_pic.so.

General:
I plan to move the platform specific assembly routines into the lib/libsys/* directory.


vDSO:

  • Encapsulating vDSO into libsys isn't too bad and it seems some related functions we would bring in with vDSO is duplicated in RTLD.
  • Requires including __vdso_* functions, auxv.c, sysconf.c
  • Only a couple additional files reference auxv.c so that could be encapsulated in libsys as well. Some of these provide functions that are duplicated in rtld again.
  • The alternative for packages that want to interpose by replacing libsys without including vDSO would be to modify the Elf_Auxinfo by in the parent process or at startup. Seems simple enough and I've looked into this for my record/replay system.

RTLD:

  • To link rtld against libsys we need these additional functions:
    • Memory/String operations: memcpy, str*cpy, str*cat, strchr, strsep, strdup, strspn/strcspn, bzero
    • Environment variables: getenv, unsetenv
    • Misc: assert(i.e. abort)
  • rtld already has replacement functions for getosreldate, stack protection, etc that we could bring in with vDSO into libsys? Maybe there's a case that those functions are worth moving into libsys.
  • I'll get a better sense of what else I've missed once I finish the rewrite of this patch. I've got a few patches towards the goal of linking rtld against libsys that I can cleanup and submit separately.

Updated patch against head and lots of cleanup to amd64 and i386. Similar cleanup in progress on other platforms.

Brooks & Konstantin if there's no objections I'm consolidating all the system call symbols for libsys/libc together. They have to be kept in sync to support library filtering otherwise it breaks. There's some fallout from that deduplication work, I'll wrap that up tomorrow.

ali_mashtizadeh.com edited the summary of this revision. (Show Details)Jun 27 2018, 6:01 PM
ali_mashtizadeh.com retitled this revision from Provide libc_nosyscalls and libsyscalls to Split libc into libc and libsys.Jun 27 2018, 6:27 PM

Refresh diff to head

This comment was removed by talg_cs.stanford.edu.

Sorry for the delay in dealing with this...

I've finally found some time to do some testing and something is not working right. I built and installed 12.0-ALPHA6 on a system, then built a poudriere jail with this patch applied. The programs seem to work fine when run on the host environment, but most dynamic programs fail with Abort trap when run. Interestingly, while the jail version of /bin/sh fails early, if I run one from the host, it gets as far as the prompt, but fails on exit.

Hello Brooks,

I'm refreshing the diff it was either an outstanding change I applied to my diff or when I rebased onto head that it broke. I have a fix I'm testing now (with fixes for other archs too) and hope that will be done by tonight/tomorrow.

It would still be good to see what major changes/issues you have with the patch so I could start addressing those.

Best,
Ali

Hello Brooks,

Can you double check maybe you have to delete your build directory because of my changes to the build. I made a few more improvements but I'm fairly certain that shouldn't fix the bug you saw. I both PXEBooted the OS and ran a jail without any crash on X86-64. I'm going to finish testing other platforms.

kib added a comment.Sep 24 2018, 8:35 AM

Did you compared export tables for patched vs. unpatched libc ?

Can you put pre-built libc.so and syscalls lib somewhere ?

Ali,

Rebuilding with a clean objdir seems to have worked. We're probably going to want to find the root cause there so we can add workarounds to Makefile.inc1.

I'm now doing some large port builds to exercise things a bit.

I need to look at the changes some more, but one thing I'm noticing is that git's rename detection did a truly terrible job in creating the diff. Do you think you can recreate the list of files moved around in some form suitable for use with svn mv when we do an actual commit? That might also make things a little easier to understand.