Page MenuHomeFreeBSD

build: Introduce MK_MSAN
Needs ReviewPublic

Authored by markj on Feb 20 2024, 11:58 PM.
Tags
None
Referenced Files
Unknown Object (File)
Mon, Apr 29, 5:09 PM
Unknown Object (File)
Fri, Apr 26, 3:42 PM
Unknown Object (File)
Fri, Apr 26, 3:34 AM
Unknown Object (File)
Mon, Apr 22, 7:51 PM
Unknown Object (File)
Thu, Apr 4, 7:59 PM
Unknown Object (File)
Feb 24 2024, 9:17 PM
Unknown Object (File)
Feb 22 2024, 2:40 AM

Details

Reviewers
olce
Summary

As with MK_ASAN and MK_UBSAN, this lets one build parts of the tree, or
the whole tree, with LLVM's Memory Sanitizer enabled. This enables
runtime detection of uses of uninitialized memory.

MSAN has some extra constraints relative to ASAN and UBSAN:

  • All code loaded into a process must be instrumented, otherwise false positives can arise easily.
  • An MSAN-instrumented DSO cannot be linked into an uninstrumented executable.
  • MSAN-instrumented executables must be position-independent. In particular, anything compiled with MK_PIE=no cannot be instrumented.

In principle, libc.so does not need to be instrumented because the MSAN
runtime provides interceptors which handle shadow map updates for
various commit libc functions. However:

  • Some commonly used libc symbols are not intercepted currently (e.g., getc() and cgetent(), the latter is used by most ncurses applications).
  • There is at least one case where libc itself will generate a false positive: fts_sort() allocates memory using realloc(), which is intercepted, initializes the array inline, then sorts it with qsort(), which is also intercepted. MSAN ends up reporting that the memory is uninitialized.

I think the proper solution for FreeBSD is to provide syscall
interceptors in libsys, and nothing else. This would allow everything
to be instrumented and would simplify maintenance. To give another
example, libc internally uses _fstat() rather than fstat(), so
interceptors won't catch it. We can fix that by adding another
interceptor, but it makes much more sense to do so in libc/libsys rather
than LLVM. This will also ease maintenance as new system calls are
added.

This commits introduces the glue needed to compile most of the base
system with MSAN enabled. Because of the aforementioned considerations,
most applications will raise false positives currently, but this at least provides
a mechanism to compile specific executables or libraries with MSan enabled.

Currently I'm testing with the following flags:
WITHOUT_KERBEROS= WITHOUT_LIB32= WITHOUT_BOOT= on amd64 and
arm64.

Sponsored by: Klara, Inc.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 56135
Build 53023: arc lint + arc unit

Event Timeline

usr.bin/clang/Makefile.inc
3–5

Does this work? PIE was disabled in clang not because we didn't want it, but because it didn't build successfully.

markj added inline comments.
usr.bin/clang/Makefile.inc
3–5

It builds, but incremental builds fail. But that's not a problem here.

usr.bin/clang/Makefile.inc
3–5

Oh - I missed that you're forcing MK_PIE on in lib/clang/Makefile.inc. Hopefully we can (later on) get rid of this altogether and have Clang built with default tree-wide PIE setting.

Makefile.libcompat
79–83

It's tempting to restyle this command like the one above.

lib/libutil/tests/Makefile
7

!= "yes" seems odd to my eye. IIRC we did ==/!= "no" because we weren't initially confident the variables would be yes or no. Now we do use == "yes" in places and I tend to prefer the pair of == "no" and == "yes" when it doesn't conflict with local style.

share/mk/bsd.sanitizer.mk
15
markj marked 4 inline comments as done.

Handle Brooks' comments.

markj added inline comments.
usr.bin/clang/Makefile.inc
3–5

Yes, I effectively just reverted @dim's revert. :)

usr.bin/clang/Makefile.inc
3–5

Yeah I still figure out a good way to do the global clang build with PIE. I have not yet found out how you can determine from a .o file whether it is PIE or not.

usr.bin/clang/Makefile.inc
3–5

I don't think there's any information about it that's directly embedded into the ELF metadata. You'd have to infer that from the machine code and the types of relocations it uses.

share/mk/src.opts.mk
485

note that llvm-symbolizer is installed (with that name) by default; can it be used directly rather than via the addr2line alias?

lib/csu/tests/Makefile.tests
1

probably exclude this

tests/sys/audit/Makefile
4

private .a archives should be pie when MK_PIE is true

markj added inline comments.
lib/csu/tests/Makefile.tests
1

Oops, thanks.

share/mk/src.opts.mk
485

From my reading of ChooseExternalSymbolizer() in sanitizer_symbolizer_posix_libcdep.cpp, that's already what it does. It'll use addr2line only if llvm-symbolizer can't be found in $PATH. So this comment is wrong or outdated, but we still do indeed need to set MK_LLVM_BINUTILS I believe.

tests/sys/audit/Makefile
4

Empirically, this is not the case here. :)

I need to come back to this and figure out why I can't build, it was just a low priority when I ran into it.