Page MenuHomeFreeBSD

amd64: Implement a KASAN shadow map

Authored by markj on Mar 24 2021, 11:40 PM.



The idea behind KASAN is to use a region of memory to track the validity
of buffers in the kernel map. This region is the shadow map. The
compiler inserts calls to the sanitizer runtime for every emitted load
and store, and the runtime uses the shadow map to decide whether the
access is valid. Various kernel allocators call kasan_mark() to update
the shadow map.

In particular, accesses outside the kernel map cannot be validated this
way. I spent some time working towards having the direct map be
optional on amd64, but KASAN is useful regardless if UMA_MD_SMALL_ALLOC
is disabled.

The shadow map uses one byte per eight bytes in the kernel map. In
pmap_bootstrap() we create an initial set of page tables for the kernel
and preloaded data. This is the majority of the patch.

When pmap_growkernel() is called, we call kasan_shadow_map() to extend
the shadow map. kasan_shadow_map() uses pmap_kasan_enter() to allocate
memory for the shadow region and map it.

Diff Detail

rG FreeBSD src repository
Automatic diff as part of commit; lint not applicable.
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

markj added inline comments.

This function is implemented in D29416.


This is already included by vm/pmap.h.

I am curious in general, how accesses not in the kernel map are marked so that corresponding loads/stores are not checked against shadow validity map. For instance, page table mappings, large map, shadow validity map itself and so on.

In D29417#659036, @kib wrote:

I am curious in general, how accesses not in the kernel map are marked so that corresponding loads/stores are not checked against shadow validity map. For instance, page table mappings, large map, shadow validity map itself and so on.

This is handled using a check on the address. See kasan_md_unsupported() in amd64/include/asan.h in this review. In particular, we only validate accesses between VM_MIN_KERNEL_ADDRESS and VM_MAX_KERNEL_ADDRESS. Currently the vm_page array (mapped starting at VM_MIN_KERNEL_ADDRESS) is excluded.

subr_asan.c is compiled without sanitizer instrumentation, so its own accesses to the shadow map are not instrumented, but this is mostly for performance.


Why the check for < VM_MIN_KERNEL_ADDRESS is needed? From my understanding of current layout, (vm_offset_t)vm_page_array == VM_MIN_KERNEL_ADDRESS.


In fact this is a lot of memory.

I wonder if KASAN should imply some reduction in the sizing of the kernel maps. Or, is the idea that with KASAN we usually not survive far enough for this to matter?


vm_page_array is not set until SI_SUB_VM, but we enable KASAN before starting sysinits. I tried to explain it with the comment, I'll try to make it more clear.

BTW, another solution is to create a shadow map of the vm_page array in pmap_page_array_startup(). Because we do not validate accesses to vm_pages, we can use a single 4KB or 2MB physical page for the entire shadow. In fact I did it this way originally but it added some complexity, and maybe a shadow map for vm_pages could become useful some day.


Well, the shadow map is grown lazily based on demand for KVA. NKASAMPML4E is just the number of reserved slots. Which sizes are you referring to exactly?

I was surprised that Peter did not manage to trigger any panics due to OOM conditions in pmap_growkernel() while testing the patch. Might be it is more of a theoretical concern for now. The only time I see panics in pmap_growkernel() is with kernel memory leaks or some kind of overcommit, e.g., something requests an absurdly large buffer with malloc(9).


I see what you mean. It would be enough, IMO, to note that vm_page_array is initialized after first use of kasan_md_unsupported().


For instance, clean map, buffer cache (number of buffers) + transient map sizing, kernel map itself. They all are sized based on amount of physical memory.

For instance, on mid-range modern machine with 128G, 1/8 is 16G, which is significant.

Sure, real population of these maps is dynamic, and we probably do not grow simultaneosly in all mappings, also enough memory is consumed by userspace which provides enough safety buffer. But still it is a large error to over-estimate the amount of available memory by 1/8.

[I do not suggest that this is blocker]

markj marked an inline comment as done.
  • Improve kasan_md_unsupported().
  • Try to scale several VM limits based on the shadow map scale. Specifically, limit the vm_kmem_size (mostly used to bound the size of UMA) and nbuf (other kernel maps derive sizes from that).

I made some changes to scale a few constants appropriately. I am not sure if it is enough to fully alleviate the problem.

kib added inline comments.

There is one use of defined(UMA_MD_SMALL_ALLOC) in openzfs arc_os.c, which probably should not depend on it.

This revision is now accepted and ready to land.Mar 25 2021, 10:41 PM

I think that should really be #if VM_KMEM_SIZE_SCALE != 1. The idea is to limit the size of the ARC based on the maximum kernel heap size. On platforms where that limit is only bounded by the amount of physical memory (i.e. VM_KMEM_SIZE_SCALE == 1), there is no need to consult uma_avail().

This revision was automatically updated to reflect the committed changes.