Page MenuHomeFreeBSD

link_elf_obj: Colour VM objects
ClosedPublic

Authored by markj on Oct 15 2020, 6:38 PM.
Tags
None
Referenced Files
F108512964: D26802.id78277.diff
Sat, Jan 25, 7:23 PM
Unknown Object (File)
Tue, Jan 21, 9:18 AM
Unknown Object (File)
Sat, Jan 11, 11:31 PM
Unknown Object (File)
Dec 6 2024, 8:23 AM
Unknown Object (File)
Dec 3 2024, 6:32 PM
Unknown Object (File)
Nov 8 2024, 5:11 PM
Unknown Object (File)
Sep 27 2024, 12:40 PM
Unknown Object (File)
Sep 23 2024, 4:28 AM
Subscribers

Details

Summary

The OpenZFS zfs.ko is quite large on amd64; its .text section weighs in
at 2.5MB. However, we are mapping it using 4KB pages. This change is
enough to ensure that the first 2MB (.text) are mapped using a
superpage. I don't believe we need to make the colouring conditional on
the object size.

No change is needed for ET_DYN objects, they already get mapped using
superpages when possible, at least by default when SPARSE_MAPPING is not
defined.

We could go further and use the kernel module linker script to pad
sufficiently large .text sections to a 2MB boundary, at the cost of
wasting some memory.

Test Plan

Verified using the vm.pmap.kernel_maps sysctl.

Diff Detail

Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 34192
Build 31343: arc lint + arc unit

Event Timeline

markj requested review of this revision.Oct 15 2020, 6:38 PM
markj added reviewers: alc, kib.

I think that memory saving is still more important than really minor TLB usage optimization.

This revision is now accepted and ready to land.Oct 15 2020, 11:53 PM

A while back, Intel quietly made it possible to measure address translation overhead for instruction accesses. I say, "quietly," because the manual provides a low-level description of the counter that doesn't really explain what it effectively measures. However, some Intel people gave a presentation at an HPC workshop about 2 years ago that explained the counter's meaning, and Intel published those slides here: https://software.intel.com/content/www/us/en/develop/download/how-top-down-microarchitecture-analysis-tma-addresses-challenges-in-modern-servers.html

Beyond explaining the counter's meaning, these slides describe some details of the operation of Intel's L1 ITLB that relate to padding. Specifically, the L1 ITLB entry that provides the address translation for an in-flight instruction can't be reclaimed (and updated to hold a new translation) until the instruction is retired. And, with the small number of L1 ITLB entries for 2MB mappings, this is much more likely to be a problem if you too aggressively pad code sections for a program with poor locality of instruction access, e.g., the OpenJDK JVM and its shared libraries.

This revision was automatically updated to reflect the committed changes.