Page MenuHomeFreeBSD

Put kernel physaddr at explicit 2MB rather than inconsistent MAXPAGESIZE
ClosedPublic

Authored by emaste on Nov 21 2016, 11:20 PM.
Tags
None
Referenced Files
F82383259: D8610.diff
Sun, Apr 28, 12:05 AM
F82350950: D8610.diff
Sat, Apr 27, 8:27 PM
Unknown Object (File)
Sat, Apr 20, 10:24 PM
Unknown Object (File)
Tue, Apr 16, 4:19 AM
Unknown Object (File)
Mar 28 2024, 8:59 PM
Unknown Object (File)
Feb 28 2024, 8:46 PM
Unknown Object (File)
Feb 27 2024, 9:26 PM
Unknown Object (File)
Jan 24 2024, 3:51 AM
Subscribers

Details

Summary

GNU ld.bfd has MAXPAGESIZE=0x200000 on x86_64, but GNU gold and LLVM's LLD currently set MAXPAGESIZE=0x1000.

Without this change a gold- or lld-linked kernel panics at boot (https://bugs.freebsd.org/214718). With this change I can successfully boot a lld-linked kernel.

See also https://llvm.org/bugs/show_bug.cgi?id=30891, https://reviews.llvm.org/D24987, and discussion on the llvm-commits list in response to D24987 (e.g. http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20161031/thread.html, http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20161031/402183.html)

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

emaste retitled this revision from to Put kernel physaddr at explicit 2MB rather than inconsistent MAXPAGESIZE.
emaste updated this object.
emaste edited the test plan for this revision. (Show Details)
emaste added reviewers: jhb, kib, dim.

I could instead pass -zmax-page-size in kernel LDFLAGS, although I suspect we actually want an explicitly chosen constant here.

From what I see in the response, there is some big issues in understanding the expected binary layout on amd64 on disk and in memory. The only requirement put forth by the ELF specification is that on-disk segment start and in-memory segment mapping addresses are equal mod PAGE_SIZE. To make it possible to utilize superpage mappings for segments loaded into memory, their VA should be aligned at the superpage boundaries, i.e. 2M. This does not affect the size of the binary due to the note above about the relation between file and memory offset. More, to make the ELF file more compact, linkers usually put beginning of data segment right after the end of the text segment, and the dynamic linker double-maps this page, once into the text region, and once into data.

That said, kernel base address is selected to satisfy different requirement. First, we must keep the low memory (< 1M) intact. This is needed to preserve BIOS data, to allow bootstrap of AP CPUs, for ACPI resume, and because of the 640K-1M non-memory window. Next, we want the space for the loader heap, which must not overlap with the kernel blob. And last, we want the kernel text to be superpage-aligned, so that it can be mapped by superpage. 2M is good choice for all criteria.

So I would say that the use of MAXPAGESIZE there is justifiable, and also I suspect that these linkers do something not quire reasonable for amd64 target.

I generally agree with kib' reasoning. We want text segments in binaries and libraries to be super page aligned.

That said, MAXPAGESIZE seems a bit of an odd name in the face of 1GB pages. If MAXPAGESIZE were ever to change due to that then we would want this fixed to 2MB. However, lld needs to be fixed to map user binaries and libraries more optimally regardless which will happen to fix the current ldscript for the kernel. I don't mind using this as a squeaky wheel to force lld to be fixed.

In D8610#178800, @jhb wrote:

I generally agree with kib' reasoning. We want text segments in binaries and libraries to be super page aligned.

I agree, although .text alignment is a separate issue from this one. LLD's default .text segment start addr is 0x10000 (64K), it's not coming from either possible MAXPAGESIZE value.

LLD has this comment:

// On freebsd x86_64 the first page cannot be mmaped.
// On linux that is controled by vm.mmap_min_addr. At least on some x86_64
// installs that is 65536, so the first 15 pages cannot be used.
// Given that, the smallest value that can be used in here is 0x10000.
// If using 2MB pages, the smallest page aligned address that works is
// 0x200000, but it looks like every OS uses 4k pages for executables.
uint64_t DefaultImageBase = 0x10000;

This needs to be at least 0x200000 IMO, and probably ought to just be 0x400000 for consistency with ld.bfd/gold.

That said, MAXPAGESIZE seems a bit of an odd name in the face of 1GB pages. If MAXPAGESIZE were ever to change due to that then we would want this fixed to 2MB. However, lld needs to be fixed to map user binaries and libraries more optimally regardless which will happen to fix the current ldscript for the kernel. I don't mind using this as a squeaky wheel to force lld to be fixed.

The LLD developers are receptive to implementing whatever changes are shown to be necessary, and avoiding this change won't have any effect as a forcing function given the above.

Thread on using 0x400000 as default text segment address is at http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20161121/406817.html - no objection it seems.

Note that does not change MAXPAGESIZE and so we'd still need this patch to link with lld and gold.

This does not affect the size of the binary due to the note above about the relation between file and memory offset.

Right now lld will align each segment to MAXPAGESIZE and won't put multiple segments in the same MAXPAGESIZE-sized page, so if set to 2M it will pad the file to 2M-align the data segment. This could have a benefit in facilitating superpage-mapped data (although in general we probably don't expect data segments large enough to make use of this).

As Rafael wrote lld needs to do one of the following:

* Using 2MB and living with 6MB binaries.
* Using COMMONPAGESIZE everywhere but linker scripts. That is what we
used to do, but it breaks compatibility with people using "-z
max-page-size=X" and expecting PT_LOAD to be aligned to X.
* Implement the page overlap logic so that we can use 2MB pages. If we
do this then we may as well merge ro and rx. Maybe do the overlap only
if not using --rosegment?

But I think all of this is separate from this patch, where I think it's better for us to be explicit in using 0x200000 than relying on a poorly-documented MAXPAGESIZE constant which is already inconsistent between the two GNU linkers.

LLD will now put the image base address at 0x200000 on x86-64: https://reviews.llvm.org/rL287782

In rS214799 (in the projects/binutils-2.17 branch), we changed this from a literal 0x100000 to MAXPAGESIZE, to accommodate for binutils commit f766154.

However, if it now turns out that different linkers, and different versions of linkers have different definitions of MAXPAGESIZE, it is not unreasonable to go back to a literal constant.

This revision was automatically updated to reflect the committed changes.