Page MenuHomeFreeBSD

Refine comment about size of struct pmap_large_md_page.
AbandonedPublic

Authored by jhb on Feb 3 2023, 1:26 AM.
Tags
None
Referenced Files
Unknown Object (File)
Jan 23 2024, 11:41 PM
Unknown Object (File)
Dec 31 2023, 9:14 PM
Unknown Object (File)
Dec 2 2023, 10:53 PM
Unknown Object (File)
Oct 10 2023, 12:49 PM
Unknown Object (File)
Jun 22 2023, 12:56 AM
Unknown Object (File)
May 3 2023, 2:34 PM
Unknown Object (File)
Apr 5 2023, 9:29 PM
Unknown Object (File)
Mar 22 2023, 5:04 PM
Subscribers

Details

Reviewers
andrew
mjg
manu
Summary

pmap_init_pv_table does not assume a specific size of 64. It does
assume a size that evenly divides the size of a page since it inserts
pages from different domains into the kva backing the global pv_table
array.

Sponsored by: DARPA

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 49502
Build 46392: arc lint + arc unit

Event Timeline

jhb requested review of this revision.Feb 3 2023, 1:26 AM

I tripped over this in CheriBSD where we have a kernel with larger pointers for which this assertion doesn't hold. However, from what I could tell, there's nothing in the code that really assumes a value of 64, just that the calculations of the array indices into pv_table (namely the update to highest after each chunk) assumes a size that evenly divides a page.

The point here was to occupy one cache line and catch blow ups past that at compilation time, to avoid accidental memory overhead increase.

If you have an arch which differs, you should ifdef on it and add the page'd assert.

In D38366#871960, @mjg wrote:

The point here was to occupy one cache line and catch blow ups past that at compilation time, to avoid accidental memory overhead increase.

If you have an arch which differs, you should ifdef on it and add the page'd assert.

Then maybe say that? 64 is a magic number otherwise, and arm64 blindly copied the comment without understanding why because it wasn't stated. Also, I think the cache line property is more of a desired optimization rather than a hard goal. The code doesn't fail to execute properly and fault if pmap_large_md_page is 2 cachelines, it will just be less optimal. Presumably still more optimal than the prior approach however as you've still reduced contention.

would this comment do it for you?

/*
 * For correctness we depend on the size being evenly divisible into a
 * page. As a tradeoff between performance and total memory use, the
 * entry is 64 bytes (aka one cacheline) in size. Not being smaller
 * avoids false-sharing, but not being 128 bytes potentially allows for
 * avoidable traffic due to adjacent cacheline prefetcher.
 *
 * Assert the size so that accidental changes fail to compile.
 */

The cacheline size on arm64 isn't fixed. Most CPUs I've seen are 64, but a few are 128. The architecture allows for other size, but I'm not aware of any CPUs where this is the case.

In D38366#872245, @mjg wrote:

would this comment do it for you?

/*
 * For correctness we depend on the size being evenly divisible into a
 * page. As a tradeoff between performance and total memory use, the
 * entry is 64 bytes (aka one cacheline) in size. Not being smaller
 * avoids false-sharing, but not being 128 bytes potentially allows for
 * avoidable traffic due to adjacent cacheline prefetcher.
 *
 * Assert the size so that accidental changes fail to compile.
 */

I think this is fine for amd64. arm64 will likely want to tweak the comment to match what is true on that platform. I think it's fine if you just fix amd64 for now and Andy or I can work on tidying up arm64 separately.