Page MenuHomeFreeBSD

pmap: Fix largemap restart checks in the kernel_maps sysctl handler
ClosedPublic

Authored by markj on Feb 25 2021, 2:27 PM.
Tags
None
Referenced Files
Unknown Object (File)
Mon, Jan 6, 4:23 PM
Unknown Object (File)
Mon, Jan 6, 4:04 PM
Unknown Object (File)
Sat, Jan 4, 4:04 PM
Unknown Object (File)
Nov 14 2024, 12:43 AM
Unknown Object (File)
Nov 14 2024, 12:36 AM
Unknown Object (File)
Nov 12 2024, 12:37 PM
Unknown Object (File)
Nov 6 2024, 10:06 AM
Unknown Object (File)
Nov 1 2024, 9:40 PM
Subscribers

Details

Summary

I believe my intent was to try and handle races with concurrent
modifications to the large map, but the check I added is bogus and
always fails. Instead, merely verify that the next page table page is
mapped into the direct map.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 37369
Build 34258: arc lint + arc unit

Event Timeline

markj requested review of this revision.Feb 25 2021, 2:27 PM
markj added a subscriber: rpokala.

@rpokala could you please see if this patch fixes the hang you see when reading vm.pmap.kernel_maps while an nvdimm is mapped?

I do not quite understand the change.

Typical nvdimm SPA is at very high physical address, definitely beyond dmaplimit. If PG_PS bit is not set then pa should be a page table page phys address, If PG_PS bit is set, then pa >= dmaplimit is probably always true.

Why it is not enough to test for PG_V, which is already done?

In D28922#647562, @kib wrote:

I do not quite understand the change.

Typical nvdimm SPA is at very high physical address, definitely beyond dmaplimit. If PG_PS bit is not set then pa should be a page table page phys address, If PG_PS bit is set, then pa >= dmaplimit is probably always true.

Oh, sigh, the real problem is that the check must come after the PG_PS check, not before.

Why it is not enough to test for PG_V, which is already done?

The PTP may be freed and reused at any point, so PG_V being set doesn't ensure that we can descend to the next level. This is not true of other maps in the kernel page tables.

Move the check, restore the original condition, add comments.

This revision is now accepted and ready to land.Feb 25 2021, 3:25 PM

@rpokala could you please see if this patch fixes the hang you see when reading vm.pmap.kernel_maps while an nvdimm is mapped?

Much better! It emitted several thousand lines, including the "Large map", and it was virtually instantaneous.

Thank you for figuring this out! (And for MFCing the CTLFLAG_SKIP change.)