Page MenuHomeFreeBSD

vm_map_wire: speed up wiring for large virtual address ranges
Needs ReviewPublic

Authored by bnovkov on Jan 26 2024, 6:18 PM.
Tags
None
Referenced Files
Unknown Object (File)
Mon, Apr 8, 10:31 PM
Unknown Object (File)
Thu, Apr 4, 4:17 PM
Unknown Object (File)
Mar 25 2024, 10:37 PM
Unknown Object (File)
Feb 17 2024, 5:45 PM
Unknown Object (File)
Feb 9 2024, 12:00 PM
Unknown Object (File)
Feb 1 2024, 12:12 AM
Unknown Object (File)
Jan 26 2024, 7:59 PM
Unknown Object (File)
Jan 26 2024, 7:53 PM
Subscribers

Details

Reviewers
markj
alc
kib
Summary

This patch adds a new internal flag and routine to vm_map_wire designed to quickly wire a large virtual address range.
Currently, vm_map_wire uses vm_fault to allocate and map one 0-order page at a time.
The routine tries to speed up that process by preallocating and inserting (super)pages into an entry's object in order to avoid the excessive overhead of vm_fault's slow path.

This change is mostly aimed at speeding up guest memory wiring in hypervisors.

This work was sponsored by the Google Summer of Code '23 program.

Test Plan

The patch was tested and evaulated using a bhyve VM with 10GB of memory.
I've compared the patch by timing the vm_mmap_seg routine in vmm.c where the guest memory wiring takes place.
I've repeated the timing 10 times for both cases, and found that there is a significant speedup.

Ministat output (in milliseconds):

x prefault_stat.txt
+ clean_stat.txt
    N           Min           Max        Median           Avg        Stddev
x  10     76.762507     115.02396     79.975776     92.488651     18.554921
+  10     761.68435     992.11996     883.77091     875.02305     80.788188
Difference at 95.0% confidence
	782.534 +/- 55.0727
	846.087% +/- 138.817%
	(Student's t, pooled s = 58.6132)

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

sys/vm/vm_map.c
3536

Can you explain this validation and unbusy of the page? Doesn't it allows other consumers to see the garbage?

Also, the vm_fault() call below would just create PTEs pointing to these pages as is.

bnovkov added inline comments.
sys/vm/vm_map.c
3536

Allocating busied pages hangs the VM boot process (somewhere in vm_fault, I assume), but this way of doing it is ugly, thank you for pointing this out.
Passing VM_ALLOC_NOBUSY instead makes more sense.
The validation was redundant in this case, so I removed it.

Now we legitimately get wired/invalid/non-busy pages in the objects. Such conditions were at least known to be quite problematic.

For instance, I do not see anything that would prevent the object collapse to run right after the object unlock. Then collapse would see invalid unbusy page in the object queue and try to free it. Since the page is wired, I believe the vm_page_free_prep() panics.

sys/vm/vm_map.c
3514

&& should be on the prev liine

3522

Or you could just increment pindex by proper size on each allocation