Page MenuHomeFreeBSD

vm_page: Break reservations to handle noobj allocations
ClosedPublic

Authored by markj on Oct 21 2021, 3:53 PM.

Details

Summary

I omitted vm_reserv_reclaim() calls from the noobj allocator variants
since breaking reservations should only release pages to the default
free pool, and for noobj we allocate from the direct freepool. However,
if the direct and default free pools are both empty, we may need to
break reservations to make progress.

Reported by: cy
Fixes: b498f71bc56a ("vm_page: Add a new page allocator interface for unnamed pages")

Diff Detail

Repository
R10 FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

markj requested review of this revision.Oct 21 2021, 3:53 PM
sys/vm/vm_page.c
2414

I'm not sure how best to handle the freelist != VM_NFREELIST case here. It doesn't necessarily make sense to break reservations to satisfy vm_page_alloc_freelist(), and the old implementation didn't do that at all. OTOH, VM_NRESERVLEVEL is 0 on mips, and vm_page_alloc_freelist() is only used on mips.

kib added inline comments.
sys/vm/vm_page.c
2414

In fact, e.g. on amd64 I think VM_FREELIST_DMA32 could get more uses than it has now. For instance, pmap_page_alloc_below_4g() might use that instead of contig(). Similarly, busdma_bounce probably would be (better) served by allocations from freelist. Both of them do not need a run of pages.

This revision is now accepted and ready to land.Oct 21 2021, 4:23 PM
sys/vm/vm_page.c
2414

I will change the diff to only reclaim reservations if freelist == VM_NFREELIST then, but if vm_page_alloc_freelist() develops more users then some other solution is needed to avoid spurious allocation failures on amd64.

Why would using vm_page_alloc_freelist() be an improvement for busdma, aside from simplifying the calling code?

sys/vm/vm_page.c
2414

Why would using vm_page_alloc_freelist() be an improvement for busdma, aside from simplifying the calling code?

I did not rechecked it, but isn't alloc_contig() iterates over segments that fit into the low,high interval, while alloc_freelist() takes the page directly from the freelist. I.e. it is O(1) vs. O(n)?

sys/vm/vm_page.c
2414

Yes, that is true, though in practice I suspect it would not make much of a difference: alloc_contig will iterate over vm_phys segments looking for one that is in range. Since the vm_phys segments are sorted, it will visit the ones below 4GB first. Segments never cross the DMA32 boundary, so if allocating a single page, alloc_contig can check the segment free lists quickly, since any free page will satisfy the allocation. O(n) here really means O(num segments below 4G).

sys/vm/vm_page.c
2414

I would go as far as to claim that this split at 4G is rather useless, if not the DMA32. More, it is not obvious that the split is not wrong: IMO less segments we have, easier for VM is to operate. More, I would say that requirements of quite old hardware like usb 2.0/1.1 controllers should be satisfied with IOMMU and not by hacks in VM subsystem.

But this is just a grumbling, I do not believe that we can realistically switch to this 'modern' approach.