Page MenuHomeFreeBSD

x86/busdma: Limit reserved pages if low nsegs
ClosedPublic

Authored by cperciva on Oct 21 2022, 6:53 PM.
Tags
None
Referenced Files
Unknown Object (File)
Jan 12 2024, 9:17 AM
Unknown Object (File)
Jan 12 2024, 5:55 AM
Unknown Object (File)
Sep 6 2023, 4:23 AM
Unknown Object (File)
May 30 2023, 3:34 PM
Unknown Object (File)
May 2 2023, 7:17 AM
Unknown Object (File)
Apr 8 2023, 3:38 PM
Unknown Object (File)
Apr 3 2023, 12:16 AM
Unknown Object (File)
Mar 27 2023, 12:04 AM
Subscribers
None

Details

Summary

When bus_dmamap_create is called, if bouncing might be required we
reserve enough pages for a maximum-length request, subject to the
MAX_BPAGES constraint (32 MB on amd64; 32 MB or 2 MB on i386
depending on the amount of RAM).

Since pages used for bouncing are typically non-consecutive, each
bounced page will typically constitute a busdma segment; as such, we
are unlikely to ever successfully use more pages than the nsegments
limit. Limit the number of pages reserved to nsegments.

On FreeBSD/Firecracker, this reduces bounce page memory consumption
from 32 MB to 512 kB, making VMs with 128 MB of RAM usable.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

cperciva created this revision.

I think this is good, but am not positive it is so.
Maybe try to see if mav@ or jhb@ can take a peek too.

This revision is now accepted and ready to land.Oct 21 2022, 8:09 PM

Alexander, John, can you chime in on this?

It might be nice to apply the same change to all the bounce backends on other arches as well?

(But I think this is probably fine to do). One odd thing to note is that we always map at least 1 page it seems. That is, it seems to me that bz->total_pages should just be the MAX(all possible tags). But it isn't, in fact it is the SUM(all possible tags) that is capped at MAX_BPAGES. That seems rather odd to me. I would expect the logic in general to be more of:

error = 0;

/* Figure out max number of pages needed for 1 request on this tag to make forward progress. */
pages = MAX(atop(dmat->common.maxsize), 1);
pages = MIN(dmat->common.nsegments, pages);

/* Cap the maximum pages reserved per zone. */
pages = MIN(pages, maxpages);

/* If the bounce zone doesn't contain that many total pages, grow the zone. */
if (bz->total_pages < pages) {
   pages = bz->total_pages - pages;
   if (alloc_bounce_pages(dmat, pages) < pages)
       error = ENOMEM;
}
...

I always wondered whether kernel could try harder to reduce the number of segments below the number of pages to help hardware that can benefit from it. But you are right that current kernel does not even try to allocate the bounce pages consequently, so allocating more pages than segments supported does not make much sense. Though many drivers should already request maximum map size as maximum number of segments multiplied by page size. So this should affect only drivers that are not doing it.

In D37082#842116, @jhb wrote:

It might be nice to apply the same change to all the bounce backends on other arches as well?

Yeah, I was planning on merging to other architectures later; x86 is the only place I could test this so I figured I'd do it first.

In D37082#842117, @jhb wrote:

(But I think this is probably fine to do). One odd thing to note is that we always map at least 1 page it seems. That is, it seems to me that bz->total_pages should just be the MAX(all possible tags). But it isn't, in fact it is the SUM(all possible tags) that is capped at MAX_BPAGES. That seems rather odd to me. I would expect the logic in general to be more of ...

In fact it's MIN(SUM(pages per dmamap), MAX_BPAGES) + [# dmamaps created after we hit MAX_BPAGES], since we always add one more page even after we hit MAX_BPAGES. I agree that the logic here is a bit strange, but I didn't know enough of the history to figure out why it was this way.

I think the ideal would be to replace MAX_BPAGES with something like "1% of system RAM" rather than having a hard-coded value -- but again that's more of a change than I wanted to try to make right now without understanding more of the history.

In D37082#842118, @mav wrote:

I always wondered whether kernel could try harder to reduce the number of segments below the number of pages to help hardware that can benefit from it. But you are right that current kernel does not even try to allocate the bounce pages consequently, so allocating more pages than segments supported does not make much sense.

Yeah, I didn't even realize that busdma didn't take care of nsegments limitations until recently.

Though many drivers should already request maximum map size as maximum number of segments multiplied by page size. So this should affect only drivers that are not doing it.

Right, I tripped over this in virtio_blk and I considered just fixing it there -- but I figured there was no harm (and potentially a benefit) to taking care of it in busdma in case anyone else wrote code the same way as I did.

sys/x86/x86/busdma_bounce.c
326

The final check here (bz->total_pages < maxpages) does keep us from adding pages for future maps once we do hit MAX_BPAGES. But I still find it rather convoluted.

sys/x86/x86/busdma_bounce.c
326

Not if BUS_DMA_MIN_ALLOC_COMP isn't set. Which I think means we add one extra page per tag (not per dmamap as I said earlier).

As before, I really can't say I understand why the code was designed this way...