Page MenuHomeFreeBSD

gve: Add support for 4k RX Buffers when using DQO queue formats
ClosedPublic

Authored by veethebee_google.com on Jun 10 2025, 9:09 PM.
Tags
None
Referenced Files
F132036710: D50786.diff
Mon, Oct 13, 3:34 AM
Unknown Object (File)
Sun, Oct 5, 12:41 AM
Unknown Object (File)
Sat, Sep 20, 7:27 PM
Unknown Object (File)
Sat, Sep 20, 2:30 PM
Unknown Object (File)
Sat, Sep 20, 2:22 PM
Unknown Object (File)
Thu, Sep 18, 2:50 PM
Unknown Object (File)
Tue, Sep 16, 4:01 AM
Unknown Object (File)
Mon, Sep 15, 7:30 PM
Subscribers

Details

Summary

This change adds support for using 4K RX Buffers when using DQO queue
formats when a boot-time tunable flag is set to true by the user.
When this flag is enabled, the driver will use 4K RX Buffer size either
when HW LRO is enabled or mtu > 2048.

Signed-off-by: Vee Agarwal <veethebee@google.com>

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

This revision is now accepted and ready to land.Jun 10 2025, 10:32 PM
sys/dev/gve/gve.h
685

Do you need _Static_asserts which verify that MJUMPAGESIZE == 4096 and MCLBYTES == 2048? The latter in particular might not hold; I'm aware of some downstreams which tweak that value, so it'd be nice to at least fail to build in that case if necessary.

sys/dev/gve/gve_main.c
414

I think this driver tries to follow the style(9) convention of declaring variables at the beginning of a scope, in which case the declaration should move up.

sys/dev/gve/gve_rx_dqo.c
323

@markj, @delphij or any other folks familiar with memory management in freebsd, do you have any insight on whether 4k mbuf jumbo clusters are as readily available as 2k mbuf clusters? From my understanding, fetching mbuf clusters only performs one fetch as there is a memory region for mbufs with pre-attached 2k clusters but when getting a 4k jumbo cluster, an mbuf is fetched separately from fetching a jumbo cluster. Would this pose any issues in replacing the use of 2k clusters with 4k jumbo clusters?

sys/dev/gve/gve_rx_dqo.c
323

By "fetch" you mean allocate from UMA? It shouldn't make a functional difference. There's a "packet" zone which caches mbuf headers with a 2KB cluster attached, whereas a 4KB packet buffer requires two back-to-back allocations. The packet zone is an optimization to elide a call to uma_zalloc(), but aside from extra performance overhead, the driver shouldn't notice.

Are you able to do some benchmarking to measure CPU usage and throughput with small packets to compare the two allocation strategies?

veethebee_google.com added inline comments.
sys/dev/gve/gve_rx_dqo.c
323

Yup, I meant the allocate from UMA. We did do extensive performance testing and didn't see any significant differences between the two.

So it sounds like replacing 2k buffers with 4k buffers shouldn't really have any issues overall. Thanks so much!

veethebee_google.com marked an inline comment as done.

Adding static asserts and fixing declaration to match style(9)

This revision now requires review to proceed.Jun 12 2025, 9:54 PM

Please send me a patch and I'll apply it.

This revision is now accepted and ready to land.Jun 13 2025, 1:01 AM
sys/dev/gve/gve.h
91

Sorry, I realized that this will trivially fail on arm64 when the page size is 16KB. There, we define MJUMPAGESIZE == 8096 (which is admittedly a bit odd). Using 8KB buffers is obviously suboptimal, but I'd expect it to work--can we relax this assertion to MJUMPAGESIZE >= GVE_4K_RX_BUFFER_SIZE_DQO?

This revision now requires review to proceed.Jun 13 2025, 5:51 PM
This revision was not accepted when it landed; it landed in state Needs Review.Jun 13 2025, 6:55 PM
This revision was automatically updated to reflect the committed changes.
sys/dev/gve/gve.h
91

I committed a follow-up change to relax the assertion for the MCLBYTES comparison as well--there are some non-default kernel configs which set MCLBYTES=4096. In that case we don't have to switch allocation strategies, so the driver could be modified to handle it, but I'm not sure it's worth bothering. The kernel config in question changes the default only as a testing measure.

sys/dev/gve/gve_main.c
438

Coverity points out that we should check for errors from gve_alloc_rx_rings(). I'm not sure how to handle errors though; perhaps this routine should be reworked to try and allocate new rings earlier?