Page MenuHomeFreeBSD

iflib: leave only 1 receive descriptor unused
ClosedPublic

Authored by vmaffione on Aug 26 2020, 8:39 AM.
Tags
None
Referenced Files
Unknown Object (File)
Jan 1 2024, 4:35 AM
Unknown Object (File)
Dec 20 2023, 7:06 AM
Unknown Object (File)
Nov 22 2023, 11:52 PM
Unknown Object (File)
Nov 10 2023, 2:05 AM
Unknown Object (File)
Oct 8 2023, 12:56 AM
Unknown Object (File)
Sep 18 2023, 3:46 PM
Unknown Object (File)
Sep 18 2023, 9:32 AM
Unknown Object (File)
Sep 3 2023, 3:11 AM

Details

Summary

The pidx argument of isc_rxd_flush() indicates which is the last valid receive descriptor to be used by the NIC.
However, current code has multiple issues:

  1. Intel drivers write pidx to their RDT register, which means that NICs will only use the descriptors up to pidx-1 (modulo ring size N), and won't actually use the one pointed by pidx. This does not break reception, but it is anyway confusing and suboptimal (the NIC will actually see only N-2 descriptors as available, rather than N-1). Other drivers (if_vmx, if_bnxt, if_mgb) adhere to this semantic).
  2. The semantic used by Intel (RDT is one descriptor past the last valid one) is used by most (if not all) NICs, and it is also used on the TX side (also in iflib). Since iflib is not currently using this semantic for RX, it must decrement fl->ifl_pidx (modulo N) before calling isc_rxd_flush(), and then the per-driver callback implementation must increment the index again (to match the real semantic). This is confusing and suboptimal.
  3. The iflib refill function is also called at initialization. However, in case the ring size is smaller than 128 (e.g. if_mgb), the refill function will actually prepare all the receive descriptors (N), without leaving one unused, as most of NICs assume (e.g. to avoid RDT to overrun RDH). I can speculate that the code looks like this right now because this issue showed up during testing (e.g. with if_mgb), and it was easy to workaround by decrementing pidx before isc_rxd_flush().

The goal of this change is to simplify the code (removing a bunch of instructions from the RX fast path), and to make the semantic of isc_rxd_flush() consistent across drivers. To achieve this, we:

  • change the semantics of the pidx argument to the usual one (that is the index one past the last valid one), so that both iflib and drivers avoid the decrement/increment dance.
  • fix the initialization code to prepare at most N-1 descriptors.
Test Plan

Tested on both if_em and if_vmx. It is worth noting that if_em has no separate completion queue and a single free list, whereas if_vmx has a separate completion queue and multiple free lists, so that tests cover more receive code.
Tests include netperf and netmap bridge.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

This looks ok to me, my comments are minor.

sys/dev/mgb/if_mgb.h
181 ↗(On Diff #76238)

Do I understand correctly that MGB_NEXT_RING_IDX() behaves exactly the same as before? It was just rewritten to be similar to PREV_RING_IDX?

sys/net/iflib.c
1963 ↗(On Diff #76238)

Can't we assert this with MPASS(count <= fl->ifl_size - 1)?

2222 ↗(On Diff #76238)

I would also note, "See the comment in iflib_fl_refill_all()."

2226 ↗(On Diff #76238)

Why are we asserting this condition? Can't it fail due to an allocation failure, as hinted by the check immediately below?

vmaffione marked 4 inline comments as done.

Added assert in iflib_fl_refill().
Improved comment.
Removed assert() at first refill.

sys/dev/mgb/if_mgb.h
181 ↗(On Diff #76238)

Yes, absolutely. If preferable, we can keep the previous definition.

sys/net/iflib.c
1963 ↗(On Diff #76238)

Yes, we can even do better and assert that we don't pass less than the current queue capacity.

2226 ↗(On Diff #76238)

Yes, indeed. I left this logic unchanged, but I agree we can just remove the assert.

markj added inline comments.
sys/dev/mgb/if_mgb.h
181 ↗(On Diff #76238)

I think it's fine, just wanted to make sure I wasn't missing something.

This revision was not accepted when it landed; it landed in state Needs Review.Sep 1 2020, 8:42 PM
This revision was automatically updated to reflect the committed changes.