Page MenuHomeFreeBSD

Make sure end of receive doesn't cause interrupt starvation in iflib
ClosedPublic

Authored by hselasky on Jan 22 2020, 2:10 PM.

Details

Summary

When the receive ring cannot be filled with mbufs, due to lack of memory, no more interrupts will be generated to fill the receive ring later on. Make sure to have a watchdog, to try refilling the receive ring from time to time, hopefully when more mbufs are available.

Affects all clients of iflib.

Sponsored by: Mellanox Technologies

Test Plan

Try to provoke mbuf allocation failure by modifying code.
Add print to check code is working like expected.

Test OK with Intel Gigabit (igb)

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

sys/net/iflib.c
2095 ↗(On Diff #67143)

I'm confused.. What if _iflib_fl_refill() actually refills? Then we are no longer empty but say we are?

Sorry if I'm missing something, I have not looked at this code in a very long time.

sys/net/iflib.c
2095 ↗(On Diff #67143)

I'll move the in-use check after the fl_refill() as an optimisation.

This revision is now accepted and ready to land.Jan 22 2020, 8:06 PM
hselasky edited the summary of this revision. (Show Details)
hselasky edited the test plan for this revision. (Show Details)

Testing revealed some more patches were needed.

Test OK.

This revision now requires review to proceed.Jan 22 2020, 9:39 PM
This revision is now accepted and ready to land.Jan 22 2020, 9:53 PM

As we discussed in slack, I'm not a huge fan of fixing things with a callout. If the hardware was amenable, I'd much rather leave the ring partially stocked and drop new packets until we were able to allocate mbufs again. However, based on the findings you reported with igb (it wanting a full rx ring to generate interrupts), I'm afraid that might require too much work from hardware drivers. And I'd prefer a fix to a real problem, even if its not something I personally find appealing, to leaving a real bug unfixed and having machines become unreachable.

Can you describe a bit what this is trying to address? This seems like a hopefully temporary condition that should be opted in by broken drivers, not default for every iflib driver.

I think we didn't see this at LLNW because we ran with much larger number of descriptors (4096 for em, 8192 for igb). Those should probably be set as such for appropriate hardware, we just didn't have a secret codex to make better decisions for different NIC types and would have needed some assistance from Intel to look through docs and errata to do so.

I don't think any of the past corporate sponsors of this work are stepping up so the community is going to have to here. @marius you have the most recent experience looking deeply at the e1000 code, can you share an opinion on the issue and proposal?

Can you describe a bit what this is trying to address? This seems like a hopefully temporary condition that should be opted in by broken drivers, not default for every iflib driver.

I am addressing an issue where filling the RX descriptors stop and so all RX packet processing, because the hardware doesn't output any more RXEOF interrupts. This happens in conjunction with temporary out of mbufs situations.

--HPS

Is there some way to make the problem easy to reproduce like reducing descriptors?

Simply force a m_getcl() failure like this:

static int counter;

counter++;
if (counter >= 10000 && counter < 40000)

break;

--HPS

I don't think any of the past corporate sponsors of this work are stepping up so the community is going to have to here. @marius you have the most recent experience looking deeply at the e1000 code, can you share an opinion on the issue and proposal?

Well, this is just one of many regressions that came with the conversion of the e1000 drivers to iflib and as you imply, looking up all relevant details is tedious work, if such information is publicly available at all.
Based on the - somewhat sparse - description of the problem for the IGB class, my first thought would be to use interrupt moderation to force another RX interrupt after some time, though, i. e. effectively to implement a callout in hardware. This should require only a few lines of code in if_em.c, but might need interrupt auto-clearing to be disabled so another interrupt is actually triggered when the counter reaches 0 (auto-clearing certainly is a nice feature, but hardly a loss for 1 GbE). Actually, at a quick glance the re-arming of EITR previously done in igb_msix_que() on every MSI-X interrupt was lost with the conversion to iflib, so bringing back that code and also doing it in the INTx and MSI case already might be most of what's necessary for this approach.
IIRC, >= 10 GbE Intel MACs have two kinds of interrupt moderation timers, making forcing another RX interrupt even more straight forward.