Page MenuHomeFreeBSD

Possible fix for "desc avail = " dmesg spam and igb breakage
AbandonedPublic

Authored by shurd on Aug 23 2018, 7:44 PM.
Tags
None
Referenced Files
Unknown Object (File)
Dec 20 2023, 3:48 AM
Unknown Object (File)
Oct 16 2023, 12:11 PM
Unknown Object (File)
Aug 15 2023, 9:48 PM
Unknown Object (File)
Jul 5 2023, 5:20 AM
Unknown Object (File)
Jun 30 2023, 10:59 AM
Unknown Object (File)
May 8 2023, 10:05 AM
Unknown Object (File)
Mar 21 2023, 12:05 PM
Unknown Object (File)
Mar 3 2023, 11:49 PM

Details

Reviewers
scottl
mmacy
gallatin
erj
Group Reviewers
iflib
Summary

The igb driver at least appears to be able to get into a state
where there's no carrier, and dmesg is spammed with something like:
"igb0 TX(2) desc avail = 1024, pidx = 0", and it remains in a broken
state even if link should be re-established. It's possible that this
is being caused by the watchdog running when link is down, and
IFLIB_ADMIN_ALWAYS_RUN is set.

Test Plan

Before committing, figure out a way to reliably reproduce this
and ensure this fixes the issue.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 19116
Build 18741: arc lint + arc unit

Event Timeline

This revision is now accepted and ready to land.Aug 23 2018, 11:57 PM

The in-tree igb never sets IFLIB_ADMIN_ALWAYS_RUN. Is this for a modified version of igb?

In D16875#359599, @erj wrote:

The in-tree igb never sets IFLIB_ADMIN_ALWAYS_RUN. Is this for a modified version of igb?

Not as far as I know... I see I confused ixl with igb.

It looks like I may need to have iflib_txq_drain() consume and reclaim descriptors when !LINK_ACTIVE() then... I was really hoping it was the watchdog.

Regarding this patch then, should the watchdog still be running when there's no link on ixl?

Possibly the easiest way to ensure we don't enter the STALLED state while link is down is to have iflib_link_state_change() set txq->ift_br->drain = iflib_txq_drain_free when link goes down, and then txq->ift_br->drain = iflib_txq_can_drain when it comes back up... which would mean stopping and resetting the callout timer on link changes.

Thoughts?

Looking at iflib_txq_drain_free() more, we likely want to just extract the free loop and put it in a LINK_ACTIVE() block in iflib_txq_drain.

Should this revision get closed? Nothing's happened on it for months.

This revision now requires review to proceed.Mar 19 2019, 6:01 PM