Page MenuHomeFreeBSD

Stop iflib(4) from leaking MSI messages and along with drivers let it use the correct RIDs when releasing resources
ClosedPublic

Authored by marius on Jan 26 2019, 1:51 PM.

Details

Summary
  • Stop iflib(4) from leaking MSI messages on detachment by calling bus_teardown_intr(9) before pci_release_msi(9).
  • Ensure that iflib(4) and associated drivers pass correct RIDs to bus_release_resource(9) by obtaining the RIDs via rman_get_rid(9) on the corresponding resources instead of using the RIDs initially passed to bus_alloc_resource_any(9) as the latter function may change those RIDs. Solely bnxt(4) was using the correct RIDs by caching the ones returned by bus_alloc_resource_any(9).
  • Change the logic of iflib_msix_init() around to only map the MSI-X BAR if MSI-X is actually supported, i. e. pci_msix_count(9) returns > 0. Otherwise the "Unable to map MSIX table " message triggers for devices that simply don't support MSI-X and the user may think that something is wrong while in fact everything works as expected.
  • Put some (mostly redundant) debug messages emitted by iflib(4) and em(4) during attachment under bootverbose. The non-verbose output of em(4) seen during attachment now is close to the one prior to the conversion to iflib(4).
  • Replace various variants of spelling "MSI-X" (several in messages) with "MSI-X" as used in the PCI specifications.
  • Remove some trailing whitespace from messages emitted by iflib(4) and change them to consistently start with uppercase.
  • Remove some obsolete comments about releasing interrupts from drivers and correct a few others.
Test Plan

Verified that detaching em(4) now no longer causes pci(4) to emit "Device leaked MSI vectors" warnings when using MSI and that the output produced when attaching em(4) now is close to the one prior to the conversion to iflib(4).

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

marius created this revision.Jan 26 2019, 1:51 PM
shurd added a comment.Jan 28 2019, 6:58 PM

I think that the inability to map an MSI-X table should likely not be bootverbose... this is an allocation failure which will significantly impact device performance. While the user should know that MSI-X is disabled, I'm not sure the user should be expected to know that the allocation will fail.

@sbruno do you remember if this is the message we saw when we ran out of vectors in that one system? Or was it the "failed to allocate %d MSI-X vectors" one?

sys/net/iflib.c
5988 ↗(On Diff #53250)

I'm not sure this one should be bootverbose.

6067 ↗(On Diff #53250)

I don't really like splitting strings arbitrarily like this, it makes searches for them likely to fail. Moving the following arguments to a new line makes sense though.

jacob.e.keller_intel.com added inline comments.
sys/net/iflib.c
5988 ↗(On Diff #53250)

Yea, I wouldn't mark this one bootverbose either.

6067 ↗(On Diff #53250)

Yea, I agree. Put the arguments on their own line, but leave the string as one line.

marius updated this revision to Diff 53347.Jan 28 2019, 10:19 PM
marius marked 4 inline comments as done.
marius edited the summary of this revision. (Show Details)
marius added inline comments.Jan 28 2019, 10:20 PM
sys/net/iflib.c
5988 ↗(On Diff #53250)

Okay, I agree that if both system and MAC support MSI-X, failure to map the PBA or vector table BAR is a serious problem and that message shouldn't be under bootverbose.
However, then the logic should be changed around like in the updated patch to only try to map this BAR if MSI-X is supported, i. e. pci_msix_count(0) returns > 0. Because otherwise for devices like this one that simply don't support MSI-X:
em0@pci0:0:25:0: class=0x020000 card=0x21ce17aa chip=0x15028086 rev=0x04 hdr=0x00

vendor     = 'Intel Corporation'
device     = '82579LM Gigabit Network Connection'
class      = network
subclass   = ethernet
cap 01[c8] = powerspec 2  supports D0 D3  current D0
cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message
cap 13[e0] = PCI Advanced Features: FLR TP

the "Unable to map MSIX table " message triggers and the user may think that something is wrong while in fact all is fine (or at least nothing can be done to improve the situation).

6067 ↗(On Diff #53250)

Okay, I broke the line at the arguments in the updated patch; this line now still violates the style(8) implied rule of staying below 80 columns, though (and I think that format specifiers like %d require one to search for some substrings of a message anyway).

shurd accepted this revision.Jan 28 2019, 11:59 PM

Looks good. This may also fix where bar == -1 && pci_msix_count(dev) == 0 (or break it less... or something)

This revision is now accepted and ready to land.Jan 28 2019, 11:59 PM
erj added a subscriber: erj.Jan 29 2019, 5:43 PM

Ensure that iflib(4) and associated drivers pass correct RIDs to bus_release_resource(9) by obtaining the RIDs via rman_get_rid(9) on the corresponding resources instead of using the RIDs initially passed to bus_alloc_resource_any(9) as the latter function may change those RIDs. Solely bnxt(4) was using the correct RIDs by caching the ones returned by bus_alloc_resource_any(9).

I'm a little confused on this point -- aren't we supposed to use the possibly changed rid from bus_alloc_resource_any, and that's why it needs to be cached (like in the removed adapter->io_rid field)?

Though, it sounds like using rman_get_rid() is preferred, and I'm guessing that always has the correct rid to use, somehow.

Yes, as the manpage for the bus_release_resource(9) family of functions says: "The bus methods are free to change the RIDs that they are given as a parameter. You must not depend on the value you gave it earlier." On way do deal with that is to cache the returned RID in e. g. the softc. As is, em(4) does that for adapter->ioport via adapter->io_rid, but not for the adapter->flash and adapter->memory resources. The summary doesn't state this correctly, sorry, I'll address that when committing.

However, caching the RID isn't at all necessary; the possibly changed RID can also be obtained via rman_get_rid(9) on the corresponding resource. The RID doesn't get magically associated with the resource, though, the bus_alloc_resource methods of the bus drivers need to do that via rman_set_rid(9). Not implementing that in a bus driver is a bug, though.

This revision was automatically updated to reflect the committed changes.