Page MenuHomeFreeBSD

ntb_hw(4): Only record the first three MSIX vectors
ClosedPublic

Authored by cem on May 21 2016, 9:48 PM.
Tags
None
Referenced Files
Unknown Object (File)
Sep 26 2024, 2:02 PM
Unknown Object (File)
Sep 26 2024, 6:56 AM
Unknown Object (File)
Sep 25 2024, 6:16 PM
Unknown Object (File)
Sep 25 2024, 4:53 PM
Unknown Object (File)
Sep 24 2024, 10:28 PM
Unknown Object (File)
Sep 24 2024, 12:53 PM
Unknown Object (File)
Sep 24 2024, 7:44 AM
Unknown Object (File)
Sep 24 2024, 3:58 AM
Subscribers

Details

Summary

Don't overrun the msix_data array by reading the (unused) link state
interrupt information.

Reported by: mav
Sponsored by: EMC / Isilon Storage Division

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

cem retitled this revision from to ntb_hw(4): Only record the first three MSIX vectors.
cem updated this object.
cem edited the test plan for this revision. (Show Details)
cem added a reviewer: mav.
mav edited edge metadata.

The patch looks good to me, and alike change I've tried myself before fixed the kernel panic.

On the other hand all this workaround stopped any interface traffic for me (without it traffic flows, but _very_ slowly), so can not really test it. Will investigate more.

This revision is now accepted and ready to land.May 22 2016, 7:20 AM

I'm just not sure why ntb_get_msix_info() needs argument to assert its value inside. Why not just remove it, doing fixed loop same as in other places?

In D6489#137571, @mav wrote:

The patch looks good to me, and alike change I've tried myself before fixed the kernel panic.

On the other hand all this workaround stopped any interface traffic for me (without it traffic flows, but _very_ slowly), so can not really test it. Will investigate more.

Wow, what panic did you hit? I'm surprised, I didn't run into it.

Re: if_ntb, I'm sure that code is buggy. We don't use it at Isilon so I haven't spent much effort debugging it.

In D6489#137575, @mav wrote:

I'm just not sure why ntb_get_msix_info() needs argument to assert its value inside. Why not just remove it, doing fixed loop same as in other places?

For now, that would be fine. Most of the code tries to handle the hardware-reported number of registers generally rather than hardcoding, though.

In D6489#137595, @cem wrote:

Wow, what panic did you hit? I'm surprised, I didn't run into it.

I am surprised you could not hit it. This bug overwrites ntb->peer_lapic_bar pointer with some garbage, that caused fault inside ntb_peer_db_set(). At least it happens so on FreeBSD 10 where I am trying to run this code.

Re: if_ntb, I'm sure that code is buggy. We don't use it at Isilon so I haven't spent much effort debugging it.

That somewhat explains my thoughts about this code after day spent reading it. :(

In D6489#137596, @mav wrote:
In D6489#137595, @cem wrote:

Wow, what panic did you hit? I'm surprised, I didn't run into it.

I am surprised you could not hit it. This bug overwrites ntb->peer_lapic_bar pointer with some garbage, that caused fault inside ntb_peer_db_set(). At least it happens so on FreeBSD 10 where I am trying to run this code.

Ah, we don't use that routine.

Re: if_ntb, I'm sure that code is buggy. We don't use it at Isilon so I haven't spent much effort debugging it.

That somewhat explains my thoughts about this code after day spent reading it. :(

:-)

This revision was automatically updated to reflect the committed changes.