Page MenuHomeFreeBSD

ixl(4): Update to 1.9.9-k
ClosedPublic

Authored by krzysztof.galazka_intel.com on Apr 6 2018, 10:33 AM.

Details

Summary

ixl(4): Update to 1.9.9-k

Refresh upstream driver before impending conversion to iflib.

Major changes:

  • Support for descriptor writeback mode
  • Ability to disable firmware LLDP agent by user (Bug 221530)
  • Fix for TX queue hang when using TSO (Bug 221919)
  • Separate descriptor ring sizes for TX and RX rings

Full history of changes in this update is available in GitHub repository: https://github.com/intel-wired-ethernet/freebsd/commits/ixl-update

Test Plan

I've tried compiling this on amd64 with no errors.

Respectfully request waiting for our Validation Team to perform a test pass before committing.

Diff Detail

Lint
Lint OK
Unit
No Unit Test Coverage
Build Status
Buildable 16323
Build 16261: arc lint + arc unit

Event Timeline

krzysztof.galazka_intel.com created this object with edit policy "Intel Networking (Project)".
erj accepted this revision as: erj.Apr 8 2018, 6:37 PM
erj added a subscriber: erj.

I've reviewed this patch-by-patch; it all looks good to me. We just need to make sure validation says it compiles and does what it's supposed to, too.

jeffrey.e.pieper_intel.com requested changes to this revision.Apr 9 2018, 7:25 PM

Kernel panic with TCP bi-directional netperf traffic using 2 threads :

Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 01
fault virtual address = 0x0
fault code = supervisor write data, page not present
instruction pointer = 0x20:0xffffffff80d3dc64
stack pointer = 0x28:0xfffffe00ca86d8d0
frame pointer = 0x28:0xfffffe00ca86d900
code segment = base 0x0, limit 0xfffff, type 0x1b

= DPL 0, pres 1, long 1, def32 0, gran 1

processor eflags = interrupt enabled, resume, IOPL = 0
current process = 12 (irq305: ixl0:q1)
[ thread pid 12 tid 101696 ]
Stopped at tcp_lro_flush_all+0x114: movq %rax,(%rcx)
db> where
Tracing pid 12 tid 101696 td 0xfffff800235f8560
tcp_lro_flush_all() at tcp_lro_flush_all+0x114/frame 0xfffffe00ca86d900
ixl_rxeof() at ixl_rxeof+0x5cb/frame 0xfffffe00ca86d990
ixl_msix_que() at ixl_msix_que+0x42/frame 0xfffffe00ca86d9e0
intr_event_execute_handlers() at intr_event_execute_handlers+0xe9 0xfffffe00ca86da20
ithread_loop() at ithread_loop+0xe7/frame 0xfffffe00ca86da70
fork_exit() at for_exeit+0x83/frame 0xfffffe00ca86dab0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00ca86dab0

  • trap 0, rip = 0, rsp = 0, rbp = 0 ---
This revision now requires changes to proceed.Apr 9 2018, 7:25 PM
  • Patch rebased on top of HEAD.
  • Fixed panic reported for TCP BX traffic.
jeffrey.e.pieper_intel.com requested changes to this revision.Apr 17 2018, 12:02 AM

The panic is still reproducible, it just takes longer. See attached screen cap.

This revision now requires changes to proceed.Apr 17 2018, 12:02 AM
smh added a subscriber: smh.Apr 17 2018, 9:54 AM

Is there anything happening with this as head is still 1.7.12-k which is very far behind the latest intel download which is 1.9.7?

In D14985#318249, @smh wrote:

Is there anything happening with this as head is still 1.7.12-k which is very far behind the latest intel download which is 1.9.7?

I'm sorry that investigation of this panic reported by Jeff took so long. The root cause has been identified - ixl_queue_hang_check function is triggering IRQs while queue is processed by ixl_handle_que task which leads to unsynchronized accesses to lro. I'm running some basic performance tests on the patch updating the review.

krzysztof.galazka_intel.com edited the summary of this revision. (Show Details)
  • Queue hang detection refactored to prevent kernel panic
  • Version bumped to 1.9.9 to align with out-of-tree driver
krzysztof.galazka_intel.com retitled this revision from ixl(4): Update to 1.9.8-k to ixl(4): Update to 1.9.9-k.
  • Deduplicate queue hang detection code and use same method in PF and VF drivers. Fixes kernel panic on VF with TCP bi-directional traffic.
This revision is now accepted and ready to land.May 1 2018, 5:51 PM

Is this to be committed to head? Are the Intel folks going to fire this off?

It was committed on 5-1 and MFC'ed to STABLE on 5-7, so I'm not sure why this wasn't closed.

sbruno closed this revision.May 14 2018, 2:03 PM