Page MenuHomeFreeBSD

Fix cxgbe netmap when interface is DOWN
ClosedPublic

Authored by jch on Nov 1 2018, 3:25 PM.

Details

Summary

A kernel panic occurs if the cxgbe interface is DOWN when activating netmap.
This patch prevents the driver from freeing up cxgbe netmap resources when
they have not been allocated.

Submitted by: Nicolas Witkowski <nwitkowski@Verisign.com>
MFC after: 1 week
Sponsored by: Verisign, Inc.

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

jch created this revision.Nov 1 2018, 3:25 PM
np accepted this revision.Nov 5 2018, 6:38 PM

Can you provide exact steps to reproduce the problem that you described? I get an EAGAIN (as expected) and not a panic if I try to enable netmap when the link is down.

That said, the proposed change cannot hurt. Please put brackets around 0 in the return statement to match the rest of the file before you commit.

This revision is now accepted and ready to land.Nov 5 2018, 6:38 PM
jch updated this revision to Diff 50039.Nov 5 2018, 9:26 PM

Address @np comment

This revision now requires review to proceed.Nov 5 2018, 9:26 PM
jch added a comment.Nov 5 2018, 9:29 PM
In D17802#381512, @np wrote:

Can you provide exact steps to reproduce the problem that you described? I get an EAGAIN (as expected) and not a panic if I try to enable netmap when the link is down.

Interesting, the steps to reproduce this issue are quite straightforward (for both old FreeBSD 10.3 and recent FreeBSD 11.2):

  1. Have the interface down, e.g. vcc0
  2. Try to start a netmap application on vcc0

Below a FreeBSD 10.3 kernel panic as example:

Fatal trap 12: page fault while in kernel mode
cpuid = 11; apic id = 0b
fault virtual address	= 0x6
fault code		= supervisor read data, page not present
instruction pointer	= 0x20:0xffffffff804dd169
stack pointer	        = 0x28:0xfffffe201db262e0
frame pointer	        = 0x28:0xfffffe201db263b0
code segment		= base 0x0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags	= interrupt enabled, resume, IOPL = 0
current process		= 2393 (lupe)
[ thread pid 2393 tid 100516 ]
Stopped at      cxgbe_netmap_reg+0x159: cmpw    0x6(%rcx,%rax,1),%dx

Tracing pid 2393 tid 100516 td 0xfffff80141878000
cxgbe_netmap_reg() at cxgbe_netmap_reg+0x159/frame 0xfffffe201db263b0
netmap_do_unregif() at netmap_do_unregif+0x89/frame 0xfffffe201db263f0
netmap_do_regif() at netmap_do_regif+0x1a1/frame 0xfffffe201db26430
netmap_ioctl() at netmap_ioctl+0x755/frame 0xfffffe201db267e0
devfs_ioctl_f() at devfs_ioctl_f+0x139/frame 0xfffffe201db26840
kern_ioctl() at kern_ioctl+0x255/frame 0xfffffe201db268b0
sys_ioctl() at sys_ioctl+0x140/frame 0xfffffe201db26990
amd64_syscall() at amd64_syscall+0x40f/frame 0xfffffe201db26ab0
Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe201db26ab0
--- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x801990fea, rsp = 0x7ffffffbe958, rbp = 0x7ffffffbf3a0 ---

And indeed here netmap_do_regif() does fail because cxgbe_netmap_on() returned EAGAIN because the interface was down. This issue might also due to the way we initialize netmap.

That said, the proposed change cannot hurt. Please put brackets around 0 in the return statement to match the rest of the file before you commit.

Good catch, fixed.

jch updated this revision to Diff 50040.Nov 5 2018, 9:34 PM

Address @np comment

This revision was not accepted when it landed; it landed in state Needs Review.Nov 12 2018, 5:57 PM
This revision was automatically updated to reflect the committed changes.