Page MenuHomeFreeBSD

fix vlan locking to permit sx acquisition in ioctl calls
ClosedPublic

Authored by mmacy on Aug 20 2018, 1:00 AM.

Details

Summary

This is a first cut of a change to address the fact that vlan hasn't caught up with all the multicast locking changes:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=230510

Test Plan

Sean - can you set up a system to test with?

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Nice. This overall looks like a pretty great idea to me.

@ae tested the original diff (D11370) and found that the lock overhead did have a measurable decrease in their forwarding performance. I'd be curious to see if there's a win here in taking the liveness-locking back out.

Aside from the VLAN_XUNLOCK() on return at at 1366, the rest are comment churn.

sys/net/if_vlan.c
212 ↗(On Diff #46952)

This comment needs to be updated.

256 ↗(On Diff #46952)

This comment also appears to be incorrect now.

447 ↗(On Diff #46952)

This comment implies we can't wait here... is NET_EPOCH_WAIT() safe?

575 ↗(On Diff #46952)

Obsolete comment?

714 ↗(On Diff #46952)

Obsolete comment.

795 ↗(On Diff #46952)

Obsolete comment

1344 ↗(On Diff #46952)

Another comment about an rmlock.

1366 ↗(On Diff #46952)

Isn't VLAN_XUNLOCK(); needed here as well?

1483 ↗(On Diff #46952)

Obsolete comment? We've already VLAN_XLOCK_ASSERT()ed, and we're not getting an exclusive lock here.

1610 ↗(On Diff #46952)

Is this comment correct? Won't TRUNK_WLOCK() sleep?

Other comments addressed in upcoming patch.

sys/net/if_vlan.c
1366 ↗(On Diff #46952)

it's handled at the jump target

1610 ↗(On Diff #46952)

It's a mutex.

  • rebase to HEAD
  • address shurd's comments
sys/net/if_vlan.c
1421 ↗(On Diff #47071)

my test machine panics at this line on vlan creation.
It seems this unlock is redundant, since lock was unlocked in line 1338.

Fatal trap 12: page fault while in kernel mode
cpuid = 5; apic id = 0a
fault virtual address   = 0x20
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff80c22297
stack pointer           = 0x28:0xfffffe00cedb05f0
frame pointer           = 0x28:0xfffffe00cedb0600
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = resume, IOPL = 0
current process         = 2796 (ifconfig)
[ thread pid 2796 tid 100788 ]
Stopped at      turnstile_broadcast+0x47:       movq    0x20(%rbx,%rax,1),%rcx
db> bt
Tracing pid 2796 tid 100788 td 0xfffff80052ff0000
turnstile_broadcast() at turnstile_broadcast+0x47/frame 0xfffffe00cedb0600
__mtx_unlock_sleep() at __mtx_unlock_sleep+0xb9/frame 0xfffffe00cedb0630
vlan_config() at vlan_config+0x6c9/frame 0xfffffe00cedb0690
vlan_clone_create() at vlan_clone_create+0x2b9/frame 0xfffffe00cedb0700
if_clone_createif() at if_clone_createif+0x4a/frame 0xfffffe00cedb0750
ifioctl() at ifioctl+0x721/frame 0xfffffe00cedb0840
kern_ioctl() at kern_ioctl+0x26d/frame 0xfffffe00cedb08b0
sys_ioctl() at sys_ioctl+0x15e/frame 0xfffffe00cedb0980
amd64_syscall() at amd64_syscall+0x369/frame 0xfffffe00cedb0ab0
fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe00cedb0ab0
--- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x800465bba, rsp = 0x7fffffffe0c8, rbp = 0x7fffffffe0d0 ---

@ae tested the original diff (D11370) and found that the lock overhead did have a measurable decrease in their forwarding performance. I'd be curious to see if there's a win here in taking the liveness-locking back out.

I did not seen any noticeable difference. I used two vlans on unpatched head/, and in both cases with and without D16808 I got about 9Mpps with single Mellanox 100G adapter.

Add Sean since he's being asked to setup a test system.

I tried the latest diff and ran into an assertion on VLAN creation:

panic: Assertion in_epoch(net_epoch_preempt) failed at /root/ws/head/sys/net/if_vlan.c:1608
cpuid = 1
time = 1535143358
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2c/frame 0xfffffe001c530ee0
kdb_backtrace() at kdb_backtrace+0x53/frame 0xfffffe001c530fb0
vpanic() at vpanic+0x277/frame 0xfffffe001c531080
doadump() at doadump/frame 0xfffffe001c5310e0
vlan_capabilities() at vlan_capabilities+0x93/frame 0xfffffe001c531160
vlan_config() at vlan_config+0x3b2/frame 0xfffffe001c531230
vlan_clone_create() at vlan_clone_create+0x49c/frame 0xfffffe001c531340
if_clone_createif() at if_clone_createif+0x7c/frame 0xfffffe001c5313a0
if_clone_create() at if_clone_create+0x2e0/frame 0xfffffe001c5313f0
ifioctl() at ifioctl+0x697/frame 0xfffffe001c531590
soo_ioctl() at soo_ioctl+0x79e/frame 0xfffffe001c531700
fo_ioctl() at fo_ioctl+0x4c/frame 0xfffffe001c531740
kern_ioctl() at kern_ioctl+0x361/frame 0xfffffe001c531810
sys_ioctl() at sys_ioctl+0x2ea/frame 0xfffffe001c531910
syscallenter() at syscallenter+0x4ed/frame 0xfffffe001c5319f0
amd64_syscall() at amd64_syscall+0x1b/frame 0xfffffe001c531ab0
fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe001c531ab0

  • syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x800465bba, rsp = 0x7fffffffe

238, rbp = 0x7fffffffe280 ---
KDB: enter: panic

Tested this patch on head(trueos synced as of the 19th + vlan fix from np) with ix and cc(cxgbe) drivers. No issue with creation of vlans and nothing noticeable in cursory testing with ping floods and iperf.

Looks good to me.

This revision is now accepted and ready to land.Sep 14 2018, 1:48 PM
This revision was automatically updated to reflect the committed changes.