Page MenuHomeFreeBSD

fix iflib netmap rx
ClosedPublic

Authored by shurd on Aug 28 2017, 12:11 AM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, Nov 21, 1:24 AM
Unknown Object (File)
Mon, Nov 18, 2:04 AM
Unknown Object (File)
Sun, Nov 17, 4:03 PM
Unknown Object (File)
Thu, Nov 14, 12:33 AM
Unknown Object (File)
Tue, Nov 12, 12:31 AM
Unknown Object (File)
Mon, Nov 11, 9:16 PM
Unknown Object (File)
Mon, Nov 11, 6:44 PM
Unknown Object (File)
Mon, Nov 11, 6:42 PM

Details

Summary

rxq setup for netmap was broken because netmap_rxq_init was getting called before IFDI_INIT - thus we ended up with ring tail pointer being reset to zero.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 11275

Event Timeline

em0@pci0:0:31:6:	class=0x020000 card=0x06db1028 chip=0x156f8086 rev=0x21 hdr=0x00
    device     = 'Ethernet Connection I219-LM'

Tested with netmap's pkt-gen

On receive I get crash:

#7  0xffffffff80a72006 in kassert_panic (
   fmt=0xffffffff810482e7 "Assertion %s failed at %s:%d")
   at /usr/home/johannes/dev/freebsd/freebsd-base-graphics/sys/kern/kern_shutdown.c:669
#8  0xffffffff8053a20f in em_isc_rxd_pkt_get (arg=<unavailable>,
   ri=<optimized out>)
   at /usr/home/johannes/dev/freebsd/freebsd-base-graphics/sys/dev/e1000/em_txrx.c:697
#9  0xffffffff80b81288 in iflib_netmap_rxsync (kring=<optimized out>,
   flags=<optimized out>)
   at /usr/home/johannes/dev/freebsd/freebsd-base-graphics/sys/net/iflib.c:1008
#10 0xffffffff806c12fc in netmap_poll (priv=<optimized out>,
   events=<optimized out>, sr=0xfffff80220420560)
   at /usr/home/johannes/dev/freebsd/freebsd-base-graphics/sys/dev/netmap/netmap.c:2724
#11 0xffffffff806c3a02 in freebsd_netmap_poll (cdevi=<optimized out>,
   events=1, td=0xfffff80220420560)
   at /usr/home/johannes/dev/freebsd/freebsd-base-graphics/sys/dev/netmap/netmap_freebsd.c:1393
#12 0xffffffff80941eef in devfs_poll_f (fp=0xfffff8006cdd9c30, events=1,
   cred=0xfffff8000c4bba00, td=0xfffff80220420560)

On transmit I get:

621.256106 sender_body [1181] start, fd 3 main_fd 3
622.257327 main_thread [2056] 1.022 Kpps (1.023 Kpkts 491.040 Kbps in 1001225 usec) 511.50 avg_batch 0 min_space
623.258329 main_thread [2056] 0.000 pps (0.000 pkts 0.000 bps in 1001002 usec) 0.00 avg_batch 99999 min_space
623.260011 sender_body [1250] poll error/timeout on queue 0: No error: 0
624.259329 main_thread [2056] 1.023 Kpps (1.024 Kpkts 491.520 Kbps in 1000999 usec) 341.33 avg_batch 99999 min_space
625.263889 main_thread [2056] 0.000 pps (0.000 pkts 0.000 bps in 1004548 usec) 0.00 avg_batch 99999 min_space
625.263876 sender_body [1250] poll error/timeout on queue 0: No error: 0
626.265335 main_thread [2056] 1.023 Kpps (1.024 Kpkts 491.520 Kbps in 1001458 usec) 341.33 avg_batch 99999 min_space
627.265335 sender_body [1250] poll error/timeout on queue 0: No error: 0
627.276066 main_thread [2056] 1.013 Kpps (1.024 Kpkts 491.520 Kbps in 1010731 usec) 341.33 avg_batch 99999 min_space
628.276338 main_thread [2056] 0.000 pps (0.000 pkts 0.000 bps in 1000273 usec) 0.00 avg_batch 99999 min_space

The packets that do get transmitted are received by receiving machine.

em0@pci0:0:31:6:	class=0x020000 card=0x06db1028 chip=0x156f8086 rev=0x21 hdr=0x00
    device     = 'Ethernet Connection I219-LM'

Tested with netmap's pkt-gen

On receive I get crash:

#7  0xffffffff80a72006 in kassert_panic (
   fmt=0xffffffff810482e7 "Assertion %s failed at %s:%d")
   at /usr/home/johannes/dev/freebsd/freebsd-base-graphics/sys/kern/kern_shutdown.c:669
#8  0xffffffff8053a20f in em_isc_rxd_pkt_get (arg=<unavailable>,
   ri=<optimized out>)
   at /usr/home/johannes/dev/freebsd/freebsd-base-graphics/sys/dev/e1000/em_txrx.c:697
#9  0xffffffff80b81288 in iflib_netmap_rxsync (kring=<optimized out>,
   flags=<optimized out>)
   at /usr/home/johannes/dev/freebsd/freebsd-base-graphics/sys/net/iflib.c:1008
#10 0xffffffff806c12fc in netmap_poll (priv=<optimized out>,
   events=<optimized out>, sr=0xfffff80220420560)
   at /usr/home/johannes/dev/freebsd/freebsd-base-graphics/sys/dev/netmap/netmap.c:2724
#11 0xffffffff806c3a02 in freebsd_netmap_poll (cdevi=<optimized out>,
   events=1, td=0xfffff80220420560)
   at /usr/home/johannes/dev/freebsd/freebsd-base-graphics/sys/dev/netmap/netmap_freebsd.c:1393
#12 0xffffffff80941eef in devfs_poll_f (fp=0xfffff8006cdd9c30, events=1,
   cred=0xfffff8000c4bba00, td=0xfffff80220420560)

On transmit I get:

621.256106 sender_body [1181] start, fd 3 main_fd 3
622.257327 main_thread [2056] 1.022 Kpps (1.023 Kpkts 491.040 Kbps in 1001225 usec) 511.50 avg_batch 0 min_space
623.258329 main_thread [2056] 0.000 pps (0.000 pkts 0.000 bps in 1001002 usec) 0.00 avg_batch 99999 min_space
623.260011 sender_body [1250] poll error/timeout on queue 0: No error: 0
624.259329 main_thread [2056] 1.023 Kpps (1.024 Kpkts 491.520 Kbps in 1000999 usec) 341.33 avg_batch 99999 min_space
625.263889 main_thread [2056] 0.000 pps (0.000 pkts 0.000 bps in 1004548 usec) 0.00 avg_batch 99999 min_space
625.263876 sender_body [1250] poll error/timeout on queue 0: No error: 0
626.265335 main_thread [2056] 1.023 Kpps (1.024 Kpkts 491.520 Kbps in 1001458 usec) 341.33 avg_batch 99999 min_space
627.265335 sender_body [1250] poll error/timeout on queue 0: No error: 0
627.276066 main_thread [2056] 1.013 Kpps (1.024 Kpkts 491.520 Kbps in 1010731 usec) 341.33 avg_batch 99999 min_space
628.276338 main_thread [2056] 0.000 pps (0.000 pkts 0.000 bps in 1000273 usec) 0.00 avg_batch 99999 min_space

The packets that do get transmitted are received by receiving machine.

Yeah you need to be running the actual iflib development branch.

avoid gratuitous ithread dispatch

@sbruno @johalun0_gmail.com I've created a dedicated branch for this fix ifilb/netmap_rx

This revision is now accepted and ready to land.Aug 30 2017, 1:04 AM

I did a clean world+kernel build on iflib/netmap_rx branch.

Running pkt-gen -i em1 -f rx on receiver and pkt-gen -i em1 -f tx on the sender results in crash on receiver.
(sender and receiver two different machines connected back to back on em1)

db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe03668b2510
vpanic() at vpanic+0x19c/frame 0xfffffe03668b2590
kassert_panic() at kassert_panic+0x126/frame 0xfffffe03668b2600
em_isc_rxd_pkt_get() at em_isc_rxd_pkt_get+0xf1/frame 0xfffffe03668b2670
iflib_netmap_rxsync() at iflib_netmap_rxsync+0x235/frame 0xfffffe03668b2770
netmap_poll() at netmap_poll+0x79c/frame 0xfffffe03668b2870
freebsd_netmap_poll() at freebsd_netmap_poll+0x32/frame 0xfffffe03668b28a0
devfs_poll_f() at devfs_poll_f+0x7f/frame 0xfffffe03668b2900
kern_poll() at kern_poll+0x4fc/frame 0xfffffe03668b2aa0
sys_poll() at sys_poll+0x50/frame 0xfffffe03668b2ac0
amd64_syscall() at amd64_syscall+0x79b/frame 0xfffffe03668b2bf0
Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe03668b2bf0
--- syscall (209, FreeBSD ELF64, sys_poll), rip = 0x800daf2aa, rsp = 0x7fffdfff9e78, rbp = 0x7fffdfff9eb0 ---
KDB: enter: panic
[ thread pid 828 tid 100177 ]
Stopped at      kdb_enter+0x3b: movq    $0,kdb_why

However, if I limit the rate of packets I like so
pkt-gen -i em1 -f rx on receiver and pkt-gen -i em1 -f tx -R 100 on the sender
the receiver stops receiving packets after ~600 packets or so (6 batches received), after that rate on receiver goes to zero but does not crash.

Breaking the process might result in crash:

cpuid = 2                 
time = 1504082223         
KDB: stack backtrace:     
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe02398c0730                           
vpanic() at vpanic+0x19c/frame 0xfffffe02398c07b0   
kassert_panic() at kassert_panic+0x126/frame 0xfffffe02398c0820                                          
iflib_fl_bufs_free() at iflib_fl_bufs_free+0x1c2/frame 0xfffffe02398c0870                                
iflib_stop() at iflib_stop+0x478/frame 0xfffffe02398c08c0                                                
iflib_netmap_register() at iflib_netmap_register+0x1a4/frame 0xfffffe02398c0900                          
netmap_hw_reg() at netmap_hw_reg+0x2c/frame 0xfffffe02398c0930                                           
netmap_do_unregif() at netmap_do_unregif+0x16a/frame 0xfffffe02398c0960                                  
netmap_priv_delete() at netmap_priv_delete+0x31/frame 0xfffffe02398c0980                                 
netmap_dtor() at netmap_dtor+0x2b/frame 0xfffffe02398c09a0                                               
devfs_destroy_cdevpriv() at devfs_destroy_cdevpriv+0x8b/frame 0xfffffe02398c09c0                         
devfs_close_f() at devfs_close_f+0x65/frame 0xfffffe02398c09f0                                           
closef() at closef+0x1f5/frame 0xfffffe02398c0a80   
closefp() at closefp+0x9f/frame 0xfffffe02398c0ac0  
amd64_syscall() at amd64_syscall+0x79b/frame 0xfffffe02398c0bf0                                          
Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe02398c0bf0

igb0 interface works flawlessly at 1.4 Mpps so problem limited to em

I did a clean world+kernel build on iflib/netmap_rx branch.

Running pkt-gen -i em1 -f rx on receiver and pkt-gen -i em1 -f tx on the sender results in crash on receiver.
(sender and receiver two different machines connected back to back on em1)

db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe03668b2510
vpanic() at vpanic+0x19c/frame 0xfffffe03668b2590
kassert_panic() at kassert_panic+0x126/frame 0xfffffe03668b2600
em_isc_rxd_pkt_get() at em_isc_rxd_pkt_get+0xf1/frame 0xfffffe03668b2670
iflib_netmap_rxsync() at iflib_netmap_rxsync+0x235/frame 0xfffffe03668b2770
netmap_poll() at netmap_poll+0x79c/frame 0xfffffe03668b2870
freebsd_netmap_poll() at freebsd_netmap_poll+0x32/frame 0xfffffe03668b28a0
devfs_poll_f() at devfs_poll_f+0x7f/frame 0xfffffe03668b2900
kern_poll() at kern_poll+0x4fc/frame 0xfffffe03668b2aa0
sys_poll() at sys_poll+0x50/frame 0xfffffe03668b2ac0
amd64_syscall() at amd64_syscall+0x79b/frame 0xfffffe03668b2bf0
Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe03668b2bf0
--- syscall (209, FreeBSD ELF64, sys_poll), rip = 0x800daf2aa, rsp = 0x7fffdfff9e78, rbp = 0x7fffdfff9eb0 ---
KDB: enter: panic
[ thread pid 828 tid 100177 ]
Stopped at      kdb_enter+0x3b: movq    $0,kdb_why

However, if I limit the rate of packets I like so
pkt-gen -i em1 -f rx on receiver and pkt-gen -i em1 -f tx -R 100 on the sender
the receiver stops receiving packets after ~600 packets or so (6 batches received), after that rate on receiver goes to zero but does not crash.

Breaking the process might result in crash:

cpuid = 2                 
time = 1504082223         
KDB: stack backtrace:     
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe02398c0730                           
vpanic() at vpanic+0x19c/frame 0xfffffe02398c07b0   
kassert_panic() at kassert_panic+0x126/frame 0xfffffe02398c0820                                          
iflib_fl_bufs_free() at iflib_fl_bufs_free+0x1c2/frame 0xfffffe02398c0870                                
iflib_stop() at iflib_stop+0x478/frame 0xfffffe02398c08c0                                                
iflib_netmap_register() at iflib_netmap_register+0x1a4/frame 0xfffffe02398c0900                          
netmap_hw_reg() at netmap_hw_reg+0x2c/frame 0xfffffe02398c0930                                           
netmap_do_unregif() at netmap_do_unregif+0x16a/frame 0xfffffe02398c0960                                  
netmap_priv_delete() at netmap_priv_delete+0x31/frame 0xfffffe02398c0980                                 
netmap_dtor() at netmap_dtor+0x2b/frame 0xfffffe02398c09a0                                               
devfs_destroy_cdevpriv() at devfs_destroy_cdevpriv+0x8b/frame 0xfffffe02398c09c0                         
devfs_close_f() at devfs_close_f+0x65/frame 0xfffffe02398c09f0                                           
closef() at closef+0x1f5/frame 0xfffffe02398c0a80   
closefp() at closefp+0x9f/frame 0xfffffe02398c0ac0  
amd64_syscall() at amd64_syscall+0x79b/frame 0xfffffe02398c0bf0                                          
Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe02398c0bf0

igb0 interface works flawlessly at 1.4 Mpps so problem limited to em

@johan_duh.se I've fixed these two panics in my dev branch. Both em and igb work fine for me now. I'll update the patch as soon as you can confirm that you find no further issues.

I did a clean world+kernel build on iflib/netmap_rx branch.

Running pkt-gen -i em1 -f rx on receiver and pkt-gen -i em1 -f tx on the sender results in crash on receiver.
(sender and receiver two different machines connected back to back on em1)

db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe03668b2510
vpanic() at vpanic+0x19c/frame 0xfffffe03668b2590
kassert_panic() at kassert_panic+0x126/frame 0xfffffe03668b2600
em_isc_rxd_pkt_get() at em_isc_rxd_pkt_get+0xf1/frame 0xfffffe03668b2670
iflib_netmap_rxsync() at iflib_netmap_rxsync+0x235/frame 0xfffffe03668b2770
netmap_poll() at netmap_poll+0x79c/frame 0xfffffe03668b2870
freebsd_netmap_poll() at freebsd_netmap_poll+0x32/frame 0xfffffe03668b28a0
devfs_poll_f() at devfs_poll_f+0x7f/frame 0xfffffe03668b2900
kern_poll() at kern_poll+0x4fc/frame 0xfffffe03668b2aa0
sys_poll() at sys_poll+0x50/frame 0xfffffe03668b2ac0
amd64_syscall() at amd64_syscall+0x79b/frame 0xfffffe03668b2bf0
Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe03668b2bf0
--- syscall (209, FreeBSD ELF64, sys_poll), rip = 0x800daf2aa, rsp = 0x7fffdfff9e78, rbp = 0x7fffdfff9eb0 ---
KDB: enter: panic
[ thread pid 828 tid 100177 ]
Stopped at      kdb_enter+0x3b: movq    $0,kdb_why

However, if I limit the rate of packets I like so
pkt-gen -i em1 -f rx on receiver and pkt-gen -i em1 -f tx -R 100 on the sender
the receiver stops receiving packets after ~600 packets or so (6 batches received), after that rate on receiver goes to zero but does not crash.

Breaking the process might result in crash:

cpuid = 2                 
time = 1504082223         
KDB: stack backtrace:     
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe02398c0730                           
vpanic() at vpanic+0x19c/frame 0xfffffe02398c07b0   
kassert_panic() at kassert_panic+0x126/frame 0xfffffe02398c0820                                          
iflib_fl_bufs_free() at iflib_fl_bufs_free+0x1c2/frame 0xfffffe02398c0870                                
iflib_stop() at iflib_stop+0x478/frame 0xfffffe02398c08c0                                                
iflib_netmap_register() at iflib_netmap_register+0x1a4/frame 0xfffffe02398c0900                          
netmap_hw_reg() at netmap_hw_reg+0x2c/frame 0xfffffe02398c0930                                           
netmap_do_unregif() at netmap_do_unregif+0x16a/frame 0xfffffe02398c0960                                  
netmap_priv_delete() at netmap_priv_delete+0x31/frame 0xfffffe02398c0980                                 
netmap_dtor() at netmap_dtor+0x2b/frame 0xfffffe02398c09a0                                               
devfs_destroy_cdevpriv() at devfs_destroy_cdevpriv+0x8b/frame 0xfffffe02398c09c0                         
devfs_close_f() at devfs_close_f+0x65/frame 0xfffffe02398c09f0                                           
closef() at closef+0x1f5/frame 0xfffffe02398c0a80   
closefp() at closefp+0x9f/frame 0xfffffe02398c0ac0  
amd64_syscall() at amd64_syscall+0x79b/frame 0xfffffe02398c0bf0                                          
Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe02398c0bf0

igb0 interface works flawlessly at 1.4 Mpps so problem limited to em

@johan_duh.se I've fixed these two panics in my dev branch. Both em and igb work fine for me now. I'll update the patch as soon as you can confirm that you find no further issues.

Confirmed that pkt-gen works fine at max packet rate on both igb and em now. ^C haven't caused any issues either.

shurd edited reviewers, added: kmacy; removed: shurd.

I'll take this from here.

This revision was automatically updated to reflect the committed changes.