Page MenuHomeFreeBSD

Improve netmap TX handling when TX IRQs are not used/supported
ClosedPublic

Authored by shurd on Jul 17 2018, 4:56 PM.
Tags
None
Referenced Files
F108832235: D16300.id45617.diff
Tue, Jan 28, 10:47 AM
F108831957: D16300.id45419.diff
Tue, Jan 28, 10:42 AM
Unknown Object (File)
Tue, Jan 21, 8:00 PM
Unknown Object (File)
Tue, Jan 14, 4:09 PM
Unknown Object (File)
Mon, Jan 6, 9:05 AM
Unknown Object (File)
Mon, Jan 6, 8:36 AM
Unknown Object (File)
Sun, Jan 5, 10:45 PM
Unknown Object (File)
Mon, Dec 30, 6:50 PM
Subscribers

Details

Summary

Use the timer to poll for TX completions when there are
outstanding TX slots. Track when the last driver timer was called
to prevent overcalling it. Also clean up some kring vs NIC ring
usage.

Test Plan

Test netmap especially tx-only use cases (ie: pkt-gen)

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

Test between two 1000baseT FD, connected with 1GB switch. I128LM is sending, I129LM is receiving.
Kernel from 2018-07-18 (how to find svn revision when building from github?)
Patched kernel only on sending side. Receiving side is same revision but unpatched.

Before patch:

johannes@jd2:~ % cat netmap-head.log
427.969089 main [2699] interface is em0
427.969868 main [2824] using default burst size: 512
427.971339 main [2832] running on 1 cpus (have 4)
427.972807 extract_ip_range [465] range is 10.0.0.1:1234 to 10.0.0.1:1234
427.974517 extract_ip_range [465] range is 192.168.1.2:1234 to 192.168.1.2:1234
428.667628 main [2932] mapped 334980KB at 0x801200000
428.668278 main [3029] Sending 512 packets every  0.000000000 s
428.670795 start_threads [2374] Wait 2 secs for phy reset
430.673339 start_threads [2376] Ready...
430.674087 sender_body [1518] start, fd 3 main_fd 3
431.675291 main_thread [2464] 1.022 Kpps (1.023 Kpkts 491.040 Kbps in 1001194 usec) 511.50 avg_batch 0 min_space
432.678767 main_thread [2464] 0.000 pps (0.000 pkts 0.000 bps in 1003476 usec) 0.00 avg_batch 99999 min_space
432.713502 sender_body [1587] poll error/timeout on queue 0: No error: 0
433.687270 main_thread [2464] 1.015 Kpps (1.024 Kpkts 491.520 Kbps in 1008503 usec) 341.33 avg_batch 99999 min_space
434.692404 main_thread [2464] 0.000 pps (0.000 pkts 0.000 bps in 1005134 usec) 0.00 avg_batch 99999 min_space
434.770355 sender_body [1587] poll error/timeout on queue 0: No error: 0
435.752546 main_thread [2464] 966.000 pps (1.024 Kpkts 491.520 Kbps in 1060142 usec) 341.33 avg_batch 99999 min_space
436.757891 main_thread [2464] 0.000 pps (0.000 pkts 0.000 bps in 1005345 usec) 0.00 avg_batch 99999 min_space
436.800657 sender_body [1587] poll error/timeout on queue 0: No error: 0
437.762638 main_thread [2464] 1.019 Kpps (1.024 Kpkts 491.520 Kbps in 1004747 usec) 341.33 avg_batch 99999 min_space
438.772120 main_thread [2464] 0.000 pps (0.000 pkts 0.000 bps in 1009482 usec) 0.00 avg_batch 99999 min_space

After patch

johannes@jd2:~ % cat netmap-shurd.log
136.918637 main [2699] interface is em0
136.919705 main [2824] using default burst size: 512
136.923207 main [2832] running on 1 cpus (have 4)
136.927070 extract_ip_range [465] range is 10.0.0.1:1234 to 10.0.0.1:1234
136.930774 extract_ip_range [465] range is 192.168.1.2:1234 to 192.168.1.2:1234
137.626649 main [2932] mapped 334980KB at 0x801200000
137.632362 main [3029] Sending 512 packets every  0.000000000 s
137.633838 start_threads [2374] Wait 2 secs for phy reset
139.761587 start_threads [2376] Ready...
139.767400 sender_body [1518] start, fd 3 main_fd 3
140.830972 main_thread [2464] 962.000 pps (1.023 Kpkts 491.040 Kbps in 1063570 usec) 511.50 avg_batch 0 min_space
141.888769 main_thread [2464] 6.776 Kpps (7.168 Kpkts 3.441 Mbps in 1057797 usec) 477.87 avg_batch 99999 min_space
142.935544 main_thread [2464] 5.380 Kpps (5.632 Kpkts 2.703 Mbps in 1046775 usec) 469.33 avg_batch 99999 min_space
143.998009 main_thread [2464] 5.783 Kpps (6.144 Kpkts 2.949 Mbps in 1062465 usec) 438.86 avg_batch 99999 min_space
145.022817 main_thread [2464] 5.496 Kpps (5.632 Kpkts 2.703 Mbps in 1024808 usec) 469.33 avg_batch 99999 min_space
146.052823 main_thread [2464] 5.965 Kpps (6.144 Kpkts 2.949 Mbps in 1030006 usec) 472.62 avg_batch 99999 min_space
147.122003 main_thread [2464] 5.268 Kpps (5.632 Kpkts 2.703 Mbps in 1069180 usec) 469.33 avg_batch 99999 min_space
148.191095 main_thread [2464] 5.268 Kpps (5.632 Kpkts 2.703 Mbps in 1069092 usec) 469.33 avg_batch 99999 min_space
149.259983 main_thread [2464] 5.748 Kpps (6.144 Kpkts 2.949 Mbps in 1068889 usec) 472.62 avg_batch 99999 min_space
150.329080 main_thread [2464] 5.268 Kpps (5.632 Kpkts 2.703 Mbps in 1069097 usec) 469.33 avg_batch 99999 min_space
151.382856 main_thread [2464] 5.345 Kpps (5.632 Kpkts 2.703 Mbps in 1053775 usec) 469.33 avg_batch 99999 min_space
152.430743 main_thread [2464] 8.306 Kpps (8.704 Kpkts 4.178 Mbps in 1047888 usec) 362.67 avg_batch 99999 min_space

Normally we should see around 14 Mpps so 5 Kpps seem extremely slow.. There's no noticeable CPU usage that would cause the slowness.

Note: I changed my router to a Netgear GB switch yesterday so first time to use pkt-gen with this switch but that should not be an issue. Can someone duplicate the results?

Looks good to me. FYI, I get:

Linux 4.17:
IGB:
891.958138 main_thread [2501] 1.488 Mpps (1.490 Mpkts 715.006 Mbps in 1001011 usec) 2.11 avg_batch 99999 min_spac
IXGBE:
909.881462 main_thread [2501] 14.882 Mpps (14.895 Mpkts 7.149 Gbps in 1000874 usec) 123.58 avg_batch 99999 min_space
IXL:
132.122482 main_thread [2501] 41.792 Mpps (41.834 Mpkts 20.080 Gbps in 1001011 usec) 208.12 avg_batch 99999 min_space

r336455:
IGB:
112.796698 main_thread [2056] 1.490 Mpps (1.584 Mpkts 760.081 Mbps in 1063000 usec) 81.17 avg_batch 99999 min_space
IXGBE:
622.373110 sender_body [1250] poll error/timeout on queue 0: No error: 0
622.781122 main_thread [2056] 481.000 pps (512.000 pkts 245.760 Kbps in 1063503 usec) 256.00 avg_batch 99999 min_space
IXL:
968.088320 main_thread [2056] 12.071 Mpps (12.418 Mpkts 5.961 Gbps in 1028750 usec) 29.22 avg_batch 99999 min_space

r336455 w/ D16300 patch:
IGB:
654.027077 main_thread [2056] 1.488 Mpps (1.491 Mpkts 715.455 Mbps in 1002000 usec) 53.54 avg_batch 99999 min_space
IXGBE:
750.355083 main_thread [2056] 14.891 Mpps (14.905 Mpkts 7.155 Gbps in 1000947 usec) 378.11 avg_batch 99999 min_space
IXL:
007.264052 main_thread [2056] 12.598 Mpps (12.623 Mpkts 6.059 Gbps in 1002000 usec) 24.28 avg_batch 99999 min_space

I repeatedly tried with head and IXGBE gear but it just kept throwing errors. So this patch definitely is an improvement and FreeBSD is en par with Linux except when it comes to IXL gear.

sys/net/iflib.c
925 ↗(On Diff #45419)

Is there a particular reason for dropping this const?

This revision is now accepted and ready to land.Jul 18 2018, 5:32 PM
shurd marked an inline comment as done.

Restore const declaration for head pointer in txsync

This revision now requires review to proceed.Jul 18 2018, 6:10 PM
sys/net/iflib.c
925 ↗(On Diff #45419)

No, this appears to have crept in from testing code.

Use the same callout reset logic in the admin task as the txq timer.

@johalun0_gmail.com I wonder if the admin task was overriding the txqs
with a longer timeout in your case.

Use the same callout reset logic in the admin task as the txq timer.

@johalun0_gmail.com I wonder if the admin task was overriding the txqs
with a longer timeout in your case.

Still the same.. I'd like to try on my I219 machine since the I218 has been known to act weird but I129 is my main dev machine so I need to find the right timing to update the kernel.

OK, switching TX/RX between I128LM and I219LM makes no difference, still 5 Kpps. iperf between the two machine measure +900 Mbps.

OK, switching TX/RX between I128LM and I219LM makes no difference, still 5 Kpps. iperf between the two machine measure +900 Mbps.

Are the counters in dev.em.X.mac_stats.xoff_txd and dev.em.X.mac_stats.xon_txd incrementing? If so, you can try turning off flow control... sysctl dev.em.X.fc=0

OK, switching TX/RX between I128LM and I219LM makes no difference, still 5 Kpps. iperf between the two machine measure +900 Mbps.

Are the counters in dev.em.X.mac_stats.xoff_txd and dev.em.X.mac_stats.xon_txd incrementing? If so, you can try turning off flow control... sysctl dev.em.X.fc=0

The counters are not incrementing. So I noticed a couple of funny things which made me unsure of my first test results (if I set destination IP properly).
With latest patch I get 5 Kpps if I don't specify a destination, which might be understandable if I'm flooding the network.
If I set destination IP and MAC, I get the usual poll error on receiving side.

I will try later today with the first version of the patch and with properly set destination IP and MAC.

One other thing maybe not related to this but what happened to TSO4 option? It is no longer visible in ifconfig output...

One other thing maybe not related to this but what happened to TSO4 option? It is no longer visible in ifconfig output...

There were a lot of TSO changes in D15720 (r336313). I believe it's disabled by default now, but can be enabled if you know it works properly (and it sounds like it rarely does).

I tried the first version again and it seems OK on the TX side. Not sure about RX, seeing some poll error but that might be my messed up I218LM which can't do more than 70 Kpps TX (it does 14 Mpps with FreeBSD 11) but that's another problem.
Between I219LM and a Realtek NIC I get stable 150 Kpps and I guess that's the limit for emulated netmap.

Last remaining option for 14 Mpps is my Macbook Pro with thunderbolt ethernet dongle. I'll return with results when I've updated the Macbook's dual boot.

Edit: the problem with the first test that showed 5 Kpps was that I only specified destination IP and not MAC.

Results with I219LM (em0) TX and Macbook thunderbolt dongle (bge0) RX

johannes@jd:~/dev/freebsd/kms-drm % sudo pkt-gen -i em0 -f tx -d 192.168.1.14 -D 10:dd:b1:ce:8b:dc -s 192.168.1.2 -S d4:81:d7:c2:f6:b9
014.550977 main [2699] interface is em0
014.551016 main [2824] using default burst size: 512
014.551025 main [2832] running on 1 cpus (have 4)
014.551084 extract_ip_range [465] range is 192.168.1.2:1234 to 192.168.1.2:1234
014.551097 extract_ip_range [465] range is 192.168.1.14:1234 to 192.168.1.14:1234
015.294761 main [2932] mapped 334980KB at 0x801200000
Sending on netmap:em0: 1 queues, 1 threads and 1 cpus.
192.168.1.2 -> 192.168.1.14 (d4:81:d7:c2:f6:b9 -> 10:dd:b1:ce:8b:dc)
015.295066 main [3029] Sending 512 packets every  0.000000000 s
015.295183 start_threads [2374] Wait 2 secs for phy reset
017.295524 start_threads [2376] Ready...
017.295866 sender_body [1518] start, fd 3 main_fd 3
018.303478 main_thread [2464] 1.015 Kpps (1.023 Kpkts 491.040 Kbps in 1007608 usec) 511.50 avg_batch 0 min_space
018.419521 sender_body [1600] drop copy
019.305338 main_thread [2464] 1.004 Mpps (1.006 Mpkts 482.673 Mbps in 1001860 usec) 341.33 avg_batch 99999 min_space
020.306338 main_thread [2464] 1.022 Mpps (1.023 Mpkts 491.274 Mbps in 1000999 usec) 341.50 avg_batch 99999 min_space
021.313395 main_thread [2464] 1.024 Mpps (1.031 Mpkts 494.961 Mbps in 1007055 usec) 341.33 avg_batch 99999 min_space
022.314367 main_thread [2464] 1.024 Mpps (1.025 Mpkts 492.012 Mbps in 1000975 usec) 341.33 avg_batch 99999 min_space
023.316351 main_thread [2464] 1.024 Mpps (1.026 Mpkts 492.503 Mbps in 1001983 usec) 341.33 avg_batch 99999 min_space
024.318342 main_thread [2464] 1.024 Mpps (1.026 Mpkts 492.503 Mbps in 1001992 usec) 341.45 avg_batch 99999 min_space
025.319348 main_thread [2464] 1.024 Mpps (1.025 Mpkts 492.012 Mbps in 1001005 usec) 341.45 avg_batch 99999 min_space
026.320603 main_thread [2464] 1.025 Mpps (1.026 Mpkts 492.503 Mbps in 1001255 usec) 341.33 avg_batch 99999 min_space

Hmm, it's interesting how it's capped at 1.024 Mpps (RX goes up to ~950 Kpps). I was expecting around 1.4 Mpps... Anyway, it seems stable both ways now.

This revision was not accepted when it landed; it landed in state Needs Review.Jul 20 2018, 5:24 PM
This revision was automatically updated to reflect the committed changes.