Page MenuHomeFreeBSD

ixgbe: Bring back accounting for tx in AIM
Needs ReviewPublic

Authored by kbowling on May 7 2021, 12:04 AM.

Details

Summary

This is a necessary fix to the AIM algorithm.

I would still not advise using it under any circumstances until D30094 is resolved.

Test Plan

Tested on two X552s. Does not fix the problems being discussed in D30094 but this will be necessary eventually to test/use AIM when it is functioning correctly.

Diff Detail

Lint
Lint Skipped
Unit
Unit Tests Skipped

Event Timeline

sys/dev/ixgbe/if_ix.c
2138

Looking at ixgbe_isc_txd_encap(), it seems txr->packets never gets adjusted.

sys/dev/ixgbe/if_ix.c
2138

That is true. Thanks for bringing in Tx too. We need to make the change in ixgbe_isc_txd_encap() to include txr->packets = txr->total_packets;

Ok I'll add that accounting and report my findings

I have good news and bad news to report.

The good news is, AIM functions closer to intended on the sender heavy workload with the txr accounting in place.

The bad news: it occasionally lops off around 2gbps on a single stream TCP TSO sender with occasional packet loss in my test environment. On the receiver, AIM reduces single stream UDP performance by about 1gbps and increase loss 20%. That seems like a bigger issue than the current situation, and I'd rather just set static int ixgbe_max_interrupt_rate = (4000000 / IXGBE_LOW_LATENCY); to IXGBE_AVE_LATENCY as a break fix instead of enabling AIM while we continue to figure this EITR interaction out for the intel drivers.

From my perspective there are two worthwhile paths to investigate, in one we improve the AIM algorithm. In another, we figure out what is going on in iflib and make it work the way it's supposed to -- we have enough information on the sender we really shouldn't need to dynamically tune EITR as far as I can tell. I'm less sure about the receiver but think in the cases FreeBSD is used a correct static EITR value would be ok if we get the iflib re-arms correct. What do you think?

There are some optimizations in the iflib driver to decrease TX descriptor writeback txq_max_rs_deferred (I think @gallatin mentioned this earlier), I wonder if this is just a matter of the old AIM algorithm being too aggressive and needing to be tamped down a bit for this batching.

sys/dev/ixgbe/if_ix.c
2117

I would refrain using que->msix as queue array index. This may not work when SR-IOV is enabled.
Ideally we would want to use "txr.me" . But with Rx and Tx queue separation, I think we may have to introduce a new "index" variable to explicitly capture corresponding TxQ index for a given RxQ.

@kbowling ,

I have similar observation (bad news) wrt UDP. But for TCP, I see just fine. My runs are all on NetApp platform.
Please note: my client is not HoL.

client% sudo iperf3 -c 192.168.228.0 -i 5 -u -b 2G
Connecting to host 192.168.228.0, port 5201
[  4] local 192.168.227.254 port 24476 connected to 192.168.228.0 port 5201
[ ID] Interval           Transfer     Bitrate         Total Datagrams
[  4]   0.00-5.00   sec  1.16 GBytes  2.00 Gbits/sec  139506  
[  4]   5.00-10.00  sec  1.16 GBytes  2.00 Gbits/sec  139508  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
[  4]   0.00-10.00  sec  2.33 GBytes  2.00 Gbits/sec  0.000 ms  0/279014 (0%)  sender
[  4]   0.00-10.09  sec  1.61 GBytes  1.37 Gbits/sec  0.017 ms  85679/279012 (31%)  receiver

iperf Done.

Wrt TCP, I donot see your observation. My lab NIC is embedded-10G (X552).

client% sudo iperf3 -c 192.168.228.0 -i 5 -b 2G
Connecting to host 192.168.228.0, port 5201
[  4] local 192.168.227.254 port 38791 connected to 192.168.228.0 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  4]   0.00-5.00   sec  1.16 GBytes  2.00 Gbits/sec    1   4.33 MBytes       
[  4]   5.00-10.00  sec  1.16 GBytes  2.00 Gbits/sec    0   7.32 MBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  4]   0.00-10.00  sec  2.33 GBytes  2.00 Gbits/sec    1             sender
[  4]   0.00-10.00  sec  2.33 GBytes  2.00 Gbits/sec                  receiver

iperf Done.

client% sudo iperf3 -c 192.168.228.0 -i 5 -b 7G
Connecting to host 192.168.228.0, port 5201
[  4] local 192.168.227.254 port 38773 connected to 192.168.228.0 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  4]   0.00-5.00   sec  4.07 GBytes  7.00 Gbits/sec    0   3.74 MBytes
[  4]   5.00-10.00  sec  4.07 GBytes  7.00 Gbits/sec    0   3.74 MBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  4]   0.00-10.00  sec  8.15 GBytes  7.00 Gbits/sec    0             sender
[  4]   0.00-10.09  sec  8.15 GBytes  6.94 Gbits/sec                  receiver

iperf Done.

Also, I prefer to have a quick call and discuss the ideas & thoughts we have. We would need an expert from Intel to help us understand AIM.
On side note, from NetApp performance experiments on NetApp platforms, BSD11 (legacy driver) vs BSD12 (IFLIB base driver) - we noticed almost 3.5x-4x latency spike in one of write tests for IFLIB-based drivers.

@stallamr_netapp.com thanks, there is a variable here in that I am running in two VMs amongst other things. I'm also diving into this code for the first time in 3 years so this is new, I'm just trying to understand the problem in the drivers and hopefully fix it or find someone who can. @gnn is getting me access to the project's network lab, and I'll use that to see if I can take a look at the problem on other types of hardware.

I don't have any authority over intel but I agree it would be helpful if we could get them back on a regular call to discuss important networking development. Would you like me to send out a Google Calendar invite for an iflib meeting?

sys/dev/ixgbe/if_ix.c
2117

Ok I will think a bit harder on this, thanks for the feedback.