Page MenuHomeFreeBSD

Update ixl(4) to use iflib.
ClosedPublic

Authored by erj on May 25 2018, 6:50 PM.

Details

Summary

You can view the individual git commits that make up this review here, on GitHub: https://github.com/freebsd/freebsd/compare/master...intel-wired-ethernet:D15577-ixl-iflib

Features that don't work in this version of the driver:

  • VF driver (ixlv)
  • SR-IOV host support
  • RDMA support

Notable features that do work, but require more testing:

  • nvmupdate

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

erj created this revision.May 25 2018, 6:50 PM
erj updated this revision to Diff 43011.May 25 2018, 6:52 PM

Remove testing config files and make spelling fixes.

erj edited the summary of this revision. (Show Details)May 25 2018, 6:54 PM
erj set the repository for this revision to rS FreeBSD src repository.
erj edited the summary of this revision. (Show Details)May 25 2018, 6:58 PM
erj edited the summary of this revision. (Show Details)
erj edited the summary of this revision. (Show Details)
erj updated this revision to Diff 43028.May 26 2018, 1:08 AM

Update due to D15575 being committed.

erj updated this revision to Diff 43030.May 26 2018, 1:20 AM

Pulled out TCP saving changes since those are in D15558

erj updated this revision to Diff 43031.May 26 2018, 1:43 AM

Include contents of/update to D15558 again, since relying on dependent revisions
in Phabricator is a PITA.

We see constant tx hangs when changing driver settings:

On client:

ping <SUT>

on SUT:

#kldload if_ixl
#ifconfig ixl0 <IPv4 address>

Client is now seeing ICMP replies:
64 bytes from 190.2.8.77: icmp_seq=33 ttl=64 time=0.162 ms
64 bytes from 190.2.8.77: icmp_seq=34 ttl=64 time=0.060 ms
64 bytes from 190.2.8.77: icmp_seq=35 ttl=64 time=0.057 ms
64 bytes from 190.2.8.77: icmp_seq=36 ttl=64 time=0.057 ms
64 bytes from 190.2.8.77: icmp_seq=37 ttl=64 time=0.057 ms
64 bytes from 190.2.8.77: icmp_seq=38 ttl=64 time=0.052 ms
64 bytes from 190.2.8.77: icmp_seq=39 ttl=64 time=0.067 ms
64 bytes from 190.2.8.77: icmp_seq=40 ttl=64 time=0.054 ms
64 bytes from 190.2.8.77: icmp_seq=41 ttl=64 time=0.051 ms
64 bytes from 190.2.8.77: icmp_seq=42 ttl=64 time=0.056 ms
64 bytes from 190.2.8.77: icmp_seq=43 ttl=64 time=0.058 ms

On SUT:

ifconfig ixl0 down up

ICMP replies on client stop.

On SUT:

#dmesg
ixl0: TX(10) desc avail = 1, pidx = 0
ixl0: TX(10) desc avail = 1, pidx = 0
ixl0: TX(10) desc avail = 1, pidx = 0
ixl0: TX(10) desc avail = 1, pidx = 0
ixl0: TX(10) desc avail = 1, pidx = 0
ixl0: TX(10) desc avail = 1, pidx = 0
ixl0: TX(10) desc avail = 1, pidx = 0

This can also be reproduced by:

#kldload if_ixl
#ifconfig ixl0 <IP address>
#ifconfig ixl0 <IP address> mtu 9000

jeffrey.e.pieper_intel.com requested changes to this revision.May 29 2018, 9:03 PM
This revision now requires changes to proceed.May 29 2018, 9:03 PM
gallatin added inline comments.May 29 2018, 9:25 PM
sys/dev/ixl/ixl_txrx.c
146 ↗(On Diff #43031)

This old version needs to be removed.

Somehow my last comment got lost. With real Netflix traffic, the sparse detection logic is not sufficient. When we have kernel TLS, which inserts 13 and 16 byte mbufs around TLS records, we hit MDDs regularly. I think you need to walk the packet stream, not the DMA descriptors, so that you keep the same segmentation used by the hardware. I re-wrote the sparse detection to do this, and our MDDs went away.

gallatin added a comment.EditedMay 29 2018, 9:28 PM

This is the sparse detect I'm using. Note how it walks the header, and then the TSO by packet on the wire.

static int
ixl_tso_detect_sparse(bus_dma_segment_t *segs, int nsegs, if_pkt_info_t pi)
{
        int     count, curseg, i, hlen, segsz, seglen, tsolen;

        if (nsegs <= IXL_MAX_TX_SEGS-2)
                return (0);
        segsz = pi->ipi_tso_segsz;
        curseg = count = 0;

        hlen = pi->ipi_ehdrlen + pi->ipi_ip_hlen + pi->ipi_tcp_hlen;
        tsolen = pi->ipi_len - hlen;

        i = 0;
        curseg = segs[0].ds_len;
        while (hlen > 0) {
                count++;
                if (count > IXL_MAX_TX_SEGS - 2)
                        return (1);
                if (curseg == 0) {
                        i++;
                        if (__predict_false(i == nsegs))
                                return (1);

                        curseg = segs[i].ds_len;
                }
                seglen = min(curseg, hlen);
                curseg -= seglen;
                hlen -= seglen;
                printf("H:seglen = %d, count=%d\n", seglen, count);
        }
        while (tsolen > 0) {
                segsz = pi->ipi_tso_segsz;
                while (segsz > 0 && tsolen > 0) {
                        count++;
                        if (count > IXL_MAX_TX_SEGS - 2)
                                return (1);
                        if (curseg == 0) {
                                i++;
                                if (__predict_false(i == nsegs))
                                        return (1);
                                curseg = segs[i].ds_len;
                        }
                        seglen = min(curseg, segsz);
                        segsz -= seglen;
                        curseg -= seglen;
                        tsolen -= seglen;
                        printf("D:seglen = %d, count=%d\n", seglen, count);
                }
                count = 0;
        }

        return (0);
}

This also needs to be rebased again , probably due to: https://reviews.freebsd.org/D15343:

git.exc.GitCommandError: Cmd('git') failed due to: exit code(128)

cmdline: git rebase master
stdout: 'First, rewinding head to replay your work on top of it...

Applying: Update ixl(4) to use iflib.
Using index info to reconstruct a base tree...
M sys/net/iflib.c
Falling back to patching base and 3-way merge...
Auto-merging sys/net/iflib.c
CONFLICT (content): Merge conflict in sys/net/iflib.c
Patch failed at 0001 Update ixl(4) to use iflib.
Use 'git am --show-current-patch' to see the failed patch

Resolve all conflicts manually, mark them as resolved with
"git add/rm <conflicted_files>", then run "git rebase --continue".
You can instead skip this commit: run "git rebase --skip".
To abort and get back to the state before "git rebase", run "git rebase --abort".
'

stderr: '.git/rebase-apply/patch:56: trailing whitespace.
Copyright (c) 2013-2015, Intel Corporation

.git/rebase-apply/patch:58: trailing whitespace.

.git/rebase-apply/patch:59: trailing whitespace.

Redistribution and use in source and binary forms, with or without

.git/rebase-apply/patch:61: trailing whitespace.

.git/rebase-apply/patch:62: trailing whitespace.

  1. Redistributions of source code must retain the above copyright notice,

warning: squelched 81 whitespace errors
warning: 86 lines add whitespace errors.
error: Failed to merge in the changes.'

This applied and could be rebased cleanly before D15343 was committed.

krzysztof.galazka_intel.com updated this revision to Diff 43153.
  • ixl(4): Re-add descriptor writeback TX completion as optionable tunable
  • Merge branch 'master' into ixl-iflib
  • Merge commit 'fa5413e897294cb6a74d75e7317c5dbeb42e185c' into ixl-iflib
  • ixl(4): Remove unused variables, macros, and comments
  • ixl(4): Replace some numbers with constants
  • Merge upstream FreeBSD branch 'master' into ixl-iflib
  • ixl(4): Fix typo in sysctl help message
  • ixl(4): Edit whitespace and a function comment
  • ixl(4): Remove a debug printf()
  • ixl(4): Remove a couple old functions and prototypes
  • ixl(4): Remove unused variables from PF struct
  • ixl(4): Add two new I2C access methods
  • ixl(4): Actually increment TSO stat counter
  • ixl(4): Set some iflib flags and a fied to correct values
  • ixl(4): Stop reading header length value from RX descriptor
  • ixl(4): Fix and move an assert in the RX path
  • ixl(4): Enhance add/remove macvlan error reporting
  • ixl(4): Edit spacing and remove old comments
  • ixl(4): Fix queue allocation bug
  • ixl(4): Replace bsrl() with the architecture-independent fls()
  • ixl(4): Remove redundant MAC type / FW version check
  • ixl(4): Copy a couple things from Linux's RCTX setup
  • Merge branch 'intel-wired-ethernet/master' into ixl-iflib
  • iflib: Fix memory leak when TX packet defrag finally fails
  • ixl(4): Move function to where VF driver can use it
  • ixl(4): Remove unused function definitions and declarations
  • ixl(4): Remove hard link check in ixl_if_media_status
  • Merge freebsd/freebsd branch 'master' into ixl-iflib
  • ixl(4): Change ixl_tso_detect_sparse()
  • ixl(4): Edit debug i2c sysctl help messages
  • ixl(4): Add debug sysctl to read some diagnostic data from i2c
  • ixl(4): Remove unnecessary SYSCTL_OUT from previous commit
  • ixl(4): Return an error when debug i2c diag sysctl encounters one
  • ixl(4): Fix error handling during attach process
  • Merge remote-tracking branch 'origin/master' into ixl-iflib
  • ixl(4): Simplify ixl_find_filter()
  • ixl(4): Temporarily stop initializing SR-IOV support
  • ixl(4): Add driver version number to end of device description string
  • ixl(4): Get rid of unused tunables.
  • ixl(4): Initial implementation of driver private ioctl
  • ixl(4): Fix driver setup after receiving reset interrupt from HW
  • iflib: Add new shared flag to iflib IFLIB_ADMIN_ALWAYS_RUN
  • ixl(4): Remove VSI allocation message after device reset
  • ixl: Rename function to better reflect purpose
  • Merge freebsd/freebsd branch 'master' into ixl-iflib
  • Merge freebsd/freebsd branch 'master' into ixl-iflib (2018-05-30)
  • ixl/ixlv: Reset head writeback value on TX ring init
erj updated this revision to Diff 43407.Jun 7 2018, 1:58 PM

Update to exclude code in D15558, since that is now committed.

erj updated this revision to Diff 43626.EditedJun 11 2018, 10:53 PM

Add these changes from @krzysztof.galazka_intel.com

  • ixl: Remove default MAC/VLAN filters
  • ixl: Fix MAC addresses accounting in VSI

And one change from @gallatin:

  • ixl(4): Enhance ixl_tso_detect_sparse() to fix more MDD issues
gallatin added inline comments.Jun 11 2018, 11:40 PM
sys/dev/ixl/ixl_txrx.c
172 ↗(On Diff #43626)

You probably want to remove this debug print!

erj updated this revision to Diff 43631.Jun 12 2018, 12:11 AM

Comment out printf()'s introduced in previous commit

erj updated this revision to Diff 43724.Jun 13 2018, 8:50 PM

Based on a report by @gallatin:

ixl(4): Increase default ITR values to 500 (1000us)

You might want to go for some middle ground here. 1ms is a bit much (even for us).

gallatin accepted this revision.Jun 14 2018, 2:53 AM
erj added a comment.Jun 14 2018, 6:23 PM

You might want to go for some middle ground here. 1ms is a bit much (even for us).

I remember another company using that value for their default ITR values, but that might have been for the VF driver.

sys/dev/ixl/ixl_txrx.c
194 ↗(On Diff #43626)

And this one, too...

I think we need to lower it. 500 significantly impacts 40G performance w/ 9K MTU, especially rx.

erj marked an inline comment as done.Jun 18 2018, 4:43 PM
erj updated this revision to Diff 44013.
  • Revert "ixl(4): Increase default ITR values to 500 (1000us)"
erj updated this revision to Diff 44017.Jun 18 2018, 5:46 PM

Commit iflib style fixes separately.

erj added a comment.Jun 18 2018, 6:03 PM

Any last-minute objections? This is the version that's going in!

erj marked 2 inline comments as done.Jun 18 2018, 8:13 PM
This revision was not accepted when it landed; it landed in state Needs Review.
Closed by commit rS335338: ixl(4): Update to use iflib (authored by erj). · Explain Why
This revision was automatically updated to reflect the committed changes.
erj edited the summary of this revision. (Show Details)Jun 19 2018, 5:47 PM