Page MenuHomeFreeBSD

Assorted TSO fixes for em(4)/iflib(9) and dead code removal from em(4)
ClosedPublic

Authored by marius on Jun 9 2018, 7:52 PM.

Details

Summary
  • Ever since the workaround for the silicon bug of TSO4 causing MAC hangs was committed in r295133, CSUM_TSO always got disabled unconditionally by em(4) on the first invocation of em_init_locked(). However, even with that problem fixed, it turned out that for at least e. g. 82579 not all necessary TSO workarounds are in place, still causing MAC hangs even at Gigabit speed. Thus, for stable/11, TSO usage was deliberately disabled in r323292 (r323293 for stable/10) for the EM-class by default, allowing users to turn it on if it happens to work with their particular EM MAC in a Gigabit-only environment. In head, the TSO workaround for speeds other than Gigabit was lost with the conversion to iflib(9) in r311849 (possibly along with another one or two TSO workarounds). Yet at the same time, for EM-class MACs TSO4 is enabled by default again, causing device hangs. Therefore, change the default for this hardware class back to have TSO4 off, allowing users to turn it on manually if it happens to work in their environment as we do in stable/{10,11}. An alternative would be to add a whitelist of EM-class devices where TSO4 actually is reliable with the workarounds in place, but given that the advantage of TSO at Gigabit speed is rather limited - especially with the overhead of these workarounds -, that's really not worth it. [1]
  • Although 82543 support TSO4 in theory, the former lem(4) didn't have support for TSO4, presumably because TSO4 is even more broken in the LEM-class of MACs than the later EM ones. Still, TSO4 for LEM-class devices was enabled as part of the conversion to iflib(9) in r311849, causing device hangs. So revert back to the pre-r311849 behavior of not supporting TSO4 for LEM-class at all, which includes not creating a TSO DMA tag in iflib(9) for devices not having IFCAP_TSO4 set. [2]
  • In fact, the FreeBSD TCP stack can handle a TSO size of IP_MAXPACKET (65535) rather than FREEBSD_TSO_SIZE_MAX (65518). However, the TSO DMA must have a maxsize of the maximum TSO size plus the size of a VLAN header for software VLAN tagging. The iflib(9) converted em(4), thus, first correctly sets scctx->isc_tx_tso_size_max to EM_TSO_SIZE in em_if_attach_pre(), but later on overrides it with IP_MAXPACKET in em_setup_interface() (apparently, left-over from pre-iflib(9) times). So remove the later and correct iflib(9) to correctly cap the maximum TSO size reported to the stack at IP_MAXPACKET. While at it, let iflib(9) use if_sethwtsomax*().
  • Move the reduction of the maximum TSO segment count reported to the stack by the number of m_pullup(9) calls (which in the worst case, can add another mbuf and, thus, the requirement for another DMA segment each) in the transmit path for performance reasons from em_setup_interface() to iflib_txsd_alloc() as these pull-ups are now done in iflib_parse_header() rather than in the no longer existing em_xmit(). Moreover, this optimization applies to all drivers using iflib(9) and not just em(4); all in-tree iflib(9) consumers still have enough room to handle full size TSO packets. Also, reduce the adjustment to the maximum number of m_pullup(9)'s now performed in iflib_parse_header().
  • Prior to the conversion to iflib(9) in r311849, em(4) didn't enable IFCAP_VLAN_HWFILTER by default due to VLAN events not being passed through by lagg(4). With iflib(9), IFCAP_VLAN_HWFILTER now is turned on by default but also lagg(4) was fixed in that regard in r203548. So just remove the now redundant and defunct IFCAP_VLAN_HWFILTER handling in em_setup_interface().
  • Remove also other redundant IFCAP_* setting in em_setup_interface() which is (more completely) already done in em_if_attach_pre() now.
  • Remove some redundant/dead setting of scctx->isc_tx_csum_flags in em_if_attach_pre().
  • Remove some IFCAP_* duplicated either directly or indirectly (e. g. via IFCAP_HWCSUM) in {EM,IGB}_CAPS.
  • Remove some unused macros from em(4).

Okayed by: sbruno@ at 201806 DevSummit Transport Working Group [1]
PR: 220997 (part of) [2]

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Adding my "ok" pending bnxt review by @shurd

This revision is now accepted and ready to land.Jun 9 2018, 9:25 PM

I think that if the size of the DMA tag needs to be larger than the MACs max, iflib should handle that itself rather than have every driver need to add a non-obvious value for non-obvious reasons.

Wouldn't this be a nicer fix if iflib just added sizeof(struct ether_vlan_header) in the bus_dma_tag_create() and call? This seems to be the type of thing that iflib exists to move away from having a driver author worry about... the shared context should be filled using values fro the datasheet, and iflib should just DTRT.

The problem with letting iflib(9) automatically register a TSO DMA tag with a maxsize of isc_tx_tso_size_max + sizeof(struct ether_vlan_header) if the front-end driver declares IFCAP_VLAN_MTU as being supported is that besides a MAC having a restriction on the maximum TSO size, its DMA engine might have a restriction on the maximum transfer size as well (in fact, that's what the maxsize DMA tag parameter is all about). Suppose a MAC supports a maximum TSO size of 64 KB and its DMA engine has a 64 KB limit on transfers as well, one wouldn't be able to support a maximum TSO size of IP_MAXPACKET when doing software VLAN tagging. Actually, that's why I've added you as a reviewer because I don't have a datasheet for the bnxt(4)-driven MACs.

On the bottom line, a front-end driver has to supply both the maximum TSO size and the maximum DMA transfer size to iflib(9). The cleanest approach for that would be to have a separate member for the maximum DMA transfer size in struct if_softc_ctx. However, given that a maximum TSO size of 64 KB really is the low end as far as MACs on the market are concerned, we can cheat here a bit and only go with sc_tx_tso_size_max. Especially given that other members such as isc_capenable are also muddled (that one really refers to both interface capabilities supported but also to those which will be enabled by default), it seemed more worthwhile to just document the sc_tx_tso_size_max usage and to not break the binary interface of iflib(9).

On the bottom line, a front-end driver has to supply both the maximum TSO size and the maximum DMA transfer size to iflib(9). The cleanest approach for that would be to have a separate member for the maximum DMA transfer size in struct if_softc_ctx.

I think that iflib is still in a place where the cleanest approach should be the one that's used, and the binary interface still be considered a work in progress... it's certainly not "done".

However, given that a maximum TSO size of 64 KB really is the low end as far as MACs on the market are concerned, we can cheat here a bit and only go with sc_tx_tso_size_max.

Yeah, I think it might make more sense to cheat in the other direction though... especially given the member name.

Especially given that other members such as isc_capenable are also muddled (that one really refers to both interface capabilities supported but also to those which will be enabled by default), it seemed more worthwhile to just document the sc_tx_tso_size_max usage and to not break the binary interface of iflib(9).

With things like this where there's capabilities that shouldn't be enabled by default, I think it's clear that we *should* have separate fields for capenable and capabilities. Other members being muddled is a reason to fix the other members, not an excuse to muddle more of them :-)

A stronger case can actually be made for isc_capabilities than isc_dma_max_transfer, but both seems like a good idea in light if this patch.

Okay, then let's try to handle this the right way. Changes since the
previous version of the patch:

  • Add an isc_capabilities to struct if_softc_ctx so iflib(9) can also handle interface capabilities that shouldn't be enabled by default; use that to handle default-off capabilities of e1000 and move their handling from em_setup_interface() to em_if_attach_pre() accordingly.
  • Add isc_tso_max{seg,}size DMA engine constraints for the TSO DMA tag to struct if_shared_ctx and let iflib_txsd_alloc() automatically adjust the maxsize of that tag in case IFCAP_VLAN_MTU is supported.
  • Move the if_setifheaderlen(9) call for adjusting the maximum Ethernet header length from {ixgbe,ixv,em}_setup_interface() to iflib(9) so adjustment is automatically done in case IFCAP_VLAN_MTU is supported. As a consequence, this adjustment now is also done in case of bnxt(4) which missed it previously.
  • Remove code to disable IFCAP_VLAN_HWFILTER by default for ixgbe(4) (already not done for ixgbev(4)) as VLAN events are now passed through by lagg(4) since r203548.
  • Don't bother to fiddle with IFCAP_HWSTATS in ixgbe(4)/ixgbev(4) as iflib(9) adds that capability unconditionally.
  • Bump __FreeBSD_version for the above changes as modules using iflib(9) need to be recompiled.
This revision now requires review to proceed.Jun 17 2018, 9:32 PM

Extend to ixl(4)/ixlv(4) after their conversion to iflib(9) in r335338. Akin e1000,
after that conversion these drivers did redundant setup of interface capabilities in
ixl{v,}_setup_interface() and had defunct (but due to r203548 also obsolete) code
to disable IFCAP_VLAN_HWFILTER by default.

Any update? It's been almost a month.

Well, I'd prefer to also have feedback from at least shurd@ regarding bnxt(4) as I have no idea what the DMA/TSO constraints of the MACs it drives actually are (the links to the corresponding datasheets on broadcom.com are broken, unfortunately). It's rather unlikely that those DMA engines are really limited to 64 KB, i. e. BNXT_TSO_SIZE most likely only applies to TSO itself only, but you never know (I did check with the Intel datasheets for the other drivers, FWIW).

This revision was not accepted when it landed; it landed in state Needs Review.Jul 15 2018, 7:04 PM
This revision was automatically updated to reflect the committed changes.