Page MenuHomeFreeBSD

major update to if_ure
ClosedPublic

Authored by jmg on Jul 25 2020, 10:52 PM.

Details

Summary

This update adds support for:
HW VLAN tagging
HW checksum offload for IPv4 and IPv6
tx and rx aggreegation (for full gige speeds)
multiple transactions

In my testing, I am able to get 900-950Mbps depending upon
TCP or UDP, which is a significant improvement over the previous
91Mbps (~8kint/sec*1500bytes/packet*1packet/int).

Test Plan

iperf3 for performance, and an attached testing script for the various
hw offload features.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

jmg requested review of this revision.Jul 25 2020, 10:52 PM
sys/dev/usb/net/if_ure.c
215 ↗(On Diff #74939)
>= 4 like in previous statement?
634 ↗(On Diff #74939)

style: return (m);

651 ↗(On Diff #74939)

style: return (m);

686 ↗(On Diff #74939)

Are you sure the right is not "len > URE_RXPKT_MAX_SEG" ?

710 ↗(On Diff #74939)

avoid negative length: if (len > ETHER_CRC_LEN)

832 ↗(On Diff #74939)

wrap long line.

873 ↗(On Diff #74939)

Do we need to mangle with the OACTIVE flag ? Can we just always have it cleared?

1045 ↗(On Diff #74939)

Make this a single line.

1060 ↗(On Diff #74939)

It is enough to set stall on the first TX transfer, else the USB stack will clear stall multiple times.

sys/dev/usb/net/if_urereg.h
446 ↗(On Diff #74939)

The USB stack will also keep track of pending USB transfers, so it would also be fine to just start all USB transfers on ifp start() . I.E. executing of USB transfers on the same USB endpoint is always serialized.

make it relative to HEAD.. Post white space commit..

jmg marked 10 inline comments as done.

update to address hselasky's comments.

ok, these should address all of your comments.

sys/dev/usb/net/if_ure.c
215 ↗(On Diff #74939)

I converted the other one to ==. I had tried to see if 8 would work, but got failure when allocating memory.

686 ↗(On Diff #74939)

Yes, I am sure. _MAX_SEG is 2048, and when a packet that is larger (or equal) to 2048, it will return 2048 (I guess it could be ==, but this is safer). So if the length is exactly _MAX_SEG, there is no way to tell if it's just a 2048 byte packet, OR it's part of a multi segment packet.

If I had the programmer's manual for the chip, it'd be easy to fix this, but I don't have access to it.

710 ↗(On Diff #74939)

Length cannot be negative here due to:
len = le32toh(pkt.ure_pktlen) & URE_RXPKT_LEN_MASK

873 ↗(On Diff #74939)

Do we need to mangle with the OACTIVE flag ? Can we just always have it cleared?

This will prevent excessive if_start calls under high packet loads, which saves unnecessary lock/unlock cycles that happens by calling it... if_handoff doesn't bother calling if_start (aka ue_start here, which does a lock/unlock cycle) if the _OACTIVE flag is set...

1060 ↗(On Diff #74939)

Thanks, wasn't sure what the correct process was, just did everything 4x for each xfer (except _start, but I explain why elsewhere).

sys/dev/usb/net/if_urereg.h
446 ↗(On Diff #74939)

The reason I don't start all the TX transfers is that if there is only have one packet queued up, the first tx xfer will get it, and then the remaining three will get call backs, and immediately go back to sleep. That is a lot of extra calls to _write_callback that are unnecessary. It allows us to only ever have as many TX xfers in flight as needed to send the currently pending packets.

Looks good to me!

sys/dev/usb/net/if_ure.c
642 ↗(On Diff #74992)

KASSERT(tlen != 0);

This revision is now accepted and ready to land.Jul 27 2020, 11:53 AM
jmg marked an inline comment as done.Jul 27 2020, 6:28 PM

I'll wait to get a few more reports of tests before committing..

sys/dev/usb/net/if_ure.c
642 ↗(On Diff #74992)

not needed

M_TRAILINGSPACE(mb) will always be non-zero, otherwise an mbuf returned by m_getm2 would have no storage space (_TRAILINGSPACE was used instead of M_SIZE because of the m_adj earlier). len is always != 0 due to loop condition.

jmg marked an inline comment as done.

move some debug prints to before the error...

Add a sysctl for dumping the chip version.

This revision now requires review to proceed.Sep 1 2020, 11:50 PM
This revision is now accepted and ready to land.Sep 2 2020, 7:25 AM

don't overwrite the RCR register, keep some bits set. This gets
MTU up to 4096 - mumble working w/o my devices crashing...

This revision now requires review to proceed.Sep 4 2020, 1:37 AM

minor update to include some new register definitions.. unused for now..

This revision was not accepted when it landed; it landed in state Needs Review.Sep 12 2020, 12:33 AM
This revision was automatically updated to reflect the committed changes.