Page MenuHomeFreeBSD

LACP: When suppressing distributing, return ENOBUFS rather than ENETDOWN to preserve TCP conns
ClosedPublic

Authored by gallatin on Nov 12 2020, 4:11 AM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, Dec 12, 6:08 PM
Unknown Object (File)
Thu, Dec 5, 7:12 PM
Unknown Object (File)
Sat, Nov 23, 10:56 PM
Unknown Object (File)
Thu, Nov 21, 10:53 PM
Unknown Object (File)
Nov 20 2024, 3:47 AM
Unknown Object (File)
Nov 20 2024, 2:45 AM
Unknown Object (File)
Nov 19 2024, 3:41 PM
Unknown Object (File)
Nov 18 2024, 10:35 PM
Subscribers

Details

Summary

When links come and go, lacp goes into a "suppress distributing" mode where it drops traffic for 3 seconds. I'm not 100% clear on the spec, but I think the intent is to avoid re-ordering traffic by allowing traffic in a newly established link to pass traffic queued on a pre-existing congested link. When in this mode, lagg/lacp drops traffic with ENETDOWN. That return value causes TCP to close any connection where it gets that value back from the lower parts of the stack. This basically means that any TCP session with active traffic during a 3 second window when a link comes up (or goes down) gets closed.

TCP treats return values of ENOBUFS as transient errors, and re-schedules transmission later. So rather than returning ENETDOWN, let's return ENOBUFS instead. This allows TCP connections to be preserved.

I've tested this by repeatedly bouncing links on a Netlfix CDN server under moderate load (20g), and observe ENOBUFS being seen by the TCP stack (as reported by a RACK TCP sysctl).

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable