Page MenuHomeFreeBSD

LACP: When suppressing distributing, return ENOBUFS rather than ENETDOWN to preserve TCP conns
ClosedPublic

Authored by gallatin on Nov 12 2020, 4:11 AM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, Apr 4, 4:16 PM
Unknown Object (File)
Thu, Apr 4, 10:12 AM
Unknown Object (File)
Feb 29 2024, 6:14 PM
Unknown Object (File)
Feb 5 2024, 10:18 PM
Unknown Object (File)
Jan 20 2024, 8:48 AM
Unknown Object (File)
Aug 25 2023, 3:00 PM
Unknown Object (File)
Aug 16 2023, 4:54 AM
Unknown Object (File)
Aug 6 2023, 10:16 AM
Subscribers

Details

Summary

When links come and go, lacp goes into a "suppress distributing" mode where it drops traffic for 3 seconds. I'm not 100% clear on the spec, but I think the intent is to avoid re-ordering traffic by allowing traffic in a newly established link to pass traffic queued on a pre-existing congested link. When in this mode, lagg/lacp drops traffic with ENETDOWN. That return value causes TCP to close any connection where it gets that value back from the lower parts of the stack. This basically means that any TCP session with active traffic during a 3 second window when a link comes up (or goes down) gets closed.

TCP treats return values of ENOBUFS as transient errors, and re-schedules transmission later. So rather than returning ENETDOWN, let's return ENOBUFS instead. This allows TCP connections to be preserved.

I've tested this by repeatedly bouncing links on a Netlfix CDN server under moderate load (20g), and observe ENOBUFS being seen by the TCP stack (as reported by a RACK TCP sysctl).

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 34755