Page MenuHomeFreeBSD

LACP: When suppressing distributing, return ENOBUFS rather than ENETDOWN to preserve TCP conns
ClosedPublic

Authored by gallatin on Nov 12 2020, 4:11 AM.

Details

Summary

When links come and go, lacp goes into a "suppress distributing" mode where it drops traffic for 3 seconds. I'm not 100% clear on the spec, but I think the intent is to avoid re-ordering traffic by allowing traffic in a newly established link to pass traffic queued on a pre-existing congested link. When in this mode, lagg/lacp drops traffic with ENETDOWN. That return value causes TCP to close any connection where it gets that value back from the lower parts of the stack. This basically means that any TCP session with active traffic during a 3 second window when a link comes up (or goes down) gets closed.

TCP treats return values of ENOBUFS as transient errors, and re-schedules transmission later. So rather than returning ENETDOWN, let's return ENOBUFS instead. This allows TCP connections to be preserved.

I've tested this by repeatedly bouncing links on a Netlfix CDN server under moderate load (20g), and observe ENOBUFS being seen by the TCP stack (as reported by a RACK TCP sysctl).

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.