_NO_ OSes actually "negotiate" MSS. RFC 879: "... This Maximum Segment Size (MSS) announcement (often mistakenly called a negotiation) ..." This negotiation behaviour was introduced 11 years ago by r159955 without any explaination about why FreeBSD had to "negotiate" MSS: In syncache_respond() do not reply with a MSS that is larger than what the peer announced to us but make it at least tcp_minmss in size. Sponsored by: TCP/IP Optimization Fundraise 2005 The tcp_minmss behaviour is still kept.
We can't revert this patch, for one extremely simple reason, and another more subtle reason.
The simple reason is PPPoE.
PPPoE adds an 8 byte overhead to packets. Generally, devices behind the PPPoE router do not know about this 8 byte lower MTU / MSS, they advertise an MSS of 1460. The PPPoE device intercepts packets with the SYN flag set and reduces the MSS option by 8 bytes.
If the receiver of this altered MSS option doesn't incorporate it into its MSS option response, then the wrong MSS will be "negotiated." Specifically, the device behind the PPPoE router will be unaware of the lower MSS, and send larger packets.
For inbound connections, if the device doesn't lower the MSS by 8 bytes in the MSS option response, as directed by the PPPoE middlebox, then the external client will use the wrong MSS.
Yes, it is absolutely a negotiation.
The more subtle reason is, for organizations that have to deal with virtually every TCP stack ever made, past and future, there's really no upside to trusting the remote TCP stack more than we must.
as far as I know, middleboxes performing MSS clamping do it for incoming and outgoing traffic.
So considering your example:
If you have a client behind such a middlebox, it send the SYN with the MSS=1460, but the server receives the SYN with an MSS of 1452. Now the server sends back a SYN-ACK with an MSS of 1460, and the middlebox will change it also such that the client sees an MSS of 1452. So this will work.
If you have a server behind the PPPoE line, the client will send a SYN with an MSS of 1460, but the server will see the MSS of 1452. The server will send an SYN-ACK with an MSS of 1460, but the client will see 1452, since the middlebox will change it.
So it is not a negotiation. Each sides announces what maximum segments it can receive.
So I think the idea is OK.
Ah, yes you are correct with regard to the PPPoE case.
Hmm. Maybe it is ok.
But what is the advantage of doing this, what are we trying to fix here? It seems like it is not broken now, and could only expose us to bugs.
I think it is good to be standards compliant. It helps in case of suboptimal middleboxes likes the ones described by sepherosa which relatively decrement the MSS instead of enforcing a maximum. The benefit is the use larger segments which should improve the performance (less overhead).
Need to check if suggested patch is enough to handle all cases. Right now writing packetdrill scripts for testing...
Shouldn't we also apply something like
/* Map our computed MSS into the 3-bit index. */ - mss = min(tcp_mssopt(&sc->sc_inc), max(sc->sc_peer_mss, V_tcp_minmss)); + mss = sc->sc_peer_mss; for (i = nitems(tcp_sc_msstab) - 1; tcp_sc_msstab[i] > mss && i > 0; i--) ;
since we will use the computed index to restore the sc->sc_peer_mss later.
|1638 ↗||(On Diff #33255)|
I would prefer
mssopt = max(tcp_mssopt(&sc->sc_inc), V_tcp_minmss);
instead of the two lines.
For what it is worth:
While testing the MSS handling I figured out that FreeBSD was NOT negotiating the MSS in case of receiving a SYN-segment in the SYN-SENT state.
So the patch discussed here will also result in consistent handling of the MSS.