If an error occurs while processing a TCP segment with some data and the FIN
flag, the back out of the sequence number advance does not take into account the
increase by 1 due to the FIN flag.
Details
Diff Detail
- Repository
- rG FreeBSD src repository
- Lint
Lint Not Applicable - Unit
Tests Not Applicable
Event Timeline
https://lists.freebsd.org/pipermail/freebsd-net/2015-June/042493.html is the original post which has 2 patches. I somehow liked the second patch as it deals with the FIN case specifically. If someone feels otherwise, please comment.
Thanks George.
I'll commit this next Friday (07/17) if I don't get any other feedback.
As a side note, I really dislike the conflation of logical sequence space and data accounting used in many places in our stack. It's something that's fairly straight forward to address and I have some proof of concept patches I did a while ago which we should dust off at some point.
Yes, we should. Let me know if/when you need help with testing/reviewing. I'll be happy to do it.
Are you okay with the patch proposed here?
This change seems inadequate given that we would have set TF_SENTFIN and updated snd_max. I haven't followed through all the implications of not reverting those changes, but if we're going to attempt a state rollback we'd better make sure we get it right. I'm also a bit unclear on some details in the original report given that an RTO would reset snd_nxt to snd_una and get us out of any permanent pickle. I'm not a fan of rollbacks in general as they're fragile. What's the use case where a rollback here matters?
Another solution proposed is to do like if the packet was lost (ignore the error). I also think that a rollback is difficult to maintain all over the code. Regarding the problem we faced: there was a FIN+DATA that was blocked by the local stack. In that case you loose 1 byte of the data each time you refuse to send it. The other effect is that it create a packet storm (if you have some process that request the missing data).
This fix is tested since one month and we never saw the problem again.
Is there any more comment to fix it ?
lstewart@ do you prefer the no rollback in case of error (will do the same as packet loss) ?
both solution is are better than keeping the code like this.
While this issue affects data segments with FIN (the left edge of the retransmitted packet shifts to the right by 1 byte per retransmission), that appears to be a different symptom than the recent panics on SACK rescue retransmission. In the transport call we agreed to commit this against HEAD, but not (yet) MFC this patch.