If an error occurs while processing a TCP segment with some data and the FIN
flag, the back out of the sequence number advance does not take into account the
increase by 1 due to the FIN flag.
https://lists.freebsd.org/pipermail/freebsd-net/2015-June/042493.html is the original post which has 2 patches. I somehow liked the second patch as it deals with the FIN case specifically. If someone feels otherwise, please comment.
As a side note, I really dislike the conflation of logical sequence space and data accounting used in many places in our stack. It's something that's fairly straight forward to address and I have some proof of concept patches I did a while ago which we should dust off at some point.
This change seems inadequate given that we would have set TF_SENTFIN and updated snd_max. I haven't followed through all the implications of not reverting those changes, but if we're going to attempt a state rollback we'd better make sure we get it right. I'm also a bit unclear on some details in the original report given that an RTO would reset snd_nxt to snd_una and get us out of any permanent pickle. I'm not a fan of rollbacks in general as they're fragile. What's the use case where a rollback here matters?
Another solution proposed is to do like if the packet was lost (ignore the error). I also think that a rollback is difficult to maintain all over the code. Regarding the problem we faced: there was a FIN+DATA that was blocked by the local stack. In that case you loose 1 byte of the data each time you refuse to send it. The other effect is that it create a packet storm (if you have some process that request the missing data).
This fix is tested since one month and we never saw the problem again.
Is there any more comment to fix it ?
lstewart@ do you prefer the no rollback in case of error (will do the same as packet loss) ?
both solution is are better than keeping the code like this.