Page MenuHomeFreeBSD

Fix dupack processing to detect loss correctly.
ClosedPublic

Authored by hiren on Nov 19 2015, 10:53 PM.
Tags
None
Referenced Files
Unknown Object (File)
Sun, Dec 15, 9:47 PM
Unknown Object (File)
Sun, Dec 15, 3:58 PM
Unknown Object (File)
Sat, Dec 7, 9:06 PM
Unknown Object (File)
Thu, Dec 5, 9:36 PM
Unknown Object (File)
Thu, Dec 5, 1:41 PM
Unknown Object (File)
Tue, Dec 3, 11:00 PM
Unknown Object (File)
Tue, Dec 3, 7:26 AM
Unknown Object (File)
Mon, Dec 2, 6:07 AM

Details

Summary

One of the ways to detect loss is to count dupacks till they reach decided
threshold (3 in our case).

dupack counting/incrementing is incorrect in a couple of ways:

  1. We reset dupack counter to 0 when certain conditions are not met and if an old ack arrives.
  2. When SACK is present, if the ack has window change, we don't consider it as a dupack.

Here is what this change is trying to bring from rfc5681 and rfc6675:
a) th_ack < snd_una --> ignore it completely
b) th_ack == snd_una, window == old_window --> increment
c) th_ack == snd_una, window != old_window, change_in_sacked_bytes == 0 --> do
nothing
d) th_ack == snd_una, window != old_window, change_in_sacked_bytes != 0 -->
increment

change_in_sacked_bytes is a new addition to sackhint. It keeps track of change
reported via SACK info. (This will be reused in PRR Proportional Rate Reduction
work I am doing.)

Test Plan

I've a local dummynet based testbed to do various tests. I'd appreciate any help
in making sure this doesn't break anything else.

I'll add some scenarios here to show what/how I am testing.

With a single packet drop, this is how it looks with the fix:
(I am just showing relevant acks which trigger fast-retransmit)

00:00:00.054513 IP 10.10.10.10.60460 > 10.10.11.12.http: Flags [.], ack 4576, win 1018, options [nop,nop,TS val 792866636 ecr 2286529430], length 0
00:00:00.054603 IP 10.10.10.10.60460 > 10.10.11.12.http: Flags [.], ack 4576, win 1040, options [nop,nop,TS val 792866637 ecr 2286529430,nop,nop,sack 1 {6024:7472}], length 0
00:00:00.056520 IP 10.10.10.10.60460 > 10.10.11.12.http: Flags [.], ack 4576, win 1040, options [nop,nop,TS val 792866638 ecr 2286529430,nop,nop,sack 1 {6024:8920}], length 0
00:00:00.058411 IP 10.10.10.10.60460 > 10.10.11.12.http: Flags [.], ack 4576, win 1040, options [nop,nop,TS val 792866639 ecr 2286529430,nop,nop,sack 1 {6024:10368}], length 0
00:00:00.058474 IP 10.10.11.12.http > 10.10.10.10.60460: Flags [.], seq 4576:6024, ack 141, win 1040, options [nop,nop,TS val 2286529460 ecr 792866639], length 1448: HTTP

Here, without the proposed fix, second ack would've reset the dupack counter to 0 as its an ack with SACK but also updates window. This would've taken more acks to trigger fast-retransmit.

When I disabled SACK, I could see the old behavior (which is expected)

00:00:00.046556 IP 10.10.10.10.50305 > 10.10.11.12.http: Flags [.], ack 4576, win 1018, options [nop,nop,TS val 798054965 ecr 535100656], length 0
00:00:00.048634 IP 10.10.10.10.50305 > 10.10.11.12.http: Flags [.], ack 4576, win 1040, options [nop,nop,TS val 798054967 ecr 535100656], length 0
00:00:00.048666 IP 10.10.10.10.50305 > 10.10.11.12.http: Flags [.], ack 4576, win 1040, options [nop,nop,TS val 798054968 ecr 535100656], length 0
00:00:00.050614 IP 10.10.10.10.50305 > 10.10.11.12.http: Flags [.], ack 4576, win 1040, options [nop,nop,TS val 798054969 ecr 535100656], length 0
00:00:00.050672 IP 10.10.10.10.50305 > 10.10.11.12.http: Flags [.], ack 4576, win 1040, options [nop,nop,TS val 798054970 ecr 535100656], length 0
00:00:00.050725 IP 10.10.11.12.http > 10.10.10.10.50305: Flags [.], seq 4576:6024, ack 141, win 1040, options [nop,nop,TS val 535100684 ecr 798054970], length 1448: HTTP

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

hiren retitled this revision from to Fix dupack processing to detect loss correctly..
hiren updated this object.
hiren edited the test plan for this revision. (Show Details)
hiren added reviewers: transport, network.
hiren updated this object.
hiren removed a subscriber: imp.

Here is what this change is trying to bring from rfc5681 and rfc6675:
a) th_ack < snd_una --> ignore it completely
b) th_ack == snd_una, window == old_window --> increment
c) th_ack == snd_una, window != old_window, change_in_sacked_bytes == 0 --> do nothing
d) th_ack == snd_una, window != old_window, change_in_sacked_bytes != 0 --> increment

If SACK is enabled on the segment, are you following the definition of "duplicate ACK" from RFC 5681 or RFC 6675? RFC 5681 only counts duplicate ACKs when they contain no data, they contain no window change, and they ACK snd_una. RFC 6675 counts duplicate ACKs anytime they contain a SACK for a previously unacknowledged block, even if the window has changed, they contain data, or they ACK beyond snd_una.

sys/netinet/tcp_sack.c
394 ↗(On Diff #10362)

This line appears to calculate the difference between the number of bytes covered by SACK options included with subsequent packets. However, these numbers are not directly comparable. In particular, each segment only includes SACKs for the four most recently-received data blocks. It is possible for consecutive ACKs to contain SACK blocks that cover the same amount of data, but which actually SACK different data.

For example, both of these ACKs cover 1000 bytes:

  • ACK #1
    • SACK 1000:1500
    • SACK 2000:2500
    • SACK 3000:3500
    • SACK 4000:4500
  • ACK #2
    • SACK 2000:2500
    • SACK 3000:3500
    • SACK 4000:4500
    • SACK 5000:5500

However, because ACK #2 covers an additional SACK block, this SACK contains additional information and may count as a duplicate ACK under either RFC 5681 or RFC 6675.

Also, it is possible for ACKs to arrive out of order. In that case, you might process an ACK with three 512-byte SACKs just before an ACK with two 512-byte SACKs. The second ACK might actually have been sent first. Because the second ACK is an "old ACK" by the time it arrive, it probably should not count as a duplicate ACK.

For purposes of RFC 6675 (if that's what you're aiming to support), you probably need to modify the scoreboard processing to increment a counter when bytes from a hole are SACKed.

In D4225#89228, @jtl wrote:

Here is what this change is trying to bring from rfc5681 and rfc6675:
a) th_ack < snd_una --> ignore it completely
b) th_ack == snd_una, window == old_window --> increment
c) th_ack == snd_una, window != old_window, change_in_sacked_bytes == 0 --> do nothing
d) th_ack == snd_una, window != old_window, change_in_sacked_bytes != 0 --> increment

If SACK is enabled on the segment, are you following the definition of "duplicate ACK" from RFC 5681 or RFC 6675? RFC 5681 only counts duplicate ACKs when they contain no data, they contain no window change, and they ACK snd_una. RFC 6675 counts duplicate ACKs anytime they contain a SACK for a previously unacknowledged block, even if the window has changed, they contain data, or they ACK beyond snd_una.

A very good catch. Aim here is to combine both, if possible.

One very interesting thing I missed here is this bit from rfc6675: It's a dupack even if it ACKs *beyond* snd_una as long as it has SACK data identifying something between HighAck (snd_una) and HighData (snd_max).

This is very confusing to me. HighData keeps moving as you go on. (Because you are also sending data while trying to gather dupacks and trigger fast-retransmit.) How can you count something that (s)acks close to right edge (HighData) as a duplicate ack?
Did I read the definition wrong or is there something obvious am I missing?

Thanks for for the review, I really appreciate it. :-)

Responding to your other comment inline.

sys/netinet/tcp_sack.c
394 ↗(On Diff #10362)

You caught me. I was cheating by adding this as I had it in my PRR (Proportional Rate Reduction) patch that I am working on. In that I need this changed bytes.

But yes, your example is clear and something I did think about and ignored.

Also, all we want to know is if we got new information or not. We don't really care about number of bytes here for the purpose of this change. I'll go back to the approach that Randall and I discussed before of making tcp_sack_doack() return 'true' when there is new data in incoming SACK information. I believe that should serve the purpose here.

WRT oldack, the code right now resets dupack counter to 0 when ad oldack arrives. I am planning to change that so that it ignores oldacks. No increment, no reset.

In D4225#89230, @hiren wrote:

One very interesting thing I missed here is this bit from rfc6675: It's a dupack even if it ACKs *beyond* snd_una as long as it has SACK data identifying something between HighAck (snd_una) and HighData (snd_max).

This is very confusing to me. HighData keeps moving as you go on. (Because you are also sending data while trying to gather dupacks and trigger fast-retransmit.) How can you count something that (s)acks close to right edge (HighData) as a duplicate ack?
Did I read the definition wrong or is there something obvious am I missing?

If the new ACK contains a SACK (not an ACK, but a SACK) for previously un-SACKed data, it is an indication that the un-(S)ACKed segments before it are potentially lost.

I find section 5 of RFC 6675 to be a little ambiguous. Does a cumulative ACK that is also a duplicate ACK first reset the counter to 0 and then also immediately increment it by 1? Or, does it merely increment it? I think it must be the former (reset to 0 and then immediately increment it). This makes sense as the combination of actions which would have been taken if the cumulative ACK (with no new SACK blocks) and a duplicate ACK (with a new SACK block) had been received separately. (And, perhaps, this helps to clarify why they have designated these as "duplicate ACKs"?)

Also, recall that RFC 6675 also counts SACK blocks for sequence numbers above each hole. In this context, it makes sense to count SACK blocks as duplicate ACKs on a per-hole basis in a wider range of cases. And, it is necessary to count these cumulative ACKs with new SACK blocks as duplicate ACKs in order to enter the section of the algorithm that checks the per-hole duplicate ACK counter.

This has been a most non-authoritative answer. But, hopefully, it is somewhat helpful.

In D4225#89237, @jtl wrote:
In D4225#89230, @hiren wrote:

One very interesting thing I missed here is this bit from rfc6675: It's a dupack even if it ACKs *beyond* snd_una as long as it has SACK data identifying something between HighAck (snd_una) and HighData (snd_max).

This is very confusing to me. HighData keeps moving as you go on. (Because you are also sending data while trying to gather dupacks and trigger fast-retransmit.) How can you count something that (s)acks close to right edge (HighData) as a duplicate ack?
Did I read the definition wrong or is there something obvious am I missing?

If the new ACK contains a SACK (not an ACK, but a SACK) for previously un-SACKed data, it is an indication that the un-(S)ACKed segments before it are potentially lost.

I find section 5 of RFC 6675 to be a little ambiguous. Does a cumulative ACK that is also a duplicate ACK first reset the counter to 0 and then also immediately increment it by 1? Or, does it merely increment it? I think it must be the former (reset to 0 and then immediately increment it). This makes sense as the combination of actions which would have been taken if the cumulative ACK (with no new SACK blocks) and a duplicate ACK (with a new SACK block) had been received separately. (And, perhaps, this helps to clarify why they have designated these as "duplicate ACKs"?)

Also, recall that RFC 6675 also counts SACK blocks for sequence numbers above each hole. In this context, it makes sense to count SACK blocks as duplicate ACKs on a per-hole basis in a wider range of cases. And, it is necessary to count these cumulative ACKs with new SACK blocks as duplicate ACKs in order to enter the section of the algorithm that checks the per-hole duplicate ACK counter.

This has been a most non-authoritative answer. But, hopefully, it is somewhat helpful.

Yeah, this whole thing is getting into a weird territory.

tcp_input.c: tcp_do_segment(), very early in the code, when we receive an ACK advancing snd_una,

                if (tlen == 0) { 
                        if (SEQ_GT(th->th_ack, tp->snd_una) &&
                            SEQ_LEQ(th->th_ack, tp->snd_max) &&
                            !IN_RECOVERY(tp->t_flags) &&
                            (to.to_flags & TOF_SACK) == 0 && 
                            TAILQ_EMPTY(&tp->snd_holes)) {

<blah>
                                tp->t_dupacks = 0;

And later we do 'Ack processing' where we try to process duplicate Acks. I suspect anything that advances snd_una would've caused dupack counter reset by the time it gets here to detect loss. I think.

On top of this we'd need per-hole dupack counters to do the right thing wrt this requirement.

I am of the opinion that I'd just mention in the commitlog that this is one piece that we are not following and tidy up part of the patch to detect actual change in sack info (as opposed to what I have now i.e change in bytes) and go with that.

Let me know if that sounds okay to you.

Edit: just realized we won't get into the condition above because of "(to.to_flags & TOF_SACK) == 0" part. But I still would like to proceed without complying to this one requirement.

In D4225#89249, @hiren wrote:

I am of the opinion that I'd just mention in the commitlog that this is one piece that we are not following and tidy up part of the patch to detect actual change in sack info (as opposed to what I have now i.e change in bytes) and go with that.

Let me know if that sounds okay to you.

I agree that full RFC 6675 support is non-trivial.

While not full RFC 6675 support, I think it would be nice to follow the part where we reset the dupack counter when we receive a new cumulative ACK, and then increment the counter if there is new SACK data. However, this changes some fundamental assumptions in our code and may be non-trivial depending on the new code path.

Having said that, working RFC 5681 support would be better than what we have today. :-)

RFC 5681 support changes very few of the assumptions we currently have except for using the change in SACKd bytes to determine whether something is really a duplicate ACK.

So, for RFC 5681-ish support (informed a bit by RFC 6675, but very much NOT supporting it), I would envision (in pseudo-code):

if (th_ack < snd_una)
    // Do nothing
else if SACK_ENABLED {
    if (new_sacked_bytes != old_sacked_bytes)
        dupack++;
} else if (new_window == old_window && tlen == 0)
    dupack++;

I think this follows the spirit of RFC 5861 (which says we should use SACK data to determine if something is a duplicate) without following its exact restrictions in the SACK case. In the SACK case, it loosens the dupack requirements somewhat (sort of like RFC 6675, but without fully supporting RFC 6675).

Is that about what you had in mind? In fact, looking back at your patch, I think it is just about what you implemented, except for the (new_window == old_window && SACK_ENABLED) case.

In D4225#89258, @jtl wrote:
In D4225#89249, @hiren wrote:

I am of the opinion that I'd just mention in the commitlog that this is one piece that we are not following and tidy up part of the patch to detect actual change in sack info (as opposed to what I have now i.e change in bytes) and go with that.

Let me know if that sounds okay to you.

I agree that full RFC 6675 support is non-trivial.

While not full RFC 6675 support, I think it would be nice to follow the part where we reset the dupack counter when we receive a new cumulative ACK, and then increment the counter if there is new SACK data. However, this changes some fundamental assumptions in our code and may be non-trivial depending on the new code path.

It seems doable. Right after the main 'if' loop that checks/detects dupacks for fast-retrans, I could do:

if (SEQ_GT(th->th_ack, tp->snd_una)) {
        t_dupacks = 0;
        if ((tp->t_flags & TF_SACK_PERMIT) && sack_changed)
                t_dupacks ++;
}

So, for RFC 5681-ish support (informed a bit by RFC 6675, but very much NOT supporting it), I would envision (in pseudo-code):

if (th_ack < snd_una)
    // Do nothing
else if SACK_ENABLED {
    if (new_sacked_bytes != old_sacked_bytes)
        dupack++;
} else if (new_window == old_window && tlen == 0)
    dupack++;

Is that about what you had in mind? In fact, looking back at your patch, I think it is just about what you implemented, except for the (new_window == old_window && SACK_ENABLED) case.

I _think_ that case is covered and we would do ++ no matter sack is present or not.

I have the patch ready with correct way to identify change in sack info. I'll update in a bit.

In D4225#89260, @hiren wrote:

Is that about what you had in mind? In fact, looking back at your patch, I think it is just about what you implemented, except for the (new_window == old_window && SACK_ENABLED) case.

I _think_ that case is covered and we would do ++ no matter sack is present or not.

Exactly. My pseudo-code proposal showed that we would not increment if (new_window == old_window && SACK_ENABLED && new_sacked_bytes == old_sacked_bytes). I think that is more in keeping with the spirit of the RFCs we are discussing.

See, for example, this snippet from RFC 5681:

Alternatively, a TCP that utilizes selective acknowledgments
(SACKs) [RFC2018, RFC2883] can leverage the SACK information to
determine when an incoming ACK is a "duplicate" (e.g., if the ACK
contains previously unknown SACK information).

I have the patch ready with correct way to identify change in sack info. I'll update in a bit.

Look forward to seeing it.

In D4225#89261, @jtl wrote:
In D4225#89260, @hiren wrote:

Is that about what you had in mind? In fact, looking back at your patch, I think it is just about what you implemented, except for the (new_window == old_window && SACK_ENABLED) case.

I _think_ that case is covered and we would do ++ no matter sack is present or not.

Exactly. My pseudo-code proposal showed that we would not increment if (new_window == old_window && SACK_ENABLED && new_sacked_bytes == old_sacked_bytes). I think that is more in keeping with the spirit of the RFCs we are discussing.

See, for example, this snippet from RFC 5681:

Alternatively, a TCP that utilizes selective acknowledgments
(SACKs) [RFC2018, RFC2883] can leverage the SACK information to
determine when an incoming ACK is a "duplicate" (e.g., if the ACK
contains previously unknown SACK information).

Ah, I see what you are saying. That would be correct if we take this sentence verbatim. But if we think about it, no change in window, no change in SACK info, wouldn't that ought to be a duplicate ack?

what I am assuming here is, that sentence is supposed to be applied on top of existing premise. (i.e. window is the same, its a dupack BUT it's also a dupack if sack info has changed evenif window is different).

I am not sure if you can make sense of anything I wrote above :p

Thanks again.

In D4225#89262, @hiren wrote:

Ah, I see what you are saying. That would be correct if we take this sentence verbatim. But if we think about it, no change in window, no change in SACK info, wouldn't that ought to be a duplicate ack?

This goes back to our discussion about the definition of "duplicate ack". ;-)

what I am assuming here is, that sentence is supposed to be applied on top of existing premise. (i.e. window is the same, its a dupack BUT it's also a dupack if sack info has changed evenif window is different).

I am not sure if you can make sense of anything I wrote above :p

I think so. You are thinking of the SACK data as being additive on top of the other things. And, that is one interpretation of the "Alternatively" in the section I quoted. The question is what is it alternative to? One of the checks, all of the checks, some set of checks?

But, let's go back to the reason we care about duplicate ACKs. Without SACK, they are our indication that the remote side might have received a data block above the next block it expected. Without SACK, all it can do is retransmit an ACK for the last packet it received. When we get enough of those, we kind of get the clue we need to do something.

But, with SACK, we have greater information. We know when a new packet (that didn't fill the first hole) has been received. And, we know when we're getting an ACK that adds no new SACK information (meaning no new packets got through). Especially in light of RFC 6675, I think it is reasonable to take some liberty and ignore duplicate ACKs on a SACK-enabled session unless they SACK more blocks.

  • Fix dupack processing to detect loss correctly.
  • Addressing jlt@'s concerns.
    • Fixed detection of change in SACK info.
    • Increment t_dupacks when an ACK is advancing the left edge (snd_una) but also has previously unkonwn SACK info.
In D4225#89278, @jtl wrote:

But, with SACK, we have greater information. We know when a new packet (that didn't fill the first hole) has been received. And, we know when we're getting an ACK that adds no new SACK information (meaning no new packets got through). Especially in light of RFC 6675, I think it is reasonable to take some liberty and ignore duplicate ACKs on a SACK-enabled session unless they SACK more blocks.

Okay, I can see your reasoning. But I cannot imagine a scenario when receiver would send 2 back to back acks with same cumulative ack number *and* same SACK info. Can you explain when this is possible? May be I am missing something.

In D4225#89288, @hiren wrote:

Okay, I can see your reasoning. But I cannot imagine a scenario when receiver would send 2 back to back acks with same cumulative ack number *and* same SACK info. Can you explain when this is possible?

Its certainly possible in the (tlen != 0) case, but I don't think we're talking about that. In the (tlen == 0) case, the most obvious reason is out-of-order ACKs (ACK1 SACKs block 1, ACK2 SACKs blocks 1 and 2, but we receive ACK2 before ACK1).

Other than that, off the top of my head, the next most obvious reasons are anomalies:

  • network errors that duplicate packets (its been seen before!)
  • anomalous TCP stacks
  • DoS attacks
In D4225#89289, @jtl wrote:
In D4225#89288, @hiren wrote:

Okay, I can see your reasoning. But I cannot imagine a scenario when receiver would send 2 back to back acks with same cumulative ack number *and* same SACK info. Can you explain when this is possible?

Its certainly possible in the (tlen != 0) case, but I don't think we're talking about that. In the (tlen == 0) case, the most obvious reason is out-of-order ACKs (ACK1 SACKs block 1, ACK2 SACKs blocks 1 and 2, but we receive ACK2 before ACK1).

In this case we'd detect change in SACK info when ACK1 arrives.

Other than that, off the top of my head, the next most obvious reasons are anomalies:

  • network errors that duplicate packets (its been seen before!)
  • anomalous TCP stacks
  • DoS attacks

Okay, I agree. Any sort of such anomaly should not be able to increment dupack counter to make us do fast retransmit.

Let me juggle around the if/else a bit :-)

  • Fix dupack processing to detect loss correctly.
  • Addressing jlt@'s concerns.
  • If sack is present but SACK info has not changed, it should not affect dupack counter.
sys/netinet/tcp_input.c
2489 ↗(On Diff #10391)

Hum, I need to update the comment about this and also should update the condition to say:
(tp->t_flags &TF_SACK_PERMIT) && tiwin == tp->snd_wnd && !sack_changed

Do we care about window being the same here in this check?

sys/netinet/tcp_input.c
2489 ↗(On Diff #10391)

I think I should add that condition. We don't want to skip striking the counter if window has changed.

  • Fix dupack processing to detect loss correctly.
  • Addressing jlt@'s concerns.
  • If sack is present but SACK info has not changed, it should not affect dupack counter.
  • Updating the condition to skip dupack increment when an ack arrives without new sack info or window update. Also added some commments.
  • Fix dupack processing to detect loss correctly.
  • Addressing jlt@'s concerns.
  • If sack is present but SACK info has not changed, it should not affect dupack counter.
  • Updating the condition to skip dupack increment when an ack arrives without new sack info or window update. Also added some commments.
  • Remove an old/wrong entry.
sys/netinet/tcp_input.c
2638 ↗(On Diff #10407)

This portion looks incorrect and unnecessary. This causes a wrong dupack increment event when an ACK comes which moves left edge and also is the first ack with SACK info in it. This code would detect that as a duplicate ack which is not correct. An example below:

00:00:00.046409 IP 10.10.10.10.56022 > 10.10.11.12.http: Flags [.], ack 1680, win 1018, options [nop,nop,TS val 1056985310 ecr 195535656], length 0
00:00:00.048354 IP 10.10.10.10.56022 > 10.10.11.12.http: Flags [.], ack 3128, win 1040, options [nop,nop,TS val 1056985313 ecr 195535657,nop,nop,sack 1 {4576:6024}], length 0
00:00:00.050344 IP 10.10.10.10.56022 > 10.10.11.12.http: Flags [.], ack 3128, win 1040, options [nop,nop,TS val 1056985315 ecr 195535657,nop,nop,sack 1 {4576:7472}], length 0
00:00:00.050580 IP 10.10.10.10.56022 > 10.10.11.12.http: Flags [.], ack 3128, win 1040, options [nop,nop,TS val 1056985315 ecr 195535657,nop,nop,sack 1 {4576:8920}], length 0
00:00:00.050738 IP 10.10.11.12.http > 10.10.10.10.56022: Flags [.], seq 3128:4576, ack 141, win 1040, options [nop,nop,TS val 195535684 ecr 1056985315], length 1448: HTTP

With this code, first 'ack 3128' is considered first dupack, which is wrong.

I'll have to think a bit more about this and update.

sys/netinet/tcp_input.c
2638 ↗(On Diff #10407)

Huh. Now that I think about it, this *is* the correct way to identify a "dupack" if we want to follow rfc6675.

jtl added a reviewer: jtl.

FWIW, RFC 6675 also contemplates looking for "duplicate ACKs" in packets with data. Implementing that check requires a change to the fast-path code. It also requires a change to the (tlen == 0) condition in tcp_do_segment at line 2439.

Personally, I think this change has merit in its current form. I've made some comments where I think we could make some small tweaks. But, if you want to commit it "as is", I'm OK with that.

sys/netinet/tcp_input.c
2489 ↗(On Diff #10407)

What's the rationale for checking the window information on a SACK session? On a non-SACK session, it is a clue that this a packet might have been lost (i.e. this is a duplicate ACK). On a SACK session, the updated SACK information is a clue that a packet might have been lost.

2638 ↗(On Diff #10407)

Huh. Now that I think about it, this *is* the correct way to identify a "dupack" if we want to follow rfc6675.

Agreed. However, I think this code will produce false positives. Namely, I think sack_changed will always be true when snd_una advances. I think this check would be closer to the behavior envisioned by RFC6675 if sack_changed only indicated whether there was a new SACK block (as distinct from advancing snd_una).

sys/netinet/tcp_sack.c
348 ↗(On Diff #10407)

Note that this function treats (snd_una, th_ack) as a SACK hole. Therefore, any new ACK should cause it to return 1.

This revision is now accepted and ready to land.Dec 1 2015, 5:34 PM
In D4225#91275, @jtl wrote:

FWIW, RFC 6675 also contemplates looking for "duplicate ACKs" in packets with data. Implementing that check requires a change to the fast-path code. It also requires a change to the (tlen == 0) condition in tcp_do_segment at line 2439.

Agree.

Personally, I think this change has merit in its current form. I've made some comments where I think we could make some small tweaks. But, if you want to commit it "as is", I'm OK with that.

Thank you. I've responded in-line.

sys/netinet/tcp_input.c
2489 ↗(On Diff #10407)

You are right. If SACK in present and sack_changed is FALSE, there is no point in checking window information. As at that point, it cannot be a dupack and we'd just want to skip it. I'll revert back to the prev condition and remove ' tiwin == tp->snd_wnd' check.

2638 ↗(On Diff #10407)

rfc6675 says "a segment that arrives carrying a SACK block that identifies previously unacknowledged and un-SACKed octets between HighACK and HighData." And not strictly a *new* sack block. And as you rightly said, we count snd_una-th_ack a hole, any change in that should be considered a new info.
(I am sure we are getting into the interpretation/implementation differences at this point ;-))

sys/netinet/tcp_sack.c
348 ↗(On Diff #10407)

Right, any ack that moves snd_una. As I think that is also new sack info. (see above)

hiren edited edge metadata.
  • Fix dupack processing to detect loss correctly.
  • Addressing jlt@'s concerns.
  • If sack is present but SACK info has not changed, it should not affect dupack counter.
  • Updating the condition to skip dupack increment when an ack arrives without new sack info or window update. Also added some commments.
  • Remove an old/wrong entry.
  • Remove window information check when acks with SACK have no new SACK info in them. We want to skip them.
This revision now requires review to proceed.Dec 1 2015, 8:54 PM
jtl edited edge metadata.

See my inline comment (and earlier comment about tlen != 0). But, again, its still OK with me to commit this "as is".

sys/netinet/tcp_input.c
2638 ↗(On Diff #10649)

Hmmm... I am looking at the same words as you, but reading them differently. :-)

To be a "duplicate acknowledgement", the ACK needs to contain "a SACK block that identifies previously unacknowledged and un-SACKed octets". Even though we internally treat an ACK that advances snd_una as a SACK for (snd_una, th_ack), that is merely an implementation choice. It doesn't make the ACK actually become a SACK block.

And, I think there is good reason to treat these differently. A new SACK block is an indication that packets with lower sequence numbers may have been lost. However, receiving an ACK that advances snd_una does not provide such an indication.

This revision is now accepted and ready to land.Dec 2 2015, 2:31 AM
In D4225#91474, @jtl wrote:

See my inline comment (and earlier comment about tlen != 0). But, again, its still OK with me to commit this "as is".

Ah, I agree to your point of just left edge moving is not a dupack as it is actually "good" news as opposed to "bad" news given by actual sack info change. But as it requires more work in the way we handle sack blocks, I'd skip that part and commit as is.

Thanks for a thorough review all along. I've learned quite a bit in the process.

This revision was automatically updated to reflect the committed changes.