Page MenuHomeFreeBSD

When TCP ECN decides it wants to assure an ACK is sent it needs to do it correctly and with some limits.
AcceptedPublic

Authored by rrs on Mon, Feb 23, 1:59 PM.
Tags
None
Referenced Files
Unknown Object (File)
Sun, Mar 1, 3:35 PM
Unknown Object (File)
Sun, Mar 1, 2:24 PM
Unknown Object (File)
Sun, Mar 1, 3:39 AM
Unknown Object (File)
Sun, Mar 1, 3:33 AM
Unknown Object (File)
Sun, Mar 1, 3:32 AM
Unknown Object (File)
Sun, Mar 1, 2:26 AM
Unknown Object (File)
Sun, Mar 1, 2:26 AM
Unknown Object (File)
Sun, Mar 1, 2:25 AM

Details

Reviewers
rscheff
tuexen
Group Reviewers
transport
Summary

So in testing I have found two interesting cases where ECN is going to
make it so that an ack will be sent right away. These cases need to be limited
to being in the ESTABLISHED state. You don't want ECN sending ACK's when
we are transitioning in front or end states. Also we don't start a delayed ack timer
<and> at the same time set the ACKNOW flag, thats just plain wrong.

Test Plan

Validate that we don't send the ack's incorrectly even when DCTCP
says we should send an ack in a trailing state.

Diff Detail

Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

rrs requested review of this revision.Mon, Feb 23, 1:59 PM

What is with BBR?

sys/netinet/tcp_input.c
542–545

Shouldn't we also process the flags in TCPS_CLOSE_WAIT?

What is with BBR?

Hmm well for BBR its a non-issue since BBRv1 does *not* support ECN. And yes thinking on it
being in CLOSE_WAIT should still allow ack's to go out. I will change that...

Ok I have added the CLOSE_WAIT stage Michael suggested. However I am not
sure that is a wise thing. It may mean an extra ack in the closing state i.e. while you
are waiting for your user to close.. but on the other hand that may take a *very* long
time.. so I am ambivalent about it

In D55460#1268933, @rrs wrote:

Ok I have added the CLOSE_WAIT stage Michael suggested. However I am not
sure that is a wise thing. It may mean an extra ack in the closing state i.e. while you
are waiting for your user to close.. but on the other hand that may take a *very* long
time.. so I am ambivalent about it

Sorry, mistake on my part. You are receiving data and still acking it. That is the
direction which is relevant, right?
Then it needs to be ESTABLISHED or FIN WAIT 1 or FIN WAIT 2.

In D55460#1268552, @rrs wrote:

What is with BBR?

Hmm well for BBR its a non-issue since BBRv1 does *not* support ECN. And yes thinking on it

OK.

being in CLOSE_WAIT should still allow ack's to go out. I will change that...

This revision is now accepted and ready to land.Mon, Feb 23, 5:09 PM

LGTM

I think CLOSE WAIT should be replaced by FIN WAIT 1 and FIN WAIT 2. My mistake.

LGTM

I think CLOSE WAIT should be replaced by FIN WAIT 1 and FIN WAIT 2. My mistake.

Actually no. I don't think so. The idea here is the main stack *not* ecn should be controlling what
acks are sent when we are closing. In FIN WAIT1/2 we are doing the 4-way handshake dance and
ECN (and any CC behind it) needs to not be involved with it. The stack in theory should still
react the correct way and go through the proper handshake sending acks finacks and anything
else it needs. We just don't want ECN toggling in TH_ACKNOW in the middle since it is
unaware of the 4 way shutdown taking place..

In D55460#1268987, @rrs wrote:

LGTM

I think CLOSE WAIT should be replaced by FIN WAIT 1 and FIN WAIT 2. My mistake.

Actually no. I don't think so. The idea here is the main stack *not* ecn should be controlling what
acks are sent when we are closing. In FIN WAIT1/2 we are doing the 4-way handshake dance and
ECN (and any CC behind it) needs to not be involved with it. The stack in theory should still
react the correct way and go through the proper handshake sending acks finacks and anything
else it needs. We just don't want ECN toggling in TH_ACKNOW in the middle since it is
unaware of the 4 way shutdown taking place..

The reason I am suggesting to include FIN WAIT 1/2 is that we stopped sending, but the
peer can send us data for an unlimited amount of time.. Shouldn't this data transfer towards
us be able to use ECN? That would mean we should send ACKs based on the received ECN markings.
Am I wrong with this? Why shouldn't half closed TCP connections not use ECN for the
direction data is still sent?

The reason I am suggesting to include FIN WAIT 1/2 is that we stopped sending, but the
peer can send us data for an unlimited amount of time.. Shouldn't this data transfer towards
us be able to use ECN? That would mean we should send ACKs based on the received ECN markings.
Am I wrong with this? Why shouldn't half closed TCP connections not use ECN for the
direction data is still sent?

Hmm FIN WAIT 2, yes I can see that since at that point you have transitioned to fully closed. And yes
the peer could keep in established on its side a long time.. But are you not in the middle of your close
in FIN WAIT 1?

In D55460#1269031, @rrs wrote:

The reason I am suggesting to include FIN WAIT 1/2 is that we stopped sending, but the
peer can send us data for an unlimited amount of time.. Shouldn't this data transfer towards
us be able to use ECN? That would mean we should send ACKs based on the received ECN markings.
Am I wrong with this? Why shouldn't half closed TCP connections not use ECN for the
direction data is still sent?

Hmm FIN WAIT 2, yes I can see that since at that point you have transitioned to fully closed. And yes
the peer could keep in established on its side a long time.. But are you not in the middle of your close
in FIN WAIT 1?

Have a look at state diagram.
When you close your writing end, you send a FIN and enter FIN WAIT 1. When you receive the ACK for this FIN, you enter FIN WAIT 2. From the perspective of your peer, there is no difference if you are in FIN WAIT 1 or FIN WAIT 2 (or ESTABLISHED). The peer can send data. As long as it can send data, ECN should be usable, I think.

In D55460#1269031, @rrs wrote:

The reason I am suggesting to include FIN WAIT 1/2 is that we stopped sending, but the
peer can send us data for an unlimited amount of time.. Shouldn't this data transfer towards
us be able to use ECN? That would mean we should send ACKs based on the received ECN markings.
Am I wrong with this? Why shouldn't half closed TCP connections not use ECN for the
direction data is still sent?

Hmm FIN WAIT 2, yes I can see that since at that point you have transitioned to fully closed. And yes
the peer could keep in established on its side a long time.. But are you not in the middle of your close
in FIN WAIT 1?

Have a look at state diagram.
When you close your writing end, you send a FIN and enter FIN WAIT 1. When you receive the ACK for this FIN, you enter FIN WAIT 2. From the perspective of your peer, there is no difference if you are in FIN WAIT 1 or FIN WAIT 2 (or ESTABLISHED). The peer can send data. As long as it can send data, ECN should be usable, I think.

I will update the review.. however I have trepidation on this whole issue. You have two ack-regimes going on.

  1. TCP's normal acking -- delayed ack or ack every packet
  2. ECN commanding acks.

Now the interaction of the two regimes could prove interesting with un-anticipated consequences. Especially in
state machine transitions. So though yes, FIN WAIT 1/2 probably should be included in the exception list, I just
feel we could have issues down the road.. not sure what, but just the interactions of these algorithms could
be a src of problems... In ESTABLISHED I see no problem worse that happens is you end up acking every packet.
But in any other state I wonder...

I will increase your exclusions.. but I am concerned...

In D55460#1269353, @rrs wrote:
In D55460#1269031, @rrs wrote:

The reason I am suggesting to include FIN WAIT 1/2 is that we stopped sending, but the
peer can send us data for an unlimited amount of time.. Shouldn't this data transfer towards
us be able to use ECN? That would mean we should send ACKs based on the received ECN markings.
Am I wrong with this? Why shouldn't half closed TCP connections not use ECN for the
direction data is still sent?

Hmm FIN WAIT 2, yes I can see that since at that point you have transitioned to fully closed. And yes
the peer could keep in established on its side a long time.. But are you not in the middle of your close
in FIN WAIT 1?

Have a look at state diagram.
When you close your writing end, you send a FIN and enter FIN WAIT 1. When you receive the ACK for this FIN, you enter FIN WAIT 2. From the perspective of your peer, there is no difference if you are in FIN WAIT 1 or FIN WAIT 2 (or ESTABLISHED). The peer can send data. As long as it can send data, ECN should be usable, I think.

I will update the review.. however I have trepidation on this whole issue. You have two ack-regimes going on.

  1. TCP's normal acking -- delayed ack or ack every packet
  2. ECN commanding acks.

Now the interaction of the two regimes could prove interesting with un-anticipated consequences. Especially in
state machine transitions. So though yes, FIN WAIT 1/2 probably should be included in the exception list, I just
feel we could have issues down the road.. not sure what, but just the interactions of these algorithms could
be a src of problems... In ESTABLISHED I see no problem worse that happens is you end up acking every packet.
But in any other state I wonder...

I will increase your exclusions.. but I am concerned...

OK. Do you think we might have problems in the protocol specifications or do you think we might have problems in the implementation?
If you have implementation restrictions in mind, then restrict the processing to ESTABLISHED and add a comment, that you intentionally exclude FIN WAIT 1/2.
If you think that there are limitations related to the protocol specifications I would like to learn.

In D55460#1269353, @rrs wrote:

Have a look at state diagram.
When you close your writing end, you send a FIN and enter FIN WAIT 1. When you receive the ACK for this FIN, you enter FIN WAIT 2. From the perspective of your peer, there is no difference if you are in FIN WAIT 1 or FIN WAIT 2 (or ESTABLISHED). The peer can send data. As long as it can send data, ECN should be usable, I think.

I will update the review.. however I have trepidation on this whole issue. You have two ack-regimes going on.

  1. TCP's normal acking -- delayed ack or ack every packet
  2. ECN commanding acks.

Now the interaction of the two regimes could prove interesting with un-anticipated consequences. Especially in
state machine transitions. So though yes, FIN WAIT 1/2 probably should be included in the exception list, I just
feel we could have issues down the road.. not sure what, but just the interactions of these algorithms could
be a src of problems... In ESTABLISHED I see no problem worse that happens is you end up acking every packet.
But in any other state I wonder...

I will increase your exclusions.. but I am concerned...

But an immediate ACK should be safe in any state - provided it is not malformed (e.g. missing FIN or SYN bit in the sequence space). If two different mechanisms come to different conclusions if to send an ACK now, sending it should never have any detrimental side effects (other than consuming slightly more bandwidth on the back channel). Delaying ACKs OTOH can negatively impact some of the mechanisms (rtt measurements, ACK clock during slow start, loss recovery, ...).

So I see that thinning ACKs could be a problem, but not ACKing with every packet - obviously, sending two ACKs within a short fraction of time, and no intermediate incoming packet would not be good though, but that shouldn't happen either.

Sending ack's in general (ESTABLISHED) will of course not cause issues. But what concerns me
is that we *do* use acks to drive the various state-machines when we are in the front or back-states.
Can ack's driven by CC/ECN interfere .. I don't know.. is it protocol or implementation probably both.
I just think there needs to be careful thought on when the CC/ECN overrides and says "ACK" if
we are not in established.

In theory yes it should never hurt to send an extra ack here or there.. but we have already seen with
the ack-war bug that this can cause issues... and thus we spent a week or two figuring out why
rack sent an extra ack that triggered the ack-war bug.

I just have an uneasy feeling about this after sinking so much time in it and can't help
but feel it will bite us in the end someplace.

I will add the exclusions for FIN states.. but I am still nervous about the whole issue.. both implementation and protocol..

Updates per latest comments.

This revision now requires review to proceed.Tue, Feb 24, 3:05 PM
sys/netinet/tcp_ecn.c
356–359
  • I think you want to use tp->t_state instead of tp_t_flags when comparing state values.
  • The above condition matches TCPS_ESTABLISHED, TCPS_CLOSE_WAIT, TCPS_FIN_WAIT_1, TCPS_CLOSING, TCPS_LAST_ACK, and TCPS_FIN_WAIT_2. Not only TCPS_ESTABLISHED, TCPS_FIN_WAIT_1, and TCPS_FIN_WAIT_2.

I guess you want something like:

if ((tp->t_state == TCPS_ESTABLISHED) ||
    (tp->t_state == TCPS_FIN_WAIT_1) ||
    (tp->t_state == TCPS_FIN_WAIT_2))
        tp->t_flags |= TF_ACKNOW;
sys/netinet/tcp_input.c
544
  • I think you want to use tp->t_state instead of tp_t_flags for comparing state values.
  • The above condition matches TCPS_ESTABLISHED, TCPS_CLOSE_WAIT, TCPS_FIN_WAIT_1, TCPS_CLOSING, TCPS_LAST_ACK, and TCPS_FIN_WAIT_2. Not only TCPS_ESTABLISHED, TCPS_FIN_WAIT_1, and TCPS_FIN_WAIT_2.

I guess you want something like:

if (((tp->t_state == TCPS_ESTABLISHED) ||
     (tp->t_state == TCPS_FIN_WAIT_1) ||
     (tp->t_state == TCPS_FIN_WAIT_2)) &&
    (tp->t_ccv.flags & CCF_ACKNOW)) {
sys/netinet/tcp_ecn.c
356–359

let me go get some more coffee and give it another go :)

fix my earlier bobble :)

added close wait.. and fixed all my earlier errors :)

In D55460#1269594, @rrs wrote:

include close-wait

I don't think we need TCPS_CLOSE_WAIT. I made a mistake earlier when suggesting it. Instead of TCPS_CLOSE_WAIT we need TCPS_FIN_WAIT_1 and TCPS_FIN_WAIT_2. Please remove it.

sys/netinet/tcp_input.c
542–545

Shouldn't we also process the flags in TCPS_CLOSE_WAIT?

No. It is wrong. This was my mistake before suggesting TCPS_FIN_WAIT_1 and TCPS_FIN_WAIT_2.

This revision is now accepted and ready to land.Wed, Feb 25, 7:36 AM