Page MenuHomeFreeBSD

Close t_tfo_pending leaks
AbandonedPublic

Authored by jtl on Oct 11 2016, 3:26 PM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, Aug 29, 3:54 PM
Unknown Object (File)
Fri, Aug 23, 7:57 AM
Unknown Object (File)
Fri, Aug 23, 3:57 AM
Unknown Object (File)
Thu, Aug 15, 1:38 AM
Unknown Object (File)
May 11 2024, 10:19 PM
Unknown Object (File)
May 11 2024, 10:18 PM
Unknown Object (File)
May 11 2024, 10:18 PM
Unknown Object (File)
May 10 2024, 5:47 AM
Subscribers

Details

Reviewers
pkelsey
Group Reviewers
transport
Summary

It appears there are circumstances in which the code will increment t_tfo_pending without decrementing it. This can lead to the TFO capability being disabled and/or a memory leak.

This was found by code inspection.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 5563
Build 5806: arc lint + arc unit

Event Timeline

jtl retitled this revision from to Close t_tfo_pending leaks.
jtl updated this object.
jtl edited the test plan for this revision. (Show Details)
jtl added reviewers: transport, pkelsey.

I can only identify two circumstances where the t_tfo_pending counter on a listen socket will be incremented without a corresponding decrement ever occurring.

The first is when all of the following conditions are met:

  1. TFO is enabled in the system
  2. A TFO SYN with an invalid TFO cookie matches a listen socket that has TFO enabled
  3. The current t_tfo_pending count on that socket is <= so_qlimit / 2

The second is when all of the following conditions are met:

  1. TFO is enabled in the system
  2. A TFO SYN matches a listen socket that has TFO enabled
  3. The current t_tfo_pending count on that socket is <= so_qlimit / 2
  4. A matching entry from a prior non-TFO SYN is in the syncache

The consequences are:

  1. If either of the above circumstances occurs at least once on a given listen socket, four bytes can be considered leaked when that listen socket and every socket passively created from it is destroyed.
  2. If the above circumstances occur a total of (so_qlimit / 2 + 1) times on a given listen socket, TFO will become disabled for that socket.

See line-specific comments in the patch.

I can only identify one circumstance where the t_tfo_pending counter on a listen socket will be incremented without a corresponding decrement ever occurring - when all of the following conditions are met:

  1. TFO is enabled in the system
  2. A TFO SYN with an invalid TFO cookie matches a listen socket that has TFO enabled
  3. The current t_tfo_pending count on that socket is <= so_qlimit / 2

The other way is...

  1. TFO enabled
  2. t_tfo_pending count <= so_qlimit / 2
  3. SYN without a TFO option, followed by a SYN with a valid TFO cookie and the same IPs and port numbers

See line-specific comments in the patch.

I don't see in-line comments. Perhaps you can resubmit them?

There is still a path that leaks when mac_syncache_init() fails.

There is another approach to solving this that might be a little cleaner.

At the top, declare:

unsigned int *tfo_decrement_counter = NULL;

Remove the atomic_subtract_int(tp->t_tfo_pending, 1) altogether and instead follow the atomic_fetchadd_int() with an unconditional:

tfo_decrement_counter = tp->t_tfo_pending;

Then, just prior to the tfo_done label:

if (tfo_decrement_counter)

tcp_fastopen_decrement_counter(tfo_decrement_counter);

The above amounts to decrementing the pending counter whenever it has been incremented but we did not also create a TFO socket, which is exactly the set of cases that need to be handled to avoid the leak (if a TFO socket is created, then that socket context provides the path to the future decrement). Saving a pointer to the allocated counter and using tcp_fastopen_decrement_counter() takes care of the fact that by the time we reach the 'done' label, the inp lock on the listen socket has been released, and the listen socket may be gone entirely (the counter itself is guaranteed to still exist because the increment we want to undo is the reference we hold to it).

sys/netinet/tcp_syncache.c
1232

This would solve the problem for the first circumstance, and also for the second circumstance when the TFO cookie is invalid.

1243

The problem remains in this path for the second circumstance where the TFO cookie is valid.

1281

This would solve the problem for the second circumstance when the TFO cookie is valid.

In D8218#170623, @jtl wrote:

I can only identify one circumstance where the t_tfo_pending counter on a listen socket will be incremented without a corresponding decrement ever occurring - when all of the following conditions are met:

  1. TFO is enabled in the system
  2. A TFO SYN with an invalid TFO cookie matches a listen socket that has TFO enabled
  3. The current t_tfo_pending count on that socket is <= so_qlimit / 2

The other way is...

  1. TFO enabled
  2. t_tfo_pending count <= so_qlimit / 2
  3. SYN without a TFO option, followed by a SYN with a valid TFO cookie and the same IPs and port numbers

See line-specific comments in the patch.

I don't see in-line comments. Perhaps you can resubmit them?

We are crossing in time. While you were writing this, I was editing my first comment and working on the per-line comments. There was probably a better order of operations I could have used with this interface w.r.t the per-line comments. The fact that I realized the second circumstance for the problem as soon as I hit submit on my first comment of course can't be helped :)

There is still a path that leaks when mac_syncache_init() fails.

Bah. You're right.

There is another approach to solving this that might be a little cleaner.

At the top, declare:

unsigned int *tfo_decrement_counter = NULL;

Remove the atomic_subtract_int(tp->t_tfo_pending, 1) altogether and instead follow the atomic_fetchadd_int() with an unconditional:

tfo_decrement_counter = tp->t_tfo_pending;

Then, just prior to the tfo_done label:

if (tfo_decrement_counter)

tcp_fastopen_decrement_counter(tfo_decrement_counter);

Having thought about it a bit, I think your approach is probably the best one. It provides the most "future-proofing" against accidental leaks introduced by logic changes to the non-TFO portions of the function.

I am probably going to tweak it a little bit to make it a little clearer to people reading the code (by my subjective interpretation) at the possible expense of a few extra instructions.

jtl edited edge metadata.

Rework the fix in keeping with the discussion in the review comments.

This is functionally equivalent, but I don't think it is clearer. I think it is less clear that the conditional decrement is placed after the tfo_done label as it will never execute after a goto tfo_done. There is clear one-to-one correspondence between creating a TFO socket and not needing this decrement that I think is blurred by this positioning because the decrement is not an action that is common to 'done' and 'tfo_done', it is unique to 'done'. I think we would be better off if the code wasn't arranging to cancel a conditional action because it was placed in a path it never needs to be in.

Really, the conditional decrement should be placed above the tfo_done label. I also think it would be plenty clear to then rework the comment above goto tfo_done (no longer needed there, and incomplete as it is) and place it above this conditional decrement to explain that all paths that increment the counter but that don't create a TFO socket that will decrement it later transit this point.

sys/netinet/tcp_syncache.c
1480

This description is not complete. If a valid TFO SYN creates a TFO socket, but the client never speaks again, the counter will be decremented in tcp_close() after the KEEPINIT timer expires - the tcp_input path will not be involved in this case.

This is functionally equivalent, but I don't think it is clearer. I think it is less clear that the conditional decrement is placed after the tfo_done label as it will never execute after a goto tfo_done.

I placed it there as a matter of future-proofing. It doesn't hurt anything for it to be after the tfo_done label. But, if someone makes a code change to add a new goto tfo_done, it will still hit this check.

Because you obviously have strong feelings for exactly how this code should be (and this doesn't really impact me), I've opened a bug and assigned it to you. You can fix it at your convenience.

In D8218#170900, @jtl wrote:

This is functionally equivalent, but I don't think it is clearer. I think it is less clear that the conditional decrement is placed after the tfo_done label as it will never execute after a goto tfo_done.

I placed it there as a matter of future-proofing. It doesn't hurt anything for it to be after the tfo_done label. But, if someone makes a code change to add a new goto tfo_done, it will still hit this check.

Because you obviously have strong feelings for exactly how this code should be (and this doesn't really impact me), I've opened a bug and assigned it to you. You can fix it at your convenience.

I explained why I didn't agree with your position that you had made the code clearer and also flagged a missing case to a code comment you inserted. I can't say there is much emotional weight behind that beyond feeling that I had some relevant thoughts that I shouldn't keep a secret.

My best guess as to what the next step was going to be was that you might ask me to elaborate on the opposite position I have on code clarity. This result is an odd one to me. You found an issue, made an initial patch, engaged the original author (me), there was some point/counterpoint (so far so good), and now what? I post the revision to your patch (that I have already described above) for review and ask anyone but the one other person who clearly has their head around the underlying issue (you) to review it?