Page MenuHomeFreeBSD

NULL out cc_data in pluggable TCP {cc}_cb_destroy
ClosedPublic

Authored by kbowling on Jul 16 2018, 3:27 AM.
Tags
None
Referenced Files
Unknown Object (File)
Tue, Dec 31, 8:58 AM
Unknown Object (File)
Sun, Dec 29, 9:29 PM
Unknown Object (File)
Tue, Dec 24, 6:10 PM
Unknown Object (File)
Sun, Dec 22, 12:09 AM
Unknown Object (File)
Fri, Dec 20, 9:03 AM
Unknown Object (File)
Wed, Dec 18, 10:46 PM
Unknown Object (File)
Wed, Dec 18, 12:35 PM
Unknown Object (File)
Sun, Dec 15, 4:30 PM
Subscribers

Details

Summary

When ABE was added (rS331214) to NewReno and leak fixed (rS333699) , it now has a destructor (newreno_cb_destroy) for per connection state. Other congestion controls may allocate and free cc_data on entry and exit, but the field is never explicitly NULLed if moving back to NewReno which only internally allocates stateful data (no entry contstructor) resulting in a situation where newreno_cb_destory might be called on a junk pointer.

  • NULL out cc_data in the framework after calling {cc}_cb_destroy
  • free(9) checks for NULL so there is no need to perform not NULL checks before calling free.
  • Improve a comment about NewReno in tcp_ccalgounload

This is the result of a debugging session from Jason Wolfe, Jason Eggleston, and mmacy@ and very helpful insight from lstewart@.

Sponsored by: Limelight Networks

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

This revision is now accepted and ready to land.Jul 16 2018, 3:51 AM
kbowling edited the summary of this revision. (Show Details)

Fix comment and call order in tcp_ccalgounload, add some asserts about clearing CC_DATA in the modules.

This revision now requires review to proceed.Jul 16 2018, 4:20 AM
kbowling retitled this revision from NULL out cc_data in pluggable TCP {cc}_cb_destory to NULL out cc_data in pluggable TCP {cc}_cb_destroy.Jul 16 2018, 5:12 AM

When ABE was added (rS331214) to New Reno and leak fixed (rS333699) , some assumptions about the state of it as the default and always ready cc seem to have changed, notably it now has a constructor and destructor (newreno_cb_destroy) for per connection state.

Not true for either commit. 331214 still allowed for an "always available" NewReno in the event of malloc failure in the constructor, and r333699 goes one step better and killed off the constructor and allocs on demand only when a setting that requires state is changed.

As to the actual problem you're trying to nut out, I suspect the problematic interaction is that NewReno is now (post r333699) assuming that if cc_data != NULL that it has malloced that memory and can use it, which is indeed a bug in the context of falling back to NewReno. Your change is therefore correct but, should be enforced in the framework i.e. I'd suggest removing the changes to NULL cc_data from the individual modules and explicitly do it anywhere cb_destroy() is called in framework code.

To clarify though, are you actually using ABE or changing NewReno's beta/beta_ecn?

sys/netinet/tcp_subr.c
1738 ↗(On Diff #45346)

This comment shouldn't be ABE specific. Perhaps something along the lines of:
"NewReno may allocate memory on demand for certain stateful configuration as needed, but is coded in such a way as to never fail because of a memory allocation failure so that it can always act as a fail safe fallback."

Also, style(9) nit, multline comments always start/end with /* and */ on their own lines without any comment text.

If the cb_destroy virtual function shouldn't null the pointer, I assert that it shouldn't free it either.

There should be another cc_assert wrapper function that checks for not null, frees then nulls, after calling the virtual function.

If the cb_destroy virtual function shouldn't null the pointer, I assert that it shouldn't free it either.

There should be another cc_assert wrapper function that checks for not null, frees then nulls, after calling the virtual function.

I disagree. Only the module itself knows the nature of the memory it has allcoated and hung off the cc_data pointer, and the module is therefore the only thing that has any business freeing that memory. NULLing the pointer on transition between modules is something the framework should be doing as a contract/guarantee for any incoming module to know that if the cc_data pointer is not NULL, it can be assured it owns that memory, not some previous module.

Good point on sub structures.

I suppose, cb_destroy is equivalent to free. Your suggestion makes sense.

@lstewart we debated who should NULL before submitting this, I think it can be moved out of the destructors to the places I put the KASSERTs but I'll have to double check everything. One issue I foresaw NULLing in the framework is that we have to trust that the cc_cb_destroy destructor did indeed free() and can't catch mistakes with a KASSERT. But writing a new CC isn't exactly common and will probably be cribbed from an existing one so maybe I was over thinking that angle.

To answer your question, no we don't use the NewReno functions that would cause a malloc. We set cc_cubic as our default, but we noticed some transition during timewait that removes the CC_ALGO and sets it back to NewReno, and then double frees.

kbowling edited the summary of this revision. (Show Details)

Incorporate feedback from @lstewart

@lstewart we debated who should NULL before submitting this, I think it can be moved out of the destructors to the places I put the KASSERTs but I'll have to double check everything. One issue I foresaw NULLing in the framework is that we have to trust that the cc_cb_destroy destructor did indeed free() and can't catch mistakes with a KASSERT. But writing a new CC isn't exactly common and will probably be cribbed from an existing one so maybe I was over thinking that angle.

Right, if a module leaks memory, that's really its problem, and if the module gets unloaded and INVARIANTS is on, the kernel prints a warning about the leaked memory anyway (assuming the module declares its own MALLOC_TYPE which it should) so I think that's an adequate safety belt.

Thanks for finding + fixing this and apologies for introducing the bug.

One final nit: I think the change added to cdg's cb_destroy() checking for NULL cc_data is not needed - its cb_init() doesn't allow the algo to be used if malloc() fails.

I would prefer to leave the check in cc_cdg until we run to ground why the callback was getting hit twice (in the case someone has cc_cdg as default).

To leave a paper trail, the option we employ that prompted this initial investigation is net.inet.tcp.nolocaltimewait=1.

I would prefer to leave the check in cc_cdg until we run to ground why the callback was getting hit twice (in the case someone has cc_cdg as default).

Ok.

To leave a paper trail, the option we employ that prompted this initial investigation is net.inet.tcp.nolocaltimewait=1.

hrm... seems kinda unrelated.

Question: are you possibly switching between stacks as well as CC algos? I must admit to not having studied the interaction between stack and algo switching (someone else did that integration while I wasn't paying attention).

This revision is now accepted and ready to land.Jul 21 2018, 3:40 AM

Ah I think you are indeed correct, I looked at the QA test suite and they have a stress for the TCP_CONGESTION sockopt from our default cc_cubic to new reno. So we were coming into reno with freed but not NULL junk in cc_data and it therefore the framework tried to call newreno's destructor later on. Thanks for the insight!

kbowling edited the summary of this revision. (Show Details)

Remove NULL check in cc_cdg_destroy per @lstewart now that we understand the pathology.

This revision now requires review to proceed.Jul 21 2018, 7:11 PM
This revision is now accepted and ready to land.Jul 21 2018, 8:23 PM
This revision was automatically updated to reflect the committed changes.