Page MenuHomeFreeBSD

Restore multi threaded callouts in the TCP stack
AbandonedPublic

Authored by hselasky on Jan 20 2015, 7:25 AM.
Tags
None
Referenced Files
Unknown Object (File)
Tue, Oct 29, 2:58 PM
Unknown Object (File)
Sun, Oct 27, 11:02 AM
Unknown Object (File)
Sep 23 2024, 2:44 AM
Unknown Object (File)
Sep 16 2024, 12:54 PM
Unknown Object (File)
Sep 9 2024, 1:01 AM
Unknown Object (File)
Sep 8 2024, 12:10 PM
Unknown Object (File)
Sep 3 2024, 7:11 AM
Unknown Object (File)
Aug 10 2024, 11:56 PM
Subscribers

Details

Reviewers
lstewart
gnn
adrian
rwatson
Group Reviewers
network
Summary

Restore multi threaded callouts in the TCP stack. Use new callout_drain_async() API to avoid using callouts after freeing of the TCP timer context.

Diff Detail

Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

hselasky retitled this revision from to Restore multi threaded callouts in the TCP stack.
hselasky updated this object.
hselasky edited the test plan for this revision. (Show Details)
hselasky added reviewers: adrian, lstewart, rwatson, gnn.
hselasky set the repository for this revision to rS FreeBSD src repository - subversion.

Don't like the magic values sprinked through out.
More commentary and analysis as to why these bugs are actually fixed and the workarounds can be removed is in order
since this is critical infrastructure code.
More information about why you are ditching the INFO lock locking in places is in order, either in the code, or in the commit message
The lack of this analysis makes me very uneasy as I (and others) can't check to make sure you haven't forgotten a case.
It would be nice to document also what regression testing was done in the absence of any formal TCP test suite in FreeBSD today.

sys/netinet/tcp_subr.c
911

I'm pretty sure that it isn't safe to drop this lock in this context. I don't think this is a viable solution.

936

Please don't use magic constants like this. It makes the code impossible to follow.

951

This is very hacky. Why was the assert before that this was locked insufficient?

993

why 6? I know I can count below, but some comment about this that's less lame than "set initial value" would be useful.

sys/netinet/tcp_timer.c
276

It would be good to explain in the commit message why these can't happen.

409

You should put back the assert.

It would be sensible to ask Julien Charbon at Verisign to review this patch -- the code (and potential race conditions) here are remarkably subtle.

jch added a subscriber: jch.

I believe D2079: Fix TCP timers use-after-free old race conditions fixed the same use-after-free race condition than this patch. The differences are:

  • It uses the old callout API only (no callout_drain_async())
  • It does not use inp_lock to protect callouts to avoid the INP_INFO_WLOCK/INP_WLOCK LOR management burden

That said:

  • I really like the new callout_drain_async() API. Once callout_drain_async() is adopted in -CURRENT, it will make sense to update rS281599 to use it
  • Using inp_lock to protect TCP timer callout is indeed a good idea, but I don't see how to implement it in a clean way as long a both INP_INFO and INP locks are required

Please drop this review as a new proposal will be required the in the wake of other changes.

gnn: What other changes are you referring to?

sys/netinet/tcp_subr.c
911

This is what the code is doing already inside one of the timer callbacks! Are you saying that current code is broken?

936

I'll define the three values used. No problem.

951

Because I allow this function being called not having V_tcpinfo locked initially, and in that case we should drop the lock before exiting again. I'll add a comment to clarify.

993

I'll add a comment to clarify what this value is based on.

sys/netinet/tcp_timer.c
276

They can't happen any more because callouts are cancelled atomically, because they are no longer so-called MPSAFE.

409

No problem.

BTW: Looks like my browser had cached a lot of comments :-)

A similar approach was implemented by r304218. Hopefully the "callout_init_rw()" for timers in the TCP stack will be implemented too. Abandoning for now.