Restore multi threaded callouts in the TCP stack. Use new callout_drain_async() API to avoid using callouts after freeing of the TCP timer context.
Diff Detail
- Lint
Lint Skipped - Unit
Tests Skipped
Event Timeline
Don't like the magic values sprinked through out.
More commentary and analysis as to why these bugs are actually fixed and the workarounds can be removed is in order
since this is critical infrastructure code.
More information about why you are ditching the INFO lock locking in places is in order, either in the code, or in the commit message
The lack of this analysis makes me very uneasy as I (and others) can't check to make sure you haven't forgotten a case.
It would be nice to document also what regression testing was done in the absence of any formal TCP test suite in FreeBSD today.
sys/netinet/tcp_subr.c | ||
---|---|---|
911 | I'm pretty sure that it isn't safe to drop this lock in this context. I don't think this is a viable solution. | |
936 | Please don't use magic constants like this. It makes the code impossible to follow. | |
951 | This is very hacky. Why was the assert before that this was locked insufficient? | |
993 | why 6? I know I can count below, but some comment about this that's less lame than "set initial value" would be useful. | |
sys/netinet/tcp_timer.c | ||
276 | It would be good to explain in the commit message why these can't happen. | |
409 | You should put back the assert. |
It would be sensible to ask Julien Charbon at Verisign to review this patch -- the code (and potential race conditions) here are remarkably subtle.
I believe D2079: Fix TCP timers use-after-free old race conditions fixed the same use-after-free race condition than this patch. The differences are:
- It uses the old callout API only (no callout_drain_async())
- It does not use inp_lock to protect callouts to avoid the INP_INFO_WLOCK/INP_WLOCK LOR management burden
That said:
- I really like the new callout_drain_async() API. Once callout_drain_async() is adopted in -CURRENT, it will make sense to update rS281599 to use it
- Using inp_lock to protect TCP timer callout is indeed a good idea, but I don't see how to implement it in a clean way as long a both INP_INFO and INP locks are required
Please drop this review as a new proposal will be required the in the wake of other changes.
gnn: What other changes are you referring to?
sys/netinet/tcp_subr.c | ||
---|---|---|
911 | This is what the code is doing already inside one of the timer callbacks! Are you saying that current code is broken? | |
936 | I'll define the three values used. No problem. | |
951 | Because I allow this function being called not having V_tcpinfo locked initially, and in that case we should drop the lock before exiting again. I'll add a comment to clarify. | |
993 | I'll add a comment to clarify what this value is based on. | |
sys/netinet/tcp_timer.c | ||
276 | They can't happen any more because callouts are cancelled atomically, because they are no longer so-called MPSAFE. | |
409 | No problem. |
A similar approach was implemented by r304218. Hopefully the "callout_init_rw()" for timers in the TCP stack will be implemented too. Abandoning for now.