tcp_respond() might get called with an rlock only. For logging, try to upgrade the lock, and if that does not work, just don't log.
This fixes:
Differential D32983
tcp: Fix a locking issue tuexen on Nov 13 2021, 1:10 PM. Authored by Tags None Referenced Files
Details
tcp_respond() might get called with an rlock only. For logging, try to upgrade the lock, and if that does not work, just don't log. This fixes:
Diff Detail
Event Timeline
Comment Actions The whole point of this was that there were absolutely no logs what so ever when Let me chat with Michael on this my gut says its not worth doing anything else Comment Actions Michael and I have chatted about this. The only way you Now these are all very very rare events. And both Michael and I agree So at this point I think whats here is fine and I don't think I would be Comment Actions
For the purposes of this discussion, it's important to make a distinction between absence of log events caused by a failure to log vs there being no log event hook at all i.e. to @markj's point, if there's a log point in some code, I should reasonably expect that if that code path executes I can rely on a corresponding log event being generated if logging is enabled. Not having a way to infer the loss of a log event that otherwise would have been written ultimately undermines the utility/trustworthiness of the log. Given that there is a straightforward way to avoid this problem by incrementing the serial number, I see no upside and definite downside to not addressing this.
I would argue that the rarity of the events has no bearing on agreeing to the general principle that "where there is a log point in code and logging is enabled, we will guarantee that either a log is successfully written to the log buffer, or that the serial number is bumped to convey the loss of information". If we can agree on that, then figuring out the mechanism to implement/achieve that "contract" is secondary.
Firstly I'll note that there are other ways to achieve the proposed outcome than doing an atomic increment (e.g. deferring the increment until next time the wlock is held and the next log event is generated) but I think the atomic increment is the most straight forward option and is why I proposed it for this case. That being said, I'm only really concerned about making sure the "contract" is agreed and maintained, less so about how it's done. As to the technical concerns about doing an atomic increment in this case, could you please help me understand the problem you think it could cause? My understanding (and I'm certainly no expert here so keen to be educated if I'm missing something) is that:
Am I missing a scenario/sequence of operations where there could be a problem with the inp-rlocked atomic increment approach? Comment Actions
So what happens if you have someone with the wlock, they are going to increment the t_logsn, and they are not So there is another concern here too... The whole reason we have no wlock by the way is due to the fact that this path with the Hmm and thinking about those cases (there are 3 IIRR from my look yesterday) you are
or
In case 1, at that point we *should not* be logging anything. So I don't think I would want any bump in sn here. Overall I think whats here is fine. And I don't think it breaks the contract |