Page MenuHomeFreeBSD

Improve blackhole detection
ClosedPublic

Authored by tuexen on Apr 6 2020, 10:11 AM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, Jan 18, 10:00 PM
Unknown Object (File)
Tue, Jan 14, 1:25 PM
Unknown Object (File)
Tue, Jan 14, 5:56 AM
Unknown Object (File)
Dec 11 2024, 8:10 PM
Unknown Object (File)
Dec 4 2024, 11:44 AM
Unknown Object (File)
Nov 27 2024, 5:47 AM
Unknown Object (File)
Nov 25 2024, 12:46 PM
Unknown Object (File)
Nov 24 2024, 5:33 AM
Subscribers

Details

Summary

The principle of the current blackhole detection is to reduce the MSS in two steps and try each candidate two times. However, if for example the two candidates are identical (which is the case for TCP/IPv6), the candidate was tested four times. This patch ensures that each possible candidate is tested two times. This reduces the time window of misclassifying a temporary outage as an MTU issue. While there, also make sure the correct statistic counter is incremented.

t_blackhole_enter is currently only set to 2 and one could argue that this doesn't need to be stored. In a followup commit t_blackhole_enter will be computed and possibly will have different values, too.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 30434

Event Timeline

The logic looks good to me, but I'd refrain from storing the enter/exit values in tcpcb - I don't really see why this is necessary or what benefit it gains. Since this is a non-timing critical path, dynamic recalculation of the enter/exit values should be ok, while keeping tcpcb smaller for less cache evictions...

sys/netinet/tcp_var.h
172

Why add these to tcpcb? Couldn't these two values be caluclated dynamically whenever necessary instead. What is the benefit of having them in tcpcb?

sys/netinet/tcp_var.h
172

Right now t_blackhole_enter is constant. However, in a future commit, it will be based on the value of a sysctl variable, which specifies a minimum time before reducing the MSS. Storing this value then makes sure nothing bad happens, if someone changes the value of the sysctl variable while blackhole detection is ongoing.

The number of steps depends on the MSS, which is changed during the probing. That is why I'm storing t_blackhole_exit

I put the two new entries near t_pmtud_saved_maxseg, which is accessed in the same code path. I can move them down, if you want.

Understood. Bundling the new variables together makes sense - and a refreshed analysis of the cache line alignment may reorder them at a later time.

Thanks for providing the extra context on the transport call today. Overall, this change looks good. I left a few nits in in-line comments.

sys/netinet/tcp_timer.c
726

Isn't it true that you only need to calculate this on tp->t_rxtshift == 2? If you make that change, you would also need to guard the entry to the blackhole detection code with (tp-t_rxtshift > 1).

727

In general, I would prefer a #define over the "magic number" 2. OTOH, I believe you will be changing this soon, so you can decide whether it is helpful to make this a defined value.

826–827

This comment (beginning with "The limit '6'...") seems out of date now.

sys/netinet/tcp_var.h
180

This change will probably make this cache line marker become out of date. OTOH, I'm not sure we are good about maintaining the markers anyway.

This revision is now accepted and ready to land.Apr 9 2020, 3:21 PM
sys/netinet/tcp_timer.c
726

When doing the first retransmission, I need to calculate when to enter blackhole detection. This could be on the first retransmission or later, but not earlier. So I guess, this is the correct condition.

727

Let me add a comment describing the number. That is clearer as using a descriptive name and can be adopted when computing it dynamically.

826–827

Good catch. I often change the code and forget to update the comments. Will remove it and add a comment when we set t_blackhole_exit.

sys/netinet/tcp_var.h
180

They got introduced by rrs in r333041. Double checking results in Cache line 4 holding 17 4 byte values, whereas Cache line [123] hold 16 4 byte values. So I guess, they were not correct in the beginning. Since then, there have been multiple changes adding fields and one change removing fields and the comments were not updated. So they are stale. We can fix them, but I would suggest to do that separately or even remove the comments, since I guess they would become stale soon.

This revision now requires review to proceed.Apr 12 2020, 3:35 PM
This revision is now accepted and ready to land.Apr 14 2020, 3:14 PM