User Details
- User Since
- Jun 22 2015, 5:21 PM (461 w, 2 d)
Thu, Apr 18
I just tripped over this again when trying to use some of the 16K changes I have in my Netflix tree on a personal machine running a GENERIC kernel, so let's try this again in a different way.
Mon, Apr 15
Fri, Apr 5
Thank you for adding that option.
Wed, Apr 3
Below are the results from my testing. I'm sorry that it took so long.. I had to re-do testing from the start b/c the new machine was not exactly identical to the old (different BIOS rev) and was giving slightly different results.
The results are from 92Gb/s of traffic over a one hour period with 45-47K TCP connections established/
Fri, Mar 29
Tue, Mar 26
Mar 25 2024
OK, starting with an unpatched kernel & working my way through the patches. I'll report percent busy for unpatched and various patches on our original 100G server (based around Xeon E5-2697A v4, which tends to be a poster-child for cache misses, as it runs very close to the limits of its memory bandwidth. I'll be disabling powerd and using TCP RACK TCP's DGP pacing.
Mar 23 2024
Mar 21 2024
Guys, this is crazy. Every SDT probe does a test on a global variable. If this lands, it will cause a noticeable performance impact, especially in high packet rate workloads. Can we shelve this until / unless SDT is modified to insert nops rather than do tests on a global variable? Or put this under its own options EXTRA_IP_PROBES or something?
Mar 20 2024
This tests well. On my test system hw.model: Intel(R) Xeon(R) CPU E5-2697A v4 @ 2.60GHz, I see a 50% reduction in syscall latency for the lmbench lat_syscall test (0.89us -> 0.44us) when the system is idle.
Mar 19 2024
Do you think you could also add a 'bool __read_mostly hpts_userrret_hook = true;' controlled by a sysctl to avoid calling into htps at all from userret for folks that want to disable this entirely?
Mar 5 2024
Added a comment explaining why the flags exist, as suggested by @glebius
- added a blank like before ip6po_m, as requested by @bz
Mar 4 2024
Feb 21 2024
Feb 9 2024
Jan 23 2024
Jan 12 2024
Jan 11 2024
Jan 10 2024
Jan 9 2024
Removed hunk that was Netflix specific
Jan 3 2024
I pinged Nvidia/Mellanox last week, and I'm still waiting to hear back to see if they can support AccECN in their NICs
Dec 26 2023
OK, I'm sorry, I was not aware of AccECN and its desired behavior of setting CWR on all segments.
Properly handling CWR is part of the NDIS spec... though the spec is broken, and says that "If the CWR bit in the TCP header of the large TCP packet is set, the miniport driver must set this bit in the TCP header of the first packet that it creates from the large TCP packet. The miniport driver may choose to set this bit in the TCP header of the last packet that it creates from the large TCP packet, although this is less desirable." [https://learn.microsoft.com/en-us/windows-hardware/drivers/network/offloading-the-segmentation-of-large-tcp-packets]
I'd prefer you add a feature flag so that NICs which do properly support CWR be able to use TSO, and avoid being pessimized by this.
Dec 11 2023
Nov 16 2023
Nov 15 2023
- Tried to address @wulf 's feedback
Test signature as suggested by @imp
Nov 14 2023
Nov 11 2023
Nov 9 2023
It seems like this was obsoleted by:
Oct 26 2023
Wouldn't it be better to toggle the IFCAP2_BIT(IFCAP2_RXTLS4) and IFCAP2_BIT(IFCAP2_RXTLS6) than to add the check in mlx5e_tls_rx_snd_tag_alloc()?
Oct 12 2023
I'm abandoning this in favor of handling events directly (https://reviews.freebsd.org/D42158)
Oct 11 2023
Update to add a tunable to run ged events in a deferred context, as suggested by @andrew
Oct 10 2023
Oct 9 2023
Oct 5 2023
Sep 13 2023
Sep 6 2023
Thank you; this fixes the weird performance problem that we've yet to fully root cause.
Aug 22 2023
Aug 21 2023
Aug 17 2023
I think I agree with Eric and Kevin after looking at this further, and withdraw my approval
Aug 16 2023
Aug 14 2023
Don't you need a mtx_init() ? Or did I just miss it..?
Aug 13 2023
Again, you've hand-rolled what is essentially a mutex_trylock(). Why not just use a mutex?
Jul 17 2023
Jun 16 2023
This allows the ktls tests for chacha poly to pass when the ossl driver is attached.