Page MenuHomeFreeBSD

Modify lock_delay() to increase the delay time after spinning
ClosedPublic

Authored by trasz on Nov 23 2020, 1:25 PM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, Dec 14, 6:15 PM
Unknown Object (File)
Nov 16 2024, 6:33 AM
Unknown Object (File)
Nov 14 2024, 4:48 PM
Unknown Object (File)
Oct 17 2024, 11:25 PM
Unknown Object (File)
Oct 14 2024, 2:58 PM
Unknown Object (File)
Oct 4 2024, 11:01 AM
Unknown Object (File)
Sep 30 2024, 1:18 PM
Unknown Object (File)
Sep 28 2024, 1:52 PM

Details

Summary

Modify lock_delay() to increase the delay time after spinning,
not before. Previously we would spin at least twice instead of once.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

trasz requested review of this revision.Nov 23 2020, 1:25 PM

It is quite unclear if this is a good idea as is, key point being that these locks are unfair and fiddling with this can further negatively affect the achieved fairness. In particular the current "wait 2 spins" setup make fresh arrivals less likely to grab the lock from under threads which already wait. Replacing this with fair locks in a manner which does not negatively impact single-threaded performance is complicated. Any changes here should come with several benchmarks, showcasing longest time to grab the lock and perhaps some fiddling with min/max for backoff.

In D27331#610694, @mjg wrote:

Any changes here should come with several benchmarks, showcasing longest time to grab the lock and perhaps some fiddling with min/max for backoff.

I don't have specific benchmark details, but I can definitively say we found this change to have a significant impact on throughput and latency in our performance testing (when compared to the BSD10 code which calls cpu_spinwait() directly).
--ap

Sure, it is a tradeoff, helps some workloads and hurts others. Can you share the hw spec? Most notably cores/threads, cpu model and socket count?

I'll add some extra measures here and give you a different patch to test.

In D27331#610723, @mjg wrote:

Can you share the hw spec? Most notably cores/threads, cpu model and socket count?

On the specific test that saw an improvement, we ran a 2-socket, Intel Sandy Bridge, 8-core/socket, 2.1GHz.

I get what you're saying. If you have something different to test out, we can try it.
--ap

So I got myself an Ivy Bridge for testing (among some other boxes). I failed to get benefit from the patch at hand, but I did not worsen the stuff I was worried about.

However, there is still significant unfairness here and making it less of a problem will negatively affect throughput. The real thing to do is to drop backoff altogether, but that's quite a lot of work.

tl;d i think the patch can go in for the time being

This revision is now accepted and ready to land.Nov 25 2020, 9:39 PM