Paths

Table of Contentst

Fix two potential overflows in the lock delay code
ClosedPublic
Actions

Authored by jtl on Mar 31 2023, 5:14 PM.

Details

Reviewers

gallatin
mjg
jhb

Commits

rG8a16fb47303e: locks: fix two potential overflows in the lock delay code

Summary

With large numbers of CPUs, the calculation of the maximum lock delay could overflow, leading to an unexpectedly low delay. In fact, the maximum delay would calculate to 0 on systems with between 128 and 255 cores (inclusive). Also, when calculating the new delay in lock_delay(), the delay would overflow if the old delay was >= 32,768.

This commit fixes these two overflow. It also updates the maximum delay from 32678 to 32768.

MFC: 2 weeks

Test Plan

After the change, a 128-core machine now shows a maximum delay of 32768 (as opposed to the 0 it showed before). And, the results show a noticeable performance difference for a maximum delay of 32768 before and after the change to lock_delay().

Diff Detail

Repository

rG FreeBSD src repository

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

jtl created this revision.Mar 31 2023, 5:14 PM

Herald added a subscriber: imp. · View Herald TranscriptMar 31 2023, 5:14 PM

jtl requested review of this revision.Mar 31 2023, 5:14 PM

This is probably a fallout from moving from int to short.

Commit message should state:
Fixes: 6b8dd26e7c5f2caf ("locks: convert delay times to u_short")

sys/kern/subr_lock.c

130

instead of branching on it you should expand the field to an int:

diff --git a/sys/sys/lock.h b/sys/sys/lock.h
index 2db38f9df89a..e586a8ff14c8 100644
--- a/sys/sys/lock.h
+++ b/sys/sys/lock.h
@@ -182,7 +182,7 @@ extern u_short locks_delay_loops;
 
 struct lock_delay_arg {
        struct lock_delay_config *config;
-       u_short delay;
+       u_int delay;
        u_int spin_cnt;
 };

note the key was to keep global limits as short so that they use less space in the cache.

local state can be int'ed no problem

afedorov added a subscriber: afedorov.Mar 31 2023, 5:23 PM

btw all this machinery really should go away in favor of queued locks, but there is quite a bit of work for that to happen

gallatin added inline comments.Mar 31 2023, 5:40 PM

sys/kern/subr_lock.c
130	Are there any concerns about a compat issue due the size of delay changing in lock_delay_arg if this is MFCed? The struct will stay the same size, as there is a 2 byte alignment hole, but those 2 bytes wouldn't be initialized by kernel modules compiled against older kernels. How about a u_int local instead?

gallatin added a reviewer: jhb.Mar 31 2023, 5:40 PM

mjg added inline comments.Mar 31 2023, 5:59 PM

sys/kern/subr_lock.c
130	there are no concerns here, cmon man :)

gallatin accepted this revision.Mar 31 2023, 6:13 PM

This revision is now accepted and ready to land.Mar 31 2023, 6:13 PM

jtl added inline comments.Mar 31 2023, 6:15 PM

sys/kern/subr_lock.c
130	The struct will stay the same size, as there is a 2 byte alignment hole, but those 2 bytes wouldn't be initialized by kernel modules compiled against older kernels. In the in-tree code, it looks like this struct is only used in things which are part of the base kernel. there are no concerns here, cmon man :) If you really would prefer that solution, it is fine with me.

mjg added inline comments.Mar 31 2023, 6:22 PM

sys/kern/subr_lock.c
130	the branch is avoided at no cost, so yes

Updated the diff to upgrade the delay field from u_short to u_int in struct lock_delay_arg. Due to alignment requirements, this probably won't actually change the size or in-memory layout of the structure at all.

This revision now requires review to proceed.Mar 31 2023, 6:23 PM

gallatin accepted this revision.Mar 31 2023, 7:06 PM

This revision is now accepted and ready to land.Mar 31 2023, 7:06 PM

mjg accepted this revision.Mar 31 2023, 7:42 PM

jhb added inline comments.Apr 1 2023, 1:12 AM

sys/kern/subr_lock.c
152	Maybe use SHRT_MAX instead of the magic number? Could also simplify this further perhaps as `lc->max = min(cpu_factor * 256, SHRT_MAX);`?

jtl added inline comments.Apr 7 2023, 7:59 PM

sys/kern/subr_lock.c
152	Maybe use SHRT_MAX instead of the magic number? I agree with avoiding magic numbers. Maybe (1 << 15) would be more obvious than ~(SHRT_MAX) or (SHRT_MAX+1)? Also, (1 << 15) fits in nicely with the exponential backoff algorithm. Could also simplify this further perhaps as `lc->max = min(cpu_factor * 256, SHRT_MAX);`? Yes, thanks!

jhb added inline comments.Apr 10 2023, 4:40 PM

sys/kern/subr_lock.c
152	I would maybe just stick with `SHRT_MAX` even if it is one less than the real max as it will read more intuitively to others and will work just as well in practice.

jtl added inline comments.Apr 10 2023, 11:25 PM

sys/kern/subr_lock.c
152	I would maybe just stick with SHRT_MAX even if it is one less than the real max as it will read more intuitively to others and will work just as well in practice. I'm not sure I agree with that, or at least within the context of this change. Using SHRT_MAX here is really the equivalent of using 16384, so I'm not sure the result is actually intuitive. OTOH, the typo in the current code also equates to 16384, so using SHRT_MAX would actually avoid the functional change I'm introducing by fixing the typo...

jhb added inline comments.Apr 14 2023, 7:07 PM

sys/kern/subr_lock.c
152	Err, SHRT_MAX is 32767. > printf "#include <limits.h>\nSHRT_MAX\n" \| cc -E - \| tail -1 0x7fff > printf "%d\n" 0x7fff 32767

jtl added inline comments.Apr 17 2023, 1:56 PM

sys/kern/subr_lock.c
152	Doh. You're right. I had it in my mind that we only increased if (delay << 1) was <= the max, which would mean that anything less than 32768 would equate to a functional max of 16384. But, that's not what the code actually does, so you are (of course) right.