Correctly measure system load averages > 1024
ClosedPublic
Actions

Authored by asomers on May 5 2022, 9:39 PM.

Details

Reviewers

Commits

rGdee01da58a27: Correctly measure system load averages > 1024
rG1d2421ad8b6d: Correctly measure system load averages > 1024

Summary

The old fixed-point arithmetic used for calculating load averages had an
overflow at 1024. So on systems with extremely high load, the observed
load average would actually fall back to 0 and shoot up again, creating
a kind of sawtooth graph.

Fix this by using 64-bit math internally, while still reporting the load
average to userspace as a 32-bit number.

Diff Detail

Repository

rG FreeBSD src repository

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

asomers requested review of this revision.May 5 2022, 9:39 PM

asomers created this revision.

Harbormaster completed remote builds in B45482: Diff 105716.May 5 2022, 9:39 PM

This looks good. normally fixfp_t is 32-bit int, which suggests that there may be other place we need to widen.
And updating the comment with the new limit would be good (~2 million if I'm doing the math right)...

This revision is now accepted and ready to land.May 6 2022, 2:40 AM

I agree with you that two million is the new limit. As for other places that need changing, this is what I found:

linprocfs: no overflow
linux_sysinfo: overflows at a load average of 32? I don't know how to test it though. I'll ask trasz; he knows a lot about the linuxulator.
snake_saver: no overflow
schedcpu: overflows at 1 million, I think. Or maybe sooner but I'm having trouble figuring out the maximum value of ts_estcpu. But that should be plenty.
tty_info: overflows at 10,000. I guess I'll go ahead and fix this one.
sysctl_vm_loadavg: no overflow