Page MenuHomeFreeBSD

Correctly measure system load averages > 1024
ClosedPublic

Authored by asomers on Thu, May 5, 9:39 PM.

Details

Summary

The old fixed-point arithmetic used for calculating load averages had an
overflow at 1024. So on systems with extremely high load, the observed
load average would actually fall back to 0 and shoot up again, creating
a kind of sawtooth graph.

Fix this by using 64-bit math internally, while still reporting the load
average to userspace as a 32-bit number.

Sponsored by: Axcient

Test Plan

Created 4096 busy processes and used top to watch the load average rise to above 4000 for all three time buckets.

Diff Detail

Repository
R10 FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

asomers created this revision.

This looks good. normally fixfp_t is 32-bit int, which suggests that there may be other place we need to widen.
And updating the comment with the new limit would be good (~2 million if I'm doing the math right)...

This revision is now accepted and ready to land.Fri, May 6, 2:40 AM

I agree with you that two million is the new limit. As for other places that need changing, this is what I found:

  • linprocfs: no overflow
  • linux_sysinfo: overflows at a load average of 32? I don't know how to test it though. I'll ask trasz; he knows a lot about the linuxulator.
  • snake_saver: no overflow
  • schedcpu: overflows at 1 million, I think. Or maybe sooner but I'm having trouble figuring out the maximum value of ts_estcpu. But that should be plenty.
  • tty_info: overflows at 10,000. I guess I'll go ahead and fix this one.
  • sysctl_vm_loadavg: no overflow