Details

share/man/man9/hz.9
57–58	While it is true that historically we used the same physical timer to drive statclock and profclock, it might be clearer to the reader to actually describe them as three separate logical clocks and then maybe add a note after the description of statclock and profclock to say that historically these two clocks were driven by the same physical timer. I think it will also fix Jess's comment about the missing or. Phab's UI won't let me select the body of text to do an suggestion since there's already a comment, so I'll just do a suggestion by hand: A second clock running at .Va stathz gathers statistics on the system and its processes. It computes values for ... .Xr top 1 . .Pp Finally, a third clock runs at .Vt profhz to sample user program counter values for profiling purposes. This profiling clock has been replaced by the more functional .Xr hwpmc 4 and may be removed in a future version of .Fx . Maybe it's not even worth mentioning the historical note that statclock and profclock use the same underlying timer, or that profclock doesn't run except when it is enabled by at least one process requesting profiling. If we did, maybe we would add it in a separate '.Sh IMPLEMENTATION NOTES' section. Also, I snuck in a fuller deprecation warning for profclock. :)

Update, per jhb@ review, adopting his wording...

Unsure about the implementation details section, so please comment.

Harbormaster completed remote builds in B39979: Diff 91062.Jun 18 2021, 6:16 PM

add in kib's observations.

Harbormaster completed remote builds in B39980: Diff 91063.Jun 18 2021, 7:17 PM

Added kib based on his observations on my original commit.

imp marked 2 inline comments as done.Jun 18 2021, 7:18 PM

ah, drop callout xr

Harbormaster completed remote builds in B39981: Diff 91064.Jun 18 2021, 7:19 PM

emaste added a subscriber: emaste.Jun 18 2021, 7:19 PM

emaste added inline comments.

share/man/man9/hz.9
52–54	Discussed on IRC and @kib pointed out in email - this is not correct. I think "up to .Va hz times per second" would be accurate, but I'm not sure if there's more context we should offer.

imp marked an inline comment as done.Jun 18 2021, 7:25 PM

imp added inline comments.

share/man/man9/hz.9
52–54	hardclock needs to be updated, and I'll put the details there. Forgot to update this based on IRC earlier, and kib's comments made me revisit. There's a lot of details that we try to do as well for when we call it when it isn't that many times a second, but that detail is best held elsewhere. thanks.

imp marked an inline comment as done.Jun 18 2021, 8:16 PM

imp added inline comments.

share/man/man9/hz.9
52–54	In the hardclock.9 page, I'll say "the sum of cnt over a second will tend to be about hz" or something similar.... It looks like we make our best guesses on ticks, but nothing really cares how exact we are on that for time keeping, within certain limits dictated by overflow and such.

kib added inline comments.Jun 18 2021, 8:16 PM

share/man/man9/hz.9
52	I think something along these lines would be more accurate: This routine is called with the varying frequency, enough to handle events fired from it, in time. Highest frequency is hz, and current implementation ensures that the function is called at least once in a second if CPU is idle.
69	s/clock/mechanism/

emaste added inline comments.Jun 18 2021, 8:41 PM

share/man/man9/hz.9
52–54	I think that's correct, but probably not quite the right way to express it, essentially hardclock is called on an upcoming tick 1/hz boundry when necessary. But anyway this is outside of the scope of hz.9.

Looks good to me overall. kib@'s remaining suggestions should be resolved, but I'm otherwise happy. A bit of a buzzsaw process-wise to get these pages in, but nice to have this more clearly documented. Thanks!

share/man/man9/hz.9
69	That suggestion to use "mechanism" is good with me.
126

This revision is now accepted and ready to land.Jun 18 2021, 9:22 PM

skate the middle ground between all the suggested frequency of hardclock
wordings. I believe this tersely, but accurately, reflects the system behavior.

This revision now requires review to proceed.Jun 19 2021, 12:46 AM

Harbormaster completed remote builds in B39985: Diff 91078.Jun 19 2021, 12:46 AM

imp marked 5 inline comments as done.Jun 19 2021, 12:56 AM

kib accepted this revision.Jun 19 2021, 1:03 AM

This revision is now accepted and ready to land.Jun 19 2021, 1:03 AM

tempt fate and tweak a word or two here and there and
add a senence saying what these clocks do.

This revision now requires review to proceed.Jun 19 2021, 1:09 AM

Harbormaster completed remote builds in B39986: Diff 91079.Jun 19 2021, 1:09 AM

emaste added inline comments.Jun 20 2021, 3:50 PM

share/man/man9/hz.9
40	As an aside I brought up removing -pg profiling support last year (https://lists.freebsd.org/pipermail/freebsd-current/2020-January/075105.html)
55	Interesting, I wonder if we can eliminate the at least once per second?

imp added inline comments.Jun 20 2021, 4:03 PM

share/man/man9/hz.9
40	Yea. I saw that and wasn't sure what to think, so said nothing. Is that where we turned off PROFILE by default? Or was that prior?
55	No. I don't think so. timecounters generally wrap in the few second range these days, and we have to poll them faster than that.. time_second needs to be bumped once a second. There may be other reasons as well. hardclock clocks polling devices and queueing the end of epoch processing (If I'm reading the code right, I just took a quick glance). There may be other reasons I'm unaware of that kib@ or mav@ could expound upon.

emaste added inline comments.Jun 20 2021, 4:18 PM

share/man/man9/hz.9
40	Is that where we turned off PROFILE by default? Or was that prior? We haven't turned PROFILE off by default yet.

kib added inline comments.Jun 20 2021, 5:45 PM

share/man/man9/hz.9
55	Yes, the main reason is the timecounters. They are set up to return uint32_t value which we allow to wrap in approximately a second. In particular, this is why we have TSC-low timecounter which shifts TSC value right. And this means that we should call tc_windup() once in half-second, actually, not once in a second. We indeed do, see getnextcpuevent(). Also this frequency determines how fast we can reclaim resources when epochs end, which imp noted above. On the other hand, this 'once in half-second' really need to occur once in half-second globally, not on each CPU. Right now we do not utilize this possible optimization.

The other niglet about once a second is so that the kern.cp_time is updated for idle CPUs at least once a second so that top(1) works. Right now when a CPU is idle we don't schedule statclock on it as an optimization and instead do a fixup later when the CPU resumes assuming all the missed ticks were idle ticks. Perhaps we could do this fixup globally in the same way as is done for tc_windup as Konstantin mentions, but I don't think that has been implemented.

Update based on the observations here. This tidbits I think are instructive.

Harbormaster completed remote builds in B40020: Diff 91174.Jun 21 2021, 5:01 PM

jhb added inline comments.Jun 21 2021, 6:39 PM