One side-effect of r284297 is that we now call lockstat_nsecs() each
time a rwlock read lock is taken, even in the uncontended case: rw_rlock
calls __rw_rlock directly rather than indirecting through some
intermediate "fast path". This is the only lock type to do so, which is
why the problem became immediately obvious when doing some lock
benchmarking.
The cost of lockstat_nsecs() can vary quite a bit between systems; in
the case of a bhyve VM, I'm seeing a slowdown of roughly 100x. I wonder
if this might also be the cause of PR 201517, i.e. something is taking a
read lock during boot, and the resulting timecounter read somehow hangs
the system.
This change uses a flag to indicate whether any lockstat probes are
actually enabled before calling binsuptime.