general LOCK_PROFILING cleanup
- only collect timestamps when a lock is contested - this reduces the overhead of collecting profiles from 20x to 5x
- remove unused function from subr_lock.c
- generalize cnt_hold and cnt_lock statistics to be kept for all locks
- NOTE: rwlock profiling generates invalid statistics (and most likely always has) someone familiar with that should review