Implement userspace gettimeofday(2) with HPET timecounter.


Implement userspace gettimeofday(2) with HPET timecounter.

Right now, userspace (fast) gettimeofday(2) on x86 only works for
RDTSC. For older machines, like Core2, where RDTSC is not C2/C3
invariant, and which fall to HPET hardware, this means that the call
has both the penalty of the syscall and of the uncached hw behind the
QPI or PCIe connection to the sought bridge. Nothing can me done
against the access latency, but the syscall overhead can be removed.
System already provides mappable /dev/hpetX devices, which gives
straight access to the HPET registers page.

Add yet another algorithm to the x86 'vdso' timehands. Libc is updated
to handle both RDTSC and HPET. For HPET, the index of the hpet device
to mmap is passed from kernel to userspace, index might be changed and
libc invalidates its mapping as needed.

Remove cpu_fill_vdso_timehands() KPI, instead require that
timecounters which can be used from userspace, to provide
tc_fill_vdso_timehands{,32}() methods. Merge i386 and amd64
libc/<arch>/sys/__vdso_gettc.c into one source file in the new
libc/x86/sys location. __vdso_gettc() internal interface is changed
to move timecounter algorithm detection into the MD code.

Measurements show that RDTSC even with the syscall overhead is faster
than userspace HPET access. But still, userspace HPET is three-four
times faster than syscall HPET on several Core2 and SandyBridge

Tested by: Howard Su <howard0su@gmail.com>
Sponsored by: The FreeBSD Foundation
MFC after: 1 month
Differential revision: https://reviews.freebsd.org/D7473