Page MenuHomeFreeBSD

hyperv: Implement userspace gettimeofday(2) with Hyper-V reference TSC
ClosedPublic

Authored by sepherosa_gmail.com on Dec 14 2016, 8:49 AM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, Nov 23, 9:47 AM
Unknown Object (File)
Fri, Nov 22, 11:07 PM
Unknown Object (File)
Fri, Nov 22, 5:57 AM
Unknown Object (File)
Thu, Nov 21, 6:36 AM
Unknown Object (File)
Tue, Nov 19, 9:07 PM
Unknown Object (File)
Mon, Nov 18, 4:24 AM
Unknown Object (File)
Thu, Nov 14, 5:00 PM
Unknown Object (File)
Wed, Nov 13, 10:01 PM
Subscribers
None

Details

Summary

This 6 times gettimeofday performance, as measured by tools/tools/syscall_timing

Test Plan

perf03-sephe:syscall_timing# /syscall_timing gettimeofday
Clock resolution: 0.000000101
test loop time iterations periteration
gettimeofday 0 1.014996400 36657501 0.000000027
gettimeofday 1 1.014232700 36467423 0.000000027
gettimeofday 2 1.037720000 37498712 0.000000027
gettimeofday 3 1.040983300 37674919 0.000000027
gettimeofday 4 1.010988300 36572763 0.000000027
gettimeofday 5 1.006983400 36404722 0.000000027
gettimeofday 6 1.006974100 36229166 0.000000027
gettimeofday 7 1.005982500 36437948 0.000000027
gettimeofday 8 1.011984500 36638109 0.000000027
gettimeofday 9 1.004983900 36347236 0.000000027
perf03-sephe:syscall_timing# sysctl kern.timecounter.fast_gettime=0
kern.timecounter.fast_gettime: 1 -> 0
perf03-sephe:syscall_timing# /syscall_timing gettimeofday
Clock resolution: 0.000000101
test loop time iterations periteration
gettimeofday 0 1.000997300 6175800 0.000000162
gettimeofday 1 1.000986700 6181355 0.000000161
gettimeofday 2 1.000981500 6174445 0.000000162
gettimeofday 3 1.000985100 6086535 0.000000164
gettimeofday 4 1.000982500 6137102 0.000000163
gettimeofday 5 1.000984700 6173990 0.000000162
gettimeofday 6 1.033986300 6284882 0.000000164
gettimeofday 7 1.050068700 4551081 0.000000230
gettimeofday 8 1.011150300 6186478 0.000000163
gettimeofday 9 1.030976700 6369569 0.000000161

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

sepherosa_gmail.com retitled this revision from to hyperv: Implement userspace gettimeofday(2) with Hyper-V reference TSC.
sepherosa_gmail.com updated this object.
sepherosa_gmail.com edited the test plan for this revision. (Show Details)
lib/libc/x86/sys/__vdso_gettc.c
163 ↗(On Diff #22905)

If the device cannot be opened for some reason, e.g. the process is executing with devfs instance which does not provide device, say in jail, and kernel reports hyperv algo for timecounter, then each gettimeofday(2) will be accompanied by failing open(2). This is the reason why I check for map == NULL _and_ MAP_FAILED for HPET.

176 ↗(On Diff #22905)

Please follow style(9) and put declarations at the beginning of function.

180 ↗(On Diff #22905)

I know about it but it seems to be fine to use lfence on amd, I never got reports of weird timecounter behavior on AMD. I suspect that mfence usage on AMD is somewhat cargo-cult.

sys/dev/hyperv/vmbus/amd64/hyperv_machdep.c
137 ↗(On Diff #22905)

You do not need to fill these fields when algorithm is hv.

lib/libc/x86/sys/__vdso_gettc.c
159 ↗(On Diff #22937)

e.g. in misconfigured jail.

186 ↗(On Diff #22937)

You moved one variable, but left two others.

sys/dev/hyperv/vmbus/amd64/hyperv_machdep.c
76 ↗(On Diff #22937)

I do no understand this. The libc part is only compiled on amd64, so why do you provide the compat32 version ? Either libc for i386 should also implement HV timecounter (preferred), or compat32 part should become NULL.

137 ↗(On Diff #22937)

You should just bzero whole part of timehands structure starting with the th_x86_shift. I am sorry that I was not quite clean in my previous note.

lib/libc/x86/sys/__vdso_gettc.c
186 ↗(On Diff #22937)

I actually prefer the have the local variables close to their usage. IIRC, there are some discussion about the style to group all of the local variables at the beginning of the function, but ended in no where.

Sure, I will move them to the beginning of the function.

sys/dev/hyperv/vmbus/amd64/hyperv_machdep.c
76 ↗(On Diff #22937)

OK, I will remove it.

137 ↗(On Diff #22937)

Heh, I used tsc.c as example, which fills the "shift" and set the "hpet" to ~0, that's why the original code (since Hyper-V TSC does not use shift, I set it to 0 and hpet to ~0).

kib edited edge metadata.

Other than compat32 bits, this looks good.

This revision is now accepted and ready to land.Dec 16 2016, 9:49 AM
This revision was automatically updated to reflect the committed changes.

So you did not added support to 32bit libc. Why ?

In D8789#183203, @kib wrote:

So you did not added support to 32bit libc. Why ?

It currently requires mulq, which is not available on 32 bits system.

In D8789#183203, @kib wrote:

So you did not added support to 32bit libc. Why ?

It currently requires mulq, which is not available on 32 bits system.

The code doesn't require 128bit multiplication support, it is useful for optimization but not critical. It is possible to express the same calculation using the big numbers multiplication (or, if you prefer it, a term from russian elementary school, 'multiplication in column').

If 64bit values are X=a*g+b and Y=c*g+d, where g is 2^32, a,b, and c,d are 32bit high and low words of the corresponding 64bit values, then X*Y = a*c*g*g + (a*d + b*c)*g + b*d. You need to care about the carry bit. It is slightly more cumbersome then mulq, but not too complicated.

In D8789#183250, @kib wrote:
In D8789#183203, @kib wrote:

So you did not added support to 32bit libc. Why ?

It currently requires mulq, which is not available on 32 bits system.

The code doesn't require 128bit multiplication support, it is useful for optimization but not critical. It is possible to express the same calculation using the big numbers multiplication (or, if you prefer it, a term from russian elementary school, 'multiplication in column').

If 64bit values are X=a*g+b and Y=c*g+d, where g is 2^32, a,b, and c,d are 32bit high and low words of the corresponding 64bit values, then X*Y = a*c*g*g + (a*d + b*c)*g + b*d. You need to care about the carry bit. It is slightly more cumbersome then mulq, but not too complicated.

You could copy/paste the code from contrib/libcompiler_rt/lib/builtins/multi3.c, the __mulddi3() function.

In D8789#183254, @kib wrote:
In D8789#183250, @kib wrote:
In D8789#183203, @kib wrote:

So you did not added support to 32bit libc. Why ?

It currently requires mulq, which is not available on 32 bits system.

The code doesn't require 128bit multiplication support, it is useful for optimization but not critical. It is possible to express the same calculation using the big numbers multiplication (or, if you prefer it, a term from russian elementary school, 'multiplication in column').

If 64bit values are X=a*g+b and Y=c*g+d, where g is 2^32, a,b, and c,d are 32bit high and low words of the corresponding 64bit values, then X*Y = a*c*g*g + (a*d + b*c)*g + b*d. You need to care about the carry bit. It is slightly more cumbersome then mulq, but not too complicated.

You could copy/paste the code from contrib/libcompiler_rt/lib/builtins/multi3.c, the __mulddi3() function.

This is ..., hmm, ok, I will check :)