Differential D24902

Time to First byte in and out
ClosedPublic
Actions

Authored by rrs on May 18 2020, 7:59 PM.

Details

Reviewers

tuexen

Group Reviewers

transport

Commits

rS361926: An important statistic in determining if a server process (or client) is being…

Summary

An important statistic in determining if a server process (or client) is being delayed
is to know the time to first byte in and time to first byte out. Currently we
have no way to know these all we have is t_starttime. That (t_starttime) tells us
what time the 3 way handshake completed. We don't know when the first
request came in or how quickly we responded. Nor from a client perspective
do we know how long from when we sent out the first byte before the
server responded.

This small change adds the ability to track the TTFB's. This will show up in
BB logging which then can be pulled for later analysis.

Test Plan

Create a simple server that turns on BB logging and accepts a connection
gets a request and responds. It should do several request responses so we
can make sure that we see only the TTFB in and out once. We can examine
the BB logs to find this information.

Diff Detail

Repository

rS FreeBSD src repository - subversion

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

rrs requested review of this revision.May 18 2020, 7:59 PM

rrs created this revision.

Just a quick question - do you really need both (t_fbyte_in and t_fbyte_out) variables in TCPCB in full?

Wouldn't it suffice to first (before TF2_FBYTES_COMPLETE is set) only store the start timestamp (there is a provision to make it non-zero under all circumstances), and by the end-time, immediately subtract that from the current timestamp (ticks) to only store the delta (which, I believe, is the main deliverable here; or do you want to observe the relative t_fbyte_in / t_fbyte_out among a set of flows, to see if the application was particularly slow for certain overlapping requests (or when too many requests to the application became outstanding, etc?)

If you have both, and since you made provisions that neither can ever be zero, when assigned, the TF2_FBYTES_COMPLETE is a redundant flag...

lohithbsd_gmail.com added a subscriber: lohithbsd_gmail.com.May 19 2020, 6:51 PM

Richard:

I don't think that holding just the delta is what you really want. You want to be able
to tell when the first byte came in and the first byte went out. The delta only gives
one a partial view of what happened. And we are not living in 1984 where 64k of
memory is a lot. Four bytes to get more detailed information is I think worth it.

In D24902#548693, @rrs wrote:

Richard:

I don't think that holding just the delta is what you really want. You want to be able
to tell when the first byte came in and the first byte went out.

Right, but that is not clear from the description of this diff - do you want to know the time relative to wall clock on human timescales, or just the delta in machine timescale (where e.g. milliseconds are a lot).

Unless I'm mistaken, ticks runs at the rate of Hz (typically millisecond resolution) correct? So, while a 32 bit counter will only overrun once per ~1,5 months, I now wonder if the resolution is fine enough for the intended purpose. At least in my world, we try to track our IOs with a granularity of around 0.1 to 1 usec (which is no longer sufficient for NVMe stuff)...

And you could potentially have both - first use a uint64 with the absolte start time (at usec resolution), and then calculate a usec-res delta until the first byte out, and a msec-res "wall-clock" related start time, and store both as two uint32 in the same space...

The delta only gives one a partial view of what happened. And
we are not living in 1984

Oh, we are living in a future that is in many aspects so way beyond 1984, it's frightening :)

where 64k of memory is a lot. Four bytes to get more detailed information is I think worth it.

Right, but L1 / L2 / L3 cache is still a precious and contended resource IMHO.

Has discussed on the transport call, I will be updating these to have a uint64_t and a nano-second basis for the values (including t_starttime I think since this
is what you compare these against) :)

In D24902#550400, @rrs wrote:

Has discussed on the transport call, I will be updating these to have a uint64_t and a nano-second basis for the values (including t_starttime I think since this
is what you compare these against) :)

Wouldn't sbintime_t be easier and cheaper to work with?

As discussed on the transport telco, the first step of a two step approach.

This revision is now accepted and ready to land.Jun 4 2020, 8:44 PM

In D24902#550447, @rstone wrote

Wouldn't sbintime_t be easier and cheaper to work with?

Sbintime I don't think is any cheaper. You can use microuptime/binuptime/nanouptime or the getxxxx versions if
you want "faster" at least according to the man page.

Yes binuptime does not do the conversion to timespec or timeval but thats a very small difference. I have ran
comparison tests serving 100G of traffic with both getmicrouptime and microuptime for hpts and rack and I could
not measure a CPU difference.

My current plan will be to commit this as is, and then follow on with a V10 BB log that converts the 3 fields
to a 64bit number and enhances the resolution per Richards request. I want to do it that way so things
stay in sync with the BB tooling. I will most likely follow along with what the mlx nic does (or chesio
if they ever enable it) and use the 64bit timestamp that is used by the NIC cards i.e. see in
sys/mbuf.h mbuf_tstmp2timespec(). That gives one a very consistent timestamp inside
the network stack that can also be cheaply subtracted to get a nanosecond precision.
This will, I think, satisfy Richard as well as keep consistent with what is already happening within the network stack.