Page MenuHomeFreeBSD

Introduce <sys/qmath.h>
ClosedPublic

Authored by trasz on Apr 30 2019, 8:05 PM.

Details

Summary

Introduce <sys/qmath.h>, a fixed-point math library from Netflix.

This makes it possible to perform mathematical operations
 on
fractional values without using floating point. It operates on Q
numbers, which are integer-sized, opaque structures 
initialized
to hold a chosen number of integer and fractional
 bits.


For a general description of the Q number system, see the "Fixed Point
Representation & Fractional Math" whitepaper[1]; for the actual
API see the qmath(3) man page.

This is one of dependencies for the upcoming stats(3)
framework[2] that will be applied to the TCP stack in
a later commit.

  1. https://www.superkits.net/whitepapers/Fixed%20Point%20Representation%20&%20Fractional%20Math.pdf
  2. https://reviews.freebsd.org/D20477

Sponsored By: Klara Inc, Netflix
Obtained from: Netflix

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
This revision now requires changes to proceed.Jun 7 2019, 2:20 AM
trasz updated this revision to Diff 58360.Jun 7 2019, 2:05 PM

Fixes from sef@

trasz marked 2 inline comments as done.Jun 7 2019, 2:16 PM
In D20116#443270, @sef wrote:

This mostly seems ok -- it's pretty much nothing but a man page and header file, after all -- but my biggest complaint is that the man page is exceedingly complicated, and does not provide any real examples. The examples would also be a good way to show the justification for the code

I've added a simple example. What do you think?

I'm not going to strongly request that the man page be broken up, but it may be worth thinking about.

I've got mixed feelings about it. On one hand, separate pages are better when used with vim's ^K - because you immediately get the information you need - but on the other, to me it's easier to get a general idea about the API from a single page, without having to jump between them.

share/man/man3/qmath.3
39 ↗(On Diff #57368)

I'd swear I've fixed those... Should be ok now.

125 ↗(On Diff #57368)

I suppose it was planned and never actually implemented. Remove for now.

150 ↗(On Diff #57368)

Hm. How about Q_LTZ()?

trasz marked 2 inline comments as done.Jun 7 2019, 2:17 PM
trasz added a comment.Jun 7 2019, 2:23 PM
In D20116#443964, @cem wrote:

The concept is cool but it's impossible to closely review 2000 lines of novel content. On the plus side, half of that is manual page and 1/4 is test cases, so that increases confidence somewhat. Still 500 lines of dense header macros with approximately zero comments (just API specification).

It is dense, I agree. Still, there are some comments there, and places which lack them basically do bit manipulation, which makes sense when looked at within the context (qtype structure is described in the man page itself).

imp added a comment.Jun 7 2019, 2:24 PM
In D20116#443964, @cem wrote:

The concept is cool but it's impossible to closely review 2000 lines of novel content.

I had the same issues when I reviewed it internally... I wrote a specific one of these for the I/O scheduler, but I have have no confidence the test suite covers all the weird cases I ran into there which made me adjust the code: the biggest worry is shifting things too high (overflow) or too low (underflow). The individual test cases are fine, but there's no attempt to sweep the range of possible values to ensure a certain level of accuracy is maintained across them all.

cem added a comment.Jun 7 2019, 3:13 PM
In D20116#443270, @sef wrote:

the man page is exceedingly complicated

+1

I'm not going to strongly request that the man page be broken up, but it may be worth thinking about.

I would strongly encourage it. At a minimum, it needs significant revamping to put the important content first.

The DESCRIPTION section is ~241 lines long, and the first 89 are almost entirely irrelevant. They describe implementation details; the goal of a .3 page is to describe the API. You could put the implementation details in a .Sh IMPLEMENTATION NOTES, which is ordered after DESCRIPTION and before RETURN VALUES.

(The "SYNOPSIS" section as-is is 226 lines. This, and the description, is a good reason to break up the page.) It's fine if you want to group the sub-pages by category and continue to have a main page that refers to the category pages and maybe describes the general philosophy of the thing.

On one hand, separate pages are better ...

I agree :-)

but on the other, to me it's easier to get a general idea about the API from a single page, without having to jump between them.

Jumping around between multiple terms of manual page by specific name is easier than jumping around linearly in a single page, IMO.

share/man/man3/qmath.3
150 ↗(On Diff #57368)

Is there any reason not to just use Q_LT, Q_LE, Q_GT, Q_GE, etc?

46–50 ↗(On Diff #58360)

Typical convention for short functions like this is:

.Ft int
.Fn Q_QADDQ "QTYPE *a" "QTYPE b"

(Rather than the longer form.)

Note also that style(9) puts the * directly adjacent to the following parameter name, without a space in the middle.

(Same below, of course.)

416–426 ↗(On Diff #58360)

Is this required by convention around Q numbers, or how was this set of sizes chosen? It seems like an odd scale of bits if it is not mandated by some existing convention. E.g., is 2 bits of suffix especially valuable? Is it necessary that the fractional parts are all of size divisible by 2?

For example, you could instead have a non-linear relationship and instead map the lower 3 bits into some 8-entry lookup table of a non-linear function; e.g., floor((x+1)² * 0.791) + 1[4, 8, 13, 20, 29, 39, 51, 64]. You could substitute some other function to stack more of the widths in the low end of the spectrum, if that is desired.

(Is it actually valuable or even possible to represent qnums with 64 bits of fractional part? It seems like 64 bits is aspirational, due to the maximum unsigned QTYPE u64q_t and the minimum 3 control bits; there are only 61 bits available for precision.)

1101 ↗(On Diff #58360)

Typical style is

.An Lawrence Stewart Aq Mt lstewart@FreeBSD.org

(I.e., add Mt)

1103–1104 ↗(On Diff #58360)

This isn't a helpful BUGS section. Be specific about what functions and what the worst case current error rate is, and what the best case error rate should be, or just drop the section.

trasz marked 2 inline comments as done.Jun 9 2019, 9:33 PM

Thanks. I've moved most of the stuff to IMPLEMENTATION NOTES, leaving only the "what you really need to know" part in DESCRIPTION. I'll see how it looks like when split into separate pages.

share/man/man3/qmath.3
150 ↗(On Diff #57368)

It kind of follows the convention with QMULQ (vs QMULI) et al. On the other hand this doesn't seem to be very consistent. I'll think about it some more.

416–426 ↗(On Diff #58360)

To be honest... I don't know. I'm not the original author. The current encoding does make sense - it's quite compact and does not need any additional table lookups (ie memory accesses), though.

I think you're right regarding the 64 bits - I've removed this part.

trasz updated this revision to Diff 58460.Jun 9 2019, 9:34 PM

Fix most of the issues found by cem@.

cem added inline comments.Jun 9 2019, 10:03 PM
share/man/man3/qmath.3
416–426 ↗(On Diff #58360)

An 8-entry LUT can be stored in 8 bytes; it probably would not generate additional memory accesses (i.e., compiler would store in the instruction stream, or it would already be present in cache). I wouldn't worry about it from a memory access perspective.

trasz updated this revision to Diff 58481.Jun 10 2019, 6:22 PM

Split into individual pages.

cem added a comment.Jun 10 2019, 7:29 PM

Thanks for splitting up the pages.

trasz added a comment.Jun 13 2019, 7:34 PM

Regarding the LUT idea - it does make sense, but I believe it would require somewhat complicated code surgery, and I'd rather not do it at this time: it's a vendor code which I'm trying to upstream, not to rewrite. It also sounds like something that can be done later, if neccessary.

sef added a comment.Jun 13 2019, 7:51 PM

The spilt-up man pages are much better. The examples definitely help a lot. But now I'm wondering about things like rounding -- the obvious usage that occurred to me here was, ahem, ZZZZZZ9.99, to borrow from a long-ago past :).

share/man/man3/Q_SIGNED.3
148 ↗(On Diff #58481)

Missing .Pp?

cem added a comment.Jun 13 2019, 8:02 PM

Regarding the LUT idea - it does make sense, but I believe it would require somewhat complicated code surgery, and I'd rather not do it at this time: it's a vendor code which I'm trying to upstream, not to rewrite. It also sounds like something that can be done later, if neccessary.

Sure, that's fine.

trasz updated this revision to Diff 58597.Jun 13 2019, 8:40 PM

Add missing .Pp.

trasz marked an inline comment as done.Jun 13 2019, 8:54 PM
In D20116#445852, @sef wrote:

The spilt-up man pages are much better. The examples definitely help a lot. But now I'm wondering about things like rounding -- the obvious usage that occurred to me here was, ahem, ZZZZZZ9.99, to borrow from a long-ago past :).

Thanks! Now, regarding the rounding - I'm not much into numerics, I'm afraid. I don't even recognize the "ZZZZZZ9.99" :-)

sef added a comment.Jun 13 2019, 8:57 PM

I'm too old for the world.

It's a COBOL formatting specifier; the important part is that it specifies two decimal digits. My point was that if you set up a Q7.2 number, then multiplied it by, say, 1.027, how does it treat rounding? This seems like it would be a big deal in using these? (I mean, it's entirely possible it uses normal rounding, or maybe it's using IEEE rounding settings? But shouldn't this be specified somewhere?)

trasz added a comment.Jun 14 2019, 6:49 PM

Ah, COBOL. Never used it, but I seem to remember reading that it natively supported fixed point.

Now, regarding precision - I'm pretty sure it doesn't use the floating point environment (fenv(3) et al). I think it takes the fractions, multiplies them, and them truncates back to the destination's precision.

trasz added a comment.Jun 25 2019, 1:15 PM

So... is there anything else left for me to do here before this can get committed?

cem requested changes to this revision.Jun 25 2019, 3:43 PM

So... is there anything else left for me to do here before this can get committed?

I think there are open questions around rounding behavior and making the extensive macros into functions.

This revision now requires changes to proceed.Jun 25 2019, 3:43 PM
trasz added a comment.Jun 25 2019, 5:30 PM
In D20116#448878, @cem wrote:

So... is there anything else left for me to do here before this can get committed?

I think there are open questions around rounding behavior and making the extensive macros into functions.

The problem with functions is that in many places it uses Q_TC() (which boils down to __typeof()), or macros that expand to Q_TC(). I guess I could try to rework the ones which don't, but that would look inconsistent.

As for rounding - the bugs section, which mentioned them, is gone. The code is just a fixed-point library, so the rounding is what one would expect, I suppose: stuff gets truncated according to the radix point. Should I document it somehow?

sef added a comment.Jun 25 2019, 5:33 PM

As for rounding - the bugs section, which mentioned them, is gone. The code is just a fixed-point library, so the rounding is what one would expect, I suppose: stuff gets truncated according to the radix point. Should I document it somehow?

Yes, it should be documented somehow. I gave an example of a multiplication that could be done -- what would the results of that be, and why?

It doesn't need to be an indepth discussion of rounding, but it does need to be discussed.

cem added a comment.Jun 26 2019, 6:24 AM
In D20116#448878, @cem wrote:

I think there are open questions around rounding behavior and making the extensive macros into functions.

The problem with functions is that in many places it uses Q_TC() (which boils down to __typeof()), or macros that expand to Q_TC(). I guess I could try to rework the ones which don't, but that would look inconsistent.

Is it really useful to have any of these operate on anything smaller than a machine word? I think you could basically do all the math on unsigned long (and perhaps uint64_t for 32-bit arch) in functions and just use macros for thin shims that convert word-sized results to the magical smaller width types. Functions are really a lot better than macros to debug and analyze so even if there is a little pain over the current macro approach, I think it is sometimes worth it.

In D20116#448996, @sef wrote:

Yes, it should be documented somehow. I gave an example of a multiplication that could be done -- what would the results of that be, and why?
It doesn't need to be an indepth discussion of rounding, but it does need to be discussed.

+1

sys/sys/qmath.h
238 ↗(On Diff #58597)

style(9), use of a pointer as a boolean value; and other instances

239 ↗(On Diff #58597)

long is definitely the wrong type for comparing against a pointer difference. ptrdiff_t might be appropriate (signed), or size_t (unsigned) (if ordering of _s and (s) is guaranteed). Ditto other uses.

trasz updated this revision to Diff 59171.Jun 28 2019, 8:45 PM

Fix use of pointers as booleans, use ptrdiff_t, and add brief explanation
of rounding (or lack thereof).

sef added inline comments.Jun 28 2019, 8:50 PM
share/man/man3/qmath.3
81 ↗(On Diff #59171)

What happens if I ${OPERATION} Q numbers with different precision?

I think you may want to say, "The fractional component is truncated to fit into the destination, with no rounding." Assuming that's what's actually being done of course :).

allanjude added inline comments.
share/man/man3/qmath.3
340 ↗(On Diff #59171)

Q_MAXSTRLEN seems to take 2 arguments now, q and base.

I think think needs to be changed to:
char buf[Q_MAXSTRLEN(s32, 10)];

to work (base 10 matching the Q_TOSTR with base 10 you are doing below)

trasz updated this revision to Diff 59400.Jul 4 2019, 5:26 PM
trasz marked 2 inline comments as done.

Fixes from sef@ and allanjude@.

trasz added inline comments.Jul 4 2019, 5:26 PM
share/man/man3/qmath.3
81 ↗(On Diff #59171)

Well... one of the implementation details is that the precision of both arguments must match. Still, I like your version, and I've also updated Q_QADDQ.3 to explicitly mention this.

trasz updated this revision to Diff 59424.Jul 4 2019, 8:21 PM

Merge upstream change from lstewart@:

The method used by Q_FBITS2CH() to convert fractional bits into their
base-specific string representation loses precision unnecessarily during the
conversion. In attempting to avoid overflow from multiplying the residual bits
by the base, the truncation caused by the right shift of bits prior to
multiplying by base loses any carry that would have been contributed by the
lower order truncated bits i.e. the representations were sometimes smaller than
if full precision had been maintained.

Improve Q_FBITS2CH() by explicitly factoring any carry into the calculation of
each digit, thereby maximising the string representation's precision.

A Q45.16 number holding the binary representation of decimal 3.1415 is rendered
as:

"3.1314947509765625" without change
"3.1414947509765625" with change
trasz updated this revision to Diff 59518.Jul 7 2019, 7:45 PM

Merge upstream change from lstewart@:

Add a new Qmath library function Q_DFV2BFV(), which converts a decimal
fractional value into its binary-encoded representation with nfbits
of binary precision. The value to convert must be passed as a preprocessor
literal to preserve leading zeroes.

Take advantage of the new macro for Q number initialisation by modifying Q_INI()
to take a desired initial decimal fractional value in addition to the desired
initial integral value.

Example usage to initialise a Q45.16 number with decimal 3.1415:

u64q_t q;
Q_INI(&q, 3, 1415, 16);

Also update the tests.

trasz added a comment.Jul 7 2019, 7:49 PM
In D20116#449151, @cem wrote:
In D20116#448878, @cem wrote:

I think there are open questions around rounding behavior and making the extensive macros into functions.

The problem with functions is that in many places it uses Q_TC() (which boils down to __typeof()), or macros that expand to Q_TC(). I guess I could try to rework the ones which don't, but that would look inconsistent.

Is it really useful to have any of these operate on anything smaller than a machine word? I think you could basically do all the math on unsigned long (and perhaps uint64_t for 32-bit arch) in functions and just use macros for thin shims that convert word-sized results to the magical smaller width types. Functions are really a lot better than macros to debug and analyze so even if there is a little pain over the current macro approach, I think it is sometimes worth it.

I agree that it might be a better way to do it, but honestly, I don't really want to rewrite it. I'd quite some work, and I'd probably end up introducing some new bugs.

In D20116#448996, @sef wrote:

Yes, it should be documented somehow. I gave an example of a multiplication that could be done -- what would the results of that be, and why?
It doesn't need to be an indepth discussion of rounding, but it does need to be discussed.

+1

Does it look ok now? Thanks!

trasz edited the summary of this revision. (Show Details)Jul 10 2019, 12:37 PM
lstewart added inline comments.
share/man/man3/qmath.3
39 ↗(On Diff #57368)

FYI, it was there to say that Q_INI() returned the first operand 'q'.

125 ↗(On Diff #57368)

Oops. FYI Q_INTMAX() did exist but became Q_IMAXVAL() during some refactoring at some point.

150 ↗(On Diff #57368)

FYI as background on my choice of naming convention... Q_ is intended as a library-wide prefix, and after the prefix, the placement of the numeric type identifiers from the man page ({Q | I | N}TYPE) indicates the types of the operands e.g. Q_Q<op>Q indicates the function operates on a Q number as its first numeric type argument (as opposed to a stdint or generic numeric type) and Q number as its second numeric type argument. Q_QFRACI by contrast operates on a Q first arg and stdint supplemental args. So Q_QGTQ compares 2 Q numbers, while a hypothetical Q_QGTI would target cross-type comparison.

328 ↗(On Diff #59518)

This will print the binary-encoded representation of the fractional bits as decimal, which I don't think is what you're trying to demonstrate here.

trasz updated this revision to Diff 59730.Jul 13 2019, 7:21 PM
trasz edited the summary of this revision. (Show Details)

Update the example.

trasz marked an inline comment as done.Jul 13 2019, 7:22 PM
trasz added inline comments.
share/man/man3/qmath.3
328 ↗(On Diff #59518)

D'oh, the same bug as before, just in reverse. Thanks, fixed.

trasz edited the summary of this revision. (Show Details)Jul 15 2019, 5:01 PM
trasz edited the summary of this revision. (Show Details)Jul 18 2019, 9:24 PM
allanjude edited the summary of this revision. (Show Details)Aug 13 2019, 4:59 PM

I've started to read the documentation and it looks like it's definitely worth plugging this math emulation layer into libcmb to see just how much performance gain is actually realized.

sef accepted this revision.Aug 13 2019, 6:11 PM
imp added inline comments.Aug 13 2019, 6:16 PM
tests/sys/sys/qmath_test.c
69 ↗(On Diff #59730)

All of these tests are extremely opaque. they should have some minimal comments at least describing what they are testing.

trasz updated this revision to Diff 61344.Aug 27 2019, 9:28 AM
trasz marked an inline comment as done.

Add comments explaining what each of the tests is doing,
as suggested by imp@.

This revision was not accepted when it landed; it landed in state Needs Review.Aug 27 2019, 11:46 AM
This revision was automatically updated to reflect the committed changes.