Page MenuHomeFreeBSD

LinuxKPI: implement mul_u64_u64_div_u64()
ClosedPublic

Authored by bz on May 16 2023, 9:07 PM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, Apr 4, 9:32 PM
Unknown Object (File)
Mon, Apr 1, 1:12 PM
Unknown Object (File)
Mar 17 2024, 11:42 AM
Unknown Object (File)
Mar 17 2024, 11:42 AM
Unknown Object (File)
Mar 17 2024, 11:42 AM
Unknown Object (File)
Jan 12 2024, 2:54 AM
Unknown Object (File)
Jan 3 2024, 3:58 AM
Unknown Object (File)
Jan 3 2024, 3:57 AM
Subscribers

Details

Summary

Implement mul_u64_u64_div_u64() for an updated iwlwifi driver (though
we do not yet use it there; it is used for in-kernel ptp on wifi).

Sponsored by: The FreeBSD Foundation
Submitted by: cperciva
MFC after: 10 days

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

bz requested review of this revision.May 16 2023, 9:07 PM
hselasky added a subscriber: hselasky.

Please don't implement integers like floating point!

I think you need to do something exactly like below. I just quickly sketched this up. Can you verify it?

static inline uint64_t
mul_u64_u64_div_u64(uint64_t x, uint64_t y, uint64_t z)
{
	const uint64_t r32[4] = { x >> 32, x & 0xFFFFFFFFULL, y >> 32, y & 0xFFFFFFFFULL };
	const uint64_t rZ[8] =  { r32[0] / z, r32[0] % z, r32[1] / z, r32[1] % z,
					       r32[2] / z, r32[2] % z, r32[3] / z, r32[3] % z };

	return (
	(((rZ[0] * rZ[4]) * z) << 64) +
	((rZ[0] * rZ[5]) << 64) +
	(((rZ[0] * rZ[6]) * z) << 32) +
	((rZ[0] * rZ[7]) << 32) +
	
	((rZ[1] * rZ[4]) << 64) +
	(((rZ[1] * rZ[5]) / z) << 64) +
	((rZ[1] * rZ[6]) << 32) +
	(((rZ[1] * rZ[7]) / z) << 32) +
	
	(((rZ[2] * rZ[4]) * z) << 32) +
	((rZ[2] * rZ[5]) << 32) +
	((rZ[2] * rZ[6]) * z) +
	(rZ[2] * rZ[7]) +
	
	((rZ[3] * rZ[4]) << 32) +
	(((rZ[3] * rZ[5]) / z) << 32) +
	(rZ[3] * rZ[6]) +
	((rZ[3] * rZ[7]) / z)
	);
}
This revision now requires changes to proceed.May 17 2023, 7:06 AM

Please don't implement integers like floating point!

I think you need to do something exactly like below. I just quickly sketched this up. Can you verify it?

No

In D40120#913732, @bz wrote:

Please don't implement integers like floating point!

I think you need to do something exactly like below. I just quickly sketched this up. Can you verify it?

No

Can you explain a bit more what this function is supposed to do?

Loosing integer precision is not a good thing!

What is this function used for?

--HPS

@bz : Another thing coming to mind, is to apply the greatest common divisor between x,y,z. This computation is pretty cheap. When you have exact fractions, it doesn't look so good seeing rounding errors.

Is this function being used for clock computations?

@bz : Another thing coming to mind, is to apply the greatest common divisor between x,y,z. This computation is pretty cheap. When you have exact fractions, it doesn't look so good seeing rounding errors.

Is this function being used for clock computations?

Yeah, it's used in a driver ptp code (which we don't actually support yet, the in-kernel ptp bits)

Based on it's name it does X * Y / Z

Hans, any suggestions on how to proceed?

In D40120#918932, @bz wrote:

Hans, any suggestions on how to proceed?

@bz

Can you show me excatly where this function is needed?

The function name does not indicate the function is inaccurate.

--HPS

In D40120#918932, @bz wrote:

Hans, any suggestions on how to proceed?

@bz

Can you show me excatly where this function is needed?

https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-testing.git/tree/drivers/net/wireless/intel/iwlwifi/mvm/ptp.c#n74

In case this is going to be too complicated I'll add a "skeleton" in given we are unlikely going to use this much -- if at all -- anywhere.

In D40120#921419, @bz wrote:
In D40120#918932, @bz wrote:

Hans, any suggestions on how to proceed?

@bz

Can you show me excatly where this function is needed?

https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-testing.git/tree/drivers/net/wireless/intel/iwlwifi/mvm/ptp.c#n74

In case this is going to be too complicated I'll add a "skeleton" in given we are unlikely going to use this much -- if at all -- anywhere.

Thank you!

I looked a bit at Linux, and it seems they have an exact implementation for this function, utilizing dedicated CPU instructions when possible to avoid bunches of code.

What do you think?

Some ideas:

What values are being passed to this function?

Can we compute the greatest common divisor on the mul/div values, and reduce this to a classic, scale 64-bit number by 32/32-bits as in sys/sys/time.h, after my recent changes?

Update with algo submitted by @cperciva.

bz removed a subscriber: hselasky.
sys/compat/linuxkpi/common/include/linux/math64.h
121

Even if it's not a KASSERT, I think it's worth documenting

// INVARIANT: x * y = res * z + rem + (y1 + y1z * z) * x1
// INVARIANT: y1 < z

to help the reader understand.

134

The second part of this should be y1 * 2 >= z.

142

Before returning I would probably add

KASSERT(res * z + rem == x * y);
KASSERT(rem < z);

Those don't prove that the computation is correct (since they only verify the bottom 64 bits of the 128-bit product) but it's still better than nothing (and would have caught the typo above).

bz marked 3 inline comments as done.Aug 16 2023, 4:39 AM

@dwmalone Please take a look at this too, just to make sure I didn't goof anywhere... it's always hard to review my own code. ;-)

Looks sensible to me. Minor observations:

  1. It will blow up with a division by zero if z = 0, however, that's probably the correct behaviour.
  2. The two checks for overflow are clever - if we overflow, then we have definitely exceed our invariant, and so need to do the reduce step.
  3. I've suggested recording one extra invariant above.
  4. I've run a few different versions of this test code, and it has always produced the correct answers, so I think it is working fine.
int main(void) { 
        uint64_t x, y, z, m;
        uint64_t a1, a2;
        int i;
        
        for (i = 0; i < 1000000; i++) { 
                x = arc4random();
                y = arc4random();
                z = arc4random();
                a2 = (x*(uint64_t)y)/z;
                m = arc4random();
                y *= m; 
                z *= m; 
                a1 = mul_u64_u64_div_u64(x,y,z);
                if (a1 != a2)
                        printf("%ju %ju %ju\n", (uintmax_t)x, (uintmax_t)y, (uintmax_t)z);
        }                       
                        
        return 0;
}
sys/compat/linuxkpi/common/include/linux/math64.h
122

Could also comment? // INVARIANT: rem < z

sys/compat/linuxkpi/common/include/linux/math64.h
122

Yes, I don't know why I didn't think of that. @bz can you add this too?

Add another INVARIANT as suggested by @dwmalone

This revision was not accepted when it landed; it landed in state Needs Review.Aug 18 2023, 1:21 AM
This revision was automatically updated to reflect the committed changes.