Page MenuHomeFreeBSD

First cut of new Murmur3 hash to be used with PF.
AbandonedPublic

Authored by gnn on Jul 14 2014, 6:35 PM.
Tags
None
Referenced Files
Unknown Object (File)
Mon, Dec 2, 4:56 AM
Unknown Object (File)
Oct 30 2024, 4:48 AM
Unknown Object (File)
Oct 22 2024, 8:58 PM
Unknown Object (File)
Oct 22 2024, 10:24 AM
Unknown Object (File)
Oct 21 2024, 4:21 PM
Unknown Object (File)
Oct 21 2024, 4:21 PM
Unknown Object (File)
Oct 21 2024, 4:20 PM
Unknown Object (File)
Oct 21 2024, 4:20 PM

Details

Reviewers
des
Summary

The Murmer3 hash is found to improve the performance of PF by
12% and remove the hashing functions from the hot path for instructions
retired (hwpmc) under a lookup (no decision) test load.

Diff Detail

Lint
No Lint Coverage
Unit
No Test Coverage

Event Timeline

gnn retitled this revision from to First cut of new Murmur3 hash to be used with PF..
gnn updated this object.
gnn edited the test plan for this revision. (Show Details)
  • Add a new option MURMUR3_HASH to select between this new hash

murmur_hash.c is missing a licence block, it looks like it may be MIT.

Also, I realise the functions are named by the Murmur author but having a x86 named function in the MI pf code may be confusing.

Any plans to make it the default?

sys/libkern/murmur_hash.c
3 ↗(On Diff #706)

This isn't a BSDL copyright. Do you have more information on the licence?

Instead of an option, consider just replacing it. We have more than enough knobs as it is.

sys/libkern/murmur_hash.c
3 ↗(On Diff #706)

Here's one C implementation https://github.com/PeterScott/murmur3/blob/master/murmur3.c

It has the following statement:

-----------------------------------------------------------------------------
MurmurHash3 was written by Austin Appleby, and is placed in the public
// domain. The author hereby disclaims copyright to this source code.

Quibbles over names, but this looks good other than that (assuming the unused murmor_hash3_*_128 functions actually work).

sys/conf/files
3043 ↗(On Diff #706)

Why not optional MURMOR3_HASH? Then again, we seem to always include all the hash functions, which may not be the best policy as we accumulate more that are used in increasingly specialized areas. Maybe not the swamp you want to drain today, eh?

MURMOR3_HASH is a great name to enable inclusion of the hash, but isn't such a good name to have PF use it instead of Jenkins. Consider PF_MURMOR3 or similar if that's what this option really does.

sys/libkern/murmur_hash.c
132–133 ↗(On Diff #706)

This is unused.

237–239 ↗(On Diff #706)

This is also unused.

sys/sys/hash.h
128–132 ↗(On Diff #706)

So what's the difference between the _x86_ and the _x64_ variants. The name suggests ix86 specific code. Looking at the code, though, this seems to be the difference between using 32-bit math and 64-bit math to compute the hash.

In D406#10, @emaste wrote:

Instead of an option, consider just replacing it. We have more than enough knobs as it is.

I'm with Ed here. Don't add an option, just replace jenkins with murmur3 if it's really that much better.

Insert obligatory comment about the dubious legal validity of placing code in the public domain, and the risk of using code which the author claims to have placed in the public domain.

Here's a BSD-licensed clean-room implementation:

http://people.freebsd.org/~des/software/murmur3_32.c

In pf's case, the input is always aligned and its size is always a multiple of 4, so the code can be significantly simplified and optimized as explained in the comments.

des requested changes to this revision.Jul 19 2014, 6:53 AM
des added a reviewer: des.

I found a serious error in the code; see inline comments about the input length parameter to the hash function.

sys/netpfil/pf/pf.c
391–393

This is wrong. The jenkins code counts its input in 32-bit words, but the murmur3 code counts its input in bytes. You need to remove the divide-by-four.

410–411

See above about input length.

419–420

See above about input length.

This revision now requires changes to proceed.Jul 19 2014, 6:53 AM
gnn edited edge metadata.

Add a 2 clause BSD license to the file as approved via email from
the original author.

sys/sys/hash.h
128–132 ↗(On Diff #706)

Allegedly, the _x86_ variant is more efficient on 32-bit CPUs while the _x64_ variant is more efficient on 64-bit CPUs.

I'm withdrawing this suggestion to take up another from DES.

sys/libkern/murmur_hash.c
3 ↗(On Diff #706)

The version that I drew from https://code.google.com/p/smhasher/wiki/MurmurHash3 has that same comment. I will work out a proper top level comment before commit.

3 ↗(On Diff #706)

I have now received permission, from the author, who is personally known to me, to license this code as 2 clause BSD in our tree.

sys/netpfil/pf/pf.c
391–393

Fixed in next commit.

410–411

Fixed in next commit.

419–420

Fixed in next commit.