The Murmer3 hash is found to improve the performance of PF by
12% and remove the hashing functions from the hot path for instructions
retired (hwpmc) under a lookup (no decision) test load.
Details
- Reviewers
des
Diff Detail
- Lint
No Lint Coverage - Unit
No Test Coverage
Event Timeline
Also, I realise the functions are named by the Murmur author but having a x86 named function in the MI pf code may be confusing.
Any plans to make it the default?
sys/libkern/murmur_hash.c | ||
---|---|---|
3 | This isn't a BSDL copyright. Do you have more information on the licence? |
Instead of an option, consider just replacing it. We have more than enough knobs as it is.
sys/libkern/murmur_hash.c | ||
---|---|---|
3 | Here's one C implementation https://github.com/PeterScott/murmur3/blob/master/murmur3.c It has the following statement: ----------------------------------------------------------------------------- |
Quibbles over names, but this looks good other than that (assuming the unused murmor_hash3_*_128 functions actually work).
sys/conf/files | ||
---|---|---|
3043 | Why not optional MURMOR3_HASH? Then again, we seem to always include all the hash functions, which may not be the best policy as we accumulate more that are used in increasingly specialized areas. Maybe not the swamp you want to drain today, eh? MURMOR3_HASH is a great name to enable inclusion of the hash, but isn't such a good name to have PF use it instead of Jenkins. Consider PF_MURMOR3 or similar if that's what this option really does. | |
sys/libkern/murmur_hash.c | ||
132–133 | This is unused. | |
237–239 | This is also unused. | |
sys/sys/hash.h | ||
128–132 | So what's the difference between the _x86_ and the _x64_ variants. The name suggests ix86 specific code. Looking at the code, though, this seems to be the difference between using 32-bit math and 64-bit math to compute the hash. |
I'm with Ed here. Don't add an option, just replace jenkins with murmur3 if it's really that much better.
Insert obligatory comment about the dubious legal validity of placing code in the public domain, and the risk of using code which the author claims to have placed in the public domain.
Here's a BSD-licensed clean-room implementation:
http://people.freebsd.org/~des/software/murmur3_32.c
In pf's case, the input is always aligned and its size is always a multiple of 4, so the code can be significantly simplified and optimized as explained in the comments.
I found a serious error in the code; see inline comments about the input length parameter to the hash function.
sys/netpfil/pf/pf.c | ||
---|---|---|
391–393 | This is wrong. The jenkins code counts its input in 32-bit words, but the murmur3 code counts its input in bytes. You need to remove the divide-by-four. | |
410–411 | See above about input length. | |
419–420 | See above about input length. |
Add a 2 clause BSD license to the file as approved via email from
the original author.
sys/sys/hash.h | ||
---|---|---|
128–132 | Allegedly, the _x86_ variant is more efficient on 32-bit CPUs while the _x64_ variant is more efficient on 64-bit CPUs. |
I'm withdrawing this suggestion to take up another from DES.
sys/libkern/murmur_hash.c | ||
---|---|---|
3 | The version that I drew from https://code.google.com/p/smhasher/wiki/MurmurHash3 has that same comment. I will work out a proper top level comment before commit. | |
3 | I have now received permission, from the author, who is personally known to me, to license this code as 2 clause BSD in our tree. | |
sys/netpfil/pf/pf.c | ||
391–393 | Fixed in next commit. | |
410–411 | Fixed in next commit. | |
419–420 | Fixed in next commit. |