Page MenuHomeFreeBSD

First cut of new Murmur3 hash to be used with PF.
AbandonedPublic

Authored by gnn on Jul 14 2014, 6:35 PM.

Details

Reviewers
des
Summary

The Murmer3 hash is found to improve the performance of PF by
12% and remove the hashing functions from the hot path for instructions
retired (hwpmc) under a lookup (no decision) test load.

Diff Detail

Lint
No Linters Available
Unit
No Unit Test Coverage

Event Timeline

gnn updated this revision to Diff 704.Jul 14 2014, 6:35 PM
gnn retitled this revision from to First cut of new Murmur3 hash to be used with PF..
gnn updated this object.
gnn edited the test plan for this revision. (Show Details)
gnn updated this revision to Diff 706.Jul 15 2014, 1:36 AM
  • Add a new option MURMUR3_HASH to select between this new hash

murmur_hash.c is missing a licence block, it looks like it may be MIT.

Also, I realise the functions are named by the Murmur author but having a x86 named function in the MI pf code may be confusing.

rpaulo added a subscriber: rpaulo.Jul 15 2014, 3:45 AM

Any plans to make it the default?

sys/libkern/murmur_hash.c
3 ↗(On Diff #706)

This isn't a BSDL copyright. Do you have more information on the licence?

emaste added a subscriber: emaste.Jul 15 2014, 6:13 AM

Instead of an option, consider just replacing it. We have more than enough knobs as it is.

sys/libkern/murmur_hash.c
3 ↗(On Diff #706)

Here's one C implementation https://github.com/PeterScott/murmur3/blob/master/murmur3.c

It has the following statement:

-----------------------------------------------------------------------------
MurmurHash3 was written by Austin Appleby, and is placed in the public
// domain. The author hereby disclaims copyright to this source code.

imp added a subscriber: imp.Jul 15 2014, 3:15 PM

Quibbles over names, but this looks good other than that (assuming the unused murmor_hash3_*_128 functions actually work).

sys/conf/files
3043 ↗(On Diff #706)

Why not optional MURMOR3_HASH? Then again, we seem to always include all the hash functions, which may not be the best policy as we accumulate more that are used in increasingly specialized areas. Maybe not the swamp you want to drain today, eh?

MURMOR3_HASH is a great name to enable inclusion of the hash, but isn't such a good name to have PF use it instead of Jenkins. Consider PF_MURMOR3 or similar if that's what this option really does.

sys/libkern/murmur_hash.c
132–133 ↗(On Diff #706)

This is unused.

237–239 ↗(On Diff #706)

This is also unused.

sys/sys/hash.h
128–132 ↗(On Diff #706)

So what's the difference between the _x86_ and the _x64_ variants. The name suggests ix86 specific code. Looking at the code, though, this seems to be the difference between using 32-bit math and 64-bit math to compute the hash.

des added a subscriber: des.Jul 19 2014, 12:56 AM
In D406#10, @emaste wrote:

Instead of an option, consider just replacing it. We have more than enough knobs as it is.

I'm with Ed here. Don't add an option, just replace jenkins with murmur3 if it's really that much better.

des added a comment.Jul 19 2014, 6:46 AM

Insert obligatory comment about the dubious legal validity of placing code in the public domain, and the risk of using code which the author claims to have placed in the public domain.

Here's a BSD-licensed clean-room implementation:

http://people.freebsd.org/~des/software/murmur3_32.c

In pf's case, the input is always aligned and its size is always a multiple of 4, so the code can be significantly simplified and optimized as explained in the comments.

des requested changes to this revision.Jul 19 2014, 6:53 AM
des added a reviewer: des.

I found a serious error in the code; see inline comments about the input length parameter to the hash function.

sys/netpfil/pf/pf.c
391–393

This is wrong. The jenkins code counts its input in 32-bit words, but the murmur3 code counts its input in bytes. You need to remove the divide-by-four.

410–411

See above about input length.

419–420

See above about input length.

This revision now requires changes to proceed.Jul 19 2014, 6:53 AM
gnn updated this revision to Diff 779.Jul 20 2014, 12:07 AM
gnn edited edge metadata.

Add a 2 clause BSD license to the file as approved via email from
the original author.

des added inline comments.Jul 20 2014, 2:51 PM
sys/sys/hash.h
128–132 ↗(On Diff #706)

Allegedly, the _x86_ variant is more efficient on 32-bit CPUs while the _x64_ variant is more efficient on 64-bit CPUs.

gnn abandoned this revision.Jul 21 2014, 4:48 PM

I'm withdrawing this suggestion to take up another from DES.

sys/libkern/murmur_hash.c
3 ↗(On Diff #706)

The version that I drew from https://code.google.com/p/smhasher/wiki/MurmurHash3 has that same comment. I will work out a proper top level comment before commit.

3 ↗(On Diff #706)

I have now received permission, from the author, who is personally known to me, to license this code as 2 clause BSD in our tree.

sys/netpfil/pf/pf.c
391–393

Fixed in next commit.

410–411

Fixed in next commit.

419–420

Fixed in next commit.