Page MenuHomeFreeBSD

Add a SPD cache to speed up lookups

Authored by on Apr 12 2018, 1:11 PM.


Group Reviewers
rS334054: Add a SPD cache to speed up lookups.

Add a SPD cache to speed up lookups.

When large SPDs are used, we face two problems:

  • too many CPU cycles are spent during the linear searches in the SPD for each packet eat
  • too much contention on multi socket systems, since we use a single shared lock.

Main changes:

  • added the sysctl tree 'net.key.spdcache' to control the SPD cache (disabled by default)
  • cache the sp indexes that are used to perform SP lookups.
  • use a range of dedicated mutexes to protect the cache lines.

Diff Detail

rS FreeBSD src repository - subversion
Automatic diff as part of commit; lint not applicable.
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Can you provide some numbers what benefits you have with this patch?

In D15050#317087, @ae wrote:

Can you provide some numbers what benefits you have with this patch?

Performance increased from 3gpbs to 12gbps using a 2*6 cores Xeon system
SPD size: 1024
8192 different UDP flows

Update the maxentries sysctl once the hashtable has been init in order to reflect the actual size used.

223 ↗(On Diff #41402)

I think this breaks the ABI and makes the change a non-MFCable.

204 ↗(On Diff #41402)

It looks like key_spidxhash() is similar to key_saidxhash(). Maybe it makes sense to rework it like:

static uint32_t key_addrprotohash(const union sockaddr_union *src, const union sockaddr_union *dst,
     const uint8_t *proto);

#define SPDCACHE_HASHVAL(idx)  \
     (key_addrprotohash(&idx->src, &idx->dst, &idx->ul_proto) & V_spdcachehash_mask)
#define SAHADDRHASH_HASHVAL(idx)   \
     (key_addrprotohash(&idx->src, &idx->dst, &idx->proto) & V_sahaddrhash_mask)
245 ↗(On Diff #41402)

This confuses the reader, I think it is better to not hide & under the macro. Also it is not clear to me, why we can not just use NULL pointer for this purpose.

549 ↗(On Diff #41402)

I think this should be SYSCTL_PROC to correctly handle user's input.

8273 ↗(On Diff #41402)

hashinit() already did LISTs initialization. added inline comments.
223 ↗(On Diff #41402)

Yes indeed. Is it a problem to be non-MFCable?

245 ↗(On Diff #41402)

Yes you're right. This is some old code from the first version of this cache, NULL is ok for the purpose.

549 ↗(On Diff #41402)

I am not sure to understand what you mean.
Do you mean it is badly handled (bug?) or would you like the cache size to be changed during runtime?

549 ↗(On Diff #41402)

I meant that user can specify wrong values and it is bad. I'm not sure that value 8 or 9 is good enough to test the cache. Ah, I understand, SYSCTL_PROC is wrong suggestion, since it is RDTUN, I meant that we need to check the value and probably do not accept values that are below some default. added inline comments.
549 ↗(On Diff #41402)

A small cache would be probably inefficient, but how can we decide the minimum size to be set? Do you have suggestions? 128, 1024, ...?

8 would be probably OK if you have a huge SPD but very few traffic flows involved at the same time (a huge SPD usually means many different traffic flows though)

I think we can try to test this with wider auditory.

223 ↗(On Diff #41402)

I propose to reuse place of the following two counters, that are unused, for SPD cache. This will keep the structure size and also reduce number of unused counters :)

This revision is now accepted and ready to land.May 13 2018, 1:06 PM
223 ↗(On Diff #41402)

Just to make sure: you want to remove ips_mbcoalesced and ips_clcoalesced?

223 ↗(On Diff #41402)

Yes, they seem to be unused.

This revision now requires review to proceed.May 15 2018, 9:08 AM
This revision was not accepted when it landed; it landed in state Needs Review.May 22 2018, 3:54 PM
This revision was automatically updated to reflect the committed changes.