Page MenuHomeFreeBSD

add tunable to eliminate conflict misses when frequently acccessing the first cache line of mbuf & cluster
AbandonedPublic

Authored by kmacy on Aug 22 2017, 5:48 AM.
Referenced Files
Unknown Object (File)
Mon, Mar 25, 9:08 AM
Unknown Object (File)
Feb 28 2024, 10:13 AM
Unknown Object (File)
Feb 5 2024, 8:53 PM
Unknown Object (File)
Jan 1 2024, 4:12 AM
Unknown Object (File)
Oct 23 2023, 9:29 PM
Unknown Object (File)
Aug 28 2023, 7:16 AM
Unknown Object (File)
Jul 8 2023, 5:29 AM
Unknown Object (File)
May 15 2023, 11:44 AM
Subscribers

Details

Reviewers
shurd
markj
Summary
  • bits 6 & 7 of mbufs are currently always zero which means that when (mostly) only accessing the first 64 bytes of it we're only able to use 1/4 of each cache set
  • similarly, bits 6-11 are always zero in mbuf clusters, this means that, particularly in the rx path, sub cache line sized packets are only able to use 1/32nd of each cache set

This may, in part, explain why prefetching frequently worsens measured performance.

Poor cache utilization generally isn't measurable at lower packet rates. Nonetheless, there are some users who would benefit.

It looks like the underlying UMA / VM code is broken. @shurd reports that tests don't complete when it's enabled. I'm adding @markj in the hope that he'll take a look after his move.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 11202

Event Timeline

kmacy added a reviewer: markj.
kmacy added a subscriber: markj.
shurd requested changes to this revision.Aug 22 2017, 6:12 PM

With this patch, and kern.ipc.cachespread="1", network traffic stops after some indeterminate time.

This revision now requires changes to proceed.Aug 22 2017, 6:12 PM

I've made it work with additional changes, but don't have a workload + hardware combo that benefits from the reduction in cache misses.