Page MenuHomeFreeBSD

reduce cache coherence traffic on br_ring accesses
AbandonedPublic

Authored by kmacy on Feb 24 2015, 10:14 PM.
Tags
None
Referenced Files
Unknown Object (File)
Jan 1 2024, 2:26 AM
Unknown Object (File)
Dec 2 2023, 9:51 AM
Unknown Object (File)
Jul 26 2023, 7:26 PM
Unknown Object (File)
Apr 22 2023, 1:06 PM
Unknown Object (File)
Apr 8 2023, 12:21 AM
Unknown Object (File)
Mar 21 2023, 8:17 PM
Unknown Object (File)
Jan 8 2023, 11:27 AM
Unknown Object (File)
Dec 29 2022, 7:40 AM

Details

Reviewers
imp
Summary

Short of creating separate buf_rings per-package there is no way to avoid a steady stream of coherence traffic on br_prod updates. By definition many threads are simultaneously trying to acquire an index by updating it. However, once a producer has a unique index there is no intrinsic cache line sharing with other producers. With the current implementation if thread A is on package 1 and thread B is on package 2 and they're both producing a steady stream of updates br_ring[] will change ownership CACHE_LINE_SIZE/sizeof(void *) times for each cache line. If instead we pad out each entry to be CACHE_LINE_SIZE this ping-ponging can be avoided entirely.

The motivation for this change is that, at least on some architectures such as AMD's HyperTransport, the number of coherence messages per second is lower than the speed of the link might imply. In other words, although there is very low latency, the actual bandwidth is not that high.

Because this clearly explodes the size of the ring by a factor of CACHE_LINE_SIZE/sizeof(void *) I'm merely putting this out there and am not (currently) championing it. I seek (informed) commentary.

Diff Detail

Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

kmacy retitled this revision from to reduce cache coherence traffic on br_ring accesses.
kmacy updated this object.
kmacy edited the test plan for this revision. (Show Details)
kmacy added a reviewer: imp.
kmacy added subscribers: andrew, rpaulo, zbb and 4 others.