Details

Reviewers

alc
markj
cem
rang_acm.org

Commits

rS362031: amd64 pmap: reorder IPI send and local TLB flush in TLB invalidations.

Summary

Right now code first flushes all local TLB entries that needs to be flushed, then signals IPI to remote cores, and then waits for acknowledgements while spinning idle. In the VMWare article 'Don’t shoot down TLB shootdowns!' it was noted that the time spent spinning is lost, and can be more usefully used doing local TLB invalidation.

We could use the same invalidation handler for local TLB as for remote, but typically for pmap == curpmap we can use INVLPG for locals instead of INVPCID on remotes, since we cannot control context switches on them. Due to that, keep the local code and provide the callbacks to be called from smp_targeted_tlb_shootdown() after IPIs are fired but before spin wait starts.

Tested by: pho

Diff Detail

Repository

rS FreeBSD src repository - subversion

Lint

Lint Skipped

Unit

Tests Skipped

Build Status

Buildable 31580

Event Timeline

kib created this revision.Jun 8 2020, 1:37 PM

Herald added a subscriber: imp. · View Herald TranscriptJun 8 2020, 1:37 PM

kib requested review of this revision.Jun 8 2020, 1:37 PM

Harbormaster completed remote builds in B31562: Diff 72825.Jun 8 2020, 1:37 PM

cem accepted this revision.Jun 8 2020, 2:21 PM

cem added inline comments.

sys/x86/x86/mp_x86.c
1746–1748	Bike-shed: The `bool` return feels like a weird abstraction to me. We seem to be avoiding uglying the non-IPI case of `smp_targeted_tlb_shootdown_pinned` with `curcpu_cb()`. You could replace the `return (false)`s with `goto noipi_out;` and conclude the function something like: ... mtx_unlock_spin(&smp_ipi_mtx); return; noipi_out: if (curcpu_cb != NULL) curcpu_cb(pmap, addr1, addr2); } That has two return paths, but isn't unbearably ugly; IMO it is less ugly than the new construction where we are obliged to call the callback IFF the the return value was `false`. Functionally, the change is fine as-is.

This revision is now accepted and ready to land.Jun 8 2020, 2:21 PM

markj accepted this revision.Jun 9 2020, 3:55 AM

markj added inline comments.

sys/x86/x86/mp_x86.c
1794	Why not perform wbinvd() the same way? Then you can eliminate a couple of conditional branches on curcpu_cb != NULL, I believe.

Looks reasonable to me. Good idea.

alc accepted this revision.Jun 9 2020, 7:21 AM

alc added inline comments.

sys/x86/x86/mp_x86.c
1746–1748	I agree with cem.

Review feedback:

handle cache invalidation.
drop wrapper, use goto.

Handle i386 with dummy curcpu_cb.

This revision now requires review to proceed.Jun 9 2020, 9:40 AM

Herald added a subscriber: rpokala. · View Herald TranscriptJun 9 2020, 9:40 AM

Harbormaster completed remote builds in B31580: Diff 72851.Jun 9 2020, 9:40 AM

markj accepted this revision.Jun 9 2020, 1:39 PM

This revision is now accepted and ready to land.Jun 9 2020, 1:39 PM

cem accepted this revision.Jun 9 2020, 4:40 PM

cem added inline comments.

sys/i386/i386/pmap.c
1313–1318	Could this one be the same as amd64 (wbinvd callback), rather than using the dummy?

Handle i386 invalidate_cache() slightly more optimal.

This revision now requires review to proceed.Jun 9 2020, 5:02 PM

Harbormaster completed remote builds in B31592: Diff 72871.Jun 9 2020, 5:02 PM

Thanks!

This revision is now accepted and ready to land.Jun 9 2020, 5:13 PM

alc added inline comments.Jun 9 2020, 9:42 PM

sys/x86/x86/mp_x86.c
1679–1681	One thing about this code that might surprise or confuse newcomers who try to understand it or use it is that the callback function is called unconditionally on the caller's underlying processor, surprisingly even when the caller's underlying processor is not set in the mask. So, the callback function must be prepared to handle such spurious invocations. I think that there should be a comment here at the head of this function explaining this.

Add herald comment for smp_targeted_tlb_shootdown().
i386 build fixes.

This revision now requires review to proceed.Jun 9 2020, 10:13 PM

Harbormaster completed remote builds in B31601: Diff 72895.Jun 9 2020, 10:13 PM

markj added inline comments.Jun 10 2020, 8:50 PM

sys/x86/x86/mp_x86.c
1682	"which are to be signalled"
1684	I would simplify a bit and write, "The curcpu_cb callback is invoked on the calling CPU while waiting for remote CPUs to complete the operation."