Port the code to block on turnstile to lock-less delayed invalidation, instead of yielding. Yield might cause tight loop due to priority inversion.
Since it is impossible to avoid race between block and wake-up, arm 1-tick callout to wakeup when thread blocks itself.
Reported and tested by: mjg