Currently this is mostly done in mi_switch. Unfortunately some of its consumers want thread lock to remain held after return, making it infeasible to handle it there. Even if a flag argument or new variant was introduced it would still mean some callers manipulate the list with the lock.
Therefore I decided to modify the mechanism to add the dying thread to the list early and then wait for it to be switched away from. Reaping is synchronized with a new state (TDS_UNUSED) which maintains the current semantics, but also retains the pessimization of having to deal with it on context switch. Fixing the last part may be a little delicate so I'm leaving it as an exercise for the reader -- most of the benefit is already in place with the patch below.
Perhaps I should add previous version of the patch removed deadthread entirely. It was synchronizing itself with first waiting for the thread to transition to TDS_INACTIVE and then for thread lock to be released, indicating in particular that mi_switch is done. Profiling showed it kept accumulating spins while waiting thus I abandoned it for now. Perhaps can be revisited after (if?) contention drops. Any other indicator runs into possibly delicate issues.
I added a simple probe to get an idea how often it waits in the first place during poudriere -j 104. 15 minutes of counting spin counts reveals:
value ------------- Distribution ------------- count -1 | 0 0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 2699722 1 | 1105 2 | 2765 4 | 4566 8 | 7344 16 | 9568 32 | 3118 64 | 45 128 | 19 256 | 3 512 | 0 1024 | 0 2048 | 1 4096 | 0