Page MenuHomeFreeBSD

Don't hold the scheduler lock while doing context switches.
ClosedPublic

Authored by jeff on Dec 12 2019, 6:23 AM.
Tags
None
Referenced Files
Unknown Object (File)
Dec 20 2023, 12:13 AM
Unknown Object (File)
Dec 9 2023, 7:19 AM
Unknown Object (File)
Nov 15 2023, 2:24 AM
Unknown Object (File)
Oct 28 2023, 4:33 AM
Unknown Object (File)
Oct 14 2023, 1:27 AM
Unknown Object (File)
Oct 11 2023, 1:35 PM
Unknown Object (File)
Aug 2 2023, 8:09 AM
Unknown Object (File)
Aug 2 2023, 8:09 AM
Subscribers

Details

Summary

This actually really simplifies some ULE locking as well as hopefully reducing the contention on the scheduler lock as switch is quite a long operation. The reason this simplifies ULE locking is that we will no longer spin waiting for block locked to clear in sched_switch() while we have the tdq locked. This also gets rid of cases where we have to pretend the lock directly changed ownership and assign the owner and clear lock profiling. All in all it's a less unique locking pattern.

switches are obviously protected against preemption by spinlock_enter(). The blocked lock still provides protection against switching into a thread that is in the process of switching out.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

jeff added reviewers: jhb, kib, mav, markj.
jeff set the repository for this revision to rS FreeBSD src repository - subversion.
sys/kern/kern_synch.c
532–535 ↗(On Diff #65536)

Arguably this could be moved into the schedulers and we wouldn't worry about returning in a spinlock section. However, I wanted to return to mi_switch() while still in a spinlock so ktr would make sense.

sys/kern/sched_4bsd.c
1030 ↗(On Diff #65536)

This feature has been dead for something like a decade.

I also could remove the td argument. It is always curthread.

sys/kern/kern_synch.c
532–535 ↗(On Diff #65536)

I think a comment explaining this would be useful.

Do we need to hold the spinlock section across the thread_stash() calls?

sys/kern/sched_4bsd.c
1030 ↗(On Diff #65536)

sched_switch() has only the one caller so I do not think it would be that helpful, plus then sched_switch() would need to reload curthread. mi_switch() already performs this reload even though all of its callers must do so anyway.

sys/kern/sched_ule.c
2368 ↗(On Diff #65536)

We don't need the thread lock to update td_owepreempt.

sys/kern/subr_sleepqueue.c
626 ↗(On Diff #65536)

The KTR is racy. Maybe that's not the end of the world, but we should have at least have a comment pointing that out.

sys/kern/subr_turnstile.c
820 ↗(On Diff #65536)

I guess this KTR is racy now?

sys/kern/kern_synch.c
532–535 ↗(On Diff #65536)

I think just critical for pcpu get/set.

Another option would be for sched_switch() to return the thread to be zombied to mi_switch where it can take care of it. The the stash lock wouldn't need to be a spinlock.

sys/kern/sched_4bsd.c
1030 ↗(On Diff #65536)

The thing that I don't like about it is in some instances it implies that you can operate on a thread other than curthread. For some functions you can, and for some you can't. Here where there's only a few arguments the perf impact is negligible but in vfs it can create enough arguments to push you into stack space instead of registers.

sys/kern/sched_ule.c
2368 ↗(On Diff #65536)

I can re-order these. I could also have an unlock label and just return after the switches. That might be less ugly. I try not to repeat lock operations because they can inline a lot of code.

sys/kern/subr_sleepqueue.c
626 ↗(On Diff #65536)

Another option is to move it into the resuming thread. So you know when it is woken. The KTR_PROC in sched_switch() is still protected with a spinlock. So whichever place does the above KTR you will know when you return from that switch but you may be preempted.

sys/kern/subr_turnstile.c
820 ↗(On Diff #65536)

Same as above

sys/kern/sched_ule.c
2071 ↗(On Diff #65536)

I believe John said once that spinlock_enter() is not the KPI to be used outside of the spinlocks implementation.

In fact, do we need the interrupts disabled there, or just a preemption ? [cpu_switch on amd64 definitely needs interrupts disabled]

2081 ↗(On Diff #65536)

Could you explain both why we should have not changed td_lock for idle, and can do it now ?

sys/kern/sched_ule.c
2071 ↗(On Diff #65536)

Definitely need interrupts and preemption disabled. The scheduler is called from interrupt handlers.

We don't want spinlock_enter/exit creeping into code that really only needs critical. It's much more expensive.

2081 ↗(On Diff #65536)

Idle should never sleep or block on a turnstile. Those are the two conditions that change td_lock. I can assert !INHIBITED here and that would be more accurate.

sys/kern/sched_ule.c
2071 ↗(On Diff #65536)

So what would be the problem if we only disabled interrupts in cpu_switch() and not there ?

2081 ↗(On Diff #65536)

I see, thanks. Yes, perhaps !INHIBITED would be cleaner.

sys/kern/sched_ule.c
2071 ↗(On Diff #65536)

We have removed newtd from the run queue. If interrupts were enabled between that operation and the switch curthread could be preempted holding the only reference to newtd.

Multiple scheduler entry points already happen in a spinlock critical and the scheduler is expected to DTRT. See sched_throw() and sched_fork_exit().

kib added inline comments.
sys/kern/sched_ule.c
2071 ↗(On Diff #65536)

But we disabled preemption ? I am only talking about interrupts.

Anyway, I am in fact retract my question. I mis-remembered some code and hoped that pushing interrupts disable into cpu_switch() would make it simpler but it is not.

This revision is now accepted and ready to land.Dec 14 2019, 5:37 PM
sys/kern/sched_ule.c
2071 ↗(On Diff #65536)

Interrupts still schedule threads. So you are right that we could keep a critical and not lose this thread forever. We would have two threads in an indeterminate scheduling state however. I think it would be possible to make it work but I'm not sure it would be worth the effort.

This revision was automatically updated to reflect the committed changes.
jeff added a commit: rS355784: schedlock 4/4.