Page MenuHomeFreeBSD

mi_switch(9): update to current day
ClosedPublic

Authored by mhorne on Jan 24 2023, 9:51 PM.
Tags
None
Referenced Files
F86793421: D38185.diff
Tue, Jun 25, 4:07 PM
Unknown Object (File)
Sun, Jun 23, 7:15 PM
Unknown Object (File)
May 26 2024, 9:28 AM
Unknown Object (File)
May 26 2024, 9:28 AM
Unknown Object (File)
May 26 2024, 9:28 AM
Unknown Object (File)
May 26 2024, 9:28 AM
Unknown Object (File)
May 26 2024, 9:28 AM
Unknown Object (File)
May 26 2024, 9:28 AM
Subscribers

Details

Summary

The function itself and much of the information in this page remains
relevant, but many details need to be fixed.

  • Update function signatures
  • Update the list of major uses of mi_switch() (it is not exhaustive)
  • Document 'flags' argument and its possible values
  • Document thread lock requirement for callers
  • Thread runtime limits are out of scope now, no need to describe them
  • Remove outdated information w.r.t. KSE, runqueue, non-preemptible kernel, etc
  • Update the description of cpu_switch() and its responsibilities

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

share/man/man9/mi_switch.9
101

One important case not listed here is involuntary preemption, which corresponds to the calls in sched_preempt() (called when a remote CPU added work to a different CPU's scheduling queue) and sched_clock() (which checks to see whether the running thread has exhausted its quantum).

112

Should this one ever be used? Is it worth documenting?

116

Maybe something like "Switch after propagating scheduling priority to the owner of a resource."

Turnstiles themselves aren't contended per se, they're used to implement priority propagation for contended resources.

127

Specifically, a thread which handles interrupts. Currently either an ithread or a softclock thread, the latter is scheduled from a timer interrupt so is morally the same.

Other kernel worker threads like taskqueues just sleep.

170

Still unfinished. :)

It probably worth explaining that even for involuntary context switches, the switch is performed by the switched thread. There are no external deity that swaps contexts.

share/man/man9/mi_switch.9
76–87

This is a strange description for modern (preemptive) kernel. There is indeed TDA_SCHED AST, but the whole case of trap and signal handling is not too special comparing with other reasons for sleeping.

I think it is better to drop this whole cause or replace it by TDA_SCHED.

Handle review comments. SWT_NONE is gone, SWT_BIND added.

mhorne added inline comments.
share/man/man9/mi_switch.9
170

This bit could probably still use more details, I am finding it difficult to understand how it fits together exactly.

In general (for ULE), threads will receive the mutex owned by the threadqueue they are on, is that right?. There is a special blocked_lock mechanism which seems to be necessary to prevent another CPU switching into a thread while it is being switched out.

I think overall the thread lock is a little confusing to me, since it get swapped out in a couple different places within the kernel.

173

I really have no idea exactly where/how the unlock happens...

174

This is more accurate terminology from the D&I book.

share/man/man9/mi_switch.9
173

This is both confusing and not true.

It is confusing because the sched_switch() function returns when the thread is again put on a cpu.
It is not true because there is no 'the thread lock', the lock serving as thread lock is set to the current thread container (I do not think we have the steady terminology for this thing). For running/runnable threads, it is runq lock, for sleeping threads it is sleepq spinlock, for blocked threads it is turnstile lock etc.

There is a special 'unlockable' lock calling blocking lock. It serves as the thread lock during actual context switches (AKA cpu_switch()). So what happens is that the thread entering sched_switch() gets it current thread lock replaced by blocking lock. The old thread lock is saved in the mtx local variable. It the mtx is not runq lock, it is simply unlocked. If it is runq lock, some more code is executed until runq is unlocked (tdq).

Only after that cpu_switch() is called, which replaces the outgoing thread blocking lock with the previous lock.

174

This is the first time I see the term in a FreeBSD context.

share/man/man9/mi_switch.9
170

Yes, the thread lock is magical.

You're right, in ULE, a thread running on a CPU has its thread lock pointing to the CPU's runqueue lock. When a thread is preparing to yield the CPU, usually to go to sleep waiting for a resource, its thread lock may be switched out for that of some data structure tracking the thread (a sleepqueue lock or a turnstile lock, for example). Then the thread can yield while still holding the data structure mutex, which makes it easier to avoid lost wakeups.

blocked_lock is used to represent a special state where nothing is allowed to lock the thread because it is switching off-CPU and its state is volatile. An attempt to lock a thread in this state just spins until the thread has finished switching.

Handle this round of feedback.

mhorne added inline comments.
share/man/man9/mi_switch.9
170

Thanks. With this and @kib's comment above, the picture is much clearer.

Is it for memory efficiency reasons that the thread lock is not just a mutex embedded in struct thread? Or, is it somehow beneficial/required to be able to lock all threads on the runq/sleepq/turnstile at once?

173

Illuminating, thank you.

174

Maybe I am too eager then; restored.

share/man/man9/mi_switch.9
170

Thread lock does not lock thread, but the thread' scheduling information. It is a global sched lock for 4BSD, physically. This organization of the scheduling locking provides the fine-grained locking for scheduler data.

Thread lock is there to protect the container (see above). For instance, when thread is being woken up from sleep, we need to lock the corresponding sleep lock, both in wakeup() and when thread is removed from the sleepq; same for turnstiles. We need tdq lock to recalculate load and reschedule threads to less busy tdq.

I believe that Jeff found this scheme in Solaris, there is Solaris internals book that could give a different perspective on the stuff.

share/man/man9/mi_switch.9
159–160

Note that this is a return on different stack, so it is not return from the call that was mentioned above. I am not sure that 'exit from cpu_switch' is good term there.

Might be, something like 'After that, CPU instruction flow continues executing in the incoming thread context, returning from the call to cpu_switch() done by the the newly scheduled thread'.

178

Might be useful to say that pcb is pointed to by td->td_pcb

Mention td_pcb pointer. Describe what it means to "return" from cpu_switch() more clearly.

This revision is now accepted and ready to land.Feb 8 2023, 12:29 AM
This revision was automatically updated to reflect the committed changes.