Page MenuHomeFreeBSD

Update CPU numbers for exiting and newborn threads.
ClosedPublic

Authored by jhb on Jul 24 2015, 10:17 PM.
Tags
None
Referenced Files
Unknown Object (File)
Sun, Jan 12, 1:54 AM
Unknown Object (File)
Sat, Jan 11, 11:23 PM
Unknown Object (File)
Thu, Jan 9, 1:40 PM
Unknown Object (File)
Nov 8 2024, 4:41 AM
Unknown Object (File)
Sep 30 2024, 12:52 AM
Unknown Object (File)
Sep 26 2024, 5:55 PM
Unknown Object (File)
Sep 26 2024, 1:52 AM
Unknown Object (File)
Sep 23 2024, 1:11 AM
Subscribers

Details

Summary

kgdb uses td_oncpu to determine if a thread is running and should use
a pcb from stoppcbs[] rather than the thread's PCB. However, exited threads
retained td_oncpu from the last time they ran, and newborn threads had their
CPU fields cleared to zero during fork and thread creation since they are
in the set of fields zeroed when threads are setup. To fix, explicitly
update the CPU fields for exiting threads in sched_throw() to reflect the
switch out and reset the CPU fields for new threads in sched_fork_thread()
to NOCPU.

Test Plan
  • Run kgdb against a core where some threads have never run after being created. One case is the IRQ1 ithread in a bhyve vm.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

jhb retitled this revision from to Update CPU numbers for exiting and newborn threads..
jhb updated this object.
jhb edited the test plan for this revision. (Show Details)
jhb added a reviewer: kib.
sys/kern/sched_ule.c
2084 ↗(On Diff #7279)

Should the td_oncpu and td_lastcpu fields removed from the td_zero region ?

2709 ↗(On Diff #7279)

This still leaves a race where td_oncpu is NOCPU for still running thread, until cpu_throw() did not switched the context. Is it important ?

sys/kern/sched_ule.c
2084 ↗(On Diff #7279)

Probably. I can do a followup for HEAD that would do that.

2709 ↗(On Diff #7279)

I can't think of a good way to make this perfect. kgdb used to use TDS_RUNNING to test for this, but that was removed in r275644. I had suggested to Dmitry that he drop the TDS_RUNNING test entirely. Perhaps we could put that back. However, in thread_exit() we set TDS_INACTIVE before sched_throw() is even called, so this change has a narrower race than if we re-add the TDS_RUNNING test.

Of course, that won't fix cores from older kernels. OTOH, the way this breakage manifests is you get what should be an obviously-wrong stack trace for a thread that has never run.

kib edited edge metadata.
This revision is now accepted and ready to land.Aug 1 2015, 9:08 PM
This revision was automatically updated to reflect the committed changes.