Several fixes resulted from the ptrace(2) fuzzing by pho
ClosedPublic
Actions

Authored by kib on Oct 15 2015, 7:16 PM.

Details

Reviewers

jhb
pho

Commits

rS289664: Trim spaces at end of line to record the proper commit message for
rS289661: Mark struct thread zone as type-stable.
rS289660: Reviewed by: jhb, pho

Summary

Two changes.

Do not allow ptrace(PT_TRACE_ME) if parent would not trace us. I.e. when the parent is init. Not init can be killed and then trace state of the process is cleared, while this does not work for pid 1 parent.

Do not ever suspend if thread_suspend_check is called with return_instead == 1. Suspending there causes us stopping while upper levels are not ready for the thread suspension.

Diff Detail

Repository

rS FreeBSD src repository - subversion

Lint

Lint Skipped

Unit

Tests Skipped

Event Timeline

kib updated this revision to Diff 9430.Oct 15 2015, 7:16 PM

kib retitled this revision from to Several fixes resulted from the ptrace(2) fuzzing by pho.

kib updated this object.

kib edited the test plan for this revision. (Show Details)

kib added a reviewer: jhb.

kib set the repository for this revision to rS FreeBSD src repository - subversion.

kib added a subscriber: pho.

Herald added a subscriber: imp. · View Herald TranscriptOct 15 2015, 7:16 PM

Do not free struct thread ever, the locks spins (e.g. mutex) depend on the struct thread not being ever unmapped.

emaste added a subscriber: emaste.Oct 19 2015, 4:35 PM

jhb added inline comments.Oct 19 2015, 9:26 PM

sys/kern/kern_thread.c
284	I think this is fine. Alternatively we could change the adaptive spinning code to grab td_oncpu and wait for pc_curthread of the relevant CPU to change instead? That would be a larger change though.
939	I need think about this one more, so maybe split this out for now? (Also since it seems it causes issues in Peter's tests?)
sys/kern/sys_process.c
755	My only question here is if init is the only special process? What about init processes inside of a jail for instance? Should this be part of being a "reaper" or should there perhaps be an additional procctl(2) operation to forbid TRACE_ME of descendants?

kib added inline comments.Oct 19 2015, 9:41 PM

sys/kern/kern_thread.c
284	This is interesting idea, but it would require larger changes and inspection of all locking primitives. I will try to evaluate this. Issue with marking struct thread type-stable is that we would end with a huge consumer of the memory and KVA after some loads which created and then exited a lot of threads. I asked Peter to test the scenario.
939	Sure, this chunk requires more work. I do not intend to commit it until the issue is understood. I left it there to show the original problem, along other problems uncovered by the Peter' fuzzing.
sys/kern/sys_process.c
755	We do not have init in the jails. Difference between real init with the pid 1, and other parents, possibly reapers, is that real init should never die. If the process parent dies, then the children' P_TRACED flag is cleared, and a self-inflicted state from ptrace(PT_TRACE_ME) is cleared as well, allowing the process to be killed. In other words, if the parent is not init, then we can kill the process which executed ptrace(PT_TRACE_ME), by first killing the parent. We cannot kill such process in any way if its parent is init.

kib added inline comments.Oct 19 2015, 9:46 PM

sys/kern/kern_thread.c
284	Hm, do you mean, accessing td_oncpu of the owner thread ? If yes, and it seems that we need owner->td_oncpu indeed, this would have the same issue as the current access to the owner->td_state.

jhb added inline comments.Oct 20 2015, 5:36 PM

sys/kern/kern_thread.c
284	The window would at least be narrower since you would only dereference the thread pointer at the start of each spinning loop rather than on each iteration. Something like: owner = lock_owner(); if (owner->td_state == TDS_RUNNING) { pc = pcpu_find(owner->td_oncpu); while (pc->pc_curthread == owner) cpu_spinwait(); } I suppose one way we could make this "safer" would be to only spin after setting a contested flag. I had avoided that originally so that the holder could do a cheap unlock when there were only spinning waiters. However, if you set the contested flag and read td_oncpu under the turnstile chain lock (but dropped the lock while you spun) that would be safe (albeit more expensive) as the thread would need to grab the turnstile chain lock to clear the contested flag. (It would also have to now handle the case that there might not be any sleeping threads when the contested flag was set.) I'm not sure if the overhead from that would be worse than making threads NOFREE though.
sys/kern/sys_process.c
755	Ok.

kib added inline comments.Oct 20 2015, 7:24 PM

sys/kern/kern_thread.c
284	IMO taking turnstile or sleepq chain lock in the spin loop would nullify any benefits of the spinning. My unserstanding is that spinning (by pretending that a blockable/sleepable lock is the spinlock until the owning thread is running) helps because it avoids double-enqueueing pattern. If we start obtaining turnstile spinlock even to spin, we in fact move the mutex implementation to double-queue, which is not really helpful, as was demonstrated by converting sx from double-enqueue to current atomic lock. Anyway, since Peter testing demonstrated that nothing awful happens with NOFREE struct thread zone, I want to commit the NOFREE and sys_process.c chunks now, since they fix real bugs. The return_instead still requires more digging.

jhb added inline comments.Oct 20 2015, 7:47 PM

sys/kern/kern_thread.c
284	Ok, sounds good. I think NOFREE is also preferable to pessimizing adaptive spinning. There is an old PR from ups@ about the thread reuse race that can possibly be closed when this is committed. https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=65448