Paths

Table of Contentst

Differential D20362

procdesc: allow to collect status through wait(1) if process is traced
ClosedPublic
Actions

Authored by oshogbo on May 22 2019, 7:46 PM.

Details

Reviewers

markj
emaste
kib
jhb
rwatson

Commits

rS355097: procdesc: allow to collect status through wait(1) if process is traced

Summary

The debugger like truss(1) depends on the wait(2) syscall. This syscall
waits for ALL children. When it is waiting for ALL child's the children
created by process descriptors are not returned. This behavior was
introduced because we want to implement libraries which may pdfork(1).

The behavior of process descriptor brakes truss(1) because it will
not be able to collect the status of processes with process descriptors.

To address this problem the status is returned to parent when the
child is traced. While the process is traced the debugger is the new parent.
In case the original parent and debugger are the same process it means the
debugger explicitly used pdfork() to create the child. In that case the debugger
should be using kqueue()/pdwait() instead of wait().

Add test case to verify that. The test case was implemented by markj@.

Diff Detail

Lint

Lint Passed

Unit

No Test Coverage

Build Status

Buildable 24480
Build 23283: arc lint + arc unit

Event Timeline

oshogbo created this revision.May 22 2019, 7:46 PM

Herald added a subscriber: emaste. · View Herald TranscriptMay 22 2019, 7:46 PM

Harbormaster completed remote builds in B24406: Diff 57717.May 22 2019, 7:46 PM

oshogbo added reviewers: markj, emaste, kib.May 22 2019, 7:47 PM

oshogbo added a reviewer: jhb.

kib added inline comments.May 22 2019, 8:46 PM

sys/kern/kern_exit.c
1005	Wouldn't this cause return of the process status through wait(2) to parent (and not a debugger) if the process was already reparented back from the debugger to the original parent ?

wait(2) and pdfork(2)

oshogbo added inline comments.May 22 2019, 8:56 PM

sys/kern/kern_exit.c
1005	The P_TRACED and P2_EXIT_TRACED flag would not be set if the child was returned.

oshogbo marked an inline comment as not done.May 22 2019, 9:06 PM

oshogbo added inline comments.

sys/kern/kern_exit.c
1005	Sorry maybe I respond to quickly. Could you clarify a little bit more? Is this possible that we have attached a debugger and we reparented back to the parent? I didn't find such scenario in the code. If we detach the debugger we are removing the P_TRACED flag.

oshogbo added a reviewer: rwatson.May 22 2019, 9:37 PM

kib added inline comments.May 22 2019, 9:37 PM

sys/kern/kern_exit.c
1005	Process is traced and exiting, which sets it P2_EXIT_TRACED flag. Debugger exits as well, which allows re-parenting of the child to the original parent. Then you get everything set up for returning status in wait(2) despite p_procdesc != NULL. Let me state it in other way: P2_EXIT_TRACED is not too different from not clearing P_TRACED at all.

oshogbo added inline comments.May 22 2019, 10:05 PM

sys/kern/kern_exit.c
1005	Yea you are right. I introduce a new flag to different our selves from the P_TRACED and not add our selves to the orphan list. I guess we should remove P2_EXIT_TRACED on re-parenting or detaching. Or those you see another way of solving this?

Clear the P2_EXIT_TRACED after reparenting.

Harbormaster completed remote builds in B24480: Diff 57853.May 24 2019, 6:40 PM

kib added inline comments.May 25 2019, 2:43 PM

sys/kern/kern_exit.c
1005	I wonder do you need P2_EXIT_TRACED flag. Isn't the condition where P_TRACED or P2_EXIT_TRACED are set equivalent to the orphaned state of the child ?

oshogbo added inline comments.May 27 2019, 3:02 PM

sys/kern/kern_exit.c
1005	I'm not sure. I think if the debugger is also a parent then we are not orphaned. But maybe I miss understand some code.

kib added inline comments.May 27 2019, 7:52 PM

sys/kern/kern_exit.c
1005	You can check if parent == oparent additionally ?

oshogbo added inline comments.May 27 2019, 8:08 PM

sys/kern/kern_exit.c
1005	When we are parent and debugger then, I think, I wouldn't be still able to fetch last status when process exited and the P_TRACED flag was removed. If I would like to leave P_TRACED flag then I need to figure out how to inform kern_wait6 not to add it to the orphan list.

markj added inline comments.Jun 3 2019, 7:48 PM

sys/kern/kern_exit.c
1005	Could we change ptrace() to add the child to the parent's orphan list even when the original parent is the debugger?

oshogbo added inline comments.Jun 9 2019, 10:47 PM

sys/kern/kern_exit.c
1005	I'm not sure about this approach. We may try to kill process twice first in the child list then in orphan lists. We can skip that if p_pid == q_oppid . Im not sure if this intuitive that the child would land on the orphan list in this situation.

markj added inline comments.Jun 10 2019, 4:03 PM

sys/kern/kern_exit.c
1005	Sorry, I'm not sure what you mean. exit() will send SIGKILL to children of the debugger, not orphans, but I guess you are referring to something else. IMO the "orphan" terminology is already non-intuitive: an orphan's parent may still be alive. Without having tried to implement it myself, I think having the debuggee always live on an orphan list is more consistent and easier to reason about.

jhb added inline comments.Jun 10 2019, 6:23 PM

sys/kern/kern_exit.c
1005	I have long wanted a 'p_debugees' list and a 'p_debugger' backpointer that was used when a debugger was detached and to not change the actual parent at all. I have had partial patches to do this in the past. The last time I brought this up when getppid was fixed to not report debuggers, kib@ convinced me that the new approach wouldn't really be any cleaner (you need workarounds to avoid reporting the status of a child with p_debugger != NULL to the "real" parent for example), but I still think it would probably be easier to reason about. Not sure if that is what you are asking about here? In case the parent is the debugger meaning the debugger explicitly used pdfork() to create the child, it seems like the debugger should be using pdwait() instead of wait(). OTOH, if the child process then uses pdfork() the debugger is only going to get pids back for that event unless we add additional changes (e.g. a new trace op) to get back a procdesc for the new process. I don't really think we want to target the case of a debugger using pdfork() to create debugees currently as that needs a lot more work.

markj added inline comments.Jun 17 2019, 6:11 PM

sys/kern/kern_exit.c
1005	Right, the problem is to recognize the case where the parent is also the debugger. I think you're right that we can not worry about the case where the debugger used pdfork() to create the child. The main motivation here is to fix the case where some existing debugger (truss, gdb, etc.) is pointed at a pdfork() child. If we make that simplifying assumption, then it should be sufficient to check: if (p->p_procdesc == NULL \|\| ((p->p_treeflag & P_TREE_ORPHANED) != 0 && p->p_pptr == td->td_proc) and maybe assert `p->p_oppid != td->td_proc->p_pid` in the latter case.

If the process is created by pdfork the final status have to be
collected through process descriptor.

Harbormaster completed remote builds in B25701: Diff 60500.Aug 6 2019, 10:20 AM

markj added inline comments.Aug 6 2019, 2:45 PM

tests/sys/kern/ptrace_test.c
3964	Won't this test fail?

I think the wait or pdfork man pages should clarify this behaviour, but the code changes look ok to me.

tests/sys/kern/ptrace_test.c
3964	Sorry, I was going off of the description, not what the test actually does. The comment should be updated: processes created this way are not visible to waitid(P_ALL).

This revision is now accepted and ready to land.Aug 7 2019, 8:31 PM

Can you rewrite the review summary according to the new code ? I want to see the commit message for the change.
Looking at proc_to_reap() and you initial note about kern_wait6(), I wonder should we check for p_procdesc for all idtype cases in proc_to_reap() ? In other words, isn't there somewhat larger problem with kern_wait6() reaping procdesc zombies at all ?

I'm not sure. If the process used explicit pid in wait shouldn't we allow to trace it?

In D20362#482644, @oshogbo wrote:

I'm not sure. If the process used explicit pid in wait shouldn't we allow to trace it?

So even despite there is a procdesc, if you pass the pid explicitly, it is returned, ok. But what If the caller specified e.g. P_GID, are we then fine that kern_wait() might select a procdesc-process among the members of the process group ? Is it fine, if yes, then how is this situation different from specifying a wildcard for wait(2) ? Same question for all other namespace identifiers for waitid().

kib accepted this revision.Oct 20 2019, 7:29 PM

@oshogbo did you see that Ryan reported that he is still seeing a problem even with this patch? https://lists.freebsd.org/pipermail/freebsd-current/2019-September/074316.html

Closed by commit rS355097: procdesc: allow to collect status through wait(1) if process is traced (authored by oshogbo). · Explain WhyNov 25 2019, 6:33 PM

This revision was automatically updated to reflect the committed changes.

oshogbo added a commit: rS355097: procdesc: allow to collect status through wait(1) if process is traced.

Herald added a subscriber: imp. · View Herald TranscriptNov 25 2019, 6:33 PM