Page MenuHomeFreeBSD

procdesc: allow to collect status through wait(1) if process is traced
ClosedPublic

Authored by oshogbo on May 22 2019, 7:46 PM.
Tags
None
Referenced Files
Unknown Object (File)
Tue, Jan 21, 9:17 AM
Unknown Object (File)
Sat, Jan 18, 5:29 PM
Unknown Object (File)
Sun, Jan 12, 1:29 AM
Unknown Object (File)
Tue, Dec 31, 2:01 PM
Unknown Object (File)
Dec 22 2024, 4:00 PM
Unknown Object (File)
Dec 22 2024, 9:54 AM
Unknown Object (File)
Dec 5 2024, 3:46 AM
Unknown Object (File)
Dec 4 2024, 4:44 PM
Subscribers

Details

Summary

The debugger like truss(1) depends on the wait(2) syscall. This syscall
waits for ALL children. When it is waiting for ALL child's the children
created by process descriptors are not returned. This behavior was
introduced because we want to implement libraries which may pdfork(1).

The behavior of process descriptor brakes truss(1) because it will
not be able to collect the status of processes with process descriptors.

To address this problem the status is returned to parent when the
child is traced. While the process is traced the debugger is the new parent.
In case the original parent and debugger are the same process it means the
debugger explicitly used pdfork() to create the child. In that case the debugger
should be using kqueue()/pdwait() instead of wait().

Add test case to verify that. The test case was implemented by markj@.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

sys/kern/kern_exit.c
1005 ↗(On Diff #57717)

Wouldn't this cause return of the process status through wait(2) to parent (and not a debugger) if the process was already reparented back from the debugger to the original parent ?

sys/kern/kern_exit.c
1005 ↗(On Diff #57717)

The P_TRACED and P2_EXIT_TRACED flag would not be set if the child was returned.

oshogbo marked an inline comment as not done.May 22 2019, 9:06 PM
oshogbo added inline comments.
sys/kern/kern_exit.c
1005 ↗(On Diff #57717)

Sorry maybe I respond to quickly. Could you clarify a little bit more?

Is this possible that we have attached a debugger and we reparented back to the parent? I didn't find such scenario in the code.
If we detach the debugger we are removing the P_TRACED flag.

sys/kern/kern_exit.c
1005 ↗(On Diff #57717)

Process is traced and exiting, which sets it P2_EXIT_TRACED flag. Debugger exits as well, which allows re-parenting of the child to the original parent. Then you get everything set up for returning status in wait(2) despite p_procdesc != NULL.

Let me state it in other way: P2_EXIT_TRACED is not too different from not clearing P_TRACED at all.

sys/kern/kern_exit.c
1005 ↗(On Diff #57717)

Yea you are right. I introduce a new flag to different our selves from the P_TRACED and not add our selves to the orphan list.

I guess we should remove P2_EXIT_TRACED on re-parenting or detaching.
Or those you see another way of solving this?

Clear the P2_EXIT_TRACED after reparenting.

sys/kern/kern_exit.c
1005 ↗(On Diff #57853)

I wonder do you need P2_EXIT_TRACED flag. Isn't the condition where P_TRACED or P2_EXIT_TRACED are set equivalent to the orphaned state of the child ?

sys/kern/kern_exit.c
1005 ↗(On Diff #57853)

I'm not sure.
I think if the debugger is also a parent then we are not orphaned.
But maybe I miss understand some code.

sys/kern/kern_exit.c
1005 ↗(On Diff #57853)

You can check if parent == oparent additionally ?

sys/kern/kern_exit.c
1005 ↗(On Diff #57853)

When we are parent and debugger then, I think, I wouldn't be still able to fetch last status when process exited and the P_TRACED flag was removed.

If I would like to leave P_TRACED flag then I need to figure out how to inform kern_wait6 not to add it to the orphan list.

sys/kern/kern_exit.c
1005 ↗(On Diff #57853)

Could we change ptrace() to add the child to the parent's orphan list even when the original parent is the debugger?

sys/kern/kern_exit.c
1005 ↗(On Diff #57853)

I'm not sure about this approach. We may try to kill process twice first in the child list then in orphan lists. We can skip that if p_pid == q_oppid .
Im not sure if this intuitive that the child would land on the orphan list in this situation.

sys/kern/kern_exit.c
1005 ↗(On Diff #57853)

Sorry, I'm not sure what you mean. exit() will send SIGKILL to children of the debugger, not orphans, but I guess you are referring to something else.

IMO the "orphan" terminology is already non-intuitive: an orphan's parent may still be alive. Without having tried to implement it myself, I think having the debuggee always live on an orphan list is more consistent and easier to reason about.

sys/kern/kern_exit.c
1005 ↗(On Diff #57853)

I have long wanted a 'p_debugees' list and a 'p_debugger' backpointer that was used when a debugger was detached and to not change the actual parent at all. I have had partial patches to do this in the past. The last time I brought this up when getppid was fixed to not report debuggers, kib@ convinced me that the new approach wouldn't really be any cleaner (you need workarounds to avoid reporting the status of a child with p_debugger != NULL to the "real" parent for example), but I still think it would probably be easier to reason about. Not sure if that is what you are asking about here?

In case the parent is the debugger meaning the debugger explicitly used pdfork() to create the child, it seems like the debugger should be using pdwait() instead of wait(). OTOH, if the child process then uses pdfork() the debugger is only going to get pids back for that event unless we add additional changes (e.g. a new trace op) to get back a procdesc for the new process. I don't really think we want to target the case of a debugger using pdfork() to create debugees currently as that needs a lot more work.

sys/kern/kern_exit.c
1005 ↗(On Diff #57853)

Right, the problem is to recognize the case where the parent is also the debugger.

I think you're right that we can not worry about the case where the debugger used pdfork() to create the child. The main motivation here is to fix the case where some existing debugger (truss, gdb, etc.) is pointed at a pdfork() child. If we make that simplifying assumption, then it should be sufficient to check:

if (p->p_procdesc == NULL || ((p->p_treeflag & P_TREE_ORPHANED) != 0 && p->p_pptr == td->td_proc)

and maybe assert p->p_oppid != td->td_proc->p_pid in the latter case.

If the process is created by pdfork the final status have to be
collected through process descriptor.

tests/sys/kern/ptrace_test.c
4135 ↗(On Diff #60500)

Won't this test fail?

I think the wait or pdfork man pages should clarify this behaviour, but the code changes look ok to me.

tests/sys/kern/ptrace_test.c
4135 ↗(On Diff #60500)

Sorry, I was going off of the description, not what the test actually does. The comment should be updated: processes created this way are *not* visible to waitid(P_ALL).

This revision is now accepted and ready to land.Aug 7 2019, 8:31 PM
  1. Can you rewrite the review summary according to the new code ? I want to see the commit message for the change.
  2. Looking at proc_to_reap() and you initial note about kern_wait6(), I wonder should we check for p_procdesc for all idtype cases in proc_to_reap() ? In other words, isn't there somewhat larger problem with kern_wait6() reaping procdesc zombies at all ?
  1. I'm not sure. If the process used explicit pid in wait shouldn't we allow to trace it?
  1. I'm not sure. If the process used explicit pid in wait shouldn't we allow to trace it?

So even despite there is a procdesc, if you pass the pid explicitly, it is returned, ok. But what If the caller specified e.g. P_GID, are we then fine that kern_wait() might select a procdesc-process among the members of the process group ? Is it fine, if yes, then how is this situation different from specifying a wildcard for wait(2) ? Same question for all other namespace identifiers for waitid().

@oshogbo did you see that Ryan reported that he is still seeing a problem even with this patch? https://lists.freebsd.org/pipermail/freebsd-current/2019-September/074316.html