Page MenuHomeFreeBSD

sys: add pdopenpid(2)
Needs ReviewPublic

Authored by kib on May 20 2026, 2:25 AM.
Tags
None
Referenced Files
Unknown Object (File)
Sun, Jun 21, 5:19 AM
Unknown Object (File)
Sun, Jun 21, 5:15 AM
Unknown Object (File)
Sat, Jun 20, 2:25 AM
Unknown Object (File)
Sat, Jun 20, 2:17 AM
Unknown Object (File)
Sat, Jun 20, 2:09 AM
Unknown Object (File)
Thu, Jun 18, 7:26 PM
Unknown Object (File)
Wed, Jun 17, 10:56 PM
Unknown Object (File)
Wed, Jun 17, 10:15 PM

Details

Reviewers
brooks
markj

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

kib requested review of this revision.May 20 2026, 2:25 AM
kib added a subscriber: asomers.

Illustrative program

/* $Id: pdopenpid.c,v 1.1 2026/05/20 05:16:25 kostik Exp kostik $ */

#include <sys/types.h>
#include <sys/procdesc.h>
#include <sys/resource.h>
#include <sys/wait.h>
#include <err.h>
#include <signal.h>
#include <stdio.h>
#include <unistd.h>

#define MY_SYS_pdopenpid	603

static int
my_pdopenpid(pid_t pid, int flags)
{
	return (syscall(MY_SYS_pdopenpid, pid, flags));
}

int
main(void)
{
	struct __wrusage wu;
	struct __siginfo si;
	pid_t child, pp;
	int error, fd, status;

	child = fork();
	if (child == -1)
		err(1, "fork");
	if (child == 0) {
		sleep(1);
		_exit(0x123);
	}
	fd = my_pdopenpid(child, 0);
	if (fd == -1)
		err(1, "pdopenpid");
	error = pdgetpid(fd, &pp);
	if (error == -1)
		err(1, "pgdetpid");
	printf("child pid %d %d\n", child, pp);
	error = pdwait(fd, &status, WEXITED, &wu, &si);
	if (error == -1)
		err(1, "pdwait");
	printf("exited pid %d status %#x si.si_status %#x si.si_code %#x\n",
	    pp, status, si.si_status, si.si_code);
	close(fd);
}

Provide dedicated error values for cases when the target process is exiting, or when the procdesc is already allocated for it.

Allow to have more than one file referencing procdesc.
I am unable to find a broken situation, so lift the restriction.

Track number of open files for given procdesc. Drop PDF_CLOSED flag.

sys/kern/sys_procdesc.c
258

This function seems to be unused.

626

fp is left uninitialized in the error case.

631

Use PGET_NOTWEXIT?

640

What if pd->pd_fpcount == 0 before the increment? I think that means either the process is exiting (should be impossible due to the P_WEXIT check above; procdesc_exit() should assert that P_WEXIT is set), or it was forked with PD_DAEMON.

642

Aren't we leaking pdf in this case?

649

Don't you need to call procdesc_finit() even if p->p_procdesc != NULL?

660

How can pdf == NULL?

kib marked 6 inline comments as done.May 21 2026, 4:07 PM
kib added inline comments.
sys/kern/sys_procdesc.c
258

It is used by kern_fork.c.

631

I want specific error code for the case of exiting process.

640

Why would having pd_fpcount == 0 is significant?
Also, why PD_DAEMON would imply that? The only difference from PD_DAEMON I see in the code is that SIGKILL is not sent of any procdesc close.

kib marked 2 inline comments as done.

Handle review notes except pd_fpcount == 0/PDF_DAEMON case.

sys/kern/sys_procdesc.c
258

Oops, I thought it was a new function.

640

I was trying to convince myself that there is no race between this code and procdesc_exit(). I think the WEXIT check is enough, since that flag will also be set for zombies. I would add an assertion that WEXIT is set to procdesc_exit(), otherwise this part seems ok.

With respect to PD_DAEMON, I just had some misunderstanding about how procdesc_close() works.

If I understand this code correctly we should always have pd->pd_fpcount > 0 before the increment.

sys/security/audit/audit_bsm.c
1417

It looks like the first parameter should be 1, not 0, but I am not certain.

kib marked 4 inline comments as done.

Assert that P_WEXIT is set in procdesc_exit(), It is trivially true because procdesc_exit() is only called from exit1() after single-threading.
Assert that pd->pd_fcount > 0 when pd already found in kern_pdopenpid().
Fix the arg number in bsm.
Write some documentation for the interface. Try to explain the quirk of multiple pdopenpid()s,

sys/kern/sys_procdesc.c
405

Now it's possible to have multiple file descriptions which reference the same process. Then, any time any such file description is closed, we send SIGKILL to the target process, even if there is some other procdesc referencing the process. That seems wrong.

640

Actually, I think there is a race with procdesc_close(). It decrements fpcount, then acquires the proc lock. If that happens first, and it decrements pd_fpcount to 0, then this assertion will fail.

kib marked 2 inline comments as done.Mon, May 25, 3:36 PM
kib added inline comments.
sys/kern/sys_procdesc.c
405

Yes, this was intended. I thought that existing behavior of sending SIGKILL should be kept because current users of procdesc expect it. The code that opened procdesc does not know if there is another actor opened the same pid so it cannot know if its close() is last.

For now I changed it to your preference, but might be we need an additional flag besides PDF_DAEMON which would say 'only send SIGKILL on close if I am last close'. I am not sure.

640

I locked proctree around the pdopenpid, and changed the locking annotation for pd_fpcount.

kib marked 2 inline comments as done.

Only send SIGKILL on last close.
Lock proctree around pdopenpid() working on the process.
Restructure kern_pdopenpid() to avoid the mess of specific error return pathes.

sys/kern/sys_procdesc.c
405

Hmm, now I'm not sure what the right semantics are. It does make some sense to kill the target process any time a procdesc is closed. I believe the original intent behind the procdesc interface was to make it easy to create "private" processes from within a library, invisible to wait(2) and cleaned up automatically when fds go away. If another process can arbitrarily keep the private process alive, then it's not possible to provide any guarantees about the lifetime of the child. So now I think the original semantics might be better after all. Maybe it is worth asking on the mailing lists.

kib marked an inline comment as done.Mon, May 25, 5:51 PM
kib added inline comments.
sys/kern/sys_procdesc.c
405

I think we can instead somewhat improve PD_DAEMON.: make it local to file descriptor.

kib marked an inline comment as done.

Make PD_DAEMON local for file.

This looks promising! Here's my plan on how to exploit this:

  • The "client" program gets a procdesc fd for itself with pdopenpid(getpid()).
  • It then sends the procdesc fd to the "server" program via UNIX socket.
  • The "server" program checks that client's PID matches the passed procdesc using pdgetpid().
  • If the descriptor is OK the "server" then polls (via poll, select or kevent) on it to have a notification when the process exits. I don't even need the exit code.

If I understand it correctly, my plan should be possible once this diff lands.

This looks promising! Here's my plan on how to exploit this:

  • The "client" program gets a procdesc fd for itself with pdopenpid(getpid()).
  • It then sends the procdesc fd to the "server" program via UNIX socket.
  • The "server" program checks that client's PID matches the passed procdesc using pdgetpid().
  • If the descriptor is OK the "server" then polls (via poll, select or kevent) on it to have a notification when the process exits. I don't even need the exit code.

If I understand it correctly, my plan should be possible once this diff lands.

I do not see why this would not work. Nonetheless, if you can provide a self-contained C snippet I will ensure that it is.

sys/kern/sys_procdesc.c
264

This can be one line.

622

Is there anything preventing this from attaching a procdesc to a P_SYSTEM process?

660

Wouldn't the error handling be simpler if we called falloc_noinstall() and checked for errors before allocating the procdesc?

Or, consider moving this refcount_release() into the if (pdf != NULL) case below: that way, pdopenpid1() does not need to release (*pdf)->pd_refcount if a procdesc is already present.

669

Suppose finstall() fails. Then fdrop() will release the last ref on the file and call procdesc_close(), sending SIGKILL (if PD_DAEMON was not passed). That seems like surprising behaviour.

sys/security/audit/audit_bsm.c
1421

This doesn't compile for me, there is no ar_arg_flags field.

kib marked 5 inline comments as done.Thu, Jun 4, 5:32 PM
kib added inline comments.
sys/kern/sys_procdesc.c
264

Bde recommended wrapping lines around 72 column. But ok.

kib marked an inline comment as done.

Do not kill target process if the procdesc file installation into fdp failed.
Streamline control flow for error cases.
Filter out P_SYSTEM processes.
Fix audit changes.

sys/kern/sys_procdesc.c
264

This is the first time I heard of such a rule... 72 columns for comments, maybe, but code too?

676

procdesc_close() has other side effects, e.g., it may reap the process. IMO it would be cleaner to avoid calling procdesc_close() at all in this case.

kib marked 2 inline comments as done.Fri, Jun 5, 2:19 AM
kib added inline comments.
sys/kern/sys_procdesc.c
264

It was not a rule, but recommendation to wrap any line around column 72.

676

Actually no. I am now even not sure about setting F_PD_NOKILL there. I assigned p->p_procdesc already, and made the procdesc visible to other possible consumers by unlocking the PROC_LOCK. If the file cannot be installed into the process file descriptor table, the procdesc file _is closed_.

First, user have to get the consequences of the close, like reap, or killing of PD_DAEMON was not specified. Second, I do not see how to correctly unwind from the finstall() failure since p_procdesc is visible. IMO this should be left as is.

kib marked 2 inline comments as done.

Extract procdesc_destroy(). Use it for freeing unattached procdesc to not clear pid in procdesc_free().

Document EMFILE from pdopenpid()

sys/kern/sys_procdesc.c
425

Is this racy? We don't hold the procdesc lock here.

kib marked an inline comment as done.Mon, Jun 8, 5:15 PM
kib added inline comments.
sys/kern/sys_procdesc.c
425

I do not think so. The proctree lock is owned there. I explicitly require both proctree lock and procdesc lock for modifications, but then either lock is enough for read.

It might be that procdesc lock is not quite needed at all for pd_fpcount, but I prefer to leave it for now.

kib marked an inline comment as done.

Rebase