Introducing pdopen.
Needs ReviewPublic
Actions

Authored by oshogbo on May 17 2019, 8:49 PM.

Details

Reviewers

kib
markj
brooks

Summary

pidgetpd will return a process descriptor for a given PID.
The current code assumes that the one struct file is pointing to a process descriptor - in my opionion this a bug. The process descriptor may by shared by multiple struct files.

Diff Detail

Lint

Lint Skipped

Unit

Tests Skipped

Event Timeline

oshogbo created this revision.May 17 2019, 8:49 PM

Update with U999999

kib added inline comments.May 18 2019, 11:31 AM

lib/libc/sys/Symbol.map
410	Why not pdopen() ?
sys/kern/kern_prot.c
1603	The comment is rather pointless for the wrapper.
sys/kern/sys_procdesc.c
191	Why read uap->flag second time ?
300	Why not return the error code directly, as most other kernel functions do ?
313	Comment is outdated.
357	() != 0
382	Why moving this block ?
sys/sys/proc.h
761	Comment is certainly wrong. Why is this flag needed ? It seems that you set p_procdesc to NULL and set the flag under the same region of the proc_lock.

I mess up this review.
Should be a little bit bather.
Thanks @kib

Herald added a reviewer: brooks. · View Herald TranscriptMay 18 2019, 6:51 PM

oshogbo retitled this revision from Introducing fdopen. to Introducing pdopen..May 18 2019, 6:51 PM

oshogbo added inline comments.

lib/libc/sys/Symbol.map
410	Ou that's name is much better.
sys/kern/sys_procdesc.c
382	I assumed the process descriptor should be marked as closed only when we are going to really closed it. That mean that the reference drops to 1 or the process is zombie already.
sys/sys/proc.h
761	I'm wasn't sure if this a right approach but I want to remove a RCE. We pdforked which creates a process descriptor. We are closing a process descriptor so we mark that there is no process descriptor in the p_procdesc to NULL and send signal to child to kill it. Then sb else use pdopen to get a file descriptor. We can't kill the process in this time because we would create an process descriptor at this time. So I'm assumed just to this allow that.

markj added inline comments.May 21 2019, 4:07 AM

sys/kern/sys_procdesc.c
219	We should avoid creating a new process descriptor for a zombie or process that has P_WEXIT set.
239	This may leak.
298	In other parts of the kernel we go through significant pain to ensure that syscalls do not fail because of transient memory allocation failures; I think we should do the same here too. The M_NOWAIT caller can check for ENOMEM, drop locks, allocate with M_WAITOK, and retry the lookup before either assigning the procedesc to the process, or discovering that the race was lost.
480	The proc lock synchronizes access to pd_refcount; we can simply read the value before deciding what to do instead of introducing the new refcount_release_ret() KPI. Or, if you prefer to keep it, the refcount(9) man page should be updated.
500	This reverts r339390.
502	Consider acquiring the (shared?) proctree lock in pdopen() instead. I believe that can be used to avoid the race without adding a new process flag.

markj added inline comments.May 31 2019, 4:15 PM

sys/kern/sys_procdesc.c
298	To expand on the reasoning a bit, it is quite difficult to write reliable system software if system calls are allowed to fail this way. Realistically, the only thing an application will do with ENOMEM is retry the call. If it forgets to do that, it now has a rare failure mode. Instead of forcing every application developer using pdopen() to deal with this, the kernel should do it.

Address @markj review.
Cover the wait issue.
Add test cases.

Herald added a subscriber: imp. · View Herald TranscriptAug 27 2021, 9:48 AM

Harbormaster completed remote builds in B41236: Diff 94267.Aug 27 2021, 9:48 AM

kib added inline comments.Aug 28 2021, 12:37 PM

sys/kern/kern_fork.c
656	I do not believe p2 is locked there.
sys/kern/sys_procdesc.c
121	I like the Mark' attitude of passing M_WAITOK/M_NOWAIT flags instead of bool.
131	`if ((flags & PD_DAEMON) != 0)`
200	So Markj' note about not allowing open of zombies/exiting processes is not handled. Am I wrong?
222	What for do you need proctree_lock there?
264	IMO this is overcomplicated. Why not allocate a spare procdesc in advance, before any locks are taken, and free if not needed? I do not think that the cost of unused allocation is high, esp. when comparing with the locking dance there.
440–441	Most parts of this comment should end up in the man page.
442
444
447
sys/sys/procdesc.h
139	Are these two utility flags needed for userspace?

oshogbo added inline comments.Aug 30 2021, 2:55 PM

sys/kern/sys_procdesc.c
264	Maybe we can allocate procdesc for all the process when created?

markj added inline comments.Aug 30 2021, 3:17 PM

sys/kern/sys_procdesc.c
264	Why is that preferable to pre-allocating at pdopen() time? It seems a bit undesirable to allocate mostly unused memory for each process. At least, if we go that route, I suspect the procdesc should be directly embedded in struct proc, not allocated separately.

oshogbo added inline comments.Aug 30 2021, 3:37 PM

sys/kern/sys_procdesc.c
264	I'm not saying it preferable. I'm just wondering about best approach. 'm also afraid of "some memory" that won't be used but its sound a little bit cleaner to assume that the prcodesc just always exists and its a part of the process itself. Currently we have to separate few more cases when the procdesc exists, when doesn't, when its free. I just wonder if the "cleaner" approach wouldn't be just allocating it and keep while the process is leaving.

kib added inline comments.Aug 30 2021, 8:00 PM

sys/kern/sys_procdesc.c
264	I looked a the procdesc structure, and I think that it probably would be better to embed it into struct proc. Not quite, because almost nothing from the struct would be left if we do so. I believe that what would we need are pd_flags, pd_selinfo, and pd_lock. But, pd_flags can be merged into p_flag2, and pd_lock replaced by e.g. process mutex, since we reuse the same structure. Then the only added field would be pd_selinfo which is of 72 bytes, which probably would go into the uma tail of unused bytes for the allocation. But if done correctly, this should remove a structure which lifetime is somewhat tricky.

markj added inline comments.Aug 31 2021, 2:22 AM

sys/kern/sys_procdesc.c
264	Then the only added field would be pd_selinfo which is of 72 bytes, which probably would go into the uma tail of unused bytes for the allocation. And the p_procdesc field would be removed, saving 8 bytes. But actually we have good slab efficiency right now: vm.uma.PROC.keg.ipers: 3 (items per slab) vm.uma.PROC.keg.ppera: 1 (pages per slab) vm.uma.PROC.keg.rsize: 1336 With an embedded procdesc we can only fit 2 procs per page, it seems. But UMA will dynamically grow the slab size to improve efficiency, so I think it is still worth doing.