- Disallow attaching to a process already being traced since filemon is typically intended to be used on children only. This is allowed for curproc as bmake relies on this behavior for rare cases.
- Detach any previously tracked process on FILEMON_SET_PID.
- Handle error from devfs_set_cdevpriv() in filemon_open().
- The global filemon lock and lists are removed.
- A free list is no longer kept. The filemon associated with an open /dev/filemon handle will be freed when the handle is closed. At that time all processes related to the filemon will be detached as well. The freelist was previously forever-growing and only freed on MOD_UNLOAD.
- No longer loses track of double-forks. If the process holding the filemon handle closes it will close the log rather than wait on a daemonized process, but it will log all activity until it closes its handle.
- proc.p_filemon is added which is protected by PROC_LOCK. This improves performance greatly and avoids double-fork issues, taking allproc_lock while in syscalls, and walking the process tree in syscalls.
- A reference count is kept on usage of the filemon pointer. The log is closed on the last reference.
- struct filemon access is protected by sx(9) filemon->lock as it was before.
- MOD_UNLOAD will invalidate any filemon on all processes via the destruction of the devfs device calling filemon_dtr.
Details
- A lot of builds.
- script -f file cmd
- script -f file sh -c 'daemon -f ...'
- kldunloading while in various states of script -f and building.
- Started working on ATF tests for it.
Diff Detail
- Repository
- rS FreeBSD src repository - subversion
- Lint
Lint Passed - Unit
No Test Coverage - Build Status
Buildable 2773 Build 2794: arc lint + arc unit
Event Timeline
Patch did not triggered any warnings while I looked at the latest version, so I clicked the button.
Please note that I did not read the filemon, only looked at the patch, so consider my 'review' with a grain of mistrust.
I have not tested this code. It likely improves filemon in the sense it's not going to crash the kernel, but fatal design flaws and other bugs remain.
sys/dev/filemon/filemon.c | ||
---|---|---|
109 | This wake up plays no role and is buggy. The sleep function is called with parameters making the thread wake up periodically. The filemon object could have been freed the exact moment refcount_release is performed. | |
216 | There is no reason to lock the process just to check what filemon is set. If this is the filemon you are looking for, it cannot change. |
sys/dev/filemon/filemon.c | ||
---|---|---|
109 | Yes perhaps I should change to a pause(2). You're right that the last release could call filemon_free() but I would rather have the log footer written and closed via filemon_dtr() (devfs file handle close) than have it happen in some random syscall that finds that it had the last reference, or have it done while holding allproc_lock in filemon_untrack_processes (especially since writing to the filesystem will take more locks). | |
216 | The exit hook also unsets it and lowers its ref. This function is called from ioctl(FILEMON_SET_PID) as well. It is possible for another thread to be unsetting or setting this proc.p_filemon at the same time. Locking here is safe and this code is not called often and is already optimized with the refcnt check. |
sys/dev/filemon/filemon.c | ||
---|---|---|
216 | Well in the cases I was worried about I do hold filemon lock already which prevents adding/removing this filemon to any process. Removing this if check here also means I can remove all PROC_LOCKing IIUC. |
- Fix comment
- Fix not locking filemon in filemon_dtr
- Remove unneeded PROC_LOCKing
- Remove unneeded prototype
- Move setting p_filemon into filemon_proc_put() with KASSERT that p_filemon is NULL
- Assert non-NULL filemon in filemon_proc_drop
- Remove wakeup channel usage and just pause while draining.
sys/dev/filemon/filemon.c | ||
---|---|---|
218 | The comparison without the proc lock is completely fine. However, you do need it to modify the field. Otherwise you are racing with a thread doing fork which can proceed to use now-freed filemon. |
sys/dev/filemon/filemon.c | ||
---|---|---|
122 | I do not understand now, what prevents filemon free while you read non-NULL p_filemon but not yet acquired a reference on it. |
sys/dev/filemon/filemon.c | ||
---|---|---|
122 | All modifications to p->filemon need a proc lock and so does this bit. My other comment about unnecessary proc lock was only about the comparison to see if the process is tracked with particular filemon. I don't see why locking here was removed. |
sys/dev/filemon/filemon.c | ||
---|---|---|
122 | I'm becoming very confused. Yes I see now this particular check is a use-after-free. However, why does the filemon_untrack_processes not need a lock? Why would I read 1 pointer unlocked and then take a lock to potentially release a different pointer? I would have to compare again wouldn't I? |
sys/dev/filemon/filemon.c | ||
---|---|---|
122 | It does not need the lock to conclude if given process uses the filemon structure in question. It does need the lock to modify it. Now that I look at it this also needs another check in fork - it is possible for a thread to obtain a reference on filemon object and then proceed to assign the object to the child after filemon_untrack_processes finishes. It should be enough to re-check in that last case that the process is still traced. If so, an exclusive lock on filemon prevents the race. If not, just don't assign anything to the child. |
sys/dev/filemon/filemon.c | ||
---|---|---|
122 |
In fork it locks the filemon before adding it to the child. In filemon_untrack_processes it locks the filemon before removing it from all processes. Either fork finds a filemon in the parent to set in the child or it finds no filemon in the parent and does nothing. This locking of filemon is why I assumed the proc locks were not needed as it seems to be your justification for not locking for the comparison in filemon_untrack_processes. |
sys/dev/filemon/filemon.c | ||
---|---|---|
122 | CPU0 CPU1 filemon_event_process_fork filemon_proc_get PROC_LOCK(p); filemon = acq(p->p_filemon); PROC_UNLOCK(p); .. filemon_dtr sx_xlock(&filemon) sx_xlock(&filemon) /* blocks */ filemon_untrack_processes untrack p filemon_drain sx_sleep(..); /* unlocks */ /* the lock is acquired */ p2->p_filemon = acq(filemon); .. And now p2 has this particular filemon set and the code is past filemon_untrack_processes. For some reason the current patch uses pause() which does not drop the lock, turning this into a deadlock - the thread from CPU0 got a reference and is blocked waiting on a lock, while the lock holder is waiting for the reference to drop. filemon_untrack_processes does not need proc lock to compare because no new processes can get this particular filemon as long as it is locked. It does need the lock to clear tracing in order to prevent user after free in filemon_proc_get. Regardless of these measures, any thread can obtain a reference and block waiting on the lock in filemon_proc_get. |
sys/dev/filemon/filemon.c | ||
---|---|---|
122 | No, the lock is dropped the line before filemon_drain is called. As for the allproc loop, I agree with the optimization and will implement it. |
sys/dev/filemon/filemon.c | ||
---|---|---|
122 | I see the fork issue too. |
sys/dev/filemon/filemon.c | ||
---|---|---|
122 | A similar race is in the exit hook as well. There is also the possibility to lose track of the child's p_filemon in fork due to a race with ioctl(FILEMON_SET_PID) being used on the child pid. Fixing both. |
- No need to take PROC_LOCK when comparing p->p_filemon with filemon locked.
- Handle fork races
- Handle exit race
- Add comment about race before locking
- Assert non-NULL p_filemon before dropping it
- Document some of the locking
- Don't EBUSY child self-reattach which bmake relies upon.
- Add some asserts
- Add some comments
sys/dev/filemon/filemon.c | ||
---|---|---|
258 | Perhaps assert that (p->p_flag & P_WEXIT) == 0? | |
274 | I don't think this loop handles all possible races. There's no guarantee that p->p_filemon can't transition from NULL to non-NULL between loop termination and the acquisition of the proc lock below. In particular, I don't think the KASSERT below holds. I think you can address this by checking again once the proc lock is acquired. If p_filemon is non-NULL, you must drop locks and restart this loop. | |
306–307 | sizeof(*filemon) is preferable. | |
314 | No need to explicitly release the reference. | |
sys/sys/proc.h | ||
634 | I don't believe so... |
sys/dev/filemon/filemon.c | ||
---|---|---|
314 | Well I assert in filemon_free that refcnt==0 since it is only called like that. I didn't want anyone to think it would release the reference itself and wanted to ensure it was only called when appropriate. | |
sys/sys/proc.h | ||
634 | I have the MFC needs worked out. I'll move it to the end and initialize it in proc_init in stable/10. |
- Use sizeof(*filemon)
- Assert not-existing proc
- Handle p_filemon being attached to after dropping it
sys/dev/filemon/filemon.c | ||
---|---|---|
239 | I would use a do-while loop here - we already know that p->p_filemon != NULL in the first iteration. It'll also make the comment at the end more natural. |
sys/dev/filemon/filemon.c | ||
---|---|---|
226 | This should have been done closer to the code playing with lock/unlock, as opposed to be executed in all cases. But as noted earlier it is impossible to reach this state so this can be removed. | |
237 | If p == curproc, then this does not serve any purpose. Prior to actual exit the process must start running singlethreaded, which in worst case would have to wait for this thread to finish this syscall. | |
238 | This does not seem to serve any purpose either. Is not filemon implicitly referenced by fp used to get to this point? If not, what guarantees the pointer is safe to dereference in the first place? | |
249 | This is done incorrectly. By the time you PROC_LOCK here, the process could have got another filemon which you blindly overwrite later. The way to go about is to store the pointer to the old filemon (if any) and then drop it at the end of the function while temporarily unlocking the current filemon. Alternatively, the function can be expected to return with filemon unlocked. | |
sys/dev/filemon/filemon_wrapper.c | ||
396 | Here it either has to be NULL or equal to filemon. Would be good to assert this. |
sys/dev/filemon/filemon.c | ||
---|---|---|
237 | The purpose is I don't want to extend stealing tracing of any traced process. Note the EBUSY later when p != curproc. | |
238 | For the low cost of taking a reference I would rather not make assumptions about when the caller's filemon may be freed. Plus 2 threads may be racing on ioctl(FILEMON_SET_PID) and close(filemon_fd) which could free the filemon when it is unlocked next. | |
239 | Yes agreed. | |
249 | It's a loop, so it is rechecked. Plus it is asserted NULL while locked. |
sys/dev/filemon/filemon.c | ||
---|---|---|
237 | Ah you meant the PHOLD. I understand. |
- Use more clear do while
- Make the curproc logic more clear
- Remove unneeded logic due to any exit waiting on this ioctl thread to complete.
sys/dev/filemon/filemon.c | ||
---|---|---|
238 | If it can be freed, ioctl already does not work as it cannot reliably lock it. As it is, this call only serves to confuse the reader. However, as noted the last time, relocking employed here can just be removed rendering the point moot. | |
249 | Previous version of the patch had only one loop. ->p_filemon was not re-checked after PROC_LOCK was taken. The current patch as the bug fixed. | |
264 | As I tried to explain the last time, this does not help anything and only makes the code handle more races. So we have a thread from a different process trying to set filemon for this process. Both with and without the loop it can set it and have it unset later. The only difference is that if there are multiple threads setting it, the code can loop indefinitely (although that's a rather hypothetical problem than something to be worried about). Either way, there is no benefit for having the loop. Just save the current filemon (if any) and release the reference at the end.b |
sys/dev/filemon/filemon.c | ||
---|---|---|
249 | You must have been looking at a stale version. I added a loop some 18 hours ago. | |
264 | I can't just overwrite the p->p_filemon if it is not NULL. That breaks the invariant that p->p_filemon won't be changed to another filemon or NULL unless the filemon is locked. All of the if (p->p_filemon == filemon) proc_drop_locked(p); code relies on this. They read p->p_filemon without the PROC_LOCK because filemon lock is adequate for this state. You even noticed this. So I need to lock the bad p_filemon lock here before removing it. If I were to just overwrite p->p_filemon here then I need to change the locking on the other code. Suggesting to remove something because it handles "too many races" sounds odd. The simple implementation you suggest here races with fork and may cause the wrong filemon to be inherited. Granted all of these races are due to a developer doing stupid not-real-world things, but the theme I have picked up on this review is that all races should be handled, so why not these when the code is simple? I would rather handle things clearly than risk having a panic, deadlock, or other DoS in the code when someone does purposely do stupid things like multiple threads running ioctl/fork/close at once. |
- ioctl(2) holds an fp reference, thus filemon_dtr() can't be called during ioctl(FILEMON_SET_PID)
- Handle the last reference in filemon_release rather than explicit filemon_free
- Simplify the filemon_attach_proc loop further
- filemon_event_process_exit: Assert that p_filemon is as expected
- Comment and add some assertions on refcount expectations
- Clarify comment about parent