Page MenuHomeFreeBSD

Track filemon usage via proc p_filemon pointer rather than its own global list.
ClosedPublic

Authored by bdrewery on Mar 2 2016, 1:01 AM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, Apr 20, 4:23 PM
Unknown Object (File)
Sun, Apr 14, 2:58 AM
Unknown Object (File)
Mar 25 2024, 9:16 AM
Unknown Object (File)
Mar 11 2024, 12:33 PM
Unknown Object (File)
Feb 4 2024, 10:50 AM
Unknown Object (File)
Jan 23 2024, 5:44 PM
Unknown Object (File)
Dec 20 2023, 12:08 AM
Unknown Object (File)
Dec 7 2023, 2:15 PM
Subscribers

Details

Summary
  • Disallow attaching to a process already being traced since filemon is typically intended to be used on children only. This is allowed for curproc as bmake relies on this behavior for rare cases.
  • Detach any previously tracked process on FILEMON_SET_PID.
  • Handle error from devfs_set_cdevpriv() in filemon_open().
  • The global filemon lock and lists are removed.
  • A free list is no longer kept. The filemon associated with an open /dev/filemon handle will be freed when the handle is closed. At that time all processes related to the filemon will be detached as well. The freelist was previously forever-growing and only freed on MOD_UNLOAD.
  • No longer loses track of double-forks. If the process holding the filemon handle closes it will close the log rather than wait on a daemonized process, but it will log all activity until it closes its handle.
  • proc.p_filemon is added which is protected by PROC_LOCK. This improves performance greatly and avoids double-fork issues, taking allproc_lock while in syscalls, and walking the process tree in syscalls.
  • A reference count is kept on usage of the filemon pointer. The log is closed on the last reference.
  • struct filemon access is protected by sx(9) filemon->lock as it was before.
  • MOD_UNLOAD will invalidate any filemon on all processes via the destruction of the devfs device calling filemon_dtr.
Test Plan
  • A lot of builds.
  • script -f file cmd
  • script -f file sh -c 'daemon -f ...'
  • kldunloading while in various states of script -f and building.
  • Started working on ATF tests for it.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
bdrewery edited edge metadata.
  • Move filemon_drain near filemon_release
  • Resolve lost wakeup by adding a timeout
kib edited edge metadata.

Patch did not triggered any warnings while I looked at the latest version, so I clicked the button.

Please note that I did not read the filemon, only looked at the patch, so consider my 'review' with a grain of mistrust.

This revision is now accepted and ready to land.Mar 5 2016, 6:40 PM

I have not tested this code. It likely improves filemon in the sense it's not going to crash the kernel, but fatal design flaws and other bugs remain.

sys/dev/filemon/filemon.c
109 ↗(On Diff #14081)

This wake up plays no role and is buggy. The sleep function is called with parameters making the thread wake up periodically. The filemon object could have been freed the exact moment refcount_release is performed.

224 ↗(On Diff #14081)

There is no reason to lock the process just to check what filemon is set. If this is the filemon you are looking for, it cannot change.

sys/dev/filemon/filemon.c
109 ↗(On Diff #14081)

Yes perhaps I should change to a pause(2). You're right that the last release could call filemon_free() but I would rather have the log footer written and closed via filemon_dtr() (devfs file handle close) than have it happen in some random syscall that finds that it had the last reference, or have it done while holding allproc_lock in filemon_untrack_processes (especially since writing to the filesystem will take more locks).

224 ↗(On Diff #14081)

The exit hook also unsets it and lowers its ref. This function is called from ioctl(FILEMON_SET_PID) as well. It is possible for another thread to be unsetting or setting this proc.p_filemon at the same time. Locking here is safe and this code is not called often and is already optimized with the refcnt check.

sys/dev/filemon/filemon.c
224 ↗(On Diff #14081)

Well in the cases I was worried about I do hold filemon lock already which prevents adding/removing this filemon to any process. Removing this if check here also means I can remove all PROC_LOCKing IIUC.

bdrewery edited edge metadata.
  • Fix comment
  • Fix not locking filemon in filemon_dtr
  • Remove unneeded PROC_LOCKing
  • Remove unneeded prototype
  • Move setting p_filemon into filemon_proc_put() with KASSERT that p_filemon is NULL
  • Assert non-NULL filemon in filemon_proc_drop
  • Remove wakeup channel usage and just pause while draining.
This revision now requires review to proceed.Mar 7 2016, 6:53 PM
sys/dev/filemon/filemon.c
223 ↗(On Diff #14139)

The comparison without the proc lock is completely fine. However, you do need it to modify the field. Otherwise you are racing with a thread doing fork which can proceed to use now-freed filemon.

sys/dev/filemon/filemon.c
122 ↗(On Diff #14139)

I do not understand now, what prevents filemon free while you read non-NULL p_filemon but not yet acquired a reference on it.

sys/dev/filemon/filemon.c
122 ↗(On Diff #14139)

All modifications to p->filemon need a proc lock and so does this bit. My other comment about unnecessary proc lock was only about the comparison to see if the process is tracked with particular filemon. I don't see why locking here was removed.

sys/dev/filemon/filemon.c
122 ↗(On Diff #14139)

I'm becoming very confused. Yes I see now this particular check is a use-after-free.

However, why does the filemon_untrack_processes not need a lock? Why would I read 1 pointer unlocked and then take a lock to potentially release a different pointer? I would have to compare again wouldn't I?

sys/dev/filemon/filemon.c
122 ↗(On Diff #14139)

It does not need the lock to conclude if given process uses the filemon structure in question. It does need the lock to modify it. Now that I look at it this also needs another check in fork - it is possible for a thread to obtain a reference on filemon object and then proceed to assign the object to the child after filemon_untrack_processes finishes. It should be enough to re-check in that last case that the process is still traced. If so, an exclusive lock on filemon prevents the race. If not, just don't assign anything to the child.

sys/dev/filemon/filemon.c
122 ↗(On Diff #14139)

in fork - it is possible for a thread to obtain a reference on filemon object and then proceed to assign the object to the child after filemon_untrack_processes finishes

In fork it locks the filemon before adding it to the child. In filemon_untrack_processes it locks the filemon before removing it from all processes. Either fork finds a filemon in the parent to set in the child or it finds no filemon in the parent and does nothing.

This locking of filemon is why I assumed the proc locks were not needed as it seems to be your justification for not locking for the comparison in filemon_untrack_processes.

bdrewery edited edge metadata.
  • restore all proc locking
sys/dev/filemon/filemon.c
122 ↗(On Diff #14143)
CPU0                                                        CPU1
filemon_event_process_fork
  filemon_proc_get
    PROC_LOCK(p);
    filemon = acq(p->p_filemon);
    PROC_UNLOCK(p);
    ..                                                      filemon_dtr
                                                              sx_xlock(&filemon)
    sx_xlock(&filemon) /* blocks */
                                                              filemon_untrack_processes
                                                                untrack p
                                                                      
                                                              filemon_drain
                                                                sx_sleep(..); /* unlocks */
/* the lock is acquired */
  p2->p_filemon = acq(filemon);
  ..

And now p2 has this particular filemon set and the code is past filemon_untrack_processes.

For some reason the current patch uses pause() which does not drop the lock, turning this into a deadlock - the thread from CPU0 got a reference and is blocked waiting on a lock, while the lock holder is waiting for the reference to drop.

filemon_untrack_processes does not need proc lock to compare because no new processes can get this particular filemon as long as it is locked. It does need the lock to clear tracing in order to prevent user after free in filemon_proc_get. Regardless of these measures, any thread can obtain a reference and block waiting on the lock in filemon_proc_get.

sys/dev/filemon/filemon.c
122 ↗(On Diff #14143)

No, the lock is dropped the line before filemon_drain is called.

As for the allproc loop, I agree with the optimization and will implement it.

sys/dev/filemon/filemon.c
122 ↗(On Diff #14143)

I see the fork issue too.

sys/dev/filemon/filemon.c
122 ↗(On Diff #14143)

A similar race is in the exit hook as well.

There is also the possibility to lose track of the child's p_filemon in fork due to a race with ioctl(FILEMON_SET_PID) being used on the child pid.

Fixing both.

bdrewery edited edge metadata.
  • No need to take PROC_LOCK when comparing p->p_filemon with filemon locked.
  • Handle fork races
  • Handle exit race
  • Add comment about race before locking
  • Assert non-NULL p_filemon before dropping it
bdrewery edited edge metadata.
  • Document some of the locking
  • Don't EBUSY child self-reattach which bmake relies upon.
  • Add some asserts
  • Add some comments
bdrewery edited the test plan for this revision. (Show Details)
bdrewery added a reviewer: markj.
sys/dev/filemon/filemon.c
282 ↗(On Diff #14299)

Perhaps assert that (p->p_flag & P_WEXIT) == 0?

298 ↗(On Diff #14299)

I don't think this loop handles all possible races. There's no guarantee that p->p_filemon can't transition from NULL to non-NULL between loop termination and the acquisition of the proc lock below. In particular, I don't think the KASSERT below holds.

I think you can address this by checking again once the proc lock is acquired. If p_filemon is non-NULL, you must drop locks and restart this loop.

391 ↗(On Diff #14299)

sizeof(*filemon) is preferable.

398 ↗(On Diff #14299)

No need to explicitly release the reference.

sys/sys/proc.h
634 ↗(On Diff #13961)

I don't believe so...

bdrewery added inline comments.
sys/dev/filemon/filemon.c
398 ↗(On Diff #14299)

Well I assert in filemon_free that refcnt==0 since it is only called like that. I didn't want anyone to think it would release the reference itself and wanted to ensure it was only called when appropriate.

sys/sys/proc.h
634 ↗(On Diff #13961)

I have the MFC needs worked out. I'll move it to the end and initialize it in proc_init in stable/10.

bdrewery marked an inline comment as done.
  • Use sizeof(*filemon)
  • Assert not-existing proc
  • Handle p_filemon being attached to after dropping it
markj edited edge metadata.
markj added inline comments.
sys/dev/filemon/filemon.c
299 ↗(On Diff #14349)

I would use a do-while loop here - we already know that p->p_filemon != NULL in the first iteration. It'll also make the comment at the end more natural.

This revision is now accepted and ready to land.Mar 16 2016, 4:16 AM
sys/dev/filemon/filemon.c
297 ↗(On Diff #14349)

If p == curproc, then this does not serve any purpose. Prior to actual exit the process must start running singlethreaded, which in worst case would have to wait for this thread to finish this syscall.

298 ↗(On Diff #14349)

This does not seem to serve any purpose either. Is not filemon implicitly referenced by fp used to get to this point? If not, what guarantees the pointer is safe to dereference in the first place?

309 ↗(On Diff #14349)

This is done incorrectly. By the time you PROC_LOCK here, the process could have got another filemon which you blindly overwrite later. The way to go about is to store the pointer to the old filemon (if any) and then drop it at the end of the function while temporarily unlocking the current filemon. Alternatively, the function can be expected to return with filemon unlocked.

333 ↗(On Diff #14349)

This should have been done closer to the code playing with lock/unlock, as opposed to be executed in all cases. But as noted earlier it is impossible to reach this state so this can be removed.

sys/dev/filemon/filemon_wrapper.c
396 ↗(On Diff #14349)

Here it either has to be NULL or equal to filemon. Would be good to assert this.

sys/dev/filemon/filemon.c
297 ↗(On Diff #14349)

The purpose is I don't want to extend stealing tracing of any traced process. Note the EBUSY later when p != curproc.

298 ↗(On Diff #14349)

For the low cost of taking a reference I would rather not make assumptions about when the caller's filemon may be freed. Plus 2 threads may be racing on ioctl(FILEMON_SET_PID) and close(filemon_fd) which could free the filemon when it is unlocked next.

299 ↗(On Diff #14349)

Yes agreed.

309 ↗(On Diff #14349)

It's a loop, so it is rechecked. Plus it is asserted NULL while locked.

sys/dev/filemon/filemon.c
297 ↗(On Diff #14349)

Ah you meant the PHOLD. I understand.

bdrewery edited edge metadata.
  • Use more clear do while
  • Make the curproc logic more clear
  • Remove unneeded logic due to any exit waiting on this ioctl thread to complete.
This revision now requires review to proceed.Mar 16 2016, 5:12 PM
bdrewery edited edge metadata.
  • Update comment
sys/dev/filemon/filemon.c
324 ↗(On Diff #14363)

As I tried to explain the last time, this does not help anything and only makes the code handle more races. So we have a thread from a different process trying to set filemon for this process. Both with and without the loop it can set it and have it unset later. The only difference is that if there are multiple threads setting it, the code can loop indefinitely (although that's a rather hypothetical problem than something to be worried about). Either way, there is no benefit for having the loop. Just save the current filemon (if any) and release the reference at the end.b

298 ↗(On Diff #14349)

If it can be freed, ioctl already does not work as it cannot reliably lock it. As it is, this call only serves to confuse the reader. However, as noted the last time, relocking employed here can just be removed rendering the point moot.

309 ↗(On Diff #14349)

Previous version of the patch had only one loop. ->p_filemon was not re-checked after PROC_LOCK was taken. The current patch as the bug fixed.

sys/dev/filemon/filemon.c
324 ↗(On Diff #14363)

I can't just overwrite the p->p_filemon if it is not NULL. That breaks the invariant that p->p_filemon won't be changed to another filemon or NULL unless the filemon is locked. All of the if (p->p_filemon == filemon) proc_drop_locked(p); code relies on this. They read p->p_filemon without the PROC_LOCK because filemon lock is adequate for this state. You even noticed this. So I need to lock the bad p_filemon lock here before removing it.

If I were to just overwrite p->p_filemon here then I need to change the locking on the other code.

Suggesting to remove something because it handles "too many races" sounds odd. The simple implementation you suggest here races with fork and may cause the wrong filemon to be inherited. Granted all of these races are due to a developer doing stupid not-real-world things, but the theme I have picked up on this review is that all races should be handled, so why not these when the code is simple? I would rather handle things clearly than risk having a panic, deadlock, or other DoS in the code when someone does purposely do stupid things like multiple threads running ioctl/fork/close at once.

309 ↗(On Diff #14349)

You must have been looking at a stale version. I added a loop some 18 hours ago.

bdrewery edited edge metadata.
  • ioctl(2) holds an fp reference, thus filemon_dtr() can't be called during ioctl(FILEMON_SET_PID)
  • Handle the last reference in filemon_release rather than explicit filemon_free
bdrewery marked an inline comment as done.
bdrewery edited edge metadata.
  • Simplify the filemon_attach_proc loop further
  • filemon_event_process_exit: Assert that p_filemon is as expected
  • Comment and add some assertions on refcount expectations
  • Clarify comment about parent
bdrewery edited edge metadata.
This revision was automatically updated to reflect the committed changes.