Page MenuHomeFreeBSD

unionfs: Ensure SAVENAME is set for unionfs vnode operations
ClosedPublic

Authored by jah on Sep 26 2021, 2:56 AM.

Details

Summary
"rm-style" system calls such as kern_frmdirat() and kern_funlinkat()
don't supply SAVENAME to preserve the pathname buffer for subsequent
vnode ops.  For unionfs this poses an issue because the pathname may
be needed for a relookup operation in unionfs_remove()/unionfs_rmdir().
Currently unionfs doesn't check for this case, leading to a panic on
DIAGNOSTIC kernels and use-after-free of cn_nameptr otherwise.

The unionfs node's stored buffer would suffice as a replacement for
cnp->cn_nameptr in some (but not all) cases, but it's cleaner to just
ensure that unionfs vnode ops always have a valid cn_nameptr by setting
SAVENAME in unionfs_lookup().

While here, do some light cleanup in unionfs_lookup() and assert that
HASBUF is always present in the relevant relookup calls.

Diff Detail

Repository
R10 FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

jah requested review of this revision.Sep 26 2021, 2:56 AM

I do have a question here:
Instead of taking this approach, would it be acceptable to simply change kern_frmdirat()/kern_funlinkat() to pass SAVENAME to namei()?
unionfs might of course be the only FS that would make use of the name buffer in those cases, but it seems like a straightforward change.

If that's not acceptable, then I have a similar (but less likely to be encountered) issue to fix in unionfs_remove(), where we try to resolve the upper/lower vnode for VSOCK vnodes.
That will be harder to fix because unionfs doesn't actually create unionfs_nodes for VSOCK vnodes, so there won't be a path readily available. But I also don't immediately see why unionfs needs to make that special case for VSOCK; we probably don't want unionfs copy-on-write behavior for sockets, but I don't see why we need to completely avoid allocating a unionfs_node for them, so maybe with some other refactoring that code can just go away.

I do not see why SAVENAME would be not acceptable? Why did you think that it could be?

You should be able to add the flag within the bowels of unionfs.

In D32148#725527, @mjg wrote:

You should be able to add the flag within the bowels of unionfs.

I thought adding SAVENAME in unionfs_lookup() might be frowned upon, based on the comment for NDVALIDATE():

/*
 * Validate the final state of ndp after the lookup.
 *
 * Historically filesystems were allowed to modify cn_flags. Most notably they
 * can add SAVENAME to the request, resulting in HASBUF and pushing subsequent
 * clean up to the consumer. In practice this seems to only concern != LOOKUP
 * operations.
 *
 * As a step towards stricter API contract this routine validates the state to
 * clean up. Note validation is a work in progress with the intent of becoming
 * stricter over time.
 */
#define NDMODIFYINGFLAGS (LOCKLEAF | LOCKPARENT | WANTPARENT | SAVENAME | SAVESTART | HASBUF)

It looks like I can get away with it for any nameiop besides LOOKUP, so if this is still OK to do then I'm fine doing that instead.
That might be more foolproof for unionfs since it would ensure a valid pathname buffer regardless of what the namei() caller supplied.

Just do it. The idea here is to not put unionfs-specific stuff outside of it if it can be helped.

Set SAVEBANE in unionfs_lookup() instead

jah retitled this revision from unionfs: avoid trying to use cn_nameptr for lookup in unionfs_rmdir() to unionfs: Ensure SAVENAME is set for unionfs vnode operations.Sep 26 2021, 9:16 PM
jah edited the summary of this revision. (Show Details)
jah edited the summary of this revision. (Show Details)

I got this panic while testing with D32148.95756.diff:

panic: vn_lock: error 16 incompatible with flags 0x82400
cpuid = 1
time = 1632729527
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0141a13740
vpanic() at vpanic+0x187/frame 0xfffffe0141a137a0
panic() at panic+0x43/frame 0xfffffe0141a13800
_vn_lock_fallback() at _vn_lock_fallback+0xd6/frame 0xfffffe0141a13860
_vn_lock() at _vn_lock+0x86/frame 0xfffffe0141a138c0
unionfs_nodeget() at unionfs_nodeget+0x719/frame 0xfffffe0141a13960
unionfs_lookup() at unionfs_lookup+0x63b/frame 0xfffffe0141a13ab0
VOP_CACHEDLOOKUP_APV() at VOP_CACHEDLOOKUP_APV+0x5a/frame 0xfffffe0141a13ad0
vfs_cache_lookup() at vfs_cache_lookup+0xa6/frame 0xfffffe0141a13b20
VOP_LOOKUP_APV() at VOP_LOOKUP_APV+0x5a/frame 0xfffffe0141a13b40
lookup() at lookup+0x4d1/frame 0xfffffe0141a13be0
namei() at namei+0x379/frame 0xfffffe0141a13c90
kern_frmdirat() at kern_frmdirat+0x15e/frame 0xfffffe0141a13e00
amd64_syscall() at amd64_syscall+0x147/frame 0xfffffe0141a13f30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe0141a13f30

https://people.freebsd.org/~pho/stress/log/log0182.txt

In D32148#725684, @pho wrote:

I got this panic while testing with D32148.95756.diff:

panic: vn_lock: error 16 incompatible with flags 0x82400
cpuid = 1
time = 1632729527
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0141a13740
vpanic() at vpanic+0x187/frame 0xfffffe0141a137a0
panic() at panic+0x43/frame 0xfffffe0141a13800
_vn_lock_fallback() at _vn_lock_fallback+0xd6/frame 0xfffffe0141a13860
_vn_lock() at _vn_lock+0x86/frame 0xfffffe0141a138c0
unionfs_nodeget() at unionfs_nodeget+0x719/frame 0xfffffe0141a13960
unionfs_lookup() at unionfs_lookup+0x63b/frame 0xfffffe0141a13ab0
VOP_CACHEDLOOKUP_APV() at VOP_CACHEDLOOKUP_APV+0x5a/frame 0xfffffe0141a13ad0
vfs_cache_lookup() at vfs_cache_lookup+0xa6/frame 0xfffffe0141a13b20
VOP_LOOKUP_APV() at VOP_LOOKUP_APV+0x5a/frame 0xfffffe0141a13b40
lookup() at lookup+0x4d1/frame 0xfffffe0141a13be0
namei() at namei+0x379/frame 0xfffffe0141a13c90
kern_frmdirat() at kern_frmdirat+0x15e/frame 0xfffffe0141a13e00
amd64_syscall() at amd64_syscall+0x147/frame 0xfffffe0141a13f30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe0141a13f30

https://people.freebsd.org/~pho/stress/log/log0182.txt

That's an unrelated issue; I have a local fix for it (along with the ENOENT-on-mkdir issue) that I'll post after this one.

This revision is now accepted and ready to land.Sep 29 2021, 1:54 PM