- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Feb 27 2024
Feb 24 2024
Feb 23 2024
Feb 20 2024
Feb 19 2024
Feb 14 2024
In D43835#1001048, @imp wrote:Git arc parch -c D43835 likely works and is less hassle
In D43835#1001028, @olce wrote:Could you send me a patch prepared with git format-patch, else I can take the patch here and put your name and mail as the author myself (as you prefer)?
In D43835#1001005, @dev_submerge.ch wrote:Thanks for the article, Olivier - now that I know the extent of your project, I suspect it won't be MFC'd?
If that's the case it may be worth to get this minimal fix in right now, and MFC it to STABLE. The earlier this issue is fixed in all supported releases, the less workarounds in ports.
Feb 13 2024
Hi Florian,
The PRIV_SCHED_SETPOLICY and PRIV_SCHED_SET privileges are inconsistent with some other places and can be circumvented. Additionally, I don't think they serve any real security purposes (beyond what PRIV_SCHED_RTPRIO and PRIV_SCHED_IDPRIO can provide).
In D43815#1000700, @jah wrote:Thinking about it a little more, I should simply remove this part of the commit message. Accessing [base_vp]->v_mount does have risks, but any code that is subject to those risks is almost certainly going to face the same risks from careless access to ump->um_[upper|lower]mp (where 'ump' was obtained by a presumably-safe load of [unionfs_vp]->v_mount->mnt_data at the beginning of the call, as most of these operations do).
In D43815#1000687, @jah wrote:In D43815#1000600, @olce wrote:There is a misunderstanding. I'm very well aware of what you are saying, as you should know. But this is not my point, which concerns the sentence "Use of [vnode]->v_mount is unsafe in the presence of a concurrent forced unmount." in the context of the current change. The bulk of the latter is modifications of unionfs_vfsops.c, which contains VFS operations, and not vnode ones. There are no vnodes involved there, except accessing the layers' root ones. And what I'm saying, and that I proved above is that v_mount on these, again in the context of a VFS operation, cannot be NULL because of a force unmount (if you disagree, then please show where you think there is a flaw in the reasoning).
Actually the assertion about VFS operations isn't entirely true either (mostly, but not entirely); see the vfs_unbusy() dance we do in unionfs_quotactl().
In D43815#1000340, @jah wrote:In D43815#1000302, @olce wrote:I don't think it can. Given the first point above, there can't be any unmount of some layer (even forced) until the unionfs mount on top is unmounted. As the layers' root vnodes are vrefed(), they can't become doomed (since unmount of their own FS is prevented), and consequently their v_mount is never modified (barring the ZFS rollback case). This is independent of holding (or not) any vnode lock.
Which doesn't say that they aren't any problems of the sort that you're reporting in unionfs, it's just a different matter.
That's not true; vref() does nothing to prevent a forced unmount from dooming the vnode, only holding its lock does this. As such, if the lock needs to be transiently dropped for some reason and the timing is sufficiently unfortunate, the concurrent recursive forced unmount can first unmount unionfs (dooming the unionfs vnode) and then the base FS (dooming the lower/upper vnode). The held references prevent the vnodes from being recycled (but not doomed), but even this isn't foolproof: for example, in the course of being doomed, the unionfs vnode will drop its references on the lower/upper vnodes, at which point they may become unreferenced unless additional action is taken. Whatever caller invoked the unionfs VOP will of course still hold a reference on the unionfs vnode, but this does not automatically guarantee that references will be held on the underlying vnodes for the duration of the call, due to the aforementioned scenario.
Feb 12 2024
In D43815#1000214, @jah wrote:In D43815#1000171, @olce wrote:If one of the layer if forcibly unmounted, there isn't much point in continuing operation. But, given the first point above, that cannot even happen. So really the only case when v_mount can get NULL is the ZFS rollback's one (the layers' root vnodes can't be recycled since they are vrefed). Thinking more about it, always testing if these are alive and well is going to be inevitable going forward. But I'm fine with this change as it is for now.
This can indeed happen, despite the first point above. If a unionfs VOP ever temporarily drops its lock, another thread is free to stage a recursive forced unmount of both the unionfs and the base FS during this window. Moreover, it's easy for this to happen without unionfs even being aware of it: because unionfs shares its lock with the base FS, if a base FS VOP (forwarded by a unionfs VOP) needs to drop the lock temporarily (this is common e.g. for FFS operations that need to update metadata), the unionfs vnode may effectively be unlocked during that time. That last point is a particularly dangerous one; I have another pending set of changes to deal with the problems that can arise in that situation.
This is why I say it's easy to make a mistake in accessing [base vp]->v_mount at an unsafe time.
In D43815#999937, @jah wrote:Well, as it is today unmounting of the base FS is either recursive or it doesn't happen at all (i.e. the unmount attempt is rejected immediately because of the unionfs stacked atop the mount in question). I don't think it can work any other way, although I could see the default settings around recursive unmounts changing (maybe vfs.recursive_forced_unmount being enabled by default, or recursive unmounts even being allowed for the non-forced case as well). I don't have plans to change any of those defaults though.
In D43818#1000015, @jah wrote:Actually I've been thinking of doing exactly that, although it depends on how much time I get away from $work over the next few weeks.
In D40850#1000012, @jah wrote:
Feb 11 2024
Nice catch.
OK as a workaround. Hopefully, we'll get OpenZFS fixed soon. If you don't plan to, I may try to submit a patch upstream, since it seems no one has proposed any change in https://github.com/openzfs/zfs/issues/15705.
I think this goes in the right direction long term also.