Page MenuHomeFreeBSD

posix_fallocate: push vnop implementation into the fileops layer
ClosedPublic

Authored by kevans on Jan 5 2020, 5:04 PM.

Details

Summary

As an example benefit, implement posix_fallocate(2) for shmfd. The shmfd implementation defers to shm_dotruncate as needed, which will reserve more swap space to accommodate -- this is something that Linux wants to be able to do, and it seems sensible enough.

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

kevans created this revision.Jan 5 2020, 5:04 PM
bcr accepted this revision as: manpages.Jan 5 2020, 5:46 PM
bcr added a subscriber: bcr.

OK from manpages.

kevans updated this revision to Diff 66380.Jan 5 2020, 8:31 PM

Move {sys,kern}_posix_fallocate into sys_generic.c -- there's no longer anything vfs specific to it, as those have been split off into vfs_vnops.

kib accepted this revision.Jan 6 2020, 12:15 AM
kib added inline comments.
sys/kern/sys_generic.c
857 ↗(On Diff #66380)
if ((fp->f_ops->fo_flags & DFLAG_SEEKABLE) == 0)
         error = ESPIPE;
else if ((fp->f_flag & FWRITE) == 0)
         error = EBADF;
 else
         error = fo_fallocate(fp, offset, len, td);

gets rid of the 'out' label.

sys/kern/uipc_shm.c
1459 ↗(On Diff #66380)

Does it make sense to instantiate pages for the fallocated range ? Even if it does, I suggest to do it in the follow-up commit.

This revision is now accepted and ready to land.Jan 6 2020, 12:15 AM
kevans added inline comments.Jan 6 2020, 12:27 AM
sys/kern/uipc_shm.c
1459 ↗(On Diff #66380)

This wasn't clear to me- the wording on the spec (which our manpage seems to basically copy) carefully avoids making any promises on the state of any newly allocated space

kib added inline comments.Jan 6 2020, 12:50 AM
sys/kern/uipc_shm.c
1459 ↗(On Diff #66380)

This is because the only way to guarantee that the pages are not get recycled is to wire them. But if the intent is to provide the backing storage so that later the app would not pay the allocation cost, e.g. by waiting for pages to become available, then pages should be allocated.

I do not know what is the use case for fallocate() over shmfd.

sys/kern/uipc_shm.c
1459 ↗(On Diff #66380)

I do not know what is the use case for fallocate() over shmfd

Examples:

The intent in the Weston code that ended up in many places was "to guarantee that disk space is available for the file at the given size"… even if the file is purely in-memory. It's code that was written assuming on-disk /tmp first, with memfd/SHM_ANON bolted on later. [[ https://gitlab.freedesktop.org/mesa/mesa/blob/642125edd97384b88f491c1383a06c42ed16e11e/src/util/anon_file.c#L97-158 | In Mesa, we don't use posix_fallocate anymore ]]. So it's relatively safe to assume that fallocate is used as just a "better" ftruncate here.

kevans added a comment.EditedJan 8 2020, 3:22 PM

I have one more local change to this that allows shm_dotruncate to borrow a rangelock from the caller to use, then shm_fallocate takes a write lock on it before checking the size, but the changes are fairly non-invasive so I suspect I will not rev this review one more time unless requested.

Edit: though I'm considering just adding vm object locking to shm_fallocate at that point and calling shm_dotruncate_locked directly.

sys/kern/uipc_shm.c
1459 ↗(On Diff #66380)

Thanks for the insight!

This revision was automatically updated to reflect the committed changes.