As an example benefit, implement posix_fallocate(2) for shmfd. The shmfd implementation defers to shm_dotruncate as needed, which will reserve more swap space to accommodate -- this is something that Linux wants to be able to do, and it seems sensible enough.
Details
- Reviewers
kib - Group Reviewers
manpages - Commits
- rS356512: posixshm: implement posix_fallocate(2)
rS356510: posix_fallocate: push vnop implementation into the fileop layer
Diff Detail
- Repository
- rS FreeBSD src repository - subversion
- Lint
Lint Skipped - Unit
Tests Skipped - Build Status
Buildable 28491
Event Timeline
Move {sys,kern}_posix_fallocate into sys_generic.c -- there's no longer anything vfs specific to it, as those have been split off into vfs_vnops.
sys/kern/sys_generic.c | ||
---|---|---|
857 ↗ | (On Diff #66380) | if ((fp->f_ops->fo_flags & DFLAG_SEEKABLE) == 0) error = ESPIPE; else if ((fp->f_flag & FWRITE) == 0) error = EBADF; else error = fo_fallocate(fp, offset, len, td); gets rid of the 'out' label. |
sys/kern/uipc_shm.c | ||
1459 | Does it make sense to instantiate pages for the fallocated range ? Even if it does, I suggest to do it in the follow-up commit. |
sys/kern/uipc_shm.c | ||
---|---|---|
1459 | This wasn't clear to me- the wording on the spec (which our manpage seems to basically copy) carefully avoids making any promises on the state of any newly allocated space |
sys/kern/uipc_shm.c | ||
---|---|---|
1459 | This is because the only way to guarantee that the pages are not get recycled is to wire them. But if the intent is to provide the backing storage so that later the app would not pay the allocation cost, e.g. by waiting for pages to become available, then pages should be allocated. I do not know what is the use case for fallocate() over shmfd. |
sys/kern/uipc_shm.c | ||
---|---|---|
1459 |
Examples:
The intent in the Weston code that ended up in many places was "to guarantee that disk space is available for the file at the given size"… even if the file is purely in-memory. It's code that was written assuming on-disk /tmp first, with memfd/SHM_ANON bolted on later. [[ https://gitlab.freedesktop.org/mesa/mesa/blob/642125edd97384b88f491c1383a06c42ed16e11e/src/util/anon_file.c#L97-158 | In Mesa, we don't use posix_fallocate anymore ]]. So it's relatively safe to assume that fallocate is used as just a "better" ftruncate here. |
I have one more local change to this that allows shm_dotruncate to borrow a rangelock from the caller to use, then shm_fallocate takes a write lock on it before checking the size, but the changes are fairly non-invasive so I suspect I will not rev this review one more time unless requested.
Edit: though I'm considering just adding vm object locking to shm_fallocate at that point and calling shm_dotruncate_locked directly.
sys/kern/uipc_shm.c | ||
---|---|---|
1459 | Thanks for the insight! |