Page MenuHomeFreeBSD

shm_open2: Implement SHM_GROW_ON_WRITE
ClosedPublic

Authored by kevans on Jun 29 2020, 3:53 AM.

Details

Summary

Lack of SHM_GROW_ON_WRITE is actively breaking Python tests, so go ahead and implement it. memfd_create will always set SHM_GROW_ON_WRITE, to match Linux behavior.

The basic premise is pretty simple, write(2) and friends should be allowed to grow the file, assuming F_SEAL_GROW has not been applied. Grab the total extent of the write (uio->uio_offset should be valid after foffset_lock_uio even if (flags & FOF_OFFSET) == 0) and quickly extend the file prior to writing if the write extends beyond the current shm_size.

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

sys/kern/uipc_shm.c
318 ↗(On Diff #73855)

The type cannot be size_t, it should be off_t, otherwise it wraps on 32bit arches.

327 ↗(On Diff #73855)

I think this should be checked for overflow.

sys/kern/uipc_shm.c
327 ↗(On Diff #73855)

Sure, this seems like a good idea. I guess for SHM_GROW_ON_WRITE we should probably return EFBIG on error, and if we're not growing from write we just clamp it to shmfd->shm_size?

Previously we would've overflowed when grabbing the rangelock, but I suppose we're more likely to overflow in the !FOF_OFFSET case than offset-supplied.

kevans marked an inline comment as done.

Unwhoops the datatype for size

Do an overflow check for the calculation; if we would've wrapped before, I suppose we'd have likely ended up with an invalid range for the rangelock... the new version will rangelock from uio->uio_offset to shmfd->shm_size in that particular case instead of offset, but perhaps it'd be better to use OFF_MAX instead of shmfd->shm_size? It's a bit of an edge-case, I think.

I realized that we do not check for the overflow in vn_io_fault() either.
I am fine with this commit as-is, and you could fix vn_io_fault() (if needed) later.
For instance, despite no check on filedesc/VFS layer, ffs_write() eventually does

	if ((uoff_t)uio->uio_offset + uio->uio_resid > fs->fs_maxfilesize)
		return (EFBIG);

which probably is good enough to avoid corruption for FFS.
But rangelocks are probably not immune.

This revision is now accepted and ready to land.Jun 30 2020, 6:11 AM
In D25502#563799, @kib wrote:

I realized that we do not check for the overflow in vn_io_fault() either.
I am fine with this commit as-is, and you could fix vn_io_fault() (if needed) later.
For instance, despite no check on filedesc/VFS layer, ffs_write() eventually does

	if ((uoff_t)uio->uio_offset + uio->uio_resid > fs->fs_maxfilesize)
		return (EFBIG);

which probably is good enough to avoid corruption for FFS.
But rangelocks are probably not immune.

Sure- I'll check out vn_io_fault later. I'm somewhat unsure about what to do there; it seems like in some cases (e.g. shmfd) it could make sense to allow the overflow because the file won't grow like most traditional filesystems so we'll just truncate the write to the pre-allocated size of the file. It would be simpler to not care about shmfd being an odd duck.

For this current diff, it occurs to me that this specific instance should be checking for overflow of SIZE_MAX rather than OFF_MAX since shm_size is a size_t. I'll make that tweak pre-commit.

In D25502#563799, @kib wrote:

I realized that we do not check for the overflow in vn_io_fault() either.
I am fine with this commit as-is, and you could fix vn_io_fault() (if needed) later.
For instance, despite no check on filedesc/VFS layer, ffs_write() eventually does

	if ((uoff_t)uio->uio_offset + uio->uio_resid > fs->fs_maxfilesize)
		return (EFBIG);

which probably is good enough to avoid corruption for FFS.
But rangelocks are probably not immune.

Sure- I'll check out vn_io_fault later. I'm somewhat unsure about what to do there; it seems like in some cases (e.g. shmfd) it could make sense to allow the overflow because the file won't grow like most traditional filesystems so we'll just truncate the write to the pre-allocated size of the file. It would be simpler to not care about shmfd being an odd duck.

vn_io_fault() is not used for shmfd.

My general concern there is that overflow for end causes rangelocks to misbehave. This is besides a possibility that some filesystems might be not prepared to this as well.

For this current diff, it occurs to me that this specific instance should be checking for overflow of SIZE_MAX rather than OFF_MAX since shm_size is a size_t. I'll make that tweak pre-commit.

Uhh. So the check in shm_dotruncate_locked() is boosted, and also we could have mis-synchronization between shm_object size and shm_size.

In D25502#564145, @kib wrote:
In D25502#563799, @kib wrote:

I realized that we do not check for the overflow in vn_io_fault() either.
I am fine with this commit as-is, and you could fix vn_io_fault() (if needed) later.
For instance, despite no check on filedesc/VFS layer, ffs_write() eventually does

	if ((uoff_t)uio->uio_offset + uio->uio_resid > fs->fs_maxfilesize)
		return (EFBIG);

which probably is good enough to avoid corruption for FFS.
But rangelocks are probably not immune.

Sure- I'll check out vn_io_fault later. I'm somewhat unsure about what to do there; it seems like in some cases (e.g. shmfd) it could make sense to allow the overflow because the file won't grow like most traditional filesystems so we'll just truncate the write to the pre-allocated size of the file. It would be simpler to not care about shmfd being an odd duck.

vn_io_fault() is not used for shmfd.

My general concern there is that overflow for end causes rangelocks to misbehave. This is besides a possibility that some filesystems might be not prepared to this as well.

Oh, whoops, of course- my bad.

For this current diff, it occurs to me that this specific instance should be checking for overflow of SIZE_MAX rather than OFF_MAX since shm_size is a size_t. I'll make that tweak pre-commit.

Uhh. So the check in shm_dotruncate_locked() is boosted, and also we could have mis-synchronization between shm_object size and shm_size.

I'll (separately) add a check for length > SIZE_MAX in shm_dotruncate_locked and kick back an EFBIG there; is there another source of mis-synchronization for shm_object size and shm_size, or are you generally referring to the truncation of shm_size that could happen when assigning the off_t length to 32-bit quantity?

In D25502#564145, @kib wrote:

For this current diff, it occurs to me that this specific instance should be checking for overflow of SIZE_MAX rather than OFF_MAX since shm_size is a size_t. I'll make that tweak pre-commit.

Uhh. So the check in shm_dotruncate_locked() is boosted, and also we could have mis-synchronization between shm_object size and shm_size.

I'll (separately) add a check for length > SIZE_MAX in shm_dotruncate_locked and kick back an EFBIG there; is there another source of mis-synchronization for shm_object size and shm_size, or are you generally referring to the truncation of shm_size that could happen when assigning the off_t length to 32-bit quantity?

I am not sure why check for SIZE_MAX and not change shm_size type to either off_t or better vm_ooffset_t.

In D25502#564823, @kib wrote:
In D25502#564145, @kib wrote:

For this current diff, it occurs to me that this specific instance should be checking for overflow of SIZE_MAX rather than OFF_MAX since shm_size is a size_t. I'll make that tweak pre-commit.

Uhh. So the check in shm_dotruncate_locked() is boosted, and also we could have mis-synchronization between shm_object size and shm_size.

I'll (separately) add a check for length > SIZE_MAX in shm_dotruncate_locked and kick back an EFBIG there; is there another source of mis-synchronization for shm_object size and shm_size, or are you generally referring to the truncation of shm_size that could happen when assigning the off_t length to 32-bit quantity?

I am not sure why check for SIZE_MAX and not change shm_size type to either off_t or better vm_ooffset_t.

My recollection was that we can't really change struct shmfd like that on stable without KBI break, and we should probably fix it at least on stable/12; leading to "ok, just validate it for the MFC then we can drop the check and expand shm_size after."

My recollection was that we can't really change struct shmfd like that on stable without KBI break, and we should probably fix it at least on stable/12; leading to "ok, just validate it for the MFC then we can drop the check and expand shm_size after."

Can you point me to the discussion, please ? I see struct shmfd as internal to uipc_shm.c. There are some uses in kern_sendfile.c but this is still internal to the kernel. There are more references to it in mac module, but there I think it is either 'do not care' or you can put large shm_size at the end of the structure for stable/12.

No discussion, just my own previous inspection when I was doing sealing stuff. Looking again, I see that you're correct -- the only thing that would seemingly break is libprocstat's procstat_get_shm_info_kvm.