Details

Reviewers

khng
kib

Commits

rG08b9cc316a31: nfscl: Add a VOP_DEALLOCATE() for the NFSv4.2 client
rGda779f262c58: vfs_default: Change vop_stddeallocate() from static to global

Summary

This patch adds a VOP_DEALLOCATE() to the NFS client.
For NFSv4.2 servers that support the Deallocate operation,
it is used. Otherwise, it falls back on calling
vop_stddeallocate(), which is defined as a non-static global
function.

It calls vinvalbuf() via ncl_vinvalbuf() and vnode_pager_purge_range()
before doing the deallocate, to ensure that stale cached file
data purged.

It returns EFBIG when the rqst->r_offset + rqst->r_len > maxfilesize.
This error needs to be added to VOP_DEALLOCATE.9 in a future commit.

Test Plan

It has been tested against both a patched FreeBSD and Linux
NFSv4.2 servers. It was also tested for NFSv3 and NFSv4.1
mounts to exercise the fallback to vop_stddeallocate().

The main chunk of code for review is nfs_deallocate(),
round in sys/fs/nfsclient/nfs_clvnops.c.

Diff Detail

Repository

rG FreeBSD src repository

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

rmacklem requested review of this revision.Aug 22 2021, 10:58 PM

rmacklem created this revision.

It returns EFBIG when the rqst->r_offset + rqst->r_len > maxfilesize.

In my opinion this eliminates a use case that specifying offset to be 0 and length to be OFF_MAX turns the whole file sparse.

Modify nfs_deallocate() so that it returns EFBIG
if rqst->r_offset >= maxfilesize.
If rqst->r_offset < maxfilesize and rqst->r_offset + rqst->r_len > maxfilesize

rqst->r_len = maxfilesize - rqst->r_offset

This should avoid the EFBIG error return from the NFSv4.2 servers,
such as the Linux one, for the case where the request is

rqst->r_offset == 0
rqst->r_len == OFF_MAX

as requested by khng@.

Hmm, I think that both Ka Ho' and mine opinions now is that deallocate should stop at EOF, and report r_offset == EOF, len == 0.

So for instance, if somebody wants to deallocate everything except the first block, he would do deallocate(offset = 512, len = OFF_MAX - 512) and get the result 0, with r_offset = EOF, len = 0.

sys/kern/vfs_default.c
1128	Please commit this in advance.

The current patch should exhibit that behaviour. In nfs_deallocate()
line#3713 - will set len = mafilesize - offset (which avoids the EFBIG error reply)

the RPC will zero from offset (512bytes for the example) to EOF and reply NFS_OK

line#3729 - will set r_offset = file_size (EOF)
line#3733 - will set r_len = 0

The only case where behaviour will be slightly different than vop_stddeallocate()
is when rqst->r_offset >= mafilesize
--> This results in an EFBIG error reply, because NFSv4.2 servers such as Linux will

generate the same error for that case.

Is it possible to simply turn offset >= maxfilesize into a no-op and just return the EOF for client? The interface's error case when offset + length >= some_random_value seems more consistent in this manner.

Not directly related to this, maxfilesize should be something that can be reflected through statfs or other means, so that callers need not to guess what the file size limit is in the dark.

In D31640#713956, @khng wrote:

Is it possible to simply turn offset >= maxfilesize into a no-op and just return the EOF for client? The interface's error case when offset + length >= some_random_value seems more consistent in this manner.

I agree with this request.

Not directly related to this, maxfilesize should be something that can be reflected through statfs or other means, so that callers need not to guess what the file size limit is in the dark.

I agree with this on principle, but I think it should become a pathconf value, statfs() is more about mount itself than about the volume properties.

I could make r_offset >= maxfilesize a no-op.
The problem is that the client will only have a (possibly stale)
va.va_size for the file to use to set the reply r_offset.
--> To be honest, it is always a bit bogus, since another

client can change the file size at any time.

If just guessing at what va_size is to set r_offset to is fine with
you guys, I can do that.

I think the Linux fallocate(2) returns EFBIG for the cases where
r_offset or r_offset + r_len > maxfilesize, so this will make the interface
less Linux compatible.

Btw, yes, it would be nice if code above the VOP/VFS could
get maxfilesize.. The NFS server currently just makes a conservative
guess (what I once thought was the minimum for FFS2 with a small blocksize).

In D31640#714099, @rmacklem wrote:
I could make r_offset >= maxfilesize a no-op.
The problem is that the client will only have a (possibly stale)
va.va_size for the file to use to set the reply r_offset.
--> To be honest, it is always a bit bogus, since another
client can change the file size at any time.
If just guessing at what va_size is to set r_offset to is fine with
you guys, I can do that.

I am fine with this under such condition.

This version has nfs_deallocate() modified so that it
will return r_len = 0, r_offset = <best guess of file size>
and no error when rqst->r_offset >= maxfilesize.

This change was requested by khng@ and kib@.

khng added inline comments.Aug 25 2021, 4:18 PM

sys/fs/nfsclient/nfs_clvnops.c
3712 ↗	(On Diff #94155)	starting from 1400032 the new r_offset semantic should make this case easier.
3731–3735 ↗	(On Diff #94155)	starting from 1400032 the new r_offset semantic should make this case easier. Under the new semantic rmsr.r_offset equals rqsr.r_offset + number of bytes zeroed before EOF. So this does not require returning file size at all.

rmacklem added inline comments.Aug 25 2021, 9:59 PM

sys/fs/nfsclient/nfs_clvnops.c
3712 ↗	(On Diff #94155)	Ok, just to make this clear, you and kib@ have now decided that the correct setting for case where rqsr->r_offset > file size (past EOF) is to just leave rmsr->r_offset there and not set it to file size (EOF)?