Page MenuHomeFreeBSD

"Long" filename support changes
Needs ReviewPublic

Authored by cem on Sep 12 2017, 6:12 AM.

Details

Reviewers
julian
Summary

Bump up filename and path limits to accommodate 255 Unicode character paths,
like Windows (UTF-8 encoded). (The maximum length of an encoded code point
is four bytes, so NAME_MAX / MAXNAMLEN becomes ~1020.) PATH_MAX is bumped by
a factor of four as well to accomodate longer names.

Some associated limits like _XOPEN_NAME_MAX, _XOPEN_PATH_MAX, FILENAME_MAX
are bumped.

Glob is modified to allocate now larger path buffers on the heap, rather
than the stack.

EXTATTR_MAXNAMELEN is de-linked from NAME_MAX; there's no obvious reason
attribute names need to be longer than 255 bytes. In particular, Windows
Alternate Data Stream names cannot be longer than 255 bytes.

The old 1024 byte path limit is retained as OLD_PATH_MAX. Several ABIs are
preserved by retaining the old limit.

Finally, UFS is modified to enforce its existing 255 byte name limit.

Diff Detail

Lint
Lint OK
Unit
No Unit Test Coverage
Build Status
Buildable 11506
Build 11861: arc lint + arc unit

Event Timeline

cem created this revision.Sep 12 2017, 6:12 AM
cem added a comment.Sep 12 2017, 6:28 AM

It seems that ZFS (which we don't use) has some ABI dependency on these constants via zfs_cmd, and this change breaks a CTASSERT on line 202 of zfs_ioctl.c. For some reason passing 3 MAXPATHLENs and 1 MAXNAMELEN via ioctl is too much. Someone would need to figure out how to fix ZFS to use this patch.

In D12330#255757, @cem wrote:

It seems that ZFS (which we don't use) has some ABI dependency on these constants via zfs_cmd, and this change breaks a CTASSERT on line 202 of zfs_ioctl.c. For some reason passing 3 MAXPATHLENs and 1 MAXNAMELEN via ioctl is too much. Someone would need to figure out how to fix ZFS to use this patch.

We have the ZFS expertise to fix that,,.. we DO use ZFS.,

cem added a comment.Sep 12 2017, 5:26 PM

It wouldn't be too hard to just convert the ioctl interface ZFS uses now to a syscall interface, which can take arbitrarily large commands. I think that's probably the most straightforward route.

cem added inline comments.Sep 12 2017, 5:31 PM
sys/ufs/ufs/ufs_lookup.c
227

Might want similar checks on other existing filesystems.

NFS already seems to check against its own MAXNAMLEN constant. tmpfs I think we can reasonably allow to just support long names. UDF is readonly. msdosfs probably needs a check added. FUSE? Not sure whether we should restrict userspace filesystems or not. ext2fs probably also should verify against EXT2FS_MAXNAMLEN (if it does now, it isn't explicit enough for me to see).

avg added a subscriber: avg.Sep 13 2017, 6:19 AM
In D12330#255757, @cem wrote:

It seems that ZFS (which we don't use) has some ABI dependency on these constants via zfs_cmd, and this change breaks a CTASSERT on line 202 of zfs_ioctl.c. For some reason passing 3 MAXPATHLENs and 1 MAXNAMELEN via ioctl is too much. Someone would need to figure out how to fix ZFS to use this patch.

I think that zfs_cmd_t is supposed to use ZFS_MAXNAMELEN. I am not sure why it doesn't.
zc_value probably needs to be at least MAXPATHLEN long because of zfs_ioc_obj_to_path, but zc_name definitely should be ZFS_MAXNAMELEN.

ed added a subscriber: ed.Sep 13 2017, 7:11 PM
ed added inline comments.
include/limits.h
131

These should not be modified. POSIX requires them to be exactly 255/1024.

http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/limits.h.html

ed added a comment.Sep 13 2017, 7:51 PM

I'm personally not a big fan of going into this direction. The problem with these constants is that they are defined in the first place. A lot of people don't realise that POSIX allows us to omit PATH_MAX and NAME_MAX entirely. By omitting them, you specify that these limits are not known at compile-time for this implementation. This is in our case arguably even more correct, as in our environment, the actual value depends on the mount point. The true limit may then be extracted using pathconf().

Instead of simply bumping these constants, I personally think we should just spend the effort to change our ABI to be path size independent. For example, struct dirent::d_name could as well be declared as having no size at all. POSIX allows this. I would even argue that this may even reduce memory usage here and there, as we often allocate PATH_MAX bytes, just to store some short pathname. This wouldn't be a problem if code just made dynamic allocations.

Once those kinds of changes are made, we could change limits.h to make PATH_MAX and NAME_MAX kernel-only. People may then change them to any value they prefer without breaking the ABI.

cem added a comment.Sep 13 2017, 8:42 PM
In D12330#256361, @ed wrote:

I'm personally not a big fan of going into this direction. The problem with these constants is that they are defined in the first place. A lot of people don't realise that POSIX allows us to omit PATH_MAX and NAME_MAX entirely. By omitting them, you specify that these limits are not known at compile-time for this implementation. This is in our case arguably even more correct, as in our environment, the actual value depends on the mount point. The true limit may then be extracted using pathconf().

Technically, sure. But porting software to e.g. GNU Hurd, which does this, is more difficult because of it. I don't think we want to increase porters' burdens.

Instead of simply bumping these constants, I personally think we should just spend the effort to change our ABI to be path size independent. For example, struct dirent::d_name could as well be declared as having no size at all. POSIX allows this. I would even argue that this may even reduce memory usage here and there, as we often allocate PATH_MAX bytes, just to store some short pathname. This wouldn't be a problem if code just made dynamic allocations.

I think your proposal is orthogonal to this work. Please counter propose a patch if you have one, though. I think it will be a pretty big change.

Once those kinds of changes are made, we could change limits.h to make PATH_MAX and NAME_MAX kernel-only. People may then change them to any value they prefer without breaking the ABI.

First you have to extract all of the constant-sized string fields out of various ABIs.

include/limits.h
131

Ok. I think this change was solely to support this test code, which (mis)uses _XOPEN_PATH_MAX in place of PATH_MAX: https://svnweb.freebsd.org/base?view=revision&revision=318210

so there are bits of this missing if you are (like us) trying to make this work in stable/10 or stable/11 because Isilon have already aommitted parts of it..
like change 313475 for example..
I need to merge that into our 10 based code until I can use this..

include/limits.h
131

maybe ngie (who works for Isilon) can weigh in on this?

cem added inline comments.Sep 14 2017, 5:56 PM
include/limits.h
131

Ngie has moved on to another employer.

avg added a comment.Sep 15 2017, 12:36 PM
In D12330#255757, @cem wrote:

It seems that ZFS (which we don't use) has some ABI dependency on these constants via zfs_cmd, and this change breaks a CTASSERT on line 202 of zfs_ioctl.c. For some reason passing 3 MAXPATHLENs and 1 MAXNAMELEN via ioctl is too much. Someone would need to figure out how to fix ZFS to use this patch.

I've just found out that that CTASSERT is no longer needed as we do not pass zfs_cmd_t directly via the ioctl path. Instead, we pass zfs_iocparm_t that contains a "pointer" to zfs_cmd_t and the latter is passed via explicit copyin and copyout.

I've applied these changes to 10.3 and will give them a run..

Have applied this patch at $JOB now to stable/10

do we need to allocate new MIBs for longer kinfo ioctls?
the code says:

#define KINFO_FILE_SIZE 1392

while the patch is not too bit it addresses a long standing issue in FreeBSD, that others have addressed long ago.

cem added a comment.Nov 10 2017, 4:43 PM

Have applied this patch at $JOB now to stable/10
do we need to allocate new MIBs for longer kinfo ioctls?
the code says:
#define KINFO_FILE_SIZE 1392
while the patch is not too bit it addresses a long standing issue in FreeBSD, that others have addressed long ago.

We attempted to preserve the existing KINFO (kldstat, etc) ABI. (Hence the changes to kf_path, kve_path, and kvo_path in this diff.) So I think it is fine to use the existing MIBs. We don't see 255-character unicode filenames as being especially relevant to kldstat and it was one less ABI to break.

cem updated this revision to Diff 40701.Mar 24 2018, 5:15 PM

Just a rebase on more recent CURRENT - untested

julian added a comment.Apr 4 2018, 1:28 PM

Can we make a list of other things that we'd need to change?
ABIs for Start

and there is a 1 byte name field in the name cache.

emaste added a subscriber: emaste.Aug 24 2019, 12:41 AM