Page MenuHomeFreeBSD

vfs: stop populating spares in struct stat
AbandonedPublic

Authored by mjg on Jan 30 2020, 11:13 AM.
Tags
None
Referenced Files
Unknown Object (File)
Feb 10 2024, 9:36 PM
Unknown Object (File)
Feb 2 2024, 10:50 PM
Unknown Object (File)
Dec 27 2023, 10:43 PM
Unknown Object (File)
Dec 20 2023, 7:30 AM
Unknown Object (File)
Oct 31 2023, 8:56 AM
Unknown Object (File)
Sep 29 2023, 8:52 AM
Unknown Object (File)
Sep 18 2023, 9:10 AM
Unknown Object (File)
May 20 2023, 10:17 AM
Subscribers

Details

Reviewers
kib
jeff
Summary

The struct is 144 bytes + 80 bytes of spares. This adds avoidable work for both zeroing and copying.

Introduce kernel_stat for internal use.

The rest of the change which replaces stat with kernel_stat where appropriate is here (along with this patch): https://people.freebsd.org/~mjg/stat-full.diff

Test Plan

make tinderbox, passed

Diff Detail

Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

kib requested changes to this revision.Jan 30 2020, 8:39 PM

I do not understand how this could work. It breaks ABI for anything that happen to include struct stat, and causes extremely hard to diagnose issues.

This revision now requires changes to proceed.Jan 30 2020, 8:39 PM

It does not. Only the kernel sees the smaller struct, while the userspace keeps using the full size. The only difference is that the st_spare array is not touched by the kernel.

In D23428#514116, @mjg wrote:

It does not. Only the kernel sees the smaller struct, while the userspace keeps using the full size. The only difference is that the st_spare array is not touched by the kernel.

Exactly. And if there is a structure that embeds struct stat, then its layout is different between userspace and kernel.

Assumption that all layouts are same regardless of the environments is fundamental for our source tree. We already had problems for lesser mismatches, this is why our struct mtx includes space for everything regardless of the kernel compilation options. Making that broken for user<->kernel ABI thing just blows stuff.

According to my grep nothing embedds the struct, but point taken.

The cleanest take on it would create a dedicated struct for the kernel (kernel_stat?) to make it clear it must never escape and a converter function which outputs userspace-friendly stat.

mjg edited the summary of this revision. (Show Details)
mjg edited the test plan for this revision. (Show Details)
  • introduce kernel_stat

If you insist doing this, I think that kernel_stat is still too rude.

You can reduce the size supplied to bzero() in vn_stat(), and do the same for size in copyouts. Of course they can utilize some handy macro instead of typing __offsetof() into all that places.

Also if you do this, I believe that compat32 stat and all fstats should be microoptimized.

In D23428#514731, @kib wrote:

If you insist doing this, I think that kernel_stat is still too rude.

I don't like the name but I don't have better ideas for it. "kstat" or so is already a term in ZFS.

You can reduce the size supplied to bzero() in vn_stat(), and do the same for size in copyouts. Of course they can utilize some handy macro instead of typing __offsetof() into all that places.

This would be too error prone in my opinion. To your previous point that perhaps the struct can become embedded in something else and should someone be looking at exporting it to userspace, 'struct stat' which has uninitialized spares in the kernel will risk exposing said area. In contrast untangled kernel-side stat and userspace stat makes it harder to make the mistake. That is, even if there were no significant size changes I think decoupling the two would provide a nice cleanup.

Also if you do this, I believe that compat32 stat and all fstats should be microoptimized.

ok

How about 'struct stat_internal'?

In D23428#515283, @mjg wrote:

How about 'struct stat_internal'?

I am still not convinced that adding this structure is a good idea. You already added copyout_kernel_stat(), so there is no reason not to add e.g. bzero_kernel_stat() and do it on normal struct stat with __offsetof. Adding a comment above st_spare that the part below is not copied/bzeroed should provide enough context.

So any incarnation of this patch makes the following fail contrib/netbsd-tests/lib/libc/c063/t_fstatat.c:

ATF_TC_BODY(fstatat_fd, tc)
{
        int dfd;
        int fd;
        struct stat st1, st2;

        ATF_REQUIRE(mkdir(DIR, 0755) == 0);
        ATF_REQUIRE((fd = open(FILE, O_CREAT|O_RDWR, 0644)) != -1);
        ATF_REQUIRE(close(fd) == 0);

        ATF_REQUIRE((dfd = open(DIR, O_RDONLY, 0)) != -1);
        ATF_REQUIRE(fstatat(dfd, BASEFILE, &st1, 0) == 0);
        ATF_REQUIRE(close(dfd) == 0);

        ATF_REQUIRE(stat(FILE, &st2) == 0);
        ATF_REQUIRE(memcmp(&st1, &st2, sizeof(st1)) == 0);
}

That is, they fstat twice to different buffers and memcmp the result. Since the spare area is not populated this fails. This makes me worried there are actual programs which also do it and which would break because of this change.

Thus another take is this: on Linux struct stat is 144 bytes. On FreeBSD this struct without any spares would have precisely the same size. The upshot of shrinking it to Linux size would be that userspace sees the same one (and uses less memory). If spares are to be preserved, we can use fewer of them. Adding 16 bytes in spares brings the structure to 160 bytes in total. I can add the necessary compat for 12.x binaries.

Note this change on top of other speed ups gives about +10% for fstat/second rate which I think makes it worth it given how often fstat is executed.

Thoughts?

In D23428#522227, @mjg wrote:

So any incarnation of this patch makes the following fail contrib/netbsd-tests/lib/libc/c063/t_fstatat.c:

ATF_TC_BODY(fstatat_fd, tc)
{
        int dfd;
        int fd;
        struct stat st1, st2;

        ATF_REQUIRE(mkdir(DIR, 0755) == 0);
        ATF_REQUIRE((fd = open(FILE, O_CREAT|O_RDWR, 0644)) != -1);
        ATF_REQUIRE(close(fd) == 0);

        ATF_REQUIRE((dfd = open(DIR, O_RDONLY, 0)) != -1);
        ATF_REQUIRE(fstatat(dfd, BASEFILE, &st1, 0) == 0);
        ATF_REQUIRE(close(dfd) == 0);

        ATF_REQUIRE(stat(FILE, &st2) == 0);
        ATF_REQUIRE(memcmp(&st1, &st2, sizeof(st1)) == 0);
}

That is, they fstat twice to different buffers and memcmp the result. Since the spare area is not populated this fails. This makes me worried there are actual programs which also do it and which would break because of this change.

Thus another take is this: on Linux struct stat is 144 bytes. On FreeBSD this struct without any spares would have precisely the same size. The upshot of shrinking it to Linux size would be that userspace sees the same one (and uses less memory). If spares are to be preserved, we can use fewer of them. Adding 16 bytes in spares brings the structure to 160 bytes in total. I can add the necessary compat for 12.x binaries.

Note this change on top of other speed ups gives about +10% for fstat/second rate which I think makes it worth it given how often fstat is executed.

Thoughts?

This is equivalent, by the size of ABI breakage and required work to keep up with it, to ino64. In other words, you impose huge troubles on all FreeBSD users to save copying several bytes. Also you need to patch bootstraps for all languages, like ghc/rust. In other words, this is no-go proposal.

Regardless of that, I think that the test is broken, if you take on it from the pedantic point of view, even if spares are excluded. If there is any padding, its content is also undefined.

Let me restate my position: it is fine not to bzero in kernel, and then to not copyout the trailing spares for struct stat. I am against kernel_stat, and very strongly against changing struct stat.

Ok, I'll write a patch which keeps the struct intact and only takes care of the non-spares area and then ask for a ports exp-run just in case.

I'll go farther: changing the size of struct stat is impossible. Logistically, it's a non starter due to how many tendrils this has...

I decided to drop this altogether as the statbuf comparison fails and is probably performed by real-world obscure code.