Page MenuHomeFreeBSD

fix file_v_unlock
ClosedPublic

Authored by mjg on Sun, Oct 5, 5:23 PM.
Tags
None
Referenced Files
F131739231: D52915.diff
Fri, Oct 10, 7:10 PM
F131633811: D52915.id163596.diff
Thu, Oct 9, 10:14 PM
F131633805: D52915.id.diff
Thu, Oct 9, 10:14 PM
F131633803: D52915.id163606.diff
Thu, Oct 9, 10:14 PM
Unknown Object (File)
Thu, Oct 9, 6:53 PM
Unknown Object (File)
Tue, Oct 7, 4:35 PM
Unknown Object (File)
Tue, Oct 7, 4:23 PM
Unknown Object (File)
Mon, Oct 6, 9:46 AM
Subscribers

Details

Summary
commit 3f7d2a970f31232006ccd73a054b52b4149b316c (HEAD -> vlock2)
Author: Mateusz Guzik <mjg@FreeBSD.org>
Date:   Sun Oct 5 17:03:54 2025 +0000

    vfs offset: fix assertion failure in face of racing ffofset and setfl locking
    
    Both use the same 16 bit var to store their locked and waiters bits,
    then this in file_v_unlock:
    
            state = atomic_load_16(flagsp);
            if ((state & lock_wait_bit) == 0 &&
                atomic_cmpset_rel_16(flagsp, state, state & ~lock_bit))
                    return;
    
    can fail if for example foffset is being unlocked while setfl is getting
    locked.
    
    Afterwards the code assumes there are blocked waiters on foffset.

commit d87d378eae3bebd167dd9bb71450a9e44b2717a1
Author: Mateusz Guzik <mjg@FreeBSD.org>
Date:   Sun Oct 5 17:02:01 2025 +0000

    vfs foffset: drop weird commentary about offset protection
    
    It is readily apparent the FOFFSET_LOCKED flag protects it because the
    read followed its use.
    
    It also does not matter who said how it was protected earlier.
Test Plan

This panicked for me while running poudriere.

I created a testcase based on will-it-scale which reproduces it within seconds. No issues with my patch.

#include <string.h>
#include <fcntl.h>
#include <stdlib.h>
#include <unistd.h>
#include <assert.h>

#define BUFLEN 4096
#define FILESIZE (1 * 1024 * 1024)

static char tmpfile[] = "/tmp/willitscale.XXXXXX";

char *testcase_description = "Same file read";

int fd;

void testcase_prepare(unsigned long nr_tasks)
{
        char buf[FILESIZE];

        fd = mkstemp(tmpfile);
        memset(buf, 0, sizeof(buf));
        assert(fd >= 0);
        assert(write(fd, buf, sizeof(buf)) == sizeof(buf));
}

void testcase(unsigned long long *iterations, unsigned long nr)
{
        char buf[BUFLEN];

        if (nr == 0) {
                while (1) {
                        int ret = read(fd, buf, BUFLEN);
                        assert(ret >= 0);
                        if (ret == 0)
                                lseek(fd, 0, SEEK_SET);

                        (*iterations)++;
                }
        } else {
                while (1) {
                        int ret = fcntl(fd, F_SETFL, FRDAHEAD);
                        assert(ret == 0);
                }
        }
}

void testcase_cleanup(void)
{
        unlink(tmpfile);
}

Diff Detail

Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

mjg requested review of this revision.Sun, Oct 5, 5:23 PM
sys/kern/vfs_vnops.c
857

This assertion is still racy. It should operate on state rather than *flagsp.

This seems ok to me. We could also just remove the assertion in question instead.

sys/kern/vfs_vnops.c
857

Oh, never mind, once the wait bit is set it won't be cleared.

This revision is now accepted and ready to land.Sun, Oct 5, 8:25 PM