Index: projects/fuse2/sbin/mount_fusefs/mount_fusefs.8 =================================================================== --- projects/fuse2/sbin/mount_fusefs/mount_fusefs.8 (revision 350114) +++ projects/fuse2/sbin/mount_fusefs/mount_fusefs.8 (revision 350115) @@ -1,397 +1,402 @@ .\" Copyright (c) 1980, 1989, 1991, 1993 .\" The Regents of the University of California. .\" Copyright (c) 2005, 2006 Csaba Henk .\" All rights reserved. .\" .\" Copyright (c) 2019 The FreeBSD Foundation .\" .\" Portions of this documentation were written by BFF Storage Systems under .\" sponsorship from the FreeBSD Foundation. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" 3. Neither the name of the University nor the names of its contributors .\" may be used to endorse or promote products derived from this software .\" without specific prior written permission. .\" .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" $FreeBSD$ .\" -.Dd June 14, 2019 +.Dd July 18, 2019 .Dt MOUNT_FUSEFS 8 .Os .Sh NAME .Nm mount_fusefs .Nd mount a Fuse file system daemon .Sh SYNOPSIS .Nm .Op Fl A .Op Fl S .Op Fl v .Op Fl D Ar fuse_daemon .Op Fl O Ar daemon_opts .Op Fl s Ar special .Op Fl m Ar node .Op Fl h .Op Fl V .Op Fl o Ar option ... .Ar special node .Op Ar fuse_daemon ... .Sh DESCRIPTION Basic usage is to start a fuse daemon on the given .Ar special file. In practice, the daemon is assigned a .Ar special file automatically, which can then be indentified via .Xr fstat 1 . That special file can then be mounted by .Nm . .Pp However, the procedure of spawning a daemon will usually be automated so that it is performed by .Nm . If the command invoking a given .Ar fuse_daemon is appended to the list of arguments, .Nm will call the .Ar fuse_daemon via that command. In that way the .Ar fuse_daemon will be instructed to attach itself to .Ar special . From that on mounting goes as in the simple case. (See .Sx DAEMON MOUNTS . ) .Pp The .Ar special argument will normally be treated as the path of the special file to mount. .Pp However, if .Pa auto is passed as .Ar special , then .Nm will look for a suitable free fuse device by itself. .Pp Finally, if .Ar special is an integer it will be interpreted as the number of the file descriptor of an already open fuse device (used when the Fuse library invokes .Nm . (See .Sx DAEMON MOUNTS ) . .Pp The options are as follows: .Bl -tag -width indent .It Fl A , Ic --reject-allow_other Prohibit the .Cm allow_other mount flag. Intended for use in scripts and the .Xr sudoers 5 file. .It Fl S , Ic --safe Run in safe mode (i.e. reject invoking a filesystem daemon) .It Fl v Be verbose .It Fl D, Ic --daemon Ar daemon Call the specified .Ar daemon .It Fl O, Ic --daemon_opts Ar opts Add .Ar opts to the daemon's command line .It Fl s, Ic --special Ar special Use .Ar special as special .It Fl m, Ic --mountpath Ar node Mount on .Ar node .It Fl h, Ic --help Show help .It Fl V, Ic --version Show version information .It Fl o Mount options are specified via .Fl o . The following options are available (and also their negated versions, by prefixing them with .Dq no ) : .Bl -tag -width indent .It Cm allow_other Do not apply .Sx STRICT ACCESS POLICY . Only root can use this option .It Cm async I/O to the file system may be done asynchronously. Writes may delayed and/or reordered. .It Cm default_permissions Enable traditional (file mode based) permission checking in kernel +.It Cm intr +Allow signals to interrupt operations that are blocked waiting for a reply from the server. +When this option is in use, system calls may fail with +.Er EINTR +whenever a signal is received. .It Cm max_read Ns = Ns Ar n Limit size of read requests to .Ar n .It Cm neglect_shares Do not refuse unmounting if there are secondary mounts .It Cm private Refuse shared mounting of the daemon. This is the default behaviour, to allow sharing, expicitly use .Fl o Cm noprivate .It Cm push_symlinks_in Prefix absolute symlinks with the mountpoint .It Cm subtype Ns = Ns Ar fsname Suffix .Ar fsname to the file system name as reported by .Xr statfs 2 . This option can be used to identify the file system implemented by .Ar fuse_daemon . .El .El .Pp Besides the above mount options, there is a set of pseudo-mount options which are supported by the Fuse library. One can list these by passing .Fl h to a Fuse daemon. Most of these options only have affect on the behavior of the daemon (that is, their scope is limited to userspace). However, there are some which do require in-kernel support. Currently the options supported by the kernel are: .Bl -tag -width indent .It Cm direct_io Bypass the buffer cache system .It Cm kernel_cache By default cached buffers of a given file are flushed at each .Xr open 2 . This option disables this behaviour .El .Sh DAEMON MOUNTS Usually users do not need to use .Nm directly, as the Fuse library enables Fuse daemons to invoke .Nm . That is, .Pp .Dl fuse_daemon device mountpoint .Pp has the same effect as .Pp .Dl mount_fusefs auto mountpoint fuse_daemon .Pp This is the recommended usage when you want basic usage (eg, run the daemon at a low privilege level but mount it as root). .Sh STRICT ACCESS POLICY The strict access policy for Fuse filesystems lets one to use the filesystem only if the filesystem daemon has the same credentials (uid, real uid, gid, real gid) as the user. .Pp This is applied for Fuse mounts by default and only root can mount without the strict access policy (i.e. the .Cm allow_other mount option). .Pp This is to shield users from the daemon .Dq spying on their I/O activities. .Pp Users might opt to willingly relax strict access policy (as far they are concerned) by doing their own secondary mount (See .Sx SHARED MOUNTS ) . .Sh SHARED MOUNTS A Fuse daemon can be shared (i.e. mounted multiple times). When doing the first (primary) mount, the spawner and the mounter of the daemon must have the same uid, or the mounter should be the superuser. .Pp After the primary mount is in place, secondary mounts can be done by anyone unless this feature is disabled by .Cm private . The behaviour of a secondary mount is analogous to that of symbolic links: they redirect all filesystem operations to the primary mount. .Pp Doing a secondary mount is like signing an agreement: by this action, the mounter agrees that the Fuse daemon can trace her I/O activities. From then on she is not banned from using the filesystem (either via her own mount or via the primary mount), regardless whether .Cm allow_other is used or not. .Pp The device name of a secondary mount is the device name of the corresponding primary mount, followed by a '#' character and the index of the secondary mount; e.g. .Pa /dev/fuse0#3 . .Sh SECURITY System administrators might want to use a custom mount policy (ie., one going beyond the .Va vfs.usermount sysctl). The primary tool for such purposes is .Xr sudo 8 . However, given that .Nm is capable of invoking an arbitrary program, one must be careful when doing this. .Nm is designed in a way such that it makes that easy. For this purpose, there are options which disable certain risky features (i.e. .Fl S and .Fl A ) , and command line parsing is done in a flexible way: mixing options and non-options is allowed, but processing them stops at the third non-option argument (after the first two has been utilized as device and mountpoint). The rest of the command line specifies the daemon and its arguments. (Alternatively, the daemon, the special and the mount path can be specified using the respective options.) Note that .Nm ignores the environment variable .Ev POSIXLY_CORRECT and always behaves as described. .Pp In general, to be as scripting / .Xr sudoers 5 friendly as possible, no information has a fixed position in the command line, but once a given piece of information is provided, subsequent arguments/options cannot override it (with the exception of some non-critical ones). .Sh ENVIRONMENT .Bl -tag -width ".Ev MOUNT_FUSEFS_SAFE" .It Ev MOUNT_FUSEFS_SAFE This has the same effect as the .Fl S option. .It Ev MOUNT_FUSEFS_VERBOSE This has the same effect as the .Fl v option. .It Ev MOUNT_FUSEFS_IGNORE_UNKNOWN If set, .Nm will ignore uknown mount options. .It Ev MOUNT_FUSEFS_CALL_BY_LIB Adjust behavior to the needs of the FUSE library. Currently it effects help output. .El .Pp Although the following variables do not have any effect on .Nm itself, they affect the behaviour of fuse daemons: .Bl -tag -width ".Ev FUSE_DEV_NAME" .It Ev FUSE_DEV_NAME Device to attach. If not set, the multiplexer path .Ar /dev/fuse is used. .It Ev FUSE_DEV_FD File desciptor of an opened Fuse device to use. Overrides .Ev FUSE_DEV_NAME . .It Ev FUSE_NO_MOUNT If set, the library will not attempt to mount the filesystem, even if a mountpoint argument is supplied. .El .Sh FILES .Bl -tag -width /dev/fuse .It Pa /dev/fuse Fuse device with which the kernel and Fuse daemons can communicate. .It Pa /dev/fuse The multiplexer path. An .Xr open 2 performed on it automatically is passed to a free Fuse device by the kernel (which might be created just for this puprose). .El .Sh EXAMPLES Mount the example filesystem in the Fuse distribution (from its directory): either .Pp .Dl ./fusexmp /mnt/fuse .Pp or .Pp .Dl mount_fusefs auto /mnt/fuse ./fusexmp .Pp Doing the same in two steps, using .Pa /dev/fuse0 : .Pp .Dl FUSE_DEV_NAME=/dev/fuse ./fusexmp && .Dl mount_fusefs /dev/fuse /mnt/fuse .Pp A script wrapper for fusexmp which ensures that .Nm does not call any external utility and also provides a hacky (non race-free) automatic device selection: .Pp .Dl #!/bin/sh -e .Pp .Dl FUSE_DEV_NAME=/dev/fuse fusexmp .Dl mount_fusefs -S /dev/fuse /mnt/fuse \(lq$@\(rq .Sh SEE ALSO .Xr fstat 1 , .Xr mount 8 , .Xr sudo 8 , .Xr umount 8 .Sh HISTORY .Nm was written as the part of the .Fx implementation of the Fuse userspace filesystem framework (see .Xr https://github.com/libfuse/libfuse ) and first appeared in the .Pa sysutils/fusefs-kmod port, supporting .Fx 6.0 . It was added to the base system in .Fx 10.0 . .Sh CAVEATS This user interface is .Fx specific. Secondary mounts should be unmounted via their device name. If an attempt is made to unmount them via their filesystem root path, the unmount request will be forwarded to the primary mount path. In general, unmounting by device name is less error-prone than by mount path (although the latter will also work under normal circumstances). .Pp If the daemon is specified via the .Fl D and .Fl O options, it will be invoked via .Xr system 3 , and the daemon's command line will also have an .Dq & control operator appended, so that we do not have to wait for its termination. You should use a simple command line when invoking the daemon via these options. .Sh BUGS .Ar special is treated as a multiplexer if and only if it is literally the same as .Pa auto or .Pa /dev/fuse . Other paths which are equivalent with .Pa /dev/fuse (eg., .Pa /../dev/fuse ) are not. Index: projects/fuse2/sbin/mount_fusefs/mount_fusefs.c =================================================================== --- projects/fuse2/sbin/mount_fusefs/mount_fusefs.c (revision 350114) +++ projects/fuse2/sbin/mount_fusefs/mount_fusefs.c (revision 350115) @@ -1,491 +1,494 @@ /*- * SPDX-License-Identifier: BSD-2-Clause-FreeBSD * * Copyright (c) 2005 Jean-Sebastien Pedron * Copyright (c) 2005 Csaba Henk * All rights reserved. * * Copyright (c) 2019 The FreeBSD Foundation * * Portions of this software were developed by BFF Storage Systems under * sponsorship from the FreeBSD Foundation. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "mntopts.h" #ifndef FUSE4BSD_VERSION #define FUSE4BSD_VERSION "0.3.9-pre1" #endif void __usage_short(void); void usage(void); void helpmsg(void); void showversion(void); static struct mntopt mopts[] = { #define ALTF_PRIVATE 0x01 { "private", 0, ALTF_PRIVATE, 1 }, { "neglect_shares", 0, 0x02, 1 }, { "push_symlinks_in", 0, 0x04, 1 }, { "allow_other", 0, 0x08, 1 }, { "default_permissions", 0, 0x10, 1 }, #define ALTF_MAXREAD 0x20 { "max_read=", 0, ALTF_MAXREAD, 1 }, #define ALTF_SUBTYPE 0x40 { "subtype=", 0, ALTF_SUBTYPE, 1 }, /* * MOPT_AUTOMOUNTED, included by MOPT_STDOPTS, does not fit into * the 'flags' argument to nmount(2). We have to abuse altflags * to pass it, as string, via iovec. */ #define ALTF_AUTOMOUNTED 0x100 { "automounted", 0, ALTF_AUTOMOUNTED, 1 }, + #define ALTF_INTR 0x200 + { "intr", 0, ALTF_INTR, 1 }, /* Linux specific options, we silently ignore them */ { "fsname=", 0, 0x00, 1 }, { "fd=", 0, 0x00, 1 }, { "rootmode=", 0, 0x00, 1 }, { "user_id=", 0, 0x00, 1 }, { "group_id=", 0, 0x00, 1 }, { "large_read", 0, 0x00, 1 }, /* "nonempty", just the first two chars are stripped off during parsing */ { "nempty", 0, 0x00, 1 }, { "async", 0, MNT_ASYNC, 0}, { "noasync", 1, MNT_ASYNC, 0}, MOPT_STDOPTS, MOPT_END }; struct mntval { int mv_flag; void *mv_value; int mv_len; }; static struct mntval mvals[] = { { ALTF_MAXREAD, NULL, 0 }, { ALTF_SUBTYPE, NULL, 0 }, { 0, NULL, 0 } }; #define DEFAULT_MOUNT_FLAGS ALTF_PRIVATE int main(int argc, char *argv[]) { struct iovec *iov; int mntflags, iovlen, verbose = 0; char *dev = NULL, *dir = NULL, mntpath[MAXPATHLEN]; char *devo = NULL, *diro = NULL; char ndev[128], fdstr[15]; int i, done = 0, reject_allow_other = 0, safe_level = 0; int altflags = DEFAULT_MOUNT_FLAGS; int __altflags = DEFAULT_MOUNT_FLAGS; int ch = 0; struct mntopt *mo; struct mntval *mv; static struct option longopts[] = { {"reject-allow_other", no_argument, NULL, 'A'}, {"safe", no_argument, NULL, 'S'}, {"daemon", required_argument, NULL, 'D'}, {"daemon_opts", required_argument, NULL, 'O'}, {"special", required_argument, NULL, 's'}, {"mountpath", required_argument, NULL, 'm'}, {"version", no_argument, NULL, 'V'}, {"help", no_argument, NULL, 'h'}, {0,0,0,0} }; int pid = 0; int fd = -1, fdx; char *ep; char *daemon_str = NULL, *daemon_opts = NULL; /* * We want a parsing routine which is not sensitive to * the position of args/opts; it should extract the * first two args and stop at the beginning of the rest. * (This makes it easier to call mount_fusefs from external * utils than it is with a strict "util flags args" syntax.) */ iov = NULL; iovlen = 0; mntflags = 0; /* All in all, I feel it more robust this way... */ unsetenv("POSIXLY_CORRECT"); if (getenv("MOUNT_FUSEFS_IGNORE_UNKNOWN")) getmnt_silent = 1; if (getenv("MOUNT_FUSEFS_VERBOSE")) verbose = 1; do { for (i = 0; i < 3; i++) { if (optind < argc && argv[optind][0] != '-') { if (dir) { done = 1; break; } if (dev) dir = argv[optind]; else dev = argv[optind]; optind++; } } switch(ch) { case 'A': reject_allow_other = 1; break; case 'S': safe_level = 1; break; case 'D': if (daemon_str) errx(1, "daemon specified inconsistently"); daemon_str = optarg; break; case 'O': if (daemon_opts) errx(1, "daemon opts specified inconsistently"); daemon_opts = optarg; break; case 'o': getmntopts(optarg, mopts, &mntflags, &altflags); for (mv = mvals; mv->mv_flag; ++mv) { if (! (altflags & mv->mv_flag)) continue; for (mo = mopts; mo->m_flag; ++mo) { char *p, *q; if (mo->m_flag != mv->mv_flag) continue; p = strstr(optarg, mo->m_option); if (p) { p += strlen(mo->m_option); q = p; while (*q != '\0' && *q != ',') q++; mv->mv_len = q - p + 1; mv->mv_value = malloc(mv->mv_len); memcpy(mv->mv_value, p, mv->mv_len - 1); ((char *)mv->mv_value)[mv->mv_len - 1] = '\0'; break; } } } break; case 's': if (devo) errx(1, "special specified inconsistently"); devo = optarg; break; case 'm': if (diro) errx(1, "mount path specified inconsistently"); diro = optarg; break; case 'v': verbose = 1; break; case 'h': helpmsg(); break; case 'V': showversion(); break; case '\0': break; case '?': default: usage(); } if (done) break; } while ((ch = getopt_long(argc, argv, "AvVho:SD:O:s:m:", longopts, NULL)) != -1); argc -= optind; argv += optind; if (devo) { if (dev) errx(1, "special specified inconsistently"); dev = devo; } else if (diro) errx(1, "if mountpoint is given via an option, special should also be given via an option"); if (diro) { if (dir) errx(1, "mount path specified inconsistently"); dir = diro; } if ((! dev) && argc > 0) { dev = *argv++; argc--; } if ((! dir) && argc > 0) { dir = *argv++; argc--; } if (! (dev && dir)) errx(1, "missing special and/or mountpoint"); for (mo = mopts; mo->m_flag; ++mo) { if (altflags & mo->m_flag) { int iov_done = 0; if (reject_allow_other && strcmp(mo->m_option, "allow_other") == 0) /* * reject_allow_other is stronger than a * negative of allow_other: if this is set, * allow_other is blocked, period. */ errx(1, "\"allow_other\" usage is banned by respective option"); for (mv = mvals; mv->mv_flag; ++mv) { if (mo->m_flag != mv->mv_flag) continue; if (mv->mv_value) { build_iovec(&iov, &iovlen, mo->m_option, mv->mv_value, mv->mv_len); iov_done = 1; break; } } if (! iov_done) build_iovec(&iov, &iovlen, mo->m_option, __DECONST(void *, ""), -1); } if (__altflags & mo->m_flag) { char *uscore_opt; if (asprintf(&uscore_opt, "__%s", mo->m_option) == -1) err(1, "failed to allocate memory"); build_iovec(&iov, &iovlen, uscore_opt, __DECONST(void *, ""), -1); free(uscore_opt); } } if (getenv("MOUNT_FUSEFS_SAFE")) safe_level = 1; if (safe_level > 0 && (argc > 0 || daemon_str || daemon_opts)) errx(1, "safe mode, spawning daemon not allowed"); if ((argc > 0 && (daemon_str || daemon_opts)) || (daemon_opts && ! daemon_str)) errx(1, "daemon specified inconsistently"); /* * Resolve the mountpoint with realpath(3) and remove unnecessary * slashes from the devicename if there are any. */ if (checkpath(dir, mntpath) != 0) err(1, "%s", mntpath); (void)rmslashes(dev, dev); if (strcmp(dev, "auto") == 0) dev = __DECONST(char *, "/dev/fuse"); if (strcmp(dev, "/dev/fuse") == 0) { if (! (argc > 0 || daemon_str)) { fprintf(stderr, "Please also specify the fuse daemon to run when mounting via the multiplexer!\n"); usage(); } if ((fd = open(dev, O_RDWR)) < 0) err(1, "failed to open fuse device"); } else { fdx = strtol(dev, &ep, 10); if (*ep == '\0') fd = fdx; } /* Identifying device */ if (fd >= 0) { struct stat sbuf; char *ndevbas, *lep; if (fstat(fd, &sbuf) == -1) err(1, "cannot stat device file descriptor"); strcpy(ndev, _PATH_DEV); ndevbas = ndev + strlen(_PATH_DEV); devname_r(sbuf.st_rdev, S_IFCHR, ndevbas, sizeof(ndev) - strlen(_PATH_DEV)); if (strncmp(ndevbas, "fuse", 4)) errx(1, "mounting inappropriate device"); strtol(ndevbas + 4, &lep, 10); if (*lep != '\0') errx(1, "mounting inappropriate device"); dev = ndev; } if (argc > 0 || daemon_str) { char *fds; if (fd < 0 && (fd = open(dev, O_RDWR)) < 0) err(1, "failed to open fuse device"); if (asprintf(&fds, "%d", fd) == -1) err(1, "failed to allocate memory"); setenv("FUSE_DEV_FD", fds, 1); free(fds); setenv("FUSE_NO_MOUNT", "1", 1); if (daemon_str) { char *bgdaemon; int len; if (! daemon_opts) daemon_opts = __DECONST(char *, ""); len = strlen(daemon_str) + 1 + strlen(daemon_opts) + 2 + 1; bgdaemon = calloc(1, len); if (! bgdaemon) err(1, "failed to allocate memory"); strlcpy(bgdaemon, daemon_str, len); strlcat(bgdaemon, " ", len); strlcat(bgdaemon, daemon_opts, len); strlcat(bgdaemon, " &", len); if (system(bgdaemon)) err(1, "failed to call fuse daemon"); } else { if ((pid = fork()) < 0) err(1, "failed to fork for fuse daemon"); if (pid == 0) { execvp(argv[0], argv); err(1, "failed to exec fuse daemon"); } } } /* Prepare the options vector for nmount(). build_iovec() is declared * in mntopts.h. */ sprintf(fdstr, "%d", fd); build_iovec(&iov, &iovlen, "fstype", __DECONST(void *, "fusefs"), -1); build_iovec(&iov, &iovlen, "fspath", mntpath, -1); build_iovec(&iov, &iovlen, "from", dev, -1); build_iovec(&iov, &iovlen, "fd", fdstr, -1); if (verbose) fprintf(stderr, "mounting fuse daemon on device %s\n", dev); if (nmount(iov, iovlen, mntflags) < 0) err(EX_OSERR, "%s on %s", dev, mntpath); exit(0); } void __usage_short(void) { fprintf(stderr, "usage:\n%s [-A|-S|-v|-V|-h|-D daemon|-O args|-s special|-m node|-o option...] special node [daemon args...]\n\n", getprogname()); } void usage(void) { struct mntopt *mo; __usage_short(); fprintf(stderr, "known options:\n"); for (mo = mopts; mo->m_flag; ++mo) fprintf(stderr, "\t%s\n", mo->m_option); fprintf(stderr, "\n(use -h for a detailed description of these options)\n"); exit(EX_USAGE); } void helpmsg(void) { if (! getenv("MOUNT_FUSEFS_CALL_BY_LIB")) { __usage_short(); fprintf(stderr, "description of options:\n"); } /* * The main use case of this function is giving info embedded in general * FUSE lib help output. Therefore the style and the content of the output * tries to fit there as much as possible. */ fprintf(stderr, " -o allow_other allow access to other users\n" /* " -o nonempty allow mounts over non-empty file/dir\n" */ " -o default_permissions enable permission checking by kernel\n" + " -o intr interruptible mount\n" /* " -o fsname=NAME set filesystem name\n" " -o large_read issue large read requests (2.4 only)\n" */ " -o subtype=NAME set filesystem type\n" " -o max_read=N set maximum size of read requests\n" " -o noprivate allow secondary mounting of the filesystem\n" " -o neglect_shares don't report EBUSY when unmount attempted\n" " in presence of secondary mounts\n" " -o push_symlinks_in prefix absolute symlinks with mountpoint\n" ); exit(EX_USAGE); } void showversion(void) { puts("mount_fusefs [fuse4bsd] version: " FUSE4BSD_VERSION); exit(EX_USAGE); } Index: projects/fuse2/sys/fs/fuse/fuse_ipc.c =================================================================== --- projects/fuse2/sys/fs/fuse/fuse_ipc.c (revision 350114) +++ projects/fuse2/sys/fs/fuse/fuse_ipc.c (revision 350115) @@ -1,1098 +1,1098 @@ /*- * SPDX-License-Identifier: BSD-3-Clause * * Copyright (c) 2007-2009 Google Inc. and Amit Singh * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are * met: * * * Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following disclaimer * in the documentation and/or other materials provided with the * distribution. * * Neither the name of Google Inc. nor the names of its * contributors may be used to endorse or promote products derived from * this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * Copyright (C) 2005 Csaba Henk. * All rights reserved. * * Copyright (c) 2019 The FreeBSD Foundation * * Portions of this software were developed by BFF Storage Systems, LLC under * sponsorship from the FreeBSD Foundation. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "fuse.h" #include "fuse_node.h" #include "fuse_ipc.h" #include "fuse_internal.h" SDT_PROVIDER_DECLARE(fusefs); /* * Fuse trace probe: * arg0: verbosity. Higher numbers give more verbose messages * arg1: Textual message */ SDT_PROBE_DEFINE2(fusefs, , ipc, trace, "int", "char*"); static void fdisp_make_pid(struct fuse_dispatcher *fdip, enum fuse_opcode op, struct fuse_data *data, uint64_t nid, pid_t pid, struct ucred *cred); static void fuse_interrupt_send(struct fuse_ticket *otick, int err); static struct fuse_ticket *fticket_alloc(struct fuse_data *data); static void fticket_refresh(struct fuse_ticket *ftick); static void fticket_destroy(struct fuse_ticket *ftick); static int fticket_wait_answer(struct fuse_ticket *ftick); static inline int fticket_aw_pull_uio(struct fuse_ticket *ftick, struct uio *uio); static int fuse_body_audit(struct fuse_ticket *ftick, size_t blen); static fuse_handler_t fuse_standard_handler; static counter_u64_t fuse_ticket_count; SYSCTL_COUNTER_U64(_vfs_fusefs_stats, OID_AUTO, ticket_count, CTLFLAG_RD, &fuse_ticket_count, "Number of allocated tickets"); static long fuse_iov_permanent_bufsize = 1 << 19; SYSCTL_LONG(_vfs_fusefs, OID_AUTO, iov_permanent_bufsize, CTLFLAG_RW, &fuse_iov_permanent_bufsize, 0, "limit for permanently stored buffer size for fuse_iovs"); static int fuse_iov_credit = 16; SYSCTL_INT(_vfs_fusefs, OID_AUTO, iov_credit, CTLFLAG_RW, &fuse_iov_credit, 0, "how many times is an oversized fuse_iov tolerated"); MALLOC_DEFINE(M_FUSEMSG, "fuse_msgbuf", "fuse message buffer"); static uma_zone_t ticket_zone; /* * TODO: figure out how to timeout INTERRUPT requests, because the daemon may * leagally never respond */ static int fuse_interrupt_callback(struct fuse_ticket *tick, struct uio *uio) { struct fuse_ticket *otick, *x_tick; struct fuse_interrupt_in *fii; struct fuse_data *data = tick->tk_data; bool found = false; fii = (struct fuse_interrupt_in*)((char*)tick->tk_ms_fiov.base + sizeof(struct fuse_in_header)); fuse_lck_mtx_lock(data->aw_mtx); TAILQ_FOREACH_SAFE(otick, &data->aw_head, tk_aw_link, x_tick) { if (otick->tk_unique == fii->unique) { found = true; break; } } fuse_lck_mtx_unlock(data->aw_mtx); if (!found) { /* Original is already complete. Just return */ return 0; } /* Clear the original ticket's interrupt association */ otick->irq_unique = 0; if (tick->tk_aw_ohead.error == ENOSYS) { fsess_set_notimpl(data->mp, FUSE_INTERRUPT); return 0; } else if (tick->tk_aw_ohead.error == EAGAIN) { /* * There are two reasons we might get this: * 1) the daemon received the INTERRUPT request before the * original, or * 2) the daemon received the INTERRUPT request after it * completed the original request. * In the first case we should re-send the INTERRUPT. In the * second, we should ignore it. */ /* Resend */ fuse_interrupt_send(otick, EINTR); return 0; } else { /* Illegal FUSE_INTERRUPT response */ return EINVAL; } } /* Interrupt the operation otick. Return err as its error code */ void fuse_interrupt_send(struct fuse_ticket *otick, int err) { struct fuse_dispatcher fdi; struct fuse_interrupt_in *fii; struct fuse_in_header *ftick_hdr; struct fuse_data *data = otick->tk_data; struct fuse_ticket *tick, *xtick; struct ucred reused_creds; gid_t reused_groups[1]; if (otick->irq_unique == 0) { /* * If the daemon hasn't yet received otick, then we can answer * it ourselves and return. */ fuse_lck_mtx_lock(data->ms_mtx); STAILQ_FOREACH_SAFE(tick, &otick->tk_data->ms_head, tk_ms_link, xtick) { if (tick == otick) { STAILQ_REMOVE(&otick->tk_data->ms_head, tick, fuse_ticket, tk_ms_link); otick->tk_data->ms_count--; otick->tk_ms_link.stqe_next = NULL; fuse_lck_mtx_unlock(data->ms_mtx); fuse_lck_mtx_lock(otick->tk_aw_mtx); if (!fticket_answered(otick)) { fticket_set_answered(otick); otick->tk_aw_errno = err; wakeup(otick); } fuse_lck_mtx_unlock(otick->tk_aw_mtx); fuse_ticket_drop(tick); return; } } fuse_lck_mtx_unlock(data->ms_mtx); /* * If the fuse daemon doesn't support interrupts, then there's * nothing more that we can do */ if (!fsess_isimpl(data->mp, FUSE_INTERRUPT)) return; /* * If the fuse daemon has already received otick, then we must * send FUSE_INTERRUPT. */ ftick_hdr = fticket_in_header(otick); reused_creds.cr_uid = ftick_hdr->uid; reused_groups[0] = ftick_hdr->gid; reused_creds.cr_groups = reused_groups; fdisp_init(&fdi, sizeof(*fii)); fdisp_make_pid(&fdi, FUSE_INTERRUPT, data, ftick_hdr->nodeid, ftick_hdr->pid, &reused_creds); fii = fdi.indata; fii->unique = otick->tk_unique; fuse_insert_callback(fdi.tick, fuse_interrupt_callback); otick->irq_unique = fdi.tick->tk_unique; /* Interrupt ops should be delivered ASAP */ fuse_insert_message(fdi.tick, true); fdisp_destroy(&fdi); } else { /* This ticket has already been interrupted */ } } void fiov_init(struct fuse_iov *fiov, size_t size) { uint32_t msize = FU_AT_LEAST(size); fiov->len = 0; fiov->base = malloc(msize, M_FUSEMSG, M_WAITOK | M_ZERO); fiov->allocated_size = msize; fiov->credit = fuse_iov_credit; } void fiov_teardown(struct fuse_iov *fiov) { MPASS(fiov->base != NULL); free(fiov->base, M_FUSEMSG); } void fiov_adjust(struct fuse_iov *fiov, size_t size) { if (fiov->allocated_size < size || (fuse_iov_permanent_bufsize >= 0 && fiov->allocated_size - size > fuse_iov_permanent_bufsize && --fiov->credit < 0)) { fiov->base = realloc(fiov->base, FU_AT_LEAST(size), M_FUSEMSG, M_WAITOK | M_ZERO); if (!fiov->base) { panic("FUSE: realloc failed"); } fiov->allocated_size = FU_AT_LEAST(size); fiov->credit = fuse_iov_credit; /* Clear data buffer after reallocation */ bzero(fiov->base, size); } else if (size > fiov->len) { /* Clear newly extended portion of data buffer */ bzero((char*)fiov->base + fiov->len, size - fiov->len); } fiov->len = size; } /* Resize the fiov if needed, and clear it's buffer */ void fiov_refresh(struct fuse_iov *fiov) { fiov_adjust(fiov, 0); } static int fticket_ctor(void *mem, int size, void *arg, int flags) { struct fuse_ticket *ftick = mem; struct fuse_data *data = arg; FUSE_ASSERT_MS_DONE(ftick); FUSE_ASSERT_AW_DONE(ftick); ftick->tk_data = data; if (ftick->tk_unique != 0) fticket_refresh(ftick); /* May be truncated to 32 bits */ ftick->tk_unique = atomic_fetchadd_long(&data->ticketer, 1); if (ftick->tk_unique == 0) ftick->tk_unique = atomic_fetchadd_long(&data->ticketer, 1); ftick->irq_unique = 0; refcount_init(&ftick->tk_refcount, 1); counter_u64_add(fuse_ticket_count, 1); return 0; } static void fticket_dtor(void *mem, int size, void *arg) { #ifdef INVARIANTS struct fuse_ticket *ftick = mem; #endif FUSE_ASSERT_MS_DONE(ftick); FUSE_ASSERT_AW_DONE(ftick); counter_u64_add(fuse_ticket_count, -1); } static int fticket_init(void *mem, int size, int flags) { struct fuse_ticket *ftick = mem; bzero(ftick, sizeof(struct fuse_ticket)); fiov_init(&ftick->tk_ms_fiov, sizeof(struct fuse_in_header)); ftick->tk_ms_type = FT_M_FIOV; mtx_init(&ftick->tk_aw_mtx, "fuse answer delivery mutex", NULL, MTX_DEF); fiov_init(&ftick->tk_aw_fiov, 0); ftick->tk_aw_type = FT_A_FIOV; return 0; } static void fticket_fini(void *mem, int size) { struct fuse_ticket *ftick = mem; fiov_teardown(&ftick->tk_ms_fiov); fiov_teardown(&ftick->tk_aw_fiov); mtx_destroy(&ftick->tk_aw_mtx); } static inline struct fuse_ticket * fticket_alloc(struct fuse_data *data) { return uma_zalloc_arg(ticket_zone, data, M_WAITOK); } static inline void fticket_destroy(struct fuse_ticket *ftick) { return uma_zfree(ticket_zone, ftick); } static inline void fticket_refresh(struct fuse_ticket *ftick) { FUSE_ASSERT_MS_DONE(ftick); FUSE_ASSERT_AW_DONE(ftick); fiov_refresh(&ftick->tk_ms_fiov); ftick->tk_ms_bufdata = NULL; ftick->tk_ms_bufsize = 0; ftick->tk_ms_type = FT_M_FIOV; bzero(&ftick->tk_aw_ohead, sizeof(struct fuse_out_header)); fiov_refresh(&ftick->tk_aw_fiov); ftick->tk_aw_errno = 0; ftick->tk_aw_bufdata = NULL; ftick->tk_aw_bufsize = 0; ftick->tk_aw_type = FT_A_FIOV; ftick->tk_flag = 0; } /* Prepar the ticket to be reused, but don't clear its data buffers */ static inline void fticket_reset(struct fuse_ticket *ftick) { FUSE_ASSERT_MS_DONE(ftick); FUSE_ASSERT_AW_DONE(ftick); ftick->tk_ms_bufdata = NULL; ftick->tk_ms_bufsize = 0; ftick->tk_ms_type = FT_M_FIOV; bzero(&ftick->tk_aw_ohead, sizeof(struct fuse_out_header)); ftick->tk_aw_errno = 0; ftick->tk_aw_bufdata = NULL; ftick->tk_aw_bufsize = 0; ftick->tk_aw_type = FT_A_FIOV; ftick->tk_flag = 0; } static int fticket_wait_answer(struct fuse_ticket *ftick) { struct thread *td = curthread; sigset_t blockedset, oldset; int err = 0, stops_deferred; - struct fuse_data *data; + struct fuse_data *data = ftick->tk_data; bool interrupted = false; - if (fsess_isimpl(ftick->tk_data->mp, FUSE_INTERRUPT)) { + if (fsess_isimpl(ftick->tk_data->mp, FUSE_INTERRUPT) && + data->dataflags & FSESS_INTR) { SIGEMPTYSET(blockedset); } else { /* Block all signals except (implicitly) SIGKILL */ SIGFILLSET(blockedset); } stops_deferred = sigdeferstop(SIGDEFERSTOP_SILENT); kern_sigprocmask(td, SIG_BLOCK, NULL, &oldset, 0); fuse_lck_mtx_lock(ftick->tk_aw_mtx); retry: if (fticket_answered(ftick)) { goto out; } - data = ftick->tk_data; if (fdata_get_dead(data)) { err = ENOTCONN; fticket_set_answered(ftick); goto out; } kern_sigprocmask(td, SIG_BLOCK, &blockedset, NULL, 0); err = msleep(ftick, &ftick->tk_aw_mtx, PCATCH, "fu_ans", data->daemon_timeout * hz); kern_sigprocmask(td, SIG_SETMASK, &oldset, NULL, 0); if (err == EWOULDBLOCK) { SDT_PROBE2(fusefs, , ipc, trace, 3, "fticket_wait_answer: EWOULDBLOCK"); #ifdef XXXIP /* die conditionally */ if (!fdata_get_dead(data)) { fdata_set_dead(data); } #endif err = ETIMEDOUT; fticket_set_answered(ftick); } else if ((err == EINTR || err == ERESTART)) { /* * Whether we get EINTR or ERESTART depends on whether * SA_RESTART was set by sigaction(2). * * Try to interrupt the operation and wait for an EINTR response * to the original operation. If the file system does not * support FUSE_INTERRUPT, then we'll just wait for it to * complete like normal. If it does support FUSE_INTERRUPT, * then it will either respond EINTR to the original operation, * or EAGAIN to the interrupt. */ sigset_t tmpset; SDT_PROBE2(fusefs, , ipc, trace, 4, "fticket_wait_answer: interrupt"); fuse_lck_mtx_unlock(ftick->tk_aw_mtx); fuse_interrupt_send(ftick, err); PROC_LOCK(td->td_proc); mtx_lock(&td->td_proc->p_sigacts->ps_mtx); tmpset = td->td_proc->p_siglist; SIGSETOR(tmpset, td->td_siglist); mtx_unlock(&td->td_proc->p_sigacts->ps_mtx); PROC_UNLOCK(td->td_proc); fuse_lck_mtx_lock(ftick->tk_aw_mtx); if (!interrupted && !SIGISMEMBER(tmpset, SIGKILL)) { /* * Block all signals while we wait for an interrupt * response. The protocol doesn't discriminate between * different signals. */ SIGFILLSET(blockedset); interrupted = true; goto retry; } else { /* * Return immediately for fatal signals, or if this is * the second interruption. We should only be * interrupted twice if the thread is stopped, for * example during sigexit. */ } } else if (err) { SDT_PROBE2(fusefs, , ipc, trace, 6, "fticket_wait_answer: other error"); } else { SDT_PROBE2(fusefs, , ipc, trace, 7, "fticket_wait_answer: OK"); } out: if (!(err || fticket_answered(ftick))) { SDT_PROBE2(fusefs, , ipc, trace, 1, "FUSE: requester was woken up but still no answer"); err = ENXIO; } fuse_lck_mtx_unlock(ftick->tk_aw_mtx); sigallowstop(stops_deferred); return err; } static inline int fticket_aw_pull_uio(struct fuse_ticket *ftick, struct uio *uio) { int err = 0; size_t len = uio_resid(uio); if (len) { switch (ftick->tk_aw_type) { case FT_A_FIOV: fiov_adjust(fticket_resp(ftick), len); err = uiomove(fticket_resp(ftick)->base, len, uio); break; case FT_A_BUF: ftick->tk_aw_bufsize = len; err = uiomove(ftick->tk_aw_bufdata, len, uio); break; default: panic("FUSE: unknown answer type for ticket %p", ftick); } } return err; } int fticket_pull(struct fuse_ticket *ftick, struct uio *uio) { int err = 0; if (ftick->tk_aw_ohead.error) { return 0; } err = fuse_body_audit(ftick, uio_resid(uio)); if (!err) { err = fticket_aw_pull_uio(ftick, uio); } return err; } struct fuse_data * fdata_alloc(struct cdev *fdev, struct ucred *cred) { struct fuse_data *data; data = malloc(sizeof(struct fuse_data), M_FUSEMSG, M_WAITOK | M_ZERO); data->fdev = fdev; mtx_init(&data->ms_mtx, "fuse message list mutex", NULL, MTX_DEF); STAILQ_INIT(&data->ms_head); data->ms_count = 0; knlist_init_mtx(&data->ks_rsel.si_note, &data->ms_mtx); mtx_init(&data->aw_mtx, "fuse answer list mutex", NULL, MTX_DEF); TAILQ_INIT(&data->aw_head); data->daemoncred = crhold(cred); data->daemon_timeout = FUSE_DEFAULT_DAEMON_TIMEOUT; sx_init(&data->rename_lock, "fuse rename lock"); data->ref = 1; return data; } void fdata_trydestroy(struct fuse_data *data) { data->ref--; MPASS(data->ref >= 0); if (data->ref != 0) return; /* Driving off stage all that stuff thrown at device... */ sx_destroy(&data->rename_lock); crfree(data->daemoncred); mtx_destroy(&data->aw_mtx); knlist_delete(&data->ks_rsel.si_note, curthread, 0); knlist_destroy(&data->ks_rsel.si_note); mtx_destroy(&data->ms_mtx); free(data, M_FUSEMSG); } void fdata_set_dead(struct fuse_data *data) { FUSE_LOCK(); if (fdata_get_dead(data)) { FUSE_UNLOCK(); return; } fuse_lck_mtx_lock(data->ms_mtx); data->dataflags |= FSESS_DEAD; wakeup_one(data); selwakeuppri(&data->ks_rsel, PZERO + 1); wakeup(&data->ticketer); fuse_lck_mtx_unlock(data->ms_mtx); FUSE_UNLOCK(); } struct fuse_ticket * fuse_ticket_fetch(struct fuse_data *data) { int err = 0; struct fuse_ticket *ftick; ftick = fticket_alloc(data); if (!(data->dataflags & FSESS_INITED)) { /* Sleep until get answer for INIT messsage */ FUSE_LOCK(); if (!(data->dataflags & FSESS_INITED) && data->ticketer > 2) { err = msleep(&data->ticketer, &fuse_mtx, PCATCH | PDROP, "fu_ini", 0); if (err) fdata_set_dead(data); } else FUSE_UNLOCK(); } return ftick; } int fuse_ticket_drop(struct fuse_ticket *ftick) { int die; die = refcount_release(&ftick->tk_refcount); if (die) fticket_destroy(ftick); return die; } void fuse_insert_callback(struct fuse_ticket *ftick, fuse_handler_t * handler) { if (fdata_get_dead(ftick->tk_data)) { return; } ftick->tk_aw_handler = handler; fuse_lck_mtx_lock(ftick->tk_data->aw_mtx); fuse_aw_push(ftick); fuse_lck_mtx_unlock(ftick->tk_data->aw_mtx); } /* * Insert a new upgoing ticket into the message queue * * If urgent is true, insert at the front of the queue. Otherwise, insert in * FIFO order. */ void fuse_insert_message(struct fuse_ticket *ftick, bool urgent) { if (ftick->tk_flag & FT_DIRTY) { panic("FUSE: ticket reused without being refreshed"); } ftick->tk_flag |= FT_DIRTY; if (fdata_get_dead(ftick->tk_data)) { return; } fuse_lck_mtx_lock(ftick->tk_data->ms_mtx); if (urgent) fuse_ms_push_head(ftick); else fuse_ms_push(ftick); wakeup_one(ftick->tk_data); selwakeuppri(&ftick->tk_data->ks_rsel, PZERO + 1); KNOTE_LOCKED(&ftick->tk_data->ks_rsel.si_note, 0); fuse_lck_mtx_unlock(ftick->tk_data->ms_mtx); } static int fuse_body_audit(struct fuse_ticket *ftick, size_t blen) { int err = 0; enum fuse_opcode opcode; opcode = fticket_opcode(ftick); switch (opcode) { case FUSE_BMAP: err = (blen == sizeof(struct fuse_bmap_out)) ? 0 : EINVAL; break; case FUSE_LINK: case FUSE_LOOKUP: case FUSE_MKDIR: case FUSE_MKNOD: case FUSE_SYMLINK: if (fuse_libabi_geq(ftick->tk_data, 7, 9)) { err = (blen == sizeof(struct fuse_entry_out)) ? 0 : EINVAL; } else { err = (blen == FUSE_COMPAT_ENTRY_OUT_SIZE) ? 0 : EINVAL; } break; case FUSE_FORGET: panic("FUSE: a handler has been intalled for FUSE_FORGET"); break; case FUSE_GETATTR: case FUSE_SETATTR: if (fuse_libabi_geq(ftick->tk_data, 7, 9)) { err = (blen == sizeof(struct fuse_attr_out)) ? 0 : EINVAL; } else { err = (blen == FUSE_COMPAT_ATTR_OUT_SIZE) ? 0 : EINVAL; } break; case FUSE_READLINK: err = (PAGE_SIZE >= blen) ? 0 : EINVAL; break; case FUSE_UNLINK: err = (blen == 0) ? 0 : EINVAL; break; case FUSE_RMDIR: err = (blen == 0) ? 0 : EINVAL; break; case FUSE_RENAME: err = (blen == 0) ? 0 : EINVAL; break; case FUSE_OPEN: err = (blen == sizeof(struct fuse_open_out)) ? 0 : EINVAL; break; case FUSE_READ: err = (((struct fuse_read_in *)( (char *)ftick->tk_ms_fiov.base + sizeof(struct fuse_in_header) ))->size >= blen) ? 0 : EINVAL; break; case FUSE_WRITE: err = (blen == sizeof(struct fuse_write_out)) ? 0 : EINVAL; break; case FUSE_STATFS: if (fuse_libabi_geq(ftick->tk_data, 7, 4)) { err = (blen == sizeof(struct fuse_statfs_out)) ? 0 : EINVAL; } else { err = (blen == FUSE_COMPAT_STATFS_SIZE) ? 0 : EINVAL; } break; case FUSE_RELEASE: err = (blen == 0) ? 0 : EINVAL; break; case FUSE_FSYNC: err = (blen == 0) ? 0 : EINVAL; break; case FUSE_SETXATTR: err = (blen == 0) ? 0 : EINVAL; break; case FUSE_GETXATTR: case FUSE_LISTXATTR: /* * These can have varying response lengths, and 0 length * isn't necessarily invalid. */ err = 0; break; case FUSE_REMOVEXATTR: err = (blen == 0) ? 0 : EINVAL; break; case FUSE_FLUSH: err = (blen == 0) ? 0 : EINVAL; break; case FUSE_INIT: if (blen == sizeof(struct fuse_init_out) || blen == FUSE_COMPAT_INIT_OUT_SIZE || blen == FUSE_COMPAT_22_INIT_OUT_SIZE) { err = 0; } else { err = EINVAL; } break; case FUSE_OPENDIR: err = (blen == sizeof(struct fuse_open_out)) ? 0 : EINVAL; break; case FUSE_READDIR: err = (((struct fuse_read_in *)( (char *)ftick->tk_ms_fiov.base + sizeof(struct fuse_in_header) ))->size >= blen) ? 0 : EINVAL; break; case FUSE_RELEASEDIR: err = (blen == 0) ? 0 : EINVAL; break; case FUSE_FSYNCDIR: err = (blen == 0) ? 0 : EINVAL; break; case FUSE_GETLK: err = (blen == sizeof(struct fuse_lk_out)) ? 0 : EINVAL; break; case FUSE_SETLK: err = (blen == 0) ? 0 : EINVAL; break; case FUSE_SETLKW: err = (blen == 0) ? 0 : EINVAL; break; case FUSE_ACCESS: err = (blen == 0) ? 0 : EINVAL; break; case FUSE_CREATE: if (fuse_libabi_geq(ftick->tk_data, 7, 9)) { err = (blen == sizeof(struct fuse_entry_out) + sizeof(struct fuse_open_out)) ? 0 : EINVAL; } else { err = (blen == FUSE_COMPAT_ENTRY_OUT_SIZE + sizeof(struct fuse_open_out)) ? 0 : EINVAL; } break; case FUSE_DESTROY: err = (blen == 0) ? 0 : EINVAL; break; default: panic("FUSE: opcodes out of sync (%d)\n", opcode); } return err; } static inline void fuse_setup_ihead(struct fuse_in_header *ihead, struct fuse_ticket *ftick, uint64_t nid, enum fuse_opcode op, size_t blen, pid_t pid, struct ucred *cred) { ihead->len = sizeof(*ihead) + blen; ihead->unique = ftick->tk_unique; ihead->nodeid = nid; ihead->opcode = op; ihead->pid = pid; ihead->uid = cred->cr_uid; ihead->gid = cred->cr_groups[0]; } /* * fuse_standard_handler just pulls indata and wakes up pretender. * Doesn't try to interpret data, that's left for the pretender. * Though might do a basic size verification before the pull-in takes place */ static int fuse_standard_handler(struct fuse_ticket *ftick, struct uio *uio) { int err = 0; err = fticket_pull(ftick, uio); fuse_lck_mtx_lock(ftick->tk_aw_mtx); if (!fticket_answered(ftick)) { fticket_set_answered(ftick); ftick->tk_aw_errno = err; wakeup(ftick); } fuse_lck_mtx_unlock(ftick->tk_aw_mtx); return err; } /* * Reinitialize a dispatcher from a pid and node id, without resizing or * clearing its data buffers */ static void fdisp_refresh_pid(struct fuse_dispatcher *fdip, enum fuse_opcode op, struct mount *mp, uint64_t nid, pid_t pid, struct ucred *cred) { MPASS(fdip->tick); MPASS2(sizeof(fdip->finh) + fdip->iosize <= fdip->tick->tk_ms_fiov.len, "Must use fdisp_make_pid to increase the size of the fiov"); fticket_reset(fdip->tick); FUSE_DIMALLOC(&fdip->tick->tk_ms_fiov, fdip->finh, fdip->indata, fdip->iosize); fuse_setup_ihead(fdip->finh, fdip->tick, nid, op, fdip->iosize, pid, cred); } /* Initialize a dispatcher from a pid and node id */ static void fdisp_make_pid(struct fuse_dispatcher *fdip, enum fuse_opcode op, struct fuse_data *data, uint64_t nid, pid_t pid, struct ucred *cred) { if (fdip->tick) { fticket_refresh(fdip->tick); } else { fdip->tick = fuse_ticket_fetch(data); } /* FUSE_DIMALLOC will bzero the fiovs when it enlarges them */ FUSE_DIMALLOC(&fdip->tick->tk_ms_fiov, fdip->finh, fdip->indata, fdip->iosize); fuse_setup_ihead(fdip->finh, fdip->tick, nid, op, fdip->iosize, pid, cred); } void fdisp_make(struct fuse_dispatcher *fdip, enum fuse_opcode op, struct mount *mp, uint64_t nid, struct thread *td, struct ucred *cred) { struct fuse_data *data = fuse_get_mpdata(mp); RECTIFY_TDCR(td, cred); return fdisp_make_pid(fdip, op, data, nid, td->td_proc->p_pid, cred); } void fdisp_make_vp(struct fuse_dispatcher *fdip, enum fuse_opcode op, struct vnode *vp, struct thread *td, struct ucred *cred) { struct mount *mp = vnode_mount(vp); struct fuse_data *data = fuse_get_mpdata(mp); RECTIFY_TDCR(td, cred); return fdisp_make_pid(fdip, op, data, VTOI(vp), td->td_proc->p_pid, cred); } /* Refresh a fuse_dispatcher so it can be reused, but don't zero its data */ void fdisp_refresh_vp(struct fuse_dispatcher *fdip, enum fuse_opcode op, struct vnode *vp, struct thread *td, struct ucred *cred) { RECTIFY_TDCR(td, cred); return fdisp_refresh_pid(fdip, op, vnode_mount(vp), VTOI(vp), td->td_proc->p_pid, cred); } void fdisp_refresh(struct fuse_dispatcher *fdip) { fticket_refresh(fdip->tick); } SDT_PROBE_DEFINE2(fusefs, , ipc, fdisp_wait_answ_error, "char*", "int"); int fdisp_wait_answ(struct fuse_dispatcher *fdip) { int err = 0; fdip->answ_stat = 0; fuse_insert_callback(fdip->tick, fuse_standard_handler); fuse_insert_message(fdip->tick, false); if ((err = fticket_wait_answer(fdip->tick))) { fuse_lck_mtx_lock(fdip->tick->tk_aw_mtx); if (fticket_answered(fdip->tick)) { /* * Just between noticing the interrupt and getting here, * the standard handler has completed his job. * So we drop the ticket and exit as usual. */ SDT_PROBE2(fusefs, , ipc, fdisp_wait_answ_error, "IPC: interrupted, already answered", err); fuse_lck_mtx_unlock(fdip->tick->tk_aw_mtx); goto out; } else { /* * So we were faster than the standard handler. * Then by setting the answered flag we get *him* * to drop the ticket. */ SDT_PROBE2(fusefs, , ipc, fdisp_wait_answ_error, "IPC: interrupted, setting to answered", err); fticket_set_answered(fdip->tick); fuse_lck_mtx_unlock(fdip->tick->tk_aw_mtx); return err; } } if (fdip->tick->tk_aw_errno == ENOTCONN) { /* The daemon died while we were waiting for a response */ err = ENOTCONN; goto out; } else if (fdip->tick->tk_aw_errno) { /* * There was some sort of communication error with the daemon * that the client wouldn't understand. */ SDT_PROBE2(fusefs, , ipc, fdisp_wait_answ_error, "IPC: explicit EIO-ing", fdip->tick->tk_aw_errno); err = EIO; goto out; } if ((err = fdip->tick->tk_aw_ohead.error)) { SDT_PROBE2(fusefs, , ipc, fdisp_wait_answ_error, "IPC: setting status", fdip->tick->tk_aw_ohead.error); /* * This means a "proper" fuse syscall error. * We record this value so the caller will * be able to know it's not a boring messaging * failure, if she wishes so (and if not, she can * just simply propagate the return value of this routine). * [XXX Maybe a bitflag would do the job too, * if other flags needed, this will be converted thusly.] */ fdip->answ_stat = err; goto out; } fdip->answ = fticket_resp(fdip->tick)->base; fdip->iosize = fticket_resp(fdip->tick)->len; return 0; out: return err; } void fuse_ipc_init(void) { ticket_zone = uma_zcreate("fuse_ticket", sizeof(struct fuse_ticket), fticket_ctor, fticket_dtor, fticket_init, fticket_fini, UMA_ALIGN_PTR, 0); fuse_ticket_count = counter_u64_alloc(M_WAITOK); counter_u64_zero(fuse_ticket_count); } void fuse_ipc_destroy(void) { counter_u64_free(fuse_ticket_count); uma_zdestroy(ticket_zone); } Index: projects/fuse2/sys/fs/fuse/fuse_ipc.h =================================================================== --- projects/fuse2/sys/fs/fuse/fuse_ipc.h (revision 350114) +++ projects/fuse2/sys/fs/fuse/fuse_ipc.h (revision 350115) @@ -1,427 +1,428 @@ /*- * SPDX-License-Identifier: BSD-3-Clause * * Copyright (c) 2007-2009 Google Inc. and Amit Singh * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are * met: * * * Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following disclaimer * in the documentation and/or other materials provided with the * distribution. * * Neither the name of Google Inc. nor the names of its * contributors may be used to endorse or promote products derived from * this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * Copyright (C) 2005 Csaba Henk. * All rights reserved. * * Copyright (c) 2019 The FreeBSD Foundation * * Portions of this software were developed by BFF Storage Systems, LLC under * sponsorship from the FreeBSD Foundation. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $FreeBSD$ */ #ifndef _FUSE_IPC_H_ #define _FUSE_IPC_H_ #include #include enum fuse_data_cache_mode { FUSE_CACHE_UC, FUSE_CACHE_WT, FUSE_CACHE_WB, }; struct fuse_iov { void *base; size_t len; size_t allocated_size; int credit; }; void fiov_init(struct fuse_iov *fiov, size_t size); void fiov_teardown(struct fuse_iov *fiov); void fiov_refresh(struct fuse_iov *fiov); void fiov_adjust(struct fuse_iov *fiov, size_t size); #define FUSE_DIMALLOC(fiov, spc1, spc2, amnt) do { \ fiov_adjust(fiov, (sizeof(*(spc1)) + (amnt))); \ (spc1) = (fiov)->base; \ (spc2) = (char *)(fiov)->base + (sizeof(*(spc1))); \ } while (0) #define FU_AT_LEAST(siz) max((siz), 160) #define FUSE_ASSERT_AW_DONE(ftick) \ KASSERT((ftick)->tk_aw_link.tqe_next == NULL && \ (ftick)->tk_aw_link.tqe_prev == NULL, \ ("FUSE: ticket still on answer delivery list %p", (ftick))) #define FUSE_ASSERT_MS_DONE(ftick) \ KASSERT((ftick)->tk_ms_link.stqe_next == NULL, \ ("FUSE: ticket still on message list %p", (ftick))) struct fuse_ticket; struct fuse_data; typedef int fuse_handler_t(struct fuse_ticket *ftick, struct uio *uio); struct fuse_ticket { /* fields giving the identity of the ticket */ uint64_t tk_unique; struct fuse_data *tk_data; int tk_flag; u_int tk_refcount; /* * If this ticket's operation has been interrupted, this will hold the * unique value of the FUSE_INTERRUPT operation. Otherwise, it will be * 0. */ uint64_t irq_unique; /* fields for initiating an upgoing message */ struct fuse_iov tk_ms_fiov; void *tk_ms_bufdata; size_t tk_ms_bufsize; enum { FT_M_FIOV, FT_M_BUF } tk_ms_type; STAILQ_ENTRY(fuse_ticket) tk_ms_link; /* fields for handling answers coming from userspace */ struct fuse_iov tk_aw_fiov; void *tk_aw_bufdata; size_t tk_aw_bufsize; enum { FT_A_FIOV, FT_A_BUF } tk_aw_type; struct fuse_out_header tk_aw_ohead; int tk_aw_errno; struct mtx tk_aw_mtx; fuse_handler_t *tk_aw_handler; TAILQ_ENTRY(fuse_ticket) tk_aw_link; }; #define FT_ANSW 0x01 /* request of ticket has already been answered */ #define FT_DIRTY 0x04 /* ticket has been used */ static inline struct fuse_iov * fticket_resp(struct fuse_ticket *ftick) { return (&ftick->tk_aw_fiov); } static inline bool fticket_answered(struct fuse_ticket *ftick) { mtx_assert(&ftick->tk_aw_mtx, MA_OWNED); return (ftick->tk_flag & FT_ANSW); } static inline void fticket_set_answered(struct fuse_ticket *ftick) { mtx_assert(&ftick->tk_aw_mtx, MA_OWNED); ftick->tk_flag |= FT_ANSW; } static inline struct fuse_in_header* fticket_in_header(struct fuse_ticket *ftick) { return (struct fuse_in_header *)(ftick->tk_ms_fiov.base); } static inline enum fuse_opcode fticket_opcode(struct fuse_ticket *ftick) { return fticket_in_header(ftick)->opcode; } int fticket_pull(struct fuse_ticket *ftick, struct uio *uio); /* * The data representing a FUSE session. */ struct fuse_data { struct cdev *fdev; struct mount *mp; struct vnode *vroot; struct ucred *daemoncred; int dataflags; int ref; struct mtx ms_mtx; STAILQ_HEAD(, fuse_ticket) ms_head; int ms_count; struct mtx aw_mtx; TAILQ_HEAD(, fuse_ticket) aw_head; /* * Holds the next value of the FUSE operation unique value. * Also, serves as a wakeup channel to prevent any operations from * being created before INIT completes. */ u_long ticketer; struct sx rename_lock; uint32_t fuse_libabi_major; uint32_t fuse_libabi_minor; uint32_t max_readahead_blocks; uint32_t max_write; uint32_t max_read; uint32_t subtype; char volname[MAXPATHLEN]; struct selinfo ks_rsel; int daemon_timeout; unsigned time_gran; uint64_t notimpl; uint64_t mnt_flag; enum fuse_data_cache_mode cache_mode; }; #define FSESS_DEAD 0x0001 /* session is to be closed */ #define FSESS_INITED 0x0004 /* session has been inited */ #define FSESS_DAEMON_CAN_SPY 0x0010 /* let non-owners access this fs */ /* (and being observed by the daemon) */ #define FSESS_PUSH_SYMLINKS_IN 0x0020 /* prefix absolute symlinks with mp */ #define FSESS_DEFAULT_PERMISSIONS 0x0040 /* kernel does permission checking */ #define FSESS_ASYNC_READ 0x1000 /* allow multiple reads of some file */ #define FSESS_POSIX_LOCKS 0x2000 /* daemon supports POSIX locks */ #define FSESS_EXPORT_SUPPORT 0x10000 /* daemon supports NFS-style lookups */ +#define FSESS_INTR 0x20000 /* interruptible mounts */ #define FSESS_MNTOPTS_MASK ( \ FSESS_DAEMON_CAN_SPY | FSESS_PUSH_SYMLINKS_IN | \ - FSESS_DEFAULT_PERMISSIONS) + FSESS_DEFAULT_PERMISSIONS | FSESS_INTR) extern int fuse_data_cache_mode; static inline struct fuse_data * fuse_get_mpdata(struct mount *mp) { return mp->mnt_data; } static inline bool fsess_isimpl(struct mount *mp, int opcode) { struct fuse_data *data = fuse_get_mpdata(mp); return ((data->notimpl & (1ULL << opcode)) == 0); } static inline void fsess_set_notimpl(struct mount *mp, int opcode) { struct fuse_data *data = fuse_get_mpdata(mp); data->notimpl |= (1ULL << opcode); } static inline bool fsess_opt_datacache(struct mount *mp) { struct fuse_data *data = fuse_get_mpdata(mp); return (data->cache_mode != FUSE_CACHE_UC); } static inline bool fsess_opt_mmap(struct mount *mp) { return (fsess_opt_datacache(mp)); } static inline bool fsess_opt_writeback(struct mount *mp) { struct fuse_data *data = fuse_get_mpdata(mp); return (data->cache_mode == FUSE_CACHE_WB); } /* Insert a new upgoing message */ static inline void fuse_ms_push(struct fuse_ticket *ftick) { mtx_assert(&ftick->tk_data->ms_mtx, MA_OWNED); refcount_acquire(&ftick->tk_refcount); STAILQ_INSERT_TAIL(&ftick->tk_data->ms_head, ftick, tk_ms_link); ftick->tk_data->ms_count++; } /* Insert a new upgoing message to the front of the queue */ static inline void fuse_ms_push_head(struct fuse_ticket *ftick) { mtx_assert(&ftick->tk_data->ms_mtx, MA_OWNED); refcount_acquire(&ftick->tk_refcount); STAILQ_INSERT_HEAD(&ftick->tk_data->ms_head, ftick, tk_ms_link); ftick->tk_data->ms_count++; } static inline struct fuse_ticket * fuse_ms_pop(struct fuse_data *data) { struct fuse_ticket *ftick = NULL; mtx_assert(&data->ms_mtx, MA_OWNED); if ((ftick = STAILQ_FIRST(&data->ms_head))) { STAILQ_REMOVE_HEAD(&data->ms_head, tk_ms_link); data->ms_count--; #ifdef INVARIANTS MPASS(data->ms_count >= 0); ftick->tk_ms_link.stqe_next = NULL; #endif } return (ftick); } static inline void fuse_aw_push(struct fuse_ticket *ftick) { mtx_assert(&ftick->tk_data->aw_mtx, MA_OWNED); refcount_acquire(&ftick->tk_refcount); TAILQ_INSERT_TAIL(&ftick->tk_data->aw_head, ftick, tk_aw_link); } static inline void fuse_aw_remove(struct fuse_ticket *ftick) { mtx_assert(&ftick->tk_data->aw_mtx, MA_OWNED); TAILQ_REMOVE(&ftick->tk_data->aw_head, ftick, tk_aw_link); #ifdef INVARIANTS ftick->tk_aw_link.tqe_next = NULL; ftick->tk_aw_link.tqe_prev = NULL; #endif } static inline struct fuse_ticket * fuse_aw_pop(struct fuse_data *data) { struct fuse_ticket *ftick; mtx_assert(&data->aw_mtx, MA_OWNED); if ((ftick = TAILQ_FIRST(&data->aw_head)) != NULL) fuse_aw_remove(ftick); return (ftick); } struct fuse_ticket *fuse_ticket_fetch(struct fuse_data *data); int fuse_ticket_drop(struct fuse_ticket *ftick); void fuse_insert_callback(struct fuse_ticket *ftick, fuse_handler_t *handler); void fuse_insert_message(struct fuse_ticket *ftick, bool irq); static inline bool fuse_libabi_geq(struct fuse_data *data, uint32_t abi_maj, uint32_t abi_min) { return (data->fuse_libabi_major > abi_maj || (data->fuse_libabi_major == abi_maj && data->fuse_libabi_minor >= abi_min)); } struct fuse_data *fdata_alloc(struct cdev *dev, struct ucred *cred); void fdata_trydestroy(struct fuse_data *data); void fdata_set_dead(struct fuse_data *data); static inline bool fdata_get_dead(struct fuse_data *data) { return (data->dataflags & FSESS_DEAD); } struct fuse_dispatcher { struct fuse_ticket *tick; struct fuse_in_header *finh; void *indata; size_t iosize; uint64_t nodeid; int answ_stat; void *answ; }; static inline void fdisp_init(struct fuse_dispatcher *fdisp, size_t iosize) { fdisp->iosize = iosize; fdisp->tick = NULL; } static inline void fdisp_destroy(struct fuse_dispatcher *fdisp) { fuse_ticket_drop(fdisp->tick); #ifdef INVARIANTS fdisp->tick = NULL; #endif } void fdisp_refresh(struct fuse_dispatcher *fdip); void fdisp_make(struct fuse_dispatcher *fdip, enum fuse_opcode op, struct mount *mp, uint64_t nid, struct thread *td, struct ucred *cred); void fdisp_make_vp(struct fuse_dispatcher *fdip, enum fuse_opcode op, struct vnode *vp, struct thread *td, struct ucred *cred); void fdisp_refresh_vp(struct fuse_dispatcher *fdip, enum fuse_opcode op, struct vnode *vp, struct thread *td, struct ucred *cred); int fdisp_wait_answ(struct fuse_dispatcher *fdip); static inline int fdisp_simple_putget_vp(struct fuse_dispatcher *fdip, enum fuse_opcode op, struct vnode *vp, struct thread *td, struct ucred *cred) { fdisp_make_vp(fdip, op, vp, td, cred); return (fdisp_wait_answ(fdip)); } #endif /* _FUSE_IPC_H_ */ Index: projects/fuse2/sys/fs/fuse/fuse_vfsops.c =================================================================== --- projects/fuse2/sys/fs/fuse/fuse_vfsops.c (revision 350114) +++ projects/fuse2/sys/fs/fuse/fuse_vfsops.c (revision 350115) @@ -1,691 +1,692 @@ /*- * SPDX-License-Identifier: BSD-3-Clause * * Copyright (c) 2007-2009 Google Inc. and Amit Singh * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are * met: * * * Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following disclaimer * in the documentation and/or other materials provided with the * distribution. * * Neither the name of Google Inc. nor the names of its * contributors may be used to endorse or promote products derived from * this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * Copyright (C) 2005 Csaba Henk. * All rights reserved. * * Copyright (c) 2019 The FreeBSD Foundation * * Portions of this software were developed by BFF Storage Systems, LLC under * sponsorship from the FreeBSD Foundation. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "fuse.h" #include "fuse_node.h" #include "fuse_ipc.h" #include "fuse_internal.h" #include #include SDT_PROVIDER_DECLARE(fusefs); /* * Fuse trace probe: * arg0: verbosity. Higher numbers give more verbose messages * arg1: Textual message */ SDT_PROBE_DEFINE2(fusefs, , vfsops, trace, "int", "char*"); /* This will do for privilege types for now */ #ifndef PRIV_VFS_FUSE_ALLOWOTHER #define PRIV_VFS_FUSE_ALLOWOTHER PRIV_VFS_MOUNT_NONUSER #endif #ifndef PRIV_VFS_FUSE_MOUNT_NONUSER #define PRIV_VFS_FUSE_MOUNT_NONUSER PRIV_VFS_MOUNT_NONUSER #endif #ifndef PRIV_VFS_FUSE_SYNC_UNMOUNT #define PRIV_VFS_FUSE_SYNC_UNMOUNT PRIV_VFS_MOUNT_NONUSER #endif static vfs_fhtovp_t fuse_vfsop_fhtovp; static vfs_mount_t fuse_vfsop_mount; static vfs_unmount_t fuse_vfsop_unmount; static vfs_root_t fuse_vfsop_root; static vfs_statfs_t fuse_vfsop_statfs; static vfs_vget_t fuse_vfsop_vget; struct vfsops fuse_vfsops = { .vfs_fhtovp = fuse_vfsop_fhtovp, .vfs_mount = fuse_vfsop_mount, .vfs_unmount = fuse_vfsop_unmount, .vfs_root = fuse_vfsop_root, .vfs_statfs = fuse_vfsop_statfs, .vfs_vget = fuse_vfsop_vget, }; static int fuse_enforce_dev_perms = 0; SYSCTL_INT(_vfs_fusefs, OID_AUTO, enforce_dev_perms, CTLFLAG_RW, &fuse_enforce_dev_perms, 0, "enforce fuse device permissions for secondary mounts"); MALLOC_DEFINE(M_FUSEVFS, "fuse_filesystem", "buffer for fuse vfs layer"); static int fuse_getdevice(const char *fspec, struct thread *td, struct cdev **fdevp) { struct nameidata nd, *ndp = &nd; struct vnode *devvp; struct cdev *fdev; int err; /* * Not an update, or updating the name: look up the name * and verify that it refers to a sensible disk device. */ NDINIT(ndp, LOOKUP, FOLLOW, UIO_SYSSPACE, fspec, td); if ((err = namei(ndp)) != 0) return err; NDFREE(ndp, NDF_ONLY_PNBUF); devvp = ndp->ni_vp; if (devvp->v_type != VCHR) { vrele(devvp); return ENXIO; } fdev = devvp->v_rdev; dev_ref(fdev); if (fuse_enforce_dev_perms) { /* * Check if mounter can open the fuse device. * * This has significance only if we are doing a secondary mount * which doesn't involve actually opening fuse devices, but we * still want to enforce the permissions of the device (in * order to keep control over the circle of fuse users). * * (In case of primary mounts, we are either the superuser so * we can do anything anyway, or we can mount only if the * device is already opened by us, ie. we are permitted to open * the device.) */ #if 0 #ifdef MAC err = mac_check_vnode_open(td->td_ucred, devvp, VREAD | VWRITE); if (!err) #endif #endif /* 0 */ err = VOP_ACCESS(devvp, VREAD | VWRITE, td->td_ucred, td); if (err) { vrele(devvp); dev_rel(fdev); return err; } } /* * according to coda code, no extra lock is needed -- * although in sys/vnode.h this field is marked "v" */ vrele(devvp); if (!fdev->si_devsw || strcmp("fuse", fdev->si_devsw->d_name)) { dev_rel(fdev); return ENXIO; } *fdevp = fdev; return 0; } #define FUSE_FLAGOPT(fnam, fval) do { \ vfs_flagopt(opts, #fnam, &mntopts, fval); \ vfs_flagopt(opts, "__" #fnam, &__mntopts, fval); \ } while (0) SDT_PROBE_DEFINE1(fusefs, , vfsops, mntopts, "uint64_t"); SDT_PROBE_DEFINE4(fusefs, , vfsops, mount_err, "char*", "struct fuse_data*", "struct mount*", "int"); static int fuse_vfs_remount(struct mount *mp, struct thread *td, uint64_t mntopts, uint32_t max_read, int daemon_timeout) { int err = 0; struct fuse_data *data = fuse_get_mpdata(mp); /* Don't allow these options to be changed */ const static unsigned long long cant_update_opts = MNT_USER; /* Mount owner must be the user running the daemon */ FUSE_LOCK(); if ((mp->mnt_flag ^ data->mnt_flag) & cant_update_opts) { err = EOPNOTSUPP; SDT_PROBE4(fusefs, , vfsops, mount_err, "Can't change these mount options during remount", data, mp, err); goto out; } if (((data->dataflags ^ mntopts) & FSESS_MNTOPTS_MASK) || (data->max_read != max_read) || (data->daemon_timeout != daemon_timeout)) { // TODO: allow changing options where it makes sense err = EOPNOTSUPP; SDT_PROBE4(fusefs, , vfsops, mount_err, "Can't change fuse mount options during remount", data, mp, err); goto out; } if (fdata_get_dead(data)) { err = ENOTCONN; SDT_PROBE4(fusefs, , vfsops, mount_err, "device is dead during mount", data, mp, err); goto out; } /* Sanity + permission checks */ if (!data->daemoncred) panic("fuse daemon found, but identity unknown"); if (mntopts & FSESS_DAEMON_CAN_SPY) err = priv_check(td, PRIV_VFS_FUSE_ALLOWOTHER); if (err == 0 && td->td_ucred->cr_uid != data->daemoncred->cr_uid) /* are we allowed to do the first mount? */ err = priv_check(td, PRIV_VFS_FUSE_MOUNT_NONUSER); out: FUSE_UNLOCK(); return err; } static int fuse_vfsop_fhtovp(struct mount *mp, struct fid *fhp, int flags, struct vnode **vpp) { struct fuse_fid *ffhp = (struct fuse_fid *)fhp; struct fuse_vnode_data *fvdat; struct vnode *nvp; int error; if (!(fuse_get_mpdata(mp)->dataflags & FSESS_EXPORT_SUPPORT)) return EOPNOTSUPP; error = VFS_VGET(mp, ffhp->nid, LK_EXCLUSIVE, &nvp); if (error) { *vpp = NULLVP; return (error); } fvdat = VTOFUD(nvp); if (fvdat->generation != ffhp->gen ) { vput(nvp); *vpp = NULLVP; return (ESTALE); } *vpp = nvp; vnode_create_vobject(*vpp, 0, curthread); return (0); } static int fuse_vfsop_mount(struct mount *mp) { int err; uint64_t mntopts, __mntopts; uint32_t max_read; int daemon_timeout; int fd; size_t len; struct cdev *fdev; struct fuse_data *data = NULL; struct thread *td; struct file *fp, *fptmp; char *fspec, *subtype; struct vfsoptlist *opts; subtype = NULL; max_read = ~0; err = 0; mntopts = 0; __mntopts = 0; td = curthread; /* Get the new options passed to mount */ opts = mp->mnt_optnew; if (!opts) return EINVAL; /* `fspath' contains the mount point (eg. /mnt/fuse/sshfs); REQUIRED */ if (!vfs_getopts(opts, "fspath", &err)) return err; /* * With the help of underscored options the mount program * can inform us from the flags it sets by default */ FUSE_FLAGOPT(allow_other, FSESS_DAEMON_CAN_SPY); FUSE_FLAGOPT(push_symlinks_in, FSESS_PUSH_SYMLINKS_IN); FUSE_FLAGOPT(default_permissions, FSESS_DEFAULT_PERMISSIONS); + FUSE_FLAGOPT(intr, FSESS_INTR); (void)vfs_scanopt(opts, "max_read=", "%u", &max_read); if (vfs_scanopt(opts, "timeout=", "%u", &daemon_timeout) == 1) { if (daemon_timeout < FUSE_MIN_DAEMON_TIMEOUT) daemon_timeout = FUSE_MIN_DAEMON_TIMEOUT; else if (daemon_timeout > FUSE_MAX_DAEMON_TIMEOUT) daemon_timeout = FUSE_MAX_DAEMON_TIMEOUT; } else { daemon_timeout = FUSE_DEFAULT_DAEMON_TIMEOUT; } subtype = vfs_getopts(opts, "subtype=", &err); SDT_PROBE1(fusefs, , vfsops, mntopts, mntopts); if (mp->mnt_flag & MNT_UPDATE) { return fuse_vfs_remount(mp, td, mntopts, max_read, daemon_timeout); } /* `from' contains the device name (eg. /dev/fuse0); REQUIRED */ fspec = vfs_getopts(opts, "from", &err); if (!fspec) return err; /* `fd' contains the filedescriptor for this session; REQUIRED */ if (vfs_scanopt(opts, "fd", "%d", &fd) != 1) return EINVAL; err = fuse_getdevice(fspec, td, &fdev); if (err != 0) return err; err = fget(td, fd, &cap_read_rights, &fp); if (err != 0) { SDT_PROBE2(fusefs, , vfsops, trace, 1, "invalid or not opened device"); goto out; } fptmp = td->td_fpop; td->td_fpop = fp; err = devfs_get_cdevpriv((void **)&data); td->td_fpop = fptmp; fdrop(fp, td); FUSE_LOCK(); if (err != 0 || data == NULL) { err = ENXIO; SDT_PROBE4(fusefs, , vfsops, mount_err, "invalid or not opened device", data, mp, err); FUSE_UNLOCK(); goto out; } if (fdata_get_dead(data)) { err = ENOTCONN; SDT_PROBE4(fusefs, , vfsops, mount_err, "device is dead during mount", data, mp, err); FUSE_UNLOCK(); goto out; } /* Sanity + permission checks */ if (!data->daemoncred) panic("fuse daemon found, but identity unknown"); if (mntopts & FSESS_DAEMON_CAN_SPY) err = priv_check(td, PRIV_VFS_FUSE_ALLOWOTHER); if (err == 0 && td->td_ucred->cr_uid != data->daemoncred->cr_uid) /* are we allowed to do the first mount? */ err = priv_check(td, PRIV_VFS_FUSE_MOUNT_NONUSER); if (err) { FUSE_UNLOCK(); goto out; } data->ref++; data->mp = mp; data->dataflags |= mntopts; data->max_read = max_read; data->daemon_timeout = daemon_timeout; data->mnt_flag = mp->mnt_flag & MNT_UPDATEMASK; FUSE_UNLOCK(); vfs_getnewfsid(mp); MNT_ILOCK(mp); mp->mnt_data = data; /* * FUSE file systems can be either local or remote, but the kernel * can't tell the difference. */ mp->mnt_flag &= ~MNT_LOCAL; mp->mnt_kern_flag |= MNTK_USES_BCACHE; MNT_IUNLOCK(mp); /* We need this here as this slot is used by getnewvnode() */ mp->mnt_stat.f_iosize = maxbcachebuf; if (subtype) { strlcat(mp->mnt_stat.f_fstypename, ".", MFSNAMELEN); strlcat(mp->mnt_stat.f_fstypename, subtype, MFSNAMELEN); } copystr(fspec, mp->mnt_stat.f_mntfromname, MNAMELEN - 1, &len); bzero(mp->mnt_stat.f_mntfromname + len, MNAMELEN - len); mp->mnt_iosize_max = MAXPHYS; /* Now handshaking with daemon */ fuse_internal_send_init(data, td); out: if (err) { FUSE_LOCK(); if (data != NULL && data->mp == mp) { /* * Destroy device only if we acquired reference to * it */ SDT_PROBE4(fusefs, , vfsops, mount_err, "mount failed, destroy device", data, mp, err); data->mp = NULL; mp->mnt_data = NULL; fdata_trydestroy(data); } FUSE_UNLOCK(); dev_rel(fdev); } return err; } static int fuse_vfsop_unmount(struct mount *mp, int mntflags) { int err = 0; int flags = 0; struct cdev *fdev; struct fuse_data *data; struct fuse_dispatcher fdi; struct thread *td = curthread; if (mntflags & MNT_FORCE) { flags |= FORCECLOSE; } data = fuse_get_mpdata(mp); if (!data) { panic("no private data for mount point?"); } /* There is 1 extra root vnode reference (mp->mnt_data). */ FUSE_LOCK(); if (data->vroot != NULL) { struct vnode *vroot = data->vroot; data->vroot = NULL; FUSE_UNLOCK(); vrele(vroot); } else FUSE_UNLOCK(); err = vflush(mp, 0, flags, td); if (err) { return err; } if (fdata_get_dead(data)) { goto alreadydead; } if (fsess_isimpl(mp, FUSE_DESTROY)) { fdisp_init(&fdi, 0); fdisp_make(&fdi, FUSE_DESTROY, mp, 0, td, NULL); (void)fdisp_wait_answ(&fdi); fdisp_destroy(&fdi); } fdata_set_dead(data); alreadydead: FUSE_LOCK(); data->mp = NULL; fdev = data->fdev; fdata_trydestroy(data); FUSE_UNLOCK(); MNT_ILOCK(mp); mp->mnt_data = NULL; MNT_IUNLOCK(mp); dev_rel(fdev); return 0; } SDT_PROBE_DEFINE1(fusefs, , vfsops, invalidate_without_export, "struct mount*"); static int fuse_vfsop_vget(struct mount *mp, ino_t ino, int flags, struct vnode **vpp) { struct fuse_data *data = fuse_get_mpdata(mp); uint64_t nodeid = ino; struct thread *td = curthread; struct fuse_dispatcher fdi; struct fuse_entry_out *feo; struct fuse_vnode_data *fvdat; const char dot[] = "."; off_t filesize; enum vtype vtyp; int error; if (!(data->dataflags & FSESS_EXPORT_SUPPORT)) { /* * Unreachable unless you do something stupid, like export a * nullfs mount of a fusefs file system. */ SDT_PROBE1(fusefs, , vfsops, invalidate_without_export, mp); return (EOPNOTSUPP); } error = fuse_internal_get_cached_vnode(mp, ino, flags, vpp); if (error || *vpp != NULL) return error; /* Do a LOOKUP, using nodeid as the parent and "." as filename */ fdisp_init(&fdi, sizeof(dot)); fdisp_make(&fdi, FUSE_LOOKUP, mp, nodeid, td, td->td_ucred); memcpy(fdi.indata, dot, sizeof(dot)); error = fdisp_wait_answ(&fdi); if (error) return error; feo = (struct fuse_entry_out *)fdi.answ; if (feo->nodeid == 0) { /* zero nodeid means ENOENT and cache it */ error = ENOENT; goto out; } vtyp = IFTOVT(feo->attr.mode); error = fuse_vnode_get(mp, feo, nodeid, NULL, vpp, NULL, vtyp); if (error) goto out; filesize = feo->attr.size; /* * In the case where we are looking up a FUSE node represented by an * existing cached vnode, and the true size reported by FUSE_LOOKUP * doesn't match the vnode's cached size, then any cached writes beyond * the file's current size are lost. * * We can get here: * * following attribute cache expiration, or * * due a bug in the daemon, or */ fvdat = VTOFUD(*vpp); if (vnode_isreg(*vpp) && filesize != fvdat->cached_attrs.va_size && fvdat->flag & FN_SIZECHANGE) { printf("%s: WB cache incoherent on %s!\n", __func__, vnode_mount(*vpp)->mnt_stat.f_mntonname); fvdat->flag &= ~FN_SIZECHANGE; } fuse_internal_cache_attrs(*vpp, &feo->attr, feo->attr_valid, feo->attr_valid_nsec, NULL); fuse_validity_2_bintime(feo->entry_valid, feo->entry_valid_nsec, &fvdat->entry_cache_timeout); out: fdisp_destroy(&fdi); return error; } static int fuse_vfsop_root(struct mount *mp, int lkflags, struct vnode **vpp) { struct fuse_data *data = fuse_get_mpdata(mp); int err = 0; if (data->vroot != NULL) { err = vget(data->vroot, lkflags, curthread); if (err == 0) *vpp = data->vroot; } else { err = fuse_vnode_get(mp, NULL, FUSE_ROOT_ID, NULL, vpp, NULL, VDIR); if (err == 0) { FUSE_LOCK(); MPASS(data->vroot == NULL || data->vroot == *vpp); if (data->vroot == NULL) { SDT_PROBE2(fusefs, , vfsops, trace, 1, "new root vnode"); data->vroot = *vpp; FUSE_UNLOCK(); vref(*vpp); } else if (data->vroot != *vpp) { SDT_PROBE2(fusefs, , vfsops, trace, 1, "root vnode race"); FUSE_UNLOCK(); VOP_UNLOCK(*vpp, 0); vrele(*vpp); vrecycle(*vpp); *vpp = data->vroot; } else FUSE_UNLOCK(); } } return err; } static int fuse_vfsop_statfs(struct mount *mp, struct statfs *sbp) { struct fuse_dispatcher fdi; int err = 0; struct fuse_statfs_out *fsfo; struct fuse_data *data; data = fuse_get_mpdata(mp); if (!(data->dataflags & FSESS_INITED)) goto fake; fdisp_init(&fdi, 0); fdisp_make(&fdi, FUSE_STATFS, mp, FUSE_ROOT_ID, NULL, NULL); err = fdisp_wait_answ(&fdi); if (err) { fdisp_destroy(&fdi); if (err == ENOTCONN) { /* * We want to seem a legitimate fs even if the daemon * is stiff dead... (so that, eg., we can still do path * based unmounting after the daemon dies). */ goto fake; } return err; } fsfo = fdi.answ; sbp->f_blocks = fsfo->st.blocks; sbp->f_bfree = fsfo->st.bfree; sbp->f_bavail = fsfo->st.bavail; sbp->f_files = fsfo->st.files; sbp->f_ffree = fsfo->st.ffree; /* cast from uint64_t to int64_t */ sbp->f_namemax = fsfo->st.namelen; sbp->f_bsize = fsfo->st.frsize; /* cast from uint32_t to uint64_t */ fdisp_destroy(&fdi); return 0; fake: sbp->f_blocks = 0; sbp->f_bfree = 0; sbp->f_bavail = 0; sbp->f_files = 0; sbp->f_ffree = 0; sbp->f_namemax = 0; sbp->f_bsize = S_BLKSIZE; return 0; } Index: projects/fuse2/sys/sys/param.h =================================================================== --- projects/fuse2/sys/sys/param.h (revision 350114) +++ projects/fuse2/sys/sys/param.h (revision 350115) @@ -1,367 +1,367 @@ /*- * SPDX-License-Identifier: BSD-3-Clause * * Copyright (c) 1982, 1986, 1989, 1993 * The Regents of the University of California. All rights reserved. * (c) UNIX System Laboratories, Inc. * All or some portions of this file are derived from material licensed * to the University of California by American Telephone and Telegraph * Co. or Unix System Laboratories, Inc. and are reproduced herein with * the permission of UNIX System Laboratories, Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * @(#)param.h 8.3 (Berkeley) 4/4/95 * $FreeBSD$ */ #ifndef _SYS_PARAM_H_ #define _SYS_PARAM_H_ #include #define BSD 199506 /* System version (year & month). */ #define BSD4_3 1 #define BSD4_4 1 /* * __FreeBSD_version numbers are documented in the Porter's Handbook. * If you bump the version for any reason, you should update the documentation * there. * Currently this lives here in the doc/ repository: * * head/en_US.ISO8859-1/books/porters-handbook/versions/chapter.xml * * scheme is: Rxx * 'R' is in the range 0 to 4 if this is a release branch or * X.0-CURRENT before releng/X.0 is created, otherwise 'R' is * in the range 5 to 9. */ #undef __FreeBSD_version -#define __FreeBSD_version 1300034 /* Master, propagated to newvers */ +#define __FreeBSD_version 1300035 /* Master, propagated to newvers */ /* * __FreeBSD_kernel__ indicates that this system uses the kernel of FreeBSD, * which by definition is always true on FreeBSD. This macro is also defined * on other systems that use the kernel of FreeBSD, such as GNU/kFreeBSD. * * It is tempting to use this macro in userland code when we want to enable * kernel-specific routines, and in fact it's fine to do this in code that * is part of FreeBSD itself. However, be aware that as presence of this * macro is still not widespread (e.g. older FreeBSD versions, 3rd party * compilers, etc), it is STRONGLY DISCOURAGED to check for this macro in * external applications without also checking for __FreeBSD__ as an * alternative. */ #undef __FreeBSD_kernel__ #define __FreeBSD_kernel__ #if defined(_KERNEL) || defined(IN_RTLD) #define P_OSREL_SIGWAIT 700000 #define P_OSREL_SIGSEGV 700004 #define P_OSREL_MAP_ANON 800104 #define P_OSREL_MAP_FSTRICT 1100036 #define P_OSREL_SHUTDOWN_ENOTCONN 1100077 #define P_OSREL_MAP_GUARD 1200035 #define P_OSREL_WRFSBASE 1200041 #define P_OSREL_CK_CYLGRP 1200046 #define P_OSREL_VMTOTAL64 1200054 #define P_OSREL_CK_SUPERBLOCK 1300000 #define P_OSREL_CK_INODE 1300005 #define P_OSREL_MAJOR(x) ((x) / 100000) #endif #ifndef LOCORE #include #endif /* * Machine-independent constants (some used in following include files). * Redefined constants are from POSIX 1003.1 limits file. * * MAXCOMLEN should be >= sizeof(ac_comm) (see ) */ #include #define MAXCOMLEN 19 /* max command name remembered */ #define MAXINTERP PATH_MAX /* max interpreter file name length */ #define MAXLOGNAME 33 /* max login name length (incl. NUL) */ #define MAXUPRC CHILD_MAX /* max simultaneous processes */ #define NCARGS ARG_MAX /* max bytes for an exec function */ #define NGROUPS (NGROUPS_MAX+1) /* max number groups */ #define NOFILE OPEN_MAX /* max open files per process */ #define NOGROUP 65535 /* marker for empty group set member */ #define MAXHOSTNAMELEN 256 /* max hostname size */ #define SPECNAMELEN 255 /* max length of devicename */ /* More types and definitions used throughout the kernel. */ #ifdef _KERNEL #include #include #ifndef LOCORE #include #include #endif #ifndef FALSE #define FALSE 0 #endif #ifndef TRUE #define TRUE 1 #endif #endif #ifndef _KERNEL /* Signals. */ #include #endif /* Machine type dependent parameters. */ #include #ifndef _KERNEL #include #endif #ifndef DEV_BSHIFT #define DEV_BSHIFT 9 /* log2(DEV_BSIZE) */ #endif #define DEV_BSIZE (1<>PAGE_SHIFT) #endif /* * btodb() is messy and perhaps slow because `bytes' may be an off_t. We * want to shift an unsigned type to avoid sign extension and we don't * want to widen `bytes' unnecessarily. Assume that the result fits in * a daddr_t. */ #ifndef btodb #define btodb(bytes) /* calculates (bytes / DEV_BSIZE) */ \ (sizeof (bytes) > sizeof(long) \ ? (daddr_t)((unsigned long long)(bytes) >> DEV_BSHIFT) \ : (daddr_t)((unsigned long)(bytes) >> DEV_BSHIFT)) #endif #ifndef dbtob #define dbtob(db) /* calculates (db * DEV_BSIZE) */ \ ((off_t)(db) << DEV_BSHIFT) #endif #define PRIMASK 0x0ff #define PCATCH 0x100 /* OR'd with pri for tsleep to check signals */ #define PDROP 0x200 /* OR'd with pri to stop re-entry of interlock mutex */ #define NZERO 0 /* default "nice" */ #define NBBY 8 /* number of bits in a byte */ #define NBPW sizeof(int) /* number of bytes per word (integer) */ #define CMASK 022 /* default file mask: S_IWGRP|S_IWOTH */ #define NODEV (dev_t)(-1) /* non-existent device */ /* * File system parameters and macros. * * MAXBSIZE - Filesystems are made out of blocks of at most MAXBSIZE bytes * per block. MAXBSIZE may be made larger without effecting * any existing filesystems as long as it does not exceed MAXPHYS, * and may be made smaller at the risk of not being able to use * filesystems which require a block size exceeding MAXBSIZE. * * MAXBCACHEBUF - Maximum size of a buffer in the buffer cache. This must * be >= MAXBSIZE and can be set differently for different * architectures by defining it in . * Making this larger allows NFS to do larger reads/writes. * * BKVASIZE - Nominal buffer space per buffer, in bytes. BKVASIZE is the * minimum KVM memory reservation the kernel is willing to make. * Filesystems can of course request smaller chunks. Actual * backing memory uses a chunk size of a page (PAGE_SIZE). * The default value here can be overridden on a per-architecture * basis by defining it in . * * If you make BKVASIZE too small you risk seriously fragmenting * the buffer KVM map which may slow things down a bit. If you * make it too big the kernel will not be able to optimally use * the KVM memory reserved for the buffer cache and will wind * up with too-few buffers. * * The default is 16384, roughly 2x the block size used by a * normal UFS filesystem. */ #define MAXBSIZE 65536 /* must be power of 2 */ #ifndef MAXBCACHEBUF #define MAXBCACHEBUF MAXBSIZE /* must be a power of 2 >= MAXBSIZE */ #endif #ifndef BKVASIZE #define BKVASIZE 16384 /* must be power of 2 */ #endif #define BKVAMASK (BKVASIZE-1) /* * MAXPATHLEN defines the longest permissible path length after expanding * symbolic links. It is used to allocate a temporary buffer from the buffer * pool in which to do the name expansion, hence should be a power of two, * and must be less than or equal to MAXBSIZE. MAXSYMLINKS defines the * maximum number of symbolic links that may be expanded in a path name. * It should be set high enough to allow all legitimate uses, but halt * infinite loops reasonably quickly. */ #define MAXPATHLEN PATH_MAX #define MAXSYMLINKS 32 /* Bit map related macros. */ #define setbit(a,i) (((unsigned char *)(a))[(i)/NBBY] |= 1<<((i)%NBBY)) #define clrbit(a,i) (((unsigned char *)(a))[(i)/NBBY] &= ~(1<<((i)%NBBY))) #define isset(a,i) \ (((const unsigned char *)(a))[(i)/NBBY] & (1<<((i)%NBBY))) #define isclr(a,i) \ ((((const unsigned char *)(a))[(i)/NBBY] & (1<<((i)%NBBY))) == 0) /* Macros for counting and rounding. */ #ifndef howmany #define howmany(x, y) (((x)+((y)-1))/(y)) #endif #define nitems(x) (sizeof((x)) / sizeof((x)[0])) #define rounddown(x, y) (((x)/(y))*(y)) #define rounddown2(x, y) ((x)&(~((y)-1))) /* if y is power of two */ #define roundup(x, y) ((((x)+((y)-1))/(y))*(y)) /* to any y */ #define roundup2(x, y) (((x)+((y)-1))&(~((y)-1))) /* if y is powers of two */ #define powerof2(x) ((((x)-1)&(x))==0) /* Macros for min/max. */ #define MIN(a,b) (((a)<(b))?(a):(b)) #define MAX(a,b) (((a)>(b))?(a):(b)) #ifdef _KERNEL /* * Basic byte order function prototypes for non-inline functions. */ #ifndef LOCORE #ifndef _BYTEORDER_PROTOTYPED #define _BYTEORDER_PROTOTYPED __BEGIN_DECLS __uint32_t htonl(__uint32_t); __uint16_t htons(__uint16_t); __uint32_t ntohl(__uint32_t); __uint16_t ntohs(__uint16_t); __END_DECLS #endif #endif #ifndef _BYTEORDER_FUNC_DEFINED #define _BYTEORDER_FUNC_DEFINED #define htonl(x) __htonl(x) #define htons(x) __htons(x) #define ntohl(x) __ntohl(x) #define ntohs(x) __ntohs(x) #endif /* !_BYTEORDER_FUNC_DEFINED */ #endif /* _KERNEL */ /* * Scale factor for scaled integers used to count %cpu time and load avgs. * * The number of CPU `tick's that map to a unique `%age' can be expressed * by the formula (1 / (2 ^ (FSHIFT - 11))). The maximum load average that * can be calculated (assuming 32 bits) can be closely approximated using * the formula (2 ^ (2 * (16 - FSHIFT))) for (FSHIFT < 15). * * For the scheduler to maintain a 1:1 mapping of CPU `tick' to `%age', * FSHIFT must be at least 11; this gives us a maximum load avg of ~1024. */ #define FSHIFT 11 /* bits to right of fixed binary point */ #define FSCALE (1<> (PAGE_SHIFT - DEV_BSHIFT)) #define ctodb(db) /* calculates pages to devblks */ \ ((db) << (PAGE_SHIFT - DEV_BSHIFT)) /* * Old spelling of __containerof(). */ #define member2struct(s, m, x) \ ((struct s *)(void *)((char *)(x) - offsetof(struct s, m))) /* * Access a variable length array that has been declared as a fixed * length array. */ #define __PAST_END(array, offset) (((__typeof__(*(array)) *)(array))[offset]) #endif /* _SYS_PARAM_H_ */ Index: projects/fuse2/tests/sys/fs/fusefs/interrupt.cc =================================================================== --- projects/fuse2/tests/sys/fs/fusefs/interrupt.cc (revision 350114) +++ projects/fuse2/tests/sys/fs/fusefs/interrupt.cc (revision 350115) @@ -1,733 +1,790 @@ /*- * SPDX-License-Identifier: BSD-2-Clause-FreeBSD * * Copyright (c) 2019 The FreeBSD Foundation * * This software was developed by BFF Storage Systems, LLC under sponsorship * from the FreeBSD Foundation. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ extern "C" { #include #include #include #include #include #include #include #include } #include "mockfs.hh" #include "utils.hh" using namespace testing; /* Initial size of files used by these tests */ const off_t FILESIZE = 1000; /* Access mode used by all directories in these tests */ const mode_t MODE = 0755; const char FULLDIRPATH0[] = "mountpoint/some_dir"; const char RELDIRPATH0[] = "some_dir"; const char FULLDIRPATH1[] = "mountpoint/other_dir"; const char RELDIRPATH1[] = "other_dir"; static sem_t *blocked_semaphore; static sem_t *signaled_semaphore; static bool killer_should_sleep = false; /* Don't do anything; all we care about is that the syscall gets interrupted */ void sigusr2_handler(int __unused sig) { if (verbosity > 1) { printf("Signaled! thread %p\n", pthread_self()); } } void* killer(void* target) { /* Wait until the main thread is blocked in fdisp_wait_answ */ if (killer_should_sleep) nap(); else sem_wait(blocked_semaphore); if (verbosity > 1) printf("Signalling! thread %p\n", target); pthread_kill((pthread_t)target, SIGUSR2); if (signaled_semaphore != NULL) sem_post(signaled_semaphore); return(NULL); } class Interrupt: public FuseTest { public: pthread_t m_child; Interrupt(): m_child(NULL) {}; void expect_lookup(const char *relpath, uint64_t ino) { FuseTest::expect_lookup(relpath, ino, S_IFREG | 0644, FILESIZE, 1); } /* * Expect a FUSE_MKDIR but don't reply. Instead, just record the unique value * to the provided pointer */ void expect_mkdir(uint64_t *mkdir_unique) { EXPECT_CALL(*m_mock, process( ResultOf([=](auto in) { return (in.header.opcode == FUSE_MKDIR); }, Eq(true)), _) ).WillOnce(Invoke([=](auto in, auto &out __unused) { *mkdir_unique = in.header.unique; sem_post(blocked_semaphore); })); } /* * Expect a FUSE_READ but don't reply. Instead, just record the unique value * to the provided pointer */ void expect_read(uint64_t ino, uint64_t *read_unique) { EXPECT_CALL(*m_mock, process( ResultOf([=](auto in) { return (in.header.opcode == FUSE_READ && in.header.nodeid == ino); }, Eq(true)), _) ).WillOnce(Invoke([=](auto in, auto &out __unused) { *read_unique = in.header.unique; sem_post(blocked_semaphore); })); } /* * Expect a FUSE_WRITE but don't reply. Instead, just record the unique value * to the provided pointer */ void expect_write(uint64_t ino, uint64_t *write_unique) { EXPECT_CALL(*m_mock, process( ResultOf([=](auto in) { return (in.header.opcode == FUSE_WRITE && in.header.nodeid == ino); }, Eq(true)), _) ).WillOnce(Invoke([=](auto in, auto &out __unused) { *write_unique = in.header.unique; sem_post(blocked_semaphore); })); } void setup_interruptor(pthread_t target, bool sleep = false) { ASSERT_NE(SIG_ERR, signal(SIGUSR2, sigusr2_handler)) << strerror(errno); killer_should_sleep = sleep; ASSERT_EQ(0, pthread_create(&m_child, NULL, killer, (void*)target)) << strerror(errno); } void SetUp() { const int mprot = PROT_READ | PROT_WRITE; const int mflags = MAP_ANON | MAP_SHARED; signaled_semaphore = NULL; blocked_semaphore = (sem_t*)mmap(NULL, sizeof(*blocked_semaphore), mprot, mflags, -1, 0); ASSERT_NE(MAP_FAILED, blocked_semaphore) << strerror(errno); ASSERT_EQ(0, sem_init(blocked_semaphore, 1, 0)) << strerror(errno); ASSERT_EQ(0, siginterrupt(SIGUSR2, 1)); FuseTest::SetUp(); } void TearDown() { struct sigaction sa; if (m_child != NULL) { pthread_join(m_child, NULL); } bzero(&sa, sizeof(sa)); sa.sa_handler = SIG_DFL; sigaction(SIGUSR2, &sa, NULL); sem_destroy(blocked_semaphore); munmap(blocked_semaphore, sizeof(*blocked_semaphore)); FuseTest::TearDown(); } }; +class Intr: public Interrupt {}; + +class Nointr: public Interrupt { + void SetUp() { + m_nointr = true; + Interrupt::SetUp(); + } +}; + static void* mkdir0(void* arg __unused) { ssize_t r; r = mkdir(FULLDIRPATH0, MODE); if (r >= 0) return 0; else return (void*)(intptr_t)errno; } static void* read1(void* arg) { const size_t bufsize = FILESIZE; char buf[bufsize]; int fd = (int)(intptr_t)arg; ssize_t r; r = read(fd, buf, bufsize); if (r >= 0) return 0; else return (void*)(intptr_t)errno; } /* * An interrupt operation that gets received after the original command is * complete should generate an EAGAIN response. */ /* https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=236530 */ -TEST_F(Interrupt, already_complete) +TEST_F(Intr, already_complete) { uint64_t ino = 42; pthread_t self; uint64_t mkdir_unique = 0; Sequence seq; self = pthread_self(); EXPECT_LOOKUP(FUSE_ROOT_ID, RELDIRPATH0) .InSequence(seq) .WillOnce(Invoke(ReturnErrno(ENOENT))); expect_mkdir(&mkdir_unique); EXPECT_CALL(*m_mock, process( ResultOf([&](auto in) { return (in.header.opcode == FUSE_INTERRUPT && in.body.interrupt.unique == mkdir_unique); }, Eq(true)), _) ).WillOnce(Invoke([&](auto in, auto &out) { // First complete the mkdir request std::unique_ptr out0(new mockfs_buf_out); out0->header.unique = mkdir_unique; SET_OUT_HEADER_LEN(*out0, entry); out0->body.create.entry.attr.mode = S_IFDIR | MODE; out0->body.create.entry.nodeid = ino; out.push_back(std::move(out0)); // Then, respond EAGAIN to the interrupt request std::unique_ptr out1(new mockfs_buf_out); out1->header.unique = in.header.unique; out1->header.error = -EAGAIN; out1->header.len = sizeof(out1->header); out.push_back(std::move(out1)); })); EXPECT_LOOKUP(FUSE_ROOT_ID, RELDIRPATH0) .InSequence(seq) .WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { SET_OUT_HEADER_LEN(out, entry); out.body.entry.attr.mode = S_IFDIR | MODE; out.body.entry.nodeid = ino; out.body.entry.attr.nlink = 2; }))); setup_interruptor(self); EXPECT_EQ(0, mkdir(FULLDIRPATH0, MODE)) << strerror(errno); /* * The final syscall simply ensures that the test's main thread doesn't * end before the daemon finishes responding to the FUSE_INTERRUPT. */ EXPECT_EQ(0, access(FULLDIRPATH0, F_OK)) << strerror(errno); } /* * If a FUSE file system returns ENOSYS for a FUSE_INTERRUPT operation, the * kernel should not attempt to interrupt any other operations on that mount * point. */ -TEST_F(Interrupt, enosys) +TEST_F(Intr, enosys) { uint64_t ino0 = 42, ino1 = 43;; uint64_t mkdir_unique; pthread_t self, th0; sem_t sem0, sem1; void *thr0_value; Sequence seq; self = pthread_self(); ASSERT_EQ(0, sem_init(&sem0, 0, 0)) << strerror(errno); ASSERT_EQ(0, sem_init(&sem1, 0, 0)) << strerror(errno); EXPECT_LOOKUP(FUSE_ROOT_ID, RELDIRPATH1) .WillOnce(Invoke(ReturnErrno(ENOENT))); EXPECT_LOOKUP(FUSE_ROOT_ID, RELDIRPATH0) .WillOnce(Invoke(ReturnErrno(ENOENT))); expect_mkdir(&mkdir_unique); EXPECT_CALL(*m_mock, process( ResultOf([&](auto in) { return (in.header.opcode == FUSE_INTERRUPT && in.body.interrupt.unique == mkdir_unique); }, Eq(true)), _) ).InSequence(seq) .WillOnce(Invoke([&](auto in, auto &out) { // reject FUSE_INTERRUPT and respond to the FUSE_MKDIR std::unique_ptr out0(new mockfs_buf_out); std::unique_ptr out1(new mockfs_buf_out); out0->header.unique = in.header.unique; out0->header.error = -ENOSYS; out0->header.len = sizeof(out0->header); out.push_back(std::move(out0)); SET_OUT_HEADER_LEN(*out1, entry); out1->body.create.entry.attr.mode = S_IFDIR | MODE; out1->body.create.entry.nodeid = ino1; out1->header.unique = mkdir_unique; out.push_back(std::move(out1)); })); EXPECT_CALL(*m_mock, process( ResultOf([&](auto in) { return (in.header.opcode == FUSE_MKDIR); }, Eq(true)), _) ).InSequence(seq) .WillOnce(Invoke([&](auto in, auto &out) { std::unique_ptr out0(new mockfs_buf_out); sem_post(&sem0); sem_wait(&sem1); SET_OUT_HEADER_LEN(*out0, entry); out0->body.create.entry.attr.mode = S_IFDIR | MODE; out0->body.create.entry.nodeid = ino0; out0->header.unique = in.header.unique; out.push_back(std::move(out0)); })); setup_interruptor(self); /* First mkdir operation should finish synchronously */ ASSERT_EQ(0, mkdir(FULLDIRPATH1, MODE)) << strerror(errno); ASSERT_EQ(0, pthread_create(&th0, NULL, mkdir0, NULL)) << strerror(errno); sem_wait(&sem0); /* * th0 should be blocked waiting for the fuse daemon thread. * Signal it. No FUSE_INTERRUPT should result */ pthread_kill(th0, SIGUSR1); /* Allow the daemon thread to proceed */ sem_post(&sem1); pthread_join(th0, &thr0_value); /* Second mkdir should've finished without error */ EXPECT_EQ(0, (intptr_t)thr0_value); } /* * A FUSE filesystem is legally allowed to ignore INTERRUPT operations, and * complete the original operation whenever it damn well pleases. */ /* https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=236530 */ -TEST_F(Interrupt, ignore) +TEST_F(Intr, ignore) { uint64_t ino = 42; pthread_t self; uint64_t mkdir_unique; self = pthread_self(); EXPECT_LOOKUP(FUSE_ROOT_ID, RELDIRPATH0) .WillOnce(Invoke(ReturnErrno(ENOENT))); expect_mkdir(&mkdir_unique); EXPECT_CALL(*m_mock, process( ResultOf([&](auto in) { return (in.header.opcode == FUSE_INTERRUPT && in.body.interrupt.unique == mkdir_unique); }, Eq(true)), _) ).WillOnce(Invoke([&](auto in __unused, auto &out) { // Ignore FUSE_INTERRUPT; respond to the FUSE_MKDIR std::unique_ptr out0(new mockfs_buf_out); out0->header.unique = mkdir_unique; SET_OUT_HEADER_LEN(*out0, entry); out0->body.create.entry.attr.mode = S_IFDIR | MODE; out0->body.create.entry.nodeid = ino; out.push_back(std::move(out0)); })); setup_interruptor(self); ASSERT_EQ(0, mkdir(FULLDIRPATH0, MODE)) << strerror(errno); } /* * A restartable operation (basically, anything except write or setextattr) * that hasn't yet been sent to userland can be interrupted without sending * FUSE_INTERRUPT, and will be automatically restarted. */ -TEST_F(Interrupt, in_kernel_restartable) +TEST_F(Intr, in_kernel_restartable) { const char FULLPATH1[] = "mountpoint/other_file.txt"; const char RELPATH1[] = "other_file.txt"; uint64_t ino0 = 42, ino1 = 43; int fd1; pthread_t self, th0, th1; sem_t sem0, sem1; void *thr0_value, *thr1_value; ASSERT_EQ(0, sem_init(&sem0, 0, 0)) << strerror(errno); ASSERT_EQ(0, sem_init(&sem1, 0, 0)) << strerror(errno); self = pthread_self(); EXPECT_LOOKUP(FUSE_ROOT_ID, RELDIRPATH0) .WillOnce(Invoke(ReturnErrno(ENOENT))); expect_lookup(RELPATH1, ino1); expect_open(ino1, 0, 1); EXPECT_CALL(*m_mock, process( ResultOf([=](auto in) { return (in.header.opcode == FUSE_MKDIR); }, Eq(true)), _) ).WillOnce(Invoke(ReturnImmediate([&](auto in __unused, auto& out) { /* Let the next write proceed */ sem_post(&sem1); /* Pause the daemon thread so it won't read the next op */ sem_wait(&sem0); SET_OUT_HEADER_LEN(out, entry); out.body.create.entry.attr.mode = S_IFDIR | MODE; out.body.create.entry.nodeid = ino0; }))); FuseTest::expect_read(ino1, 0, FILESIZE, 0, NULL); fd1 = open(FULLPATH1, O_RDONLY); ASSERT_LE(0, fd1) << strerror(errno); /* Use a separate thread for each operation */ ASSERT_EQ(0, pthread_create(&th0, NULL, mkdir0, NULL)) << strerror(errno); sem_wait(&sem1); /* Sequence the two operations */ ASSERT_EQ(0, pthread_create(&th1, NULL, read1, (void*)(intptr_t)fd1)) << strerror(errno); setup_interruptor(self, true); pause(); /* Wait for signal */ /* Unstick the daemon */ ASSERT_EQ(0, sem_post(&sem0)) << strerror(errno); /* Wait awhile to make sure the signal generates no FUSE_INTERRUPT */ nap(); pthread_join(th1, &thr1_value); pthread_join(th0, &thr0_value); EXPECT_EQ(0, (intptr_t)thr1_value); EXPECT_EQ(0, (intptr_t)thr0_value); sem_destroy(&sem1); sem_destroy(&sem0); } /* * An operation that hasn't yet been sent to userland can be interrupted * without sending FUSE_INTERRUPT. If it's a non-restartable operation (write * or setextattr) it will return EINTR. */ -TEST_F(Interrupt, in_kernel_nonrestartable) +TEST_F(Intr, in_kernel_nonrestartable) { const char FULLPATH1[] = "mountpoint/other_file.txt"; const char RELPATH1[] = "other_file.txt"; const char value[] = "whatever"; ssize_t value_len = strlen(value) + 1; uint64_t ino0 = 42, ino1 = 43; int ns = EXTATTR_NAMESPACE_USER; int fd1; pthread_t self, th0; sem_t sem0, sem1; void *thr0_value; ssize_t r; ASSERT_EQ(0, sem_init(&sem0, 0, 0)) << strerror(errno); ASSERT_EQ(0, sem_init(&sem1, 0, 0)) << strerror(errno); self = pthread_self(); EXPECT_LOOKUP(FUSE_ROOT_ID, RELDIRPATH0) .WillOnce(Invoke(ReturnErrno(ENOENT))); expect_lookup(RELPATH1, ino1); expect_open(ino1, 0, 1); EXPECT_CALL(*m_mock, process( ResultOf([=](auto in) { return (in.header.opcode == FUSE_MKDIR); }, Eq(true)), _) ).WillOnce(Invoke(ReturnImmediate([&](auto in __unused, auto& out) { /* Let the next write proceed */ sem_post(&sem1); /* Pause the daemon thread so it won't read the next op */ sem_wait(&sem0); SET_OUT_HEADER_LEN(out, entry); out.body.create.entry.attr.mode = S_IFDIR | MODE; out.body.create.entry.nodeid = ino0; }))); fd1 = open(FULLPATH1, O_WRONLY); ASSERT_LE(0, fd1) << strerror(errno); /* Use a separate thread for the first write */ ASSERT_EQ(0, pthread_create(&th0, NULL, mkdir0, NULL)) << strerror(errno); sem_wait(&sem1); /* Sequence the two operations */ setup_interruptor(self, true); r = extattr_set_fd(fd1, ns, "foo", (void*)value, value_len); EXPECT_EQ(EINTR, errno); /* Unstick the daemon */ ASSERT_EQ(0, sem_post(&sem0)) << strerror(errno); /* Wait awhile to make sure the signal generates no FUSE_INTERRUPT */ nap(); pthread_join(th0, &thr0_value); EXPECT_EQ(0, (intptr_t)thr0_value); sem_destroy(&sem1); sem_destroy(&sem0); } /* * A syscall that gets interrupted while blocking on FUSE I/O should send a * FUSE_INTERRUPT command to the fuse filesystem, which should then send EINTR * in response to the _original_ operation. The kernel should ultimately * return EINTR to userspace */ /* https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=236530 */ -TEST_F(Interrupt, in_progress) +TEST_F(Intr, in_progress) { pthread_t self; uint64_t mkdir_unique; self = pthread_self(); EXPECT_LOOKUP(FUSE_ROOT_ID, RELDIRPATH0) .WillOnce(Invoke(ReturnErrno(ENOENT))); expect_mkdir(&mkdir_unique); EXPECT_CALL(*m_mock, process( ResultOf([&](auto in) { return (in.header.opcode == FUSE_INTERRUPT && in.body.interrupt.unique == mkdir_unique); }, Eq(true)), _) ).WillOnce(Invoke([&](auto in __unused, auto &out) { std::unique_ptr out0(new mockfs_buf_out); out0->header.error = -EINTR; out0->header.unique = mkdir_unique; out0->header.len = sizeof(out0->header); out.push_back(std::move(out0)); })); setup_interruptor(self); ASSERT_EQ(-1, mkdir(FULLDIRPATH0, MODE)); EXPECT_EQ(EINTR, errno); } /* Reads should also be interruptible */ -TEST_F(Interrupt, in_progress_read) +TEST_F(Intr, in_progress_read) { const char FULLPATH[] = "mountpoint/some_file.txt"; const char RELPATH[] = "some_file.txt"; const size_t bufsize = 80; char buf[bufsize]; uint64_t ino = 42; int fd; pthread_t self; uint64_t read_unique; self = pthread_self(); expect_lookup(RELPATH, ino); expect_open(ino, 0, 1); expect_read(ino, &read_unique); EXPECT_CALL(*m_mock, process( ResultOf([&](auto in) { return (in.header.opcode == FUSE_INTERRUPT && in.body.interrupt.unique == read_unique); }, Eq(true)), _) ).WillOnce(Invoke([&](auto in __unused, auto &out) { std::unique_ptr out0(new mockfs_buf_out); out0->header.error = -EINTR; out0->header.unique = read_unique; out0->header.len = sizeof(out0->header); out.push_back(std::move(out0)); })); fd = open(FULLPATH, O_RDONLY); ASSERT_LE(0, fd) << strerror(errno); setup_interruptor(self); ASSERT_EQ(-1, read(fd, buf, bufsize)); EXPECT_EQ(EINTR, errno); } +/* + * When mounted with -o nointr, fusefs will block signals while waiting for the + * server. + */ +TEST_F(Nointr, block) +{ + uint64_t ino = 42; + pthread_t self; + sem_t sem0; + + ASSERT_EQ(0, sem_init(&sem0, 0, 0)) << strerror(errno); + signaled_semaphore = &sem0; + self = pthread_self(); + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELDIRPATH0) + .WillOnce(Invoke(ReturnErrno(ENOENT))); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_MKDIR); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([&](auto in __unused, auto& out) { + /* Let the killer proceed */ + sem_post(blocked_semaphore); + + /* Wait until after the signal has been sent */ + sem_wait(signaled_semaphore); + /* Allow time for the mkdir thread to receive the signal */ + nap(); + + /* Finally, complete the original op */ + SET_OUT_HEADER_LEN(out, entry); + out.body.create.entry.attr.mode = S_IFDIR | MODE; + out.body.create.entry.nodeid = ino; + }))); + EXPECT_CALL(*m_mock, process( + ResultOf([&](auto in) { + return (in.header.opcode == FUSE_INTERRUPT); + }, Eq(true)), + _) + ).Times(0); + + setup_interruptor(self); + ASSERT_EQ(0, mkdir(FULLDIRPATH0, MODE)) << strerror(errno); + + sem_destroy(&sem0); +} + /* FUSE_INTERRUPT operations should take priority over other pending ops */ -TEST_F(Interrupt, priority) +TEST_F(Intr, priority) { Sequence seq; uint64_t ino1 = 43; uint64_t mkdir_unique; pthread_t self, th0; sem_t sem0, sem1; ASSERT_EQ(0, sem_init(&sem0, 0, 0)) << strerror(errno); ASSERT_EQ(0, sem_init(&sem1, 0, 0)) << strerror(errno); self = pthread_self(); EXPECT_LOOKUP(FUSE_ROOT_ID, RELDIRPATH0) .WillOnce(Invoke(ReturnErrno(ENOENT))); EXPECT_LOOKUP(FUSE_ROOT_ID, RELDIRPATH1) .WillOnce(Invoke(ReturnErrno(ENOENT))); EXPECT_CALL(*m_mock, process( ResultOf([=](auto in) { return (in.header.opcode == FUSE_MKDIR); }, Eq(true)), _) ).InSequence(seq) .WillOnce(Invoke(ReturnImmediate([&](auto in, auto& out) { mkdir_unique = in.header.unique; /* Let the next mkdir proceed */ sem_post(&sem1); /* Pause the daemon thread so it won't read the next op */ sem_wait(&sem0); /* Finally, interrupt the original op */ out.header.error = -EINTR; out.header.unique = mkdir_unique; out.header.len = sizeof(out.header); }))); /* * FUSE_INTERRUPT should be received before the second FUSE_MKDIR, * even though it was generated later */ EXPECT_CALL(*m_mock, process( ResultOf([&](auto in) { return (in.header.opcode == FUSE_INTERRUPT && in.body.interrupt.unique == mkdir_unique); }, Eq(true)), _) ).InSequence(seq) .WillOnce(Invoke(ReturnErrno(EAGAIN))); EXPECT_CALL(*m_mock, process( ResultOf([&](auto in) { return (in.header.opcode == FUSE_MKDIR); }, Eq(true)), _) ).InSequence(seq) .WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { SET_OUT_HEADER_LEN(out, entry); out.body.create.entry.attr.mode = S_IFDIR | MODE; out.body.create.entry.nodeid = ino1; }))); /* Use a separate thread for the first mkdir */ ASSERT_EQ(0, pthread_create(&th0, NULL, mkdir0, NULL)) << strerror(errno); signaled_semaphore = &sem0; sem_wait(&sem1); /* Sequence the two mkdirs */ setup_interruptor(th0, true); ASSERT_EQ(0, mkdir(FULLDIRPATH1, MODE)) << strerror(errno); pthread_join(th0, NULL); sem_destroy(&sem1); sem_destroy(&sem0); } /* * If the FUSE filesystem receives the FUSE_INTERRUPT operation before * processing the original, then it should wait for "some timeout" for the * original operation to arrive. If not, it should send EAGAIN to the * INTERRUPT operation, and the kernel should requeue the INTERRUPT. * * In this test, we'll pretend that the INTERRUPT arrives too soon, gets * EAGAINed, then the kernel requeues it, and the second time around it * successfully interrupts the original */ /* https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=236530 */ -TEST_F(Interrupt, too_soon) +TEST_F(Intr, too_soon) { Sequence seq; pthread_t self; uint64_t mkdir_unique; self = pthread_self(); EXPECT_LOOKUP(FUSE_ROOT_ID, RELDIRPATH0) .WillOnce(Invoke(ReturnErrno(ENOENT))); expect_mkdir(&mkdir_unique); EXPECT_CALL(*m_mock, process( ResultOf([&](auto in) { return (in.header.opcode == FUSE_INTERRUPT && in.body.interrupt.unique == mkdir_unique); }, Eq(true)), _) ).InSequence(seq) .WillOnce(Invoke(ReturnErrno(EAGAIN))); EXPECT_CALL(*m_mock, process( ResultOf([&](auto in) { return (in.header.opcode == FUSE_INTERRUPT && in.body.interrupt.unique == mkdir_unique); }, Eq(true)), _) ).InSequence(seq) .WillOnce(Invoke([&](auto in __unused, auto &out __unused) { std::unique_ptr out0(new mockfs_buf_out); out0->header.error = -EINTR; out0->header.unique = mkdir_unique; out0->header.len = sizeof(out0->header); out.push_back(std::move(out0)); })); setup_interruptor(self); ASSERT_EQ(-1, mkdir(FULLDIRPATH0, MODE)); EXPECT_EQ(EINTR, errno); } // TODO: add a test where write returns EWOULDBLOCK Index: projects/fuse2/tests/sys/fs/fusefs/mockfs.cc =================================================================== --- projects/fuse2/tests/sys/fs/fusefs/mockfs.cc (revision 350114) +++ projects/fuse2/tests/sys/fs/fusefs/mockfs.cc (revision 350115) @@ -1,726 +1,733 @@ /*- * SPDX-License-Identifier: BSD-2-Clause-FreeBSD * * Copyright (c) 2019 The FreeBSD Foundation * * This software was developed by BFF Storage Systems, LLC under sponsorship * from the FreeBSD Foundation. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ extern "C" { #include #include #include #include #include #include #include #include #include #include #include #include #include #include "mntopts.h" // for build_iovec } #include #include #include "mockfs.hh" using namespace testing; int verbosity = 0; const char* opcode2opname(uint32_t opcode) { const int NUM_OPS = 39; const char* table[NUM_OPS] = { "Unknown (opcode 0)", "LOOKUP", "FORGET", "GETATTR", "SETATTR", "READLINK", "SYMLINK", "Unknown (opcode 7)", "MKNOD", "MKDIR", "UNLINK", "RMDIR", "RENAME", "LINK", "OPEN", "READ", "WRITE", "STATFS", "RELEASE", "Unknown (opcode 19)", "FSYNC", "SETXATTR", "GETXATTR", "LISTXATTR", "REMOVEXATTR", "FLUSH", "INIT", "OPENDIR", "READDIR", "RELEASEDIR", "FSYNCDIR", "GETLK", "SETLK", "SETLKW", "ACCESS", "CREATE", "INTERRUPT", "BMAP", "DESTROY" }; if (opcode >= NUM_OPS) return ("Unknown (opcode > max)"); else return (table[opcode]); } ProcessMockerT ReturnErrno(int error) { return([=](auto in, auto &out) { std::unique_ptr out0(new mockfs_buf_out); out0->header.unique = in.header.unique; out0->header.error = -error; out0->header.len = sizeof(out0->header); out.push_back(std::move(out0)); }); } /* Helper function used for returning negative cache entries for LOOKUP */ ProcessMockerT ReturnNegativeCache(const struct timespec *entry_valid) { return([=](auto in, auto &out) { /* nodeid means ENOENT and cache it */ std::unique_ptr out0(new mockfs_buf_out); out0->body.entry.nodeid = 0; out0->header.unique = in.header.unique; out0->header.error = 0; out0->body.entry.entry_valid = entry_valid->tv_sec; out0->body.entry.entry_valid_nsec = entry_valid->tv_nsec; SET_OUT_HEADER_LEN(*out0, entry); out.push_back(std::move(out0)); }); } ProcessMockerT ReturnImmediate(std::function f) { return([=](auto& in, auto &out) { std::unique_ptr out0(new mockfs_buf_out); out0->header.unique = in.header.unique; f(in, *out0); out.push_back(std::move(out0)); }); } void sigint_handler(int __unused sig) { // Don't do anything except interrupt the daemon's read(2) call } void MockFS::debug_request(const mockfs_buf_in &in) { printf("%-11s ino=%2" PRIu64, opcode2opname(in.header.opcode), in.header.nodeid); if (verbosity > 1) { printf(" uid=%5u gid=%5u pid=%5u unique=%" PRIu64 " len=%u", in.header.uid, in.header.gid, in.header.pid, in.header.unique, in.header.len); } switch (in.header.opcode) { const char *name, *value; case FUSE_ACCESS: printf(" mask=%#x", in.body.access.mask); break; case FUSE_BMAP: printf(" block=%" PRIx64 " blocksize=%#x", in.body.bmap.block, in.body.bmap.blocksize); break; case FUSE_CREATE: if (m_kernel_minor_version >= 12) name = (const char*)in.body.bytes + sizeof(fuse_create_in); else name = (const char*)in.body.bytes + sizeof(fuse_open_in); printf(" flags=%#x name=%s", in.body.open.flags, name); break; case FUSE_FLUSH: printf(" fh=%#" PRIx64 " lock_owner=%" PRIu64, in.body.flush.fh, in.body.flush.lock_owner); break; case FUSE_FORGET: printf(" nlookup=%" PRIu64, in.body.forget.nlookup); break; case FUSE_FSYNC: printf(" flags=%#x", in.body.fsync.fsync_flags); break; case FUSE_FSYNCDIR: printf(" flags=%#x", in.body.fsyncdir.fsync_flags); break; case FUSE_INTERRUPT: printf(" unique=%" PRIu64, in.body.interrupt.unique); break; case FUSE_LINK: printf(" oldnodeid=%" PRIu64, in.body.link.oldnodeid); break; case FUSE_LOOKUP: printf(" %s", in.body.lookup); break; case FUSE_MKDIR: name = (const char*)in.body.bytes + sizeof(fuse_mkdir_in); printf(" name=%s mode=%#o umask=%#o", name, in.body.mkdir.mode, in.body.mkdir.umask); break; case FUSE_MKNOD: if (m_kernel_minor_version >= 12) name = (const char*)in.body.bytes + sizeof(fuse_mknod_in); else name = (const char*)in.body.bytes + FUSE_COMPAT_MKNOD_IN_SIZE; printf(" mode=%#o rdev=%x umask=%#o name=%s", in.body.mknod.mode, in.body.mknod.rdev, in.body.mknod.umask, name); break; case FUSE_OPEN: printf(" flags=%#x", in.body.open.flags); break; case FUSE_OPENDIR: printf(" flags=%#x", in.body.opendir.flags); break; case FUSE_READ: printf(" offset=%" PRIu64 " size=%u", in.body.read.offset, in.body.read.size); if (verbosity > 1) printf(" flags=%#x", in.body.read.flags); break; case FUSE_READDIR: printf(" fh=%#" PRIx64 " offset=%" PRIu64 " size=%u", in.body.readdir.fh, in.body.readdir.offset, in.body.readdir.size); break; case FUSE_RELEASE: printf(" fh=%#" PRIx64 " flags=%#x lock_owner=%" PRIu64, in.body.release.fh, in.body.release.flags, in.body.release.lock_owner); break; case FUSE_SETATTR: if (verbosity <= 1) { printf(" valid=%#x", in.body.setattr.valid); break; } if (in.body.setattr.valid & FATTR_MODE) printf(" mode=%#o", in.body.setattr.mode); if (in.body.setattr.valid & FATTR_UID) printf(" uid=%u", in.body.setattr.uid); if (in.body.setattr.valid & FATTR_GID) printf(" gid=%u", in.body.setattr.gid); if (in.body.setattr.valid & FATTR_SIZE) printf(" size=%" PRIu64, in.body.setattr.size); if (in.body.setattr.valid & FATTR_ATIME) printf(" atime=%" PRIu64 ".%u", in.body.setattr.atime, in.body.setattr.atimensec); if (in.body.setattr.valid & FATTR_MTIME) printf(" mtime=%" PRIu64 ".%u", in.body.setattr.mtime, in.body.setattr.mtimensec); if (in.body.setattr.valid & FATTR_FH) printf(" fh=%" PRIu64 "", in.body.setattr.fh); break; case FUSE_SETLK: printf(" fh=%#" PRIx64 " owner=%" PRIu64 " type=%u pid=%u", in.body.setlk.fh, in.body.setlk.owner, in.body.setlk.lk.type, in.body.setlk.lk.pid); if (verbosity >= 2) { printf(" range=[%" PRIu64 "-%" PRIu64 "]", in.body.setlk.lk.start, in.body.setlk.lk.end); } break; case FUSE_SETXATTR: /* * In theory neither the xattr name and value need be * ASCII, but in this test suite they always are. */ name = (const char*)in.body.bytes + sizeof(fuse_setxattr_in); value = name + strlen(name) + 1; printf(" %s=%s", name, value); break; case FUSE_WRITE: printf(" fh=%#" PRIx64 " offset=%" PRIu64 " size=%u write_flags=%u", in.body.write.fh, in.body.write.offset, in.body.write.size, in.body.write.write_flags); if (verbosity > 1) printf(" flags=%#x", in.body.write.flags); break; default: break; } printf("\n"); } /* * Debug a FUSE response. * * This is mostly useful for asynchronous notifications, which don't correspond * to any request */ void MockFS::debug_response(const mockfs_buf_out &out) { const char *name; if (verbosity == 0) return; switch (out.header.error) { case FUSE_NOTIFY_INVAL_ENTRY: name = (const char*)out.body.bytes + sizeof(fuse_notify_inval_entry_out); printf("<- INVAL_ENTRY parent=%" PRIu64 " %s\n", out.body.inval_entry.parent, name); break; case FUSE_NOTIFY_INVAL_INODE: printf("<- INVAL_INODE ino=%" PRIu64 " off=%" PRIi64 " len=%" PRIi64 "\n", out.body.inval_inode.ino, out.body.inval_inode.off, out.body.inval_inode.len); break; case FUSE_NOTIFY_STORE: printf("<- STORE ino=%" PRIu64 " off=%" PRIu64 " size=%" PRIu32 "\n", out.body.store.nodeid, out.body.store.offset, out.body.store.size); break; default: break; } } MockFS::MockFS(int max_readahead, bool allow_other, bool default_permissions, bool push_symlinks_in, bool ro, enum poll_method pm, uint32_t flags, uint32_t kernel_minor_version, uint32_t max_write, bool async, - bool noclusterr, unsigned time_gran) + bool noclusterr, unsigned time_gran, bool nointr) { struct sigaction sa; struct iovec *iov = NULL; int iovlen = 0; char fdstr[15]; const bool trueval = true; m_daemon_id = NULL; m_kernel_minor_version = kernel_minor_version; m_maxreadahead = max_readahead; m_maxwrite = max_write; m_nready = -1; m_pm = pm; m_time_gran = time_gran; m_quit = false; if (m_pm == KQ) m_kq = kqueue(); else m_kq = -1; /* * Kyua sets pwd to a testcase-unique tempdir; no need to use * mkdtemp */ /* * googletest doesn't allow ASSERT_ in constructors, so we must throw * instead. */ if (mkdir("mountpoint" , 0755) && errno != EEXIST) throw(std::system_error(errno, std::system_category(), "Couldn't make mountpoint directory")); switch (m_pm) { case BLOCKING: m_fuse_fd = open("/dev/fuse", O_CLOEXEC | O_RDWR); break; default: m_fuse_fd = open("/dev/fuse", O_CLOEXEC | O_RDWR | O_NONBLOCK); break; } if (m_fuse_fd < 0) throw(std::system_error(errno, std::system_category(), "Couldn't open /dev/fuse")); m_pid = getpid(); m_child_pid = -1; build_iovec(&iov, &iovlen, "fstype", __DECONST(void *, "fusefs"), -1); build_iovec(&iov, &iovlen, "fspath", __DECONST(void *, "mountpoint"), -1); build_iovec(&iov, &iovlen, "from", __DECONST(void *, "/dev/fuse"), -1); sprintf(fdstr, "%d", m_fuse_fd); build_iovec(&iov, &iovlen, "fd", fdstr, -1); if (allow_other) { build_iovec(&iov, &iovlen, "allow_other", __DECONST(void*, &trueval), sizeof(bool)); } if (default_permissions) { build_iovec(&iov, &iovlen, "default_permissions", __DECONST(void*, &trueval), sizeof(bool)); } if (push_symlinks_in) { build_iovec(&iov, &iovlen, "push_symlinks_in", __DECONST(void*, &trueval), sizeof(bool)); } if (ro) { build_iovec(&iov, &iovlen, "ro", __DECONST(void*, &trueval), sizeof(bool)); } if (async) { build_iovec(&iov, &iovlen, "async", __DECONST(void*, &trueval), sizeof(bool)); } if (noclusterr) { build_iovec(&iov, &iovlen, "noclusterr", + __DECONST(void*, &trueval), sizeof(bool)); + } + if (nointr) { + build_iovec(&iov, &iovlen, "nointr", + __DECONST(void*, &trueval), sizeof(bool)); + } else { + build_iovec(&iov, &iovlen, "intr", __DECONST(void*, &trueval), sizeof(bool)); } if (nmount(iov, iovlen, 0)) throw(std::system_error(errno, std::system_category(), "Couldn't mount filesystem")); // Setup default handler ON_CALL(*this, process(_, _)) .WillByDefault(Invoke(this, &MockFS::process_default)); init(flags); bzero(&sa, sizeof(sa)); sa.sa_handler = sigint_handler; sa.sa_flags = 0; /* Don't set SA_RESTART! */ if (0 != sigaction(SIGUSR1, &sa, NULL)) throw(std::system_error(errno, std::system_category(), "Couldn't handle SIGUSR1")); if (pthread_create(&m_daemon_id, NULL, service, (void*)this)) throw(std::system_error(errno, std::system_category(), "Couldn't Couldn't start fuse thread")); } MockFS::~MockFS() { kill_daemon(); if (m_daemon_id != NULL) { pthread_join(m_daemon_id, NULL); m_daemon_id = NULL; } ::unmount("mountpoint", MNT_FORCE); rmdir("mountpoint"); if (m_kq >= 0) close(m_kq); } void MockFS::init(uint32_t flags) { std::unique_ptr in(new mockfs_buf_in); std::unique_ptr out(new mockfs_buf_out); read_request(*in); ASSERT_EQ(FUSE_INIT, in->header.opcode); out->header.unique = in->header.unique; out->header.error = 0; out->body.init.major = FUSE_KERNEL_VERSION; out->body.init.minor = m_kernel_minor_version;; out->body.init.flags = in->body.init.flags & flags; out->body.init.max_write = m_maxwrite; out->body.init.max_readahead = m_maxreadahead; if (m_kernel_minor_version < 23) { SET_OUT_HEADER_LEN(*out, init_7_22); } else { out->body.init.time_gran = m_time_gran; SET_OUT_HEADER_LEN(*out, init); } write(m_fuse_fd, out.get(), out->header.len); } void MockFS::kill_daemon() { m_quit = true; if (m_daemon_id != NULL) pthread_kill(m_daemon_id, SIGUSR1); // Closing the /dev/fuse file descriptor first allows unmount to // succeed even if the daemon doesn't correctly respond to commands // during the unmount sequence. close(m_fuse_fd); m_fuse_fd = -1; } void MockFS::loop() { std::vector> out; std::unique_ptr in(new mockfs_buf_in); ASSERT_TRUE(in != NULL); while (!m_quit) { bzero(in.get(), sizeof(*in)); read_request(*in); if (m_quit) break; if (verbosity > 0) debug_request(*in); if (pid_ok((pid_t)in->header.pid)) { process(*in, out); } else { /* * Reject any requests from unknown processes. Because * we actually do mount a filesystem, plenty of * unrelated system daemons may try to access it. */ if (verbosity > 1) printf("\tREJECTED (wrong pid %d)\n", in->header.pid); process_default(*in, out); } for (auto &it: out) write_response(*it); out.clear(); } } int MockFS::notify_inval_entry(ino_t parent, const char *name, size_t namelen) { std::unique_ptr out(new mockfs_buf_out); out->header.unique = 0; /* 0 means asynchronous notification */ out->header.error = FUSE_NOTIFY_INVAL_ENTRY; out->body.inval_entry.parent = parent; out->body.inval_entry.namelen = namelen; strlcpy((char*)&out->body.bytes + sizeof(out->body.inval_entry), name, sizeof(out->body.bytes) - sizeof(out->body.inval_entry)); out->header.len = sizeof(out->header) + sizeof(out->body.inval_entry) + namelen; debug_response(*out); write_response(*out); return 0; } int MockFS::notify_inval_inode(ino_t ino, off_t off, ssize_t len) { std::unique_ptr out(new mockfs_buf_out); out->header.unique = 0; /* 0 means asynchronous notification */ out->header.error = FUSE_NOTIFY_INVAL_INODE; out->body.inval_inode.ino = ino; out->body.inval_inode.off = off; out->body.inval_inode.len = len; out->header.len = sizeof(out->header) + sizeof(out->body.inval_inode); debug_response(*out); write_response(*out); return 0; } int MockFS::notify_store(ino_t ino, off_t off, void* data, ssize_t size) { std::unique_ptr out(new mockfs_buf_out); out->header.unique = 0; /* 0 means asynchronous notification */ out->header.error = FUSE_NOTIFY_STORE; out->body.store.nodeid = ino; out->body.store.offset = off; out->body.store.size = size; bcopy(data, (char*)&out->body.bytes + sizeof(out->body.store), size); out->header.len = sizeof(out->header) + sizeof(out->body.store) + size; debug_response(*out); write_response(*out); return 0; } bool MockFS::pid_ok(pid_t pid) { if (pid == m_pid) { return (true); } else if (pid == m_child_pid) { return (true); } else { struct kinfo_proc *ki; bool ok = false; ki = kinfo_getproc(pid); if (ki == NULL) return (false); /* * Allow access by the aio daemon processes so that our tests * can use aio functions */ if (0 == strncmp("aiod", ki->ki_comm, 4)) ok = true; free(ki); return (ok); } } void MockFS::process_default(const mockfs_buf_in& in, std::vector> &out) { std::unique_ptr out0(new mockfs_buf_out); out0->header.unique = in.header.unique; out0->header.error = -EOPNOTSUPP; out0->header.len = sizeof(out0->header); out.push_back(std::move(out0)); } void MockFS::read_request(mockfs_buf_in &in) { ssize_t res; int nready = 0; fd_set readfds; pollfd fds[1]; struct kevent changes[1]; struct kevent events[1]; struct timespec timeout_ts; struct timeval timeout_tv; const int timeout_ms = 999; int timeout_int, nfds; switch (m_pm) { case BLOCKING: break; case KQ: timeout_ts.tv_sec = 0; timeout_ts.tv_nsec = timeout_ms * 1'000'000; while (nready == 0) { EV_SET(&changes[0], m_fuse_fd, EVFILT_READ, EV_ADD, 0, 0, 0); nready = kevent(m_kq, &changes[0], 1, &events[0], 1, &timeout_ts); if (m_quit) return; } ASSERT_LE(0, nready) << strerror(errno); ASSERT_EQ(events[0].ident, (uintptr_t)m_fuse_fd); if (events[0].flags & EV_ERROR) FAIL() << strerror(events[0].data); else if (events[0].flags & EV_EOF) FAIL() << strerror(events[0].fflags); m_nready = events[0].data; break; case POLL: timeout_int = timeout_ms; fds[0].fd = m_fuse_fd; fds[0].events = POLLIN; while (nready == 0) { nready = poll(fds, 1, timeout_int); if (m_quit) return; } ASSERT_LE(0, nready) << strerror(errno); ASSERT_TRUE(fds[0].revents & POLLIN); break; case SELECT: timeout_tv.tv_sec = 0; timeout_tv.tv_usec = timeout_ms * 1'000; nfds = m_fuse_fd + 1; while (nready == 0) { FD_ZERO(&readfds); FD_SET(m_fuse_fd, &readfds); nready = select(nfds, &readfds, NULL, NULL, &timeout_tv); if (m_quit) return; } ASSERT_LE(0, nready) << strerror(errno); ASSERT_TRUE(FD_ISSET(m_fuse_fd, &readfds)); break; default: FAIL() << "not yet implemented"; } res = read(m_fuse_fd, &in, sizeof(in)); if (res < 0 && !m_quit) { FAIL() << "read: " << strerror(errno); m_quit = true; } ASSERT_TRUE(res >= static_cast(sizeof(in.header)) || m_quit); } void MockFS::write_response(const mockfs_buf_out &out) { fd_set writefds; pollfd fds[1]; int nready, nfds; ssize_t r; switch (m_pm) { case BLOCKING: case KQ: /* EVFILT_WRITE is not supported */ break; case POLL: fds[0].fd = m_fuse_fd; fds[0].events = POLLOUT; nready = poll(fds, 1, INFTIM); ASSERT_LE(0, nready) << strerror(errno); ASSERT_EQ(1, nready) << "NULL timeout expired?"; ASSERT_TRUE(fds[0].revents & POLLOUT); break; case SELECT: FD_ZERO(&writefds); FD_SET(m_fuse_fd, &writefds); nfds = m_fuse_fd + 1; nready = select(nfds, NULL, &writefds, NULL, NULL); ASSERT_LE(0, nready) << strerror(errno); ASSERT_EQ(1, nready) << "NULL timeout expired?"; ASSERT_TRUE(FD_ISSET(m_fuse_fd, &writefds)); break; default: FAIL() << "not yet implemented"; } r = write(m_fuse_fd, &out, out.header.len); ASSERT_TRUE(r > 0 || errno == EAGAIN) << strerror(errno); } void* MockFS::service(void *pthr_data) { MockFS *mock_fs = (MockFS*)pthr_data; mock_fs->loop(); return (NULL); } void MockFS::unmount() { ::unmount("mountpoint", 0); } Index: projects/fuse2/tests/sys/fs/fusefs/mockfs.hh =================================================================== --- projects/fuse2/tests/sys/fs/fusefs/mockfs.hh (revision 350114) +++ projects/fuse2/tests/sys/fs/fusefs/mockfs.hh (revision 350115) @@ -1,394 +1,394 @@ /*- * SPDX-License-Identifier: BSD-2-Clause-FreeBSD * * Copyright (c) 2019 The FreeBSD Foundation * * This software was developed by BFF Storage Systems, LLC under sponsorship * from the FreeBSD Foundation. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ extern "C" { #include #include #include "fuse_kernel.h" } #include #define TIME_T_MAX (std::numeric_limits::max()) /* * A pseudo-fuse errno used indicate that a fuse operation should have no * response, at least not immediately */ #define FUSE_NORESPONSE 9999 #define SET_OUT_HEADER_LEN(out, variant) { \ (out).header.len = (sizeof((out).header) + \ sizeof((out).body.variant)); \ } /* * Create an expectation on FUSE_LOOKUP and return it so the caller can set * actions. * * This must be a macro instead of a method because EXPECT_CALL returns a type * with a deleted constructor. */ #define EXPECT_LOOKUP(parent, path) \ EXPECT_CALL(*m_mock, process( \ ResultOf([=](auto in) { \ return (in.header.opcode == FUSE_LOOKUP && \ in.header.nodeid == (parent) && \ strcmp(in.body.lookup, (path)) == 0); \ }, Eq(true)), \ _) \ ) extern int verbosity; /* This struct isn't defined by fuse_kernel.h or libfuse, but it should be */ struct fuse_create_out { struct fuse_entry_out entry; struct fuse_open_out open; }; /* Protocol 7.8 version of struct fuse_attr */ struct fuse_attr_7_8 { uint64_t ino; uint64_t size; uint64_t blocks; uint64_t atime; uint64_t mtime; uint64_t ctime; uint32_t atimensec; uint32_t mtimensec; uint32_t ctimensec; uint32_t mode; uint32_t nlink; uint32_t uid; uint32_t gid; uint32_t rdev; }; /* Protocol 7.8 version of struct fuse_attr_out */ struct fuse_attr_out_7_8 { uint64_t attr_valid; uint32_t attr_valid_nsec; uint32_t dummy; struct fuse_attr_7_8 attr; }; /* Protocol 7.8 version of struct fuse_entry_out */ struct fuse_entry_out_7_8 { uint64_t nodeid; /* Inode ID */ uint64_t generation; /* Inode generation: nodeid:gen must be unique for the fs's lifetime */ uint64_t entry_valid; /* Cache timeout for the name */ uint64_t attr_valid; /* Cache timeout for the attributes */ uint32_t entry_valid_nsec; uint32_t attr_valid_nsec; struct fuse_attr_7_8 attr; }; /* Output struct for FUSE_CREATE for protocol 7.8 servers */ struct fuse_create_out_7_8 { struct fuse_entry_out_7_8 entry; struct fuse_open_out open; }; /* Output struct for FUSE_INIT for protocol 7.22 and earlier servers */ struct fuse_init_out_7_22 { uint32_t major; uint32_t minor; uint32_t max_readahead; uint32_t flags; uint16_t max_background; uint16_t congestion_threshold; uint32_t max_write; }; union fuse_payloads_in { fuse_access_in access; fuse_bmap_in bmap; /* value is from fuse_kern_chan.c in fusefs-libs */ uint8_t bytes[0x21000 - sizeof(struct fuse_in_header)]; fuse_create_in create; fuse_flush_in flush; fuse_fsync_in fsync; fuse_fsync_in fsyncdir; fuse_forget_in forget; fuse_interrupt_in interrupt; fuse_lk_in getlk; fuse_getxattr_in getxattr; fuse_init_in init; fuse_link_in link; fuse_listxattr_in listxattr; char lookup[0]; fuse_mkdir_in mkdir; fuse_mknod_in mknod; fuse_open_in open; fuse_open_in opendir; fuse_read_in read; fuse_read_in readdir; fuse_release_in release; fuse_release_in releasedir; fuse_rename_in rename; char rmdir[0]; fuse_setattr_in setattr; fuse_setxattr_in setxattr; fuse_lk_in setlk; fuse_lk_in setlkw; char unlink[0]; fuse_write_in write; }; struct mockfs_buf_in { fuse_in_header header; union fuse_payloads_in body; }; union fuse_payloads_out { fuse_attr_out attr; fuse_attr_out_7_8 attr_7_8; fuse_bmap_out bmap; fuse_create_out create; fuse_create_out_7_8 create_7_8; /* * The protocol places no limits on the size of bytes. Choose * a size big enough for anything we'll test. */ uint8_t bytes[0x20000]; fuse_entry_out entry; fuse_entry_out_7_8 entry_7_8; fuse_lk_out getlk; fuse_getxattr_out getxattr; fuse_init_out init; fuse_init_out_7_22 init_7_22; /* The inval_entry structure should be followed by the entry's name */ fuse_notify_inval_entry_out inval_entry; fuse_notify_inval_inode_out inval_inode; /* The store structure should be followed by the data to store */ fuse_notify_store_out store; fuse_listxattr_out listxattr; fuse_open_out open; fuse_statfs_out statfs; /* * The protocol places no limits on the length of the string. This is * merely convenient for testing. */ char str[80]; fuse_write_out write; }; struct mockfs_buf_out { fuse_out_header header; union fuse_payloads_out body; /* Default constructor: zero everything */ mockfs_buf_out() { memset(this, 0, sizeof(*this)); } }; /* A function that can be invoked in place of MockFS::process */ typedef std::function> &out)> ProcessMockerT; /* * Helper function used for setting an error expectation for any fuse operation. * The operation will return the supplied error */ ProcessMockerT ReturnErrno(int error); /* Helper function used for returning negative cache entries for LOOKUP */ ProcessMockerT ReturnNegativeCache(const struct timespec *entry_valid); /* Helper function used for returning a single immediate response */ ProcessMockerT ReturnImmediate( std::function f); /* How the daemon should check /dev/fuse for readiness */ enum poll_method { BLOCKING, SELECT, POLL, KQ }; /* * Fake FUSE filesystem * * "Mounts" a filesystem to a temporary directory and services requests * according to the programmed expectations. * * Operates directly on the fusefs(4) kernel API, not the libfuse(3) user api. */ class MockFS { /* * thread id of the fuse daemon thread * * It must run in a separate thread so it doesn't deadlock with the * client test code. */ pthread_t m_daemon_id; /* file descriptor of /dev/fuse control device */ int m_fuse_fd; /* The minor version of the kernel API that this mock daemon targets */ uint32_t m_kernel_minor_version; int m_kq; /* The max_readahead file system option */ uint32_t m_maxreadahead; /* pid of the test process */ pid_t m_pid; /* Method the daemon should use for I/O to and from /dev/fuse */ enum poll_method m_pm; /* Timestamp granularity in nanoseconds */ unsigned m_time_gran; void debug_request(const mockfs_buf_in&); void debug_response(const mockfs_buf_out&); /* Initialize a session after mounting */ void init(uint32_t flags); /* Is pid from a process that might be involved in the test? */ bool pid_ok(pid_t pid); /* Default request handler */ void process_default(const mockfs_buf_in&, std::vector>&); /* Entry point for the daemon thread */ static void* service(void*); /* Read, but do not process, a single request from the kernel */ void read_request(mockfs_buf_in& in); /* Write a single response back to the kernel */ void write_response(const mockfs_buf_out &out); public: /* pid of child process, for two-process test cases */ pid_t m_child_pid; /* Maximum size of a FUSE_WRITE write */ uint32_t m_maxwrite; /* * Number of events that were available from /dev/fuse after the last * kevent call. Only valid when m_pm = KQ. */ int m_nready; /* Tell the daemon to shut down ASAP */ bool m_quit; /* Create a new mockfs and mount it to a tempdir */ MockFS(int max_readahead, bool allow_other, bool default_permissions, bool push_symlinks_in, bool ro, enum poll_method pm, uint32_t flags, uint32_t kernel_minor_version, uint32_t max_write, bool async, - bool no_clusterr, unsigned time_gran); + bool no_clusterr, unsigned time_gran, bool nointr); virtual ~MockFS(); /* Kill the filesystem daemon without unmounting the filesystem */ void kill_daemon(); /* Process FUSE requests endlessly */ void loop(); /* * Send an asynchronous notification to invalidate a directory entry. * Similar to libfuse's fuse_lowlevel_notify_inval_entry * * This method will block until the client has responded, so it should * generally be run in a separate thread from request processing. * * @param parent Parent directory's inode number * @param name name of dirent to invalidate * @param namelen size of name, including the NUL */ int notify_inval_entry(ino_t parent, const char *name, size_t namelen); /* * Send an asynchronous notification to invalidate an inode's cached * data and/or attributes. Similar to libfuse's * fuse_lowlevel_notify_inval_inode. * * This method will block until the client has responded, so it should * generally be run in a separate thread from request processing. * * @param ino File's inode number * @param off offset at which to begin invalidation. A * negative offset means to invalidate attributes * only. * @param len Size of region of data to invalidate. 0 means * to invalidate all cached data. */ int notify_inval_inode(ino_t ino, off_t off, ssize_t len); /* * Send an asynchronous notification to store data directly into an * inode's cache. Similar to libfuse's fuse_lowlevel_notify_store. * * This method will block until the client has responded, so it should * generally be run in a separate thread from request processing. * * @param ino File's inode number * @param off Offset at which to store data * @param data Pointer to the data to cache * @param len Size of data */ int notify_store(ino_t ino, off_t off, void* data, ssize_t size); /* * Request handler * * This method is expected to provide the responses to each FUSE * operation. For an immediate response, push one buffer into out. * For a delayed response, push nothing. For an immediate response * plus a delayed response to an earlier operation, push two bufs. * Test cases must define each response using Googlemock expectations */ MOCK_METHOD2(process, void(const mockfs_buf_in&, std::vector>&)); /* Gracefully unmount */ void unmount(); }; Index: projects/fuse2/tests/sys/fs/fusefs/utils.cc =================================================================== --- projects/fuse2/tests/sys/fs/fusefs/utils.cc (revision 350114) +++ projects/fuse2/tests/sys/fs/fusefs/utils.cc (revision 350115) @@ -1,592 +1,593 @@ /*- * SPDX-License-Identifier: BSD-2-Clause-FreeBSD * * Copyright (c) 2019 The FreeBSD Foundation * * This software was developed by BFF Storage Systems, LLC under sponsorship * from the FreeBSD Foundation. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ extern "C" { #include #include #include #include #include #include #include #include #include #include #include } #include #include "mockfs.hh" #include "utils.hh" using namespace testing; /* * The default max_write is set to this formula in libfuse, though * individual filesystems can lower it. The "- 4096" was added in * commit 154ffe2, with the commit message "fix". */ const uint32_t libfuse_max_write = 32 * getpagesize() + 0x1000 - 4096; /* * Set the default max_write to a distinct value from MAXPHYS to catch bugs * that confuse the two. */ const uint32_t default_max_write = MIN(libfuse_max_write, MAXPHYS / 2); /* Check that fusefs(4) is accessible and the current user can mount(2) */ void check_environment() { const char *devnode = "/dev/fuse"; const char *usermount_node = "vfs.usermount"; int usermount_val = 0; size_t usermount_size = sizeof(usermount_val); if (eaccess(devnode, R_OK | W_OK)) { if (errno == ENOENT) { GTEST_SKIP() << devnode << " does not exist"; } else if (errno == EACCES) { GTEST_SKIP() << devnode << " is not accessible by the current user"; } else { GTEST_SKIP() << strerror(errno); } } sysctlbyname(usermount_node, &usermount_val, &usermount_size, NULL, 0); if (geteuid() != 0 && !usermount_val) GTEST_SKIP() << "current user is not allowed to mount"; } class FuseEnv: public Environment { virtual void SetUp() { } }; void FuseTest::SetUp() { const char *maxbcachebuf_node = "vfs.maxbcachebuf"; const char *maxphys_node = "kern.maxphys"; int val = 0; size_t size = sizeof(val); /* * XXX check_environment should be called from FuseEnv::SetUp, but * can't due to https://github.com/google/googletest/issues/2189 */ check_environment(); if (IsSkipped()) return; ASSERT_EQ(0, sysctlbyname(maxbcachebuf_node, &val, &size, NULL, 0)) << strerror(errno); m_maxbcachebuf = val; ASSERT_EQ(0, sysctlbyname(maxphys_node, &val, &size, NULL, 0)) << strerror(errno); m_maxphys = val; try { m_mock = new MockFS(m_maxreadahead, m_allow_other, m_default_permissions, m_push_symlinks_in, m_ro, m_pm, m_init_flags, m_kernel_minor_version, - m_maxwrite, m_async, m_noclusterr, m_time_gran); + m_maxwrite, m_async, m_noclusterr, m_time_gran, + m_nointr); /* * FUSE_ACCESS is called almost universally. Expecting it in * each test case would be super-annoying. Instead, set a * default expectation for FUSE_ACCESS and return ENOSYS. * * Individual test cases can override this expectation since * googlemock evaluates expectations in LIFO order. */ EXPECT_CALL(*m_mock, process( ResultOf([=](auto in) { return (in.header.opcode == FUSE_ACCESS); }, Eq(true)), _) ).Times(AnyNumber()) .WillRepeatedly(Invoke(ReturnErrno(ENOSYS))); /* * FUSE_BMAP is called for most test cases that read data. Set * a default expectation and return ENOSYS. * * Individual test cases can override this expectation since * googlemock evaluates expectations in LIFO order. */ EXPECT_CALL(*m_mock, process( ResultOf([=](auto in) { return (in.header.opcode == FUSE_BMAP); }, Eq(true)), _) ).Times(AnyNumber()) .WillRepeatedly(Invoke(ReturnErrno(ENOSYS))); } catch (std::system_error err) { FAIL() << err.what(); } } void FuseTest::expect_access(uint64_t ino, mode_t access_mode, int error) { EXPECT_CALL(*m_mock, process( ResultOf([=](auto in) { return (in.header.opcode == FUSE_ACCESS && in.header.nodeid == ino && in.body.access.mask == access_mode); }, Eq(true)), _) ).WillOnce(Invoke(ReturnErrno(error))); } void FuseTest::expect_destroy(int error) { EXPECT_CALL(*m_mock, process( ResultOf([=](auto in) { return (in.header.opcode == FUSE_DESTROY); }, Eq(true)), _) ).WillOnce(Invoke( ReturnImmediate([&](auto in, auto& out) { m_mock->m_quit = true; out.header.len = sizeof(out.header); out.header.unique = in.header.unique; out.header.error = -error; }))); } void FuseTest::expect_flush(uint64_t ino, int times, ProcessMockerT r) { EXPECT_CALL(*m_mock, process( ResultOf([=](auto in) { return (in.header.opcode == FUSE_FLUSH && in.header.nodeid == ino); }, Eq(true)), _) ).Times(times) .WillRepeatedly(Invoke(r)); } void FuseTest::expect_forget(uint64_t ino, uint64_t nlookup, sem_t *sem) { EXPECT_CALL(*m_mock, process( ResultOf([=](auto in) { return (in.header.opcode == FUSE_FORGET && in.header.nodeid == ino && in.body.forget.nlookup == nlookup); }, Eq(true)), _) ).WillOnce(Invoke([=](auto in __unused, auto &out __unused) { if (sem != NULL) sem_post(sem); /* FUSE_FORGET has no response! */ })); } void FuseTest::expect_getattr(uint64_t ino, uint64_t size) { EXPECT_CALL(*m_mock, process( ResultOf([=](auto in) { return (in.header.opcode == FUSE_GETATTR && in.header.nodeid == ino); }, Eq(true)), _) ).WillOnce(Invoke(ReturnImmediate([=](auto i __unused, auto& out) { SET_OUT_HEADER_LEN(out, attr); out.body.attr.attr.ino = ino; // Must match nodeid out.body.attr.attr.mode = S_IFREG | 0644; out.body.attr.attr.size = size; out.body.attr.attr_valid = UINT64_MAX; }))); } void FuseTest::expect_lookup(const char *relpath, uint64_t ino, mode_t mode, uint64_t size, int times, uint64_t attr_valid, uid_t uid, gid_t gid) { EXPECT_LOOKUP(FUSE_ROOT_ID, relpath) .Times(times) .WillRepeatedly(Invoke( ReturnImmediate([=](auto in __unused, auto& out) { SET_OUT_HEADER_LEN(out, entry); out.body.entry.attr.mode = mode; out.body.entry.nodeid = ino; out.body.entry.attr.nlink = 1; out.body.entry.attr_valid = attr_valid; out.body.entry.attr.size = size; out.body.entry.attr.uid = uid; out.body.entry.attr.gid = gid; }))); } void FuseTest::expect_lookup_7_8(const char *relpath, uint64_t ino, mode_t mode, uint64_t size, int times, uint64_t attr_valid, uid_t uid, gid_t gid) { EXPECT_LOOKUP(FUSE_ROOT_ID, relpath) .Times(times) .WillRepeatedly(Invoke( ReturnImmediate([=](auto in __unused, auto& out) { SET_OUT_HEADER_LEN(out, entry_7_8); out.body.entry.attr.mode = mode; out.body.entry.nodeid = ino; out.body.entry.attr.nlink = 1; out.body.entry.attr_valid = attr_valid; out.body.entry.attr.size = size; out.body.entry.attr.uid = uid; out.body.entry.attr.gid = gid; }))); } void FuseTest::expect_open(uint64_t ino, uint32_t flags, int times) { EXPECT_CALL(*m_mock, process( ResultOf([=](auto in) { return (in.header.opcode == FUSE_OPEN && in.header.nodeid == ino); }, Eq(true)), _) ).Times(times) .WillRepeatedly(Invoke( ReturnImmediate([=](auto in __unused, auto& out) { out.header.len = sizeof(out.header); SET_OUT_HEADER_LEN(out, open); out.body.open.fh = FH; out.body.open.open_flags = flags; }))); } void FuseTest::expect_opendir(uint64_t ino) { /* opendir(3) calls fstatfs */ EXPECT_CALL(*m_mock, process( ResultOf([](auto in) { return (in.header.opcode == FUSE_STATFS); }, Eq(true)), _) ).WillRepeatedly(Invoke( ReturnImmediate([=](auto i __unused, auto& out) { SET_OUT_HEADER_LEN(out, statfs); }))); EXPECT_CALL(*m_mock, process( ResultOf([=](auto in) { return (in.header.opcode == FUSE_OPENDIR && in.header.nodeid == ino); }, Eq(true)), _) ).WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { out.header.len = sizeof(out.header); SET_OUT_HEADER_LEN(out, open); out.body.open.fh = FH; }))); } void FuseTest::expect_read(uint64_t ino, uint64_t offset, uint64_t isize, uint64_t osize, const void *contents, int flags) { EXPECT_CALL(*m_mock, process( ResultOf([=](auto in) { return (in.header.opcode == FUSE_READ && in.header.nodeid == ino && in.body.read.fh == FH && in.body.read.offset == offset && in.body.read.size == isize && flags == -1 ? (in.body.read.flags == O_RDONLY || in.body.read.flags == O_RDWR) : in.body.read.flags == (uint32_t)flags); }, Eq(true)), _) ).WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { out.header.len = sizeof(struct fuse_out_header) + osize; memmove(out.body.bytes, contents, osize); }))).RetiresOnSaturation(); } void FuseTest::expect_readdir(uint64_t ino, uint64_t off, std::vector &ents) { EXPECT_CALL(*m_mock, process( ResultOf([=](auto in) { return (in.header.opcode == FUSE_READDIR && in.header.nodeid == ino && in.body.readdir.fh == FH && in.body.readdir.offset == off); }, Eq(true)), _) ).WillRepeatedly(Invoke(ReturnImmediate([=](auto in, auto& out) { struct fuse_dirent *fde = (struct fuse_dirent*)&(out.body); int i = 0; out.header.error = 0; out.header.len = 0; for (const auto& it: ents) { size_t entlen, entsize; fde->ino = it.d_fileno; fde->off = it.d_off; fde->type = it.d_type; fde->namelen = it.d_namlen; strncpy(fde->name, it.d_name, it.d_namlen); entlen = FUSE_NAME_OFFSET + fde->namelen; entsize = FUSE_DIRENT_SIZE(fde); /* * The FUSE protocol does not require zeroing out the * unused portion of the name. But it's a good * practice to prevent information disclosure to the * FUSE client, even though the client is usually the * kernel */ memset(fde->name + fde->namelen, 0, entsize - entlen); if (out.header.len + entsize > in.body.read.size) { printf("Overflow in readdir expectation: i=%d\n" , i); break; } out.header.len += entsize; fde = (struct fuse_dirent*) ((intmax_t*)fde + entsize / sizeof(intmax_t)); i++; } out.header.len += sizeof(out.header); }))); } void FuseTest::expect_release(uint64_t ino, uint64_t fh) { EXPECT_CALL(*m_mock, process( ResultOf([=](auto in) { return (in.header.opcode == FUSE_RELEASE && in.header.nodeid == ino && in.body.release.fh == fh); }, Eq(true)), _) ).WillOnce(Invoke(ReturnErrno(0))); } void FuseTest::expect_releasedir(uint64_t ino, ProcessMockerT r) { EXPECT_CALL(*m_mock, process( ResultOf([=](auto in) { return (in.header.opcode == FUSE_RELEASEDIR && in.header.nodeid == ino && in.body.release.fh == FH); }, Eq(true)), _) ).WillOnce(Invoke(r)); } void FuseTest::expect_unlink(uint64_t parent, const char *path, int error) { EXPECT_CALL(*m_mock, process( ResultOf([=](auto in) { return (in.header.opcode == FUSE_UNLINK && 0 == strcmp(path, in.body.unlink) && in.header.nodeid == parent); }, Eq(true)), _) ).WillOnce(Invoke(ReturnErrno(error))); } void FuseTest::expect_write(uint64_t ino, uint64_t offset, uint64_t isize, uint64_t osize, uint32_t flags_set, uint32_t flags_unset, const void *contents) { EXPECT_CALL(*m_mock, process( ResultOf([=](auto in) { const char *buf = (const char*)in.body.bytes + sizeof(struct fuse_write_in); bool pid_ok; uint32_t wf = in.body.write.write_flags; if (wf & FUSE_WRITE_CACHE) pid_ok = true; else pid_ok = (pid_t)in.header.pid == getpid(); return (in.header.opcode == FUSE_WRITE && in.header.nodeid == ino && in.body.write.fh == FH && in.body.write.offset == offset && in.body.write.size == isize && pid_ok && (wf & flags_set) == flags_set && (wf & flags_unset) == 0 && (in.body.write.flags == O_WRONLY || in.body.write.flags == O_RDWR) && 0 == bcmp(buf, contents, isize)); }, Eq(true)), _) ).WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { SET_OUT_HEADER_LEN(out, write); out.body.write.size = osize; }))); } void FuseTest::expect_write_7_8(uint64_t ino, uint64_t offset, uint64_t isize, uint64_t osize, const void *contents) { EXPECT_CALL(*m_mock, process( ResultOf([=](auto in) { const char *buf = (const char*)in.body.bytes + FUSE_COMPAT_WRITE_IN_SIZE; bool pid_ok = (pid_t)in.header.pid == getpid(); return (in.header.opcode == FUSE_WRITE && in.header.nodeid == ino && in.body.write.fh == FH && in.body.write.offset == offset && in.body.write.size == isize && pid_ok && 0 == bcmp(buf, contents, isize)); }, Eq(true)), _) ).WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { SET_OUT_HEADER_LEN(out, write); out.body.write.size = osize; }))); } void get_unprivileged_id(uid_t *uid, gid_t *gid) { struct passwd *pw; struct group *gr; /* * First try "tests", Kyua's default unprivileged user. XXX after * GoogleTest gains a proper Kyua wrapper, get this with the Kyua API */ pw = getpwnam("tests"); if (pw == NULL) { /* Fall back to "nobody" */ pw = getpwnam("nobody"); } if (pw == NULL) GTEST_SKIP() << "Test requires an unprivileged user"; /* Use group "nobody", which is Kyua's default unprivileged group */ gr = getgrnam("nobody"); if (gr == NULL) GTEST_SKIP() << "Test requires an unprivileged group"; *uid = pw->pw_uid; *gid = gr->gr_gid; } void FuseTest::fork(bool drop_privs, int *child_status, std::function parent_func, std::function child_func) { sem_t *sem; int mprot = PROT_READ | PROT_WRITE; int mflags = MAP_ANON | MAP_SHARED; pid_t child; uid_t uid; gid_t gid; if (drop_privs) { get_unprivileged_id(&uid, &gid); if (IsSkipped()) return; } sem = (sem_t*)mmap(NULL, sizeof(*sem), mprot, mflags, -1, 0); ASSERT_NE(MAP_FAILED, sem) << strerror(errno); ASSERT_EQ(0, sem_init(sem, 1, 0)) << strerror(errno); if ((child = ::fork()) == 0) { /* In child */ int err = 0; if (sem_wait(sem)) { perror("sem_wait"); err = 1; goto out; } if (drop_privs && 0 != setegid(gid)) { perror("setegid"); err = 1; goto out; } if (drop_privs && 0 != setreuid(-1, uid)) { perror("setreuid"); err = 1; goto out; } err = child_func(); out: sem_destroy(sem); _exit(err); } else if (child > 0) { /* * In parent. Cleanup must happen here, because it's still * privileged. */ m_mock->m_child_pid = child; ASSERT_NO_FATAL_FAILURE(parent_func()); /* Signal the child process to go */ ASSERT_EQ(0, sem_post(sem)) << strerror(errno); ASSERT_LE(0, wait(child_status)) << strerror(errno); } else { FAIL() << strerror(errno); } munmap(sem, sizeof(*sem)); return; } static void usage(char* progname) { fprintf(stderr, "Usage: %s [-v]\n\t-v increase verbosity\n", progname); exit(2); } int main(int argc, char **argv) { int ch; FuseEnv *fuse_env = new FuseEnv; InitGoogleTest(&argc, argv); AddGlobalTestEnvironment(fuse_env); while ((ch = getopt(argc, argv, "v")) != -1) { switch (ch) { case 'v': verbosity++; break; default: usage(argv[0]); break; } } return (RUN_ALL_TESTS()); } Index: projects/fuse2/tests/sys/fs/fusefs/utils.hh =================================================================== --- projects/fuse2/tests/sys/fs/fusefs/utils.hh (revision 350114) +++ projects/fuse2/tests/sys/fs/fusefs/utils.hh (revision 350115) @@ -1,234 +1,236 @@ /*- * SPDX-License-Identifier: BSD-2-Clause-FreeBSD * * Copyright (c) 2019 The FreeBSD Foundation * * This software was developed by BFF Storage Systems, LLC under sponsorship * from the FreeBSD Foundation. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ struct _sem; typedef struct _sem sem_t; struct _dirdesc; typedef struct _dirdesc DIR; /* Nanoseconds to sleep, for tests that must */ #define NAP_NS (100'000'000) void get_unprivileged_id(uid_t *uid, gid_t *gid); inline void nap() { usleep(NAP_NS / 1000); } extern const uint32_t libfuse_max_write; extern const uint32_t default_max_write; class FuseTest : public ::testing::Test { protected: uint32_t m_maxreadahead; uint32_t m_maxwrite; uint32_t m_init_flags; bool m_allow_other; bool m_default_permissions; uint32_t m_kernel_minor_version; enum poll_method m_pm; bool m_push_symlinks_in; bool m_ro; bool m_async; bool m_noclusterr; + bool m_nointr; unsigned m_time_gran; MockFS *m_mock = NULL; const static uint64_t FH = 0xdeadbeef1a7ebabe; public: int m_maxbcachebuf; int m_maxphys; FuseTest(): m_maxreadahead(0), m_maxwrite(default_max_write), m_init_flags(0), m_allow_other(false), m_default_permissions(false), m_kernel_minor_version(FUSE_KERNEL_MINOR_VERSION), m_pm(BLOCKING), m_push_symlinks_in(false), m_ro(false), m_async(false), m_noclusterr(false), + m_nointr(false), m_time_gran(1) {} virtual void SetUp(); virtual void TearDown() { if (m_mock) delete m_mock; } /* * Create an expectation that FUSE_ACCESS will be called once for the * given inode with the given access_mode, returning the given errno */ void expect_access(uint64_t ino, mode_t access_mode, int error); /* Expect FUSE_DESTROY and shutdown the daemon */ void expect_destroy(int error); /* * Create an expectation that FUSE_FLUSH will be called times times for * the given inode */ void expect_flush(uint64_t ino, int times, ProcessMockerT r); /* * Create an expectation that FUSE_FORGET will be called for the given * inode. There will be no response. If sem is provided, it will be * posted after the operation is received by the daemon. */ void expect_forget(uint64_t ino, uint64_t nlookup, sem_t *sem = NULL); /* * Create an expectation that FUSE_GETATTR will be called for the given * inode any number of times. It will respond with a few basic * attributes, like the given size and the mode S_IFREG | 0644 */ void expect_getattr(uint64_t ino, uint64_t size); /* * Create an expectation that FUSE_LOOKUP will be called for the given * path exactly times times and cache validity period. It will respond * with inode ino, mode mode, filesize size. */ void expect_lookup(const char *relpath, uint64_t ino, mode_t mode, uint64_t size, int times, uint64_t attr_valid = UINT64_MAX, uid_t uid = 0, gid_t gid = 0); /* The protocol 7.8 version of expect_lookup */ void expect_lookup_7_8(const char *relpath, uint64_t ino, mode_t mode, uint64_t size, int times, uint64_t attr_valid = UINT64_MAX, uid_t uid = 0, gid_t gid = 0); /* * Create an expectation that FUSE_OPEN will be called for the given * inode exactly times times. It will return with open_flags flags and * file handle FH. */ void expect_open(uint64_t ino, uint32_t flags, int times); /* * Create an expectation that FUSE_OPENDIR will be called exactly once * for inode ino. */ void expect_opendir(uint64_t ino); /* * Create an expectation that FUSE_READ will be called exactly once for * the given inode, at offset offset and with size isize. It will * return the first osize bytes from contents * * Protocol 7.8 tests can use this same expectation method because * nothing currently validates the size of the fuse_read_in struct. */ void expect_read(uint64_t ino, uint64_t offset, uint64_t isize, uint64_t osize, const void *contents, int flags = -1); /* * Create an expectation that FUSE_READIR will be called any number of * times on the given ino with the given offset, returning (by copy) * the provided entries */ void expect_readdir(uint64_t ino, uint64_t off, std::vector &ents); /* * Create an expectation that FUSE_RELEASE will be called exactly once * for the given inode and filehandle, returning success */ void expect_release(uint64_t ino, uint64_t fh); /* * Create an expectation that FUSE_RELEASEDIR will be called exactly * once for the given inode */ void expect_releasedir(uint64_t ino, ProcessMockerT r); /* * Create an expectation that FUSE_UNLINK will be called exactly once * for the given path, returning an errno */ void expect_unlink(uint64_t parent, const char *path, int error); /* * Create an expectation that FUSE_WRITE will be called exactly once * for the given inode, at offset offset, with size isize and buffer * contents. Any flags present in flags_set must be set, and any * present in flags_unset must not be set. Other flags are don't care. * It will return osize. */ void expect_write(uint64_t ino, uint64_t offset, uint64_t isize, uint64_t osize, uint32_t flags_set, uint32_t flags_unset, const void *contents); /* Protocol 7.8 version of expect_write */ void expect_write_7_8(uint64_t ino, uint64_t offset, uint64_t isize, uint64_t osize, const void *contents); /* * Helper that runs code in a child process. * * First, parent_func runs in the parent process. * Then, child_func runs in the child process, dropping privileges if * desired. * Finally, fusetest_fork returns. * * # Returns * * fusetest_fork may SKIP the test, which the caller should detect with * the IsSkipped() method. If not, then the child's exit status will * be returned in status. */ void fork(bool drop_privs, int *status, std::function parent_func, std::function child_func); /* * Deliberately leak a file descriptor. * * Closing a file descriptor on fusefs would cause the server to * receive FUSE_CLOSE and possibly FUSE_INACTIVE. Handling those * operations would needlessly complicate most tests. So most tests * deliberately leak the file descriptors instead. This method serves * to document the leakage, and provide a single point of suppression * for static analyzers. */ static void leak(int fd __unused) {} /* * Deliberately leak a DIR* pointer * * See comments for FuseTest::leak */ static void leakdir(DIR* dirp __unused) {} };