Index: head/MAINTAINERS =================================================================== --- head/MAINTAINERS (revision 350664) +++ head/MAINTAINERS (revision 350665) @@ -1,132 +1,133 @@ $FreeBSD$ Please note that the content of this file is strictly advisory. No locks listed here are valid. The only strict review requirements are granted by core. These are documented in head/LOCKS and enforced by svnadmin/conf/approvers. The source tree is a community effort. However, some folks go to the trouble of looking after particular areas of the tree. In return for their active caretaking of the code it is polite to coordinate changes with them. This is a list of people who have expressed an interest in part of the code or listed their active caretaking role so that other committers can easily find somebody who is familiar with it. The notes should specify if there is a 3rd party source tree involved or other things that should be kept in mind. However, this is not a 'big stick', it is an offer to help and a source of guidance. It does not override the communal nature of the tree. It is not a registry of 'turf' or private property. *** This list is prone to becoming stale quickly. The best way to find the recent maintainer of a sub-system is to check recent logs for that directory or sub-system. *** *** Maintainers are encouraged to visit: https://reviews.freebsd.org/herald and configure notifications for parts of the tree which they maintain. Notifications can automatically be sent when someone proposes a revision or makes a commit to the specified subtree. *** subsystem login notes ----------------------------- ath(4) adrian Pre-commit review requested, send to freebsd-wireless@freebsd.org contrib/atf ngie,#test Pre-commit review requested. contrib/capsicum-test ngie,#capsicum,#test Pre-commit review requested. contrib/compiler-rt dim Pre-commit review preferred. contrib/googletest ngie,#test Pre-commit review requested. contrib/ipfilter cy Pre-commit review requested. contrib/libc++ dim Pre-commit review preferred. contrib/libcxxrt dim Pre-commit review preferred. contrib/libunwind dim,emaste,jhb Pre-commit review preferred. contrib/llvm dim Pre-commit review preferred. contrib/llvm/tools/lldb dim,emaste Pre-commit review preferred. contrib/netbsd-tests ngie,#test Pre-commit review requested. contrib/pjdfstest asomers,ngie,pjd,#test Pre-commit review requested. *env(3) secteam Due to the problematic security history of this code, please have patches reviewed by secteam. etc/mail gshapiro Pre-commit review requested. Keep in sync with -STABLE. etc/sendmail gshapiro Pre-commit review requested. Keep in sync with -STABLE. fetch des Pre-commit review requested, email only. +fusefs(5) asomers Pre-commit review requested. geli pjd Pre-commit review requested (both sys/geom/eli/ and sbin/geom/class/eli/). isci(4) jimharris Pre-commit review requested. iwm(4) adrian Pre-commit review requested, send to freebsd-wireless@freebsd.org iwn(4) adrian Pre-commit review requested, send to freebsd-wireless@freebsd.org kqueue jmg Pre-commit review requested. Documentation Required. libdpv dteske Pre-commit review requested. Keep in sync with dpv(1). libfetch des Pre-commit review requested, email only. libfigpar dteske Pre-commit review requested. libm freebsd-numerics Send email with patches to freebsd-numerics@ libpam des Pre-commit review requested, email only. linprocfs des Pre-commit review requested, email only. lpr gad Pre-commit review requested, particularly for lpd/recvjob.c and lpd/printjob.c. nanobsd imp Pre-commit phabricator review requested. net80211 adrian Pre-commit review requested, send to freebsd-wireless@freebsd.org nfs freebsd-fs@FreeBSD.org, rmacklem is best for reviews. nvd(4) jimharris Pre-commit review requested. nvme(4) jimharris Pre-commit review requested. nvmecontrol(8) jimharris Pre-commit review requested. opencrypto jmg Pre-commit review requested. Documentation Required. openssh des Pre-commit review requested, email only. openssl benl,jkim Pre-commit review requested. otus(4) adrian Pre-commit review requested, send to freebsd-wireless@freebsd.org pci bus imp,jhb Pre-commit review requested. pmcstudy(8) rrs Pre-commit review requested. procfs des Pre-commit review requested, email only. pseudofs des Pre-commit review requested, email only. release/release.sh gjb,re Pre-commit review and regression tests requested. sctp rrs,tuexen Pre-commit review requested (changes need to be backported to github). sendmail gshapiro Pre-commit review requested. sh(1) jilles Pre-commit review requested. This also applies to kill(1), printf(1) and test(1) which are compiled in as builtins. share/mk imp, bapt, bdrewery, emaste, sjg Make is hard. share/mk/*.test.mk imp,bapt,bdrewery, Pre-commit review requested. emaste,ngie,sjg,#test stand/forth dteske Pre-commit review requested. stand/lua kevans Pre-commit review requested sys/compat/linuxkpi hselasky If in doubt, ask. zeising, johalun pre-commit review requested via #x11 phabricator group. (to avoid drm graphics drivers impact) sys/contrib/ipfilter cy Pre-commit review requested. sys/dev/e1000 erj Pre-commit phabricator review requested. sys/dev/ixgbe erj Pre-commit phabricator review requested. sys/dev/ixl erj Pre-commit phabricator review requested. sys/dev/sound/usb hselasky If in doubt, ask. sys/dev/usb hselasky If in doubt, ask. sys/dev/xen royger Pre-commit review recommended. sys/netinet/ip_carp.c glebius Pre-commit review recommended. sys/netpfil/pf kp,glebius Pre-commit review recommended. sys/x86/xen royger Pre-commit review recommended. sys/xen royger Pre-commit review recommended. tests ngie,#test Pre-commit review requested. tools/build imp Pre-commit review requested, especially to fix bootstrap issues. top(1) eadler Pre-commit review requested. usr.sbin/bsdconfig dteske Pre-commit phabricator review requested. usr.sbin/dpv dteske Pre-commit review requested. Keep in sync with libdpv. usr.sbin/pkg pkg@ Please coordinate behavior or flag changes with pkg team. usr.sbin/sysrc dteske Pre-commit phabricator review requested. Keep in sync with bsdconfig(8) sysrc.subr. vmm(4) tychon, jhb Pre-commit review requested via #bhyve phabricator group. libvmmapi tychon, jhb Pre-commit review requested via #bhyve phabricator group. usr.sbin/bhyve* tychon, jhb Pre-commit review requested via #bhyve phabricator group. autofs(5) trasz Pre-commit review recommended. iscsi(4) trasz Pre-commit review recommended. rctl(8) trasz Pre-commit review recommended. sys/dev/ofw nwhitehorn Pre-commit review recommended. sys/dev/drm* imp Pre-commit review requested in phabricator. Changes need to be mirrored in github repo. sys/dev/usb/wlan adrian Pre-commit review requested, send to freebsd-wireless@freebsd.org sys/arm/allwinner manu Pre-commit review requested sys/arm64/rockchip manu Pre-commit review requested Property changes on: head/MAINTAINERS ___________________________________________________________________ Modified: svn:mergeinfo ## -0,0 +0,1 ## Merged /projects/fuse2/MAINTAINERS:r344558-350621 Index: head/UPDATING =================================================================== --- head/UPDATING (revision 350664) +++ head/UPDATING (revision 350665) @@ -1,2076 +1,2088 @@ Updating Information for FreeBSD current users. This file is maintained and copyrighted by M. Warner Losh . See end of file for further details. For commonly done items, please see the COMMON ITEMS: section later in the file. These instructions assume that you basically know what you are doing. If not, then please consult the FreeBSD handbook: https://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/makeworld.html Items affecting the ports and packages system can be found in /usr/ports/UPDATING. Please read that file before running portupgrade. NOTE TO PEOPLE WHO THINK THAT FreeBSD 13.x IS SLOW: FreeBSD 13.x has many debugging features turned on, in both the kernel and userland. These features attempt to detect incorrect use of system primitives, and encourage loud failure through extra sanity checking and fail stop semantics. They also substantially impact system performance. If you want to do performance measurement, benchmarking, and optimization, you'll want to turn them off. This includes various WITNESS- related kernel options, INVARIANTS, malloc debugging flags in userland, and various verbose features in the kernel. Many developers choose to disable these features on build machines to maximize performance. (To completely disable malloc debugging, define MALLOC_PRODUCTION in /etc/make.conf, or to merely disable the most expensive debugging functionality run "ln -s 'abort:false,junk:false' /etc/malloc.conf".) +20190727: + The vfs.fusefs.sync_unmount and vfs.fusefs.init_backgrounded sysctls + and the "-o sync_unmount" and "-o init_backgrounded" mount options have + been removed from mount_fusefs(8). You can safely remove them from + your scripts, because they had no effect. + + The vfs.fusefs.fix_broken_io, vfs.fusefs.sync_resize, + vfs.fusefs.refresh_size, vfs.fusefs.mmap_enable, + vfs.fusefs.reclaim_revoked, and vfs.fusefs.data_cache_invalidate + sysctls have been removed. If you felt the need to set any of them to + a non-default value, please tell asomers@FreeBSD.org why. + 20190713: Default permissions on the /var/account/acct file (and copies of it rotated by periodic daily scripts) are changed from 0644 to 0640 because the file contains sensitive information that should not be world-readable. If the /var/account directory must be created by rc.d/accounting, the mode used is now 0750. Admins who use the accounting feature are encouraged to change the mode of an existing /var/account directory to 0750 or 0700. 20190620: Entropy collection and the /dev/random device are no longer optional components. The "device random" option has been removed. Implementations of distilling algorithms can still be made loadable with "options RANDOM_LOADABLE" (e.g., random_fortuna.ko). 20190612: Clang, llvm, lld, lldb, compiler-rt, libc++, libunwind and openmp have been upgraded to 8.0.1. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20190608: A fix was applied to i386 kernel modules to avoid panics with dpcpu or vnet. Users need to recompile i386 kernel modules having pcpu or vnet sections or they will refuse to load. 20190513: User-wired pages now have their own counter, vm.stats.vm.v_user_wire_count. The vm.max_wired sysctl was renamed to vm.max_user_wired and changed from an unsigned int to an unsigned long. bhyve VMs wired with the -S are now subject to the user wiring limit; the vm.max_user_wired sysctl may need to be tuned to avoid running into the limit. 20190507: The IPSEC option has been removed from GENERIC. Users requiring ipsec(4) must now load the ipsec(4) kernel module. 20190507: The tap(4) driver has been folded into tun(4), and the module has been renamed to tuntap. You should update any kld_load="if_tap" or kld_load="if_tun" entries in /etc/rc.conf, if_tap_load="YES" or if_tun_load="YES" entries in /boot/loader.conf to load the if_tuntap module instead, and "device tap" or "device tun" entries in kernel config files to select the tuntap device instead. 20190418: The following knobs have been added related to tradeoffs between safe use of the random device and availability in the absence of entropy: kern.random.initial_seeding.bypass_before_seeding: tunable; set non-zero to bypass the random device prior to seeding, or zero to block random requests until the random device is initially seeded. For now, set to 1 (unsafe) by default to restore pre-r346250 boot availability properties. kern.random.initial_seeding.read_random_bypassed_before_seeding: read-only diagnostic sysctl that is set when bypass is enabled and read_random(9) is bypassed, to enable programmatic handling of this initial condition, if desired. kern.random.initial_seeding.arc4random_bypassed_before_seeding: Similar to the above, but for for arc4random(9) initial seeding. kern.random.initial_seeding.disable_bypass_warnings: tunable; set non-zero to disable warnings in dmesg when the same conditions are met as for the diagnostic sysctls above. Defaults to zero, i.e., produce warnings in dmesg when the conditions are met. 20190416: The loadable random module KPI has changed; the random_infra_init() routine now requires a 3rd function pointer for a bool (*)(void) method that returns true if the random device is seeded (and therefore unblocked). 20190404: r345895 reverts r320698. This implies that an nfsuserd(8) daemon built from head sources between r320757 (July 6, 2017) and r338192 (Aug. 22, 2018) will not work unless the "-use-udpsock" is added to the command line. nfsuserd daemons built from head sources that are post-r338192 are not affected and should continue to work. 20190320: The fuse(4) module has been renamed to fusefs(4) for consistency with other filesystems. You should update any kld_load="fuse" entries in /etc/rc.conf, fuse_load="YES" entries in /boot/loader.conf, and "options FUSE" entries in kernel config files. 20190304: Clang, llvm, lld, lldb, compiler-rt and libc++ have been upgraded to 8.0.0. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20190226: geom_uzip(4) depends on the new module xz. If geom_uzip is statically compiled into your custom kernel, add 'device xz' statement to the kernel config. 20190219: drm and drm2 have been removed from the tree. Please see https://wiki.freebsd.org/Graphics for the latest information on migrating to the drm ports. 20190131: Iflib is no longer unconditionally compiled into the kernel. Drivers using iflib and statically compiled into the kernel, now require the 'device iflib' config option. For the same drivers loaded as modules on kernels not having 'device iflib', the iflib.ko module is loaded automatically. 20190125: The IEEE80211_AMPDU_AGE and AH_SUPPORT_AR5416 kernel configuration options no longer exist since r343219 and r343427 respectively; nothing uses them, so they should be just removed from custom kernel config files. 20181230: r342635 changes the way efibootmgr(8) works by requiring users to add the -b (bootnum) parameter for commands where the bootnum was previously specified with each option. For example 'efibootmgr -B 0001' is now 'efibootmgr -B -b 0001'. 20181220: r342286 modifies the NFSv4 server so that it obeys vfs.nfsd.nfs_privport in the same as it is applied to NFSv2 and 3. This implies that NFSv4 servers that have vfs.nfsd.nfs_privport set will only allow mounts from clients using a reserved port#. Since both the FreeBSD and Linux NFSv4 clients use reserved port#s by default, this should not affect most NFSv4 mounts. 20181219: The XLP config has been removed. We can't support 64-bit atomics in this kernel because it is running in 32-bit mode. XLP users must transition to running a 64-bit kernel (XLP64 or XLPN32). The mips GXEMUL support has been removed from FreeBSD. MALTA* + qemu is the preferred emulator today and we don't need two different ones. The old sibyte / swarm / Broadcom BCM1250 support has been removed from the mips port. 20181211: Clang, llvm, lld, lldb, compiler-rt and libc++ have been upgraded to 7.0.1. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20181211: Remove the timed and netdate programs from the base tree. Setting the time with these daemons has been obsolete for over a decade. 20181126: On amd64, arm64 and armv7 (architectures that install LLVM's ld.lld linker as /usr/bin/ld) GNU ld is no longer installed as ld.bfd, as it produces broken binaries when ifuncs are in use. Users needing GNU ld should install the binutils port or package. 20181123: The BSD crtbegin and crtend code has been enabled by default. It has had extensive testing on amd64, arm64, and i386. It can be disabled by building a world with -DWITHOUT_BSD_CRTBEGIN. 20181115: The set of CTM commands (ctm, ctm_smail, ctm_rmail, ctm_dequeue) has been converted to a port (misc/ctm) and will be removed from FreeBSD-13. It is available as a package (ctm) for all supported FreeBSD versions. 20181110: The default newsyslog.conf(5) file has been changed to only include files in /etc/newsyslog.conf.d/ and /usr/local/etc/newsyslog.conf.d/ if the filenames end in '.conf' and do not begin with a '.'. You should check the configuration files in these two directories match this naming convention. You can verify which configuration files are being included using the command: $ newsyslog -Nrv 20181015: Ports for the DRM modules have been simplified. Now, amd64 users should just install the drm-kmod port. All others should install drm-legacy-kmod. Graphics hardware that's newer than about 2010 usually works with drm-kmod. For hardware older than 2013, however, some users will need to use drm-legacy-kmod if drm-kmod doesn't work for them. Hardware older than 2008 usually only works in drm-legacy-kmod. The graphics team can only commit to hardware made since 2013 due to the complexity of the market and difficulty to test all the older cards effectively. If you have hardware supported by drm-kmod, you are strongly encouraged to use that as you will get better support. Other than KPI chasing, drm-legacy-kmod will not be updated. As outlined elsewhere, the drm and drm2 modules will be eliminated from the src base soon (with a limited exception for arm). Please update to the package asap and report any issues to x11@freebsd.org. Generally, anybody using the drm*-kmod packages should add WITHOUT_DRM_MODULE=t and WITHOUT_DRM2_MODULE=t to avoid nasty cross-threading surprises, especially with automatic driver loading from X11 startup. These will become the defaults in 13-current shortly. 20181012: The ixlv(4) driver has been renamed to iavf(4). As a consequence, custom kernel and module loading configuration files must be updated accordingly. Moreover, interfaces previous presented as ixlvN to the system are now exposed as iavfN and network configuration files must be adjusted as necessary. 20181009: OpenSSL has been updated to version 1.1.1. This update included additional various API changes throughout the base system. It is important to rebuild third-party software after upgrading. The value of __FreeBSD_version has been bumped accordingly. 20181006: The legacy DRM modules and drivers have now been added to the loader's module blacklist, in favor of loading them with kld_list in rc.conf(5). The module blacklist may be overridden with the loader.conf(5) 'module_blacklist' variable, but loading them via rc.conf(5) is strongly encouraged. 20181002: The cam(4) based nda(4) driver will be used over nvd(4) by default on powerpc64. You may set 'options NVME_USE_NVD=1' in your kernel conf or loader tunable 'hw.nvme.use_nvd=1' if you wish to use the existing driver. Make sure to edit /boot/etc/kboot.conf and fstab to use the nda device name. 20180913: Reproducible build mode is now on by default, in preparation for FreeBSD 12.0. This eliminates build metadata such as the user, host, and time from the kernel (and uname), unless the working tree corresponds to a modified checkout from a version control system. The previous behavior can be obtained by setting the /etc/src.conf knob WITHOUT_REPRODUCIBLE_BUILD. 20180826: The Yarrow CSPRNG has been removed from the kernel as it has not been supported by its designers since at least 2003. Fortuna has been the default since FreeBSD-11. 20180822: devctl freeze/thaw have gone into the tree, the rc scripts have been updated to use them and devmatch has been changed. You should update kernel, userland and rc scripts all at the same time. 20180818: The default interpreter has been switched from 4th to Lua. LOADER_DEFAULT_INTERP, documented in build(7), will override the default interpreter. If you have custom FORTH code you will need to set LOADER_DEFAULT_INTERP=4th (valid values are 4th, lua or simp) in src.conf for the build. This will create default hard links between loader and loader_4th instead of loader and loader_lua, the new default. If you are using UEFI it will create the proper hard link to loader.efi. bhyve uses userboot.so. It remains 4th-only until some issues are solved regarding coexisting with multiple versions of FreeBSD are resolved. 20180815: ls(1) now respects the COLORTERM environment variable used in other systems and software to indicate that a colored terminal is both supported and desired. If ls(1) is suddenly emitting colors, they may be disabled again by either removing the unwanted COLORTERM from your environment, or using `ls --color=never`. The ls(1) specific CLICOLOR may not be observed in a future release. 20180808: The default pager for most commands has been changed to "less". To restore the old behavior, set PAGER="more" and MANPAGER="more -s" in your environment. 20180731: The jedec_ts(4) driver has been removed. A superset of its functionality is available in the jedec_dimm(4) driver, and the manpage for that driver includes migration instructions. If you have "device jedec_ts" in your kernel configuration file, it must be removed. 20180730: amd64/GENERIC now has EFI runtime services, EFIRT, enabled by default. This should have no effect if the kernel is booted via BIOS/legacy boot. EFIRT may be disabled via a loader tunable, efi.rt.disabled, if a system has a buggy firmware that prevents a successful boot due to use of runtime services. 20180727: Atmel AT91RM9200 and AT91SAM9, Cavium CNS 11xx and XScale support has been removed from the tree. These ports were obsolete and/or known to be broken for many years. 20180723: loader.efi has been augmented to participate more fully in the UEFI boot manager protocol. loader.efi will now look at the BootXXXX environment variable to determine if a specific kernel or root partition was specified. XXXX is derived from BootCurrent. efibootmgr(8) manages these standard UEFI variables. 20180720: zfsloader's functionality has now been folded into loader. zfsloader is no longer necessary once you've updated your boot blocks. For a transition period, we will install a hardlink for zfsloader to loader to allow a smooth transition until the boot blocks can be updated (hard link because old zfs boot blocks don't understand symlinks). 20180719: ARM64 now have efifb support, if you want to have serial console on your arm64 board when an screen is connected and the bootloader setup a frame buffer for us to use, just add : boot_serial=YES boot_multicons=YES in /boot/loader.conf For Raspberry Pi 3 (RPI) users, this is needed even if you don't have an screen connected as the firmware will setup a frame buffer are that u-boot will expose as an EFI frame buffer. 20180719: New uid:gid added, ntpd:ntpd (123:123). Be sure to run mergemaster or take steps to update /etc/passwd before doing installworld on existing systems. Do not skip the "mergemaster -Fp" step before installworld, as described in the update procedures near the bottom of this document. Also, rc.d/ntpd now starts ntpd(8) as user ntpd if the new mac_ntpd(4) policy is available, unless ntpd_flags or the ntp config file contain options that change file/dir locations. When such options (e.g., "statsdir" or "crypto") are used, ntpd can still be run as non-root by setting ntpd_user=ntpd in rc.conf, after taking steps to ensure that all required files/dirs are accessible by the ntpd user. 20180717: Big endian arm support has been removed. 20180711: The static environment setup in kernel configs is no longer mutually exclusive with the loader(8) environment by default. In order to restore the previous default behavior of disabling the loader(8) environment if a static environment is present, you must specify loader_env.disabled=1 in the static environment. 20180705: The ABI of syscalls used by management tools like sockstat and netstat has been broken to allow 32-bit binaries to work on 64-bit kernels without modification. These programs will need to match the kernel in order to function. External programs may require minor modifications to accommodate a change of type in structures from pointers to 64-bit virtual addresses. 20180702: On i386 and amd64 atomics are now inlined. Out of tree modules using atomics will need to be rebuilt. 20180701: The '%I' format in the kern.corefile sysctl limits the number of core files that a process can generate to the number stored in the debug.ncores sysctl. The '%I' format is replaced by the single digit index. Previously, if all indexes were taken the kernel would overwrite only a core file with the highest index in a filename. Currently the system will create a new core file if there is a free index or if all slots are taken it will overwrite the oldest one. 20180630: Clang, llvm, lld, lldb, compiler-rt and libc++ have been upgraded to 6.0.1. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20180628: r335753 introduced a new quoting method. However, etc/devd/devmatch.conf needed to be changed to work with it. This change was made with r335763 and requires a mergemaster / etcupdate / etc to update the installed file. 20180612: r334930 changed the interface between the NFS modules, so they all need to be rebuilt. r335018 did a __FreeBSD_version bump for this. 20180530: As of r334391 lld is the default amd64 system linker; it is installed as /usr/bin/ld. Kernel build workarounds (see 20180510 entry) are no longer necessary. 20180530: The kernel / userland interface for devinfo changed, so you'll need a new kernel and userland as a pair for it to work (rebuilding lib/libdevinfo is all that's required). devinfo and devmatch will not work, but everything else will when there's a mismatch. 20180523: The on-disk format for hwpmc callchain records has changed to include threadid corresponding to a given record. This changes the field offsets and thus requires that libpmcstat be rebuilt before using a kernel later than r334108. 20180517: The vxge(4) driver has been removed. This driver was introduced into HEAD one week before the Exar left the Ethernet market and is not known to be used. If you have device vxge in your kernel config file it must be removed. 20180510: The amd64 kernel now requires a ld that supports ifunc to produce a working kernel, either lld or a newer binutils. lld is built by default on amd64, and the 'buildkernel' target uses it automatically. However, it is not the default linker, so building the kernel the traditional way requires LD=ld.lld on the command line (or LD=/usr/local/bin/ld for binutils port/package). lld will soon be default, and this requirement will go away. NOTE: As of r334391 lld is the default system linker on amd64, and no workaround is necessary. 20180508: The nxge(4) driver has been removed. This driver was for PCI-X 10g cards made by s2io/Neterion. The company was acquired by Exar and no longer sells or supports Ethernet products. If you have device nxge in your kernel config file it must be removed. 20180504: The tz database (tzdb) has been updated to 2018e. This version more correctly models time stamps in time zones with negative DST such as Europe/Dublin (from 1971 on), Europe/Prague (1946/7), and Africa/Windhoek (1994/2017). This does not affect the UT offsets, only time zone abbreviations and the tm_isdst flag. 20180502: The ixgb(4) driver has been removed. This driver was for an early and uncommon legacy PCI 10GbE for a single ASIC, Intel 82597EX. Intel quickly shifted to the long lived ixgbe family. If you have device ixgb in your kernel config file it must be removed. 20180501: The lmc(4) driver has been removed. This was a WAN interface card that was already reportedly rare in 2003, and had an ambiguous license. If you have device lmc in your kernel config file it must be removed. 20180413: Support for Arcnet networks has been removed. If you have device arcnet or device cm in your kernel config file they must be removed. 20180411: Support for FDDI networks has been removed. If you have device fddi or device fpa in your kernel config file they must be removed. 20180406: In addition to supporting RFC 3164 formatted messages, the syslogd(8) service is now capable of parsing RFC 5424 formatted log messages. The main benefit of using RFC 5424 is that clients may now send log messages with timestamps containing year numbers, microseconds and time zone offsets. Similarly, the syslog(3) C library function has been altered to send RFC 5424 formatted messages to the local system logging daemon. On systems using syslogd(8), this change should have no negative impact, as long as syslogd(8) and the C library are updated at the same time. On systems using a different system logging daemon, it may be necessary to make configuration adjustments, depending on the software used. When using syslog-ng, add the 'syslog-protocol' flag to local input sources to enable parsing of RFC 5424 formatted messages: source src { unix-dgram("/var/run/log" flags(syslog-protocol)); } When using rsyslog, disable the 'SysSock.UseSpecialParser' option of the 'imuxsock' module to let messages be processed by the regular RFC 3164/5424 parsing pipeline: module(load="imuxsock" SysSock.UseSpecialParser="off") Do note that these changes only affect communication between local applications and syslogd(8). The format that syslogd(8) uses to store messages on disk or forward messages to other systems remains unchanged. syslogd(8) still uses RFC 3164 for these purposes. Options to customize this behaviour will be added in the future. Utilities that process log files stored in /var/log are thus expected to continue to function as before. __FreeBSD_version has been incremented to 1200061 to denote this change. 20180328: Support for token ring networks has been removed. If you have "device token" in your kernel config you should remove it. No device drivers supported token ring. 20180323: makefs was modified to be able to tag ISO9660 El Torito boot catalog entries as EFI instead of overloading the i386 tag as done previously. The amd64 mkisoimages.sh script used to build amd64 ISO images for release was updated to use this. This may mean that makefs must be updated before "make cdrom" can be run in the release directory. This should be as simple as: $ cd $SRCDIR/usr.sbin/makefs $ make depend all install 20180212: FreeBSD boot loader enhanced with Lua scripting. It's purely opt-in for now by building WITH_LOADER_LUA and WITHOUT_FORTH in /etc/src.conf. Co-existence for the transition period will come shortly. Booting is a complex environment and test coverage for Lua-enabled loaders has been thin, so it would be prudent to assume it might not work and make provisions for backup boot methods. 20180211: devmatch functionality has been turned on in devd. It will automatically load drivers for unattached devices. This may cause unexpected drivers to be loaded. Please report any problems to current@ and imp@freebsd.org. 20180114: Clang, llvm, lld, lldb, compiler-rt and libc++ have been upgraded to 6.0.0. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20180110: LLVM's lld linker is now used as the FreeBSD/amd64 bootstrap linker. This means it is used to link the kernel and userland libraries and executables, but is not yet installed as /usr/bin/ld by default. To revert to ld.bfd as the bootstrap linker, in /etc/src.conf set WITHOUT_LLD_BOOTSTRAP=yes 20180110: On i386, pmtimer has been removed. Its functionality has been folded into apm. It was a no-op on ACPI in current for a while now (but was still needed on i386 in FreeBSD 11 and earlier). Users may need to remove it from kernel config files. 20180104: The use of RSS hash from the network card aka flowid has been disabled by default for lagg(4) as it's currently incompatible with the lacp and loadbalance protocols. This can be re-enabled by setting the following in loader.conf: net.link.lagg.default_use_flowid="1" 20180102: The SW_WATCHDOG option is no longer necessary to enable the hardclock-based software watchdog if no hardware watchdog is configured. As before, SW_WATCHDOG will cause the software watchdog to be enabled even if a hardware watchdog is configured. 20171215: r326887 fixes the issue described in the 20171214 UPDATING entry. r326888 flips the switch back to building GELI support always. 20171214: r362593 broke ZFS + GELI support for reasons unknown. However, it also broke ZFS support generally, so GELI has been turned off by default as the lesser evil in r326857. If you boot off ZFS and/or GELI, it might not be a good time to update. 20171125: PowerPC users must update loader(8) by rebuilding world before installing a new kernel, as the protocol connecting them has changed. Without the update, loader metadata will not be passed successfully to the kernel and users will have to enter their root partition at the kernel mountroot prompt to continue booting. Newer versions of loader can boot old kernels without issue. 20171110: The LOADER_FIREWIRE_SUPPORT build variable as been renamed to WITH/OUT_LOADER_FIREWIRE. LOADER_{NO_,}GELI_SUPPORT has been renamed to WITH/OUT_LOADER_GELI. 20171106: The naive and non-compliant support of posix_fallocate(2) in ZFS has been removed as of r325320. The system call now returns EINVAL when used on a ZFS file. Although the new behavior complies with the standard, some consumers are not prepared to cope with it. One known victim is lld prior to r325420. 20171102: Building in a FreeBSD src checkout will automatically create object directories now rather than store files in the current directory if 'make obj' was not ran. Calling 'make obj' is no longer necessary. This feature can be disabled by setting WITHOUT_AUTO_OBJ=yes in /etc/src-env.conf (not /etc/src.conf), or passing the option in the environment. 20171101: The default MAKEOBJDIR has changed from /usr/obj/ for native builds, and /usr/obj// for cross-builds, to a unified /usr/obj//. This behavior can be changed to the old format by setting WITHOUT_UNIFIED_OBJDIR=yes in /etc/src-env.conf, the environment, or with -DWITHOUT_UNIFIED_OBJDIR when building. The UNIFIED_OBJDIR option is a transitional feature that will be removed for 12.0 release; please migrate to the new format for any tools by looking up the OBJDIR used by 'make -V .OBJDIR' means rather than hardcoding paths. 20171028: The native-xtools target no longer installs the files by default to the OBJDIR. Use the native-xtools-install target with a DESTDIR to install to ${DESTDIR}/${NXTP} where NXTP defaults to /nxb-bin. 20171021: As part of the boot loader infrastructure cleanup, LOADER_*_SUPPORT options are changing from controlling the build if defined / undefined to controlling the build with explicit 'yes' or 'no' values. They will shift to WITH/WITHOUT options to match other options in the system. 20171010: libstand has turned into a private library for sys/boot use only. It is no longer supported as a public interface outside of sys/boot. 20171005: The arm port has split armv6 into armv6 and armv7. armv7 is now a valid TARGET_ARCH/MACHINE_ARCH setting. If you have an armv7 system and are running a kernel from before r324363, you will need to add MACHINE_ARCH=armv7 to 'make buildworld' to do a native build. 20171003: When building multiple kernels using KERNCONF, non-existent KERNCONF files will produce an error and buildkernel will fail. Previously missing KERNCONF files silently failed giving no indication as to why, only to subsequently discover during installkernel that the desired kernel was never built in the first place. 20170912: The default serial number format for CTL LUNs has changed. This will affect users who use /dev/diskid/* device nodes, or whose FibreChannel or iSCSI clients care about their LUNs' serial numbers. Users who require serial number stability should hardcode serial numbers in /etc/ctl.conf . 20170912: For 32-bit arm compiled for hard-float support, soft-floating point binaries now always get their shared libraries from LD_SOFT_LIBRARY_PATH (in the past, this was only used if /usr/libsoft also existed). Only users with a hard-float ld.so, but soft-float everything else should be affected. 20170826: The geli password typed at boot is now hidden. To restore the previous behavior, see geli(8) for configuration options. 20170825: Move PMTUD blackhole counters to TCPSTATS and remove them from bare sysctl values. Minor nit, but requires a rebuild of both world/kernel to complete. 20170814: "make check" behavior (made in ^/head@r295380) has been changed to execute from a limited sandbox, as opposed to executing from ${TESTSDIR}. Behavioral changes: - The "beforecheck" and "aftercheck" targets are now specified. - ${CHECKDIR} (added in commit noted above) has been removed. - Legacy behavior can be enabled by setting WITHOUT_MAKE_CHECK_USE_SANDBOX in src.conf(5) or the environment. If the limited sandbox mode is enabled, "make check" will execute "make distribution", then install, execute the tests, and clean up the sandbox if successful. The "make distribution" and "make install" targets are typically run as root to set appropriate permissions and ownership at installation time. The end-user should set "WITH_INSTALL_AS_USER" in src.conf(5) or the environment if executing "make check" with limited sandbox mode using an unprivileged user. 20170808: Since the switch to GPT disk labels, fsck for UFS/FFS has been unable to automatically find alternate superblocks. As of r322297, the information needed to find alternate superblocks has been moved to the end of the area reserved for the boot block. Filesystems created with a newfs of this vintage or later will create the recovery information. If you have a filesystem created prior to this change and wish to have a recovery block created for your filesystem, you can do so by running fsck in foreground mode (i.e., do not use the -p or -y options). As it starts, fsck will ask ``SAVE DATA TO FIND ALTERNATE SUPERBLOCKS'' to which you should answer yes. 20170728: As of r321665, an NFSv4 server configuration that services Kerberos mounts or clients that do not support the uid/gid in owner/owner_group string capability, must explicitly enable the nfsuserd daemon by adding nfsuserd_enable="YES" to the machine's /etc/rc.conf file. 20170722: Clang, llvm, lldb, compiler-rt and libc++ have been upgraded to 5.0.0. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20170701: WITHOUT_RCMDS is now the default. Set WITH_RCMDS if you need the r-commands (rlogin, rsh, etc.) to be built with the base system. 20170625: The FreeBSD/powerpc platform now uses a 64-bit type for time_t. This is a very major ABI incompatible change, so users of FreeBSD/powerpc must be careful when performing source upgrades. It is best to run 'make installworld' from an alternate root system, either a live CD/memory stick, or a temporary root partition. Additionally, all ports must be recompiled. powerpc64 is largely unaffected, except in the case of 32-bit compatibility. All 32-bit binaries will be affected. 20170623: Forward compatibility for the "ino64" project have been committed. This will allow most new binaries to run on older kernels in a limited fashion. This prevents many of the common foot-shooting actions in the upgrade as well as the limited ability to roll back the kernel across the ino64 upgrade. Complicated use cases may not work properly, though enough simpler ones work to allow recovery in most situations. 20170620: Switch back to the BSDL dtc (Device Tree Compiler). Set WITH_GPL_DTC if you require the GPL compiler. 20170618: The internal ABI used for communication between the NFS kernel modules was changed by r320085, so __FreeBSD_version was bumped to ensure all the NFS related modules are updated together. 20170617: The ABI of struct event was changed by extending the data member to 64bit and adding ext fields. For upgrade, same precautions as for the entry 20170523 "ino64" must be followed. 20170531: The GNU roff toolchain has been removed from base. To render manpages which are not supported by mandoc(1), man(1) can fallback on GNU roff from ports (and recommends to install it). To render roff(7) documents, consider using GNU roff from ports or the heirloom doctools roff toolchain from ports via pkg install groff or via pkg install heirloom-doctools. 20170524: The ath(4) and ath_hal(4) modules now build piecemeal to allow for smaller runtime footprint builds. This is useful for embedded systems which only require one chipset support. If you load it as a module, make sure this is in /boot/loader.conf: if_ath_load="YES" This will load the HAL, all chip/RF backends and if_ath_pci. If you have if_ath_pci in /boot/loader.conf, ensure it is after if_ath or it will not load any HAL chipset support. If you want to selectively load things (eg on ye cheape ARM/MIPS platforms where RAM is at a premium) you should: * load ath_hal * load the chip modules in question * load ath_rate, ath_dfs * load ath_main * load if_ath_pci and/or if_ath_ahb depending upon your particular bus bind type - this is where probe/attach is done. For further comments/feedback, poke adrian@ . 20170523: The "ino64" 64-bit inode project has been committed, which extends a number of types to 64 bits. Upgrading in place requires care and adherence to the documented upgrade procedure. If using a custom kernel configuration ensure that the COMPAT_FREEBSD11 option is included (as during the upgrade the system will be running the ino64 kernel with the existing world). For the safest in-place upgrade begin by removing previous build artifacts via "rm -rf /usr/obj/*". Then, carefully follow the full procedure documented below under the heading "To rebuild everything and install it on the current system." Specifically, a reboot is required after installing the new kernel before installing world. While an installworld normally works by accident from multiuser after rebooting the proper kernel, there are many cases where this will fail across this upgrade and installworld from single user is required. 20170424: The NATM framework including the en(4), fatm(4), hatm(4), and patm(4) devices has been removed. Consumers should plan a migration before the end-of-life date for FreeBSD 11. 20170420: GNU diff has been replaced by a BSD licensed diff. Some features of GNU diff has not been implemented, if those are needed a newer version of GNU diff is available via the diffutils package under the gdiff name. 20170413: As of r316810 for ipfilter, keep frags is no longer assumed when keep state is specified in a rule. r316810 aligns ipfilter with documentation in man pages separating keep frags from keep state. This allows keep state to be specified without forcing keep frags and allows keep frags to be specified independently of keep state. To maintain previous behaviour, also specify keep frags with keep state (as documented in ipf.conf.5). 20170407: arm64 builds now use the base system LLD 4.0.0 linker by default, instead of requiring that the aarch64-binutils port or package be installed. To continue using aarch64-binutils, set CROSS_BINUTILS_PREFIX=/usr/local/aarch64-freebsd/bin . 20170405: The UDP optimization in entry 20160818 that added the sysctl net.inet.udp.require_l2_bcast has been reverted. L2 broadcast packets will no longer be treated as L3 broadcast packets. 20170331: Binds and sends to the loopback addresses, IPv6 and IPv4, will now use any explicitly assigned loopback address available in the jail instead of using the first assigned address of the jail. 20170329: The ctl.ko module no longer implements the iSCSI target frontend: cfiscsi.ko does instead. If building cfiscsi.ko as a kernel module, the module can be loaded via one of the following methods: - `cfiscsi_load="YES"` in loader.conf(5). - Add `cfiscsi` to `$kld_list` in rc.conf(5). - ctladm(8)/ctld(8), when compiled with iSCSI support (`WITH_ISCSI=yes` in src.conf(5)) Please see cfiscsi(4) for more details. 20170316: The mmcsd.ko module now additionally depends on geom_flashmap.ko. Also, mmc.ko and mmcsd.ko need to be a matching pair built from the same source (previously, the dependency of mmcsd.ko on mmc.ko was missing, but mmcsd.ko now will refuse to load if it is incompatible with mmc.ko). 20170315: The syntax of ipfw(8) named states was changed to avoid ambiguity. If you have used named states in the firewall rules, you need to modify them after installworld and before rebooting. Now named states must be prefixed with colon. 20170311: The old drm (sys/dev/drm/) drivers for i915 and radeon have been removed as the userland we provide cannot use them. The KMS version (sys/dev/drm2) supports the same hardware. 20170302: Clang, llvm, lldb, compiler-rt and libc++ have been upgraded to 4.0.0. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20170221: The code that provides support for ZFS .zfs/ directory functionality has been reimplemented. It's not possible now to create a snapshot by mkdir under .zfs/snapshot/. That should be the only user visible change. 20170216: EISA bus support has been removed. The WITH_EISA option is no longer valid. 20170215: MCA bus support has been removed. 20170127: The WITH_LLD_AS_LD / WITHOUT_LLD_AS_LD build knobs have been renamed WITH_LLD_IS_LD / WITHOUT_LLD_IS_LD, for consistency with CLANG_IS_CC. 20170112: The EM_MULTIQUEUE kernel configuration option is deprecated now that the em(4) driver conforms to iflib specifications. 20170109: The igb(4), em(4) and lem(4) ethernet drivers are now implemented via IFLIB. If you have a custom kernel configuration that excludes em(4) but you use igb(4), you need to re-add em(4) to your custom configuration. 20161217: Clang, llvm, lldb, compiler-rt and libc++ have been upgraded to 3.9.1. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20161124: Clang, llvm, lldb, compiler-rt and libc++ have been upgraded to 3.9.0. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20161119: The layout of the pmap structure has changed for powerpc to put the pmap statistics at the front for all CPU variations. libkvm(3) and all tools that link against it need to be recompiled. 20161030: isl(4) and cyapa(4) drivers now require a new driver, chromebook_platform(4), to work properly on Chromebook-class hardware. On other types of hardware the drivers may need to be configured using device hints. Please see the corresponding manual pages for details. 20161017: The urtwn(4) driver was merged into rtwn(4) and now consists of rtwn(4) main module + rtwn_usb(4) and rtwn_pci(4) bus-specific parts. Also, firmware for RTL8188CE was renamed due to possible name conflict (rtwnrtl8192cU(B) -> rtwnrtl8192cE(B)) 20161015: GNU rcs has been removed from base. It is available as packages: - rcs: Latest GPLv3 GNU rcs version. - rcs57: Copy of the latest version of GNU rcs (GPLv2) before it was removed from base. 20161008: Use of the cc_cdg, cc_chd, cc_hd, or cc_vegas congestion control modules now requires that the kernel configuration contain the TCP_HHOOK option. (This option is included in the GENERIC kernel.) 20161003: The WITHOUT_ELFCOPY_AS_OBJCOPY src.conf(5) knob has been retired. ELF Tool Chain's elfcopy is always installed as /usr/bin/objcopy. 20160924: Relocatable object files with the extension of .So have been renamed to use an extension of .pico instead. The purpose of this change is to avoid a name clash with shared libraries on case-insensitive file systems. On those file systems, foo.So is the same file as foo.so. 20160918: GNU rcs has been turned off by default. It can (temporarily) be built again by adding WITH_RCS knob in src.conf. Otherwise, GNU rcs is available from packages: - rcs: Latest GPLv3 GNU rcs version. - rcs57: Copy of the latest version of GNU rcs (GPLv2) from base. 20160918: The backup_uses_rcs functionality has been removed from rc.subr. 20160908: The queue(3) debugging macro, QUEUE_MACRO_DEBUG, has been split into two separate components, QUEUE_MACRO_DEBUG_TRACE and QUEUE_MACRO_DEBUG_TRASH. Define both for the original QUEUE_MACRO_DEBUG behavior. 20160824: r304787 changed some ioctl interfaces between the iSCSI userspace programs and the kernel. ctladm, ctld, iscsictl, and iscsid must be rebuilt to work with new kernels. __FreeBSD_version has been bumped to 1200005. 20160818: The UDP receive code has been updated to only treat incoming UDP packets that were addressed to an L2 broadcast address as L3 broadcast packets. It is not expected that this will affect any standards-conforming UDP application. The new behaviour can be disabled by setting the sysctl net.inet.udp.require_l2_bcast to 0. 20160818: Remove the openbsd_poll system call. __FreeBSD_version has been bumped because of this. 20160708: The stable/11 branch has been created from head@r302406. 20160622: The libc stub for the pipe(2) system call has been replaced with a wrapper that calls the pipe2(2) system call and the pipe(2) system call is now only implemented by the kernels that include "options COMPAT_FREEBSD10" in their config file (this is the default). Users should ensure that this option is enabled in their kernel or upgrade userspace to r302092 before upgrading their kernel. 20160527: CAM will now strip leading spaces from SCSI disks' serial numbers. This will affect users who create UFS filesystems on SCSI disks using those disk's diskid device nodes. For example, if /etc/fstab previously contained a line like "/dev/diskid/DISK-%20%20%20%20%20%20%20ABCDEFG0123456", you should change it to "/dev/diskid/DISK-ABCDEFG0123456". Users of geom transforms like gmirror may also be affected. ZFS users should generally be fine. 20160523: The bitstring(3) API has been updated with new functionality and improved performance. But it is binary-incompatible with the old API. Objects built with the new headers may not be linked against objects built with the old headers. 20160520: The brk and sbrk functions have been removed from libc on arm64. Binutils from ports has been updated to not link to these functions and should be updated to the latest version before installing a new libc. 20160517: The armv6 port now defaults to hard float ABI. Limited support for running both hardfloat and soft float on the same system is available using the libraries installed with -DWITH_LIBSOFT. This has only been tested as an upgrade path for installworld and packages may fail or need manual intervention to run. New packages will be needed. To update an existing self-hosted armv6hf system, you must add TARGET_ARCH=armv6 on the make command line for both the build and the install steps. 20160510: Kernel modules compiled outside of a kernel build now default to installing to /boot/modules instead of /boot/kernel. Many kernel modules built this way (such as those in ports) already overrode KMODDIR explicitly to install into /boot/modules. However, manually building and installing a module from /sys/modules will now install to /boot/modules instead of /boot/kernel. 20160414: The CAM I/O scheduler has been committed to the kernel. There should be no user visible impact. This does enable NCQ Trim on ada SSDs. While the list of known rogues that claim support for this but actually corrupt data is believed to be complete, be on the lookout for data corruption. The known rogue list is believed to be complete: o Crucial MX100, M550 drives with MU01 firmware. o Micron M510 and M550 drives with MU01 firmware. o Micron M500 prior to MU07 firmware o Samsung 830, 840, and 850 all firmwares o FCCT M500 all firmwares Crucial has firmware http://www.crucial.com/usa/en/support-ssd-firmware with working NCQ TRIM. For Micron branded drives, see your sales rep for updated firmware. Black listed drives will work correctly because these drives work correctly so long as no NCQ TRIMs are sent to them. Given this list is the same as found in Linux, it's believed there are no other rogues in the market place. All other models from the above vendors work. To be safe, if you are at all concerned, you can quirk each of your drives to prevent NCQ from being sent by setting: kern.cam.ada.X.quirks="0x2" in loader.conf. If the drive requires the 4k sector quirk, set the quirks entry to 0x3. 20160330: The FAST_DEPEND build option has been removed and its functionality is now the one true way. The old mkdep(1) style of 'make depend' has been removed. See 20160311 for further details. 20160317: Resource range types have grown from unsigned long to uintmax_t. All drivers, and anything using libdevinfo, need to be recompiled. 20160311: WITH_FAST_DEPEND is now enabled by default for in-tree and out-of-tree builds. It no longer runs mkdep(1) during 'make depend', and the 'make depend' stage can safely be skipped now as it is auto ran when building 'make all' and will generate all SRCS and DPSRCS before building anything else. Dependencies are gathered at compile time with -MF flags kept in separate .depend files per object file. Users should run 'make cleandepend' once if using -DNO_CLEAN to clean out older stale .depend files. 20160306: On amd64, clang 3.8.0 can now insert sections of type AMD64_UNWIND into kernel modules. Therefore, if you load any kernel modules at boot time, please install the boot loaders after you install the kernel, but before rebooting, e.g.: make buildworld make buildkernel KERNCONF=YOUR_KERNEL_HERE make installkernel KERNCONF=YOUR_KERNEL_HERE make -C sys/boot install Then follow the usual steps, described in the General Notes section, below. 20160305: Clang, llvm, lldb and compiler-rt have been upgraded to 3.8.0. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20160301: The AIO subsystem is now a standard part of the kernel. The VFS_AIO kernel option and aio.ko kernel module have been removed. Due to stability concerns, asynchronous I/O requests are only permitted on sockets and raw disks by default. To enable asynchronous I/O requests on all file types, set the vfs.aio.enable_unsafe sysctl to a non-zero value. 20160226: The ELF object manipulation tool objcopy is now provided by the ELF Tool Chain project rather than by GNU binutils. It should be a drop-in replacement, with the addition of arm64 support. The (temporary) src.conf knob WITHOUT_ELFCOPY_AS_OBJCOPY knob may be set to obtain the GNU version if necessary. 20160129: Building ZFS pools on top of zvols is prohibited by default. That feature has never worked safely; it's always been prone to deadlocks. Using a zvol as the backing store for a VM guest's virtual disk will still work, even if the guest is using ZFS. Legacy behavior can be restored by setting vfs.zfs.vol.recursive=1. 20160119: The NONE and HPN patches has been removed from OpenSSH. They are still available in the security/openssh-portable port. 20160113: With the addition of ypldap(8), a new _ypldap user is now required during installworld. "mergemaster -p" can be used to add the user prior to installworld, as documented in the handbook. 20151216: The tftp loader (pxeboot) now uses the option root-path directive. As a consequence it no longer looks for a pxeboot.4th file on the tftp server. Instead it uses the regular /boot infrastructure as with the other loaders. 20151211: The code to start recording plug and play data into the modules has been committed. While the old tools will properly build a new kernel, a number of warnings about "unknown metadata record 4" will be produced for an older kldxref. To avoid such warnings, make sure to rebuild the kernel toolchain (or world). Make sure that you have r292078 or later when trying to build 292077 or later before rebuilding. 20151207: Debug data files are now built by default with 'make buildworld' and installed with 'make installworld'. This facilitates debugging but requires more disk space both during the build and for the installed world. Debug files may be disabled by setting WITHOUT_DEBUG_FILES=yes in src.conf(5). 20151130: r291527 changed the internal interface between the nfsd.ko and nfscommon.ko modules. As such, they must both be upgraded to-gether. __FreeBSD_version has been bumped because of this. 20151108: Add support for unicode collation strings leads to a change of order of files listed by ls(1) for example. To get back to the old behaviour, set LC_COLLATE environment variable to "C". Databases administrators will need to reindex their databases given collation results will be different. Due to a bug in install(1) it is recommended to remove the ancient locales before running make installworld. rm -rf /usr/share/locale/* 20151030: The OpenSSL has been upgraded to 1.0.2d. Any binaries requiring libcrypto.so.7 or libssl.so.7 must be recompiled. 20151020: Qlogic 24xx/25xx firmware images were updated from 5.5.0 to 7.3.0. Kernel modules isp_2400_multi and isp_2500_multi were removed and should be replaced with isp_2400 and isp_2500 modules respectively. 20151017: The build previously allowed using 'make -n' to not recurse into sub-directories while showing what commands would be executed, and 'make -n -n' to recursively show commands. Now 'make -n' will recurse and 'make -N' will not. 20151012: If you specify SENDMAIL_MC or SENDMAIL_CF in make.conf, mergemaster and etcupdate will now use this file. A custom sendmail.cf is now updated via this mechanism rather than via installworld. If you had excluded sendmail.cf in mergemaster.rc or etcupdate.conf, you may want to remove the exclusion or change it to "always install". /etc/mail/sendmail.cf is now managed the same way regardless of whether SENDMAIL_MC/SENDMAIL_CF is used. If you are not using SENDMAIL_MC/SENDMAIL_CF there should be no change in behavior. 20151011: Compatibility shims for legacy ATA device names have been removed. It includes ATA_STATIC_ID kernel option, kern.cam.ada.legacy_aliases and kern.geom.raid.legacy_aliases loader tunables, kern.devalias.* environment variables, /dev/ad* and /dev/ar* symbolic links. 20151006: Clang, llvm, lldb, compiler-rt and libc++ have been upgraded to 3.7.0. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20150924: Kernel debug files have been moved to /usr/lib/debug/boot/kernel/, and renamed from .symbols to .debug. This reduces the size requirements on the boot partition or file system and provides consistency with userland debug files. When using the supported kernel installation method the /usr/lib/debug/boot/kernel directory will be renamed (to kernel.old) as is done with /boot/kernel. Developers wishing to maintain the historical behavior of installing debug files in /boot/kernel/ can set KERN_DEBUGDIR="" in src.conf(5). 20150827: The wireless drivers had undergone changes that remove the 'parent interface' from the ifconfig -l output. The rc.d network scripts used to check presence of a parent interface in the list, so old scripts would fail to start wireless networking. Thus, etcupdate(3) or mergemaster(8) run is required after kernel update, to update your rc.d scripts in /etc. 20150827: pf no longer supports 'scrub fragment crop' or 'scrub fragment drop-ovl' These configurations are now automatically interpreted as 'scrub fragment reassemble'. 20150817: Kernel-loadable modules for the random(4) device are back. To use them, the kernel must have device random options RANDOM_LOADABLE kldload(8) can then be used to load random_fortuna.ko or random_yarrow.ko. Please note that due to the indirect function calls that the loadable modules need to provide, the build-in variants will be slightly more efficient. The random(4) kernel option RANDOM_DUMMY has been retired due to unpopularity. It was not all that useful anyway. 20150813: The WITHOUT_ELFTOOLCHAIN_TOOLS src.conf(5) knob has been retired. Control over building the ELF Tool Chain tools is now provided by the WITHOUT_TOOLCHAIN knob. 20150810: The polarity of Pulse Per Second (PPS) capture events with the uart(4) driver has been corrected. Prior to this change the PPS "assert" event corresponded to the trailing edge of a positive PPS pulse and the "clear" event was the leading edge of the next pulse. As the width of a PPS pulse in a typical GPS receiver is on the order of 1 millisecond, most users will not notice any significant difference with this change. Anyone who has compensated for the historical polarity reversal by configuring a negative offset equal to the pulse width will need to remove that workaround. 20150809: The default group assigned to /dev/dri entries has been changed from 'wheel' to 'video' with the id of '44'. If you want to have access to the dri devices please add yourself to the video group with: # pw groupmod video -m $USER 20150806: The menu.rc and loader.rc files will now be replaced during upgrades. Please migrate local changes to menu.rc.local and loader.rc.local instead. 20150805: GNU Binutils versions of addr2line, c++filt, nm, readelf, size, strings and strip have been removed. The src.conf(5) knob WITHOUT_ELFTOOLCHAIN_TOOLS no longer provides the binutils tools. 20150728: As ZFS requires more kernel stack pages than is the default on some architectures e.g. i386, it now warns if KSTACK_PAGES is less than ZFS_MIN_KSTACK_PAGES (which is 4 at the time of writing). Please consider using 'options KSTACK_PAGES=X' where X is greater than or equal to ZFS_MIN_KSTACK_PAGES i.e. 4 in such configurations. 20150706: sendmail has been updated to 8.15.2. Starting with FreeBSD 11.0 and sendmail 8.15, sendmail uses uncompressed IPv6 addresses by default, i.e., they will not contain "::". For example, instead of ::1, it will be 0:0:0:0:0:0:0:1. This permits a zero subnet to have a more specific match, such as different map entries for IPv6:0:0 vs IPv6:0. This change requires that configuration data (including maps, files, classes, custom ruleset, etc.) must use the same format, so make certain such configuration data is upgrading. As a very simple check search for patterns like 'IPv6:[0-9a-fA-F:]*::' and 'IPv6::'. To return to the old behavior, set the m4 option confUSE_COMPRESSED_IPV6_ADDRESSES or the cf option UseCompressedIPv6Addresses. 20150630: The default kernel entropy-processing algorithm is now Fortuna, replacing Yarrow. Assuming you have 'device random' in your kernel config file, the configurations allow a kernel option to override this default. You may choose *ONE* of: options RANDOM_YARROW # Legacy /dev/random algorithm. options RANDOM_DUMMY # Blocking-only driver. If you have neither, you get Fortuna. For most people, read no further, Fortuna will give a /dev/random that works like it always used to, and the difference will be irrelevant. If you remove 'device random', you get *NO* kernel-processed entropy at all. This may be acceptable to folks building embedded systems, but has complications. Carry on reading, and it is assumed you know what you need. *PLEASE* read random(4) and random(9) if you are in the habit of tweaking kernel configs, and/or if you are a member of the embedded community, wanting specific and not-usual behaviour from your security subsystems. NOTE!! If you use RANDOM_DUMMY and/or have no 'device random', you will NOT have a functioning /dev/random, and many cryptographic features will not work, including SSH. You may also find strange behaviour from the random(3) set of library functions, in particular sranddev(3), srandomdev(3) and arc4random(3). The reason for this is that the KERN_ARND sysctl only returns entropy if it thinks it has some to share, and with RANDOM_DUMMY or no 'device random' this will never happen. 20150623: An additional fix for the issue described in the 20150614 sendmail entry below has been committed in revision 284717. 20150616: FreeBSD's old make (fmake) has been removed from the system. It is available as the devel/fmake port or via pkg install fmake. 20150615: The fix for the issue described in the 20150614 sendmail entry below has been committed in revision 284436. The work around described in that entry is no longer needed unless the default setting is overridden by a confDH_PARAMETERS configuration setting of '5' or pointing to a 512 bit DH parameter file. 20150614: ALLOW_DEPRECATED_ATF_TOOLS/ATFFILE support has been removed from atf.test.mk (included from bsd.test.mk). Please upgrade devel/atf and devel/kyua to version 0.20+ and adjust any calling code to work with Kyuafile and kyua. 20150614: The import of openssl to address the FreeBSD-SA-15:10.openssl security advisory includes a change which rejects handshakes with DH parameters below 768 bits. sendmail releases prior to 8.15.2 (not yet released), defaulted to a 512 bit DH parameter setting for client connections. To work around this interoperability, sendmail can be configured to use a 2048 bit DH parameter by: 1. Edit /etc/mail/`hostname`.mc 2. If a setting for confDH_PARAMETERS does not exist or exists and is set to a string beginning with '5', replace it with '2'. 3. If a setting for confDH_PARAMETERS exists and is set to a file path, create a new file with: openssl dhparam -out /path/to/file 2048 4. Rebuild the .cf file: cd /etc/mail/; make; make install 5. Restart sendmail: cd /etc/mail/; make restart A sendmail patch is coming, at which time this file will be updated. 20150604: Generation of legacy formatted entries have been disabled by default in pwd_mkdb(8), as all base system consumers of the legacy formatted entries were converted to use the new format by default when the new, machine independent format have been added and supported since FreeBSD 5.x. Please see the pwd_mkdb(8) manual page for further details. 20150525: Clang and llvm have been upgraded to 3.6.1 release. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using 3.5.0 or higher. 20150521: TI platform code switched to using vendor DTS files and this update may break existing systems running on Beaglebone, Beaglebone Black, and Pandaboard: - dtb files should be regenerated/reinstalled. Filenames are the same but content is different now - GPIO addressing was changed, now each GPIO bank (32 pins per bank) has its own /dev/gpiocX device, e.g. pin 121 on /dev/gpioc0 in old addressing scheme is now pin 25 on /dev/gpioc3. - Pandaboard: /etc/ttys should be updated, serial console device is now /dev/ttyu2, not /dev/ttyu0 20150501: soelim(1) from gnu/usr.bin/groff has been replaced by usr.bin/soelim. If you need the GNU extension from groff soelim(1), install groff from package: pkg install groff, or via ports: textproc/groff. 20150423: chmod, chflags, chown and chgrp now affect symlinks in -R mode as defined in symlink(7); previously symlinks were silently ignored. 20150415: The const qualifier has been removed from iconv(3) to comply with POSIX. The ports tree is aware of this from r384038 onwards. 20150416: Libraries specified by LIBADD in Makefiles must have a corresponding DPADD_ variable to ensure correct dependencies. This is now enforced in src.libnames.mk. 20150324: From legacy ata(4) driver was removed support for SATA controllers supported by more functional drivers ahci(4), siis(4) and mvs(4). Kernel modules ataahci and ataadaptec were removed completely, replaced by ahci and mvs modules respectively. 20150315: Clang, llvm and lldb have been upgraded to 3.6.0 release. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using 3.5.0 or higher. 20150307: The 32-bit PowerPC kernel has been changed to a position-independent executable. This can only be booted with a version of loader(8) newer than January 31, 2015, so make sure to update both world and kernel before rebooting. 20150217: If you are running a -CURRENT kernel since r273872 (Oct 30th, 2014), but before r278950, the RNG was not seeded properly. Immediately upgrade the kernel to r278950 or later and regenerate any keys (e.g. ssh keys or openssl keys) that were generated w/ a kernel from that range. This does not affect programs that directly used /dev/random or /dev/urandom. All userland uses of arc4random(3) are affected. 20150210: The autofs(4) ABI was changed in order to restore binary compatibility with 10.1-RELEASE. The automountd(8) daemon needs to be rebuilt to work with the new kernel. 20150131: The powerpc64 kernel has been changed to a position-independent executable. This can only be booted with a new version of loader(8), so make sure to update both world and kernel before rebooting. 20150118: Clang and llvm have been upgraded to 3.5.1 release. This is a bugfix only release, no new features have been added. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using 3.5.0. 20150107: ELF tools addr2line, elfcopy (strip), nm, size, and strings are now taken from the ELF Tool Chain project rather than GNU binutils. They should be drop-in replacements, with the addition of arm64 support. The WITHOUT_ELFTOOLCHAIN_TOOLS= knob may be used to obtain the binutils tools, if necessary. See 20150805 for updated information. 20150105: The default Unbound configuration now enables remote control using a local socket. Users who have already enabled the local_unbound service should regenerate their configuration by running "service local_unbound setup" as root. 20150102: The GNU texinfo and GNU info pages have been removed. To be able to view GNU info pages please install texinfo from ports. 20141231: Clang, llvm and lldb have been upgraded to 3.5.0 release. As of this release, a prerequisite for building clang, llvm and lldb is a C++11 capable compiler and C++11 standard library. This means that to be able to successfully build the cross-tools stage of buildworld, with clang as the bootstrap compiler, your system compiler or cross compiler should either be clang 3.3 or later, or gcc 4.8 or later, and your system C++ library should be libc++, or libdstdc++ from gcc 4.8 or later. On any standard FreeBSD 10.x or 11.x installation, where clang and libc++ are on by default (that is, on x86 or arm), this should work out of the box. On 9.x installations where clang is enabled by default, e.g. on x86 and powerpc, libc++ will not be enabled by default, so libc++ should be built (with clang) and installed first. If both clang and libc++ are missing, build clang first, then use it to build libc++. On 8.x and earlier installations, upgrade to 9.x first, and then follow the instructions for 9.x above. Sparc64 and mips users are unaffected, as they still use gcc 4.2.1 by default, and do not build clang. Many embedded systems are resource constrained, and will not be able to build clang in a reasonable time, or in some cases at all. In those cases, cross building bootable systems on amd64 is a workaround. This new version of clang introduces a number of new warnings, of which the following are most likely to appear: -Wabsolute-value This warns in two cases, for both C and C++: * When the code is trying to take the absolute value of an unsigned quantity, which is effectively a no-op, and almost never what was intended. The code should be fixed, if at all possible. If you are sure that the unsigned quantity can be safely cast to signed, without loss of information or undefined behavior, you can add an explicit cast, or disable the warning. * When the code is trying to take an absolute value, but the called abs() variant is for the wrong type, which can lead to truncation. If you want to disable the warning instead of fixing the code, please make sure that truncation will not occur, or it might lead to unwanted side-effects. -Wtautological-undefined-compare and -Wundefined-bool-conversion These warn when C++ code is trying to compare 'this' against NULL, while 'this' should never be NULL in well-defined C++ code. However, there is some legacy (pre C++11) code out there, which actively abuses this feature, which was less strictly defined in previous C++ versions. Squid and openjdk do this, for example. The warning can be turned off for C++98 and earlier, but compiling the code in C++11 mode might result in unexpected behavior; for example, the parts of the program that are unreachable could be optimized away. 20141222: The old NFS client and server (kernel options NFSCLIENT, NFSSERVER) kernel sources have been removed. The .h files remain, since some utilities include them. This will need to be fixed later. If "mount -t oldnfs ..." is attempted, it will fail. If the "-o" option on mountd(8), nfsd(8) or nfsstat(1) is used, the utilities will report errors. 20141121: The handling of LOCAL_LIB_DIRS has been altered to skip addition of directories to top level SUBDIR variable when their parent directory is included in LOCAL_DIRS. Users with build systems with such hierarchies and without SUBDIR entries in the parent directory Makefiles should add them or add the directories to LOCAL_DIRS. 20141109: faith(4) and faithd(8) have been removed from the base system. Faith has been obsolete for a very long time. 20141104: vt(4), the new console driver, is enabled by default. It brings support for Unicode and double-width characters, as well as support for UEFI and integration with the KMS kernel video drivers. You may need to update your console settings in /etc/rc.conf, most probably the keymap. During boot, /etc/rc.d/syscons will indicate what you need to do. vt(4) still has issues and lacks some features compared to syscons(4). See the wiki for up-to-date information: https://wiki.freebsd.org/Newcons If you want to keep using syscons(4), you can do so by adding the following line to /boot/loader.conf: kern.vty=sc 20141102: pjdfstest has been integrated into kyua as an opt-in test suite. Please see share/doc/pjdfstest/README for more details on how to execute it. 20141009: gperf has been removed from the base system for architectures that use clang. Ports that require gperf will obtain it from the devel/gperf port. 20140923: pjdfstest has been moved from tools/regression/pjdfstest to contrib/pjdfstest . 20140922: At svn r271982, The default linux compat kernel ABI has been adjusted to 2.6.18 in support of the linux-c6 compat ports infrastructure update. If you wish to continue using the linux-f10 compat ports, add compat.linux.osrelease=2.6.16 to your local sysctl.conf. Users are encouraged to update their linux-compat packages to linux-c6 during their next update cycle. 20140729: The ofwfb driver, used to provide a graphics console on PowerPC when using vt(4), no longer allows mmap() of all physical memory. This will prevent Xorg on PowerPC with some ATI graphics cards from initializing properly unless x11-servers/xorg-server is updated to 1.12.4_8 or newer. 20140723: The xdev targets have been converted to using TARGET and TARGET_ARCH instead of XDEV and XDEV_ARCH. 20140719: The default unbound configuration has been modified to address issues with reverse lookups on networks that use private address ranges. If you use the local_unbound service, run "service local_unbound setup" as root to regenerate your configuration, then "service local_unbound reload" to load the new configuration. 20140709: The GNU texinfo and GNU info pages are not built and installed anymore, WITH_INFO knob has been added to allow to built and install them again. UPDATE: see 20150102 entry on texinfo's removal 20140708: The GNU readline library is now an INTERNALLIB - that is, it is statically linked into consumers (GDB and variants) in the base system, and the shared library is no longer installed. The devel/readline port is available for third party software that requires readline. 20140702: The Itanium architecture (ia64) has been removed from the list of known architectures. This is the first step in the removal of the architecture. 20140701: Commit r268115 has added NFSv4.1 server support, merged from projects/nfsv4.1-server. Since this includes changes to the internal interfaces between the NFS related modules, a full build of the kernel and modules will be necessary. __FreeBSD_version has been bumped. 20140629: The WITHOUT_VT_SUPPORT kernel config knob has been renamed WITHOUT_VT. (The other _SUPPORT knobs have a consistent meaning which differs from the behaviour controlled by this knob.) 20140619: Maximal length of the serial number in CTL was increased from 16 to 64 chars, that breaks ABI. All CTL-related tools, such as ctladm and ctld, need to be rebuilt to work with a new kernel. 20140606: The libatf-c and libatf-c++ major versions were downgraded to 0 and 1 respectively to match the upstream numbers. They were out of sync because, when they were originally added to FreeBSD, the upstream versions were not respected. These libraries are private and not yet built by default, so renumbering them should be a non-issue. However, unclean source trees will yield broken test programs once the operator executes "make delete-old-libs" after a "make installworld". Additionally, the atf-sh binary was made private by moving it into /usr/libexec/. Already-built shell test programs will keep the path to the old binary so they will break after "make delete-old" is run. If you are using WITH_TESTS=yes (not the default), wipe the object tree and rebuild from scratch to prevent spurious test failures. This is only needed once: the misnumbered libraries and misplaced binaries have been added to OptionalObsoleteFiles.inc so they will be removed during a clean upgrade. 20140512: Clang and llvm have been upgraded to 3.4.1 release. 20140508: We bogusly installed src.opts.mk in /usr/share/mk. This file should be removed to avoid issues in the future (and has been added to ObsoleteFiles.inc). 20140505: /etc/src.conf now affects only builds of the FreeBSD src tree. In the past, it affected all builds that used the bsd.*.mk files. The old behavior was a bug, but people may have relied upon it. To get this behavior back, you can .include /etc/src.conf from /etc/make.conf (which is still global and isn't changed). This also changes the behavior of incremental builds inside the tree of individual directories. Set MAKESYSPATH to ".../share/mk" to do that. Although this has survived make universe and some upgrade scenarios, other upgrade scenarios may have broken. At least one form of temporary breakage was fixed with MAKESYSPATH settings for buildworld as well... In cases where MAKESYSPATH isn't working with this setting, you'll need to set it to the full path to your tree. One side effect of all this cleaning up is that bsd.compiler.mk is no longer implicitly included by bsd.own.mk. If you wish to use COMPILER_TYPE, you must now explicitly include bsd.compiler.mk as well. 20140430: The lindev device has been removed since /dev/full has been made a standard device. __FreeBSD_version has been bumped. 20140424: The knob WITHOUT_VI was added to the base system, which controls building ex(1), vi(1), etc. Older releases of FreeBSD required ex(1) in order to reorder files share/termcap and didn't build ex(1) as a build tool, so building/installing with WITH_VI is highly advised for build hosts for older releases. This issue has been fixed in stable/9 and stable/10 in r277022 and r276991, respectively. 20140418: The YES_HESIOD knob has been removed. It has been obsolete for a decade. Please move to using WITH_HESIOD instead or your builds will silently lack HESIOD. 20140405: The uart(4) driver has been changed with respect to its handling of the low-level console. Previously the uart(4) driver prevented any process from changing the baudrate or the CLOCAL and HUPCL control flags. By removing the restrictions, operators can make changes to the serial console port without having to reboot. However, when getty(8) is started on the serial device that is associated with the low-level console, a misconfigured terminal line in /etc/ttys will now have a real impact. Before upgrading the kernel, make sure that /etc/ttys has the serial console device configured as 3wire without baudrate to preserve the previous behaviour. E.g: ttyu0 "/usr/libexec/getty 3wire" vt100 on secure 20140306: Support for libwrap (TCP wrappers) in rpcbind was disabled by default to improve performance. To re-enable it, if needed, run rpcbind with command line option -W. 20140226: Switched back to the GPL dtc compiler due to updates in the upstream dts files not being supported by the BSDL dtc compiler. You will need to rebuild your kernel toolchain to pick up the new compiler. Core dumps may result while building dtb files during a kernel build if you fail to do so. Set WITHOUT_GPL_DTC if you require the BSDL compiler. 20140216: Clang and llvm have been upgraded to 3.4 release. 20140216: The nve(4) driver has been removed. Please use the nfe(4) driver for NVIDIA nForce MCP Ethernet adapters instead. 20140212: An ABI incompatibility crept into the libc++ 3.4 import in r261283. This could cause certain C++ applications using shared libraries built against the previous version of libc++ to crash. The incompatibility has now been fixed, but any C++ applications or shared libraries built between r261283 and r261801 should be recompiled. 20140204: OpenSSH will now ignore errors caused by kernel lacking of Capsicum capability mode support. Please note that enabling the feature in kernel is still highly recommended. 20140131: OpenSSH is now built with sandbox support, and will use sandbox as the default privilege separation method. This requires Capsicum capability mode support in kernel. 20140128: The libelf and libdwarf libraries have been updated to newer versions from upstream. Shared library version numbers for these two libraries were bumped. Any ports or binaries requiring these two libraries should be recompiled. __FreeBSD_version is bumped to 1100006. 20140110: If a Makefile in a tests/ directory was auto-generating a Kyuafile instead of providing an explicit one, this would prevent such Makefile from providing its own Kyuafile in the future during NO_CLEAN builds. This has been fixed in the Makefiles but manual intervention is needed to clean an objdir if you use NO_CLEAN: # find /usr/obj -name Kyuafile | xargs rm -f 20131213: The behavior of gss_pseudo_random() for the krb5 mechanism has changed, for applications requesting a longer random string than produced by the underlying enctype's pseudo-random() function. In particular, the random string produced from a session key of enctype aes256-cts-hmac-sha1-96 or aes256-cts-hmac-sha1-96 will be different at the 17th octet and later, after this change. The counter used in the PRF+ construction is now encoded as a big-endian integer in accordance with RFC 4402. __FreeBSD_version is bumped to 1100004. 20131108: The WITHOUT_ATF build knob has been removed and its functionality has been subsumed into the more generic WITHOUT_TESTS. If you were using the former to disable the build of the ATF libraries, you should change your settings to use the latter. 20131025: The default version of mtree is nmtree which is obtained from NetBSD. The output is generally the same, but may vary slightly. If you found you need identical output adding "-F freebsd9" to the command line should do the trick. For the time being, the old mtree is available as fmtree. 20131014: libbsdyml has been renamed to libyaml and moved to /usr/lib/private. This will break ports-mgmt/pkg. Rebuild the port, or upgrade to pkg 1.1.4_8 and verify bsdyml not linked in, before running "make delete-old-libs": # make -C /usr/ports/ports-mgmt/pkg build deinstall install clean or # pkg install pkg; ldd /usr/local/sbin/pkg | grep bsdyml 20131010: The stable/10 branch has been created in subversion from head revision r256279. COMMON ITEMS: General Notes ------------- Avoid using make -j when upgrading. While generally safe, there are sometimes problems using -j to upgrade. If your upgrade fails with -j, please try again without -j. From time to time in the past there have been problems using -j with buildworld and/or installworld. This is especially true when upgrading between "distant" versions (eg one that cross a major release boundary or several minor releases, or when several months have passed on the -current branch). Sometimes, obscure build problems are the result of environment poisoning. This can happen because the make utility reads its environment when searching for values for global variables. To run your build attempts in an "environmental clean room", prefix all make commands with 'env -i '. See the env(1) manual page for more details. When upgrading from one major version to another it is generally best to upgrade to the latest code in the currently installed branch first, then do an upgrade to the new branch. This is the best-tested upgrade path, and has the highest probability of being successful. Please try this approach if you encounter problems with a major version upgrade. Since the stable 4.x branch point, one has generally been able to upgrade from anywhere in the most recent stable branch to head / current (or even the last couple of stable branches). See the top of this file when there's an exception. When upgrading a live system, having a root shell around before installing anything can help undo problems. Not having a root shell around can lead to problems if pam has changed too much from your starting point to allow continued authentication after the upgrade. This file should be read as a log of events. When a later event changes information of a prior event, the prior event should not be deleted. Instead, a pointer to the entry with the new information should be placed in the old entry. Readers of this file should also sanity check older entries before relying on them blindly. Authors of new entries should write them with this in mind. ZFS notes --------- When upgrading the boot ZFS pool to a new version, always follow these two steps: 1.) recompile and reinstall the ZFS boot loader and boot block (this is part of "make buildworld" and "make installworld") 2.) update the ZFS boot block on your boot drive The following example updates the ZFS boot block on the first partition (freebsd-boot) of a GPT partitioned drive ada0: "gpart bootcode -p /boot/gptzfsboot -i 1 ada0" Non-boot pools do not need these updates. To build a kernel ----------------- If you are updating from a prior version of FreeBSD (even one just a few days old), you should follow this procedure. It is the most failsafe as it uses a /usr/obj tree with a fresh mini-buildworld, make kernel-toolchain make -DALWAYS_CHECK_MAKE buildkernel KERNCONF=YOUR_KERNEL_HERE make -DALWAYS_CHECK_MAKE installkernel KERNCONF=YOUR_KERNEL_HERE To test a kernel once --------------------- If you just want to boot a kernel once (because you are not sure if it works, or if you want to boot a known bad kernel to provide debugging information) run make installkernel KERNCONF=YOUR_KERNEL_HERE KODIR=/boot/testkernel nextboot -k testkernel To rebuild everything and install it on the current system. ----------------------------------------------------------- # Note: sometimes if you are running current you gotta do more than # is listed here if you are upgrading from a really old current. make buildworld make buildkernel KERNCONF=YOUR_KERNEL_HERE make installkernel KERNCONF=YOUR_KERNEL_HERE [1] [3] mergemaster -Fp [5] make installworld mergemaster -Fi [4] make delete-old [6] To cross-install current onto a separate partition -------------------------------------------------- # In this approach we use a separate partition to hold # current's root, 'usr', and 'var' directories. A partition # holding "/", "/usr" and "/var" should be about 2GB in # size. make buildworld make buildkernel KERNCONF=YOUR_KERNEL_HERE make installworld DESTDIR=${CURRENT_ROOT} -DDB_FROM_SRC make distribution DESTDIR=${CURRENT_ROOT} # if newfs'd make installkernel KERNCONF=YOUR_KERNEL_HERE DESTDIR=${CURRENT_ROOT} cp /etc/fstab ${CURRENT_ROOT}/etc/fstab # if newfs'd To upgrade in-place from stable to current ---------------------------------------------- make buildworld [9] make buildkernel KERNCONF=YOUR_KERNEL_HERE [8] make installkernel KERNCONF=YOUR_KERNEL_HERE [1] [3] mergemaster -Fp [5] make installworld mergemaster -Fi [4] make delete-old [6] Make sure that you've read the UPDATING file to understand the tweaks to various things you need. At this point in the life cycle of current, things change often and you are on your own to cope. The defaults can also change, so please read ALL of the UPDATING entries. Also, if you are tracking -current, you must be subscribed to freebsd-current@freebsd.org. Make sure that before you update your sources that you have read and understood all the recent messages there. If in doubt, please track -stable which has much fewer pitfalls. [1] If you have third party modules, such as vmware, you should disable them at this point so they don't crash your system on reboot. [3] From the bootblocks, boot -s, and then do fsck -p mount -u / mount -a sh /etc/rc.d/zfs start # mount zfs filesystem, if needed cd src # full path to source adjkerntz -i # if CMOS is wall time Also, when doing a major release upgrade, it is required that you boot into single user mode to do the installworld. [4] Note: This step is non-optional. Failure to do this step can result in a significant reduction in the functionality of the system. Attempting to do it by hand is not recommended and those that pursue this avenue should read this file carefully, as well as the archives of freebsd-current and freebsd-hackers mailing lists for potential gotchas. The -U option is also useful to consider. See mergemaster(8) for more information. [5] Usually this step is a no-op. However, from time to time you may need to do this if you get unknown user in the following step. It never hurts to do it all the time. You may need to install a new mergemaster (cd src/usr.sbin/mergemaster && make install) after the buildworld before this step if you last updated from current before 20130425 or from -stable before 20130430. [6] This only deletes old files and directories. Old libraries can be deleted by "make delete-old-libs", but you have to make sure that no program is using those libraries anymore. [8] The new kernel must be able to run existing binaries used by an installworld. When upgrading across major versions, the new kernel's configuration must include the correct COMPAT_FREEBSD option for existing binaries (e.g. COMPAT_FREEBSD11 to run 11.x binaries). Failure to do so may leave you with a system that is hard to boot to recover. A GENERIC kernel will include suitable compatibility options to run binaries from older branches. Note that the ability to run binaries from unsupported branches is not guaranteed. Make sure that you merge any new devices from GENERIC since the last time you updated your kernel config file. Options also change over time, so you may need to adjust your custom kernels for these as well. [9] If CPUTYPE is defined in your /etc/make.conf, make sure to use the "?=" instead of the "=" assignment operator, so that buildworld can override the CPUTYPE if it needs to. MAKEOBJDIRPREFIX must be defined in an environment variable, and not on the command line, or in /etc/make.conf. buildworld will warn if it is improperly defined. FORMAT: This file contains a list, in reverse chronological order, of major breakages in tracking -current. It is not guaranteed to be a complete list of such breakages, and only contains entries since September 23, 2011. If you need to see UPDATING entries from before that date, you will need to fetch an UPDATING file from an older FreeBSD release. Copyright information: Copyright 1998-2009 M. Warner Losh. Redistribution, publication, translation and use, with or without modification, in full or in part, in any form or format of this document are permitted without further permission from the author. THIS DOCUMENT IS PROVIDED BY WARNER LOSH ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL WARNER LOSH BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. Contact Warner Losh if you have any questions about your use of this document. $FreeBSD$ Index: head/etc/mtree/BSD.tests.dist =================================================================== --- head/etc/mtree/BSD.tests.dist (revision 350664) +++ head/etc/mtree/BSD.tests.dist (revision 350665) @@ -1,1092 +1,1094 @@ # $FreeBSD$ # # Please see the file src/etc/mtree/README before making changes to this file. # /set type=dir uname=root gname=wheel mode=0755 . bin cat .. chflags .. chmod .. date .. dd .. echo .. expr .. ln .. ls .. mkdir .. mv .. pax .. pkill .. pwait .. rm .. rmdir .. sh builtins .. errors .. execution .. expansion .. invocation .. parameters .. parser .. set-e .. .. sleep .. test .. .. cddl lib .. sbin .. usr.bin ctfconvert .. ztest .. .. usr.sbin dtrace common aggs .. arithmetic .. arrays .. assocs .. begin .. bitfields .. buffering .. builtinvar .. cg .. clauses .. cpc .. decls .. drops .. dtraceUtil .. end .. env .. enum .. error .. exit .. fbtprovider .. funcs .. grammar .. include .. inline .. io .. ip .. java_api .. json .. lexer .. llquantize .. mdb .. mib .. misc .. multiaggs .. offsetof .. operators .. pid .. plockstat .. pointers .. pragma .. predicates .. preprocessor .. print .. printa .. printf .. privs .. probes .. proc .. profile-n .. providers .. raise .. rates .. safety .. scalars .. sched .. scripting .. sdt .. sizeof .. speculation .. stability .. stack .. stackdepth .. stop .. strlen .. strtoll .. struct .. sugar .. syscall .. sysevent .. tick-n .. trace .. tracemem .. translators .. typedef .. types .. uctf .. union .. usdt .. ustack .. vars .. version .. .. i386 arrays .. funcs .. pid .. ustack .. .. amd64 arrays .. .. .. zfsd .. .. .. etc rc.d .. .. games .. gnu lib .. usr.bin diff .. .. .. lib atf libatf-c detail .. .. libatf-c++ detail .. .. test-programs .. .. csu dynamic .. dynamiclib .. static .. .. googletest gmock .. gmock_main .. gtest .. gtest_main .. .. libarchive .. libbe .. libc c063 .. db .. gen execve .. posix_spawn .. .. hash data .. .. iconv .. inet .. locale .. net getaddrinfo data .. .. .. nss .. regex data .. .. resolv .. rpc .. ssp .. setjmp .. stdio .. stdlib .. string .. sys .. time .. tls dso .. .. termios .. ttyio .. .. libcam .. libcasper services cap_dns .. cap_grp .. cap_pwd .. cap_sysctl .. .. .. libcrypt .. libdevdctl .. libkvm .. libmp .. libnv .. libproc .. libregex data .. .. librt .. libsbuf .. libthr dlopen .. .. libutil .. libxo .. msun .. .. libexec atf atf-check .. atf-sh .. .. rtld-elf .. tftpd .. .. sbin bectl .. dhclient .. devd .. growfs .. ifconfig .. mdconfig .. pfctl files .. .. .. secure lib .. libexec .. usr.bin .. usr.sbin .. .. share examples tests atf .. googletest .. plain .. tap .. .. .. zoneinfo .. .. sys acl .. aio .. audit .. auditpipe .. capsicum .. cddl zfs bin .. include .. tests acl cifs .. nontrivial .. trivial .. .. atime .. bootfs .. cache .. cachefile .. clean_mirror .. cli_root zfs_upgrade .. zfs_promote .. zfs_clone .. zfs_property .. zfs_destroy .. zpool_create .. zpool_history .. zpool_expand .. zpool_remove .. zfs_mount .. zfs_unshare .. zdb .. zpool_online .. zpool_get .. zpool_export .. zfs_copies .. zfs_get .. zfs .. zpool_clear .. zpool_import blockfiles .. .. zpool .. zpool_offline .. zpool_replace .. zfs_rollback .. zpool_set .. zfs_send .. zfs_set .. zpool_detach .. zfs_diff .. zpool_scrub .. zfs_inherit .. zfs_snapshot .. zfs_share .. zpool_destroy .. zpool_status .. zfs_unmount .. zfs_receive .. zfs_create .. zpool_upgrade blockfiles .. .. zpool_add .. zfs_rename .. zpool_attach .. zfs_reservation .. .. cli_user misc .. zfs_list .. zpool_iostat .. zpool_list .. .. compression .. ctime .. delegate .. devices .. exec .. grow_pool .. grow_replicas .. history .. hotplug .. hotspare .. inheritance .. interop .. inuse .. iscsi .. large_files .. largest_pool .. link_count .. migration .. mmap .. mount .. mv_files .. nestedfs .. no_space .. online_offline .. pool_names .. poolversion .. quota .. redundancy .. refquota .. refreserv .. rename_dirs .. replacement .. reservation .. rootpool .. rsend .. scrub_mirror .. slog .. snapshot .. snapused .. sparse .. threadsappend .. truncate .. txg_integrity .. userquota .. utils_test .. write_dirs .. xattr .. zfsd .. zil .. zinject .. zones .. zvol zvol_ENOSPC .. zvol_cli .. zvol_misc .. zvol_swap .. .. zvol_thrash .. .. .. .. devrandom .. dtrace .. fifo .. file .. fs + fusefs + .. tmpfs .. .. geom class concat .. eli .. gate .. gpt .. mirror .. nop .. part .. raid3 .. shsec .. stripe .. uzip etalon .. .. .. .. kern acct .. execve .. pipe .. .. kqueue libkqueue .. .. mac bsdextended .. portacl .. .. mqueue .. net .. netinet .. netipsec tunnel .. .. netmap .. netpfil common .. pf ioctl .. .. .. opencrypto .. pjdfstest chflags .. chmod .. chown .. ftruncate .. granular .. link .. mkdir .. mkfifo .. mknod .. open .. rename .. rmdir .. symlink .. truncate .. unlink .. utimensat .. .. posixshm .. sys .. vfs .. vm .. .. usr.bin apply .. awk .. basename .. bmake archives fmt_44bsd .. fmt_44bsd_mod .. fmt_oldbsd .. .. basic t0 .. t1 .. t2 .. t3 .. .. execution ellipsis .. empty .. joberr .. plus .. .. shell builtin .. meta .. path .. path_select .. replace .. select .. .. suffixes basic .. src_wild1 .. src_wild2 .. .. syntax directive-t0 .. enl .. funny-targets .. semi .. .. sysmk t0 2 1 .. .. mk .. .. t1 2 1 .. .. mk .. .. t2 2 1 .. .. mk .. .. .. variables modifier_M .. modifier_t .. opt_V .. t0 .. .. .. bsdcat .. calendar .. cmp .. compress .. cpio .. col .. comm .. csplit .. cut .. dc .. diff .. dirname .. du .. file2c .. find .. fold .. getconf .. grep .. gzip .. head .. hexdump .. ident .. indent .. join .. jot .. lastcomm .. limits .. m4 .. mkimg .. ncal .. opensm .. pr .. printf .. procstat .. rs .. sdiff .. sed regress.multitest.out .. .. seq .. soelim .. stat .. tail .. tar .. timeout .. tr .. truncate .. units .. uudecode .. uuencode .. uniq .. vmstat .. xargs .. xinstall .. xo .. yacc yacc .. .. .. usr.sbin chown .. etcupdate .. extattr .. fstyp .. makefs .. newsyslog .. nmtree .. praudit .. pw .. rpcbind .. sa .. .. .. # vim: set expandtab ts=4 sw=4: Index: head/sbin/mount_fusefs/mount_fusefs.8 =================================================================== --- head/sbin/mount_fusefs/mount_fusefs.8 (revision 350664) +++ head/sbin/mount_fusefs/mount_fusefs.8 (revision 350665) @@ -1,382 +1,402 @@ .\" Copyright (c) 1980, 1989, 1991, 1993 .\" The Regents of the University of California. .\" Copyright (c) 2005, 2006 Csaba Henk .\" All rights reserved. .\" +.\" Copyright (c) 2019 The FreeBSD Foundation +.\" +.\" Portions of this documentation were written by BFF Storage Systems under +.\" sponsorship from the FreeBSD Foundation. +.\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" 3. Neither the name of the University nor the names of its contributors .\" may be used to endorse or promote products derived from this software .\" without specific prior written permission. .\" .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" $FreeBSD$ .\" -.Dd November 17, 2018 +.Dd July 31, 2019 .Dt MOUNT_FUSEFS 8 .Os .Sh NAME .Nm mount_fusefs .Nd mount a Fuse file system daemon .Sh SYNOPSIS .Nm .Op Fl A .Op Fl S .Op Fl v .Op Fl D Ar fuse_daemon .Op Fl O Ar daemon_opts .Op Fl s Ar special .Op Fl m Ar node .Op Fl h .Op Fl V .Op Fl o Ar option ... .Ar special node .Op Ar fuse_daemon ... .Sh DESCRIPTION Basic usage is to start a fuse daemon on the given .Ar special file. In practice, the daemon is assigned a .Ar special file automatically, which can then be indentified via .Xr fstat 1 . That special file can then be mounted by .Nm . .Pp However, the procedure of spawning a daemon will usually be automated so that it is performed by .Nm . If the command invoking a given .Ar fuse_daemon is appended to the list of arguments, .Nm will call the .Ar fuse_daemon via that command. In that way the .Ar fuse_daemon will be instructed to attach itself to .Ar special . From that on mounting goes as in the simple case. (See .Sx DAEMON MOUNTS . ) .Pp The .Ar special argument will normally be treated as the path of the special file to mount. .Pp However, if .Pa auto is passed as .Ar special , then .Nm will look for a suitable free fuse device by itself. .Pp Finally, if .Ar special is an integer it will be interpreted as the number of the file descriptor of an already open fuse device (used when the Fuse library invokes .Nm . (See .Sx DAEMON MOUNTS ) . .Pp The options are as follows: .Bl -tag -width indent .It Fl A , Ic --reject-allow_other Prohibit the .Cm allow_other mount flag. Intended for use in scripts and the .Xr sudoers 5 file. .It Fl S , Ic --safe -Run in safe mode (i.e. reject invoking a filesystem daemon) +Run in safe mode (i.e., reject invoking a filesystem daemon). .It Fl v -Be verbose -.It Fl D, Ic --daemon Ar daemon +Be verbose. +.It Fl D , Ic --daemon Ar daemon Call the specified -.Ar daemon -.It Fl O, Ic --daemon_opts Ar opts +.Ar daemon . +.It Fl O , Ic --daemon_opts Ar opts Add .Ar opts -to the daemon's command line -.It Fl s, Ic --special Ar special +to the daemon's command line. +.It Fl s , Ic --special Ar special Use .Ar special -as special -.It Fl m, Ic --mountpath Ar node +as special. +.It Fl m , Ic --mountpath Ar node Mount on -.Ar node -.It Fl h, Ic --help -Show help -.It Fl V, Ic --version -Show version information +.Ar node . +.It Fl h , Ic --help +Show help. +.It Fl V , Ic --version +Show version information. .It Fl o Mount options are specified via .Fl o . The following options are available (and also their negated versions, by prefixing them with .Dq no ) : .Bl -tag -width indent -.It Cm default_permissions -Enable traditional (file mode based) permission checking in kernel .It Cm allow_other Do not apply .Sx STRICT ACCESS POLICY . -Only root can use this option +Only root can use this option. +.It Cm async +I/O to the file system may be done asynchronously. +Writes may be delayed and/or reordered. +.It Cm default_permissions +Enable traditional (file mode based) permission checking in kernel. +.It Cm intr +Allow signals to interrupt operations that are blocked waiting for a reply from the server. +When this option is in use, system calls may fail with +.Er EINTR +whenever a signal is received. .It Cm max_read Ns = Ns Ar n Limit size of read requests to -.Ar n +.Ar n . +.It Cm neglect_shares +Do not refuse unmounting if there are secondary mounts. .It Cm private Refuse shared mounting of the daemon. This is the default behaviour, to allow sharing, expicitly use -.Fl o Cm noprivate -.It Cm neglect_shares -Do not refuse unmounting if there are secondary mounts +.Fl o Cm noprivate . .It Cm push_symlinks_in -Prefix absolute symlinks with the mountpoint +Prefix absolute symlinks with the mountpoint. +.It Cm subtype Ns = Ns Ar fsname +Suffix +.Ar fsname +to the file system name as reported by +.Xr statfs 2 . +This option can be used to identify the file system implemented by +.Ar fuse_daemon . .El .El .Pp Besides the above mount options, there is a set of pseudo-mount options which are supported by the Fuse library. One can list these by passing .Fl h to a Fuse daemon. Most of these options only have affect on the behavior of the daemon (that is, their scope is limited to userspace). However, there are some which do require in-kernel support. Currently the options supported by the kernel are: .Bl -tag -width indent .It Cm direct_io -Bypass the buffer cache system +Bypass the buffer cache system. .It Cm kernel_cache By default cached buffers of a given file are flushed at each .Xr open 2 . -This option disables this behaviour +This option disables this behaviour. .El .Sh DAEMON MOUNTS Usually users do not need to use .Nm directly, as the Fuse library enables Fuse daemons to invoke .Nm . That is, .Pp .Dl fuse_daemon device mountpoint .Pp has the same effect as .Pp .Dl mount_fusefs auto mountpoint fuse_daemon .Pp This is the recommended usage when you want basic usage (eg, run the daemon at a low privilege level but mount it as root). .Sh STRICT ACCESS POLICY The strict access policy for Fuse filesystems lets one to use the filesystem only if the filesystem daemon has the same credentials (uid, real uid, gid, real gid) as the user. .Pp This is applied for Fuse mounts by default and only root can mount without -the strict access policy (i.e. the +the strict access policy (i.e., the .Cm allow_other mount option). .Pp This is to shield users from the daemon .Dq spying on their I/O activities. .Pp Users might opt to willingly relax strict access policy (as far they are concerned) by doing their own secondary mount (See .Sx SHARED MOUNTS ) . .Sh SHARED MOUNTS -A Fuse daemon can be shared (i.e. mounted multiple times). +A Fuse daemon can be shared (i.e., mounted multiple times). When doing the first (primary) mount, the spawner and the mounter of the daemon must have the same uid, or the mounter should be the superuser. .Pp After the primary mount is in place, secondary mounts can be done by anyone unless this feature is disabled by .Cm private . The behaviour of a secondary mount is analogous to that of symbolic links: they redirect all filesystem operations to the primary mount. .Pp Doing a secondary mount is like signing an agreement: by this action, the mounter agrees that the Fuse daemon can trace her I/O activities. From then on she is not banned from using the filesystem (either via her own mount or via the primary mount), regardless whether .Cm allow_other is used or not. .Pp The device name of a secondary mount is the device name of the corresponding primary mount, followed by a '#' character and the index of the secondary -mount; e.g. +mount; e.g., .Pa /dev/fuse0#3 . .Sh SECURITY System administrators might want to use a custom mount policy (ie., one going beyond the .Va vfs.usermount sysctl). The primary tool for such purposes is .Xr sudo 8 . However, given that .Nm is capable of invoking an arbitrary program, one must be careful when doing this. .Nm is designed in a way such that it makes that easy. -For this purpose, there are options which disable certain risky features (i.e. +For this purpose, there are options which disable certain risky features ( .Fl S and .Fl A ) , and command line parsing is done in a flexible way: mixing options and non-options is allowed, but processing them stops at the third non-option argument (after the first two has been utilized as device and mountpoint). The rest of the command line specifies the daemon and its arguments. (Alternatively, the daemon, the special and the mount path can be specified using the respective options.) Note that .Nm ignores the environment variable .Ev POSIXLY_CORRECT and always behaves as described. .Pp In general, to be as scripting / .Xr sudoers 5 friendly as possible, no information has a fixed position in the command line, but once a given piece of information is provided, subsequent arguments/options cannot override it (with the exception of some non-critical ones). .Sh ENVIRONMENT .Bl -tag -width ".Ev MOUNT_FUSEFS_SAFE" .It Ev MOUNT_FUSEFS_SAFE This has the same effect as the .Fl S option. .It Ev MOUNT_FUSEFS_VERBOSE This has the same effect as the .Fl v option. .It Ev MOUNT_FUSEFS_IGNORE_UNKNOWN If set, .Nm will ignore uknown mount options. .It Ev MOUNT_FUSEFS_CALL_BY_LIB Adjust behavior to the needs of the FUSE library. Currently it effects help output. .El .Pp Although the following variables do not have any effect on .Nm itself, they affect the behaviour of fuse daemons: .Bl -tag -width ".Ev FUSE_DEV_NAME" .It Ev FUSE_DEV_NAME Device to attach. If not set, the multiplexer path .Ar /dev/fuse is used. .It Ev FUSE_DEV_FD File desciptor of an opened Fuse device to use. Overrides .Ev FUSE_DEV_NAME . .It Ev FUSE_NO_MOUNT If set, the library will not attempt to mount the filesystem, even if a mountpoint argument is supplied. .El .Sh FILES .Bl -tag -width /dev/fuse .It Pa /dev/fuse Fuse device with which the kernel and Fuse daemons can communicate. .It Pa /dev/fuse The multiplexer path. An .Xr open 2 performed on it automatically is passed to a free Fuse device by the kernel (which might be created just for this puprose). .El .Sh EXAMPLES Mount the example filesystem in the Fuse distribution (from its directory): either .Pp .Dl ./fusexmp /mnt/fuse .Pp or .Pp .Dl mount_fusefs auto /mnt/fuse ./fusexmp .Pp Doing the same in two steps, using .Pa /dev/fuse0 : .Pp .Dl FUSE_DEV_NAME=/dev/fuse ./fusexmp && .Dl mount_fusefs /dev/fuse /mnt/fuse .Pp A script wrapper for fusexmp which ensures that .Nm does not call any external utility and also provides a hacky (non race-free) automatic device selection: .Pp .Dl #!/bin/sh -e .Pp .Dl FUSE_DEV_NAME=/dev/fuse fusexmp .Dl mount_fusefs -S /dev/fuse /mnt/fuse \(lq$@\(rq .Sh SEE ALSO .Xr fstat 1 , .Xr mount 8 , .Xr sudo 8 , .Xr umount 8 .Sh HISTORY .Nm was written as the part of the .Fx implementation of the Fuse userspace filesystem framework (see -.Xr https://github.com/libfuse/libfuse ) +.Lk https://github.com/libfuse/libfuse ) and first appeared in the .Pa sysutils/fusefs-kmod port, supporting .Fx 6.0 . It was added to the base system in .Fx 10.0 . .Sh CAVEATS This user interface is .Fx specific. Secondary mounts should be unmounted via their device name. If an attempt is made to unmount them via their filesystem root path, the unmount request will be forwarded to the primary mount path. In general, unmounting by device name is less error-prone than by mount path (although the latter will also work under normal circumstances). .Pp If the daemon is specified via the .Fl D and .Fl O options, it will be invoked via .Xr system 3 , and the daemon's command line will also have an .Dq & control operator appended, so that we do not have to wait for its termination. You should use a simple command line when invoking the daemon via these options. .Sh BUGS .Ar special is treated as a multiplexer if and only if it is literally the same as .Pa auto or .Pa /dev/fuse . Other paths which are equivalent with .Pa /dev/fuse (eg., .Pa /../dev/fuse ) are not. Index: head/sbin/mount_fusefs/mount_fusefs.c =================================================================== --- head/sbin/mount_fusefs/mount_fusefs.c (revision 350664) +++ head/sbin/mount_fusefs/mount_fusefs.c (revision 350665) @@ -1,508 +1,494 @@ /*- * SPDX-License-Identifier: BSD-2-Clause-FreeBSD * * Copyright (c) 2005 Jean-Sebastien Pedron * Copyright (c) 2005 Csaba Henk * All rights reserved. * + * Copyright (c) 2019 The FreeBSD Foundation + * + * Portions of this software were developed by BFF Storage Systems under + * sponsorship from the FreeBSD Foundation. + * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "mntopts.h" #ifndef FUSE4BSD_VERSION #define FUSE4BSD_VERSION "0.3.9-pre1" #endif void __usage_short(void); void usage(void); void helpmsg(void); void showversion(void); -int init_backgrounded(void); static struct mntopt mopts[] = { #define ALTF_PRIVATE 0x01 { "private", 0, ALTF_PRIVATE, 1 }, { "neglect_shares", 0, 0x02, 1 }, { "push_symlinks_in", 0, 0x04, 1 }, { "allow_other", 0, 0x08, 1 }, { "default_permissions", 0, 0x10, 1 }, #define ALTF_MAXREAD 0x20 { "max_read=", 0, ALTF_MAXREAD, 1 }, #define ALTF_SUBTYPE 0x40 { "subtype=", 0, ALTF_SUBTYPE, 1 }, - #define ALTF_SYNC_UNMOUNT 0x80 - { "sync_unmount", 0, ALTF_SYNC_UNMOUNT, 1 }, /* * MOPT_AUTOMOUNTED, included by MOPT_STDOPTS, does not fit into * the 'flags' argument to nmount(2). We have to abuse altflags * to pass it, as string, via iovec. */ #define ALTF_AUTOMOUNTED 0x100 { "automounted", 0, ALTF_AUTOMOUNTED, 1 }, + #define ALTF_INTR 0x200 + { "intr", 0, ALTF_INTR, 1 }, /* Linux specific options, we silently ignore them */ { "fsname=", 0, 0x00, 1 }, { "fd=", 0, 0x00, 1 }, { "rootmode=", 0, 0x00, 1 }, { "user_id=", 0, 0x00, 1 }, { "group_id=", 0, 0x00, 1 }, { "large_read", 0, 0x00, 1 }, /* "nonempty", just the first two chars are stripped off during parsing */ { "nempty", 0, 0x00, 1 }, + { "async", 0, MNT_ASYNC, 0}, + { "noasync", 1, MNT_ASYNC, 0}, MOPT_STDOPTS, MOPT_END }; struct mntval { int mv_flag; void *mv_value; int mv_len; }; static struct mntval mvals[] = { { ALTF_MAXREAD, NULL, 0 }, { ALTF_SUBTYPE, NULL, 0 }, { 0, NULL, 0 } }; -#define DEFAULT_MOUNT_FLAGS ALTF_PRIVATE | ALTF_SYNC_UNMOUNT +#define DEFAULT_MOUNT_FLAGS ALTF_PRIVATE int main(int argc, char *argv[]) { struct iovec *iov; int mntflags, iovlen, verbose = 0; char *dev = NULL, *dir = NULL, mntpath[MAXPATHLEN]; char *devo = NULL, *diro = NULL; char ndev[128], fdstr[15]; int i, done = 0, reject_allow_other = 0, safe_level = 0; int altflags = DEFAULT_MOUNT_FLAGS; int __altflags = DEFAULT_MOUNT_FLAGS; int ch = 0; struct mntopt *mo; struct mntval *mv; static struct option longopts[] = { {"reject-allow_other", no_argument, NULL, 'A'}, {"safe", no_argument, NULL, 'S'}, {"daemon", required_argument, NULL, 'D'}, {"daemon_opts", required_argument, NULL, 'O'}, {"special", required_argument, NULL, 's'}, {"mountpath", required_argument, NULL, 'm'}, {"version", no_argument, NULL, 'V'}, {"help", no_argument, NULL, 'h'}, {0,0,0,0} }; int pid = 0; int fd = -1, fdx; char *ep; char *daemon_str = NULL, *daemon_opts = NULL; /* * We want a parsing routine which is not sensitive to * the position of args/opts; it should extract the * first two args and stop at the beginning of the rest. * (This makes it easier to call mount_fusefs from external * utils than it is with a strict "util flags args" syntax.) */ iov = NULL; iovlen = 0; mntflags = 0; /* All in all, I feel it more robust this way... */ unsetenv("POSIXLY_CORRECT"); if (getenv("MOUNT_FUSEFS_IGNORE_UNKNOWN")) getmnt_silent = 1; if (getenv("MOUNT_FUSEFS_VERBOSE")) verbose = 1; do { for (i = 0; i < 3; i++) { if (optind < argc && argv[optind][0] != '-') { if (dir) { done = 1; break; } if (dev) dir = argv[optind]; else dev = argv[optind]; optind++; } } switch(ch) { case 'A': reject_allow_other = 1; break; case 'S': safe_level = 1; break; case 'D': if (daemon_str) errx(1, "daemon specified inconsistently"); daemon_str = optarg; break; case 'O': if (daemon_opts) errx(1, "daemon opts specified inconsistently"); daemon_opts = optarg; break; case 'o': getmntopts(optarg, mopts, &mntflags, &altflags); for (mv = mvals; mv->mv_flag; ++mv) { if (! (altflags & mv->mv_flag)) continue; for (mo = mopts; mo->m_flag; ++mo) { char *p, *q; if (mo->m_flag != mv->mv_flag) continue; p = strstr(optarg, mo->m_option); if (p) { p += strlen(mo->m_option); q = p; while (*q != '\0' && *q != ',') q++; mv->mv_len = q - p + 1; mv->mv_value = malloc(mv->mv_len); memcpy(mv->mv_value, p, mv->mv_len - 1); ((char *)mv->mv_value)[mv->mv_len - 1] = '\0'; break; } } } break; case 's': if (devo) errx(1, "special specified inconsistently"); devo = optarg; break; case 'm': if (diro) errx(1, "mount path specified inconsistently"); diro = optarg; break; case 'v': verbose = 1; break; case 'h': helpmsg(); break; case 'V': showversion(); break; case '\0': break; case '?': default: usage(); } if (done) break; } while ((ch = getopt_long(argc, argv, "AvVho:SD:O:s:m:", longopts, NULL)) != -1); argc -= optind; argv += optind; if (devo) { if (dev) errx(1, "special specified inconsistently"); dev = devo; } else if (diro) errx(1, "if mountpoint is given via an option, special should also be given via an option"); if (diro) { if (dir) errx(1, "mount path specified inconsistently"); dir = diro; } if ((! dev) && argc > 0) { dev = *argv++; argc--; } if ((! dir) && argc > 0) { dir = *argv++; argc--; } if (! (dev && dir)) errx(1, "missing special and/or mountpoint"); for (mo = mopts; mo->m_flag; ++mo) { if (altflags & mo->m_flag) { int iov_done = 0; if (reject_allow_other && strcmp(mo->m_option, "allow_other") == 0) /* * reject_allow_other is stronger than a * negative of allow_other: if this is set, * allow_other is blocked, period. */ errx(1, "\"allow_other\" usage is banned by respective option"); for (mv = mvals; mv->mv_flag; ++mv) { if (mo->m_flag != mv->mv_flag) continue; if (mv->mv_value) { build_iovec(&iov, &iovlen, mo->m_option, mv->mv_value, mv->mv_len); iov_done = 1; break; } } if (! iov_done) build_iovec(&iov, &iovlen, mo->m_option, __DECONST(void *, ""), -1); } if (__altflags & mo->m_flag) { char *uscore_opt; if (asprintf(&uscore_opt, "__%s", mo->m_option) == -1) err(1, "failed to allocate memory"); build_iovec(&iov, &iovlen, uscore_opt, __DECONST(void *, ""), -1); free(uscore_opt); } } if (getenv("MOUNT_FUSEFS_SAFE")) safe_level = 1; if (safe_level > 0 && (argc > 0 || daemon_str || daemon_opts)) errx(1, "safe mode, spawning daemon not allowed"); if ((argc > 0 && (daemon_str || daemon_opts)) || (daemon_opts && ! daemon_str)) errx(1, "daemon specified inconsistently"); /* * Resolve the mountpoint with realpath(3) and remove unnecessary * slashes from the devicename if there are any. */ if (checkpath(dir, mntpath) != 0) err(1, "%s", mntpath); (void)rmslashes(dev, dev); if (strcmp(dev, "auto") == 0) dev = __DECONST(char *, "/dev/fuse"); if (strcmp(dev, "/dev/fuse") == 0) { if (! (argc > 0 || daemon_str)) { fprintf(stderr, "Please also specify the fuse daemon to run when mounting via the multiplexer!\n"); usage(); } if ((fd = open(dev, O_RDWR)) < 0) err(1, "failed to open fuse device"); } else { fdx = strtol(dev, &ep, 10); if (*ep == '\0') fd = fdx; } /* Identifying device */ if (fd >= 0) { struct stat sbuf; char *ndevbas, *lep; if (fstat(fd, &sbuf) == -1) err(1, "cannot stat device file descriptor"); strcpy(ndev, _PATH_DEV); ndevbas = ndev + strlen(_PATH_DEV); devname_r(sbuf.st_rdev, S_IFCHR, ndevbas, sizeof(ndev) - strlen(_PATH_DEV)); if (strncmp(ndevbas, "fuse", 4)) errx(1, "mounting inappropriate device"); strtol(ndevbas + 4, &lep, 10); if (*lep != '\0') errx(1, "mounting inappropriate device"); dev = ndev; } if (argc > 0 || daemon_str) { char *fds; if (fd < 0 && (fd = open(dev, O_RDWR)) < 0) err(1, "failed to open fuse device"); if (asprintf(&fds, "%d", fd) == -1) err(1, "failed to allocate memory"); setenv("FUSE_DEV_FD", fds, 1); free(fds); setenv("FUSE_NO_MOUNT", "1", 1); if (daemon_str) { char *bgdaemon; int len; if (! daemon_opts) daemon_opts = __DECONST(char *, ""); len = strlen(daemon_str) + 1 + strlen(daemon_opts) + 2 + 1; bgdaemon = calloc(1, len); if (! bgdaemon) err(1, "failed to allocate memory"); strlcpy(bgdaemon, daemon_str, len); strlcat(bgdaemon, " ", len); strlcat(bgdaemon, daemon_opts, len); strlcat(bgdaemon, " &", len); if (system(bgdaemon)) err(1, "failed to call fuse daemon"); } else { if ((pid = fork()) < 0) err(1, "failed to fork for fuse daemon"); if (pid == 0) { execvp(argv[0], argv); err(1, "failed to exec fuse daemon"); } } } - if (fd >= 0 && ! init_backgrounded() && close(fd) < 0) { - if (pid) - kill(pid, SIGKILL); - err(1, "failed to close fuse device"); - } - /* Prepare the options vector for nmount(). build_iovec() is declared * in mntopts.h. */ sprintf(fdstr, "%d", fd); build_iovec(&iov, &iovlen, "fstype", __DECONST(void *, "fusefs"), -1); build_iovec(&iov, &iovlen, "fspath", mntpath, -1); build_iovec(&iov, &iovlen, "from", dev, -1); build_iovec(&iov, &iovlen, "fd", fdstr, -1); if (verbose) fprintf(stderr, "mounting fuse daemon on device %s\n", dev); if (nmount(iov, iovlen, mntflags) < 0) err(EX_OSERR, "%s on %s", dev, mntpath); exit(0); } void __usage_short(void) { fprintf(stderr, "usage:\n%s [-A|-S|-v|-V|-h|-D daemon|-O args|-s special|-m node|-o option...] special node [daemon args...]\n\n", getprogname()); } void usage(void) { struct mntopt *mo; __usage_short(); fprintf(stderr, "known options:\n"); for (mo = mopts; mo->m_flag; ++mo) fprintf(stderr, "\t%s\n", mo->m_option); fprintf(stderr, "\n(use -h for a detailed description of these options)\n"); exit(EX_USAGE); } void helpmsg(void) { if (! getenv("MOUNT_FUSEFS_CALL_BY_LIB")) { __usage_short(); fprintf(stderr, "description of options:\n"); } /* * The main use case of this function is giving info embedded in general * FUSE lib help output. Therefore the style and the content of the output * tries to fit there as much as possible. */ fprintf(stderr, " -o allow_other allow access to other users\n" /* " -o nonempty allow mounts over non-empty file/dir\n" */ " -o default_permissions enable permission checking by kernel\n" + " -o intr interruptible mount\n" /* " -o fsname=NAME set filesystem name\n" " -o large_read issue large read requests (2.4 only)\n" */ " -o subtype=NAME set filesystem type\n" " -o max_read=N set maximum size of read requests\n" " -o noprivate allow secondary mounting of the filesystem\n" " -o neglect_shares don't report EBUSY when unmount attempted\n" " in presence of secondary mounts\n" " -o push_symlinks_in prefix absolute symlinks with mountpoint\n" - " -o sync_unmount do unmount synchronously\n" ); exit(EX_USAGE); } void showversion(void) { puts("mount_fusefs [fuse4bsd] version: " FUSE4BSD_VERSION); exit(EX_USAGE); -} - -int -init_backgrounded(void) -{ - int ibg; - size_t len; - - len = sizeof(ibg); - - if (sysctlbyname("vfs.fusefs.init_backgrounded", &ibg, &len, NULL, 0)) - return (0); - - return (ibg); } Index: head/share/man/man5/fusefs.5 =================================================================== --- head/share/man/man5/fusefs.5 (revision 350664) +++ head/share/man/man5/fusefs.5 (revision 350665) @@ -1,147 +1,138 @@ .\" .\" SPDX-License-Identifier: BSD-2-Clause-FreeBSD .\" .\" Copyright (c) 2019 The FreeBSD Foundation .\" -.\" This software was developed by BFF Storage Systems, LLC under sponsorship -.\" from the FreeBSD Foundation. +.\" This documentation was written by BFF Storage Systems, LLC under +.\" sponsorship from the FreeBSD Foundation. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" $FreeBSD$ -.Dd April 13, 2019 +.Dd July 31, 2019 .Dt FUSEFS 5 .Os .Sh NAME .Nm fusefs .Nd "File system in USErspace" .Sh SYNOPSIS To link into the kernel: .Bd -ragged -offset indent .Cd "options FUSEFS" .Ed .Pp To load as a loadable kernel module: .Pp .Dl "kldload fusefs" .Sh DESCRIPTION The .Nm driver implements a file system that is serviced by a userspace program. .Pp There are many uses for .Nm . Userspace daemons can access libraries or programming languages that cannot run in kernel-mode, for example. .Nm is also useful for developing and debugging file systems, because a crash of the daemon will not take down the entire operating system. Finally, the .Nm API is portable. Many daemons can run on multiple operating systems with minimal modifications. .Sh SYSCTL VARIABLES -The following variables are available as both +The following .Xr sysctl 8 -variables and -.Xr loader 8 -tunables: +variables are available: .Bl -tag -width indent .It Va vfs.fusefs.kernelabi_major Major version of the FUSE kernel ABI supported by this driver. .It Va vfs.fusefs.kernelabi_minor Minor version of the FUSE kernel ABI supported by this driver. .It Va vfs.fusefs.data_cache_mode Controls how .Nm -will cache file data. +will cache file data for pre-7.23 file systems. A value of 0 will disable caching entirely. Every data access will be forwarded to the daemon. A value of 1 will select write-through caching. Reads will be cached in the VFS layer as usual. Writes will be immediately forwarded to the daemon, and also added to the cache. A value of 2 will select write-back caching. Reads and writes will both be cached, and writes will occasionally be flushed to the daemon by the page daemon. Write-back caching is usually unsafe, especially for FUSE file systems that require network access. -.It Va vfs.fusefs.lookup_cache_enable -Controls whether -.Nm -will cache lookup responses from the file system. -FUSE file systems indicate whether lookup responses should be cacheable, but -it may be useful to globally disable caching them if a file system is -misbehaving. +.Pp +FUSE file systems using protocol 7.23 or later specify their cache behavior +on a per-mountpoint basis, ignoring this sysctl. +.It Va vfs.fusefs.stats.filehandle_count +Current number of open FUSE file handles. +.It Va vfs.fusefs.stats.lookup_cache_hits +Total number of lookup cache hits. +.It Va vfs.fusefs.stats.lookup_cache_misses +Total number of lookup cache misses. +.It Va vfs.fusefs.stats.node_count +Current number of allocated FUSE vnodes. +.It Va vfs.fusefs.stats.ticket_count +Current number of allocated FUSE tickets, which is roughly equal to the number +of FUSE operations currently being processed by daemons. .\" Undocumented sysctls .\" ==================== -.\" Counters: I intend to rename to vfs.fusefs.stats.* for clarity -.\" vfs.fusefs.lookup_cache_{hits, misses} -.\" vfs.fusefs.filehandle_count -.\" vfs.fusefs.ticker_count -.\" vfs.fusefs.node_count -.\" -.\" vfs.fusefs.version - useless since the driver moved in-tree -.\" vfs.fusefs.reclaim_revoked: I don't understand it well-enough -.\" vfs.fusefs.sync_unmount: dead code .\" vfs.fusefs.enforce_dev_perms: I don't understand it well enough. -.\" vfs.fusefs.init_backgrounded: dead code .\" vfs.fusefs.iov_credit: I don't understand it well enough .\" vfs.fusefs.iov_permanent_bufsize: I don't understand it well enough -.\" vfs.fusefs.fix_broken_io: I don't understand it well enough -.\" vfs.fusefs.sync_resize: useless and should be removed -.\" vfs.fusefs.refresh_size: probably useless? -.\" vfs.fusefs.mmap_enable: why is this optional? -.\" vfs.fusefs.data_cache_invalidate: what is this needed for? +.El .Sh SEE ALSO .Xr mount_fusefs 8 .Sh HISTORY The .Nm fuse driver was written as the part of the .Fx implementation of the FUSE userspace file system framework (see -.Xr https://github.com/libfuse/libfuse ) +.Lk https://github.com/libfuse/libfuse ) and first appeared in the .Pa sysutils/fusefs-kmod port, supporting .Fx 6.0 . It was added to the base system in .Fx 10.0 , and renamed to .Nm for .Fx 12.1 . .Sh AUTHORS .An -nosplit The .Nm fuse driver was originally written by .An Csaba Henk as a Google Summer of Code project in 2005. It was further developed by .An Ilya Putsikau during Google Summer of Code 2011, and that version was integrated into the base system by .An Attilio Rao Aq Mt attilio@FreeBSD.org . .Pp This manual page was written by .An Alan Somers Aq Mt asomers@FreeBSD.org . Index: head/sys/fs/fuse/fuse_param.h =================================================================== --- head/sys/fs/fuse/fuse_param.h (revision 350664) +++ head/sys/fs/fuse/fuse_param.h (nonexistent) @@ -1,82 +0,0 @@ -/*- - * SPDX-License-Identifier: BSD-3-Clause - * - * Copyright (c) 2007-2009 Google Inc. and Amit Singh - * All rights reserved. - * - * Redistribution and use in source and binary forms, with or without - * modification, are permitted provided that the following conditions are - * met: - * - * * Redistributions of source code must retain the above copyright - * notice, this list of conditions and the following disclaimer. - * * Redistributions in binary form must reproduce the above - * copyright notice, this list of conditions and the following disclaimer - * in the documentation and/or other materials provided with the - * distribution. - * * Neither the name of Google Inc. nor the names of its - * contributors may be used to endorse or promote products derived from - * this software without specific prior written permission. - * - * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS - * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT - * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR - * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT - * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, - * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT - * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, - * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY - * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT - * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE - * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. - * - * $FreeBSD$ - */ - -#ifndef _FUSE_PARAM_H_ -#define _FUSE_PARAM_H_ - -/* - * This is the prefix ("fuse" by default) of the name of a FUSE device node - * in devfs. The suffix is the device number. "/dev/fuse0" is the first FUSE - * device by default. If you change the prefix from the default to something - * else, the user-space FUSE library will need to know about it too. - */ -#define FUSE_DEVICE_BASENAME "fuse" - -/* - * This is the number of /dev/fuse nodes we will create. goes from - * 0 to (FUSE_NDEVICES - 1). - */ -#define FUSE_NDEVICES 16 - -/* - * This is the default block size of the virtual storage devices that are - * implicitly implemented by the FUSE kernel extension. This can be changed - * on a per-mount basis (there's one such virtual device for each mount). - */ -#define FUSE_DEFAULT_BLOCKSIZE 4096 - -/* - * This is default I/O size used while accessing the virtual storage devices. - * This can be changed on a per-mount basis. - */ -#define FUSE_DEFAULT_IOSIZE 4096 - -#ifdef KERNEL - -/* - * This is the soft upper limit on the number of "request tickets" FUSE's - * user-kernel IPC layer can have for a given mount. This can be modified - * through the fuse.* sysctl interface. - */ -#define FUSE_DEFAULT_MAX_FREE_TICKETS 1024 - -#define FUSE_DEFAULT_IOV_PERMANENT_BUFSIZE (1L << 19) -#define FUSE_DEFAULT_IOV_CREDIT 16 - -#endif - -#define FUSE_LINK_MAX UINT32_MAX - -#endif /* _FUSE_PARAM_H_ */ Property changes on: head/sys/fs/fuse/fuse_param.h ___________________________________________________________________ Deleted: svn:eol-style ## -1 +0,0 ## -native \ No newline at end of property Deleted: svn:keywords ## -1 +0,0 ## -FreeBSD=%H \ No newline at end of property Deleted: svn:mime-type ## -1 +0,0 ## -text/plain \ No newline at end of property Index: head/sys/fs/fuse/fuse.h =================================================================== --- head/sys/fs/fuse/fuse.h (revision 350664) +++ head/sys/fs/fuse/fuse.h (revision 350665) @@ -1,169 +1,97 @@ /*- * SPDX-License-Identifier: BSD-3-Clause * * Copyright (c) 2007-2009 Google Inc. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are * met: * * * Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following disclaimer * in the documentation and/or other materials provided with the * distribution. * * Neither the name of Google Inc. nor the names of its * contributors may be used to endorse or promote products derived from * this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * Copyright (C) 2005 Csaba Henk. * All rights reserved. + * + * Copyright (c) 2019 The FreeBSD Foundation + * + * Portions of this software were developed by BFF Storage Systems, LLC under + * sponsorship from the FreeBSD Foundation. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $FreeBSD$ */ #include "fuse_kernel.h" #define FUSE_DEFAULT_DAEMON_TIMEOUT 60 /* s */ #define FUSE_MIN_DAEMON_TIMEOUT 0 /* s */ #define FUSE_MAX_DAEMON_TIMEOUT 600 /* s */ -#ifndef FUSE_FREEBSD_VERSION -#define FUSE_FREEBSD_VERSION "0.4.4" -#endif - -/* Mapping versions to features */ - -#define FUSE_KERNELABI_GEQ(maj, min) \ -(FUSE_KERNEL_VERSION > (maj) || (FUSE_KERNEL_VERSION == (maj) && FUSE_KERNEL_MINOR_VERSION >= (min))) - -/* - * Appearance of new FUSE operations is not always in par with version - * numbering... At least, 7.3 is a sufficient condition for having - * FUSE_{ACCESS,CREATE}. - */ -#if FUSE_KERNELABI_GEQ(7, 3) -#ifndef FUSE_HAS_ACCESS -#define FUSE_HAS_ACCESS 1 -#endif -#ifndef FUSE_HAS_CREATE -#define FUSE_HAS_CREATE 1 -#endif -#else /* FUSE_KERNELABI_GEQ(7, 3) */ -#ifndef FUSE_HAS_ACCESS -#define FUSE_HAS_ACCESS 0 -#endif -#ifndef FUSE_HAS_CREATE -#define FUSE_HAS_CREATE 0 -#endif -#endif - -#if FUSE_KERNELABI_GEQ(7, 7) -#ifndef FUSE_HAS_GETLK -#define FUSE_HAS_GETLK 1 -#endif -#ifndef FUSE_HAS_SETLK -#define FUSE_HAS_SETLK 1 -#endif -#ifndef FUSE_HAS_SETLKW -#define FUSE_HAS_SETLKW 1 -#endif -#ifndef FUSE_HAS_INTERRUPT -#define FUSE_HAS_INTERRUPT 1 -#endif -#else /* FUSE_KERNELABI_GEQ(7, 7) */ -#ifndef FUSE_HAS_GETLK -#define FUSE_HAS_GETLK 0 -#endif -#ifndef FUSE_HAS_SETLK -#define FUSE_HAS_SETLK 0 -#endif -#ifndef FUSE_HAS_SETLKW -#define FUSE_HAS_SETLKW 0 -#endif -#ifndef FUSE_HAS_INTERRUPT -#define FUSE_HAS_INTERRUPT 0 -#endif -#endif - -#if FUSE_KERNELABI_GEQ(7, 8) -#ifndef FUSE_HAS_FLUSH_RELEASE -#define FUSE_HAS_FLUSH_RELEASE 1 -/* - * "DESTROY" came in the middle of the 7.8 era, - * so this is not completely exact... - */ -#ifndef FUSE_HAS_DESTROY -#define FUSE_HAS_DESTROY 1 -#endif -#endif -#else /* FUSE_KERNELABI_GEQ(7, 8) */ -#ifndef FUSE_HAS_FLUSH_RELEASE -#define FUSE_HAS_FLUSH_RELEASE 0 -#ifndef FUSE_HAS_DESTROY -#define FUSE_HAS_DESTROY 0 -#endif -#endif -#endif - /* misc */ SYSCTL_DECL(_vfs_fusefs); +SYSCTL_DECL(_vfs_fusefs_stats); /* Fuse locking */ extern struct mtx fuse_mtx; #define FUSE_LOCK() fuse_lck_mtx_lock(fuse_mtx) #define FUSE_UNLOCK() fuse_lck_mtx_unlock(fuse_mtx) #define RECTIFY_TDCR(td, cred) \ do { \ if (! (td)) \ (td) = curthread; \ if (! (cred)) \ (cred) = (td)->td_ucred; \ } while (0) #define fuse_lck_mtx_lock(mtx) mtx_lock(&(mtx)) #define fuse_lck_mtx_unlock(mtx) mtx_unlock(&(mtx)) void fuse_ipc_init(void); void fuse_ipc_destroy(void); int fuse_device_init(void); void fuse_device_destroy(void); Index: head/sys/fs/fuse/fuse_device.c =================================================================== --- head/sys/fs/fuse/fuse_device.c (revision 350664) +++ head/sys/fs/fuse/fuse_device.c (revision 350665) @@ -1,460 +1,591 @@ /*- * SPDX-License-Identifier: BSD-3-Clause * * Copyright (c) 2007-2009 Google Inc. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are * met: * * * Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following disclaimer * in the documentation and/or other materials provided with the * distribution. * * Neither the name of Google Inc. nor the names of its * contributors may be used to endorse or promote products derived from * this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * Copyright (C) 2005 Csaba Henk. * All rights reserved. * + * Copyright (c) 2019 The FreeBSD Foundation + * + * Portions of this software were developed by BFF Storage Systems, LLC under + * sponsorship from the FreeBSD Foundation. + * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "fuse.h" +#include "fuse_internal.h" #include "fuse_ipc.h" -SDT_PROVIDER_DECLARE(fuse); +SDT_PROVIDER_DECLARE(fusefs); /* * Fuse trace probe: * arg0: verbosity. Higher numbers give more verbose messages * arg1: Textual message */ -SDT_PROBE_DEFINE2(fuse, , device, trace, "int", "char*"); +SDT_PROBE_DEFINE2(fusefs, , device, trace, "int", "char*"); static struct cdev *fuse_dev; +static d_kqfilter_t fuse_device_filter; static d_open_t fuse_device_open; -static d_close_t fuse_device_close; static d_poll_t fuse_device_poll; static d_read_t fuse_device_read; static d_write_t fuse_device_write; static struct cdevsw fuse_device_cdevsw = { + .d_kqfilter = fuse_device_filter, .d_open = fuse_device_open, - .d_close = fuse_device_close, .d_name = "fuse", .d_poll = fuse_device_poll, .d_read = fuse_device_read, .d_write = fuse_device_write, .d_version = D_VERSION, }; +static int fuse_device_filt_read(struct knote *kn, long hint); +static void fuse_device_filt_detach(struct knote *kn); + +struct filterops fuse_device_rfiltops = { + .f_isfd = 1, + .f_detach = fuse_device_filt_detach, + .f_event = fuse_device_filt_read, +}; + /**************************** * * >>> Fuse device op defs * ****************************/ static void fdata_dtor(void *arg) { struct fuse_data *fdata; + struct fuse_ticket *tick; fdata = arg; + if (fdata == NULL) + return; + + fdata_set_dead(fdata); + + FUSE_LOCK(); + fuse_lck_mtx_lock(fdata->aw_mtx); + /* wakup poll()ers */ + selwakeuppri(&fdata->ks_rsel, PZERO + 1); + /* Don't let syscall handlers wait in vain */ + while ((tick = fuse_aw_pop(fdata))) { + fuse_lck_mtx_lock(tick->tk_aw_mtx); + fticket_set_answered(tick); + tick->tk_aw_errno = ENOTCONN; + wakeup(tick); + fuse_lck_mtx_unlock(tick->tk_aw_mtx); + FUSE_ASSERT_AW_DONE(tick); + fuse_ticket_drop(tick); + } + fuse_lck_mtx_unlock(fdata->aw_mtx); + + /* Cleanup unsent operations */ + fuse_lck_mtx_lock(fdata->ms_mtx); + while ((tick = fuse_ms_pop(fdata))) { + fuse_ticket_drop(tick); + } + fuse_lck_mtx_unlock(fdata->ms_mtx); + FUSE_UNLOCK(); + fdata_trydestroy(fdata); } +static int +fuse_device_filter(struct cdev *dev, struct knote *kn) +{ + struct fuse_data *data; + int error; + + error = devfs_get_cdevpriv((void **)&data); + + /* EVFILT_WRITE is not supported; the device is always ready to write */ + if (error == 0 && kn->kn_filter == EVFILT_READ) { + kn->kn_fop = &fuse_device_rfiltops; + kn->kn_hook = data; + knlist_add(&data->ks_rsel.si_note, kn, 0); + error = 0; + } else if (error == 0) { + error = EINVAL; + kn->kn_data = error; + } + + return (error); +} + +static void +fuse_device_filt_detach(struct knote *kn) +{ + struct fuse_data *data; + + data = (struct fuse_data*)kn->kn_hook; + MPASS(data != NULL); + knlist_remove(&data->ks_rsel.si_note, kn, 0); + kn->kn_hook = NULL; +} + +static int +fuse_device_filt_read(struct knote *kn, long hint) +{ + struct fuse_data *data; + int ready; + + data = (struct fuse_data*)kn->kn_hook; + MPASS(data != NULL); + + mtx_assert(&data->ms_mtx, MA_OWNED); + if (fdata_get_dead(data)) { + kn->kn_flags |= EV_EOF; + kn->kn_fflags = ENODEV; + kn->kn_data = 1; + ready = 1; + } else if (STAILQ_FIRST(&data->ms_head)) { + MPASS(data->ms_count >= 1); + kn->kn_data = data->ms_count; + ready = 1; + } else { + ready = 0; + } + + return (ready); +} + /* * Resources are set up on a per-open basis */ static int fuse_device_open(struct cdev *dev, int oflags, int devtype, struct thread *td) { struct fuse_data *fdata; int error; - SDT_PROBE2(fuse, , device, trace, 1, "device open"); + SDT_PROBE2(fusefs, , device, trace, 1, "device open"); fdata = fdata_alloc(dev, td->td_ucred); error = devfs_set_cdevpriv(fdata, fdata_dtor); if (error != 0) fdata_trydestroy(fdata); else - SDT_PROBE2(fuse, , device, trace, 1, "device open success"); + SDT_PROBE2(fusefs, , device, trace, 1, "device open success"); return (error); } -static int -fuse_device_close(struct cdev *dev, int fflag, int devtype, struct thread *td) -{ - struct fuse_data *data; - struct fuse_ticket *tick; - int error; - - error = devfs_get_cdevpriv((void **)&data); - if (error != 0) - return (error); - if (!data) - panic("no fuse data upon fuse device close"); - fdata_set_dead(data); - - FUSE_LOCK(); - fuse_lck_mtx_lock(data->aw_mtx); - /* wakup poll()ers */ - selwakeuppri(&data->ks_rsel, PZERO + 1); - /* Don't let syscall handlers wait in vain */ - while ((tick = fuse_aw_pop(data))) { - fuse_lck_mtx_lock(tick->tk_aw_mtx); - fticket_set_answered(tick); - tick->tk_aw_errno = ENOTCONN; - wakeup(tick); - fuse_lck_mtx_unlock(tick->tk_aw_mtx); - FUSE_ASSERT_AW_DONE(tick); - fuse_ticket_drop(tick); - } - fuse_lck_mtx_unlock(data->aw_mtx); - FUSE_UNLOCK(); - - SDT_PROBE2(fuse, , device, trace, 1, "device close"); - return (0); -} - int fuse_device_poll(struct cdev *dev, int events, struct thread *td) { struct fuse_data *data; int error, revents = 0; error = devfs_get_cdevpriv((void **)&data); if (error != 0) return (events & (POLLHUP|POLLIN|POLLRDNORM|POLLOUT|POLLWRNORM)); if (events & (POLLIN | POLLRDNORM)) { fuse_lck_mtx_lock(data->ms_mtx); if (fdata_get_dead(data) || STAILQ_FIRST(&data->ms_head)) revents |= events & (POLLIN | POLLRDNORM); else selrecord(td, &data->ks_rsel); fuse_lck_mtx_unlock(data->ms_mtx); } if (events & (POLLOUT | POLLWRNORM)) { revents |= events & (POLLOUT | POLLWRNORM); } return (revents); } /* * fuse_device_read hangs on the queue of VFS messages. * When it's notified that there is a new one, it picks that and * passes up to the daemon */ int fuse_device_read(struct cdev *dev, struct uio *uio, int ioflag) { int err; struct fuse_data *data; struct fuse_ticket *tick; void *buf[] = {NULL, NULL, NULL}; int buflen[3]; int i; - SDT_PROBE2(fuse, , device, trace, 1, "fuse device read"); + SDT_PROBE2(fusefs, , device, trace, 1, "fuse device read"); err = devfs_get_cdevpriv((void **)&data); if (err != 0) return (err); fuse_lck_mtx_lock(data->ms_mtx); again: if (fdata_get_dead(data)) { - SDT_PROBE2(fuse, , device, trace, 2, + SDT_PROBE2(fusefs, , device, trace, 2, "we know early on that reader should be kicked so we " "don't wait for news"); fuse_lck_mtx_unlock(data->ms_mtx); return (ENODEV); } if (!(tick = fuse_ms_pop(data))) { /* check if we may block */ if (ioflag & O_NONBLOCK) { /* get outa here soon */ fuse_lck_mtx_unlock(data->ms_mtx); return (EAGAIN); } else { err = msleep(data, &data->ms_mtx, PCATCH, "fu_msg", 0); if (err != 0) { fuse_lck_mtx_unlock(data->ms_mtx); return (fdata_get_dead(data) ? ENODEV : err); } tick = fuse_ms_pop(data); } } if (!tick) { /* * We can get here if fuse daemon suddenly terminates, * eg, by being hit by a SIGKILL * -- and some other cases, too, tho not totally clear, when * (cv_signal/wakeup_one signals the whole process ?) */ - SDT_PROBE2(fuse, , device, trace, 1, "no message on thread"); + SDT_PROBE2(fusefs, , device, trace, 1, "no message on thread"); goto again; } fuse_lck_mtx_unlock(data->ms_mtx); if (fdata_get_dead(data)) { /* * somebody somewhere -- eg., umount routine -- * wants this liaison finished off */ - SDT_PROBE2(fuse, , device, trace, 2, "reader is to be sacked"); + SDT_PROBE2(fusefs, , device, trace, 2, + "reader is to be sacked"); if (tick) { - SDT_PROBE2(fuse, , device, trace, 2, "weird -- " + SDT_PROBE2(fusefs, , device, trace, 2, "weird -- " "\"kick\" is set tho there is message"); FUSE_ASSERT_MS_DONE(tick); fuse_ticket_drop(tick); } return (ENODEV); /* This should make the daemon get off * of us */ } - SDT_PROBE2(fuse, , device, trace, 1, + SDT_PROBE2(fusefs, , device, trace, 1, "fuse device read message successfully"); KASSERT(tick->tk_ms_bufdata || tick->tk_ms_bufsize == 0, ("non-null buf pointer with positive size")); switch (tick->tk_ms_type) { case FT_M_FIOV: buf[0] = tick->tk_ms_fiov.base; buflen[0] = tick->tk_ms_fiov.len; break; case FT_M_BUF: buf[0] = tick->tk_ms_fiov.base; buflen[0] = tick->tk_ms_fiov.len; buf[1] = tick->tk_ms_bufdata; buflen[1] = tick->tk_ms_bufsize; break; default: panic("unknown message type for fuse_ticket %p", tick); } for (i = 0; buf[i]; i++) { /* * Why not ban mercilessly stupid daemons who can't keep up * with us? (There is no much use of a partial read here...) */ /* * XXX note that in such cases Linux FUSE throws EIO at the * syscall invoker and stands back to the message queue. The * rationale should be made clear (and possibly adopt that * behaviour). Keeping the current scheme at least makes * fallacy as loud as possible... */ if (uio->uio_resid < buflen[i]) { fdata_set_dead(data); - SDT_PROBE2(fuse, , device, trace, 2, + SDT_PROBE2(fusefs, , device, trace, 2, "daemon is stupid, kick it off..."); err = ENODEV; break; } err = uiomove(buf[i], buflen[i], uio); if (err) break; } FUSE_ASSERT_MS_DONE(tick); fuse_ticket_drop(tick); return (err); } static inline int fuse_ohead_audit(struct fuse_out_header *ohead, struct uio *uio) { if (uio->uio_resid + sizeof(struct fuse_out_header) != ohead->len) { - SDT_PROBE2(fuse, , device, trace, 1, "Format error: body size " + SDT_PROBE2(fusefs, , device, trace, 1, + "Format error: body size " "differs from size claimed by header"); return (EINVAL); } - if (uio->uio_resid && ohead->error) { - SDT_PROBE2(fuse, , device, trace, 1, + if (uio->uio_resid && ohead->unique != 0 && ohead->error) { + SDT_PROBE2(fusefs, , device, trace, 1, "Format error: non zero error but message had a body"); return (EINVAL); } - /* Sanitize the linuxism of negative errnos */ - ohead->error = -(ohead->error); return (0); } -SDT_PROBE_DEFINE1(fuse, , device, fuse_device_write_bumped_into_callback, - "uint64_t"); +SDT_PROBE_DEFINE1(fusefs, , device, fuse_device_write_notify, + "struct fuse_out_header*"); +SDT_PROBE_DEFINE1(fusefs, , device, fuse_device_write_missing_ticket, + "uint64_t"); +SDT_PROBE_DEFINE1(fusefs, , device, fuse_device_write_found, + "struct fuse_ticket*"); /* * fuse_device_write first reads the header sent by the daemon. * If that's OK, looks up ticket/callback node by the unique id seen in header. * If the callback node contains a handler function, the uio is passed over * that. */ static int fuse_device_write(struct cdev *dev, struct uio *uio, int ioflag) { struct fuse_out_header ohead; int err = 0; struct fuse_data *data; - struct fuse_ticket *tick, *x_tick; + struct mount *mp; + struct fuse_ticket *tick, *itick, *x_tick; int found = 0; err = devfs_get_cdevpriv((void **)&data); if (err != 0) return (err); + mp = data->mp; if (uio->uio_resid < sizeof(struct fuse_out_header)) { - SDT_PROBE2(fuse, , device, trace, 1, + SDT_PROBE2(fusefs, , device, trace, 1, "fuse_device_write got less than a header!"); fdata_set_dead(data); return (EINVAL); } if ((err = uiomove(&ohead, sizeof(struct fuse_out_header), uio)) != 0) return (err); /* * We check header information (which is redundant) and compare it * with what we see. If we see some inconsistency we discard the * whole answer and proceed on as if it had never existed. In * particular, no pretender will be woken up, regardless the * "unique" value in the header. */ if ((err = fuse_ohead_audit(&ohead, uio))) { fdata_set_dead(data); return (err); } /* Pass stuff over to callback if there is one installed */ /* Looking for ticket with the unique id of header */ fuse_lck_mtx_lock(data->aw_mtx); TAILQ_FOREACH_SAFE(tick, &data->aw_head, tk_aw_link, x_tick) { - SDT_PROBE1(fuse, , device, - fuse_device_write_bumped_into_callback, - tick->tk_unique); if (tick->tk_unique == ohead.unique) { + SDT_PROBE1(fusefs, , device, fuse_device_write_found, + tick); found = 1; fuse_aw_remove(tick); break; } } + if (found && tick->irq_unique > 0) { + /* + * Discard the FUSE_INTERRUPT ticket that tried to interrupt + * this operation + */ + TAILQ_FOREACH_SAFE(itick, &data->aw_head, tk_aw_link, + x_tick) { + if (itick->tk_unique == tick->irq_unique) { + fuse_aw_remove(itick); + fuse_ticket_drop(itick); + break; + } + } + tick->irq_unique = 0; + } fuse_lck_mtx_unlock(data->aw_mtx); if (found) { if (tick->tk_aw_handler) { /* * We found a callback with proper handler. In this * case the out header will be 0wnd by the callback, * so the fun of freeing that is left for her. * (Then, by all chance, she'll just get that's done * via ticket_drop(), so no manual mucking * around...) */ - SDT_PROBE2(fuse, , device, trace, 1, + SDT_PROBE2(fusefs, , device, trace, 1, "pass ticket to a callback"); + /* Sanitize the linuxism of negative errnos */ + ohead.error *= -1; memcpy(&tick->tk_aw_ohead, &ohead, sizeof(ohead)); err = tick->tk_aw_handler(tick, uio); } else { /* pretender doesn't wanna do anything with answer */ - SDT_PROBE2(fuse, , device, trace, 1, + SDT_PROBE2(fusefs, , device, trace, 1, "stuff devalidated, so we drop it"); } /* * As aw_mtx was not held during the callback execution the * ticket may have been inserted again. However, this is safe * because fuse_ticket_drop() will deal with refcount anyway. */ fuse_ticket_drop(tick); + } else if (ohead.unique == 0){ + /* unique == 0 means asynchronous notification */ + SDT_PROBE1(fusefs, , device, fuse_device_write_notify, &ohead); + switch (ohead.error) { + case FUSE_NOTIFY_INVAL_ENTRY: + err = fuse_internal_invalidate_entry(mp, uio); + break; + case FUSE_NOTIFY_INVAL_INODE: + err = fuse_internal_invalidate_inode(mp, uio); + break; + case FUSE_NOTIFY_RETRIEVE: + case FUSE_NOTIFY_STORE: + /* + * Unimplemented. I don't know of any file systems + * that use them, and the protocol isn't sound anyway, + * since the notification messages don't include the + * inode's generation number. Without that, it's + * possible to manipulate the cache of the wrong vnode. + * Finally, it's not defined what this message should + * do for a file with dirty cache. + */ + case FUSE_NOTIFY_POLL: + /* Unimplemented. See comments in fuse_vnops */ + default: + /* Not implemented */ + err = ENOSYS; + } } else { /* no callback at all! */ - SDT_PROBE2(fuse, , device, trace, 1, - "erhm, no handler for this response"); - err = EINVAL; + SDT_PROBE1(fusefs, , device, fuse_device_write_missing_ticket, + ohead.unique); + if (ohead.error == -EAGAIN) { + /* + * This was probably a response to a FUSE_INTERRUPT + * operation whose original operation is already + * complete. We can't store FUSE_INTERRUPT tickets + * indefinitely because their responses are optional. + * So we delete them when the original operation + * completes. And sadly the fuse_header_out doesn't + * identify the opcode, so we have to guess. + */ + err = 0; + } else { + err = EINVAL; + } } return (err); } int fuse_device_init(void) { fuse_dev = make_dev(&fuse_device_cdevsw, 0, UID_ROOT, GID_OPERATOR, - S_IRUSR | S_IWUSR | S_IRGRP | S_IWGRP, "fuse"); + S_IRUSR | S_IWUSR | S_IRGRP | S_IWGRP | S_IROTH | S_IWOTH, "fuse"); if (fuse_dev == NULL) return (ENOMEM); return (0); } void fuse_device_destroy(void) { MPASS(fuse_dev != NULL); destroy_dev(fuse_dev); } Index: head/sys/fs/fuse/fuse_file.c =================================================================== --- head/sys/fs/fuse/fuse_file.c (revision 350664) +++ head/sys/fs/fuse/fuse_file.c (revision 350665) @@ -1,282 +1,376 @@ /*- * SPDX-License-Identifier: BSD-3-Clause * * Copyright (c) 2007-2009 Google Inc. and Amit Singh * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are * met: * * * Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following disclaimer * in the documentation and/or other materials provided with the * distribution. * * Neither the name of Google Inc. nor the names of its * contributors may be used to endorse or promote products derived from * this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * Copyright (C) 2005 Csaba Henk. * All rights reserved. * + * Copyright (c) 2019 The FreeBSD Foundation + * + * Portions of this software were developed by BFF Storage Systems, LLC under + * sponsorship from the FreeBSD Foundation. + * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include -#include #include +#include +#include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "fuse.h" #include "fuse_file.h" #include "fuse_internal.h" +#include "fuse_io.h" #include "fuse_ipc.h" #include "fuse_node.h" -SDT_PROVIDER_DECLARE(fuse); +MALLOC_DEFINE(M_FUSE_FILEHANDLE, "fuse_filefilehandle", "FUSE file handle"); + +SDT_PROVIDER_DECLARE(fusefs); /* * Fuse trace probe: * arg0: verbosity. Higher numbers give more verbose messages * arg1: Textual message */ -SDT_PROBE_DEFINE2(fuse, , file, trace, "int", "char*"); +SDT_PROBE_DEFINE2(fusefs, , file, trace, "int", "char*"); -static int fuse_fh_count = 0; +static counter_u64_t fuse_fh_count; -SYSCTL_INT(_vfs_fusefs, OID_AUTO, filehandle_count, CTLFLAG_RD, - &fuse_fh_count, 0, "number of open FUSE filehandles"); +SYSCTL_COUNTER_U64(_vfs_fusefs_stats, OID_AUTO, filehandle_count, CTLFLAG_RD, + &fuse_fh_count, "number of open FUSE filehandles"); +/* Get the FUFH type for a particular access mode */ +static inline fufh_type_t +fflags_2_fufh_type(int fflags) +{ + if ((fflags & FREAD) && (fflags & FWRITE)) + return FUFH_RDWR; + else if (fflags & (FWRITE)) + return FUFH_WRONLY; + else if (fflags & (FREAD)) + return FUFH_RDONLY; + else if (fflags & (FEXEC)) + return FUFH_EXEC; + else + panic("FUSE: What kind of a flag is this (%x)?", fflags); +} + int -fuse_filehandle_open(struct vnode *vp, fufh_type_t fufh_type, +fuse_filehandle_open(struct vnode *vp, int a_mode, struct fuse_filehandle **fufhp, struct thread *td, struct ucred *cred) { struct fuse_dispatcher fdi; struct fuse_open_in *foi; struct fuse_open_out *foo; + fufh_type_t fufh_type; int err = 0; int oflags = 0; int op = FUSE_OPEN; - if (fuse_filehandle_valid(vp, fufh_type)) { - panic("FUSE: filehandle_open called despite valid fufh (type=%d)", - fufh_type); - /* NOTREACHED */ - } - /* - * Note that this means we are effectively FILTERING OUT open() flags. - */ - oflags = fuse_filehandle_xlate_to_oflags(fufh_type); + fufh_type = fflags_2_fufh_type(a_mode); + oflags = fufh_type_2_fflags(fufh_type); if (vnode_isdir(vp)) { op = FUSE_OPENDIR; - if (fufh_type != FUFH_RDONLY) { - SDT_PROBE2(fuse, , file, trace, 1, - "non-rdonly fh requested for a directory?"); - printf("FUSE:non-rdonly fh requested for a directory?\n"); - fufh_type = FUFH_RDONLY; - } + /* vn_open_vnode already rejects FWRITE on directories */ + MPASS(fufh_type == FUFH_RDONLY || fufh_type == FUFH_EXEC); } fdisp_init(&fdi, sizeof(*foi)); fdisp_make_vp(&fdi, op, vp, td, cred); foi = fdi.indata; foi->flags = oflags; if ((err = fdisp_wait_answ(&fdi))) { - SDT_PROBE2(fuse, , file, trace, 1, + SDT_PROBE2(fusefs, , file, trace, 1, "OUCH ... daemon didn't give fh"); if (err == ENOENT) { fuse_internal_vnode_disappear(vp); } goto out; } foo = fdi.answ; - fuse_filehandle_init(vp, fufh_type, fufhp, foo->fh); + fuse_filehandle_init(vp, fufh_type, fufhp, td, cred, foo); + fuse_vnode_open(vp, foo->open_flags, td); - /* - * For WRONLY opens, force DIRECT_IO. This is necessary - * since writing a partial block through the buffer cache - * will result in a read of the block and that read won't - * be allowed by the WRONLY open. - */ - if (fufh_type == FUFH_WRONLY) - fuse_vnode_open(vp, foo->open_flags | FOPEN_DIRECT_IO, td); - else - fuse_vnode_open(vp, foo->open_flags, td); - out: fdisp_destroy(&fdi); return err; } int -fuse_filehandle_close(struct vnode *vp, fufh_type_t fufh_type, +fuse_filehandle_close(struct vnode *vp, struct fuse_filehandle *fufh, struct thread *td, struct ucred *cred) { struct fuse_dispatcher fdi; struct fuse_release_in *fri; - struct fuse_vnode_data *fvdat = VTOFUD(vp); - struct fuse_filehandle *fufh = NULL; int err = 0; int op = FUSE_RELEASE; - fufh = &(fvdat->fufh[fufh_type]); - if (!FUFH_IS_VALID(fufh)) { - panic("FUSE: filehandle_put called on invalid fufh (type=%d)", - fufh_type); - /* NOTREACHED */ - } if (fuse_isdeadfs(vp)) { goto out; } if (vnode_isdir(vp)) op = FUSE_RELEASEDIR; fdisp_init(&fdi, sizeof(*fri)); fdisp_make_vp(&fdi, op, vp, td, cred); fri = fdi.indata; fri->fh = fufh->fh_id; - fri->flags = fuse_filehandle_xlate_to_oflags(fufh_type); + fri->flags = fufh_type_2_fflags(fufh->fufh_type); + /* + * If the file has a POSIX lock then we're supposed to set lock_owner. + * If not, then lock_owner is undefined. So we may as well always set + * it. + */ + fri->lock_owner = td->td_proc->p_pid; err = fdisp_wait_answ(&fdi); fdisp_destroy(&fdi); out: - atomic_subtract_acq_int(&fuse_fh_count, 1); - fufh->fh_id = (uint64_t)-1; - fufh->fh_type = FUFH_INVALID; + counter_u64_add(fuse_fh_count, -1); + LIST_REMOVE(fufh, next); + free(fufh, M_FUSE_FILEHANDLE); return err; } -int -fuse_filehandle_valid(struct vnode *vp, fufh_type_t fufh_type) -{ - struct fuse_vnode_data *fvdat = VTOFUD(vp); - struct fuse_filehandle *fufh; - - fufh = &(fvdat->fufh[fufh_type]); - return FUFH_IS_VALID(fufh); -} - /* * Check for a valid file handle, first the type requested, but if that * isn't valid, try for FUFH_RDWR. - * Return the FUFH type that is valid or FUFH_INVALID if there are none. - * This is a variant of fuse_filehandle_vaild() analogous to - * fuse_filehandle_getrw(). + * Return true if there is any file handle with the correct credentials and + * a fufh type that includes the provided one. + * A pid of 0 means "don't care" */ -fufh_type_t -fuse_filehandle_validrw(struct vnode *vp, fufh_type_t fufh_type) +bool +fuse_filehandle_validrw(struct vnode *vp, int mode, + struct ucred *cred, pid_t pid) { struct fuse_vnode_data *fvdat = VTOFUD(vp); struct fuse_filehandle *fufh; + fufh_type_t fufh_type = fflags_2_fufh_type(mode); - fufh = &fvdat->fufh[fufh_type]; - if (FUFH_IS_VALID(fufh) != 0) - return (fufh_type); - fufh = &fvdat->fufh[FUFH_RDWR]; - if (FUFH_IS_VALID(fufh) != 0) - return (FUFH_RDWR); - return (FUFH_INVALID); + /* + * Unlike fuse_filehandle_get, we want to search for a filehandle with + * the exact cred, and no fallback + */ + LIST_FOREACH(fufh, &fvdat->handles, next) { + if (fufh->fufh_type == fufh_type && + fufh->uid == cred->cr_uid && + fufh->gid == cred->cr_rgid && + (pid == 0 || fufh->pid == pid)) + return true; + } + + if (fufh_type == FUFH_EXEC) + return false; + + /* Fallback: find a RDWR list entry with the right cred */ + LIST_FOREACH(fufh, &fvdat->handles, next) { + if (fufh->fufh_type == FUFH_RDWR && + fufh->uid == cred->cr_uid && + fufh->gid == cred->cr_rgid && + (pid == 0 || fufh->pid == pid)) + return true; + } + + return false; } int -fuse_filehandle_get(struct vnode *vp, fufh_type_t fufh_type, - struct fuse_filehandle **fufhp) +fuse_filehandle_get(struct vnode *vp, int fflag, + struct fuse_filehandle **fufhp, struct ucred *cred, pid_t pid) { struct fuse_vnode_data *fvdat = VTOFUD(vp); struct fuse_filehandle *fufh; + fufh_type_t fufh_type; - fufh = &(fvdat->fufh[fufh_type]); - if (!FUFH_IS_VALID(fufh)) + fufh_type = fflags_2_fufh_type(fflag); + /* cred can be NULL for in-kernel clients */ + if (cred == NULL) + goto fallback; + + LIST_FOREACH(fufh, &fvdat->handles, next) { + if (fufh->fufh_type == fufh_type && + fufh->uid == cred->cr_uid && + fufh->gid == cred->cr_rgid && + (pid == 0 || fufh->pid == pid)) + goto found; + } + +fallback: + /* Fallback: find a list entry with the right flags */ + LIST_FOREACH(fufh, &fvdat->handles, next) { + if (fufh->fufh_type == fufh_type) + break; + } + + if (fufh == NULL) return EBADF; + +found: if (fufhp != NULL) *fufhp = fufh; return 0; } +/* Get a file handle with any kind of flags */ int -fuse_filehandle_getrw(struct vnode *vp, fufh_type_t fufh_type, - struct fuse_filehandle **fufhp) +fuse_filehandle_get_anyflags(struct vnode *vp, + struct fuse_filehandle **fufhp, struct ucred *cred, pid_t pid) { struct fuse_vnode_data *fvdat = VTOFUD(vp); struct fuse_filehandle *fufh; - fufh = &(fvdat->fufh[fufh_type]); - if (!FUFH_IS_VALID(fufh)) { - fufh_type = FUFH_RDWR; + if (cred == NULL) + goto fallback; + + LIST_FOREACH(fufh, &fvdat->handles, next) { + if (fufh->uid == cred->cr_uid && + fufh->gid == cred->cr_rgid && + (pid == 0 || fufh->pid == pid)) + goto found; } - return fuse_filehandle_get(vp, fufh_type, fufhp); + +fallback: + /* Fallback: find any list entry */ + fufh = LIST_FIRST(&fvdat->handles); + + if (fufh == NULL) + return EBADF; + +found: + if (fufhp != NULL) + *fufhp = fufh; + return 0; } +int +fuse_filehandle_getrw(struct vnode *vp, int fflag, + struct fuse_filehandle **fufhp, struct ucred *cred, pid_t pid) +{ + int err; + + err = fuse_filehandle_get(vp, fflag, fufhp, cred, pid); + if (err) + err = fuse_filehandle_get(vp, FREAD | FWRITE, fufhp, cred, pid); + return err; +} + void fuse_filehandle_init(struct vnode *vp, fufh_type_t fufh_type, - struct fuse_filehandle **fufhp, uint64_t fh_id) + struct fuse_filehandle **fufhp, struct thread *td, struct ucred *cred, + struct fuse_open_out *foo) { struct fuse_vnode_data *fvdat = VTOFUD(vp); struct fuse_filehandle *fufh; - fufh = &(fvdat->fufh[fufh_type]); - MPASS(!FUFH_IS_VALID(fufh)); - fufh->fh_id = fh_id; - fufh->fh_type = fufh_type; + fufh = malloc(sizeof(struct fuse_filehandle), M_FUSE_FILEHANDLE, + M_WAITOK); + MPASS(fufh != NULL); + fufh->fh_id = foo->fh; + fufh->fufh_type = fufh_type; + fufh->gid = cred->cr_rgid; + fufh->uid = cred->cr_uid; + fufh->pid = td->td_proc->p_pid; + fufh->fuse_open_flags = foo->open_flags; if (!FUFH_IS_VALID(fufh)) { panic("FUSE: init: invalid filehandle id (type=%d)", fufh_type); } + LIST_INSERT_HEAD(&fvdat->handles, fufh, next); if (fufhp != NULL) *fufhp = fufh; - atomic_add_acq_int(&fuse_fh_count, 1); + counter_u64_add(fuse_fh_count, 1); + + if (foo->open_flags & FOPEN_DIRECT_IO) { + ASSERT_VOP_ELOCKED(vp, __func__); + VTOFUD(vp)->flag |= FN_DIRECTIO; + fuse_io_invalbuf(vp, td); + } else { + if ((foo->open_flags & FOPEN_KEEP_CACHE) == 0) + fuse_io_invalbuf(vp, td); + VTOFUD(vp)->flag &= ~FN_DIRECTIO; + } + +} + +void +fuse_file_init(void) +{ + fuse_fh_count = counter_u64_alloc(M_WAITOK); +} + +void +fuse_file_destroy(void) +{ + counter_u64_free(fuse_fh_count); } Index: head/sys/fs/fuse/fuse_file.h =================================================================== --- head/sys/fs/fuse/fuse_file.h (revision 350664) +++ head/sys/fs/fuse/fuse_file.h (revision 350665) @@ -1,146 +1,224 @@ /*- * SPDX-License-Identifier: BSD-3-Clause * * Copyright (c) 2007-2009 Google Inc. and Amit Singh * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are * met: * * * Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following disclaimer * in the documentation and/or other materials provided with the * distribution. * * Neither the name of Google Inc. nor the names of its * contributors may be used to endorse or promote products derived from * this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * Copyright (C) 2005 Csaba Henk. * All rights reserved. + * + * Copyright (c) 2019 The FreeBSD Foundation + * + * Portions of this software were developed by BFF Storage Systems, LLC under + * sponsorship from the FreeBSD Foundation. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $FreeBSD$ */ #ifndef _FUSE_FILE_H_ #define _FUSE_FILE_H_ #include #include #include #include #include +/* + * The fufh type is the access mode of the fuse file handle. It's the portion + * of the open(2) flags related to permission. + */ typedef enum fufh_type { FUFH_INVALID = -1, - FUFH_RDONLY = 0, - FUFH_WRONLY = 1, - FUFH_RDWR = 2, - FUFH_MAXTYPE = 3, + FUFH_RDONLY = O_RDONLY, + FUFH_WRONLY = O_WRONLY, + FUFH_RDWR = O_RDWR, + FUFH_EXEC = O_EXEC, } fufh_type_t; -_Static_assert(FUFH_RDONLY == O_RDONLY, "RDONLY"); -_Static_assert(FUFH_WRONLY == O_WRONLY, "WRONLY"); -_Static_assert(FUFH_RDWR == O_RDWR, "RDWR"); +/* + * FUSE File Handles + * + * The FUSE protocol says that a server may assign a unique 64-bit file handle + * every time that a file is opened. Effectively, that's once for each file + * descriptor. + * + * Unfortunately, the VFS doesn't help us here. VOPs don't have a + * struct file* argument. fileops do, but many syscalls bypass the fileops + * layer and go straight to a vnode. Some, like writing from cache, can't + * track a file handle even in theory. The entire concept of the file handle + * is a product of FUSE's Linux origins; Linux lacks vnodes and almost every + * file system operation takes a struct file* argument. + * + * Since FreeBSD's VFS is more file descriptor-agnostic, we must store FUSE + * filehandles in the vnode. One option would be to only store a single file + * handle and never open FUSE files concurrently. That's what NetBSD does. + * But that violates FUSE's security model. FUSE expects the server to do all + * authorization (except when mounted with -o default_permissions). In order + * to do that, the server needs us to send FUSE_OPEN every time somebody opens + * a new file descriptor. + * + * Another option would be to never open FUSE files concurrently, but send a + * FUSE_ACCESS prior to every open after the first. That would give the server + * the opportunity to authorize the access. Unfortunately, the FUSE protocol + * makes ACCESS optional. File systems that don't implement it are assumed to + * authorize everything. A survey of 32 fuse file systems showed that only 14 + * implemented access. Among the laggards were a few that really ought to be + * doing server-side authorization. + * + * So we do something hacky, similar to what OpenBSD, Illumos, and OSXFuse do. + * we store a list of file handles, one for each combination of vnode, uid, + * gid, pid, and access mode. When opening a file, we first check whether + * there's already a matching file handle. If so, we reuse it. If not, we + * send FUSE_OPEN and create a new file handle. That minimizes the number of + * open file handles while still allowing the server to authorize stuff. + * + * VOPs that need a file handle search through the list for a close match. + * They can't be guaranteed of finding an exact match because, for example, a + * process may have changed its UID since opening the file. Also, most VOPs + * don't know exactly what permission they need. Is O_RDWR required or is + * O_RDONLY good enough? So the file handle we end up using may not be exactly + * the one we're supposed to use with that file descriptor. But if the FUSE + * file system isn't too picky, it will work. (FWIW even Linux sometimes + * guesses the file handle, during writes from cache or most SETATTR + * operations). + * + * I suspect this mess is part of the reason why neither NFS nor 9P have an + * equivalent of FUSE file handles. + */ struct fuse_filehandle { + LIST_ENTRY(fuse_filehandle) next; + + /* The filehandle returned by FUSE_OPEN */ uint64_t fh_id; - fufh_type_t fh_type; -}; -#define FUFH_IS_VALID(f) ((f)->fh_type != FUFH_INVALID) + /* + * flags returned by FUSE_OPEN + * Supported flags: FOPEN_DIRECT_IO, FOPEN_KEEP_CACHE + * Unsupported: + * FOPEN_NONSEEKABLE: Adding support would require a new per-file + * or per-vnode attribute, which would have to be checked by + * kern_lseek (and others) for every file system. The benefit is + * dubious, since I'm unaware of any file systems in ports that use + * this flag. + */ + uint32_t fuse_open_flags; -static inline fufh_type_t -fuse_filehandle_xlate_from_mmap(int fflags) -{ - if (fflags & (PROT_READ | PROT_WRITE)) - return FUFH_RDWR; - else if (fflags & (PROT_WRITE)) - return FUFH_WRONLY; - else if ((fflags & PROT_READ) || (fflags & PROT_EXEC)) - return FUFH_RDONLY; - else - return FUFH_INVALID; -} + /* The access mode of the file handle */ + fufh_type_t fufh_type; -static inline fufh_type_t -fuse_filehandle_xlate_from_fflags(int fflags) -{ - if ((fflags & FREAD) && (fflags & FWRITE)) - return FUFH_RDWR; - else if (fflags & (FWRITE)) - return FUFH_WRONLY; - else if (fflags & (FREAD)) - return FUFH_RDONLY; - else - panic("FUSE: What kind of a flag is this (%x)?", fflags); -} + /* Credentials used to open the file */ + gid_t gid; + pid_t pid; + uid_t uid; +}; +#define FUFH_IS_VALID(f) ((f)->fufh_type != FUFH_INVALID) + +/* + * Get the flags to use for FUSE_CREATE, FUSE_OPEN and FUSE_RELEASE + * + * These are supposed to be the same as the flags argument to open(2). + * However, since we can't reliably associate a fuse_filehandle with a specific + * file descriptor it would would be dangerous to include anything more than + * the access mode flags. For example, suppose we open a file twice, once with + * O_APPEND and once without. Then the user pwrite(2)s to offset using the + * second file descriptor. If fusefs uses the first file handle, then the + * server may append the write to the end of the file rather than at offset 0. + * To prevent problems like this, we only ever send the portion of flags + * related to access mode. + * + * It's essential to send that portion, because FUSE uses it for server-side + * authorization. + */ static inline int -fuse_filehandle_xlate_to_oflags(fufh_type_t type) +fufh_type_2_fflags(fufh_type_t type) { int oflags = -1; switch (type) { case FUFH_RDONLY: case FUFH_WRONLY: case FUFH_RDWR: + case FUFH_EXEC: oflags = type; break; default: break; } return oflags; } -int fuse_filehandle_valid(struct vnode *vp, fufh_type_t fufh_type); -fufh_type_t fuse_filehandle_validrw(struct vnode *vp, fufh_type_t fufh_type); -int fuse_filehandle_get(struct vnode *vp, fufh_type_t fufh_type, - struct fuse_filehandle **fufhp); -int fuse_filehandle_getrw(struct vnode *vp, fufh_type_t fufh_type, - struct fuse_filehandle **fufhp); +bool fuse_filehandle_validrw(struct vnode *vp, int mode, + struct ucred *cred, pid_t pid); +int fuse_filehandle_get(struct vnode *vp, int fflag, + struct fuse_filehandle **fufhp, struct ucred *cred, + pid_t pid); +int fuse_filehandle_get_anyflags(struct vnode *vp, + struct fuse_filehandle **fufhp, struct ucred *cred, + pid_t pid); +int fuse_filehandle_getrw(struct vnode *vp, int fflag, + struct fuse_filehandle **fufhp, struct ucred *cred, + pid_t pid); void fuse_filehandle_init(struct vnode *vp, fufh_type_t fufh_type, - struct fuse_filehandle **fufhp, uint64_t fh_id); -int fuse_filehandle_open(struct vnode *vp, fufh_type_t fufh_type, + struct fuse_filehandle **fufhp, struct thread *td, + struct ucred *cred, struct fuse_open_out *foo); +int fuse_filehandle_open(struct vnode *vp, int mode, struct fuse_filehandle **fufhp, struct thread *td, struct ucred *cred); -int fuse_filehandle_close(struct vnode *vp, fufh_type_t fufh_type, +int fuse_filehandle_close(struct vnode *vp, struct fuse_filehandle *fufh, struct thread *td, struct ucred *cred); + +void fuse_file_init(void); +void fuse_file_destroy(void); #endif /* _FUSE_FILE_H_ */ Index: head/sys/fs/fuse/fuse_internal.c =================================================================== --- head/sys/fs/fuse/fuse_internal.c (revision 350664) +++ head/sys/fs/fuse/fuse_internal.c (revision 350665) @@ -1,694 +1,1216 @@ /*- * SPDX-License-Identifier: BSD-3-Clause * * Copyright (c) 2007-2009 Google Inc. and Amit Singh * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are * met: * * * Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following disclaimer * in the documentation and/or other materials provided with the * distribution. * * Neither the name of Google Inc. nor the names of its * contributors may be used to endorse or promote products derived from * this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * Copyright (C) 2005 Csaba Henk. * All rights reserved. * + * Copyright (c) 2019 The FreeBSD Foundation + * + * Portions of this software were developed by BFF Storage Systems, LLC under + * sponsorship from the FreeBSD Foundation. + * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include -#include #include +#include +#include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "fuse.h" #include "fuse_file.h" #include "fuse_internal.h" +#include "fuse_io.h" #include "fuse_ipc.h" #include "fuse_node.h" #include "fuse_file.h" -#include "fuse_param.h" -SDT_PROVIDER_DECLARE(fuse); +SDT_PROVIDER_DECLARE(fusefs); /* * Fuse trace probe: * arg0: verbosity. Higher numbers give more verbose messages * arg1: Textual message */ -SDT_PROBE_DEFINE2(fuse, , internal, trace, "int", "char*"); +SDT_PROBE_DEFINE2(fusefs, , internal, trace, "int", "char*"); #ifdef ZERO_PAD_INCOMPLETE_BUFS static int isbzero(void *buf, size_t len); #endif -/* access */ +counter_u64_t fuse_lookup_cache_hits; +counter_u64_t fuse_lookup_cache_misses; +SYSCTL_COUNTER_U64(_vfs_fusefs_stats, OID_AUTO, lookup_cache_hits, CTLFLAG_RD, + &fuse_lookup_cache_hits, "number of positive cache hits in lookup"); + +SYSCTL_COUNTER_U64(_vfs_fusefs_stats, OID_AUTO, lookup_cache_misses, CTLFLAG_RD, + &fuse_lookup_cache_misses, "number of cache misses in lookup"); + int +fuse_internal_get_cached_vnode(struct mount* mp, ino_t ino, int flags, + struct vnode **vpp) +{ + struct bintime now; + struct thread *td = curthread; + uint64_t nodeid = ino; + int error; + + *vpp = NULL; + + error = vfs_hash_get(mp, fuse_vnode_hash(nodeid), flags, td, vpp, + fuse_vnode_cmp, &nodeid); + if (error) + return error; + /* + * Check the entry cache timeout. We have to do this within fusefs + * instead of by using cache_enter_time/cache_lookup because those + * routines are only intended to work with pathnames, not inodes + */ + if (*vpp != NULL) { + getbinuptime(&now); + if (bintime_cmp(&(VTOFUD(*vpp)->entry_cache_timeout), &now, >)){ + counter_u64_add(fuse_lookup_cache_hits, 1); + return 0; + } else { + /* Entry cache timeout */ + counter_u64_add(fuse_lookup_cache_misses, 1); + cache_purge(*vpp); + vput(*vpp); + *vpp = NULL; + } + } + return 0; +} + +/* Synchronously send a FUSE_ACCESS operation */ +int fuse_internal_access(struct vnode *vp, - mode_t mode, - struct fuse_access_param *facp, + accmode_t mode, struct thread *td, struct ucred *cred) { int err = 0; - uint32_t mask = 0; + uint32_t mask = F_OK; int dataflags; int vtype; struct mount *mp; struct fuse_dispatcher fdi; struct fuse_access_in *fai; struct fuse_data *data; - /* NOT YET DONE */ - /* - * If this vnop gives you trouble, just return 0 here for a lazy - * kludge. - */ - /* return 0;*/ - mp = vnode_mount(vp); vtype = vnode_vtype(vp); data = fuse_get_mpdata(mp); dataflags = data->dataflags; - if ((mode & VWRITE) && vfs_isrdonly(mp)) { - return EACCES; - } - /* Unless explicitly permitted, deny everyone except the fs owner. */ - if (vnode_isvroot(vp) && !(facp->facc_flags & FACCESS_NOCHECKSPY)) { - if (!(dataflags & FSESS_DAEMON_CAN_SPY)) { - int denied = fuse_match_cred(data->daemoncred, - cred); + if (mode == 0) + return 0; - if (denied) { - return EPERM; - } + if (mode & VMODIFY_PERMS && vfs_isrdonly(mp)) { + switch (vp->v_type) { + case VDIR: + /* FALLTHROUGH */ + case VLNK: + /* FALLTHROUGH */ + case VREG: + return EROFS; + default: + break; } - facp->facc_flags |= FACCESS_NOCHECKSPY; } - if (!(facp->facc_flags & FACCESS_DO_ACCESS)) { - return 0; + + /* Unless explicitly permitted, deny everyone except the fs owner. */ + if (!(dataflags & FSESS_DAEMON_CAN_SPY)) { + if (fuse_match_cred(data->daemoncred, cred)) + return EPERM; } - if (((vtype == VREG) && (mode & VEXEC))) { -#ifdef NEED_MOUNT_ARGUMENT_FOR_THIS - /* Let the kernel handle this through open / close heuristics.*/ - return ENOTSUP; -#else - /* Let the kernel handle this. */ - return 0; -#endif - } - if (!fsess_isimpl(mp, FUSE_ACCESS)) { - /* Let the kernel handle this. */ - return 0; - } + if (dataflags & FSESS_DEFAULT_PERMISSIONS) { - /* Let the kernel handle this. */ - return 0; + struct vattr va; + + fuse_internal_getattr(vp, &va, cred, td); + return vaccess(vp->v_type, va.va_mode, va.va_uid, + va.va_gid, mode, cred, NULL); } - if ((mode & VADMIN) != 0) { - err = priv_check_cred(cred, PRIV_VFS_ADMIN); - if (err) { - return err; - } - } - if ((mode & (VWRITE | VAPPEND | VADMIN)) != 0) { + + if (!fsess_isimpl(mp, FUSE_ACCESS)) + return 0; + + if ((mode & (VWRITE | VAPPEND | VADMIN)) != 0) mask |= W_OK; - } - if ((mode & VREAD) != 0) { + if ((mode & VREAD) != 0) mask |= R_OK; - } - if ((mode & VEXEC) != 0) { + if ((mode & VEXEC) != 0) mask |= X_OK; - } - bzero(&fdi, sizeof(fdi)); fdisp_init(&fdi, sizeof(*fai)); fdisp_make_vp(&fdi, FUSE_ACCESS, vp, td, cred); fai = fdi.indata; - fai->mask = F_OK; - fai->mask |= mask; + fai->mask = mask; err = fdisp_wait_answ(&fdi); fdisp_destroy(&fdi); if (err == ENOSYS) { fsess_set_notimpl(mp, FUSE_ACCESS); err = 0; } return err; } /* - * Cache FUSE attributes from feo, in attr cache associated with vnode 'vp'. - * Optionally, if argument 'vap' is not NULL, store a copy of the converted - * attributes there as well. + * Cache FUSE attributes from attr, in attribute cache associated with vnode + * 'vp'. Optionally, if argument 'vap' is not NULL, store a copy of the + * converted attributes there as well. * * If the nominal attribute cache TTL is zero, do not cache on the 'vp' (but do * return the result to the caller). */ void fuse_internal_cache_attrs(struct vnode *vp, struct fuse_attr *attr, uint64_t attr_valid, uint32_t attr_valid_nsec, struct vattr *vap) { struct mount *mp; struct fuse_vnode_data *fvdat; + struct fuse_data *data; struct vattr *vp_cache_at; mp = vnode_mount(vp); fvdat = VTOFUD(vp); + data = fuse_get_mpdata(mp); - /* Honor explicit do-not-cache requests from user filesystems. */ - if (attr_valid == 0 && attr_valid_nsec == 0) - fvdat->valid_attr_cache = false; - else - fvdat->valid_attr_cache = true; + ASSERT_VOP_ELOCKED(vp, "fuse_internal_cache_attrs"); - vp_cache_at = VTOVA(vp); + fuse_validity_2_bintime(attr_valid, attr_valid_nsec, + &fvdat->attr_cache_timeout); - if (vap == NULL && vp_cache_at == NULL) + /* Fix our buffers if the filesize changed without us knowing */ + if (vnode_isreg(vp) && attr->size != fvdat->cached_attrs.va_size) { + (void)fuse_vnode_setsize(vp, attr->size); + fvdat->cached_attrs.va_size = attr->size; + } + + if (attr_valid > 0 || attr_valid_nsec > 0) + vp_cache_at = &(fvdat->cached_attrs); + else if (vap != NULL) + vp_cache_at = vap; + else return; - if (vap == NULL) - vap = vp_cache_at; - - vattr_null(vap); - - vap->va_fsid = mp->mnt_stat.f_fsid.val[0]; - vap->va_fileid = attr->ino; - vap->va_mode = attr->mode & ~S_IFMT; - vap->va_nlink = attr->nlink; - vap->va_uid = attr->uid; - vap->va_gid = attr->gid; - vap->va_rdev = attr->rdev; - vap->va_size = attr->size; + vattr_null(vp_cache_at); + vp_cache_at->va_fsid = mp->mnt_stat.f_fsid.val[0]; + vp_cache_at->va_fileid = attr->ino; + vp_cache_at->va_mode = attr->mode & ~S_IFMT; + vp_cache_at->va_nlink = attr->nlink; + vp_cache_at->va_uid = attr->uid; + vp_cache_at->va_gid = attr->gid; + vp_cache_at->va_rdev = attr->rdev; + vp_cache_at->va_size = attr->size; /* XXX on i386, seconds are truncated to 32 bits */ - vap->va_atime.tv_sec = attr->atime; - vap->va_atime.tv_nsec = attr->atimensec; - vap->va_mtime.tv_sec = attr->mtime; - vap->va_mtime.tv_nsec = attr->mtimensec; - vap->va_ctime.tv_sec = attr->ctime; - vap->va_ctime.tv_nsec = attr->ctimensec; - vap->va_blocksize = PAGE_SIZE; - vap->va_type = IFTOVT(attr->mode); - vap->va_bytes = attr->blocks * S_BLKSIZE; - vap->va_flags = 0; + vp_cache_at->va_atime.tv_sec = attr->atime; + vp_cache_at->va_atime.tv_nsec = attr->atimensec; + vp_cache_at->va_mtime.tv_sec = attr->mtime; + vp_cache_at->va_mtime.tv_nsec = attr->mtimensec; + vp_cache_at->va_ctime.tv_sec = attr->ctime; + vp_cache_at->va_ctime.tv_nsec = attr->ctimensec; + if (fuse_libabi_geq(data, 7, 9) && attr->blksize > 0) + vp_cache_at->va_blocksize = attr->blksize; + else + vp_cache_at->va_blocksize = PAGE_SIZE; + vp_cache_at->va_type = IFTOVT(attr->mode); + vp_cache_at->va_bytes = attr->blocks * S_BLKSIZE; + vp_cache_at->va_flags = 0; - if (vap != vp_cache_at && vp_cache_at != NULL) - memcpy(vp_cache_at, vap, sizeof(*vap)); + if (vap != vp_cache_at && vap != NULL) + memcpy(vap, vp_cache_at, sizeof(*vap)); } /* fsync */ int fuse_internal_fsync_callback(struct fuse_ticket *tick, struct uio *uio) { if (tick->tk_aw_ohead.error == ENOSYS) { fsess_set_notimpl(tick->tk_data->mp, fticket_opcode(tick)); } return 0; } int fuse_internal_fsync(struct vnode *vp, struct thread *td, - struct ucred *cred, - struct fuse_filehandle *fufh) + int waitfor, + bool datasync) { - int op = FUSE_FSYNC; - struct fuse_fsync_in *ffsi; + struct fuse_fsync_in *ffsi = NULL; struct fuse_dispatcher fdi; + struct fuse_filehandle *fufh; + struct fuse_vnode_data *fvdat = VTOFUD(vp); + struct mount *mp = vnode_mount(vp); + int op = FUSE_FSYNC; + int err = 0; - if (vnode_isdir(vp)) { - op = FUSE_FSYNCDIR; + if (!fsess_isimpl(vnode_mount(vp), + (vnode_vtype(vp) == VDIR ? FUSE_FSYNCDIR : FUSE_FSYNC))) { + return 0; } - fdisp_init(&fdi, sizeof(*ffsi)); - fdisp_make_vp(&fdi, op, vp, td, cred); - ffsi = fdi.indata; - ffsi->fh = fufh->fh_id; + if (vnode_isdir(vp)) + op = FUSE_FSYNCDIR; - ffsi->fsync_flags = 1; /* datasync */ + if (!fsess_isimpl(mp, op)) + return 0; - fuse_insert_callback(fdi.tick, fuse_internal_fsync_callback); - fuse_insert_message(fdi.tick); + fdisp_init(&fdi, sizeof(*ffsi)); + /* + * fsync every open file handle for this file, because we can't be sure + * which file handle the caller is really referring to. + */ + LIST_FOREACH(fufh, &fvdat->handles, next) { + if (ffsi == NULL) + fdisp_make_vp(&fdi, op, vp, td, NULL); + else + fdisp_refresh_vp(&fdi, op, vp, td, NULL); + ffsi = fdi.indata; + ffsi->fh = fufh->fh_id; + ffsi->fsync_flags = 0; + if (datasync) + ffsi->fsync_flags = 1; + + if (waitfor == MNT_WAIT) { + err = fdisp_wait_answ(&fdi); + } else { + fuse_insert_callback(fdi.tick, + fuse_internal_fsync_callback); + fuse_insert_message(fdi.tick, false); + } + if (err == ENOSYS) { + /* ENOSYS means "success, and don't call again" */ + fsess_set_notimpl(mp, op); + err = 0; + break; + } + } fdisp_destroy(&fdi); - return 0; + return err; +} +/* Asynchronous invalidation */ +SDT_PROBE_DEFINE2(fusefs, , internal, invalidate_cache_hit, + "struct vnode*", "struct vnode*"); +int +fuse_internal_invalidate_entry(struct mount *mp, struct uio *uio) +{ + struct fuse_notify_inval_entry_out fnieo; + struct componentname cn; + struct vnode *dvp, *vp; + char name[PATH_MAX]; + int err; + + if ((err = uiomove(&fnieo, sizeof(fnieo), uio)) != 0) + return (err); + + if ((err = uiomove(name, fnieo.namelen, uio)) != 0) + return (err); + name[fnieo.namelen] = '\0'; + /* fusefs does not cache "." or ".." entries */ + if (strncmp(name, ".", sizeof(".")) == 0 || + strncmp(name, "..", sizeof("..")) == 0) + return (0); + + if (fnieo.parent == FUSE_ROOT_ID) + err = VFS_ROOT(mp, LK_SHARED, &dvp); + else + err = fuse_internal_get_cached_vnode( mp, fnieo.parent, + LK_SHARED, &dvp); + /* + * If dvp is not in the cache, then it must've been reclaimed. And + * since fuse_vnop_reclaim does a cache_purge, name's entry must've + * been invalidated already. So we can safely return if dvp == NULL + */ + if (err != 0 || dvp == NULL) + return (err); + /* + * XXX we can't check dvp's generation because the FUSE invalidate + * entry message doesn't include it. Worse case is that we invalidate + * an entry that didn't need to be invalidated. + */ + + cn.cn_nameiop = LOOKUP; + cn.cn_flags = 0; /* !MAKEENTRY means free cached entry */ + cn.cn_thread = curthread; + cn.cn_cred = curthread->td_ucred; + cn.cn_lkflags = LK_SHARED; + cn.cn_pnbuf = NULL; + cn.cn_nameptr = name; + cn.cn_namelen = fnieo.namelen; + err = cache_lookup(dvp, &vp, &cn, NULL, NULL); + MPASS(err == 0); + fuse_vnode_clear_attr_cache(dvp); + vput(dvp); + return (0); } +int +fuse_internal_invalidate_inode(struct mount *mp, struct uio *uio) +{ + struct fuse_notify_inval_inode_out fniio; + struct vnode *vp; + int err; + + if ((err = uiomove(&fniio, sizeof(fniio), uio)) != 0) + return (err); + + if (fniio.ino == FUSE_ROOT_ID) + err = VFS_ROOT(mp, LK_EXCLUSIVE, &vp); + else + err = fuse_internal_get_cached_vnode(mp, fniio.ino, LK_SHARED, + &vp); + if (err != 0 || vp == NULL) + return (err); + /* + * XXX we can't check vp's generation because the FUSE invalidate + * entry message doesn't include it. Worse case is that we invalidate + * an inode that didn't need to be invalidated. + */ + + /* + * Flush and invalidate buffers if off >= 0. Technically we only need + * to flush and invalidate the range of offsets [off, off + len), but + * for simplicity's sake we do everything. + */ + if (fniio.off >= 0) + fuse_io_invalbuf(vp, curthread); + fuse_vnode_clear_attr_cache(vp); + vput(vp); + return (0); +} + +/* mknod */ +int +fuse_internal_mknod(struct vnode *dvp, struct vnode **vpp, + struct componentname *cnp, struct vattr *vap) +{ + struct fuse_data *data; + struct fuse_mknod_in fmni; + size_t insize; + + data = fuse_get_mpdata(dvp->v_mount); + + fmni.mode = MAKEIMODE(vap->va_type, vap->va_mode); + fmni.rdev = vap->va_rdev; + if (fuse_libabi_geq(data, 7, 12)) { + insize = sizeof(fmni); + fmni.umask = curthread->td_proc->p_fd->fd_cmask; + } else { + insize = FUSE_COMPAT_MKNOD_IN_SIZE; + } + return (fuse_internal_newentry(dvp, vpp, cnp, FUSE_MKNOD, &fmni, + insize, vap->va_type)); +} + /* readdir */ int fuse_internal_readdir(struct vnode *vp, struct uio *uio, + off_t startoff, struct fuse_filehandle *fufh, - struct fuse_iov *cookediov) + struct fuse_iov *cookediov, + int *ncookies, + u_long *cookies) { int err = 0; struct fuse_dispatcher fdi; - struct fuse_read_in *fri; + struct fuse_read_in *fri = NULL; + int fnd_start; - if (uio_resid(uio) == 0) { + if (uio_resid(uio) == 0) return 0; - } fdisp_init(&fdi, 0); /* * Note that we DO NOT have a UIO_SYSSPACE here (so no need for p2p * I/O). */ + /* + * fnd_start is set non-zero once the offset in the directory gets + * to the startoff. This is done because directories must be read + * from the beginning (offset == 0) when fuse_vnop_readdir() needs + * to do an open of the directory. + * If it is not set non-zero here, it will be set non-zero in + * fuse_internal_readdir_processdata() when uio_offset == startoff. + */ + fnd_start = 0; + if (uio->uio_offset == startoff) + fnd_start = 1; while (uio_resid(uio) > 0) { - fdi.iosize = sizeof(*fri); - fdisp_make_vp(&fdi, FUSE_READDIR, vp, NULL, NULL); + if (fri == NULL) + fdisp_make_vp(&fdi, FUSE_READDIR, vp, NULL, NULL); + else + fdisp_refresh_vp(&fdi, FUSE_READDIR, vp, NULL, NULL); fri = fdi.indata; fri->fh = fufh->fh_id; fri->offset = uio_offset(uio); - fri->size = min(uio_resid(uio), FUSE_DEFAULT_IOSIZE); - /* mp->max_read */ + fri->size = MIN(uio->uio_resid, + fuse_get_mpdata(vp->v_mount)->max_read); - if ((err = fdisp_wait_answ(&fdi))) { + if ((err = fdisp_wait_answ(&fdi))) break; - } - if ((err = fuse_internal_readdir_processdata(uio, fri->size, fdi.answ, - fdi.iosize, cookediov))) { + if ((err = fuse_internal_readdir_processdata(uio, startoff, + &fnd_start, fri->size, fdi.answ, fdi.iosize, cookediov, + ncookies, &cookies))) break; - } } fdisp_destroy(&fdi); return ((err == -1) ? 0 : err); } +/* + * Return -1 to indicate that this readdir is finished, 0 if it copied + * all the directory data read in and it may be possible to read more + * and greater than 0 for a failure. + */ int fuse_internal_readdir_processdata(struct uio *uio, + off_t startoff, + int *fnd_start, size_t reqsize, void *buf, size_t bufsize, - void *param) + struct fuse_iov *cookediov, + int *ncookies, + u_long **cookiesp) { int err = 0; - int cou = 0; int bytesavail; size_t freclen; struct dirent *de; struct fuse_dirent *fudge; - struct fuse_iov *cookediov = param; + u_long *cookies; - if (bufsize < FUSE_NAME_OFFSET) { + cookies = *cookiesp; + if (bufsize < FUSE_NAME_OFFSET) return -1; - } for (;;) { - if (bufsize < FUSE_NAME_OFFSET) { err = -1; break; } fudge = (struct fuse_dirent *)buf; freclen = FUSE_DIRENT_SIZE(fudge); - cou++; - if (bufsize < freclen) { - err = ((cou == 1) ? -1 : 0); + /* + * This indicates a partial directory entry at the + * end of the directory data. + */ + err = -1; break; } #ifdef ZERO_PAD_INCOMPLETE_BUFS if (isbzero(buf, FUSE_NAME_OFFSET)) { err = -1; break; } #endif if (!fudge->namelen || fudge->namelen > MAXNAMLEN) { err = EINVAL; break; } bytesavail = GENERIC_DIRSIZ((struct pseudo_dirent *) &fudge->namelen); if (bytesavail > uio_resid(uio)) { + /* Out of space for the dir so we are done. */ err = -1; break; } - fiov_refresh(cookediov); - fiov_adjust(cookediov, bytesavail); + /* + * Don't start to copy the directory entries out until + * the requested offset in the directory is found. + */ + if (*fnd_start != 0) { + fiov_adjust(cookediov, bytesavail); + bzero(cookediov->base, bytesavail); - de = (struct dirent *)cookediov->base; - de->d_fileno = fudge->ino; - de->d_reclen = bytesavail; - de->d_type = fudge->type; - de->d_namlen = fudge->namelen; - memcpy((char *)cookediov->base + sizeof(struct dirent) - - MAXNAMLEN - 1, - (char *)buf + FUSE_NAME_OFFSET, fudge->namelen); - dirent_terminate(de); + de = (struct dirent *)cookediov->base; + de->d_fileno = fudge->ino; + de->d_reclen = bytesavail; + de->d_type = fudge->type; + de->d_namlen = fudge->namelen; + memcpy((char *)cookediov->base + sizeof(struct dirent) - + MAXNAMLEN - 1, + (char *)buf + FUSE_NAME_OFFSET, fudge->namelen); + dirent_terminate(de); - err = uiomove(cookediov->base, cookediov->len, uio); - if (err) { - break; - } + err = uiomove(cookediov->base, cookediov->len, uio); + if (err) + break; + if (cookies != NULL) { + if (*ncookies == 0) { + err = -1; + break; + } + *cookies = fudge->off; + cookies++; + (*ncookies)--; + } + } else if (startoff == fudge->off) + *fnd_start = 1; buf = (char *)buf + freclen; bufsize -= freclen; uio_setoffset(uio, fudge->off); } + *cookiesp = cookies; return err; } /* remove */ int fuse_internal_remove(struct vnode *dvp, struct vnode *vp, struct componentname *cnp, enum fuse_opcode op) { struct fuse_dispatcher fdi; - struct fuse_vnode_data *fvdat; - int err; + nlink_t nlink; + int err = 0; - err = 0; - fvdat = VTOFUD(vp); - fdisp_init(&fdi, cnp->cn_namelen + 1); fdisp_make_vp(&fdi, op, dvp, cnp->cn_thread, cnp->cn_cred); memcpy(fdi.indata, cnp->cn_nameptr, cnp->cn_namelen); ((char *)fdi.indata)[cnp->cn_namelen] = '\0'; err = fdisp_wait_answ(&fdi); fdisp_destroy(&fdi); + + if (err) + return (err); + + /* + * Access the cached nlink even if the attr cached has expired. If + * it's inaccurate, the worst that will happen is: + * 1) We'll recycle the vnode even though the file has another link we + * don't know about, costing a bit of cpu time, or + * 2) We won't recycle the vnode even though all of its links are gone. + * It will linger around until vnlru reclaims it, costing a bit of + * temporary memory. + */ + nlink = VTOFUD(vp)->cached_attrs.va_nlink--; + + /* + * Purge the parent's attribute cache because the daemon + * should've updated its mtime and ctime. + */ + fuse_vnode_clear_attr_cache(dvp); + + /* NB: nlink could be zero if it was never cached */ + if (nlink <= 1 || vnode_vtype(vp) == VDIR) { + fuse_internal_vnode_disappear(vp); + } else { + cache_purge(vp); + fuse_vnode_update(vp, FN_CTIMECHANGE); + } + return err; } /* rename */ int fuse_internal_rename(struct vnode *fdvp, struct componentname *fcnp, struct vnode *tdvp, struct componentname *tcnp) { struct fuse_dispatcher fdi; struct fuse_rename_in *fri; int err = 0; fdisp_init(&fdi, sizeof(*fri) + fcnp->cn_namelen + tcnp->cn_namelen + 2); fdisp_make_vp(&fdi, FUSE_RENAME, fdvp, tcnp->cn_thread, tcnp->cn_cred); fri = fdi.indata; fri->newdir = VTOI(tdvp); memcpy((char *)fdi.indata + sizeof(*fri), fcnp->cn_nameptr, fcnp->cn_namelen); ((char *)fdi.indata)[sizeof(*fri) + fcnp->cn_namelen] = '\0'; memcpy((char *)fdi.indata + sizeof(*fri) + fcnp->cn_namelen + 1, tcnp->cn_nameptr, tcnp->cn_namelen); ((char *)fdi.indata)[sizeof(*fri) + fcnp->cn_namelen + tcnp->cn_namelen + 1] = '\0'; err = fdisp_wait_answ(&fdi); fdisp_destroy(&fdi); return err; } /* strategy */ /* entity creation */ void fuse_internal_newentry_makerequest(struct mount *mp, uint64_t dnid, struct componentname *cnp, enum fuse_opcode op, void *buf, size_t bufsize, struct fuse_dispatcher *fdip) { fdip->iosize = bufsize + cnp->cn_namelen + 1; fdisp_make(fdip, op, mp, dnid, cnp->cn_thread, cnp->cn_cred); memcpy(fdip->indata, buf, bufsize); memcpy((char *)fdip->indata + bufsize, cnp->cn_nameptr, cnp->cn_namelen); ((char *)fdip->indata)[bufsize + cnp->cn_namelen] = '\0'; } int fuse_internal_newentry_core(struct vnode *dvp, struct vnode **vpp, struct componentname *cnp, enum vtype vtyp, struct fuse_dispatcher *fdip) { int err = 0; struct fuse_entry_out *feo; struct mount *mp = vnode_mount(dvp); if ((err = fdisp_wait_answ(fdip))) { return err; } feo = fdip->answ; if ((err = fuse_internal_checkentry(feo, vtyp))) { return err; } err = fuse_vnode_get(mp, feo, feo->nodeid, dvp, vpp, cnp, vtyp); if (err) { fuse_internal_forget_send(mp, cnp->cn_thread, cnp->cn_cred, feo->nodeid, 1); return err; } + + /* + * Purge the parent's attribute cache because the daemon should've + * updated its mtime and ctime + */ + fuse_vnode_clear_attr_cache(dvp); + fuse_internal_cache_attrs(*vpp, &feo->attr, feo->attr_valid, feo->attr_valid_nsec, NULL); return err; } int fuse_internal_newentry(struct vnode *dvp, struct vnode **vpp, struct componentname *cnp, enum fuse_opcode op, void *buf, size_t bufsize, enum vtype vtype) { int err; struct fuse_dispatcher fdi; struct mount *mp = vnode_mount(dvp); fdisp_init(&fdi, 0); fuse_internal_newentry_makerequest(mp, VTOI(dvp), cnp, op, buf, bufsize, &fdi); err = fuse_internal_newentry_core(dvp, vpp, cnp, vtype, &fdi); fdisp_destroy(&fdi); return err; } /* entity destruction */ int fuse_internal_forget_callback(struct fuse_ticket *ftick, struct uio *uio) { fuse_internal_forget_send(ftick->tk_data->mp, curthread, NULL, ((struct fuse_in_header *)ftick->tk_ms_fiov.base)->nodeid, 1); return 0; } void fuse_internal_forget_send(struct mount *mp, struct thread *td, struct ucred *cred, uint64_t nodeid, uint64_t nlookup) { struct fuse_dispatcher fdi; struct fuse_forget_in *ffi; /* * KASSERT(nlookup > 0, ("zero-times forget for vp #%llu", * (long long unsigned) nodeid)); */ fdisp_init(&fdi, sizeof(*ffi)); fdisp_make(&fdi, FUSE_FORGET, mp, nodeid, td, cred); ffi = fdi.indata; ffi->nlookup = nlookup; - fuse_insert_message(fdi.tick); + fuse_insert_message(fdi.tick, false); fdisp_destroy(&fdi); } +/* Fetch the vnode's attributes from the daemon*/ +int +fuse_internal_do_getattr(struct vnode *vp, struct vattr *vap, + struct ucred *cred, struct thread *td) +{ + struct fuse_dispatcher fdi; + struct fuse_vnode_data *fvdat = VTOFUD(vp); + struct fuse_getattr_in *fgai; + struct fuse_attr_out *fao; + off_t old_filesize = fvdat->cached_attrs.va_size; + struct timespec old_ctime = fvdat->cached_attrs.va_ctime; + struct timespec old_mtime = fvdat->cached_attrs.va_mtime; + enum vtype vtyp; + int err; + + fdisp_init(&fdi, 0); + fdisp_make_vp(&fdi, FUSE_GETATTR, vp, td, cred); + fgai = fdi.indata; + /* + * We could look up a file handle and set it in fgai->fh, but that + * involves extra runtime work and I'm unaware of any file systems that + * care. + */ + fgai->getattr_flags = 0; + if ((err = fdisp_simple_putget_vp(&fdi, FUSE_GETATTR, vp, td, cred))) { + if (err == ENOENT) + fuse_internal_vnode_disappear(vp); + goto out; + } + + fao = (struct fuse_attr_out *)fdi.answ; + vtyp = IFTOVT(fao->attr.mode); + if (fvdat->flag & FN_SIZECHANGE) + fao->attr.size = old_filesize; + if (fvdat->flag & FN_CTIMECHANGE) { + fao->attr.ctime = old_ctime.tv_sec; + fao->attr.ctimensec = old_ctime.tv_nsec; + } + if (fvdat->flag & FN_MTIMECHANGE) { + fao->attr.mtime = old_mtime.tv_sec; + fao->attr.mtimensec = old_mtime.tv_nsec; + } + fuse_internal_cache_attrs(vp, &fao->attr, fao->attr_valid, + fao->attr_valid_nsec, vap); + if (vtyp != vnode_vtype(vp)) { + fuse_internal_vnode_disappear(vp); + err = ENOENT; + } + +out: + fdisp_destroy(&fdi); + return err; +} + +/* Read a vnode's attributes from cache or fetch them from the fuse daemon */ +int +fuse_internal_getattr(struct vnode *vp, struct vattr *vap, struct ucred *cred, + struct thread *td) +{ + struct vattr *attrs; + + if ((attrs = VTOVA(vp)) != NULL) { + *vap = *attrs; /* struct copy */ + return 0; + } + + return fuse_internal_do_getattr(vp, vap, cred, td); +} + void fuse_internal_vnode_disappear(struct vnode *vp) { struct fuse_vnode_data *fvdat = VTOFUD(vp); ASSERT_VOP_ELOCKED(vp, "fuse_internal_vnode_disappear"); fvdat->flag |= FN_REVOKED; - fvdat->valid_attr_cache = false; cache_purge(vp); } /* fuse start/stop */ int fuse_internal_init_callback(struct fuse_ticket *tick, struct uio *uio) { int err = 0; struct fuse_data *data = tick->tk_data; struct fuse_init_out *fiio; if ((err = tick->tk_aw_ohead.error)) { goto out; } if ((err = fticket_pull(tick, uio))) { goto out; } fiio = fticket_resp(tick)->base; - /* XXX: Do we want to check anything further besides this? */ - if (fiio->major < 7) { - SDT_PROBE2(fuse, , internal, trace, 1, + data->fuse_libabi_major = fiio->major; + data->fuse_libabi_minor = fiio->minor; + if (!fuse_libabi_geq(data, 7, 4)) { + /* + * With a little work we could support servers as old as 7.1. + * But there would be little payoff. + */ + SDT_PROBE2(fusefs, , internal, trace, 1, "userpace version too low"); err = EPROTONOSUPPORT; goto out; } - data->fuse_libabi_major = fiio->major; - data->fuse_libabi_minor = fiio->minor; if (fuse_libabi_geq(data, 7, 5)) { - if (fticket_resp(tick)->len == sizeof(struct fuse_init_out)) { + if (fticket_resp(tick)->len == sizeof(struct fuse_init_out) || + fticket_resp(tick)->len == FUSE_COMPAT_22_INIT_OUT_SIZE) { data->max_write = fiio->max_write; + if (fiio->flags & FUSE_ASYNC_READ) + data->dataflags |= FSESS_ASYNC_READ; + if (fiio->flags & FUSE_POSIX_LOCKS) + data->dataflags |= FSESS_POSIX_LOCKS; + if (fiio->flags & FUSE_EXPORT_SUPPORT) + data->dataflags |= FSESS_EXPORT_SUPPORT; + /* + * Don't bother to check FUSE_BIG_WRITES, because it's + * redundant with max_write + */ + /* + * max_background and congestion_threshold are not + * implemented + */ } else { err = EINVAL; } } else { - /* Old fix values */ + /* Old fixed values */ data->max_write = 4096; } + if (fuse_libabi_geq(data, 7, 6)) + data->max_readahead_blocks = fiio->max_readahead / maxbcachebuf; + + if (!fuse_libabi_geq(data, 7, 7)) + fsess_set_notimpl(data->mp, FUSE_INTERRUPT); + + if (!fuse_libabi_geq(data, 7, 8)) { + fsess_set_notimpl(data->mp, FUSE_BMAP); + fsess_set_notimpl(data->mp, FUSE_DESTROY); + } + + if (fuse_libabi_geq(data, 7, 23) && fiio->time_gran >= 1 && + fiio->time_gran <= 1000000000) + data->time_gran = fiio->time_gran; + else + data->time_gran = 1; + + if (!fuse_libabi_geq(data, 7, 23)) + data->cache_mode = fuse_data_cache_mode; + else if (fiio->flags & FUSE_WRITEBACK_CACHE) + data->cache_mode = FUSE_CACHE_WB; + else + data->cache_mode = FUSE_CACHE_WT; + out: if (err) { fdata_set_dead(data); } FUSE_LOCK(); data->dataflags |= FSESS_INITED; wakeup(&data->ticketer); FUSE_UNLOCK(); return 0; } void fuse_internal_send_init(struct fuse_data *data, struct thread *td) { struct fuse_init_in *fiii; struct fuse_dispatcher fdi; fdisp_init(&fdi, sizeof(*fiii)); fdisp_make(&fdi, FUSE_INIT, data->mp, 0, td, NULL); fiii = fdi.indata; fiii->major = FUSE_KERNEL_VERSION; fiii->minor = FUSE_KERNEL_MINOR_VERSION; - fiii->max_readahead = FUSE_DEFAULT_IOSIZE * 16; - fiii->flags = 0; + /* + * fusefs currently reads ahead no more than one cache block at a time. + * See fuse_read_biobackend + */ + fiii->max_readahead = maxbcachebuf; + /* + * Unsupported features: + * FUSE_FILE_OPS: No known FUSE server or client supports it + * FUSE_ATOMIC_O_TRUNC: our VFS cannot support it + * FUSE_DONT_MASK: unlike Linux, FreeBSD always applies the umask, even + * when default ACLs are in use. + * FUSE_SPLICE_WRITE, FUSE_SPLICE_MOVE, FUSE_SPLICE_READ: FreeBSD + * doesn't have splice(2). + * FUSE_FLOCK_LOCKS: not yet implemented + * FUSE_HAS_IOCTL_DIR: not yet implemented + * FUSE_AUTO_INVAL_DATA: not yet implemented + * FUSE_DO_READDIRPLUS: not yet implemented + * FUSE_READDIRPLUS_AUTO: not yet implemented + * FUSE_ASYNC_DIO: not yet implemented + * FUSE_NO_OPEN_SUPPORT: not yet implemented + */ + fiii->flags = FUSE_ASYNC_READ | FUSE_POSIX_LOCKS | FUSE_EXPORT_SUPPORT + | FUSE_BIG_WRITES | FUSE_WRITEBACK_CACHE; fuse_insert_callback(fdi.tick, fuse_internal_init_callback); - fuse_insert_message(fdi.tick); + fuse_insert_message(fdi.tick, false); fdisp_destroy(&fdi); } +/* + * Send a FUSE_SETATTR operation with no permissions checks. If cred is NULL, + * send the request with root credentials + */ +int fuse_internal_setattr(struct vnode *vp, struct vattr *vap, + struct thread *td, struct ucred *cred) +{ + struct fuse_vnode_data *fvdat; + struct fuse_dispatcher fdi; + struct fuse_setattr_in *fsai; + struct mount *mp; + pid_t pid = td->td_proc->p_pid; + struct fuse_data *data; + int dataflags; + int err = 0; + enum vtype vtyp; + int sizechanged = -1; + uint64_t newsize = 0; + + mp = vnode_mount(vp); + fvdat = VTOFUD(vp); + data = fuse_get_mpdata(mp); + dataflags = data->dataflags; + + fdisp_init(&fdi, sizeof(*fsai)); + fdisp_make_vp(&fdi, FUSE_SETATTR, vp, td, cred); + if (!cred) { + fdi.finh->uid = 0; + fdi.finh->gid = 0; + } + fsai = fdi.indata; + fsai->valid = 0; + + if (vap->va_uid != (uid_t)VNOVAL) { + fsai->uid = vap->va_uid; + fsai->valid |= FATTR_UID; + } + if (vap->va_gid != (gid_t)VNOVAL) { + fsai->gid = vap->va_gid; + fsai->valid |= FATTR_GID; + } + if (vap->va_size != VNOVAL) { + struct fuse_filehandle *fufh = NULL; + + /*Truncate to a new value. */ + fsai->size = vap->va_size; + sizechanged = 1; + newsize = vap->va_size; + fsai->valid |= FATTR_SIZE; + + fuse_filehandle_getrw(vp, FWRITE, &fufh, cred, pid); + if (fufh) { + fsai->fh = fufh->fh_id; + fsai->valid |= FATTR_FH; + } + VTOFUD(vp)->flag &= ~FN_SIZECHANGE; + } + if (vap->va_atime.tv_sec != VNOVAL) { + fsai->atime = vap->va_atime.tv_sec; + fsai->atimensec = vap->va_atime.tv_nsec; + fsai->valid |= FATTR_ATIME; + if (vap->va_vaflags & VA_UTIMES_NULL) + fsai->valid |= FATTR_ATIME_NOW; + } + if (vap->va_mtime.tv_sec != VNOVAL) { + fsai->mtime = vap->va_mtime.tv_sec; + fsai->mtimensec = vap->va_mtime.tv_nsec; + fsai->valid |= FATTR_MTIME; + if (vap->va_vaflags & VA_UTIMES_NULL) + fsai->valid |= FATTR_MTIME_NOW; + } else if (fvdat->flag & FN_MTIMECHANGE) { + fsai->mtime = fvdat->cached_attrs.va_mtime.tv_sec; + fsai->mtimensec = fvdat->cached_attrs.va_mtime.tv_nsec; + fsai->valid |= FATTR_MTIME; + } + if (fuse_libabi_geq(data, 7, 23) && fvdat->flag & FN_CTIMECHANGE) { + fsai->ctime = fvdat->cached_attrs.va_ctime.tv_sec; + fsai->ctimensec = fvdat->cached_attrs.va_ctime.tv_nsec; + fsai->valid |= FATTR_CTIME; + } + if (vap->va_mode != (mode_t)VNOVAL) { + fsai->mode = vap->va_mode & ALLPERMS; + fsai->valid |= FATTR_MODE; + } + if (!fsai->valid) { + goto out; + } + + if ((err = fdisp_wait_answ(&fdi))) + goto out; + vtyp = IFTOVT(((struct fuse_attr_out *)fdi.answ)->attr.mode); + + if (vnode_vtype(vp) != vtyp) { + if (vnode_vtype(vp) == VNON && vtyp != VNON) { + SDT_PROBE2(fusefs, , internal, trace, 1, "FUSE: Dang! " + "vnode_vtype is VNON and vtype isn't."); + } else { + /* + * STALE vnode, ditch + * + * The vnode has changed its type "behind our back". + * There's nothing really we can do, so let us just + * force an internal revocation and tell the caller to + * try again, if interested. + */ + fuse_internal_vnode_disappear(vp); + err = EAGAIN; + } + } + if (err == 0) { + struct fuse_attr_out *fao = (struct fuse_attr_out*)fdi.answ; + fuse_vnode_undirty_cached_timestamps(vp); + fuse_internal_cache_attrs(vp, &fao->attr, fao->attr_valid, + fao->attr_valid_nsec, NULL); + } + +out: + fdisp_destroy(&fdi); + return err; +} + #ifdef ZERO_PAD_INCOMPLETE_BUFS static int isbzero(void *buf, size_t len) { int i; for (i = 0; i < len; i++) { if (((char *)buf)[i]) return (0); } return (1); } #endif + +void +fuse_internal_init(void) +{ + fuse_lookup_cache_misses = counter_u64_alloc(M_WAITOK); + fuse_lookup_cache_hits = counter_u64_alloc(M_WAITOK); +} + +void +fuse_internal_destroy(void) +{ + counter_u64_free(fuse_lookup_cache_hits); + counter_u64_free(fuse_lookup_cache_misses); +} Index: head/sys/fs/fuse/fuse_internal.h =================================================================== --- head/sys/fs/fuse/fuse_internal.h (revision 350664) +++ head/sys/fs/fuse/fuse_internal.h (revision 350665) @@ -1,274 +1,326 @@ /*- * SPDX-License-Identifier: BSD-3-Clause * * Copyright (c) 2007-2009 Google Inc. and Amit Singh * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are * met: * * * Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following disclaimer * in the documentation and/or other materials provided with the * distribution. * * Neither the name of Google Inc. nor the names of its * contributors may be used to endorse or promote products derived from * this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * Copyright (C) 2005 Csaba Henk. * All rights reserved. + * + * Copyright (c) 2019 The FreeBSD Foundation + * + * Portions of this software were developed by BFF Storage Systems, LLC under + * sponsorship from the FreeBSD Foundation. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $FreeBSD$ */ #ifndef _FUSE_INTERNAL_H_ #define _FUSE_INTERNAL_H_ #include +#include +#include #include #include #include #include "fuse_ipc.h" #include "fuse_node.h" +extern counter_u64_t fuse_lookup_cache_hits; +extern counter_u64_t fuse_lookup_cache_misses; + static inline bool vfs_isrdonly(struct mount *mp) { return ((mp->mnt_flag & MNT_RDONLY) != 0); } static inline struct mount * vnode_mount(struct vnode *vp) { return (vp->v_mount); } -static inline bool -vnode_mountedhere(struct vnode *vp) -{ - return (vp->v_mountedhere != NULL); -} - static inline enum vtype vnode_vtype(struct vnode *vp) { return (vp->v_type); } static inline bool vnode_isvroot(struct vnode *vp) { return ((vp->v_vflag & VV_ROOT) != 0); } static inline bool vnode_isreg(struct vnode *vp) { return (vp->v_type == VREG); } static inline bool vnode_isdir(struct vnode *vp) { return (vp->v_type == VDIR); } static inline bool vnode_islnk(struct vnode *vp) { return (vp->v_type == VLNK); } static inline ssize_t uio_resid(struct uio *uio) { return (uio->uio_resid); } static inline off_t uio_offset(struct uio *uio) { return (uio->uio_offset); } static inline void uio_setoffset(struct uio *uio, off_t offset) { uio->uio_offset = offset; } -static inline void -uio_setresid(struct uio *uio, ssize_t resid) -{ - uio->uio_resid = resid; -} - /* miscellaneous */ static inline bool fuse_isdeadfs(struct vnode *vp) { struct fuse_data *data = fuse_get_mpdata(vnode_mount(vp)); return (data->dataflags & FSESS_DEAD); } static inline uint64_t fuse_iosize(struct vnode *vp) { return (vp->v_mount->mnt_stat.f_iosize); } -/* access */ +/* + * Make a cacheable timeout in bintime format value based on a fuse_attr_out + * response + */ +static inline void +fuse_validity_2_bintime(uint64_t attr_valid, uint32_t attr_valid_nsec, + struct bintime *timeout) +{ + struct timespec now, duration, timeout_ts; -#define FVP_ACCESS_NOOP 0x01 + getnanouptime(&now); + /* "+ 2" is the bound of attr_valid_nsec + now.tv_nsec */ + /* Why oh why isn't there a TIME_MAX defined? */ + if (attr_valid >= INT_MAX || attr_valid + now.tv_sec + 2 >= INT_MAX) { + timeout->sec = INT_MAX; + } else { + duration.tv_sec = attr_valid; + duration.tv_nsec = attr_valid_nsec; + timespecadd(&duration, &now, &timeout_ts); + timespec2bintime(&timeout_ts, timeout); + } +} -#define FACCESS_VA_VALID 0x01 -#define FACCESS_DO_ACCESS 0x02 -#define FACCESS_STICKY 0x04 -#define FACCESS_CHOWN 0x08 -#define FACCESS_NOCHECKSPY 0x10 -#define FACCESS_SETGID 0x12 +/* + * Make a cacheable timeout value in timespec format based on the fuse_entry_out + * response + */ +static inline void +fuse_validity_2_timespec(const struct fuse_entry_out *feo, + struct timespec *timeout) +{ + struct timespec duration, now; -#define FACCESS_XQUERIES (FACCESS_STICKY | FACCESS_CHOWN | FACCESS_SETGID) + getnanouptime(&now); + /* "+ 2" is the bound of entry_valid_nsec + now.tv_nsec */ + if (feo->entry_valid >= INT_MAX || + feo->entry_valid + now.tv_sec + 2 >= INT_MAX) { + timeout->tv_sec = INT_MAX; + } else { + duration.tv_sec = feo->entry_valid; + duration.tv_nsec = feo->entry_valid_nsec; + timespecadd(&duration, &now, timeout); + } +} -struct fuse_access_param { - uid_t xuid; - gid_t xgid; - uint32_t facc_flags; -}; +/* VFS ops */ +int +fuse_internal_get_cached_vnode(struct mount*, ino_t, int, struct vnode**); + +/* access */ static inline int fuse_match_cred(struct ucred *basecred, struct ucred *usercred) { if (basecred->cr_uid == usercred->cr_uid && basecred->cr_uid == usercred->cr_ruid && basecred->cr_uid == usercred->cr_svuid && basecred->cr_groups[0] == usercred->cr_groups[0] && basecred->cr_groups[0] == usercred->cr_rgid && basecred->cr_groups[0] == usercred->cr_svgid) return (0); return (EPERM); } -int fuse_internal_access(struct vnode *vp, mode_t mode, - struct fuse_access_param *facp, struct thread *td, struct ucred *cred); +int fuse_internal_access(struct vnode *vp, accmode_t mode, + struct thread *td, struct ucred *cred); /* attributes */ void fuse_internal_cache_attrs(struct vnode *vp, struct fuse_attr *attr, uint64_t attr_valid, uint32_t attr_valid_nsec, struct vattr *vap); /* fsync */ -int fuse_internal_fsync(struct vnode *vp, struct thread *td, - struct ucred *cred, struct fuse_filehandle *fufh); +int fuse_internal_fsync(struct vnode *vp, struct thread *td, int waitfor, + bool datasync); int fuse_internal_fsync_callback(struct fuse_ticket *tick, struct uio *uio); -/* readdir */ +/* getattr */ +int fuse_internal_do_getattr(struct vnode *vp, struct vattr *vap, + struct ucred *cred, struct thread *td); +int fuse_internal_getattr(struct vnode *vp, struct vattr *vap, + struct ucred *cred, struct thread *td); +/* asynchronous invalidation */ +int fuse_internal_invalidate_entry(struct mount *mp, struct uio *uio); +int fuse_internal_invalidate_inode(struct mount *mp, struct uio *uio); + +/* mknod */ +int fuse_internal_mknod(struct vnode *dvp, struct vnode **vpp, + struct componentname *cnp, struct vattr *vap); + +/* readdir */ struct pseudo_dirent { uint32_t d_namlen; }; +int fuse_internal_readdir(struct vnode *vp, struct uio *uio, off_t startoff, + struct fuse_filehandle *fufh, struct fuse_iov *cookediov, int *ncookies, + u_long *cookies); +int fuse_internal_readdir_processdata(struct uio *uio, off_t startoff, + int *fnd_start, size_t reqsize, void *buf, size_t bufsize, + struct fuse_iov *cookediov, int *ncookies, u_long **cookiesp); -int fuse_internal_readdir(struct vnode *vp, struct uio *uio, - struct fuse_filehandle *fufh, struct fuse_iov *cookediov); -int fuse_internal_readdir_processdata(struct uio *uio, size_t reqsize, - void *buf, size_t bufsize, void *param); - /* remove */ int fuse_internal_remove(struct vnode *dvp, struct vnode *vp, struct componentname *cnp, enum fuse_opcode op); /* rename */ int fuse_internal_rename(struct vnode *fdvp, struct componentname *fcnp, struct vnode *tdvp, struct componentname *tcnp); /* revoke */ void fuse_internal_vnode_disappear(struct vnode *vp); +/* setattr */ +int fuse_internal_setattr(struct vnode *vp, struct vattr *va, + struct thread *td, struct ucred *cred); + /* strategy */ /* entity creation */ static inline int fuse_internal_checkentry(struct fuse_entry_out *feo, enum vtype vtyp) { if (vtyp != IFTOVT(feo->attr.mode)) { return (EINVAL); } if (feo->nodeid == FUSE_NULL_ID) { return (EINVAL); } if (feo->nodeid == FUSE_ROOT_ID) { return (EINVAL); } return (0); } int fuse_internal_newentry(struct vnode *dvp, struct vnode **vpp, struct componentname *cnp, enum fuse_opcode op, void *buf, size_t bufsize, enum vtype vtyp); void fuse_internal_newentry_makerequest(struct mount *mp, uint64_t dnid, struct componentname *cnp, enum fuse_opcode op, void *buf, size_t bufsize, struct fuse_dispatcher *fdip); int fuse_internal_newentry_core(struct vnode *dvp, struct vnode **vpp, struct componentname *cnp, enum vtype vtyp, struct fuse_dispatcher *fdip); /* entity destruction */ int fuse_internal_forget_callback(struct fuse_ticket *tick, struct uio *uio); void fuse_internal_forget_send(struct mount *mp, struct thread *td, struct ucred *cred, uint64_t nodeid, uint64_t nlookup); /* fuse start/stop */ int fuse_internal_init_callback(struct fuse_ticket *tick, struct uio *uio); void fuse_internal_send_init(struct fuse_data *data, struct thread *td); + +/* module load/unload */ +void fuse_internal_init(void); +void fuse_internal_destroy(void); #endif /* _FUSE_INTERNAL_H_ */ Index: head/sys/fs/fuse/fuse_io.c =================================================================== --- head/sys/fs/fuse/fuse_io.c (revision 350664) +++ head/sys/fs/fuse/fuse_io.c (revision 350665) @@ -1,836 +1,1162 @@ /*- * SPDX-License-Identifier: BSD-3-Clause * * Copyright (c) 2007-2009 Google Inc. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are * met: * * * Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following disclaimer * in the documentation and/or other materials provided with the * distribution. * * Neither the name of Google Inc. nor the names of its * contributors may be used to endorse or promote products derived from * this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * Copyright (C) 2005 Csaba Henk. * All rights reserved. * + * Copyright (c) 2019 The FreeBSD Foundation + * + * Portions of this software were developed by BFF Storage Systems, LLC under + * sponsorship from the FreeBSD Foundation. + * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include #include #include #include #include +#include #include #include #include #include #include #include #include #include #include #include #include +#include #include #include #include #include #include #include #include "fuse.h" #include "fuse_file.h" #include "fuse_node.h" #include "fuse_internal.h" #include "fuse_ipc.h" #include "fuse_io.h" -SDT_PROVIDER_DECLARE(fuse); /* + * Set in a struct buf to indicate that the write came from the buffer cache + * and the originating cred and pid are no longer known. + */ +#define B_FUSEFS_WRITE_CACHE B_FS_FLAG1 + +SDT_PROVIDER_DECLARE(fusefs); +/* * Fuse trace probe: * arg0: verbosity. Higher numbers give more verbose messages * arg1: Textual message */ -SDT_PROBE_DEFINE2(fuse, , io, trace, "int", "char*"); +SDT_PROBE_DEFINE2(fusefs, , io, trace, "int", "char*"); +static int +fuse_inval_buf_range(struct vnode *vp, off_t filesize, off_t start, off_t end); +static void +fuse_io_clear_suid_on_write(struct vnode *vp, struct ucred *cred, + struct thread *td); static int fuse_read_directbackend(struct vnode *vp, struct uio *uio, struct ucred *cred, struct fuse_filehandle *fufh); static int -fuse_read_biobackend(struct vnode *vp, struct uio *uio, - struct ucred *cred, struct fuse_filehandle *fufh); +fuse_read_biobackend(struct vnode *vp, struct uio *uio, int ioflag, + struct ucred *cred, struct fuse_filehandle *fufh, pid_t pid); static int fuse_write_directbackend(struct vnode *vp, struct uio *uio, - struct ucred *cred, struct fuse_filehandle *fufh, int ioflag); + struct ucred *cred, struct fuse_filehandle *fufh, off_t filesize, + int ioflag, bool pages); static int fuse_write_biobackend(struct vnode *vp, struct uio *uio, - struct ucred *cred, struct fuse_filehandle *fufh, int ioflag); + struct ucred *cred, struct fuse_filehandle *fufh, int ioflag, pid_t pid); -SDT_PROBE_DEFINE5(fuse, , io, io_dispatch, "struct vnode*", "struct uio*", +/* Invalidate a range of cached data, whether dirty of not */ +static int +fuse_inval_buf_range(struct vnode *vp, off_t filesize, off_t start, off_t end) +{ + struct buf *bp; + daddr_t left_lbn, end_lbn, right_lbn; + off_t new_filesize; + int iosize, left_on, right_on, right_blksize; + + iosize = fuse_iosize(vp); + left_lbn = start / iosize; + end_lbn = howmany(end, iosize); + left_on = start & (iosize - 1); + if (left_on != 0) { + bp = getblk(vp, left_lbn, iosize, PCATCH, 0, 0); + if ((bp->b_flags & B_CACHE) != 0 && bp->b_dirtyend >= left_on) { + /* + * Flush the dirty buffer, because we don't have a + * byte-granular way to record which parts of the + * buffer are valid. + */ + bwrite(bp); + if (bp->b_error) + return (bp->b_error); + } else { + brelse(bp); + } + } + right_on = end & (iosize - 1); + if (right_on != 0) { + right_lbn = end / iosize; + new_filesize = MAX(filesize, end); + right_blksize = MIN(iosize, new_filesize - iosize * right_lbn); + bp = getblk(vp, right_lbn, right_blksize, PCATCH, 0, 0); + if ((bp->b_flags & B_CACHE) != 0 && bp->b_dirtyoff < right_on) { + /* + * Flush the dirty buffer, because we don't have a + * byte-granular way to record which parts of the + * buffer are valid. + */ + bwrite(bp); + if (bp->b_error) + return (bp->b_error); + } else { + brelse(bp); + } + } + + v_inval_buf_range(vp, left_lbn, end_lbn, iosize); + return (0); +} + +/* + * FreeBSD clears the SUID and SGID bits on any write by a non-root user. + */ +static void +fuse_io_clear_suid_on_write(struct vnode *vp, struct ucred *cred, + struct thread *td) +{ + struct fuse_data *data; + struct mount *mp; + struct vattr va; + int dataflags; + + mp = vnode_mount(vp); + data = fuse_get_mpdata(mp); + dataflags = data->dataflags; + + if (dataflags & FSESS_DEFAULT_PERMISSIONS) { + if (priv_check_cred(cred, PRIV_VFS_RETAINSUGID)) { + fuse_internal_getattr(vp, &va, cred, td); + if (va.va_mode & (S_ISUID | S_ISGID)) { + mode_t mode = va.va_mode & ~(S_ISUID | S_ISGID); + /* Clear all vattr fields except mode */ + vattr_null(&va); + va.va_mode = mode; + + /* + * Ignore fuse_internal_setattr's return value, + * because at this point the write operation has + * already succeeded and we don't want to return + * failing status for that. + */ + (void)fuse_internal_setattr(vp, &va, td, NULL); + } + } + } +} + +SDT_PROBE_DEFINE5(fusefs, , io, io_dispatch, "struct vnode*", "struct uio*", "int", "struct ucred*", "struct fuse_filehandle*"); +SDT_PROBE_DEFINE4(fusefs, , io, io_dispatch_filehandles_closed, "struct vnode*", + "struct uio*", "int", "struct ucred*"); int fuse_io_dispatch(struct vnode *vp, struct uio *uio, int ioflag, - struct ucred *cred) + struct ucred *cred, pid_t pid) { struct fuse_filehandle *fufh; int err, directio; + int fflag; + bool closefufh = false; MPASS(vp->v_type == VREG || vp->v_type == VDIR); - err = fuse_filehandle_getrw(vp, - (uio->uio_rw == UIO_READ) ? FUFH_RDONLY : FUFH_WRONLY, &fufh); - if (err) { + fflag = (uio->uio_rw == UIO_READ) ? FREAD : FWRITE; + err = fuse_filehandle_getrw(vp, fflag, &fufh, cred, pid); + if (err == EBADF && vnode_mount(vp)->mnt_flag & MNT_EXPORTED) { + /* + * nfsd will do I/O without first doing VOP_OPEN. We + * must implicitly open the file here + */ + err = fuse_filehandle_open(vp, fflag, &fufh, curthread, cred); + closefufh = true; + } + else if (err) { + SDT_PROBE4(fusefs, , io, io_dispatch_filehandles_closed, + vp, uio, ioflag, cred); printf("FUSE: io dispatch: filehandles are closed\n"); return err; } - SDT_PROBE5(fuse, , io, io_dispatch, vp, uio, ioflag, cred, fufh); + if (err) + goto out; + SDT_PROBE5(fusefs, , io, io_dispatch, vp, uio, ioflag, cred, fufh); /* * Ideally, when the daemon asks for direct io at open time, the * standard file flag should be set according to this, so that would * just change the default mode, which later on could be changed via * fcntl(2). * But this doesn't work, the O_DIRECT flag gets cleared at some point * (don't know where). So to make any use of the Fuse direct_io option, * we hardwire it into the file's private data (similarly to Linux, * btw.). */ directio = (ioflag & IO_DIRECT) || !fsess_opt_datacache(vnode_mount(vp)); switch (uio->uio_rw) { case UIO_READ: if (directio) { - SDT_PROBE2(fuse, , io, trace, 1, + SDT_PROBE2(fusefs, , io, trace, 1, "direct read of vnode"); err = fuse_read_directbackend(vp, uio, cred, fufh); } else { - SDT_PROBE2(fuse, , io, trace, 1, + SDT_PROBE2(fusefs, , io, trace, 1, "buffered read of vnode"); - err = fuse_read_biobackend(vp, uio, cred, fufh); + err = fuse_read_biobackend(vp, uio, ioflag, cred, fufh, + pid); } break; case UIO_WRITE: - /* - * Kludge: simulate write-through caching via write-around - * caching. Same effect, as far as never caching dirty data, - * but slightly pessimal in that newly written data is not - * cached. - */ - if (directio || fuse_data_cache_mode == FUSE_CACHE_WT) { - SDT_PROBE2(fuse, , io, trace, 1, + fuse_vnode_update(vp, FN_MTIMECHANGE | FN_CTIMECHANGE); + if (directio) { + off_t start, end, filesize; + + SDT_PROBE2(fusefs, , io, trace, 1, "direct write of vnode"); - err = fuse_write_directbackend(vp, uio, cred, fufh, ioflag); + + err = fuse_vnode_size(vp, &filesize, cred, curthread); + if (err) + goto out; + + start = uio->uio_offset; + end = start + uio->uio_resid; + KASSERT((ioflag & (IO_VMIO | IO_DIRECT)) != + (IO_VMIO | IO_DIRECT), + ("IO_DIRECT used for a cache flush?")); + /* Invalidate the write cache when writing directly */ + err = fuse_inval_buf_range(vp, filesize, start, end); + if (err) + return (err); + err = fuse_write_directbackend(vp, uio, cred, fufh, + filesize, ioflag, false); } else { - SDT_PROBE2(fuse, , io, trace, 1, + SDT_PROBE2(fusefs, , io, trace, 1, "buffered write of vnode"); - err = fuse_write_biobackend(vp, uio, cred, fufh, ioflag); + if (!fsess_opt_writeback(vnode_mount(vp))) + ioflag |= IO_SYNC; + err = fuse_write_biobackend(vp, uio, cred, fufh, ioflag, + pid); } + fuse_io_clear_suid_on_write(vp, cred, uio->uio_td); break; default: panic("uninterpreted mode passed to fuse_io_dispatch"); } +out: + if (closefufh) + fuse_filehandle_close(vp, fufh, curthread, cred); + return (err); } -SDT_PROBE_DEFINE3(fuse, , io, read_bio_backend_start, "int", "int", "int"); -SDT_PROBE_DEFINE2(fuse, , io, read_bio_backend_feed, "int", "int"); -SDT_PROBE_DEFINE3(fuse, , io, read_bio_backend_end, "int", "ssize_t", "int"); +SDT_PROBE_DEFINE4(fusefs, , io, read_bio_backend_start, "int", "int", "int", "int"); +SDT_PROBE_DEFINE2(fusefs, , io, read_bio_backend_feed, "int", "struct buf*"); +SDT_PROBE_DEFINE4(fusefs, , io, read_bio_backend_end, "int", "ssize_t", "int", + "struct buf*"); static int -fuse_read_biobackend(struct vnode *vp, struct uio *uio, - struct ucred *cred, struct fuse_filehandle *fufh) +fuse_read_biobackend(struct vnode *vp, struct uio *uio, int ioflag, + struct ucred *cred, struct fuse_filehandle *fufh, pid_t pid) { struct buf *bp; - daddr_t lbn; - int bcount; - int err = 0, n = 0, on = 0; + struct mount *mp; + struct fuse_data *data; + daddr_t lbn, nextlbn; + int bcount, nextsize; + int err, n = 0, on = 0, seqcount; off_t filesize; const int biosize = fuse_iosize(vp); + mp = vnode_mount(vp); + data = fuse_get_mpdata(mp); - if (uio->uio_resid == 0) - return (0); if (uio->uio_offset < 0) return (EINVAL); - bcount = biosize; - filesize = VTOFUD(vp)->filesize; + seqcount = ioflag >> IO_SEQSHIFT; - do { + err = fuse_vnode_size(vp, &filesize, cred, curthread); + if (err) + return err; + + for (err = 0, bp = NULL; uio->uio_resid > 0; bp = NULL) { if (fuse_isdeadfs(vp)) { err = ENXIO; break; } + if (filesize - uio->uio_offset <= 0) + break; lbn = uio->uio_offset / biosize; on = uio->uio_offset & (biosize - 1); - SDT_PROBE3(fuse, , io, read_bio_backend_start, - biosize, (int)lbn, on); - - /* - * Obtain the buffer cache block. Figure out the buffer size - * when we are at EOF. If we are modifying the size of the - * buffer based on an EOF condition we need to hold - * nfs_rslock() through obtaining the buffer to prevent - * a potential writer-appender from messing with n_size. - * Otherwise we may accidentally truncate the buffer and - * lose dirty data. - * - * Note that bcount is *not* DEV_BSIZE aligned. - */ if ((off_t)lbn * biosize >= filesize) { bcount = 0; } else if ((off_t)(lbn + 1) * biosize > filesize) { bcount = filesize - (off_t)lbn *biosize; + } else { + bcount = biosize; } - bp = getblk(vp, lbn, bcount, PCATCH, 0, 0); + nextlbn = lbn + 1; + nextsize = MIN(biosize, filesize - nextlbn * biosize); - if (!bp) - return (EINTR); + SDT_PROBE4(fusefs, , io, read_bio_backend_start, + biosize, (int)lbn, on, bcount); - /* - * If B_CACHE is not set, we must issue the read. If this - * fails, we return an error. - */ + if (bcount < biosize) { + /* If near EOF, don't do readahead */ + err = bread(vp, lbn, bcount, NOCRED, &bp); + } else if ((vp->v_mount->mnt_flag & MNT_NOCLUSTERR) == 0) { + /* Try clustered read */ + long totread = uio->uio_resid + on; + seqcount = MIN(seqcount, + data->max_readahead_blocks + 1); + err = cluster_read(vp, filesize, lbn, bcount, NOCRED, + totread, seqcount, 0, &bp); + } else if (seqcount > 1 && data->max_readahead_blocks >= 1) { + /* Try non-clustered readahead */ + err = breadn(vp, lbn, bcount, &nextlbn, &nextsize, 1, + NOCRED, &bp); + } else { + /* Just read what was requested */ + err = bread(vp, lbn, bcount, NOCRED, &bp); + } - if ((bp->b_flags & B_CACHE) == 0) { - bp->b_iocmd = BIO_READ; - vfs_busy_pages(bp, 0); - err = fuse_io_strategy(vp, bp); - if (err) { - brelse(bp); - return (err); - } + if (err) { + brelse(bp); + bp = NULL; + break; } + /* * on is the offset into the current bp. Figure out how many * bytes we can copy out of the bp. Note that bcount is * NOT DEV_BSIZE aligned. * * Then figure out how many bytes we can copy into the uio. */ n = 0; - if (on < bcount) - n = MIN((unsigned)(bcount - on), uio->uio_resid); + if (on < bcount - bp->b_resid) + n = MIN((unsigned)(bcount - bp->b_resid - on), + uio->uio_resid); if (n > 0) { - SDT_PROBE2(fuse, , io, read_bio_backend_feed, - n, n + (int)bp->b_resid); + SDT_PROBE2(fusefs, , io, read_bio_backend_feed, n, bp); err = uiomove(bp->b_data + on, n, uio); } - brelse(bp); - SDT_PROBE3(fuse, , io, read_bio_backend_end, err, - uio->uio_resid, n); - } while (err == 0 && uio->uio_resid > 0 && n > 0); + vfs_bio_brelse(bp, ioflag); + SDT_PROBE4(fusefs, , io, read_bio_backend_end, err, + uio->uio_resid, n, bp); + if (bp->b_resid > 0) { + /* Short read indicates EOF */ + break; + } + } return (err); } -SDT_PROBE_DEFINE1(fuse, , io, read_directbackend_start, "struct fuse_read_in*"); -SDT_PROBE_DEFINE2(fuse, , io, read_directbackend_complete, - "struct fuse_dispatcher*", "struct uio*"); +SDT_PROBE_DEFINE1(fusefs, , io, read_directbackend_start, + "struct fuse_read_in*"); +SDT_PROBE_DEFINE3(fusefs, , io, read_directbackend_complete, + "struct fuse_dispatcher*", "struct fuse_read_in*", "struct uio*"); static int fuse_read_directbackend(struct vnode *vp, struct uio *uio, struct ucred *cred, struct fuse_filehandle *fufh) { + struct fuse_data *data; struct fuse_dispatcher fdi; struct fuse_read_in *fri; int err = 0; + data = fuse_get_mpdata(vp->v_mount); + if (uio->uio_resid == 0) return (0); fdisp_init(&fdi, 0); /* * XXX In "normal" case we use an intermediate kernel buffer for * transmitting data from daemon's context to ours. Eventually, we should * get rid of this. Anyway, if the target uio lives in sysspace (we are * called from pageops), and the input data doesn't need kernel-side * processing (we are not called from readdir) we can already invoke * an optimized, "peer-to-peer" I/O routine. */ while (uio->uio_resid > 0) { fdi.iosize = sizeof(*fri); fdisp_make_vp(&fdi, FUSE_READ, vp, uio->uio_td, cred); fri = fdi.indata; fri->fh = fufh->fh_id; fri->offset = uio->uio_offset; fri->size = MIN(uio->uio_resid, fuse_get_mpdata(vp->v_mount)->max_read); + if (fuse_libabi_geq(data, 7, 9)) { + /* See comment regarding FUSE_WRITE_LOCKOWNER */ + fri->read_flags = 0; + fri->flags = fufh_type_2_fflags(fufh->fufh_type); + } - SDT_PROBE1(fuse, , io, read_directbackend_start, fri); + SDT_PROBE1(fusefs, , io, read_directbackend_start, fri); if ((err = fdisp_wait_answ(&fdi))) goto out; - SDT_PROBE2(fuse, , io, read_directbackend_complete, - fdi.iosize, uio); + SDT_PROBE3(fusefs, , io, read_directbackend_complete, + &fdi, fri, uio); if ((err = uiomove(fdi.answ, MIN(fri->size, fdi.iosize), uio))) break; - if (fdi.iosize < fri->size) + if (fdi.iosize < fri->size) { + /* + * Short read. Should only happen at EOF or with + * direct io. + */ break; + } } out: fdisp_destroy(&fdi); return (err); } static int fuse_write_directbackend(struct vnode *vp, struct uio *uio, - struct ucred *cred, struct fuse_filehandle *fufh, int ioflag) + struct ucred *cred, struct fuse_filehandle *fufh, off_t filesize, + int ioflag, bool pages) { struct fuse_vnode_data *fvdat = VTOFUD(vp); + struct fuse_data *data; struct fuse_write_in *fwi; + struct fuse_write_out *fwo; struct fuse_dispatcher fdi; size_t chunksize; + void *fwi_data; + off_t as_written_offset; int diff; int err = 0; + bool direct_io = fufh->fuse_open_flags & FOPEN_DIRECT_IO; + bool wrote_anything = false; + uint32_t write_flags; + data = fuse_get_mpdata(vp->v_mount); + + /* + * Don't set FUSE_WRITE_LOCKOWNER in write_flags. It can't be set + * accurately when using POSIX AIO, libfuse doesn't use it, and I'm not + * aware of any file systems that do. It was an attempt to add + * Linux-style mandatory locking to the FUSE protocol, but mandatory + * locking is deprecated even on Linux. See Linux commit + * f33321141b273d60cbb3a8f56a5489baad82ba5e . + */ + /* + * Set FUSE_WRITE_CACHE whenever we don't know the uid, gid, and/or pid + * that originated a write. For example when writing from the + * writeback cache. I don't know of a single file system that cares, + * but the protocol says we're supposed to do this. + */ + write_flags = !pages && ( + (ioflag & IO_DIRECT) || + !fsess_opt_datacache(vnode_mount(vp)) || + !fsess_opt_writeback(vnode_mount(vp))) ? 0 : FUSE_WRITE_CACHE; + if (uio->uio_resid == 0) return (0); + if (ioflag & IO_APPEND) - uio_setoffset(uio, fvdat->filesize); + uio_setoffset(uio, filesize); + if (vn_rlimit_fsize(vp, uio, uio->uio_td)) + return (EFBIG); + fdisp_init(&fdi, 0); while (uio->uio_resid > 0) { - chunksize = MIN(uio->uio_resid, - fuse_get_mpdata(vp->v_mount)->max_write); + chunksize = MIN(uio->uio_resid, data->max_write); fdi.iosize = sizeof(*fwi) + chunksize; fdisp_make_vp(&fdi, FUSE_WRITE, vp, uio->uio_td, cred); fwi = fdi.indata; fwi->fh = fufh->fh_id; fwi->offset = uio->uio_offset; fwi->size = chunksize; + fwi->write_flags = write_flags; + if (fuse_libabi_geq(data, 7, 9)) { + fwi->flags = fufh_type_2_fflags(fufh->fufh_type); + fwi_data = (char *)fdi.indata + sizeof(*fwi); + } else { + fwi_data = (char *)fdi.indata + + FUSE_COMPAT_WRITE_IN_SIZE; + } - if ((err = uiomove((char *)fdi.indata + sizeof(*fwi), - chunksize, uio))) + if ((err = uiomove(fwi_data, chunksize, uio))) break; - if ((err = fdisp_wait_answ(&fdi))) +retry: + err = fdisp_wait_answ(&fdi); + if (err == ERESTART || err == EINTR || err == EWOULDBLOCK) { + /* + * Rewind the uio so dofilewrite will know it's + * incomplete + */ + uio->uio_resid += fwi->size; + uio->uio_offset -= fwi->size; + /* + * Change ERESTART into EINTR because we can't rewind + * uio->uio_iov. Basically, once uiomove(9) has been + * called, it's impossible to restart a syscall. + */ + if (err == ERESTART) + err = EINTR; break; + } else if (err) { + break; + } else { + wrote_anything = true; + } + fwo = ((struct fuse_write_out *)fdi.answ); + /* Adjust the uio in the case of short writes */ - diff = chunksize - ((struct fuse_write_out *)fdi.answ)->size; + diff = fwi->size - fwo->size; + as_written_offset = uio->uio_offset - diff; + + if (as_written_offset - diff > filesize) + fuse_vnode_setsize(vp, as_written_offset); + if (as_written_offset - diff >= filesize) + fvdat->flag &= ~FN_SIZECHANGE; + if (diff < 0) { + printf("WARNING: misbehaving FUSE filesystem " + "wrote more data than we provided it\n"); err = EINVAL; break; - } else if (diff > 0 && !(ioflag & IO_DIRECT)) { - /* - * XXX We really should be directly checking whether - * the file was opened with FOPEN_DIRECT_IO, not - * IO_DIRECT. IO_DIRECT can be set in multiple ways. - */ - SDT_PROBE2(fuse, , io, trace, 1, - "misbehaving filesystem: short writes are only " - "allowed with direct_io"); + } else if (diff > 0) { + /* Short write */ + if (!direct_io) { + printf("WARNING: misbehaving FUSE filesystem: " + "short writes are only allowed with " + "direct_io\n"); + } + if (ioflag & IO_DIRECT) { + /* Return early */ + uio->uio_resid += diff; + uio->uio_offset -= diff; + break; + } else { + /* Resend the unwritten portion of data */ + fdi.iosize = sizeof(*fwi) + diff; + /* Refresh fdi without clearing data buffer */ + fdisp_refresh_vp(&fdi, FUSE_WRITE, vp, + uio->uio_td, cred); + fwi = fdi.indata; + MPASS2(fwi == fdi.indata, "FUSE dispatcher " + "reallocated despite no increase in " + "size?"); + void *src = (char*)fwi_data + fwo->size; + memmove(fwi_data, src, diff); + fwi->fh = fufh->fh_id; + fwi->offset = as_written_offset; + fwi->size = diff; + fwi->write_flags = write_flags; + goto retry; + } } - uio->uio_resid += diff; - uio->uio_offset -= diff; - - if (uio->uio_offset > fvdat->filesize && - fuse_data_cache_mode != FUSE_CACHE_UC) { - fuse_vnode_setsize(vp, uio->uio_offset); - fvdat->flag &= ~FN_SIZECHANGE; - } } fdisp_destroy(&fdi); + if (wrote_anything) + fuse_vnode_undirty_cached_timestamps(vp); + return (err); } -SDT_PROBE_DEFINE6(fuse, , io, write_biobackend_start, "int64_t", "int", "int", +SDT_PROBE_DEFINE6(fusefs, , io, write_biobackend_start, "int64_t", "int", "int", "struct uio*", "int", "bool"); -SDT_PROBE_DEFINE2(fuse, , io, write_biobackend_append_race, "long", "int"); +SDT_PROBE_DEFINE2(fusefs, , io, write_biobackend_append_race, "long", "int"); +SDT_PROBE_DEFINE2(fusefs, , io, write_biobackend_issue, "int", "struct buf*"); static int fuse_write_biobackend(struct vnode *vp, struct uio *uio, - struct ucred *cred, struct fuse_filehandle *fufh, int ioflag) + struct ucred *cred, struct fuse_filehandle *fufh, int ioflag, pid_t pid) { struct fuse_vnode_data *fvdat = VTOFUD(vp); struct buf *bp; daddr_t lbn; + off_t filesize; int bcount; - int n, on, err = 0; + int n, on, seqcount, err = 0; + bool last_page; const int biosize = fuse_iosize(vp); - KASSERT(uio->uio_rw == UIO_WRITE, ("ncl_write mode")); + seqcount = ioflag >> IO_SEQSHIFT; + + KASSERT(uio->uio_rw == UIO_WRITE, ("fuse_write_biobackend mode")); if (vp->v_type != VREG) return (EIO); if (uio->uio_offset < 0) return (EINVAL); if (uio->uio_resid == 0) return (0); + + err = fuse_vnode_size(vp, &filesize, cred, curthread); + if (err) + return err; + if (ioflag & IO_APPEND) - uio_setoffset(uio, fvdat->filesize); + uio_setoffset(uio, filesize); - /* - * Find all of this file's B_NEEDCOMMIT buffers. If our writes - * would exceed the local maximum per-file write commit size when - * combined with those, we must decide whether to flush, - * go synchronous, or return err. We don't bother checking - * IO_UNIT -- we just make all writes atomic anyway, as there's - * no point optimizing for something that really won't ever happen. - */ + if (vn_rlimit_fsize(vp, uio, uio->uio_td)) + return (EFBIG); + do { + bool direct_append, extending; + if (fuse_isdeadfs(vp)) { err = ENXIO; break; } lbn = uio->uio_offset / biosize; on = uio->uio_offset & (biosize - 1); n = MIN((unsigned)(biosize - on), uio->uio_resid); again: - /* - * Handle direct append and file extension cases, calculate - * unaligned buffer size. - */ - if (uio->uio_offset == fvdat->filesize && n) { - /* - * Get the buffer (in its pre-append state to maintain - * B_CACHE if it was previously set). Resize the - * nfsnode after we have locked the buffer to prevent - * readers from reading garbage. - */ - bcount = on; - SDT_PROBE6(fuse, , io, write_biobackend_start, - lbn, on, n, uio, bcount, true); - bp = getblk(vp, lbn, bcount, PCATCH, 0, 0); - + /* Get or create a buffer for the write */ + direct_append = uio->uio_offset == filesize && n; + if (uio->uio_offset + n < filesize) { + extending = false; + if ((off_t)(lbn + 1) * biosize < filesize) { + /* Not the file's last block */ + bcount = biosize; + } else { + /* The file's last block */ + bcount = filesize - (off_t)lbn * biosize; + } + } else { + extending = true; + bcount = on + n; + } + if (howmany(((off_t)lbn * biosize + on + n - 1), PAGE_SIZE) >= + howmany(filesize, PAGE_SIZE)) + last_page = true; + else + last_page = false; + if (direct_append) { + /* + * Take care to preserve the buffer's B_CACHE state so + * as not to cause an unnecessary read. + */ + bp = getblk(vp, lbn, on, PCATCH, 0, 0); if (bp != NULL) { - long save; - - err = fuse_vnode_setsize(vp, - uio->uio_offset + n); - if (err) { - brelse(bp); - break; - } - save = bp->b_flags & B_CACHE; - bcount += n; + uint32_t save = bp->b_flags & B_CACHE; allocbuf(bp, bcount); bp->b_flags |= save; } } else { - /* - * Obtain the locked cache block first, and then - * adjust the file's size as appropriate. - */ - bcount = on + n; - if ((off_t)lbn * biosize + bcount < fvdat->filesize) { - if ((off_t)(lbn + 1) * biosize < fvdat->filesize) - bcount = biosize; - else - bcount = fvdat->filesize - - (off_t)lbn *biosize; - } - SDT_PROBE6(fuse, , io, write_biobackend_start, - lbn, on, n, uio, bcount, false); bp = getblk(vp, lbn, bcount, PCATCH, 0, 0); - if (bp && uio->uio_offset + n > fvdat->filesize) { - err = fuse_vnode_setsize(vp, - uio->uio_offset + n); - if (err) { - brelse(bp); - break; - } - } } - if (!bp) { err = EINTR; break; } + if (extending) { + /* + * Extend file _after_ locking buffer so we won't race + * with other readers + */ + err = fuse_vnode_setsize(vp, uio->uio_offset + n); + filesize = uio->uio_offset + n; + fvdat->flag |= FN_SIZECHANGE; + if (err) { + brelse(bp); + break; + } + } + + SDT_PROBE6(fusefs, , io, write_biobackend_start, + lbn, on, n, uio, bcount, direct_append); /* * Issue a READ if B_CACHE is not set. In special-append * mode, B_CACHE is based on the buffer prior to the write * op and is typically set, avoiding the read. If a read * is required in special append mode, the server will * probably send us a short-read since we extended the file * on our end, resulting in b_resid == 0 and, thusly, * B_CACHE getting set. * * We can also avoid issuing the read if the write covers * the entire buffer. We have to make sure the buffer state * is reasonable in this case since we will not be initiating * I/O. See the comments in kern/vfs_bio.c's getblk() for * more information. * * B_CACHE may also be set due to the buffer being cached * normally. */ if (on == 0 && n == bcount) { bp->b_flags |= B_CACHE; bp->b_flags &= ~B_INVAL; bp->b_ioflags &= ~BIO_ERROR; } if ((bp->b_flags & B_CACHE) == 0) { bp->b_iocmd = BIO_READ; vfs_busy_pages(bp, 0); fuse_io_strategy(vp, bp); if ((err = bp->b_error)) { brelse(bp); break; } + if (bp->b_resid > 0) { + /* + * Short read indicates EOF. Update file size + * from the server and try again. + */ + SDT_PROBE2(fusefs, , io, trace, 1, + "Short read during a RMW"); + brelse(bp); + err = fuse_vnode_size(vp, &filesize, cred, + curthread); + if (err) + break; + else + goto again; + } } if (bp->b_wcred == NOCRED) bp->b_wcred = crhold(cred); /* * If dirtyend exceeds file size, chop it down. This should * not normally occur but there is an append race where it * might occur XXX, so we log it. * * If the chopping creates a reverse-indexed or degenerate * situation with dirtyoff/end, we 0 both of them. */ - if (bp->b_dirtyend > bcount) { - SDT_PROBE2(fuse, , io, write_biobackend_append_race, + SDT_PROBE2(fusefs, , io, write_biobackend_append_race, (long)bp->b_blkno * biosize, bp->b_dirtyend - bcount); bp->b_dirtyend = bcount; } if (bp->b_dirtyoff >= bp->b_dirtyend) bp->b_dirtyoff = bp->b_dirtyend = 0; /* * If the new write will leave a contiguous dirty * area, just update the b_dirtyoff and b_dirtyend, * otherwise force a write rpc of the old dirty area. * * While it is possible to merge discontiguous writes due to * our having a B_CACHE buffer ( and thus valid read data * for the hole), we don't because it could lead to * significant cache coherency problems with multiple clients, * especially if locking is implemented later on. * * as an optimization we could theoretically maintain * a linked list of discontinuous areas, but we would still * have to commit them separately so there isn't much * advantage to it except perhaps a bit of asynchronization. */ if (bp->b_dirtyend > 0 && (on > bp->b_dirtyend || (on + n) < bp->b_dirtyoff)) { /* * Yes, we mean it. Write out everything to "storage" * immediately, without hesitation. (Apart from other * reasons: the only way to know if a write is valid * if its actually written out.) */ + SDT_PROBE2(fusefs, , io, write_biobackend_issue, 0, bp); bwrite(bp); if (bp->b_error == EINTR) { err = EINTR; break; } goto again; } err = uiomove((char *)bp->b_data + on, n, uio); - /* - * Since this block is being modified, it must be written - * again and not just committed. Since write clustering does - * not work for the stage 1 data write, only the stage 2 - * commit rpc, we have to clear B_CLUSTEROK as well. - */ - bp->b_flags &= ~(B_NEEDCOMMIT | B_CLUSTEROK); - if (err) { bp->b_ioflags |= BIO_ERROR; bp->b_error = err; brelse(bp); break; + /* TODO: vfs_bio_clrbuf like ffs_write does? */ } /* * Only update dirtyoff/dirtyend if not a degenerate * condition. */ if (n) { if (bp->b_dirtyend > 0) { bp->b_dirtyoff = MIN(on, bp->b_dirtyoff); bp->b_dirtyend = MAX((on + n), bp->b_dirtyend); } else { bp->b_dirtyoff = on; bp->b_dirtyend = on + n; } vfs_bio_set_valid(bp, on, n); } - err = bwrite(bp); + + vfs_bio_set_flags(bp, ioflag); + + bp->b_flags |= B_FUSEFS_WRITE_CACHE; + if (ioflag & IO_SYNC) { + SDT_PROBE2(fusefs, , io, write_biobackend_issue, 2, bp); + if (!(ioflag & IO_VMIO)) + bp->b_flags &= ~B_FUSEFS_WRITE_CACHE; + err = bwrite(bp); + } else if (vm_page_count_severe() || + buf_dirty_count_severe() || + (ioflag & IO_ASYNC)) { + bp->b_flags |= B_CLUSTEROK; + SDT_PROBE2(fusefs, , io, write_biobackend_issue, 3, bp); + bawrite(bp); + } else if (on == 0 && n == bcount) { + if ((vp->v_mount->mnt_flag & MNT_NOCLUSTERW) == 0) { + bp->b_flags |= B_CLUSTEROK; + SDT_PROBE2(fusefs, , io, write_biobackend_issue, + 4, bp); + cluster_write(vp, bp, filesize, seqcount, 0); + } else { + SDT_PROBE2(fusefs, , io, write_biobackend_issue, + 5, bp); + bawrite(bp); + } + } else if (ioflag & IO_DIRECT) { + bp->b_flags |= B_CLUSTEROK; + SDT_PROBE2(fusefs, , io, write_biobackend_issue, 6, bp); + bawrite(bp); + } else { + bp->b_flags &= ~B_CLUSTEROK; + SDT_PROBE2(fusefs, , io, write_biobackend_issue, 7, bp); + bdwrite(bp); + } if (err) break; } while (uio->uio_resid > 0 && n > 0); - if (fuse_sync_resize && (fvdat->flag & FN_SIZECHANGE) != 0) - fuse_vnode_savesize(vp, cred); - return (err); } int fuse_io_strategy(struct vnode *vp, struct buf *bp) { - struct fuse_filehandle *fufh; struct fuse_vnode_data *fvdat = VTOFUD(vp); + struct fuse_filehandle *fufh; struct ucred *cred; struct uio *uiop; struct uio uio; struct iovec io; + off_t filesize; int error = 0; + int fflag; + /* We don't know the true pid when we're dealing with the cache */ + pid_t pid = 0; const int biosize = fuse_iosize(vp); MPASS(vp->v_type == VREG || vp->v_type == VDIR); MPASS(bp->b_iocmd == BIO_READ || bp->b_iocmd == BIO_WRITE); - error = fuse_filehandle_getrw(vp, - (bp->b_iocmd == BIO_READ) ? FUFH_RDONLY : FUFH_WRONLY, &fufh); + fflag = bp->b_iocmd == BIO_READ ? FREAD : FWRITE; + cred = bp->b_iocmd == BIO_READ ? bp->b_rcred : bp->b_wcred; + error = fuse_filehandle_getrw(vp, fflag, &fufh, cred, pid); + if (bp->b_iocmd == BIO_READ && error == EBADF) { + /* + * This may be a read-modify-write operation on a cached file + * opened O_WRONLY. The FUSE protocol allows this. + */ + error = fuse_filehandle_get(vp, FWRITE, &fufh, cred, pid); + } if (error) { printf("FUSE: strategy: filehandles are closed\n"); bp->b_ioflags |= BIO_ERROR; bp->b_error = error; + bufdone(bp); return (error); } - cred = bp->b_iocmd == BIO_READ ? bp->b_rcred : bp->b_wcred; uiop = &uio; uiop->uio_iov = &io; uiop->uio_iovcnt = 1; uiop->uio_segflg = UIO_SYSSPACE; uiop->uio_td = curthread; /* * clear BIO_ERROR and B_INVAL state prior to initiating the I/O. We * do this here so we do not have to do it in all the code that * calls us. */ bp->b_flags &= ~B_INVAL; bp->b_ioflags &= ~BIO_ERROR; KASSERT(!(bp->b_flags & B_DONE), ("fuse_io_strategy: bp %p already marked done", bp)); if (bp->b_iocmd == BIO_READ) { + ssize_t left; + io.iov_len = uiop->uio_resid = bp->b_bcount; io.iov_base = bp->b_data; uiop->uio_rw = UIO_READ; - uiop->uio_offset = ((off_t)bp->b_blkno) * biosize; + uiop->uio_offset = ((off_t)bp->b_lblkno) * biosize; error = fuse_read_directbackend(vp, uiop, cred, fufh); + /* + * Store the amount we failed to read in the buffer's private + * field, so callers can truncate the file if necessary' + */ - /* XXXCEM: Potentially invalid access to cached_attrs here */ - if ((!error && uiop->uio_resid) || - (fsess_opt_brokenio(vnode_mount(vp)) && error == EIO && - uiop->uio_offset < fvdat->filesize && fvdat->filesize > 0 && - uiop->uio_offset >= fvdat->cached_attrs.va_size)) { - /* - * If we had a short read with no error, we must have - * hit a file hole. We should zero-fill the remainder. - * This can also occur if the server hits the file EOF. - * - * Holes used to be able to occur due to pending - * writes, but that is not possible any longer. - */ + if (!error && uiop->uio_resid) { int nread = bp->b_bcount - uiop->uio_resid; - int left = uiop->uio_resid; + left = uiop->uio_resid; + bzero((char *)bp->b_data + nread, left); - if (error != 0) { - printf("FUSE: Fix broken io: offset %ju, " - " resid %zd, file size %ju/%ju\n", - (uintmax_t)uiop->uio_offset, - uiop->uio_resid, fvdat->filesize, - fvdat->cached_attrs.va_size); - error = 0; + if ((fvdat->flag & FN_SIZECHANGE) == 0) { + /* + * A short read with no error, when not using + * direct io, and when no writes are cached, + * indicates EOF caused by a server-side + * truncation. Clear the attr cache so we'll + * pick up the new file size and timestamps. + * + * We must still bzero the remaining buffer so + * uninitialized data doesn't get exposed by a + * future truncate that extends the file. + * + * To prevent lock order problems, we must + * truncate the file upstack, not here. + */ + SDT_PROBE2(fusefs, , io, trace, 1, + "Short read of a clean file"); + fuse_vnode_clear_attr_cache(vp); + } else { + /* + * If dirty writes _are_ cached beyond EOF, + * that indicates a newly created hole that the + * server doesn't know about. Those don't pose + * any problem. + * XXX: we don't currently track whether dirty + * writes are cached beyond EOF, before EOF, or + * both. + */ + SDT_PROBE2(fusefs, , io, trace, 1, + "Short read of a dirty file"); + uiop->uio_resid = 0; } - if (left > 0) - bzero((char *)bp->b_data + nread, left); - uiop->uio_resid = 0; + } if (error) { bp->b_ioflags |= BIO_ERROR; bp->b_error = error; } } else { /* - * If we only need to commit, try to commit - */ - if (bp->b_flags & B_NEEDCOMMIT) { - SDT_PROBE2(fuse, , io, trace, 1, - "write: B_NEEDCOMMIT flags set"); - } - /* * Setup for actual write */ - if ((off_t)bp->b_blkno * biosize + bp->b_dirtyend > - fvdat->filesize) - bp->b_dirtyend = fvdat->filesize - - (off_t)bp->b_blkno * biosize; + error = fuse_vnode_size(vp, &filesize, cred, curthread); + if (error) { + bp->b_ioflags |= BIO_ERROR; + bp->b_error = error; + bufdone(bp); + return (error); + } + if ((off_t)bp->b_lblkno * biosize + bp->b_dirtyend > filesize) + bp->b_dirtyend = filesize - + (off_t)bp->b_lblkno * biosize; + if (bp->b_dirtyend > bp->b_dirtyoff) { io.iov_len = uiop->uio_resid = bp->b_dirtyend - bp->b_dirtyoff; - uiop->uio_offset = (off_t)bp->b_blkno * biosize + uiop->uio_offset = (off_t)bp->b_lblkno * biosize + bp->b_dirtyoff; io.iov_base = (char *)bp->b_data + bp->b_dirtyoff; uiop->uio_rw = UIO_WRITE; - error = fuse_write_directbackend(vp, uiop, cred, fufh, 0); + bool pages = bp->b_flags & B_FUSEFS_WRITE_CACHE; + error = fuse_write_directbackend(vp, uiop, cred, fufh, + filesize, 0, pages); - if (error == EINTR || error == ETIMEDOUT - || (!error && (bp->b_flags & B_NEEDCOMMIT))) { - + if (error == EINTR || error == ETIMEDOUT) { bp->b_flags &= ~(B_INVAL | B_NOCACHE); if ((bp->b_flags & B_PAGING) == 0) { bdirty(bp); bp->b_flags &= ~B_DONE; } if ((error == EINTR || error == ETIMEDOUT) && (bp->b_flags & B_ASYNC) == 0) bp->b_flags |= B_EINTR; } else { if (error) { bp->b_ioflags |= BIO_ERROR; bp->b_flags |= B_INVAL; bp->b_error = error; } bp->b_dirtyoff = bp->b_dirtyend = 0; } } else { bp->b_resid = 0; bufdone(bp); return (0); } } bp->b_resid = uiop->uio_resid; bufdone(bp); return (error); } int fuse_io_flushbuf(struct vnode *vp, int waitfor, struct thread *td) { return (vn_fsync_buf(vp, waitfor)); } /* * Flush and invalidate all dirty buffers. If another process is already * doing the flush, just wait for completion. */ int fuse_io_invalbuf(struct vnode *vp, struct thread *td) { struct fuse_vnode_data *fvdat = VTOFUD(vp); int error = 0; if (vp->v_iflag & VI_DOOMED) return 0; ASSERT_VOP_ELOCKED(vp, "fuse_io_invalbuf"); while (fvdat->flag & FN_FLUSHINPROG) { struct proc *p = td->td_proc; if (vp->v_mount->mnt_kern_flag & MNTK_UNMOUNTF) return EIO; fvdat->flag |= FN_FLUSHWANT; tsleep(&fvdat->flag, PRIBIO + 2, "fusevinv", 2 * hz); error = 0; if (p != NULL) { PROC_LOCK(p); if (SIGNOTEMPTY(p->p_siglist) || SIGNOTEMPTY(td->td_siglist)) error = EINTR; PROC_UNLOCK(p); } if (error == EINTR) return EINTR; } fvdat->flag |= FN_FLUSHINPROG; if (vp->v_bufobj.bo_object != NULL) { VM_OBJECT_WLOCK(vp->v_bufobj.bo_object); vm_object_page_clean(vp->v_bufobj.bo_object, 0, 0, OBJPC_SYNC); VM_OBJECT_WUNLOCK(vp->v_bufobj.bo_object); } error = vinvalbuf(vp, V_SAVE, PCATCH, 0); while (error) { if (error == ERESTART || error == EINTR) { fvdat->flag &= ~FN_FLUSHINPROG; if (fvdat->flag & FN_FLUSHWANT) { fvdat->flag &= ~FN_FLUSHWANT; wakeup(&fvdat->flag); } return EINTR; } error = vinvalbuf(vp, V_SAVE, PCATCH, 0); } fvdat->flag &= ~FN_FLUSHINPROG; if (fvdat->flag & FN_FLUSHWANT) { fvdat->flag &= ~FN_FLUSHWANT; wakeup(&fvdat->flag); } return (error); } Index: head/sys/fs/fuse/fuse_io.h =================================================================== --- head/sys/fs/fuse/fuse_io.h (revision 350664) +++ head/sys/fs/fuse/fuse_io.h (revision 350665) @@ -1,69 +1,74 @@ /*- * SPDX-License-Identifier: BSD-3-Clause * * Copyright (c) 2007-2009 Google Inc. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are * met: * * * Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following disclaimer * in the documentation and/or other materials provided with the * distribution. * * Neither the name of Google Inc. nor the names of its * contributors may be used to endorse or promote products derived from * this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * Copyright (C) 2005 Csaba Henk. * All rights reserved. + * + * Copyright (c) 2019 The FreeBSD Foundation + * + * Portions of this software were developed by BFF Storage Systems, LLC under + * sponsorship from the FreeBSD Foundation. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $FreeBSD$ */ #ifndef _FUSE_IO_H_ #define _FUSE_IO_H_ int fuse_io_dispatch(struct vnode *vp, struct uio *uio, int ioflag, - struct ucred *cred); + struct ucred *cred, pid_t pid); int fuse_io_strategy(struct vnode *vp, struct buf *bp); int fuse_io_flushbuf(struct vnode *vp, int waitfor, struct thread *td); int fuse_io_invalbuf(struct vnode *vp, struct thread *td); #endif /* _FUSE_IO_H_ */ Index: head/sys/fs/fuse/fuse_ipc.c =================================================================== --- head/sys/fs/fuse/fuse_ipc.c (revision 350664) +++ head/sys/fs/fuse/fuse_ipc.c (revision 350665) @@ -1,824 +1,1097 @@ /*- * SPDX-License-Identifier: BSD-3-Clause * * Copyright (c) 2007-2009 Google Inc. and Amit Singh * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are * met: * * * Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following disclaimer * in the documentation and/or other materials provided with the * distribution. * * Neither the name of Google Inc. nor the names of its * contributors may be used to endorse or promote products derived from * this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * Copyright (C) 2005 Csaba Henk. * All rights reserved. * + * Copyright (c) 2019 The FreeBSD Foundation + * + * Portions of this software were developed by BFF Storage Systems, LLC under + * sponsorship from the FreeBSD Foundation. + * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include #include #include +#include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "fuse.h" #include "fuse_node.h" #include "fuse_ipc.h" #include "fuse_internal.h" -SDT_PROVIDER_DECLARE(fuse); +SDT_PROVIDER_DECLARE(fusefs); /* * Fuse trace probe: * arg0: verbosity. Higher numbers give more verbose messages * arg1: Textual message */ -SDT_PROBE_DEFINE2(fuse, , ipc, trace, "int", "char*"); +SDT_PROBE_DEFINE2(fusefs, , ipc, trace, "int", "char*"); +static void fdisp_make_pid(struct fuse_dispatcher *fdip, enum fuse_opcode op, + struct fuse_data *data, uint64_t nid, pid_t pid, struct ucred *cred); +static void fuse_interrupt_send(struct fuse_ticket *otick, int err); static struct fuse_ticket *fticket_alloc(struct fuse_data *data); static void fticket_refresh(struct fuse_ticket *ftick); static void fticket_destroy(struct fuse_ticket *ftick); static int fticket_wait_answer(struct fuse_ticket *ftick); static inline int fticket_aw_pull_uio(struct fuse_ticket *ftick, struct uio *uio); static int fuse_body_audit(struct fuse_ticket *ftick, size_t blen); static fuse_handler_t fuse_standard_handler; -SYSCTL_NODE(_vfs, OID_AUTO, fusefs, CTLFLAG_RW, 0, "FUSE tunables"); -SYSCTL_STRING(_vfs_fusefs, OID_AUTO, version, CTLFLAG_RD, - FUSE_FREEBSD_VERSION, 0, "fuse-freebsd version"); -static int fuse_ticket_count = 0; +static counter_u64_t fuse_ticket_count; +SYSCTL_COUNTER_U64(_vfs_fusefs_stats, OID_AUTO, ticket_count, CTLFLAG_RD, + &fuse_ticket_count, "Number of allocated tickets"); -SYSCTL_INT(_vfs_fusefs, OID_AUTO, ticket_count, CTLFLAG_RW, - &fuse_ticket_count, 0, "number of allocated tickets"); static long fuse_iov_permanent_bufsize = 1 << 19; SYSCTL_LONG(_vfs_fusefs, OID_AUTO, iov_permanent_bufsize, CTLFLAG_RW, &fuse_iov_permanent_bufsize, 0, "limit for permanently stored buffer size for fuse_iovs"); static int fuse_iov_credit = 16; SYSCTL_INT(_vfs_fusefs, OID_AUTO, iov_credit, CTLFLAG_RW, &fuse_iov_credit, 0, "how many times is an oversized fuse_iov tolerated"); MALLOC_DEFINE(M_FUSEMSG, "fuse_msgbuf", "fuse message buffer"); static uma_zone_t ticket_zone; -static void -fuse_block_sigs(sigset_t *oldset) +/* + * TODO: figure out how to timeout INTERRUPT requests, because the daemon may + * leagally never respond + */ +static int +fuse_interrupt_callback(struct fuse_ticket *tick, struct uio *uio) { - sigset_t newset; + struct fuse_ticket *otick, *x_tick; + struct fuse_interrupt_in *fii; + struct fuse_data *data = tick->tk_data; + bool found = false; - SIGFILLSET(newset); - SIGDELSET(newset, SIGKILL); - if (kern_sigprocmask(curthread, SIG_BLOCK, &newset, oldset, 0)) - panic("%s: Invalid operation for kern_sigprocmask()", - __func__); + fii = (struct fuse_interrupt_in*)((char*)tick->tk_ms_fiov.base + + sizeof(struct fuse_in_header)); + + fuse_lck_mtx_lock(data->aw_mtx); + TAILQ_FOREACH_SAFE(otick, &data->aw_head, tk_aw_link, x_tick) { + if (otick->tk_unique == fii->unique) { + found = true; + break; + } + } + fuse_lck_mtx_unlock(data->aw_mtx); + + if (!found) { + /* Original is already complete. Just return */ + return 0; + } + + /* Clear the original ticket's interrupt association */ + otick->irq_unique = 0; + + if (tick->tk_aw_ohead.error == ENOSYS) { + fsess_set_notimpl(data->mp, FUSE_INTERRUPT); + return 0; + } else if (tick->tk_aw_ohead.error == EAGAIN) { + /* + * There are two reasons we might get this: + * 1) the daemon received the INTERRUPT request before the + * original, or + * 2) the daemon received the INTERRUPT request after it + * completed the original request. + * In the first case we should re-send the INTERRUPT. In the + * second, we should ignore it. + */ + /* Resend */ + fuse_interrupt_send(otick, EINTR); + return 0; + } else { + /* Illegal FUSE_INTERRUPT response */ + return EINVAL; + } } -static void -fuse_restore_sigs(sigset_t *oldset) +/* Interrupt the operation otick. Return err as its error code */ +void +fuse_interrupt_send(struct fuse_ticket *otick, int err) { + struct fuse_dispatcher fdi; + struct fuse_interrupt_in *fii; + struct fuse_in_header *ftick_hdr; + struct fuse_data *data = otick->tk_data; + struct fuse_ticket *tick, *xtick; + struct ucred reused_creds; + gid_t reused_groups[1]; - if (kern_sigprocmask(curthread, SIG_SETMASK, oldset, NULL, 0)) - panic("%s: Invalid operation for kern_sigprocmask()", - __func__); + if (otick->irq_unique == 0) { + /* + * If the daemon hasn't yet received otick, then we can answer + * it ourselves and return. + */ + fuse_lck_mtx_lock(data->ms_mtx); + STAILQ_FOREACH_SAFE(tick, &otick->tk_data->ms_head, tk_ms_link, + xtick) { + if (tick == otick) { + STAILQ_REMOVE(&otick->tk_data->ms_head, tick, + fuse_ticket, tk_ms_link); + otick->tk_data->ms_count--; + otick->tk_ms_link.stqe_next = NULL; + fuse_lck_mtx_unlock(data->ms_mtx); + + fuse_lck_mtx_lock(otick->tk_aw_mtx); + if (!fticket_answered(otick)) { + fticket_set_answered(otick); + otick->tk_aw_errno = err; + wakeup(otick); + } + fuse_lck_mtx_unlock(otick->tk_aw_mtx); + + fuse_ticket_drop(tick); + return; + } + } + fuse_lck_mtx_unlock(data->ms_mtx); + + /* + * If the fuse daemon doesn't support interrupts, then there's + * nothing more that we can do + */ + if (!fsess_isimpl(data->mp, FUSE_INTERRUPT)) + return; + + /* + * If the fuse daemon has already received otick, then we must + * send FUSE_INTERRUPT. + */ + ftick_hdr = fticket_in_header(otick); + reused_creds.cr_uid = ftick_hdr->uid; + reused_groups[0] = ftick_hdr->gid; + reused_creds.cr_groups = reused_groups; + fdisp_init(&fdi, sizeof(*fii)); + fdisp_make_pid(&fdi, FUSE_INTERRUPT, data, ftick_hdr->nodeid, + ftick_hdr->pid, &reused_creds); + + fii = fdi.indata; + fii->unique = otick->tk_unique; + fuse_insert_callback(fdi.tick, fuse_interrupt_callback); + + otick->irq_unique = fdi.tick->tk_unique; + /* Interrupt ops should be delivered ASAP */ + fuse_insert_message(fdi.tick, true); + fdisp_destroy(&fdi); + } else { + /* This ticket has already been interrupted */ + } } void fiov_init(struct fuse_iov *fiov, size_t size) { uint32_t msize = FU_AT_LEAST(size); fiov->len = 0; fiov->base = malloc(msize, M_FUSEMSG, M_WAITOK | M_ZERO); fiov->allocated_size = msize; fiov->credit = fuse_iov_credit; } void fiov_teardown(struct fuse_iov *fiov) { MPASS(fiov->base != NULL); free(fiov->base, M_FUSEMSG); } void fiov_adjust(struct fuse_iov *fiov, size_t size) { if (fiov->allocated_size < size || (fuse_iov_permanent_bufsize >= 0 && fiov->allocated_size - size > fuse_iov_permanent_bufsize && --fiov->credit < 0)) { fiov->base = realloc(fiov->base, FU_AT_LEAST(size), M_FUSEMSG, M_WAITOK | M_ZERO); if (!fiov->base) { panic("FUSE: realloc failed"); } fiov->allocated_size = FU_AT_LEAST(size); fiov->credit = fuse_iov_credit; + /* Clear data buffer after reallocation */ + bzero(fiov->base, size); + } else if (size > fiov->len) { + /* Clear newly extended portion of data buffer */ + bzero((char*)fiov->base + fiov->len, size - fiov->len); } fiov->len = size; } +/* Resize the fiov if needed, and clear it's buffer */ void fiov_refresh(struct fuse_iov *fiov) { - bzero(fiov->base, fiov->len); fiov_adjust(fiov, 0); } static int fticket_ctor(void *mem, int size, void *arg, int flags) { struct fuse_ticket *ftick = mem; struct fuse_data *data = arg; FUSE_ASSERT_MS_DONE(ftick); FUSE_ASSERT_AW_DONE(ftick); ftick->tk_data = data; if (ftick->tk_unique != 0) fticket_refresh(ftick); /* May be truncated to 32 bits */ ftick->tk_unique = atomic_fetchadd_long(&data->ticketer, 1); if (ftick->tk_unique == 0) ftick->tk_unique = atomic_fetchadd_long(&data->ticketer, 1); + ftick->irq_unique = 0; + refcount_init(&ftick->tk_refcount, 1); - atomic_add_acq_int(&fuse_ticket_count, 1); + counter_u64_add(fuse_ticket_count, 1); return 0; } static void fticket_dtor(void *mem, int size, void *arg) { #ifdef INVARIANTS struct fuse_ticket *ftick = mem; #endif FUSE_ASSERT_MS_DONE(ftick); FUSE_ASSERT_AW_DONE(ftick); - atomic_subtract_acq_int(&fuse_ticket_count, 1); + counter_u64_add(fuse_ticket_count, -1); } static int fticket_init(void *mem, int size, int flags) { struct fuse_ticket *ftick = mem; bzero(ftick, sizeof(struct fuse_ticket)); fiov_init(&ftick->tk_ms_fiov, sizeof(struct fuse_in_header)); ftick->tk_ms_type = FT_M_FIOV; mtx_init(&ftick->tk_aw_mtx, "fuse answer delivery mutex", NULL, MTX_DEF); fiov_init(&ftick->tk_aw_fiov, 0); ftick->tk_aw_type = FT_A_FIOV; return 0; } static void fticket_fini(void *mem, int size) { struct fuse_ticket *ftick = mem; fiov_teardown(&ftick->tk_ms_fiov); fiov_teardown(&ftick->tk_aw_fiov); mtx_destroy(&ftick->tk_aw_mtx); } static inline struct fuse_ticket * fticket_alloc(struct fuse_data *data) { return uma_zalloc_arg(ticket_zone, data, M_WAITOK); } static inline void fticket_destroy(struct fuse_ticket *ftick) { return uma_zfree(ticket_zone, ftick); } -static inline +static inline void fticket_refresh(struct fuse_ticket *ftick) { FUSE_ASSERT_MS_DONE(ftick); FUSE_ASSERT_AW_DONE(ftick); fiov_refresh(&ftick->tk_ms_fiov); ftick->tk_ms_bufdata = NULL; ftick->tk_ms_bufsize = 0; ftick->tk_ms_type = FT_M_FIOV; bzero(&ftick->tk_aw_ohead, sizeof(struct fuse_out_header)); fiov_refresh(&ftick->tk_aw_fiov); ftick->tk_aw_errno = 0; ftick->tk_aw_bufdata = NULL; ftick->tk_aw_bufsize = 0; ftick->tk_aw_type = FT_A_FIOV; ftick->tk_flag = 0; } +/* Prepar the ticket to be reused, but don't clear its data buffers */ +static inline void +fticket_reset(struct fuse_ticket *ftick) +{ + FUSE_ASSERT_MS_DONE(ftick); + FUSE_ASSERT_AW_DONE(ftick); + + ftick->tk_ms_bufdata = NULL; + ftick->tk_ms_bufsize = 0; + ftick->tk_ms_type = FT_M_FIOV; + + bzero(&ftick->tk_aw_ohead, sizeof(struct fuse_out_header)); + + ftick->tk_aw_errno = 0; + ftick->tk_aw_bufdata = NULL; + ftick->tk_aw_bufsize = 0; + ftick->tk_aw_type = FT_A_FIOV; + + ftick->tk_flag = 0; +} + static int fticket_wait_answer(struct fuse_ticket *ftick) { - sigset_t tset; - int err = 0; - struct fuse_data *data; + struct thread *td = curthread; + sigset_t blockedset, oldset; + int err = 0, stops_deferred; + struct fuse_data *data = ftick->tk_data; + bool interrupted = false; + if (fsess_isimpl(ftick->tk_data->mp, FUSE_INTERRUPT) && + data->dataflags & FSESS_INTR) { + SIGEMPTYSET(blockedset); + } else { + /* Block all signals except (implicitly) SIGKILL */ + SIGFILLSET(blockedset); + } + stops_deferred = sigdeferstop(SIGDEFERSTOP_SILENT); + kern_sigprocmask(td, SIG_BLOCK, NULL, &oldset, 0); + fuse_lck_mtx_lock(ftick->tk_aw_mtx); +retry: if (fticket_answered(ftick)) { goto out; } - data = ftick->tk_data; if (fdata_get_dead(data)) { err = ENOTCONN; fticket_set_answered(ftick); goto out; } - fuse_block_sigs(&tset); + kern_sigprocmask(td, SIG_BLOCK, &blockedset, NULL, 0); err = msleep(ftick, &ftick->tk_aw_mtx, PCATCH, "fu_ans", data->daemon_timeout * hz); - fuse_restore_sigs(&tset); - if (err == EAGAIN) { /* same as EWOULDBLOCK */ + kern_sigprocmask(td, SIG_SETMASK, &oldset, NULL, 0); + if (err == EWOULDBLOCK) { + SDT_PROBE2(fusefs, , ipc, trace, 3, + "fticket_wait_answer: EWOULDBLOCK"); #ifdef XXXIP /* die conditionally */ if (!fdata_get_dead(data)) { fdata_set_dead(data); } #endif err = ETIMEDOUT; fticket_set_answered(ftick); + } else if ((err == EINTR || err == ERESTART)) { + /* + * Whether we get EINTR or ERESTART depends on whether + * SA_RESTART was set by sigaction(2). + * + * Try to interrupt the operation and wait for an EINTR response + * to the original operation. If the file system does not + * support FUSE_INTERRUPT, then we'll just wait for it to + * complete like normal. If it does support FUSE_INTERRUPT, + * then it will either respond EINTR to the original operation, + * or EAGAIN to the interrupt. + */ + sigset_t tmpset; + + SDT_PROBE2(fusefs, , ipc, trace, 4, + "fticket_wait_answer: interrupt"); + fuse_lck_mtx_unlock(ftick->tk_aw_mtx); + fuse_interrupt_send(ftick, err); + + PROC_LOCK(td->td_proc); + mtx_lock(&td->td_proc->p_sigacts->ps_mtx); + tmpset = td->td_proc->p_siglist; + SIGSETOR(tmpset, td->td_siglist); + mtx_unlock(&td->td_proc->p_sigacts->ps_mtx); + PROC_UNLOCK(td->td_proc); + + fuse_lck_mtx_lock(ftick->tk_aw_mtx); + if (!interrupted && !SIGISMEMBER(tmpset, SIGKILL)) { + /* + * Block all signals while we wait for an interrupt + * response. The protocol doesn't discriminate between + * different signals. + */ + SIGFILLSET(blockedset); + interrupted = true; + goto retry; + } else { + /* + * Return immediately for fatal signals, or if this is + * the second interruption. We should only be + * interrupted twice if the thread is stopped, for + * example during sigexit. + */ + } + } else if (err) { + SDT_PROBE2(fusefs, , ipc, trace, 6, + "fticket_wait_answer: other error"); + } else { + SDT_PROBE2(fusefs, , ipc, trace, 7, "fticket_wait_answer: OK"); } out: if (!(err || fticket_answered(ftick))) { - SDT_PROBE2(fuse, , ipc, trace, 1, + SDT_PROBE2(fusefs, , ipc, trace, 1, "FUSE: requester was woken up but still no answer"); err = ENXIO; } fuse_lck_mtx_unlock(ftick->tk_aw_mtx); + sigallowstop(stops_deferred); return err; } static inline int fticket_aw_pull_uio(struct fuse_ticket *ftick, struct uio *uio) { int err = 0; size_t len = uio_resid(uio); if (len) { switch (ftick->tk_aw_type) { case FT_A_FIOV: fiov_adjust(fticket_resp(ftick), len); err = uiomove(fticket_resp(ftick)->base, len, uio); break; case FT_A_BUF: ftick->tk_aw_bufsize = len; err = uiomove(ftick->tk_aw_bufdata, len, uio); break; default: panic("FUSE: unknown answer type for ticket %p", ftick); } } return err; } int fticket_pull(struct fuse_ticket *ftick, struct uio *uio) { int err = 0; if (ftick->tk_aw_ohead.error) { return 0; } err = fuse_body_audit(ftick, uio_resid(uio)); if (!err) { err = fticket_aw_pull_uio(ftick, uio); } return err; } struct fuse_data * fdata_alloc(struct cdev *fdev, struct ucred *cred) { struct fuse_data *data; data = malloc(sizeof(struct fuse_data), M_FUSEMSG, M_WAITOK | M_ZERO); data->fdev = fdev; mtx_init(&data->ms_mtx, "fuse message list mutex", NULL, MTX_DEF); STAILQ_INIT(&data->ms_head); + data->ms_count = 0; + knlist_init_mtx(&data->ks_rsel.si_note, &data->ms_mtx); mtx_init(&data->aw_mtx, "fuse answer list mutex", NULL, MTX_DEF); TAILQ_INIT(&data->aw_head); data->daemoncred = crhold(cred); data->daemon_timeout = FUSE_DEFAULT_DAEMON_TIMEOUT; sx_init(&data->rename_lock, "fuse rename lock"); data->ref = 1; return data; } void fdata_trydestroy(struct fuse_data *data) { data->ref--; MPASS(data->ref >= 0); if (data->ref != 0) return; /* Driving off stage all that stuff thrown at device... */ - mtx_destroy(&data->ms_mtx); - mtx_destroy(&data->aw_mtx); sx_destroy(&data->rename_lock); - crfree(data->daemoncred); + mtx_destroy(&data->aw_mtx); + knlist_delete(&data->ks_rsel.si_note, curthread, 0); + knlist_destroy(&data->ks_rsel.si_note); + mtx_destroy(&data->ms_mtx); free(data, M_FUSEMSG); } void fdata_set_dead(struct fuse_data *data) { FUSE_LOCK(); if (fdata_get_dead(data)) { FUSE_UNLOCK(); return; } fuse_lck_mtx_lock(data->ms_mtx); data->dataflags |= FSESS_DEAD; wakeup_one(data); selwakeuppri(&data->ks_rsel, PZERO + 1); wakeup(&data->ticketer); fuse_lck_mtx_unlock(data->ms_mtx); FUSE_UNLOCK(); } struct fuse_ticket * fuse_ticket_fetch(struct fuse_data *data) { int err = 0; struct fuse_ticket *ftick; ftick = fticket_alloc(data); if (!(data->dataflags & FSESS_INITED)) { /* Sleep until get answer for INIT messsage */ FUSE_LOCK(); if (!(data->dataflags & FSESS_INITED) && data->ticketer > 2) { err = msleep(&data->ticketer, &fuse_mtx, PCATCH | PDROP, "fu_ini", 0); if (err) fdata_set_dead(data); } else FUSE_UNLOCK(); } return ftick; } int fuse_ticket_drop(struct fuse_ticket *ftick) { int die; die = refcount_release(&ftick->tk_refcount); if (die) fticket_destroy(ftick); return die; } void fuse_insert_callback(struct fuse_ticket *ftick, fuse_handler_t * handler) { if (fdata_get_dead(ftick->tk_data)) { return; } ftick->tk_aw_handler = handler; fuse_lck_mtx_lock(ftick->tk_data->aw_mtx); fuse_aw_push(ftick); fuse_lck_mtx_unlock(ftick->tk_data->aw_mtx); } +/* + * Insert a new upgoing ticket into the message queue + * + * If urgent is true, insert at the front of the queue. Otherwise, insert in + * FIFO order. + */ void -fuse_insert_message(struct fuse_ticket *ftick) +fuse_insert_message(struct fuse_ticket *ftick, bool urgent) { if (ftick->tk_flag & FT_DIRTY) { panic("FUSE: ticket reused without being refreshed"); } ftick->tk_flag |= FT_DIRTY; if (fdata_get_dead(ftick->tk_data)) { return; } fuse_lck_mtx_lock(ftick->tk_data->ms_mtx); - fuse_ms_push(ftick); + if (urgent) + fuse_ms_push_head(ftick); + else + fuse_ms_push(ftick); wakeup_one(ftick->tk_data); selwakeuppri(&ftick->tk_data->ks_rsel, PZERO + 1); + KNOTE_LOCKED(&ftick->tk_data->ks_rsel.si_note, 0); fuse_lck_mtx_unlock(ftick->tk_data->ms_mtx); } static int fuse_body_audit(struct fuse_ticket *ftick, size_t blen) { int err = 0; enum fuse_opcode opcode; opcode = fticket_opcode(ftick); switch (opcode) { + case FUSE_BMAP: + err = (blen == sizeof(struct fuse_bmap_out)) ? 0 : EINVAL; + break; + + case FUSE_LINK: case FUSE_LOOKUP: - err = (blen == sizeof(struct fuse_entry_out)) ? 0 : EINVAL; + case FUSE_MKDIR: + case FUSE_MKNOD: + case FUSE_SYMLINK: + if (fuse_libabi_geq(ftick->tk_data, 7, 9)) { + err = (blen == sizeof(struct fuse_entry_out)) ? + 0 : EINVAL; + } else { + err = (blen == FUSE_COMPAT_ENTRY_OUT_SIZE) ? 0 : EINVAL; + } break; case FUSE_FORGET: panic("FUSE: a handler has been intalled for FUSE_FORGET"); break; case FUSE_GETATTR: - err = (blen == sizeof(struct fuse_attr_out)) ? 0 : EINVAL; - break; - case FUSE_SETATTR: - err = (blen == sizeof(struct fuse_attr_out)) ? 0 : EINVAL; + if (fuse_libabi_geq(ftick->tk_data, 7, 9)) { + err = (blen == sizeof(struct fuse_attr_out)) ? + 0 : EINVAL; + } else { + err = (blen == FUSE_COMPAT_ATTR_OUT_SIZE) ? 0 : EINVAL; + } break; case FUSE_READLINK: err = (PAGE_SIZE >= blen) ? 0 : EINVAL; break; - case FUSE_SYMLINK: - err = (blen == sizeof(struct fuse_entry_out)) ? 0 : EINVAL; - break; - - case FUSE_MKNOD: - err = (blen == sizeof(struct fuse_entry_out)) ? 0 : EINVAL; - break; - - case FUSE_MKDIR: - err = (blen == sizeof(struct fuse_entry_out)) ? 0 : EINVAL; - break; - case FUSE_UNLINK: err = (blen == 0) ? 0 : EINVAL; break; case FUSE_RMDIR: err = (blen == 0) ? 0 : EINVAL; break; case FUSE_RENAME: err = (blen == 0) ? 0 : EINVAL; break; - case FUSE_LINK: - err = (blen == sizeof(struct fuse_entry_out)) ? 0 : EINVAL; - break; - case FUSE_OPEN: err = (blen == sizeof(struct fuse_open_out)) ? 0 : EINVAL; break; case FUSE_READ: err = (((struct fuse_read_in *)( (char *)ftick->tk_ms_fiov.base + sizeof(struct fuse_in_header) ))->size >= blen) ? 0 : EINVAL; break; case FUSE_WRITE: err = (blen == sizeof(struct fuse_write_out)) ? 0 : EINVAL; break; case FUSE_STATFS: if (fuse_libabi_geq(ftick->tk_data, 7, 4)) { err = (blen == sizeof(struct fuse_statfs_out)) ? 0 : EINVAL; } else { err = (blen == FUSE_COMPAT_STATFS_SIZE) ? 0 : EINVAL; } break; case FUSE_RELEASE: err = (blen == 0) ? 0 : EINVAL; break; case FUSE_FSYNC: err = (blen == 0) ? 0 : EINVAL; break; case FUSE_SETXATTR: err = (blen == 0) ? 0 : EINVAL; break; case FUSE_GETXATTR: case FUSE_LISTXATTR: /* * These can have varying response lengths, and 0 length * isn't necessarily invalid. */ err = 0; break; case FUSE_REMOVEXATTR: err = (blen == 0) ? 0 : EINVAL; break; case FUSE_FLUSH: err = (blen == 0) ? 0 : EINVAL; break; case FUSE_INIT: - if (blen == sizeof(struct fuse_init_out) || blen == 8) { + if (blen == sizeof(struct fuse_init_out) || + blen == FUSE_COMPAT_INIT_OUT_SIZE || + blen == FUSE_COMPAT_22_INIT_OUT_SIZE) { err = 0; } else { err = EINVAL; } break; case FUSE_OPENDIR: err = (blen == sizeof(struct fuse_open_out)) ? 0 : EINVAL; break; case FUSE_READDIR: err = (((struct fuse_read_in *)( (char *)ftick->tk_ms_fiov.base + sizeof(struct fuse_in_header) ))->size >= blen) ? 0 : EINVAL; break; case FUSE_RELEASEDIR: err = (blen == 0) ? 0 : EINVAL; break; case FUSE_FSYNCDIR: err = (blen == 0) ? 0 : EINVAL; break; case FUSE_GETLK: - panic("FUSE: no response body format check for FUSE_GETLK"); + err = (blen == sizeof(struct fuse_lk_out)) ? 0 : EINVAL; break; case FUSE_SETLK: - panic("FUSE: no response body format check for FUSE_SETLK"); + err = (blen == 0) ? 0 : EINVAL; break; case FUSE_SETLKW: - panic("FUSE: no response body format check for FUSE_SETLKW"); + err = (blen == 0) ? 0 : EINVAL; break; case FUSE_ACCESS: err = (blen == 0) ? 0 : EINVAL; break; case FUSE_CREATE: - err = (blen == sizeof(struct fuse_entry_out) + - sizeof(struct fuse_open_out)) ? 0 : EINVAL; + if (fuse_libabi_geq(ftick->tk_data, 7, 9)) { + err = (blen == sizeof(struct fuse_entry_out) + + sizeof(struct fuse_open_out)) ? 0 : EINVAL; + } else { + err = (blen == FUSE_COMPAT_ENTRY_OUT_SIZE + + sizeof(struct fuse_open_out)) ? 0 : EINVAL; + } break; case FUSE_DESTROY: err = (blen == 0) ? 0 : EINVAL; break; default: panic("FUSE: opcodes out of sync (%d)\n", opcode); } return err; } static inline void fuse_setup_ihead(struct fuse_in_header *ihead, struct fuse_ticket *ftick, uint64_t nid, enum fuse_opcode op, size_t blen, pid_t pid, struct ucred *cred) { ihead->len = sizeof(*ihead) + blen; ihead->unique = ftick->tk_unique; ihead->nodeid = nid; ihead->opcode = op; ihead->pid = pid; ihead->uid = cred->cr_uid; - ihead->gid = cred->cr_rgid; + ihead->gid = cred->cr_groups[0]; } /* * fuse_standard_handler just pulls indata and wakes up pretender. * Doesn't try to interpret data, that's left for the pretender. * Though might do a basic size verification before the pull-in takes place */ static int fuse_standard_handler(struct fuse_ticket *ftick, struct uio *uio) { int err = 0; err = fticket_pull(ftick, uio); fuse_lck_mtx_lock(ftick->tk_aw_mtx); if (!fticket_answered(ftick)) { fticket_set_answered(ftick); ftick->tk_aw_errno = err; wakeup(ftick); } fuse_lck_mtx_unlock(ftick->tk_aw_mtx); return err; } -void -fdisp_make_pid(struct fuse_dispatcher *fdip, enum fuse_opcode op, +/* + * Reinitialize a dispatcher from a pid and node id, without resizing or + * clearing its data buffers + */ +static void +fdisp_refresh_pid(struct fuse_dispatcher *fdip, enum fuse_opcode op, struct mount *mp, uint64_t nid, pid_t pid, struct ucred *cred) { - struct fuse_data *data = fuse_get_mpdata(mp); + MPASS(fdip->tick); + MPASS2(sizeof(fdip->finh) + fdip->iosize <= fdip->tick->tk_ms_fiov.len, + "Must use fdisp_make_pid to increase the size of the fiov"); + fticket_reset(fdip->tick); + FUSE_DIMALLOC(&fdip->tick->tk_ms_fiov, fdip->finh, + fdip->indata, fdip->iosize); + + fuse_setup_ihead(fdip->finh, fdip->tick, nid, op, fdip->iosize, pid, + cred); +} + +/* Initialize a dispatcher from a pid and node id */ +static void +fdisp_make_pid(struct fuse_dispatcher *fdip, enum fuse_opcode op, + struct fuse_data *data, uint64_t nid, pid_t pid, struct ucred *cred) +{ if (fdip->tick) { fticket_refresh(fdip->tick); } else { fdip->tick = fuse_ticket_fetch(data); } + /* FUSE_DIMALLOC will bzero the fiovs when it enlarges them */ FUSE_DIMALLOC(&fdip->tick->tk_ms_fiov, fdip->finh, fdip->indata, fdip->iosize); fuse_setup_ihead(fdip->finh, fdip->tick, nid, op, fdip->iosize, pid, cred); } void fdisp_make(struct fuse_dispatcher *fdip, enum fuse_opcode op, struct mount *mp, uint64_t nid, struct thread *td, struct ucred *cred) { + struct fuse_data *data = fuse_get_mpdata(mp); RECTIFY_TDCR(td, cred); - return fdisp_make_pid(fdip, op, mp, nid, td->td_proc->p_pid, cred); + return fdisp_make_pid(fdip, op, data, nid, td->td_proc->p_pid, cred); } void fdisp_make_vp(struct fuse_dispatcher *fdip, enum fuse_opcode op, struct vnode *vp, struct thread *td, struct ucred *cred) { + struct mount *mp = vnode_mount(vp); + struct fuse_data *data = fuse_get_mpdata(mp); + RECTIFY_TDCR(td, cred); - return fdisp_make_pid(fdip, op, vnode_mount(vp), VTOI(vp), + return fdisp_make_pid(fdip, op, data, VTOI(vp), td->td_proc->p_pid, cred); } -SDT_PROBE_DEFINE2(fuse, , ipc, fdisp_wait_answ_error, "char*", "int"); +/* Refresh a fuse_dispatcher so it can be reused, but don't zero its data */ +void +fdisp_refresh_vp(struct fuse_dispatcher *fdip, enum fuse_opcode op, + struct vnode *vp, struct thread *td, struct ucred *cred) +{ + RECTIFY_TDCR(td, cred); + return fdisp_refresh_pid(fdip, op, vnode_mount(vp), VTOI(vp), + td->td_proc->p_pid, cred); +} +void +fdisp_refresh(struct fuse_dispatcher *fdip) +{ + fticket_refresh(fdip->tick); +} + +SDT_PROBE_DEFINE2(fusefs, , ipc, fdisp_wait_answ_error, "char*", "int"); + int fdisp_wait_answ(struct fuse_dispatcher *fdip) { int err = 0; fdip->answ_stat = 0; fuse_insert_callback(fdip->tick, fuse_standard_handler); - fuse_insert_message(fdip->tick); + fuse_insert_message(fdip->tick, false); if ((err = fticket_wait_answer(fdip->tick))) { fuse_lck_mtx_lock(fdip->tick->tk_aw_mtx); if (fticket_answered(fdip->tick)) { /* * Just between noticing the interrupt and getting here, * the standard handler has completed his job. * So we drop the ticket and exit as usual. */ - SDT_PROBE2(fuse, , ipc, fdisp_wait_answ_error, + SDT_PROBE2(fusefs, , ipc, fdisp_wait_answ_error, "IPC: interrupted, already answered", err); fuse_lck_mtx_unlock(fdip->tick->tk_aw_mtx); goto out; } else { /* * So we were faster than the standard handler. * Then by setting the answered flag we get *him* * to drop the ticket. */ - SDT_PROBE2(fuse, , ipc, fdisp_wait_answ_error, + SDT_PROBE2(fusefs, , ipc, fdisp_wait_answ_error, "IPC: interrupted, setting to answered", err); fticket_set_answered(fdip->tick); fuse_lck_mtx_unlock(fdip->tick->tk_aw_mtx); return err; } } - if (fdip->tick->tk_aw_errno) { - SDT_PROBE2(fuse, , ipc, fdisp_wait_answ_error, + if (fdip->tick->tk_aw_errno == ENOTCONN) { + /* The daemon died while we were waiting for a response */ + err = ENOTCONN; + goto out; + } else if (fdip->tick->tk_aw_errno) { + /* + * There was some sort of communication error with the daemon + * that the client wouldn't understand. + */ + SDT_PROBE2(fusefs, , ipc, fdisp_wait_answ_error, "IPC: explicit EIO-ing", fdip->tick->tk_aw_errno); err = EIO; goto out; } if ((err = fdip->tick->tk_aw_ohead.error)) { - SDT_PROBE2(fuse, , ipc, fdisp_wait_answ_error, + SDT_PROBE2(fusefs, , ipc, fdisp_wait_answ_error, "IPC: setting status", fdip->tick->tk_aw_ohead.error); /* * This means a "proper" fuse syscall error. * We record this value so the caller will * be able to know it's not a boring messaging * failure, if she wishes so (and if not, she can * just simply propagate the return value of this routine). * [XXX Maybe a bitflag would do the job too, * if other flags needed, this will be converted thusly.] */ fdip->answ_stat = err; goto out; } fdip->answ = fticket_resp(fdip->tick)->base; fdip->iosize = fticket_resp(fdip->tick)->len; return 0; out: return err; } void fuse_ipc_init(void) { ticket_zone = uma_zcreate("fuse_ticket", sizeof(struct fuse_ticket), fticket_ctor, fticket_dtor, fticket_init, fticket_fini, UMA_ALIGN_PTR, 0); + fuse_ticket_count = counter_u64_alloc(M_WAITOK); } void fuse_ipc_destroy(void) { + counter_u64_free(fuse_ticket_count); uma_zdestroy(ticket_zone); } Index: head/sys/fs/fuse/fuse_ipc.h =================================================================== --- head/sys/fs/fuse/fuse_ipc.h (revision 350664) +++ head/sys/fs/fuse/fuse_ipc.h (revision 350665) @@ -1,396 +1,428 @@ /*- * SPDX-License-Identifier: BSD-3-Clause * * Copyright (c) 2007-2009 Google Inc. and Amit Singh * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are * met: * * * Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following disclaimer * in the documentation and/or other materials provided with the * distribution. * * Neither the name of Google Inc. nor the names of its * contributors may be used to endorse or promote products derived from * this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * Copyright (C) 2005 Csaba Henk. * All rights reserved. + * + * Copyright (c) 2019 The FreeBSD Foundation + * + * Portions of this software were developed by BFF Storage Systems, LLC under + * sponsorship from the FreeBSD Foundation. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $FreeBSD$ */ #ifndef _FUSE_IPC_H_ #define _FUSE_IPC_H_ #include #include +enum fuse_data_cache_mode { + FUSE_CACHE_UC, + FUSE_CACHE_WT, + FUSE_CACHE_WB, +}; + struct fuse_iov { void *base; size_t len; size_t allocated_size; int credit; }; void fiov_init(struct fuse_iov *fiov, size_t size); void fiov_teardown(struct fuse_iov *fiov); void fiov_refresh(struct fuse_iov *fiov); void fiov_adjust(struct fuse_iov *fiov, size_t size); #define FUSE_DIMALLOC(fiov, spc1, spc2, amnt) do { \ fiov_adjust(fiov, (sizeof(*(spc1)) + (amnt))); \ (spc1) = (fiov)->base; \ (spc2) = (char *)(fiov)->base + (sizeof(*(spc1))); \ } while (0) #define FU_AT_LEAST(siz) max((siz), 160) #define FUSE_ASSERT_AW_DONE(ftick) \ KASSERT((ftick)->tk_aw_link.tqe_next == NULL && \ (ftick)->tk_aw_link.tqe_prev == NULL, \ ("FUSE: ticket still on answer delivery list %p", (ftick))) #define FUSE_ASSERT_MS_DONE(ftick) \ KASSERT((ftick)->tk_ms_link.stqe_next == NULL, \ ("FUSE: ticket still on message list %p", (ftick))) struct fuse_ticket; struct fuse_data; typedef int fuse_handler_t(struct fuse_ticket *ftick, struct uio *uio); struct fuse_ticket { /* fields giving the identity of the ticket */ uint64_t tk_unique; struct fuse_data *tk_data; int tk_flag; u_int tk_refcount; + /* + * If this ticket's operation has been interrupted, this will hold the + * unique value of the FUSE_INTERRUPT operation. Otherwise, it will be + * 0. + */ + uint64_t irq_unique; /* fields for initiating an upgoing message */ struct fuse_iov tk_ms_fiov; void *tk_ms_bufdata; size_t tk_ms_bufsize; enum { FT_M_FIOV, FT_M_BUF } tk_ms_type; STAILQ_ENTRY(fuse_ticket) tk_ms_link; /* fields for handling answers coming from userspace */ struct fuse_iov tk_aw_fiov; void *tk_aw_bufdata; size_t tk_aw_bufsize; enum { FT_A_FIOV, FT_A_BUF } tk_aw_type; struct fuse_out_header tk_aw_ohead; int tk_aw_errno; struct mtx tk_aw_mtx; fuse_handler_t *tk_aw_handler; TAILQ_ENTRY(fuse_ticket) tk_aw_link; }; #define FT_ANSW 0x01 /* request of ticket has already been answered */ #define FT_DIRTY 0x04 /* ticket has been used */ static inline struct fuse_iov * fticket_resp(struct fuse_ticket *ftick) { return (&ftick->tk_aw_fiov); } static inline bool fticket_answered(struct fuse_ticket *ftick) { mtx_assert(&ftick->tk_aw_mtx, MA_OWNED); return (ftick->tk_flag & FT_ANSW); } static inline void fticket_set_answered(struct fuse_ticket *ftick) { mtx_assert(&ftick->tk_aw_mtx, MA_OWNED); ftick->tk_flag |= FT_ANSW; } +static inline struct fuse_in_header* +fticket_in_header(struct fuse_ticket *ftick) +{ + return (struct fuse_in_header *)(ftick->tk_ms_fiov.base); +} + static inline enum fuse_opcode fticket_opcode(struct fuse_ticket *ftick) { - return (((struct fuse_in_header *)(ftick->tk_ms_fiov.base))->opcode); + return fticket_in_header(ftick)->opcode; } int fticket_pull(struct fuse_ticket *ftick, struct uio *uio); -enum mountpri { FM_NOMOUNTED, FM_PRIMARY, FM_SECONDARY }; - /* * The data representing a FUSE session. */ struct fuse_data { struct cdev *fdev; struct mount *mp; struct vnode *vroot; struct ucred *daemoncred; int dataflags; int ref; struct mtx ms_mtx; STAILQ_HEAD(, fuse_ticket) ms_head; + int ms_count; struct mtx aw_mtx; TAILQ_HEAD(, fuse_ticket) aw_head; + /* + * Holds the next value of the FUSE operation unique value. + * Also, serves as a wakeup channel to prevent any operations from + * being created before INIT completes. + */ u_long ticketer; struct sx rename_lock; uint32_t fuse_libabi_major; uint32_t fuse_libabi_minor; + uint32_t max_readahead_blocks; uint32_t max_write; uint32_t max_read; uint32_t subtype; char volname[MAXPATHLEN]; struct selinfo ks_rsel; int daemon_timeout; + unsigned time_gran; uint64_t notimpl; + uint64_t mnt_flag; + enum fuse_data_cache_mode cache_mode; }; #define FSESS_DEAD 0x0001 /* session is to be closed */ -#define FSESS_UNUSED0 0x0002 /* unused */ #define FSESS_INITED 0x0004 /* session has been inited */ #define FSESS_DAEMON_CAN_SPY 0x0010 /* let non-owners access this fs */ /* (and being observed by the daemon) */ #define FSESS_PUSH_SYMLINKS_IN 0x0020 /* prefix absolute symlinks with mp */ #define FSESS_DEFAULT_PERMISSIONS 0x0040 /* kernel does permission checking */ -#define FSESS_NO_ATTRCACHE 0x0080 /* no attribute caching */ -#define FSESS_NO_READAHEAD 0x0100 /* no readaheads */ -#define FSESS_NO_DATACACHE 0x0200 /* disable buffer cache */ -#define FSESS_NO_NAMECACHE 0x0400 /* disable name cache */ -#define FSESS_NO_MMAP 0x0800 /* disable mmap */ -#define FSESS_BROKENIO 0x1000 /* fix broken io */ +#define FSESS_ASYNC_READ 0x1000 /* allow multiple reads of some file */ +#define FSESS_POSIX_LOCKS 0x2000 /* daemon supports POSIX locks */ +#define FSESS_EXPORT_SUPPORT 0x10000 /* daemon supports NFS-style lookups */ +#define FSESS_INTR 0x20000 /* interruptible mounts */ +#define FSESS_MNTOPTS_MASK ( \ + FSESS_DAEMON_CAN_SPY | FSESS_PUSH_SYMLINKS_IN | \ + FSESS_DEFAULT_PERMISSIONS | FSESS_INTR) -enum fuse_data_cache_mode { - FUSE_CACHE_UC, - FUSE_CACHE_WT, - FUSE_CACHE_WB, -}; - extern int fuse_data_cache_mode; -extern int fuse_data_cache_invalidate; -extern int fuse_mmap_enable; -extern int fuse_sync_resize; -extern int fuse_fix_broken_io; static inline struct fuse_data * fuse_get_mpdata(struct mount *mp) { return mp->mnt_data; } static inline bool fsess_isimpl(struct mount *mp, int opcode) { struct fuse_data *data = fuse_get_mpdata(mp); return ((data->notimpl & (1ULL << opcode)) == 0); } static inline void fsess_set_notimpl(struct mount *mp, int opcode) { struct fuse_data *data = fuse_get_mpdata(mp); data->notimpl |= (1ULL << opcode); } static inline bool fsess_opt_datacache(struct mount *mp) { struct fuse_data *data = fuse_get_mpdata(mp); - return (fuse_data_cache_mode != FUSE_CACHE_UC && - (data->dataflags & FSESS_NO_DATACACHE) == 0); + return (data->cache_mode != FUSE_CACHE_UC); } static inline bool fsess_opt_mmap(struct mount *mp) { - struct fuse_data *data = fuse_get_mpdata(mp); - - if (!fuse_mmap_enable || fuse_data_cache_mode == FUSE_CACHE_UC) - return (false); - return ((data->dataflags & (FSESS_NO_DATACACHE | FSESS_NO_MMAP)) == 0); + return (fsess_opt_datacache(mp)); } static inline bool -fsess_opt_brokenio(struct mount *mp) +fsess_opt_writeback(struct mount *mp) { struct fuse_data *data = fuse_get_mpdata(mp); - return (fuse_fix_broken_io || (data->dataflags & FSESS_BROKENIO)); + return (data->cache_mode == FUSE_CACHE_WB); } +/* Insert a new upgoing message */ static inline void fuse_ms_push(struct fuse_ticket *ftick) { mtx_assert(&ftick->tk_data->ms_mtx, MA_OWNED); refcount_acquire(&ftick->tk_refcount); STAILQ_INSERT_TAIL(&ftick->tk_data->ms_head, ftick, tk_ms_link); + ftick->tk_data->ms_count++; } +/* Insert a new upgoing message to the front of the queue */ +static inline void +fuse_ms_push_head(struct fuse_ticket *ftick) +{ + mtx_assert(&ftick->tk_data->ms_mtx, MA_OWNED); + refcount_acquire(&ftick->tk_refcount); + STAILQ_INSERT_HEAD(&ftick->tk_data->ms_head, ftick, tk_ms_link); + ftick->tk_data->ms_count++; +} + static inline struct fuse_ticket * fuse_ms_pop(struct fuse_data *data) { struct fuse_ticket *ftick = NULL; mtx_assert(&data->ms_mtx, MA_OWNED); if ((ftick = STAILQ_FIRST(&data->ms_head))) { STAILQ_REMOVE_HEAD(&data->ms_head, tk_ms_link); + data->ms_count--; #ifdef INVARIANTS + MPASS(data->ms_count >= 0); ftick->tk_ms_link.stqe_next = NULL; #endif } return (ftick); } static inline void fuse_aw_push(struct fuse_ticket *ftick) { mtx_assert(&ftick->tk_data->aw_mtx, MA_OWNED); refcount_acquire(&ftick->tk_refcount); TAILQ_INSERT_TAIL(&ftick->tk_data->aw_head, ftick, tk_aw_link); } static inline void fuse_aw_remove(struct fuse_ticket *ftick) { mtx_assert(&ftick->tk_data->aw_mtx, MA_OWNED); TAILQ_REMOVE(&ftick->tk_data->aw_head, ftick, tk_aw_link); #ifdef INVARIANTS ftick->tk_aw_link.tqe_next = NULL; ftick->tk_aw_link.tqe_prev = NULL; #endif } static inline struct fuse_ticket * fuse_aw_pop(struct fuse_data *data) { struct fuse_ticket *ftick; mtx_assert(&data->aw_mtx, MA_OWNED); if ((ftick = TAILQ_FIRST(&data->aw_head)) != NULL) fuse_aw_remove(ftick); return (ftick); } struct fuse_ticket *fuse_ticket_fetch(struct fuse_data *data); int fuse_ticket_drop(struct fuse_ticket *ftick); void fuse_insert_callback(struct fuse_ticket *ftick, fuse_handler_t *handler); -void fuse_insert_message(struct fuse_ticket *ftick); +void fuse_insert_message(struct fuse_ticket *ftick, bool irq); static inline bool fuse_libabi_geq(struct fuse_data *data, uint32_t abi_maj, uint32_t abi_min) { return (data->fuse_libabi_major > abi_maj || (data->fuse_libabi_major == abi_maj && data->fuse_libabi_minor >= abi_min)); } struct fuse_data *fdata_alloc(struct cdev *dev, struct ucred *cred); void fdata_trydestroy(struct fuse_data *data); void fdata_set_dead(struct fuse_data *data); static inline bool fdata_get_dead(struct fuse_data *data) { return (data->dataflags & FSESS_DEAD); } struct fuse_dispatcher { struct fuse_ticket *tick; struct fuse_in_header *finh; void *indata; size_t iosize; uint64_t nodeid; int answ_stat; void *answ; }; static inline void fdisp_init(struct fuse_dispatcher *fdisp, size_t iosize) { fdisp->iosize = iosize; fdisp->tick = NULL; } static inline void fdisp_destroy(struct fuse_dispatcher *fdisp) { fuse_ticket_drop(fdisp->tick); #ifdef INVARIANTS fdisp->tick = NULL; #endif } +void fdisp_refresh(struct fuse_dispatcher *fdip); + void fdisp_make(struct fuse_dispatcher *fdip, enum fuse_opcode op, struct mount *mp, uint64_t nid, struct thread *td, struct ucred *cred); -void fdisp_make_pid(struct fuse_dispatcher *fdip, enum fuse_opcode op, - struct mount *mp, uint64_t nid, pid_t pid, struct ucred *cred); - void fdisp_make_vp(struct fuse_dispatcher *fdip, enum fuse_opcode op, + struct vnode *vp, struct thread *td, struct ucred *cred); + +void fdisp_refresh_vp(struct fuse_dispatcher *fdip, enum fuse_opcode op, struct vnode *vp, struct thread *td, struct ucred *cred); int fdisp_wait_answ(struct fuse_dispatcher *fdip); static inline int fdisp_simple_putget_vp(struct fuse_dispatcher *fdip, enum fuse_opcode op, struct vnode *vp, struct thread *td, struct ucred *cred) { fdisp_make_vp(fdip, op, vp, td, cred); return (fdisp_wait_answ(fdip)); } #endif /* _FUSE_IPC_H_ */ Index: head/sys/fs/fuse/fuse_kernel.h =================================================================== --- head/sys/fs/fuse/fuse_kernel.h (revision 350664) +++ head/sys/fs/fuse/fuse_kernel.h (revision 350665) @@ -1,383 +1,754 @@ /*-- * This file defines the kernel interface of FUSE - * Copyright (C) 2001-2007 Miklos Szeredi + * Copyright (C) 2001-2008 Miklos Szeredi * * This program can be distributed under the terms of the GNU GPL. * See the file COPYING. * * This -- and only this -- header file may also be distributed under * the terms of the BSD Licence as follows: * * Copyright (C) 2001-2007 Miklos Szeredi. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $FreeBSD$ */ -#ifndef linux -#include -#define __u64 uint64_t -#define __u32 uint32_t -#define __s32 int32_t +/* + * This file defines the kernel interface of FUSE + * + * Protocol changelog: + * + * 7.9: + * - new fuse_getattr_in input argument of GETATTR + * - add lk_flags in fuse_lk_in + * - add lock_owner field to fuse_setattr_in, fuse_read_in and fuse_write_in + * - add blksize field to fuse_attr + * - add file flags field to fuse_read_in and fuse_write_in + * + * 7.10 + * - add nonseekable open flag + * + * 7.11 + * - add IOCTL message + * - add unsolicited notification support + * + * 7.12 + * - add umask flag to input argument of open, mknod and mkdir + * - add notification messages for invalidation of inodes and + * directory entries + * + * 7.13 + * - make max number of background requests and congestion threshold + * tunables + * + * 7.14 + * - add splice support to fuse device + * + * 7.15 + * - add store notify + * - add retrieve notify + * + * 7.16 + * - add BATCH_FORGET request + * - FUSE_IOCTL_UNRESTRICTED shall now return with array of 'struct + * fuse_ioctl_iovec' instead of ambiguous 'struct iovec' + * - add FUSE_IOCTL_32BIT flag + * + * 7.17 + * - add FUSE_FLOCK_LOCKS and FUSE_RELEASE_FLOCK_UNLOCK + * + * 7.18 + * - add FUSE_IOCTL_DIR flag + * - add FUSE_NOTIFY_DELETE + * + * 7.19 + * - add FUSE_FALLOCATE + * + * 7.20 + * - add FUSE_AUTO_INVAL_DATA + * 7.21 + * - add FUSE_READDIRPLUS + * - send the requested events in POLL request + * + * 7.22 + * - add FUSE_ASYNC_DIO + * + * 7.23 + * - add FUSE_WRITEBACK_CACHE + * - add time_gran to fuse_init_out + * - add reserved space to fuse_init_out + * - add FATTR_CTIME + * - add ctime and ctimensec to fuse_setattr_in + * - add FUSE_RENAME2 request + * - add FUSE_NO_OPEN_SUPPORT flag + */ + +#ifndef _FUSE_FUSE_KERNEL_H +#define _FUSE_FUSE_KERNEL_H + +#ifdef __linux__ +#include #else -#include -#include +#include #endif /** Version number of this interface */ #define FUSE_KERNEL_VERSION 7 /** Minor version number of this interface */ -#define FUSE_KERNEL_MINOR_VERSION 8 +#define FUSE_KERNEL_MINOR_VERSION 23 /** The node ID of the root inode */ #define FUSE_ROOT_ID 1 -/** The major number of the fuse character device */ -#define FUSE_MAJOR MISC_MAJOR - -/** The minor number of the fuse character device */ -#define FUSE_MINOR 229 - /* Make sure all structures are padded to 64bit boundary, so 32bit userspace works under 64bit kernels */ struct fuse_attr { - __u64 ino; - __u64 size; - __u64 blocks; - __u64 atime; - __u64 mtime; - __u64 ctime; - __u32 atimensec; - __u32 mtimensec; - __u32 ctimensec; - __u32 mode; - __u32 nlink; - __u32 uid; - __u32 gid; - __u32 rdev; + uint64_t ino; + uint64_t size; + uint64_t blocks; + uint64_t atime; + uint64_t mtime; + uint64_t ctime; + uint32_t atimensec; + uint32_t mtimensec; + uint32_t ctimensec; + uint32_t mode; + uint32_t nlink; + uint32_t uid; + uint32_t gid; + uint32_t rdev; + uint32_t blksize; + uint32_t padding; }; struct fuse_kstatfs { - __u64 blocks; - __u64 bfree; - __u64 bavail; - __u64 files; - __u64 ffree; - __u32 bsize; - __u32 namelen; - __u32 frsize; - __u32 padding; - __u32 spare[6]; + uint64_t blocks; + uint64_t bfree; + uint64_t bavail; + uint64_t files; + uint64_t ffree; + uint32_t bsize; + uint32_t namelen; + uint32_t frsize; + uint32_t padding; + uint32_t spare[6]; }; struct fuse_file_lock { - __u64 start; - __u64 end; - __u32 type; - __u32 pid; /* tgid */ + uint64_t start; + uint64_t end; + uint32_t type; + uint32_t pid; /* tgid */ }; /** * Bitmasks for fuse_setattr_in.valid */ #define FATTR_MODE (1 << 0) #define FATTR_UID (1 << 1) #define FATTR_GID (1 << 2) #define FATTR_SIZE (1 << 3) #define FATTR_ATIME (1 << 4) #define FATTR_MTIME (1 << 5) #define FATTR_FH (1 << 6) +#define FATTR_ATIME_NOW (1 << 7) +#define FATTR_MTIME_NOW (1 << 8) +#define FATTR_LOCKOWNER (1 << 9) +#define FATTR_CTIME (1 << 10) /** * Flags returned by the OPEN request * * FOPEN_DIRECT_IO: bypass page cache for this open file * FOPEN_KEEP_CACHE: don't invalidate the data cache on open + * FOPEN_NONSEEKABLE: the file is not seekable */ #define FOPEN_DIRECT_IO (1 << 0) #define FOPEN_KEEP_CACHE (1 << 1) +#define FOPEN_NONSEEKABLE (1 << 2) /** * INIT request/reply flags + * + * FUSE_ASYNC_READ: asynchronous read requests + * FUSE_POSIX_LOCKS: remote locking for POSIX file locks + * FUSE_FILE_OPS: kernel sends file handle for fstat, etc... (not yet supported) + * FUSE_ATOMIC_O_TRUNC: handles the O_TRUNC open flag in the filesystem + * FUSE_EXPORT_SUPPORT: filesystem handles lookups of "." and ".." + * FUSE_BIG_WRITES: filesystem can handle write size larger than 4kB + * FUSE_DONT_MASK: don't apply umask to file mode on create operations + * FUSE_SPLICE_WRITE: kernel supports splice write on the device + * FUSE_SPLICE_MOVE: kernel supports splice move on the device + * FUSE_SPLICE_READ: kernel supports splice read on the device + * FUSE_FLOCK_LOCKS: remote locking for BSD style file locks + * FUSE_HAS_IOCTL_DIR: kernel supports ioctl on directories + * FUSE_AUTO_INVAL_DATA: automatically invalidate cached pages + * FUSE_DO_READDIRPLUS: do READDIRPLUS (READDIR+LOOKUP in one) + * FUSE_READDIRPLUS_AUTO: adaptive readdirplus + * FUSE_ASYNC_DIO: asynchronous direct I/O submission + * FUSE_WRITEBACK_CACHE: use writeback cache for buffered writes + * FUSE_NO_OPEN_SUPPORT: kernel supports zero-message opens */ #define FUSE_ASYNC_READ (1 << 0) #define FUSE_POSIX_LOCKS (1 << 1) +#define FUSE_FILE_OPS (1 << 2) +#define FUSE_ATOMIC_O_TRUNC (1 << 3) +#define FUSE_EXPORT_SUPPORT (1 << 4) +#define FUSE_BIG_WRITES (1 << 5) +#define FUSE_DONT_MASK (1 << 6) +#define FUSE_SPLICE_WRITE (1 << 7) +#define FUSE_SPLICE_MOVE (1 << 8) +#define FUSE_SPLICE_READ (1 << 9) +#define FUSE_FLOCK_LOCKS (1 << 10) +#define FUSE_HAS_IOCTL_DIR (1 << 11) +#define FUSE_AUTO_INVAL_DATA (1 << 12) +#define FUSE_DO_READDIRPLUS (1 << 13) +#define FUSE_READDIRPLUS_AUTO (1 << 14) +#define FUSE_ASYNC_DIO (1 << 15) +#define FUSE_WRITEBACK_CACHE (1 << 16) +#define FUSE_NO_OPEN_SUPPORT (1 << 17) +#ifdef linux /** + * CUSE INIT request/reply flags + * + * CUSE_UNRESTRICTED_IOCTL: use unrestricted ioctl + */ +#define CUSE_UNRESTRICTED_IOCTL (1 << 0) +#endif /* linux */ + +/** * Release flags */ #define FUSE_RELEASE_FLUSH (1 << 0) +#define FUSE_RELEASE_FLOCK_UNLOCK (1 << 1) +/** + * Getattr flags + */ +#define FUSE_GETATTR_FH (1 << 0) + +/** + * Lock flags + */ +#define FUSE_LK_FLOCK (1 << 0) + +/** + * WRITE flags + * + * FUSE_WRITE_CACHE: delayed write from page cache, file handle is guessed + * FUSE_WRITE_LOCKOWNER: lock_owner field is valid + */ +#define FUSE_WRITE_CACHE (1 << 0) +#define FUSE_WRITE_LOCKOWNER (1 << 1) + +/** + * Read flags + */ +#define FUSE_READ_LOCKOWNER (1 << 1) + +/** + * Ioctl flags + * + * FUSE_IOCTL_COMPAT: 32bit compat ioctl on 64bit machine + * FUSE_IOCTL_UNRESTRICTED: not restricted to well-formed ioctls, retry allowed + * FUSE_IOCTL_RETRY: retry with new iovecs + * FUSE_IOCTL_32BIT: 32bit ioctl + * FUSE_IOCTL_DIR: is a directory + * + * FUSE_IOCTL_MAX_IOV: maximum of in_iovecs + out_iovecs + */ +#define FUSE_IOCTL_COMPAT (1 << 0) +#define FUSE_IOCTL_UNRESTRICTED (1 << 1) +#define FUSE_IOCTL_RETRY (1 << 2) +#define FUSE_IOCTL_32BIT (1 << 3) +#define FUSE_IOCTL_DIR (1 << 4) + +#define FUSE_IOCTL_MAX_IOV 256 + +/** + * Poll flags + * + * FUSE_POLL_SCHEDULE_NOTIFY: request poll notify + */ +#define FUSE_POLL_SCHEDULE_NOTIFY (1 << 0) + enum fuse_opcode { FUSE_LOOKUP = 1, FUSE_FORGET = 2, /* no reply */ FUSE_GETATTR = 3, FUSE_SETATTR = 4, FUSE_READLINK = 5, FUSE_SYMLINK = 6, FUSE_MKNOD = 8, FUSE_MKDIR = 9, FUSE_UNLINK = 10, FUSE_RMDIR = 11, FUSE_RENAME = 12, FUSE_LINK = 13, FUSE_OPEN = 14, FUSE_READ = 15, FUSE_WRITE = 16, FUSE_STATFS = 17, FUSE_RELEASE = 18, FUSE_FSYNC = 20, FUSE_SETXATTR = 21, FUSE_GETXATTR = 22, FUSE_LISTXATTR = 23, FUSE_REMOVEXATTR = 24, FUSE_FLUSH = 25, FUSE_INIT = 26, FUSE_OPENDIR = 27, FUSE_READDIR = 28, FUSE_RELEASEDIR = 29, FUSE_FSYNCDIR = 30, FUSE_GETLK = 31, FUSE_SETLK = 32, FUSE_SETLKW = 33, FUSE_ACCESS = 34, FUSE_CREATE = 35, FUSE_INTERRUPT = 36, FUSE_BMAP = 37, FUSE_DESTROY = 38, + FUSE_IOCTL = 39, + FUSE_POLL = 40, + FUSE_NOTIFY_REPLY = 41, + FUSE_BATCH_FORGET = 42, + FUSE_FALLOCATE = 43, + FUSE_READDIRPLUS = 44, + FUSE_RENAME2 = 45, + +#ifdef linux + /* CUSE specific operations */ + CUSE_INIT = 4096, +#endif /* linux */ }; +enum fuse_notify_code { + FUSE_NOTIFY_POLL = 1, + FUSE_NOTIFY_INVAL_INODE = 2, + FUSE_NOTIFY_INVAL_ENTRY = 3, + FUSE_NOTIFY_STORE = 4, + FUSE_NOTIFY_RETRIEVE = 5, + FUSE_NOTIFY_DELETE = 6, + FUSE_NOTIFY_CODE_MAX, +}; + /* The read buffer is required to be at least 8k, but may be much larger */ #define FUSE_MIN_READ_BUFFER 8192 +#define FUSE_COMPAT_ENTRY_OUT_SIZE 120 + struct fuse_entry_out { - __u64 nodeid; /* Inode ID */ - __u64 generation; /* Inode generation: nodeid:gen must - be unique for the fs's lifetime */ - __u64 entry_valid; /* Cache timeout for the name */ - __u64 attr_valid; /* Cache timeout for the attributes */ - __u32 entry_valid_nsec; - __u32 attr_valid_nsec; + uint64_t nodeid; /* Inode ID */ + uint64_t generation; /* Inode generation: nodeid:gen must + be unique for the fs's lifetime */ + uint64_t entry_valid; /* Cache timeout for the name */ + uint64_t attr_valid; /* Cache timeout for the attributes */ + uint32_t entry_valid_nsec; + uint32_t attr_valid_nsec; struct fuse_attr attr; }; struct fuse_forget_in { - __u64 nlookup; + uint64_t nlookup; }; +struct fuse_forget_one { + uint64_t nodeid; + uint64_t nlookup; +}; + +struct fuse_batch_forget_in { + uint32_t count; + uint32_t dummy; +}; + +struct fuse_getattr_in { + uint32_t getattr_flags; + uint32_t dummy; + uint64_t fh; +}; + +#define FUSE_COMPAT_ATTR_OUT_SIZE 96 + struct fuse_attr_out { - __u64 attr_valid; /* Cache timeout for the attributes */ - __u32 attr_valid_nsec; - __u32 dummy; + uint64_t attr_valid; /* Cache timeout for the attributes */ + uint32_t attr_valid_nsec; + uint32_t dummy; struct fuse_attr attr; }; +#define FUSE_COMPAT_MKNOD_IN_SIZE 8 + +struct fuse_mknod_in { + uint32_t mode; + uint32_t rdev; + uint32_t umask; + uint32_t padding; +}; + struct fuse_mkdir_in { - __u32 mode; - __u32 padding; + uint32_t mode; + uint32_t umask; }; struct fuse_rename_in { - __u64 newdir; + uint64_t newdir; }; +struct fuse_rename2_in { + uint64_t newdir; + uint32_t flags; + uint32_t padding; +}; + struct fuse_link_in { - __u64 oldnodeid; + uint64_t oldnodeid; }; struct fuse_setattr_in { - __u32 valid; - __u32 padding; - __u64 fh; - __u64 size; - __u64 unused1; - __u64 atime; - __u64 mtime; - __u64 unused2; - __u32 atimensec; - __u32 mtimensec; - __u32 unused3; - __u32 mode; - __u32 unused4; - __u32 uid; - __u32 gid; - __u32 unused5; + uint32_t valid; + uint32_t padding; + uint64_t fh; + uint64_t size; + uint64_t lock_owner; + uint64_t atime; + uint64_t mtime; + uint64_t ctime; + uint32_t atimensec; + uint32_t mtimensec; + uint32_t ctimensec; + uint32_t mode; + uint32_t unused4; + uint32_t uid; + uint32_t gid; + uint32_t unused5; }; struct fuse_open_in { - __u32 flags; - __u32 mode; + uint32_t flags; + uint32_t unused; }; +struct fuse_create_in { + uint32_t flags; + uint32_t mode; + uint32_t umask; + uint32_t padding; +}; + struct fuse_open_out { - __u64 fh; - __u32 open_flags; - __u32 padding; + uint64_t fh; + uint32_t open_flags; + uint32_t padding; }; struct fuse_release_in { - __u64 fh; - __u32 flags; - __u32 release_flags; - __u64 lock_owner; + uint64_t fh; + uint32_t flags; + uint32_t release_flags; + uint64_t lock_owner; }; struct fuse_flush_in { - __u64 fh; - __u32 unused; - __u32 padding; - __u64 lock_owner; + uint64_t fh; + uint32_t unused; + uint32_t padding; + uint64_t lock_owner; }; struct fuse_read_in { - __u64 fh; - __u64 offset; - __u32 size; - __u32 padding; + uint64_t fh; + uint64_t offset; + uint32_t size; + uint32_t read_flags; + uint64_t lock_owner; + uint32_t flags; + uint32_t padding; }; +#define FUSE_COMPAT_WRITE_IN_SIZE 24 + struct fuse_write_in { - __u64 fh; - __u64 offset; - __u32 size; - __u32 write_flags; + uint64_t fh; + uint64_t offset; + uint32_t size; + uint32_t write_flags; + uint64_t lock_owner; + uint32_t flags; + uint32_t padding; }; struct fuse_write_out { - __u32 size; - __u32 padding; + uint32_t size; + uint32_t padding; }; #define FUSE_COMPAT_STATFS_SIZE 48 struct fuse_statfs_out { struct fuse_kstatfs st; }; struct fuse_fsync_in { - __u64 fh; - __u32 fsync_flags; - __u32 padding; + uint64_t fh; + uint32_t fsync_flags; + uint32_t padding; }; +struct fuse_setxattr_in { + uint32_t size; + uint32_t flags; +}; + struct fuse_listxattr_in { - __u32 size; - __u32 flags; + uint32_t size; + uint32_t padding; }; struct fuse_listxattr_out { - __u32 size; - __u32 flags; + uint32_t size; + uint32_t padding; }; struct fuse_getxattr_in { - __u32 size; - __u32 padding; + uint32_t size; + uint32_t padding; }; struct fuse_getxattr_out { - __u32 size; - __u32 padding; + uint32_t size; + uint32_t padding; }; -struct fuse_setxattr_in { - __u32 size; - __u32 flags; -}; - struct fuse_lk_in { - __u64 fh; - __u64 owner; + uint64_t fh; + uint64_t owner; struct fuse_file_lock lk; + uint32_t lk_flags; + uint32_t padding; }; struct fuse_lk_out { struct fuse_file_lock lk; }; struct fuse_access_in { - __u32 mask; - __u32 padding; + uint32_t mask; + uint32_t padding; }; struct fuse_init_in { - __u32 major; - __u32 minor; - __u32 max_readahead; - __u32 flags; + uint32_t major; + uint32_t minor; + uint32_t max_readahead; + uint32_t flags; }; +#define FUSE_COMPAT_INIT_OUT_SIZE 8 +#define FUSE_COMPAT_22_INIT_OUT_SIZE 24 + struct fuse_init_out { - __u32 major; - __u32 minor; - __u32 max_readahead; - __u32 flags; - __u32 unused; - __u32 max_write; + uint32_t major; + uint32_t minor; + uint32_t max_readahead; + uint32_t flags; + uint16_t max_background; + uint16_t congestion_threshold; + uint32_t max_write; + uint32_t time_gran; + uint32_t unused[9]; }; +#ifdef linux +#define CUSE_INIT_INFO_MAX 4096 + +struct cuse_init_in { + uint32_t major; + uint32_t minor; + uint32_t unused; + uint32_t flags; +}; + +struct cuse_init_out { + uint32_t major; + uint32_t minor; + uint32_t unused; + uint32_t flags; + uint32_t max_read; + uint32_t max_write; + uint32_t dev_major; /* chardev major */ + uint32_t dev_minor; /* chardev minor */ + uint32_t spare[10]; +}; +#endif /* linux */ + struct fuse_interrupt_in { - __u64 unique; + uint64_t unique; }; struct fuse_bmap_in { - __u64 block; - __u32 blocksize; - __u32 padding; + uint64_t block; + uint32_t blocksize; + uint32_t padding; }; struct fuse_bmap_out { - __u64 block; + uint64_t block; }; +struct fuse_ioctl_in { + uint64_t fh; + uint32_t flags; + uint32_t cmd; + uint64_t arg; + uint32_t in_size; + uint32_t out_size; +}; + +struct fuse_ioctl_iovec { + uint64_t base; + uint64_t len; +}; + +struct fuse_ioctl_out { + int32_t result; + uint32_t flags; + uint32_t in_iovs; + uint32_t out_iovs; +}; + +struct fuse_poll_in { + uint64_t fh; + uint64_t kh; + uint32_t flags; + uint32_t events; +}; + +struct fuse_poll_out { + uint32_t revents; + uint32_t padding; +}; + +struct fuse_notify_poll_wakeup_out { + uint64_t kh; +}; + +struct fuse_fallocate_in { + uint64_t fh; + uint64_t offset; + uint64_t length; + uint32_t mode; + uint32_t padding; +}; + struct fuse_in_header { - __u32 len; - __u32 opcode; - __u64 unique; - __u64 nodeid; - __u32 uid; - __u32 gid; - __u32 pid; - __u32 padding; + uint32_t len; + uint32_t opcode; + uint64_t unique; + uint64_t nodeid; + uint32_t uid; + uint32_t gid; + uint32_t pid; + uint32_t padding; }; struct fuse_out_header { - __u32 len; - __s32 error; - __u64 unique; + uint32_t len; + int32_t error; + uint64_t unique; }; struct fuse_dirent { - __u64 ino; - __u64 off; - __u32 namelen; - __u32 type; - char name[0]; + uint64_t ino; + uint64_t off; + uint32_t namelen; + uint32_t type; + char name[]; }; #define FUSE_NAME_OFFSET offsetof(struct fuse_dirent, name) -#define FUSE_DIRENT_ALIGN(x) (((x) + sizeof(__u64) - 1) & ~(sizeof(__u64) - 1)) +#define FUSE_DIRENT_ALIGN(x) \ + (((x) + sizeof(uint64_t) - 1) & ~(sizeof(uint64_t) - 1)) #define FUSE_DIRENT_SIZE(d) \ FUSE_DIRENT_ALIGN(FUSE_NAME_OFFSET + (d)->namelen) + +struct fuse_direntplus { + struct fuse_entry_out entry_out; + struct fuse_dirent dirent; +}; + +#define FUSE_NAME_OFFSET_DIRENTPLUS \ + offsetof(struct fuse_direntplus, dirent.name) +#define FUSE_DIRENTPLUS_SIZE(d) \ + FUSE_DIRENT_ALIGN(FUSE_NAME_OFFSET_DIRENTPLUS + (d)->dirent.namelen) + +struct fuse_notify_inval_inode_out { + uint64_t ino; + int64_t off; + int64_t len; +}; + +struct fuse_notify_inval_entry_out { + uint64_t parent; + uint32_t namelen; + uint32_t padding; +}; + +struct fuse_notify_delete_out { + uint64_t parent; + uint64_t child; + uint32_t namelen; + uint32_t padding; +}; + +struct fuse_notify_store_out { + uint64_t nodeid; + uint64_t offset; + uint32_t size; + uint32_t padding; +}; + +struct fuse_notify_retrieve_out { + uint64_t notify_unique; + uint64_t nodeid; + uint64_t offset; + uint32_t size; + uint32_t padding; +}; + +/* Matches the size of fuse_write_in */ +struct fuse_notify_retrieve_in { + uint64_t dummy1; + uint64_t offset; + uint32_t size; + uint32_t dummy2; + uint64_t dummy3; + uint64_t dummy4; +}; + +#endif /* _FUSE_FUSE_KERNEL_H */ Index: head/sys/fs/fuse/fuse_main.c =================================================================== --- head/sys/fs/fuse/fuse_main.c (revision 350664) +++ head/sys/fs/fuse/fuse_main.c (revision 350665) @@ -1,168 +1,179 @@ /*- * SPDX-License-Identifier: BSD-3-Clause * * Copyright (c) 2007-2009 Google Inc. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are * met: * * * Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following disclaimer * in the documentation and/or other materials provided with the * distribution. * * Neither the name of Google Inc. nor the names of its * contributors may be used to endorse or promote products derived from * this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * Copyright (C) 2005 Csaba Henk. * All rights reserved. * + * Copyright (c) 2019 The FreeBSD Foundation + * + * Portions of this software were developed by BFF Storage Systems, LLC under + * sponsorship from the FreeBSD Foundation. + * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "fuse.h" +#include "fuse_file.h" +#include "fuse_ipc.h" +#include "fuse_internal.h" +#include "fuse_node.h" static void fuse_bringdown(eventhandler_tag eh_tag); static int fuse_loader(struct module *m, int what, void *arg); struct mtx fuse_mtx; extern struct vfsops fuse_vfsops; extern struct cdevsw fuse_cdevsw; -extern struct vop_vector fuse_vnops; +extern struct vop_vector fuse_fifonops; extern uma_zone_t fuse_pbuf_zone; static struct vfsconf fuse_vfsconf = { .vfc_version = VFS_VERSION, .vfc_name = "fusefs", .vfc_vfsops = &fuse_vfsops, .vfc_typenum = -1, .vfc_flags = VFCF_JAIL | VFCF_SYNTHETIC }; +SYSCTL_NODE(_vfs, OID_AUTO, fusefs, CTLFLAG_RW, 0, "FUSE tunables"); +SYSCTL_NODE(_vfs_fusefs, OID_AUTO, stats, CTLFLAG_RW, 0, "FUSE statistics"); SYSCTL_INT(_vfs_fusefs, OID_AUTO, kernelabi_major, CTLFLAG_RD, SYSCTL_NULL_INT_PTR, FUSE_KERNEL_VERSION, "FUSE kernel abi major version"); SYSCTL_INT(_vfs_fusefs, OID_AUTO, kernelabi_minor, CTLFLAG_RD, SYSCTL_NULL_INT_PTR, FUSE_KERNEL_MINOR_VERSION, "FUSE kernel abi minor version"); -SDT_PROVIDER_DEFINE(fuse); +SDT_PROVIDER_DEFINE(fusefs); /****************************** * * >>> Module management stuff * ******************************/ static void fuse_bringdown(eventhandler_tag eh_tag) { - + fuse_node_destroy(); + fuse_internal_destroy(); + fuse_file_destroy(); fuse_ipc_destroy(); fuse_device_destroy(); mtx_destroy(&fuse_mtx); } static int fuse_loader(struct module *m, int what, void *arg) { static eventhandler_tag eh_tag = NULL; int err = 0; switch (what) { case MOD_LOAD: /* kldload */ mtx_init(&fuse_mtx, "fuse_mtx", NULL, MTX_DEF); err = fuse_device_init(); if (err) { mtx_destroy(&fuse_mtx); return (err); } fuse_ipc_init(); + fuse_file_init(); + fuse_internal_init(); + fuse_node_init(); fuse_pbuf_zone = pbuf_zsecond_create("fusepbuf", nswbuf / 2); /* vfs_modevent ignores its first arg */ if ((err = vfs_modevent(NULL, what, &fuse_vfsconf))) fuse_bringdown(eh_tag); - else - printf("fuse-freebsd: version %s, FUSE ABI %d.%d\n", - FUSE_FREEBSD_VERSION, - FUSE_KERNEL_VERSION, FUSE_KERNEL_MINOR_VERSION); - break; case MOD_UNLOAD: if ((err = vfs_modevent(NULL, what, &fuse_vfsconf))) return (err); fuse_bringdown(eh_tag); uma_zdestroy(fuse_pbuf_zone); break; default: return (EINVAL); } return (err); } /* Registering the module */ static moduledata_t fuse_moddata = { "fusefs", fuse_loader, &fuse_vfsconf }; DECLARE_MODULE(fusefs, fuse_moddata, SI_SUB_VFS, SI_ORDER_MIDDLE); MODULE_VERSION(fusefs, 1); Index: head/sys/fs/fuse/fuse_node.c =================================================================== --- head/sys/fs/fuse/fuse_node.c (revision 350664) +++ head/sys/fs/fuse/fuse_node.c (revision 350665) @@ -1,428 +1,500 @@ /*- * SPDX-License-Identifier: BSD-3-Clause * * Copyright (c) 2007-2009 Google Inc. and Amit Singh * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are * met: * * * Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following disclaimer * in the documentation and/or other materials provided with the * distribution. * * Neither the name of Google Inc. nor the names of its * contributors may be used to endorse or promote products derived from * this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * Copyright (C) 2005 Csaba Henk. * All rights reserved. * + * Copyright (c) 2019 The FreeBSD Foundation + * + * Portions of this software were developed by BFF Storage Systems, LLC under + * sponsorship from the FreeBSD Foundation. + * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include -#include #include +#include +#include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include -#include #include +#include #include #include #include #include "fuse.h" #include "fuse_node.h" #include "fuse_internal.h" #include "fuse_io.h" #include "fuse_ipc.h" -SDT_PROVIDER_DECLARE(fuse); +SDT_PROVIDER_DECLARE(fusefs); /* * Fuse trace probe: * arg0: verbosity. Higher numbers give more verbose messages * arg1: Textual message */ -SDT_PROBE_DEFINE2(fuse, , node, trace, "int", "char*"); +SDT_PROBE_DEFINE2(fusefs, , node, trace, "int", "char*"); MALLOC_DEFINE(M_FUSEVN, "fuse_vnode", "fuse vnode private data"); static int sysctl_fuse_cache_mode(SYSCTL_HANDLER_ARGS); -static int fuse_node_count = 0; +static counter_u64_t fuse_node_count; -SYSCTL_INT(_vfs_fusefs, OID_AUTO, node_count, CTLFLAG_RD, - &fuse_node_count, 0, "Count of FUSE vnodes"); +SYSCTL_COUNTER_U64(_vfs_fusefs_stats, OID_AUTO, node_count, CTLFLAG_RD, + &fuse_node_count, "Count of FUSE vnodes"); int fuse_data_cache_mode = FUSE_CACHE_WT; +/* + * DEPRECATED + * This sysctl is no longer needed as of fuse protocol 7.23. Individual + * servers can select the cache behavior they need for each mountpoint: + * - writethrough: the default + * - writeback: set FUSE_WRITEBACK_CACHE in fuse_init_out.flags + * - uncached: set FOPEN_DIRECT_IO for every file + * The sysctl is retained primarily for use by jails supporting older FUSE + * protocols. It may be removed entirely once FreeBSD 11.3 and 12.0 are EOL. + */ SYSCTL_PROC(_vfs_fusefs, OID_AUTO, data_cache_mode, CTLTYPE_INT|CTLFLAG_RW, &fuse_data_cache_mode, 0, sysctl_fuse_cache_mode, "I", "Zero: disable caching of FUSE file data; One: write-through caching " "(default); Two: write-back caching (generally unsafe)"); -int fuse_data_cache_invalidate = 0; - -SYSCTL_INT(_vfs_fusefs, OID_AUTO, data_cache_invalidate, CTLFLAG_RW, - &fuse_data_cache_invalidate, 0, - "If non-zero, discard cached clean file data when there are no active file" - " users"); - -int fuse_mmap_enable = 1; - -SYSCTL_INT(_vfs_fusefs, OID_AUTO, mmap_enable, CTLFLAG_RW, - &fuse_mmap_enable, 0, - "If non-zero, and data_cache_mode is also non-zero, enable mmap(2) of " - "FUSE files"); - -int fuse_refresh_size = 0; - -SYSCTL_INT(_vfs_fusefs, OID_AUTO, refresh_size, CTLFLAG_RW, - &fuse_refresh_size, 0, - "If non-zero, and no dirty file extension data is buffered, fetch file " - "size before write operations"); - -int fuse_sync_resize = 1; - -SYSCTL_INT(_vfs_fusefs, OID_AUTO, sync_resize, CTLFLAG_RW, - &fuse_sync_resize, 0, - "If a cached write extended a file, inform FUSE filesystem of the changed" - "size immediately subsequent to the issued writes"); - -int fuse_fix_broken_io = 0; - -SYSCTL_INT(_vfs_fusefs, OID_AUTO, fix_broken_io, CTLFLAG_RW, - &fuse_fix_broken_io, 0, - "If non-zero, print a diagnostic warning if a userspace filesystem returns" - " EIO on reads of recently extended portions of files"); - static int sysctl_fuse_cache_mode(SYSCTL_HANDLER_ARGS) { int val, error; val = *(int *)arg1; error = sysctl_handle_int(oidp, &val, 0, req); if (error || !req->newptr) return (error); switch (val) { case FUSE_CACHE_UC: case FUSE_CACHE_WT: case FUSE_CACHE_WB: *(int *)arg1 = val; break; default: return (EDOM); } return (0); } static void fuse_vnode_init(struct vnode *vp, struct fuse_vnode_data *fvdat, uint64_t nodeid, enum vtype vtyp) { - int i; - fvdat->nid = nodeid; + LIST_INIT(&fvdat->handles); vattr_null(&fvdat->cached_attrs); if (nodeid == FUSE_ROOT_ID) { vp->v_vflag |= VV_ROOT; } vp->v_type = vtyp; vp->v_data = fvdat; - for (i = 0; i < FUFH_MAXTYPE; i++) - fvdat->fufh[i].fh_type = FUFH_INVALID; - - atomic_add_acq_int(&fuse_node_count, 1); + counter_u64_add(fuse_node_count, 1); } void fuse_vnode_destroy(struct vnode *vp) { struct fuse_vnode_data *fvdat = vp->v_data; vp->v_data = NULL; + KASSERT(LIST_EMPTY(&fvdat->handles), + ("Destroying fuse vnode with open files!")); free(fvdat, M_FUSEVN); - atomic_subtract_acq_int(&fuse_node_count, 1); + counter_u64_add(fuse_node_count, -1); } -static int +int fuse_vnode_cmp(struct vnode *vp, void *nidp) { return (VTOI(vp) != *((uint64_t *)nidp)); } -static uint32_t inline -fuse_vnode_hash(uint64_t id) -{ - return (fnv_32_buf(&id, sizeof(id), FNV1_32_INIT)); -} - +SDT_PROBE_DEFINE3(fusefs, , node, stale_vnode, "struct vnode*", "enum vtype", + "uint64_t"); static int fuse_vnode_alloc(struct mount *mp, struct thread *td, uint64_t nodeid, enum vtype vtyp, struct vnode **vpp) { + struct fuse_data *data; struct fuse_vnode_data *fvdat; struct vnode *vp2; int err = 0; + data = fuse_get_mpdata(mp); if (vtyp == VNON) { return EINVAL; } *vpp = NULL; err = vfs_hash_get(mp, fuse_vnode_hash(nodeid), LK_EXCLUSIVE, td, vpp, fuse_vnode_cmp, &nodeid); if (err) return (err); if (*vpp) { - MPASS((*vpp)->v_type == vtyp && (*vpp)->v_data != NULL); - SDT_PROBE2(fuse, , node, trace, 1, "vnode taken from hash"); + if ((*vpp)->v_type != vtyp) { + /* + * STALE vnode! This probably indicates a buggy + * server, but it could also be the result of a race + * between FUSE_LOOKUP and another client's + * FUSE_UNLINK/FUSE_CREATE + */ + SDT_PROBE3(fusefs, , node, stale_vnode, *vpp, vtyp, + nodeid); + fuse_internal_vnode_disappear(*vpp); + lockmgr((*vpp)->v_vnlock, LK_RELEASE, NULL); + *vpp = NULL; + return (EAGAIN); + } + MPASS((*vpp)->v_data != NULL); + MPASS(VTOFUD(*vpp)->nid == nodeid); + SDT_PROBE2(fusefs, , node, trace, 1, "vnode taken from hash"); return (0); } fvdat = malloc(sizeof(*fvdat), M_FUSEVN, M_WAITOK | M_ZERO); - err = getnewvnode("fuse", mp, &fuse_vnops, vpp); + switch (vtyp) { + case VFIFO: + err = getnewvnode("fuse", mp, &fuse_fifoops, vpp); + break; + default: + err = getnewvnode("fuse", mp, &fuse_vnops, vpp); + break; + } if (err) { free(fvdat, M_FUSEVN); return (err); } lockmgr((*vpp)->v_vnlock, LK_EXCLUSIVE, NULL); fuse_vnode_init(*vpp, fvdat, nodeid, vtyp); err = insmntque(*vpp, mp); ASSERT_VOP_ELOCKED(*vpp, "fuse_vnode_alloc"); if (err) { + lockmgr((*vpp)->v_vnlock, LK_RELEASE, NULL); free(fvdat, M_FUSEVN); *vpp = NULL; return (err); } + /* Disallow async reads for fifos because UFS does. I don't know why */ + if (data->dataflags & FSESS_ASYNC_READ && vtyp != VFIFO) + VN_LOCK_ASHARE(*vpp); + err = vfs_hash_insert(*vpp, fuse_vnode_hash(nodeid), LK_EXCLUSIVE, td, &vp2, fuse_vnode_cmp, &nodeid); - if (err) + if (err) { + lockmgr((*vpp)->v_vnlock, LK_RELEASE, NULL); + free(fvdat, M_FUSEVN); + *vpp = NULL; return (err); + } if (vp2 != NULL) { *vpp = vp2; return (0); } ASSERT_VOP_ELOCKED(*vpp, "fuse_vnode_alloc"); return (0); } int fuse_vnode_get(struct mount *mp, struct fuse_entry_out *feo, uint64_t nodeid, struct vnode *dvp, struct vnode **vpp, struct componentname *cnp, enum vtype vtyp) { struct thread *td = (cnp != NULL ? cnp->cn_thread : curthread); + /* + * feo should only be NULL for the root directory, which (when libfuse + * is used) always has generation 0 + */ + uint64_t generation = feo ? feo->generation : 0; int err = 0; err = fuse_vnode_alloc(mp, td, nodeid, vtyp, vpp); if (err) { return err; } if (dvp != NULL) { - MPASS((cnp->cn_flags & ISDOTDOT) == 0); - MPASS(!(cnp->cn_namelen == 1 && cnp->cn_nameptr[0] == '.')); + MPASS(cnp && (cnp->cn_flags & ISDOTDOT) == 0); + MPASS(cnp && + !(cnp->cn_namelen == 1 && cnp->cn_nameptr[0] == '.')); fuse_vnode_setparent(*vpp, dvp); } if (dvp != NULL && cnp != NULL && (cnp->cn_flags & MAKEENTRY) != 0 && feo != NULL && (feo->entry_valid != 0 || feo->entry_valid_nsec != 0)) { + struct timespec timeout; + ASSERT_VOP_LOCKED(*vpp, "fuse_vnode_get"); ASSERT_VOP_LOCKED(dvp, "fuse_vnode_get"); - cache_enter(dvp, *vpp, cnp); + + fuse_validity_2_timespec(feo, &timeout); + cache_enter_time(dvp, *vpp, cnp, &timeout, NULL); } + VTOFUD(*vpp)->generation = generation; /* * In userland, libfuse uses cached lookups for dot and dotdot entries, * thus it does not really bump the nlookup counter for forget. - * Follow the same semantic and avoid tu bump it in order to keep + * Follow the same semantic and avoid the bump in order to keep * nlookup counters consistent. */ if (cnp == NULL || ((cnp->cn_flags & ISDOTDOT) == 0 && (cnp->cn_namelen != 1 || cnp->cn_nameptr[0] != '.'))) VTOFUD(*vpp)->nlookup++; return 0; } +/* + * Called for every fusefs vnode open to initialize the vnode (not + * fuse_filehandle) for use + */ void fuse_vnode_open(struct vnode *vp, int32_t fuse_open_flags, struct thread *td) { - /* - * Funcation is called for every vnode open. - * Merge fuse_open_flags it may be 0 - */ - /* - * Ideally speaking, direct io should be enabled on - * fd's but do not see of any way of providing that - * this implementation. - * - * Also cannot think of a reason why would two - * different fd's on same vnode would like - * have DIRECT_IO turned on and off. But linux - * based implementation works on an fd not an - * inode and provides such a feature. - * - * XXXIP: Handle fd based DIRECT_IO - */ - if (fuse_open_flags & FOPEN_DIRECT_IO) { - ASSERT_VOP_ELOCKED(vp, __func__); - VTOFUD(vp)->flag |= FN_DIRECTIO; - fuse_io_invalbuf(vp, td); - } else { - if ((fuse_open_flags & FOPEN_KEEP_CACHE) == 0) - fuse_io_invalbuf(vp, td); - VTOFUD(vp)->flag &= ~FN_DIRECTIO; - } - - if (vnode_vtype(vp) == VREG) { - /* XXXIP prevent getattr, by using cached node size */ + if (vnode_vtype(vp) == VREG) vnode_create_vobject(vp, 0, td); - } } int -fuse_vnode_savesize(struct vnode *vp, struct ucred *cred) +fuse_vnode_savesize(struct vnode *vp, struct ucred *cred, pid_t pid) { struct fuse_vnode_data *fvdat = VTOFUD(vp); struct thread *td = curthread; struct fuse_filehandle *fufh = NULL; struct fuse_dispatcher fdi; struct fuse_setattr_in *fsai; int err = 0; ASSERT_VOP_ELOCKED(vp, "fuse_io_extend"); if (fuse_isdeadfs(vp)) { return EBADF; } if (vnode_vtype(vp) == VDIR) { return EISDIR; } if (vfs_isrdonly(vnode_mount(vp))) { return EROFS; } if (cred == NULL) { cred = td->td_ucred; } fdisp_init(&fdi, sizeof(*fsai)); fdisp_make_vp(&fdi, FUSE_SETATTR, vp, td, cred); fsai = fdi.indata; fsai->valid = 0; /* Truncate to a new value. */ - fsai->size = fvdat->filesize; + MPASS((fvdat->flag & FN_SIZECHANGE) != 0); + fsai->size = fvdat->cached_attrs.va_size; fsai->valid |= FATTR_SIZE; - fuse_filehandle_getrw(vp, FUFH_WRONLY, &fufh); + fuse_filehandle_getrw(vp, FWRITE, &fufh, cred, pid); if (fufh) { fsai->fh = fufh->fh_id; fsai->valid |= FATTR_FH; } err = fdisp_wait_answ(&fdi); fdisp_destroy(&fdi); if (err == 0) fvdat->flag &= ~FN_SIZECHANGE; return err; } -void -fuse_vnode_refreshsize(struct vnode *vp, struct ucred *cred) -{ - - struct fuse_vnode_data *fvdat = VTOFUD(vp); - struct vattr va; - - if ((fvdat->flag & FN_SIZECHANGE) != 0 || - fuse_data_cache_mode == FUSE_CACHE_UC || - (fuse_refresh_size == 0 && fvdat->filesize != 0)) - return; - - VOP_GETATTR(vp, &va, cred); - SDT_PROBE2(fuse, , node, trace, 1, "refreshed file size"); -} - +/* + * Adjust the vnode's size to a new value, such as that provided by + * FUSE_GETATTR. + */ int fuse_vnode_setsize(struct vnode *vp, off_t newsize) { struct fuse_vnode_data *fvdat = VTOFUD(vp); + struct vattr *attrs; off_t oldsize; + size_t iosize; + struct buf *bp = NULL; int err = 0; ASSERT_VOP_ELOCKED(vp, "fuse_vnode_setsize"); - oldsize = fvdat->filesize; - fvdat->filesize = newsize; - fvdat->flag |= FN_SIZECHANGE; + iosize = fuse_iosize(vp); + oldsize = fvdat->cached_attrs.va_size; + fvdat->cached_attrs.va_size = newsize; + if ((attrs = VTOVA(vp)) != NULL) + attrs->va_size = newsize; if (newsize < oldsize) { + daddr_t lbn; + err = vtruncbuf(vp, newsize, fuse_iosize(vp)); + if (err) + goto out; + if (newsize % iosize == 0) + goto out; + /* + * Zero the contents of the last partial block. + * Sure seems like vtruncbuf should do this for us. + */ + + lbn = newsize / iosize; + bp = getblk(vp, lbn, iosize, PCATCH, 0, 0); + if (!bp) { + err = EINTR; + goto out; + } + if (!(bp->b_flags & B_CACHE)) + goto out; /* Nothing to do */ + MPASS(bp->b_flags & B_VMIO); + vfs_bio_clrbuf(bp); + bp->b_dirtyend = MIN(bp->b_dirtyend, newsize - lbn * iosize); } +out: + if (bp) + brelse(bp); vnode_pager_setsize(vp, newsize); return err; +} + +/* Get the current, possibly dirty, size of the file */ +int +fuse_vnode_size(struct vnode *vp, off_t *filesize, struct ucred *cred, + struct thread *td) +{ + struct fuse_vnode_data *fvdat = VTOFUD(vp); + int error = 0; + + if (!(fvdat->flag & FN_SIZECHANGE) && + (VTOVA(vp) == NULL || fvdat->cached_attrs.va_size == VNOVAL)) + error = fuse_internal_do_getattr(vp, NULL, cred, td); + + if (!error) + *filesize = fvdat->cached_attrs.va_size; + + return error; +} + +void +fuse_vnode_undirty_cached_timestamps(struct vnode *vp) +{ + struct fuse_vnode_data *fvdat = VTOFUD(vp); + + fvdat->flag &= ~(FN_MTIMECHANGE | FN_CTIMECHANGE); +} + +/* Update a fuse file's cached timestamps */ +void +fuse_vnode_update(struct vnode *vp, int flags) +{ + struct fuse_vnode_data *fvdat = VTOFUD(vp); + struct fuse_data *data = fuse_get_mpdata(vnode_mount(vp)); + struct timespec ts; + + vfs_timestamp(&ts); + + if (data->time_gran > 1) + ts.tv_nsec = rounddown(ts.tv_nsec, data->time_gran); + + if (flags & FN_MTIMECHANGE) + fvdat->cached_attrs.va_mtime = ts; + if (flags & FN_CTIMECHANGE) + fvdat->cached_attrs.va_ctime = ts; + + fvdat->flag |= flags; +} + +void +fuse_node_init(void) +{ + fuse_node_count = counter_u64_alloc(M_WAITOK); +} + +void +fuse_node_destroy(void) +{ + counter_u64_free(fuse_node_count); } Index: head/sys/fs/fuse/fuse_node.h =================================================================== --- head/sys/fs/fuse/fuse_node.h (revision 350664) +++ head/sys/fs/fuse/fuse_node.h (revision 350665) @@ -1,132 +1,202 @@ /*- * SPDX-License-Identifier: BSD-3-Clause * * Copyright (c) 2007-2009 Google Inc. and Amit Singh * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are * met: * * * Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following disclaimer * in the documentation and/or other materials provided with the * distribution. * * Neither the name of Google Inc. nor the names of its * contributors may be used to endorse or promote products derived from * this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * Copyright (C) 2005 Csaba Henk. * All rights reserved. + * + * Copyright (c) 2019 The FreeBSD Foundation + * + * Portions of this software were developed by BFF Storage Systems, LLC under + * sponsorship from the FreeBSD Foundation. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $FreeBSD$ */ #ifndef _FUSE_NODE_H_ #define _FUSE_NODE_H_ +#include #include #include #include "fuse_file.h" -#define FN_REVOKED 0x00000020 -#define FN_FLUSHINPROG 0x00000040 -#define FN_FLUSHWANT 0x00000080 -#define FN_SIZECHANGE 0x00000100 -#define FN_DIRECTIO 0x00000200 +#define FN_REVOKED 0x00000020 +#define FN_FLUSHINPROG 0x00000040 +#define FN_FLUSHWANT 0x00000080 +/* + * Indicates that the file's size is dirty; the kernel has changed it but not + * yet send the change to the daemon. When this bit is set, the + * cache_attrs.va_size field does not time out. + */ +#define FN_SIZECHANGE 0x00000100 +#define FN_DIRECTIO 0x00000200 +/* Indicates that parent_nid is valid */ +#define FN_PARENT_NID 0x00000400 +/* + * Indicates that the file's cached timestamps are dirty. They will be flushed + * during the next SETATTR or WRITE. Until then, the cached fields will not + * time out. + */ +#define FN_MTIMECHANGE 0x00000800 +#define FN_CTIMECHANGE 0x00001000 + struct fuse_vnode_data { /** self **/ uint64_t nid; + uint64_t generation; /** parent **/ - /* XXXIP very likely to be stale, it's not updated in rename() */ uint64_t parent_nid; /** I/O **/ - struct fuse_filehandle fufh[FUFH_MAXTYPE]; + /* List of file handles for all of the vnode's open file descriptors */ + LIST_HEAD(, fuse_filehandle) handles; /** flags **/ uint32_t flag; /** meta **/ - bool valid_attr_cache; + /* The monotonic time after which the attr cache is invalid */ + struct bintime attr_cache_timeout; + /* + * Monotonic time after which the entry is invalid. Used for lookups + * by nodeid instead of pathname. + */ + struct bintime entry_cache_timeout; struct vattr cached_attrs; - off_t filesize; uint64_t nlookup; enum vtype vtype; }; +/* + * This overlays the fid structure (see mount.h). Mostly the same as the types + * used by UFS and ext2. + */ +struct fuse_fid { + uint16_t len; /* Length of structure. */ + uint16_t pad; /* Force 32-bit alignment. */ + uint32_t gen; /* Generation number. */ + uint64_t nid; /* FUSE node id. */ +}; + #define VTOFUD(vp) \ ((struct fuse_vnode_data *)((vp)->v_data)) #define VTOI(vp) (VTOFUD(vp)->nid) -#define VTOVA(vp) \ - (VTOFUD(vp)->valid_attr_cache ? \ - &(VTOFUD(vp)->cached_attrs) : NULL) +static inline struct vattr* +VTOVA(struct vnode *vp) +{ + struct bintime now; + + getbinuptime(&now); + if (bintime_cmp(&(VTOFUD(vp)->attr_cache_timeout), &now, >)) + return &(VTOFUD(vp)->cached_attrs); + else + return NULL; +} + +static inline void +fuse_vnode_clear_attr_cache(struct vnode *vp) +{ + bintime_clear(&VTOFUD(vp)->attr_cache_timeout); +} + +static uint32_t inline +fuse_vnode_hash(uint64_t id) +{ + return (fnv_32_buf(&id, sizeof(id), FNV1_32_INIT)); +} + #define VTOILLU(vp) ((uint64_t)(VTOFUD(vp) ? VTOI(vp) : 0)) #define FUSE_NULL_ID 0 +extern struct vop_vector fuse_fifoops; extern struct vop_vector fuse_vnops; +int fuse_vnode_cmp(struct vnode *vp, void *nidp); + static inline void fuse_vnode_setparent(struct vnode *vp, struct vnode *dvp) { if (dvp != NULL && vp->v_type == VDIR) { MPASS(dvp->v_type == VDIR); VTOFUD(vp)->parent_nid = VTOI(dvp); + VTOFUD(vp)->flag |= FN_PARENT_NID; } } +int fuse_vnode_size(struct vnode *vp, off_t *filesize, struct ucred *cred, + struct thread *td); + void fuse_vnode_destroy(struct vnode *vp); int fuse_vnode_get(struct mount *mp, struct fuse_entry_out *feo, uint64_t nodeid, struct vnode *dvp, struct vnode **vpp, struct componentname *cnp, enum vtype vtyp); void fuse_vnode_open(struct vnode *vp, int32_t fuse_open_flags, struct thread *td); -void fuse_vnode_refreshsize(struct vnode *vp, struct ucred *cred); +int fuse_vnode_savesize(struct vnode *vp, struct ucred *cred, pid_t pid); -int fuse_vnode_savesize(struct vnode *vp, struct ucred *cred); - int fuse_vnode_setsize(struct vnode *vp, off_t newsize); +void fuse_vnode_undirty_cached_timestamps(struct vnode *vp); + +void fuse_vnode_update(struct vnode *vp, int flags); + +void fuse_node_init(void); +void fuse_node_destroy(void); #endif /* _FUSE_NODE_H_ */ Index: head/sys/fs/fuse/fuse_vfsops.c =================================================================== --- head/sys/fs/fuse/fuse_vfsops.c (revision 350664) +++ head/sys/fs/fuse/fuse_vfsops.c (revision 350665) @@ -1,529 +1,692 @@ /*- * SPDX-License-Identifier: BSD-3-Clause * * Copyright (c) 2007-2009 Google Inc. and Amit Singh * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are * met: * * * Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following disclaimer * in the documentation and/or other materials provided with the * distribution. * * Neither the name of Google Inc. nor the names of its * contributors may be used to endorse or promote products derived from * this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * Copyright (C) 2005 Csaba Henk. * All rights reserved. * + * Copyright (c) 2019 The FreeBSD Foundation + * + * Portions of this software were developed by BFF Storage Systems, LLC under + * sponsorship from the FreeBSD Foundation. + * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "fuse.h" -#include "fuse_param.h" #include "fuse_node.h" #include "fuse_ipc.h" #include "fuse_internal.h" #include #include -SDT_PROVIDER_DECLARE(fuse); +SDT_PROVIDER_DECLARE(fusefs); /* * Fuse trace probe: * arg0: verbosity. Higher numbers give more verbose messages * arg1: Textual message */ -SDT_PROBE_DEFINE2(fuse, , vfsops, trace, "int", "char*"); +SDT_PROBE_DEFINE2(fusefs, , vfsops, trace, "int", "char*"); /* This will do for privilege types for now */ #ifndef PRIV_VFS_FUSE_ALLOWOTHER #define PRIV_VFS_FUSE_ALLOWOTHER PRIV_VFS_MOUNT_NONUSER #endif #ifndef PRIV_VFS_FUSE_MOUNT_NONUSER #define PRIV_VFS_FUSE_MOUNT_NONUSER PRIV_VFS_MOUNT_NONUSER #endif #ifndef PRIV_VFS_FUSE_SYNC_UNMOUNT #define PRIV_VFS_FUSE_SYNC_UNMOUNT PRIV_VFS_MOUNT_NONUSER #endif +static vfs_fhtovp_t fuse_vfsop_fhtovp; static vfs_mount_t fuse_vfsop_mount; static vfs_unmount_t fuse_vfsop_unmount; static vfs_root_t fuse_vfsop_root; static vfs_statfs_t fuse_vfsop_statfs; +static vfs_vget_t fuse_vfsop_vget; struct vfsops fuse_vfsops = { + .vfs_fhtovp = fuse_vfsop_fhtovp, .vfs_mount = fuse_vfsop_mount, .vfs_unmount = fuse_vfsop_unmount, .vfs_root = fuse_vfsop_root, .vfs_statfs = fuse_vfsop_statfs, + .vfs_vget = fuse_vfsop_vget, }; -SYSCTL_INT(_vfs_fusefs, OID_AUTO, init_backgrounded, CTLFLAG_RD, - SYSCTL_NULL_INT_PTR, 1, "indicate async handshake"); static int fuse_enforce_dev_perms = 0; SYSCTL_INT(_vfs_fusefs, OID_AUTO, enforce_dev_perms, CTLFLAG_RW, &fuse_enforce_dev_perms, 0, "enforce fuse device permissions for secondary mounts"); -static unsigned sync_unmount = 1; -SYSCTL_UINT(_vfs_fusefs, OID_AUTO, sync_unmount, CTLFLAG_RW, - &sync_unmount, 0, "specify when to use synchronous unmount"); - MALLOC_DEFINE(M_FUSEVFS, "fuse_filesystem", "buffer for fuse vfs layer"); static int fuse_getdevice(const char *fspec, struct thread *td, struct cdev **fdevp) { struct nameidata nd, *ndp = &nd; struct vnode *devvp; struct cdev *fdev; int err; /* * Not an update, or updating the name: look up the name * and verify that it refers to a sensible disk device. */ NDINIT(ndp, LOOKUP, FOLLOW, UIO_SYSSPACE, fspec, td); if ((err = namei(ndp)) != 0) return err; NDFREE(ndp, NDF_ONLY_PNBUF); devvp = ndp->ni_vp; if (devvp->v_type != VCHR) { vrele(devvp); return ENXIO; } fdev = devvp->v_rdev; dev_ref(fdev); if (fuse_enforce_dev_perms) { /* * Check if mounter can open the fuse device. * * This has significance only if we are doing a secondary mount * which doesn't involve actually opening fuse devices, but we * still want to enforce the permissions of the device (in * order to keep control over the circle of fuse users). * * (In case of primary mounts, we are either the superuser so * we can do anything anyway, or we can mount only if the * device is already opened by us, ie. we are permitted to open * the device.) */ #if 0 #ifdef MAC err = mac_check_vnode_open(td->td_ucred, devvp, VREAD | VWRITE); if (!err) #endif #endif /* 0 */ err = VOP_ACCESS(devvp, VREAD | VWRITE, td->td_ucred, td); if (err) { vrele(devvp); dev_rel(fdev); return err; } } /* * according to coda code, no extra lock is needed -- * although in sys/vnode.h this field is marked "v" */ vrele(devvp); if (!fdev->si_devsw || strcmp("fuse", fdev->si_devsw->d_name)) { dev_rel(fdev); return ENXIO; } *fdevp = fdev; return 0; } #define FUSE_FLAGOPT(fnam, fval) do { \ vfs_flagopt(opts, #fnam, &mntopts, fval); \ vfs_flagopt(opts, "__" #fnam, &__mntopts, fval); \ } while (0) -SDT_PROBE_DEFINE1(fuse, , vfsops, mntopts, "uint64_t"); -SDT_PROBE_DEFINE4(fuse, , vfsops, mount_err, "char*", "struct fuse_data*", +SDT_PROBE_DEFINE1(fusefs, , vfsops, mntopts, "uint64_t"); +SDT_PROBE_DEFINE4(fusefs, , vfsops, mount_err, "char*", "struct fuse_data*", "struct mount*", "int"); static int +fuse_vfs_remount(struct mount *mp, struct thread *td, uint64_t mntopts, + uint32_t max_read, int daemon_timeout) +{ + int err = 0; + struct fuse_data *data = fuse_get_mpdata(mp); + /* Don't allow these options to be changed */ + const static unsigned long long cant_update_opts = + MNT_USER; /* Mount owner must be the user running the daemon */ + + FUSE_LOCK(); + + if ((mp->mnt_flag ^ data->mnt_flag) & cant_update_opts) { + err = EOPNOTSUPP; + SDT_PROBE4(fusefs, , vfsops, mount_err, + "Can't change these mount options during remount", + data, mp, err); + goto out; + } + if (((data->dataflags ^ mntopts) & FSESS_MNTOPTS_MASK) || + (data->max_read != max_read) || + (data->daemon_timeout != daemon_timeout)) { + // TODO: allow changing options where it makes sense + err = EOPNOTSUPP; + SDT_PROBE4(fusefs, , vfsops, mount_err, + "Can't change fuse mount options during remount", + data, mp, err); + goto out; + } + + if (fdata_get_dead(data)) { + err = ENOTCONN; + SDT_PROBE4(fusefs, , vfsops, mount_err, + "device is dead during mount", data, mp, err); + goto out; + } + + /* Sanity + permission checks */ + if (!data->daemoncred) + panic("fuse daemon found, but identity unknown"); + if (mntopts & FSESS_DAEMON_CAN_SPY) + err = priv_check(td, PRIV_VFS_FUSE_ALLOWOTHER); + if (err == 0 && td->td_ucred->cr_uid != data->daemoncred->cr_uid) + /* are we allowed to do the first mount? */ + err = priv_check(td, PRIV_VFS_FUSE_MOUNT_NONUSER); + +out: + FUSE_UNLOCK(); + return err; +} + +static int +fuse_vfsop_fhtovp(struct mount *mp, struct fid *fhp, int flags, + struct vnode **vpp) +{ + struct fuse_fid *ffhp = (struct fuse_fid *)fhp; + struct fuse_vnode_data *fvdat; + struct vnode *nvp; + int error; + + if (!(fuse_get_mpdata(mp)->dataflags & FSESS_EXPORT_SUPPORT)) + return EOPNOTSUPP; + + error = VFS_VGET(mp, ffhp->nid, LK_EXCLUSIVE, &nvp); + if (error) { + *vpp = NULLVP; + return (error); + } + fvdat = VTOFUD(nvp); + if (fvdat->generation != ffhp->gen ) { + vput(nvp); + *vpp = NULLVP; + return (ESTALE); + } + *vpp = nvp; + vnode_create_vobject(*vpp, 0, curthread); + return (0); +} + +static int fuse_vfsop_mount(struct mount *mp) { int err; uint64_t mntopts, __mntopts; uint32_t max_read; int daemon_timeout; int fd; size_t len; struct cdev *fdev; struct fuse_data *data = NULL; struct thread *td; struct file *fp, *fptmp; char *fspec, *subtype; struct vfsoptlist *opts; subtype = NULL; max_read = ~0; err = 0; mntopts = 0; __mntopts = 0; td = curthread; - if (mp->mnt_flag & MNT_UPDATE) - return EOPNOTSUPP; - - MNT_ILOCK(mp); - mp->mnt_flag |= MNT_SYNCHRONOUS; - mp->mnt_data = NULL; - MNT_IUNLOCK(mp); /* Get the new options passed to mount */ opts = mp->mnt_optnew; if (!opts) return EINVAL; /* `fspath' contains the mount point (eg. /mnt/fuse/sshfs); REQUIRED */ if (!vfs_getopts(opts, "fspath", &err)) return err; - /* `from' contains the device name (eg. /dev/fuse0); REQUIRED */ - fspec = vfs_getopts(opts, "from", &err); - if (!fspec) - return err; - - /* `fd' contains the filedescriptor for this session; REQUIRED */ - if (vfs_scanopt(opts, "fd", "%d", &fd) != 1) - return EINVAL; - - err = fuse_getdevice(fspec, td, &fdev); - if (err != 0) - return err; - /* * With the help of underscored options the mount program * can inform us from the flags it sets by default */ FUSE_FLAGOPT(allow_other, FSESS_DAEMON_CAN_SPY); FUSE_FLAGOPT(push_symlinks_in, FSESS_PUSH_SYMLINKS_IN); FUSE_FLAGOPT(default_permissions, FSESS_DEFAULT_PERMISSIONS); - FUSE_FLAGOPT(no_attrcache, FSESS_NO_ATTRCACHE); - FUSE_FLAGOPT(no_readahed, FSESS_NO_READAHEAD); - FUSE_FLAGOPT(no_datacache, FSESS_NO_DATACACHE); - FUSE_FLAGOPT(no_namecache, FSESS_NO_NAMECACHE); - FUSE_FLAGOPT(no_mmap, FSESS_NO_MMAP); - FUSE_FLAGOPT(brokenio, FSESS_BROKENIO); + FUSE_FLAGOPT(intr, FSESS_INTR); (void)vfs_scanopt(opts, "max_read=", "%u", &max_read); if (vfs_scanopt(opts, "timeout=", "%u", &daemon_timeout) == 1) { if (daemon_timeout < FUSE_MIN_DAEMON_TIMEOUT) daemon_timeout = FUSE_MIN_DAEMON_TIMEOUT; else if (daemon_timeout > FUSE_MAX_DAEMON_TIMEOUT) daemon_timeout = FUSE_MAX_DAEMON_TIMEOUT; } else { daemon_timeout = FUSE_DEFAULT_DAEMON_TIMEOUT; } subtype = vfs_getopts(opts, "subtype=", &err); - SDT_PROBE1(fuse, , vfsops, mntopts, mntopts); + SDT_PROBE1(fusefs, , vfsops, mntopts, mntopts); + if (mp->mnt_flag & MNT_UPDATE) { + return fuse_vfs_remount(mp, td, mntopts, max_read, + daemon_timeout); + } + + /* `from' contains the device name (eg. /dev/fuse0); REQUIRED */ + fspec = vfs_getopts(opts, "from", &err); + if (!fspec) + return err; + + /* `fd' contains the filedescriptor for this session; REQUIRED */ + if (vfs_scanopt(opts, "fd", "%d", &fd) != 1) + return EINVAL; + + err = fuse_getdevice(fspec, td, &fdev); + if (err != 0) + return err; + err = fget(td, fd, &cap_read_rights, &fp); if (err != 0) { - SDT_PROBE2(fuse, , vfsops, trace, 1, + SDT_PROBE2(fusefs, , vfsops, trace, 1, "invalid or not opened device"); goto out; } fptmp = td->td_fpop; td->td_fpop = fp; err = devfs_get_cdevpriv((void **)&data); td->td_fpop = fptmp; fdrop(fp, td); FUSE_LOCK(); - if (err != 0 || data == NULL || data->mp != NULL) { + + if (err != 0 || data == NULL) { err = ENXIO; - SDT_PROBE4(fuse, , vfsops, mount_err, + SDT_PROBE4(fusefs, , vfsops, mount_err, "invalid or not opened device", data, mp, err); FUSE_UNLOCK(); goto out; } if (fdata_get_dead(data)) { err = ENOTCONN; - SDT_PROBE4(fuse, , vfsops, mount_err, + SDT_PROBE4(fusefs, , vfsops, mount_err, "device is dead during mount", data, mp, err); FUSE_UNLOCK(); goto out; } /* Sanity + permission checks */ if (!data->daemoncred) panic("fuse daemon found, but identity unknown"); if (mntopts & FSESS_DAEMON_CAN_SPY) err = priv_check(td, PRIV_VFS_FUSE_ALLOWOTHER); if (err == 0 && td->td_ucred->cr_uid != data->daemoncred->cr_uid) /* are we allowed to do the first mount? */ err = priv_check(td, PRIV_VFS_FUSE_MOUNT_NONUSER); if (err) { FUSE_UNLOCK(); goto out; } data->ref++; data->mp = mp; data->dataflags |= mntopts; data->max_read = max_read; data->daemon_timeout = daemon_timeout; + data->mnt_flag = mp->mnt_flag & MNT_UPDATEMASK; FUSE_UNLOCK(); vfs_getnewfsid(mp); MNT_ILOCK(mp); mp->mnt_data = data; - mp->mnt_flag |= MNT_LOCAL; + /* + * FUSE file systems can be either local or remote, but the kernel + * can't tell the difference. + */ + mp->mnt_flag &= ~MNT_LOCAL; mp->mnt_kern_flag |= MNTK_USES_BCACHE; MNT_IUNLOCK(mp); /* We need this here as this slot is used by getnewvnode() */ mp->mnt_stat.f_iosize = maxbcachebuf; if (subtype) { strlcat(mp->mnt_stat.f_fstypename, ".", MFSNAMELEN); strlcat(mp->mnt_stat.f_fstypename, subtype, MFSNAMELEN); } copystr(fspec, mp->mnt_stat.f_mntfromname, MNAMELEN - 1, &len); bzero(mp->mnt_stat.f_mntfromname + len, MNAMELEN - len); + mp->mnt_iosize_max = MAXPHYS; /* Now handshaking with daemon */ fuse_internal_send_init(data, td); out: if (err) { FUSE_LOCK(); if (data != NULL && data->mp == mp) { /* * Destroy device only if we acquired reference to * it */ - SDT_PROBE4(fuse, , vfsops, mount_err, + SDT_PROBE4(fusefs, , vfsops, mount_err, "mount failed, destroy device", data, mp, err); data->mp = NULL; + mp->mnt_data = NULL; fdata_trydestroy(data); } FUSE_UNLOCK(); dev_rel(fdev); } return err; } static int fuse_vfsop_unmount(struct mount *mp, int mntflags) { int err = 0; int flags = 0; struct cdev *fdev; struct fuse_data *data; struct fuse_dispatcher fdi; struct thread *td = curthread; if (mntflags & MNT_FORCE) { flags |= FORCECLOSE; } data = fuse_get_mpdata(mp); if (!data) { panic("no private data for mount point?"); } /* There is 1 extra root vnode reference (mp->mnt_data). */ FUSE_LOCK(); if (data->vroot != NULL) { struct vnode *vroot = data->vroot; data->vroot = NULL; FUSE_UNLOCK(); vrele(vroot); } else FUSE_UNLOCK(); err = vflush(mp, 0, flags, td); if (err) { return err; } if (fdata_get_dead(data)) { goto alreadydead; } - fdisp_init(&fdi, 0); - fdisp_make(&fdi, FUSE_DESTROY, mp, 0, td, NULL); + if (fsess_isimpl(mp, FUSE_DESTROY)) { + fdisp_init(&fdi, 0); + fdisp_make(&fdi, FUSE_DESTROY, mp, 0, td, NULL); - err = fdisp_wait_answ(&fdi); - fdisp_destroy(&fdi); + (void)fdisp_wait_answ(&fdi); + fdisp_destroy(&fdi); + } fdata_set_dead(data); alreadydead: FUSE_LOCK(); data->mp = NULL; fdev = data->fdev; fdata_trydestroy(data); FUSE_UNLOCK(); MNT_ILOCK(mp); mp->mnt_data = NULL; - mp->mnt_flag &= ~MNT_LOCAL; MNT_IUNLOCK(mp); dev_rel(fdev); return 0; } +SDT_PROBE_DEFINE1(fusefs, , vfsops, invalidate_without_export, + "struct mount*"); static int +fuse_vfsop_vget(struct mount *mp, ino_t ino, int flags, struct vnode **vpp) +{ + struct fuse_data *data = fuse_get_mpdata(mp); + uint64_t nodeid = ino; + struct thread *td = curthread; + struct fuse_dispatcher fdi; + struct fuse_entry_out *feo; + struct fuse_vnode_data *fvdat; + const char dot[] = "."; + off_t filesize; + enum vtype vtyp; + int error; + + if (!(data->dataflags & FSESS_EXPORT_SUPPORT)) { + /* + * Unreachable unless you do something stupid, like export a + * nullfs mount of a fusefs file system. + */ + SDT_PROBE1(fusefs, , vfsops, invalidate_without_export, mp); + return (EOPNOTSUPP); + } + + error = fuse_internal_get_cached_vnode(mp, ino, flags, vpp); + if (error || *vpp != NULL) + return error; + + /* Do a LOOKUP, using nodeid as the parent and "." as filename */ + fdisp_init(&fdi, sizeof(dot)); + fdisp_make(&fdi, FUSE_LOOKUP, mp, nodeid, td, td->td_ucred); + memcpy(fdi.indata, dot, sizeof(dot)); + error = fdisp_wait_answ(&fdi); + + if (error) + return error; + + feo = (struct fuse_entry_out *)fdi.answ; + if (feo->nodeid == 0) { + /* zero nodeid means ENOENT and cache it */ + error = ENOENT; + goto out; + } + + vtyp = IFTOVT(feo->attr.mode); + error = fuse_vnode_get(mp, feo, nodeid, NULL, vpp, NULL, vtyp); + if (error) + goto out; + filesize = feo->attr.size; + + /* + * In the case where we are looking up a FUSE node represented by an + * existing cached vnode, and the true size reported by FUSE_LOOKUP + * doesn't match the vnode's cached size, then any cached writes beyond + * the file's current size are lost. + * + * We can get here: + * * following attribute cache expiration, or + * * due a bug in the daemon, or + */ + fvdat = VTOFUD(*vpp); + if (vnode_isreg(*vpp) && + filesize != fvdat->cached_attrs.va_size && + fvdat->flag & FN_SIZECHANGE) { + printf("%s: WB cache incoherent on %s!\n", __func__, + vnode_mount(*vpp)->mnt_stat.f_mntonname); + + fvdat->flag &= ~FN_SIZECHANGE; + } + + fuse_internal_cache_attrs(*vpp, &feo->attr, feo->attr_valid, + feo->attr_valid_nsec, NULL); + fuse_validity_2_bintime(feo->entry_valid, feo->entry_valid_nsec, + &fvdat->entry_cache_timeout); +out: + fdisp_destroy(&fdi); + return error; +} + +static int fuse_vfsop_root(struct mount *mp, int lkflags, struct vnode **vpp) { struct fuse_data *data = fuse_get_mpdata(mp); int err = 0; if (data->vroot != NULL) { err = vget(data->vroot, lkflags, curthread); if (err == 0) *vpp = data->vroot; } else { err = fuse_vnode_get(mp, NULL, FUSE_ROOT_ID, NULL, vpp, NULL, VDIR); if (err == 0) { FUSE_LOCK(); MPASS(data->vroot == NULL || data->vroot == *vpp); if (data->vroot == NULL) { - SDT_PROBE2(fuse, , vfsops, trace, 1, + SDT_PROBE2(fusefs, , vfsops, trace, 1, "new root vnode"); data->vroot = *vpp; FUSE_UNLOCK(); vref(*vpp); } else if (data->vroot != *vpp) { - SDT_PROBE2(fuse, , vfsops, trace, 1, + SDT_PROBE2(fusefs, , vfsops, trace, 1, "root vnode race"); FUSE_UNLOCK(); VOP_UNLOCK(*vpp, 0); vrele(*vpp); vrecycle(*vpp); *vpp = data->vroot; } else FUSE_UNLOCK(); } } return err; } static int fuse_vfsop_statfs(struct mount *mp, struct statfs *sbp) { struct fuse_dispatcher fdi; int err = 0; struct fuse_statfs_out *fsfo; struct fuse_data *data; data = fuse_get_mpdata(mp); if (!(data->dataflags & FSESS_INITED)) goto fake; fdisp_init(&fdi, 0); fdisp_make(&fdi, FUSE_STATFS, mp, FUSE_ROOT_ID, NULL, NULL); err = fdisp_wait_answ(&fdi); if (err) { fdisp_destroy(&fdi); if (err == ENOTCONN) { /* * We want to seem a legitimate fs even if the daemon * is stiff dead... (so that, eg., we can still do path * based unmounting after the daemon dies). */ goto fake; } return err; } fsfo = fdi.answ; sbp->f_blocks = fsfo->st.blocks; sbp->f_bfree = fsfo->st.bfree; sbp->f_bavail = fsfo->st.bavail; sbp->f_files = fsfo->st.files; sbp->f_ffree = fsfo->st.ffree; /* cast from uint64_t to int64_t */ sbp->f_namemax = fsfo->st.namelen; sbp->f_bsize = fsfo->st.frsize; /* cast from uint32_t to uint64_t */ fdisp_destroy(&fdi); return 0; fake: sbp->f_blocks = 0; sbp->f_bfree = 0; sbp->f_bavail = 0; sbp->f_files = 0; sbp->f_ffree = 0; sbp->f_namemax = 0; - sbp->f_bsize = FUSE_DEFAULT_BLOCKSIZE; + sbp->f_bsize = S_BLKSIZE; return 0; } Index: head/sys/fs/fuse/fuse_vnops.c =================================================================== --- head/sys/fs/fuse/fuse_vnops.c (revision 350664) +++ head/sys/fs/fuse/fuse_vnops.c (revision 350665) @@ -1,2376 +1,2471 @@ /*- * SPDX-License-Identifier: BSD-3-Clause * * Copyright (c) 2007-2009 Google Inc. and Amit Singh * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are * met: * * * Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * * Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following disclaimer * in the documentation and/or other materials provided with the * distribution. * * Neither the name of Google Inc. nor the names of its * contributors may be used to endorse or promote products derived from * this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * Copyright (C) 2005 Csaba Henk. * All rights reserved. * + * Copyright (c) 2019 The FreeBSD Foundation + * + * Portions of this software were developed by BFF Storage Systems, LLC under + * sponsorship from the FreeBSD Foundation. + * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "fuse.h" #include "fuse_file.h" #include "fuse_internal.h" #include "fuse_ipc.h" #include "fuse_node.h" -#include "fuse_param.h" #include "fuse_io.h" #include -SDT_PROVIDER_DECLARE(fuse); +/* Maximum number of hardlinks to a single FUSE file */ +#define FUSE_LINK_MAX UINT32_MAX + +SDT_PROVIDER_DECLARE(fusefs); /* * Fuse trace probe: * arg0: verbosity. Higher numbers give more verbose messages * arg1: Textual message */ -SDT_PROBE_DEFINE2(fuse, , vnops, trace, "int", "char*"); +SDT_PROBE_DEFINE2(fusefs, , vnops, trace, "int", "char*"); /* vnode ops */ static vop_access_t fuse_vnop_access; +static vop_advlock_t fuse_vnop_advlock; +static vop_bmap_t fuse_vnop_bmap; +static vop_close_t fuse_fifo_close; static vop_close_t fuse_vnop_close; static vop_create_t fuse_vnop_create; static vop_deleteextattr_t fuse_vnop_deleteextattr; +static vop_fdatasync_t fuse_vnop_fdatasync; static vop_fsync_t fuse_vnop_fsync; static vop_getattr_t fuse_vnop_getattr; static vop_getextattr_t fuse_vnop_getextattr; static vop_inactive_t fuse_vnop_inactive; static vop_link_t fuse_vnop_link; static vop_listextattr_t fuse_vnop_listextattr; static vop_lookup_t fuse_vnop_lookup; static vop_mkdir_t fuse_vnop_mkdir; static vop_mknod_t fuse_vnop_mknod; static vop_open_t fuse_vnop_open; static vop_pathconf_t fuse_vnop_pathconf; static vop_read_t fuse_vnop_read; static vop_readdir_t fuse_vnop_readdir; static vop_readlink_t fuse_vnop_readlink; static vop_reclaim_t fuse_vnop_reclaim; static vop_remove_t fuse_vnop_remove; static vop_rename_t fuse_vnop_rename; static vop_rmdir_t fuse_vnop_rmdir; static vop_setattr_t fuse_vnop_setattr; static vop_setextattr_t fuse_vnop_setextattr; static vop_strategy_t fuse_vnop_strategy; static vop_symlink_t fuse_vnop_symlink; static vop_write_t fuse_vnop_write; static vop_getpages_t fuse_vnop_getpages; -static vop_putpages_t fuse_vnop_putpages; static vop_print_t fuse_vnop_print; +static vop_vptofh_t fuse_vnop_vptofh; +struct vop_vector fuse_fifoops = { + .vop_default = &fifo_specops, + .vop_access = fuse_vnop_access, + .vop_close = fuse_fifo_close, + .vop_fsync = fuse_vnop_fsync, + .vop_getattr = fuse_vnop_getattr, + .vop_inactive = fuse_vnop_inactive, + .vop_pathconf = fuse_vnop_pathconf, + .vop_print = fuse_vnop_print, + .vop_read = VOP_PANIC, + .vop_reclaim = fuse_vnop_reclaim, + .vop_setattr = fuse_vnop_setattr, + .vop_write = VOP_PANIC, + .vop_vptofh = fuse_vnop_vptofh, +}; + struct vop_vector fuse_vnops = { + .vop_allocate = VOP_EINVAL, .vop_default = &default_vnodeops, .vop_access = fuse_vnop_access, + .vop_advlock = fuse_vnop_advlock, + .vop_bmap = fuse_vnop_bmap, .vop_close = fuse_vnop_close, .vop_create = fuse_vnop_create, .vop_deleteextattr = fuse_vnop_deleteextattr, .vop_fsync = fuse_vnop_fsync, + .vop_fdatasync = fuse_vnop_fdatasync, .vop_getattr = fuse_vnop_getattr, .vop_getextattr = fuse_vnop_getextattr, .vop_inactive = fuse_vnop_inactive, + /* + * TODO: implement vop_ioctl after upgrading to protocol 7.16. + * FUSE_IOCTL was added in 7.11, but 32-bit compat is broken until + * 7.16. + */ .vop_link = fuse_vnop_link, .vop_listextattr = fuse_vnop_listextattr, .vop_lookup = fuse_vnop_lookup, .vop_mkdir = fuse_vnop_mkdir, .vop_mknod = fuse_vnop_mknod, .vop_open = fuse_vnop_open, .vop_pathconf = fuse_vnop_pathconf, + /* + * TODO: implement vop_poll after upgrading to protocol 7.21. + * FUSE_POLL was added in protocol 7.11, but it's kind of broken until + * 7.21, which adds the ability for the client to choose which poll + * events it wants, and for a client to deregister a file handle + */ .vop_read = fuse_vnop_read, .vop_readdir = fuse_vnop_readdir, .vop_readlink = fuse_vnop_readlink, .vop_reclaim = fuse_vnop_reclaim, .vop_remove = fuse_vnop_remove, .vop_rename = fuse_vnop_rename, .vop_rmdir = fuse_vnop_rmdir, .vop_setattr = fuse_vnop_setattr, .vop_setextattr = fuse_vnop_setextattr, .vop_strategy = fuse_vnop_strategy, .vop_symlink = fuse_vnop_symlink, .vop_write = fuse_vnop_write, .vop_getpages = fuse_vnop_getpages, - .vop_putpages = fuse_vnop_putpages, .vop_print = fuse_vnop_print, + .vop_vptofh = fuse_vnop_vptofh, }; -static u_long fuse_lookup_cache_hits = 0; +uma_zone_t fuse_pbuf_zone; -SYSCTL_ULONG(_vfs_fusefs, OID_AUTO, lookup_cache_hits, CTLFLAG_RD, - &fuse_lookup_cache_hits, 0, "number of positive cache hits in lookup"); +#define fuse_vm_page_lock(m) vm_page_lock((m)); +#define fuse_vm_page_unlock(m) vm_page_unlock((m)); +#define fuse_vm_page_lock_queues() ((void)0) +#define fuse_vm_page_unlock_queues() ((void)0) -static u_long fuse_lookup_cache_misses = 0; +/* Check permission for extattr operations, much like extattr_check_cred */ +static int +fuse_extattr_check_cred(struct vnode *vp, int ns, struct ucred *cred, + struct thread *td, accmode_t accmode) +{ + struct mount *mp = vnode_mount(vp); + struct fuse_data *data = fuse_get_mpdata(mp); -SYSCTL_ULONG(_vfs_fusefs, OID_AUTO, lookup_cache_misses, CTLFLAG_RD, - &fuse_lookup_cache_misses, 0, "number of cache misses in lookup"); + /* + * Kernel-invoked always succeeds. + */ + if (cred == NOCRED) + return (0); -int fuse_lookup_cache_enable = 1; + /* + * Do not allow privileged processes in jail to directly manipulate + * system attributes. + */ + switch (ns) { + case EXTATTR_NAMESPACE_SYSTEM: + if (data->dataflags & FSESS_DEFAULT_PERMISSIONS) { + return (priv_check_cred(cred, PRIV_VFS_EXTATTR_SYSTEM)); + } + /* FALLTHROUGH */ + case EXTATTR_NAMESPACE_USER: + return (fuse_internal_access(vp, accmode, td, cred)); + default: + return (EPERM); + } +} -SYSCTL_INT(_vfs_fusefs, OID_AUTO, lookup_cache_enable, CTLFLAG_RW, - &fuse_lookup_cache_enable, 0, "if non-zero, enable lookup cache"); +/* Get a filehandle for a directory */ +static int +fuse_filehandle_get_dir(struct vnode *vp, struct fuse_filehandle **fufhp, + struct ucred *cred, pid_t pid) +{ + if (fuse_filehandle_get(vp, FREAD, fufhp, cred, pid) == 0) + return 0; + return fuse_filehandle_get(vp, FEXEC, fufhp, cred, pid); +} -/* - * XXX: This feature is highly experimental and can bring to instabilities, - * needs revisiting before to be enabled by default. - */ -static int fuse_reclaim_revoked = 0; +/* Send FUSE_FLUSH for this vnode */ +static int +fuse_flush(struct vnode *vp, struct ucred *cred, pid_t pid, int fflag) +{ + struct fuse_flush_in *ffi; + struct fuse_filehandle *fufh; + struct fuse_dispatcher fdi; + struct thread *td = curthread; + struct mount *mp = vnode_mount(vp); + int err; -SYSCTL_INT(_vfs_fusefs, OID_AUTO, reclaim_revoked, CTLFLAG_RW, - &fuse_reclaim_revoked, 0, ""); + if (!fsess_isimpl(vnode_mount(vp), FUSE_FLUSH)) + return 0; -uma_zone_t fuse_pbuf_zone; + err = fuse_filehandle_getrw(vp, fflag, &fufh, cred, pid); + if (err) + return err; -#define fuse_vm_page_lock(m) vm_page_lock((m)); -#define fuse_vm_page_unlock(m) vm_page_unlock((m)); -#define fuse_vm_page_lock_queues() ((void)0) -#define fuse_vm_page_unlock_queues() ((void)0) + fdisp_init(&fdi, sizeof(*ffi)); + fdisp_make_vp(&fdi, FUSE_FLUSH, vp, td, cred); + ffi = fdi.indata; + ffi->fh = fufh->fh_id; + /* + * If the file has a POSIX lock then we're supposed to set lock_owner. + * If not, then lock_owner is undefined. So we may as well always set + * it. + */ + ffi->lock_owner = td->td_proc->p_pid; + err = fdisp_wait_answ(&fdi); + if (err == ENOSYS) { + fsess_set_notimpl(mp, FUSE_FLUSH); + err = 0; + } + fdisp_destroy(&fdi); + return err; +} + +/* Close wrapper for fifos. */ +static int +fuse_fifo_close(struct vop_close_args *ap) +{ + return (fifo_specops.vop_close(ap)); +} + /* struct vnop_access_args { struct vnode *a_vp; #if VOP_ACCESS_TAKES_ACCMODE_T accmode_t a_accmode; #else int a_mode; #endif struct ucred *a_cred; struct thread *a_td; }; */ static int fuse_vnop_access(struct vop_access_args *ap) { struct vnode *vp = ap->a_vp; int accmode = ap->a_accmode; struct ucred *cred = ap->a_cred; - struct fuse_access_param facp; struct fuse_data *data = fuse_get_mpdata(vnode_mount(vp)); int err; if (fuse_isdeadfs(vp)) { if (vnode_isvroot(vp)) { return 0; } return ENXIO; } if (!(data->dataflags & FSESS_INITED)) { if (vnode_isvroot(vp)) { if (priv_check_cred(cred, PRIV_VFS_ADMIN) || (fuse_match_cred(data->daemoncred, cred) == 0)) { return 0; } } return EBADF; } if (vnode_islnk(vp)) { return 0; } - bzero(&facp, sizeof(facp)); - err = fuse_internal_access(vp, accmode, &facp, ap->a_td, ap->a_cred); + err = fuse_internal_access(vp, accmode, ap->a_td, ap->a_cred); return err; } /* - struct vnop_close_args { + * struct vop_advlock_args { + * struct vop_generic_args a_gen; + * struct vnode *a_vp; + * void *a_id; + * int a_op; + * struct flock *a_fl; + * int a_flags; + * } + */ +static int +fuse_vnop_advlock(struct vop_advlock_args *ap) +{ + struct vnode *vp = ap->a_vp; + struct flock *fl = ap->a_fl; + struct thread *td = curthread; + struct ucred *cred = td->td_ucred; + pid_t pid = td->td_proc->p_pid; + struct fuse_filehandle *fufh; + struct fuse_dispatcher fdi; + struct fuse_lk_in *fli; + struct fuse_lk_out *flo; + enum fuse_opcode op; + int dataflags, err; + int flags = ap->a_flags; + + dataflags = fuse_get_mpdata(vnode_mount(vp))->dataflags; + + if (fuse_isdeadfs(vp)) { + return ENXIO; + } + + if (!(dataflags & FSESS_POSIX_LOCKS)) + return vop_stdadvlock(ap); + /* FUSE doesn't properly support flock until protocol 7.17 */ + if (flags & F_FLOCK) + return vop_stdadvlock(ap); + + err = fuse_filehandle_get_anyflags(vp, &fufh, cred, pid); + if (err) + return err; + + fdisp_init(&fdi, sizeof(*fli)); + + switch(ap->a_op) { + case F_GETLK: + op = FUSE_GETLK; + break; + case F_SETLK: + op = FUSE_SETLK; + break; + case F_SETLKW: + op = FUSE_SETLKW; + break; + default: + return EINVAL; + } + + fdisp_make_vp(&fdi, op, vp, td, cred); + fli = fdi.indata; + fli->fh = fufh->fh_id; + fli->owner = fl->l_pid; + fli->lk.start = fl->l_start; + if (fl->l_len != 0) + fli->lk.end = fl->l_start + fl->l_len - 1; + else + fli->lk.end = INT64_MAX; + fli->lk.type = fl->l_type; + fli->lk.pid = fl->l_pid; + + err = fdisp_wait_answ(&fdi); + fdisp_destroy(&fdi); + + if (err == 0 && op == FUSE_GETLK) { + flo = fdi.answ; + fl->l_type = flo->lk.type; + fl->l_pid = flo->lk.pid; + if (flo->lk.type != F_UNLCK) { + fl->l_start = flo->lk.start; + if (flo->lk.end == INT64_MAX) + fl->l_len = 0; + else + fl->l_len = flo->lk.end - flo->lk.start + 1; + fl->l_start = flo->lk.start; + } + } + + return err; +} + +/* { struct vnode *a_vp; + daddr_t a_bn; + struct bufobj **a_bop; + daddr_t *a_bnp; + int *a_runp; + int *a_runb; +} */ +static int +fuse_vnop_bmap(struct vop_bmap_args *ap) +{ + struct vnode *vp = ap->a_vp; + struct bufobj **bo = ap->a_bop; + struct thread *td = curthread; + struct mount *mp; + struct fuse_dispatcher fdi; + struct fuse_bmap_in *fbi; + struct fuse_bmap_out *fbo; + struct fuse_data *data; + uint64_t biosize; + off_t filesize; + daddr_t lbn = ap->a_bn; + daddr_t *pbn = ap->a_bnp; + int *runp = ap->a_runp; + int *runb = ap->a_runb; + int error = 0; + int maxrun; + + if (fuse_isdeadfs(vp)) { + return ENXIO; + } + + mp = vnode_mount(vp); + data = fuse_get_mpdata(mp); + biosize = fuse_iosize(vp); + maxrun = MIN(vp->v_mount->mnt_iosize_max / biosize - 1, + data->max_readahead_blocks); + + if (bo != NULL) + *bo = &vp->v_bufobj; + + /* + * The FUSE_BMAP operation does not include the runp and runb + * variables, so we must guess. Report nonzero contiguous runs so + * cluster_read will combine adjacent reads. It's worthwhile to reduce + * upcalls even if we don't know the true physical layout of the file. + * + * FUSE file systems may opt out of read clustering in two ways: + * * mounting with -onoclusterr + * * Setting max_readahead <= maxbcachebuf during FUSE_INIT + */ + if (runb != NULL) + *runb = MIN(lbn, maxrun); + if (runp != NULL) { + error = fuse_vnode_size(vp, &filesize, td->td_ucred, td); + if (error == 0) + *runp = MIN(MAX(0, filesize / biosize - lbn - 1), + maxrun); + else + *runp = 0; + } + + if (fsess_isimpl(mp, FUSE_BMAP)) { + fdisp_init(&fdi, sizeof(*fbi)); + fdisp_make_vp(&fdi, FUSE_BMAP, vp, td, td->td_ucred); + fbi = fdi.indata; + fbi->block = lbn; + fbi->blocksize = biosize; + error = fdisp_wait_answ(&fdi); + if (error == ENOSYS) { + fdisp_destroy(&fdi); + fsess_set_notimpl(mp, FUSE_BMAP); + error = 0; + } else { + fbo = fdi.answ; + if (error == 0 && pbn != NULL) + *pbn = fbo->block; + fdisp_destroy(&fdi); + return error; + } + } + + /* If the daemon doesn't support BMAP, make up a sensible default */ + if (pbn != NULL) + *pbn = lbn * btodb(biosize); + return (error); +} + +/* + struct vop_close_args { + struct vnode *a_vp; int a_fflag; struct ucred *a_cred; struct thread *a_td; }; */ static int fuse_vnop_close(struct vop_close_args *ap) { struct vnode *vp = ap->a_vp; struct ucred *cred = ap->a_cred; int fflag = ap->a_fflag; - fufh_type_t fufh_type; + struct thread *td = ap->a_td; + pid_t pid = td->td_proc->p_pid; + int err = 0; - if (fuse_isdeadfs(vp)) { + if (fuse_isdeadfs(vp)) return 0; - } - if (vnode_isdir(vp)) { - if (fuse_filehandle_valid(vp, FUFH_RDONLY)) { - fuse_filehandle_close(vp, FUFH_RDONLY, NULL, cred); - } + if (vnode_isdir(vp)) return 0; - } - if (fflag & IO_NDELAY) { + if (fflag & IO_NDELAY) return 0; - } - fufh_type = fuse_filehandle_xlate_from_fflags(fflag); - if (!fuse_filehandle_valid(vp, fufh_type)) { - int i; - - for (i = 0; i < FUFH_MAXTYPE; i++) - if (fuse_filehandle_valid(vp, i)) - break; - if (i == FUFH_MAXTYPE) - panic("FUSE: fufh type %d found to be invalid in close" - " (fflag=0x%x)\n", - fufh_type, fflag); - } + err = fuse_flush(vp, cred, pid, fflag); + /* TODO: close the file handle, if we're sure it's no longer used */ if ((VTOFUD(vp)->flag & FN_SIZECHANGE) != 0) { - fuse_vnode_savesize(vp, cred); + fuse_vnode_savesize(vp, cred, td->td_proc->p_pid); } - return 0; + return err; } +static void +fdisp_make_mknod_for_fallback( + struct fuse_dispatcher *fdip, + struct componentname *cnp, + struct vnode *dvp, + uint64_t parentnid, + struct thread *td, + struct ucred *cred, + mode_t mode, + enum fuse_opcode *op) +{ + struct fuse_mknod_in *fmni; + + fdisp_init(fdip, sizeof(*fmni) + cnp->cn_namelen + 1); + *op = FUSE_MKNOD; + fdisp_make(fdip, *op, vnode_mount(dvp), parentnid, td, cred); + fmni = fdip->indata; + fmni->mode = mode; + fmni->rdev = 0; + memcpy((char *)fdip->indata + sizeof(*fmni), cnp->cn_nameptr, + cnp->cn_namelen); + ((char *)fdip->indata)[sizeof(*fmni) + cnp->cn_namelen] = '\0'; +} /* struct vnop_create_args { struct vnode *a_dvp; struct vnode **a_vpp; struct componentname *a_cnp; struct vattr *a_vap; }; */ static int fuse_vnop_create(struct vop_create_args *ap) { struct vnode *dvp = ap->a_dvp; struct vnode **vpp = ap->a_vpp; struct componentname *cnp = ap->a_cnp; struct vattr *vap = ap->a_vap; struct thread *td = cnp->cn_thread; struct ucred *cred = cnp->cn_cred; - struct fuse_open_in *foi; + struct fuse_data *data; + struct fuse_create_in *fci; struct fuse_entry_out *feo; - struct fuse_dispatcher fdi; + struct fuse_open_out *foo; + struct fuse_dispatcher fdi, fdi2; struct fuse_dispatcher *fdip = &fdi; + struct fuse_dispatcher *fdip2 = NULL; int err; struct mount *mp = vnode_mount(dvp); + data = fuse_get_mpdata(mp); uint64_t parentnid = VTOFUD(dvp)->nid; mode_t mode = MAKEIMODE(vap->va_type, vap->va_mode); - uint64_t x_fh_id; - uint32_t x_open_flags; + enum fuse_opcode op; + int flags; - if (fuse_isdeadfs(dvp)) { + if (fuse_isdeadfs(dvp)) return ENXIO; - } + + /* FUSE expects sockets to be created with FUSE_MKNOD */ + if (vap->va_type == VSOCK) + return fuse_internal_mknod(dvp, vpp, cnp, vap); + + /* + * VOP_CREATE doesn't tell us the open(2) flags, so we guess. Only a + * writable mode makes sense, and we might as well include readability + * too. + */ + flags = O_RDWR; + bzero(&fdi, sizeof(fdi)); - /* XXX: Will we ever want devices ? */ - if ((vap->va_type != VREG)) { - printf("fuse_vnop_create: unsupported va_type %d\n", - vap->va_type); + if (vap->va_type != VREG) return (EINVAL); - } - fdisp_init(fdip, sizeof(*foi) + cnp->cn_namelen + 1); - if (!fsess_isimpl(mp, FUSE_CREATE)) { - SDT_PROBE2(fuse, , vnops, trace, 1, - "eh, daemon doesn't implement create?"); - return (EINVAL); - } - fdisp_make(fdip, FUSE_CREATE, vnode_mount(dvp), parentnid, td, cred); + if (!fsess_isimpl(mp, FUSE_CREATE) || vap->va_type == VSOCK) { + /* Fallback to FUSE_MKNOD/FUSE_OPEN */ + fdisp_make_mknod_for_fallback(fdip, cnp, dvp, parentnid, td, + cred, mode, &op); + } else { + /* Use FUSE_CREATE */ + size_t insize; - foi = fdip->indata; - foi->mode = mode; - foi->flags = O_CREAT | O_RDWR; + op = FUSE_CREATE; + fdisp_init(fdip, sizeof(*fci) + cnp->cn_namelen + 1); + fdisp_make(fdip, op, vnode_mount(dvp), parentnid, td, cred); + fci = fdip->indata; + fci->mode = mode; + fci->flags = O_CREAT | flags; + if (fuse_libabi_geq(data, 7, 12)) { + insize = sizeof(*fci); + fci->umask = td->td_proc->p_fd->fd_cmask; + } else { + insize = sizeof(struct fuse_open_in); + } - memcpy((char *)fdip->indata + sizeof(*foi), cnp->cn_nameptr, - cnp->cn_namelen); - ((char *)fdip->indata)[sizeof(*foi) + cnp->cn_namelen] = '\0'; + memcpy((char *)fdip->indata + insize, cnp->cn_nameptr, + cnp->cn_namelen); + ((char *)fdip->indata)[insize + cnp->cn_namelen] = '\0'; + } err = fdisp_wait_answ(fdip); if (err) { - if (err == ENOSYS) + if (err == ENOSYS && op == FUSE_CREATE) { fsess_set_notimpl(mp, FUSE_CREATE); - goto out; + fdisp_destroy(fdip); + fdisp_make_mknod_for_fallback(fdip, cnp, dvp, + parentnid, td, cred, mode, &op); + err = fdisp_wait_answ(fdip); + } + if (err) + goto out; } feo = fdip->answ; - if ((err = fuse_internal_checkentry(feo, VREG))) { + if ((err = fuse_internal_checkentry(feo, vap->va_type))) { goto out; } - err = fuse_vnode_get(mp, feo, feo->nodeid, dvp, vpp, cnp, VREG); + + if (op == FUSE_CREATE) { + foo = (struct fuse_open_out*)(feo + 1); + } else { + /* Issue a separate FUSE_OPEN */ + struct fuse_open_in *foi; + + fdip2 = &fdi2; + fdisp_init(fdip2, sizeof(*foi)); + fdisp_make(fdip2, FUSE_OPEN, vnode_mount(dvp), feo->nodeid, td, + cred); + foi = fdip2->indata; + foi->flags = flags; + err = fdisp_wait_answ(fdip2); + if (err) + goto out; + foo = fdip2->answ; + } + err = fuse_vnode_get(mp, feo, feo->nodeid, dvp, vpp, cnp, vap->va_type); if (err) { struct fuse_release_in *fri; uint64_t nodeid = feo->nodeid; - uint64_t fh_id = ((struct fuse_open_out *)(feo + 1))->fh; + uint64_t fh_id = foo->fh; fdisp_init(fdip, sizeof(*fri)); fdisp_make(fdip, FUSE_RELEASE, mp, nodeid, td, cred); fri = fdip->indata; fri->fh = fh_id; - fri->flags = OFLAGS(mode); + fri->flags = flags; fuse_insert_callback(fdip->tick, fuse_internal_forget_callback); - fuse_insert_message(fdip->tick); - return err; + fuse_insert_message(fdip->tick, false); + goto out; } ASSERT_VOP_ELOCKED(*vpp, "fuse_vnop_create"); + fuse_internal_cache_attrs(*vpp, &feo->attr, feo->attr_valid, + feo->attr_valid_nsec, NULL); - fdip->answ = feo + 1; - - x_fh_id = ((struct fuse_open_out *)(feo + 1))->fh; - x_open_flags = ((struct fuse_open_out *)(feo + 1))->open_flags; - fuse_filehandle_init(*vpp, FUFH_RDWR, NULL, x_fh_id); - fuse_vnode_open(*vpp, x_open_flags, td); + fuse_filehandle_init(*vpp, FUFH_RDWR, NULL, td, cred, foo); + fuse_vnode_open(*vpp, foo->open_flags, td); + /* + * Purge the parent's attribute cache because the daemon should've + * updated its mtime and ctime + */ + fuse_vnode_clear_attr_cache(dvp); cache_purge_negative(dvp); out: + if (fdip2) + fdisp_destroy(fdip2); fdisp_destroy(fdip); return err; } /* - * Our vnop_fsync roughly corresponds to the FUSE_FSYNC method. The Linux - * version of FUSE also has a FUSE_FLUSH method. - * - * On Linux, fsync() synchronizes a file's complete in-core state with that - * on disk. The call is not supposed to return until the system has completed - * that action or until an error is detected. - * - * Linux also has an fdatasync() call that is similar to fsync() but is not - * required to update the metadata such as access time and modification time. - */ + struct vnop_fdatasync_args { + struct vop_generic_args a_gen; + struct vnode * a_vp; + struct thread * a_td; + }; +*/ +static int +fuse_vnop_fdatasync(struct vop_fdatasync_args *ap) +{ + struct vnode *vp = ap->a_vp; + struct thread *td = ap->a_td; + int waitfor = MNT_WAIT; + int err = 0; + + if (fuse_isdeadfs(vp)) { + return 0; + } + if ((err = vop_stdfdatasync_buf(ap))) + return err; + + return fuse_internal_fsync(vp, td, waitfor, true); +} + /* struct vnop_fsync_args { - struct vnodeop_desc *a_desc; + struct vop_generic_args a_gen; struct vnode * a_vp; - struct ucred * a_cred; int a_waitfor; struct thread * a_td; }; */ static int fuse_vnop_fsync(struct vop_fsync_args *ap) { struct vnode *vp = ap->a_vp; struct thread *td = ap->a_td; + int waitfor = ap->a_waitfor; + int err = 0; - struct fuse_filehandle *fufh; - struct fuse_vnode_data *fvdat = VTOFUD(vp); - - int type, err = 0; - if (fuse_isdeadfs(vp)) { return 0; } if ((err = vop_stdfsync(ap))) return err; - if (!fsess_isimpl(vnode_mount(vp), - (vnode_vtype(vp) == VDIR ? FUSE_FSYNCDIR : FUSE_FSYNC))) { - goto out; - } - for (type = 0; type < FUFH_MAXTYPE; type++) { - fufh = &(fvdat->fufh[type]); - if (FUFH_IS_VALID(fufh)) { - fuse_internal_fsync(vp, td, NULL, fufh); - } - } - -out: - return 0; + return fuse_internal_fsync(vp, td, waitfor, false); } /* struct vnop_getattr_args { struct vnode *a_vp; struct vattr *a_vap; struct ucred *a_cred; struct thread *a_td; }; */ static int fuse_vnop_getattr(struct vop_getattr_args *ap) { struct vnode *vp = ap->a_vp; struct vattr *vap = ap->a_vap; struct ucred *cred = ap->a_cred; struct thread *td = curthread; - struct fuse_vnode_data *fvdat = VTOFUD(vp); - struct fuse_attr_out *fao; int err = 0; int dataflags; - struct fuse_dispatcher fdi; dataflags = fuse_get_mpdata(vnode_mount(vp))->dataflags; /* Note that we are not bailing out on a dead file system just yet. */ if (!(dataflags & FSESS_INITED)) { if (!vnode_isvroot(vp)) { fdata_set_dead(fuse_get_mpdata(vnode_mount(vp))); err = ENOTCONN; return err; } else { goto fake; } } - fdisp_init(&fdi, 0); - if ((err = fdisp_simple_putget_vp(&fdi, FUSE_GETATTR, vp, td, cred))) { - if ((err == ENOTCONN) && vnode_isvroot(vp)) { - /* see comment in fuse_vfsop_statfs() */ - fdisp_destroy(&fdi); - goto fake; - } - if (err == ENOENT) { - fuse_internal_vnode_disappear(vp); - } - goto out; + err = fuse_internal_getattr(vp, vap, cred, td); + if (err == ENOTCONN && vnode_isvroot(vp)) { + /* see comment in fuse_vfsop_statfs() */ + goto fake; + } else { + return err; } - fao = (struct fuse_attr_out *)fdi.answ; - fuse_internal_cache_attrs(vp, &fao->attr, fao->attr_valid, - fao->attr_valid_nsec, vap); - if (vap->va_type != vnode_vtype(vp)) { - fuse_internal_vnode_disappear(vp); - err = ENOENT; - goto out; - } - if ((fvdat->flag & FN_SIZECHANGE) != 0) - vap->va_size = fvdat->filesize; - - if (vnode_isreg(vp) && (fvdat->flag & FN_SIZECHANGE) == 0) { - /* - * This is for those cases when the file size changed without us - * knowing, and we want to catch up. - */ - off_t new_filesize = ((struct fuse_attr_out *) - fdi.answ)->attr.size; - - if (fvdat->filesize != new_filesize) { - fuse_vnode_setsize(vp, new_filesize); - fvdat->flag &= ~FN_SIZECHANGE; - } - } - -out: - fdisp_destroy(&fdi); - return err; - fake: bzero(vap, sizeof(*vap)); vap->va_type = vnode_vtype(vp); return 0; } /* struct vnop_inactive_args { struct vnode *a_vp; struct thread *a_td; }; */ static int fuse_vnop_inactive(struct vop_inactive_args *ap) { struct vnode *vp = ap->a_vp; struct thread *td = ap->a_td; struct fuse_vnode_data *fvdat = VTOFUD(vp); - struct fuse_filehandle *fufh = NULL; + struct fuse_filehandle *fufh, *fufh_tmp; - int type, need_flush = 1; + int need_flush = 1; - for (type = 0; type < FUFH_MAXTYPE; type++) { - fufh = &(fvdat->fufh[type]); - if (FUFH_IS_VALID(fufh)) { - if (need_flush && vp->v_type == VREG) { - if ((VTOFUD(vp)->flag & FN_SIZECHANGE) != 0) { - fuse_vnode_savesize(vp, NULL); - } - if (fuse_data_cache_invalidate || - (fvdat->flag & FN_REVOKED) != 0) - fuse_io_invalbuf(vp, td); - else - fuse_io_flushbuf(vp, MNT_WAIT, td); - need_flush = 0; + LIST_FOREACH_SAFE(fufh, &fvdat->handles, next, fufh_tmp) { + if (need_flush && vp->v_type == VREG) { + if ((VTOFUD(vp)->flag & FN_SIZECHANGE) != 0) { + fuse_vnode_savesize(vp, NULL, 0); } - fuse_filehandle_close(vp, type, td, NULL); + if ((fvdat->flag & FN_REVOKED) != 0) + fuse_io_invalbuf(vp, td); + else + fuse_io_flushbuf(vp, MNT_WAIT, td); + need_flush = 0; } + fuse_filehandle_close(vp, fufh, td, NULL); } - if ((fvdat->flag & FN_REVOKED) != 0 && fuse_reclaim_revoked) { + if ((fvdat->flag & FN_REVOKED) != 0) vrecycle(vp); - } + return 0; } /* struct vnop_link_args { struct vnode *a_tdvp; struct vnode *a_vp; struct componentname *a_cnp; }; */ static int fuse_vnop_link(struct vop_link_args *ap) { struct vnode *vp = ap->a_vp; struct vnode *tdvp = ap->a_tdvp; struct componentname *cnp = ap->a_cnp; struct vattr *vap = VTOVA(vp); struct fuse_dispatcher fdi; struct fuse_entry_out *feo; struct fuse_link_in fli; int err; if (fuse_isdeadfs(vp)) { return ENXIO; } if (vnode_mount(tdvp) != vnode_mount(vp)) { return EXDEV; } /* * This is a seatbelt check to protect naive userspace filesystems from * themselves and the limitations of the FUSE IPC protocol. If a * filesystem does not allow attribute caching, assume it is capable of * validating that nlink does not overflow. */ if (vap != NULL && vap->va_nlink >= FUSE_LINK_MAX) return EMLINK; fli.oldnodeid = VTOI(vp); fdisp_init(&fdi, 0); fuse_internal_newentry_makerequest(vnode_mount(tdvp), VTOI(tdvp), cnp, FUSE_LINK, &fli, sizeof(fli), &fdi); if ((err = fdisp_wait_answ(&fdi))) { goto out; } feo = fdi.answ; err = fuse_internal_checkentry(feo, vnode_vtype(vp)); + if (!err) { + /* + * Purge the parent's attribute cache because the daemon + * should've updated its mtime and ctime + */ + fuse_vnode_clear_attr_cache(tdvp); + fuse_internal_cache_attrs(vp, &feo->attr, feo->attr_valid, + feo->attr_valid_nsec, NULL); + } out: fdisp_destroy(&fdi); return err; } +struct fuse_lookup_alloc_arg { + struct fuse_entry_out *feo; + struct componentname *cnp; + uint64_t nid; + enum vtype vtyp; +}; + +/* Callback for vn_get_ino */ +static int +fuse_lookup_alloc(struct mount *mp, void *arg, int lkflags, struct vnode **vpp) +{ + struct fuse_lookup_alloc_arg *flaa = arg; + + return fuse_vnode_get(mp, flaa->feo, flaa->nid, NULL, vpp, flaa->cnp, + flaa->vtyp); +} + +SDT_PROBE_DEFINE3(fusefs, , vnops, cache_lookup, + "int", "struct timespec*", "struct timespec*"); /* struct vnop_lookup_args { struct vnodeop_desc *a_desc; struct vnode *a_dvp; struct vnode **a_vpp; struct componentname *a_cnp; }; */ int fuse_vnop_lookup(struct vop_lookup_args *ap) { struct vnode *dvp = ap->a_dvp; struct vnode **vpp = ap->a_vpp; struct componentname *cnp = ap->a_cnp; struct thread *td = cnp->cn_thread; struct ucred *cred = cnp->cn_cred; int nameiop = cnp->cn_nameiop; int flags = cnp->cn_flags; int wantparent = flags & (LOCKPARENT | WANTPARENT); int islastcn = flags & ISLASTCN; struct mount *mp = vnode_mount(dvp); int err = 0; int lookup_err = 0; struct vnode *vp = NULL; struct fuse_dispatcher fdi; - enum fuse_opcode op; + bool did_lookup = false; + struct fuse_entry_out *feo = NULL; + enum vtype vtyp; /* vnode type of target */ + off_t filesize; /* filesize of target */ uint64_t nid; - struct fuse_access_param facp; if (fuse_isdeadfs(dvp)) { *vpp = NULL; return ENXIO; } - if (!vnode_isdir(dvp)) { + if (!vnode_isdir(dvp)) return ENOTDIR; - } - if (islastcn && vfs_isrdonly(mp) && (nameiop != LOOKUP)) { + + if (islastcn && vfs_isrdonly(mp) && (nameiop != LOOKUP)) return EROFS; - } - /* - * We do access check prior to doing anything else only in the case - * when we are at fs root (we'd like to say, "we are at the first - * component", but that's not exactly the same... nevermind). - * See further comments at further access checks. - */ - bzero(&facp, sizeof(facp)); - if (vnode_isvroot(dvp)) { /* early permission check hack */ - if ((err = fuse_internal_access(dvp, VEXEC, &facp, td, cred))) { - return err; - } - } + if ((err = fuse_internal_access(dvp, VEXEC, td, cred))) + return err; + if (flags & ISDOTDOT) { + KASSERT(VTOFUD(dvp)->flag & FN_PARENT_NID, + ("Looking up .. is TODO")); nid = VTOFUD(dvp)->parent_nid; - if (nid == 0) { + if (nid == 0) return ENOENT; - } - fdisp_init(&fdi, 0); - op = FUSE_GETATTR; - goto calldaemon; + /* .. is obviously a directory */ + vtyp = VDIR; + filesize = 0; } else if (cnp->cn_namelen == 1 && *(cnp->cn_nameptr) == '.') { nid = VTOI(dvp); - fdisp_init(&fdi, 0); - op = FUSE_GETATTR; - goto calldaemon; - } else if (fuse_lookup_cache_enable) { - err = cache_lookup(dvp, vpp, cnp, NULL, NULL); - switch (err) { + /* . is obviously a directory */ + vtyp = VDIR; + filesize = 0; + } else { + struct timespec now, timeout; + err = cache_lookup(dvp, vpp, cnp, &timeout, NULL); + getnanouptime(&now); + SDT_PROBE3(fusefs, , vnops, cache_lookup, err, &timeout, &now); + switch (err) { case -1: /* positive match */ - atomic_add_acq_long(&fuse_lookup_cache_hits, 1); + if (timespeccmp(&timeout, &now, >)) { + counter_u64_add(fuse_lookup_cache_hits, 1); + } else { + /* Cache timeout */ + counter_u64_add(fuse_lookup_cache_misses, 1); + bintime_clear( + &VTOFUD(*vpp)->entry_cache_timeout); + cache_purge(*vpp); + if (dvp != *vpp) + vput(*vpp); + else + vrele(*vpp); + *vpp = NULL; + break; + } return 0; case 0: /* no match in cache */ - atomic_add_acq_long(&fuse_lookup_cache_misses, 1); + counter_u64_add(fuse_lookup_cache_misses, 1); break; case ENOENT: /* negative match */ + getnanouptime(&now); + if (timespeccmp(&timeout, &now, <=)) { + /* Cache timeout */ + cache_purge_negative(dvp); + break; + } /* fall through */ default: return err; } - } - nid = VTOI(dvp); - fdisp_init(&fdi, cnp->cn_namelen + 1); - op = FUSE_LOOKUP; -calldaemon: - fdisp_make(&fdi, op, mp, nid, td, cred); + nid = VTOI(dvp); + fdisp_init(&fdi, cnp->cn_namelen + 1); + fdisp_make(&fdi, FUSE_LOOKUP, mp, nid, td, cred); - if (op == FUSE_LOOKUP) { memcpy(fdi.indata, cnp->cn_nameptr, cnp->cn_namelen); ((char *)fdi.indata)[cnp->cn_namelen] = '\0'; - } - lookup_err = fdisp_wait_answ(&fdi); + lookup_err = fdisp_wait_answ(&fdi); + did_lookup = true; - if ((op == FUSE_LOOKUP) && !lookup_err) { /* lookup call succeeded */ - nid = ((struct fuse_entry_out *)fdi.answ)->nodeid; - if (!nid) { - /* - * zero nodeid is the same as "not found", - * but it's also cacheable (which we keep - * keep on doing not as of writing this) - */ - lookup_err = ENOENT; - } else if (nid == FUSE_ROOT_ID) { - lookup_err = EINVAL; + if (!lookup_err) { + /* lookup call succeeded */ + feo = (struct fuse_entry_out *)fdi.answ; + nid = feo->nodeid; + if (nid == 0) { + /* zero nodeid means ENOENT and cache it */ + struct timespec timeout; + + fdi.answ_stat = ENOENT; + lookup_err = ENOENT; + if (cnp->cn_flags & MAKEENTRY) { + fuse_validity_2_timespec(feo, &timeout); + cache_enter_time(dvp, *vpp, cnp, + &timeout, NULL); + } + } else if (nid == FUSE_ROOT_ID) { + lookup_err = EINVAL; + } + vtyp = IFTOVT(feo->attr.mode); + filesize = feo->attr.size; } + if (lookup_err && (!fdi.answ_stat || lookup_err != ENOENT)) { + fdisp_destroy(&fdi); + return lookup_err; + } } - if (lookup_err && - (!fdi.answ_stat || lookup_err != ENOENT || op != FUSE_LOOKUP)) { - fdisp_destroy(&fdi); - return lookup_err; - } /* lookup_err, if non-zero, must be ENOENT at this point */ if (lookup_err) { + /* Entry not found */ + if ((nameiop == CREATE || nameiop == RENAME) && islastcn) { + err = fuse_internal_access(dvp, VWRITE, td, cred); + if (!err) { + /* + * Set the SAVENAME flag to hold onto the + * pathname for use later in VOP_CREATE or + * VOP_RENAME. + */ + cnp->cn_flags |= SAVENAME; - if ((nameiop == CREATE || nameiop == RENAME) && islastcn - /* && directory dvp has not been removed */ ) { - - if (vfs_isrdonly(mp)) { - err = EROFS; - goto out; + err = EJUSTRETURN; } -#if 0 /* THINK_ABOUT_THIS */ - if ((err = fuse_internal_access(dvp, VWRITE, cred, td, &facp))) { - goto out; - } -#endif - - /* - * Possibly record the position of a slot in the - * directory large enough for the new component name. - * This can be recorded in the vnode private data for - * dvp. Set the SAVENAME flag to hold onto the - * pathname for use later in VOP_CREATE or VOP_RENAME. - */ - cnp->cn_flags |= SAVENAME; - - err = EJUSTRETURN; - goto out; - } - /* Consider inserting name into cache. */ - - /* - * No we can't use negative caching, as the fs - * changes are out of our control. - * False positives' falseness turns out just as things - * go by, but false negatives' falseness doesn't. - * (and aiding the caching mechanism with extra control - * mechanisms comes quite close to beating the whole purpose - * caching...) - */ -#if 0 - if ((cnp->cn_flags & MAKEENTRY) != 0) { - SDT_PROBE2(fuse, , vnops, trace, 1, - "inserting NULL into cache"); - cache_enter(dvp, NULL, cnp); - } -#endif - err = ENOENT; - goto out; - - } else { - - /* !lookup_err */ - - struct fuse_entry_out *feo = NULL; - struct fuse_attr *fattr = NULL; - - if (op == FUSE_GETATTR) { - fattr = &((struct fuse_attr_out *)fdi.answ)->attr; } else { - feo = (struct fuse_entry_out *)fdi.answ; - fattr = &(feo->attr); + err = ENOENT; } - - /* - * If deleting, and at end of pathname, return parameters - * which can be used to remove file. If the wantparent flag - * isn't set, we return only the directory, otherwise we go on - * and lock the inode, being careful with ".". - */ - if (nameiop == DELETE && islastcn) { - /* - * Check for write access on directory. - */ - facp.xuid = fattr->uid; - facp.facc_flags |= FACCESS_STICKY; - err = fuse_internal_access(dvp, VWRITE, &facp, td, cred); - facp.facc_flags &= ~FACCESS_XQUERIES; - - if (err) { - goto out; - } - if (nid == VTOI(dvp)) { - vref(dvp); - *vpp = dvp; - } else { - err = fuse_vnode_get(dvp->v_mount, feo, nid, - dvp, &vp, cnp, IFTOVT(fattr->mode)); - if (err) - goto out; - *vpp = vp; - } - - /* - * Save the name for use in VOP_RMDIR and VOP_REMOVE - * later. - */ - cnp->cn_flags |= SAVENAME; - goto out; - - } - /* - * If rewriting (RENAME), return the inode and the - * information required to rewrite the present directory - * Must get inode of directory entry to verify it's a - * regular file, or empty directory. - */ - if (nameiop == RENAME && wantparent && islastcn) { - -#if 0 /* THINK_ABOUT_THIS */ - if ((err = fuse_internal_access(dvp, VWRITE, cred, td, &facp))) { - goto out; - } -#endif - - /* - * Check for "." - */ - if (nid == VTOI(dvp)) { - err = EISDIR; - goto out; - } - err = fuse_vnode_get(vnode_mount(dvp), feo, nid, dvp, - &vp, cnp, IFTOVT(fattr->mode)); - if (err) { - goto out; - } - *vpp = vp; - /* - * Save the name for use in VOP_RENAME later. - */ - cnp->cn_flags |= SAVENAME; - - goto out; - } + } else { + /* Entry was found */ if (flags & ISDOTDOT) { - struct mount *mp; - int ltype; + struct fuse_lookup_alloc_arg flaa; - /* - * Expanded copy of vn_vget_ino() so that - * fuse_vnode_get() can be used. - */ - mp = dvp->v_mount; - ltype = VOP_ISLOCKED(dvp); - err = vfs_busy(mp, MBF_NOWAIT); - if (err != 0) { - vfs_ref(mp); - VOP_UNLOCK(dvp, 0); - err = vfs_busy(mp, 0); - vn_lock(dvp, ltype | LK_RETRY); - vfs_rel(mp); - if (err) - goto out; - if ((dvp->v_iflag & VI_DOOMED) != 0) { - err = ENOENT; - vfs_unbusy(mp); - goto out; - } - } - VOP_UNLOCK(dvp, 0); - err = fuse_vnode_get(vnode_mount(dvp), feo, nid, NULL, - &vp, cnp, IFTOVT(fattr->mode)); - vfs_unbusy(mp); - vn_lock(dvp, ltype | LK_RETRY); - if ((dvp->v_iflag & VI_DOOMED) != 0) { - if (err == 0) - vput(vp); - err = ENOENT; - } - if (err) - goto out; + flaa.nid = nid; + flaa.feo = feo; + flaa.cnp = cnp; + flaa.vtyp = vtyp; + err = vn_vget_ino_gen(dvp, fuse_lookup_alloc, &flaa, 0, + &vp); *vpp = vp; } else if (nid == VTOI(dvp)) { vref(dvp); *vpp = dvp; } else { struct fuse_vnode_data *fvdat; err = fuse_vnode_get(vnode_mount(dvp), feo, nid, dvp, - &vp, cnp, IFTOVT(fattr->mode)); - if (err) { + &vp, cnp, vtyp); + if (err) goto out; - } - fuse_vnode_setparent(vp, dvp); + *vpp = vp; /* * In the case where we are looking up a FUSE node * represented by an existing cached vnode, and the * true size reported by FUSE_LOOKUP doesn't match - * the vnode's cached size, fix the vnode cache to - * match the real object size. + * the vnode's cached size, then any cached writes + * beyond the file's current size are lost. * - * This can occur via FUSE distributed filesystems, - * irregular files, etc. + * We can get here: + * * following attribute cache expiration, or + * * due a bug in the daemon, or */ fvdat = VTOFUD(vp); if (vnode_isreg(vp) && - fattr->size != fvdat->filesize) { + filesize != fvdat->cached_attrs.va_size && + fvdat->flag & FN_SIZECHANGE) { /* * The FN_SIZECHANGE flag reflects a dirty * append. If userspace lets us know our cache * is invalid, that write was lost. (Dirty * writes that do not cause append are also * lost, but we don't detect them here.) * * XXX: Maybe disable WB caching on this mount. */ - if (fvdat->flag & FN_SIZECHANGE) - printf("%s: WB cache incoherent on " - "%s!\n", __func__, - vnode_mount(vp)->mnt_stat.f_mntonname); + printf("%s: WB cache incoherent on %s!\n", + __func__, + vnode_mount(vp)->mnt_stat.f_mntonname); - (void)fuse_vnode_setsize(vp, fattr->size); fvdat->flag &= ~FN_SIZECHANGE; } - *vpp = vp; - } - if (op == FUSE_GETATTR) { - struct fuse_attr_out *fao = - (struct fuse_attr_out*)fdi.answ; - fuse_internal_cache_attrs(*vpp, - &fao->attr, fao->attr_valid, - fao->attr_valid_nsec, NULL); - } else { - struct fuse_entry_out *feo = - (struct fuse_entry_out*)fdi.answ; - fuse_internal_cache_attrs(*vpp, - &feo->attr, feo->attr_valid, - feo->attr_valid_nsec, NULL); - } + MPASS(feo != NULL); + fuse_internal_cache_attrs(*vpp, &feo->attr, + feo->attr_valid, feo->attr_valid_nsec, NULL); + fuse_validity_2_bintime(feo->entry_valid, + feo->entry_valid_nsec, + &fvdat->entry_cache_timeout); - /* Insert name into cache if appropriate. */ + if ((nameiop == DELETE || nameiop == RENAME) && + islastcn) + { + struct vattr dvattr; - /* - * Nooo, caching is evil. With caching, we can't avoid stale - * information taking over the playground (cached info is not - * just positive/negative, it does have qualitative aspects, - * too). And a (VOP/FUSE)_GETATTR is always thrown anyway, when - * walking down along cached path components, and that's not - * any cheaper than FUSE_LOOKUP. This might change with - * implementing kernel side attr caching, but... In Linux, - * lookup results are not cached, and the daemon is bombarded - * with FUSE_LOOKUPS on and on. This shows that by design, the - * daemon is expected to handle frequent lookup queries - * efficiently, do its caching in userspace, and so on. - * - * So just leave the name cache alone. - */ - - /* - * Well, now I know, Linux caches lookups, but with a - * timeout... So it's the same thing as attribute caching: - * we can deal with it when implement timeouts. - */ -#if 0 - if (cnp->cn_flags & MAKEENTRY) { - cache_enter(dvp, *vpp, cnp); - } -#endif - } -out: - if (!lookup_err) { - - /* No lookup error; need to clean up. */ - - if (err) { /* Found inode; exit with no vnode. */ - if (op == FUSE_LOOKUP) { - fuse_internal_forget_send(vnode_mount(dvp), td, cred, - nid, 1); - } - fdisp_destroy(&fdi); - return err; - } else { -#ifndef NO_EARLY_PERM_CHECK_HACK - if (!islastcn) { - /* - * We have the attributes of the next item - * *now*, and it's a fact, and we do not - * have to do extra work for it (ie, beg the - * daemon), and it neither depends on such - * accidental things like attr caching. So - * the big idea: check credentials *now*, - * not at the beginning of the next call to - * lookup. - * - * The first item of the lookup chain (fs root) - * won't be checked then here, of course, as - * its never "the next". But go and see that - * the root is taken care about at the very - * beginning of this function. - * - * Now, given we want to do the access check - * this way, one might ask: so then why not - * do the access check just after fetching - * the inode and its attributes from the - * daemon? Why bother with producing the - * corresponding vnode at all if something - * is not OK? We know what's the deal as - * soon as we get those attrs... There is - * one bit of info though not given us by - * the daemon: whether his response is - * authoritative or not... His response should - * be ignored if something is mounted over - * the dir in question. But that can be - * known only by having the vnode... + err = fuse_internal_access(dvp, VWRITE, td, + cred); + if (err != 0) + goto out; + /* + * if the parent's sticky bit is set, check + * whether we're allowed to remove the file. + * Need to figure out the vnode locking to make + * this work. */ - int tmpvtype = vnode_vtype(*vpp); - - bzero(&facp, sizeof(facp)); - /*the early perm check hack */ - facp.facc_flags |= FACCESS_VA_VALID; - - if ((tmpvtype != VDIR) && (tmpvtype != VLNK)) { - err = ENOTDIR; + fuse_internal_getattr(dvp, &dvattr, cred, td); + if ((dvattr.va_mode & S_ISTXT) && + fuse_internal_access(dvp, VADMIN, td, + cred) && + fuse_internal_access(*vpp, VADMIN, td, + cred)) { + err = EPERM; + goto out; } - if (!err && !vnode_mountedhere(*vpp)) { - err = fuse_internal_access(*vpp, VEXEC, &facp, td, cred); - } - if (err) { - if (tmpvtype == VLNK) - SDT_PROBE2(fuse, , vnops, trace, - 1, "weird, permission " - "error with a symlink?"); - vput(*vpp); - *vpp = NULL; - } } -#endif + + if (islastcn && ( + (nameiop == DELETE) || + (nameiop == RENAME && wantparent))) { + cnp->cn_flags |= SAVENAME; + } + } } - fdisp_destroy(&fdi); +out: + if (err) { + if (vp != NULL && dvp != vp) + vput(vp); + else if (vp != NULL) + vrele(vp); + *vpp = NULL; + } + if (did_lookup) + fdisp_destroy(&fdi); return err; } /* struct vnop_mkdir_args { struct vnode *a_dvp; struct vnode **a_vpp; struct componentname *a_cnp; struct vattr *a_vap; }; */ static int fuse_vnop_mkdir(struct vop_mkdir_args *ap) { struct vnode *dvp = ap->a_dvp; struct vnode **vpp = ap->a_vpp; struct componentname *cnp = ap->a_cnp; struct vattr *vap = ap->a_vap; struct fuse_mkdir_in fmdi; if (fuse_isdeadfs(dvp)) { return ENXIO; } fmdi.mode = MAKEIMODE(vap->va_type, vap->va_mode); + fmdi.umask = curthread->td_proc->p_fd->fd_cmask; return (fuse_internal_newentry(dvp, vpp, cnp, FUSE_MKDIR, &fmdi, sizeof(fmdi), VDIR)); } /* struct vnop_mknod_args { struct vnode *a_dvp; struct vnode **a_vpp; struct componentname *a_cnp; struct vattr *a_vap; }; */ static int fuse_vnop_mknod(struct vop_mknod_args *ap) { - return (EINVAL); -} + struct vnode *dvp = ap->a_dvp; + struct vnode **vpp = ap->a_vpp; + struct componentname *cnp = ap->a_cnp; + struct vattr *vap = ap->a_vap; + if (fuse_isdeadfs(dvp)) + return ENXIO; + return fuse_internal_mknod(dvp, vpp, cnp, vap); +} + /* - struct vnop_open_args { + struct vop_open_args { struct vnode *a_vp; int a_mode; struct ucred *a_cred; struct thread *a_td; int a_fdidx; / struct file *a_fp; }; */ static int fuse_vnop_open(struct vop_open_args *ap) { struct vnode *vp = ap->a_vp; - int mode = ap->a_mode; + int a_mode = ap->a_mode; struct thread *td = ap->a_td; struct ucred *cred = ap->a_cred; - - fufh_type_t fufh_type; + pid_t pid = td->td_proc->p_pid; struct fuse_vnode_data *fvdat; - int error, isdir = 0; - int32_t fuse_open_flags; - - if (fuse_isdeadfs(vp)) { + if (fuse_isdeadfs(vp)) return ENXIO; - } - if ((mode & (FREAD | FWRITE)) == 0) + if (vp->v_type == VCHR || vp->v_type == VBLK || vp->v_type == VFIFO) + return (EOPNOTSUPP); + if ((a_mode & (FREAD | FWRITE | FEXEC)) == 0) return EINVAL; fvdat = VTOFUD(vp); - if (vnode_isdir(vp)) { - isdir = 1; - } - fuse_open_flags = 0; - if (isdir) { - fufh_type = FUFH_RDONLY; - } else { - fufh_type = fuse_filehandle_xlate_from_fflags(mode); - /* - * For WRONLY opens, force DIRECT_IO. This is necessary - * since writing a partial block through the buffer cache - * will result in a read of the block and that read won't - * be allowed by the WRONLY open. - */ - if (fufh_type == FUFH_WRONLY || - (fvdat->flag & FN_DIRECTIO) != 0) - fuse_open_flags = FOPEN_DIRECT_IO; - } - - if (fuse_filehandle_validrw(vp, fufh_type) != FUFH_INVALID) { - fuse_vnode_open(vp, fuse_open_flags, td); + if (fuse_filehandle_validrw(vp, a_mode, cred, pid)) { + fuse_vnode_open(vp, 0, td); return 0; } - error = fuse_filehandle_open(vp, fufh_type, NULL, td, cred); - return error; + return fuse_filehandle_open(vp, a_mode, NULL, td, cred); } static int fuse_vnop_pathconf(struct vop_pathconf_args *ap) { switch (ap->a_name) { case _PC_FILESIZEBITS: *ap->a_retval = 64; return (0); case _PC_NAME_MAX: *ap->a_retval = NAME_MAX; return (0); case _PC_LINK_MAX: *ap->a_retval = MIN(LONG_MAX, FUSE_LINK_MAX); return (0); case _PC_SYMLINK_MAX: *ap->a_retval = MAXPATHLEN; return (0); case _PC_NO_TRUNC: *ap->a_retval = 1; return (0); default: return (vop_stdpathconf(ap)); } } /* struct vnop_read_args { struct vnode *a_vp; struct uio *a_uio; int a_ioflag; struct ucred *a_cred; }; */ static int fuse_vnop_read(struct vop_read_args *ap) { struct vnode *vp = ap->a_vp; struct uio *uio = ap->a_uio; int ioflag = ap->a_ioflag; struct ucred *cred = ap->a_cred; + pid_t pid = curthread->td_proc->p_pid; if (fuse_isdeadfs(vp)) { return ENXIO; } if (VTOFUD(vp)->flag & FN_DIRECTIO) { ioflag |= IO_DIRECT; } - return fuse_io_dispatch(vp, uio, ioflag, cred); + return fuse_io_dispatch(vp, uio, ioflag, cred, pid); } /* struct vnop_readdir_args { struct vnode *a_vp; struct uio *a_uio; struct ucred *a_cred; int *a_eofflag; - int *ncookies; + int *a_ncookies; u_long **a_cookies; }; */ static int fuse_vnop_readdir(struct vop_readdir_args *ap) { struct vnode *vp = ap->a_vp; struct uio *uio = ap->a_uio; struct ucred *cred = ap->a_cred; - struct fuse_filehandle *fufh = NULL; struct fuse_iov cookediov; - int err = 0; - int freefufh = 0; + u_long *cookies; + off_t startoff; + ssize_t tresid; + int ncookies; + bool closefufh = false; + pid_t pid = curthread->td_proc->p_pid; + if (ap->a_eofflag) + *ap->a_eofflag = 0; if (fuse_isdeadfs(vp)) { return ENXIO; } if ( /* XXXIP ((uio_iovcnt(uio) > 1)) || */ (uio_resid(uio) < sizeof(struct dirent))) { return EINVAL; } - if (!fuse_filehandle_valid(vp, FUFH_RDONLY)) { - SDT_PROBE2(fuse, , vnops, trace, 1, - "calling readdir() before open()"); - err = fuse_filehandle_open(vp, FUFH_RDONLY, &fufh, NULL, cred); - freefufh = 1; - } else { - err = fuse_filehandle_get(vp, FUFH_RDONLY, &fufh); + tresid = uio->uio_resid; + startoff = uio->uio_offset; + err = fuse_filehandle_get_dir(vp, &fufh, cred, pid); + if (err == EBADF && vnode_mount(vp)->mnt_flag & MNT_EXPORTED) { + /* + * nfsd will do VOP_READDIR without first doing VOP_OPEN. We + * must implicitly open the directory here + */ + err = fuse_filehandle_open(vp, FREAD, &fufh, curthread, cred); + if (err == 0) { + /* + * When a directory is opened, it must be read from + * the beginning. Hopefully, the "startoff" still + * exists as an offset cookie for the directory. + * If not, it will read the entire directory without + * returning any entries and just return eof. + */ + uio->uio_offset = 0; + } + closefufh = true; } - if (err) { + if (err) return (err); + if (ap->a_ncookies != NULL) { + ncookies = uio->uio_resid / + (offsetof(struct dirent, d_name) + 4) + 1; + cookies = malloc(ncookies * sizeof(*cookies), M_TEMP, M_WAITOK); + *ap->a_ncookies = ncookies; + *ap->a_cookies = cookies; + } else { + ncookies = 0; + cookies = NULL; } #define DIRCOOKEDSIZE FUSE_DIRENT_ALIGN(FUSE_NAME_OFFSET + MAXNAMLEN + 1) fiov_init(&cookediov, DIRCOOKEDSIZE); - err = fuse_internal_readdir(vp, uio, fufh, &cookediov); + err = fuse_internal_readdir(vp, uio, startoff, fufh, &cookediov, + &ncookies, cookies); fiov_teardown(&cookediov); - if (freefufh) { - fuse_filehandle_close(vp, FUFH_RDONLY, NULL, cred); + if (closefufh) + fuse_filehandle_close(vp, fufh, curthread, cred); + + if (ap->a_ncookies != NULL) { + if (err == 0) { + *ap->a_ncookies -= ncookies; + } else { + free(*ap->a_cookies, M_TEMP); + *ap->a_ncookies = 0; + *ap->a_cookies = NULL; + } } + if (err == 0 && tresid == uio->uio_resid) + *ap->a_eofflag = 1; + return err; } /* struct vnop_readlink_args { struct vnode *a_vp; struct uio *a_uio; struct ucred *a_cred; }; */ static int fuse_vnop_readlink(struct vop_readlink_args *ap) { struct vnode *vp = ap->a_vp; struct uio *uio = ap->a_uio; struct ucred *cred = ap->a_cred; struct fuse_dispatcher fdi; int err; if (fuse_isdeadfs(vp)) { return ENXIO; } if (!vnode_islnk(vp)) { return EINVAL; } fdisp_init(&fdi, 0); err = fdisp_simple_putget_vp(&fdi, FUSE_READLINK, vp, curthread, cred); if (err) { goto out; } if (((char *)fdi.answ)[0] == '/' && fuse_get_mpdata(vnode_mount(vp))->dataflags & FSESS_PUSH_SYMLINKS_IN) { char *mpth = vnode_mount(vp)->mnt_stat.f_mntonname; err = uiomove(mpth, strlen(mpth), uio); } if (!err) { err = uiomove(fdi.answ, fdi.iosize, uio); } out: fdisp_destroy(&fdi); return err; } /* struct vnop_reclaim_args { struct vnode *a_vp; struct thread *a_td; }; */ static int fuse_vnop_reclaim(struct vop_reclaim_args *ap) { struct vnode *vp = ap->a_vp; struct thread *td = ap->a_td; - struct fuse_vnode_data *fvdat = VTOFUD(vp); - struct fuse_filehandle *fufh = NULL; + struct fuse_filehandle *fufh, *fufh_tmp; - int type; - if (!fvdat) { panic("FUSE: no vnode data during recycling"); } - for (type = 0; type < FUFH_MAXTYPE; type++) { - fufh = &(fvdat->fufh[type]); - if (FUFH_IS_VALID(fufh)) { - printf("FUSE: vnode being reclaimed but fufh (type=%d) is valid", - type); - fuse_filehandle_close(vp, type, td, NULL); - } + LIST_FOREACH_SAFE(fufh, &fvdat->handles, next, fufh_tmp) { + printf("FUSE: vnode being reclaimed with open fufh " + "(type=%#x)", fufh->fufh_type); + fuse_filehandle_close(vp, fufh, td, NULL); } if ((!fuse_isdeadfs(vp)) && (fvdat->nlookup)) { fuse_internal_forget_send(vnode_mount(vp), td, NULL, VTOI(vp), fvdat->nlookup); } fuse_vnode_setparent(vp, NULL); cache_purge(vp); vfs_hash_remove(vp); vnode_destroy_vobject(vp); fuse_vnode_destroy(vp); return 0; } /* struct vnop_remove_args { struct vnode *a_dvp; struct vnode *a_vp; struct componentname *a_cnp; }; */ static int fuse_vnop_remove(struct vop_remove_args *ap) { struct vnode *dvp = ap->a_dvp; struct vnode *vp = ap->a_vp; struct componentname *cnp = ap->a_cnp; int err; if (fuse_isdeadfs(vp)) { return ENXIO; } if (vnode_isdir(vp)) { return EPERM; } - cache_purge(vp); err = fuse_internal_remove(dvp, vp, cnp, FUSE_UNLINK); - if (err == 0) - fuse_internal_vnode_disappear(vp); return err; } /* struct vnop_rename_args { struct vnode *a_fdvp; struct vnode *a_fvp; struct componentname *a_fcnp; struct vnode *a_tdvp; struct vnode *a_tvp; struct componentname *a_tcnp; }; */ static int fuse_vnop_rename(struct vop_rename_args *ap) { struct vnode *fdvp = ap->a_fdvp; struct vnode *fvp = ap->a_fvp; struct componentname *fcnp = ap->a_fcnp; struct vnode *tdvp = ap->a_tdvp; struct vnode *tvp = ap->a_tvp; struct componentname *tcnp = ap->a_tcnp; struct fuse_data *data; - + bool newparent = fdvp != tdvp; + bool isdir = fvp->v_type == VDIR; int err = 0; if (fuse_isdeadfs(fdvp)) { return ENXIO; } if (fvp->v_mount != tdvp->v_mount || (tvp && fvp->v_mount != tvp->v_mount)) { - SDT_PROBE2(fuse, , vnops, trace, 1, "cross-device rename"); + SDT_PROBE2(fusefs, , vnops, trace, 1, "cross-device rename"); err = EXDEV; goto out; } cache_purge(fvp); /* * FUSE library is expected to check if target directory is not * under the source directory in the file system tree. * Linux performs this check at VFS level. */ + /* + * If source is a directory, and it will get a new parent, user must + * have write permission to it, so ".." can be modified. + */ data = fuse_get_mpdata(vnode_mount(tdvp)); + if (data->dataflags & FSESS_DEFAULT_PERMISSIONS && isdir && newparent) { + err = fuse_internal_access(fvp, VWRITE, + tcnp->cn_thread, tcnp->cn_cred); + if (err) + goto out; + } sx_xlock(&data->rename_lock); err = fuse_internal_rename(fdvp, fcnp, tdvp, tcnp); if (err == 0) { if (tdvp != fdvp) fuse_vnode_setparent(fvp, tdvp); if (tvp != NULL) fuse_vnode_setparent(tvp, NULL); } sx_unlock(&data->rename_lock); if (tvp != NULL && tvp != fvp) { cache_purge(tvp); } if (vnode_isdir(fvp)) { if ((tvp != NULL) && vnode_isdir(tvp)) { cache_purge(tdvp); } cache_purge(fdvp); } out: if (tdvp == tvp) { vrele(tdvp); } else { vput(tdvp); } if (tvp != NULL) { vput(tvp); } vrele(fdvp); vrele(fvp); return err; } /* struct vnop_rmdir_args { struct vnode *a_dvp; struct vnode *a_vp; struct componentname *a_cnp; } *ap; */ static int fuse_vnop_rmdir(struct vop_rmdir_args *ap) { struct vnode *dvp = ap->a_dvp; struct vnode *vp = ap->a_vp; int err; if (fuse_isdeadfs(vp)) { return ENXIO; } if (VTOFUD(vp) == VTOFUD(dvp)) { return EINVAL; } err = fuse_internal_remove(dvp, vp, ap->a_cnp, FUSE_RMDIR); - if (err == 0) - fuse_internal_vnode_disappear(vp); return err; } /* struct vnop_setattr_args { struct vnode *a_vp; struct vattr *a_vap; struct ucred *a_cred; struct thread *a_td; }; */ static int fuse_vnop_setattr(struct vop_setattr_args *ap) { struct vnode *vp = ap->a_vp; struct vattr *vap = ap->a_vap; struct ucred *cred = ap->a_cred; struct thread *td = curthread; + struct mount *mp; + struct fuse_data *data; + struct vattr old_va; + int dataflags; + int err = 0, err2; + accmode_t accmode = 0; + bool checkperm; + bool drop_suid = false; + gid_t cr_gid; - struct fuse_dispatcher fdi; - struct fuse_setattr_in *fsai; - struct fuse_access_param facp; + mp = vnode_mount(vp); + data = fuse_get_mpdata(mp); + dataflags = data->dataflags; + checkperm = dataflags & FSESS_DEFAULT_PERMISSIONS; + if (cred->cr_ngroups > 0) + cr_gid = cred->cr_groups[0]; + else + cr_gid = 0; - int err = 0; - enum vtype vtyp; - int sizechanged = 0; - uint64_t newsize = 0; - if (fuse_isdeadfs(vp)) { return ENXIO; } - fdisp_init(&fdi, sizeof(*fsai)); - fdisp_make_vp(&fdi, FUSE_SETATTR, vp, td, cred); - fsai = fdi.indata; - fsai->valid = 0; - bzero(&facp, sizeof(facp)); - - facp.xuid = vap->va_uid; - facp.xgid = vap->va_gid; - if (vap->va_uid != (uid_t)VNOVAL) { - facp.facc_flags |= FACCESS_CHOWN; - fsai->uid = vap->va_uid; - fsai->valid |= FATTR_UID; + if (checkperm) { + /* Only root may change a file's owner */ + err = priv_check_cred(cred, PRIV_VFS_CHOWN); + if (err) { + /* As a special case, allow the null chown */ + err2 = fuse_internal_getattr(vp, &old_va, cred, + td); + if (err2) + return (err2); + if (vap->va_uid != old_va.va_uid) + return err; + else + accmode |= VADMIN; + drop_suid = true; + } else + accmode |= VADMIN; + } else + accmode |= VADMIN; } if (vap->va_gid != (gid_t)VNOVAL) { - facp.facc_flags |= FACCESS_CHOWN; - fsai->gid = vap->va_gid; - fsai->valid |= FATTR_GID; + if (checkperm && priv_check_cred(cred, PRIV_VFS_CHOWN)) + drop_suid = true; + if (checkperm && !groupmember(vap->va_gid, cred)) + { + /* + * Non-root users may only chgrp to one of their own + * groups + */ + err = priv_check_cred(cred, PRIV_VFS_CHOWN); + if (err) { + /* As a special case, allow the null chgrp */ + err2 = fuse_internal_getattr(vp, &old_va, cred, + td); + if (err2) + return (err2); + if (vap->va_gid != old_va.va_gid) + return err; + accmode |= VADMIN; + } else + accmode |= VADMIN; + } else + accmode |= VADMIN; } if (vap->va_size != VNOVAL) { - - struct fuse_filehandle *fufh = NULL; - - /*Truncate to a new value. */ - fsai->size = vap->va_size; - sizechanged = 1; - newsize = vap->va_size; - fsai->valid |= FATTR_SIZE; - - fuse_filehandle_getrw(vp, FUFH_WRONLY, &fufh); - if (fufh) { - fsai->fh = fufh->fh_id; - fsai->valid |= FATTR_FH; + switch (vp->v_type) { + case VDIR: + return (EISDIR); + case VLNK: + case VREG: + if (vfs_isrdonly(mp)) + return (EROFS); + break; + default: + /* + * According to POSIX, the result is unspecified + * for file types other than regular files, + * directories and shared memory objects. We + * don't support shared memory objects in the file + * system, and have dubious support for truncating + * symlinks. Just ignore the request in other cases. + */ + return (0); } + /* Don't set accmode. Permission to trunc is checked upstack */ } - if (vap->va_atime.tv_sec != VNOVAL) { - fsai->atime = vap->va_atime.tv_sec; - fsai->atimensec = vap->va_atime.tv_nsec; - fsai->valid |= FATTR_ATIME; + if (vap->va_atime.tv_sec != VNOVAL || vap->va_mtime.tv_sec != VNOVAL) { + if (vap->va_vaflags & VA_UTIMES_NULL) + accmode |= VWRITE; + else + accmode |= VADMIN; } - if (vap->va_mtime.tv_sec != VNOVAL) { - fsai->mtime = vap->va_mtime.tv_sec; - fsai->mtimensec = vap->va_mtime.tv_nsec; - fsai->valid |= FATTR_MTIME; + if (drop_suid) { + if (vap->va_mode != (mode_t)VNOVAL) + vap->va_mode &= ~(S_ISUID | S_ISGID); + else { + err = fuse_internal_getattr(vp, &old_va, cred, td); + if (err) + return (err); + vap->va_mode = old_va.va_mode & ~(S_ISUID | S_ISGID); + } } if (vap->va_mode != (mode_t)VNOVAL) { - fsai->mode = vap->va_mode & ALLPERMS; - fsai->valid |= FATTR_MODE; + /* Only root may set the sticky bit on non-directories */ + if (checkperm && vp->v_type != VDIR && (vap->va_mode & S_ISTXT) + && priv_check_cred(cred, PRIV_VFS_STICKYFILE)) + return EFTYPE; + if (checkperm && (vap->va_mode & S_ISGID)) { + err = fuse_internal_getattr(vp, &old_va, cred, td); + if (err) + return (err); + if (!groupmember(old_va.va_gid, cred)) { + err = priv_check_cred(cred, PRIV_VFS_SETGID); + if (err) + return (err); + } + } + accmode |= VADMIN; } - if (!fsai->valid) { - goto out; - } - vtyp = vnode_vtype(vp); - if (fsai->valid & FATTR_SIZE && vtyp == VDIR) { - err = EISDIR; - goto out; - } - if (vfs_isrdonly(vnode_mount(vp)) && (fsai->valid & ~FATTR_SIZE || vtyp == VREG)) { - err = EROFS; - goto out; - } - if (fsai->valid & ~FATTR_SIZE) { - /*err = fuse_internal_access(vp, VADMIN, context, &facp); */ - /*XXX */ - err = 0; - } - facp.facc_flags &= ~FACCESS_XQUERIES; + if (vfs_isrdonly(mp)) + return EROFS; - if (err && !(fsai->valid & ~(FATTR_ATIME | FATTR_MTIME)) && - vap->va_vaflags & VA_UTIMES_NULL) { - err = fuse_internal_access(vp, VWRITE, &facp, td, cred); - } + err = fuse_internal_access(vp, accmode, td, cred); if (err) - goto out; - if ((err = fdisp_wait_answ(&fdi))) - goto out; - vtyp = IFTOVT(((struct fuse_attr_out *)fdi.answ)->attr.mode); - - if (vnode_vtype(vp) != vtyp) { - if (vnode_vtype(vp) == VNON && vtyp != VNON) { - SDT_PROBE2(fuse, , vnops, trace, 1, "FUSE: Dang! " - "vnode_vtype is VNON and vtype isn't."); - } else { - /* - * STALE vnode, ditch - * - * The vnode has changed its type "behind our back". - * There's nothing really we can do, so let us just - * force an internal revocation and tell the caller to - * try again, if interested. - */ - fuse_internal_vnode_disappear(vp); - err = EAGAIN; - } - } - if (err == 0) { - struct fuse_attr_out *fao = (struct fuse_attr_out*)fdi.answ; - fuse_internal_cache_attrs(vp, &fao->attr, fao->attr_valid, - fao->attr_valid_nsec, NULL); - } - -out: - fdisp_destroy(&fdi); - if (!err && sizechanged) { - fuse_vnode_setsize(vp, newsize); - VTOFUD(vp)->flag &= ~FN_SIZECHANGE; - } - return err; + return err; + else + return fuse_internal_setattr(vp, vap, td, cred); } /* struct vnop_strategy_args { struct vnode *a_vp; struct buf *a_bp; }; */ static int fuse_vnop_strategy(struct vop_strategy_args *ap) { struct vnode *vp = ap->a_vp; struct buf *bp = ap->a_bp; if (!vp || fuse_isdeadfs(vp)) { bp->b_ioflags |= BIO_ERROR; bp->b_error = ENXIO; bufdone(bp); - return ENXIO; + return 0; } - if (bp->b_iocmd == BIO_WRITE) - fuse_vnode_refreshsize(vp, NOCRED); - (void)fuse_io_strategy(vp, bp); - /* - * This is a dangerous function. If returns error, that might mean a - * panic. We prefer pretty much anything over being forced to panic - * by a malicious daemon (a demon?). So we just return 0 anyway. You - * should never mind this: this function has its own error - * propagation mechanism via the argument buffer, so - * not-that-melodramatic residents of the call chain still will be - * able to know what to do. + * VOP_STRATEGY always returns zero and signals error via bp->b_ioflags. + * fuse_io_strategy sets bp's error fields */ + (void)fuse_io_strategy(vp, bp); + return 0; } /* struct vnop_symlink_args { struct vnode *a_dvp; struct vnode **a_vpp; struct componentname *a_cnp; struct vattr *a_vap; char *a_target; }; */ static int fuse_vnop_symlink(struct vop_symlink_args *ap) { struct vnode *dvp = ap->a_dvp; struct vnode **vpp = ap->a_vpp; struct componentname *cnp = ap->a_cnp; const char *target = ap->a_target; struct fuse_dispatcher fdi; int err; size_t len; if (fuse_isdeadfs(dvp)) { return ENXIO; } /* * Unlike the other creator type calls, here we have to create a message * where the name of the new entry comes first, and the data describing * the entry comes second. * Hence we can't rely on our handy fuse_internal_newentry() routine, * but put together the message manually and just call the core part. */ len = strlen(target) + 1; fdisp_init(&fdi, len + cnp->cn_namelen + 1); fdisp_make_vp(&fdi, FUSE_SYMLINK, dvp, curthread, NULL); memcpy(fdi.indata, cnp->cn_nameptr, cnp->cn_namelen); ((char *)fdi.indata)[cnp->cn_namelen] = '\0'; memcpy((char *)fdi.indata + cnp->cn_namelen + 1, target, len); err = fuse_internal_newentry_core(dvp, vpp, cnp, VLNK, &fdi); fdisp_destroy(&fdi); return err; } /* struct vnop_write_args { struct vnode *a_vp; struct uio *a_uio; int a_ioflag; struct ucred *a_cred; }; */ static int fuse_vnop_write(struct vop_write_args *ap) { struct vnode *vp = ap->a_vp; struct uio *uio = ap->a_uio; int ioflag = ap->a_ioflag; struct ucred *cred = ap->a_cred; + pid_t pid = curthread->td_proc->p_pid; if (fuse_isdeadfs(vp)) { return ENXIO; } - fuse_vnode_refreshsize(vp, cred); if (VTOFUD(vp)->flag & FN_DIRECTIO) { ioflag |= IO_DIRECT; } - return fuse_io_dispatch(vp, uio, ioflag, cred); + return fuse_io_dispatch(vp, uio, ioflag, cred, pid); } -SDT_PROBE_DEFINE1(fuse, , vnops, vnop_getpages_error, "int"); -/* - struct vnop_getpages_args { - struct vnode *a_vp; - vm_page_t *a_m; - int a_count; - int a_reqpage; - }; -*/ -static int -fuse_vnop_getpages(struct vop_getpages_args *ap) +static daddr_t +fuse_gbp_getblkno(struct vnode *vp, vm_ooffset_t off) { - int i, error, nextoff, size, toff, count, npages; - struct uio uio; - struct iovec iov; - vm_offset_t kva; - struct buf *bp; - struct vnode *vp; - struct thread *td; - struct ucred *cred; - vm_page_t *pages; + const int biosize = fuse_iosize(vp); - vp = ap->a_vp; - KASSERT(vp->v_object, ("objectless vp passed to getpages")); - td = curthread; /* XXX */ - cred = curthread->td_ucred; /* XXX */ - pages = ap->a_m; - npages = ap->a_count; + return (off / biosize); +} - if (!fsess_opt_mmap(vnode_mount(vp))) { - SDT_PROBE2(fuse, , vnops, trace, 1, - "called on non-cacheable vnode??\n"); - return (VM_PAGER_ERROR); - } +static int +fuse_gbp_getblksz(struct vnode *vp, daddr_t lbn) +{ + off_t filesize; + int blksz, err; + const int biosize = fuse_iosize(vp); - /* - * If the last page is partially valid, just return it and allow - * the pager to zero-out the blanks. Partially valid pages can - * only occur at the file EOF. - * - * XXXGL: is that true for FUSE, which is a local filesystem, - * but still somewhat disconnected from the kernel? - */ - VM_OBJECT_WLOCK(vp->v_object); - if (pages[npages - 1]->valid != 0 && --npages == 0) - goto out; - VM_OBJECT_WUNLOCK(vp->v_object); + err = fuse_vnode_size(vp, &filesize, NULL, NULL); + KASSERT(err == 0, ("vfs_bio_getpages can't handle errors here")); + if (err) + return biosize; - /* - * We use only the kva address for the buffer, but this is extremely - * convenient and fast. - */ - bp = uma_zalloc(fuse_pbuf_zone, M_WAITOK); - - kva = (vm_offset_t)bp->b_data; - pmap_qenter(kva, pages, npages); - VM_CNT_INC(v_vnodein); - VM_CNT_ADD(v_vnodepgsin, npages); - - count = npages << PAGE_SHIFT; - iov.iov_base = (caddr_t)kva; - iov.iov_len = count; - uio.uio_iov = &iov; - uio.uio_iovcnt = 1; - uio.uio_offset = IDX_TO_OFF(pages[0]->pindex); - uio.uio_resid = count; - uio.uio_segflg = UIO_SYSSPACE; - uio.uio_rw = UIO_READ; - uio.uio_td = td; - - error = fuse_io_dispatch(vp, &uio, IO_DIRECT, cred); - pmap_qremove(kva, npages); - - uma_zfree(fuse_pbuf_zone, bp); - - if (error && (uio.uio_resid == count)) { - SDT_PROBE1(fuse, , vnops, vnop_getpages_error, error); - return VM_PAGER_ERROR; + if ((off_t)lbn * biosize >= filesize) { + blksz = 0; + } else if ((off_t)(lbn + 1) * biosize > filesize) { + blksz = filesize - (off_t)lbn *biosize; + } else { + blksz = biosize; } - /* - * Calculate the number of bytes read and validate only that number - * of bytes. Note that due to pending writes, size may be 0. This - * does not mean that the remaining data is invalid! - */ - - size = count - uio.uio_resid; - VM_OBJECT_WLOCK(vp->v_object); - fuse_vm_page_lock_queues(); - for (i = 0, toff = 0; i < npages; i++, toff = nextoff) { - vm_page_t m; - - nextoff = toff + PAGE_SIZE; - m = pages[i]; - - if (nextoff <= size) { - /* - * Read operation filled an entire page - */ - m->valid = VM_PAGE_BITS_ALL; - KASSERT(m->dirty == 0, - ("fuse_getpages: page %p is dirty", m)); - } else if (size > toff) { - /* - * Read operation filled a partial page. - */ - m->valid = 0; - vm_page_set_valid_range(m, 0, size - toff); - KASSERT(m->dirty == 0, - ("fuse_getpages: page %p is dirty", m)); - } else { - /* - * Read operation was short. If no error occurred - * we may have hit a zero-fill section. We simply - * leave valid set to 0. - */ - ; - } - } - fuse_vm_page_unlock_queues(); -out: - VM_OBJECT_WUNLOCK(vp->v_object); - if (ap->a_rbehind) - *ap->a_rbehind = 0; - if (ap->a_rahead) - *ap->a_rahead = 0; - return (VM_PAGER_OK); + return (blksz); } /* - struct vnop_putpages_args { + struct vnop_getpages_args { struct vnode *a_vp; vm_page_t *a_m; int a_count; - int a_sync; - int *a_rtvals; - vm_ooffset_t a_offset; + int a_reqpage; }; */ static int -fuse_vnop_putpages(struct vop_putpages_args *ap) +fuse_vnop_getpages(struct vop_getpages_args *ap) { - struct uio uio; - struct iovec iov; - vm_offset_t kva; - struct buf *bp; - int i, error, npages, count; - off_t offset; - int *rtvals; - struct vnode *vp; - struct thread *td; - struct ucred *cred; - vm_page_t *pages; - vm_ooffset_t fsize; + struct vnode *vp = ap->a_vp; - vp = ap->a_vp; - KASSERT(vp->v_object, ("objectless vp passed to putpages")); - fsize = vp->v_object->un_pager.vnp.vnp_size; - td = curthread; /* XXX */ - cred = curthread->td_ucred; /* XXX */ - pages = ap->a_m; - count = ap->a_count; - rtvals = ap->a_rtvals; - npages = btoc(count); - offset = IDX_TO_OFF(pages[0]->pindex); - if (!fsess_opt_mmap(vnode_mount(vp))) { - SDT_PROBE2(fuse, , vnops, trace, 1, + SDT_PROBE2(fusefs, , vnops, trace, 1, "called on non-cacheable vnode??\n"); + return (VM_PAGER_ERROR); } - for (i = 0; i < npages; i++) - rtvals[i] = VM_PAGER_AGAIN; - /* - * When putting pages, do not extend file past EOF. - */ - - if (offset + count > fsize) { - count = fsize - offset; - if (count < 0) - count = 0; - } - /* - * We use only the kva address for the buffer, but this is extremely - * convenient and fast. - */ - bp = uma_zalloc(fuse_pbuf_zone, M_WAITOK); - - kva = (vm_offset_t)bp->b_data; - pmap_qenter(kva, pages, npages); - VM_CNT_INC(v_vnodeout); - VM_CNT_ADD(v_vnodepgsout, count); - - iov.iov_base = (caddr_t)kva; - iov.iov_len = count; - uio.uio_iov = &iov; - uio.uio_iovcnt = 1; - uio.uio_offset = offset; - uio.uio_resid = count; - uio.uio_segflg = UIO_SYSSPACE; - uio.uio_rw = UIO_WRITE; - uio.uio_td = td; - - error = fuse_io_dispatch(vp, &uio, IO_DIRECT, cred); - - pmap_qremove(kva, npages); - uma_zfree(fuse_pbuf_zone, bp); - - if (!error) { - int nwritten = round_page(count - uio.uio_resid) / PAGE_SIZE; - - for (i = 0; i < nwritten; i++) { - rtvals[i] = VM_PAGER_OK; - VM_OBJECT_WLOCK(pages[i]->object); - vm_page_undirty(pages[i]); - VM_OBJECT_WUNLOCK(pages[i]->object); - } - } - return rtvals[0]; + return (vfs_bio_getpages(vp, ap->a_m, ap->a_count, ap->a_rbehind, + ap->a_rahead, fuse_gbp_getblkno, fuse_gbp_getblksz)); } static const char extattr_namespace_separator = '.'; /* struct vop_getextattr_args { struct vop_generic_args a_gen; struct vnode *a_vp; int a_attrnamespace; const char *a_name; struct uio *a_uio; size_t *a_size; struct ucred *a_cred; struct thread *a_td; }; */ static int fuse_vnop_getextattr(struct vop_getextattr_args *ap) { struct vnode *vp = ap->a_vp; struct uio *uio = ap->a_uio; struct fuse_dispatcher fdi; struct fuse_getxattr_in *get_xattr_in; struct fuse_getxattr_out *get_xattr_out; struct mount *mp = vnode_mount(vp); struct thread *td = ap->a_td; struct ucred *cred = ap->a_cred; char *prefix; char *attr_str; size_t len; int err; if (fuse_isdeadfs(vp)) return (ENXIO); + if (!fsess_isimpl(mp, FUSE_GETXATTR)) + return EOPNOTSUPP; + + err = fuse_extattr_check_cred(vp, ap->a_attrnamespace, cred, td, VREAD); + if (err) + return err; + /* Default to looking for user attributes. */ if (ap->a_attrnamespace == EXTATTR_NAMESPACE_SYSTEM) prefix = EXTATTR_NAMESPACE_SYSTEM_STRING; else prefix = EXTATTR_NAMESPACE_USER_STRING; len = strlen(prefix) + sizeof(extattr_namespace_separator) + strlen(ap->a_name) + 1; fdisp_init(&fdi, len + sizeof(*get_xattr_in)); fdisp_make_vp(&fdi, FUSE_GETXATTR, vp, td, cred); get_xattr_in = fdi.indata; /* * Check to see whether we're querying the available size or * issuing the actual request. If we pass in 0, we get back struct * fuse_getxattr_out. If we pass in a non-zero size, we get back * that much data, without the struct fuse_getxattr_out header. */ if (uio == NULL) get_xattr_in->size = 0; else get_xattr_in->size = uio->uio_resid; attr_str = (char *)fdi.indata + sizeof(*get_xattr_in); snprintf(attr_str, len, "%s%c%s", prefix, extattr_namespace_separator, ap->a_name); err = fdisp_wait_answ(&fdi); if (err != 0) { - if (err == ENOSYS) + if (err == ENOSYS) { fsess_set_notimpl(mp, FUSE_GETXATTR); + err = EOPNOTSUPP; + } goto out; } get_xattr_out = fdi.answ; if (ap->a_size != NULL) *ap->a_size = get_xattr_out->size; if (uio != NULL) err = uiomove(fdi.answ, fdi.iosize, uio); out: fdisp_destroy(&fdi); return (err); } /* struct vop_setextattr_args { struct vop_generic_args a_gen; struct vnode *a_vp; int a_attrnamespace; const char *a_name; struct uio *a_uio; struct ucred *a_cred; struct thread *a_td; }; */ static int fuse_vnop_setextattr(struct vop_setextattr_args *ap) { struct vnode *vp = ap->a_vp; struct uio *uio = ap->a_uio; struct fuse_dispatcher fdi; struct fuse_setxattr_in *set_xattr_in; struct mount *mp = vnode_mount(vp); struct thread *td = ap->a_td; struct ucred *cred = ap->a_cred; char *prefix; size_t len; char *attr_str; int err; if (fuse_isdeadfs(vp)) return (ENXIO); + if (!fsess_isimpl(mp, FUSE_SETXATTR)) + return EOPNOTSUPP; + + if (vfs_isrdonly(mp)) + return EROFS; + + /* Deleting xattrs must use VOP_DELETEEXTATTR instead */ + if (ap->a_uio == NULL) { + /* + * If we got here as fallback from VOP_DELETEEXTATTR, then + * return EOPNOTSUPP. + */ + if (!fsess_isimpl(mp, FUSE_REMOVEXATTR)) + return (EOPNOTSUPP); + else + return (EINVAL); + } + + err = fuse_extattr_check_cred(vp, ap->a_attrnamespace, cred, td, + VWRITE); + if (err) + return err; + /* Default to looking for user attributes. */ if (ap->a_attrnamespace == EXTATTR_NAMESPACE_SYSTEM) prefix = EXTATTR_NAMESPACE_SYSTEM_STRING; else prefix = EXTATTR_NAMESPACE_USER_STRING; len = strlen(prefix) + sizeof(extattr_namespace_separator) + strlen(ap->a_name) + 1; fdisp_init(&fdi, len + sizeof(*set_xattr_in) + uio->uio_resid); fdisp_make_vp(&fdi, FUSE_SETXATTR, vp, td, cred); set_xattr_in = fdi.indata; set_xattr_in->size = uio->uio_resid; attr_str = (char *)fdi.indata + sizeof(*set_xattr_in); snprintf(attr_str, len, "%s%c%s", prefix, extattr_namespace_separator, ap->a_name); err = uiomove((char *)fdi.indata + sizeof(*set_xattr_in) + len, uio->uio_resid, uio); if (err != 0) { goto out; } err = fdisp_wait_answ(&fdi); - if (err != 0) { - if (err == ENOSYS) - fsess_set_notimpl(mp, FUSE_SETXATTR); - goto out; + if (err == ENOSYS) { + fsess_set_notimpl(mp, FUSE_SETXATTR); + err = EOPNOTSUPP; } + if (err == ERESTART) { + /* Can't restart after calling uiomove */ + err = EINTR; + } out: fdisp_destroy(&fdi); return (err); } /* * The Linux / FUSE extended attribute list is simply a collection of * NUL-terminated strings. The FreeBSD extended attribute list is a single * byte length followed by a non-NUL terminated string. So, this allows * conversion of the Linux / FUSE format to the FreeBSD format in place. * Linux attribute names are reported with the namespace as a prefix (e.g. * "user.attribute_name"), but in FreeBSD they are reported without the * namespace prefix (e.g. "attribute_name"). So, we're going from: * * user.attr_name1\0user.attr_name2\0 * * to: * * attr_name1attr_name2 * * Where "" is a single byte number of characters in the attribute name. * * Args: * prefix - exattr namespace prefix string * list, list_len - input list with namespace prefixes * bsd_list, bsd_list_len - output list compatible with bsd vfs */ static int fuse_xattrlist_convert(char *prefix, const char *list, int list_len, char *bsd_list, int *bsd_list_len) { int len, pos, dist_to_next, prefix_len; pos = 0; *bsd_list_len = 0; prefix_len = strlen(prefix); while (pos < list_len && list[pos] != '\0') { dist_to_next = strlen(&list[pos]) + 1; if (bcmp(&list[pos], prefix, prefix_len) == 0 && list[pos + prefix_len] == extattr_namespace_separator) { len = dist_to_next - (prefix_len + sizeof(extattr_namespace_separator)) - 1; if (len >= EXTATTR_MAXNAMELEN) return (ENAMETOOLONG); bsd_list[*bsd_list_len] = len; memcpy(&bsd_list[*bsd_list_len + 1], &list[pos + prefix_len + sizeof(extattr_namespace_separator)], len); *bsd_list_len += len + 1; } pos += dist_to_next; } return (0); } /* struct vop_listextattr_args { struct vop_generic_args a_gen; struct vnode *a_vp; int a_attrnamespace; struct uio *a_uio; size_t *a_size; struct ucred *a_cred; struct thread *a_td; }; */ static int fuse_vnop_listextattr(struct vop_listextattr_args *ap) { struct vnode *vp = ap->a_vp; struct uio *uio = ap->a_uio; struct fuse_dispatcher fdi; struct fuse_listxattr_in *list_xattr_in; struct fuse_listxattr_out *list_xattr_out; struct mount *mp = vnode_mount(vp); struct thread *td = ap->a_td; struct ucred *cred = ap->a_cred; size_t len; char *prefix; char *attr_str; char *bsd_list = NULL; char *linux_list; int bsd_list_len; int linux_list_len; int err; if (fuse_isdeadfs(vp)) return (ENXIO); + if (!fsess_isimpl(mp, FUSE_LISTXATTR)) + return EOPNOTSUPP; + + err = fuse_extattr_check_cred(vp, ap->a_attrnamespace, cred, td, VREAD); + if (err) + return err; + /* * Add space for a NUL and the period separator if enabled. * Default to looking for user attributes. */ if (ap->a_attrnamespace == EXTATTR_NAMESPACE_SYSTEM) prefix = EXTATTR_NAMESPACE_SYSTEM_STRING; else prefix = EXTATTR_NAMESPACE_USER_STRING; len = strlen(prefix) + sizeof(extattr_namespace_separator) + 1; fdisp_init(&fdi, sizeof(*list_xattr_in) + len); fdisp_make_vp(&fdi, FUSE_LISTXATTR, vp, td, cred); /* * Retrieve Linux / FUSE compatible list size. */ list_xattr_in = fdi.indata; list_xattr_in->size = 0; attr_str = (char *)fdi.indata + sizeof(*list_xattr_in); snprintf(attr_str, len, "%s%c", prefix, extattr_namespace_separator); err = fdisp_wait_answ(&fdi); if (err != 0) { - if (err == ENOSYS) + if (err == ENOSYS) { fsess_set_notimpl(mp, FUSE_LISTXATTR); + err = EOPNOTSUPP; + } goto out; } list_xattr_out = fdi.answ; linux_list_len = list_xattr_out->size; if (linux_list_len == 0) { if (ap->a_size != NULL) *ap->a_size = linux_list_len; goto out; } /* * Retrieve Linux / FUSE compatible list values. */ - fdisp_make_vp(&fdi, FUSE_LISTXATTR, vp, td, cred); + fdisp_refresh_vp(&fdi, FUSE_LISTXATTR, vp, td, cred); list_xattr_in = fdi.indata; list_xattr_in->size = linux_list_len + sizeof(*list_xattr_out); attr_str = (char *)fdi.indata + sizeof(*list_xattr_in); snprintf(attr_str, len, "%s%c", prefix, extattr_namespace_separator); err = fdisp_wait_answ(&fdi); if (err != 0) goto out; linux_list = fdi.answ; linux_list_len = fdi.iosize; /* * Retrieve the BSD compatible list values. * The Linux / FUSE attribute list format isn't the same * as FreeBSD's format. So we need to transform it into * FreeBSD's format before giving it to the user. */ bsd_list = malloc(linux_list_len, M_TEMP, M_WAITOK); err = fuse_xattrlist_convert(prefix, linux_list, linux_list_len, bsd_list, &bsd_list_len); if (err != 0) goto out; if (ap->a_size != NULL) *ap->a_size = bsd_list_len; if (uio != NULL) err = uiomove(bsd_list, bsd_list_len, uio); out: free(bsd_list, M_TEMP); fdisp_destroy(&fdi); return (err); } /* struct vop_deleteextattr_args { struct vop_generic_args a_gen; struct vnode *a_vp; int a_attrnamespace; const char *a_name; struct ucred *a_cred; struct thread *a_td; }; */ static int fuse_vnop_deleteextattr(struct vop_deleteextattr_args *ap) { struct vnode *vp = ap->a_vp; struct fuse_dispatcher fdi; struct mount *mp = vnode_mount(vp); struct thread *td = ap->a_td; struct ucred *cred = ap->a_cred; char *prefix; size_t len; char *attr_str; int err; if (fuse_isdeadfs(vp)) return (ENXIO); + if (!fsess_isimpl(mp, FUSE_REMOVEXATTR)) + return EOPNOTSUPP; + + if (vfs_isrdonly(mp)) + return EROFS; + + err = fuse_extattr_check_cred(vp, ap->a_attrnamespace, cred, td, + VWRITE); + if (err) + return err; + /* Default to looking for user attributes. */ if (ap->a_attrnamespace == EXTATTR_NAMESPACE_SYSTEM) prefix = EXTATTR_NAMESPACE_SYSTEM_STRING; else prefix = EXTATTR_NAMESPACE_USER_STRING; len = strlen(prefix) + sizeof(extattr_namespace_separator) + strlen(ap->a_name) + 1; fdisp_init(&fdi, len); fdisp_make_vp(&fdi, FUSE_REMOVEXATTR, vp, td, cred); attr_str = fdi.indata; snprintf(attr_str, len, "%s%c%s", prefix, extattr_namespace_separator, ap->a_name); err = fdisp_wait_answ(&fdi); - if (err != 0) { - if (err == ENOSYS) - fsess_set_notimpl(mp, FUSE_REMOVEXATTR); + if (err == ENOSYS) { + fsess_set_notimpl(mp, FUSE_REMOVEXATTR); + err = EOPNOTSUPP; } fdisp_destroy(&fdi); return (err); } /* struct vnop_print_args { struct vnode *a_vp; }; */ static int fuse_vnop_print(struct vop_print_args *ap) { struct fuse_vnode_data *fvdat = VTOFUD(ap->a_vp); printf("nodeid: %ju, parent nodeid: %ju, nlookup: %ju, flag: %#x\n", (uintmax_t)VTOILLU(ap->a_vp), (uintmax_t)fvdat->parent_nid, (uintmax_t)fvdat->nlookup, fvdat->flag); return 0; } + +/* + * Get an NFS filehandle for a FUSE file. + * + * This will only work for FUSE file systems that guarantee the uniqueness of + * nodeid:generation, which most don't. + */ +/* +vop_vptofh { + IN struct vnode *a_vp; + IN struct fid *a_fhp; +}; +*/ +static int +fuse_vnop_vptofh(struct vop_vptofh_args *ap) +{ + struct vnode *vp = ap->a_vp; + struct fuse_vnode_data *fvdat = VTOFUD(vp); + struct fuse_fid *fhp = (struct fuse_fid *)(ap->a_fhp); + _Static_assert(sizeof(struct fuse_fid) <= sizeof(struct fid), + "FUSE fid type is too big"); + struct mount *mp = vnode_mount(vp); + struct fuse_data *data = fuse_get_mpdata(mp); + struct vattr va; + int err; + + if (!(data->dataflags & FSESS_EXPORT_SUPPORT)) + return EOPNOTSUPP; + + err = fuse_internal_getattr(vp, &va, curthread->td_ucred, curthread); + if (err) + return err; + + /*ip = VTOI(ap->a_vp);*/ + /*ufhp = (struct ufid *)ap->a_fhp;*/ + fhp->len = sizeof(struct fuse_fid); + fhp->nid = fvdat->nid; + if (fvdat->generation <= UINT32_MAX) + fhp->gen = fvdat->generation; + else + return EOVERFLOW; + return (0); +} + + Index: head/sys/sys/param.h =================================================================== --- head/sys/sys/param.h (revision 350664) +++ head/sys/sys/param.h (revision 350665) @@ -1,367 +1,367 @@ /*- * SPDX-License-Identifier: BSD-3-Clause * * Copyright (c) 1982, 1986, 1989, 1993 * The Regents of the University of California. All rights reserved. * (c) UNIX System Laboratories, Inc. * All or some portions of this file are derived from material licensed * to the University of California by American Telephone and Telegraph * Co. or Unix System Laboratories, Inc. and are reproduced herein with * the permission of UNIX System Laboratories, Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * @(#)param.h 8.3 (Berkeley) 4/4/95 * $FreeBSD$ */ #ifndef _SYS_PARAM_H_ #define _SYS_PARAM_H_ #include #define BSD 199506 /* System version (year & month). */ #define BSD4_3 1 #define BSD4_4 1 /* * __FreeBSD_version numbers are documented in the Porter's Handbook. * If you bump the version for any reason, you should update the documentation * there. * Currently this lives here in the doc/ repository: * * head/en_US.ISO8859-1/books/porters-handbook/versions/chapter.xml * * scheme is: Rxx * 'R' is in the range 0 to 4 if this is a release branch or * X.0-CURRENT before releng/X.0 is created, otherwise 'R' is * in the range 5 to 9. */ #undef __FreeBSD_version -#define __FreeBSD_version 1300038 /* Master, propagated to newvers */ +#define __FreeBSD_version 1300039 /* Master, propagated to newvers */ /* * __FreeBSD_kernel__ indicates that this system uses the kernel of FreeBSD, * which by definition is always true on FreeBSD. This macro is also defined * on other systems that use the kernel of FreeBSD, such as GNU/kFreeBSD. * * It is tempting to use this macro in userland code when we want to enable * kernel-specific routines, and in fact it's fine to do this in code that * is part of FreeBSD itself. However, be aware that as presence of this * macro is still not widespread (e.g. older FreeBSD versions, 3rd party * compilers, etc), it is STRONGLY DISCOURAGED to check for this macro in * external applications without also checking for __FreeBSD__ as an * alternative. */ #undef __FreeBSD_kernel__ #define __FreeBSD_kernel__ #if defined(_KERNEL) || defined(IN_RTLD) #define P_OSREL_SIGWAIT 700000 #define P_OSREL_SIGSEGV 700004 #define P_OSREL_MAP_ANON 800104 #define P_OSREL_MAP_FSTRICT 1100036 #define P_OSREL_SHUTDOWN_ENOTCONN 1100077 #define P_OSREL_MAP_GUARD 1200035 #define P_OSREL_WRFSBASE 1200041 #define P_OSREL_CK_CYLGRP 1200046 #define P_OSREL_VMTOTAL64 1200054 #define P_OSREL_CK_SUPERBLOCK 1300000 #define P_OSREL_CK_INODE 1300005 #define P_OSREL_MAJOR(x) ((x) / 100000) #endif #ifndef LOCORE #include #endif /* * Machine-independent constants (some used in following include files). * Redefined constants are from POSIX 1003.1 limits file. * * MAXCOMLEN should be >= sizeof(ac_comm) (see ) */ #include #define MAXCOMLEN 19 /* max command name remembered */ #define MAXINTERP PATH_MAX /* max interpreter file name length */ #define MAXLOGNAME 33 /* max login name length (incl. NUL) */ #define MAXUPRC CHILD_MAX /* max simultaneous processes */ #define NCARGS ARG_MAX /* max bytes for an exec function */ #define NGROUPS (NGROUPS_MAX+1) /* max number groups */ #define NOFILE OPEN_MAX /* max open files per process */ #define NOGROUP 65535 /* marker for empty group set member */ #define MAXHOSTNAMELEN 256 /* max hostname size */ #define SPECNAMELEN 255 /* max length of devicename */ /* More types and definitions used throughout the kernel. */ #ifdef _KERNEL #include #include #ifndef LOCORE #include #include #endif #ifndef FALSE #define FALSE 0 #endif #ifndef TRUE #define TRUE 1 #endif #endif #ifndef _KERNEL /* Signals. */ #include #endif /* Machine type dependent parameters. */ #include #ifndef _KERNEL #include #endif #ifndef DEV_BSHIFT #define DEV_BSHIFT 9 /* log2(DEV_BSIZE) */ #endif #define DEV_BSIZE (1<>PAGE_SHIFT) #endif /* * btodb() is messy and perhaps slow because `bytes' may be an off_t. We * want to shift an unsigned type to avoid sign extension and we don't * want to widen `bytes' unnecessarily. Assume that the result fits in * a daddr_t. */ #ifndef btodb #define btodb(bytes) /* calculates (bytes / DEV_BSIZE) */ \ (sizeof (bytes) > sizeof(long) \ ? (daddr_t)((unsigned long long)(bytes) >> DEV_BSHIFT) \ : (daddr_t)((unsigned long)(bytes) >> DEV_BSHIFT)) #endif #ifndef dbtob #define dbtob(db) /* calculates (db * DEV_BSIZE) */ \ ((off_t)(db) << DEV_BSHIFT) #endif #define PRIMASK 0x0ff #define PCATCH 0x100 /* OR'd with pri for tsleep to check signals */ #define PDROP 0x200 /* OR'd with pri to stop re-entry of interlock mutex */ #define NZERO 0 /* default "nice" */ #define NBBY 8 /* number of bits in a byte */ #define NBPW sizeof(int) /* number of bytes per word (integer) */ #define CMASK 022 /* default file mask: S_IWGRP|S_IWOTH */ #define NODEV (dev_t)(-1) /* non-existent device */ /* * File system parameters and macros. * * MAXBSIZE - Filesystems are made out of blocks of at most MAXBSIZE bytes * per block. MAXBSIZE may be made larger without effecting * any existing filesystems as long as it does not exceed MAXPHYS, * and may be made smaller at the risk of not being able to use * filesystems which require a block size exceeding MAXBSIZE. * * MAXBCACHEBUF - Maximum size of a buffer in the buffer cache. This must * be >= MAXBSIZE and can be set differently for different * architectures by defining it in . * Making this larger allows NFS to do larger reads/writes. * * BKVASIZE - Nominal buffer space per buffer, in bytes. BKVASIZE is the * minimum KVM memory reservation the kernel is willing to make. * Filesystems can of course request smaller chunks. Actual * backing memory uses a chunk size of a page (PAGE_SIZE). * The default value here can be overridden on a per-architecture * basis by defining it in . * * If you make BKVASIZE too small you risk seriously fragmenting * the buffer KVM map which may slow things down a bit. If you * make it too big the kernel will not be able to optimally use * the KVM memory reserved for the buffer cache and will wind * up with too-few buffers. * * The default is 16384, roughly 2x the block size used by a * normal UFS filesystem. */ #define MAXBSIZE 65536 /* must be power of 2 */ #ifndef MAXBCACHEBUF #define MAXBCACHEBUF MAXBSIZE /* must be a power of 2 >= MAXBSIZE */ #endif #ifndef BKVASIZE #define BKVASIZE 16384 /* must be power of 2 */ #endif #define BKVAMASK (BKVASIZE-1) /* * MAXPATHLEN defines the longest permissible path length after expanding * symbolic links. It is used to allocate a temporary buffer from the buffer * pool in which to do the name expansion, hence should be a power of two, * and must be less than or equal to MAXBSIZE. MAXSYMLINKS defines the * maximum number of symbolic links that may be expanded in a path name. * It should be set high enough to allow all legitimate uses, but halt * infinite loops reasonably quickly. */ #define MAXPATHLEN PATH_MAX #define MAXSYMLINKS 32 /* Bit map related macros. */ #define setbit(a,i) (((unsigned char *)(a))[(i)/NBBY] |= 1<<((i)%NBBY)) #define clrbit(a,i) (((unsigned char *)(a))[(i)/NBBY] &= ~(1<<((i)%NBBY))) #define isset(a,i) \ (((const unsigned char *)(a))[(i)/NBBY] & (1<<((i)%NBBY))) #define isclr(a,i) \ ((((const unsigned char *)(a))[(i)/NBBY] & (1<<((i)%NBBY))) == 0) /* Macros for counting and rounding. */ #ifndef howmany #define howmany(x, y) (((x)+((y)-1))/(y)) #endif #define nitems(x) (sizeof((x)) / sizeof((x)[0])) #define rounddown(x, y) (((x)/(y))*(y)) #define rounddown2(x, y) ((x)&(~((y)-1))) /* if y is power of two */ #define roundup(x, y) ((((x)+((y)-1))/(y))*(y)) /* to any y */ #define roundup2(x, y) (((x)+((y)-1))&(~((y)-1))) /* if y is powers of two */ #define powerof2(x) ((((x)-1)&(x))==0) /* Macros for min/max. */ #define MIN(a,b) (((a)<(b))?(a):(b)) #define MAX(a,b) (((a)>(b))?(a):(b)) #ifdef _KERNEL /* * Basic byte order function prototypes for non-inline functions. */ #ifndef LOCORE #ifndef _BYTEORDER_PROTOTYPED #define _BYTEORDER_PROTOTYPED __BEGIN_DECLS __uint32_t htonl(__uint32_t); __uint16_t htons(__uint16_t); __uint32_t ntohl(__uint32_t); __uint16_t ntohs(__uint16_t); __END_DECLS #endif #endif #ifndef _BYTEORDER_FUNC_DEFINED #define _BYTEORDER_FUNC_DEFINED #define htonl(x) __htonl(x) #define htons(x) __htons(x) #define ntohl(x) __ntohl(x) #define ntohs(x) __ntohs(x) #endif /* !_BYTEORDER_FUNC_DEFINED */ #endif /* _KERNEL */ /* * Scale factor for scaled integers used to count %cpu time and load avgs. * * The number of CPU `tick's that map to a unique `%age' can be expressed * by the formula (1 / (2 ^ (FSHIFT - 11))). The maximum load average that * can be calculated (assuming 32 bits) can be closely approximated using * the formula (2 ^ (2 * (16 - FSHIFT))) for (FSHIFT < 15). * * For the scheduler to maintain a 1:1 mapping of CPU `tick' to `%age', * FSHIFT must be at least 11; this gives us a maximum load avg of ~1024. */ #define FSHIFT 11 /* bits to right of fixed binary point */ #define FSCALE (1<> (PAGE_SHIFT - DEV_BSHIFT)) #define ctodb(db) /* calculates pages to devblks */ \ ((db) << (PAGE_SHIFT - DEV_BSHIFT)) /* * Old spelling of __containerof(). */ #define member2struct(s, m, x) \ ((struct s *)(void *)((char *)(x) - offsetof(struct s, m))) /* * Access a variable length array that has been declared as a fixed * length array. */ #define __PAST_END(array, offset) (((__typeof__(*(array)) *)(array))[offset]) #endif /* _SYS_PARAM_H_ */ Index: head/tests/sys/fs/Makefile =================================================================== --- head/tests/sys/fs/Makefile (revision 350664) +++ head/tests/sys/fs/Makefile (revision 350665) @@ -1,23 +1,28 @@ # $FreeBSD$ +.include + PACKAGE= tests TESTSDIR= ${TESTSBASE}/sys/fs TESTSRC= ${SRCTOP}/contrib/netbsd-tests/fs #TESTS_SUBDIRS+= nullfs # XXX: needs rump +.if ${COMPILER_FEATURES:Mc++14} +TESTS_SUBDIRS+= fusefs +.endif TESTS_SUBDIRS+= tmpfs ${PACKAGE}FILES+= h_funcs.subr ${PACKAGE}FILESDIR= ${TESTSDIR} CLEANFILES+= h_funcs.subr CLEANFILES+= h_funcs.subr.tmp h_funcs.subr: ${TESTSRC}/h_funcs.subr cat ${.ALLSRC} | \ sed -e '/atf_require_prog mount_$${name}/d' >>${.TARGET}.tmp mv ${.TARGET}.tmp ${.TARGET} .include Index: head/tests/sys/fs/fusefs/Makefile =================================================================== --- head/tests/sys/fs/fusefs/Makefile (nonexistent) +++ head/tests/sys/fs/fusefs/Makefile (revision 350665) @@ -0,0 +1,81 @@ +# $FreeBSD$ + +PACKAGE= tests + +TESTSDIR= ${TESTSBASE}/sys/fs/fusefs + +# We could simply link all of these files into a single executable. But since +# Kyua treats googletest programs as plain tests, it's better to separate them +# out, so we get more granular reporting. +GTESTS+= access +GTESTS+= allow_other +GTESTS+= bmap +GTESTS+= create +GTESTS+= default_permissions +GTESTS+= default_permissions_privileged +GTESTS+= destroy +GTESTS+= dev_fuse_poll +GTESTS+= fifo +GTESTS+= flush +GTESTS+= forget +GTESTS+= fsync +GTESTS+= fsyncdir +GTESTS+= getattr +GTESTS+= interrupt +GTESTS+= io +GTESTS+= link +GTESTS+= locks +GTESTS+= lookup +GTESTS+= mkdir +GTESTS+= mknod +GTESTS+= mount +GTESTS+= nfs +GTESTS+= notify +GTESTS+= open +GTESTS+= opendir +GTESTS+= read +GTESTS+= readdir +GTESTS+= readlink +GTESTS+= release +GTESTS+= releasedir +GTESTS+= rename +GTESTS+= rmdir +GTESTS+= setattr +GTESTS+= statfs +GTESTS+= symlink +GTESTS+= unlink +GTESTS+= write +GTESTS+= xattr + +.for p in ${GTESTS} +SRCS.$p+= ${p}.cc +SRCS.$p+= getmntopts.c +SRCS.$p+= mockfs.cc +SRCS.$p+= utils.cc +.endfor + +TEST_METADATA.default_permissions+= required_user="unprivileged" +TEST_METADATA.default_permissions_privileged+= required_user="root" +TEST_METADATA.mknod+= required_user="root" +TEST_METADATA.nfs+= required_user="root" + +# TODO: drastically increase timeout after test development is mostly complete +TEST_METADATA+= timeout=10 + +FUSEFS= ${SRCTOP}/sys/fs/fuse +MOUNT= ${SRCTOP}/sbin/mount +# Suppress warnings that GCC generates for the libc++ and gtest headers. +CXXWARNFLAGS.gcc+= -Wno-placement-new -Wno-attributes -Wno-class-memaccess +CXXFLAGS+= -I${SRCTOP}/tests +CXXFLAGS+= -I${FUSEFS} +CXXFLAGS+= -I${MOUNT} +.PATH: ${MOUNT} +CXXSTD= c++14 + +LIBADD+= pthread +LIBADD+= gmock gtest +LIBADD+= util + +WARNS?= 6 + +.include Property changes on: head/tests/sys/fs/fusefs/Makefile ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: head/tests/sys/fs/fusefs/readdir.cc =================================================================== --- head/tests/sys/fs/fusefs/readdir.cc (nonexistent) +++ head/tests/sys/fs/fusefs/readdir.cc (revision 350665) @@ -0,0 +1,375 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2019 The FreeBSD Foundation + * + * This software was developed by BFF Storage Systems, LLC under sponsorship + * from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +extern "C" { +#include +#include +} + +#include "mockfs.hh" +#include "utils.hh" + +using namespace testing; +using namespace std; + +class Readdir: public FuseTest { +public: +void expect_lookup(const char *relpath, uint64_t ino) +{ + FuseTest::expect_lookup(relpath, ino, S_IFDIR | 0755, 0, 1); +} +}; + +class Readdir_7_8: public Readdir { +public: +virtual void SetUp() { + m_kernel_minor_version = 8; + Readdir::SetUp(); +} + +void expect_lookup(const char *relpath, uint64_t ino) +{ + FuseTest::expect_lookup_7_8(relpath, ino, S_IFDIR | 0755, 0, 1); +} +}; + +/* FUSE_READDIR returns nothing but "." and ".." */ +TEST_F(Readdir, dots) +{ + const char FULLPATH[] = "mountpoint/some_dir"; + const char RELPATH[] = "some_dir"; + uint64_t ino = 42; + DIR *dir; + struct dirent *de; + vector ents(2); + vector empty_ents(0); + const char dot[] = "."; + const char dotdot[] = ".."; + + expect_lookup(RELPATH, ino); + expect_opendir(ino); + ents[0].d_fileno = 2; + ents[0].d_off = 2000; + ents[0].d_namlen = sizeof(dotdot); + ents[0].d_type = DT_DIR; + strncpy(ents[0].d_name, dotdot, ents[0].d_namlen); + ents[1].d_fileno = 3; + ents[1].d_off = 3000; + ents[1].d_namlen = sizeof(dot); + ents[1].d_type = DT_DIR; + strncpy(ents[1].d_name, dot, ents[1].d_namlen); + expect_readdir(ino, 0, ents); + expect_readdir(ino, 3000, empty_ents); + + errno = 0; + dir = opendir(FULLPATH); + ASSERT_NE(nullptr, dir) << strerror(errno); + + errno = 0; + de = readdir(dir); + ASSERT_NE(nullptr, de) << strerror(errno); + EXPECT_EQ(2ul, de->d_fileno); + /* + * fuse(4) doesn't actually set d_off, which is ok for now because + * nothing uses it. + */ + //EXPECT_EQ(2000, de->d_off); + EXPECT_EQ(DT_DIR, de->d_type); + EXPECT_EQ(sizeof(dotdot), de->d_namlen); + EXPECT_EQ(0, strcmp(dotdot, de->d_name)); + + errno = 0; + de = readdir(dir); + ASSERT_NE(nullptr, de) << strerror(errno); + EXPECT_EQ(3ul, de->d_fileno); + //EXPECT_EQ(3000, de->d_off); + EXPECT_EQ(DT_DIR, de->d_type); + EXPECT_EQ(sizeof(dot), de->d_namlen); + EXPECT_EQ(0, strcmp(dot, de->d_name)); + + ASSERT_EQ(nullptr, readdir(dir)); + ASSERT_EQ(0, errno); + + leakdir(dir); +} + +TEST_F(Readdir, eio) +{ + const char FULLPATH[] = "mountpoint/some_dir"; + const char RELPATH[] = "some_dir"; + uint64_t ino = 42; + DIR *dir; + struct dirent *de; + + expect_lookup(RELPATH, ino); + expect_opendir(ino); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_READDIR && + in.header.nodeid == ino && + in.body.readdir.offset == 0); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnErrno(EIO))); + + errno = 0; + dir = opendir(FULLPATH); + ASSERT_NE(nullptr, dir) << strerror(errno); + + errno = 0; + de = readdir(dir); + ASSERT_EQ(nullptr, de); + ASSERT_EQ(EIO, errno); + + leakdir(dir); +} + +/* getdirentries(2) can use a larger buffer size than readdir(3) */ +TEST_F(Readdir, getdirentries) +{ + const char FULLPATH[] = "mountpoint/some_dir"; + const char RELPATH[] = "some_dir"; + uint64_t ino = 42; + int fd; + char buf[8192]; + ssize_t r; + + expect_lookup(RELPATH, ino); + expect_opendir(ino); + + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_READDIR && + in.header.nodeid == ino && + in.body.readdir.size == 8192); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + out.header.error = 0; + out.header.len = sizeof(out.header); + }))); + + fd = open(FULLPATH, O_DIRECTORY); + ASSERT_LE(0, fd) << strerror(errno); + r = getdirentries(fd, buf, sizeof(buf), 0); + ASSERT_EQ(0, r) << strerror(errno); + + leak(fd); +} + +/* + * Nothing bad should happen if getdirentries is called on two file descriptors + * which were concurrently open, but one has already been closed. + * This is a regression test for a specific bug dating from r238402. + */ +TEST_F(Readdir, getdirentries_concurrent) +{ + const char FULLPATH[] = "mountpoint/some_dir"; + const char RELPATH[] = "some_dir"; + uint64_t ino = 42; + int fd0, fd1; + char buf[8192]; + ssize_t r; + + FuseTest::expect_lookup(RELPATH, ino, S_IFDIR | 0755, 0, 2); + expect_opendir(ino); + + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_READDIR && + in.header.nodeid == ino && + in.body.readdir.size == 8192); + }, Eq(true)), + _) + ).Times(2) + .WillRepeatedly(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + out.header.error = 0; + out.header.len = sizeof(out.header); + }))); + + fd0 = open(FULLPATH, O_DIRECTORY); + ASSERT_LE(0, fd0) << strerror(errno); + + fd1 = open(FULLPATH, O_DIRECTORY); + ASSERT_LE(0, fd1) << strerror(errno); + + r = getdirentries(fd0, buf, sizeof(buf), 0); + ASSERT_EQ(0, r) << strerror(errno); + + EXPECT_EQ(0, close(fd0)) << strerror(errno); + + r = getdirentries(fd1, buf, sizeof(buf), 0); + ASSERT_EQ(0, r) << strerror(errno); + + leak(fd0); + leak(fd1); +} + +/* + * FUSE_READDIR returns nothing, not even "." and "..". This is legal, though + * the filesystem obviously won't be fully functional. + */ +TEST_F(Readdir, nodots) +{ + const char FULLPATH[] = "mountpoint/some_dir"; + const char RELPATH[] = "some_dir"; + uint64_t ino = 42; + DIR *dir; + + expect_lookup(RELPATH, ino); + expect_opendir(ino); + + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_READDIR && + in.header.nodeid == ino); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + out.header.error = 0; + out.header.len = sizeof(out.header); + }))); + + errno = 0; + dir = opendir(FULLPATH); + ASSERT_NE(nullptr, dir) << strerror(errno); + errno = 0; + ASSERT_EQ(nullptr, readdir(dir)); + ASSERT_EQ(0, errno); + + leakdir(dir); +} + +/* telldir(3) and seekdir(3) should work with fuse */ +TEST_F(Readdir, seekdir) +{ + const char FULLPATH[] = "mountpoint/some_dir"; + const char RELPATH[] = "some_dir"; + uint64_t ino = 42; + DIR *dir; + struct dirent *de; + /* + * use enough entries to be > 4096 bytes, so getdirentries must be + * called + * multiple times. + */ + vector ents0(122), ents1(102), ents2(30); + long bookmark; + int i = 0; + + for (auto& it: ents0) { + snprintf(it.d_name, MAXNAMLEN, "file.%d", i); + it.d_fileno = 2 + i; + it.d_off = (2 + i) * 1000; + it.d_namlen = strlen(it.d_name); + it.d_type = DT_REG; + i++; + } + for (auto& it: ents1) { + snprintf(it.d_name, MAXNAMLEN, "file.%d", i); + it.d_fileno = 2 + i; + it.d_off = (2 + i) * 1000; + it.d_namlen = strlen(it.d_name); + it.d_type = DT_REG; + i++; + } + for (auto& it: ents2) { + snprintf(it.d_name, MAXNAMLEN, "file.%d", i); + it.d_fileno = 2 + i; + it.d_off = (2 + i) * 1000; + it.d_namlen = strlen(it.d_name); + it.d_type = DT_REG; + i++; + } + + expect_lookup(RELPATH, ino); + expect_opendir(ino); + + expect_readdir(ino, 0, ents0); + expect_readdir(ino, 123000, ents1); + expect_readdir(ino, 225000, ents2); + + errno = 0; + dir = opendir(FULLPATH); + ASSERT_NE(nullptr, dir) << strerror(errno); + + for (i=0; i < 128; i++) { + errno = 0; + de = readdir(dir); + ASSERT_NE(nullptr, de) << strerror(errno); + EXPECT_EQ(2 + (ino_t)i, de->d_fileno); + } + bookmark = telldir(dir); + + for (; i < 232; i++) { + errno = 0; + de = readdir(dir); + ASSERT_NE(nullptr, de) << strerror(errno); + EXPECT_EQ(2 + (ino_t)i, de->d_fileno); + } + + seekdir(dir, bookmark); + de = readdir(dir); + ASSERT_NE(nullptr, de) << strerror(errno); + EXPECT_EQ(130ul, de->d_fileno); + + leakdir(dir); +} + +TEST_F(Readdir_7_8, nodots) +{ + const char FULLPATH[] = "mountpoint/some_dir"; + const char RELPATH[] = "some_dir"; + uint64_t ino = 42; + DIR *dir; + + expect_lookup(RELPATH, ino); + expect_opendir(ino); + + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_READDIR && + in.header.nodeid == ino); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + out.header.error = 0; + out.header.len = sizeof(out.header); + }))); + + errno = 0; + dir = opendir(FULLPATH); + ASSERT_NE(nullptr, dir) << strerror(errno); + errno = 0; + ASSERT_EQ(nullptr, readdir(dir)); + ASSERT_EQ(0, errno); + + leakdir(dir); +} Property changes on: head/tests/sys/fs/fusefs/readdir.cc ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: head/tests/sys/fs/fusefs/write.cc =================================================================== --- head/tests/sys/fs/fusefs/write.cc (nonexistent) +++ head/tests/sys/fs/fusefs/write.cc (revision 350665) @@ -0,0 +1,1309 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2019 The FreeBSD Foundation + * + * This software was developed by BFF Storage Systems, LLC under sponsorship + * from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +extern "C" { +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +} + +#include "mockfs.hh" +#include "utils.hh" + +using namespace testing; + +class Write: public FuseTest { + +public: +static sig_atomic_t s_sigxfsz; + +void SetUp() { + s_sigxfsz = 0; + FuseTest::SetUp(); +} + +void TearDown() { + struct sigaction sa; + + bzero(&sa, sizeof(sa)); + sa.sa_handler = SIG_DFL; + sigaction(SIGXFSZ, &sa, NULL); + + FuseTest::TearDown(); +} + +void expect_lookup(const char *relpath, uint64_t ino, uint64_t size) +{ + FuseTest::expect_lookup(relpath, ino, S_IFREG | 0644, size, 1); +} + +void expect_release(uint64_t ino, ProcessMockerT r) +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_RELEASE && + in.header.nodeid == ino); + }, Eq(true)), + _) + ).WillRepeatedly(Invoke(r)); +} + +void expect_write(uint64_t ino, uint64_t offset, uint64_t isize, + uint64_t osize, const void *contents) +{ + FuseTest::expect_write(ino, offset, isize, osize, 0, 0, contents); +} + +/* Expect a write that may or may not come, depending on the cache mode */ +void maybe_expect_write(uint64_t ino, uint64_t offset, uint64_t size, + const void *contents) +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + const char *buf = (const char*)in.body.bytes + + sizeof(struct fuse_write_in); + + return (in.header.opcode == FUSE_WRITE && + in.header.nodeid == ino && + in.body.write.offset == offset && + in.body.write.size == size && + 0 == bcmp(buf, contents, size)); + }, Eq(true)), + _) + ).Times(AtMost(1)) + .WillRepeatedly(Invoke( + ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, write); + out.body.write.size = size; + }) + )); +} + +}; + +sig_atomic_t Write::s_sigxfsz = 0; + +class Write_7_8: public FuseTest { + +public: +virtual void SetUp() { + m_kernel_minor_version = 8; + FuseTest::SetUp(); +} + +void expect_lookup(const char *relpath, uint64_t ino, uint64_t size) +{ + FuseTest::expect_lookup_7_8(relpath, ino, S_IFREG | 0644, size, 1); +} + +}; + +class AioWrite: public Write { +virtual void SetUp() { + const char *node = "vfs.aio.enable_unsafe"; + int val = 0; + size_t size = sizeof(val); + + FuseTest::SetUp(); + + ASSERT_EQ(0, sysctlbyname(node, &val, &size, NULL, 0)) + << strerror(errno); + if (!val) + GTEST_SKIP() << + "vfs.aio.enable_unsafe must be set for this test"; +} +}; + +/* Tests for the writeback cache mode */ +class WriteBack: public Write { +public: +virtual void SetUp() { + m_init_flags |= FUSE_WRITEBACK_CACHE; + FuseTest::SetUp(); + if (IsSkipped()) + return; +} + +void expect_write(uint64_t ino, uint64_t offset, uint64_t isize, + uint64_t osize, const void *contents) +{ + FuseTest::expect_write(ino, offset, isize, osize, FUSE_WRITE_CACHE, 0, + contents); +} +}; + +class WriteBackAsync: public WriteBack { +public: +virtual void SetUp() { + m_async = true; + WriteBack::SetUp(); +} +}; + +class TimeGran: public WriteBackAsync, public WithParamInterface { +public: +virtual void SetUp() { + m_time_gran = 1 << GetParam(); + WriteBackAsync::SetUp(); +} +}; + +/* Tests for clustered writes with WriteBack cacheing */ +class WriteCluster: public WriteBack { +public: +virtual void SetUp() { + m_async = true; + m_maxwrite = m_maxphys; + WriteBack::SetUp(); + if (m_maxphys < 2 * DFLTPHYS) + GTEST_SKIP() << "MAXPHYS must be at least twice DFLTPHYS" + << " for this test"; + if (m_maxphys < 2 * m_maxbcachebuf) + GTEST_SKIP() << "MAXPHYS must be at least twice maxbcachebuf" + << " for this test"; +} +}; + +void sigxfsz_handler(int __unused sig) { + Write::s_sigxfsz = 1; +} + +/* AIO writes need to set the header's pid field correctly */ +/* https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=236379 */ +TEST_F(AioWrite, DISABLED_aio_write) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char *CONTENTS = "abcdefgh"; + uint64_t ino = 42; + uint64_t offset = 4096; + int fd; + ssize_t bufsize = strlen(CONTENTS); + struct aiocb iocb, *piocb; + + expect_lookup(RELPATH, ino, 0); + expect_open(ino, 0, 1); + expect_write(ino, offset, bufsize, bufsize, CONTENTS); + + fd = open(FULLPATH, O_WRONLY); + EXPECT_LE(0, fd) << strerror(errno); + + iocb.aio_nbytes = bufsize; + iocb.aio_fildes = fd; + iocb.aio_buf = __DECONST(void *, CONTENTS); + iocb.aio_offset = offset; + iocb.aio_sigevent.sigev_notify = SIGEV_NONE; + ASSERT_EQ(0, aio_write(&iocb)) << strerror(errno); + ASSERT_EQ(bufsize, aio_waitcomplete(&piocb, NULL)) << strerror(errno); + leak(fd); +} + +/* + * When a file is opened with O_APPEND, we should forward that flag to + * FUSE_OPEN (tested by Open.o_append) but still attempt to calculate the + * offset internally. That way we'll work both with filesystems that + * understand O_APPEND (and ignore the offset) and filesystems that don't (and + * simply use the offset). + * + * Note that verifying the O_APPEND flag in FUSE_OPEN is done in the + * Open.o_append test. + */ +TEST_F(Write, append) +{ + const ssize_t BUFSIZE = 9; + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char CONTENTS[BUFSIZE] = "abcdefgh"; + uint64_t ino = 42; + /* + * Set offset to a maxbcachebuf boundary so we don't need to RMW when + * using writeback caching + */ + uint64_t initial_offset = m_maxbcachebuf; + int fd; + + expect_lookup(RELPATH, ino, initial_offset); + expect_open(ino, 0, 1); + expect_write(ino, initial_offset, BUFSIZE, BUFSIZE, CONTENTS); + + /* Must open O_RDWR or fuse(4) implicitly sets direct_io */ + fd = open(FULLPATH, O_RDWR | O_APPEND); + EXPECT_LE(0, fd) << strerror(errno); + + ASSERT_EQ(BUFSIZE, write(fd, CONTENTS, BUFSIZE)) << strerror(errno); + leak(fd); +} + +/* If a file is cached, then appending to the end should not cause a read */ +TEST_F(Write, append_to_cached) +{ + const ssize_t BUFSIZE = 9; + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + char *oldcontents, *oldbuf; + const char CONTENTS[BUFSIZE] = "abcdefgh"; + uint64_t ino = 42; + /* + * Set offset in between maxbcachebuf boundary to test buffer handling + */ + uint64_t oldsize = m_maxbcachebuf / 2; + int fd; + + oldcontents = (char*)calloc(1, oldsize); + ASSERT_NE(nullptr, oldcontents) << strerror(errno); + oldbuf = (char*)malloc(oldsize); + ASSERT_NE(nullptr, oldbuf) << strerror(errno); + + expect_lookup(RELPATH, ino, oldsize); + expect_open(ino, 0, 1); + expect_read(ino, 0, oldsize, oldsize, oldcontents); + maybe_expect_write(ino, oldsize, BUFSIZE, CONTENTS); + + /* Must open O_RDWR or fuse(4) implicitly sets direct_io */ + fd = open(FULLPATH, O_RDWR | O_APPEND); + EXPECT_LE(0, fd) << strerror(errno); + + /* Read the old data into the cache */ + ASSERT_EQ((ssize_t)oldsize, read(fd, oldbuf, oldsize)) + << strerror(errno); + + /* Write the new data. There should be no more read operations */ + ASSERT_EQ(BUFSIZE, write(fd, CONTENTS, BUFSIZE)) << strerror(errno); + leak(fd); +} + +TEST_F(Write, append_direct_io) +{ + const ssize_t BUFSIZE = 9; + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char CONTENTS[BUFSIZE] = "abcdefgh"; + uint64_t ino = 42; + uint64_t initial_offset = 4096; + int fd; + + expect_lookup(RELPATH, ino, initial_offset); + expect_open(ino, FOPEN_DIRECT_IO, 1); + expect_write(ino, initial_offset, BUFSIZE, BUFSIZE, CONTENTS); + + fd = open(FULLPATH, O_WRONLY | O_APPEND); + EXPECT_LE(0, fd) << strerror(errno); + + ASSERT_EQ(BUFSIZE, write(fd, CONTENTS, BUFSIZE)) << strerror(errno); + leak(fd); +} + +/* A direct write should evict any overlapping cached data */ +TEST_F(Write, direct_io_evicts_cache) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char CONTENTS0[] = "abcdefgh"; + const char CONTENTS1[] = "ijklmnop"; + uint64_t ino = 42; + int fd; + ssize_t bufsize = strlen(CONTENTS0) + 1; + char readbuf[bufsize]; + + expect_lookup(RELPATH, ino, bufsize); + expect_open(ino, 0, 1); + expect_read(ino, 0, bufsize, bufsize, CONTENTS0); + expect_write(ino, 0, bufsize, bufsize, CONTENTS1); + + fd = open(FULLPATH, O_RDWR); + EXPECT_LE(0, fd) << strerror(errno); + + // Prime cache + ASSERT_EQ(bufsize, read(fd, readbuf, bufsize)) << strerror(errno); + + // Write directly, evicting cache + ASSERT_EQ(0, fcntl(fd, F_SETFL, O_DIRECT)) << strerror(errno); + ASSERT_EQ(0, lseek(fd, 0, SEEK_SET)) << strerror(errno); + ASSERT_EQ(bufsize, write(fd, CONTENTS1, bufsize)) << strerror(errno); + + // Read again. Cache should be bypassed + expect_read(ino, 0, bufsize, bufsize, CONTENTS1); + ASSERT_EQ(0, fcntl(fd, F_SETFL, 0)) << strerror(errno); + ASSERT_EQ(0, lseek(fd, 0, SEEK_SET)) << strerror(errno); + ASSERT_EQ(bufsize, read(fd, readbuf, bufsize)) << strerror(errno); + ASSERT_STREQ(readbuf, CONTENTS1); + + leak(fd); +} + +/* + * If the server doesn't return FOPEN_DIRECT_IO during FUSE_OPEN, then it's not + * allowed to return a short write for that file handle. However, if it does + * then we should still do our darndest to handle it by resending the unwritten + * portion. + */ +TEST_F(Write, indirect_io_short_write) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char *CONTENTS = "abcdefghijklmnop"; + uint64_t ino = 42; + int fd; + ssize_t bufsize = strlen(CONTENTS); + ssize_t bufsize0 = 11; + ssize_t bufsize1 = strlen(CONTENTS) - bufsize0; + const char *contents1 = CONTENTS + bufsize0; + + expect_lookup(RELPATH, ino, 0); + expect_open(ino, 0, 1); + expect_write(ino, 0, bufsize, bufsize0, CONTENTS); + expect_write(ino, bufsize0, bufsize1, bufsize1, contents1); + + fd = open(FULLPATH, O_WRONLY); + EXPECT_LE(0, fd) << strerror(errno); + + ASSERT_EQ(bufsize, write(fd, CONTENTS, bufsize)) << strerror(errno); + leak(fd); +} + +/* + * When the direct_io option is used, filesystems are allowed to write less + * data than requested. We should return the short write to userland. + */ +TEST_F(Write, direct_io_short_write) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char *CONTENTS = "abcdefghijklmnop"; + uint64_t ino = 42; + int fd; + ssize_t bufsize = strlen(CONTENTS); + ssize_t halfbufsize = bufsize / 2; + + expect_lookup(RELPATH, ino, 0); + expect_open(ino, FOPEN_DIRECT_IO, 1); + expect_write(ino, 0, bufsize, halfbufsize, CONTENTS); + + fd = open(FULLPATH, O_WRONLY); + EXPECT_LE(0, fd) << strerror(errno); + + ASSERT_EQ(halfbufsize, write(fd, CONTENTS, bufsize)) << strerror(errno); + leak(fd); +} + +/* + * An insidious edge case: the filesystem returns a short write, and the + * difference between what we requested and what it actually wrote crosses an + * iov element boundary + */ +TEST_F(Write, direct_io_short_write_iov) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char *CONTENTS0 = "abcdefgh"; + const char *CONTENTS1 = "ijklmnop"; + const char *EXPECTED0 = "abcdefghijklmnop"; + uint64_t ino = 42; + int fd; + ssize_t size0 = strlen(CONTENTS0) - 1; + ssize_t size1 = strlen(CONTENTS1) + 1; + ssize_t totalsize = size0 + size1; + struct iovec iov[2]; + + expect_lookup(RELPATH, ino, 0); + expect_open(ino, FOPEN_DIRECT_IO, 1); + expect_write(ino, 0, totalsize, size0, EXPECTED0); + + fd = open(FULLPATH, O_WRONLY); + EXPECT_LE(0, fd) << strerror(errno); + + iov[0].iov_base = __DECONST(void*, CONTENTS0); + iov[0].iov_len = strlen(CONTENTS0); + iov[1].iov_base = __DECONST(void*, CONTENTS1); + iov[1].iov_len = strlen(CONTENTS1); + ASSERT_EQ(size0, writev(fd, iov, 2)) << strerror(errno); + leak(fd); +} + +/* fusefs should respect RLIMIT_FSIZE */ +TEST_F(Write, rlimit_fsize) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char *CONTENTS = "abcdefgh"; + struct rlimit rl; + ssize_t bufsize = strlen(CONTENTS); + off_t offset = 1'000'000'000; + uint64_t ino = 42; + int fd; + + expect_lookup(RELPATH, ino, 0); + expect_open(ino, 0, 1); + + rl.rlim_cur = offset; + rl.rlim_max = 10 * offset; + ASSERT_EQ(0, setrlimit(RLIMIT_FSIZE, &rl)) << strerror(errno); + ASSERT_NE(SIG_ERR, signal(SIGXFSZ, sigxfsz_handler)) << strerror(errno); + + fd = open(FULLPATH, O_WRONLY); + + EXPECT_LE(0, fd) << strerror(errno); + + ASSERT_EQ(-1, pwrite(fd, CONTENTS, bufsize, offset)); + EXPECT_EQ(EFBIG, errno); + EXPECT_EQ(1, s_sigxfsz); + leak(fd); +} + +/* + * A short read indicates EOF. Test that nothing bad happens if we get EOF + * during the R of a RMW operation. + */ +TEST_F(Write, eof_during_rmw) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char *CONTENTS = "abcdefgh"; + const char *INITIAL = "XXXXXXXXXX"; + uint64_t ino = 42; + uint64_t offset = 1; + ssize_t bufsize = strlen(CONTENTS); + off_t orig_fsize = 10; + off_t truncated_fsize = 5; + off_t final_fsize = bufsize; + int fd; + + FuseTest::expect_lookup(RELPATH, ino, S_IFREG | 0644, orig_fsize, 1); + expect_open(ino, 0, 1); + expect_read(ino, 0, orig_fsize, truncated_fsize, INITIAL, O_RDWR); + expect_getattr(ino, truncated_fsize); + expect_read(ino, 0, final_fsize, final_fsize, INITIAL, O_RDWR); + maybe_expect_write(ino, offset, bufsize, CONTENTS); + + fd = open(FULLPATH, O_RDWR); + EXPECT_LE(0, fd) << strerror(errno); + + ASSERT_EQ(bufsize, pwrite(fd, CONTENTS, bufsize, offset)) + << strerror(errno); + leak(fd); +} + +/* + * If the kernel cannot be sure which uid, gid, or pid was responsible for a + * write, then it must set the FUSE_WRITE_CACHE bit + */ +/* https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=236378 */ +TEST_F(Write, mmap) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char *CONTENTS = "abcdefgh"; + uint64_t ino = 42; + int fd; + ssize_t bufsize = strlen(CONTENTS); + void *p; + uint64_t offset = 10; + size_t len; + void *zeros, *expected; + + len = getpagesize(); + + zeros = calloc(1, len); + ASSERT_NE(nullptr, zeros); + expected = calloc(1, len); + ASSERT_NE(nullptr, expected); + memmove((uint8_t*)expected + offset, CONTENTS, bufsize); + + expect_lookup(RELPATH, ino, len); + expect_open(ino, 0, 1); + expect_read(ino, 0, len, len, zeros); + /* + * Writes from the pager may or may not be associated with the correct + * pid, so they must set FUSE_WRITE_CACHE. + */ + FuseTest::expect_write(ino, 0, len, len, FUSE_WRITE_CACHE, 0, expected); + expect_flush(ino, 1, ReturnErrno(0)); + expect_release(ino, ReturnErrno(0)); + + fd = open(FULLPATH, O_RDWR); + EXPECT_LE(0, fd) << strerror(errno); + + p = mmap(NULL, len, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); + ASSERT_NE(MAP_FAILED, p) << strerror(errno); + + memmove((uint8_t*)p + offset, CONTENTS, bufsize); + + ASSERT_EQ(0, munmap(p, len)) << strerror(errno); + close(fd); // Write mmap'd data on close + + free(expected); + free(zeros); +} + +TEST_F(Write, pwrite) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char *CONTENTS = "abcdefgh"; + uint64_t ino = 42; + uint64_t offset = m_maxbcachebuf; + int fd; + ssize_t bufsize = strlen(CONTENTS); + + expect_lookup(RELPATH, ino, 0); + expect_open(ino, 0, 1); + expect_write(ino, offset, bufsize, bufsize, CONTENTS); + + fd = open(FULLPATH, O_WRONLY); + EXPECT_LE(0, fd) << strerror(errno); + + ASSERT_EQ(bufsize, pwrite(fd, CONTENTS, bufsize, offset)) + << strerror(errno); + leak(fd); +} + +/* Writing a file should update its cached mtime and ctime */ +TEST_F(Write, timestamps) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char *CONTENTS = "abcdefgh"; + ssize_t bufsize = strlen(CONTENTS); + uint64_t ino = 42; + struct stat sb0, sb1; + int fd; + + expect_lookup(RELPATH, ino, 0); + expect_open(ino, 0, 1); + maybe_expect_write(ino, 0, bufsize, CONTENTS); + + fd = open(FULLPATH, O_RDWR); + EXPECT_LE(0, fd) << strerror(errno); + ASSERT_EQ(0, fstat(fd, &sb0)) << strerror(errno); + ASSERT_EQ(bufsize, write(fd, CONTENTS, bufsize)) << strerror(errno); + + nap(); + + ASSERT_EQ(0, fstat(fd, &sb1)) << strerror(errno); + + EXPECT_EQ(sb0.st_atime, sb1.st_atime); + EXPECT_NE(sb0.st_mtime, sb1.st_mtime); + EXPECT_NE(sb0.st_ctime, sb1.st_ctime); +} + +TEST_F(Write, write) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char *CONTENTS = "abcdefgh"; + uint64_t ino = 42; + int fd; + ssize_t bufsize = strlen(CONTENTS); + + expect_lookup(RELPATH, ino, 0); + expect_open(ino, 0, 1); + expect_write(ino, 0, bufsize, bufsize, CONTENTS); + + fd = open(FULLPATH, O_WRONLY); + EXPECT_LE(0, fd) << strerror(errno); + + ASSERT_EQ(bufsize, write(fd, CONTENTS, bufsize)) << strerror(errno); + leak(fd); +} + +/* fuse(4) should not issue writes of greater size than the daemon requests */ +TEST_F(Write, write_large) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + int *contents; + uint64_t ino = 42; + int fd; + ssize_t halfbufsize, bufsize; + + halfbufsize = m_mock->m_maxwrite; + bufsize = halfbufsize * 2; + contents = (int*)malloc(bufsize); + ASSERT_NE(nullptr, contents); + for (int i = 0; i < (int)bufsize / (int)sizeof(i); i++) { + contents[i] = i; + } + + expect_lookup(RELPATH, ino, 0); + expect_open(ino, 0, 1); + maybe_expect_write(ino, 0, halfbufsize, contents); + maybe_expect_write(ino, halfbufsize, halfbufsize, + &contents[halfbufsize / sizeof(int)]); + + fd = open(FULLPATH, O_WRONLY); + EXPECT_LE(0, fd) << strerror(errno); + + ASSERT_EQ(bufsize, write(fd, contents, bufsize)) << strerror(errno); + leak(fd); + + free(contents); +} + +TEST_F(Write, write_nothing) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char *CONTENTS = ""; + uint64_t ino = 42; + int fd; + ssize_t bufsize = 0; + + expect_lookup(RELPATH, ino, 0); + expect_open(ino, 0, 1); + + fd = open(FULLPATH, O_WRONLY); + EXPECT_LE(0, fd) << strerror(errno); + + ASSERT_EQ(bufsize, write(fd, CONTENTS, bufsize)) << strerror(errno); + leak(fd); +} + +TEST_F(Write_7_8, write) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char *CONTENTS = "abcdefgh"; + uint64_t ino = 42; + int fd; + ssize_t bufsize = strlen(CONTENTS); + + expect_lookup(RELPATH, ino, 0); + expect_open(ino, 0, 1); + expect_write_7_8(ino, 0, bufsize, bufsize, CONTENTS); + + fd = open(FULLPATH, O_WRONLY); + EXPECT_LE(0, fd) << strerror(errno); + + ASSERT_EQ(bufsize, write(fd, CONTENTS, bufsize)) << strerror(errno); + leak(fd); +} + +/* In writeback mode, dirty data should be written on close */ +TEST_F(WriteBackAsync, close) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char *CONTENTS = "abcdefgh"; + uint64_t ino = 42; + int fd; + ssize_t bufsize = strlen(CONTENTS); + + expect_lookup(RELPATH, ino, 0); + expect_open(ino, 0, 1); + expect_write(ino, 0, bufsize, bufsize, CONTENTS); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_SETATTR); + }, Eq(true)), + _) + ).WillRepeatedly(Invoke(ReturnImmediate([=](auto i __unused, auto& out) { + SET_OUT_HEADER_LEN(out, attr); + out.body.attr.attr.ino = ino; // Must match nodeid + }))); + expect_flush(ino, 1, ReturnErrno(0)); + expect_release(ino, ReturnErrno(0)); + + fd = open(FULLPATH, O_RDWR); + ASSERT_LE(0, fd) << strerror(errno); + + ASSERT_EQ(bufsize, write(fd, CONTENTS, bufsize)) << strerror(errno); + close(fd); +} + +/* In writeback mode, adjacent writes will be clustered together */ +TEST_F(WriteCluster, clustering) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + int i, fd; + void *wbuf, *wbuf2x; + ssize_t bufsize = m_maxbcachebuf; + off_t filesize = 5 * bufsize; + + wbuf = malloc(bufsize); + ASSERT_NE(nullptr, wbuf) << strerror(errno); + memset(wbuf, 'X', bufsize); + wbuf2x = malloc(2 * bufsize); + ASSERT_NE(nullptr, wbuf2x) << strerror(errno); + memset(wbuf2x, 'X', 2 * bufsize); + + expect_lookup(RELPATH, ino, filesize); + expect_open(ino, 0, 1); + /* + * Writes of bufsize-bytes each should be clustered into greater sizes. + * The amount of clustering is adaptive, so the first write actually + * issued will be 2x bufsize and subsequent writes may be larger + */ + expect_write(ino, 0, 2 * bufsize, 2 * bufsize, wbuf2x); + expect_write(ino, 2 * bufsize, 2 * bufsize, 2 * bufsize, wbuf2x); + expect_flush(ino, 1, ReturnErrno(0)); + expect_release(ino, ReturnErrno(0)); + + fd = open(FULLPATH, O_RDWR); + ASSERT_LE(0, fd) << strerror(errno); + + for (i = 0; i < 4; i++) { + ASSERT_EQ(bufsize, write(fd, wbuf, bufsize)) + << strerror(errno); + } + close(fd); +} + +/* + * When clustering writes, an I/O error to any of the cluster's children should + * not panic the system on unmount + */ +/* + * Disabled because it panics. + * https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=238565 + */ +TEST_F(WriteCluster, DISABLED_cluster_write_err) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + int i, fd; + void *wbuf; + ssize_t bufsize = m_maxbcachebuf; + off_t filesize = 4 * bufsize; + + wbuf = malloc(bufsize); + ASSERT_NE(nullptr, wbuf) << strerror(errno); + memset(wbuf, 'X', bufsize); + + expect_lookup(RELPATH, ino, filesize); + expect_open(ino, 0, 1); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_WRITE); + }, Eq(true)), + _) + ).WillRepeatedly(Invoke(ReturnErrno(EIO))); + expect_flush(ino, 1, ReturnErrno(0)); + expect_release(ino, ReturnErrno(0)); + + fd = open(FULLPATH, O_RDWR); + ASSERT_LE(0, fd) << strerror(errno); + + for (i = 0; i < 3; i++) { + ASSERT_EQ(bufsize, write(fd, wbuf, bufsize)) + << strerror(errno); + } + close(fd); +} + +/* + * In writeback mode, writes to an O_WRONLY file could trigger reads from the + * server. The FUSE protocol explicitly allows that. + */ +TEST_F(WriteBack, rmw) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char *CONTENTS = "abcdefgh"; + const char *INITIAL = "XXXXXXXXXX"; + uint64_t ino = 42; + uint64_t offset = 1; + off_t fsize = 10; + int fd; + ssize_t bufsize = strlen(CONTENTS); + + FuseTest::expect_lookup(RELPATH, ino, S_IFREG | 0644, fsize, 1); + expect_open(ino, 0, 1); + expect_read(ino, 0, fsize, fsize, INITIAL, O_WRONLY); + maybe_expect_write(ino, offset, bufsize, CONTENTS); + + fd = open(FULLPATH, O_WRONLY); + EXPECT_LE(0, fd) << strerror(errno); + + ASSERT_EQ(bufsize, pwrite(fd, CONTENTS, bufsize, offset)) + << strerror(errno); + leak(fd); +} + +/* + * Without direct_io, writes should be committed to cache + */ +TEST_F(WriteBack, cache) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char *CONTENTS = "abcdefgh"; + uint64_t ino = 42; + int fd; + ssize_t bufsize = strlen(CONTENTS); + char readbuf[bufsize]; + + expect_lookup(RELPATH, ino, 0); + expect_open(ino, 0, 1); + expect_write(ino, 0, bufsize, bufsize, CONTENTS); + + fd = open(FULLPATH, O_RDWR); + EXPECT_LE(0, fd) << strerror(errno); + + ASSERT_EQ(bufsize, write(fd, CONTENTS, bufsize)) << strerror(errno); + /* + * A subsequent read should be serviced by cache, without querying the + * filesystem daemon + */ + ASSERT_EQ(0, lseek(fd, 0, SEEK_SET)) << strerror(errno); + ASSERT_EQ(bufsize, read(fd, readbuf, bufsize)) << strerror(errno); + leak(fd); +} + +/* + * With O_DIRECT, writes should be not committed to cache. Admittedly this is + * an odd test, because it would be unusual to use O_DIRECT for writes but not + * reads. + */ +TEST_F(WriteBack, o_direct) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char *CONTENTS = "abcdefgh"; + uint64_t ino = 42; + int fd; + ssize_t bufsize = strlen(CONTENTS); + char readbuf[bufsize]; + + expect_lookup(RELPATH, ino, 0); + expect_open(ino, 0, 1); + FuseTest::expect_write(ino, 0, bufsize, bufsize, 0, FUSE_WRITE_CACHE, + CONTENTS); + expect_read(ino, 0, bufsize, bufsize, CONTENTS); + + fd = open(FULLPATH, O_RDWR | O_DIRECT); + EXPECT_LE(0, fd) << strerror(errno); + + ASSERT_EQ(bufsize, write(fd, CONTENTS, bufsize)) << strerror(errno); + /* A subsequent read must query the daemon because cache is empty */ + ASSERT_EQ(0, lseek(fd, 0, SEEK_SET)) << strerror(errno); + ASSERT_EQ(0, fcntl(fd, F_SETFL, 0)) << strerror(errno); + ASSERT_EQ(bufsize, read(fd, readbuf, bufsize)) << strerror(errno); + leak(fd); +} + +/* + * When mounted with -o async, the writeback cache mode should delay writes + */ +TEST_F(WriteBackAsync, delay) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char *CONTENTS = "abcdefgh"; + uint64_t ino = 42; + int fd; + ssize_t bufsize = strlen(CONTENTS); + + expect_lookup(RELPATH, ino, 0); + expect_open(ino, 0, 1); + /* Write should be cached, but FUSE_WRITE shouldn't be sent */ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_WRITE); + }, Eq(true)), + _) + ).Times(0); + + fd = open(FULLPATH, O_RDWR); + EXPECT_LE(0, fd) << strerror(errno); + + ASSERT_EQ(bufsize, write(fd, CONTENTS, bufsize)) << strerror(errno); + + /* Don't close the file because that would flush the cache */ +} + +/* + * A direct write should not evict dirty cached data from outside of its own + * byte range. + */ +TEST_F(WriteBackAsync, direct_io_ignores_unrelated_cached) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char CONTENTS0[] = "abcdefgh"; + const char CONTENTS1[] = "ijklmnop"; + uint64_t ino = 42; + int fd; + ssize_t bufsize = strlen(CONTENTS0) + 1; + ssize_t fsize = 2 * m_maxbcachebuf; + char readbuf[bufsize]; + void *zeros; + + zeros = calloc(1, m_maxbcachebuf); + ASSERT_NE(nullptr, zeros); + + expect_lookup(RELPATH, ino, fsize); + expect_open(ino, 0, 1); + expect_read(ino, 0, m_maxbcachebuf, m_maxbcachebuf, zeros); + FuseTest::expect_write(ino, m_maxbcachebuf, bufsize, bufsize, 0, 0, + CONTENTS1); + + fd = open(FULLPATH, O_RDWR); + EXPECT_LE(0, fd) << strerror(errno); + + // Cache first block with dirty data. This will entail first reading + // the existing data. + ASSERT_EQ(bufsize, pwrite(fd, CONTENTS0, bufsize, 0)) + << strerror(errno); + + // Write directly to second block + ASSERT_EQ(0, fcntl(fd, F_SETFL, O_DIRECT)) << strerror(errno); + ASSERT_EQ(bufsize, pwrite(fd, CONTENTS1, bufsize, m_maxbcachebuf)) + << strerror(errno); + + // Read from the first block again. Should be serviced by cache. + ASSERT_EQ(0, fcntl(fd, F_SETFL, 0)) << strerror(errno); + ASSERT_EQ(bufsize, pread(fd, readbuf, bufsize, 0)) << strerror(errno); + ASSERT_STREQ(readbuf, CONTENTS0); + + leak(fd); + free(zeros); +} + +/* + * If a direct io write partially overlaps one or two blocks of dirty cached + * data, No dirty data should be lost. Admittedly this is a weird test, + * because it would be unusual to use O_DIRECT and the writeback cache. + */ +TEST_F(WriteBackAsync, direct_io_partially_overlaps_cached_block) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + int fd; + off_t bs = m_maxbcachebuf; + ssize_t fsize = 3 * bs; + void *readbuf, *zeros, *ones, *zeroones, *onezeros; + + readbuf = malloc(bs); + ASSERT_NE(nullptr, readbuf) << strerror(errno); + zeros = calloc(1, 3 * bs); + ASSERT_NE(nullptr, zeros); + ones = calloc(1, 2 * bs); + ASSERT_NE(nullptr, ones); + memset(ones, 1, 2 * bs); + zeroones = calloc(1, bs); + ASSERT_NE(nullptr, zeroones); + memset((uint8_t*)zeroones + bs / 2, 1, bs / 2); + onezeros = calloc(1, bs); + ASSERT_NE(nullptr, onezeros); + memset(onezeros, 1, bs / 2); + + expect_lookup(RELPATH, ino, fsize); + expect_open(ino, 0, 1); + + fd = open(FULLPATH, O_RDWR); + EXPECT_LE(0, fd) << strerror(errno); + + /* Cache first and third blocks with dirty data. */ + ASSERT_EQ(3 * bs, pwrite(fd, zeros, 3 * bs, 0)) << strerror(errno); + + /* + * Write directly to all three blocks. The partially written blocks + * will be flushed because they're dirty. + */ + FuseTest::expect_write(ino, 0, bs, bs, 0, 0, zeros); + FuseTest::expect_write(ino, 2 * bs, bs, bs, 0, 0, zeros); + /* The direct write is split in two because of the m_maxwrite value */ + FuseTest::expect_write(ino, bs / 2, bs, bs, 0, 0, ones); + FuseTest::expect_write(ino, 3 * bs / 2, bs, bs, 0, 0, ones); + ASSERT_EQ(0, fcntl(fd, F_SETFL, O_DIRECT)) << strerror(errno); + ASSERT_EQ(2 * bs, pwrite(fd, ones, 2 * bs, bs / 2)) << strerror(errno); + + /* + * Read from both the valid and invalid portions of the first and third + * blocks again. This will entail FUSE_READ operations because these + * blocks were invalidated by the direct write. + */ + expect_read(ino, 0, bs, bs, zeroones); + expect_read(ino, 2 * bs, bs, bs, onezeros); + ASSERT_EQ(0, fcntl(fd, F_SETFL, 0)) << strerror(errno); + ASSERT_EQ(bs / 2, pread(fd, readbuf, bs / 2, 0)) << strerror(errno); + EXPECT_EQ(0, memcmp(zeros, readbuf, bs / 2)); + ASSERT_EQ(bs / 2, pread(fd, readbuf, bs / 2, 5 * bs / 2)) + << strerror(errno); + EXPECT_EQ(0, memcmp(zeros, readbuf, bs / 2)); + ASSERT_EQ(bs / 2, pread(fd, readbuf, bs / 2, bs / 2)) + << strerror(errno); + EXPECT_EQ(0, memcmp(ones, readbuf, bs / 2)); + ASSERT_EQ(bs / 2, pread(fd, readbuf, bs / 2, 2 * bs)) + << strerror(errno); + EXPECT_EQ(0, memcmp(ones, readbuf, bs / 2)); + + leak(fd); + free(zeroones); + free(onezeros); + free(ones); + free(zeros); + free(readbuf); +} + +/* + * In WriteBack mode, writes may be cached beyond what the server thinks is the + * EOF. In this case, a short read at EOF should _not_ cause fusefs to update + * the file's size. + */ +TEST_F(WriteBackAsync, eof) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char *CONTENTS0 = "abcdefgh"; + const char *CONTENTS1 = "ijklmnop"; + uint64_t ino = 42; + int fd; + off_t offset = m_maxbcachebuf; + ssize_t wbufsize = strlen(CONTENTS1); + off_t old_filesize = (off_t)strlen(CONTENTS0); + ssize_t rbufsize = 2 * old_filesize; + char readbuf[rbufsize]; + size_t holesize = rbufsize - old_filesize; + char hole[holesize]; + struct stat sb; + ssize_t r; + + expect_lookup(RELPATH, ino, 0); + expect_open(ino, 0, 1); + expect_read(ino, 0, m_maxbcachebuf, old_filesize, CONTENTS0); + + fd = open(FULLPATH, O_RDWR); + EXPECT_LE(0, fd) << strerror(errno); + + /* Write and cache data beyond EOF */ + ASSERT_EQ(wbufsize, pwrite(fd, CONTENTS1, wbufsize, offset)) + << strerror(errno); + + /* Read from the old EOF */ + r = pread(fd, readbuf, rbufsize, 0); + ASSERT_LE(0, r) << strerror(errno); + EXPECT_EQ(rbufsize, r) << "read should've synthesized a hole"; + EXPECT_EQ(0, memcmp(CONTENTS0, readbuf, old_filesize)); + bzero(hole, holesize); + EXPECT_EQ(0, memcmp(hole, readbuf + old_filesize, holesize)); + + /* The file's size should still be what was established by pwrite */ + ASSERT_EQ(0, fstat(fd, &sb)) << strerror(errno); + EXPECT_EQ(offset + wbufsize, sb.st_size); + leak(fd); +} + +/* + * When a file has dirty writes that haven't been flushed, the server's notion + * of its mtime and ctime will be wrong. The kernel should ignore those if it + * gets them from a FUSE_GETATTR before flushing. + */ +TEST_F(WriteBackAsync, timestamps) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char *CONTENTS = "abcdefgh"; + ssize_t bufsize = strlen(CONTENTS); + uint64_t ino = 42; + uint64_t attr_valid = 0; + uint64_t attr_valid_nsec = 0; + uint64_t server_time = 12345; + mode_t mode = S_IFREG | 0644; + int fd; + + struct stat sb; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillRepeatedly(Invoke( + ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = mode; + out.body.entry.nodeid = ino; + out.body.entry.attr.nlink = 1; + out.body.entry.attr_valid = attr_valid; + out.body.entry.attr_valid_nsec = attr_valid_nsec; + }))); + expect_open(ino, 0, 1); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_GETATTR && + in.header.nodeid == ino); + }, Eq(true)), + _) + ).WillRepeatedly(Invoke( + ReturnImmediate([=](auto i __unused, auto& out) { + SET_OUT_HEADER_LEN(out, attr); + out.body.attr.attr.ino = ino; + out.body.attr.attr.mode = mode; + out.body.attr.attr_valid = attr_valid; + out.body.attr.attr_valid_nsec = attr_valid_nsec; + out.body.attr.attr.atime = server_time; + out.body.attr.attr.mtime = server_time; + out.body.attr.attr.ctime = server_time; + }))); + + fd = open(FULLPATH, O_RDWR); + EXPECT_LE(0, fd) << strerror(errno); + ASSERT_EQ(bufsize, write(fd, CONTENTS, bufsize)) << strerror(errno); + + ASSERT_EQ(0, fstat(fd, &sb)) << strerror(errno); + EXPECT_EQ((time_t)server_time, sb.st_atime); + EXPECT_NE((time_t)server_time, sb.st_mtime); + EXPECT_NE((time_t)server_time, sb.st_ctime); +} + +/* Any dirty timestamp fields should be flushed during a SETATTR */ +TEST_F(WriteBackAsync, timestamps_during_setattr) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char *CONTENTS = "abcdefgh"; + ssize_t bufsize = strlen(CONTENTS); + uint64_t ino = 42; + const mode_t newmode = 0755; + int fd; + + expect_lookup(RELPATH, ino, 0); + expect_open(ino, 0, 1); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + uint32_t valid = FATTR_MODE | FATTR_MTIME | FATTR_CTIME; + return (in.header.opcode == FUSE_SETATTR && + in.header.nodeid == ino && + in.body.setattr.valid == valid); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, attr); + out.body.attr.attr.ino = ino; + out.body.attr.attr.mode = S_IFREG | newmode; + }))); + + fd = open(FULLPATH, O_RDWR); + EXPECT_LE(0, fd) << strerror(errno); + ASSERT_EQ(bufsize, write(fd, CONTENTS, bufsize)) << strerror(errno); + ASSERT_EQ(0, fchmod(fd, newmode)) << strerror(errno); +} + +/* fuse_init_out.time_gran controls the granularity of timestamps */ +TEST_P(TimeGran, timestamps_during_setattr) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char *CONTENTS = "abcdefgh"; + ssize_t bufsize = strlen(CONTENTS); + uint64_t ino = 42; + const mode_t newmode = 0755; + int fd; + + expect_lookup(RELPATH, ino, 0); + expect_open(ino, 0, 1); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + uint32_t valid = FATTR_MODE | FATTR_MTIME | FATTR_CTIME; + return (in.header.opcode == FUSE_SETATTR && + in.header.nodeid == ino && + in.body.setattr.valid == valid && + in.body.setattr.mtimensec % m_time_gran == 0 && + in.body.setattr.ctimensec % m_time_gran == 0); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, attr); + out.body.attr.attr.ino = ino; + out.body.attr.attr.mode = S_IFREG | newmode; + }))); + + fd = open(FULLPATH, O_RDWR); + EXPECT_LE(0, fd) << strerror(errno); + ASSERT_EQ(bufsize, write(fd, CONTENTS, bufsize)) << strerror(errno); + ASSERT_EQ(0, fchmod(fd, newmode)) << strerror(errno); +} + +INSTANTIATE_TEST_CASE_P(RA, TimeGran, Range(0u, 10u)); + +/* + * Without direct_io, writes should be committed to cache + */ +TEST_F(Write, writethrough) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char *CONTENTS = "abcdefgh"; + uint64_t ino = 42; + int fd; + ssize_t bufsize = strlen(CONTENTS); + char readbuf[bufsize]; + + expect_lookup(RELPATH, ino, 0); + expect_open(ino, 0, 1); + expect_write(ino, 0, bufsize, bufsize, CONTENTS); + + fd = open(FULLPATH, O_RDWR); + EXPECT_LE(0, fd) << strerror(errno); + + ASSERT_EQ(bufsize, write(fd, CONTENTS, bufsize)) << strerror(errno); + /* + * A subsequent read should be serviced by cache, without querying the + * filesystem daemon + */ + ASSERT_EQ(0, lseek(fd, 0, SEEK_SET)) << strerror(errno); + ASSERT_EQ(bufsize, read(fd, readbuf, bufsize)) << strerror(errno); + leak(fd); +} + +/* Writes that extend a file should update the cached file size */ +TEST_F(Write, update_file_size) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char *CONTENTS = "abcdefgh"; + struct stat sb; + uint64_t ino = 42; + int fd; + ssize_t bufsize = strlen(CONTENTS); + + expect_lookup(RELPATH, ino, 0); + expect_open(ino, 0, 1); + expect_write(ino, 0, bufsize, bufsize, CONTENTS); + + fd = open(FULLPATH, O_RDWR); + EXPECT_LE(0, fd) << strerror(errno); + + ASSERT_EQ(bufsize, write(fd, CONTENTS, bufsize)) << strerror(errno); + /* Get cached attributes */ + ASSERT_EQ(0, fstat(fd, &sb)) << strerror(errno); + ASSERT_EQ(bufsize, sb.st_size); + leak(fd); +} Property changes on: head/tests/sys/fs/fusefs/write.cc ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: head/tests/sys/fs/fusefs/allow_other.cc =================================================================== --- head/tests/sys/fs/fusefs/allow_other.cc (nonexistent) +++ head/tests/sys/fs/fusefs/allow_other.cc (revision 350665) @@ -0,0 +1,303 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2019 The FreeBSD Foundation + * + * This software was developed by BFF Storage Systems, LLC under sponsorship + * from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +/* + * Tests for the "allow_other" mount option. They must be in their own + * file so they can be run as root + */ + +extern "C" { +#include +#include +#include +#include +#include +} + +#include "mockfs.hh" +#include "utils.hh" + +using namespace testing; + +const static char FULLPATH[] = "mountpoint/some_file.txt"; +const static char RELPATH[] = "some_file.txt"; + +class NoAllowOther: public FuseTest { + +public: +/* Unprivileged user id */ +int m_uid; + +virtual void SetUp() { + if (geteuid() != 0) { + GTEST_SKIP() << "This test must be run as root"; + } + + FuseTest::SetUp(); +} +}; + +class AllowOther: public NoAllowOther { + +public: +virtual void SetUp() { + m_allow_other = true; + NoAllowOther::SetUp(); +} +}; + +TEST_F(AllowOther, allowed) +{ + int status; + + fork(true, &status, [&] { + uint64_t ino = 42; + + expect_lookup(RELPATH, ino, S_IFREG | 0644, 0, 1); + expect_open(ino, 0, 1); + expect_flush(ino, 1, ReturnErrno(0)); + expect_release(ino, FH); + }, []() { + int fd; + + fd = open(FULLPATH, O_RDONLY); + if (fd < 0) { + perror("open"); + return(1); + } + return 0; + } + ); + ASSERT_EQ(0, WEXITSTATUS(status)); +} + +/* Check that fusefs uses the correct credentials for FUSE operations */ +TEST_F(AllowOther, creds) +{ + int status; + uid_t uid; + gid_t gid; + + get_unprivileged_id(&uid, &gid); + fork(true, &status, [=] { + EXPECT_CALL(*m_mock, process( ResultOf([=](auto in) { + return (in.header.opcode == FUSE_LOOKUP && + in.header.uid == uid && + in.header.gid == gid); + }, Eq(true)), + _) + ).Times(1) + .WillOnce(Invoke(ReturnErrno(ENOENT))); + }, []() { + eaccess(FULLPATH, F_OK); + return 0; + } + ); + ASSERT_EQ(0, WEXITSTATUS(status)); +} + +/* + * A variation of the Open.multiple_creds test showing how the bug can lead to a + * privilege elevation. The first process is privileged and opens a file only + * visible to root. The second process is unprivileged and shouldn't be able + * to open the file, but does thanks to the bug + */ +TEST_F(AllowOther, privilege_escalation) +{ + int fd1, status; + const static uint64_t ino = 42; + const static uint64_t fh = 100; + + /* Fork a child to open the file with different credentials */ + fork(true, &status, [&] { + + expect_lookup(RELPATH, ino, S_IFREG | 0600, 0, 2); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_OPEN && + in.header.pid == (uint32_t)getpid() && + in.header.uid == (uint32_t)geteuid() && + in.header.nodeid == ino); + }, Eq(true)), + _) + ).WillOnce(Invoke( + ReturnImmediate([](auto in __unused, auto& out) { + out.body.open.fh = fh; + out.header.len = sizeof(out.header); + SET_OUT_HEADER_LEN(out, open); + }))); + + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_OPEN && + in.header.pid != (uint32_t)getpid() && + in.header.uid != (uint32_t)geteuid() && + in.header.nodeid == ino); + }, Eq(true)), + _) + ).Times(AnyNumber()) + .WillRepeatedly(Invoke(ReturnErrno(EPERM))); + + fd1 = open(FULLPATH, O_RDONLY); + EXPECT_LE(0, fd1) << strerror(errno); + }, [] { + int fd0; + + fd0 = open(FULLPATH, O_RDONLY); + if (fd0 >= 0) { + fprintf(stderr, "Privilege escalation!\n"); + return 1; + } + if (errno != EPERM) { + fprintf(stderr, "Unexpected error %s\n", + strerror(errno)); + return 1; + } + leak(fd0); + return 0; + } + ); + ASSERT_EQ(0, WEXITSTATUS(status)); + leak(fd1); +} + +TEST_F(NoAllowOther, disallowed) +{ + int status; + + fork(true, &status, [] { + }, []() { + int fd; + + fd = open(FULLPATH, O_RDONLY); + if (fd >= 0) { + fprintf(stderr, "open should've failed\n"); + return(1); + } else if (errno != EPERM) { + fprintf(stderr, "Unexpected error: %s\n", + strerror(errno)); + return(1); + } + return 0; + } + ); + ASSERT_EQ(0, WEXITSTATUS(status)); +} + +/* + * When -o allow_other is not used, users other than the owner aren't allowed + * to open anything inside of the mount point, not just the mountpoint itself + * This is a regression test for bug 237052 + */ +TEST_F(NoAllowOther, disallowed_beneath_root) +{ + const static char RELPATH2[] = "other_dir"; + const static uint64_t ino = 42; + const static uint64_t ino2 = 43; + int dfd, status; + + expect_lookup(RELPATH, ino, S_IFDIR | 0755, 0, 1); + EXPECT_LOOKUP(ino, RELPATH2) + .WillRepeatedly(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = S_IFREG | 0644; + out.body.entry.nodeid = ino2; + out.body.entry.attr.nlink = 1; + out.body.entry.attr_valid = UINT64_MAX; + }))); + expect_opendir(ino); + dfd = open(FULLPATH, O_DIRECTORY); + ASSERT_LE(0, dfd) << strerror(errno); + + fork(true, &status, [] { + }, [&]() { + int fd; + + fd = openat(dfd, RELPATH2, O_RDONLY); + if (fd >= 0) { + fprintf(stderr, "openat should've failed\n"); + return(1); + } else if (errno != EPERM) { + fprintf(stderr, "Unexpected error: %s\n", + strerror(errno)); + return(1); + } + return 0; + } + ); + ASSERT_EQ(0, WEXITSTATUS(status)); +} + +/* + * Provide coverage for the extattr methods, which have a slightly different + * code path + */ +TEST_F(NoAllowOther, setextattr) +{ + int ino = 42, status; + + fork(true, &status, [&] { + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(Invoke( + ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr_valid = UINT64_MAX; + out.body.entry.entry_valid = UINT64_MAX; + out.body.entry.attr.mode = S_IFREG | 0644; + out.body.entry.nodeid = ino; + }))); + + /* + * lookup the file to get it into the cache. + * Otherwise, the unprivileged lookup will fail with + * EACCES + */ + ASSERT_EQ(0, access(FULLPATH, F_OK)) << strerror(errno); + }, [&]() { + const char value[] = "whatever"; + ssize_t value_len = strlen(value) + 1; + int ns = EXTATTR_NAMESPACE_USER; + ssize_t r; + + r = extattr_set_file(FULLPATH, ns, "foo", + (const void*)value, value_len); + if (r >= 0) { + fprintf(stderr, "should've failed\n"); + return(1); + } else if (errno != EPERM) { + fprintf(stderr, "Unexpected error: %s\n", + strerror(errno)); + return(1); + } + return 0; + } + ); + ASSERT_EQ(0, WEXITSTATUS(status)); +} Property changes on: head/tests/sys/fs/fusefs/allow_other.cc ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: head/tests/sys/fs/fusefs/default_permissions.cc =================================================================== --- head/tests/sys/fs/fusefs/default_permissions.cc (nonexistent) +++ head/tests/sys/fs/fusefs/default_permissions.cc (revision 350665) @@ -0,0 +1,1305 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2019 The FreeBSD Foundation + * + * This software was developed by BFF Storage Systems, LLC under sponsorship + * from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +/* + * Tests for the "default_permissions" mount option. They must be in their own + * file so they can be run as an unprivileged user + */ + +extern "C" { +#include +#include + +#include +#include +} + +#include "mockfs.hh" +#include "utils.hh" + +using namespace testing; + +class DefaultPermissions: public FuseTest { + +virtual void SetUp() { + m_default_permissions = true; + FuseTest::SetUp(); + if (HasFatalFailure() || IsSkipped()) + return; + + if (geteuid() == 0) { + GTEST_SKIP() << "This test requires an unprivileged user"; + } + + /* With -o default_permissions, FUSE_ACCESS should never be called */ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_ACCESS); + }, Eq(true)), + _) + ).Times(0); +} + +public: +void expect_chmod(uint64_t ino, mode_t mode, uint64_t size = 0) +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_SETATTR && + in.header.nodeid == ino && + in.body.setattr.valid == FATTR_MODE && + in.body.setattr.mode == mode); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, attr); + out.body.attr.attr.ino = ino; // Must match nodeid + out.body.attr.attr.mode = S_IFREG | mode; + out.body.attr.attr.size = size; + out.body.attr.attr_valid = UINT64_MAX; + }))); +} + +void expect_create(const char *relpath, uint64_t ino) +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + const char *name = (const char*)in.body.bytes + + sizeof(fuse_create_in); + return (in.header.opcode == FUSE_CREATE && + (0 == strcmp(relpath, name))); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, create); + out.body.create.entry.attr.mode = S_IFREG | 0644; + out.body.create.entry.nodeid = ino; + out.body.create.entry.entry_valid = UINT64_MAX; + out.body.create.entry.attr_valid = UINT64_MAX; + }))); +} + +void expect_getattr(uint64_t ino, mode_t mode, uint64_t attr_valid, int times, + uid_t uid = 0, gid_t gid = 0) +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_GETATTR && + in.header.nodeid == ino); + }, Eq(true)), + _) + ).Times(times) + .WillRepeatedly(Invoke(ReturnImmediate([=](auto i __unused, auto& out) { + SET_OUT_HEADER_LEN(out, attr); + out.body.attr.attr.ino = ino; // Must match nodeid + out.body.attr.attr.mode = mode; + out.body.attr.attr.size = 0; + out.body.attr.attr.uid = uid; + out.body.attr.attr.uid = gid; + out.body.attr.attr_valid = attr_valid; + }))); +} + +void expect_lookup(const char *relpath, uint64_t ino, mode_t mode, + uint64_t attr_valid, uid_t uid = 0, gid_t gid = 0) +{ + FuseTest::expect_lookup(relpath, ino, mode, 0, 1, attr_valid, uid, gid); +} + +}; + +class Access: public DefaultPermissions {}; +class Chown: public DefaultPermissions {}; +class Chgrp: public DefaultPermissions {}; +class Lookup: public DefaultPermissions {}; +class Open: public DefaultPermissions {}; +class Setattr: public DefaultPermissions {}; +class Unlink: public DefaultPermissions {}; +class Utimensat: public DefaultPermissions {}; +class Write: public DefaultPermissions {}; + +/* + * Test permission handling during create, mkdir, mknod, link, symlink, and + * rename vops (they all share a common path for permission checks in + * VOP_LOOKUP) + */ +class Create: public DefaultPermissions {}; + +class Deleteextattr: public DefaultPermissions { +public: +void expect_removexattr() +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_REMOVEXATTR); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnErrno(0))); +} +}; + +class Getextattr: public DefaultPermissions { +public: +void expect_getxattr(ProcessMockerT r) +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_GETXATTR); + }, Eq(true)), + _) + ).WillOnce(Invoke(r)); +} +}; + +class Listextattr: public DefaultPermissions { +public: +void expect_listxattr() +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_LISTXATTR); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([](auto i __unused, auto& out) { + out.body.listxattr.size = 0; + SET_OUT_HEADER_LEN(out, listxattr); + }))); +} +}; + +class Rename: public DefaultPermissions { +public: + /* + * Expect a rename and respond with the given error. Don't both to + * validate arguments; the tests in rename.cc do that. + */ + void expect_rename(int error) + { + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_RENAME); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnErrno(error))); + } +}; + +class Setextattr: public DefaultPermissions { +public: +void expect_setxattr(int error) +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_SETXATTR); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnErrno(error))); +} +}; + +/* Return a group to which this user does not belong */ +static gid_t excluded_group() +{ + int i, ngroups = 64; + gid_t newgid, groups[ngroups]; + + getgrouplist(getlogin(), getegid(), groups, &ngroups); + for (newgid = 0; ; newgid++) { + bool belongs = false; + + for (i = 0; i < ngroups; i++) { + if (groups[i] == newgid) + belongs = true; + } + if (!belongs) + break; + } + /* newgid is now a group to which the current user does not belong */ + return newgid; +} + +TEST_F(Access, eacces) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + mode_t access_mode = X_OK; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0755, UINT64_MAX, 1); + expect_lookup(RELPATH, ino, S_IFREG | 0644, UINT64_MAX); + + ASSERT_NE(0, access(FULLPATH, access_mode)); + ASSERT_EQ(EACCES, errno); +} + +TEST_F(Access, eacces_no_cached_attrs) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + mode_t access_mode = X_OK; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0755, 0, 1); + expect_lookup(RELPATH, ino, S_IFREG | 0644, 0); + expect_getattr(ino, S_IFREG | 0644, 0, 1); + /* + * Once default_permissions is properly implemented, there might be + * another FUSE_GETATTR or something in here. But there should not be + * a FUSE_ACCESS + */ + + ASSERT_NE(0, access(FULLPATH, access_mode)); + ASSERT_EQ(EACCES, errno); +} + +TEST_F(Access, ok) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + mode_t access_mode = R_OK; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0755, UINT64_MAX, 1); + expect_lookup(RELPATH, ino, S_IFREG | 0644, UINT64_MAX); + /* + * Once default_permissions is properly implemented, there might be + * another FUSE_GETATTR or something in here. + */ + + ASSERT_EQ(0, access(FULLPATH, access_mode)) << strerror(errno); +} + +/* Unprivileged users may chown a file to their own uid */ +TEST_F(Chown, chown_to_self) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const uint64_t ino = 42; + const mode_t mode = 0755; + uid_t uid; + + uid = geteuid(); + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0755, UINT64_MAX, 1, uid); + expect_lookup(RELPATH, ino, S_IFREG | mode, UINT64_MAX, uid); + /* The OS may optimize chown by omitting the redundant setattr */ + EXPECT_CALL(*m_mock, process( + ResultOf([](auto in) { + return (in.header.opcode == FUSE_SETATTR); + }, Eq(true)), + _) + ).WillRepeatedly(Invoke(ReturnImmediate([=](auto in __unused, auto& out){ + SET_OUT_HEADER_LEN(out, attr); + out.body.attr.attr.mode = S_IFREG | mode; + out.body.attr.attr.uid = uid; + }))); + + EXPECT_EQ(0, chown(FULLPATH, uid, -1)) << strerror(errno); +} + +/* + * A successful chown by a non-privileged non-owner should clear a file's SUID + * bit + */ +TEST_F(Chown, clear_suid) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + const mode_t oldmode = 06755; + const mode_t newmode = 0755; + uid_t uid = geteuid(); + uint32_t valid = FATTR_UID | FATTR_MODE; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0755, UINT64_MAX, 1, uid); + expect_lookup(RELPATH, ino, S_IFREG | oldmode, UINT64_MAX, uid); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_SETATTR && + in.header.nodeid == ino && + in.body.setattr.valid == valid && + in.body.setattr.mode == newmode); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, attr); + out.body.attr.attr.ino = ino; // Must match nodeid + out.body.attr.attr.mode = S_IFREG | newmode; + out.body.attr.attr_valid = UINT64_MAX; + }))); + + EXPECT_EQ(0, chown(FULLPATH, uid, -1)) << strerror(errno); +} + + +/* Only root may change a file's owner */ +TEST_F(Chown, eperm) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const uint64_t ino = 42; + const mode_t mode = 0755; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0755, UINT64_MAX, 1, geteuid()); + expect_lookup(RELPATH, ino, S_IFREG | mode, UINT64_MAX, geteuid()); + EXPECT_CALL(*m_mock, process( + ResultOf([](auto in) { + return (in.header.opcode == FUSE_SETATTR); + }, Eq(true)), + _) + ).Times(0); + + EXPECT_NE(0, chown(FULLPATH, 0, -1)); + EXPECT_EQ(EPERM, errno); +} + +/* + * A successful chgrp by a non-privileged non-owner should clear a file's SUID + * bit + */ +TEST_F(Chgrp, clear_suid) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + const mode_t oldmode = 06755; + const mode_t newmode = 0755; + uid_t uid = geteuid(); + gid_t gid = getegid(); + uint32_t valid = FATTR_GID | FATTR_MODE; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0755, UINT64_MAX, 1, uid); + expect_lookup(RELPATH, ino, S_IFREG | oldmode, UINT64_MAX, uid, gid); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_SETATTR && + in.header.nodeid == ino && + in.body.setattr.valid == valid && + in.body.setattr.mode == newmode); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, attr); + out.body.attr.attr.ino = ino; // Must match nodeid + out.body.attr.attr.mode = S_IFREG | newmode; + out.body.attr.attr_valid = UINT64_MAX; + }))); + + EXPECT_EQ(0, chown(FULLPATH, -1, gid)) << strerror(errno); +} + +/* non-root users may only chgrp a file to a group they belong to */ +TEST_F(Chgrp, eperm) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const uint64_t ino = 42; + const mode_t mode = 0755; + uid_t uid; + gid_t gid, newgid; + + uid = geteuid(); + gid = getegid(); + newgid = excluded_group(); + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0755, UINT64_MAX, 1, uid, gid); + expect_lookup(RELPATH, ino, S_IFREG | mode, UINT64_MAX, uid, gid); + EXPECT_CALL(*m_mock, process( + ResultOf([](auto in) { + return (in.header.opcode == FUSE_SETATTR); + }, Eq(true)), + _) + ).Times(0); + + EXPECT_NE(0, chown(FULLPATH, -1, newgid)); + EXPECT_EQ(EPERM, errno); +} + +TEST_F(Chgrp, ok) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const uint64_t ino = 42; + const mode_t mode = 0755; + uid_t uid; + gid_t gid, newgid; + + uid = geteuid(); + gid = 0; + newgid = getegid(); + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0755, UINT64_MAX, 1, uid, gid); + expect_lookup(RELPATH, ino, S_IFREG | mode, UINT64_MAX, uid, gid); + /* The OS may optimize chgrp by omitting the redundant setattr */ + EXPECT_CALL(*m_mock, process( + ResultOf([](auto in) { + return (in.header.opcode == FUSE_SETATTR && + in.header.nodeid == ino); + }, Eq(true)), + _) + ).WillRepeatedly(Invoke(ReturnImmediate([=](auto in __unused, auto& out){ + SET_OUT_HEADER_LEN(out, attr); + out.body.attr.attr.mode = S_IFREG | mode; + out.body.attr.attr.uid = uid; + out.body.attr.attr.gid = newgid; + }))); + + EXPECT_EQ(0, chown(FULLPATH, -1, newgid)) << strerror(errno); +} + +TEST_F(Create, ok) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + int fd; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0777, UINT64_MAX, 1); + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(Invoke(ReturnErrno(ENOENT))); + expect_create(RELPATH, ino); + + fd = open(FULLPATH, O_CREAT | O_EXCL, 0644); + EXPECT_LE(0, fd) << strerror(errno); + leak(fd); +} + +TEST_F(Create, eacces) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0755, UINT64_MAX, 1); + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(Invoke(ReturnErrno(ENOENT))); + + EXPECT_EQ(-1, open(FULLPATH, O_CREAT | O_EXCL, 0644)); + EXPECT_EQ(EACCES, errno); +} + +TEST_F(Deleteextattr, eacces) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + int ns = EXTATTR_NAMESPACE_USER; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0755, UINT64_MAX, 1); + expect_lookup(RELPATH, ino, S_IFREG | 0644, UINT64_MAX, 0); + + ASSERT_EQ(-1, extattr_delete_file(FULLPATH, ns, "foo")); + ASSERT_EQ(EACCES, errno); +} + +TEST_F(Deleteextattr, ok) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + int ns = EXTATTR_NAMESPACE_USER; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0755, UINT64_MAX, 1); + expect_lookup(RELPATH, ino, S_IFREG | 0644, UINT64_MAX, geteuid()); + expect_removexattr(); + + ASSERT_EQ(0, extattr_delete_file(FULLPATH, ns, "foo")) + << strerror(errno); +} + +/* Delete system attributes requires superuser privilege */ +TEST_F(Deleteextattr, system) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + int ns = EXTATTR_NAMESPACE_SYSTEM; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0755, UINT64_MAX, 1); + expect_lookup(RELPATH, ino, S_IFREG | 0666, UINT64_MAX, geteuid()); + + ASSERT_EQ(-1, extattr_delete_file(FULLPATH, ns, "foo")); + ASSERT_EQ(EPERM, errno); +} + +/* Anybody with write permission can set both timestamps to UTIME_NOW */ +TEST_F(Utimensat, utime_now) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const uint64_t ino = 42; + /* Write permissions for everybody */ + const mode_t mode = 0666; + uid_t owner = 0; + const timespec times[2] = { + {.tv_sec = 0, .tv_nsec = UTIME_NOW}, + {.tv_sec = 0, .tv_nsec = UTIME_NOW}, + }; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0755, UINT64_MAX, 1); + expect_lookup(RELPATH, ino, S_IFREG | mode, UINT64_MAX, owner); + EXPECT_CALL(*m_mock, process( + ResultOf([](auto in) { + return (in.header.opcode == FUSE_SETATTR && + in.header.nodeid == ino && + in.body.setattr.valid & FATTR_ATIME && + in.body.setattr.valid & FATTR_MTIME); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, attr); + out.body.attr.attr.mode = S_IFREG | mode; + }))); + + ASSERT_EQ(0, utimensat(AT_FDCWD, FULLPATH, ×[0], 0)) + << strerror(errno); +} + +/* Anybody can set both timestamps to UTIME_OMIT */ +TEST_F(Utimensat, utime_omit) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const uint64_t ino = 42; + /* Write permissions for no one */ + const mode_t mode = 0444; + uid_t owner = 0; + const timespec times[2] = { + {.tv_sec = 0, .tv_nsec = UTIME_OMIT}, + {.tv_sec = 0, .tv_nsec = UTIME_OMIT}, + }; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0755, UINT64_MAX, 1); + expect_lookup(RELPATH, ino, S_IFREG | mode, UINT64_MAX, owner); + + ASSERT_EQ(0, utimensat(AT_FDCWD, FULLPATH, ×[0], 0)) + << strerror(errno); +} + +/* Deleting user attributes merely requires WRITE privilege */ +TEST_F(Deleteextattr, user) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + int ns = EXTATTR_NAMESPACE_USER; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0755, UINT64_MAX, 1); + expect_lookup(RELPATH, ino, S_IFREG | 0666, UINT64_MAX, 0); + expect_removexattr(); + + ASSERT_EQ(0, extattr_delete_file(FULLPATH, ns, "foo")) + << strerror(errno); +} + +TEST_F(Getextattr, eacces) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + char data[80]; + int ns = EXTATTR_NAMESPACE_USER; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0755, UINT64_MAX, 1); + expect_lookup(RELPATH, ino, S_IFREG | 0600, UINT64_MAX, 0); + + ASSERT_EQ(-1, + extattr_get_file(FULLPATH, ns, "foo", data, sizeof(data))); + ASSERT_EQ(EACCES, errno); +} + +TEST_F(Getextattr, ok) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + char data[80]; + const char value[] = "whatever"; + ssize_t value_len = strlen(value) + 1; + int ns = EXTATTR_NAMESPACE_USER; + ssize_t r; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0755, UINT64_MAX, 1); + /* Getting user attributes only requires read access */ + expect_lookup(RELPATH, ino, S_IFREG | 0444, UINT64_MAX, 0); + expect_getxattr( + ReturnImmediate([&](auto in __unused, auto& out) { + memcpy((void*)out.body.bytes, value, value_len); + out.header.len = sizeof(out.header) + value_len; + }) + ); + + r = extattr_get_file(FULLPATH, ns, "foo", data, sizeof(data)); + ASSERT_EQ(value_len, r) << strerror(errno); + EXPECT_STREQ(value, data); +} + +/* Getting system attributes requires superuser privileges */ +TEST_F(Getextattr, system) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + char data[80]; + int ns = EXTATTR_NAMESPACE_SYSTEM; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0755, UINT64_MAX, 1); + expect_lookup(RELPATH, ino, S_IFREG | 0666, UINT64_MAX, geteuid()); + + ASSERT_EQ(-1, + extattr_get_file(FULLPATH, ns, "foo", data, sizeof(data))); + ASSERT_EQ(EPERM, errno); +} + +TEST_F(Listextattr, eacces) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + int ns = EXTATTR_NAMESPACE_USER; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0777, UINT64_MAX, 1); + expect_lookup(RELPATH, ino, S_IFREG | 0600, UINT64_MAX, 0); + + ASSERT_EQ(-1, extattr_list_file(FULLPATH, ns, NULL, 0)); + ASSERT_EQ(EACCES, errno); +} + +TEST_F(Listextattr, ok) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + int ns = EXTATTR_NAMESPACE_USER; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0777, UINT64_MAX, 1); + /* Listing user extended attributes merely requires read access */ + expect_lookup(RELPATH, ino, S_IFREG | 0644, UINT64_MAX, 0); + expect_listxattr(); + + ASSERT_EQ(0, extattr_list_file(FULLPATH, ns, NULL, 0)) + << strerror(errno); +} + +/* Listing system xattrs requires superuser privileges */ +TEST_F(Listextattr, system) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + int ns = EXTATTR_NAMESPACE_SYSTEM; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0777, UINT64_MAX, 1); + /* Listing user extended attributes merely requires read access */ + expect_lookup(RELPATH, ino, S_IFREG | 0644, UINT64_MAX, geteuid()); + + ASSERT_EQ(-1, extattr_list_file(FULLPATH, ns, NULL, 0)); + ASSERT_EQ(EPERM, errno); +} + +/* A component of the search path lacks execute permissions */ +TEST_F(Lookup, eacces) +{ + const char FULLPATH[] = "mountpoint/some_dir/some_file.txt"; + const char RELDIRPATH[] = "some_dir"; + uint64_t dir_ino = 42; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0755, UINT64_MAX, 1); + expect_lookup(RELDIRPATH, dir_ino, S_IFDIR | 0700, UINT64_MAX, 0); + + EXPECT_EQ(-1, access(FULLPATH, F_OK)); + EXPECT_EQ(EACCES, errno); +} + +TEST_F(Open, eacces) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0755, UINT64_MAX, 1); + expect_lookup(RELPATH, ino, S_IFREG | 0644, UINT64_MAX); + + EXPECT_NE(0, open(FULLPATH, O_RDWR)); + EXPECT_EQ(EACCES, errno); +} + +TEST_F(Open, ok) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + int fd; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0755, UINT64_MAX, 1); + expect_lookup(RELPATH, ino, S_IFREG | 0644, UINT64_MAX); + expect_open(ino, 0, 1); + + fd = open(FULLPATH, O_RDONLY); + EXPECT_LE(0, fd) << strerror(errno); + leak(fd); +} + +TEST_F(Rename, eacces_on_srcdir) +{ + const char FULLDST[] = "mountpoint/d/dst"; + const char RELDST[] = "d/dst"; + const char FULLSRC[] = "mountpoint/src"; + const char RELSRC[] = "src"; + uint64_t ino = 42; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0755, UINT64_MAX, 1, 0); + expect_lookup(RELSRC, ino, S_IFREG | 0644, UINT64_MAX); + EXPECT_LOOKUP(FUSE_ROOT_ID, RELDST) + .Times(AnyNumber()) + .WillRepeatedly(Invoke(ReturnErrno(ENOENT))); + + ASSERT_EQ(-1, rename(FULLSRC, FULLDST)); + ASSERT_EQ(EACCES, errno); +} + +TEST_F(Rename, eacces_on_dstdir_for_creating) +{ + const char FULLDST[] = "mountpoint/d/dst"; + const char RELDSTDIR[] = "d"; + const char RELDST[] = "dst"; + const char FULLSRC[] = "mountpoint/src"; + const char RELSRC[] = "src"; + uint64_t src_ino = 42; + uint64_t dstdir_ino = 43; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0777, UINT64_MAX, 1, 0); + expect_lookup(RELSRC, src_ino, S_IFREG | 0644, UINT64_MAX); + expect_lookup(RELDSTDIR, dstdir_ino, S_IFDIR | 0755, UINT64_MAX); + EXPECT_LOOKUP(dstdir_ino, RELDST).WillOnce(Invoke(ReturnErrno(ENOENT))); + + ASSERT_EQ(-1, rename(FULLSRC, FULLDST)); + ASSERT_EQ(EACCES, errno); +} + +TEST_F(Rename, eacces_on_dstdir_for_removing) +{ + const char FULLDST[] = "mountpoint/d/dst"; + const char RELDSTDIR[] = "d"; + const char RELDST[] = "dst"; + const char FULLSRC[] = "mountpoint/src"; + const char RELSRC[] = "src"; + uint64_t src_ino = 42; + uint64_t dstdir_ino = 43; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0777, UINT64_MAX, 1, 0); + expect_lookup(RELSRC, src_ino, S_IFREG | 0644, UINT64_MAX); + expect_lookup(RELDSTDIR, dstdir_ino, S_IFDIR | 0755, UINT64_MAX); + EXPECT_LOOKUP(dstdir_ino, RELDST).WillOnce(Invoke(ReturnErrno(ENOENT))); + + ASSERT_EQ(-1, rename(FULLSRC, FULLDST)); + ASSERT_EQ(EACCES, errno); +} + +TEST_F(Rename, eperm_on_sticky_srcdir) +{ + const char FULLDST[] = "mountpoint/d/dst"; + const char FULLSRC[] = "mountpoint/src"; + const char RELSRC[] = "src"; + uint64_t ino = 42; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 01777, UINT64_MAX, 1, 0); + expect_lookup(RELSRC, ino, S_IFREG | 0644, UINT64_MAX); + + ASSERT_EQ(-1, rename(FULLSRC, FULLDST)); + ASSERT_EQ(EPERM, errno); +} + +/* + * A user cannot move out a subdirectory that he does not own, because that + * would require changing the subdirectory's ".." dirent + */ +TEST_F(Rename, eperm_for_subdirectory) +{ + const char FULLDST[] = "mountpoint/d/dst"; + const char FULLSRC[] = "mountpoint/src"; + const char RELDSTDIR[] = "d"; + const char RELDST[] = "dst"; + const char RELSRC[] = "src"; + uint64_t ino = 42; + uint64_t dstdir_ino = 43; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0777, UINT64_MAX, 1, 0); + expect_lookup(RELSRC, ino, S_IFDIR | 0755, UINT64_MAX, 0); + expect_lookup(RELDSTDIR, dstdir_ino, S_IFDIR | 0777, UINT64_MAX, 0); + EXPECT_LOOKUP(dstdir_ino, RELDST).WillOnce(Invoke(ReturnErrno(ENOENT))); + + ASSERT_EQ(-1, rename(FULLSRC, FULLDST)); + ASSERT_EQ(EACCES, errno); +} + +/* + * A user _can_ rename a subdirectory to which he lacks write permissions, if + * it will keep the same parent + */ +TEST_F(Rename, subdirectory_to_same_dir) +{ + const char FULLDST[] = "mountpoint/dst"; + const char FULLSRC[] = "mountpoint/src"; + const char RELDST[] = "dst"; + const char RELSRC[] = "src"; + uint64_t ino = 42; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0777, UINT64_MAX, 1, 0); + expect_lookup(RELSRC, ino, S_IFDIR | 0755, UINT64_MAX, 0); + EXPECT_LOOKUP(FUSE_ROOT_ID, RELDST) + .WillOnce(Invoke(ReturnErrno(ENOENT))); + expect_rename(0); + + ASSERT_EQ(0, rename(FULLSRC, FULLDST)) << strerror(errno); +} + +TEST_F(Rename, eperm_on_sticky_dstdir) +{ + const char FULLDST[] = "mountpoint/d/dst"; + const char RELDSTDIR[] = "d"; + const char RELDST[] = "dst"; + const char FULLSRC[] = "mountpoint/src"; + const char RELSRC[] = "src"; + uint64_t src_ino = 42; + uint64_t dstdir_ino = 43; + uint64_t dst_ino = 44; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0777, UINT64_MAX, 1, 0); + expect_lookup(RELSRC, src_ino, S_IFREG | 0644, UINT64_MAX); + expect_lookup(RELDSTDIR, dstdir_ino, S_IFDIR | 01777, UINT64_MAX); + EXPECT_LOOKUP(dstdir_ino, RELDST) + .WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = S_IFREG | 0644; + out.body.entry.nodeid = dst_ino; + out.body.entry.attr_valid = UINT64_MAX; + out.body.entry.entry_valid = UINT64_MAX; + out.body.entry.attr.uid = 0; + }))); + + ASSERT_EQ(-1, rename(FULLSRC, FULLDST)); + ASSERT_EQ(EPERM, errno); +} + +/* Successfully rename a file, overwriting the destination */ +TEST_F(Rename, ok) +{ + const char FULLDST[] = "mountpoint/dst"; + const char RELDST[] = "dst"; + const char FULLSRC[] = "mountpoint/src"; + const char RELSRC[] = "src"; + // The inode of the already-existing destination file + uint64_t dst_ino = 2; + uint64_t ino = 42; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0777, UINT64_MAX, 1, geteuid()); + expect_lookup(RELSRC, ino, S_IFREG | 0644, UINT64_MAX); + expect_lookup(RELDST, dst_ino, S_IFREG | 0644, UINT64_MAX); + expect_rename(0); + + ASSERT_EQ(0, rename(FULLSRC, FULLDST)) << strerror(errno); +} + +TEST_F(Rename, ok_to_remove_src_because_of_stickiness) +{ + const char FULLDST[] = "mountpoint/dst"; + const char RELDST[] = "dst"; + const char FULLSRC[] = "mountpoint/src"; + const char RELSRC[] = "src"; + uint64_t ino = 42; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 01777, UINT64_MAX, 1, 0); + expect_lookup(RELSRC, ino, S_IFREG | 0644, UINT64_MAX, geteuid()); + EXPECT_LOOKUP(FUSE_ROOT_ID, RELDST) + .WillOnce(Invoke(ReturnErrno(ENOENT))); + expect_rename(0); + + ASSERT_EQ(0, rename(FULLSRC, FULLDST)) << strerror(errno); +} + +TEST_F(Setattr, ok) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const uint64_t ino = 42; + const mode_t oldmode = 0755; + const mode_t newmode = 0644; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0755, UINT64_MAX, 1); + expect_lookup(RELPATH, ino, S_IFREG | oldmode, UINT64_MAX, geteuid()); + EXPECT_CALL(*m_mock, process( + ResultOf([](auto in) { + return (in.header.opcode == FUSE_SETATTR && + in.header.nodeid == ino && + in.body.setattr.mode == newmode); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, attr); + out.body.attr.attr.mode = S_IFREG | newmode; + }))); + + EXPECT_EQ(0, chmod(FULLPATH, newmode)) << strerror(errno); +} + +TEST_F(Setattr, eacces) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const uint64_t ino = 42; + const mode_t oldmode = 0755; + const mode_t newmode = 0644; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0755, UINT64_MAX, 1); + expect_lookup(RELPATH, ino, S_IFREG | oldmode, UINT64_MAX, 0); + EXPECT_CALL(*m_mock, process( + ResultOf([](auto in) { + return (in.header.opcode == FUSE_SETATTR); + }, Eq(true)), + _) + ).Times(0); + + EXPECT_NE(0, chmod(FULLPATH, newmode)); + EXPECT_EQ(EPERM, errno); +} + +/* + * ftruncate() of a file without writable permissions should succeed as long as + * the file descriptor is writable. This is important when combined with + * O_CREAT + */ +TEST_F(Setattr, ftruncate_of_newly_created_file) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const uint64_t ino = 42; + const mode_t mode = 0000; + int fd; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0777, UINT64_MAX, 1); + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(Invoke(ReturnErrno(ENOENT))); + expect_create(RELPATH, ino); + EXPECT_CALL(*m_mock, process( + ResultOf([](auto in) { + return (in.header.opcode == FUSE_SETATTR && + in.header.nodeid == ino && + (in.body.setattr.valid & FATTR_SIZE)); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, attr); + out.body.attr.attr.ino = ino; + out.body.attr.attr.mode = S_IFREG | mode; + out.body.attr.attr_valid = UINT64_MAX; + }))); + + fd = open(FULLPATH, O_CREAT | O_RDWR, 0); + ASSERT_LE(0, fd) << strerror(errno); + ASSERT_EQ(0, ftruncate(fd, 100)) << strerror(errno); + leak(fd); +} + +/* + * Setting the sgid bit should fail for an unprivileged user who doesn't belong + * to the file's group + */ +TEST_F(Setattr, sgid_by_non_group_member) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const uint64_t ino = 42; + const mode_t oldmode = 0755; + const mode_t newmode = 02755; + uid_t uid = geteuid(); + gid_t gid = excluded_group(); + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0755, UINT64_MAX, 1); + expect_lookup(RELPATH, ino, S_IFREG | oldmode, UINT64_MAX, uid, gid); + EXPECT_CALL(*m_mock, process( + ResultOf([](auto in) { + return (in.header.opcode == FUSE_SETATTR); + }, Eq(true)), + _) + ).Times(0); + + EXPECT_NE(0, chmod(FULLPATH, newmode)); + EXPECT_EQ(EPERM, errno); +} + +/* Only the superuser may set the sticky bit on a non-directory */ +TEST_F(Setattr, sticky_regular_file) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const uint64_t ino = 42; + const mode_t oldmode = 0644; + const mode_t newmode = 01644; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0755, UINT64_MAX, 1); + expect_lookup(RELPATH, ino, S_IFREG | oldmode, UINT64_MAX, geteuid()); + EXPECT_CALL(*m_mock, process( + ResultOf([](auto in) { + return (in.header.opcode == FUSE_SETATTR); + }, Eq(true)), + _) + ).Times(0); + + EXPECT_NE(0, chmod(FULLPATH, newmode)); + EXPECT_EQ(EFTYPE, errno); +} + +TEST_F(Setextattr, ok) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + const char value[] = "whatever"; + ssize_t value_len = strlen(value) + 1; + int ns = EXTATTR_NAMESPACE_USER; + ssize_t r; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0755, UINT64_MAX, 1); + expect_lookup(RELPATH, ino, S_IFREG | 0644, UINT64_MAX, geteuid()); + expect_setxattr(0); + + r = extattr_set_file(FULLPATH, ns, "foo", (const void*)value, + value_len); + ASSERT_EQ(value_len, r) << strerror(errno); +} + +TEST_F(Setextattr, eacces) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + const char value[] = "whatever"; + ssize_t value_len = strlen(value) + 1; + int ns = EXTATTR_NAMESPACE_USER; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0755, UINT64_MAX, 1); + expect_lookup(RELPATH, ino, S_IFREG | 0644, UINT64_MAX, 0); + + ASSERT_EQ(-1, extattr_set_file(FULLPATH, ns, "foo", (const void*)value, + value_len)); + ASSERT_EQ(EACCES, errno); +} + +// Setting system attributes requires superuser privileges +TEST_F(Setextattr, system) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + const char value[] = "whatever"; + ssize_t value_len = strlen(value) + 1; + int ns = EXTATTR_NAMESPACE_SYSTEM; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0755, UINT64_MAX, 1); + expect_lookup(RELPATH, ino, S_IFREG | 0666, UINT64_MAX, geteuid()); + + ASSERT_EQ(-1, extattr_set_file(FULLPATH, ns, "foo", (const void*)value, + value_len)); + ASSERT_EQ(EPERM, errno); +} + +// Setting user attributes merely requires write privileges +TEST_F(Setextattr, user) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + const char value[] = "whatever"; + ssize_t value_len = strlen(value) + 1; + int ns = EXTATTR_NAMESPACE_USER; + ssize_t r; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0755, UINT64_MAX, 1); + expect_lookup(RELPATH, ino, S_IFREG | 0666, UINT64_MAX, 0); + expect_setxattr(0); + + r = extattr_set_file(FULLPATH, ns, "foo", (const void*)value, + value_len); + ASSERT_EQ(value_len, r) << strerror(errno); +} + +TEST_F(Unlink, ok) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0777, UINT64_MAX, 1); + expect_lookup(RELPATH, ino, S_IFREG | 0644, UINT64_MAX, geteuid()); + expect_unlink(FUSE_ROOT_ID, RELPATH, 0); + + ASSERT_EQ(0, unlink(FULLPATH)) << strerror(errno); +} + +/* + * Ensure that a cached name doesn't cause unlink to bypass permission checks + * in VOP_LOOKUP. + * + * This test should pass because lookup(9) purges the namecache entry by doing + * a vfs_cache_lookup with ~MAKEENTRY when nameiop == DELETE. + */ +TEST_F(Unlink, cached_unwritable_directory) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0755, UINT64_MAX, 1); + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .Times(AnyNumber()) + .WillRepeatedly(Invoke( + ReturnImmediate([=](auto i __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = S_IFREG | 0644; + out.body.entry.nodeid = ino; + out.body.entry.entry_valid = UINT64_MAX; + })) + ); + + /* Fill name cache */ + ASSERT_EQ(0, access(FULLPATH, F_OK)) << strerror(errno); + /* Despite cached name , unlink should fail */ + ASSERT_EQ(-1, unlink(FULLPATH)); + ASSERT_EQ(EACCES, errno); +} + +TEST_F(Unlink, unwritable_directory) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0755, UINT64_MAX, 1); + expect_lookup(RELPATH, ino, S_IFREG | 0644, UINT64_MAX, geteuid()); + + ASSERT_EQ(-1, unlink(FULLPATH)); + ASSERT_EQ(EACCES, errno); +} + +TEST_F(Unlink, sticky_directory) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 01777, UINT64_MAX, 1); + expect_lookup(RELPATH, ino, S_IFREG | 0644, UINT64_MAX, 0); + + ASSERT_EQ(-1, unlink(FULLPATH)); + ASSERT_EQ(EPERM, errno); +} + +/* A write by a non-owner should clear a file's SUID bit */ +TEST_F(Write, clear_suid) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + struct stat sb; + uint64_t ino = 42; + mode_t oldmode = 04777; + mode_t newmode = 0777; + char wbuf[1] = {'x'}; + int fd; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0755, UINT64_MAX, 1); + expect_lookup(RELPATH, ino, S_IFREG | oldmode, UINT64_MAX); + expect_open(ino, 0, 1); + expect_write(ino, 0, sizeof(wbuf), sizeof(wbuf), 0, 0, wbuf); + expect_chmod(ino, newmode, sizeof(wbuf)); + + fd = open(FULLPATH, O_WRONLY); + ASSERT_LE(0, fd) << strerror(errno); + ASSERT_EQ(1, write(fd, wbuf, sizeof(wbuf))) << strerror(errno); + ASSERT_EQ(0, fstat(fd, &sb)) << strerror(errno); + EXPECT_EQ(S_IFREG | newmode, sb.st_mode); + leak(fd); +} + +/* A write by a non-owner should clear a file's SGID bit */ +TEST_F(Write, clear_sgid) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + struct stat sb; + uint64_t ino = 42; + mode_t oldmode = 02777; + mode_t newmode = 0777; + char wbuf[1] = {'x'}; + int fd; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0755, UINT64_MAX, 1); + expect_lookup(RELPATH, ino, S_IFREG | oldmode, UINT64_MAX); + expect_open(ino, 0, 1); + expect_write(ino, 0, sizeof(wbuf), sizeof(wbuf), 0, 0, wbuf); + expect_chmod(ino, newmode, sizeof(wbuf)); + + fd = open(FULLPATH, O_WRONLY); + ASSERT_LE(0, fd) << strerror(errno); + ASSERT_EQ(1, write(fd, wbuf, sizeof(wbuf))) << strerror(errno); + ASSERT_EQ(0, fstat(fd, &sb)) << strerror(errno); + EXPECT_EQ(S_IFREG | newmode, sb.st_mode); + leak(fd); +} + +/* Regression test for a specific recurse-of-nonrecursive-lock panic + * + * With writeback caching, we can't call vtruncbuf from fuse_io_strategy, or it + * may panic. That happens if the FUSE_SETATTR response indicates that the + * file's size has changed since the write. + */ +TEST_F(Write, recursion_panic_while_clearing_suid) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + mode_t oldmode = 04777; + mode_t newmode = 0777; + char wbuf[1] = {'x'}; + int fd; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0755, UINT64_MAX, 1); + expect_lookup(RELPATH, ino, S_IFREG | oldmode, UINT64_MAX); + expect_open(ino, 0, 1); + expect_write(ino, 0, sizeof(wbuf), sizeof(wbuf), 0, 0, wbuf); + /* XXX Return a smaller file size than what we just wrote! */ + expect_chmod(ino, newmode, 0); + + fd = open(FULLPATH, O_WRONLY); + ASSERT_LE(0, fd) << strerror(errno); + ASSERT_EQ(1, write(fd, wbuf, sizeof(wbuf))) << strerror(errno); + leak(fd); +} + + Property changes on: head/tests/sys/fs/fusefs/default_permissions.cc ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: head/tests/sys/fs/fusefs/destroy.cc =================================================================== --- head/tests/sys/fs/fusefs/destroy.cc (nonexistent) +++ head/tests/sys/fs/fusefs/destroy.cc (revision 350665) @@ -0,0 +1,158 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2019 The FreeBSD Foundation + * + * This software was developed by BFF Storage Systems, LLC under sponsorship + * from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +extern "C" { +#include +#include +#include +} + +#include "mockfs.hh" +#include "utils.hh" + +using namespace testing; + +/* Tests for orderly unmounts */ +class Destroy: public FuseTest {}; + +/* Tests for unexpected deaths of the server */ +class Death: public FuseTest{}; + +static void* open_th(void* arg) { + int fd; + const char *path = (const char*)arg; + + fd = open(path, O_RDONLY); + EXPECT_EQ(-1, fd); + EXPECT_EQ(ENOTCONN, errno); + return 0; +} + +/* + * The server dies with unsent operations still on the message queue. + * Check for any memory leaks like this: + * 1) kldunload fusefs, if necessary + * 2) kldload fusefs + * 3) ./destroy --gtest_filter=Destroy.unsent_operations + * 4) kldunload fusefs + * 5) check /var/log/messages for anything like this: +Freed UMA keg (fuse_ticket) was not empty (31 items). Lost 2 pages of memory. +Warning: memory type fuse_msgbuf leaked memory on destroy (68 allocations, 428800 bytes leaked). + */ +TEST_F(Death, unsent_operations) +{ + const char FULLPATH0[] = "mountpoint/some_file.txt"; + const char FULLPATH1[] = "mountpoint/other_file.txt"; + const char RELPATH0[] = "some_file.txt"; + const char RELPATH1[] = "other_file.txt"; + pthread_t th0, th1; + ino_t ino0 = 42, ino1 = 43; + sem_t sem; + mode_t mode = S_IFREG | 0644; + + sem_init(&sem, 0, 0); + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH0) + .WillRepeatedly(Invoke( + ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = mode; + out.body.entry.nodeid = ino0; + out.body.entry.attr.nlink = 1; + }))); + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH1) + .WillRepeatedly(Invoke( + ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = mode; + out.body.entry.nodeid = ino1; + out.body.entry.attr.nlink = 1; + }))); + + EXPECT_CALL(*m_mock, process( + ResultOf([&](auto in) { + return (in.header.opcode == FUSE_OPEN); + }, Eq(true)), + _) + ).WillOnce(Invoke([&](auto in __unused, auto &out __unused) { + sem_post(&sem); + pause(); + })); + + /* + * One thread's operation will be sent to the daemon and block, and the + * other's will be stuck in the message queue. + */ + ASSERT_EQ(0, pthread_create(&th0, NULL, open_th, + __DECONST(void*, FULLPATH0))) << strerror(errno); + ASSERT_EQ(0, pthread_create(&th1, NULL, open_th, + __DECONST(void*, FULLPATH1))) << strerror(errno); + + /* Wait for the first thread to block */ + sem_wait(&sem); + /* Give the second thread time to block */ + nap(); + + m_mock->kill_daemon(); + + pthread_join(th0, NULL); + pthread_join(th1, NULL); + + sem_destroy(&sem); +} + +/* + * On unmount the kernel should send a FUSE_DESTROY operation. It should also + * send FUSE_FORGET operations for all inodes with lookup_count > 0. + */ +TEST_F(Destroy, ok) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + + expect_lookup(RELPATH, ino, S_IFREG | 0644, 0, 2); + expect_forget(FUSE_ROOT_ID, 1); + expect_forget(ino, 2); + expect_destroy(0); + + /* + * access(2) the file to force a lookup. Access it twice to double its + * lookup count. + */ + ASSERT_EQ(0, access(FULLPATH, F_OK)) << strerror(errno); + ASSERT_EQ(0, access(FULLPATH, F_OK)) << strerror(errno); + + /* + * Unmount, triggering a FUSE_DESTROY and also causing a VOP_RECLAIM + * for every vnode on this mp, triggering FUSE_FORGET for each of them. + */ + m_mock->unmount(); +} Property changes on: head/tests/sys/fs/fusefs/destroy.cc ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: head/tests/sys/fs/fusefs/dev_fuse_poll.cc =================================================================== --- head/tests/sys/fs/fusefs/dev_fuse_poll.cc (nonexistent) +++ head/tests/sys/fs/fusefs/dev_fuse_poll.cc (revision 350665) @@ -0,0 +1,227 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2019 The FreeBSD Foundation + * + * This software was developed by BFF Storage Systems, LLC under sponsorship + * from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +/* + * This file tests different polling methods for the /dev/fuse device + */ + +extern "C" { +#include +#include +#include +} + +#include "mockfs.hh" +#include "utils.hh" + +using namespace testing; + +const char FULLPATH[] = "mountpoint/some_file.txt"; +const char RELPATH[] = "some_file.txt"; +const uint64_t ino = 42; +const mode_t access_mode = R_OK; + +/* + * Translate a poll method's string representation to the enum value. + * Using strings with ::testing::Values gives better output with + * --gtest_list_tests + */ +enum poll_method poll_method_from_string(const char *s) +{ + if (0 == strcmp("BLOCKING", s)) + return BLOCKING; + else if (0 == strcmp("KQ", s)) + return KQ; + else if (0 == strcmp("POLL", s)) + return POLL; + else + return SELECT; +} + +class DevFusePoll: public FuseTest, public WithParamInterface { + virtual void SetUp() { + m_pm = poll_method_from_string(GetParam()); + FuseTest::SetUp(); + } +}; + +class Kqueue: public FuseTest { + virtual void SetUp() { + m_pm = KQ; + FuseTest::SetUp(); + } +}; + +TEST_P(DevFusePoll, access) +{ + expect_access(FUSE_ROOT_ID, X_OK, 0); + expect_lookup(RELPATH, ino, S_IFREG | 0644, 0, 1); + expect_access(ino, access_mode, 0); + + ASSERT_EQ(0, access(FULLPATH, access_mode)) << strerror(errno); +} + +/* Ensure that we wake up pollers during unmount */ +TEST_P(DevFusePoll, destroy) +{ + expect_forget(FUSE_ROOT_ID, 1); + expect_destroy(0); + + m_mock->unmount(); +} + +INSTANTIATE_TEST_CASE_P(PM, DevFusePoll, + ::testing::Values("BLOCKING", "KQ", "POLL", "SELECT")); + +static void* statter(void* arg) { + const char *name; + struct stat sb; + + name = (const char*)arg; + stat(name, &sb); + return 0; +} + +/* + * A kevent's data field should contain the number of operations available to + * be immediately rea. + */ +TEST_F(Kqueue, data) +{ + pthread_t th0, th1, th2; + sem_t sem0, sem1; + int nready0, nready1, nready2; + uint64_t foo_ino = 42; + uint64_t bar_ino = 43; + uint64_t baz_ino = 44; + Sequence seq; + + ASSERT_EQ(0, sem_init(&sem0, 0, 0)) << strerror(errno); + ASSERT_EQ(0, sem_init(&sem1, 0, 0)) << strerror(errno); + + EXPECT_LOOKUP(FUSE_ROOT_ID, "foo") + .WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.entry_valid = UINT64_MAX; + out.body.entry.attr.mode = S_IFREG | 0644; + out.body.entry.nodeid = foo_ino; + }))); + EXPECT_LOOKUP(FUSE_ROOT_ID, "bar") + .WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.entry_valid = UINT64_MAX; + out.body.entry.attr.mode = S_IFREG | 0644; + out.body.entry.nodeid = bar_ino; + }))); + EXPECT_LOOKUP(FUSE_ROOT_ID, "baz") + .WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.entry_valid = UINT64_MAX; + out.body.entry.attr.mode = S_IFREG | 0644; + out.body.entry.nodeid = baz_ino; + }))); + + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_GETATTR && + in.header.nodeid == foo_ino); + }, Eq(true)), + _) + ) + .WillOnce(Invoke(ReturnImmediate([&](auto in, auto& out) { + nready0 = m_mock->m_nready; + + sem_post(&sem0); + // Block the daemon so we can accumulate a few more ops + sem_wait(&sem1); + + out.header.unique = in.header.unique; + out.header.error = -EIO; + out.header.len = sizeof(out.header); + }))); + + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_GETATTR && + (in.header.nodeid == bar_ino || + in.header.nodeid == baz_ino)); + }, Eq(true)), + _) + ).InSequence(seq) + .WillOnce(Invoke(ReturnImmediate([&](auto in, auto& out) { + nready1 = m_mock->m_nready; + out.header.unique = in.header.unique; + out.header.error = -EIO; + out.header.len = sizeof(out.header); + }))); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_GETATTR && + (in.header.nodeid == bar_ino || + in.header.nodeid == baz_ino)); + }, Eq(true)), + _) + ).InSequence(seq) + .WillOnce(Invoke(ReturnImmediate([&](auto in, auto& out) { + nready2 = m_mock->m_nready; + out.header.unique = in.header.unique; + out.header.error = -EIO; + out.header.len = sizeof(out.header); + }))); + + /* + * Create cached lookup entries for these files. It seems that only + * one thread at a time can be in VOP_LOOKUP for a given directory + */ + access("mountpoint/foo", F_OK); + access("mountpoint/bar", F_OK); + access("mountpoint/baz", F_OK); + ASSERT_EQ(0, pthread_create(&th0, NULL, statter, + __DECONST(void*, "mountpoint/foo"))) << strerror(errno); + EXPECT_EQ(0, sem_wait(&sem0)) << strerror(errno); + ASSERT_EQ(0, pthread_create(&th1, NULL, statter, + __DECONST(void*, "mountpoint/bar"))) << strerror(errno); + ASSERT_EQ(0, pthread_create(&th2, NULL, statter, + __DECONST(void*, "mountpoint/baz"))) << strerror(errno); + + nap(); // Allow th1 and th2 to send their ops to the daemon + EXPECT_EQ(0, sem_post(&sem1)) << strerror(errno); + + pthread_join(th0, NULL); + pthread_join(th1, NULL); + pthread_join(th2, NULL); + + EXPECT_EQ(1, nready0); + EXPECT_EQ(2, nready1); + EXPECT_EQ(1, nready2); + + sem_destroy(&sem0); + sem_destroy(&sem1); +} Property changes on: head/tests/sys/fs/fusefs/dev_fuse_poll.cc ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: head/tests/sys/fs/fusefs/interrupt.cc =================================================================== --- head/tests/sys/fs/fusefs/interrupt.cc (nonexistent) +++ head/tests/sys/fs/fusefs/interrupt.cc (revision 350665) @@ -0,0 +1,790 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2019 The FreeBSD Foundation + * + * This software was developed by BFF Storage Systems, LLC under sponsorship + * from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +extern "C" { +#include +#include +#include +#include +#include +#include +#include +#include +} + +#include "mockfs.hh" +#include "utils.hh" + +using namespace testing; + +/* Initial size of files used by these tests */ +const off_t FILESIZE = 1000; +/* Access mode used by all directories in these tests */ +const mode_t MODE = 0755; +const char FULLDIRPATH0[] = "mountpoint/some_dir"; +const char RELDIRPATH0[] = "some_dir"; +const char FULLDIRPATH1[] = "mountpoint/other_dir"; +const char RELDIRPATH1[] = "other_dir"; + +static sem_t *blocked_semaphore; +static sem_t *signaled_semaphore; + +static bool killer_should_sleep = false; + +/* Don't do anything; all we care about is that the syscall gets interrupted */ +void sigusr2_handler(int __unused sig) { + if (verbosity > 1) { + printf("Signaled! thread %p\n", pthread_self()); + } + +} + +void* killer(void* target) { + /* Wait until the main thread is blocked in fdisp_wait_answ */ + if (killer_should_sleep) + nap(); + else + sem_wait(blocked_semaphore); + if (verbosity > 1) + printf("Signalling! thread %p\n", target); + pthread_kill((pthread_t)target, SIGUSR2); + if (signaled_semaphore != NULL) + sem_post(signaled_semaphore); + + return(NULL); +} + +class Interrupt: public FuseTest { +public: +pthread_t m_child; + +Interrupt(): m_child(NULL) {}; + +void expect_lookup(const char *relpath, uint64_t ino) +{ + FuseTest::expect_lookup(relpath, ino, S_IFREG | 0644, FILESIZE, 1); +} + +/* + * Expect a FUSE_MKDIR but don't reply. Instead, just record the unique value + * to the provided pointer + */ +void expect_mkdir(uint64_t *mkdir_unique) +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_MKDIR); + }, Eq(true)), + _) + ).WillOnce(Invoke([=](auto in, auto &out __unused) { + *mkdir_unique = in.header.unique; + sem_post(blocked_semaphore); + })); +} + +/* + * Expect a FUSE_READ but don't reply. Instead, just record the unique value + * to the provided pointer + */ +void expect_read(uint64_t ino, uint64_t *read_unique) +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_READ && + in.header.nodeid == ino); + }, Eq(true)), + _) + ).WillOnce(Invoke([=](auto in, auto &out __unused) { + *read_unique = in.header.unique; + sem_post(blocked_semaphore); + })); +} + +/* + * Expect a FUSE_WRITE but don't reply. Instead, just record the unique value + * to the provided pointer + */ +void expect_write(uint64_t ino, uint64_t *write_unique) +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_WRITE && + in.header.nodeid == ino); + }, Eq(true)), + _) + ).WillOnce(Invoke([=](auto in, auto &out __unused) { + *write_unique = in.header.unique; + sem_post(blocked_semaphore); + })); +} + +void setup_interruptor(pthread_t target, bool sleep = false) +{ + ASSERT_NE(SIG_ERR, signal(SIGUSR2, sigusr2_handler)) << strerror(errno); + killer_should_sleep = sleep; + ASSERT_EQ(0, pthread_create(&m_child, NULL, killer, (void*)target)) + << strerror(errno); +} + +void SetUp() { + const int mprot = PROT_READ | PROT_WRITE; + const int mflags = MAP_ANON | MAP_SHARED; + + signaled_semaphore = NULL; + + blocked_semaphore = (sem_t*)mmap(NULL, sizeof(*blocked_semaphore), + mprot, mflags, -1, 0); + ASSERT_NE(MAP_FAILED, blocked_semaphore) << strerror(errno); + ASSERT_EQ(0, sem_init(blocked_semaphore, 1, 0)) << strerror(errno); + ASSERT_EQ(0, siginterrupt(SIGUSR2, 1)); + + FuseTest::SetUp(); +} + +void TearDown() { + struct sigaction sa; + + if (m_child != NULL) { + pthread_join(m_child, NULL); + } + bzero(&sa, sizeof(sa)); + sa.sa_handler = SIG_DFL; + sigaction(SIGUSR2, &sa, NULL); + + sem_destroy(blocked_semaphore); + munmap(blocked_semaphore, sizeof(*blocked_semaphore)); + + FuseTest::TearDown(); +} +}; + +class Intr: public Interrupt {}; + +class Nointr: public Interrupt { + void SetUp() { + m_nointr = true; + Interrupt::SetUp(); + } +}; + +static void* mkdir0(void* arg __unused) { + ssize_t r; + + r = mkdir(FULLDIRPATH0, MODE); + if (r >= 0) + return 0; + else + return (void*)(intptr_t)errno; +} + +static void* read1(void* arg) { + const size_t bufsize = FILESIZE; + char buf[bufsize]; + int fd = (int)(intptr_t)arg; + ssize_t r; + + r = read(fd, buf, bufsize); + if (r >= 0) + return 0; + else + return (void*)(intptr_t)errno; +} + +/* + * An interrupt operation that gets received after the original command is + * complete should generate an EAGAIN response. + */ +/* https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=236530 */ +TEST_F(Intr, already_complete) +{ + uint64_t ino = 42; + pthread_t self; + uint64_t mkdir_unique = 0; + Sequence seq; + + self = pthread_self(); + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELDIRPATH0) + .InSequence(seq) + .WillOnce(Invoke(ReturnErrno(ENOENT))); + expect_mkdir(&mkdir_unique); + EXPECT_CALL(*m_mock, process( + ResultOf([&](auto in) { + return (in.header.opcode == FUSE_INTERRUPT && + in.body.interrupt.unique == mkdir_unique); + }, Eq(true)), + _) + ).WillOnce(Invoke([&](auto in, auto &out) { + // First complete the mkdir request + std::unique_ptr out0(new mockfs_buf_out); + out0->header.unique = mkdir_unique; + SET_OUT_HEADER_LEN(*out0, entry); + out0->body.create.entry.attr.mode = S_IFDIR | MODE; + out0->body.create.entry.nodeid = ino; + out.push_back(std::move(out0)); + + // Then, respond EAGAIN to the interrupt request + std::unique_ptr out1(new mockfs_buf_out); + out1->header.unique = in.header.unique; + out1->header.error = -EAGAIN; + out1->header.len = sizeof(out1->header); + out.push_back(std::move(out1)); + })); + EXPECT_LOOKUP(FUSE_ROOT_ID, RELDIRPATH0) + .InSequence(seq) + .WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = S_IFDIR | MODE; + out.body.entry.nodeid = ino; + out.body.entry.attr.nlink = 2; + }))); + + setup_interruptor(self); + EXPECT_EQ(0, mkdir(FULLDIRPATH0, MODE)) << strerror(errno); + /* + * The final syscall simply ensures that the test's main thread doesn't + * end before the daemon finishes responding to the FUSE_INTERRUPT. + */ + EXPECT_EQ(0, access(FULLDIRPATH0, F_OK)) << strerror(errno); +} + +/* + * If a FUSE file system returns ENOSYS for a FUSE_INTERRUPT operation, the + * kernel should not attempt to interrupt any other operations on that mount + * point. + */ +TEST_F(Intr, enosys) +{ + uint64_t ino0 = 42, ino1 = 43;; + uint64_t mkdir_unique; + pthread_t self, th0; + sem_t sem0, sem1; + void *thr0_value; + Sequence seq; + + self = pthread_self(); + ASSERT_EQ(0, sem_init(&sem0, 0, 0)) << strerror(errno); + ASSERT_EQ(0, sem_init(&sem1, 0, 0)) << strerror(errno); + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELDIRPATH1) + .WillOnce(Invoke(ReturnErrno(ENOENT))); + EXPECT_LOOKUP(FUSE_ROOT_ID, RELDIRPATH0) + .WillOnce(Invoke(ReturnErrno(ENOENT))); + expect_mkdir(&mkdir_unique); + EXPECT_CALL(*m_mock, process( + ResultOf([&](auto in) { + return (in.header.opcode == FUSE_INTERRUPT && + in.body.interrupt.unique == mkdir_unique); + }, Eq(true)), + _) + ).InSequence(seq) + .WillOnce(Invoke([&](auto in, auto &out) { + // reject FUSE_INTERRUPT and respond to the FUSE_MKDIR + std::unique_ptr out0(new mockfs_buf_out); + std::unique_ptr out1(new mockfs_buf_out); + + out0->header.unique = in.header.unique; + out0->header.error = -ENOSYS; + out0->header.len = sizeof(out0->header); + out.push_back(std::move(out0)); + + SET_OUT_HEADER_LEN(*out1, entry); + out1->body.create.entry.attr.mode = S_IFDIR | MODE; + out1->body.create.entry.nodeid = ino1; + out1->header.unique = mkdir_unique; + out.push_back(std::move(out1)); + })); + EXPECT_CALL(*m_mock, process( + ResultOf([&](auto in) { + return (in.header.opcode == FUSE_MKDIR); + }, Eq(true)), + _) + ).InSequence(seq) + .WillOnce(Invoke([&](auto in, auto &out) { + std::unique_ptr out0(new mockfs_buf_out); + + sem_post(&sem0); + sem_wait(&sem1); + + SET_OUT_HEADER_LEN(*out0, entry); + out0->body.create.entry.attr.mode = S_IFDIR | MODE; + out0->body.create.entry.nodeid = ino0; + out0->header.unique = in.header.unique; + out.push_back(std::move(out0)); + })); + + setup_interruptor(self); + /* First mkdir operation should finish synchronously */ + ASSERT_EQ(0, mkdir(FULLDIRPATH1, MODE)) << strerror(errno); + + ASSERT_EQ(0, pthread_create(&th0, NULL, mkdir0, NULL)) + << strerror(errno); + + sem_wait(&sem0); + /* + * th0 should be blocked waiting for the fuse daemon thread. + * Signal it. No FUSE_INTERRUPT should result + */ + pthread_kill(th0, SIGUSR1); + /* Allow the daemon thread to proceed */ + sem_post(&sem1); + pthread_join(th0, &thr0_value); + /* Second mkdir should've finished without error */ + EXPECT_EQ(0, (intptr_t)thr0_value); +} + +/* + * A FUSE filesystem is legally allowed to ignore INTERRUPT operations, and + * complete the original operation whenever it damn well pleases. + */ +/* https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=236530 */ +TEST_F(Intr, ignore) +{ + uint64_t ino = 42; + pthread_t self; + uint64_t mkdir_unique; + + self = pthread_self(); + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELDIRPATH0) + .WillOnce(Invoke(ReturnErrno(ENOENT))); + expect_mkdir(&mkdir_unique); + EXPECT_CALL(*m_mock, process( + ResultOf([&](auto in) { + return (in.header.opcode == FUSE_INTERRUPT && + in.body.interrupt.unique == mkdir_unique); + }, Eq(true)), + _) + ).WillOnce(Invoke([&](auto in __unused, auto &out) { + // Ignore FUSE_INTERRUPT; respond to the FUSE_MKDIR + std::unique_ptr out0(new mockfs_buf_out); + out0->header.unique = mkdir_unique; + SET_OUT_HEADER_LEN(*out0, entry); + out0->body.create.entry.attr.mode = S_IFDIR | MODE; + out0->body.create.entry.nodeid = ino; + out.push_back(std::move(out0)); + })); + + setup_interruptor(self); + ASSERT_EQ(0, mkdir(FULLDIRPATH0, MODE)) << strerror(errno); +} + +/* + * A restartable operation (basically, anything except write or setextattr) + * that hasn't yet been sent to userland can be interrupted without sending + * FUSE_INTERRUPT, and will be automatically restarted. + */ +TEST_F(Intr, in_kernel_restartable) +{ + const char FULLPATH1[] = "mountpoint/other_file.txt"; + const char RELPATH1[] = "other_file.txt"; + uint64_t ino0 = 42, ino1 = 43; + int fd1; + pthread_t self, th0, th1; + sem_t sem0, sem1; + void *thr0_value, *thr1_value; + + ASSERT_EQ(0, sem_init(&sem0, 0, 0)) << strerror(errno); + ASSERT_EQ(0, sem_init(&sem1, 0, 0)) << strerror(errno); + self = pthread_self(); + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELDIRPATH0) + .WillOnce(Invoke(ReturnErrno(ENOENT))); + expect_lookup(RELPATH1, ino1); + expect_open(ino1, 0, 1); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_MKDIR); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([&](auto in __unused, auto& out) { + /* Let the next write proceed */ + sem_post(&sem1); + /* Pause the daemon thread so it won't read the next op */ + sem_wait(&sem0); + + SET_OUT_HEADER_LEN(out, entry); + out.body.create.entry.attr.mode = S_IFDIR | MODE; + out.body.create.entry.nodeid = ino0; + }))); + FuseTest::expect_read(ino1, 0, FILESIZE, 0, NULL); + + fd1 = open(FULLPATH1, O_RDONLY); + ASSERT_LE(0, fd1) << strerror(errno); + + /* Use a separate thread for each operation */ + ASSERT_EQ(0, pthread_create(&th0, NULL, mkdir0, NULL)) + << strerror(errno); + + sem_wait(&sem1); /* Sequence the two operations */ + + ASSERT_EQ(0, pthread_create(&th1, NULL, read1, (void*)(intptr_t)fd1)) + << strerror(errno); + + setup_interruptor(self, true); + + pause(); /* Wait for signal */ + + /* Unstick the daemon */ + ASSERT_EQ(0, sem_post(&sem0)) << strerror(errno); + + /* Wait awhile to make sure the signal generates no FUSE_INTERRUPT */ + nap(); + + pthread_join(th1, &thr1_value); + pthread_join(th0, &thr0_value); + EXPECT_EQ(0, (intptr_t)thr1_value); + EXPECT_EQ(0, (intptr_t)thr0_value); + sem_destroy(&sem1); + sem_destroy(&sem0); +} + +/* + * An operation that hasn't yet been sent to userland can be interrupted + * without sending FUSE_INTERRUPT. If it's a non-restartable operation (write + * or setextattr) it will return EINTR. + */ +TEST_F(Intr, in_kernel_nonrestartable) +{ + const char FULLPATH1[] = "mountpoint/other_file.txt"; + const char RELPATH1[] = "other_file.txt"; + const char value[] = "whatever"; + ssize_t value_len = strlen(value) + 1; + uint64_t ino0 = 42, ino1 = 43; + int ns = EXTATTR_NAMESPACE_USER; + int fd1; + pthread_t self, th0; + sem_t sem0, sem1; + void *thr0_value; + ssize_t r; + + ASSERT_EQ(0, sem_init(&sem0, 0, 0)) << strerror(errno); + ASSERT_EQ(0, sem_init(&sem1, 0, 0)) << strerror(errno); + self = pthread_self(); + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELDIRPATH0) + .WillOnce(Invoke(ReturnErrno(ENOENT))); + expect_lookup(RELPATH1, ino1); + expect_open(ino1, 0, 1); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_MKDIR); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([&](auto in __unused, auto& out) { + /* Let the next write proceed */ + sem_post(&sem1); + /* Pause the daemon thread so it won't read the next op */ + sem_wait(&sem0); + + SET_OUT_HEADER_LEN(out, entry); + out.body.create.entry.attr.mode = S_IFDIR | MODE; + out.body.create.entry.nodeid = ino0; + }))); + + fd1 = open(FULLPATH1, O_WRONLY); + ASSERT_LE(0, fd1) << strerror(errno); + + /* Use a separate thread for the first write */ + ASSERT_EQ(0, pthread_create(&th0, NULL, mkdir0, NULL)) + << strerror(errno); + + sem_wait(&sem1); /* Sequence the two operations */ + + setup_interruptor(self, true); + + r = extattr_set_fd(fd1, ns, "foo", (const void*)value, value_len); + EXPECT_NE(0, r); + EXPECT_EQ(EINTR, errno); + + /* Unstick the daemon */ + ASSERT_EQ(0, sem_post(&sem0)) << strerror(errno); + + /* Wait awhile to make sure the signal generates no FUSE_INTERRUPT */ + nap(); + + pthread_join(th0, &thr0_value); + EXPECT_EQ(0, (intptr_t)thr0_value); + sem_destroy(&sem1); + sem_destroy(&sem0); +} + +/* + * A syscall that gets interrupted while blocking on FUSE I/O should send a + * FUSE_INTERRUPT command to the fuse filesystem, which should then send EINTR + * in response to the _original_ operation. The kernel should ultimately + * return EINTR to userspace + */ +/* https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=236530 */ +TEST_F(Intr, in_progress) +{ + pthread_t self; + uint64_t mkdir_unique; + + self = pthread_self(); + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELDIRPATH0) + .WillOnce(Invoke(ReturnErrno(ENOENT))); + expect_mkdir(&mkdir_unique); + EXPECT_CALL(*m_mock, process( + ResultOf([&](auto in) { + return (in.header.opcode == FUSE_INTERRUPT && + in.body.interrupt.unique == mkdir_unique); + }, Eq(true)), + _) + ).WillOnce(Invoke([&](auto in __unused, auto &out) { + std::unique_ptr out0(new mockfs_buf_out); + out0->header.error = -EINTR; + out0->header.unique = mkdir_unique; + out0->header.len = sizeof(out0->header); + out.push_back(std::move(out0)); + })); + + setup_interruptor(self); + ASSERT_EQ(-1, mkdir(FULLDIRPATH0, MODE)); + EXPECT_EQ(EINTR, errno); +} + +/* Reads should also be interruptible */ +TEST_F(Intr, in_progress_read) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const size_t bufsize = 80; + char buf[bufsize]; + uint64_t ino = 42; + int fd; + pthread_t self; + uint64_t read_unique; + + self = pthread_self(); + + expect_lookup(RELPATH, ino); + expect_open(ino, 0, 1); + expect_read(ino, &read_unique); + EXPECT_CALL(*m_mock, process( + ResultOf([&](auto in) { + return (in.header.opcode == FUSE_INTERRUPT && + in.body.interrupt.unique == read_unique); + }, Eq(true)), + _) + ).WillOnce(Invoke([&](auto in __unused, auto &out) { + std::unique_ptr out0(new mockfs_buf_out); + out0->header.error = -EINTR; + out0->header.unique = read_unique; + out0->header.len = sizeof(out0->header); + out.push_back(std::move(out0)); + })); + + fd = open(FULLPATH, O_RDONLY); + ASSERT_LE(0, fd) << strerror(errno); + + setup_interruptor(self); + ASSERT_EQ(-1, read(fd, buf, bufsize)); + EXPECT_EQ(EINTR, errno); +} + +/* + * When mounted with -o nointr, fusefs will block signals while waiting for the + * server. + */ +TEST_F(Nointr, block) +{ + uint64_t ino = 42; + pthread_t self; + sem_t sem0; + + ASSERT_EQ(0, sem_init(&sem0, 0, 0)) << strerror(errno); + signaled_semaphore = &sem0; + self = pthread_self(); + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELDIRPATH0) + .WillOnce(Invoke(ReturnErrno(ENOENT))); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_MKDIR); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([&](auto in __unused, auto& out) { + /* Let the killer proceed */ + sem_post(blocked_semaphore); + + /* Wait until after the signal has been sent */ + sem_wait(signaled_semaphore); + /* Allow time for the mkdir thread to receive the signal */ + nap(); + + /* Finally, complete the original op */ + SET_OUT_HEADER_LEN(out, entry); + out.body.create.entry.attr.mode = S_IFDIR | MODE; + out.body.create.entry.nodeid = ino; + }))); + EXPECT_CALL(*m_mock, process( + ResultOf([&](auto in) { + return (in.header.opcode == FUSE_INTERRUPT); + }, Eq(true)), + _) + ).Times(0); + + setup_interruptor(self); + ASSERT_EQ(0, mkdir(FULLDIRPATH0, MODE)) << strerror(errno); + + sem_destroy(&sem0); +} + +/* FUSE_INTERRUPT operations should take priority over other pending ops */ +TEST_F(Intr, priority) +{ + Sequence seq; + uint64_t ino1 = 43; + uint64_t mkdir_unique; + pthread_t th0; + sem_t sem0, sem1; + + ASSERT_EQ(0, sem_init(&sem0, 0, 0)) << strerror(errno); + ASSERT_EQ(0, sem_init(&sem1, 0, 0)) << strerror(errno); + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELDIRPATH0) + .WillOnce(Invoke(ReturnErrno(ENOENT))); + EXPECT_LOOKUP(FUSE_ROOT_ID, RELDIRPATH1) + .WillOnce(Invoke(ReturnErrno(ENOENT))); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_MKDIR); + }, Eq(true)), + _) + ).InSequence(seq) + .WillOnce(Invoke(ReturnImmediate([&](auto in, auto& out) { + mkdir_unique = in.header.unique; + + /* Let the next mkdir proceed */ + sem_post(&sem1); + + /* Pause the daemon thread so it won't read the next op */ + sem_wait(&sem0); + + /* Finally, interrupt the original op */ + out.header.error = -EINTR; + out.header.unique = mkdir_unique; + out.header.len = sizeof(out.header); + }))); + /* + * FUSE_INTERRUPT should be received before the second FUSE_MKDIR, + * even though it was generated later + */ + EXPECT_CALL(*m_mock, process( + ResultOf([&](auto in) { + return (in.header.opcode == FUSE_INTERRUPT && + in.body.interrupt.unique == mkdir_unique); + }, Eq(true)), + _) + ).InSequence(seq) + .WillOnce(Invoke(ReturnErrno(EAGAIN))); + EXPECT_CALL(*m_mock, process( + ResultOf([&](auto in) { + return (in.header.opcode == FUSE_MKDIR); + }, Eq(true)), + _) + ).InSequence(seq) + .WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.create.entry.attr.mode = S_IFDIR | MODE; + out.body.create.entry.nodeid = ino1; + }))); + + /* Use a separate thread for the first mkdir */ + ASSERT_EQ(0, pthread_create(&th0, NULL, mkdir0, NULL)) + << strerror(errno); + + signaled_semaphore = &sem0; + + sem_wait(&sem1); /* Sequence the two mkdirs */ + setup_interruptor(th0, true); + ASSERT_EQ(0, mkdir(FULLDIRPATH1, MODE)) << strerror(errno); + + pthread_join(th0, NULL); + sem_destroy(&sem1); + sem_destroy(&sem0); +} + +/* + * If the FUSE filesystem receives the FUSE_INTERRUPT operation before + * processing the original, then it should wait for "some timeout" for the + * original operation to arrive. If not, it should send EAGAIN to the + * INTERRUPT operation, and the kernel should requeue the INTERRUPT. + * + * In this test, we'll pretend that the INTERRUPT arrives too soon, gets + * EAGAINed, then the kernel requeues it, and the second time around it + * successfully interrupts the original + */ +/* https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=236530 */ +TEST_F(Intr, too_soon) +{ + Sequence seq; + pthread_t self; + uint64_t mkdir_unique; + + self = pthread_self(); + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELDIRPATH0) + .WillOnce(Invoke(ReturnErrno(ENOENT))); + expect_mkdir(&mkdir_unique); + + EXPECT_CALL(*m_mock, process( + ResultOf([&](auto in) { + return (in.header.opcode == FUSE_INTERRUPT && + in.body.interrupt.unique == mkdir_unique); + }, Eq(true)), + _) + ).InSequence(seq) + .WillOnce(Invoke(ReturnErrno(EAGAIN))); + + EXPECT_CALL(*m_mock, process( + ResultOf([&](auto in) { + return (in.header.opcode == FUSE_INTERRUPT && + in.body.interrupt.unique == mkdir_unique); + }, Eq(true)), + _) + ).InSequence(seq) + .WillOnce(Invoke([&](auto in __unused, auto &out __unused) { + std::unique_ptr out0(new mockfs_buf_out); + out0->header.error = -EINTR; + out0->header.unique = mkdir_unique; + out0->header.len = sizeof(out0->header); + out.push_back(std::move(out0)); + })); + + setup_interruptor(self); + ASSERT_EQ(-1, mkdir(FULLDIRPATH0, MODE)); + EXPECT_EQ(EINTR, errno); +} + + +// TODO: add a test where write returns EWOULDBLOCK Property changes on: head/tests/sys/fs/fusefs/interrupt.cc ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: head/tests/sys/fs/fusefs/io.cc =================================================================== --- head/tests/sys/fs/fusefs/io.cc (nonexistent) +++ head/tests/sys/fs/fusefs/io.cc (revision 350665) @@ -0,0 +1,543 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2019 The FreeBSD Foundation + * + * This software was developed by BFF Storage Systems, LLC under sponsorship + * from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +extern "C" { +#include +#include +#include + +#include +#include +#include +} + +#include "mockfs.hh" +#include "utils.hh" + +/* + * For testing I/O like fsx does, but deterministically and without a real + * underlying file system + * + * TODO: after fusefs gains the options to select cache mode for each mount + * point, run each of these tests for all cache modes. + */ + +using namespace testing; + +enum cache_mode { + Uncached, + Writethrough, + Writeback, + WritebackAsync +}; + +const char *cache_mode_to_s(enum cache_mode cm) { + switch (cm) { + case Uncached: + return "Uncached"; + case Writethrough: + return "Writethrough"; + case Writeback: + return "Writeback"; + case WritebackAsync: + return "WritebackAsync"; + default: + return "Unknown"; + } +} + +const char FULLPATH[] = "mountpoint/some_file.txt"; +const char RELPATH[] = "some_file.txt"; +const uint64_t ino = 42; + +static void compare(const void *tbuf, const void *controlbuf, off_t baseofs, + ssize_t size) +{ + int i; + + for (i = 0; i < size; i++) { + if (((const char*)tbuf)[i] != ((const char*)controlbuf)[i]) { + off_t ofs = baseofs + i; + FAIL() << "miscompare at offset " + << std::hex + << std::showbase + << ofs + << ". expected = " + << std::setw(2) + << (unsigned)((const uint8_t*)controlbuf)[i] + << " got = " + << (unsigned)((const uint8_t*)tbuf)[i]; + } + } +} + +typedef tuple IoParam; + +class Io: public FuseTest, public WithParamInterface { +public: +int m_backing_fd, m_control_fd, m_test_fd; +off_t m_filesize; +bool m_direct_io; + +Io(): m_backing_fd(-1), m_control_fd(-1), m_direct_io(false) {}; + +void SetUp() +{ + m_filesize = 0; + m_backing_fd = open("backing_file", O_RDWR | O_CREAT | O_TRUNC, 0644); + if (m_backing_fd < 0) + FAIL() << strerror(errno); + m_control_fd = open("control", O_RDWR | O_CREAT | O_TRUNC, 0644); + if (m_control_fd < 0) + FAIL() << strerror(errno); + srandom(22'9'1982); // Seed with my birthday + + if (get<0>(GetParam())) + m_init_flags |= FUSE_ASYNC_READ; + m_maxwrite = get<1>(GetParam()); + switch (get<2>(GetParam())) { + case Uncached: + m_direct_io = true; + break; + case WritebackAsync: + m_async = true; + /* FALLTHROUGH */ + case Writeback: + m_init_flags |= FUSE_WRITEBACK_CACHE; + /* FALLTHROUGH */ + case Writethrough: + break; + default: + FAIL() << "Unknown cache mode"; + } + + FuseTest::SetUp(); + if (IsSkipped()) + return; + + if (verbosity > 0) { + printf("Test Parameters: init_flags=%#x maxwrite=%#x " + "%sasync cache=%s\n", + m_init_flags, m_maxwrite, m_async? "" : "no", + cache_mode_to_s(get<2>(GetParam()))); + } + + expect_lookup(RELPATH, ino, S_IFREG | 0644, 0, 1); + expect_open(ino, m_direct_io ? FOPEN_DIRECT_IO : 0, 1); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_WRITE && + in.header.nodeid == ino); + }, Eq(true)), + _) + ).WillRepeatedly(Invoke(ReturnImmediate([=](auto in, auto& out) { + const char *buf = (const char*)in.body.bytes + + sizeof(struct fuse_write_in); + ssize_t isize = in.body.write.size; + off_t iofs = in.body.write.offset; + + ASSERT_EQ(isize, pwrite(m_backing_fd, buf, isize, iofs)) + << strerror(errno); + SET_OUT_HEADER_LEN(out, write); + out.body.write.size = isize; + }))); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_READ && + in.header.nodeid == ino); + }, Eq(true)), + _) + ).WillRepeatedly(Invoke(ReturnImmediate([=](auto in, auto& out) { + ssize_t isize = in.body.write.size; + off_t iofs = in.body.write.offset; + void *buf = out.body.bytes; + ssize_t osize; + + osize = pread(m_backing_fd, buf, isize, iofs); + ASSERT_LE(0, osize) << strerror(errno); + out.header.len = sizeof(struct fuse_out_header) + osize; + }))); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_SETATTR && + in.header.nodeid == ino && + (in.body.setattr.valid & FATTR_SIZE)); + + }, Eq(true)), + _) + ).WillRepeatedly(Invoke(ReturnImmediate([=](auto in, auto& out) { + ASSERT_EQ(0, ftruncate(m_backing_fd, in.body.setattr.size)) + << strerror(errno); + SET_OUT_HEADER_LEN(out, attr); + out.body.attr.attr.ino = ino; + out.body.attr.attr.mode = S_IFREG | 0755; + out.body.attr.attr.size = in.body.setattr.size; + out.body.attr.attr_valid = UINT64_MAX; + }))); + /* Any test that close()s will send FUSE_FLUSH and FUSE_RELEASE */ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_FLUSH && + in.header.nodeid == ino); + }, Eq(true)), + _) + ).WillRepeatedly(Invoke(ReturnErrno(0))); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_RELEASE && + in.header.nodeid == ino); + }, Eq(true)), + _) + ).WillRepeatedly(Invoke(ReturnErrno(0))); + + m_test_fd = open(FULLPATH, O_RDWR ); + EXPECT_LE(0, m_test_fd) << strerror(errno); +} + +void TearDown() +{ + if (m_test_fd >= 0) + close(m_test_fd); + if (m_backing_fd >= 0) + close(m_backing_fd); + if (m_control_fd >= 0) + close(m_control_fd); + FuseTest::TearDown(); + leak(m_test_fd); +} + +void do_closeopen() +{ + ASSERT_EQ(0, close(m_test_fd)) << strerror(errno); + m_test_fd = open("backing_file", O_RDWR); + ASSERT_LE(0, m_test_fd) << strerror(errno); + + ASSERT_EQ(0, close(m_control_fd)) << strerror(errno); + m_control_fd = open("control", O_RDWR); + ASSERT_LE(0, m_control_fd) << strerror(errno); +} + +void do_ftruncate(off_t offs) +{ + ASSERT_EQ(0, ftruncate(m_test_fd, offs)) << strerror(errno); + ASSERT_EQ(0, ftruncate(m_control_fd, offs)) << strerror(errno); + m_filesize = offs; +} + +void do_mapread(ssize_t size, off_t offs) +{ + void *control_buf, *p; + off_t pg_offset, page_mask; + size_t map_size; + + page_mask = getpagesize() - 1; + pg_offset = offs & page_mask; + map_size = pg_offset + size; + + p = mmap(NULL, map_size, PROT_READ, MAP_FILE | MAP_SHARED, m_test_fd, + offs - pg_offset); + ASSERT_NE(p, MAP_FAILED) << strerror(errno); + + control_buf = malloc(size); + ASSERT_NE(nullptr, control_buf) << strerror(errno); + + ASSERT_EQ(size, pread(m_control_fd, control_buf, size, offs)) + << strerror(errno); + + compare((void*)((char*)p + pg_offset), control_buf, offs, size); + + ASSERT_EQ(0, munmap(p, map_size)) << strerror(errno); + free(control_buf); +} + +void do_read(ssize_t size, off_t offs) +{ + void *test_buf, *control_buf; + ssize_t r; + + test_buf = malloc(size); + ASSERT_NE(nullptr, test_buf) << strerror(errno); + control_buf = malloc(size); + ASSERT_NE(nullptr, control_buf) << strerror(errno); + + errno = 0; + r = pread(m_test_fd, test_buf, size, offs); + ASSERT_NE(-1, r) << strerror(errno); + ASSERT_EQ(size, r) << "unexpected short read"; + r = pread(m_control_fd, control_buf, size, offs); + ASSERT_NE(-1, r) << strerror(errno); + ASSERT_EQ(size, r) << "unexpected short read"; + + compare(test_buf, control_buf, offs, size); + + free(control_buf); + free(test_buf); +} + +void do_mapwrite(ssize_t size, off_t offs) +{ + char *buf; + void *p; + off_t pg_offset, page_mask; + size_t map_size; + long i; + + page_mask = getpagesize() - 1; + pg_offset = offs & page_mask; + map_size = pg_offset + size; + + buf = (char*)malloc(size); + ASSERT_NE(nullptr, buf) << strerror(errno); + for (i=0; i < size; i++) + buf[i] = random(); + + if (offs + size > m_filesize) { + /* + * Must manually extend. vm_mmap_vnode will not implicitly + * extend a vnode + */ + do_ftruncate(offs + size); + } + + p = mmap(NULL, map_size, PROT_READ | PROT_WRITE, + MAP_FILE | MAP_SHARED, m_test_fd, offs - pg_offset); + ASSERT_NE(p, MAP_FAILED) << strerror(errno); + + bcopy(buf, (char*)p + pg_offset, size); + ASSERT_EQ(size, pwrite(m_control_fd, buf, size, offs)) + << strerror(errno); + + free(buf); + ASSERT_EQ(0, munmap(p, map_size)) << strerror(errno); +} + +void do_write(ssize_t size, off_t offs) +{ + char *buf; + long i; + + buf = (char*)malloc(size); + ASSERT_NE(nullptr, buf) << strerror(errno); + for (i=0; i < size; i++) + buf[i] = random(); + + ASSERT_EQ(size, pwrite(m_test_fd, buf, size, offs )) + << strerror(errno); + ASSERT_EQ(size, pwrite(m_control_fd, buf, size, offs)) + << strerror(errno); + m_filesize = std::max(m_filesize, offs + size); + + free(buf); +} + +}; + +class IoCacheable: public Io { +public: +virtual void SetUp() { + Io::SetUp(); +} +}; + +/* + * Extend a file with dirty data in the last page of the last block. + * + * fsx -WR -P /tmp -S8 -N3 fsx.bin + */ +TEST_P(Io, extend_from_dirty_page) +{ + off_t wofs = 0x21a0; + ssize_t wsize = 0xf0a8; + off_t rofs = 0xb284; + ssize_t rsize = 0x9b22; + off_t truncsize = 0x28702; + + do_write(wsize, wofs); + do_ftruncate(truncsize); + do_read(rsize, rofs); +} + +/* + * mapwrite into a newly extended part of a file. + * + * fsx -c 100 -i 100 -l 524288 -o 131072 -N5 -P /tmp -S19 fsx.bin + */ +TEST_P(IoCacheable, extend_by_mapwrite) +{ + do_mapwrite(0x849e, 0x29a3a); /* [0x29a3a, 0x31ed7] */ + do_mapwrite(0x3994, 0x3c7d8); /* [0x3c7d8, 0x4016b] */ + do_read(0xf556, 0x30c16); /* [0x30c16, 0x4016b] */ +} + +/* + * When writing the last page of a file, it must be written synchronously. + * Otherwise the cached page can become invalid by a subsequent extend + * operation. + * + * fsx -WR -P /tmp -S642 -N3 fsx.bin + */ +TEST_P(Io, last_page) +{ + do_write(0xcc77, 0x1134f); /* [0x1134f, 0x1dfc5] */ + do_write(0xdfa7, 0x2096a); /* [0x2096a, 0x2e910] */ + do_read(0xb5b7, 0x1a3aa); /* [0x1a3aa, 0x25960] */ +} + +/* + * Read a hole using mmap + * + * fsx -c 100 -i 100 -l 524288 -o 131072 -N11 -P /tmp -S14 fsx.bin + */ +TEST_P(IoCacheable, mapread_hole) +{ + do_write(0x123b7, 0xf205); /* [0xf205, 0x215bb] */ + do_mapread(0xeeea, 0x2f4c); /* [0x2f4c, 0x11e35] */ +} + +/* + * Read a hole from a block that contains some cached data. + * + * fsx -WR -P /tmp -S55 fsx.bin + */ +TEST_P(Io, read_hole_from_cached_block) +{ + off_t wofs = 0x160c5; + ssize_t wsize = 0xa996; + off_t rofs = 0x472e; + ssize_t rsize = 0xd8d5; + + do_write(wsize, wofs); + do_read(rsize, rofs); +} + +/* + * Truncating a file into a dirty buffer should not causing anything untoward + * to happen when that buffer is eventually flushed. + * + * fsx -WR -P /tmp -S839 -d -N6 fsx.bin + */ +TEST_P(Io, truncate_into_dirty_buffer) +{ + off_t wofs0 = 0x3bad7; + ssize_t wsize0 = 0x4529; + off_t wofs1 = 0xc30d; + ssize_t wsize1 = 0x5f77; + off_t truncsize0 = 0x10916; + off_t rofs = 0xdf17; + ssize_t rsize = 0x29ff; + off_t truncsize1 = 0x152b4; + + do_write(wsize0, wofs0); + do_write(wsize1, wofs1); + do_ftruncate(truncsize0); + do_read(rsize, rofs); + do_ftruncate(truncsize1); + close(m_test_fd); +} + +/* + * Truncating a file into a dirty buffer should not causing anything untoward + * to happen when that buffer is eventually flushed, even when the buffer's + * dirty_off is > 0. + * + * Based on this command with a few steps removed: + * fsx -WR -P /tmp -S677 -d -N8 fsx.bin + */ +TEST_P(Io, truncate_into_dirty_buffer2) +{ + off_t truncsize0 = 0x344f3; + off_t wofs = 0x2790c; + ssize_t wsize = 0xd86a; + off_t truncsize1 = 0x2de38; + off_t rofs2 = 0x1fd7a; + ssize_t rsize2 = 0xc594; + off_t truncsize2 = 0x31e71; + + /* Sets the file size to something larger than the next write */ + do_ftruncate(truncsize0); + /* + * Creates a dirty buffer. The part in lbn 2 doesn't flush + * synchronously. + */ + do_write(wsize, wofs); + /* Truncates part of the dirty buffer created in step 2 */ + do_ftruncate(truncsize1); + /* XXX ?I don't know why this is necessary? */ + do_read(rsize2, rofs2); + /* Truncates the dirty buffer */ + do_ftruncate(truncsize2); + close(m_test_fd); +} + +/* + * Regression test for a bug introduced in r348931 + * + * Sequence of operations: + * 1) The first write reads lbn so it can modify it + * 2) The first write flushes lbn 3 immediately because it's the end of file + * 3) The first write then flushes lbn 4 because it's the end of the file + * 4) The second write modifies the cached versions of lbn 3 and 4 + * 5) The third write's getblkx invalidates lbn 4's B_CACHE because it's + * extending the buffer. Then it flushes lbn 4 because B_DELWRI was set but + * B_CACHE was clear. + * 6) fuse_write_biobackend erroneously called vfs_bio_clrbuf, putting the + * buffer into a weird write-only state. All read operations would return + * 0. Writes were apparently still processed, because the buffer's contents + * were correct when examined in a core dump. + * 7) The third write reads lbn 4 because cache is clear + * 9) uiomove dutifully copies new data into the buffer + * 10) The buffer's dirty is flushed to lbn 4 + * 11) The read returns all zeros because of step 6. + * + * Based on: + * fsx -WR -l 524388 -o 131072 -P /tmp -S6456 -q fsx.bin + */ +TEST_P(Io, resize_a_valid_buffer_while_extending) +{ + do_write(0x14530, 0x36ee6); /* [0x36ee6, 0x4b415] */ + do_write(0x1507c, 0x33256); /* [0x33256, 0x482d1] */ + do_write(0x175c, 0x4c03d); /* [0x4c03d, 0x4d798] */ + do_read(0xe277, 0x3599c); /* [0x3599c, 0x43c12] */ + close(m_test_fd); +} + +INSTANTIATE_TEST_CASE_P(Io, Io, + Combine(Bool(), /* async read */ + Values(0x1000, 0x10000, 0x20000), /* m_maxwrite */ + Values(Uncached, Writethrough, Writeback, WritebackAsync) + ) +); + +INSTANTIATE_TEST_CASE_P(Io, IoCacheable, + Combine(Bool(), /* async read */ + Values(0x1000, 0x10000, 0x20000), /* m_maxwrite */ + Values(Writethrough, Writeback, WritebackAsync) + ) +); Property changes on: head/tests/sys/fs/fusefs/io.cc ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: head/tests/sys/fs/fusefs/mockfs.cc =================================================================== --- head/tests/sys/fs/fusefs/mockfs.cc (nonexistent) +++ head/tests/sys/fs/fusefs/mockfs.cc (revision 350665) @@ -0,0 +1,733 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2019 The FreeBSD Foundation + * + * This software was developed by BFF Storage Systems, LLC under sponsorship + * from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +extern "C" { +#include + +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include + +#include "mntopts.h" // for build_iovec +} + +#include + +#include + +#include "mockfs.hh" + +using namespace testing; + +int verbosity = 0; + +const char* opcode2opname(uint32_t opcode) +{ + const int NUM_OPS = 39; + const char* table[NUM_OPS] = { + "Unknown (opcode 0)", + "LOOKUP", + "FORGET", + "GETATTR", + "SETATTR", + "READLINK", + "SYMLINK", + "Unknown (opcode 7)", + "MKNOD", + "MKDIR", + "UNLINK", + "RMDIR", + "RENAME", + "LINK", + "OPEN", + "READ", + "WRITE", + "STATFS", + "RELEASE", + "Unknown (opcode 19)", + "FSYNC", + "SETXATTR", + "GETXATTR", + "LISTXATTR", + "REMOVEXATTR", + "FLUSH", + "INIT", + "OPENDIR", + "READDIR", + "RELEASEDIR", + "FSYNCDIR", + "GETLK", + "SETLK", + "SETLKW", + "ACCESS", + "CREATE", + "INTERRUPT", + "BMAP", + "DESTROY" + }; + if (opcode >= NUM_OPS) + return ("Unknown (opcode > max)"); + else + return (table[opcode]); +} + +ProcessMockerT +ReturnErrno(int error) +{ + return([=](auto in, auto &out) { + std::unique_ptr out0(new mockfs_buf_out); + out0->header.unique = in.header.unique; + out0->header.error = -error; + out0->header.len = sizeof(out0->header); + out.push_back(std::move(out0)); + }); +} + +/* Helper function used for returning negative cache entries for LOOKUP */ +ProcessMockerT +ReturnNegativeCache(const struct timespec *entry_valid) +{ + return([=](auto in, auto &out) { + /* nodeid means ENOENT and cache it */ + std::unique_ptr out0(new mockfs_buf_out); + out0->body.entry.nodeid = 0; + out0->header.unique = in.header.unique; + out0->header.error = 0; + out0->body.entry.entry_valid = entry_valid->tv_sec; + out0->body.entry.entry_valid_nsec = entry_valid->tv_nsec; + SET_OUT_HEADER_LEN(*out0, entry); + out.push_back(std::move(out0)); + }); +} + +ProcessMockerT +ReturnImmediate(std::function f) +{ + return([=](auto& in, auto &out) { + std::unique_ptr out0(new mockfs_buf_out); + out0->header.unique = in.header.unique; + f(in, *out0); + out.push_back(std::move(out0)); + }); +} + +void sigint_handler(int __unused sig) { + // Don't do anything except interrupt the daemon's read(2) call +} + +void MockFS::debug_request(const mockfs_buf_in &in) +{ + printf("%-11s ino=%2" PRIu64, opcode2opname(in.header.opcode), + in.header.nodeid); + if (verbosity > 1) { + printf(" uid=%5u gid=%5u pid=%5u unique=%" PRIu64 " len=%u", + in.header.uid, in.header.gid, in.header.pid, + in.header.unique, in.header.len); + } + switch (in.header.opcode) { + const char *name, *value; + + case FUSE_ACCESS: + printf(" mask=%#x", in.body.access.mask); + break; + case FUSE_BMAP: + printf(" block=%" PRIx64 " blocksize=%#x", + in.body.bmap.block, in.body.bmap.blocksize); + break; + case FUSE_CREATE: + if (m_kernel_minor_version >= 12) + name = (const char*)in.body.bytes + + sizeof(fuse_create_in); + else + name = (const char*)in.body.bytes + + sizeof(fuse_open_in); + printf(" flags=%#x name=%s", + in.body.open.flags, name); + break; + case FUSE_FLUSH: + printf(" fh=%#" PRIx64 " lock_owner=%" PRIu64, + in.body.flush.fh, + in.body.flush.lock_owner); + break; + case FUSE_FORGET: + printf(" nlookup=%" PRIu64, in.body.forget.nlookup); + break; + case FUSE_FSYNC: + printf(" flags=%#x", in.body.fsync.fsync_flags); + break; + case FUSE_FSYNCDIR: + printf(" flags=%#x", in.body.fsyncdir.fsync_flags); + break; + case FUSE_INTERRUPT: + printf(" unique=%" PRIu64, in.body.interrupt.unique); + break; + case FUSE_LINK: + printf(" oldnodeid=%" PRIu64, in.body.link.oldnodeid); + break; + case FUSE_LOOKUP: + printf(" %s", in.body.lookup); + break; + case FUSE_MKDIR: + name = (const char*)in.body.bytes + + sizeof(fuse_mkdir_in); + printf(" name=%s mode=%#o umask=%#o", name, + in.body.mkdir.mode, in.body.mkdir.umask); + break; + case FUSE_MKNOD: + if (m_kernel_minor_version >= 12) + name = (const char*)in.body.bytes + + sizeof(fuse_mknod_in); + else + name = (const char*)in.body.bytes + + FUSE_COMPAT_MKNOD_IN_SIZE; + printf(" mode=%#o rdev=%x umask=%#o name=%s", + in.body.mknod.mode, in.body.mknod.rdev, + in.body.mknod.umask, name); + break; + case FUSE_OPEN: + printf(" flags=%#x", in.body.open.flags); + break; + case FUSE_OPENDIR: + printf(" flags=%#x", in.body.opendir.flags); + break; + case FUSE_READ: + printf(" offset=%" PRIu64 " size=%u", + in.body.read.offset, + in.body.read.size); + if (verbosity > 1) + printf(" flags=%#x", in.body.read.flags); + break; + case FUSE_READDIR: + printf(" fh=%#" PRIx64 " offset=%" PRIu64 " size=%u", + in.body.readdir.fh, in.body.readdir.offset, + in.body.readdir.size); + break; + case FUSE_RELEASE: + printf(" fh=%#" PRIx64 " flags=%#x lock_owner=%" PRIu64, + in.body.release.fh, + in.body.release.flags, + in.body.release.lock_owner); + break; + case FUSE_SETATTR: + if (verbosity <= 1) { + printf(" valid=%#x", in.body.setattr.valid); + break; + } + if (in.body.setattr.valid & FATTR_MODE) + printf(" mode=%#o", in.body.setattr.mode); + if (in.body.setattr.valid & FATTR_UID) + printf(" uid=%u", in.body.setattr.uid); + if (in.body.setattr.valid & FATTR_GID) + printf(" gid=%u", in.body.setattr.gid); + if (in.body.setattr.valid & FATTR_SIZE) + printf(" size=%" PRIu64, in.body.setattr.size); + if (in.body.setattr.valid & FATTR_ATIME) + printf(" atime=%" PRIu64 ".%u", + in.body.setattr.atime, + in.body.setattr.atimensec); + if (in.body.setattr.valid & FATTR_MTIME) + printf(" mtime=%" PRIu64 ".%u", + in.body.setattr.mtime, + in.body.setattr.mtimensec); + if (in.body.setattr.valid & FATTR_FH) + printf(" fh=%" PRIu64 "", in.body.setattr.fh); + break; + case FUSE_SETLK: + printf(" fh=%#" PRIx64 " owner=%" PRIu64 + " type=%u pid=%u", + in.body.setlk.fh, in.body.setlk.owner, + in.body.setlk.lk.type, + in.body.setlk.lk.pid); + if (verbosity >= 2) { + printf(" range=[%" PRIu64 "-%" PRIu64 "]", + in.body.setlk.lk.start, + in.body.setlk.lk.end); + } + break; + case FUSE_SETXATTR: + /* + * In theory neither the xattr name and value need be + * ASCII, but in this test suite they always are. + */ + name = (const char*)in.body.bytes + + sizeof(fuse_setxattr_in); + value = name + strlen(name) + 1; + printf(" %s=%s", name, value); + break; + case FUSE_WRITE: + printf(" fh=%#" PRIx64 " offset=%" PRIu64 + " size=%u write_flags=%u", + in.body.write.fh, + in.body.write.offset, in.body.write.size, + in.body.write.write_flags); + if (verbosity > 1) + printf(" flags=%#x", in.body.write.flags); + break; + default: + break; + } + printf("\n"); +} + +/* + * Debug a FUSE response. + * + * This is mostly useful for asynchronous notifications, which don't correspond + * to any request + */ +void MockFS::debug_response(const mockfs_buf_out &out) { + const char *name; + + if (verbosity == 0) + return; + + switch (out.header.error) { + case FUSE_NOTIFY_INVAL_ENTRY: + name = (const char*)out.body.bytes + + sizeof(fuse_notify_inval_entry_out); + printf("<- INVAL_ENTRY parent=%" PRIu64 " %s\n", + out.body.inval_entry.parent, name); + break; + case FUSE_NOTIFY_INVAL_INODE: + printf("<- INVAL_INODE ino=%" PRIu64 " off=%" PRIi64 + " len=%" PRIi64 "\n", + out.body.inval_inode.ino, + out.body.inval_inode.off, + out.body.inval_inode.len); + break; + case FUSE_NOTIFY_STORE: + printf("<- STORE ino=%" PRIu64 " off=%" PRIu64 + " size=%" PRIu32 "\n", + out.body.store.nodeid, + out.body.store.offset, + out.body.store.size); + break; + default: + break; + } +} + +MockFS::MockFS(int max_readahead, bool allow_other, bool default_permissions, + bool push_symlinks_in, bool ro, enum poll_method pm, uint32_t flags, + uint32_t kernel_minor_version, uint32_t max_write, bool async, + bool noclusterr, unsigned time_gran, bool nointr) +{ + struct sigaction sa; + struct iovec *iov = NULL; + int iovlen = 0; + char fdstr[15]; + const bool trueval = true; + + m_daemon_id = NULL; + m_kernel_minor_version = kernel_minor_version; + m_maxreadahead = max_readahead; + m_maxwrite = max_write; + m_nready = -1; + m_pm = pm; + m_time_gran = time_gran; + m_quit = false; + if (m_pm == KQ) + m_kq = kqueue(); + else + m_kq = -1; + + /* + * Kyua sets pwd to a testcase-unique tempdir; no need to use + * mkdtemp + */ + /* + * googletest doesn't allow ASSERT_ in constructors, so we must throw + * instead. + */ + if (mkdir("mountpoint" , 0755) && errno != EEXIST) + throw(std::system_error(errno, std::system_category(), + "Couldn't make mountpoint directory")); + + switch (m_pm) { + case BLOCKING: + m_fuse_fd = open("/dev/fuse", O_CLOEXEC | O_RDWR); + break; + default: + m_fuse_fd = open("/dev/fuse", O_CLOEXEC | O_RDWR | O_NONBLOCK); + break; + } + if (m_fuse_fd < 0) + throw(std::system_error(errno, std::system_category(), + "Couldn't open /dev/fuse")); + + m_pid = getpid(); + m_child_pid = -1; + + build_iovec(&iov, &iovlen, "fstype", __DECONST(void *, "fusefs"), -1); + build_iovec(&iov, &iovlen, "fspath", + __DECONST(void *, "mountpoint"), -1); + build_iovec(&iov, &iovlen, "from", __DECONST(void *, "/dev/fuse"), -1); + sprintf(fdstr, "%d", m_fuse_fd); + build_iovec(&iov, &iovlen, "fd", fdstr, -1); + if (allow_other) { + build_iovec(&iov, &iovlen, "allow_other", + __DECONST(void*, &trueval), sizeof(bool)); + } + if (default_permissions) { + build_iovec(&iov, &iovlen, "default_permissions", + __DECONST(void*, &trueval), sizeof(bool)); + } + if (push_symlinks_in) { + build_iovec(&iov, &iovlen, "push_symlinks_in", + __DECONST(void*, &trueval), sizeof(bool)); + } + if (ro) { + build_iovec(&iov, &iovlen, "ro", + __DECONST(void*, &trueval), sizeof(bool)); + } + if (async) { + build_iovec(&iov, &iovlen, "async", __DECONST(void*, &trueval), + sizeof(bool)); + } + if (noclusterr) { + build_iovec(&iov, &iovlen, "noclusterr", + __DECONST(void*, &trueval), sizeof(bool)); + } + if (nointr) { + build_iovec(&iov, &iovlen, "nointr", + __DECONST(void*, &trueval), sizeof(bool)); + } else { + build_iovec(&iov, &iovlen, "intr", + __DECONST(void*, &trueval), sizeof(bool)); + } + if (nmount(iov, iovlen, 0)) + throw(std::system_error(errno, std::system_category(), + "Couldn't mount filesystem")); + + // Setup default handler + ON_CALL(*this, process(_, _)) + .WillByDefault(Invoke(this, &MockFS::process_default)); + + init(flags); + bzero(&sa, sizeof(sa)); + sa.sa_handler = sigint_handler; + sa.sa_flags = 0; /* Don't set SA_RESTART! */ + if (0 != sigaction(SIGUSR1, &sa, NULL)) + throw(std::system_error(errno, std::system_category(), + "Couldn't handle SIGUSR1")); + if (pthread_create(&m_daemon_id, NULL, service, (void*)this)) + throw(std::system_error(errno, std::system_category(), + "Couldn't Couldn't start fuse thread")); +} + +MockFS::~MockFS() { + kill_daemon(); + if (m_daemon_id != NULL) { + pthread_join(m_daemon_id, NULL); + m_daemon_id = NULL; + } + ::unmount("mountpoint", MNT_FORCE); + rmdir("mountpoint"); + if (m_kq >= 0) + close(m_kq); +} + +void MockFS::init(uint32_t flags) { + std::unique_ptr in(new mockfs_buf_in); + std::unique_ptr out(new mockfs_buf_out); + + read_request(*in); + ASSERT_EQ(FUSE_INIT, in->header.opcode); + + out->header.unique = in->header.unique; + out->header.error = 0; + out->body.init.major = FUSE_KERNEL_VERSION; + out->body.init.minor = m_kernel_minor_version;; + out->body.init.flags = in->body.init.flags & flags; + out->body.init.max_write = m_maxwrite; + out->body.init.max_readahead = m_maxreadahead; + + if (m_kernel_minor_version < 23) { + SET_OUT_HEADER_LEN(*out, init_7_22); + } else { + out->body.init.time_gran = m_time_gran; + SET_OUT_HEADER_LEN(*out, init); + } + + write(m_fuse_fd, out.get(), out->header.len); +} + +void MockFS::kill_daemon() { + m_quit = true; + if (m_daemon_id != NULL) + pthread_kill(m_daemon_id, SIGUSR1); + // Closing the /dev/fuse file descriptor first allows unmount to + // succeed even if the daemon doesn't correctly respond to commands + // during the unmount sequence. + close(m_fuse_fd); + m_fuse_fd = -1; +} + +void MockFS::loop() { + std::vector> out; + + std::unique_ptr in(new mockfs_buf_in); + ASSERT_TRUE(in != NULL); + while (!m_quit) { + bzero(in.get(), sizeof(*in)); + read_request(*in); + if (m_quit) + break; + if (verbosity > 0) + debug_request(*in); + if (pid_ok((pid_t)in->header.pid)) { + process(*in, out); + } else { + /* + * Reject any requests from unknown processes. Because + * we actually do mount a filesystem, plenty of + * unrelated system daemons may try to access it. + */ + if (verbosity > 1) + printf("\tREJECTED (wrong pid %d)\n", + in->header.pid); + process_default(*in, out); + } + for (auto &it: out) + write_response(*it); + out.clear(); + } +} + +int MockFS::notify_inval_entry(ino_t parent, const char *name, size_t namelen) +{ + std::unique_ptr out(new mockfs_buf_out); + + out->header.unique = 0; /* 0 means asynchronous notification */ + out->header.error = FUSE_NOTIFY_INVAL_ENTRY; + out->body.inval_entry.parent = parent; + out->body.inval_entry.namelen = namelen; + strlcpy((char*)&out->body.bytes + sizeof(out->body.inval_entry), + name, sizeof(out->body.bytes) - sizeof(out->body.inval_entry)); + out->header.len = sizeof(out->header) + sizeof(out->body.inval_entry) + + namelen; + debug_response(*out); + write_response(*out); + return 0; +} + +int MockFS::notify_inval_inode(ino_t ino, off_t off, ssize_t len) +{ + std::unique_ptr out(new mockfs_buf_out); + + out->header.unique = 0; /* 0 means asynchronous notification */ + out->header.error = FUSE_NOTIFY_INVAL_INODE; + out->body.inval_inode.ino = ino; + out->body.inval_inode.off = off; + out->body.inval_inode.len = len; + out->header.len = sizeof(out->header) + sizeof(out->body.inval_inode); + debug_response(*out); + write_response(*out); + return 0; +} + +int MockFS::notify_store(ino_t ino, off_t off, const void* data, ssize_t size) +{ + std::unique_ptr out(new mockfs_buf_out); + + out->header.unique = 0; /* 0 means asynchronous notification */ + out->header.error = FUSE_NOTIFY_STORE; + out->body.store.nodeid = ino; + out->body.store.offset = off; + out->body.store.size = size; + bcopy(data, (char*)&out->body.bytes + sizeof(out->body.store), size); + out->header.len = sizeof(out->header) + sizeof(out->body.store) + size; + debug_response(*out); + write_response(*out); + return 0; +} + +bool MockFS::pid_ok(pid_t pid) { + if (pid == m_pid) { + return (true); + } else if (pid == m_child_pid) { + return (true); + } else { + struct kinfo_proc *ki; + bool ok = false; + + ki = kinfo_getproc(pid); + if (ki == NULL) + return (false); + /* + * Allow access by the aio daemon processes so that our tests + * can use aio functions + */ + if (0 == strncmp("aiod", ki->ki_comm, 4)) + ok = true; + free(ki); + return (ok); + } +} + +void MockFS::process_default(const mockfs_buf_in& in, + std::vector> &out) +{ + std::unique_ptr out0(new mockfs_buf_out); + out0->header.unique = in.header.unique; + out0->header.error = -EOPNOTSUPP; + out0->header.len = sizeof(out0->header); + out.push_back(std::move(out0)); +} + +void MockFS::read_request(mockfs_buf_in &in) { + ssize_t res; + int nready = 0; + fd_set readfds; + pollfd fds[1]; + struct kevent changes[1]; + struct kevent events[1]; + struct timespec timeout_ts; + struct timeval timeout_tv; + const int timeout_ms = 999; + int timeout_int, nfds; + + switch (m_pm) { + case BLOCKING: + break; + case KQ: + timeout_ts.tv_sec = 0; + timeout_ts.tv_nsec = timeout_ms * 1'000'000; + while (nready == 0) { + EV_SET(&changes[0], m_fuse_fd, EVFILT_READ, EV_ADD, 0, + 0, 0); + nready = kevent(m_kq, &changes[0], 1, &events[0], 1, + &timeout_ts); + if (m_quit) + return; + } + ASSERT_LE(0, nready) << strerror(errno); + ASSERT_EQ(events[0].ident, (uintptr_t)m_fuse_fd); + if (events[0].flags & EV_ERROR) + FAIL() << strerror(events[0].data); + else if (events[0].flags & EV_EOF) + FAIL() << strerror(events[0].fflags); + m_nready = events[0].data; + break; + case POLL: + timeout_int = timeout_ms; + fds[0].fd = m_fuse_fd; + fds[0].events = POLLIN; + while (nready == 0) { + nready = poll(fds, 1, timeout_int); + if (m_quit) + return; + } + ASSERT_LE(0, nready) << strerror(errno); + ASSERT_TRUE(fds[0].revents & POLLIN); + break; + case SELECT: + timeout_tv.tv_sec = 0; + timeout_tv.tv_usec = timeout_ms * 1'000; + nfds = m_fuse_fd + 1; + while (nready == 0) { + FD_ZERO(&readfds); + FD_SET(m_fuse_fd, &readfds); + nready = select(nfds, &readfds, NULL, NULL, + &timeout_tv); + if (m_quit) + return; + } + ASSERT_LE(0, nready) << strerror(errno); + ASSERT_TRUE(FD_ISSET(m_fuse_fd, &readfds)); + break; + default: + FAIL() << "not yet implemented"; + } + res = read(m_fuse_fd, &in, sizeof(in)); + + if (res < 0 && !m_quit) { + FAIL() << "read: " << strerror(errno); + m_quit = true; + } + ASSERT_TRUE(res >= static_cast(sizeof(in.header)) || m_quit); +} + +void MockFS::write_response(const mockfs_buf_out &out) { + fd_set writefds; + pollfd fds[1]; + int nready, nfds; + ssize_t r; + + switch (m_pm) { + case BLOCKING: + case KQ: /* EVFILT_WRITE is not supported */ + break; + case POLL: + fds[0].fd = m_fuse_fd; + fds[0].events = POLLOUT; + nready = poll(fds, 1, INFTIM); + ASSERT_LE(0, nready) << strerror(errno); + ASSERT_EQ(1, nready) << "NULL timeout expired?"; + ASSERT_TRUE(fds[0].revents & POLLOUT); + break; + case SELECT: + FD_ZERO(&writefds); + FD_SET(m_fuse_fd, &writefds); + nfds = m_fuse_fd + 1; + nready = select(nfds, NULL, &writefds, NULL, NULL); + ASSERT_LE(0, nready) << strerror(errno); + ASSERT_EQ(1, nready) << "NULL timeout expired?"; + ASSERT_TRUE(FD_ISSET(m_fuse_fd, &writefds)); + break; + default: + FAIL() << "not yet implemented"; + } + r = write(m_fuse_fd, &out, out.header.len); + ASSERT_TRUE(r > 0 || errno == EAGAIN) << strerror(errno); +} + +void* MockFS::service(void *pthr_data) { + MockFS *mock_fs = (MockFS*)pthr_data; + + mock_fs->loop(); + + return (NULL); +} + +void MockFS::unmount() { + ::unmount("mountpoint", 0); +} Property changes on: head/tests/sys/fs/fusefs/mockfs.cc ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: head/tests/sys/fs/fusefs/mockfs.hh =================================================================== --- head/tests/sys/fs/fusefs/mockfs.hh (nonexistent) +++ head/tests/sys/fs/fusefs/mockfs.hh (revision 350665) @@ -0,0 +1,394 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2019 The FreeBSD Foundation + * + * This software was developed by BFF Storage Systems, LLC under sponsorship + * from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +extern "C" { +#include + +#include + +#include "fuse_kernel.h" +} + +#include + +#define TIME_T_MAX (std::numeric_limits::max()) + +/* + * A pseudo-fuse errno used indicate that a fuse operation should have no + * response, at least not immediately + */ +#define FUSE_NORESPONSE 9999 + +#define SET_OUT_HEADER_LEN(out, variant) { \ + (out).header.len = (sizeof((out).header) + \ + sizeof((out).body.variant)); \ +} + +/* + * Create an expectation on FUSE_LOOKUP and return it so the caller can set + * actions. + * + * This must be a macro instead of a method because EXPECT_CALL returns a type + * with a deleted constructor. + */ +#define EXPECT_LOOKUP(parent, path) \ + EXPECT_CALL(*m_mock, process( \ + ResultOf([=](auto in) { \ + return (in.header.opcode == FUSE_LOOKUP && \ + in.header.nodeid == (parent) && \ + strcmp(in.body.lookup, (path)) == 0); \ + }, Eq(true)), \ + _) \ + ) + +extern int verbosity; + +/* This struct isn't defined by fuse_kernel.h or libfuse, but it should be */ +struct fuse_create_out { + struct fuse_entry_out entry; + struct fuse_open_out open; +}; + +/* Protocol 7.8 version of struct fuse_attr */ +struct fuse_attr_7_8 +{ + uint64_t ino; + uint64_t size; + uint64_t blocks; + uint64_t atime; + uint64_t mtime; + uint64_t ctime; + uint32_t atimensec; + uint32_t mtimensec; + uint32_t ctimensec; + uint32_t mode; + uint32_t nlink; + uint32_t uid; + uint32_t gid; + uint32_t rdev; +}; + +/* Protocol 7.8 version of struct fuse_attr_out */ +struct fuse_attr_out_7_8 +{ + uint64_t attr_valid; + uint32_t attr_valid_nsec; + uint32_t dummy; + struct fuse_attr_7_8 attr; +}; + +/* Protocol 7.8 version of struct fuse_entry_out */ +struct fuse_entry_out_7_8 { + uint64_t nodeid; /* Inode ID */ + uint64_t generation; /* Inode generation: nodeid:gen must + be unique for the fs's lifetime */ + uint64_t entry_valid; /* Cache timeout for the name */ + uint64_t attr_valid; /* Cache timeout for the attributes */ + uint32_t entry_valid_nsec; + uint32_t attr_valid_nsec; + struct fuse_attr_7_8 attr; +}; + +/* Output struct for FUSE_CREATE for protocol 7.8 servers */ +struct fuse_create_out_7_8 { + struct fuse_entry_out_7_8 entry; + struct fuse_open_out open; +}; + +/* Output struct for FUSE_INIT for protocol 7.22 and earlier servers */ +struct fuse_init_out_7_22 { + uint32_t major; + uint32_t minor; + uint32_t max_readahead; + uint32_t flags; + uint16_t max_background; + uint16_t congestion_threshold; + uint32_t max_write; +}; + +union fuse_payloads_in { + fuse_access_in access; + fuse_bmap_in bmap; + /* value is from fuse_kern_chan.c in fusefs-libs */ + uint8_t bytes[0x21000 - sizeof(struct fuse_in_header)]; + fuse_create_in create; + fuse_flush_in flush; + fuse_fsync_in fsync; + fuse_fsync_in fsyncdir; + fuse_forget_in forget; + fuse_interrupt_in interrupt; + fuse_lk_in getlk; + fuse_getxattr_in getxattr; + fuse_init_in init; + fuse_link_in link; + fuse_listxattr_in listxattr; + char lookup[0]; + fuse_mkdir_in mkdir; + fuse_mknod_in mknod; + fuse_open_in open; + fuse_open_in opendir; + fuse_read_in read; + fuse_read_in readdir; + fuse_release_in release; + fuse_release_in releasedir; + fuse_rename_in rename; + char rmdir[0]; + fuse_setattr_in setattr; + fuse_setxattr_in setxattr; + fuse_lk_in setlk; + fuse_lk_in setlkw; + char unlink[0]; + fuse_write_in write; +}; + +struct mockfs_buf_in { + fuse_in_header header; + union fuse_payloads_in body; +}; + +union fuse_payloads_out { + fuse_attr_out attr; + fuse_attr_out_7_8 attr_7_8; + fuse_bmap_out bmap; + fuse_create_out create; + fuse_create_out_7_8 create_7_8; + /* + * The protocol places no limits on the size of bytes. Choose + * a size big enough for anything we'll test. + */ + uint8_t bytes[0x20000]; + fuse_entry_out entry; + fuse_entry_out_7_8 entry_7_8; + fuse_lk_out getlk; + fuse_getxattr_out getxattr; + fuse_init_out init; + fuse_init_out_7_22 init_7_22; + /* The inval_entry structure should be followed by the entry's name */ + fuse_notify_inval_entry_out inval_entry; + fuse_notify_inval_inode_out inval_inode; + /* The store structure should be followed by the data to store */ + fuse_notify_store_out store; + fuse_listxattr_out listxattr; + fuse_open_out open; + fuse_statfs_out statfs; + /* + * The protocol places no limits on the length of the string. This is + * merely convenient for testing. + */ + char str[80]; + fuse_write_out write; +}; + +struct mockfs_buf_out { + fuse_out_header header; + union fuse_payloads_out body; + + /* Default constructor: zero everything */ + mockfs_buf_out() { + memset(this, 0, sizeof(*this)); + } +}; + +/* A function that can be invoked in place of MockFS::process */ +typedef std::function> &out)> +ProcessMockerT; + +/* + * Helper function used for setting an error expectation for any fuse operation. + * The operation will return the supplied error + */ +ProcessMockerT ReturnErrno(int error); + +/* Helper function used for returning negative cache entries for LOOKUP */ +ProcessMockerT ReturnNegativeCache(const struct timespec *entry_valid); + +/* Helper function used for returning a single immediate response */ +ProcessMockerT ReturnImmediate( + std::function f); + +/* How the daemon should check /dev/fuse for readiness */ +enum poll_method { + BLOCKING, + SELECT, + POLL, + KQ +}; + +/* + * Fake FUSE filesystem + * + * "Mounts" a filesystem to a temporary directory and services requests + * according to the programmed expectations. + * + * Operates directly on the fusefs(4) kernel API, not the libfuse(3) user api. + */ +class MockFS { + /* + * thread id of the fuse daemon thread + * + * It must run in a separate thread so it doesn't deadlock with the + * client test code. + */ + pthread_t m_daemon_id; + + /* file descriptor of /dev/fuse control device */ + int m_fuse_fd; + + /* The minor version of the kernel API that this mock daemon targets */ + uint32_t m_kernel_minor_version; + + int m_kq; + + /* The max_readahead file system option */ + uint32_t m_maxreadahead; + + /* pid of the test process */ + pid_t m_pid; + + /* Method the daemon should use for I/O to and from /dev/fuse */ + enum poll_method m_pm; + + /* Timestamp granularity in nanoseconds */ + unsigned m_time_gran; + + void debug_request(const mockfs_buf_in&); + void debug_response(const mockfs_buf_out&); + + /* Initialize a session after mounting */ + void init(uint32_t flags); + + /* Is pid from a process that might be involved in the test? */ + bool pid_ok(pid_t pid); + + /* Default request handler */ + void process_default(const mockfs_buf_in&, + std::vector>&); + + /* Entry point for the daemon thread */ + static void* service(void*); + + /* Read, but do not process, a single request from the kernel */ + void read_request(mockfs_buf_in& in); + + /* Write a single response back to the kernel */ + void write_response(const mockfs_buf_out &out); + + public: + /* pid of child process, for two-process test cases */ + pid_t m_child_pid; + + /* Maximum size of a FUSE_WRITE write */ + uint32_t m_maxwrite; + + /* + * Number of events that were available from /dev/fuse after the last + * kevent call. Only valid when m_pm = KQ. + */ + int m_nready; + + /* Tell the daemon to shut down ASAP */ + bool m_quit; + + /* Create a new mockfs and mount it to a tempdir */ + MockFS(int max_readahead, bool allow_other, + bool default_permissions, bool push_symlinks_in, bool ro, + enum poll_method pm, uint32_t flags, + uint32_t kernel_minor_version, uint32_t max_write, bool async, + bool no_clusterr, unsigned time_gran, bool nointr); + + virtual ~MockFS(); + + /* Kill the filesystem daemon without unmounting the filesystem */ + void kill_daemon(); + + /* Process FUSE requests endlessly */ + void loop(); + + /* + * Send an asynchronous notification to invalidate a directory entry. + * Similar to libfuse's fuse_lowlevel_notify_inval_entry + * + * This method will block until the client has responded, so it should + * generally be run in a separate thread from request processing. + * + * @param parent Parent directory's inode number + * @param name name of dirent to invalidate + * @param namelen size of name, including the NUL + */ + int notify_inval_entry(ino_t parent, const char *name, size_t namelen); + + /* + * Send an asynchronous notification to invalidate an inode's cached + * data and/or attributes. Similar to libfuse's + * fuse_lowlevel_notify_inval_inode. + * + * This method will block until the client has responded, so it should + * generally be run in a separate thread from request processing. + * + * @param ino File's inode number + * @param off offset at which to begin invalidation. A + * negative offset means to invalidate attributes + * only. + * @param len Size of region of data to invalidate. 0 means + * to invalidate all cached data. + */ + int notify_inval_inode(ino_t ino, off_t off, ssize_t len); + + /* + * Send an asynchronous notification to store data directly into an + * inode's cache. Similar to libfuse's fuse_lowlevel_notify_store. + * + * This method will block until the client has responded, so it should + * generally be run in a separate thread from request processing. + * + * @param ino File's inode number + * @param off Offset at which to store data + * @param data Pointer to the data to cache + * @param len Size of data + */ + int notify_store(ino_t ino, off_t off, const void* data, ssize_t size); + + /* + * Request handler + * + * This method is expected to provide the responses to each FUSE + * operation. For an immediate response, push one buffer into out. + * For a delayed response, push nothing. For an immediate response + * plus a delayed response to an earlier operation, push two bufs. + * Test cases must define each response using Googlemock expectations + */ + MOCK_METHOD2(process, void(const mockfs_buf_in&, + std::vector>&)); + + /* Gracefully unmount */ + void unmount(); +}; Index: head/tests/sys/fs/fusefs/notify.cc =================================================================== --- head/tests/sys/fs/fusefs/notify.cc (nonexistent) +++ head/tests/sys/fs/fusefs/notify.cc (revision 350665) @@ -0,0 +1,545 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2019 The FreeBSD Foundation + * + * This software was developed by BFF Storage Systems, LLC under sponsorship + * from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +extern "C" { +#include +#include + +#include +#include +} + +#include "mockfs.hh" +#include "utils.hh" + +using namespace testing; + +/* + * FUSE asynchonous notification + * + * FUSE servers can send unprompted notification messages for things like cache + * invalidation. This file tests our client's handling of those messages. + */ + +class Notify: public FuseTest { +public: +/* Ignore an optional FUSE_FSYNC */ +void maybe_expect_fsync(uint64_t ino) +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_FSYNC && + in.header.nodeid == ino); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnErrno(0))); +} + +void expect_lookup(uint64_t parent, const char *relpath, uint64_t ino, + off_t size, Sequence &seq) +{ + EXPECT_LOOKUP(parent, relpath) + .InSequence(seq) + .WillOnce(Invoke( + ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = S_IFREG | 0644; + out.body.entry.nodeid = ino; + out.body.entry.attr.ino = ino; + out.body.entry.attr.nlink = 1; + out.body.entry.attr.size = size; + out.body.entry.attr_valid = UINT64_MAX; + out.body.entry.entry_valid = UINT64_MAX; + }))); +} +}; + +class NotifyWriteback: public Notify { +public: +virtual void SetUp() { + m_init_flags |= FUSE_WRITEBACK_CACHE; + m_async = true; + Notify::SetUp(); + if (IsSkipped()) + return; +} + +void expect_write(uint64_t ino, uint64_t offset, uint64_t size, + const void *contents) +{ + FuseTest::expect_write(ino, offset, size, size, 0, 0, contents); +} + +}; + +struct inval_entry_args { + MockFS *mock; + ino_t parent; + const char *name; + size_t namelen; +}; + +static void* inval_entry(void* arg) { + const struct inval_entry_args *iea = (struct inval_entry_args*)arg; + ssize_t r; + + r = iea->mock->notify_inval_entry(iea->parent, iea->name, iea->namelen); + if (r >= 0) + return 0; + else + return (void*)(intptr_t)errno; +} + +struct inval_inode_args { + MockFS *mock; + ino_t ino; + off_t off; + ssize_t len; +}; + +struct store_args { + MockFS *mock; + ino_t nodeid; + off_t offset; + ssize_t size; + const void* data; +}; + +static void* inval_inode(void* arg) { + const struct inval_inode_args *iia = (struct inval_inode_args*)arg; + ssize_t r; + + r = iia->mock->notify_inval_inode(iia->ino, iia->off, iia->len); + if (r >= 0) + return 0; + else + return (void*)(intptr_t)errno; +} + +static void* store(void* arg) { + const struct store_args *sa = (struct store_args*)arg; + ssize_t r; + + r = sa->mock->notify_store(sa->nodeid, sa->offset, sa->data, sa->size); + if (r >= 0) + return 0; + else + return (void*)(intptr_t)errno; +} + +/* Invalidate a nonexistent entry */ +TEST_F(Notify, inval_entry_nonexistent) +{ + const static char *name = "foo"; + struct inval_entry_args iea; + void *thr0_value; + pthread_t th0; + + iea.mock = m_mock; + iea.parent = FUSE_ROOT_ID; + iea.name = name; + iea.namelen = strlen(name); + ASSERT_EQ(0, pthread_create(&th0, NULL, inval_entry, &iea)) + << strerror(errno); + pthread_join(th0, &thr0_value); + /* It's not an error for an entry to not be cached */ + EXPECT_EQ(0, (intptr_t)thr0_value); +} + +/* Invalidate a cached entry */ +TEST_F(Notify, inval_entry) +{ + const static char FULLPATH[] = "mountpoint/foo"; + const static char RELPATH[] = "foo"; + struct inval_entry_args iea; + struct stat sb; + void *thr0_value; + uint64_t ino0 = 42; + uint64_t ino1 = 43; + Sequence seq; + pthread_t th0; + + expect_lookup(FUSE_ROOT_ID, RELPATH, ino0, 0, seq); + expect_lookup(FUSE_ROOT_ID, RELPATH, ino1, 0, seq); + + /* Fill the entry cache */ + ASSERT_EQ(0, stat(FULLPATH, &sb)) << strerror(errno); + EXPECT_EQ(ino0, sb.st_ino); + + /* Now invalidate the entry */ + iea.mock = m_mock; + iea.parent = FUSE_ROOT_ID; + iea.name = RELPATH; + iea.namelen = strlen(RELPATH); + ASSERT_EQ(0, pthread_create(&th0, NULL, inval_entry, &iea)) + << strerror(errno); + pthread_join(th0, &thr0_value); + EXPECT_EQ(0, (intptr_t)thr0_value); + + /* The second lookup should return the alternate ino */ + ASSERT_EQ(0, stat(FULLPATH, &sb)) << strerror(errno); + EXPECT_EQ(ino1, sb.st_ino); +} + +/* + * Invalidate a cached entry beneath the root, which uses a slightly different + * code path. + */ +TEST_F(Notify, inval_entry_below_root) +{ + const static char FULLPATH[] = "mountpoint/some_dir/foo"; + const static char DNAME[] = "some_dir"; + const static char FNAME[] = "foo"; + struct inval_entry_args iea; + struct stat sb; + void *thr0_value; + uint64_t dir_ino = 41; + uint64_t ino0 = 42; + uint64_t ino1 = 43; + Sequence seq; + pthread_t th0; + + EXPECT_LOOKUP(FUSE_ROOT_ID, DNAME) + .WillOnce(Invoke( + ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = S_IFDIR | 0755; + out.body.entry.nodeid = dir_ino; + out.body.entry.attr.nlink = 2; + out.body.entry.attr_valid = UINT64_MAX; + out.body.entry.entry_valid = UINT64_MAX; + }))); + expect_lookup(dir_ino, FNAME, ino0, 0, seq); + expect_lookup(dir_ino, FNAME, ino1, 0, seq); + + /* Fill the entry cache */ + ASSERT_EQ(0, stat(FULLPATH, &sb)) << strerror(errno); + EXPECT_EQ(ino0, sb.st_ino); + + /* Now invalidate the entry */ + iea.mock = m_mock; + iea.parent = dir_ino; + iea.name = FNAME; + iea.namelen = strlen(FNAME); + ASSERT_EQ(0, pthread_create(&th0, NULL, inval_entry, &iea)) + << strerror(errno); + pthread_join(th0, &thr0_value); + EXPECT_EQ(0, (intptr_t)thr0_value); + + /* The second lookup should return the alternate ino */ + ASSERT_EQ(0, stat(FULLPATH, &sb)) << strerror(errno); + EXPECT_EQ(ino1, sb.st_ino); +} + +/* Invalidating an entry invalidates the parent directory's attributes */ +TEST_F(Notify, inval_entry_invalidates_parent_attrs) +{ + const static char FULLPATH[] = "mountpoint/foo"; + const static char RELPATH[] = "foo"; + struct inval_entry_args iea; + struct stat sb; + void *thr0_value; + uint64_t ino = 42; + Sequence seq; + pthread_t th0; + + expect_lookup(FUSE_ROOT_ID, RELPATH, ino, 0, seq); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_GETATTR && + in.header.nodeid == FUSE_ROOT_ID); + }, Eq(true)), + _) + ).Times(2) + .WillRepeatedly(Invoke(ReturnImmediate([=](auto i __unused, auto& out) { + SET_OUT_HEADER_LEN(out, attr); + out.body.attr.attr.mode = S_IFDIR | 0755; + out.body.attr.attr_valid = UINT64_MAX; + }))); + + /* Fill the attr and entry cache */ + ASSERT_EQ(0, stat("mountpoint", &sb)) << strerror(errno); + ASSERT_EQ(0, stat(FULLPATH, &sb)) << strerror(errno); + + /* Now invalidate the entry */ + iea.mock = m_mock; + iea.parent = FUSE_ROOT_ID; + iea.name = RELPATH; + iea.namelen = strlen(RELPATH); + ASSERT_EQ(0, pthread_create(&th0, NULL, inval_entry, &iea)) + << strerror(errno); + pthread_join(th0, &thr0_value); + EXPECT_EQ(0, (intptr_t)thr0_value); + + /* /'s attribute cache should be cleared */ + ASSERT_EQ(0, stat("mountpoint", &sb)) << strerror(errno); +} + + +TEST_F(Notify, inval_inode_nonexistent) +{ + struct inval_inode_args iia; + ino_t ino = 42; + void *thr0_value; + pthread_t th0; + + iia.mock = m_mock; + iia.ino = ino; + iia.off = 0; + iia.len = 0; + ASSERT_EQ(0, pthread_create(&th0, NULL, inval_inode, &iia)) + << strerror(errno); + pthread_join(th0, &thr0_value); + /* It's not an error for an inode to not be cached */ + EXPECT_EQ(0, (intptr_t)thr0_value); +} + +TEST_F(Notify, inval_inode_with_clean_cache) +{ + const static char FULLPATH[] = "mountpoint/foo"; + const static char RELPATH[] = "foo"; + const char CONTENTS0[] = "abcdefgh"; + const char CONTENTS1[] = "ijklmnopqrstuvwxyz"; + struct inval_inode_args iia; + struct stat sb; + ino_t ino = 42; + void *thr0_value; + Sequence seq; + uid_t uid = 12345; + pthread_t th0; + ssize_t size0 = sizeof(CONTENTS0); + ssize_t size1 = sizeof(CONTENTS1); + char buf[80]; + int fd; + + expect_lookup(FUSE_ROOT_ID, RELPATH, ino, size0, seq); + expect_open(ino, 0, 1); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_GETATTR && + in.header.nodeid == ino); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([=](auto i __unused, auto& out) { + SET_OUT_HEADER_LEN(out, attr); + out.body.attr.attr.mode = S_IFREG | 0644; + out.body.attr.attr_valid = UINT64_MAX; + out.body.attr.attr.size = size1; + out.body.attr.attr.uid = uid; + }))); + expect_read(ino, 0, size0, size0, CONTENTS0); + expect_read(ino, 0, size1, size1, CONTENTS1); + + /* Fill the data cache */ + fd = open(FULLPATH, O_RDWR); + ASSERT_LE(0, fd) << strerror(errno); + ASSERT_EQ(size0, read(fd, buf, size0)) << strerror(errno); + EXPECT_EQ(0, memcmp(buf, CONTENTS0, size0)); + + /* Evict the data cache */ + iia.mock = m_mock; + iia.ino = ino; + iia.off = 0; + iia.len = 0; + ASSERT_EQ(0, pthread_create(&th0, NULL, inval_inode, &iia)) + << strerror(errno); + pthread_join(th0, &thr0_value); + EXPECT_EQ(0, (intptr_t)thr0_value); + + /* cache attributes were been purged; this will trigger a new GETATTR */ + ASSERT_EQ(0, stat(FULLPATH, &sb)) << strerror(errno); + EXPECT_EQ(uid, sb.st_uid); + EXPECT_EQ(size1, sb.st_size); + + /* This read should not be serviced by cache */ + ASSERT_EQ(0, lseek(fd, 0, SEEK_SET)) << strerror(errno); + ASSERT_EQ(size1, read(fd, buf, size1)) << strerror(errno); + EXPECT_EQ(0, memcmp(buf, CONTENTS1, size1)); + + leak(fd); +} + +/* FUSE_NOTIFY_STORE with a file that's not in the entry cache */ +/* disabled because FUSE_NOTIFY_STORE is not yet implemented */ +TEST_F(Notify, DISABLED_store_nonexistent) +{ + struct store_args sa; + ino_t ino = 42; + void *thr0_value; + pthread_t th0; + + sa.mock = m_mock; + sa.nodeid = ino; + sa.offset = 0; + sa.size = 0; + ASSERT_EQ(0, pthread_create(&th0, NULL, store, &sa)) << strerror(errno); + pthread_join(th0, &thr0_value); + /* It's not an error for a file to be unknown to the kernel */ + EXPECT_EQ(0, (intptr_t)thr0_value); +} + +/* Store data into for a file that does not yet have anything cached */ +/* disabled because FUSE_NOTIFY_STORE is not yet implemented */ +TEST_F(Notify, DISABLED_store_with_blank_cache) +{ + const static char FULLPATH[] = "mountpoint/foo"; + const static char RELPATH[] = "foo"; + const char CONTENTS1[] = "ijklmnopqrstuvwxyz"; + struct store_args sa; + ino_t ino = 42; + void *thr0_value; + Sequence seq; + pthread_t th0; + ssize_t size1 = sizeof(CONTENTS1); + char buf[80]; + int fd; + + expect_lookup(FUSE_ROOT_ID, RELPATH, ino, size1, seq); + expect_open(ino, 0, 1); + + /* Fill the data cache */ + fd = open(FULLPATH, O_RDWR); + ASSERT_LE(0, fd) << strerror(errno); + + /* Evict the data cache */ + sa.mock = m_mock; + sa.nodeid = ino; + sa.offset = 0; + sa.size = size1; + sa.data = (const void*)CONTENTS1; + ASSERT_EQ(0, pthread_create(&th0, NULL, store, &sa)) << strerror(errno); + pthread_join(th0, &thr0_value); + EXPECT_EQ(0, (intptr_t)thr0_value); + + /* This read should be serviced by cache */ + ASSERT_EQ(size1, read(fd, buf, size1)) << strerror(errno); + EXPECT_EQ(0, memcmp(buf, CONTENTS1, size1)); + + leak(fd); +} + +TEST_F(NotifyWriteback, inval_inode_with_dirty_cache) +{ + const static char FULLPATH[] = "mountpoint/foo"; + const static char RELPATH[] = "foo"; + const char CONTENTS[] = "abcdefgh"; + struct inval_inode_args iia; + ino_t ino = 42; + void *thr0_value; + Sequence seq; + pthread_t th0; + ssize_t bufsize = sizeof(CONTENTS); + int fd; + + expect_lookup(FUSE_ROOT_ID, RELPATH, ino, 0, seq); + expect_open(ino, 0, 1); + + /* Fill the data cache */ + fd = open(FULLPATH, O_RDWR); + ASSERT_EQ(bufsize, write(fd, CONTENTS, bufsize)) << strerror(errno); + + expect_write(ino, 0, bufsize, CONTENTS); + /* + * The FUSE protocol does not require an fsync here, but FreeBSD's + * bufobj_invalbuf sends it anyway + */ + maybe_expect_fsync(ino); + + /* Evict the data cache */ + iia.mock = m_mock; + iia.ino = ino; + iia.off = 0; + iia.len = 0; + ASSERT_EQ(0, pthread_create(&th0, NULL, inval_inode, &iia)) + << strerror(errno); + pthread_join(th0, &thr0_value); + EXPECT_EQ(0, (intptr_t)thr0_value); + + leak(fd); +} + +TEST_F(NotifyWriteback, inval_inode_attrs_only) +{ + const static char FULLPATH[] = "mountpoint/foo"; + const static char RELPATH[] = "foo"; + const char CONTENTS[] = "abcdefgh"; + struct inval_inode_args iia; + struct stat sb; + uid_t uid = 12345; + ino_t ino = 42; + void *thr0_value; + Sequence seq; + pthread_t th0; + ssize_t bufsize = sizeof(CONTENTS); + int fd; + + expect_lookup(FUSE_ROOT_ID, RELPATH, ino, 0, seq); + expect_open(ino, 0, 1); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_WRITE); + }, Eq(true)), + _) + ).Times(0); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_GETATTR && + in.header.nodeid == ino); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([=](auto i __unused, auto& out) { + SET_OUT_HEADER_LEN(out, attr); + out.body.attr.attr.mode = S_IFREG | 0644; + out.body.attr.attr_valid = UINT64_MAX; + out.body.attr.attr.size = bufsize; + out.body.attr.attr.uid = uid; + }))); + + /* Fill the data cache */ + fd = open(FULLPATH, O_RDWR); + ASSERT_EQ(bufsize, write(fd, CONTENTS, bufsize)) << strerror(errno); + + /* Evict the attributes, but not data cache */ + iia.mock = m_mock; + iia.ino = ino; + iia.off = -1; + iia.len = 0; + ASSERT_EQ(0, pthread_create(&th0, NULL, inval_inode, &iia)) + << strerror(errno); + pthread_join(th0, &thr0_value); + EXPECT_EQ(0, (intptr_t)thr0_value); + + /* cache attributes were been purged; this will trigger a new GETATTR */ + ASSERT_EQ(0, stat(FULLPATH, &sb)) << strerror(errno); + EXPECT_EQ(uid, sb.st_uid); + EXPECT_EQ(bufsize, sb.st_size); + + leak(fd); +} Property changes on: head/tests/sys/fs/fusefs/notify.cc ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: head/tests/sys/fs/fusefs/opendir.cc =================================================================== --- head/tests/sys/fs/fusefs/opendir.cc (nonexistent) +++ head/tests/sys/fs/fusefs/opendir.cc (revision 350665) @@ -0,0 +1,155 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2019 The FreeBSD Foundation + * + * This software was developed by BFF Storage Systems, LLC under sponsorship + * from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +extern "C" { +#include +#include +} + +#include "mockfs.hh" +#include "utils.hh" + +using namespace testing; + +class Opendir: public FuseTest { +public: +void expect_lookup(const char *relpath, uint64_t ino) +{ + FuseTest::expect_lookup(relpath, ino, S_IFDIR | 0755, 0, 1); +} + +void expect_opendir(uint64_t ino, uint32_t flags, ProcessMockerT r) +{ + /* opendir(3) calls fstatfs */ + EXPECT_CALL(*m_mock, process( + ResultOf([](auto in) { + return (in.header.opcode == FUSE_STATFS); + }, Eq(true)), + _) + ).WillRepeatedly(Invoke(ReturnImmediate([=](auto i __unused, auto& out) { + SET_OUT_HEADER_LEN(out, statfs); + }))); + + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_OPENDIR && + in.header.nodeid == ino && + in.body.opendir.flags == flags); + }, Eq(true)), + _) + ).WillOnce(Invoke(r)); +} + +}; + + +/* + * The fuse daemon fails the request with enoent. This usually indicates a + * race condition: some other FUSE client removed the file in between when the + * kernel checked for it with lookup and tried to open it + */ +TEST_F(Opendir, enoent) +{ + const char FULLPATH[] = "mountpoint/some_dir"; + const char RELPATH[] = "some_dir"; + uint64_t ino = 42; + + expect_lookup(RELPATH, ino); + expect_opendir(ino, O_RDONLY, ReturnErrno(ENOENT)); + + EXPECT_NE(0, open(FULLPATH, O_DIRECTORY)); + EXPECT_EQ(ENOENT, errno); +} + +/* + * The daemon is responsible for checking file permissions (unless the + * default_permissions mount option was used) + */ +TEST_F(Opendir, eperm) +{ + const char FULLPATH[] = "mountpoint/some_dir"; + const char RELPATH[] = "some_dir"; + uint64_t ino = 42; + + expect_lookup(RELPATH, ino); + expect_opendir(ino, O_RDONLY, ReturnErrno(EPERM)); + + EXPECT_NE(0, open(FULLPATH, O_DIRECTORY)); + EXPECT_EQ(EPERM, errno); +} + +TEST_F(Opendir, open) +{ + const char FULLPATH[] = "mountpoint/some_dir"; + const char RELPATH[] = "some_dir"; + uint64_t ino = 42; + + expect_lookup(RELPATH, ino); + expect_opendir(ino, O_RDONLY, + ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, open); + })); + + EXPECT_LE(0, open(FULLPATH, O_DIRECTORY)) << strerror(errno); +} + +/* Directories can be opened O_EXEC for stuff like fchdir(2) */ +TEST_F(Opendir, open_exec) +{ + const char FULLPATH[] = "mountpoint/some_dir"; + const char RELPATH[] = "some_dir"; + uint64_t ino = 42; + int fd; + + expect_lookup(RELPATH, ino); + expect_opendir(ino, O_EXEC, + ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, open); + })); + + fd = open(FULLPATH, O_EXEC | O_DIRECTORY); + ASSERT_LE(0, fd) << strerror(errno); +} + +TEST_F(Opendir, opendir) +{ + const char FULLPATH[] = "mountpoint/some_dir"; + const char RELPATH[] = "some_dir"; + uint64_t ino = 42; + + expect_lookup(RELPATH, ino); + expect_opendir(ino, O_RDONLY, + ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, open); + })); + + errno = 0; + EXPECT_NE(nullptr, opendir(FULLPATH)) << strerror(errno); +} Property changes on: head/tests/sys/fs/fusefs/opendir.cc ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: head/tests/sys/fs/fusefs/read.cc =================================================================== --- head/tests/sys/fs/fusefs/read.cc (nonexistent) +++ head/tests/sys/fs/fusefs/read.cc (revision 350665) @@ -0,0 +1,916 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2019 The FreeBSD Foundation + * + * This software was developed by BFF Storage Systems, LLC under sponsorship + * from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +extern "C" { +#include +#include +#include +#include +#include + +#include +#include +#include +#include +} + +#include "mockfs.hh" +#include "utils.hh" + +using namespace testing; + +class Read: public FuseTest { + +public: +void expect_lookup(const char *relpath, uint64_t ino, uint64_t size) +{ + FuseTest::expect_lookup(relpath, ino, S_IFREG | 0644, size, 1); +} +}; + +class Read_7_8: public FuseTest { +public: +virtual void SetUp() { + m_kernel_minor_version = 8; + FuseTest::SetUp(); +} + +void expect_lookup(const char *relpath, uint64_t ino, uint64_t size) +{ + FuseTest::expect_lookup_7_8(relpath, ino, S_IFREG | 0644, size, 1); +} +}; + +class AioRead: public Read { +public: +virtual void SetUp() { + const char *node = "vfs.aio.enable_unsafe"; + int val = 0; + size_t size = sizeof(val); + + FuseTest::SetUp(); + + ASSERT_EQ(0, sysctlbyname(node, &val, &size, NULL, 0)) + << strerror(errno); + if (!val) + GTEST_SKIP() << + "vfs.aio.enable_unsafe must be set for this test"; +} +}; + +class AsyncRead: public AioRead { + virtual void SetUp() { + m_init_flags = FUSE_ASYNC_READ; + AioRead::SetUp(); + } +}; + +class ReadAhead: public Read, + public WithParamInterface> +{ + virtual void SetUp() { + int val; + const char *node = "vfs.maxbcachebuf"; + size_t size = sizeof(val); + ASSERT_EQ(0, sysctlbyname(node, &val, &size, NULL, 0)) + << strerror(errno); + + m_maxreadahead = val * get<1>(GetParam()); + m_noclusterr = get<0>(GetParam()); + Read::SetUp(); + } +}; + +/* AIO reads need to set the header's pid field correctly */ +/* https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=236379 */ +TEST_F(AioRead, aio_read) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char *CONTENTS = "abcdefgh"; + uint64_t ino = 42; + int fd; + ssize_t bufsize = strlen(CONTENTS); + char buf[bufsize]; + struct aiocb iocb, *piocb; + + expect_lookup(RELPATH, ino, bufsize); + expect_open(ino, 0, 1); + expect_read(ino, 0, bufsize, bufsize, CONTENTS); + + fd = open(FULLPATH, O_RDONLY); + ASSERT_LE(0, fd) << strerror(errno); + + iocb.aio_nbytes = bufsize; + iocb.aio_fildes = fd; + iocb.aio_buf = buf; + iocb.aio_offset = 0; + iocb.aio_sigevent.sigev_notify = SIGEV_NONE; + ASSERT_EQ(0, aio_read(&iocb)) << strerror(errno); + ASSERT_EQ(bufsize, aio_waitcomplete(&piocb, NULL)) << strerror(errno); + ASSERT_EQ(0, memcmp(buf, CONTENTS, bufsize)); + + leak(fd); +} + +/* + * Without the FUSE_ASYNC_READ mount option, fuse(4) should ensure that there + * is at most one outstanding read operation per file handle + */ +TEST_F(AioRead, async_read_disabled) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + int fd; + ssize_t bufsize = 50; + char buf0[bufsize], buf1[bufsize]; + off_t off0 = 0; + off_t off1 = m_maxbcachebuf; + struct aiocb iocb0, iocb1; + volatile sig_atomic_t read_count = 0; + + expect_lookup(RELPATH, ino, 131072); + expect_open(ino, 0, 1); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_READ && + in.header.nodeid == ino && + in.body.read.fh == FH && + in.body.read.offset == (uint64_t)off0); + }, Eq(true)), + _) + ).WillRepeatedly(Invoke([&](auto in __unused, auto &out __unused) { + read_count++; + /* Filesystem is slow to respond */ + })); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_READ && + in.header.nodeid == ino && + in.body.read.fh == FH && + in.body.read.offset == (uint64_t)off1); + }, Eq(true)), + _) + ).WillRepeatedly(Invoke([&](auto in __unused, auto &out __unused) { + read_count++; + /* Filesystem is slow to respond */ + })); + + fd = open(FULLPATH, O_RDONLY); + ASSERT_LE(0, fd) << strerror(errno); + + /* + * Submit two AIO read requests, and respond to neither. If the + * filesystem ever gets the second read request, then we failed to + * limit outstanding reads. + */ + iocb0.aio_nbytes = bufsize; + iocb0.aio_fildes = fd; + iocb0.aio_buf = buf0; + iocb0.aio_offset = off0; + iocb0.aio_sigevent.sigev_notify = SIGEV_NONE; + ASSERT_EQ(0, aio_read(&iocb0)) << strerror(errno); + + iocb1.aio_nbytes = bufsize; + iocb1.aio_fildes = fd; + iocb1.aio_buf = buf1; + iocb1.aio_offset = off1; + iocb1.aio_sigevent.sigev_notify = SIGEV_NONE; + ASSERT_EQ(0, aio_read(&iocb1)) << strerror(errno); + + /* + * Sleep for awhile to make sure the kernel has had a chance to issue + * the second read, even though the first has not yet returned + */ + nap(); + EXPECT_EQ(read_count, 1); + + m_mock->kill_daemon(); + /* Wait for AIO activity to complete, but ignore errors */ + (void)aio_waitcomplete(NULL, NULL); + + leak(fd); +} + +/* + * With the FUSE_ASYNC_READ mount option, fuse(4) may issue multiple + * simultaneous read requests on the same file handle. + */ +TEST_F(AsyncRead, async_read) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + int fd; + ssize_t bufsize = 50; + char buf0[bufsize], buf1[bufsize]; + off_t off0 = 0; + off_t off1 = m_maxbcachebuf; + off_t fsize = 2 * m_maxbcachebuf; + struct aiocb iocb0, iocb1; + sem_t sem; + + ASSERT_EQ(0, sem_init(&sem, 0, 0)) << strerror(errno); + + expect_lookup(RELPATH, ino, fsize); + expect_open(ino, 0, 1); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_READ && + in.header.nodeid == ino && + in.body.read.fh == FH && + in.body.read.offset == (uint64_t)off0); + }, Eq(true)), + _) + ).WillOnce(Invoke([&](auto in __unused, auto &out __unused) { + sem_post(&sem); + /* Filesystem is slow to respond */ + })); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_READ && + in.header.nodeid == ino && + in.body.read.fh == FH && + in.body.read.offset == (uint64_t)off1); + }, Eq(true)), + _) + ).WillOnce(Invoke([&](auto in __unused, auto &out __unused) { + sem_post(&sem); + /* Filesystem is slow to respond */ + })); + + fd = open(FULLPATH, O_RDONLY); + ASSERT_LE(0, fd) << strerror(errno); + + /* + * Submit two AIO read requests, but respond to neither. Ensure that + * we received both. + */ + iocb0.aio_nbytes = bufsize; + iocb0.aio_fildes = fd; + iocb0.aio_buf = buf0; + iocb0.aio_offset = off0; + iocb0.aio_sigevent.sigev_notify = SIGEV_NONE; + ASSERT_EQ(0, aio_read(&iocb0)) << strerror(errno); + + iocb1.aio_nbytes = bufsize; + iocb1.aio_fildes = fd; + iocb1.aio_buf = buf1; + iocb1.aio_offset = off1; + iocb1.aio_sigevent.sigev_notify = SIGEV_NONE; + ASSERT_EQ(0, aio_read(&iocb1)) << strerror(errno); + + /* Wait until both reads have reached the daemon */ + ASSERT_EQ(0, sem_wait(&sem)) << strerror(errno); + ASSERT_EQ(0, sem_wait(&sem)) << strerror(errno); + + m_mock->kill_daemon(); + /* Wait for AIO activity to complete, but ignore errors */ + (void)aio_waitcomplete(NULL, NULL); + + leak(fd); +} + +/* 0-length reads shouldn't cause any confusion */ +TEST_F(Read, direct_io_read_nothing) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + int fd; + uint64_t offset = 100; + char buf[80]; + + expect_lookup(RELPATH, ino, offset + 1000); + expect_open(ino, FOPEN_DIRECT_IO, 1); + + fd = open(FULLPATH, O_RDONLY); + ASSERT_LE(0, fd) << strerror(errno); + + ASSERT_EQ(0, pread(fd, buf, 0, offset)) << strerror(errno); + leak(fd); +} + +/* + * With direct_io, reads should not fill the cache. They should go straight to + * the daemon + */ +TEST_F(Read, direct_io_pread) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char *CONTENTS = "abcdefgh"; + uint64_t ino = 42; + int fd; + uint64_t offset = 100; + ssize_t bufsize = strlen(CONTENTS); + char buf[bufsize]; + + expect_lookup(RELPATH, ino, offset + bufsize); + expect_open(ino, FOPEN_DIRECT_IO, 1); + expect_read(ino, offset, bufsize, bufsize, CONTENTS); + + fd = open(FULLPATH, O_RDONLY); + ASSERT_LE(0, fd) << strerror(errno); + + ASSERT_EQ(bufsize, pread(fd, buf, bufsize, offset)) << strerror(errno); + ASSERT_EQ(0, memcmp(buf, CONTENTS, bufsize)); + + // With FOPEN_DIRECT_IO, the cache should be bypassed. The server will + // get a 2nd read request. + expect_read(ino, offset, bufsize, bufsize, CONTENTS); + ASSERT_EQ(bufsize, pread(fd, buf, bufsize, offset)) << strerror(errno); + ASSERT_EQ(0, memcmp(buf, CONTENTS, bufsize)); + leak(fd); +} + +/* + * With direct_io, filesystems are allowed to return less data than is + * requested. fuse(4) should return a short read to userland. + */ +TEST_F(Read, direct_io_short_read) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char *CONTENTS = "abcdefghijklmnop"; + uint64_t ino = 42; + int fd; + uint64_t offset = 100; + ssize_t bufsize = strlen(CONTENTS); + ssize_t halfbufsize = bufsize / 2; + char buf[bufsize]; + + expect_lookup(RELPATH, ino, offset + bufsize); + expect_open(ino, FOPEN_DIRECT_IO, 1); + expect_read(ino, offset, bufsize, halfbufsize, CONTENTS); + + fd = open(FULLPATH, O_RDONLY); + ASSERT_LE(0, fd) << strerror(errno); + + ASSERT_EQ(halfbufsize, pread(fd, buf, bufsize, offset)) + << strerror(errno); + ASSERT_EQ(0, memcmp(buf, CONTENTS, halfbufsize)); + leak(fd); +} + +TEST_F(Read, eio) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char *CONTENTS = "abcdefgh"; + uint64_t ino = 42; + int fd; + ssize_t bufsize = strlen(CONTENTS); + char buf[bufsize]; + + expect_lookup(RELPATH, ino, bufsize); + expect_open(ino, 0, 1); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_READ); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnErrno(EIO))); + + fd = open(FULLPATH, O_RDONLY); + ASSERT_LE(0, fd) << strerror(errno); + + ASSERT_EQ(-1, read(fd, buf, bufsize)) << strerror(errno); + ASSERT_EQ(EIO, errno); + leak(fd); +} + +/* + * If the server returns a short read when direct io is not in use, that + * indicates EOF, because of a server-side truncation. We should invalidate + * all cached attributes. We may update the file size, + */ +TEST_F(Read, eof) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char *CONTENTS = "abcdefghijklmnop"; + uint64_t ino = 42; + int fd; + uint64_t offset = 100; + ssize_t bufsize = strlen(CONTENTS); + ssize_t partbufsize = 3 * bufsize / 4; + ssize_t r; + char buf[bufsize]; + struct stat sb; + + expect_lookup(RELPATH, ino, offset + bufsize); + expect_open(ino, 0, 1); + expect_read(ino, 0, offset + bufsize, offset + partbufsize, CONTENTS); + expect_getattr(ino, offset + partbufsize); + + fd = open(FULLPATH, O_RDONLY); + ASSERT_LE(0, fd) << strerror(errno); + + r = pread(fd, buf, bufsize, offset); + ASSERT_LE(0, r) << strerror(errno); + EXPECT_EQ(partbufsize, r) << strerror(errno); + ASSERT_EQ(0, fstat(fd, &sb)); + EXPECT_EQ((off_t)(offset + partbufsize), sb.st_size); + leak(fd); +} + +/* Like Read.eof, but causes an entire buffer to be invalidated */ +TEST_F(Read, eof_of_whole_buffer) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char *CONTENTS = "abcdefghijklmnop"; + uint64_t ino = 42; + int fd; + ssize_t bufsize = strlen(CONTENTS); + off_t old_filesize = m_maxbcachebuf * 2 + bufsize; + char buf[bufsize]; + struct stat sb; + + expect_lookup(RELPATH, ino, old_filesize); + expect_open(ino, 0, 1); + expect_read(ino, 2 * m_maxbcachebuf, bufsize, bufsize, CONTENTS); + expect_read(ino, m_maxbcachebuf, m_maxbcachebuf, 0, CONTENTS); + expect_getattr(ino, m_maxbcachebuf); + + fd = open(FULLPATH, O_RDONLY); + ASSERT_LE(0, fd) << strerror(errno); + + /* Cache the third block */ + ASSERT_EQ(bufsize, pread(fd, buf, bufsize, m_maxbcachebuf * 2)) + << strerror(errno); + /* Try to read the 2nd block, but it's past EOF */ + ASSERT_EQ(0, pread(fd, buf, bufsize, m_maxbcachebuf)) + << strerror(errno); + ASSERT_EQ(0, fstat(fd, &sb)); + EXPECT_EQ((off_t)(m_maxbcachebuf), sb.st_size); + leak(fd); +} + +/* + * With the keep_cache option, the kernel may keep its read cache across + * multiple open(2)s. + */ +TEST_F(Read, keep_cache) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char *CONTENTS = "abcdefgh"; + uint64_t ino = 42; + int fd0, fd1; + ssize_t bufsize = strlen(CONTENTS); + char buf[bufsize]; + + FuseTest::expect_lookup(RELPATH, ino, S_IFREG | 0644, bufsize, 2); + expect_open(ino, FOPEN_KEEP_CACHE, 2); + expect_read(ino, 0, bufsize, bufsize, CONTENTS); + + fd0 = open(FULLPATH, O_RDONLY); + ASSERT_LE(0, fd0) << strerror(errno); + ASSERT_EQ(bufsize, read(fd0, buf, bufsize)) << strerror(errno); + + fd1 = open(FULLPATH, O_RDWR); + ASSERT_LE(0, fd1) << strerror(errno); + + /* + * This read should be serviced by cache, even though it's on the other + * file descriptor + */ + ASSERT_EQ(bufsize, read(fd1, buf, bufsize)) << strerror(errno); + + leak(fd0); + leak(fd1); +} + +/* + * Without the keep_cache option, the kernel should drop its read caches on + * every open + */ +TEST_F(Read, keep_cache_disabled) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char *CONTENTS = "abcdefgh"; + uint64_t ino = 42; + int fd0, fd1; + ssize_t bufsize = strlen(CONTENTS); + char buf[bufsize]; + + FuseTest::expect_lookup(RELPATH, ino, S_IFREG | 0644, bufsize, 2); + expect_open(ino, 0, 2); + expect_read(ino, 0, bufsize, bufsize, CONTENTS); + + fd0 = open(FULLPATH, O_RDONLY); + ASSERT_LE(0, fd0) << strerror(errno); + ASSERT_EQ(bufsize, read(fd0, buf, bufsize)) << strerror(errno); + + fd1 = open(FULLPATH, O_RDWR); + ASSERT_LE(0, fd1) << strerror(errno); + + /* + * This read should not be serviced by cache, even though it's on the + * original file descriptor + */ + expect_read(ino, 0, bufsize, bufsize, CONTENTS); + ASSERT_EQ(0, lseek(fd0, 0, SEEK_SET)) << strerror(errno); + ASSERT_EQ(bufsize, read(fd0, buf, bufsize)) << strerror(errno); + + leak(fd0); + leak(fd1); +} + +TEST_F(Read, mmap) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char *CONTENTS = "abcdefgh"; + uint64_t ino = 42; + int fd; + ssize_t len; + size_t bufsize = strlen(CONTENTS); + void *p; + + len = getpagesize(); + + expect_lookup(RELPATH, ino, bufsize); + expect_open(ino, 0, 1); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_READ && + in.header.nodeid == ino && + in.body.read.fh == Read::FH && + in.body.read.offset == 0 && + in.body.read.size == bufsize); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + out.header.len = sizeof(struct fuse_out_header) + bufsize; + memmove(out.body.bytes, CONTENTS, bufsize); + }))); + + fd = open(FULLPATH, O_RDONLY); + ASSERT_LE(0, fd) << strerror(errno); + + p = mmap(NULL, len, PROT_READ, MAP_SHARED, fd, 0); + ASSERT_NE(MAP_FAILED, p) << strerror(errno); + + ASSERT_EQ(0, memcmp(p, CONTENTS, bufsize)); + + ASSERT_EQ(0, munmap(p, len)) << strerror(errno); + leak(fd); +} + +/* + * A read via mmap comes up short, indicating that the file was truncated + * server-side. + */ +TEST_F(Read, mmap_eof) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char *CONTENTS = "abcdefgh"; + uint64_t ino = 42; + int fd; + ssize_t len; + size_t bufsize = strlen(CONTENTS); + struct stat sb; + void *p; + + len = getpagesize(); + + expect_lookup(RELPATH, ino, m_maxbcachebuf); + expect_open(ino, 0, 1); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_READ && + in.header.nodeid == ino && + in.body.read.fh == Read::FH && + in.body.read.offset == 0 && + in.body.read.size == (uint32_t)m_maxbcachebuf); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + out.header.len = sizeof(struct fuse_out_header) + bufsize; + memmove(out.body.bytes, CONTENTS, bufsize); + }))); + expect_getattr(ino, bufsize); + + fd = open(FULLPATH, O_RDONLY); + ASSERT_LE(0, fd) << strerror(errno); + + p = mmap(NULL, len, PROT_READ, MAP_SHARED, fd, 0); + ASSERT_NE(MAP_FAILED, p) << strerror(errno); + + /* The file size should be automatically truncated */ + ASSERT_EQ(0, memcmp(p, CONTENTS, bufsize)); + ASSERT_EQ(0, fstat(fd, &sb)) << strerror(errno); + EXPECT_EQ((off_t)bufsize, sb.st_size); + + ASSERT_EQ(0, munmap(p, len)) << strerror(errno); + leak(fd); +} + +/* + * Just as when FOPEN_DIRECT_IO is used, reads with O_DIRECT should bypass + * cache and to straight to the daemon + */ +TEST_F(Read, o_direct) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char *CONTENTS = "abcdefgh"; + uint64_t ino = 42; + int fd; + ssize_t bufsize = strlen(CONTENTS); + char buf[bufsize]; + + expect_lookup(RELPATH, ino, bufsize); + expect_open(ino, 0, 1); + expect_read(ino, 0, bufsize, bufsize, CONTENTS); + + fd = open(FULLPATH, O_RDONLY); + ASSERT_LE(0, fd) << strerror(errno); + + // Fill the cache + ASSERT_EQ(bufsize, read(fd, buf, bufsize)) << strerror(errno); + ASSERT_EQ(0, memcmp(buf, CONTENTS, bufsize)); + + // Reads with o_direct should bypass the cache + expect_read(ino, 0, bufsize, bufsize, CONTENTS); + ASSERT_EQ(0, fcntl(fd, F_SETFL, O_DIRECT)) << strerror(errno); + ASSERT_EQ(0, lseek(fd, 0, SEEK_SET)) << strerror(errno); + ASSERT_EQ(bufsize, read(fd, buf, bufsize)) << strerror(errno); + ASSERT_EQ(0, memcmp(buf, CONTENTS, bufsize)); + + leak(fd); +} + +TEST_F(Read, pread) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char *CONTENTS = "abcdefgh"; + uint64_t ino = 42; + int fd; + /* + * Set offset to a maxbcachebuf boundary so we'll be sure what offset + * to read from. Without this, the read might start at a lower offset. + */ + uint64_t offset = m_maxbcachebuf; + ssize_t bufsize = strlen(CONTENTS); + char buf[bufsize]; + + expect_lookup(RELPATH, ino, offset + bufsize); + expect_open(ino, 0, 1); + expect_read(ino, offset, bufsize, bufsize, CONTENTS); + + fd = open(FULLPATH, O_RDONLY); + ASSERT_LE(0, fd) << strerror(errno); + + ASSERT_EQ(bufsize, pread(fd, buf, bufsize, offset)) << strerror(errno); + ASSERT_EQ(0, memcmp(buf, CONTENTS, bufsize)); + leak(fd); +} + +TEST_F(Read, read) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char *CONTENTS = "abcdefgh"; + uint64_t ino = 42; + int fd; + ssize_t bufsize = strlen(CONTENTS); + char buf[bufsize]; + + expect_lookup(RELPATH, ino, bufsize); + expect_open(ino, 0, 1); + expect_read(ino, 0, bufsize, bufsize, CONTENTS); + + fd = open(FULLPATH, O_RDONLY); + ASSERT_LE(0, fd) << strerror(errno); + + ASSERT_EQ(bufsize, read(fd, buf, bufsize)) << strerror(errno); + ASSERT_EQ(0, memcmp(buf, CONTENTS, bufsize)); + + leak(fd); +} + +TEST_F(Read_7_8, read) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char *CONTENTS = "abcdefgh"; + uint64_t ino = 42; + int fd; + ssize_t bufsize = strlen(CONTENTS); + char buf[bufsize]; + + expect_lookup(RELPATH, ino, bufsize); + expect_open(ino, 0, 1); + expect_read(ino, 0, bufsize, bufsize, CONTENTS); + + fd = open(FULLPATH, O_RDONLY); + ASSERT_LE(0, fd) << strerror(errno); + + ASSERT_EQ(bufsize, read(fd, buf, bufsize)) << strerror(errno); + ASSERT_EQ(0, memcmp(buf, CONTENTS, bufsize)); + + leak(fd); +} + +/* + * If cacheing is enabled, the kernel should try to read an entire cache block + * at a time. + */ +TEST_F(Read, cache_block) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char *CONTENTS0 = "abcdefghijklmnop"; + uint64_t ino = 42; + int fd; + ssize_t bufsize = 8; + ssize_t filesize = m_maxbcachebuf * 2; + char *contents; + char buf[bufsize]; + const char *contents1 = CONTENTS0 + bufsize; + + contents = (char*)calloc(1, filesize); + ASSERT_NE(nullptr, contents); + memmove(contents, CONTENTS0, strlen(CONTENTS0)); + + expect_lookup(RELPATH, ino, filesize); + expect_open(ino, 0, 1); + expect_read(ino, 0, m_maxbcachebuf, m_maxbcachebuf, + contents); + + fd = open(FULLPATH, O_RDONLY); + ASSERT_LE(0, fd) << strerror(errno); + + ASSERT_EQ(bufsize, read(fd, buf, bufsize)) << strerror(errno); + ASSERT_EQ(0, memcmp(buf, CONTENTS0, bufsize)); + + /* A subsequent read should be serviced by cache */ + ASSERT_EQ(bufsize, read(fd, buf, bufsize)) << strerror(errno); + ASSERT_EQ(0, memcmp(buf, contents1, bufsize)); + leak(fd); +} + +/* Reading with sendfile should work (though it obviously won't be 0-copy) */ +TEST_F(Read, sendfile) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char *CONTENTS = "abcdefgh"; + uint64_t ino = 42; + int fd; + size_t bufsize = strlen(CONTENTS); + char buf[bufsize]; + int sp[2]; + off_t sbytes; + + expect_lookup(RELPATH, ino, bufsize); + expect_open(ino, 0, 1); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_READ && + in.header.nodeid == ino && + in.body.read.fh == Read::FH && + in.body.read.offset == 0 && + in.body.read.size == bufsize); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + out.header.len = sizeof(struct fuse_out_header) + bufsize; + memmove(out.body.bytes, CONTENTS, bufsize); + }))); + + ASSERT_EQ(0, socketpair(PF_LOCAL, SOCK_STREAM, 0, sp)) + << strerror(errno); + fd = open(FULLPATH, O_RDONLY); + ASSERT_LE(0, fd) << strerror(errno); + + ASSERT_EQ(0, sendfile(fd, sp[1], 0, bufsize, NULL, &sbytes, 0)) + << strerror(errno); + ASSERT_EQ(static_cast(bufsize), read(sp[0], buf, bufsize)) + << strerror(errno); + ASSERT_EQ(0, memcmp(buf, CONTENTS, bufsize)); + + close(sp[1]); + close(sp[0]); + leak(fd); +} + +/* sendfile should fail gracefully if fuse declines the read */ +/* https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=236466 */ +TEST_F(Read, sendfile_eio) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char *CONTENTS = "abcdefgh"; + uint64_t ino = 42; + int fd; + ssize_t bufsize = strlen(CONTENTS); + int sp[2]; + off_t sbytes; + + expect_lookup(RELPATH, ino, bufsize); + expect_open(ino, 0, 1); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_READ); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnErrno(EIO))); + + ASSERT_EQ(0, socketpair(PF_LOCAL, SOCK_STREAM, 0, sp)) + << strerror(errno); + fd = open(FULLPATH, O_RDONLY); + ASSERT_LE(0, fd) << strerror(errno); + + ASSERT_NE(0, sendfile(fd, sp[1], 0, bufsize, NULL, &sbytes, 0)); + + close(sp[1]); + close(sp[0]); + leak(fd); +} + +/* + * Sequential reads should use readahead. And if allowed, large reads should + * be clustered. + */ +TEST_P(ReadAhead, readahead) { + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + int fd, maxcontig, clustersize; + ssize_t bufsize = 4 * m_maxbcachebuf; + ssize_t filesize = bufsize; + uint64_t len; + char *rbuf, *contents; + off_t offs; + + contents = (char*)malloc(filesize); + ASSERT_NE(nullptr, contents); + memset(contents, 'X', filesize); + rbuf = (char*)calloc(1, bufsize); + + expect_lookup(RELPATH, ino, filesize); + expect_open(ino, 0, 1); + maxcontig = m_noclusterr ? m_maxbcachebuf : + m_maxbcachebuf + m_maxreadahead; + clustersize = MIN(maxcontig, m_maxphys); + for (offs = 0; offs < bufsize; offs += clustersize) { + len = std::min((size_t)clustersize, (size_t)(filesize - offs)); + expect_read(ino, offs, len, len, contents + offs); + } + + fd = open(FULLPATH, O_RDONLY); + ASSERT_LE(0, fd) << strerror(errno); + + /* Set the internal readahead counter to a "large" value */ + ASSERT_EQ(0, fcntl(fd, F_READAHEAD, 1'000'000'000)) << strerror(errno); + + ASSERT_EQ(bufsize, read(fd, rbuf, bufsize)) << strerror(errno); + ASSERT_EQ(0, memcmp(rbuf, contents, bufsize)); + + leak(fd); +} + +INSTANTIATE_TEST_CASE_P(RA, ReadAhead, + Values(tuple(false, 0), + tuple(false, 1), + tuple(false, 2), + tuple(false, 3), + tuple(true, 0), + tuple(true, 1), + tuple(true, 2))); Property changes on: head/tests/sys/fs/fusefs/read.cc ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: head/tests/sys/fs/fusefs/readlink.cc =================================================================== --- head/tests/sys/fs/fusefs/readlink.cc (nonexistent) +++ head/tests/sys/fs/fusefs/readlink.cc (revision 350665) @@ -0,0 +1,123 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2019 The FreeBSD Foundation + * + * This software was developed by BFF Storage Systems, LLC under sponsorship + * from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +extern "C" { +#include + +#include +} + +#include "mockfs.hh" +#include "utils.hh" + +using namespace testing; + +class Readlink: public FuseTest { +public: +void expect_lookup(const char *relpath, uint64_t ino) +{ + FuseTest::expect_lookup(relpath, ino, S_IFLNK | 0777, 0, 1); +} +void expect_readlink(uint64_t ino, ProcessMockerT r) +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_READLINK && + in.header.nodeid == ino); + }, Eq(true)), + _) + ).WillOnce(Invoke(r)); +} + +}; + +class PushSymlinksIn: public Readlink { + virtual void SetUp() { + m_push_symlinks_in = true; + Readlink::SetUp(); + } +}; + +TEST_F(Readlink, eloop) +{ + const char FULLPATH[] = "mountpoint/src"; + const char RELPATH[] = "src"; + const uint64_t ino = 42; + char buf[80]; + + expect_lookup(RELPATH, ino); + expect_readlink(ino, ReturnErrno(ELOOP)); + + EXPECT_EQ(-1, readlink(FULLPATH, buf, sizeof(buf))); + EXPECT_EQ(ELOOP, errno); +} + +TEST_F(Readlink, ok) +{ + const char FULLPATH[] = "mountpoint/src"; + const char RELPATH[] = "src"; + const char dst[] = "dst"; + const uint64_t ino = 42; + char buf[80]; + + expect_lookup(RELPATH, ino); + expect_readlink(ino, ReturnImmediate([=](auto in __unused, auto& out) { + strlcpy(out.body.str, dst, sizeof(out.body.str)); + out.header.len = sizeof(out.header) + strlen(dst) + 1; + })); + + EXPECT_EQ(static_cast(strlen(dst)) + 1, + readlink(FULLPATH, buf, sizeof(buf))); + EXPECT_STREQ(dst, buf); +} + +TEST_F(PushSymlinksIn, readlink) +{ + const char FULLPATH[] = "mountpoint/src"; + const char RELPATH[] = "src"; + const char dst[] = "/dst"; + const uint64_t ino = 42; + char buf[MAXPATHLEN], wd[MAXPATHLEN], want[MAXPATHLEN]; + int len; + + expect_lookup(RELPATH, ino); + expect_readlink(ino, ReturnImmediate([=](auto in __unused, auto& out) { + strlcpy(out.body.str, dst, sizeof(out.body.str)); + out.header.len = sizeof(out.header) + strlen(dst) + 1; + })); + + ASSERT_NE(nullptr, getcwd(wd, sizeof(wd))) << strerror(errno); + len = snprintf(want, sizeof(want), "%s/mountpoint%s", wd, dst); + ASSERT_LE(0, len) << strerror(errno); + + EXPECT_EQ(static_cast(len) + 1, + readlink(FULLPATH, buf, sizeof(buf))); + EXPECT_STREQ(want, buf); +} Property changes on: head/tests/sys/fs/fusefs/readlink.cc ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: head/tests/sys/fs/fusefs/releasedir.cc =================================================================== --- head/tests/sys/fs/fusefs/releasedir.cc (nonexistent) +++ head/tests/sys/fs/fusefs/releasedir.cc (revision 350665) @@ -0,0 +1,116 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2019 The FreeBSD Foundation + * + * This software was developed by BFF Storage Systems, LLC under sponsorship + * from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +extern "C" { +#include +#include +} + +#include "mockfs.hh" +#include "utils.hh" + +using namespace testing; + +class ReleaseDir: public FuseTest { + +public: +void expect_lookup(const char *relpath, uint64_t ino) +{ + FuseTest::expect_lookup(relpath, ino, S_IFDIR | 0755, 0, 1); +} +}; + +/* If a file descriptor is duplicated, only the last close causes RELEASE */ +TEST_F(ReleaseDir, dup) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + DIR *dir, *dir2; + + expect_lookup(RELPATH, ino); + expect_opendir(ino); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_READDIR && + in.header.nodeid == ino && + in.body.readdir.offset == 0); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + out.header.error = 0; + out.header.len = sizeof(out.header); + }))); + expect_releasedir(ino, ReturnErrno(0)); + + dir = opendir(FULLPATH); + ASSERT_NE(nullptr, dir) << strerror(errno); + + dir2 = fdopendir(dup(dirfd(dir))); + ASSERT_NE(nullptr, dir2) << strerror(errno); + + ASSERT_EQ(0, closedir(dir)) << strerror(errno); + ASSERT_EQ(0, closedir(dir2)) << strerror(errno); +} + +TEST_F(ReleaseDir, ok) +{ + const char FULLPATH[] = "mountpoint/some_dir"; + const char RELPATH[] = "some_dir"; + uint64_t ino = 42; + DIR *dir; + + expect_lookup(RELPATH, ino); + expect_opendir(ino); + expect_releasedir(ino, ReturnErrno(0)); + + dir = opendir(FULLPATH); + ASSERT_NE(nullptr, dir) << strerror(errno); + + ASSERT_EQ(0, closedir(dir)) << strerror(errno); +} + +/* Directories opened O_EXEC should be properly released, too */ +TEST_F(ReleaseDir, o_exec) +{ + const char FULLPATH[] = "mountpoint/some_dir"; + const char RELPATH[] = "some_dir"; + uint64_t ino = 42; + int fd; + + expect_lookup(RELPATH, ino); + expect_opendir(ino); + expect_releasedir(ino, ReturnErrno(0)); + + fd = open(FULLPATH, O_EXEC | O_DIRECTORY); + EXPECT_LE(0, fd) << strerror(errno); + + ASSERT_EQ(0, close(fd)) << strerror(errno); +} Property changes on: head/tests/sys/fs/fusefs/releasedir.cc ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: head/tests/sys/fs/fusefs/setattr.cc =================================================================== --- head/tests/sys/fs/fusefs/setattr.cc (nonexistent) +++ head/tests/sys/fs/fusefs/setattr.cc (revision 350665) @@ -0,0 +1,774 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2019 The FreeBSD Foundation + * + * This software was developed by BFF Storage Systems, LLC under sponsorship + * from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +extern "C" { +#include + +#include +} + +#include "mockfs.hh" +#include "utils.hh" + +using namespace testing; + +class Setattr : public FuseTest {}; + +class RofsSetattr: public Setattr { +public: +virtual void SetUp() { + m_ro = true; + Setattr::SetUp(); +} +}; + +class Setattr_7_8: public Setattr { +public: +virtual void SetUp() { + m_kernel_minor_version = 8; + Setattr::SetUp(); +} +}; + + +/* + * If setattr returns a non-zero cache timeout, then subsequent VOP_GETATTRs + * should use the cached attributes, rather than query the daemon + */ +TEST_F(Setattr, attr_cache) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const uint64_t ino = 42; + struct stat sb; + const mode_t newmode = 0644; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillRepeatedly(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = S_IFREG | 0644; + out.body.entry.nodeid = ino; + out.body.entry.entry_valid = UINT64_MAX; + }))); + + EXPECT_CALL(*m_mock, process( + ResultOf([](auto in) { + return (in.header.opcode == FUSE_SETATTR && + in.header.nodeid == ino); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, attr); + out.body.attr.attr.ino = ino; // Must match nodeid + out.body.attr.attr.mode = S_IFREG | newmode; + out.body.attr.attr_valid = UINT64_MAX; + }))); + EXPECT_CALL(*m_mock, process( + ResultOf([](auto in) { + return (in.header.opcode == FUSE_GETATTR); + }, Eq(true)), + _) + ).Times(0); + + /* Set an attribute with SETATTR */ + ASSERT_EQ(0, chmod(FULLPATH, newmode)) << strerror(errno); + + /* The stat(2) should use cached attributes */ + ASSERT_EQ(0, stat(FULLPATH, &sb)); + EXPECT_EQ(S_IFREG | newmode, sb.st_mode); +} + +/* Change the mode of a file */ +TEST_F(Setattr, chmod) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const uint64_t ino = 42; + const mode_t oldmode = 0755; + const mode_t newmode = 0644; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = S_IFREG | oldmode; + out.body.entry.nodeid = ino; + }))); + + EXPECT_CALL(*m_mock, process( + ResultOf([](auto in) { + uint32_t valid = FATTR_MODE; + return (in.header.opcode == FUSE_SETATTR && + in.header.nodeid == ino && + in.body.setattr.valid == valid && + in.body.setattr.mode == newmode); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, attr); + out.body.attr.attr.ino = ino; // Must match nodeid + out.body.attr.attr.mode = S_IFREG | newmode; + }))); + EXPECT_EQ(0, chmod(FULLPATH, newmode)) << strerror(errno); +} + +/* + * Chmod a multiply-linked file with cached attributes. Check that both files' + * attributes have changed. + */ +TEST_F(Setattr, chmod_multiply_linked) +{ + const char FULLPATH0[] = "mountpoint/some_file.txt"; + const char RELPATH0[] = "some_file.txt"; + const char FULLPATH1[] = "mountpoint/other_file.txt"; + const char RELPATH1[] = "other_file.txt"; + struct stat sb; + const uint64_t ino = 42; + const mode_t oldmode = 0777; + const mode_t newmode = 0666; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH0) + .WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = S_IFREG | oldmode; + out.body.entry.nodeid = ino; + out.body.entry.attr.nlink = 2; + out.body.entry.attr_valid = UINT64_MAX; + out.body.entry.entry_valid = UINT64_MAX; + }))); + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH1) + .WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = S_IFREG | oldmode; + out.body.entry.nodeid = ino; + out.body.entry.attr.nlink = 2; + out.body.entry.attr_valid = UINT64_MAX; + out.body.entry.entry_valid = UINT64_MAX; + }))); + + EXPECT_CALL(*m_mock, process( + ResultOf([](auto in) { + uint32_t valid = FATTR_MODE; + return (in.header.opcode == FUSE_SETATTR && + in.header.nodeid == ino && + in.body.setattr.valid == valid && + in.body.setattr.mode == newmode); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, attr); + out.body.attr.attr.ino = ino; + out.body.attr.attr.mode = S_IFREG | newmode; + out.body.attr.attr.nlink = 2; + out.body.attr.attr_valid = UINT64_MAX; + }))); + + /* For a lookup of the 2nd file to get it into the cache*/ + ASSERT_EQ(0, stat(FULLPATH1, &sb)) << strerror(errno); + EXPECT_EQ(S_IFREG | oldmode, sb.st_mode); + + ASSERT_EQ(0, chmod(FULLPATH0, newmode)) << strerror(errno); + ASSERT_EQ(0, stat(FULLPATH0, &sb)) << strerror(errno); + EXPECT_EQ(S_IFREG | newmode, sb.st_mode); + ASSERT_EQ(0, stat(FULLPATH1, &sb)) << strerror(errno); + EXPECT_EQ(S_IFREG | newmode, sb.st_mode); +} + + +/* Change the owner and group of a file */ +TEST_F(Setattr, chown) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const uint64_t ino = 42; + const gid_t oldgroup = 66; + const gid_t newgroup = 99; + const uid_t olduser = 33; + const uid_t newuser = 44; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = S_IFREG | 0644; + out.body.entry.nodeid = ino; + out.body.entry.attr.gid = oldgroup; + out.body.entry.attr.uid = olduser; + }))); + + EXPECT_CALL(*m_mock, process( + ResultOf([](auto in) { + uint32_t valid = FATTR_GID | FATTR_UID; + return (in.header.opcode == FUSE_SETATTR && + in.header.nodeid == ino && + in.body.setattr.valid == valid && + in.body.setattr.uid == newuser && + in.body.setattr.gid == newgroup); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, attr); + out.body.attr.attr.ino = ino; // Must match nodeid + out.body.attr.attr.mode = S_IFREG | 0644; + out.body.attr.attr.uid = newuser; + out.body.attr.attr.gid = newgroup; + }))); + EXPECT_EQ(0, chown(FULLPATH, newuser, newgroup)) << strerror(errno); +} + + + +/* + * FUSE daemons are allowed to check permissions however they like. If the + * daemon returns EPERM, even if the file permissions "should" grant access, + * then fuse(4) should return EPERM too. + */ +TEST_F(Setattr, eperm) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const uint64_t ino = 42; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(Invoke(ReturnImmediate([=](auto in, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = S_IFREG | 0777; + out.body.entry.nodeid = ino; + out.body.entry.attr.uid = in.header.uid; + out.body.entry.attr.gid = in.header.gid; + }))); + + EXPECT_CALL(*m_mock, process( + ResultOf([](auto in) { + return (in.header.opcode == FUSE_SETATTR && + in.header.nodeid == ino); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnErrno(EPERM))); + EXPECT_NE(0, truncate(FULLPATH, 10)); + EXPECT_EQ(EPERM, errno); +} + +/* Change the mode of an open file, by its file descriptor */ +TEST_F(Setattr, fchmod) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + int fd; + const mode_t oldmode = 0755; + const mode_t newmode = 0644; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = S_IFREG | oldmode; + out.body.entry.nodeid = ino; + out.body.entry.attr_valid = UINT64_MAX; + }))); + + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_OPEN && + in.header.nodeid == ino); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + out.header.len = sizeof(out.header); + SET_OUT_HEADER_LEN(out, open); + }))); + + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + uint32_t valid = FATTR_MODE; + return (in.header.opcode == FUSE_SETATTR && + in.header.nodeid == ino && + in.body.setattr.valid == valid && + in.body.setattr.mode == newmode); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, attr); + out.body.attr.attr.ino = ino; // Must match nodeid + out.body.attr.attr.mode = S_IFREG | newmode; + }))); + + fd = open(FULLPATH, O_RDONLY); + ASSERT_LE(0, fd) << strerror(errno); + ASSERT_EQ(0, fchmod(fd, newmode)) << strerror(errno); + leak(fd); +} + +/* Change the size of an open file, by its file descriptor */ +TEST_F(Setattr, ftruncate) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + int fd; + uint64_t fh = 0xdeadbeef1a7ebabe; + const off_t oldsize = 99; + const off_t newsize = 12345; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = S_IFREG | 0755; + out.body.entry.nodeid = ino; + out.body.entry.attr_valid = UINT64_MAX; + out.body.entry.attr.size = oldsize; + }))); + + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_OPEN && + in.header.nodeid == ino); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + out.header.len = sizeof(out.header); + SET_OUT_HEADER_LEN(out, open); + out.body.open.fh = fh; + }))); + + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + uint32_t valid = FATTR_SIZE | FATTR_FH; + return (in.header.opcode == FUSE_SETATTR && + in.header.nodeid == ino && + in.body.setattr.valid == valid && + in.body.setattr.fh == fh); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, attr); + out.body.attr.attr.ino = ino; // Must match nodeid + out.body.attr.attr.mode = S_IFREG | 0755; + out.body.attr.attr.size = newsize; + }))); + + fd = open(FULLPATH, O_RDWR); + ASSERT_LE(0, fd) << strerror(errno); + ASSERT_EQ(0, ftruncate(fd, newsize)) << strerror(errno); + leak(fd); +} + +/* Change the size of the file */ +TEST_F(Setattr, truncate) { + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const uint64_t ino = 42; + const uint64_t oldsize = 100'000'000; + const uint64_t newsize = 20'000'000; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = S_IFREG | 0644; + out.body.entry.nodeid = ino; + out.body.entry.attr.size = oldsize; + }))); + + EXPECT_CALL(*m_mock, process( + ResultOf([](auto in) { + uint32_t valid = FATTR_SIZE; + return (in.header.opcode == FUSE_SETATTR && + in.header.nodeid == ino && + in.body.setattr.valid == valid && + in.body.setattr.size == newsize); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, attr); + out.body.attr.attr.ino = ino; // Must match nodeid + out.body.attr.attr.mode = S_IFREG | 0644; + out.body.attr.attr.size = newsize; + }))); + EXPECT_EQ(0, truncate(FULLPATH, newsize)) << strerror(errno); +} + +/* + * Truncating a file should discard cached data past the truncation point. + * This is a regression test for bug 233783. + * + * There are two distinct failure modes. The first one is a failure to zero + * the portion of the file's final buffer past EOF. It can be reproduced by + * fsx -WR -P /tmp -S10 fsx.bin + * + * The second is a failure to drop buffers beyond that. It can be reproduced by + * fsx -WR -P /tmp -S18 -n fsx.bin + * Also reproducible in sh with: + * $> /path/to/libfuse/build/example/passthrough -d /tmp/mnt + * $> cd /tmp/mnt/tmp + * $> dd if=/dev/random of=randfile bs=1k count=192 + * $> truncate -s 1k randfile && truncate -s 192k randfile + * $> xxd randfile | less # xxd will wrongly show random data at offset 0x8000 + */ +TEST_F(Setattr, truncate_discards_cached_data) { + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + void *w0buf, *r0buf, *r1buf, *expected; + off_t w0_offset = 0; + size_t w0_size = 0x30000; + off_t r0_offset = 0; + off_t r0_size = w0_size; + size_t trunc0_size = 0x400; + size_t trunc1_size = w0_size; + off_t r1_offset = trunc0_size; + off_t r1_size = w0_size - trunc0_size; + size_t cur_size = 0; + const uint64_t ino = 42; + mode_t mode = S_IFREG | 0644; + int fd, r; + bool should_have_data = false; + + w0buf = malloc(w0_size); + ASSERT_NE(nullptr, w0buf) << strerror(errno); + memset(w0buf, 'X', w0_size); + + r0buf = malloc(r0_size); + ASSERT_NE(nullptr, r0buf) << strerror(errno); + r1buf = malloc(r1_size); + ASSERT_NE(nullptr, r1buf) << strerror(errno); + + expected = malloc(r1_size); + ASSERT_NE(nullptr, expected) << strerror(errno); + memset(expected, 0, r1_size); + + expect_lookup(RELPATH, ino, mode, 0, 1); + expect_open(ino, O_RDWR, 1); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_GETATTR && + in.header.nodeid == ino); + }, Eq(true)), + _) + ).WillRepeatedly(Invoke(ReturnImmediate([&](auto i __unused, auto& out) { + SET_OUT_HEADER_LEN(out, attr); + out.body.attr.attr.ino = ino; + out.body.attr.attr.mode = mode; + out.body.attr.attr.size = cur_size; + }))); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_WRITE); + }, Eq(true)), + _) + ).WillRepeatedly(Invoke(ReturnImmediate([&](auto in, auto& out) { + SET_OUT_HEADER_LEN(out, write); + out.body.attr.attr.ino = ino; + out.body.write.size = in.body.write.size; + cur_size = std::max(static_cast(cur_size), + in.body.write.size + in.body.write.offset); + }))); + + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_SETATTR && + in.header.nodeid == ino && + (in.body.setattr.valid & FATTR_SIZE)); + }, Eq(true)), + _) + ).WillRepeatedly(Invoke(ReturnImmediate([&](auto in, auto& out) { + auto trunc_size = in.body.setattr.size; + SET_OUT_HEADER_LEN(out, attr); + out.body.attr.attr.ino = ino; + out.body.attr.attr.mode = mode; + out.body.attr.attr.size = trunc_size; + cur_size = trunc_size; + }))); + + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_READ); + }, Eq(true)), + _) + ).WillRepeatedly(Invoke(ReturnImmediate([&](auto in, auto& out) { + auto osize = std::min( + static_cast(cur_size) - in.body.read.offset, + static_cast(in.body.read.size)); + out.header.len = sizeof(struct fuse_out_header) + osize; + if (should_have_data) + memset(out.body.bytes, 'X', osize); + else + bzero(out.body.bytes, osize); + }))); + + fd = open(FULLPATH, O_RDWR, 0644); + ASSERT_LE(0, fd) << strerror(errno); + + /* Fill the file with Xs */ + ASSERT_EQ(static_cast(w0_size), + pwrite(fd, w0buf, w0_size, w0_offset)); + should_have_data = true; + /* Fill the cache */ + ASSERT_EQ(static_cast(r0_size), + pread(fd, r0buf, r0_size, r0_offset)); + /* 1st truncate should discard cached data */ + EXPECT_EQ(0, ftruncate(fd, trunc0_size)) << strerror(errno); + should_have_data = false; + /* 2nd truncate extends file into previously cached data */ + EXPECT_EQ(0, ftruncate(fd, trunc1_size)) << strerror(errno); + /* Read should return all zeros */ + ASSERT_EQ(static_cast(r1_size), + pread(fd, r1buf, r1_size, r1_offset)); + + r = memcmp(expected, r1buf, r1_size); + ASSERT_EQ(0, r); + + free(expected); + free(r1buf); + free(r0buf); + free(w0buf); +} + +/* Change a file's timestamps */ +TEST_F(Setattr, utimensat) { + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const uint64_t ino = 42; + const timespec oldtimes[2] = { + {.tv_sec = 1, .tv_nsec = 2}, + {.tv_sec = 3, .tv_nsec = 4}, + }; + const timespec newtimes[2] = { + {.tv_sec = 5, .tv_nsec = 6}, + {.tv_sec = 7, .tv_nsec = 8}, + }; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = S_IFREG | 0644; + out.body.entry.nodeid = ino; + out.body.entry.attr_valid = UINT64_MAX; + out.body.entry.attr.atime = oldtimes[0].tv_sec; + out.body.entry.attr.atimensec = oldtimes[0].tv_nsec; + out.body.entry.attr.mtime = oldtimes[1].tv_sec; + out.body.entry.attr.mtimensec = oldtimes[1].tv_nsec; + }))); + + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + uint32_t valid = FATTR_ATIME | FATTR_MTIME; + return (in.header.opcode == FUSE_SETATTR && + in.header.nodeid == ino && + in.body.setattr.valid == valid && + (time_t)in.body.setattr.atime == + newtimes[0].tv_sec && + in.body.setattr.atimensec == + newtimes[0].tv_nsec && + (time_t)in.body.setattr.mtime == + newtimes[1].tv_sec && + in.body.setattr.mtimensec == + newtimes[1].tv_nsec); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, attr); + out.body.attr.attr.ino = ino; // Must match nodeid + out.body.attr.attr.mode = S_IFREG | 0644; + out.body.attr.attr.atime = newtimes[0].tv_sec; + out.body.attr.attr.atimensec = newtimes[0].tv_nsec; + out.body.attr.attr.mtime = newtimes[1].tv_sec; + out.body.attr.attr.mtimensec = newtimes[1].tv_nsec; + }))); + EXPECT_EQ(0, utimensat(AT_FDCWD, FULLPATH, &newtimes[0], 0)) + << strerror(errno); +} + +/* Change a file mtime but not its atime */ +TEST_F(Setattr, utimensat_mtime_only) { + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const uint64_t ino = 42; + const timespec oldtimes[2] = { + {.tv_sec = 1, .tv_nsec = 2}, + {.tv_sec = 3, .tv_nsec = 4}, + }; + const timespec newtimes[2] = { + {.tv_sec = 5, .tv_nsec = UTIME_OMIT}, + {.tv_sec = 7, .tv_nsec = 8}, + }; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = S_IFREG | 0644; + out.body.entry.nodeid = ino; + out.body.entry.attr_valid = UINT64_MAX; + out.body.entry.attr.atime = oldtimes[0].tv_sec; + out.body.entry.attr.atimensec = oldtimes[0].tv_nsec; + out.body.entry.attr.mtime = oldtimes[1].tv_sec; + out.body.entry.attr.mtimensec = oldtimes[1].tv_nsec; + }))); + + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + uint32_t valid = FATTR_MTIME; + return (in.header.opcode == FUSE_SETATTR && + in.header.nodeid == ino && + in.body.setattr.valid == valid && + (time_t)in.body.setattr.mtime == + newtimes[1].tv_sec && + in.body.setattr.mtimensec == + newtimes[1].tv_nsec); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, attr); + out.body.attr.attr.ino = ino; // Must match nodeid + out.body.attr.attr.mode = S_IFREG | 0644; + out.body.attr.attr.atime = oldtimes[0].tv_sec; + out.body.attr.attr.atimensec = oldtimes[0].tv_nsec; + out.body.attr.attr.mtime = newtimes[1].tv_sec; + out.body.attr.attr.mtimensec = newtimes[1].tv_nsec; + }))); + EXPECT_EQ(0, utimensat(AT_FDCWD, FULLPATH, &newtimes[0], 0)) + << strerror(errno); +} + +/* + * Set a file's mtime and atime to now + * + * The design of FreeBSD's VFS does not allow fusefs to set just one of atime + * or mtime to UTIME_NOW; it's both or neither. + */ +TEST_F(Setattr, utimensat_utime_now) { + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const uint64_t ino = 42; + const timespec oldtimes[2] = { + {.tv_sec = 1, .tv_nsec = 2}, + {.tv_sec = 3, .tv_nsec = 4}, + }; + const timespec newtimes[2] = { + {.tv_sec = 0, .tv_nsec = UTIME_NOW}, + {.tv_sec = 0, .tv_nsec = UTIME_NOW}, + }; + /* "now" is whatever the server says it is */ + const timespec now[2] = { + {.tv_sec = 5, .tv_nsec = 7}, + {.tv_sec = 6, .tv_nsec = 8}, + }; + struct stat sb; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = S_IFREG | 0644; + out.body.entry.nodeid = ino; + out.body.entry.attr_valid = UINT64_MAX; + out.body.entry.entry_valid = UINT64_MAX; + out.body.entry.attr.atime = oldtimes[0].tv_sec; + out.body.entry.attr.atimensec = oldtimes[0].tv_nsec; + out.body.entry.attr.mtime = oldtimes[1].tv_sec; + out.body.entry.attr.mtimensec = oldtimes[1].tv_nsec; + }))); + + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + uint32_t valid = FATTR_ATIME | FATTR_ATIME_NOW | + FATTR_MTIME | FATTR_MTIME_NOW; + return (in.header.opcode == FUSE_SETATTR && + in.header.nodeid == ino && + in.body.setattr.valid == valid); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, attr); + out.body.attr.attr.ino = ino; // Must match nodeid + out.body.attr.attr.mode = S_IFREG | 0644; + out.body.attr.attr.atime = now[0].tv_sec; + out.body.attr.attr.atimensec = now[0].tv_nsec; + out.body.attr.attr.mtime = now[1].tv_sec; + out.body.attr.attr.mtimensec = now[1].tv_nsec; + out.body.attr.attr_valid = UINT64_MAX; + }))); + ASSERT_EQ(0, utimensat(AT_FDCWD, FULLPATH, &newtimes[0], 0)) + << strerror(errno); + ASSERT_EQ(0, stat(FULLPATH, &sb)) << strerror(errno); + EXPECT_EQ(now[0].tv_sec, sb.st_atim.tv_sec); + EXPECT_EQ(now[0].tv_nsec, sb.st_atim.tv_nsec); + EXPECT_EQ(now[1].tv_sec, sb.st_mtim.tv_sec); + EXPECT_EQ(now[1].tv_nsec, sb.st_mtim.tv_nsec); +} + +/* On a read-only mount, no attributes may be changed */ +TEST_F(RofsSetattr, erofs) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const uint64_t ino = 42; + const mode_t oldmode = 0755; + const mode_t newmode = 0644; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = S_IFREG | oldmode; + out.body.entry.nodeid = ino; + }))); + + ASSERT_EQ(-1, chmod(FULLPATH, newmode)); + ASSERT_EQ(EROFS, errno); +} + +/* Change the mode of a file */ +TEST_F(Setattr_7_8, chmod) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const uint64_t ino = 42; + const mode_t oldmode = 0755; + const mode_t newmode = 0644; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry_7_8); + out.body.entry.attr.mode = S_IFREG | oldmode; + out.body.entry.nodeid = ino; + }))); + + EXPECT_CALL(*m_mock, process( + ResultOf([](auto in) { + uint32_t valid = FATTR_MODE; + return (in.header.opcode == FUSE_SETATTR && + in.header.nodeid == ino && + in.body.setattr.valid == valid && + in.body.setattr.mode == newmode); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, attr_7_8); + out.body.attr.attr.ino = ino; // Must match nodeid + out.body.attr.attr.mode = S_IFREG | newmode; + }))); + EXPECT_EQ(0, chmod(FULLPATH, newmode)) << strerror(errno); +} Property changes on: head/tests/sys/fs/fusefs/setattr.cc ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: head/tests/sys/fs/fusefs/statfs.cc =================================================================== --- head/tests/sys/fs/fusefs/statfs.cc (nonexistent) +++ head/tests/sys/fs/fusefs/statfs.cc (revision 350665) @@ -0,0 +1,171 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2019 The FreeBSD Foundation + * + * This software was developed by BFF Storage Systems, LLC under sponsorship + * from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +extern "C" { +#include +#include +#include +} + +#include "mockfs.hh" +#include "utils.hh" + +using namespace testing; + +class Statfs: public FuseTest {}; + +TEST_F(Statfs, eio) +{ + struct statfs statbuf; + + EXPECT_CALL(*m_mock, process( + ResultOf([](auto in) { + return (in.header.opcode == FUSE_STATFS); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnErrno(EIO))); + + ASSERT_NE(0, statfs("mountpoint", &statbuf)); + ASSERT_EQ(EIO, errno); +} + +/* + * When the daemon is dead but the filesystem is still mounted, fuse(4) fakes + * the statfs(2) response, which is necessary for unmounting. + */ +TEST_F(Statfs, enotconn) +{ + struct statfs statbuf; + char mp[PATH_MAX]; + + m_mock->kill_daemon(); + + ASSERT_NE(nullptr, getcwd(mp, PATH_MAX)) << strerror(errno); + strlcat(mp, "/mountpoint", PATH_MAX); + ASSERT_EQ(0, statfs("mountpoint", &statbuf)) << strerror(errno); + + EXPECT_EQ(getuid(), statbuf.f_owner); + EXPECT_EQ(0, strcmp("fusefs", statbuf.f_fstypename)); + EXPECT_EQ(0, strcmp("/dev/fuse", statbuf.f_mntfromname)); + EXPECT_EQ(0, strcmp(mp, statbuf.f_mntonname)); +} + +static void* statfs_th(void* arg) { + ssize_t r; + struct statfs *sb = (struct statfs*)arg; + + r = statfs("mountpoint", sb); + if (r >= 0) + return 0; + else + return (void*)(intptr_t)errno; +} + +/* + * Like the enotconn test, but in this case the daemon dies after we send the + * FUSE_STATFS operation but before we get a response. + */ +TEST_F(Statfs, enotconn_while_blocked) +{ + struct statfs statbuf; + void *thr0_value; + pthread_t th0; + char mp[PATH_MAX]; + sem_t sem; + + ASSERT_EQ(0, sem_init(&sem, 0, 0)) << strerror(errno); + + EXPECT_CALL(*m_mock, process( + ResultOf([](auto in) { + return (in.header.opcode == FUSE_STATFS); + }, Eq(true)), + _) + ).WillOnce(Invoke([&](auto in __unused, auto &out __unused) { + sem_post(&sem); + /* Just block until the daemon dies */ + })); + + ASSERT_NE(nullptr, getcwd(mp, PATH_MAX)) << strerror(errno); + strlcat(mp, "/mountpoint", PATH_MAX); + ASSERT_EQ(0, pthread_create(&th0, NULL, statfs_th, (void*)&statbuf)) + << strerror(errno); + + ASSERT_EQ(0, sem_wait(&sem)) << strerror(errno); + m_mock->kill_daemon(); + + pthread_join(th0, &thr0_value); + ASSERT_EQ(0, (intptr_t)thr0_value); + + EXPECT_EQ(getuid(), statbuf.f_owner); + EXPECT_EQ(0, strcmp("fusefs", statbuf.f_fstypename)); + EXPECT_EQ(0, strcmp("/dev/fuse", statbuf.f_mntfromname)); + EXPECT_EQ(0, strcmp(mp, statbuf.f_mntonname)); +} + +TEST_F(Statfs, ok) +{ + struct statfs statbuf; + char mp[PATH_MAX]; + + EXPECT_CALL(*m_mock, process( + ResultOf([](auto in) { + return (in.header.opcode == FUSE_STATFS); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, statfs); + out.body.statfs.st.blocks = 1000; + out.body.statfs.st.bfree = 100; + out.body.statfs.st.bavail = 200; + out.body.statfs.st.files = 5; + out.body.statfs.st.ffree = 6; + out.body.statfs.st.namelen = 128; + out.body.statfs.st.frsize = 1024; + }))); + + ASSERT_NE(nullptr, getcwd(mp, PATH_MAX)) << strerror(errno); + strlcat(mp, "/mountpoint", PATH_MAX); + ASSERT_EQ(0, statfs("mountpoint", &statbuf)) << strerror(errno); + EXPECT_EQ(1024ul, statbuf.f_bsize); + /* + * fuse(4) ignores the filesystem's reported optimal transfer size, and + * chooses a size that works well with the rest of the system instead + */ + EXPECT_EQ(1000ul, statbuf.f_blocks); + EXPECT_EQ(100ul, statbuf.f_bfree); + EXPECT_EQ(200l, statbuf.f_bavail); + EXPECT_EQ(5ul, statbuf.f_files); + EXPECT_EQ(6l, statbuf.f_ffree); + EXPECT_EQ(128u, statbuf.f_namemax); + EXPECT_EQ(getuid(), statbuf.f_owner); + EXPECT_EQ(0, strcmp("fusefs", statbuf.f_fstypename)); + EXPECT_EQ(0, strcmp("/dev/fuse", statbuf.f_mntfromname)); + EXPECT_EQ(0, strcmp(mp, statbuf.f_mntonname)); +} Property changes on: head/tests/sys/fs/fusefs/statfs.cc ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: head/tests/sys/fs/fusefs/xattr.cc =================================================================== --- head/tests/sys/fs/fusefs/xattr.cc (nonexistent) +++ head/tests/sys/fs/fusefs/xattr.cc (revision 350665) @@ -0,0 +1,641 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2019 The FreeBSD Foundation + * + * This software was developed by BFF Storage Systems, LLC under sponsorship + * from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +/* Tests for all things relating to extended attributes and FUSE */ + +extern "C" { +#include +#include +#include +} + +#include "mockfs.hh" +#include "utils.hh" + +using namespace testing; + +const char FULLPATH[] = "mountpoint/some_file.txt"; +const char RELPATH[] = "some_file.txt"; + +class Xattr: public FuseTest { +public: +void expect_getxattr(uint64_t ino, const char *attr, ProcessMockerT r) +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + const char *a = (const char*)in.body.bytes + + sizeof(fuse_getxattr_in); + return (in.header.opcode == FUSE_GETXATTR && + in.header.nodeid == ino && + 0 == strcmp(attr, a)); + }, Eq(true)), + _) + ).WillOnce(Invoke(r)); +} + +void expect_listxattr(uint64_t ino, uint32_t size, ProcessMockerT r) +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_LISTXATTR && + in.header.nodeid == ino && + in.body.listxattr.size == size); + }, Eq(true)), + _) + ).WillOnce(Invoke(r)) + .RetiresOnSaturation(); +} + +void expect_removexattr(uint64_t ino, const char *attr, int error) +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + const char *a = (const char*)in.body.bytes; + return (in.header.opcode == FUSE_REMOVEXATTR && + in.header.nodeid == ino && + 0 == strcmp(attr, a)); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnErrno(error))); +} + +void expect_setxattr(uint64_t ino, const char *attr, const char *value, + ProcessMockerT r) +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + const char *a = (const char*)in.body.bytes + + sizeof(fuse_setxattr_in); + const char *v = a + strlen(a) + 1; + return (in.header.opcode == FUSE_SETXATTR && + in.header.nodeid == ino && + 0 == strcmp(attr, a) && + 0 == strcmp(value, v)); + }, Eq(true)), + _) + ).WillOnce(Invoke(r)); +} + +}; + +class Getxattr: public Xattr {}; +class Listxattr: public Xattr {}; +class Removexattr: public Xattr {}; +class Setxattr: public Xattr {}; +class RofsXattr: public Xattr { +public: +virtual void SetUp() { + m_ro = true; + Xattr::SetUp(); +} +}; + +/* + * If the extended attribute does not exist on this file, the daemon should + * return ENOATTR (ENODATA on Linux, but it's up to the daemon to choose the + * correct errror code) + */ +TEST_F(Getxattr, enoattr) +{ + char data[80]; + uint64_t ino = 42; + int ns = EXTATTR_NAMESPACE_USER; + ssize_t r; + + expect_lookup(RELPATH, ino, S_IFREG | 0644, 0, 1); + expect_getxattr(ino, "user.foo", ReturnErrno(ENOATTR)); + + r = extattr_get_file(FULLPATH, ns, "foo", data, sizeof(data)); + ASSERT_EQ(-1, r); + ASSERT_EQ(ENOATTR, errno); +} + +/* + * If the filesystem returns ENOSYS, then it will be treated as a permanent + * failure and all future VOP_GETEXTATTR calls will fail with EOPNOTSUPP + * without querying the filesystem daemon + */ +TEST_F(Getxattr, enosys) +{ + char data[80]; + uint64_t ino = 42; + int ns = EXTATTR_NAMESPACE_USER; + ssize_t r; + + expect_lookup(RELPATH, ino, S_IFREG | 0644, 0, 2); + expect_getxattr(ino, "user.foo", ReturnErrno(ENOSYS)); + + r = extattr_get_file(FULLPATH, ns, "foo", data, sizeof(data)); + ASSERT_EQ(-1, r); + EXPECT_EQ(EOPNOTSUPP, errno); + + /* Subsequent attempts should not query the filesystem at all */ + r = extattr_get_file(FULLPATH, ns, "foo", data, sizeof(data)); + ASSERT_EQ(-1, r); + EXPECT_EQ(EOPNOTSUPP, errno); +} + +/* + * On FreeBSD, if the user passes an insufficiently large buffer then the + * filesystem is supposed to copy as much of the attribute's value as will fit. + * + * On Linux, however, the filesystem is supposed to return ERANGE. + * + * libfuse specifies the Linux behavior. However, that's probably an error. + * It would probably be correct for the filesystem to use platform-dependent + * behavior. + * + * This test case covers a filesystem that uses the Linux behavior + */ +TEST_F(Getxattr, erange) +{ + char data[10]; + uint64_t ino = 42; + int ns = EXTATTR_NAMESPACE_USER; + ssize_t r; + + expect_lookup(RELPATH, ino, S_IFREG | 0644, 0, 1); + expect_getxattr(ino, "user.foo", ReturnErrno(ERANGE)); + + r = extattr_get_file(FULLPATH, ns, "foo", data, sizeof(data)); + ASSERT_EQ(-1, r); + ASSERT_EQ(ERANGE, errno); +} + +/* + * If the user passes a 0-length buffer, then the daemon should just return the + * size of the attribute + */ +TEST_F(Getxattr, size_only) +{ + uint64_t ino = 42; + int ns = EXTATTR_NAMESPACE_USER; + + expect_lookup(RELPATH, ino, S_IFREG | 0644, 0, 1); + expect_getxattr(ino, "user.foo", + ReturnImmediate([](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, getxattr); + out.body.getxattr.size = 99; + }) + ); + + ASSERT_EQ(99, extattr_get_file(FULLPATH, ns, "foo", NULL, 0)) + << strerror(errno);; +} + +/* + * Successfully get an attribute from the system namespace + */ +TEST_F(Getxattr, system) +{ + uint64_t ino = 42; + char data[80]; + const char value[] = "whatever"; + ssize_t value_len = strlen(value) + 1; + int ns = EXTATTR_NAMESPACE_SYSTEM; + ssize_t r; + + expect_lookup(RELPATH, ino, S_IFREG | 0644, 0, 1); + expect_getxattr(ino, "system.foo", + ReturnImmediate([&](auto in __unused, auto& out) { + memcpy((void*)out.body.bytes, value, value_len); + out.header.len = sizeof(out.header) + value_len; + }) + ); + + r = extattr_get_file(FULLPATH, ns, "foo", data, sizeof(data)); + ASSERT_EQ(value_len, r) << strerror(errno); + EXPECT_STREQ(value, data); +} + +/* + * Successfully get an attribute from the user namespace + */ +TEST_F(Getxattr, user) +{ + uint64_t ino = 42; + char data[80]; + const char value[] = "whatever"; + ssize_t value_len = strlen(value) + 1; + int ns = EXTATTR_NAMESPACE_USER; + ssize_t r; + + expect_lookup(RELPATH, ino, S_IFREG | 0644, 0, 1); + expect_getxattr(ino, "user.foo", + ReturnImmediate([&](auto in __unused, auto& out) { + memcpy((void*)out.body.bytes, value, value_len); + out.header.len = sizeof(out.header) + value_len; + }) + ); + + r = extattr_get_file(FULLPATH, ns, "foo", data, sizeof(data)); + ASSERT_EQ(value_len, r) << strerror(errno); + EXPECT_STREQ(value, data); +} + +/* + * If the filesystem returns ENOSYS, then it will be treated as a permanent + * failure and all future VOP_LISTEXTATTR calls will fail with EOPNOTSUPP + * without querying the filesystem daemon + */ +TEST_F(Listxattr, enosys) +{ + uint64_t ino = 42; + int ns = EXTATTR_NAMESPACE_USER; + + expect_lookup(RELPATH, ino, S_IFREG | 0644, 0, 2); + expect_listxattr(ino, 0, ReturnErrno(ENOSYS)); + + ASSERT_EQ(-1, extattr_list_file(FULLPATH, ns, NULL, 0)); + EXPECT_EQ(EOPNOTSUPP, errno); + + /* Subsequent attempts should not query the filesystem at all */ + ASSERT_EQ(-1, extattr_list_file(FULLPATH, ns, NULL, 0)); + EXPECT_EQ(EOPNOTSUPP, errno); +} + +/* + * Listing extended attributes failed because they aren't configured on this + * filesystem + */ +TEST_F(Listxattr, enotsup) +{ + uint64_t ino = 42; + int ns = EXTATTR_NAMESPACE_USER; + + expect_lookup(RELPATH, ino, S_IFREG | 0644, 0, 1); + expect_listxattr(ino, 0, ReturnErrno(ENOTSUP)); + + ASSERT_EQ(-1, extattr_list_file(FULLPATH, ns, NULL, 0)); + ASSERT_EQ(ENOTSUP, errno); +} + +/* + * On FreeBSD, if the user passes an insufficiently large buffer then the + * filesystem is supposed to copy as much of the attribute's value as will fit. + * + * On Linux, however, the filesystem is supposed to return ERANGE. + * + * libfuse specifies the Linux behavior. However, that's probably an error. + * It would probably be correct for the filesystem to use platform-dependent + * behavior. + * + * This test case covers a filesystem that uses the Linux behavior + */ +TEST_F(Listxattr, erange) +{ + uint64_t ino = 42; + int ns = EXTATTR_NAMESPACE_USER; + + expect_lookup(RELPATH, ino, S_IFREG | 0644, 0, 1); + expect_listxattr(ino, 0, ReturnErrno(ERANGE)); + + ASSERT_EQ(-1, extattr_list_file(FULLPATH, ns, NULL, 0)); + ASSERT_EQ(ERANGE, errno); +} + +/* + * Get the size of the list that it would take to list no extended attributes + */ +TEST_F(Listxattr, size_only_empty) +{ + uint64_t ino = 42; + int ns = EXTATTR_NAMESPACE_USER; + + expect_lookup(RELPATH, ino, S_IFREG | 0644, 0, 1); + expect_listxattr(ino, 0, ReturnImmediate([](auto i __unused, auto& out) { + out.body.listxattr.size = 0; + SET_OUT_HEADER_LEN(out, listxattr); + })); + + ASSERT_EQ(0, extattr_list_file(FULLPATH, ns, NULL, 0)) + << strerror(errno); +} + +/* + * Get the size of the list that it would take to list some extended + * attributes. Due to the format differences between a FreeBSD and a + * Linux/FUSE extended attribute list, fuse(4) will actually allocate a buffer + * and get the whole list, then convert it, just to figure out its size. + */ +TEST_F(Listxattr, size_only_nonempty) +{ + uint64_t ino = 42; + int ns = EXTATTR_NAMESPACE_USER; + + expect_lookup(RELPATH, ino, S_IFREG | 0644, 0, 1); + expect_listxattr(ino, 0, ReturnImmediate([](auto i __unused, auto& out) { + out.body.listxattr.size = 45; + SET_OUT_HEADER_LEN(out, listxattr); + })); + + // TODO: fix the expected size after fixing the size calculation bug in + // fuse_vnop_listextattr. It should be exactly 45. + expect_listxattr(ino, 53, + ReturnImmediate([](auto in __unused, auto& out) { + const char l[] = "user.foo"; + strlcpy((char*)out.body.bytes, l, + sizeof(out.body.bytes)); + out.header.len = sizeof(fuse_out_header) + sizeof(l); + }) + ); + + ASSERT_EQ(4, extattr_list_file(FULLPATH, ns, NULL, 0)) + << strerror(errno); +} + +TEST_F(Listxattr, size_only_really_big) +{ + uint64_t ino = 42; + int ns = EXTATTR_NAMESPACE_USER; + + expect_lookup(RELPATH, ino, S_IFREG | 0644, 0, 1); + expect_listxattr(ino, 0, ReturnImmediate([](auto i __unused, auto& out) { + out.body.listxattr.size = 16000; + SET_OUT_HEADER_LEN(out, listxattr); + })); + + // TODO: fix the expected size after fixing the size calculation bug in + // fuse_vnop_listextattr. It should be exactly 16000. + expect_listxattr(ino, 16008, + ReturnImmediate([](auto in __unused, auto& out) { + const char l[16] = "user.foobarbang"; + for (int i=0; i < 1000; i++) { + memcpy(&out.body.bytes[16 * i], l, 16); + } + out.header.len = sizeof(fuse_out_header) + 16000; + }) + ); + + ASSERT_EQ(11000, extattr_list_file(FULLPATH, ns, NULL, 0)) + << strerror(errno); +} + +/* + * List all of the user attributes of a file which has both user and system + * attributes + */ +TEST_F(Listxattr, user) +{ + uint64_t ino = 42; + int ns = EXTATTR_NAMESPACE_USER; + char data[80]; + char expected[9] = {3, 'f', 'o', 'o', 4, 'b', 'a', 'n', 'g'}; + char attrs[28] = "user.foo\0system.x\0user.bang"; + + expect_lookup(RELPATH, ino, S_IFREG | 0644, 0, 1); + expect_listxattr(ino, 0, + ReturnImmediate([&](auto in __unused, auto& out) { + out.body.listxattr.size = sizeof(attrs); + SET_OUT_HEADER_LEN(out, listxattr); + }) + ); + + // TODO: fix the expected size after fixing the size calculation bug in + // fuse_vnop_listextattr. + expect_listxattr(ino, sizeof(attrs) + 8, + ReturnImmediate([&](auto in __unused, auto& out) { + memcpy((void*)out.body.bytes, attrs, sizeof(attrs)); + out.header.len = sizeof(fuse_out_header) + sizeof(attrs); + })); + + ASSERT_EQ(static_cast(sizeof(expected)), + extattr_list_file(FULLPATH, ns, data, sizeof(data))) + << strerror(errno); + ASSERT_EQ(0, memcmp(expected, data, sizeof(expected))); +} + +/* + * List all of the system attributes of a file which has both user and system + * attributes + */ +TEST_F(Listxattr, system) +{ + uint64_t ino = 42; + int ns = EXTATTR_NAMESPACE_SYSTEM; + char data[80]; + char expected[2] = {1, 'x'}; + char attrs[28] = "user.foo\0system.x\0user.bang"; + + expect_lookup(RELPATH, ino, S_IFREG | 0644, 0, 1); + expect_listxattr(ino, 0, + ReturnImmediate([&](auto in __unused, auto& out) { + out.body.listxattr.size = sizeof(attrs); + SET_OUT_HEADER_LEN(out, listxattr); + }) + ); + + // TODO: fix the expected size after fixing the size calculation bug in + // fuse_vnop_listextattr. + expect_listxattr(ino, sizeof(attrs) + 8, + ReturnImmediate([&](auto in __unused, auto& out) { + memcpy((void*)out.body.bytes, attrs, sizeof(attrs)); + out.header.len = sizeof(fuse_out_header) + sizeof(attrs); + })); + + ASSERT_EQ(static_cast(sizeof(expected)), + extattr_list_file(FULLPATH, ns, data, sizeof(data))) + << strerror(errno); + ASSERT_EQ(0, memcmp(expected, data, sizeof(expected))); +} + +/* Fail to remove a nonexistent attribute */ +TEST_F(Removexattr, enoattr) +{ + uint64_t ino = 42; + int ns = EXTATTR_NAMESPACE_USER; + + expect_lookup(RELPATH, ino, S_IFREG | 0644, 0, 1); + expect_removexattr(ino, "user.foo", ENOATTR); + + ASSERT_EQ(-1, extattr_delete_file(FULLPATH, ns, "foo")); + ASSERT_EQ(ENOATTR, errno); +} + +/* + * If the filesystem returns ENOSYS, then it will be treated as a permanent + * failure and all future VOP_DELETEEXTATTR calls will fail with EOPNOTSUPP + * without querying the filesystem daemon + */ +TEST_F(Removexattr, enosys) +{ + uint64_t ino = 42; + int ns = EXTATTR_NAMESPACE_USER; + + expect_lookup(RELPATH, ino, S_IFREG | 0644, 0, 2); + expect_removexattr(ino, "user.foo", ENOSYS); + + ASSERT_EQ(-1, extattr_delete_file(FULLPATH, ns, "foo")); + EXPECT_EQ(EOPNOTSUPP, errno); + + /* Subsequent attempts should not query the filesystem at all */ + ASSERT_EQ(-1, extattr_delete_file(FULLPATH, ns, "foo")); + EXPECT_EQ(EOPNOTSUPP, errno); +} + +/* Successfully remove a user xattr */ +TEST_F(Removexattr, user) +{ + uint64_t ino = 42; + int ns = EXTATTR_NAMESPACE_USER; + + expect_lookup(RELPATH, ino, S_IFREG | 0644, 0, 1); + expect_removexattr(ino, "user.foo", 0); + + ASSERT_EQ(0, extattr_delete_file(FULLPATH, ns, "foo")) + << strerror(errno); +} + +/* Successfully remove a system xattr */ +TEST_F(Removexattr, system) +{ + uint64_t ino = 42; + int ns = EXTATTR_NAMESPACE_SYSTEM; + + expect_lookup(RELPATH, ino, S_IFREG | 0644, 0, 1); + expect_removexattr(ino, "system.foo", 0); + + ASSERT_EQ(0, extattr_delete_file(FULLPATH, ns, "foo")) + << strerror(errno); +} + +/* + * If the filesystem returns ENOSYS, then it will be treated as a permanent + * failure and all future VOP_SETEXTATTR calls will fail with EOPNOTSUPP + * without querying the filesystem daemon + */ +TEST_F(Setxattr, enosys) +{ + uint64_t ino = 42; + const char value[] = "whatever"; + ssize_t value_len = strlen(value) + 1; + int ns = EXTATTR_NAMESPACE_USER; + ssize_t r; + + expect_lookup(RELPATH, ino, S_IFREG | 0644, 0, 2); + expect_setxattr(ino, "user.foo", value, ReturnErrno(ENOSYS)); + + r = extattr_set_file(FULLPATH, ns, "foo", (const void*)value, + value_len); + ASSERT_EQ(-1, r); + EXPECT_EQ(EOPNOTSUPP, errno); + + /* Subsequent attempts should not query the filesystem at all */ + r = extattr_set_file(FULLPATH, ns, "foo", (const void*)value, + value_len); + ASSERT_EQ(-1, r); + EXPECT_EQ(EOPNOTSUPP, errno); +} + +/* + * SETXATTR will return ENOTSUP if the namespace is invalid or the filesystem + * as currently configured doesn't support extended attributes. + */ +TEST_F(Setxattr, enotsup) +{ + uint64_t ino = 42; + const char value[] = "whatever"; + ssize_t value_len = strlen(value) + 1; + int ns = EXTATTR_NAMESPACE_USER; + ssize_t r; + + expect_lookup(RELPATH, ino, S_IFREG | 0644, 0, 1); + expect_setxattr(ino, "user.foo", value, ReturnErrno(ENOTSUP)); + + r = extattr_set_file(FULLPATH, ns, "foo", (const void*)value, + value_len); + ASSERT_EQ(-1, r); + EXPECT_EQ(ENOTSUP, errno); +} + +/* + * Successfully set a user attribute. + */ +TEST_F(Setxattr, user) +{ + uint64_t ino = 42; + const char value[] = "whatever"; + ssize_t value_len = strlen(value) + 1; + int ns = EXTATTR_NAMESPACE_USER; + ssize_t r; + + expect_lookup(RELPATH, ino, S_IFREG | 0644, 0, 1); + expect_setxattr(ino, "user.foo", value, ReturnErrno(0)); + + r = extattr_set_file(FULLPATH, ns, "foo", (const void*)value, + value_len); + ASSERT_EQ(value_len, r) << strerror(errno); +} + +/* + * Successfully set a system attribute. + */ +TEST_F(Setxattr, system) +{ + uint64_t ino = 42; + const char value[] = "whatever"; + ssize_t value_len = strlen(value) + 1; + int ns = EXTATTR_NAMESPACE_SYSTEM; + ssize_t r; + + expect_lookup(RELPATH, ino, S_IFREG | 0644, 0, 1); + expect_setxattr(ino, "system.foo", value, ReturnErrno(0)); + + r = extattr_set_file(FULLPATH, ns, "foo", (const void*)value, + value_len); + ASSERT_EQ(value_len, r) << strerror(errno); +} + +TEST_F(RofsXattr, deleteextattr_erofs) +{ + uint64_t ino = 42; + int ns = EXTATTR_NAMESPACE_USER; + + expect_lookup(RELPATH, ino, S_IFREG | 0644, 0, 1); + + ASSERT_EQ(-1, extattr_delete_file(FULLPATH, ns, "foo")); + ASSERT_EQ(EROFS, errno); +} + +TEST_F(RofsXattr, setextattr_erofs) +{ + uint64_t ino = 42; + const char value[] = "whatever"; + ssize_t value_len = strlen(value) + 1; + int ns = EXTATTR_NAMESPACE_USER; + ssize_t r; + + expect_lookup(RELPATH, ino, S_IFREG | 0644, 0, 1); + + r = extattr_set_file(FULLPATH, ns, "foo", (const void*)value, + value_len); + ASSERT_EQ(-1, r); + EXPECT_EQ(EROFS, errno); +} Property changes on: head/tests/sys/fs/fusefs/xattr.cc ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: head/tests/sys/fs/fusefs/utils.cc =================================================================== --- head/tests/sys/fs/fusefs/utils.cc (nonexistent) +++ head/tests/sys/fs/fusefs/utils.cc (revision 350665) @@ -0,0 +1,593 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2019 The FreeBSD Foundation + * + * This software was developed by BFF Storage Systems, LLC under sponsorship + * from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +extern "C" { +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +} + +#include + +#include "mockfs.hh" +#include "utils.hh" + +using namespace testing; + +/* + * The default max_write is set to this formula in libfuse, though + * individual filesystems can lower it. The "- 4096" was added in + * commit 154ffe2, with the commit message "fix". + */ +const uint32_t libfuse_max_write = 32 * getpagesize() + 0x1000 - 4096; + +/* + * Set the default max_write to a distinct value from MAXPHYS to catch bugs + * that confuse the two. + */ +const uint32_t default_max_write = MIN(libfuse_max_write, MAXPHYS / 2); + + +/* Check that fusefs(4) is accessible and the current user can mount(2) */ +void check_environment() +{ + const char *devnode = "/dev/fuse"; + const char *usermount_node = "vfs.usermount"; + int usermount_val = 0; + size_t usermount_size = sizeof(usermount_val); + if (eaccess(devnode, R_OK | W_OK)) { + if (errno == ENOENT) { + GTEST_SKIP() << devnode << " does not exist"; + } else if (errno == EACCES) { + GTEST_SKIP() << devnode << + " is not accessible by the current user"; + } else { + GTEST_SKIP() << strerror(errno); + } + } + sysctlbyname(usermount_node, &usermount_val, &usermount_size, + NULL, 0); + if (geteuid() != 0 && !usermount_val) + GTEST_SKIP() << "current user is not allowed to mount"; +} + +class FuseEnv: public Environment { + virtual void SetUp() { + } +}; + +void FuseTest::SetUp() { + const char *maxbcachebuf_node = "vfs.maxbcachebuf"; + const char *maxphys_node = "kern.maxphys"; + int val = 0; + size_t size = sizeof(val); + + /* + * XXX check_environment should be called from FuseEnv::SetUp, but + * can't due to https://github.com/google/googletest/issues/2189 + */ + check_environment(); + if (IsSkipped()) + return; + + ASSERT_EQ(0, sysctlbyname(maxbcachebuf_node, &val, &size, NULL, 0)) + << strerror(errno); + m_maxbcachebuf = val; + ASSERT_EQ(0, sysctlbyname(maxphys_node, &val, &size, NULL, 0)) + << strerror(errno); + m_maxphys = val; + + try { + m_mock = new MockFS(m_maxreadahead, m_allow_other, + m_default_permissions, m_push_symlinks_in, m_ro, + m_pm, m_init_flags, m_kernel_minor_version, + m_maxwrite, m_async, m_noclusterr, m_time_gran, + m_nointr); + /* + * FUSE_ACCESS is called almost universally. Expecting it in + * each test case would be super-annoying. Instead, set a + * default expectation for FUSE_ACCESS and return ENOSYS. + * + * Individual test cases can override this expectation since + * googlemock evaluates expectations in LIFO order. + */ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_ACCESS); + }, Eq(true)), + _) + ).Times(AnyNumber()) + .WillRepeatedly(Invoke(ReturnErrno(ENOSYS))); + /* + * FUSE_BMAP is called for most test cases that read data. Set + * a default expectation and return ENOSYS. + * + * Individual test cases can override this expectation since + * googlemock evaluates expectations in LIFO order. + */ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_BMAP); + }, Eq(true)), + _) + ).Times(AnyNumber()) + .WillRepeatedly(Invoke(ReturnErrno(ENOSYS))); + } catch (std::system_error err) { + FAIL() << err.what(); + } +} + +void +FuseTest::expect_access(uint64_t ino, mode_t access_mode, int error) +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_ACCESS && + in.header.nodeid == ino && + in.body.access.mask == access_mode); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnErrno(error))); +} + +void +FuseTest::expect_destroy(int error) +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_DESTROY); + }, Eq(true)), + _) + ).WillOnce(Invoke( ReturnImmediate([&](auto in, auto& out) { + m_mock->m_quit = true; + out.header.len = sizeof(out.header); + out.header.unique = in.header.unique; + out.header.error = -error; + }))); +} + +void +FuseTest::expect_flush(uint64_t ino, int times, ProcessMockerT r) +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_FLUSH && + in.header.nodeid == ino); + }, Eq(true)), + _) + ).Times(times) + .WillRepeatedly(Invoke(r)); +} + +void +FuseTest::expect_forget(uint64_t ino, uint64_t nlookup, sem_t *sem) +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_FORGET && + in.header.nodeid == ino && + in.body.forget.nlookup == nlookup); + }, Eq(true)), + _) + ).WillOnce(Invoke([=](auto in __unused, auto &out __unused) { + if (sem != NULL) + sem_post(sem); + /* FUSE_FORGET has no response! */ + })); +} + +void FuseTest::expect_getattr(uint64_t ino, uint64_t size) +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_GETATTR && + in.header.nodeid == ino); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([=](auto i __unused, auto& out) { + SET_OUT_HEADER_LEN(out, attr); + out.body.attr.attr.ino = ino; // Must match nodeid + out.body.attr.attr.mode = S_IFREG | 0644; + out.body.attr.attr.size = size; + out.body.attr.attr_valid = UINT64_MAX; + }))); +} + +void FuseTest::expect_lookup(const char *relpath, uint64_t ino, mode_t mode, + uint64_t size, int times, uint64_t attr_valid, uid_t uid, gid_t gid) +{ + EXPECT_LOOKUP(FUSE_ROOT_ID, relpath) + .Times(times) + .WillRepeatedly(Invoke( + ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = mode; + out.body.entry.nodeid = ino; + out.body.entry.attr.nlink = 1; + out.body.entry.attr_valid = attr_valid; + out.body.entry.attr.size = size; + out.body.entry.attr.uid = uid; + out.body.entry.attr.gid = gid; + }))); +} + +void FuseTest::expect_lookup_7_8(const char *relpath, uint64_t ino, mode_t mode, + uint64_t size, int times, uint64_t attr_valid, uid_t uid, gid_t gid) +{ + EXPECT_LOOKUP(FUSE_ROOT_ID, relpath) + .Times(times) + .WillRepeatedly(Invoke( + ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry_7_8); + out.body.entry.attr.mode = mode; + out.body.entry.nodeid = ino; + out.body.entry.attr.nlink = 1; + out.body.entry.attr_valid = attr_valid; + out.body.entry.attr.size = size; + out.body.entry.attr.uid = uid; + out.body.entry.attr.gid = gid; + }))); +} + +void FuseTest::expect_open(uint64_t ino, uint32_t flags, int times) +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_OPEN && + in.header.nodeid == ino); + }, Eq(true)), + _) + ).Times(times) + .WillRepeatedly(Invoke( + ReturnImmediate([=](auto in __unused, auto& out) { + out.header.len = sizeof(out.header); + SET_OUT_HEADER_LEN(out, open); + out.body.open.fh = FH; + out.body.open.open_flags = flags; + }))); +} + +void FuseTest::expect_opendir(uint64_t ino) +{ + /* opendir(3) calls fstatfs */ + EXPECT_CALL(*m_mock, process( + ResultOf([](auto in) { + return (in.header.opcode == FUSE_STATFS); + }, Eq(true)), + _) + ).WillRepeatedly(Invoke( + ReturnImmediate([=](auto i __unused, auto& out) { + SET_OUT_HEADER_LEN(out, statfs); + }))); + + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_OPENDIR && + in.header.nodeid == ino); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + out.header.len = sizeof(out.header); + SET_OUT_HEADER_LEN(out, open); + out.body.open.fh = FH; + }))); +} + +void FuseTest::expect_read(uint64_t ino, uint64_t offset, uint64_t isize, + uint64_t osize, const void *contents, int flags) +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_READ && + in.header.nodeid == ino && + in.body.read.fh == FH && + in.body.read.offset == offset && + in.body.read.size == isize && + flags == -1 ? + (in.body.read.flags == O_RDONLY || + in.body.read.flags == O_RDWR) + : in.body.read.flags == (uint32_t)flags); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + out.header.len = sizeof(struct fuse_out_header) + osize; + memmove(out.body.bytes, contents, osize); + }))).RetiresOnSaturation(); +} + +void FuseTest::expect_readdir(uint64_t ino, uint64_t off, + std::vector &ents) +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_READDIR && + in.header.nodeid == ino && + in.body.readdir.fh == FH && + in.body.readdir.offset == off); + }, Eq(true)), + _) + ).WillRepeatedly(Invoke(ReturnImmediate([=](auto in, auto& out) { + struct fuse_dirent *fde = (struct fuse_dirent*)&(out.body); + int i = 0; + + out.header.error = 0; + out.header.len = 0; + + for (const auto& it: ents) { + size_t entlen, entsize; + + fde->ino = it.d_fileno; + fde->off = it.d_off; + fde->type = it.d_type; + fde->namelen = it.d_namlen; + strncpy(fde->name, it.d_name, it.d_namlen); + entlen = FUSE_NAME_OFFSET + fde->namelen; + entsize = FUSE_DIRENT_SIZE(fde); + /* + * The FUSE protocol does not require zeroing out the + * unused portion of the name. But it's a good + * practice to prevent information disclosure to the + * FUSE client, even though the client is usually the + * kernel + */ + memset(fde->name + fde->namelen, 0, entsize - entlen); + if (out.header.len + entsize > in.body.read.size) { + printf("Overflow in readdir expectation: i=%d\n" + , i); + break; + } + out.header.len += entsize; + fde = (struct fuse_dirent*) + ((intmax_t*)fde + entsize / sizeof(intmax_t)); + i++; + } + out.header.len += sizeof(out.header); + }))); + +} +void FuseTest::expect_release(uint64_t ino, uint64_t fh) +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_RELEASE && + in.header.nodeid == ino && + in.body.release.fh == fh); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnErrno(0))); +} + +void FuseTest::expect_releasedir(uint64_t ino, ProcessMockerT r) +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_RELEASEDIR && + in.header.nodeid == ino && + in.body.release.fh == FH); + }, Eq(true)), + _) + ).WillOnce(Invoke(r)); +} + +void FuseTest::expect_unlink(uint64_t parent, const char *path, int error) +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_UNLINK && + 0 == strcmp(path, in.body.unlink) && + in.header.nodeid == parent); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnErrno(error))); +} + +void FuseTest::expect_write(uint64_t ino, uint64_t offset, uint64_t isize, + uint64_t osize, uint32_t flags_set, uint32_t flags_unset, + const void *contents) +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + const char *buf = (const char*)in.body.bytes + + sizeof(struct fuse_write_in); + bool pid_ok; + uint32_t wf = in.body.write.write_flags; + + if (wf & FUSE_WRITE_CACHE) + pid_ok = true; + else + pid_ok = (pid_t)in.header.pid == getpid(); + + return (in.header.opcode == FUSE_WRITE && + in.header.nodeid == ino && + in.body.write.fh == FH && + in.body.write.offset == offset && + in.body.write.size == isize && + pid_ok && + (wf & flags_set) == flags_set && + (wf & flags_unset) == 0 && + (in.body.write.flags == O_WRONLY || + in.body.write.flags == O_RDWR) && + 0 == bcmp(buf, contents, isize)); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, write); + out.body.write.size = osize; + }))); +} + +void FuseTest::expect_write_7_8(uint64_t ino, uint64_t offset, uint64_t isize, + uint64_t osize, const void *contents) +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + const char *buf = (const char*)in.body.bytes + + FUSE_COMPAT_WRITE_IN_SIZE; + bool pid_ok = (pid_t)in.header.pid == getpid(); + return (in.header.opcode == FUSE_WRITE && + in.header.nodeid == ino && + in.body.write.fh == FH && + in.body.write.offset == offset && + in.body.write.size == isize && + pid_ok && + 0 == bcmp(buf, contents, isize)); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, write); + out.body.write.size = osize; + }))); +} + +void +get_unprivileged_id(uid_t *uid, gid_t *gid) +{ + struct passwd *pw; + struct group *gr; + + /* + * First try "tests", Kyua's default unprivileged user. XXX after + * GoogleTest gains a proper Kyua wrapper, get this with the Kyua API + */ + pw = getpwnam("tests"); + if (pw == NULL) { + /* Fall back to "nobody" */ + pw = getpwnam("nobody"); + } + if (pw == NULL) + GTEST_SKIP() << "Test requires an unprivileged user"; + /* Use group "nobody", which is Kyua's default unprivileged group */ + gr = getgrnam("nobody"); + if (gr == NULL) + GTEST_SKIP() << "Test requires an unprivileged group"; + *uid = pw->pw_uid; + *gid = gr->gr_gid; +} + +void +FuseTest::fork(bool drop_privs, int *child_status, + std::function parent_func, + std::function child_func) +{ + sem_t *sem; + int mprot = PROT_READ | PROT_WRITE; + int mflags = MAP_ANON | MAP_SHARED; + pid_t child; + uid_t uid; + gid_t gid; + + if (drop_privs) { + get_unprivileged_id(&uid, &gid); + if (IsSkipped()) + return; + } + + sem = (sem_t*)mmap(NULL, sizeof(*sem), mprot, mflags, -1, 0); + ASSERT_NE(MAP_FAILED, sem) << strerror(errno); + ASSERT_EQ(0, sem_init(sem, 1, 0)) << strerror(errno); + + if ((child = ::fork()) == 0) { + /* In child */ + int err = 0; + + if (sem_wait(sem)) { + perror("sem_wait"); + err = 1; + goto out; + } + + if (drop_privs && 0 != setegid(gid)) { + perror("setegid"); + err = 1; + goto out; + } + if (drop_privs && 0 != setreuid(-1, uid)) { + perror("setreuid"); + err = 1; + goto out; + } + err = child_func(); + +out: + sem_destroy(sem); + _exit(err); + } else if (child > 0) { + /* + * In parent. Cleanup must happen here, because it's still + * privileged. + */ + m_mock->m_child_pid = child; + ASSERT_NO_FATAL_FAILURE(parent_func()); + + /* Signal the child process to go */ + ASSERT_EQ(0, sem_post(sem)) << strerror(errno); + + ASSERT_LE(0, wait(child_status)) << strerror(errno); + } else { + FAIL() << strerror(errno); + } + munmap(sem, sizeof(*sem)); + return; +} + +static void usage(char* progname) { + fprintf(stderr, "Usage: %s [-v]\n\t-v increase verbosity\n", progname); + exit(2); +} + +int main(int argc, char **argv) { + int ch; + FuseEnv *fuse_env = new FuseEnv; + + InitGoogleTest(&argc, argv); + AddGlobalTestEnvironment(fuse_env); + + while ((ch = getopt(argc, argv, "v")) != -1) { + switch (ch) { + case 'v': + verbosity++; + break; + default: + usage(argv[0]); + break; + } + } + + return (RUN_ALL_TESTS()); +} Property changes on: head/tests/sys/fs/fusefs/utils.cc ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: head/tests/sys/fs/fusefs/utils.hh =================================================================== --- head/tests/sys/fs/fusefs/utils.hh (nonexistent) +++ head/tests/sys/fs/fusefs/utils.hh (revision 350665) @@ -0,0 +1,236 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2019 The FreeBSD Foundation + * + * This software was developed by BFF Storage Systems, LLC under sponsorship + * from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +struct _sem; +typedef struct _sem sem_t; +struct _dirdesc; +typedef struct _dirdesc DIR; + +/* Nanoseconds to sleep, for tests that must */ +#define NAP_NS (100'000'000) + +void get_unprivileged_id(uid_t *uid, gid_t *gid); +inline void nap() +{ + usleep(NAP_NS / 1000); +} + +extern const uint32_t libfuse_max_write; +extern const uint32_t default_max_write; +class FuseTest : public ::testing::Test { + protected: + uint32_t m_maxreadahead; + uint32_t m_maxwrite; + uint32_t m_init_flags; + bool m_allow_other; + bool m_default_permissions; + uint32_t m_kernel_minor_version; + enum poll_method m_pm; + bool m_push_symlinks_in; + bool m_ro; + bool m_async; + bool m_noclusterr; + bool m_nointr; + unsigned m_time_gran; + MockFS *m_mock = NULL; + const static uint64_t FH = 0xdeadbeef1a7ebabe; + + public: + int m_maxbcachebuf; + int m_maxphys; + + FuseTest(): + m_maxreadahead(0), + m_maxwrite(default_max_write), + m_init_flags(0), + m_allow_other(false), + m_default_permissions(false), + m_kernel_minor_version(FUSE_KERNEL_MINOR_VERSION), + m_pm(BLOCKING), + m_push_symlinks_in(false), + m_ro(false), + m_async(false), + m_noclusterr(false), + m_nointr(false), + m_time_gran(1) + {} + + virtual void SetUp(); + + virtual void TearDown() { + if (m_mock) + delete m_mock; + } + + /* + * Create an expectation that FUSE_ACCESS will be called once for the + * given inode with the given access_mode, returning the given errno + */ + void expect_access(uint64_t ino, mode_t access_mode, int error); + + /* Expect FUSE_DESTROY and shutdown the daemon */ + void expect_destroy(int error); + + /* + * Create an expectation that FUSE_FLUSH will be called times times for + * the given inode + */ + void expect_flush(uint64_t ino, int times, ProcessMockerT r); + + /* + * Create an expectation that FUSE_FORGET will be called for the given + * inode. There will be no response. If sem is provided, it will be + * posted after the operation is received by the daemon. + */ + void expect_forget(uint64_t ino, uint64_t nlookup, sem_t *sem = NULL); + + /* + * Create an expectation that FUSE_GETATTR will be called for the given + * inode any number of times. It will respond with a few basic + * attributes, like the given size and the mode S_IFREG | 0644 + */ + void expect_getattr(uint64_t ino, uint64_t size); + + /* + * Create an expectation that FUSE_LOOKUP will be called for the given + * path exactly times times and cache validity period. It will respond + * with inode ino, mode mode, filesize size. + */ + void expect_lookup(const char *relpath, uint64_t ino, mode_t mode, + uint64_t size, int times, uint64_t attr_valid = UINT64_MAX, + uid_t uid = 0, gid_t gid = 0); + + /* The protocol 7.8 version of expect_lookup */ + void expect_lookup_7_8(const char *relpath, uint64_t ino, mode_t mode, + uint64_t size, int times, uint64_t attr_valid = UINT64_MAX, + uid_t uid = 0, gid_t gid = 0); + + /* + * Create an expectation that FUSE_OPEN will be called for the given + * inode exactly times times. It will return with open_flags flags and + * file handle FH. + */ + void expect_open(uint64_t ino, uint32_t flags, int times); + + /* + * Create an expectation that FUSE_OPENDIR will be called exactly once + * for inode ino. + */ + void expect_opendir(uint64_t ino); + + /* + * Create an expectation that FUSE_READ will be called exactly once for + * the given inode, at offset offset and with size isize. It will + * return the first osize bytes from contents + * + * Protocol 7.8 tests can use this same expectation method because + * nothing currently validates the size of the fuse_read_in struct. + */ + void expect_read(uint64_t ino, uint64_t offset, uint64_t isize, + uint64_t osize, const void *contents, int flags = -1); + + /* + * Create an expectation that FUSE_READIR will be called any number of + * times on the given ino with the given offset, returning (by copy) + * the provided entries + */ + void expect_readdir(uint64_t ino, uint64_t off, + std::vector &ents); + + /* + * Create an expectation that FUSE_RELEASE will be called exactly once + * for the given inode and filehandle, returning success + */ + void expect_release(uint64_t ino, uint64_t fh); + + /* + * Create an expectation that FUSE_RELEASEDIR will be called exactly + * once for the given inode + */ + void expect_releasedir(uint64_t ino, ProcessMockerT r); + + /* + * Create an expectation that FUSE_UNLINK will be called exactly once + * for the given path, returning an errno + */ + void expect_unlink(uint64_t parent, const char *path, int error); + + /* + * Create an expectation that FUSE_WRITE will be called exactly once + * for the given inode, at offset offset, with size isize and buffer + * contents. Any flags present in flags_set must be set, and any + * present in flags_unset must not be set. Other flags are don't care. + * It will return osize. + */ + void expect_write(uint64_t ino, uint64_t offset, uint64_t isize, + uint64_t osize, uint32_t flags_set, uint32_t flags_unset, + const void *contents); + + /* Protocol 7.8 version of expect_write */ + void expect_write_7_8(uint64_t ino, uint64_t offset, uint64_t isize, + uint64_t osize, const void *contents); + + /* + * Helper that runs code in a child process. + * + * First, parent_func runs in the parent process. + * Then, child_func runs in the child process, dropping privileges if + * desired. + * Finally, fusetest_fork returns. + * + * # Returns + * + * fusetest_fork may SKIP the test, which the caller should detect with + * the IsSkipped() method. If not, then the child's exit status will + * be returned in status. + */ + void fork(bool drop_privs, int *status, + std::function parent_func, + std::function child_func); + + /* + * Deliberately leak a file descriptor. + * + * Closing a file descriptor on fusefs would cause the server to + * receive FUSE_CLOSE and possibly FUSE_INACTIVE. Handling those + * operations would needlessly complicate most tests. So most tests + * deliberately leak the file descriptors instead. This method serves + * to document the leakage, and provide a single point of suppression + * for static analyzers. + */ + static void leak(int fd __unused) {} + + /* + * Deliberately leak a DIR* pointer + * + * See comments for FuseTest::leak + */ + static void leakdir(DIR* dirp __unused) {} +}; Index: head/tests/sys/fs/fusefs/rmdir.cc =================================================================== --- head/tests/sys/fs/fusefs/rmdir.cc (nonexistent) +++ head/tests/sys/fs/fusefs/rmdir.cc (revision 350665) @@ -0,0 +1,172 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2019 The FreeBSD Foundation + * + * This software was developed by BFF Storage Systems, LLC under sponsorship + * from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +extern "C" { +#include +#include +} + +#include "mockfs.hh" +#include "utils.hh" + +using namespace testing; + +class Rmdir: public FuseTest { +public: +void expect_getattr(uint64_t ino, mode_t mode) +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_GETATTR && + in.header.nodeid == ino); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([=](auto i __unused, auto& out) { + SET_OUT_HEADER_LEN(out, attr); + out.body.attr.attr.ino = ino; // Must match nodeid + out.body.attr.attr.mode = mode; + out.body.attr.attr_valid = UINT64_MAX; + }))); +} + +void expect_lookup(const char *relpath, uint64_t ino, int times=1) +{ + EXPECT_LOOKUP(FUSE_ROOT_ID, relpath) + .Times(times) + .WillRepeatedly(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr_valid = UINT64_MAX; + out.body.entry.attr.mode = S_IFDIR | 0755; + out.body.entry.nodeid = ino; + out.body.entry.attr.nlink = 2; + }))); +} + +void expect_rmdir(uint64_t parent, const char *relpath, int error) +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_RMDIR && + 0 == strcmp(relpath, in.body.rmdir) && + in.header.nodeid == parent); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnErrno(error))); +} +}; + +/* + * A successful rmdir should clear the parent directory's attribute cache, + * because the fuse daemon should update its mtime and ctime + */ +TEST_F(Rmdir, parent_attr_cache) +{ + const char FULLPATH[] = "mountpoint/some_dir"; + const char RELPATH[] = "some_dir"; + struct stat sb; + sem_t sem; + uint64_t ino = 42; + + ASSERT_EQ(0, sem_init(&sem, 0, 0)) << strerror(errno); + + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_GETATTR && + in.header.nodeid == FUSE_ROOT_ID); + }, Eq(true)), + _) + ).Times(2) + .WillRepeatedly(Invoke(ReturnImmediate([=](auto i __unused, auto& out) { + SET_OUT_HEADER_LEN(out, attr); + out.body.attr.attr.ino = ino; // Must match nodeid + out.body.attr.attr.mode = S_IFDIR | 0755; + out.body.attr.attr_valid = UINT64_MAX; + }))); + expect_lookup(RELPATH, ino); + expect_rmdir(FUSE_ROOT_ID, RELPATH, 0); + expect_forget(ino, 1, &sem); + + ASSERT_EQ(0, rmdir(FULLPATH)) << strerror(errno); + EXPECT_EQ(0, stat("mountpoint", &sb)) << strerror(errno); + sem_wait(&sem); + sem_destroy(&sem); +} + +TEST_F(Rmdir, enotempty) +{ + const char FULLPATH[] = "mountpoint/some_dir"; + const char RELPATH[] = "some_dir"; + uint64_t ino = 42; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0755); + expect_lookup(RELPATH, ino); + expect_rmdir(FUSE_ROOT_ID, RELPATH, ENOTEMPTY); + + ASSERT_NE(0, rmdir(FULLPATH)); + ASSERT_EQ(ENOTEMPTY, errno); +} + +/* Removing a directory should expire its entry cache */ +TEST_F(Rmdir, entry_cache) +{ + const char FULLPATH[] = "mountpoint/some_dir"; + const char RELPATH[] = "some_dir"; + sem_t sem; + uint64_t ino = 42; + + expect_getattr(1, S_IFDIR | 0755); + expect_lookup(RELPATH, ino, 2); + expect_rmdir(FUSE_ROOT_ID, RELPATH, 0); + expect_forget(ino, 1, &sem); + + ASSERT_EQ(0, rmdir(FULLPATH)) << strerror(errno); + ASSERT_EQ(0, access(FULLPATH, F_OK)) << strerror(errno); + sem_wait(&sem); + sem_destroy(&sem); +} + +TEST_F(Rmdir, ok) +{ + const char FULLPATH[] = "mountpoint/some_dir"; + const char RELPATH[] = "some_dir"; + sem_t sem; + uint64_t ino = 42; + + ASSERT_EQ(0, sem_init(&sem, 0, 0)) << strerror(errno); + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0755); + expect_lookup(RELPATH, ino); + expect_rmdir(FUSE_ROOT_ID, RELPATH, 0); + expect_forget(ino, 1, &sem); + + ASSERT_EQ(0, rmdir(FULLPATH)) << strerror(errno); + sem_wait(&sem); + sem_destroy(&sem); +} Property changes on: head/tests/sys/fs/fusefs/rmdir.cc ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: head/tests/sys/fs/fusefs/unlink.cc =================================================================== --- head/tests/sys/fs/fusefs/unlink.cc (nonexistent) +++ head/tests/sys/fs/fusefs/unlink.cc (revision 350665) @@ -0,0 +1,251 @@ +/*- + * Copyright (c) 2019 The FreeBSD Foundation + * All rights reserved. + * + * This software was developed by BFF Storage Systems, LLC under sponsorship + * from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +extern "C" { +#include +#include +} + +#include "mockfs.hh" +#include "utils.hh" + +using namespace testing; + +class Unlink: public FuseTest { +public: +void expect_getattr(uint64_t ino, mode_t mode) +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_GETATTR && + in.header.nodeid == ino); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([=](auto i __unused, auto& out) { + SET_OUT_HEADER_LEN(out, attr); + out.body.attr.attr.ino = ino; // Must match nodeid + out.body.attr.attr.mode = mode; + out.body.attr.attr_valid = UINT64_MAX; + }))); +} + +void expect_lookup(const char *relpath, uint64_t ino, int times, int nlink=1) +{ + EXPECT_LOOKUP(FUSE_ROOT_ID, relpath) + .Times(times) + .WillRepeatedly(Invoke( + ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = S_IFREG | 0644; + out.body.entry.nodeid = ino; + out.body.entry.attr.nlink = nlink; + out.body.entry.attr_valid = UINT64_MAX; + out.body.entry.attr.size = 0; + }))); +} + +}; + +/* + * Unlinking a multiply linked file should update its ctime and nlink. This + * could be handled simply by invalidating the attributes, necessitating a new + * GETATTR, but we implement it in-kernel for efficiency's sake. + */ +TEST_F(Unlink, attr_cache) +{ + const char FULLPATH0[] = "mountpoint/some_file.txt"; + const char RELPATH0[] = "some_file.txt"; + const char FULLPATH1[] = "mountpoint/other_file.txt"; + const char RELPATH1[] = "other_file.txt"; + uint64_t ino = 42; + struct stat sb_old, sb_new; + int fd1; + + expect_getattr(1, S_IFDIR | 0755); + expect_lookup(RELPATH0, ino, 1, 2); + expect_lookup(RELPATH1, ino, 1, 2); + expect_open(ino, 0, 1); + expect_unlink(1, RELPATH0, 0); + + fd1 = open(FULLPATH1, O_RDONLY); + ASSERT_LE(0, fd1) << strerror(errno); + + ASSERT_EQ(0, fstat(fd1, &sb_old)) << strerror(errno); + ASSERT_EQ(0, unlink(FULLPATH0)) << strerror(errno); + ASSERT_EQ(0, fstat(fd1, &sb_new)) << strerror(errno); + EXPECT_NE(sb_old.st_ctime, sb_new.st_ctime); + EXPECT_EQ(1u, sb_new.st_nlink); + + leak(fd1); +} + +/* + * A successful unlink should clear the parent directory's attribute cache, + * because the fuse daemon should update its mtime and ctime + */ +TEST_F(Unlink, parent_attr_cache) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + struct stat sb; + uint64_t ino = 42; + + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_GETATTR && + in.header.nodeid == FUSE_ROOT_ID); + }, Eq(true)), + _) + ).Times(2) + .WillRepeatedly(Invoke(ReturnImmediate([=](auto i __unused, auto& out) { + SET_OUT_HEADER_LEN(out, attr); + out.body.attr.attr.ino = ino; // Must match nodeid + out.body.attr.attr.mode = S_IFDIR | 0755; + out.body.attr.attr_valid = UINT64_MAX; + }))); + /* Use nlink=2 so we don't get a FUSE_FORGET */ + expect_lookup(RELPATH, ino, 1, 2); + expect_unlink(1, RELPATH, 0); + + ASSERT_EQ(0, unlink(FULLPATH)) << strerror(errno); + EXPECT_EQ(0, stat("mountpoint", &sb)) << strerror(errno); +} + +TEST_F(Unlink, eperm) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + + expect_getattr(1, S_IFDIR | 0755); + expect_lookup(RELPATH, ino, 1); + expect_unlink(1, RELPATH, EPERM); + + ASSERT_NE(0, unlink(FULLPATH)); + ASSERT_EQ(EPERM, errno); +} + +/* + * Unlinking a file should expire its entry cache, even if it's multiply linked + */ +TEST_F(Unlink, entry_cache) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + + expect_getattr(1, S_IFDIR | 0755); + expect_lookup(RELPATH, ino, 2, 2); + expect_unlink(1, RELPATH, 0); + + ASSERT_EQ(0, unlink(FULLPATH)) << strerror(errno); + ASSERT_EQ(0, access(FULLPATH, F_OK)) << strerror(errno); +} + +/* + * Unlink a multiply-linked file. There should be no FUSE_FORGET because the + * file is still linked. + */ +TEST_F(Unlink, multiply_linked) +{ + const char FULLPATH0[] = "mountpoint/some_file.txt"; + const char RELPATH0[] = "some_file.txt"; + const char FULLPATH1[] = "mountpoint/other_file.txt"; + const char RELPATH1[] = "other_file.txt"; + uint64_t ino = 42; + + expect_getattr(1, S_IFDIR | 0755); + expect_lookup(RELPATH0, ino, 1, 2); + expect_unlink(1, RELPATH0, 0); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_FORGET && + in.header.nodeid == ino); + }, Eq(true)), + _) + ).Times(0); + expect_lookup(RELPATH1, ino, 1, 1); + + ASSERT_EQ(0, unlink(FULLPATH0)) << strerror(errno); + + /* + * The final syscall simply ensures that no FUSE_FORGET was ever sent, + * by scheduling an arbitrary different operation after a FUSE_FORGET + * would've been sent. + */ + ASSERT_EQ(0, access(FULLPATH1, F_OK)) << strerror(errno); +} + +TEST_F(Unlink, ok) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + sem_t sem; + + ASSERT_EQ(0, sem_init(&sem, 0, 0)) << strerror(errno); + + expect_getattr(1, S_IFDIR | 0755); + expect_lookup(RELPATH, ino, 1); + expect_unlink(1, RELPATH, 0); + expect_forget(ino, 1, &sem); + + ASSERT_EQ(0, unlink(FULLPATH)) << strerror(errno); + sem_wait(&sem); + sem_destroy(&sem); +} + +/* Unlink an open file */ +TEST_F(Unlink, open_but_deleted) +{ + const char FULLPATH0[] = "mountpoint/some_file.txt"; + const char RELPATH0[] = "some_file.txt"; + const char FULLPATH1[] = "mountpoint/other_file.txt"; + const char RELPATH1[] = "other_file.txt"; + uint64_t ino = 42; + int fd; + + expect_getattr(1, S_IFDIR | 0755); + expect_lookup(RELPATH0, ino, 2); + expect_open(ino, 0, 1); + expect_unlink(1, RELPATH0, 0); + expect_lookup(RELPATH1, ino, 1, 1); + + fd = open(FULLPATH0, O_RDWR); + ASSERT_LE(0, fd) << strerror(errno); + ASSERT_EQ(0, unlink(FULLPATH0)) << strerror(errno); + + /* + * The final syscall simply ensures that no FUSE_FORGET was ever sent, + * by scheduling an arbitrary different operation after a FUSE_FORGET + * would've been sent. + */ + ASSERT_EQ(0, access(FULLPATH1, F_OK)) << strerror(errno); + leak(fd); +} Property changes on: head/tests/sys/fs/fusefs/unlink.cc ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: head/tests/sys/fs/fusefs/forget.cc =================================================================== --- head/tests/sys/fs/fusefs/forget.cc (nonexistent) +++ head/tests/sys/fs/fusefs/forget.cc (revision 350665) @@ -0,0 +1,154 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2019 The FreeBSD Foundation + * + * This software was developed by BFF Storage Systems, LLC under sponsorship + * from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +extern "C" { +#include +#include + +#include +#include +#include +} + +#include "mockfs.hh" +#include "utils.hh" + +using namespace testing; + +const char reclaim_mib[] = "debug.try_reclaim_vnode"; + +class Forget: public FuseTest { +public: +void SetUp() { + if (geteuid() != 0) + GTEST_SKIP() << "Only root may use " << reclaim_mib; + + if (-1 == sysctlbyname(reclaim_mib, NULL, 0, NULL, 0) && + errno == ENOENT) + GTEST_SKIP() << reclaim_mib << " is not available"; + + FuseTest::SetUp(); +} + +}; + +/* + * When a fusefs vnode is reclaimed, it should send a FUSE_FORGET operation. + */ +TEST_F(Forget, ok) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + mode_t mode = S_IFREG | 0755; + sem_t sem; + int err; + + ASSERT_EQ(0, sem_init(&sem, 0, 0)) << strerror(errno); + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .Times(3) + .WillRepeatedly(Invoke( + ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = mode; + out.body.entry.nodeid = ino; + out.body.entry.attr.nlink = 1; + out.body.entry.attr_valid = UINT64_MAX; + }))); + expect_forget(ino, 3, &sem); + + /* + * access(2) the file to force a lookup. Access it twice to double its + * lookup count. + */ + ASSERT_EQ(0, access(FULLPATH, F_OK)) << strerror(errno); + ASSERT_EQ(0, access(FULLPATH, F_OK)) << strerror(errno); + + err = sysctlbyname(reclaim_mib, NULL, 0, FULLPATH, sizeof(FULLPATH)); + ASSERT_EQ(0, err) << strerror(errno); + + sem_wait(&sem); + sem_destroy(&sem); +} + +/* + * When a directory is reclaimed, the names of its entries vanish from the + * namecache + */ +TEST_F(Forget, invalidate_names) +{ + const char FULLFPATH[] = "mountpoint/some_dir/some_file.txt"; + const char FULLDPATH[] = "mountpoint/some_dir"; + const char DNAME[] = "some_dir"; + const char FNAME[] = "some_file.txt"; + uint64_t dir_ino = 42; + uint64_t file_ino = 43; + int err; + + EXPECT_LOOKUP(FUSE_ROOT_ID, DNAME) + .WillRepeatedly(Invoke( + ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = S_IFDIR | 0755; + out.body.entry.nodeid = dir_ino; + out.body.entry.attr.nlink = 2; + out.body.entry.attr_valid = UINT64_MAX; + out.body.entry.entry_valid = UINT64_MAX; + }))); + + /* + * Even though we don't reclaim FNAME and its entry is cacheable, we + * should get two lookups because the reclaim of DNAME will invalidate + * the cached FNAME entry. + */ + EXPECT_LOOKUP(dir_ino, FNAME) + .Times(2) + .WillRepeatedly(Invoke( + ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = S_IFREG | 0644; + out.body.entry.nodeid = file_ino; + out.body.entry.attr.nlink = 1; + out.body.entry.attr_valid = UINT64_MAX; + out.body.entry.entry_valid = UINT64_MAX; + }))); + expect_forget(dir_ino, 2); + + /* Access the file to cache its name */ + ASSERT_EQ(0, access(FULLFPATH, F_OK)) << strerror(errno); + + /* Reclaim the directory, invalidating its children from namecache */ + err = sysctlbyname(reclaim_mib, NULL, 0, FULLDPATH, sizeof(FULLDPATH)); + ASSERT_EQ(0, err) << strerror(errno); + + /* Access the file again, causing another lookup */ + ASSERT_EQ(0, access(FULLFPATH, F_OK)) << strerror(errno); +} Property changes on: head/tests/sys/fs/fusefs/forget.cc ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: head/tests/sys/fs/fusefs/create.cc =================================================================== --- head/tests/sys/fs/fusefs/create.cc (nonexistent) +++ head/tests/sys/fs/fusefs/create.cc (revision 350665) @@ -0,0 +1,449 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2019 The FreeBSD Foundation + * + * This software was developed by BFF Storage Systems, LLC under sponsorship + * from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +extern "C" { +#include +} + +#include "mockfs.hh" +#include "utils.hh" + +using namespace testing; + +class Create: public FuseTest { +public: + +void expect_create(const char *relpath, mode_t mode, ProcessMockerT r) +{ + mode_t mask = umask(0); + (void)umask(mask); + + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + const char *name = (const char*)in.body.bytes + + sizeof(fuse_create_in); + return (in.header.opcode == FUSE_CREATE && + in.body.create.mode == mode && + in.body.create.umask == mask && + (0 == strcmp(relpath, name))); + }, Eq(true)), + _) + ).WillOnce(Invoke(r)); +} + +}; + +/* FUSE_CREATE operations for a protocol 7.8 server */ +class Create_7_8: public Create { +public: +virtual void SetUp() { + m_kernel_minor_version = 8; + Create::SetUp(); +} + +void expect_create(const char *relpath, mode_t mode, ProcessMockerT r) +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + const char *name = (const char*)in.body.bytes + + sizeof(fuse_open_in); + return (in.header.opcode == FUSE_CREATE && + in.body.create.mode == mode && + (0 == strcmp(relpath, name))); + }, Eq(true)), + _) + ).WillOnce(Invoke(r)); +} + +}; + +/* FUSE_CREATE operations for a server built at protocol <= 7.11 */ +class Create_7_11: public FuseTest { +public: +virtual void SetUp() { + m_kernel_minor_version = 11; + FuseTest::SetUp(); +} + +void expect_create(const char *relpath, mode_t mode, ProcessMockerT r) +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + const char *name = (const char*)in.body.bytes + + sizeof(fuse_open_in); + return (in.header.opcode == FUSE_CREATE && + in.body.create.mode == mode && + (0 == strcmp(relpath, name))); + }, Eq(true)), + _) + ).WillOnce(Invoke(r)); +} + +}; + + +/* + * If FUSE_CREATE sets attr_valid, then subsequent GETATTRs should use the + * attribute cache + */ +TEST_F(Create, attr_cache) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + mode_t mode = S_IFREG | 0755; + uint64_t ino = 42; + int fd; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(Invoke(ReturnErrno(ENOENT))); + expect_create(RELPATH, mode, + ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, create); + out.body.create.entry.attr.mode = mode; + out.body.create.entry.nodeid = ino; + out.body.create.entry.entry_valid = UINT64_MAX; + out.body.create.entry.attr_valid = UINT64_MAX; + })); + + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_GETATTR && + in.header.nodeid == ino); + }, Eq(true)), + _) + ).Times(0); + + fd = open(FULLPATH, O_CREAT | O_EXCL, mode); + EXPECT_LE(0, fd) << strerror(errno); + leak(fd); +} + +/* A successful CREATE operation should purge the parent dir's attr cache */ +TEST_F(Create, clear_attr_cache) +{ + const char FULLPATH[] = "mountpoint/src"; + const char RELPATH[] = "src"; + mode_t mode = S_IFREG | 0755; + uint64_t ino = 42; + int fd; + struct stat sb; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(Invoke(ReturnErrno(ENOENT))); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_GETATTR && + in.header.nodeid == FUSE_ROOT_ID); + }, Eq(true)), + _) + ).Times(2) + .WillRepeatedly(Invoke(ReturnImmediate([=](auto i __unused, auto& out) { + SET_OUT_HEADER_LEN(out, attr); + out.body.attr.attr.ino = FUSE_ROOT_ID; + out.body.attr.attr.mode = S_IFDIR | 0755; + out.body.attr.attr_valid = UINT64_MAX; + }))); + + expect_create(RELPATH, mode, + ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, create); + out.body.create.entry.attr.mode = mode; + out.body.create.entry.nodeid = ino; + out.body.create.entry.entry_valid = UINT64_MAX; + out.body.create.entry.attr_valid = UINT64_MAX; + })); + + EXPECT_EQ(0, stat("mountpoint", &sb)) << strerror(errno); + fd = open(FULLPATH, O_CREAT | O_EXCL, mode); + EXPECT_LE(0, fd) << strerror(errno); + EXPECT_EQ(0, stat("mountpoint", &sb)) << strerror(errno); + + leak(fd); +} + +/* + * The fuse daemon fails the request with EEXIST. This usually indicates a + * race condition: some other FUSE client created the file in between when the + * kernel checked for it with lookup and tried to create it with create + */ +TEST_F(Create, eexist) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + mode_t mode = S_IFREG | 0755; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(Invoke(ReturnErrno(ENOENT))); + expect_create(RELPATH, mode, ReturnErrno(EEXIST)); + EXPECT_NE(0, open(FULLPATH, O_CREAT | O_EXCL, mode)); + EXPECT_EQ(EEXIST, errno); +} + +/* + * If the daemon doesn't implement FUSE_CREATE, then the kernel should fallback + * to FUSE_MKNOD/FUSE_OPEN + */ +TEST_F(Create, Enosys) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + mode_t mode = S_IFREG | 0755; + uint64_t ino = 42; + int fd; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(Invoke(ReturnErrno(ENOENT))); + expect_create(RELPATH, mode, ReturnErrno(ENOSYS)); + + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + const char *name = (const char*)in.body.bytes + + sizeof(fuse_mknod_in); + return (in.header.opcode == FUSE_MKNOD && + in.body.mknod.mode == (S_IFREG | mode) && + in.body.mknod.rdev == 0 && + (0 == strcmp(RELPATH, name))); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = mode; + out.body.entry.nodeid = ino; + out.body.entry.entry_valid = UINT64_MAX; + out.body.entry.attr_valid = UINT64_MAX; + }))); + + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_OPEN && + in.header.nodeid == ino); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([](auto in __unused, auto& out) { + out.header.len = sizeof(out.header); + SET_OUT_HEADER_LEN(out, open); + }))); + + fd = open(FULLPATH, O_CREAT | O_EXCL, mode); + EXPECT_LE(0, fd) << strerror(errno); + leak(fd); +} + +/* + * Creating a new file after FUSE_LOOKUP returned a negative cache entry + */ +TEST_F(Create, entry_cache_negative) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + mode_t mode = S_IFREG | 0755; + uint64_t ino = 42; + int fd; + /* + * Set entry_valid = 0 because this test isn't concerned with whether + * or not we actually cache negative entries, only with whether we + * interpret negative cache responses correctly. + */ + struct timespec entry_valid = {.tv_sec = 0, .tv_nsec = 0}; + + /* create will first do a LOOKUP, adding a negative cache entry */ + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(ReturnNegativeCache(&entry_valid)); + expect_create(RELPATH, mode, + ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, create); + out.body.create.entry.attr.mode = mode; + out.body.create.entry.nodeid = ino; + out.body.create.entry.entry_valid = UINT64_MAX; + out.body.create.entry.attr_valid = UINT64_MAX; + })); + + fd = open(FULLPATH, O_CREAT | O_EXCL, mode); + ASSERT_LE(0, fd) << strerror(errno); + leak(fd); +} + +/* + * Creating a new file should purge any negative namecache entries + */ +TEST_F(Create, entry_cache_negative_purge) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + mode_t mode = S_IFREG | 0755; + uint64_t ino = 42; + int fd; + struct timespec entry_valid = {.tv_sec = TIME_T_MAX, .tv_nsec = 0}; + + /* create will first do a LOOKUP, adding a negative cache entry */ + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH).Times(1) + .WillOnce(Invoke(ReturnNegativeCache(&entry_valid))) + .RetiresOnSaturation(); + + /* Then the CREATE should purge the negative cache entry */ + expect_create(RELPATH, mode, + ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, create); + out.body.create.entry.attr.mode = mode; + out.body.create.entry.nodeid = ino; + out.body.create.entry.attr_valid = UINT64_MAX; + })); + + fd = open(FULLPATH, O_CREAT | O_EXCL, mode); + ASSERT_LE(0, fd) << strerror(errno); + + /* Finally, a subsequent lookup should query the daemon */ + expect_lookup(RELPATH, ino, S_IFREG | mode, 0, 1); + + ASSERT_EQ(0, access(FULLPATH, F_OK)) << strerror(errno); + leak(fd); +} + +/* + * The daemon is responsible for checking file permissions (unless the + * default_permissions mount option was used) + */ +TEST_F(Create, eperm) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + mode_t mode = S_IFREG | 0755; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(Invoke(ReturnErrno(ENOENT))); + expect_create(RELPATH, mode, ReturnErrno(EPERM)); + + EXPECT_NE(0, open(FULLPATH, O_CREAT | O_EXCL, mode)); + EXPECT_EQ(EPERM, errno); +} + +TEST_F(Create, ok) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + mode_t mode = S_IFREG | 0755; + uint64_t ino = 42; + int fd; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(Invoke(ReturnErrno(ENOENT))); + expect_create(RELPATH, mode, + ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, create); + out.body.create.entry.attr.mode = mode; + out.body.create.entry.nodeid = ino; + out.body.create.entry.entry_valid = UINT64_MAX; + out.body.create.entry.attr_valid = UINT64_MAX; + })); + + fd = open(FULLPATH, O_CREAT | O_EXCL, mode); + EXPECT_LE(0, fd) << strerror(errno); + leak(fd); +} + +/* + * A regression test for a bug that affected old FUSE implementations: + * open(..., O_WRONLY | O_CREAT, 0444) should work despite the seeming + * contradiction between O_WRONLY and 0444 + * + * For example: + * https://bugs.launchpad.net/ubuntu/+source/sshfs-fuse/+bug/44886 + */ +TEST_F(Create, wronly_0444) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + mode_t mode = S_IFREG | 0444; + uint64_t ino = 42; + int fd; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(Invoke(ReturnErrno(ENOENT))); + expect_create(RELPATH, mode, + ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, create); + out.body.create.entry.attr.mode = mode; + out.body.create.entry.nodeid = ino; + out.body.create.entry.entry_valid = UINT64_MAX; + out.body.create.entry.attr_valid = UINT64_MAX; + })); + + fd = open(FULLPATH, O_CREAT | O_WRONLY, mode); + EXPECT_LE(0, fd) << strerror(errno); + leak(fd); +} + +TEST_F(Create_7_8, ok) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + mode_t mode = S_IFREG | 0755; + uint64_t ino = 42; + int fd; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(Invoke(ReturnErrno(ENOENT))); + expect_create(RELPATH, mode, + ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, create_7_8); + out.body.create.entry.attr.mode = mode; + out.body.create.entry.nodeid = ino; + out.body.create.entry.entry_valid = UINT64_MAX; + out.body.create.entry.attr_valid = UINT64_MAX; + })); + + fd = open(FULLPATH, O_CREAT | O_EXCL, mode); + EXPECT_LE(0, fd) << strerror(errno); + leak(fd); +} + +TEST_F(Create_7_11, ok) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + mode_t mode = S_IFREG | 0755; + uint64_t ino = 42; + int fd; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(Invoke(ReturnErrno(ENOENT))); + expect_create(RELPATH, mode, + ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, create); + out.body.create.entry.attr.mode = mode; + out.body.create.entry.nodeid = ino; + out.body.create.entry.entry_valid = UINT64_MAX; + out.body.create.entry.attr_valid = UINT64_MAX; + })); + + fd = open(FULLPATH, O_CREAT | O_EXCL, mode); + EXPECT_LE(0, fd) << strerror(errno); + leak(fd); +} Property changes on: head/tests/sys/fs/fusefs/create.cc ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: head/tests/sys/fs/fusefs/fifo.cc =================================================================== --- head/tests/sys/fs/fusefs/fifo.cc (nonexistent) +++ head/tests/sys/fs/fusefs/fifo.cc (revision 350665) @@ -0,0 +1,207 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2019 The FreeBSD Foundation + * + * This software was developed by BFF Storage Systems, LLC under sponsorship + * from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +extern "C" { +#include +#include +#include +#include +} + +#include "mockfs.hh" +#include "utils.hh" + +using namespace testing; + +const char FULLPATH[] = "mountpoint/some_fifo"; +const char RELPATH[] = "some_fifo"; +const char MESSAGE[] = "Hello, World!\n"; +const int msgsize = sizeof(MESSAGE); + +class Fifo: public FuseTest { +public: +pthread_t m_child; + +Fifo(): m_child(NULL) {}; + +void TearDown() { + if (m_child != NULL) { + pthread_join(m_child, NULL); + } + FuseTest::TearDown(); +} +}; + +class Socket: public Fifo {}; + +/* Writer thread */ +static void* writer(void* arg) { + ssize_t sent = 0; + int fd; + + fd = *(int*)arg; + while (sent < msgsize) { + ssize_t r; + + r = write(fd, MESSAGE + sent, msgsize - sent); + if (r < 0) + return (void*)(intptr_t)errno; + else + sent += r; + + } + return 0; +} + +/* + * Reading and writing FIFOs works. None of the I/O actually goes through FUSE + */ +TEST_F(Fifo, read_write) +{ + mode_t mode = S_IFIFO | 0755; + const int bufsize = 80; + char message[bufsize]; + ssize_t recvd = 0, r; + uint64_t ino = 42; + int fd; + + expect_lookup(RELPATH, ino, mode, 0, 1); + + fd = open(FULLPATH, O_RDWR); + ASSERT_LE(0, fd) << strerror(errno); + ASSERT_EQ(0, pthread_create(&m_child, NULL, writer, &fd)) + << strerror(errno); + while (recvd < msgsize) { + r = read(fd, message + recvd, bufsize - recvd); + ASSERT_LE(0, r) << strerror(errno); + ASSERT_LT(0, r) << "unexpected EOF"; + recvd += r; + } + ASSERT_STREQ(message, MESSAGE); + + leak(fd); +} + +/* Writer thread */ +static void* socket_writer(void* arg __unused) { + ssize_t sent = 0; + int fd, err; + struct sockaddr_un sa; + + fd = socket(AF_UNIX, SOCK_STREAM, 0); + if (fd < 0) { + perror("socket"); + return (void*)(intptr_t)errno; + } + sa.sun_family = AF_UNIX; + strlcpy(sa.sun_path, FULLPATH, sizeof(sa.sun_path)); + err = connect(fd, (struct sockaddr*)&sa, sizeof(sa)); + if (err < 0) { + perror("connect"); + return (void*)(intptr_t)errno; + } + + while (sent < msgsize) { + ssize_t r; + + r = write(fd, MESSAGE + sent, msgsize - sent); + if (r < 0) + return (void*)(intptr_t)errno; + else + sent += r; + + } + return 0; +} + +/* + * Reading and writing unix-domain sockets works. None of the I/O actually + * goes through FUSE. + */ +TEST_F(Socket, read_write) +{ + mode_t mode = S_IFSOCK | 0755; + const int bufsize = 80; + char message[bufsize]; + struct sockaddr_un sa; + ssize_t recvd = 0, r; + uint64_t ino = 42; + int fd, connected; + Sequence seq; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(Invoke(ReturnErrno(ENOENT))); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_MKNOD); + }, Eq(true)), + _) + ).InSequence(seq) + .WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = mode; + out.body.entry.nodeid = ino; + out.body.entry.entry_valid = UINT64_MAX; + out.body.entry.attr_valid = UINT64_MAX; + }))); + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .InSequence(seq) + .WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = mode; + out.body.entry.nodeid = ino; + out.body.entry.attr.nlink = 1; + out.body.entry.attr_valid = UINT64_MAX; + out.body.entry.entry_valid = UINT64_MAX; + }))); + + fd = socket(AF_UNIX, SOCK_STREAM, 0); + ASSERT_LE(0, fd) << strerror(errno); + sa.sun_family = AF_UNIX; + strlcpy(sa.sun_path, FULLPATH, sizeof(sa.sun_path)); + ASSERT_EQ(0, bind(fd, (struct sockaddr*)&sa, sizeof(sa))) + << strerror(errno); + listen(fd, 5); + ASSERT_EQ(0, pthread_create(&m_child, NULL, socket_writer, NULL)) + << strerror(errno); + connected = accept(fd, 0, 0); + ASSERT_LE(0, connected) << strerror(errno); + + while (recvd < msgsize) { + r = read(connected, message + recvd, bufsize - recvd); + ASSERT_LE(0, r) << strerror(errno); + ASSERT_LT(0, r) << "unexpected EOF"; + recvd += r; + } + ASSERT_STREQ(message, MESSAGE); + + leak(fd); +} Property changes on: head/tests/sys/fs/fusefs/fifo.cc ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: head/tests/sys/fs/fusefs/flush.cc =================================================================== --- head/tests/sys/fs/fusefs/flush.cc (nonexistent) +++ head/tests/sys/fs/fusefs/flush.cc (revision 350665) @@ -0,0 +1,232 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2019 The FreeBSD Foundation + * + * This software was developed by BFF Storage Systems, LLC under sponsorship + * from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +extern "C" { +#include +#include +} + +#include "mockfs.hh" +#include "utils.hh" + +using namespace testing; + +class Flush: public FuseTest { + +public: +void +expect_flush(uint64_t ino, int times, pid_t lo, ProcessMockerT r) +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_FLUSH && + in.header.nodeid == ino && + in.body.flush.lock_owner == (uint64_t)lo && + in.body.flush.fh == FH); + }, Eq(true)), + _) + ).Times(times) + .WillRepeatedly(Invoke(r)); +} + +void expect_lookup(const char *relpath, uint64_t ino, int times) +{ + FuseTest::expect_lookup(relpath, ino, S_IFREG | 0644, 0, times); +} + +/* + * When testing FUSE_FLUSH, the FUSE_RELEASE calls are uninteresting. This + * expectation will silence googlemock warnings + */ +void expect_release() +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_RELEASE); + }, Eq(true)), + _) + ).WillRepeatedly(Invoke(ReturnErrno(0))); +} +}; + +class FlushWithLocks: public Flush { + virtual void SetUp() { + m_init_flags = FUSE_POSIX_LOCKS; + Flush::SetUp(); + } +}; + +/* + * If multiple file descriptors refer to the same file handle, closing each + * should send FUSE_FLUSH + */ +TEST_F(Flush, open_twice) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + int fd, fd2; + + expect_lookup(RELPATH, ino, 2); + expect_open(ino, 0, 1); + expect_flush(ino, 2, getpid(), ReturnErrno(0)); + expect_release(); + + fd = open(FULLPATH, O_WRONLY); + EXPECT_LE(0, fd) << strerror(errno); + + fd2 = open(FULLPATH, O_WRONLY); + EXPECT_LE(0, fd2) << strerror(errno); + + EXPECT_EQ(0, close(fd2)) << strerror(errno); + EXPECT_EQ(0, close(fd)) << strerror(errno); +} + +/* + * Some FUSE filesystem cache data internally and flush it on release. Such + * filesystems may generate errors during release. On Linux, these get + * returned by close(2). However, POSIX does not require close(2) to return + * this error. FreeBSD's fuse(4) should return EIO if it returns an error at + * all. + */ +/* http://pubs.opengroup.org/onlinepubs/9699919799/functions/close.html */ +TEST_F(Flush, eio) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + int fd; + + expect_lookup(RELPATH, ino, 1); + expect_open(ino, 0, 1); + expect_flush(ino, 1, getpid(), ReturnErrno(EIO)); + expect_release(); + + fd = open(FULLPATH, O_WRONLY); + EXPECT_LE(0, fd) << strerror(errno); + + ASSERT_TRUE(0 == close(fd) || errno == EIO) << strerror(errno); +} + +/* + * If the filesystem returns ENOSYS, it will be treated as success and + * no more FUSE_FLUSH operations will be sent to the daemon + */ +TEST_F(Flush, enosys) +{ + const char FULLPATH0[] = "mountpoint/some_file.txt"; + const char RELPATH0[] = "some_file.txt"; + const char FULLPATH1[] = "mountpoint/other_file.txt"; + const char RELPATH1[] = "other_file.txt"; + uint64_t ino0 = 42; + uint64_t ino1 = 43; + int fd0, fd1; + + expect_lookup(RELPATH0, ino0, 1); + expect_open(ino0, 0, 1); + /* On the 2nd close, FUSE_FLUSH won't be sent at all */ + expect_flush(ino0, 1, getpid(), ReturnErrno(ENOSYS)); + expect_release(); + + expect_lookup(RELPATH1, ino1, 1); + expect_open(ino1, 0, 1); + /* On the 2nd close, FUSE_FLUSH won't be sent at all */ + expect_release(); + + fd0 = open(FULLPATH0, O_WRONLY); + ASSERT_LE(0, fd0) << strerror(errno); + + fd1 = open(FULLPATH1, O_WRONLY); + ASSERT_LE(0, fd1) << strerror(errno); + + EXPECT_EQ(0, close(fd0)) << strerror(errno); + EXPECT_EQ(0, close(fd1)) << strerror(errno); +} + +/* A FUSE_FLUSH should be sent on close(2) */ +TEST_F(Flush, flush) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + int fd; + + expect_lookup(RELPATH, ino, 1); + expect_open(ino, 0, 1); + expect_flush(ino, 1, getpid(), ReturnErrno(0)); + expect_release(); + + fd = open(FULLPATH, O_WRONLY); + EXPECT_LE(0, fd) << strerror(errno); + + ASSERT_TRUE(0 == close(fd)) << strerror(errno); +} + +/* + * When closing a file with a POSIX file lock, flush should release the lock, + * _even_if_ it's not the process's last file descriptor for this file. + */ +TEST_F(FlushWithLocks, unlock_on_close) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + int fd, fd2; + struct flock fl; + pid_t pid = getpid(); + + expect_lookup(RELPATH, ino, 2); + expect_open(ino, 0, 1); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_SETLK && + in.header.nodeid == ino && + in.body.setlk.fh == FH); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnErrno(0))); + expect_flush(ino, 1, pid, ReturnErrno(0)); + + fd = open(FULLPATH, O_RDWR); + ASSERT_LE(0, fd) << strerror(errno); + fl.l_start = 0; + fl.l_len = 0; + fl.l_pid = pid; + fl.l_type = F_RDLCK; + fl.l_whence = SEEK_SET; + fl.l_sysid = 0; + ASSERT_NE(-1, fcntl(fd, F_SETLKW, &fl)) << strerror(errno); + + fd2 = open(FULLPATH, O_WRONLY); + ASSERT_LE(0, fd2) << strerror(errno); + ASSERT_EQ(0, close(fd2)) << strerror(errno); + leak(fd); + leak(fd2); +} Property changes on: head/tests/sys/fs/fusefs/flush.cc ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: head/tests/sys/fs/fusefs/fsync.cc =================================================================== --- head/tests/sys/fs/fusefs/fsync.cc (nonexistent) +++ head/tests/sys/fs/fusefs/fsync.cc (revision 350665) @@ -0,0 +1,249 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2019 The FreeBSD Foundation + * + * This software was developed by BFF Storage Systems, LLC under sponsorship + * from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +extern "C" { +#include +#include +#include +} + +#include "mockfs.hh" +#include "utils.hh" + +using namespace testing; + +/* + * TODO: remove FUSE_FSYNC_FDATASYNC definition when upgrading to protocol 7.28. + * This bit was actually part of kernel protocol version 5.2, but never + * documented until after 7.28 + */ +#ifndef FUSE_FSYNC_FDATASYNC +#define FUSE_FSYNC_FDATASYNC 1 +#endif + +class Fsync: public FuseTest { +public: +void expect_fsync(uint64_t ino, uint32_t flags, int error) +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_FSYNC && + in.header.nodeid == ino && + /* + * TODO: reenable pid check after fixing + * bug 236379 + */ + //(pid_t)in.header.pid == getpid() && + in.body.fsync.fh == FH && + in.body.fsync.fsync_flags == flags); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnErrno(error))); +} + +void expect_lookup(const char *relpath, uint64_t ino) +{ + FuseTest::expect_lookup(relpath, ino, S_IFREG | 0644, 0, 1); +} + +void expect_write(uint64_t ino, uint64_t size, const void *contents) +{ + FuseTest::expect_write(ino, 0, size, size, 0, 0, contents); +} + +}; + +/* https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=236379 */ +TEST_F(Fsync, aio_fsync) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char *CONTENTS = "abcdefgh"; + ssize_t bufsize = strlen(CONTENTS); + uint64_t ino = 42; + struct aiocb iocb, *piocb; + int fd; + + expect_lookup(RELPATH, ino); + expect_open(ino, 0, 1); + expect_write(ino, bufsize, CONTENTS); + expect_fsync(ino, 0, 0); + + fd = open(FULLPATH, O_RDWR); + ASSERT_LE(0, fd) << strerror(errno); + ASSERT_EQ(bufsize, write(fd, CONTENTS, bufsize)) << strerror(errno); + + bzero(&iocb, sizeof(iocb)); + iocb.aio_fildes = fd; + + ASSERT_EQ(0, aio_fsync(O_SYNC, &iocb)) << strerror(errno); + ASSERT_EQ(0, aio_waitcomplete(&piocb, NULL)) << strerror(errno); + + leak(fd); +} + +/* + * fuse(4) should NOT fsync during VOP_RELEASE or VOP_INACTIVE + * + * This test only really make sense in writeback caching mode, but it should + * still pass in any cache mode. + */ +TEST_F(Fsync, close) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char *CONTENTS = "abcdefgh"; + ssize_t bufsize = strlen(CONTENTS); + uint64_t ino = 42; + int fd; + + expect_lookup(RELPATH, ino); + expect_open(ino, 0, 1); + expect_write(ino, bufsize, CONTENTS); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_SETATTR); + }, Eq(true)), + _) + ).WillRepeatedly(Invoke(ReturnImmediate([=](auto i __unused, auto& out) { + SET_OUT_HEADER_LEN(out, attr); + out.body.attr.attr.ino = ino; // Must match nodeid + }))); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_FSYNC); + }, Eq(true)), + _) + ).Times(0); + expect_flush(ino, 1, ReturnErrno(0)); + expect_release(ino, FH); + + fd = open(FULLPATH, O_RDWR); + ASSERT_LE(0, fd) << strerror(errno); + ASSERT_EQ(bufsize, write(fd, CONTENTS, bufsize)) << strerror(errno); + close(fd); +} + +TEST_F(Fsync, eio) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char *CONTENTS = "abcdefgh"; + ssize_t bufsize = strlen(CONTENTS); + uint64_t ino = 42; + int fd; + + expect_lookup(RELPATH, ino); + expect_open(ino, 0, 1); + expect_write(ino, bufsize, CONTENTS); + expect_fsync(ino, FUSE_FSYNC_FDATASYNC, EIO); + + fd = open(FULLPATH, O_RDWR); + ASSERT_LE(0, fd) << strerror(errno); + ASSERT_EQ(bufsize, write(fd, CONTENTS, bufsize)) << strerror(errno); + ASSERT_NE(0, fdatasync(fd)); + ASSERT_EQ(EIO, errno); + + leak(fd); +} + +/* + * If the filesystem returns ENOSYS, it will be treated as success and + * subsequent calls to VOP_FSYNC will succeed automatically without being sent + * to the filesystem daemon + */ +TEST_F(Fsync, enosys) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char *CONTENTS = "abcdefgh"; + ssize_t bufsize = strlen(CONTENTS); + uint64_t ino = 42; + int fd; + + expect_lookup(RELPATH, ino); + expect_open(ino, 0, 1); + expect_write(ino, bufsize, CONTENTS); + expect_fsync(ino, FUSE_FSYNC_FDATASYNC, ENOSYS); + + fd = open(FULLPATH, O_RDWR); + ASSERT_LE(0, fd) << strerror(errno); + ASSERT_EQ(bufsize, write(fd, CONTENTS, bufsize)) << strerror(errno); + EXPECT_EQ(0, fdatasync(fd)); + + /* Subsequent calls shouldn't query the daemon*/ + EXPECT_EQ(0, fdatasync(fd)); + leak(fd); +} + + +TEST_F(Fsync, fdatasync) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char *CONTENTS = "abcdefgh"; + ssize_t bufsize = strlen(CONTENTS); + uint64_t ino = 42; + int fd; + + expect_lookup(RELPATH, ino); + expect_open(ino, 0, 1); + expect_write(ino, bufsize, CONTENTS); + expect_fsync(ino, FUSE_FSYNC_FDATASYNC, 0); + + fd = open(FULLPATH, O_RDWR); + ASSERT_LE(0, fd) << strerror(errno); + ASSERT_EQ(bufsize, write(fd, CONTENTS, bufsize)) << strerror(errno); + ASSERT_EQ(0, fdatasync(fd)) << strerror(errno); + + leak(fd); +} + +TEST_F(Fsync, fsync) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const char *CONTENTS = "abcdefgh"; + ssize_t bufsize = strlen(CONTENTS); + uint64_t ino = 42; + int fd; + + expect_lookup(RELPATH, ino); + expect_open(ino, 0, 1); + expect_write(ino, bufsize, CONTENTS); + expect_fsync(ino, 0, 0); + + fd = open(FULLPATH, O_RDWR); + ASSERT_LE(0, fd) << strerror(errno); + ASSERT_EQ(bufsize, write(fd, CONTENTS, bufsize)) << strerror(errno); + ASSERT_EQ(0, fsync(fd)) << strerror(errno); + + leak(fd); +} Property changes on: head/tests/sys/fs/fusefs/fsync.cc ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: head/tests/sys/fs/fusefs/fsyncdir.cc =================================================================== --- head/tests/sys/fs/fusefs/fsyncdir.cc (nonexistent) +++ head/tests/sys/fs/fusefs/fsyncdir.cc (revision 350665) @@ -0,0 +1,186 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2019 The FreeBSD Foundation + * + * This software was developed by BFF Storage Systems, LLC under sponsorship + * from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +extern "C" { +#include +#include +#include +} + +#include "mockfs.hh" +#include "utils.hh" + +using namespace testing; + +/* + * TODO: remove FUSE_FSYNC_FDATASYNC definition when upgrading to protocol 7.28. + * This bit was actually part of kernel protocol version 5.2, but never + * documented until after 7.28 + */ +#ifndef FUSE_FSYNC_FDATASYNC +#define FUSE_FSYNC_FDATASYNC 1 +#endif + +class FsyncDir: public FuseTest { +public: +void expect_fsyncdir(uint64_t ino, uint32_t flags, int error) +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_FSYNCDIR && + in.header.nodeid == ino && + /* + * TODO: reenable pid check after fixing + * bug 236379 + */ + //(pid_t)in.header.pid == getpid() && + in.body.fsyncdir.fh == FH && + in.body.fsyncdir.fsync_flags == flags); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnErrno(error))); +} + +void expect_lookup(const char *relpath, uint64_t ino) +{ + FuseTest::expect_lookup(relpath, ino, S_IFDIR | 0755, 0, 1); +} + +}; + +/* https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=236379 */ +TEST_F(FsyncDir, aio_fsync) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + struct aiocb iocb, *piocb; + int fd; + + expect_lookup(RELPATH, ino); + expect_opendir(ino); + expect_fsyncdir(ino, 0, 0); + + fd = open(FULLPATH, O_DIRECTORY); + ASSERT_LE(0, fd) << strerror(errno); + + bzero(&iocb, sizeof(iocb)); + iocb.aio_fildes = fd; + + ASSERT_EQ(0, aio_fsync(O_SYNC, &iocb)) << strerror(errno); + ASSERT_EQ(0, aio_waitcomplete(&piocb, NULL)) << strerror(errno); + + leak(fd); +} + +TEST_F(FsyncDir, eio) +{ + const char FULLPATH[] = "mountpoint/some_dir"; + const char RELPATH[] = "some_dir"; + uint64_t ino = 42; + int fd; + + expect_lookup(RELPATH, ino); + expect_opendir(ino); + expect_fsyncdir(ino, 0, EIO); + + fd = open(FULLPATH, O_DIRECTORY); + ASSERT_LE(0, fd) << strerror(errno); + ASSERT_NE(0, fsync(fd)); + ASSERT_EQ(EIO, errno); + + leak(fd); +} + +/* + * If the filesystem returns ENOSYS, it will be treated as success and + * subsequent calls to VOP_FSYNC will succeed automatically without being sent + * to the filesystem daemon + */ +TEST_F(FsyncDir, enosys) +{ + const char FULLPATH[] = "mountpoint/some_dir"; + const char RELPATH[] = "some_dir"; + uint64_t ino = 42; + int fd; + + expect_lookup(RELPATH, ino); + expect_opendir(ino); + expect_fsyncdir(ino, 0, ENOSYS); + + fd = open(FULLPATH, O_DIRECTORY); + ASSERT_LE(0, fd) << strerror(errno); + EXPECT_EQ(0, fsync(fd)) << strerror(errno); + + /* Subsequent calls shouldn't query the daemon*/ + EXPECT_EQ(0, fsync(fd)) << strerror(errno); + + leak(fd); +} + +TEST_F(FsyncDir, fsyncdata) +{ + const char FULLPATH[] = "mountpoint/some_dir"; + const char RELPATH[] = "some_dir"; + uint64_t ino = 42; + int fd; + + expect_lookup(RELPATH, ino); + expect_opendir(ino); + expect_fsyncdir(ino, FUSE_FSYNC_FDATASYNC, 0); + + fd = open(FULLPATH, O_DIRECTORY); + ASSERT_LE(0, fd) << strerror(errno); + ASSERT_EQ(0, fdatasync(fd)) << strerror(errno); + + leak(fd); +} + +/* + * Unlike regular files, the kernel doesn't know whether a directory is or + * isn't dirty, so fuse(4) should always send FUSE_FSYNCDIR on fsync(2) + */ +TEST_F(FsyncDir, fsync) +{ + const char FULLPATH[] = "mountpoint/some_dir"; + const char RELPATH[] = "some_dir"; + uint64_t ino = 42; + int fd; + + expect_lookup(RELPATH, ino); + expect_opendir(ino); + expect_fsyncdir(ino, 0, 0); + + fd = open(FULLPATH, O_DIRECTORY); + ASSERT_LE(0, fd) << strerror(errno); + ASSERT_EQ(0, fsync(fd)) << strerror(errno); + + leak(fd); +} Property changes on: head/tests/sys/fs/fusefs/fsyncdir.cc ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: head/tests/sys/fs/fusefs/locks.cc =================================================================== --- head/tests/sys/fs/fusefs/locks.cc (nonexistent) +++ head/tests/sys/fs/fusefs/locks.cc (revision 350665) @@ -0,0 +1,478 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2019 The FreeBSD Foundation + * + * This software was developed by BFF Storage Systems, LLC under sponsorship + * from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +extern "C" { +#include +#include +} + +#include "mockfs.hh" +#include "utils.hh" + +/* This flag value should probably be defined in fuse_kernel.h */ +#define OFFSET_MAX 0x7fffffffffffffffLL + +using namespace testing; + +/* For testing filesystems without posix locking support */ +class Fallback: public FuseTest { +public: + +void expect_lookup(const char *relpath, uint64_t ino) +{ + FuseTest::expect_lookup(relpath, ino, S_IFREG | 0644, 0, 1); +} + +}; + +/* For testing filesystems with posix locking support */ +class Locks: public Fallback { + virtual void SetUp() { + m_init_flags = FUSE_POSIX_LOCKS; + Fallback::SetUp(); + } +}; + +class Fcntl: public Locks { +public: +void expect_setlk(uint64_t ino, pid_t pid, uint64_t start, uint64_t end, + uint32_t type, int err) +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_SETLK && + in.header.nodeid == ino && + in.body.setlk.fh == FH && + in.body.setlkw.owner == (uint32_t)pid && + in.body.setlkw.lk.start == start && + in.body.setlkw.lk.end == end && + in.body.setlkw.lk.type == type && + in.body.setlkw.lk.pid == (uint64_t)pid); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnErrno(err))); +} +}; + +class Flock: public Locks { +public: +void expect_setlk(uint64_t ino, uint32_t type, int err) +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_SETLK && + in.header.nodeid == ino && + in.body.setlk.fh == FH && + /* + * The owner should be set to the address of + * the vnode. That's hard to verify. + */ + /* in.body.setlk.owner == ??? && */ + in.body.setlk.lk.type == type); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnErrno(err))); +} +}; + +class FlockFallback: public Fallback {}; +class GetlkFallback: public Fallback {}; +class Getlk: public Fcntl {}; +class SetlkFallback: public Fallback {}; +class Setlk: public Fcntl {}; +class SetlkwFallback: public Fallback {}; +class Setlkw: public Fcntl {}; + +/* + * If the fuse filesystem does not support flock locks, then the kernel should + * fall back to local locks. + */ +TEST_F(FlockFallback, local) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + int fd; + + expect_lookup(RELPATH, ino); + expect_open(ino, 0, 1); + + fd = open(FULLPATH, O_RDWR); + ASSERT_LE(0, fd) << strerror(errno); + ASSERT_EQ(0, flock(fd, LOCK_EX)) << strerror(errno); + leak(fd); +} + +/* + * Even if the fuse file system supports POSIX locks, we must implement flock + * locks locally until protocol 7.17. Protocol 7.9 added partial buggy support + * but we won't implement that. + */ +TEST_F(Flock, local) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + int fd; + + expect_lookup(RELPATH, ino); + expect_open(ino, 0, 1); + + fd = open(FULLPATH, O_RDWR); + ASSERT_LE(0, fd) << strerror(errno); + ASSERT_EQ(0, flock(fd, LOCK_EX)) << strerror(errno); + leak(fd); +} + +/* Set a new flock lock with FUSE_SETLK */ +/* TODO: enable after upgrading to protocol 7.17 */ +TEST_F(Flock, DISABLED_set) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + int fd; + + expect_lookup(RELPATH, ino); + expect_open(ino, 0, 1); + expect_setlk(ino, F_WRLCK, 0); + + fd = open(FULLPATH, O_RDWR); + ASSERT_LE(0, fd) << strerror(errno); + ASSERT_EQ(0, flock(fd, LOCK_EX)) << strerror(errno); + leak(fd); +} + +/* Fail to set a flock lock in non-blocking mode */ +/* TODO: enable after upgrading to protocol 7.17 */ +TEST_F(Flock, DISABLED_eagain) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + int fd; + + expect_lookup(RELPATH, ino); + expect_open(ino, 0, 1); + expect_setlk(ino, F_WRLCK, EAGAIN); + + fd = open(FULLPATH, O_RDWR); + ASSERT_LE(0, fd) << strerror(errno); + ASSERT_NE(0, flock(fd, LOCK_EX | LOCK_NB)); + ASSERT_EQ(EAGAIN, errno); + leak(fd); +} + +/* + * If the fuse filesystem does not support posix file locks, then the kernel + * should fall back to local locks. + */ +TEST_F(GetlkFallback, local) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + struct flock fl; + int fd; + + expect_lookup(RELPATH, ino); + expect_open(ino, 0, 1); + + fd = open(FULLPATH, O_RDWR); + ASSERT_LE(0, fd) << strerror(errno); + fl.l_start = 10; + fl.l_len = 1000; + fl.l_pid = getpid(); + fl.l_type = F_RDLCK; + fl.l_whence = SEEK_SET; + fl.l_sysid = 0; + ASSERT_NE(-1, fcntl(fd, F_GETLK, &fl)) << strerror(errno); + leak(fd); +} + +/* + * If the filesystem has no locks that fit the description, the filesystem + * should return F_UNLCK + */ +TEST_F(Getlk, no_locks) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + struct flock fl; + int fd; + pid_t pid = 1234; + + expect_lookup(RELPATH, ino); + expect_open(ino, 0, 1); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_GETLK && + in.header.nodeid == ino && + in.body.getlk.fh == FH && + in.body.getlk.owner == (uint32_t)pid && + in.body.getlk.lk.start == 10 && + in.body.getlk.lk.end == 1009 && + in.body.getlk.lk.type == F_RDLCK && + in.body.getlk.lk.pid == (uint64_t)pid); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([=](auto in, auto& out) { + SET_OUT_HEADER_LEN(out, getlk); + out.body.getlk.lk = in.body.getlk.lk; + out.body.getlk.lk.type = F_UNLCK; + }))); + + fd = open(FULLPATH, O_RDWR); + ASSERT_LE(0, fd) << strerror(errno); + fl.l_start = 10; + fl.l_len = 1000; + fl.l_pid = pid; + fl.l_type = F_RDLCK; + fl.l_whence = SEEK_SET; + fl.l_sysid = 0; + ASSERT_NE(-1, fcntl(fd, F_GETLK, &fl)) << strerror(errno); + ASSERT_EQ(F_UNLCK, fl.l_type); + leak(fd); +} + +/* A different pid does have a lock */ +TEST_F(Getlk, lock_exists) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + struct flock fl; + int fd; + pid_t pid = 1234; + pid_t pid2 = 1235; + + expect_lookup(RELPATH, ino); + expect_open(ino, 0, 1); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_GETLK && + in.header.nodeid == ino && + in.body.getlk.fh == FH && + in.body.getlk.owner == (uint32_t)pid && + in.body.getlk.lk.start == 10 && + in.body.getlk.lk.end == 1009 && + in.body.getlk.lk.type == F_RDLCK && + in.body.getlk.lk.pid == (uint64_t)pid); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, getlk); + out.body.getlk.lk.start = 100; + out.body.getlk.lk.end = 199; + out.body.getlk.lk.type = F_WRLCK; + out.body.getlk.lk.pid = (uint32_t)pid2;; + }))); + + fd = open(FULLPATH, O_RDWR); + ASSERT_LE(0, fd) << strerror(errno); + fl.l_start = 10; + fl.l_len = 1000; + fl.l_pid = pid; + fl.l_type = F_RDLCK; + fl.l_whence = SEEK_SET; + fl.l_sysid = 0; + ASSERT_NE(-1, fcntl(fd, F_GETLK, &fl)) << strerror(errno); + EXPECT_EQ(100, fl.l_start); + EXPECT_EQ(100, fl.l_len); + EXPECT_EQ(pid2, fl.l_pid); + EXPECT_EQ(F_WRLCK, fl.l_type); + EXPECT_EQ(SEEK_SET, fl.l_whence); + EXPECT_EQ(0, fl.l_sysid); + leak(fd); +} + +/* + * If the fuse filesystem does not support posix file locks, then the kernel + * should fall back to local locks. + */ +TEST_F(SetlkFallback, local) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + struct flock fl; + int fd; + + expect_lookup(RELPATH, ino); + expect_open(ino, 0, 1); + + fd = open(FULLPATH, O_RDWR); + ASSERT_LE(0, fd) << strerror(errno); + fl.l_start = 10; + fl.l_len = 1000; + fl.l_pid = getpid(); + fl.l_type = F_RDLCK; + fl.l_whence = SEEK_SET; + fl.l_sysid = 0; + ASSERT_NE(-1, fcntl(fd, F_SETLK, &fl)) << strerror(errno); + leak(fd); +} + +/* Set a new lock with FUSE_SETLK */ +TEST_F(Setlk, set) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + struct flock fl; + int fd; + pid_t pid = 1234; + + expect_lookup(RELPATH, ino); + expect_open(ino, 0, 1); + expect_setlk(ino, pid, 10, 1009, F_RDLCK, 0); + + fd = open(FULLPATH, O_RDWR); + ASSERT_LE(0, fd) << strerror(errno); + fl.l_start = 10; + fl.l_len = 1000; + fl.l_pid = pid; + fl.l_type = F_RDLCK; + fl.l_whence = SEEK_SET; + fl.l_sysid = 0; + ASSERT_NE(-1, fcntl(fd, F_SETLK, &fl)) << strerror(errno); + leak(fd); +} + +/* l_len = 0 is a flag value that means to lock until EOF */ +TEST_F(Setlk, set_eof) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + struct flock fl; + int fd; + pid_t pid = 1234; + + expect_lookup(RELPATH, ino); + expect_open(ino, 0, 1); + expect_setlk(ino, pid, 10, OFFSET_MAX, F_RDLCK, 0); + + fd = open(FULLPATH, O_RDWR); + ASSERT_LE(0, fd) << strerror(errno); + fl.l_start = 10; + fl.l_len = 0; + fl.l_pid = pid; + fl.l_type = F_RDLCK; + fl.l_whence = SEEK_SET; + fl.l_sysid = 0; + ASSERT_NE(-1, fcntl(fd, F_SETLK, &fl)) << strerror(errno); + leak(fd); +} + +/* Fail to set a new lock with FUSE_SETLK due to a conflict */ +TEST_F(Setlk, eagain) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + struct flock fl; + int fd; + pid_t pid = 1234; + + expect_lookup(RELPATH, ino); + expect_open(ino, 0, 1); + expect_setlk(ino, pid, 10, 1009, F_RDLCK, EAGAIN); + + fd = open(FULLPATH, O_RDWR); + ASSERT_LE(0, fd) << strerror(errno); + fl.l_start = 10; + fl.l_len = 1000; + fl.l_pid = pid; + fl.l_type = F_RDLCK; + fl.l_whence = SEEK_SET; + fl.l_sysid = 0; + ASSERT_EQ(-1, fcntl(fd, F_SETLK, &fl)); + ASSERT_EQ(EAGAIN, errno); + leak(fd); +} + +/* + * If the fuse filesystem does not support posix file locks, then the kernel + * should fall back to local locks. + */ +TEST_F(SetlkwFallback, local) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + struct flock fl; + int fd; + + expect_lookup(RELPATH, ino); + expect_open(ino, 0, 1); + + fd = open(FULLPATH, O_RDWR); + ASSERT_LE(0, fd) << strerror(errno); + fl.l_start = 10; + fl.l_len = 1000; + fl.l_pid = getpid(); + fl.l_type = F_RDLCK; + fl.l_whence = SEEK_SET; + fl.l_sysid = 0; + ASSERT_NE(-1, fcntl(fd, F_SETLKW, &fl)) << strerror(errno); + leak(fd); +} + +/* + * Set a new lock with FUSE_SETLK. If the lock is not available, then the + * command should block. But to the kernel, that's the same as just being + * slow, so we don't need a separate test method + */ +TEST_F(Setlkw, set) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + struct flock fl; + int fd; + pid_t pid = 1234; + + expect_lookup(RELPATH, ino); + expect_open(ino, 0, 1); + expect_setlk(ino, pid, 10, 1009, F_RDLCK, 0); + + fd = open(FULLPATH, O_RDWR); + ASSERT_LE(0, fd) << strerror(errno); + fl.l_start = 10; + fl.l_len = 1000; + fl.l_pid = pid; + fl.l_type = F_RDLCK; + fl.l_whence = SEEK_SET; + fl.l_sysid = 0; + ASSERT_NE(-1, fcntl(fd, F_SETLKW, &fl)) << strerror(errno); + leak(fd); +} Property changes on: head/tests/sys/fs/fusefs/locks.cc ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: head/tests/sys/fs/fusefs/nfs.cc =================================================================== --- head/tests/sys/fs/fusefs/nfs.cc (nonexistent) +++ head/tests/sys/fs/fusefs/nfs.cc (revision 350665) @@ -0,0 +1,344 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2019 The FreeBSD Foundation + * + * This software was developed by BFF Storage Systems, LLC under sponsorship + * from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +/* This file tests functionality needed by NFS servers */ +extern "C" { +#include +#include + +#include +#include +#include +} + +#include "mockfs.hh" +#include "utils.hh" + +using namespace std; +using namespace testing; + + +class Nfs: public FuseTest { +public: +virtual void SetUp() { + if (geteuid() != 0) + GTEST_SKIP() << "This test requires a privileged user"; + FuseTest::SetUp(); +} +}; + +class Exportable: public Nfs { +public: +virtual void SetUp() { + m_init_flags = FUSE_EXPORT_SUPPORT; + Nfs::SetUp(); +} +}; + +class Fhstat: public Exportable {}; +class FhstatNotExportable: public Nfs {}; +class Getfh: public Exportable {}; +class Readdir: public Exportable {}; + +/* If the server returns a different generation number, then file is stale */ +TEST_F(Fhstat, estale) +{ + const char FULLPATH[] = "mountpoint/some_dir/."; + const char RELDIRPATH[] = "some_dir"; + fhandle_t fhp; + struct stat sb; + const uint64_t ino = 42; + const mode_t mode = S_IFDIR | 0755; + Sequence seq; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELDIRPATH) + .InSequence(seq) + .WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = mode; + out.body.entry.nodeid = ino; + out.body.entry.generation = 1; + out.body.entry.attr_valid = UINT64_MAX; + out.body.entry.entry_valid = 0; + }))); + + EXPECT_LOOKUP(ino, ".") + .InSequence(seq) + .WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = mode; + out.body.entry.nodeid = ino; + out.body.entry.generation = 2; + out.body.entry.attr_valid = UINT64_MAX; + out.body.entry.entry_valid = 0; + }))); + + ASSERT_EQ(0, getfh(FULLPATH, &fhp)) << strerror(errno); + ASSERT_EQ(-1, fhstat(&fhp, &sb)); + EXPECT_EQ(ESTALE, errno); +} + +/* If we must lookup an entry from the server, send a LOOKUP request for "." */ +TEST_F(Fhstat, lookup_dot) +{ + const char FULLPATH[] = "mountpoint/some_dir/."; + const char RELDIRPATH[] = "some_dir"; + fhandle_t fhp; + struct stat sb; + const uint64_t ino = 42; + const mode_t mode = S_IFDIR | 0755; + const uid_t uid = 12345; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELDIRPATH) + .WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = mode; + out.body.entry.nodeid = ino; + out.body.entry.generation = 1; + out.body.entry.attr.uid = uid; + out.body.entry.attr_valid = UINT64_MAX; + out.body.entry.entry_valid = 0; + }))); + + EXPECT_LOOKUP(ino, ".") + .WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = mode; + out.body.entry.nodeid = ino; + out.body.entry.generation = 1; + out.body.entry.attr.uid = uid; + out.body.entry.attr_valid = UINT64_MAX; + out.body.entry.entry_valid = 0; + }))); + + ASSERT_EQ(0, getfh(FULLPATH, &fhp)) << strerror(errno); + ASSERT_EQ(0, fhstat(&fhp, &sb)) << strerror(errno); + EXPECT_EQ(uid, sb.st_uid); + EXPECT_EQ(mode, sb.st_mode); +} + +/* Use a file handle whose entry is still cached */ +TEST_F(Fhstat, cached) +{ + const char FULLPATH[] = "mountpoint/some_dir/."; + const char RELDIRPATH[] = "some_dir"; + fhandle_t fhp; + struct stat sb; + const uint64_t ino = 42; + const mode_t mode = S_IFDIR | 0755; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELDIRPATH) + .WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = mode; + out.body.entry.nodeid = ino; + out.body.entry.generation = 1; + out.body.entry.attr.ino = ino; + out.body.entry.attr_valid = UINT64_MAX; + out.body.entry.entry_valid = UINT64_MAX; + }))); + + ASSERT_EQ(0, getfh(FULLPATH, &fhp)) << strerror(errno); + ASSERT_EQ(0, fhstat(&fhp, &sb)) << strerror(errno); + EXPECT_EQ(ino, sb.st_ino); +} + +/* File handle entries should expire from the cache, too */ +TEST_F(Fhstat, cache_expired) +{ + const char FULLPATH[] = "mountpoint/some_dir/."; + const char RELDIRPATH[] = "some_dir"; + fhandle_t fhp; + struct stat sb; + const uint64_t ino = 42; + const mode_t mode = S_IFDIR | 0755; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELDIRPATH) + .WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = mode; + out.body.entry.nodeid = ino; + out.body.entry.generation = 1; + out.body.entry.attr.ino = ino; + out.body.entry.attr_valid = UINT64_MAX; + out.body.entry.entry_valid_nsec = NAP_NS / 2; + }))); + + EXPECT_LOOKUP(ino, ".") + .WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = mode; + out.body.entry.nodeid = ino; + out.body.entry.generation = 1; + out.body.entry.attr.ino = ino; + out.body.entry.attr_valid = UINT64_MAX; + out.body.entry.entry_valid = 0; + }))); + + ASSERT_EQ(0, getfh(FULLPATH, &fhp)) << strerror(errno); + ASSERT_EQ(0, fhstat(&fhp, &sb)) << strerror(errno); + EXPECT_EQ(ino, sb.st_ino); + + nap(); + + /* Cache should be expired; fuse should issue a FUSE_LOOKUP */ + ASSERT_EQ(0, fhstat(&fhp, &sb)) << strerror(errno); + EXPECT_EQ(ino, sb.st_ino); +} + +/* + * If the server doesn't set FUSE_EXPORT_SUPPORT, then we can't do NFS-style + * lookups + */ +TEST_F(FhstatNotExportable, lookup_dot) +{ + const char FULLPATH[] = "mountpoint/some_dir/."; + const char RELDIRPATH[] = "some_dir"; + fhandle_t fhp; + const uint64_t ino = 42; + const mode_t mode = S_IFDIR | 0755; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELDIRPATH) + .WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = mode; + out.body.entry.nodeid = ino; + out.body.entry.generation = 1; + out.body.entry.attr_valid = UINT64_MAX; + out.body.entry.entry_valid = 0; + }))); + + ASSERT_EQ(-1, getfh(FULLPATH, &fhp)); + ASSERT_EQ(EOPNOTSUPP, errno); +} + +/* FreeBSD's fid struct doesn't have enough space for 64-bit generations */ +TEST_F(Getfh, eoverflow) +{ + const char FULLPATH[] = "mountpoint/some_dir/."; + const char RELDIRPATH[] = "some_dir"; + fhandle_t fhp; + uint64_t ino = 42; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELDIRPATH) + .WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = S_IFDIR | 0755; + out.body.entry.nodeid = ino; + out.body.entry.generation = (uint64_t)UINT32_MAX + 1; + out.body.entry.attr_valid = UINT64_MAX; + out.body.entry.entry_valid = UINT64_MAX; + }))); + + ASSERT_NE(0, getfh(FULLPATH, &fhp)); + EXPECT_EQ(EOVERFLOW, errno); +} + +/* Get an NFS file handle */ +TEST_F(Getfh, ok) +{ + const char FULLPATH[] = "mountpoint/some_dir/."; + const char RELDIRPATH[] = "some_dir"; + fhandle_t fhp; + uint64_t ino = 42; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELDIRPATH) + .WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = S_IFDIR | 0755; + out.body.entry.nodeid = ino; + out.body.entry.attr_valid = UINT64_MAX; + out.body.entry.entry_valid = UINT64_MAX; + }))); + + ASSERT_EQ(0, getfh(FULLPATH, &fhp)) << strerror(errno); +} + +/* + * Call readdir via a file handle. + * + * This is how a userspace nfs server like nfs-ganesha or unfs3 would call + * readdir. The in-kernel NFS server never does any equivalent of open. I + * haven't discovered a way to mimic nfsd's behavior short of actually running + * nfsd. + */ +TEST_F(Readdir, getdirentries) +{ + const char FULLPATH[] = "mountpoint/some_dir"; + const char RELPATH[] = "some_dir"; + uint64_t ino = 42; + mode_t mode = S_IFDIR | 0755; + fhandle_t fhp; + int fd; + char buf[8192]; + ssize_t r; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = mode; + out.body.entry.nodeid = ino; + out.body.entry.generation = 1; + out.body.entry.attr_valid = UINT64_MAX; + out.body.entry.entry_valid = 0; + }))); + + EXPECT_LOOKUP(ino, ".") + .WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = mode; + out.body.entry.nodeid = ino; + out.body.entry.generation = 1; + out.body.entry.attr_valid = UINT64_MAX; + out.body.entry.entry_valid = 0; + }))); + + expect_opendir(ino); + + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_READDIR && + in.header.nodeid == ino && + in.body.readdir.size == sizeof(buf)); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + out.header.error = 0; + out.header.len = sizeof(out.header); + }))); + + ASSERT_EQ(0, getfh(FULLPATH, &fhp)) << strerror(errno); + fd = fhopen(&fhp, O_DIRECTORY); + ASSERT_LE(0, fd) << strerror(errno); + r = getdirentries(fd, buf, sizeof(buf), 0); + ASSERT_EQ(0, r) << strerror(errno); + + leak(fd); +} Property changes on: head/tests/sys/fs/fusefs/nfs.cc ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: head/tests/sys/fs/fusefs/open.cc =================================================================== --- head/tests/sys/fs/fusefs/open.cc (nonexistent) +++ head/tests/sys/fs/fusefs/open.cc (revision 350665) @@ -0,0 +1,262 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2019 The FreeBSD Foundation + * + * This software was developed by BFF Storage Systems, LLC under sponsorship + * from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +extern "C" { +#include +#include +} + +#include "mockfs.hh" +#include "utils.hh" + +using namespace testing; + +class Open: public FuseTest { + +public: + +/* Test an OK open of a file with the given flags */ +void test_ok(int os_flags, int fuse_flags) { + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + int fd; + + FuseTest::expect_lookup(RELPATH, ino, S_IFREG | 0644, 0, 1); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_OPEN && + in.body.open.flags == (uint32_t)fuse_flags && + in.header.nodeid == ino); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([](auto in __unused, auto& out) { + out.header.len = sizeof(out.header); + SET_OUT_HEADER_LEN(out, open); + }))); + + fd = open(FULLPATH, os_flags); + EXPECT_LE(0, fd) << strerror(errno); + leak(fd); +} +}; + + +/* + * fusefs(5) does not support I/O on device nodes (neither does UFS). But it + * shouldn't crash + */ +TEST_F(Open, chr) +{ + const char FULLPATH[] = "mountpoint/zero"; + const char RELPATH[] = "zero"; + uint64_t ino = 42; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillRepeatedly(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = S_IFCHR | 0644; + out.body.entry.nodeid = ino; + out.body.entry.attr.nlink = 1; + out.body.entry.attr_valid = UINT64_MAX; + out.body.entry.attr.rdev = 44; /* /dev/zero's rdev */ + }))); + + ASSERT_EQ(-1, open(FULLPATH, O_RDONLY)); + EXPECT_EQ(EOPNOTSUPP, errno); +} + +/* + * The fuse daemon fails the request with enoent. This usually indicates a + * race condition: some other FUSE client removed the file in between when the + * kernel checked for it with lookup and tried to open it + */ +TEST_F(Open, enoent) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + + expect_lookup(RELPATH, ino, S_IFREG | 0644, 0, 1); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_OPEN && + in.header.nodeid == ino); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnErrno(ENOENT))); + EXPECT_NE(0, open(FULLPATH, O_RDONLY)); + EXPECT_EQ(ENOENT, errno); +} + +/* + * The daemon is responsible for checking file permissions (unless the + * default_permissions mount option was used) + */ +TEST_F(Open, eperm) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + + expect_lookup(RELPATH, ino, S_IFREG | 0644, 0, 1); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_OPEN && + in.header.nodeid == ino); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnErrno(EPERM))); + EXPECT_NE(0, open(FULLPATH, O_RDONLY)); + EXPECT_EQ(EPERM, errno); +} + +/* + * fusefs must issue multiple FUSE_OPEN operations if clients with different + * credentials open the same file, even if they use the same mode. This is + * necessary so that the daemon can validate each set of credentials. + */ +TEST_F(Open, multiple_creds) +{ + const static char FULLPATH[] = "mountpoint/some_file.txt"; + const static char RELPATH[] = "some_file.txt"; + int fd1, status; + const static uint64_t ino = 42; + const static uint64_t fh0 = 100, fh1 = 200; + + /* Fork a child to open the file with different credentials */ + fork(false, &status, [&] { + + expect_lookup(RELPATH, ino, S_IFREG | 0644, 0, 2); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_OPEN && + in.header.pid == (uint32_t)getpid() && + in.header.nodeid == ino); + }, Eq(true)), + _) + ).WillOnce(Invoke( + ReturnImmediate([](auto in __unused, auto& out) { + out.body.open.fh = fh0; + out.header.len = sizeof(out.header); + SET_OUT_HEADER_LEN(out, open); + }))); + + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_OPEN && + in.header.pid != (uint32_t)getpid() && + in.header.nodeid == ino); + }, Eq(true)), + _) + ).WillOnce(Invoke( + ReturnImmediate([](auto in __unused, auto& out) { + out.body.open.fh = fh1; + out.header.len = sizeof(out.header); + SET_OUT_HEADER_LEN(out, open); + }))); + expect_flush(ino, 2, ReturnErrno(0)); + expect_release(ino, fh0); + expect_release(ino, fh1); + + fd1 = open(FULLPATH, O_RDONLY); + EXPECT_LE(0, fd1) << strerror(errno); + }, [] { + int fd0; + + fd0 = open(FULLPATH, O_RDONLY); + if (fd0 < 0) { + perror("open"); + return(1); + } + return 0; + } + ); + ASSERT_EQ(0, WEXITSTATUS(status)); + + close(fd1); +} + +/* https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=236340 */ +TEST_F(Open, DISABLED_o_append) +{ + test_ok(O_WRONLY | O_APPEND, O_WRONLY | O_APPEND); +} + +/* The kernel is supposed to filter out this flag */ +TEST_F(Open, o_creat) +{ + test_ok(O_WRONLY | O_CREAT, O_WRONLY); +} + +/* https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=236340 */ +TEST_F(Open, DISABLED_o_direct) +{ + test_ok(O_WRONLY | O_DIRECT, O_WRONLY | O_DIRECT); +} + +/* The kernel is supposed to filter out this flag */ +TEST_F(Open, o_excl) +{ + test_ok(O_WRONLY | O_EXCL, O_WRONLY); +} + +TEST_F(Open, o_exec) +{ + test_ok(O_EXEC, O_EXEC); +} + +/* The kernel is supposed to filter out this flag */ +TEST_F(Open, o_noctty) +{ + test_ok(O_WRONLY | O_NOCTTY, O_WRONLY); +} + +TEST_F(Open, o_rdonly) +{ + test_ok(O_RDONLY, O_RDONLY); +} + +/* https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=236340 */ +TEST_F(Open, DISABLED_o_trunc) +{ + test_ok(O_WRONLY | O_TRUNC, O_WRONLY | O_TRUNC); +} + +TEST_F(Open, o_wronly) +{ + test_ok(O_WRONLY, O_WRONLY); +} + +TEST_F(Open, o_rdwr) +{ + test_ok(O_RDWR, O_RDWR); +} + Property changes on: head/tests/sys/fs/fusefs/open.cc ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: head/tests/sys/fs/fusefs/bmap.cc =================================================================== --- head/tests/sys/fs/fusefs/bmap.cc (nonexistent) +++ head/tests/sys/fs/fusefs/bmap.cc (revision 350665) @@ -0,0 +1,159 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2019 The FreeBSD Foundation + * + * This software was developed by BFF Storage Systems, LLC under sponsorship + * from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +extern "C" { +#include +#include +#include + +#include +} + +#include "mockfs.hh" +#include "utils.hh" + +using namespace testing; + +const static char FULLPATH[] = "mountpoint/foo"; +const static char RELPATH[] = "foo"; + +class Bmap: public FuseTest { +public: +virtual void SetUp() { + m_maxreadahead = UINT32_MAX; + FuseTest::SetUp(); +} +void expect_bmap(uint64_t ino, uint64_t lbn, uint32_t blocksize, uint64_t pbn) +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_BMAP && + in.header.nodeid == ino && + in.body.bmap.block == lbn && + in.body.bmap.blocksize == blocksize); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([=](auto i __unused, auto& out) { + SET_OUT_HEADER_LEN(out, bmap); + out.body.bmap.block = pbn; + }))); +} + +void expect_lookup(const char *relpath, uint64_t ino, off_t size) +{ + FuseTest::expect_lookup(relpath, ino, S_IFREG | 0644, size, 1, + UINT64_MAX); +} +}; + +/* + * Test FUSE_BMAP + * XXX The FUSE protocol does not include the runp and runb variables, so those + * must be guessed in-kernel. + */ +TEST_F(Bmap, bmap) +{ + struct fiobmap2_arg arg; + const off_t filesize = 1 << 20; + const ino_t ino = 42; + int64_t lbn = 10; + int64_t pbn = 12345; + int fd; + + expect_lookup(RELPATH, 42, filesize); + expect_open(ino, 0, 1); + expect_bmap(ino, lbn, m_maxbcachebuf, pbn); + + fd = open(FULLPATH, O_RDWR); + ASSERT_LE(0, fd) << strerror(errno); + + arg.bn = lbn; + arg.runp = -1; + arg.runb = -1; + ASSERT_EQ(0, ioctl(fd, FIOBMAP2, &arg)) << strerror(errno); + EXPECT_EQ(arg.bn, pbn); + EXPECT_EQ(arg.runp, m_maxphys / m_maxbcachebuf - 1); + EXPECT_EQ(arg.runb, m_maxphys / m_maxbcachebuf - 1); +} + +/* + * If the daemon does not implement VOP_BMAP, fusefs should return sensible + * defaults. + */ +TEST_F(Bmap, default_) +{ + struct fiobmap2_arg arg; + const off_t filesize = 1 << 20; + const ino_t ino = 42; + int64_t lbn; + int fd; + + expect_lookup(RELPATH, 42, filesize); + expect_open(ino, 0, 1); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_BMAP); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnErrno(ENOSYS))); + + fd = open(FULLPATH, O_RDWR); + ASSERT_LE(0, fd) << strerror(errno); + + /* First block */ + lbn = 0; + arg.bn = lbn; + arg.runp = -1; + arg.runb = -1; + ASSERT_EQ(0, ioctl(fd, FIOBMAP2, &arg)) << strerror(errno); + EXPECT_EQ(arg.bn, 0); + EXPECT_EQ(arg.runp, m_maxphys / m_maxbcachebuf - 1); + EXPECT_EQ(arg.runb, 0); + + /* In the middle */ + lbn = filesize / m_maxbcachebuf / 2; + arg.bn = lbn; + arg.runp = -1; + arg.runb = -1; + ASSERT_EQ(0, ioctl(fd, FIOBMAP2, &arg)) << strerror(errno); + EXPECT_EQ(arg.bn, lbn * m_maxbcachebuf / DEV_BSIZE); + EXPECT_EQ(arg.runp, m_maxphys / m_maxbcachebuf - 1); + EXPECT_EQ(arg.runb, m_maxphys / m_maxbcachebuf - 1); + + /* Last block */ + lbn = filesize / m_maxbcachebuf - 1; + arg.bn = lbn; + arg.runp = -1; + arg.runb = -1; + ASSERT_EQ(0, ioctl(fd, FIOBMAP2, &arg)) << strerror(errno); + EXPECT_EQ(arg.bn, lbn * m_maxbcachebuf / DEV_BSIZE); + EXPECT_EQ(arg.runp, 0); + EXPECT_EQ(arg.runb, m_maxphys / m_maxbcachebuf - 1); +} Property changes on: head/tests/sys/fs/fusefs/bmap.cc ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: head/tests/sys/fs/fusefs/mknod.cc =================================================================== --- head/tests/sys/fs/fusefs/mknod.cc (nonexistent) +++ head/tests/sys/fs/fusefs/mknod.cc (revision 350665) @@ -0,0 +1,238 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2019 The FreeBSD Foundation + * + * This software was developed by BFF Storage Systems, LLC under sponsorship + * from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +extern "C" { +#include +#include +#include +} + +#include "mockfs.hh" +#include "utils.hh" + +using namespace testing; + +#ifndef VNOVAL +#define VNOVAL (-1) /* Defined in sys/vnode.h */ +#endif + +const char FULLPATH[] = "mountpoint/some_file.txt"; +const char RELPATH[] = "some_file.txt"; + +class Mknod: public FuseTest { + +mode_t m_oldmask; +const static mode_t c_umask = 022; + +public: + +virtual void SetUp() { + m_oldmask = umask(c_umask); + if (geteuid() != 0) { + GTEST_SKIP() << "Only root may use most mknod(2) variations"; + } + FuseTest::SetUp(); +} + +virtual void TearDown() { + FuseTest::TearDown(); + (void)umask(m_oldmask); +} + +/* Test an OK creation of a file with the given mode and device number */ +void expect_mknod(mode_t mode, dev_t dev) { + uint64_t ino = 42; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(Invoke(ReturnErrno(ENOENT))); + + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + const char *name = (const char*)in.body.bytes + + sizeof(fuse_mknod_in); + return (in.header.opcode == FUSE_MKNOD && + in.body.mknod.mode == mode && + in.body.mknod.rdev == (uint32_t)dev && + in.body.mknod.umask == c_umask && + (0 == strcmp(RELPATH, name))); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = mode; + out.body.entry.nodeid = ino; + out.body.entry.entry_valid = UINT64_MAX; + out.body.entry.attr_valid = UINT64_MAX; + out.body.entry.attr.rdev = dev; + }))); +} + +}; + +class Mknod_7_11: public FuseTest { +public: +virtual void SetUp() { + m_kernel_minor_version = 11; + if (geteuid() != 0) { + GTEST_SKIP() << "Only root may use most mknod(2) variations"; + } + FuseTest::SetUp(); +} + +void expect_lookup(const char *relpath, uint64_t ino, uint64_t size) +{ + FuseTest::expect_lookup_7_8(relpath, ino, S_IFREG | 0644, size, 1); +} + +/* Test an OK creation of a file with the given mode and device number */ +void expect_mknod(mode_t mode, dev_t dev) { + uint64_t ino = 42; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(Invoke(ReturnErrno(ENOENT))); + + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + const char *name = (const char*)in.body.bytes + + FUSE_COMPAT_MKNOD_IN_SIZE; + return (in.header.opcode == FUSE_MKNOD && + in.body.mknod.mode == mode && + in.body.mknod.rdev == (uint32_t)dev && + (0 == strcmp(RELPATH, name))); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = mode; + out.body.entry.nodeid = ino; + out.body.entry.entry_valid = UINT64_MAX; + out.body.entry.attr_valid = UINT64_MAX; + out.body.entry.attr.rdev = dev; + }))); +} + +}; + +/* + * mknod(2) should be able to create block devices on a FUSE filesystem. Even + * though FreeBSD doesn't use block devices, this is useful when copying media + * from or preparing media for other operating systems. + */ +TEST_F(Mknod, blk) +{ + mode_t mode = S_IFBLK | 0755; + dev_t rdev = 0xfe00; /* /dev/vda's device number on Linux */ + expect_mknod(mode, rdev); + EXPECT_EQ(0, mknod(FULLPATH, mode, rdev)) << strerror(errno); +} + +TEST_F(Mknod, chr) +{ + mode_t mode = S_IFCHR | 0755; + dev_t rdev = 54; /* /dev/fuse's device number */ + expect_mknod(mode, rdev); + EXPECT_EQ(0, mknod(FULLPATH, mode, rdev)) << strerror(errno); +} + +/* + * The daemon is responsible for checking file permissions (unless the + * default_permissions mount option was used) + */ +TEST_F(Mknod, eperm) +{ + mode_t mode = S_IFIFO | 0755; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(Invoke(ReturnErrno(ENOENT))); + + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + const char *name = (const char*)in.body.bytes + + sizeof(fuse_mknod_in); + return (in.header.opcode == FUSE_MKNOD && + in.body.mknod.mode == mode && + (0 == strcmp(RELPATH, name))); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnErrno(EPERM))); + EXPECT_NE(0, mkfifo(FULLPATH, mode)); + EXPECT_EQ(EPERM, errno); +} + +TEST_F(Mknod, fifo) +{ + mode_t mode = S_IFIFO | 0755; + dev_t rdev = VNOVAL; /* Fifos don't have device numbers */ + expect_mknod(mode, rdev); + EXPECT_EQ(0, mkfifo(FULLPATH, mode)) << strerror(errno); +} + +/* + * Create a unix-domain socket. + * + * This test case doesn't actually need root privileges. + */ +TEST_F(Mknod, socket) +{ + mode_t mode = S_IFSOCK | 0755; + struct sockaddr_un sa; + int fd; + dev_t rdev = -1; /* Really it's a don't care */ + + expect_mknod(mode, rdev); + + fd = socket(AF_UNIX, SOCK_STREAM, 0); + ASSERT_LE(0, fd) << strerror(errno); + sa.sun_family = AF_UNIX; + strlcpy(sa.sun_path, FULLPATH, sizeof(sa.sun_path)); + ASSERT_EQ(0, bind(fd, (struct sockaddr*)&sa, sizeof(sa))) + << strerror(errno); +} + +/* + * fusefs(5) lacks VOP_WHITEOUT support. No bugzilla entry, because that's a + * feature, not a bug + */ +TEST_F(Mknod, DISABLED_whiteout) +{ + mode_t mode = S_IFWHT | 0755; + dev_t rdev = VNOVAL; /* whiteouts don't have device numbers */ + expect_mknod(mode, rdev); + EXPECT_EQ(0, mknod(FULLPATH, mode, 0)) << strerror(errno); +} + +/* A server built at protocol version 7.11 or earlier can still use mknod */ +TEST_F(Mknod_7_11, fifo) +{ + mode_t mode = S_IFIFO | 0755; + dev_t rdev = VNOVAL; + expect_mknod(mode, rdev); + EXPECT_EQ(0, mkfifo(FULLPATH, mode)) << strerror(errno); +} Property changes on: head/tests/sys/fs/fusefs/mknod.cc ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: head/tests/sys/fs/fusefs/access.cc =================================================================== --- head/tests/sys/fs/fusefs/access.cc (nonexistent) +++ head/tests/sys/fs/fusefs/access.cc (revision 350665) @@ -0,0 +1,119 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2019 The FreeBSD Foundation + * + * This software was developed by BFF Storage Systems, LLC under sponsorship + * from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +extern "C" { +#include +#include +} + +#include "mockfs.hh" +#include "utils.hh" + +using namespace testing; + +class Access: public FuseTest { +public: +void expect_lookup(const char *relpath, uint64_t ino) +{ + FuseTest::expect_lookup(relpath, ino, S_IFREG | 0644, 0, 1); +} +}; + +class RofsAccess: public Access { +public: +virtual void SetUp() { + m_ro = true; + Access::SetUp(); +} +}; + +/* The error case of FUSE_ACCESS. */ +TEST_F(Access, eaccess) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + mode_t access_mode = X_OK; + + expect_access(FUSE_ROOT_ID, X_OK, 0); + expect_lookup(RELPATH, ino); + expect_access(ino, access_mode, EACCES); + + ASSERT_NE(0, access(FULLPATH, access_mode)); + ASSERT_EQ(EACCES, errno); +} + +/* + * If the filesystem returns ENOSYS, then it is treated as a permanent success, + * and subsequent VOP_ACCESS calls will succeed automatically without querying + * the daemon. + */ +TEST_F(Access, enosys) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + mode_t access_mode = R_OK; + + expect_access(FUSE_ROOT_ID, X_OK, ENOSYS); + FuseTest::expect_lookup(RELPATH, ino, S_IFREG | 0644, 0, 2); + + ASSERT_EQ(0, access(FULLPATH, access_mode)) << strerror(errno); + ASSERT_EQ(0, access(FULLPATH, access_mode)) << strerror(errno); +} + +TEST_F(RofsAccess, erofs) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + mode_t access_mode = W_OK; + + expect_access(FUSE_ROOT_ID, X_OK, 0); + expect_lookup(RELPATH, ino); + + ASSERT_NE(0, access(FULLPATH, access_mode)); + ASSERT_EQ(EROFS, errno); +} + +/* The successful case of FUSE_ACCESS. */ +TEST_F(Access, ok) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + mode_t access_mode = R_OK; + + expect_access(FUSE_ROOT_ID, X_OK, 0); + expect_lookup(RELPATH, ino); + expect_access(ino, access_mode, 0); + + ASSERT_EQ(0, access(FULLPATH, access_mode)) << strerror(errno); +} Property changes on: head/tests/sys/fs/fusefs/access.cc ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: head/tests/sys/fs/fusefs/getattr.cc =================================================================== --- head/tests/sys/fs/fusefs/getattr.cc (nonexistent) +++ head/tests/sys/fs/fusefs/getattr.cc (revision 350665) @@ -0,0 +1,300 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2019 The FreeBSD Foundation + * + * This software was developed by BFF Storage Systems, LLC under sponsorship + * from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +extern "C" { +#include +} + +#include "mockfs.hh" +#include "utils.hh" + +using namespace testing; + +class Getattr : public FuseTest { +public: +void expect_lookup(const char *relpath, uint64_t ino, mode_t mode, + uint64_t size, int times, uint64_t attr_valid, uint32_t attr_valid_nsec) +{ + EXPECT_LOOKUP(FUSE_ROOT_ID, relpath) + .Times(times) + .WillRepeatedly(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = mode; + out.body.entry.nodeid = ino; + out.body.entry.attr.nlink = 1; + out.body.entry.attr_valid = attr_valid; + out.body.entry.attr_valid_nsec = attr_valid_nsec; + out.body.entry.attr.size = size; + out.body.entry.entry_valid = UINT64_MAX; + }))); +} +}; + +class Getattr_7_8: public FuseTest { +public: +virtual void SetUp() { + m_kernel_minor_version = 8; + FuseTest::SetUp(); +} +}; + +/* + * If getattr returns a non-zero cache timeout, then subsequent VOP_GETATTRs + * should use the cached attributes, rather than query the daemon + */ +TEST_F(Getattr, attr_cache) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const uint64_t ino = 42; + struct stat sb; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillRepeatedly(Invoke(ReturnImmediate([=](auto i __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = S_IFREG | 0644; + out.body.entry.nodeid = ino; + out.body.entry.entry_valid = UINT64_MAX; + }))); + EXPECT_CALL(*m_mock, process( + ResultOf([](auto in) { + return (in.header.opcode == FUSE_GETATTR && + in.header.nodeid == ino); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([](auto i __unused, auto& out) { + SET_OUT_HEADER_LEN(out, attr); + out.body.attr.attr_valid = UINT64_MAX; + out.body.attr.attr.ino = ino; // Must match nodeid + out.body.attr.attr.mode = S_IFREG | 0644; + }))); + EXPECT_EQ(0, stat(FULLPATH, &sb)); + /* The second stat(2) should use cached attributes */ + EXPECT_EQ(0, stat(FULLPATH, &sb)); +} + +/* + * If getattr returns a finite but non-zero cache timeout, then we should + * discard the cached attributes and requery the daemon after the timeout + * period passes. + */ +TEST_F(Getattr, attr_cache_timeout) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const uint64_t ino = 42; + struct stat sb; + + expect_lookup(RELPATH, ino, S_IFREG | 0644, 0, 1, 0, 0); + EXPECT_CALL(*m_mock, process( + ResultOf([](auto in) { + return (in.header.opcode == FUSE_GETATTR && + in.header.nodeid == ino); + }, Eq(true)), + _) + ).Times(2) + .WillRepeatedly(Invoke(ReturnImmediate([=](auto i __unused, auto& out) { + SET_OUT_HEADER_LEN(out, attr); + out.body.attr.attr_valid_nsec = NAP_NS / 2; + out.body.attr.attr_valid = 0; + out.body.attr.attr.ino = ino; // Must match nodeid + out.body.attr.attr.mode = S_IFREG | 0644; + }))); + + EXPECT_EQ(0, stat(FULLPATH, &sb)); + nap(); + /* Timeout has expired. stat(2) should requery the daemon */ + EXPECT_EQ(0, stat(FULLPATH, &sb)); +} + +/* + * If attr.blksize is zero, then the kernel should use a default value for + * st_blksize + */ +TEST_F(Getattr, blksize_zero) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const uint64_t ino = 42; + struct stat sb; + + expect_lookup(RELPATH, ino, S_IFREG | 0644, 1, 1, 0, 0); + EXPECT_CALL(*m_mock, process( + ResultOf([](auto in) { + return (in.header.opcode == FUSE_GETATTR && + in.header.nodeid == ino); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([](auto i __unused, auto& out) { + SET_OUT_HEADER_LEN(out, attr); + out.body.attr.attr.mode = S_IFREG | 0644; + out.body.attr.attr.ino = ino; // Must match nodeid + out.body.attr.attr.blksize = 0; + }))); + + ASSERT_EQ(0, stat(FULLPATH, &sb)) << strerror(errno); + EXPECT_EQ((blksize_t)PAGE_SIZE, sb.st_blksize); +} + +TEST_F(Getattr, enoent) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + struct stat sb; + const uint64_t ino = 42; + + expect_lookup(RELPATH, ino, S_IFREG | 0644, 0, 1, 0, 0); + EXPECT_CALL(*m_mock, process( + ResultOf([](auto in) { + return (in.header.opcode == FUSE_GETATTR && + in.header.nodeid == ino); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnErrno(ENOENT))); + EXPECT_NE(0, stat(FULLPATH, &sb)); + EXPECT_EQ(ENOENT, errno); +} + +TEST_F(Getattr, ok) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const uint64_t ino = 42; + struct stat sb; + + expect_lookup(RELPATH, ino, S_IFREG | 0644, 1, 1, 0, 0); + EXPECT_CALL(*m_mock, process( + ResultOf([](auto in) { + return (in.header.opcode == FUSE_GETATTR && + in.header.nodeid == ino); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([](auto i __unused, auto& out) { + SET_OUT_HEADER_LEN(out, attr); + out.body.attr.attr.ino = ino; // Must match nodeid + out.body.attr.attr.mode = S_IFREG | 0644; + out.body.attr.attr.size = 1; + out.body.attr.attr.blocks = 2; + out.body.attr.attr.atime = 3; + out.body.attr.attr.mtime = 4; + out.body.attr.attr.ctime = 5; + out.body.attr.attr.atimensec = 6; + out.body.attr.attr.mtimensec = 7; + out.body.attr.attr.ctimensec = 8; + out.body.attr.attr.nlink = 9; + out.body.attr.attr.uid = 10; + out.body.attr.attr.gid = 11; + out.body.attr.attr.rdev = 12; + out.body.attr.attr.blksize = 12345; + }))); + + ASSERT_EQ(0, stat(FULLPATH, &sb)) << strerror(errno); + EXPECT_EQ(1, sb.st_size); + EXPECT_EQ(2, sb.st_blocks); + EXPECT_EQ(3, sb.st_atim.tv_sec); + EXPECT_EQ(6, sb.st_atim.tv_nsec); + EXPECT_EQ(4, sb.st_mtim.tv_sec); + EXPECT_EQ(7, sb.st_mtim.tv_nsec); + EXPECT_EQ(5, sb.st_ctim.tv_sec); + EXPECT_EQ(8, sb.st_ctim.tv_nsec); + EXPECT_EQ(9ull, sb.st_nlink); + EXPECT_EQ(10ul, sb.st_uid); + EXPECT_EQ(11ul, sb.st_gid); + EXPECT_EQ(12ul, sb.st_rdev); + EXPECT_EQ((blksize_t)12345, sb.st_blksize); + EXPECT_EQ(ino, sb.st_ino); + EXPECT_EQ(S_IFREG | 0644, sb.st_mode); + + //st_birthtim and st_flags are not supported by protocol 7.8. They're + //only supported as OS-specific extensions to OSX. + //EXPECT_EQ(, sb.st_birthtim); + //EXPECT_EQ(, sb.st_flags); + + //FUSE can't set st_blksize until protocol 7.9 +} + +TEST_F(Getattr_7_8, ok) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const uint64_t ino = 42; + struct stat sb; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry_7_8); + out.body.entry.attr.mode = S_IFREG | 0644; + out.body.entry.nodeid = ino; + out.body.entry.attr.nlink = 1; + out.body.entry.attr.size = 1; + }))); + EXPECT_CALL(*m_mock, process( + ResultOf([](auto in) { + return (in.header.opcode == FUSE_GETATTR && + in.header.nodeid == ino); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([](auto i __unused, auto& out) { + SET_OUT_HEADER_LEN(out, attr_7_8); + out.body.attr.attr.ino = ino; // Must match nodeid + out.body.attr.attr.mode = S_IFREG | 0644; + out.body.attr.attr.size = 1; + out.body.attr.attr.blocks = 2; + out.body.attr.attr.atime = 3; + out.body.attr.attr.mtime = 4; + out.body.attr.attr.ctime = 5; + out.body.attr.attr.atimensec = 6; + out.body.attr.attr.mtimensec = 7; + out.body.attr.attr.ctimensec = 8; + out.body.attr.attr.nlink = 9; + out.body.attr.attr.uid = 10; + out.body.attr.attr.gid = 11; + out.body.attr.attr.rdev = 12; + }))); + + ASSERT_EQ(0, stat(FULLPATH, &sb)) << strerror(errno); + EXPECT_EQ(1, sb.st_size); + EXPECT_EQ(2, sb.st_blocks); + EXPECT_EQ(3, sb.st_atim.tv_sec); + EXPECT_EQ(6, sb.st_atim.tv_nsec); + EXPECT_EQ(4, sb.st_mtim.tv_sec); + EXPECT_EQ(7, sb.st_mtim.tv_nsec); + EXPECT_EQ(5, sb.st_ctim.tv_sec); + EXPECT_EQ(8, sb.st_ctim.tv_nsec); + EXPECT_EQ(9ull, sb.st_nlink); + EXPECT_EQ(10ul, sb.st_uid); + EXPECT_EQ(11ul, sb.st_gid); + EXPECT_EQ(12ul, sb.st_rdev); + EXPECT_EQ(ino, sb.st_ino); + EXPECT_EQ(S_IFREG | 0644, sb.st_mode); + + //st_birthtim and st_flags are not supported by protocol 7.8. They're + //only supported as OS-specific extensions to OSX. +} Property changes on: head/tests/sys/fs/fusefs/getattr.cc ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: head/tests/sys/fs/fusefs/link.cc =================================================================== --- head/tests/sys/fs/fusefs/link.cc (nonexistent) +++ head/tests/sys/fs/fusefs/link.cc (revision 350665) @@ -0,0 +1,233 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2019 The FreeBSD Foundation + * + * This software was developed by BFF Storage Systems, LLC under sponsorship + * from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +extern "C" { +#include +} + +#include "mockfs.hh" +#include "utils.hh" + +using namespace testing; + +class Link: public FuseTest { +public: +void expect_link(uint64_t ino, const char *relpath, mode_t mode, uint32_t nlink) +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + const char *name = (const char*)in.body.bytes + + sizeof(struct fuse_link_in); + return (in.header.opcode == FUSE_LINK && + in.body.link.oldnodeid == ino && + (0 == strcmp(name, relpath))); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.nodeid = ino; + out.body.entry.attr.mode = mode; + out.body.entry.attr.nlink = nlink; + out.body.entry.attr_valid = UINT64_MAX; + out.body.entry.entry_valid = UINT64_MAX; + }))); +} + +void expect_lookup(const char *relpath, uint64_t ino) +{ + FuseTest::expect_lookup(relpath, ino, S_IFREG | 0644, 0, 1); +} +}; + +class Link_7_8: public FuseTest { +public: +virtual void SetUp() { + m_kernel_minor_version = 8; + FuseTest::SetUp(); +} + +void expect_link(uint64_t ino, const char *relpath, mode_t mode, uint32_t nlink) +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + const char *name = (const char*)in.body.bytes + + sizeof(struct fuse_link_in); + return (in.header.opcode == FUSE_LINK && + in.body.link.oldnodeid == ino && + (0 == strcmp(name, relpath))); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry_7_8); + out.body.entry.nodeid = ino; + out.body.entry.attr.mode = mode; + out.body.entry.attr.nlink = nlink; + out.body.entry.attr_valid = UINT64_MAX; + out.body.entry.entry_valid = UINT64_MAX; + }))); +} + +void expect_lookup(const char *relpath, uint64_t ino) +{ + FuseTest::expect_lookup_7_8(relpath, ino, S_IFREG | 0644, 0, 1); +} +}; + +/* + * A successful link should clear the parent directory's attribute cache, + * because the fuse daemon should update its mtime and ctime + */ +TEST_F(Link, clear_attr_cache) +{ + const char FULLPATH[] = "mountpoint/src"; + const char RELPATH[] = "src"; + const char FULLDST[] = "mountpoint/dst"; + const char RELDST[] = "dst"; + const uint64_t ino = 42; + mode_t mode = S_IFREG | 0644; + struct stat sb; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(Invoke(ReturnErrno(ENOENT))); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_GETATTR && + in.header.nodeid == FUSE_ROOT_ID); + }, Eq(true)), + _) + ).Times(2) + .WillRepeatedly(Invoke(ReturnImmediate([=](auto i __unused, auto& out) { + SET_OUT_HEADER_LEN(out, attr); + out.body.attr.attr.ino = FUSE_ROOT_ID; + out.body.attr.attr.mode = S_IFDIR | 0755; + out.body.attr.attr_valid = UINT64_MAX; + }))); + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELDST) + + .WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = mode; + out.body.entry.nodeid = ino; + out.body.entry.attr.nlink = 1; + out.body.entry.attr_valid = UINT64_MAX; + out.body.entry.entry_valid = UINT64_MAX; + }))); + expect_link(ino, RELPATH, mode, 2); + + EXPECT_EQ(0, stat("mountpoint", &sb)) << strerror(errno); + EXPECT_EQ(0, link(FULLDST, FULLPATH)) << strerror(errno); + EXPECT_EQ(0, stat("mountpoint", &sb)) << strerror(errno); +} + +TEST_F(Link, emlink) +{ + const char FULLPATH[] = "mountpoint/lnk"; + const char RELPATH[] = "lnk"; + const char FULLDST[] = "mountpoint/dst"; + const char RELDST[] = "dst"; + uint64_t dst_ino = 42; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(Invoke(ReturnErrno(ENOENT))); + expect_lookup(RELDST, dst_ino); + + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + const char *name = (const char*)in.body.bytes + + sizeof(struct fuse_link_in); + return (in.header.opcode == FUSE_LINK && + in.body.link.oldnodeid == dst_ino && + (0 == strcmp(name, RELPATH))); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnErrno(EMLINK))); + + EXPECT_EQ(-1, link(FULLDST, FULLPATH)); + EXPECT_EQ(EMLINK, errno); +} + +TEST_F(Link, ok) +{ + const char FULLPATH[] = "mountpoint/src"; + const char RELPATH[] = "src"; + const char FULLDST[] = "mountpoint/dst"; + const char RELDST[] = "dst"; + const uint64_t ino = 42; + mode_t mode = S_IFREG | 0644; + struct stat sb; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(Invoke(ReturnErrno(ENOENT))); + EXPECT_LOOKUP(FUSE_ROOT_ID, RELDST) + .WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = mode; + out.body.entry.nodeid = ino; + out.body.entry.attr.nlink = 1; + out.body.entry.attr_valid = UINT64_MAX; + out.body.entry.entry_valid = UINT64_MAX; + }))); + expect_link(ino, RELPATH, mode, 2); + + ASSERT_EQ(0, link(FULLDST, FULLPATH)) << strerror(errno); + // Check that the original file's nlink count has increased. + ASSERT_EQ(0, stat(FULLDST, &sb)) << strerror(errno); + EXPECT_EQ(2ul, sb.st_nlink); +} + +TEST_F(Link_7_8, ok) +{ + const char FULLPATH[] = "mountpoint/src"; + const char RELPATH[] = "src"; + const char FULLDST[] = "mountpoint/dst"; + const char RELDST[] = "dst"; + const uint64_t ino = 42; + mode_t mode = S_IFREG | 0644; + struct stat sb; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(Invoke(ReturnErrno(ENOENT))); + EXPECT_LOOKUP(FUSE_ROOT_ID, RELDST) + .WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry_7_8); + out.body.entry.attr.mode = mode; + out.body.entry.nodeid = ino; + out.body.entry.attr.nlink = 1; + out.body.entry.attr_valid = UINT64_MAX; + out.body.entry.entry_valid = UINT64_MAX; + }))); + expect_link(ino, RELPATH, mode, 2); + + ASSERT_EQ(0, link(FULLDST, FULLPATH)) << strerror(errno); + // Check that the original file's nlink count has increased. + ASSERT_EQ(0, stat(FULLDST, &sb)) << strerror(errno); + EXPECT_EQ(2ul, sb.st_nlink); +} Property changes on: head/tests/sys/fs/fusefs/link.cc ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: head/tests/sys/fs/fusefs/lookup.cc =================================================================== --- head/tests/sys/fs/fusefs/lookup.cc (nonexistent) +++ head/tests/sys/fs/fusefs/lookup.cc (revision 350665) @@ -0,0 +1,381 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2019 The FreeBSD Foundation + * + * This software was developed by BFF Storage Systems, LLC under sponsorship + * from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +extern "C" { +#include +} + +#include "mockfs.hh" +#include "utils.hh" + +using namespace testing; + +class Lookup: public FuseTest {}; +class Lookup_7_8: public Lookup { +public: +virtual void SetUp() { + m_kernel_minor_version = 8; + Lookup::SetUp(); +} +}; + +/* + * If lookup returns a non-zero cache timeout, then subsequent VOP_GETATTRs + * should use the cached attributes, rather than query the daemon + */ +TEST_F(Lookup, attr_cache) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const uint64_t ino = 42; + const uint64_t generation = 13; + struct stat sb; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.nodeid = ino; + out.body.entry.attr_valid = UINT64_MAX; + out.body.entry.attr.ino = ino; // Must match nodeid + out.body.entry.attr.mode = S_IFREG | 0644; + out.body.entry.attr.size = 1; + out.body.entry.attr.blocks = 2; + out.body.entry.attr.atime = 3; + out.body.entry.attr.mtime = 4; + out.body.entry.attr.ctime = 5; + out.body.entry.attr.atimensec = 6; + out.body.entry.attr.mtimensec = 7; + out.body.entry.attr.ctimensec = 8; + out.body.entry.attr.nlink = 9; + out.body.entry.attr.uid = 10; + out.body.entry.attr.gid = 11; + out.body.entry.attr.rdev = 12; + out.body.entry.generation = generation; + }))); + /* stat(2) issues a VOP_LOOKUP followed by a VOP_GETATTR */ + ASSERT_EQ(0, stat(FULLPATH, &sb)) << strerror(errno); + EXPECT_EQ(1, sb.st_size); + EXPECT_EQ(2, sb.st_blocks); + EXPECT_EQ(3, sb.st_atim.tv_sec); + EXPECT_EQ(6, sb.st_atim.tv_nsec); + EXPECT_EQ(4, sb.st_mtim.tv_sec); + EXPECT_EQ(7, sb.st_mtim.tv_nsec); + EXPECT_EQ(5, sb.st_ctim.tv_sec); + EXPECT_EQ(8, sb.st_ctim.tv_nsec); + EXPECT_EQ(9ull, sb.st_nlink); + EXPECT_EQ(10ul, sb.st_uid); + EXPECT_EQ(11ul, sb.st_gid); + EXPECT_EQ(12ul, sb.st_rdev); + EXPECT_EQ(ino, sb.st_ino); + EXPECT_EQ(S_IFREG | 0644, sb.st_mode); + + // fuse(4) does not _yet_ support inode generations + //EXPECT_EQ(generation, sb.st_gen); + + //st_birthtim and st_flags are not supported by protocol 7.8. They're + //only supported as OS-specific extensions to OSX. + //EXPECT_EQ(, sb.st_birthtim); + //EXPECT_EQ(, sb.st_flags); + + //FUSE can't set st_blksize until protocol 7.9 +} + +/* + * If lookup returns a finite but non-zero cache timeout, then we should discard + * the cached attributes and requery the daemon. + */ +TEST_F(Lookup, attr_cache_timeout) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const uint64_t ino = 42; + struct stat sb; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .Times(2) + .WillRepeatedly(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.nodeid = ino; + out.body.entry.attr_valid_nsec = NAP_NS / 2; + out.body.entry.attr.ino = ino; // Must match nodeid + out.body.entry.attr.mode = S_IFREG | 0644; + }))); + + /* access(2) will issue a VOP_LOOKUP and fill the attr cache */ + ASSERT_EQ(0, access(FULLPATH, F_OK)) << strerror(errno); + /* Next access(2) will use the cached attributes */ + nap(); + /* The cache has timed out; VOP_GETATTR should query the daemon*/ + ASSERT_EQ(0, stat(FULLPATH, &sb)) << strerror(errno); +} + +TEST_F(Lookup, dot) +{ + const char FULLPATH[] = "mountpoint/some_dir/."; + const char RELDIRPATH[] = "some_dir"; + uint64_t ino = 42; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELDIRPATH) + .WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = S_IFDIR | 0755; + out.body.entry.nodeid = ino; + out.body.entry.attr_valid = UINT64_MAX; + out.body.entry.entry_valid = UINT64_MAX; + }))); + + /* + * access(2) is one of the few syscalls that will not (always) follow + * up a successful VOP_LOOKUP with another VOP. + */ + ASSERT_EQ(0, access(FULLPATH, F_OK)) << strerror(errno); +} + +TEST_F(Lookup, dotdot) +{ + const char FULLPATH[] = "mountpoint/some_dir/.."; + const char RELDIRPATH[] = "some_dir"; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELDIRPATH) + .WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = S_IFDIR | 0755; + out.body.entry.nodeid = 14; + out.body.entry.attr_valid = UINT64_MAX; + out.body.entry.entry_valid = UINT64_MAX; + }))); + + /* + * access(2) is one of the few syscalls that will not (always) follow + * up a successful VOP_LOOKUP with another VOP. + */ + ASSERT_EQ(0, access(FULLPATH, F_OK)) << strerror(errno); +} + +TEST_F(Lookup, enoent) +{ + const char FULLPATH[] = "mountpoint/does_not_exist"; + const char RELPATH[] = "does_not_exist"; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(Invoke(ReturnErrno(ENOENT))); + EXPECT_NE(0, access(FULLPATH, F_OK)); + EXPECT_EQ(ENOENT, errno); +} + +TEST_F(Lookup, enotdir) +{ + const char FULLPATH[] = "mountpoint/not_a_dir/some_file.txt"; + const char RELPATH[] = "not_a_dir"; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.entry_valid = UINT64_MAX; + out.body.entry.attr.mode = S_IFREG | 0644; + out.body.entry.nodeid = 42; + }))); + + ASSERT_EQ(-1, access(FULLPATH, F_OK)); + ASSERT_EQ(ENOTDIR, errno); +} + +/* + * If lookup returns a non-zero entry timeout, then subsequent VOP_LOOKUPs + * should use the cached inode rather than requery the daemon + */ +TEST_F(Lookup, entry_cache) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.entry_valid = UINT64_MAX; + out.body.entry.attr.mode = S_IFREG | 0644; + out.body.entry.nodeid = 14; + }))); + ASSERT_EQ(0, access(FULLPATH, F_OK)) << strerror(errno); + /* The second access(2) should use the cache */ + ASSERT_EQ(0, access(FULLPATH, F_OK)) << strerror(errno); +} + +/* + * If the daemon returns an error of 0 and an inode of 0, that's a flag for + * "ENOENT and cache it" with the given entry_timeout + */ +TEST_F(Lookup, entry_cache_negative) +{ + struct timespec entry_valid = {.tv_sec = TIME_T_MAX, .tv_nsec = 0}; + + EXPECT_LOOKUP(FUSE_ROOT_ID, "does_not_exist") + .Times(1) + .WillOnce(Invoke(ReturnNegativeCache(&entry_valid))); + + EXPECT_NE(0, access("mountpoint/does_not_exist", F_OK)); + EXPECT_EQ(ENOENT, errno); + EXPECT_NE(0, access("mountpoint/does_not_exist", F_OK)); + EXPECT_EQ(ENOENT, errno); +} + +/* Negative entry caches should timeout, too */ +TEST_F(Lookup, entry_cache_negative_timeout) +{ + const char *RELPATH = "does_not_exist"; + const char *FULLPATH = "mountpoint/does_not_exist"; + struct timespec entry_valid = {.tv_sec = 0, .tv_nsec = NAP_NS / 2}; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .Times(2) + .WillRepeatedly(Invoke(ReturnNegativeCache(&entry_valid))); + + EXPECT_NE(0, access(FULLPATH, F_OK)); + EXPECT_EQ(ENOENT, errno); + + nap(); + + /* The cache has timed out; VOP_LOOKUP should requery the daemon*/ + EXPECT_NE(0, access(FULLPATH, F_OK)); + EXPECT_EQ(ENOENT, errno); +} + +/* + * If lookup returns a finite but non-zero entry cache timeout, then we should + * discard the cached inode and requery the daemon + */ +TEST_F(Lookup, entry_cache_timeout) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .Times(2) + .WillRepeatedly(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.entry_valid_nsec = NAP_NS / 2; + out.body.entry.attr.mode = S_IFREG | 0644; + out.body.entry.nodeid = 14; + }))); + + /* access(2) will issue a VOP_LOOKUP and fill the entry cache */ + ASSERT_EQ(0, access(FULLPATH, F_OK)) << strerror(errno); + /* Next access(2) will use the cached entry */ + ASSERT_EQ(0, access(FULLPATH, F_OK)) << strerror(errno); + nap(); + /* The cache has timed out; VOP_LOOKUP should requery the daemon*/ + ASSERT_EQ(0, access(FULLPATH, F_OK)) << strerror(errno); +} + +TEST_F(Lookup, ok) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = S_IFREG | 0644; + out.body.entry.nodeid = 14; + }))); + /* + * access(2) is one of the few syscalls that will not (always) follow + * up a successful VOP_LOOKUP with another VOP. + */ + ASSERT_EQ(0, access(FULLPATH, F_OK)) << strerror(errno); +} + +// Lookup in a subdirectory of the fuse mount +TEST_F(Lookup, subdir) +{ + const char FULLPATH[] = "mountpoint/some_dir/some_file.txt"; + const char DIRPATH[] = "some_dir"; + const char RELPATH[] = "some_file.txt"; + uint64_t dir_ino = 2; + uint64_t file_ino = 3; + + EXPECT_LOOKUP(FUSE_ROOT_ID, DIRPATH) + .WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = S_IFDIR | 0755; + out.body.entry.nodeid = dir_ino; + }))); + EXPECT_LOOKUP(dir_ino, RELPATH) + .WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = S_IFREG | 0644; + out.body.entry.nodeid = file_ino; + }))); + /* + * access(2) is one of the few syscalls that will not (always) follow + * up a successful VOP_LOOKUP with another VOP. + */ + ASSERT_EQ(0, access(FULLPATH, F_OK)) << strerror(errno); +} + +/* + * The server returns two different vtypes for the same nodeid. This is a bad + * server! But we shouldn't crash. + */ +TEST_F(Lookup, vtype_conflict) +{ + const char FIRSTFULLPATH[] = "mountpoint/foo"; + const char SECONDFULLPATH[] = "mountpoint/bar"; + const char FIRSTRELPATH[] = "foo"; + const char SECONDRELPATH[] = "bar"; + uint64_t ino = 42; + + expect_lookup(FIRSTRELPATH, ino, S_IFREG | 0644, 0, 1, UINT64_MAX); + expect_lookup(SECONDRELPATH, ino, S_IFDIR | 0755, 0, 1, UINT64_MAX); + + ASSERT_EQ(0, access(FIRSTFULLPATH, F_OK)) << strerror(errno); + ASSERT_EQ(-1, access(SECONDFULLPATH, F_OK)); + ASSERT_EQ(EAGAIN, errno); +} + +TEST_F(Lookup_7_8, ok) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry_7_8); + out.body.entry.attr.mode = S_IFREG | 0644; + out.body.entry.nodeid = 14; + }))); + /* + * access(2) is one of the few syscalls that will not (always) follow + * up a successful VOP_LOOKUP with another VOP. + */ + ASSERT_EQ(0, access(FULLPATH, F_OK)) << strerror(errno); +} + + Property changes on: head/tests/sys/fs/fusefs/lookup.cc ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: head/tests/sys/fs/fusefs/mkdir.cc =================================================================== --- head/tests/sys/fs/fusefs/mkdir.cc (nonexistent) +++ head/tests/sys/fs/fusefs/mkdir.cc (revision 350665) @@ -0,0 +1,222 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2019 The FreeBSD Foundation + * + * This software was developed by BFF Storage Systems, LLC under sponsorship + * from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +extern "C" { +#include +} + +#include "mockfs.hh" +#include "utils.hh" + +using namespace testing; + +class Mkdir: public FuseTest {}; +class Mkdir_7_8: public FuseTest { +public: +virtual void SetUp() { + m_kernel_minor_version = 8; + FuseTest::SetUp(); +} +}; + +/* + * EMLINK is possible on filesystems that limit the number of hard links to a + * single file, like early versions of BtrFS + */ +TEST_F(Mkdir, emlink) +{ + const char FULLPATH[] = "mountpoint/some_dir"; + const char RELPATH[] = "some_dir"; + mode_t mode = 0755; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(Invoke(ReturnErrno(ENOENT))); + + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + const char *name = (const char*)in.body.bytes + + sizeof(fuse_mkdir_in); + return (in.header.opcode == FUSE_MKDIR && + in.body.mkdir.mode == (S_IFDIR | mode) && + (0 == strcmp(RELPATH, name))); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnErrno(EMLINK))); + + ASSERT_NE(1, mkdir(FULLPATH, mode)); + ASSERT_EQ(EMLINK, errno); +} + +/* + * Creating a new directory after FUSE_LOOKUP returned a negative cache entry + */ +TEST_F(Mkdir, entry_cache_negative) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + mode_t mode = 0755; + uint64_t ino = 42; + /* + * Set entry_valid = 0 because this test isn't concerned with whether + * or not we actually cache negative entries, only with whether we + * interpret negative cache responses correctly. + */ + struct timespec entry_valid = {.tv_sec = 0, .tv_nsec = 0}; + + /* mkdir will first do a LOOKUP, adding a negative cache entry */ + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(ReturnNegativeCache(&entry_valid)); + + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + const char *name = (const char*)in.body.bytes + + sizeof(fuse_open_in); + return (in.header.opcode == FUSE_MKDIR && + in.body.mkdir.mode == (S_IFDIR | mode) && + (0 == strcmp(RELPATH, name))); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.create.entry.attr.mode = S_IFDIR | mode; + out.body.create.entry.nodeid = ino; + out.body.create.entry.entry_valid = UINT64_MAX; + out.body.create.entry.attr_valid = UINT64_MAX; + }))); + + ASSERT_EQ(0, mkdir(FULLPATH, mode)) << strerror(errno); +} + +/* + * Creating a new directory should purge any negative namecache entries + */ +TEST_F(Mkdir, entry_cache_negative_purge) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + mode_t mode = 0755; + uint64_t ino = 42; + struct timespec entry_valid = {.tv_sec = TIME_T_MAX, .tv_nsec = 0}; + + /* mkdir will first do a LOOKUP, adding a negative cache entry */ + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .Times(1) + .WillOnce(Invoke(ReturnNegativeCache(&entry_valid))) + .RetiresOnSaturation(); + + /* Then the MKDIR should purge the negative cache entry */ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + const char *name = (const char*)in.body.bytes + + sizeof(fuse_open_in); + return (in.header.opcode == FUSE_MKDIR && + in.body.mkdir.mode == (S_IFDIR | mode) && + (0 == strcmp(RELPATH, name))); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = S_IFDIR | mode; + out.body.entry.nodeid = ino; + out.body.entry.attr_valid = UINT64_MAX; + }))); + + ASSERT_EQ(0, mkdir(FULLPATH, mode)) << strerror(errno); + + /* Finally, a subsequent lookup should query the daemon */ + expect_lookup(RELPATH, ino, S_IFDIR | mode, 0, 1); + + ASSERT_EQ(0, access(FULLPATH, F_OK)) << strerror(errno); +} + +TEST_F(Mkdir, ok) +{ + const char FULLPATH[] = "mountpoint/some_dir"; + const char RELPATH[] = "some_dir"; + mode_t mode = 0755; + uint64_t ino = 42; + mode_t mask; + + mask = umask(0); + (void)umask(mask); + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(Invoke(ReturnErrno(ENOENT))); + + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + const char *name = (const char*)in.body.bytes + + sizeof(fuse_mkdir_in); + return (in.header.opcode == FUSE_MKDIR && + in.body.mkdir.mode == (S_IFDIR | mode) && + in.body.mkdir.umask == mask && + (0 == strcmp(RELPATH, name))); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.create.entry.attr.mode = S_IFDIR | mode; + out.body.create.entry.nodeid = ino; + out.body.create.entry.entry_valid = UINT64_MAX; + out.body.create.entry.attr_valid = UINT64_MAX; + }))); + + ASSERT_EQ(0, mkdir(FULLPATH, mode)) << strerror(errno); +} + +TEST_F(Mkdir_7_8, ok) +{ + const char FULLPATH[] = "mountpoint/some_dir"; + const char RELPATH[] = "some_dir"; + mode_t mode = 0755; + uint64_t ino = 42; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(Invoke(ReturnErrno(ENOENT))); + + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + const char *name = (const char*)in.body.bytes + + sizeof(fuse_mkdir_in); + return (in.header.opcode == FUSE_MKDIR && + in.body.mkdir.mode == (S_IFDIR | mode) && + (0 == strcmp(RELPATH, name))); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry_7_8); + out.body.create.entry.attr.mode = S_IFDIR | mode; + out.body.create.entry.nodeid = ino; + out.body.create.entry.entry_valid = UINT64_MAX; + out.body.create.entry.attr_valid = UINT64_MAX; + }))); + + ASSERT_EQ(0, mkdir(FULLPATH, mode)) << strerror(errno); +} Property changes on: head/tests/sys/fs/fusefs/mkdir.cc ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: head/tests/sys/fs/fusefs/rename.cc =================================================================== --- head/tests/sys/fs/fusefs/rename.cc (nonexistent) +++ head/tests/sys/fs/fusefs/rename.cc (revision 350665) @@ -0,0 +1,321 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2019 The FreeBSD Foundation + * + * This software was developed by BFF Storage Systems, LLC under sponsorship + * from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +extern "C" { +#include +#include +} + +#include "mockfs.hh" +#include "utils.hh" + +using namespace testing; + +class Rename: public FuseTest { + public: + int tmpfd = -1; + char tmpfile[80] = "/tmp/fuse.rename.XXXXXX"; + + virtual void TearDown() { + if (tmpfd >= 0) { + close(tmpfd); + unlink(tmpfile); + } + + FuseTest::TearDown(); + } + + void expect_getattr(uint64_t ino, mode_t mode) + { + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_GETATTR && + in.header.nodeid == ino); + }, Eq(true)), + _) + ).WillOnce(Invoke( + ReturnImmediate([=](auto i __unused, auto& out) { + SET_OUT_HEADER_LEN(out, attr); + out.body.attr.attr.ino = ino; // Must match nodeid + out.body.attr.attr.mode = mode; + out.body.attr.attr_valid = UINT64_MAX; + }))); + } + +}; + +// EINVAL, dst is subdir of src +TEST_F(Rename, einval) +{ + const char FULLDST[] = "mountpoint/src/dst"; + const char RELDST[] = "dst"; + const char FULLSRC[] = "mountpoint/src"; + const char RELSRC[] = "src"; + uint64_t src_ino = 42; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0755); + expect_lookup(RELSRC, src_ino, S_IFDIR | 0755, 0, 2); + EXPECT_LOOKUP(src_ino, RELDST).WillOnce(Invoke(ReturnErrno(ENOENT))); + + ASSERT_NE(0, rename(FULLSRC, FULLDST)); + ASSERT_EQ(EINVAL, errno); +} + +// source does not exist +TEST_F(Rename, enoent) +{ + const char FULLDST[] = "mountpoint/dst"; + const char FULLSRC[] = "mountpoint/src"; + const char RELSRC[] = "src"; + // FUSE hardcodes the mountpoint to inode 1 + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELSRC) + .WillOnce(Invoke(ReturnErrno(ENOENT))); + + ASSERT_NE(0, rename(FULLSRC, FULLDST)); + ASSERT_EQ(ENOENT, errno); +} + +/* + * Renaming a file after FUSE_LOOKUP returned a negative cache entry for dst + */ +TEST_F(Rename, entry_cache_negative) +{ + const char FULLDST[] = "mountpoint/dst"; + const char RELDST[] = "dst"; + const char FULLSRC[] = "mountpoint/src"; + const char RELSRC[] = "src"; + uint64_t dst_dir_ino = FUSE_ROOT_ID; + uint64_t ino = 42; + /* + * Set entry_valid = 0 because this test isn't concerned with whether + * or not we actually cache negative entries, only with whether we + * interpret negative cache responses correctly. + */ + struct timespec entry_valid = {.tv_sec = 0, .tv_nsec = 0}; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0755); + expect_lookup(RELSRC, ino, S_IFREG | 0644, 0, 1); + /* LOOKUP returns a negative cache entry for dst */ + EXPECT_LOOKUP(FUSE_ROOT_ID, RELDST) + .WillOnce(ReturnNegativeCache(&entry_valid)); + + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + const char *src = (const char*)in.body.bytes + + sizeof(fuse_rename_in); + const char *dst = src + strlen(src) + 1; + return (in.header.opcode == FUSE_RENAME && + in.body.rename.newdir == dst_dir_ino && + (0 == strcmp(RELDST, dst)) && + (0 == strcmp(RELSRC, src))); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnErrno(0))); + + ASSERT_EQ(0, rename(FULLSRC, FULLDST)) << strerror(errno); +} + +/* + * Renaming a file should purge any negative namecache entries for the dst + */ +TEST_F(Rename, entry_cache_negative_purge) +{ + const char FULLDST[] = "mountpoint/dst"; + const char RELDST[] = "dst"; + const char FULLSRC[] = "mountpoint/src"; + const char RELSRC[] = "src"; + uint64_t dst_dir_ino = FUSE_ROOT_ID; + uint64_t ino = 42; + struct timespec entry_valid = {.tv_sec = TIME_T_MAX, .tv_nsec = 0}; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0755); + expect_lookup(RELSRC, ino, S_IFREG | 0644, 0, 1); + /* LOOKUP returns a negative cache entry for dst */ + EXPECT_LOOKUP(FUSE_ROOT_ID, RELDST) + .WillOnce(ReturnNegativeCache(&entry_valid)) + .RetiresOnSaturation(); + + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + const char *src = (const char*)in.body.bytes + + sizeof(fuse_rename_in); + const char *dst = src + strlen(src) + 1; + return (in.header.opcode == FUSE_RENAME && + in.body.rename.newdir == dst_dir_ino && + (0 == strcmp(RELDST, dst)) && + (0 == strcmp(RELSRC, src))); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnErrno(0))); + + ASSERT_EQ(0, rename(FULLSRC, FULLDST)) << strerror(errno); + + /* Finally, a subsequent lookup should query the daemon */ + expect_lookup(RELDST, ino, S_IFREG | 0644, 0, 1); + + ASSERT_EQ(0, access(FULLDST, F_OK)) << strerror(errno); +} + +TEST_F(Rename, exdev) +{ + const char FULLB[] = "mountpoint/src"; + const char RELB[] = "src"; + // FUSE hardcodes the mountpoint to inode 1 + uint64_t b_ino = 42; + + tmpfd = mkstemp(tmpfile); + ASSERT_LE(0, tmpfd) << strerror(errno); + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0755); + expect_lookup(RELB, b_ino, S_IFREG | 0644, 0, 2); + + ASSERT_NE(0, rename(tmpfile, FULLB)); + ASSERT_EQ(EXDEV, errno); + + ASSERT_NE(0, rename(FULLB, tmpfile)); + ASSERT_EQ(EXDEV, errno); +} + +TEST_F(Rename, ok) +{ + const char FULLDST[] = "mountpoint/dst"; + const char RELDST[] = "dst"; + const char FULLSRC[] = "mountpoint/src"; + const char RELSRC[] = "src"; + uint64_t dst_dir_ino = FUSE_ROOT_ID; + uint64_t ino = 42; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0755); + expect_lookup(RELSRC, ino, S_IFREG | 0644, 0, 1); + EXPECT_LOOKUP(FUSE_ROOT_ID, RELDST) + .WillOnce(Invoke(ReturnErrno(ENOENT))); + + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + const char *src = (const char*)in.body.bytes + + sizeof(fuse_rename_in); + const char *dst = src + strlen(src) + 1; + return (in.header.opcode == FUSE_RENAME && + in.body.rename.newdir == dst_dir_ino && + (0 == strcmp(RELDST, dst)) && + (0 == strcmp(RELSRC, src))); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnErrno(0))); + + ASSERT_EQ(0, rename(FULLSRC, FULLDST)) << strerror(errno); +} + +/* When moving a file to a new directory, update its parent */ +TEST_F(Rename, parent) +{ + const char FULLDST[] = "mountpoint/dstdir/dst"; + const char RELDSTDIR[] = "dstdir"; + const char RELDST[] = "dst"; + const char FULLSRC[] = "mountpoint/src"; + const char RELSRC[] = "src"; + const char FULLDSTPARENT[] = "mountpoint/dstdir/dst/.."; + Sequence seq; + uint64_t dst_dir_ino = 43; + uint64_t ino = 42; + struct stat sb; + + expect_lookup(RELSRC, ino, S_IFDIR | 0755, 0, 1); + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0755); + EXPECT_LOOKUP(FUSE_ROOT_ID, RELDSTDIR) + .WillRepeatedly(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.nodeid = dst_dir_ino; + out.body.entry.entry_valid = UINT64_MAX; + out.body.entry.attr_valid = UINT64_MAX; + out.body.entry.attr.mode = S_IFDIR | 0755; + out.body.entry.attr.ino = dst_dir_ino; + }))); + EXPECT_LOOKUP(dst_dir_ino, RELDST) + .InSequence(seq) + .WillOnce(Invoke(ReturnErrno(ENOENT))); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + const char *src = (const char*)in.body.bytes + + sizeof(fuse_rename_in); + const char *dst = src + strlen(src) + 1; + return (in.header.opcode == FUSE_RENAME && + in.body.rename.newdir == dst_dir_ino && + (0 == strcmp(RELDST, dst)) && + (0 == strcmp(RELSRC, src))); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnErrno(0))); + EXPECT_LOOKUP(dst_dir_ino, RELDST) + .InSequence(seq) + .WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = S_IFDIR | 0755; + out.body.entry.nodeid = ino; + out.body.entry.entry_valid = UINT64_MAX; + out.body.entry.attr_valid = UINT64_MAX; + }))); + + ASSERT_EQ(0, rename(FULLSRC, FULLDST)) << strerror(errno); + ASSERT_EQ(0, stat(FULLDSTPARENT, &sb)) << strerror(errno); + ASSERT_EQ(dst_dir_ino, sb.st_ino); +} + +// Rename overwrites an existing destination file +TEST_F(Rename, overwrite) +{ + const char FULLDST[] = "mountpoint/dst"; + const char RELDST[] = "dst"; + const char FULLSRC[] = "mountpoint/src"; + const char RELSRC[] = "src"; + // The inode of the already-existing destination file + uint64_t dst_ino = 2; + uint64_t dst_dir_ino = FUSE_ROOT_ID; + uint64_t ino = 42; + + expect_getattr(FUSE_ROOT_ID, S_IFDIR | 0755); + expect_lookup(RELSRC, ino, S_IFREG | 0644, 0, 1); + expect_lookup(RELDST, dst_ino, S_IFREG | 0644, 0, 1); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + const char *src = (const char*)in.body.bytes + + sizeof(fuse_rename_in); + const char *dst = src + strlen(src) + 1; + return (in.header.opcode == FUSE_RENAME && + in.body.rename.newdir == dst_dir_ino && + (0 == strcmp(RELDST, dst)) && + (0 == strcmp(RELSRC, src))); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnErrno(0))); + + ASSERT_EQ(0, rename(FULLSRC, FULLDST)) << strerror(errno); +} Property changes on: head/tests/sys/fs/fusefs/rename.cc ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: head/tests/sys/fs/fusefs/symlink.cc =================================================================== --- head/tests/sys/fs/fusefs/symlink.cc (nonexistent) +++ head/tests/sys/fs/fusefs/symlink.cc (revision 350665) @@ -0,0 +1,178 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2019 The FreeBSD Foundation + * + * This software was developed by BFF Storage Systems, LLC under sponsorship + * from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +extern "C" { +#include +} + +#include "mockfs.hh" +#include "utils.hh" + +using namespace testing; + +class Symlink: public FuseTest { +public: + +void expect_symlink(uint64_t ino, const char *target, const char *relpath) +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + const char *name = (const char*)in.body.bytes; + const char *linkname = name + strlen(name) + 1; + return (in.header.opcode == FUSE_SYMLINK && + (0 == strcmp(linkname, target)) && + (0 == strcmp(name, relpath))); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry); + out.body.entry.attr.mode = S_IFLNK | 0777; + out.body.entry.nodeid = ino; + out.body.entry.entry_valid = UINT64_MAX; + out.body.entry.attr_valid = UINT64_MAX; + }))); +} + +}; + +class Symlink_7_8: public FuseTest { +public: +virtual void SetUp() { + m_kernel_minor_version = 8; + FuseTest::SetUp(); +} + +void expect_symlink(uint64_t ino, const char *target, const char *relpath) +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + const char *name = (const char*)in.body.bytes; + const char *linkname = name + strlen(name) + 1; + return (in.header.opcode == FUSE_SYMLINK && + (0 == strcmp(linkname, target)) && + (0 == strcmp(name, relpath))); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, entry_7_8); + out.body.entry.attr.mode = S_IFLNK | 0777; + out.body.entry.nodeid = ino; + out.body.entry.entry_valid = UINT64_MAX; + out.body.entry.attr_valid = UINT64_MAX; + }))); +} + +}; + +/* + * A successful symlink should clear the parent directory's attribute cache, + * because the fuse daemon should update its mtime and ctime + */ +TEST_F(Symlink, clear_attr_cache) +{ + const char FULLPATH[] = "mountpoint/src"; + const char RELPATH[] = "src"; + const char dst[] = "dst"; + const uint64_t ino = 42; + struct stat sb; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(Invoke(ReturnErrno(ENOENT))); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_GETATTR && + in.header.nodeid == FUSE_ROOT_ID); + }, Eq(true)), + _) + ).Times(2) + .WillRepeatedly(Invoke(ReturnImmediate([=](auto i __unused, auto& out) { + SET_OUT_HEADER_LEN(out, attr); + out.body.attr.attr.ino = FUSE_ROOT_ID; + out.body.attr.attr.mode = S_IFDIR | 0755; + out.body.attr.attr_valid = UINT64_MAX; + }))); + expect_symlink(ino, dst, RELPATH); + + EXPECT_EQ(0, stat("mountpoint", &sb)) << strerror(errno); + EXPECT_EQ(0, symlink(dst, FULLPATH)) << strerror(errno); + EXPECT_EQ(0, stat("mountpoint", &sb)) << strerror(errno); +} + +TEST_F(Symlink, enospc) +{ + const char FULLPATH[] = "mountpoint/lnk"; + const char RELPATH[] = "lnk"; + const char dst[] = "dst"; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(Invoke(ReturnErrno(ENOENT))); + + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + const char *name = (const char*)in.body.bytes; + const char *linkname = name + strlen(name) + 1; + return (in.header.opcode == FUSE_SYMLINK && + (0 == strcmp(linkname, dst)) && + (0 == strcmp(name, RELPATH))); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnErrno(ENOSPC))); + + EXPECT_EQ(-1, symlink(dst, FULLPATH)); + EXPECT_EQ(ENOSPC, errno); +} + +TEST_F(Symlink, ok) +{ + const char FULLPATH[] = "mountpoint/src"; + const char RELPATH[] = "src"; + const char dst[] = "dst"; + const uint64_t ino = 42; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(Invoke(ReturnErrno(ENOENT))); + expect_symlink(ino, dst, RELPATH); + + EXPECT_EQ(0, symlink(dst, FULLPATH)) << strerror(errno); +} + +TEST_F(Symlink_7_8, ok) +{ + const char FULLPATH[] = "mountpoint/src"; + const char RELPATH[] = "src"; + const char dst[] = "dst"; + const uint64_t ino = 42; + + EXPECT_LOOKUP(FUSE_ROOT_ID, RELPATH) + .WillOnce(Invoke(ReturnErrno(ENOENT))); + expect_symlink(ino, dst, RELPATH); + + EXPECT_EQ(0, symlink(dst, FULLPATH)) << strerror(errno); +} Property changes on: head/tests/sys/fs/fusefs/symlink.cc ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: head/tests/sys/fs/fusefs/default_permissions_privileged.cc =================================================================== --- head/tests/sys/fs/fusefs/default_permissions_privileged.cc (nonexistent) +++ head/tests/sys/fs/fusefs/default_permissions_privileged.cc (revision 350665) @@ -0,0 +1,124 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2019 The FreeBSD Foundation + * + * This software was developed by BFF Storage Systems, LLC under sponsorship + * from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +/* + * Tests for the "default_permissions" mount option that require a privileged + * user. + */ + +extern "C" { +#include +#include + +#include +#include +} + +#include "mockfs.hh" +#include "utils.hh" + +using namespace testing; + +class DefaultPermissionsPrivileged: public FuseTest { +virtual void SetUp() { + m_default_permissions = true; + FuseTest::SetUp(); + if (HasFatalFailure() || IsSkipped()) + return; + + if (geteuid() != 0) { + GTEST_SKIP() << "This test requires a privileged user"; + } + + /* With -o default_permissions, FUSE_ACCESS should never be called */ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_ACCESS); + }, Eq(true)), + _) + ).Times(0); +} + +public: +void expect_getattr(uint64_t ino, mode_t mode, uint64_t attr_valid, int times, + uid_t uid = 0, gid_t gid = 0) +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_GETATTR && + in.header.nodeid == ino); + }, Eq(true)), + _) + ).Times(times) + .WillRepeatedly(Invoke(ReturnImmediate([=](auto i __unused, auto& out) { + SET_OUT_HEADER_LEN(out, attr); + out.body.attr.attr.ino = ino; // Must match nodeid + out.body.attr.attr.mode = mode; + out.body.attr.attr.size = 0; + out.body.attr.attr.uid = uid; + out.body.attr.attr.uid = gid; + out.body.attr.attr_valid = attr_valid; + }))); +} + +void expect_lookup(const char *relpath, uint64_t ino, mode_t mode, + uint64_t attr_valid, uid_t uid = 0, gid_t gid = 0) +{ + FuseTest::expect_lookup(relpath, ino, mode, 0, 1, attr_valid, uid, gid); +} + +}; + +class Setattr: public DefaultPermissionsPrivileged {}; + +TEST_F(Setattr, sticky_regular_file) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + const uint64_t ino = 42; + const mode_t oldmode = 0644; + const mode_t newmode = 01644; + + expect_getattr(1, S_IFDIR | 0755, UINT64_MAX, 1); + expect_lookup(RELPATH, ino, S_IFREG | oldmode, UINT64_MAX, geteuid()); + EXPECT_CALL(*m_mock, process( + ResultOf([](auto in) { + return (in.header.opcode == FUSE_SETATTR); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnImmediate([](auto in __unused, auto& out) { + SET_OUT_HEADER_LEN(out, attr); + out.body.attr.attr.mode = S_IFREG | newmode; + }))); + + EXPECT_EQ(0, chmod(FULLPATH, newmode)) << strerror(errno); +} + + Property changes on: head/tests/sys/fs/fusefs/default_permissions_privileged.cc ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: head/tests/sys/fs/fusefs/mount.cc =================================================================== --- head/tests/sys/fs/fusefs/mount.cc (nonexistent) +++ head/tests/sys/fs/fusefs/mount.cc (revision 350665) @@ -0,0 +1,152 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2019 The FreeBSD Foundation + * + * This software was developed by BFF Storage Systems, LLC under sponsorship + * from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +extern "C" { +#include +#include +#include + +#include "mntopts.h" // for build_iovec +} + +#include "mockfs.hh" +#include "utils.hh" + +using namespace testing; + +class UpdateOk: public FuseTest, public WithParamInterface {}; +class UpdateErr: public FuseTest, public WithParamInterface {}; + +int mntflag_from_string(const char *s) +{ + if (0 == strcmp("MNT_RDONLY", s)) + return MNT_RDONLY; + else if (0 == strcmp("MNT_NOEXEC", s)) + return MNT_NOEXEC; + else if (0 == strcmp("MNT_NOSUID", s)) + return MNT_NOSUID; + else if (0 == strcmp("MNT_NOATIME", s)) + return MNT_NOATIME; + else if (0 == strcmp("MNT_SUIDDIR", s)) + return MNT_SUIDDIR; + else if (0 == strcmp("MNT_USER", s)) + return MNT_USER; + else + return 0; +} + +/* Some mount options can be changed by mount -u */ +TEST_P(UpdateOk, update) +{ + struct statfs statbuf; + struct iovec *iov = NULL; + int iovlen = 0; + int flag; + int newflags = MNT_UPDATE | MNT_SYNCHRONOUS; + + flag = mntflag_from_string(GetParam()); + if (flag == MNT_NOSUID && 0 != geteuid()) + GTEST_SKIP() << "Only root may clear MNT_NOSUID"; + if (flag == MNT_SUIDDIR && 0 != geteuid()) + GTEST_SKIP() << "Only root may set MNT_SUIDDIR"; + + EXPECT_CALL(*m_mock, process( + ResultOf([](auto in) { + return (in.header.opcode == FUSE_STATFS); + }, Eq(true)), + _) + ).WillRepeatedly(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + /* + * All of the fields except f_flags are don't care, and f_flags is set by + * the VFS + */ + SET_OUT_HEADER_LEN(out, statfs); + }))); + + ASSERT_EQ(0, statfs("mountpoint", &statbuf)) << strerror(errno); + newflags = (statbuf.f_flags | MNT_UPDATE) ^ flag; + + build_iovec(&iov, &iovlen, "fstype", (void*)statbuf.f_fstypename, -1); + build_iovec(&iov, &iovlen, "fspath", (void*)statbuf.f_mntonname, -1); + build_iovec(&iov, &iovlen, "from", __DECONST(void *, "/dev/fuse"), -1); + ASSERT_EQ(0, nmount(iov, iovlen, newflags)) << strerror(errno); + + ASSERT_EQ(0, statfs("mountpoint", &statbuf)) << strerror(errno); + EXPECT_FALSE((newflags ^ statbuf.f_flags) & flag); +} + +/* Some mount options cannnot be changed by mount -u */ +TEST_P(UpdateErr, update) +{ + struct statfs statbuf; + struct iovec *iov = NULL; + int iovlen = 0; + int flag; + int newflags = MNT_UPDATE | MNT_SYNCHRONOUS; + + flag = mntflag_from_string(GetParam()); + EXPECT_CALL(*m_mock, process( + ResultOf([](auto in) { + return (in.header.opcode == FUSE_STATFS); + }, Eq(true)), + _) + ).WillRepeatedly(Invoke(ReturnImmediate([=](auto in __unused, auto& out) { + /* + * All of the fields except f_flags are don't care, and f_flags is set by + * the VFS + */ + SET_OUT_HEADER_LEN(out, statfs); + }))); + + ASSERT_EQ(0, statfs("mountpoint", &statbuf)) << strerror(errno); + newflags = (statbuf.f_flags | MNT_UPDATE) ^ flag; + + build_iovec(&iov, &iovlen, "fstype", (void*)statbuf.f_fstypename, -1); + build_iovec(&iov, &iovlen, "fspath", (void*)statbuf.f_mntonname, -1); + build_iovec(&iov, &iovlen, "from", __DECONST(void *, "/dev/fuse"), -1); + /* + * Don't check nmount's return value, because vfs_domount may "fix" the + * options for us. The important thing is to check the final value of + * statbuf.f_flags below. + */ + (void)nmount(iov, iovlen, newflags); + + ASSERT_EQ(0, statfs("mountpoint", &statbuf)) << strerror(errno); + EXPECT_TRUE((newflags ^ statbuf.f_flags) & flag); +} + +INSTANTIATE_TEST_CASE_P(Mount, UpdateOk, + ::testing::Values("MNT_RDONLY", "MNT_NOEXEC", "MNT_NOSUID", "MNT_NOATIME", + "MNT_SUIDDIR") +); + +INSTANTIATE_TEST_CASE_P(Mount, UpdateErr, + ::testing::Values( "MNT_USER"); +); Property changes on: head/tests/sys/fs/fusefs/mount.cc ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: head/tests/sys/fs/fusefs/release.cc =================================================================== --- head/tests/sys/fs/fusefs/release.cc (nonexistent) +++ head/tests/sys/fs/fusefs/release.cc (revision 350665) @@ -0,0 +1,224 @@ +/*- + * SPDX-License-Identifier: BSD-2-Clause-FreeBSD + * + * Copyright (c) 2019 The FreeBSD Foundation + * + * This software was developed by BFF Storage Systems, LLC under sponsorship + * from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +extern "C" { +#include +#include +} + +#include "mockfs.hh" +#include "utils.hh" + +using namespace testing; + +class Release: public FuseTest { + +public: +void expect_lookup(const char *relpath, uint64_t ino, int times) +{ + FuseTest::expect_lookup(relpath, ino, S_IFREG | 0644, 0, times); +} + +void expect_release(uint64_t ino, uint64_t lock_owner, + uint32_t flags, int error) +{ + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_RELEASE && + in.header.nodeid == ino && + in.body.release.lock_owner == lock_owner && + in.body.release.fh == FH && + in.body.release.flags == flags); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnErrno(error))) + .RetiresOnSaturation(); +} +}; + +class ReleaseWithLocks: public Release { + virtual void SetUp() { + m_init_flags = FUSE_POSIX_LOCKS; + Release::SetUp(); + } +}; + + +/* If a file descriptor is duplicated, only the last close causes RELEASE */ +TEST_F(Release, dup) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + int fd, fd2; + + expect_lookup(RELPATH, ino, 1); + expect_open(ino, 0, 1); + expect_flush(ino, 1, ReturnErrno(0)); + expect_release(ino, getpid(), O_RDONLY, 0); + + fd = open(FULLPATH, O_RDONLY); + EXPECT_LE(0, fd) << strerror(errno); + + fd2 = dup(fd); + + ASSERT_EQ(0, close(fd2)) << strerror(errno); + ASSERT_EQ(0, close(fd)) << strerror(errno); +} + +/* + * Some FUSE filesystem cache data internally and flush it on release. Such + * filesystems may generate errors during release. On Linux, these get + * returned by close(2). However, POSIX does not require close(2) to return + * this error. FreeBSD's fuse(4) should return EIO if it returns an error at + * all. + */ +/* http://pubs.opengroup.org/onlinepubs/9699919799/functions/close.html */ +TEST_F(Release, eio) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + int fd; + + expect_lookup(RELPATH, ino, 1); + expect_open(ino, 0, 1); + expect_flush(ino, 1, ReturnErrno(0)); + expect_release(ino, getpid(), O_WRONLY, EIO); + + fd = open(FULLPATH, O_WRONLY); + EXPECT_LE(0, fd) << strerror(errno); + + ASSERT_TRUE(0 == close(fd) || errno == EIO) << strerror(errno); +} + +/* + * FUSE_RELEASE should contain the same flags used for FUSE_OPEN + */ +/* https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=236340 */ +TEST_F(Release, DISABLED_flags) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + int fd; + + expect_lookup(RELPATH, ino, 1); + expect_open(ino, 0, 1); + expect_flush(ino, 1, ReturnErrno(0)); + expect_release(ino, getpid(), O_RDWR | O_APPEND, 0); + + fd = open(FULLPATH, O_RDWR | O_APPEND); + EXPECT_LE(0, fd) << strerror(errno); + + ASSERT_EQ(0, close(fd)) << strerror(errno); +} + +/* + * fuse(4) will issue multiple FUSE_OPEN operations for the same file if it's + * opened with different modes. Each FUSE_OPEN should get its own + * FUSE_RELEASE. + */ +TEST_F(Release, multiple_opens) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + int fd, fd2; + + expect_lookup(RELPATH, ino, 2); + expect_open(ino, 0, 2); + expect_flush(ino, 2, ReturnErrno(0)); + expect_release(ino, getpid(), O_RDONLY, 0); + + fd = open(FULLPATH, O_RDONLY); + EXPECT_LE(0, fd) << strerror(errno); + + expect_release(ino, getpid(), O_WRONLY, 0); + fd2 = open(FULLPATH, O_WRONLY); + EXPECT_LE(0, fd2) << strerror(errno); + + ASSERT_EQ(0, close(fd2)) << strerror(errno); + ASSERT_EQ(0, close(fd)) << strerror(errno); +} + +TEST_F(Release, ok) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + int fd; + + expect_lookup(RELPATH, ino, 1); + expect_open(ino, 0, 1); + expect_flush(ino, 1, ReturnErrno(0)); + expect_release(ino, getpid(), O_RDONLY, 0); + + fd = open(FULLPATH, O_RDONLY); + EXPECT_LE(0, fd) << strerror(errno); + + ASSERT_EQ(0, close(fd)) << strerror(errno); +} + +/* When closing a file with a POSIX file lock, release should release the lock*/ +TEST_F(ReleaseWithLocks, unlock_on_close) +{ + const char FULLPATH[] = "mountpoint/some_file.txt"; + const char RELPATH[] = "some_file.txt"; + uint64_t ino = 42; + int fd; + struct flock fl; + pid_t pid = getpid(); + + expect_lookup(RELPATH, ino, 1); + expect_open(ino, 0, 1); + EXPECT_CALL(*m_mock, process( + ResultOf([=](auto in) { + return (in.header.opcode == FUSE_SETLK && + in.header.nodeid == ino && + in.body.setlk.fh == FH); + }, Eq(true)), + _) + ).WillOnce(Invoke(ReturnErrno(0))); + expect_flush(ino, 1, ReturnErrno(0)); + expect_release(ino, static_cast(pid), O_RDWR, 0); + + fd = open(FULLPATH, O_RDWR); + ASSERT_LE(0, fd) << strerror(errno); + fl.l_start = 0; + fl.l_len = 0; + fl.l_pid = pid; + fl.l_type = F_RDLCK; + fl.l_whence = SEEK_SET; + fl.l_sysid = 0; + ASSERT_NE(-1, fcntl(fd, F_SETLKW, &fl)) << strerror(errno); + + ASSERT_EQ(0, close(fd)) << strerror(errno); +} Property changes on: head/tests/sys/fs/fusefs/release.cc ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: head =================================================================== --- head (revision 350664) +++ head (revision 350665) Property changes on: head ___________________________________________________________________ Modified: svn:mergeinfo ## -0,0 +0,1 ## Merged /projects/fuse2:r344558-344663,344665-344702,344704-344913,344915-345303,345305-345330,345332-345345,345347-345420,345422-348006,348008-348053,348055-350092,350094-350143,350145-350621