Index: projects/clang1000-import/UPDATING =================================================================== --- projects/clang1000-import/UPDATING (revision 358238) +++ projects/clang1000-import/UPDATING (revision 358239) @@ -1,2221 +1,2221 @@ Updating Information for FreeBSD current users. This file is maintained and copyrighted by M. Warner Losh . See end of file for further details. For commonly done items, please see the COMMON ITEMS: section later in the file. These instructions assume that you basically know what you are doing. If not, then please consult the FreeBSD handbook: https://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/makeworld.html Items affecting the ports and packages system can be found in /usr/ports/UPDATING. Please read that file before running portupgrade. NOTE TO PEOPLE WHO THINK THAT FreeBSD 13.x IS SLOW: FreeBSD 13.x has many debugging features turned on, in both the kernel and userland. These features attempt to detect incorrect use of system primitives, and encourage loud failure through extra sanity checking and fail stop semantics. They also substantially impact system performance. If you want to do performance measurement, benchmarking, and optimization, you'll want to turn them off. This includes various WITNESS- related kernel options, INVARIANTS, malloc debugging flags in userland, and various verbose features in the kernel. Many developers choose to disable these features on build machines to maximize performance. (To completely disable malloc debugging, define MALLOC_PRODUCTION in /etc/make.conf, or to merely disable the most expensive debugging functionality run "ln -s 'abort:false,junk:false' /etc/malloc.conf".) 2020mmdd: Clang, llvm, lld, lldb, compiler-rt, libc++, libunwind and openmp have been upgraded to 10.0.0. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20200220: - ncurses has been updated to a newer version (6.1-20200118). Given the ABI + ncurses has been updated to a newer version (6.2-20200215). Given the ABI has changed, users will have to rebuild all the ports that are linked to ncurses. 20200217: The size of struct vnet and the magic cookie have changed. Users need to recompile libkvm and all modules using VIMAGE together with their new kernel. 20200212: Defining the long deprecated NO_CTF, NO_DEBUG_FILES, NO_INSTALLLIB, NO_MAN, NO_PROFILE, and NO_WARNS variables is now an error. Update your Makefiles and scripts to define MK_=no instead as required. One exception to this is that program or library Makefiles should define MAN to empty rather than setting MK_MAN=no. 20200108: Clang/LLVM is now the default compiler and LLD the default linker for riscv64. 20200107: make universe no longer uses GCC 4.2.1 on any architectures. Architectures not supported by in-tree Clang/LLVM require an external toolchain package. 20200104: GCC 4.2.1 is now not built by default, as part of the GCC 4.2.1 retirement plan. Specifically, the GCC, GCC_BOOTSTRAP, and GNUCXX options default to off for all supported CPU architectures. As a short-term transition aid they may be enabled via WITH_* options. GCC 4.2.1 is expected to be removed from the tree on 2020-03-31. 20200102: Support for armv5 has been disconnected and is being removed. The machine combination MACHINE=arm MACHINE_ARCH=arm is no longer valid. You must now use a MACHINE_ARCH of armv6 or armv7. The default MACHINE_ARCH for MACHINE=arm is now armv7. 20191226: Clang/LLVM is now the default compiler for all powerpc architectures. LLD is now the default linker for powerpc64. The change for powerpc64 also includes a change to the ELFv2 ABI, incompatible with the existing ABI. 20191226: Kernel-loadable random(4) modules are no longer unloadable. 20191222: Clang, llvm, lld, lldb, compiler-rt, libc++, libunwind and openmp have been upgraded to 9.0.1. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20191212: r355677 has modified the internal interface used between the NFS modules in the kernel. As such, they must all be upgraded simultaneously. I will do a version bump for this. 20191205: The root certificates of the Mozilla CA Certificate Store have been imported into the base system and can be managed with the certctl(8) utility. If you have installed the security/ca_root_nss port or package with the ETCSYMLINK option (the default), be advised that there may be differences between those included in the port and those included in base due to differences in nss branch used as well as general update frequency. Note also that certctl(8) cannot manage certs in the format used by the security/ca_root_nss port. 20191120: The amd(8) automount daemon has been disabled by default, and will be removed in the future. As of FreeBSD 10.1 the autofs(5) is available for automounting. 20191107: The nctgpio and wbwd drivers have been moved to the superio bus. If you have one of these drivers in a kernel configuration, then you should add device superio to it. If you use one of these drivers as a module and you compile a custom set of modules, then you should add superio to the set. 20191021: KPIs for network drivers to access interface addresses have changed. Users need to recompile NIC driver modules together with kernel. 20191021: The net.link.tap.user_open sysctl no longer prevents user opening of already created /dev/tapNN devices. Access is still controlled by node permissions, just like tun devices. The net.link.tap.user_open sysctl is now used only to allow users to perform devfs cloning of tap devices, and the subsequent open may not succeed if the user is not in the appropriate group. This sysctl may be deprecated/removed completely in the future. 20191009: mips, powerpc, and sparc64 are no longer built as part of universe / tinderbox unless MAKE_OBSOLETE_GCC is defined. If not defined, mips, powerpc, and sparc64 builds will look for the xtoolchain binaries and if installed use them for universe builds. As llvm 9.0 becomes vetted for these architectures, they will be removed from the list. 20191009: Clang, llvm, lld, lldb, compiler-rt, libc++, libunwind and openmp have been upgraded to 9.0.0. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20191003: The hpt27xx, hptmv, hptnr, and hptrr drivers have been removed from GENERIC. They are available as modules and can be loaded by adding to /boot/loader.conf hpt27xx_load="YES", hptmv_load="YES", hptnr_load="YES", or hptrr_load="YES", respectively. 20190913: ntpd no longer by default locks its pages in memory, allowing them to be paged out by the kernel. Use rlimit memlock to restore historic BSD behaviour. For example, add "rlimit memlock 32" to ntp.conf to lock up to 32 MB of ntpd address space in memory. 20190823: Several of ping6's options have been renamed for better consistency with ping. If you use any of -ARWXaghmrtwx, you must update your scripts. See ping6(8) for details. 20190727: The vfs.fusefs.sync_unmount and vfs.fusefs.init_backgrounded sysctls and the "-o sync_unmount" and "-o init_backgrounded" mount options have been removed from mount_fusefs(8). You can safely remove them from your scripts, because they had no effect. The vfs.fusefs.fix_broken_io, vfs.fusefs.sync_resize, vfs.fusefs.refresh_size, vfs.fusefs.mmap_enable, vfs.fusefs.reclaim_revoked, and vfs.fusefs.data_cache_invalidate sysctls have been removed. If you felt the need to set any of them to a non-default value, please tell asomers@FreeBSD.org why. 20190713: Default permissions on the /var/account/acct file (and copies of it rotated by periodic daily scripts) are changed from 0644 to 0640 because the file contains sensitive information that should not be world-readable. If the /var/account directory must be created by rc.d/accounting, the mode used is now 0750. Admins who use the accounting feature are encouraged to change the mode of an existing /var/account directory to 0750 or 0700. 20190620: Entropy collection and the /dev/random device are no longer optional components. The "device random" option has been removed. Implementations of distilling algorithms can still be made loadable with "options RANDOM_LOADABLE" (e.g., random_fortuna.ko). 20190612: Clang, llvm, lld, lldb, compiler-rt, libc++, libunwind and openmp have been upgraded to 8.0.1. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20190608: A fix was applied to i386 kernel modules to avoid panics with dpcpu or vnet. Users need to recompile i386 kernel modules having pcpu or vnet sections or they will refuse to load. 20190513: User-wired pages now have their own counter, vm.stats.vm.v_user_wire_count. The vm.max_wired sysctl was renamed to vm.max_user_wired and changed from an unsigned int to an unsigned long. bhyve VMs wired with the -S are now subject to the user wiring limit; the vm.max_user_wired sysctl may need to be tuned to avoid running into the limit. 20190507: The IPSEC option has been removed from GENERIC. Users requiring ipsec(4) must now load the ipsec(4) kernel module. 20190507: The tap(4) driver has been folded into tun(4), and the module has been renamed to tuntap. You should update any kld_list="if_tap" or kld_list="if_tun" entries in /etc/rc.conf, if_tap_load="YES" or if_tun_load="YES" entries in /boot/loader.conf to load the if_tuntap module instead, and "device tap" or "device tun" entries in kernel config files to select the tuntap device instead. 20190418: The following knobs have been added related to tradeoffs between safe use of the random device and availability in the absence of entropy: kern.random.initial_seeding.bypass_before_seeding: tunable; set non-zero to bypass the random device prior to seeding, or zero to block random requests until the random device is initially seeded. For now, set to 1 (unsafe) by default to restore pre-r346250 boot availability properties. kern.random.initial_seeding.read_random_bypassed_before_seeding: read-only diagnostic sysctl that is set when bypass is enabled and read_random(9) is bypassed, to enable programmatic handling of this initial condition, if desired. kern.random.initial_seeding.arc4random_bypassed_before_seeding: Similar to the above, but for for arc4random(9) initial seeding. kern.random.initial_seeding.disable_bypass_warnings: tunable; set non-zero to disable warnings in dmesg when the same conditions are met as for the diagnostic sysctls above. Defaults to zero, i.e., produce warnings in dmesg when the conditions are met. 20190416: The loadable random module KPI has changed; the random_infra_init() routine now requires a 3rd function pointer for a bool (*)(void) method that returns true if the random device is seeded (and therefore unblocked). 20190404: r345895 reverts r320698. This implies that an nfsuserd(8) daemon built from head sources between r320757 (July 6, 2017) and r338192 (Aug. 22, 2018) will not work unless the "-use-udpsock" is added to the command line. nfsuserd daemons built from head sources that are post-r338192 are not affected and should continue to work. 20190320: The fuse(4) module has been renamed to fusefs(4) for consistency with other filesystems. You should update any kld_load="fuse" entries in /etc/rc.conf, fuse_load="YES" entries in /boot/loader.conf, and "options FUSE" entries in kernel config files. 20190304: Clang, llvm, lld, lldb, compiler-rt and libc++ have been upgraded to 8.0.0. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20190226: geom_uzip(4) depends on the new module xz. If geom_uzip is statically compiled into your custom kernel, add 'device xz' statement to the kernel config. 20190219: drm and drm2 have been removed from the tree. Please see https://wiki.freebsd.org/Graphics for the latest information on migrating to the drm ports. 20190131: Iflib is no longer unconditionally compiled into the kernel. Drivers using iflib and statically compiled into the kernel, now require the 'device iflib' config option. For the same drivers loaded as modules on kernels not having 'device iflib', the iflib.ko module is loaded automatically. 20190125: The IEEE80211_AMPDU_AGE and AH_SUPPORT_AR5416 kernel configuration options no longer exist since r343219 and r343427 respectively; nothing uses them, so they should be just removed from custom kernel config files. 20181230: r342635 changes the way efibootmgr(8) works by requiring users to add the -b (bootnum) parameter for commands where the bootnum was previously specified with each option. For example 'efibootmgr -B 0001' is now 'efibootmgr -B -b 0001'. 20181220: r342286 modifies the NFSv4 server so that it obeys vfs.nfsd.nfs_privport in the same as it is applied to NFSv2 and 3. This implies that NFSv4 servers that have vfs.nfsd.nfs_privport set will only allow mounts from clients using a reserved port#. Since both the FreeBSD and Linux NFSv4 clients use reserved port#s by default, this should not affect most NFSv4 mounts. 20181219: The XLP config has been removed. We can't support 64-bit atomics in this kernel because it is running in 32-bit mode. XLP users must transition to running a 64-bit kernel (XLP64 or XLPN32). The mips GXEMUL support has been removed from FreeBSD. MALTA* + qemu is the preferred emulator today and we don't need two different ones. The old sibyte / swarm / Broadcom BCM1250 support has been removed from the mips port. 20181211: Clang, llvm, lld, lldb, compiler-rt and libc++ have been upgraded to 7.0.1. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20181211: Remove the timed and netdate programs from the base tree. Setting the time with these daemons has been obsolete for over a decade. 20181126: On amd64, arm64 and armv7 (architectures that install LLVM's ld.lld linker as /usr/bin/ld) GNU ld is no longer installed as ld.bfd, as it produces broken binaries when ifuncs are in use. Users needing GNU ld should install the binutils port or package. 20181123: The BSD crtbegin and crtend code has been enabled by default. It has had extensive testing on amd64, arm64, and i386. It can be disabled by building a world with -DWITHOUT_BSD_CRTBEGIN. 20181115: The set of CTM commands (ctm, ctm_smail, ctm_rmail, ctm_dequeue) has been converted to a port (misc/ctm) and will be removed from FreeBSD-13. It is available as a package (ctm) for all supported FreeBSD versions. 20181110: The default newsyslog.conf(5) file has been changed to only include files in /etc/newsyslog.conf.d/ and /usr/local/etc/newsyslog.conf.d/ if the filenames end in '.conf' and do not begin with a '.'. You should check the configuration files in these two directories match this naming convention. You can verify which configuration files are being included using the command: $ newsyslog -Nrv 20181015: Ports for the DRM modules have been simplified. Now, amd64 users should just install the drm-kmod port. All others should install drm-legacy-kmod. Graphics hardware that's newer than about 2010 usually works with drm-kmod. For hardware older than 2013, however, some users will need to use drm-legacy-kmod if drm-kmod doesn't work for them. Hardware older than 2008 usually only works in drm-legacy-kmod. The graphics team can only commit to hardware made since 2013 due to the complexity of the market and difficulty to test all the older cards effectively. If you have hardware supported by drm-kmod, you are strongly encouraged to use that as you will get better support. Other than KPI chasing, drm-legacy-kmod will not be updated. As outlined elsewhere, the drm and drm2 modules will be eliminated from the src base soon (with a limited exception for arm). Please update to the package asap and report any issues to x11@freebsd.org. Generally, anybody using the drm*-kmod packages should add WITHOUT_DRM_MODULE=t and WITHOUT_DRM2_MODULE=t to avoid nasty cross-threading surprises, especially with automatic driver loading from X11 startup. These will become the defaults in 13-current shortly. 20181012: The ixlv(4) driver has been renamed to iavf(4). As a consequence, custom kernel and module loading configuration files must be updated accordingly. Moreover, interfaces previous presented as ixlvN to the system are now exposed as iavfN and network configuration files must be adjusted as necessary. 20181009: OpenSSL has been updated to version 1.1.1. This update included additional various API changes throughout the base system. It is important to rebuild third-party software after upgrading. The value of __FreeBSD_version has been bumped accordingly. 20181006: The legacy DRM modules and drivers have now been added to the loader's module blacklist, in favor of loading them with kld_list in rc.conf(5). The module blacklist may be overridden with the loader.conf(5) 'module_blacklist' variable, but loading them via rc.conf(5) is strongly encouraged. 20181002: The cam(4) based nda(4) driver will be used over nvd(4) by default on powerpc64. You may set 'options NVME_USE_NVD=1' in your kernel conf or loader tunable 'hw.nvme.use_nvd=1' if you wish to use the existing driver. Make sure to edit /boot/etc/kboot.conf and fstab to use the nda device name. 20180913: Reproducible build mode is now on by default, in preparation for FreeBSD 12.0. This eliminates build metadata such as the user, host, and time from the kernel (and uname), unless the working tree corresponds to a modified checkout from a version control system. The previous behavior can be obtained by setting the /etc/src.conf knob WITHOUT_REPRODUCIBLE_BUILD. 20180826: The Yarrow CSPRNG has been removed from the kernel as it has not been supported by its designers since at least 2003. Fortuna has been the default since FreeBSD-11. 20180822: devctl freeze/thaw have gone into the tree, the rc scripts have been updated to use them and devmatch has been changed. You should update kernel, userland and rc scripts all at the same time. 20180818: The default interpreter has been switched from 4th to Lua. LOADER_DEFAULT_INTERP, documented in build(7), will override the default interpreter. If you have custom FORTH code you will need to set LOADER_DEFAULT_INTERP=4th (valid values are 4th, lua or simp) in src.conf for the build. This will create default hard links between loader and loader_4th instead of loader and loader_lua, the new default. If you are using UEFI it will create the proper hard link to loader.efi. bhyve uses userboot.so. It remains 4th-only until some issues are solved regarding coexisting with multiple versions of FreeBSD are resolved. 20180815: ls(1) now respects the COLORTERM environment variable used in other systems and software to indicate that a colored terminal is both supported and desired. If ls(1) is suddenly emitting colors, they may be disabled again by either removing the unwanted COLORTERM from your environment, or using `ls --color=never`. The ls(1) specific CLICOLOR may not be observed in a future release. 20180808: The default pager for most commands has been changed to "less". To restore the old behavior, set PAGER="more" and MANPAGER="more -s" in your environment. 20180731: The jedec_ts(4) driver has been removed. A superset of its functionality is available in the jedec_dimm(4) driver, and the manpage for that driver includes migration instructions. If you have "device jedec_ts" in your kernel configuration file, it must be removed. 20180730: amd64/GENERIC now has EFI runtime services, EFIRT, enabled by default. This should have no effect if the kernel is booted via BIOS/legacy boot. EFIRT may be disabled via a loader tunable, efi.rt.disabled, if a system has a buggy firmware that prevents a successful boot due to use of runtime services. 20180727: Atmel AT91RM9200 and AT91SAM9, Cavium CNS 11xx and XScale support has been removed from the tree. These ports were obsolete and/or known to be broken for many years. 20180723: loader.efi has been augmented to participate more fully in the UEFI boot manager protocol. loader.efi will now look at the BootXXXX environment variable to determine if a specific kernel or root partition was specified. XXXX is derived from BootCurrent. efibootmgr(8) manages these standard UEFI variables. 20180720: zfsloader's functionality has now been folded into loader. zfsloader is no longer necessary once you've updated your boot blocks. For a transition period, we will install a hardlink for zfsloader to loader to allow a smooth transition until the boot blocks can be updated (hard link because old zfs boot blocks don't understand symlinks). 20180719: ARM64 now have efifb support, if you want to have serial console on your arm64 board when an screen is connected and the bootloader setup a frame buffer for us to use, just add : boot_serial=YES boot_multicons=YES in /boot/loader.conf For Raspberry Pi 3 (RPI) users, this is needed even if you don't have an screen connected as the firmware will setup a frame buffer are that u-boot will expose as an EFI frame buffer. 20180719: New uid:gid added, ntpd:ntpd (123:123). Be sure to run mergemaster or take steps to update /etc/passwd before doing installworld on existing systems. Do not skip the "mergemaster -Fp" step before installworld, as described in the update procedures near the bottom of this document. Also, rc.d/ntpd now starts ntpd(8) as user ntpd if the new mac_ntpd(4) policy is available, unless ntpd_flags or the ntp config file contain options that change file/dir locations. When such options (e.g., "statsdir" or "crypto") are used, ntpd can still be run as non-root by setting ntpd_user=ntpd in rc.conf, after taking steps to ensure that all required files/dirs are accessible by the ntpd user. 20180717: Big endian arm support has been removed. 20180711: The static environment setup in kernel configs is no longer mutually exclusive with the loader(8) environment by default. In order to restore the previous default behavior of disabling the loader(8) environment if a static environment is present, you must specify loader_env.disabled=1 in the static environment. 20180705: The ABI of syscalls used by management tools like sockstat and netstat has been broken to allow 32-bit binaries to work on 64-bit kernels without modification. These programs will need to match the kernel in order to function. External programs may require minor modifications to accommodate a change of type in structures from pointers to 64-bit virtual addresses. 20180702: On i386 and amd64 atomics are now inlined. Out of tree modules using atomics will need to be rebuilt. 20180701: The '%I' format in the kern.corefile sysctl limits the number of core files that a process can generate to the number stored in the debug.ncores sysctl. The '%I' format is replaced by the single digit index. Previously, if all indexes were taken the kernel would overwrite only a core file with the highest index in a filename. Currently the system will create a new core file if there is a free index or if all slots are taken it will overwrite the oldest one. 20180630: Clang, llvm, lld, lldb, compiler-rt and libc++ have been upgraded to 6.0.1. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20180628: r335753 introduced a new quoting method. However, etc/devd/devmatch.conf needed to be changed to work with it. This change was made with r335763 and requires a mergemaster / etcupdate / etc to update the installed file. 20180612: r334930 changed the interface between the NFS modules, so they all need to be rebuilt. r335018 did a __FreeBSD_version bump for this. 20180530: As of r334391 lld is the default amd64 system linker; it is installed as /usr/bin/ld. Kernel build workarounds (see 20180510 entry) are no longer necessary. 20180530: The kernel / userland interface for devinfo changed, so you'll need a new kernel and userland as a pair for it to work (rebuilding lib/libdevinfo is all that's required). devinfo and devmatch will not work, but everything else will when there's a mismatch. 20180523: The on-disk format for hwpmc callchain records has changed to include threadid corresponding to a given record. This changes the field offsets and thus requires that libpmcstat be rebuilt before using a kernel later than r334108. 20180517: The vxge(4) driver has been removed. This driver was introduced into HEAD one week before the Exar left the Ethernet market and is not known to be used. If you have device vxge in your kernel config file it must be removed. 20180510: The amd64 kernel now requires a ld that supports ifunc to produce a working kernel, either lld or a newer binutils. lld is built by default on amd64, and the 'buildkernel' target uses it automatically. However, it is not the default linker, so building the kernel the traditional way requires LD=ld.lld on the command line (or LD=/usr/local/bin/ld for binutils port/package). lld will soon be default, and this requirement will go away. NOTE: As of r334391 lld is the default system linker on amd64, and no workaround is necessary. 20180508: The nxge(4) driver has been removed. This driver was for PCI-X 10g cards made by s2io/Neterion. The company was acquired by Exar and no longer sells or supports Ethernet products. If you have device nxge in your kernel config file it must be removed. 20180504: The tz database (tzdb) has been updated to 2018e. This version more correctly models time stamps in time zones with negative DST such as Europe/Dublin (from 1971 on), Europe/Prague (1946/7), and Africa/Windhoek (1994/2017). This does not affect the UT offsets, only time zone abbreviations and the tm_isdst flag. 20180502: The ixgb(4) driver has been removed. This driver was for an early and uncommon legacy PCI 10GbE for a single ASIC, Intel 82597EX. Intel quickly shifted to the long lived ixgbe family. If you have device ixgb in your kernel config file it must be removed. 20180501: The lmc(4) driver has been removed. This was a WAN interface card that was already reportedly rare in 2003, and had an ambiguous license. If you have device lmc in your kernel config file it must be removed. 20180413: Support for Arcnet networks has been removed. If you have device arcnet or device cm in your kernel config file they must be removed. 20180411: Support for FDDI networks has been removed. If you have device fddi or device fpa in your kernel config file they must be removed. 20180406: In addition to supporting RFC 3164 formatted messages, the syslogd(8) service is now capable of parsing RFC 5424 formatted log messages. The main benefit of using RFC 5424 is that clients may now send log messages with timestamps containing year numbers, microseconds and time zone offsets. Similarly, the syslog(3) C library function has been altered to send RFC 5424 formatted messages to the local system logging daemon. On systems using syslogd(8), this change should have no negative impact, as long as syslogd(8) and the C library are updated at the same time. On systems using a different system logging daemon, it may be necessary to make configuration adjustments, depending on the software used. When using syslog-ng, add the 'syslog-protocol' flag to local input sources to enable parsing of RFC 5424 formatted messages: source src { unix-dgram("/var/run/log" flags(syslog-protocol)); } When using rsyslog, disable the 'SysSock.UseSpecialParser' option of the 'imuxsock' module to let messages be processed by the regular RFC 3164/5424 parsing pipeline: module(load="imuxsock" SysSock.UseSpecialParser="off") Do note that these changes only affect communication between local applications and syslogd(8). The format that syslogd(8) uses to store messages on disk or forward messages to other systems remains unchanged. syslogd(8) still uses RFC 3164 for these purposes. Options to customize this behaviour will be added in the future. Utilities that process log files stored in /var/log are thus expected to continue to function as before. __FreeBSD_version has been incremented to 1200061 to denote this change. 20180328: Support for token ring networks has been removed. If you have "device token" in your kernel config you should remove it. No device drivers supported token ring. 20180323: makefs was modified to be able to tag ISO9660 El Torito boot catalog entries as EFI instead of overloading the i386 tag as done previously. The amd64 mkisoimages.sh script used to build amd64 ISO images for release was updated to use this. This may mean that makefs must be updated before "make cdrom" can be run in the release directory. This should be as simple as: $ cd $SRCDIR/usr.sbin/makefs $ make depend all install 20180212: FreeBSD boot loader enhanced with Lua scripting. It's purely opt-in for now by building WITH_LOADER_LUA and WITHOUT_FORTH in /etc/src.conf. Co-existence for the transition period will come shortly. Booting is a complex environment and test coverage for Lua-enabled loaders has been thin, so it would be prudent to assume it might not work and make provisions for backup boot methods. 20180211: devmatch functionality has been turned on in devd. It will automatically load drivers for unattached devices. This may cause unexpected drivers to be loaded. Please report any problems to current@ and imp@freebsd.org. 20180114: Clang, llvm, lld, lldb, compiler-rt and libc++ have been upgraded to 6.0.0. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20180110: LLVM's lld linker is now used as the FreeBSD/amd64 bootstrap linker. This means it is used to link the kernel and userland libraries and executables, but is not yet installed as /usr/bin/ld by default. To revert to ld.bfd as the bootstrap linker, in /etc/src.conf set WITHOUT_LLD_BOOTSTRAP=yes 20180110: On i386, pmtimer has been removed. Its functionality has been folded into apm. It was a no-op on ACPI in current for a while now (but was still needed on i386 in FreeBSD 11 and earlier). Users may need to remove it from kernel config files. 20180104: The use of RSS hash from the network card aka flowid has been disabled by default for lagg(4) as it's currently incompatible with the lacp and loadbalance protocols. This can be re-enabled by setting the following in loader.conf: net.link.lagg.default_use_flowid="1" 20180102: The SW_WATCHDOG option is no longer necessary to enable the hardclock-based software watchdog if no hardware watchdog is configured. As before, SW_WATCHDOG will cause the software watchdog to be enabled even if a hardware watchdog is configured. 20171215: r326887 fixes the issue described in the 20171214 UPDATING entry. r326888 flips the switch back to building GELI support always. 20171214: r362593 broke ZFS + GELI support for reasons unknown. However, it also broke ZFS support generally, so GELI has been turned off by default as the lesser evil in r326857. If you boot off ZFS and/or GELI, it might not be a good time to update. 20171125: PowerPC users must update loader(8) by rebuilding world before installing a new kernel, as the protocol connecting them has changed. Without the update, loader metadata will not be passed successfully to the kernel and users will have to enter their root partition at the kernel mountroot prompt to continue booting. Newer versions of loader can boot old kernels without issue. 20171110: The LOADER_FIREWIRE_SUPPORT build variable as been renamed to WITH/OUT_LOADER_FIREWIRE. LOADER_{NO_,}GELI_SUPPORT has been renamed to WITH/OUT_LOADER_GELI. 20171106: The naive and non-compliant support of posix_fallocate(2) in ZFS has been removed as of r325320. The system call now returns EINVAL when used on a ZFS file. Although the new behavior complies with the standard, some consumers are not prepared to cope with it. One known victim is lld prior to r325420. 20171102: Building in a FreeBSD src checkout will automatically create object directories now rather than store files in the current directory if 'make obj' was not ran. Calling 'make obj' is no longer necessary. This feature can be disabled by setting WITHOUT_AUTO_OBJ=yes in /etc/src-env.conf (not /etc/src.conf), or passing the option in the environment. 20171101: The default MAKEOBJDIR has changed from /usr/obj/ for native builds, and /usr/obj// for cross-builds, to a unified /usr/obj//. This behavior can be changed to the old format by setting WITHOUT_UNIFIED_OBJDIR=yes in /etc/src-env.conf, the environment, or with -DWITHOUT_UNIFIED_OBJDIR when building. The UNIFIED_OBJDIR option is a transitional feature that will be removed for 12.0 release; please migrate to the new format for any tools by looking up the OBJDIR used by 'make -V .OBJDIR' means rather than hardcoding paths. 20171028: The native-xtools target no longer installs the files by default to the OBJDIR. Use the native-xtools-install target with a DESTDIR to install to ${DESTDIR}/${NXTP} where NXTP defaults to /nxb-bin. 20171021: As part of the boot loader infrastructure cleanup, LOADER_*_SUPPORT options are changing from controlling the build if defined / undefined to controlling the build with explicit 'yes' or 'no' values. They will shift to WITH/WITHOUT options to match other options in the system. 20171010: libstand has turned into a private library for sys/boot use only. It is no longer supported as a public interface outside of sys/boot. 20171005: The arm port has split armv6 into armv6 and armv7. armv7 is now a valid TARGET_ARCH/MACHINE_ARCH setting. If you have an armv7 system and are running a kernel from before r324363, you will need to add MACHINE_ARCH=armv7 to 'make buildworld' to do a native build. 20171003: When building multiple kernels using KERNCONF, non-existent KERNCONF files will produce an error and buildkernel will fail. Previously missing KERNCONF files silently failed giving no indication as to why, only to subsequently discover during installkernel that the desired kernel was never built in the first place. 20170912: The default serial number format for CTL LUNs has changed. This will affect users who use /dev/diskid/* device nodes, or whose FibreChannel or iSCSI clients care about their LUNs' serial numbers. Users who require serial number stability should hardcode serial numbers in /etc/ctl.conf . 20170912: For 32-bit arm compiled for hard-float support, soft-floating point binaries now always get their shared libraries from LD_SOFT_LIBRARY_PATH (in the past, this was only used if /usr/libsoft also existed). Only users with a hard-float ld.so, but soft-float everything else should be affected. 20170826: The geli password typed at boot is now hidden. To restore the previous behavior, see geli(8) for configuration options. 20170825: Move PMTUD blackhole counters to TCPSTATS and remove them from bare sysctl values. Minor nit, but requires a rebuild of both world/kernel to complete. 20170814: "make check" behavior (made in ^/head@r295380) has been changed to execute from a limited sandbox, as opposed to executing from ${TESTSDIR}. Behavioral changes: - The "beforecheck" and "aftercheck" targets are now specified. - ${CHECKDIR} (added in commit noted above) has been removed. - Legacy behavior can be enabled by setting WITHOUT_MAKE_CHECK_USE_SANDBOX in src.conf(5) or the environment. If the limited sandbox mode is enabled, "make check" will execute "make distribution", then install, execute the tests, and clean up the sandbox if successful. The "make distribution" and "make install" targets are typically run as root to set appropriate permissions and ownership at installation time. The end-user should set "WITH_INSTALL_AS_USER" in src.conf(5) or the environment if executing "make check" with limited sandbox mode using an unprivileged user. 20170808: Since the switch to GPT disk labels, fsck for UFS/FFS has been unable to automatically find alternate superblocks. As of r322297, the information needed to find alternate superblocks has been moved to the end of the area reserved for the boot block. Filesystems created with a newfs of this vintage or later will create the recovery information. If you have a filesystem created prior to this change and wish to have a recovery block created for your filesystem, you can do so by running fsck in foreground mode (i.e., do not use the -p or -y options). As it starts, fsck will ask ``SAVE DATA TO FIND ALTERNATE SUPERBLOCKS'' to which you should answer yes. 20170728: As of r321665, an NFSv4 server configuration that services Kerberos mounts or clients that do not support the uid/gid in owner/owner_group string capability, must explicitly enable the nfsuserd daemon by adding nfsuserd_enable="YES" to the machine's /etc/rc.conf file. 20170722: Clang, llvm, lldb, compiler-rt and libc++ have been upgraded to 5.0.0. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20170701: WITHOUT_RCMDS is now the default. Set WITH_RCMDS if you need the r-commands (rlogin, rsh, etc.) to be built with the base system. 20170625: The FreeBSD/powerpc platform now uses a 64-bit type for time_t. This is a very major ABI incompatible change, so users of FreeBSD/powerpc must be careful when performing source upgrades. It is best to run 'make installworld' from an alternate root system, either a live CD/memory stick, or a temporary root partition. Additionally, all ports must be recompiled. powerpc64 is largely unaffected, except in the case of 32-bit compatibility. All 32-bit binaries will be affected. 20170623: Forward compatibility for the "ino64" project have been committed. This will allow most new binaries to run on older kernels in a limited fashion. This prevents many of the common foot-shooting actions in the upgrade as well as the limited ability to roll back the kernel across the ino64 upgrade. Complicated use cases may not work properly, though enough simpler ones work to allow recovery in most situations. 20170620: Switch back to the BSDL dtc (Device Tree Compiler). Set WITH_GPL_DTC if you require the GPL compiler. 20170618: The internal ABI used for communication between the NFS kernel modules was changed by r320085, so __FreeBSD_version was bumped to ensure all the NFS related modules are updated together. 20170617: The ABI of struct event was changed by extending the data member to 64bit and adding ext fields. For upgrade, same precautions as for the entry 20170523 "ino64" must be followed. 20170531: The GNU roff toolchain has been removed from base. To render manpages which are not supported by mandoc(1), man(1) can fallback on GNU roff from ports (and recommends to install it). To render roff(7) documents, consider using GNU roff from ports or the heirloom doctools roff toolchain from ports via pkg install groff or via pkg install heirloom-doctools. 20170524: The ath(4) and ath_hal(4) modules now build piecemeal to allow for smaller runtime footprint builds. This is useful for embedded systems which only require one chipset support. If you load it as a module, make sure this is in /boot/loader.conf: if_ath_load="YES" This will load the HAL, all chip/RF backends and if_ath_pci. If you have if_ath_pci in /boot/loader.conf, ensure it is after if_ath or it will not load any HAL chipset support. If you want to selectively load things (eg on ye cheape ARM/MIPS platforms where RAM is at a premium) you should: * load ath_hal * load the chip modules in question * load ath_rate, ath_dfs * load ath_main * load if_ath_pci and/or if_ath_ahb depending upon your particular bus bind type - this is where probe/attach is done. For further comments/feedback, poke adrian@ . 20170523: The "ino64" 64-bit inode project has been committed, which extends a number of types to 64 bits. Upgrading in place requires care and adherence to the documented upgrade procedure. If using a custom kernel configuration ensure that the COMPAT_FREEBSD11 option is included (as during the upgrade the system will be running the ino64 kernel with the existing world). For the safest in-place upgrade begin by removing previous build artifacts via "rm -rf /usr/obj/*". Then, carefully follow the full procedure documented below under the heading "To rebuild everything and install it on the current system." Specifically, a reboot is required after installing the new kernel before installing world. While an installworld normally works by accident from multiuser after rebooting the proper kernel, there are many cases where this will fail across this upgrade and installworld from single user is required. 20170424: The NATM framework including the en(4), fatm(4), hatm(4), and patm(4) devices has been removed. Consumers should plan a migration before the end-of-life date for FreeBSD 11. 20170420: GNU diff has been replaced by a BSD licensed diff. Some features of GNU diff has not been implemented, if those are needed a newer version of GNU diff is available via the diffutils package under the gdiff name. 20170413: As of r316810 for ipfilter, keep frags is no longer assumed when keep state is specified in a rule. r316810 aligns ipfilter with documentation in man pages separating keep frags from keep state. This allows keep state to be specified without forcing keep frags and allows keep frags to be specified independently of keep state. To maintain previous behaviour, also specify keep frags with keep state (as documented in ipf.conf.5). 20170407: arm64 builds now use the base system LLD 4.0.0 linker by default, instead of requiring that the aarch64-binutils port or package be installed. To continue using aarch64-binutils, set CROSS_BINUTILS_PREFIX=/usr/local/aarch64-freebsd/bin . 20170405: The UDP optimization in entry 20160818 that added the sysctl net.inet.udp.require_l2_bcast has been reverted. L2 broadcast packets will no longer be treated as L3 broadcast packets. 20170331: Binds and sends to the loopback addresses, IPv6 and IPv4, will now use any explicitly assigned loopback address available in the jail instead of using the first assigned address of the jail. 20170329: The ctl.ko module no longer implements the iSCSI target frontend: cfiscsi.ko does instead. If building cfiscsi.ko as a kernel module, the module can be loaded via one of the following methods: - `cfiscsi_load="YES"` in loader.conf(5). - Add `cfiscsi` to `$kld_list` in rc.conf(5). - ctladm(8)/ctld(8), when compiled with iSCSI support (`WITH_ISCSI=yes` in src.conf(5)) Please see cfiscsi(4) for more details. 20170316: The mmcsd.ko module now additionally depends on geom_flashmap.ko. Also, mmc.ko and mmcsd.ko need to be a matching pair built from the same source (previously, the dependency of mmcsd.ko on mmc.ko was missing, but mmcsd.ko now will refuse to load if it is incompatible with mmc.ko). 20170315: The syntax of ipfw(8) named states was changed to avoid ambiguity. If you have used named states in the firewall rules, you need to modify them after installworld and before rebooting. Now named states must be prefixed with colon. 20170311: The old drm (sys/dev/drm/) drivers for i915 and radeon have been removed as the userland we provide cannot use them. The KMS version (sys/dev/drm2) supports the same hardware. 20170302: Clang, llvm, lldb, compiler-rt and libc++ have been upgraded to 4.0.0. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20170221: The code that provides support for ZFS .zfs/ directory functionality has been reimplemented. It's not possible now to create a snapshot by mkdir under .zfs/snapshot/. That should be the only user visible change. 20170216: EISA bus support has been removed. The WITH_EISA option is no longer valid. 20170215: MCA bus support has been removed. 20170127: The WITH_LLD_AS_LD / WITHOUT_LLD_AS_LD build knobs have been renamed WITH_LLD_IS_LD / WITHOUT_LLD_IS_LD, for consistency with CLANG_IS_CC. 20170112: The EM_MULTIQUEUE kernel configuration option is deprecated now that the em(4) driver conforms to iflib specifications. 20170109: The igb(4), em(4) and lem(4) ethernet drivers are now implemented via IFLIB. If you have a custom kernel configuration that excludes em(4) but you use igb(4), you need to re-add em(4) to your custom configuration. 20161217: Clang, llvm, lldb, compiler-rt and libc++ have been upgraded to 3.9.1. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20161124: Clang, llvm, lldb, compiler-rt and libc++ have been upgraded to 3.9.0. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20161119: The layout of the pmap structure has changed for powerpc to put the pmap statistics at the front for all CPU variations. libkvm(3) and all tools that link against it need to be recompiled. 20161030: isl(4) and cyapa(4) drivers now require a new driver, chromebook_platform(4), to work properly on Chromebook-class hardware. On other types of hardware the drivers may need to be configured using device hints. Please see the corresponding manual pages for details. 20161017: The urtwn(4) driver was merged into rtwn(4) and now consists of rtwn(4) main module + rtwn_usb(4) and rtwn_pci(4) bus-specific parts. Also, firmware for RTL8188CE was renamed due to possible name conflict (rtwnrtl8192cU(B) -> rtwnrtl8192cE(B)) 20161015: GNU rcs has been removed from base. It is available as packages: - rcs: Latest GPLv3 GNU rcs version. - rcs57: Copy of the latest version of GNU rcs (GPLv2) before it was removed from base. 20161008: Use of the cc_cdg, cc_chd, cc_hd, or cc_vegas congestion control modules now requires that the kernel configuration contain the TCP_HHOOK option. (This option is included in the GENERIC kernel.) 20161003: The WITHOUT_ELFCOPY_AS_OBJCOPY src.conf(5) knob has been retired. ELF Tool Chain's elfcopy is always installed as /usr/bin/objcopy. 20160924: Relocatable object files with the extension of .So have been renamed to use an extension of .pico instead. The purpose of this change is to avoid a name clash with shared libraries on case-insensitive file systems. On those file systems, foo.So is the same file as foo.so. 20160918: GNU rcs has been turned off by default. It can (temporarily) be built again by adding WITH_RCS knob in src.conf. Otherwise, GNU rcs is available from packages: - rcs: Latest GPLv3 GNU rcs version. - rcs57: Copy of the latest version of GNU rcs (GPLv2) from base. 20160918: The backup_uses_rcs functionality has been removed from rc.subr. 20160908: The queue(3) debugging macro, QUEUE_MACRO_DEBUG, has been split into two separate components, QUEUE_MACRO_DEBUG_TRACE and QUEUE_MACRO_DEBUG_TRASH. Define both for the original QUEUE_MACRO_DEBUG behavior. 20160824: r304787 changed some ioctl interfaces between the iSCSI userspace programs and the kernel. ctladm, ctld, iscsictl, and iscsid must be rebuilt to work with new kernels. __FreeBSD_version has been bumped to 1200005. 20160818: The UDP receive code has been updated to only treat incoming UDP packets that were addressed to an L2 broadcast address as L3 broadcast packets. It is not expected that this will affect any standards-conforming UDP application. The new behaviour can be disabled by setting the sysctl net.inet.udp.require_l2_bcast to 0. 20160818: Remove the openbsd_poll system call. __FreeBSD_version has been bumped because of this. 20160708: The stable/11 branch has been created from head@r302406. 20160622: The libc stub for the pipe(2) system call has been replaced with a wrapper that calls the pipe2(2) system call and the pipe(2) system call is now only implemented by the kernels that include "options COMPAT_FREEBSD10" in their config file (this is the default). Users should ensure that this option is enabled in their kernel or upgrade userspace to r302092 before upgrading their kernel. 20160527: CAM will now strip leading spaces from SCSI disks' serial numbers. This will affect users who create UFS filesystems on SCSI disks using those disk's diskid device nodes. For example, if /etc/fstab previously contained a line like "/dev/diskid/DISK-%20%20%20%20%20%20%20ABCDEFG0123456", you should change it to "/dev/diskid/DISK-ABCDEFG0123456". Users of geom transforms like gmirror may also be affected. ZFS users should generally be fine. 20160523: The bitstring(3) API has been updated with new functionality and improved performance. But it is binary-incompatible with the old API. Objects built with the new headers may not be linked against objects built with the old headers. 20160520: The brk and sbrk functions have been removed from libc on arm64. Binutils from ports has been updated to not link to these functions and should be updated to the latest version before installing a new libc. 20160517: The armv6 port now defaults to hard float ABI. Limited support for running both hardfloat and soft float on the same system is available using the libraries installed with -DWITH_LIBSOFT. This has only been tested as an upgrade path for installworld and packages may fail or need manual intervention to run. New packages will be needed. To update an existing self-hosted armv6hf system, you must add TARGET_ARCH=armv6 on the make command line for both the build and the install steps. 20160510: Kernel modules compiled outside of a kernel build now default to installing to /boot/modules instead of /boot/kernel. Many kernel modules built this way (such as those in ports) already overrode KMODDIR explicitly to install into /boot/modules. However, manually building and installing a module from /sys/modules will now install to /boot/modules instead of /boot/kernel. 20160414: The CAM I/O scheduler has been committed to the kernel. There should be no user visible impact. This does enable NCQ Trim on ada SSDs. While the list of known rogues that claim support for this but actually corrupt data is believed to be complete, be on the lookout for data corruption. The known rogue list is believed to be complete: o Crucial MX100, M550 drives with MU01 firmware. o Micron M510 and M550 drives with MU01 firmware. o Micron M500 prior to MU07 firmware o Samsung 830, 840, and 850 all firmwares o FCCT M500 all firmwares Crucial has firmware http://www.crucial.com/usa/en/support-ssd-firmware with working NCQ TRIM. For Micron branded drives, see your sales rep for updated firmware. Black listed drives will work correctly because these drives work correctly so long as no NCQ TRIMs are sent to them. Given this list is the same as found in Linux, it's believed there are no other rogues in the market place. All other models from the above vendors work. To be safe, if you are at all concerned, you can quirk each of your drives to prevent NCQ from being sent by setting: kern.cam.ada.X.quirks="0x2" in loader.conf. If the drive requires the 4k sector quirk, set the quirks entry to 0x3. 20160330: The FAST_DEPEND build option has been removed and its functionality is now the one true way. The old mkdep(1) style of 'make depend' has been removed. See 20160311 for further details. 20160317: Resource range types have grown from unsigned long to uintmax_t. All drivers, and anything using libdevinfo, need to be recompiled. 20160311: WITH_FAST_DEPEND is now enabled by default for in-tree and out-of-tree builds. It no longer runs mkdep(1) during 'make depend', and the 'make depend' stage can safely be skipped now as it is auto ran when building 'make all' and will generate all SRCS and DPSRCS before building anything else. Dependencies are gathered at compile time with -MF flags kept in separate .depend files per object file. Users should run 'make cleandepend' once if using -DNO_CLEAN to clean out older stale .depend files. 20160306: On amd64, clang 3.8.0 can now insert sections of type AMD64_UNWIND into kernel modules. Therefore, if you load any kernel modules at boot time, please install the boot loaders after you install the kernel, but before rebooting, e.g.: make buildworld make buildkernel KERNCONF=YOUR_KERNEL_HERE make installkernel KERNCONF=YOUR_KERNEL_HERE make -C sys/boot install Then follow the usual steps, described in the General Notes section, below. 20160305: Clang, llvm, lldb and compiler-rt have been upgraded to 3.8.0. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20160301: The AIO subsystem is now a standard part of the kernel. The VFS_AIO kernel option and aio.ko kernel module have been removed. Due to stability concerns, asynchronous I/O requests are only permitted on sockets and raw disks by default. To enable asynchronous I/O requests on all file types, set the vfs.aio.enable_unsafe sysctl to a non-zero value. 20160226: The ELF object manipulation tool objcopy is now provided by the ELF Tool Chain project rather than by GNU binutils. It should be a drop-in replacement, with the addition of arm64 support. The (temporary) src.conf knob WITHOUT_ELFCOPY_AS_OBJCOPY knob may be set to obtain the GNU version if necessary. 20160129: Building ZFS pools on top of zvols is prohibited by default. That feature has never worked safely; it's always been prone to deadlocks. Using a zvol as the backing store for a VM guest's virtual disk will still work, even if the guest is using ZFS. Legacy behavior can be restored by setting vfs.zfs.vol.recursive=1. 20160119: The NONE and HPN patches has been removed from OpenSSH. They are still available in the security/openssh-portable port. 20160113: With the addition of ypldap(8), a new _ypldap user is now required during installworld. "mergemaster -p" can be used to add the user prior to installworld, as documented in the handbook. 20151216: The tftp loader (pxeboot) now uses the option root-path directive. As a consequence it no longer looks for a pxeboot.4th file on the tftp server. Instead it uses the regular /boot infrastructure as with the other loaders. 20151211: The code to start recording plug and play data into the modules has been committed. While the old tools will properly build a new kernel, a number of warnings about "unknown metadata record 4" will be produced for an older kldxref. To avoid such warnings, make sure to rebuild the kernel toolchain (or world). Make sure that you have r292078 or later when trying to build 292077 or later before rebuilding. 20151207: Debug data files are now built by default with 'make buildworld' and installed with 'make installworld'. This facilitates debugging but requires more disk space both during the build and for the installed world. Debug files may be disabled by setting WITHOUT_DEBUG_FILES=yes in src.conf(5). 20151130: r291527 changed the internal interface between the nfsd.ko and nfscommon.ko modules. As such, they must both be upgraded to-gether. __FreeBSD_version has been bumped because of this. 20151108: Add support for unicode collation strings leads to a change of order of files listed by ls(1) for example. To get back to the old behaviour, set LC_COLLATE environment variable to "C". Databases administrators will need to reindex their databases given collation results will be different. Due to a bug in install(1) it is recommended to remove the ancient locales before running make installworld. rm -rf /usr/share/locale/* 20151030: The OpenSSL has been upgraded to 1.0.2d. Any binaries requiring libcrypto.so.7 or libssl.so.7 must be recompiled. 20151020: Qlogic 24xx/25xx firmware images were updated from 5.5.0 to 7.3.0. Kernel modules isp_2400_multi and isp_2500_multi were removed and should be replaced with isp_2400 and isp_2500 modules respectively. 20151017: The build previously allowed using 'make -n' to not recurse into sub-directories while showing what commands would be executed, and 'make -n -n' to recursively show commands. Now 'make -n' will recurse and 'make -N' will not. 20151012: If you specify SENDMAIL_MC or SENDMAIL_CF in make.conf, mergemaster and etcupdate will now use this file. A custom sendmail.cf is now updated via this mechanism rather than via installworld. If you had excluded sendmail.cf in mergemaster.rc or etcupdate.conf, you may want to remove the exclusion or change it to "always install". /etc/mail/sendmail.cf is now managed the same way regardless of whether SENDMAIL_MC/SENDMAIL_CF is used. If you are not using SENDMAIL_MC/SENDMAIL_CF there should be no change in behavior. 20151011: Compatibility shims for legacy ATA device names have been removed. It includes ATA_STATIC_ID kernel option, kern.cam.ada.legacy_aliases and kern.geom.raid.legacy_aliases loader tunables, kern.devalias.* environment variables, /dev/ad* and /dev/ar* symbolic links. 20151006: Clang, llvm, lldb, compiler-rt and libc++ have been upgraded to 3.7.0. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20150924: Kernel debug files have been moved to /usr/lib/debug/boot/kernel/, and renamed from .symbols to .debug. This reduces the size requirements on the boot partition or file system and provides consistency with userland debug files. When using the supported kernel installation method the /usr/lib/debug/boot/kernel directory will be renamed (to kernel.old) as is done with /boot/kernel. Developers wishing to maintain the historical behavior of installing debug files in /boot/kernel/ can set KERN_DEBUGDIR="" in src.conf(5). 20150827: The wireless drivers had undergone changes that remove the 'parent interface' from the ifconfig -l output. The rc.d network scripts used to check presence of a parent interface in the list, so old scripts would fail to start wireless networking. Thus, etcupdate(3) or mergemaster(8) run is required after kernel update, to update your rc.d scripts in /etc. 20150827: pf no longer supports 'scrub fragment crop' or 'scrub fragment drop-ovl' These configurations are now automatically interpreted as 'scrub fragment reassemble'. 20150817: Kernel-loadable modules for the random(4) device are back. To use them, the kernel must have device random options RANDOM_LOADABLE kldload(8) can then be used to load random_fortuna.ko or random_yarrow.ko. Please note that due to the indirect function calls that the loadable modules need to provide, the build-in variants will be slightly more efficient. The random(4) kernel option RANDOM_DUMMY has been retired due to unpopularity. It was not all that useful anyway. 20150813: The WITHOUT_ELFTOOLCHAIN_TOOLS src.conf(5) knob has been retired. Control over building the ELF Tool Chain tools is now provided by the WITHOUT_TOOLCHAIN knob. 20150810: The polarity of Pulse Per Second (PPS) capture events with the uart(4) driver has been corrected. Prior to this change the PPS "assert" event corresponded to the trailing edge of a positive PPS pulse and the "clear" event was the leading edge of the next pulse. As the width of a PPS pulse in a typical GPS receiver is on the order of 1 millisecond, most users will not notice any significant difference with this change. Anyone who has compensated for the historical polarity reversal by configuring a negative offset equal to the pulse width will need to remove that workaround. 20150809: The default group assigned to /dev/dri entries has been changed from 'wheel' to 'video' with the id of '44'. If you want to have access to the dri devices please add yourself to the video group with: # pw groupmod video -m $USER 20150806: The menu.rc and loader.rc files will now be replaced during upgrades. Please migrate local changes to menu.rc.local and loader.rc.local instead. 20150805: GNU Binutils versions of addr2line, c++filt, nm, readelf, size, strings and strip have been removed. The src.conf(5) knob WITHOUT_ELFTOOLCHAIN_TOOLS no longer provides the binutils tools. 20150728: As ZFS requires more kernel stack pages than is the default on some architectures e.g. i386, it now warns if KSTACK_PAGES is less than ZFS_MIN_KSTACK_PAGES (which is 4 at the time of writing). Please consider using 'options KSTACK_PAGES=X' where X is greater than or equal to ZFS_MIN_KSTACK_PAGES i.e. 4 in such configurations. 20150706: sendmail has been updated to 8.15.2. Starting with FreeBSD 11.0 and sendmail 8.15, sendmail uses uncompressed IPv6 addresses by default, i.e., they will not contain "::". For example, instead of ::1, it will be 0:0:0:0:0:0:0:1. This permits a zero subnet to have a more specific match, such as different map entries for IPv6:0:0 vs IPv6:0. This change requires that configuration data (including maps, files, classes, custom ruleset, etc.) must use the same format, so make certain such configuration data is upgrading. As a very simple check search for patterns like 'IPv6:[0-9a-fA-F:]*::' and 'IPv6::'. To return to the old behavior, set the m4 option confUSE_COMPRESSED_IPV6_ADDRESSES or the cf option UseCompressedIPv6Addresses. 20150630: The default kernel entropy-processing algorithm is now Fortuna, replacing Yarrow. Assuming you have 'device random' in your kernel config file, the configurations allow a kernel option to override this default. You may choose *ONE* of: options RANDOM_YARROW # Legacy /dev/random algorithm. options RANDOM_DUMMY # Blocking-only driver. If you have neither, you get Fortuna. For most people, read no further, Fortuna will give a /dev/random that works like it always used to, and the difference will be irrelevant. If you remove 'device random', you get *NO* kernel-processed entropy at all. This may be acceptable to folks building embedded systems, but has complications. Carry on reading, and it is assumed you know what you need. *PLEASE* read random(4) and random(9) if you are in the habit of tweaking kernel configs, and/or if you are a member of the embedded community, wanting specific and not-usual behaviour from your security subsystems. NOTE!! If you use RANDOM_DUMMY and/or have no 'device random', you will NOT have a functioning /dev/random, and many cryptographic features will not work, including SSH. You may also find strange behaviour from the random(3) set of library functions, in particular sranddev(3), srandomdev(3) and arc4random(3). The reason for this is that the KERN_ARND sysctl only returns entropy if it thinks it has some to share, and with RANDOM_DUMMY or no 'device random' this will never happen. 20150623: An additional fix for the issue described in the 20150614 sendmail entry below has been committed in revision 284717. 20150616: FreeBSD's old make (fmake) has been removed from the system. It is available as the devel/fmake port or via pkg install fmake. 20150615: The fix for the issue described in the 20150614 sendmail entry below has been committed in revision 284436. The work around described in that entry is no longer needed unless the default setting is overridden by a confDH_PARAMETERS configuration setting of '5' or pointing to a 512 bit DH parameter file. 20150614: ALLOW_DEPRECATED_ATF_TOOLS/ATFFILE support has been removed from atf.test.mk (included from bsd.test.mk). Please upgrade devel/atf and devel/kyua to version 0.20+ and adjust any calling code to work with Kyuafile and kyua. 20150614: The import of openssl to address the FreeBSD-SA-15:10.openssl security advisory includes a change which rejects handshakes with DH parameters below 768 bits. sendmail releases prior to 8.15.2 (not yet released), defaulted to a 512 bit DH parameter setting for client connections. To work around this interoperability, sendmail can be configured to use a 2048 bit DH parameter by: 1. Edit /etc/mail/`hostname`.mc 2. If a setting for confDH_PARAMETERS does not exist or exists and is set to a string beginning with '5', replace it with '2'. 3. If a setting for confDH_PARAMETERS exists and is set to a file path, create a new file with: openssl dhparam -out /path/to/file 2048 4. Rebuild the .cf file: cd /etc/mail/; make; make install 5. Restart sendmail: cd /etc/mail/; make restart A sendmail patch is coming, at which time this file will be updated. 20150604: Generation of legacy formatted entries have been disabled by default in pwd_mkdb(8), as all base system consumers of the legacy formatted entries were converted to use the new format by default when the new, machine independent format have been added and supported since FreeBSD 5.x. Please see the pwd_mkdb(8) manual page for further details. 20150525: Clang and llvm have been upgraded to 3.6.1 release. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using 3.5.0 or higher. 20150521: TI platform code switched to using vendor DTS files and this update may break existing systems running on Beaglebone, Beaglebone Black, and Pandaboard: - dtb files should be regenerated/reinstalled. Filenames are the same but content is different now - GPIO addressing was changed, now each GPIO bank (32 pins per bank) has its own /dev/gpiocX device, e.g. pin 121 on /dev/gpioc0 in old addressing scheme is now pin 25 on /dev/gpioc3. - Pandaboard: /etc/ttys should be updated, serial console device is now /dev/ttyu2, not /dev/ttyu0 20150501: soelim(1) from gnu/usr.bin/groff has been replaced by usr.bin/soelim. If you need the GNU extension from groff soelim(1), install groff from package: pkg install groff, or via ports: textproc/groff. 20150423: chmod, chflags, chown and chgrp now affect symlinks in -R mode as defined in symlink(7); previously symlinks were silently ignored. 20150415: The const qualifier has been removed from iconv(3) to comply with POSIX. The ports tree is aware of this from r384038 onwards. 20150416: Libraries specified by LIBADD in Makefiles must have a corresponding DPADD_ variable to ensure correct dependencies. This is now enforced in src.libnames.mk. 20150324: From legacy ata(4) driver was removed support for SATA controllers supported by more functional drivers ahci(4), siis(4) and mvs(4). Kernel modules ataahci and ataadaptec were removed completely, replaced by ahci and mvs modules respectively. 20150315: Clang, llvm and lldb have been upgraded to 3.6.0 release. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using 3.5.0 or higher. 20150307: The 32-bit PowerPC kernel has been changed to a position-independent executable. This can only be booted with a version of loader(8) newer than January 31, 2015, so make sure to update both world and kernel before rebooting. 20150217: If you are running a -CURRENT kernel since r273872 (Oct 30th, 2014), but before r278950, the RNG was not seeded properly. Immediately upgrade the kernel to r278950 or later and regenerate any keys (e.g. ssh keys or openssl keys) that were generated w/ a kernel from that range. This does not affect programs that directly used /dev/random or /dev/urandom. All userland uses of arc4random(3) are affected. 20150210: The autofs(4) ABI was changed in order to restore binary compatibility with 10.1-RELEASE. The automountd(8) daemon needs to be rebuilt to work with the new kernel. 20150131: The powerpc64 kernel has been changed to a position-independent executable. This can only be booted with a new version of loader(8), so make sure to update both world and kernel before rebooting. 20150118: Clang and llvm have been upgraded to 3.5.1 release. This is a bugfix only release, no new features have been added. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using 3.5.0. 20150107: ELF tools addr2line, elfcopy (strip), nm, size, and strings are now taken from the ELF Tool Chain project rather than GNU binutils. They should be drop-in replacements, with the addition of arm64 support. The WITHOUT_ELFTOOLCHAIN_TOOLS= knob may be used to obtain the binutils tools, if necessary. See 20150805 for updated information. 20150105: The default Unbound configuration now enables remote control using a local socket. Users who have already enabled the local_unbound service should regenerate their configuration by running "service local_unbound setup" as root. 20150102: The GNU texinfo and GNU info pages have been removed. To be able to view GNU info pages please install texinfo from ports. 20141231: Clang, llvm and lldb have been upgraded to 3.5.0 release. As of this release, a prerequisite for building clang, llvm and lldb is a C++11 capable compiler and C++11 standard library. This means that to be able to successfully build the cross-tools stage of buildworld, with clang as the bootstrap compiler, your system compiler or cross compiler should either be clang 3.3 or later, or gcc 4.8 or later, and your system C++ library should be libc++, or libdstdc++ from gcc 4.8 or later. On any standard FreeBSD 10.x or 11.x installation, where clang and libc++ are on by default (that is, on x86 or arm), this should work out of the box. On 9.x installations where clang is enabled by default, e.g. on x86 and powerpc, libc++ will not be enabled by default, so libc++ should be built (with clang) and installed first. If both clang and libc++ are missing, build clang first, then use it to build libc++. On 8.x and earlier installations, upgrade to 9.x first, and then follow the instructions for 9.x above. Sparc64 and mips users are unaffected, as they still use gcc 4.2.1 by default, and do not build clang. Many embedded systems are resource constrained, and will not be able to build clang in a reasonable time, or in some cases at all. In those cases, cross building bootable systems on amd64 is a workaround. This new version of clang introduces a number of new warnings, of which the following are most likely to appear: -Wabsolute-value This warns in two cases, for both C and C++: * When the code is trying to take the absolute value of an unsigned quantity, which is effectively a no-op, and almost never what was intended. The code should be fixed, if at all possible. If you are sure that the unsigned quantity can be safely cast to signed, without loss of information or undefined behavior, you can add an explicit cast, or disable the warning. * When the code is trying to take an absolute value, but the called abs() variant is for the wrong type, which can lead to truncation. If you want to disable the warning instead of fixing the code, please make sure that truncation will not occur, or it might lead to unwanted side-effects. -Wtautological-undefined-compare and -Wundefined-bool-conversion These warn when C++ code is trying to compare 'this' against NULL, while 'this' should never be NULL in well-defined C++ code. However, there is some legacy (pre C++11) code out there, which actively abuses this feature, which was less strictly defined in previous C++ versions. Squid and openjdk do this, for example. The warning can be turned off for C++98 and earlier, but compiling the code in C++11 mode might result in unexpected behavior; for example, the parts of the program that are unreachable could be optimized away. 20141222: The old NFS client and server (kernel options NFSCLIENT, NFSSERVER) kernel sources have been removed. The .h files remain, since some utilities include them. This will need to be fixed later. If "mount -t oldnfs ..." is attempted, it will fail. If the "-o" option on mountd(8), nfsd(8) or nfsstat(1) is used, the utilities will report errors. 20141121: The handling of LOCAL_LIB_DIRS has been altered to skip addition of directories to top level SUBDIR variable when their parent directory is included in LOCAL_DIRS. Users with build systems with such hierarchies and without SUBDIR entries in the parent directory Makefiles should add them or add the directories to LOCAL_DIRS. 20141109: faith(4) and faithd(8) have been removed from the base system. Faith has been obsolete for a very long time. 20141104: vt(4), the new console driver, is enabled by default. It brings support for Unicode and double-width characters, as well as support for UEFI and integration with the KMS kernel video drivers. You may need to update your console settings in /etc/rc.conf, most probably the keymap. During boot, /etc/rc.d/syscons will indicate what you need to do. vt(4) still has issues and lacks some features compared to syscons(4). See the wiki for up-to-date information: https://wiki.freebsd.org/Newcons If you want to keep using syscons(4), you can do so by adding the following line to /boot/loader.conf: kern.vty=sc 20141102: pjdfstest has been integrated into kyua as an opt-in test suite. Please see share/doc/pjdfstest/README for more details on how to execute it. 20141009: gperf has been removed from the base system for architectures that use clang. Ports that require gperf will obtain it from the devel/gperf port. 20140923: pjdfstest has been moved from tools/regression/pjdfstest to contrib/pjdfstest . 20140922: At svn r271982, The default linux compat kernel ABI has been adjusted to 2.6.18 in support of the linux-c6 compat ports infrastructure update. If you wish to continue using the linux-f10 compat ports, add compat.linux.osrelease=2.6.16 to your local sysctl.conf. Users are encouraged to update their linux-compat packages to linux-c6 during their next update cycle. 20140729: The ofwfb driver, used to provide a graphics console on PowerPC when using vt(4), no longer allows mmap() of all physical memory. This will prevent Xorg on PowerPC with some ATI graphics cards from initializing properly unless x11-servers/xorg-server is updated to 1.12.4_8 or newer. 20140723: The xdev targets have been converted to using TARGET and TARGET_ARCH instead of XDEV and XDEV_ARCH. 20140719: The default unbound configuration has been modified to address issues with reverse lookups on networks that use private address ranges. If you use the local_unbound service, run "service local_unbound setup" as root to regenerate your configuration, then "service local_unbound reload" to load the new configuration. 20140709: The GNU texinfo and GNU info pages are not built and installed anymore, WITH_INFO knob has been added to allow to built and install them again. UPDATE: see 20150102 entry on texinfo's removal 20140708: The GNU readline library is now an INTERNALLIB - that is, it is statically linked into consumers (GDB and variants) in the base system, and the shared library is no longer installed. The devel/readline port is available for third party software that requires readline. 20140702: The Itanium architecture (ia64) has been removed from the list of known architectures. This is the first step in the removal of the architecture. 20140701: Commit r268115 has added NFSv4.1 server support, merged from projects/nfsv4.1-server. Since this includes changes to the internal interfaces between the NFS related modules, a full build of the kernel and modules will be necessary. __FreeBSD_version has been bumped. 20140629: The WITHOUT_VT_SUPPORT kernel config knob has been renamed WITHOUT_VT. (The other _SUPPORT knobs have a consistent meaning which differs from the behaviour controlled by this knob.) 20140619: Maximal length of the serial number in CTL was increased from 16 to 64 chars, that breaks ABI. All CTL-related tools, such as ctladm and ctld, need to be rebuilt to work with a new kernel. 20140606: The libatf-c and libatf-c++ major versions were downgraded to 0 and 1 respectively to match the upstream numbers. They were out of sync because, when they were originally added to FreeBSD, the upstream versions were not respected. These libraries are private and not yet built by default, so renumbering them should be a non-issue. However, unclean source trees will yield broken test programs once the operator executes "make delete-old-libs" after a "make installworld". Additionally, the atf-sh binary was made private by moving it into /usr/libexec/. Already-built shell test programs will keep the path to the old binary so they will break after "make delete-old" is run. If you are using WITH_TESTS=yes (not the default), wipe the object tree and rebuild from scratch to prevent spurious test failures. This is only needed once: the misnumbered libraries and misplaced binaries have been added to OptionalObsoleteFiles.inc so they will be removed during a clean upgrade. 20140512: Clang and llvm have been upgraded to 3.4.1 release. 20140508: We bogusly installed src.opts.mk in /usr/share/mk. This file should be removed to avoid issues in the future (and has been added to ObsoleteFiles.inc). 20140505: /etc/src.conf now affects only builds of the FreeBSD src tree. In the past, it affected all builds that used the bsd.*.mk files. The old behavior was a bug, but people may have relied upon it. To get this behavior back, you can .include /etc/src.conf from /etc/make.conf (which is still global and isn't changed). This also changes the behavior of incremental builds inside the tree of individual directories. Set MAKESYSPATH to ".../share/mk" to do that. Although this has survived make universe and some upgrade scenarios, other upgrade scenarios may have broken. At least one form of temporary breakage was fixed with MAKESYSPATH settings for buildworld as well... In cases where MAKESYSPATH isn't working with this setting, you'll need to set it to the full path to your tree. One side effect of all this cleaning up is that bsd.compiler.mk is no longer implicitly included by bsd.own.mk. If you wish to use COMPILER_TYPE, you must now explicitly include bsd.compiler.mk as well. 20140430: The lindev device has been removed since /dev/full has been made a standard device. __FreeBSD_version has been bumped. 20140424: The knob WITHOUT_VI was added to the base system, which controls building ex(1), vi(1), etc. Older releases of FreeBSD required ex(1) in order to reorder files share/termcap and didn't build ex(1) as a build tool, so building/installing with WITH_VI is highly advised for build hosts for older releases. This issue has been fixed in stable/9 and stable/10 in r277022 and r276991, respectively. 20140418: The YES_HESIOD knob has been removed. It has been obsolete for a decade. Please move to using WITH_HESIOD instead or your builds will silently lack HESIOD. 20140405: The uart(4) driver has been changed with respect to its handling of the low-level console. Previously the uart(4) driver prevented any process from changing the baudrate or the CLOCAL and HUPCL control flags. By removing the restrictions, operators can make changes to the serial console port without having to reboot. However, when getty(8) is started on the serial device that is associated with the low-level console, a misconfigured terminal line in /etc/ttys will now have a real impact. Before upgrading the kernel, make sure that /etc/ttys has the serial console device configured as 3wire without baudrate to preserve the previous behaviour. E.g: ttyu0 "/usr/libexec/getty 3wire" vt100 on secure 20140306: Support for libwrap (TCP wrappers) in rpcbind was disabled by default to improve performance. To re-enable it, if needed, run rpcbind with command line option -W. 20140226: Switched back to the GPL dtc compiler due to updates in the upstream dts files not being supported by the BSDL dtc compiler. You will need to rebuild your kernel toolchain to pick up the new compiler. Core dumps may result while building dtb files during a kernel build if you fail to do so. Set WITHOUT_GPL_DTC if you require the BSDL compiler. 20140216: Clang and llvm have been upgraded to 3.4 release. 20140216: The nve(4) driver has been removed. Please use the nfe(4) driver for NVIDIA nForce MCP Ethernet adapters instead. 20140212: An ABI incompatibility crept into the libc++ 3.4 import in r261283. This could cause certain C++ applications using shared libraries built against the previous version of libc++ to crash. The incompatibility has now been fixed, but any C++ applications or shared libraries built between r261283 and r261801 should be recompiled. 20140204: OpenSSH will now ignore errors caused by kernel lacking of Capsicum capability mode support. Please note that enabling the feature in kernel is still highly recommended. 20140131: OpenSSH is now built with sandbox support, and will use sandbox as the default privilege separation method. This requires Capsicum capability mode support in kernel. 20140128: The libelf and libdwarf libraries have been updated to newer versions from upstream. Shared library version numbers for these two libraries were bumped. Any ports or binaries requiring these two libraries should be recompiled. __FreeBSD_version is bumped to 1100006. 20140110: If a Makefile in a tests/ directory was auto-generating a Kyuafile instead of providing an explicit one, this would prevent such Makefile from providing its own Kyuafile in the future during NO_CLEAN builds. This has been fixed in the Makefiles but manual intervention is needed to clean an objdir if you use NO_CLEAN: # find /usr/obj -name Kyuafile | xargs rm -f 20131213: The behavior of gss_pseudo_random() for the krb5 mechanism has changed, for applications requesting a longer random string than produced by the underlying enctype's pseudo-random() function. In particular, the random string produced from a session key of enctype aes256-cts-hmac-sha1-96 or aes256-cts-hmac-sha1-96 will be different at the 17th octet and later, after this change. The counter used in the PRF+ construction is now encoded as a big-endian integer in accordance with RFC 4402. __FreeBSD_version is bumped to 1100004. 20131108: The WITHOUT_ATF build knob has been removed and its functionality has been subsumed into the more generic WITHOUT_TESTS. If you were using the former to disable the build of the ATF libraries, you should change your settings to use the latter. 20131025: The default version of mtree is nmtree which is obtained from NetBSD. The output is generally the same, but may vary slightly. If you found you need identical output adding "-F freebsd9" to the command line should do the trick. For the time being, the old mtree is available as fmtree. 20131014: libbsdyml has been renamed to libyaml and moved to /usr/lib/private. This will break ports-mgmt/pkg. Rebuild the port, or upgrade to pkg 1.1.4_8 and verify bsdyml not linked in, before running "make delete-old-libs": # make -C /usr/ports/ports-mgmt/pkg build deinstall install clean or # pkg install pkg; ldd /usr/local/sbin/pkg | grep bsdyml 20131010: The stable/10 branch has been created in subversion from head revision r256279. COMMON ITEMS: General Notes ------------- Sometimes, obscure build problems are the result of environment poisoning. This can happen because the make utility reads its environment when searching for values for global variables. To run your build attempts in an "environmental clean room", prefix all make commands with 'env -i '. See the env(1) manual page for more details. Occasionally a build failure will occur with "make -j" due to a race condition. If this happens try building again without -j, and please report a bug if it happens consistently. When upgrading from one major version to another it is generally best to upgrade to the latest code in the currently installed branch first, then do an upgrade to the new branch. This is the best-tested upgrade path, and has the highest probability of being successful. Please try this approach if you encounter problems with a major version upgrade. Since the stable 4.x branch point, one has generally been able to upgrade from anywhere in the most recent stable branch to head / current (or even the last couple of stable branches). See the top of this file when there's an exception. The update process will emit an error on an attempt to perform a build or install from a FreeBSD version below the earliest supported version. When updating from an older version the update should be performed one major release at a time, including running `make delete-old` at each step. When upgrading a live system, having a root shell around before installing anything can help undo problems. Not having a root shell around can lead to problems if pam has changed too much from your starting point to allow continued authentication after the upgrade. This file should be read as a log of events. When a later event changes information of a prior event, the prior event should not be deleted. Instead, a pointer to the entry with the new information should be placed in the old entry. Readers of this file should also sanity check older entries before relying on them blindly. Authors of new entries should write them with this in mind. ZFS notes --------- When upgrading the boot ZFS pool to a new version, always follow these two steps: 1.) recompile and reinstall the ZFS boot loader and boot block (this is part of "make buildworld" and "make installworld") 2.) update the ZFS boot block on your boot drive The following example updates the ZFS boot block on the first partition (freebsd-boot) of a GPT partitioned drive ada0: "gpart bootcode -p /boot/gptzfsboot -i 1 ada0" Non-boot pools do not need these updates. To build a kernel ----------------- If you are updating from a prior version of FreeBSD (even one just a few days old), you should follow this procedure. It is the most failsafe as it uses a /usr/obj tree with a fresh mini-buildworld, make kernel-toolchain make -DALWAYS_CHECK_MAKE buildkernel KERNCONF=YOUR_KERNEL_HERE make -DALWAYS_CHECK_MAKE installkernel KERNCONF=YOUR_KERNEL_HERE To test a kernel once --------------------- If you just want to boot a kernel once (because you are not sure if it works, or if you want to boot a known bad kernel to provide debugging information) run make installkernel KERNCONF=YOUR_KERNEL_HERE KODIR=/boot/testkernel nextboot -k testkernel To rebuild everything and install it on the current system. ----------------------------------------------------------- # Note: sometimes if you are running current you gotta do more than # is listed here if you are upgrading from a really old current. make buildworld make buildkernel KERNCONF=YOUR_KERNEL_HERE make installkernel KERNCONF=YOUR_KERNEL_HERE [1] [3] mergemaster -Fp [5] make installworld mergemaster -Fi [4] make delete-old [6] To cross-install current onto a separate partition -------------------------------------------------- # In this approach we use a separate partition to hold # current's root, 'usr', and 'var' directories. A partition # holding "/", "/usr" and "/var" should be about 2GB in # size. make buildworld make buildkernel KERNCONF=YOUR_KERNEL_HERE make installworld DESTDIR=${CURRENT_ROOT} -DDB_FROM_SRC make distribution DESTDIR=${CURRENT_ROOT} # if newfs'd make installkernel KERNCONF=YOUR_KERNEL_HERE DESTDIR=${CURRENT_ROOT} cp /etc/fstab ${CURRENT_ROOT}/etc/fstab # if newfs'd To upgrade in-place from stable to current ---------------------------------------------- make buildworld [9] make buildkernel KERNCONF=YOUR_KERNEL_HERE [8] make installkernel KERNCONF=YOUR_KERNEL_HERE [1] [3] mergemaster -Fp [5] make installworld mergemaster -Fi [4] make delete-old [6] Make sure that you've read the UPDATING file to understand the tweaks to various things you need. At this point in the life cycle of current, things change often and you are on your own to cope. The defaults can also change, so please read ALL of the UPDATING entries. Also, if you are tracking -current, you must be subscribed to freebsd-current@freebsd.org. Make sure that before you update your sources that you have read and understood all the recent messages there. If in doubt, please track -stable which has much fewer pitfalls. [1] If you have third party modules, such as vmware, you should disable them at this point so they don't crash your system on reboot. [3] From the bootblocks, boot -s, and then do fsck -p mount -u / mount -a sh /etc/rc.d/zfs start # mount zfs filesystem, if needed cd src # full path to source adjkerntz -i # if CMOS is wall time Also, when doing a major release upgrade, it is required that you boot into single user mode to do the installworld. [4] Note: This step is non-optional. Failure to do this step can result in a significant reduction in the functionality of the system. Attempting to do it by hand is not recommended and those that pursue this avenue should read this file carefully, as well as the archives of freebsd-current and freebsd-hackers mailing lists for potential gotchas. The -U option is also useful to consider. See mergemaster(8) for more information. [5] Usually this step is a no-op. However, from time to time you may need to do this if you get unknown user in the following step. It never hurts to do it all the time. You may need to install a new mergemaster (cd src/usr.sbin/mergemaster && make install) after the buildworld before this step if you last updated from current before 20130425 or from -stable before 20130430. [6] This only deletes old files and directories. Old libraries can be deleted by "make delete-old-libs", but you have to make sure that no program is using those libraries anymore. [8] The new kernel must be able to run existing binaries used by an installworld. When upgrading across major versions, the new kernel's configuration must include the correct COMPAT_FREEBSD option for existing binaries (e.g. COMPAT_FREEBSD11 to run 11.x binaries). Failure to do so may leave you with a system that is hard to boot to recover. A GENERIC kernel will include suitable compatibility options to run binaries from older branches. Note that the ability to run binaries from unsupported branches is not guaranteed. Make sure that you merge any new devices from GENERIC since the last time you updated your kernel config file. Options also change over time, so you may need to adjust your custom kernels for these as well. [9] If CPUTYPE is defined in your /etc/make.conf, make sure to use the "?=" instead of the "=" assignment operator, so that buildworld can override the CPUTYPE if it needs to. MAKEOBJDIRPREFIX must be defined in an environment variable, and not on the command line, or in /etc/make.conf. buildworld will warn if it is improperly defined. FORMAT: This file contains a list, in reverse chronological order, of major breakages in tracking -current. It is not guaranteed to be a complete list of such breakages, and only contains entries since September 23, 2011. If you need to see UPDATING entries from before that date, you will need to fetch an UPDATING file from an older FreeBSD release. Copyright information: Copyright 1998-2009 M. Warner Losh Redistribution, publication, translation and use, with or without modification, in full or in part, in any form or format of this document are permitted without further permission from the author. THIS DOCUMENT IS PROVIDED BY WARNER LOSH ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL WARNER LOSH BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. Contact Warner Losh if you have any questions about your use of this document. $FreeBSD$ Index: projects/clang1000-import/bin/sh/miscbltin.c =================================================================== --- projects/clang1000-import/bin/sh/miscbltin.c (revision 358238) +++ projects/clang1000-import/bin/sh/miscbltin.c (revision 358239) @@ -1,606 +1,606 @@ /*- * SPDX-License-Identifier: BSD-3-Clause * * Copyright (c) 1991, 1993 * The Regents of the University of California. All rights reserved. * * This code is derived from software contributed to Berkeley by * Kenneth Almquist. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #ifndef lint #if 0 static char sccsid[] = "@(#)miscbltin.c 8.4 (Berkeley) 5/4/95"; #endif #endif /* not lint */ #include __FBSDID("$FreeBSD$"); /* * Miscellaneous builtins. */ #include #include #include #include #include #include #include #include #include #include "shell.h" #include "options.h" #include "var.h" #include "output.h" #include "memalloc.h" #include "error.h" #include "mystring.h" #include "syntax.h" #include "trap.h" #undef eflag #define READ_BUFLEN 1024 struct fdctx { int fd; size_t off; /* offset in buf */ size_t buflen; char *ep; /* tail pointer */ char buf[READ_BUFLEN]; }; static void fdctx_init(int, struct fdctx *); static void fdctx_destroy(struct fdctx *); static ssize_t fdgetc(struct fdctx *, char *); int readcmd(int, char **); int umaskcmd(int, char **); int ulimitcmd(int, char **); static void fdctx_init(int fd, struct fdctx *fdc) { off_t cur; /* Check if fd is seekable. */ cur = lseek(fd, 0, SEEK_CUR); *fdc = (struct fdctx){ .fd = fd, .buflen = (cur != -1) ? READ_BUFLEN : 1, .ep = &fdc->buf[0], /* No data */ }; } static ssize_t fdgetc(struct fdctx *fdc, char *c) { ssize_t nread; if (&fdc->buf[fdc->off] == fdc->ep) { nread = read(fdc->fd, fdc->buf, fdc->buflen); if (nread > 0) { fdc->off = 0; fdc->ep = fdc->buf + nread; } else return (nread); } *c = fdc->buf[fdc->off++]; return (1); } static void fdctx_destroy(struct fdctx *fdc) { - size_t residue; + off_t residue; if (fdc->buflen > 1) { /* * Reposition the file offset. Here is the layout of buf: * * | off * v * |*****************|-------| * buf ep buf+buflen * |<- residue ->| * * off: current character * ep: offset just after read(2) * residue: length for reposition */ residue = (fdc->ep - fdc->buf) - fdc->off; if (residue > 0) (void) lseek(fdc->fd, -residue, SEEK_CUR); } } /* * The read builtin. The -r option causes backslashes to be treated like * ordinary characters. * * This uses unbuffered input, which may be avoidable in some cases. * * Note that if IFS=' :' then read x y should work so that: * 'a b' x='a', y='b' * ' a b ' x='a', y='b' * ':b' x='', y='b' * ':' x='', y='' * '::' x='', y='' * ': :' x='', y='' * ':::' x='', y='::' * ':b c:' x='', y='b c:' */ int readcmd(int argc __unused, char **argv __unused) { char **ap; int backslash; char c; int rflag; char *prompt; const char *ifs; char *p; int startword; int status; int i; int is_ifs; int saveall = 0; ptrdiff_t lastnonifs, lastnonifsws; struct timeval tv; char *tvptr; fd_set ifds; ssize_t nread; int sig; struct fdctx fdctx; rflag = 0; prompt = NULL; tv.tv_sec = -1; tv.tv_usec = 0; while ((i = nextopt("erp:t:")) != '\0') { switch(i) { case 'p': prompt = shoptarg; break; case 'e': break; case 'r': rflag = 1; break; case 't': tv.tv_sec = strtol(shoptarg, &tvptr, 0); if (tvptr == shoptarg) error("timeout value"); switch(*tvptr) { case 0: case 's': break; case 'h': tv.tv_sec *= 60; /* FALLTHROUGH */ case 'm': tv.tv_sec *= 60; break; default: error("timeout unit"); } break; } } if (prompt && isatty(0)) { out2str(prompt); flushall(); } if (*(ap = argptr) == NULL) error("arg count"); if ((ifs = bltinlookup("IFS", 1)) == NULL) ifs = " \t\n"; if (tv.tv_sec >= 0) { /* * Wait for something to become available. */ FD_ZERO(&ifds); FD_SET(0, &ifds); status = select(1, &ifds, NULL, NULL, &tv); /* * If there's nothing ready, return an error. */ if (status <= 0) { sig = pendingsig; return (128 + (sig != 0 ? sig : SIGALRM)); } } status = 0; startword = 2; backslash = 0; STARTSTACKSTR(p); lastnonifs = lastnonifsws = -1; fdctx_init(STDIN_FILENO, &fdctx); for (;;) { nread = fdgetc(&fdctx, &c); if (nread == -1) { if (errno == EINTR) { sig = pendingsig; if (sig == 0) continue; status = 128 + sig; break; } warning("read error: %s", strerror(errno)); status = 2; break; } else if (nread != 1) { status = 1; break; } if (c == '\0') continue; CHECKSTRSPACE(1, p); if (backslash) { backslash = 0; if (c != '\n') { startword = 0; lastnonifs = lastnonifsws = p - stackblock(); USTPUTC(c, p); } continue; } if (!rflag && c == '\\') { backslash++; continue; } if (c == '\n') break; if (strchr(ifs, c)) is_ifs = strchr(" \t\n", c) ? 1 : 2; else is_ifs = 0; if (startword != 0) { if (is_ifs == 1) { /* Ignore leading IFS whitespace */ if (saveall) USTPUTC(c, p); continue; } if (is_ifs == 2 && startword == 1) { /* Only one non-whitespace IFS per word */ startword = 2; if (saveall) { lastnonifsws = p - stackblock(); USTPUTC(c, p); } continue; } } if (is_ifs == 0) { /* append this character to the current variable */ startword = 0; if (saveall) /* Not just a spare terminator */ saveall++; lastnonifs = lastnonifsws = p - stackblock(); USTPUTC(c, p); continue; } /* end of variable... */ startword = is_ifs; if (ap[1] == NULL) { /* Last variable needs all IFS chars */ saveall++; if (is_ifs == 2) lastnonifsws = p - stackblock(); USTPUTC(c, p); continue; } STACKSTRNUL(p); setvar(*ap, stackblock(), 0); ap++; STARTSTACKSTR(p); lastnonifs = lastnonifsws = -1; } fdctx_destroy(&fdctx); STACKSTRNUL(p); /* * Remove trailing IFS chars: always remove whitespace, don't remove * non-whitespace unless it was naked */ if (saveall <= 1) lastnonifsws = lastnonifs; stackblock()[lastnonifsws + 1] = '\0'; setvar(*ap, stackblock(), 0); /* Set any remaining args to "" */ while (*++ap != NULL) setvar(*ap, "", 0); return status; } int umaskcmd(int argc __unused, char **argv __unused) { char *ap; int mask; int i; int symbolic_mode = 0; while ((i = nextopt("S")) != '\0') { symbolic_mode = 1; } INTOFF; mask = umask(0); umask(mask); INTON; if ((ap = *argptr) == NULL) { if (symbolic_mode) { char u[4], g[4], o[4]; i = 0; if ((mask & S_IRUSR) == 0) u[i++] = 'r'; if ((mask & S_IWUSR) == 0) u[i++] = 'w'; if ((mask & S_IXUSR) == 0) u[i++] = 'x'; u[i] = '\0'; i = 0; if ((mask & S_IRGRP) == 0) g[i++] = 'r'; if ((mask & S_IWGRP) == 0) g[i++] = 'w'; if ((mask & S_IXGRP) == 0) g[i++] = 'x'; g[i] = '\0'; i = 0; if ((mask & S_IROTH) == 0) o[i++] = 'r'; if ((mask & S_IWOTH) == 0) o[i++] = 'w'; if ((mask & S_IXOTH) == 0) o[i++] = 'x'; o[i] = '\0'; out1fmt("u=%s,g=%s,o=%s\n", u, g, o); } else { out1fmt("%.4o\n", mask); } } else { if (is_digit(*ap)) { mask = 0; do { if (*ap >= '8' || *ap < '0') error("Illegal number: %s", *argptr); mask = (mask << 3) + (*ap - '0'); } while (*++ap != '\0'); umask(mask); } else { void *set; INTOFF; if ((set = setmode (ap)) == NULL) error("Illegal number: %s", ap); mask = getmode (set, ~mask & 0777); umask(~mask & 0777); free(set); INTON; } } return 0; } /* * ulimit builtin * * This code, originally by Doug Gwyn, Doug Kingston, Eric Gisin, and * Michael Rendell was ripped from pdksh 5.0.8 and hacked for use with * ash by J.T. Conklin. * * Public domain. */ struct limits { const char *name; const char *units; int cmd; short factor; /* multiply by to get rlim_{cur,max} values */ char option; }; static const struct limits limits[] = { #ifdef RLIMIT_CPU { "cpu time", "seconds", RLIMIT_CPU, 1, 't' }, #endif #ifdef RLIMIT_FSIZE { "file size", "512-blocks", RLIMIT_FSIZE, 512, 'f' }, #endif #ifdef RLIMIT_DATA { "data seg size", "kbytes", RLIMIT_DATA, 1024, 'd' }, #endif #ifdef RLIMIT_STACK { "stack size", "kbytes", RLIMIT_STACK, 1024, 's' }, #endif #ifdef RLIMIT_CORE { "core file size", "512-blocks", RLIMIT_CORE, 512, 'c' }, #endif #ifdef RLIMIT_RSS { "max memory size", "kbytes", RLIMIT_RSS, 1024, 'm' }, #endif #ifdef RLIMIT_MEMLOCK { "locked memory", "kbytes", RLIMIT_MEMLOCK, 1024, 'l' }, #endif #ifdef RLIMIT_NPROC { "max user processes", (char *)0, RLIMIT_NPROC, 1, 'u' }, #endif #ifdef RLIMIT_NOFILE { "open files", (char *)0, RLIMIT_NOFILE, 1, 'n' }, #endif #ifdef RLIMIT_VMEM { "virtual mem size", "kbytes", RLIMIT_VMEM, 1024, 'v' }, #endif #ifdef RLIMIT_SWAP { "swap limit", "kbytes", RLIMIT_SWAP, 1024, 'w' }, #endif #ifdef RLIMIT_SBSIZE { "socket buffer size", "bytes", RLIMIT_SBSIZE, 1, 'b' }, #endif #ifdef RLIMIT_NPTS { "pseudo-terminals", (char *)0, RLIMIT_NPTS, 1, 'p' }, #endif #ifdef RLIMIT_KQUEUES { "kqueues", (char *)0, RLIMIT_KQUEUES, 1, 'k' }, #endif #ifdef RLIMIT_UMTXP { "umtx shared locks", (char *)0, RLIMIT_UMTXP, 1, 'o' }, #endif { (char *) 0, (char *)0, 0, 0, '\0' } }; enum limithow { SOFT = 0x1, HARD = 0x2 }; static void printlimit(enum limithow how, const struct rlimit *limit, const struct limits *l) { rlim_t val = 0; if (how & SOFT) val = limit->rlim_cur; else if (how & HARD) val = limit->rlim_max; if (val == RLIM_INFINITY) out1str("unlimited\n"); else { val /= l->factor; out1fmt("%jd\n", (intmax_t)val); } } int ulimitcmd(int argc __unused, char **argv __unused) { rlim_t val = 0; enum limithow how = SOFT | HARD; const struct limits *l; int set, all = 0; int optc, what; struct rlimit limit; what = 'f'; while ((optc = nextopt("HSatfdsmcnuvlbpwko")) != '\0') switch (optc) { case 'H': how = HARD; break; case 'S': how = SOFT; break; case 'a': all = 1; break; default: what = optc; } for (l = limits; l->name && l->option != what; l++) ; if (!l->name) error("internal error (%c)", what); set = *argptr ? 1 : 0; if (set) { char *p = *argptr; if (all || argptr[1]) error("too many arguments"); if (strcmp(p, "unlimited") == 0) val = RLIM_INFINITY; else { char *end; uintmax_t uval; if (*p < '0' || *p > '9') error("bad number"); errno = 0; uval = strtoumax(p, &end, 10); if (errno != 0 || *end != '\0') error("bad number"); if (uval > UINTMAX_MAX / l->factor) error("bad number"); uval *= l->factor; val = (rlim_t)uval; if (val < 0 || (uintmax_t)val != uval || val == RLIM_INFINITY) error("bad number"); } } if (all) { for (l = limits; l->name; l++) { char optbuf[40]; if (getrlimit(l->cmd, &limit) < 0) error("can't get limit: %s", strerror(errno)); if (l->units) snprintf(optbuf, sizeof(optbuf), "(%s, -%c) ", l->units, l->option); else snprintf(optbuf, sizeof(optbuf), "(-%c) ", l->option); out1fmt("%-18s %18s ", l->name, optbuf); printlimit(how, &limit, l); } return 0; } if (getrlimit(l->cmd, &limit) < 0) error("can't get limit: %s", strerror(errno)); if (set) { if (how & SOFT) limit.rlim_cur = val; if (how & HARD) limit.rlim_max = val; if (setrlimit(l->cmd, &limit) < 0) error("bad limit: %s", strerror(errno)); } else printlimit(how, &limit, l); return 0; } Index: projects/clang1000-import/lib/libc/sys/truncate.2 =================================================================== --- projects/clang1000-import/lib/libc/sys/truncate.2 (revision 358238) +++ projects/clang1000-import/lib/libc/sys/truncate.2 (revision 358239) @@ -1,165 +1,168 @@ .\" Copyright (c) 1983, 1991, 1993 .\" The Regents of the University of California. All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" 3. Neither the name of the University nor the names of its contributors .\" may be used to endorse or promote products derived from this software .\" without specific prior written permission. .\" .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" @(#)truncate.2 8.1 (Berkeley) 6/4/93 .\" $FreeBSD$ .\" -.Dd May 4, 2015 +.Dd January 24, 2020 .Dt TRUNCATE 2 .Os .Sh NAME .Nm truncate , .Nm ftruncate .Nd truncate or extend a file to a specified length .Sh LIBRARY .Lb libc .Sh SYNOPSIS .In unistd.h .Ft int .Fn truncate "const char *path" "off_t length" .Ft int .Fn ftruncate "int fd" "off_t length" .Sh DESCRIPTION The .Fn truncate system call causes the file named by .Fa path or referenced by .Fa fd to be truncated or extended to .Fa length bytes in size. If the file was larger than this size, the extra data is lost. If the file was smaller than this size, it will be extended as if by writing bytes with the value zero. .Pp The .Fn ftruncate system call causes the file or shared memory object backing the file descriptor .Fa fd to be truncated or extended to .Fa length bytes in size. The file descriptor must be a valid file descriptor open for writing. The file position pointer associated with the file descriptor .Fa fd will not be modified. .Sh RETURN VALUES .Rv -std If the file to be modified is not a directory or a regular file, the .Fn truncate call has no effect and returns the value 0. .Sh ERRORS The .Fn truncate system call succeeds unless: .Bl -tag -width Er .It Bq Er ENOTDIR A component of the path prefix is not a directory. .It Bq Er ENAMETOOLONG A component of a pathname exceeded 255 characters, or an entire path name exceeded 1023 characters. .It Bq Er ENOENT The named file does not exist. .It Bq Er EACCES Search permission is denied for a component of the path prefix. .It Bq Er EACCES The named file is not writable by the user. .It Bq Er ELOOP Too many symbolic links were encountered in translating the pathname. .It Bq Er EPERM The named file has its immutable or append-only flag set, see the .Xr chflags 2 manual page for more information. .It Bq Er EISDIR The named file is a directory. .It Bq Er EROFS The named file resides on a read-only file system. .It Bq Er ETXTBSY The file is a pure procedure (shared text) file that is being executed. .It Bq Er EFBIG The .Fa length argument was greater than the maximum file size. .It Bq Er EINVAL The .Fa length argument was less than 0. .It Bq Er EIO An I/O error occurred updating the inode. .It Bq Er EFAULT The .Fa path argument points outside the process's allocated address space. .El .Pp The .Fn ftruncate system call succeeds unless: .Bl -tag -width Er .It Bq Er EBADF The .Fa fd argument is not a valid descriptor. .It Bq Er EINVAL The .Fa fd argument references a file descriptor that is not a regular file or shared memory object. .It Bq Er EINVAL The .Fa fd descriptor is not open for writing. .El .Sh SEE ALSO .Xr chflags 2 , .Xr open 2 , .Xr shm_open 2 .Sh HISTORY The .Fn truncate and .Fn ftruncate system calls appeared in .Bx 4.2 . .Sh BUGS These calls should be generalized to allow ranges of bytes in a file to be discarded. .Pp -Use of +Historically, the use of .Fn truncate -to extend a file is not portable. +or +.Fn ftruncate +to extend a file was not portable, but this behavior became required in +.St -p1003.1-2008 . Index: projects/clang1000-import/lib/libfetch/common.c =================================================================== --- projects/clang1000-import/lib/libfetch/common.c (revision 358238) +++ projects/clang1000-import/lib/libfetch/common.c (revision 358239) @@ -1,1800 +1,1804 @@ /*- * SPDX-License-Identifier: BSD-3-Clause * * Copyright (c) 1998-2016 Dag-Erling Smørgrav * Copyright (c) 2013 Michael Gmelin * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer * in this position and unchanged. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. The name of the author may not be used to endorse or promote products * derived from this software without specific prior written permission * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #ifdef WITH_SSL #include #endif #include "fetch.h" #include "common.h" /*** Local data **************************************************************/ /* * Error messages for resolver errors */ static struct fetcherr netdb_errlist[] = { #ifdef EAI_NODATA { EAI_NODATA, FETCH_RESOLV, "Host not found" }, #endif { EAI_AGAIN, FETCH_TEMP, "Transient resolver failure" }, { EAI_FAIL, FETCH_RESOLV, "Non-recoverable resolver failure" }, { EAI_NONAME, FETCH_RESOLV, "No address record" }, { -1, FETCH_UNKNOWN, "Unknown resolver error" } }; /* * SOCKS5 error enumerations */ enum SOCKS5_ERR { /* Protocol errors */ SOCKS5_ERR_SELECTION, SOCKS5_ERR_READ_METHOD, SOCKS5_ERR_VER5_ONLY, SOCKS5_ERR_NOMETHODS, SOCKS5_ERR_NOTIMPLEMENTED, SOCKS5_ERR_HOSTNAME_SIZE, SOCKS5_ERR_REQUEST, SOCKS5_ERR_REPLY, SOCKS5_ERR_NON_VER5_RESP, SOCKS5_ERR_GENERAL, SOCKS5_ERR_NOT_ALLOWED, SOCKS5_ERR_NET_UNREACHABLE, SOCKS5_ERR_HOST_UNREACHABLE, SOCKS5_ERR_CONN_REFUSED, SOCKS5_ERR_TTL_EXPIRED, SOCKS5_ERR_COM_UNSUPPORTED, SOCKS5_ERR_ADDR_UNSUPPORTED, SOCKS5_ERR_UNSPECIFIED, /* Configuration errors */ SOCKS5_ERR_BAD_HOST, SOCKS5_ERR_BAD_PROXY_FORMAT, SOCKS5_ERR_BAD_PORT }; /* * Error messages for SOCKS5 errors */ static struct fetcherr socks5_errlist[] = { /* SOCKS5 protocol errors */ { SOCKS5_ERR_SELECTION, FETCH_ABORT, "SOCKS5: Failed to send selection method" }, { SOCKS5_ERR_READ_METHOD, FETCH_ABORT, "SOCKS5: Failed to read method" }, { SOCKS5_ERR_VER5_ONLY, FETCH_PROTO, "SOCKS5: Only version 5 is implemented" }, { SOCKS5_ERR_NOMETHODS, FETCH_PROTO, "SOCKS5: No acceptable methods" }, { SOCKS5_ERR_NOTIMPLEMENTED, FETCH_PROTO, "SOCKS5: Method currently not implemented" }, { SOCKS5_ERR_HOSTNAME_SIZE, FETCH_PROTO, "SOCKS5: Hostname size is above 256 bytes" }, { SOCKS5_ERR_REQUEST, FETCH_PROTO, "SOCKS5: Failed to request" }, { SOCKS5_ERR_REPLY, FETCH_PROTO, "SOCKS5: Failed to receive reply" }, { SOCKS5_ERR_NON_VER5_RESP, FETCH_PROTO, "SOCKS5: Server responded with a non-version 5 response" }, { SOCKS5_ERR_GENERAL, FETCH_ABORT, "SOCKS5: General server failure" }, { SOCKS5_ERR_NOT_ALLOWED, FETCH_AUTH, "SOCKS5: Connection not allowed by ruleset" }, { SOCKS5_ERR_NET_UNREACHABLE, FETCH_NETWORK, "SOCKS5: Network unreachable" }, { SOCKS5_ERR_HOST_UNREACHABLE, FETCH_ABORT, "SOCKS5: Host unreachable" }, { SOCKS5_ERR_CONN_REFUSED, FETCH_ABORT, "SOCKS5: Connection refused" }, { SOCKS5_ERR_TTL_EXPIRED, FETCH_TIMEOUT, "SOCKS5: TTL expired" }, { SOCKS5_ERR_COM_UNSUPPORTED, FETCH_PROTO, "SOCKS5: Command not supported" }, { SOCKS5_ERR_ADDR_UNSUPPORTED, FETCH_ABORT, "SOCKS5: Address type not supported" }, { SOCKS5_ERR_UNSPECIFIED, FETCH_UNKNOWN, "SOCKS5: Unspecified error" }, /* Configuration error */ { SOCKS5_ERR_BAD_HOST, FETCH_ABORT, "SOCKS5: Bad proxy host" }, { SOCKS5_ERR_BAD_PROXY_FORMAT, FETCH_ABORT, "SOCKS5: Bad proxy format" }, { SOCKS5_ERR_BAD_PORT, FETCH_ABORT, "SOCKS5: Bad port" } }; /* End-of-Line */ static const char ENDL[2] = "\r\n"; /*** Error-reporting functions ***********************************************/ /* * Map error code to string */ static struct fetcherr * fetch_finderr(struct fetcherr *p, int e) { while (p->num != -1 && p->num != e) p++; return (p); } /* * Set error code */ void fetch_seterr(struct fetcherr *p, int e) { p = fetch_finderr(p, e); fetchLastErrCode = p->cat; snprintf(fetchLastErrString, MAXERRSTRING, "%s", p->string); } /* * Set error code according to errno */ void fetch_syserr(void) { switch (errno) { case 0: fetchLastErrCode = FETCH_OK; break; case EPERM: case EACCES: case EROFS: case EAUTH: case ENEEDAUTH: fetchLastErrCode = FETCH_AUTH; break; case ENOENT: case EISDIR: /* XXX */ fetchLastErrCode = FETCH_UNAVAIL; break; case ENOMEM: fetchLastErrCode = FETCH_MEMORY; break; case EBUSY: case EAGAIN: fetchLastErrCode = FETCH_TEMP; break; case EEXIST: fetchLastErrCode = FETCH_EXISTS; break; case ENOSPC: fetchLastErrCode = FETCH_FULL; break; case EADDRINUSE: case EADDRNOTAVAIL: case ENETDOWN: case ENETUNREACH: case ENETRESET: case EHOSTUNREACH: fetchLastErrCode = FETCH_NETWORK; break; case ECONNABORTED: case ECONNRESET: fetchLastErrCode = FETCH_ABORT; break; case ETIMEDOUT: fetchLastErrCode = FETCH_TIMEOUT; break; case ECONNREFUSED: case EHOSTDOWN: fetchLastErrCode = FETCH_DOWN; break; default: fetchLastErrCode = FETCH_UNKNOWN; } snprintf(fetchLastErrString, MAXERRSTRING, "%s", strerror(errno)); } /* * Emit status message */ void fetch_info(const char *fmt, ...) { va_list ap; va_start(ap, fmt); vfprintf(stderr, fmt, ap); va_end(ap); fputc('\n', stderr); } /*** Network-related utility functions ***************************************/ /* * Return the default port for a scheme */ int fetch_default_port(const char *scheme) { struct servent *se; if ((se = getservbyname(scheme, "tcp")) != NULL) return (ntohs(se->s_port)); if (strcmp(scheme, SCHEME_FTP) == 0) return (FTP_DEFAULT_PORT); if (strcmp(scheme, SCHEME_HTTP) == 0) return (HTTP_DEFAULT_PORT); return (0); } /* * Return the default proxy port for a scheme */ int fetch_default_proxy_port(const char *scheme) { if (strcmp(scheme, SCHEME_FTP) == 0) return (FTP_DEFAULT_PROXY_PORT); if (strcmp(scheme, SCHEME_HTTP) == 0) return (HTTP_DEFAULT_PROXY_PORT); return (0); } /* * Create a connection for an existing descriptor. */ conn_t * fetch_reopen(int sd) { conn_t *conn; int opt = 1; /* allocate and fill connection structure */ if ((conn = calloc(1, sizeof(*conn))) == NULL) return (NULL); fcntl(sd, F_SETFD, FD_CLOEXEC); setsockopt(sd, SOL_SOCKET, SO_NOSIGPIPE, &opt, sizeof opt); conn->sd = sd; ++conn->ref; return (conn); } /* * Bump a connection's reference count. */ conn_t * fetch_ref(conn_t *conn) { ++conn->ref; return (conn); } /* * Resolve an address */ struct addrinfo * fetch_resolve(const char *addr, int port, int af) { char hbuf[256], sbuf[8]; struct addrinfo hints, *res; const char *hb, *he, *sep; const char *host, *service; int err, len; /* first, check for a bracketed IPv6 address */ if (*addr == '[') { hb = addr + 1; if ((sep = strchr(hb, ']')) == NULL) { errno = EINVAL; goto syserr; } he = sep++; } else { hb = addr; sep = strchrnul(hb, ':'); he = sep; } /* see if we need to copy the host name */ if (*he != '\0') { len = snprintf(hbuf, sizeof(hbuf), "%.*s", (int)(he - hb), hb); if (len < 0) goto syserr; if (len >= (int)sizeof(hbuf)) { errno = ENAMETOOLONG; goto syserr; } host = hbuf; } else { host = hb; } /* was it followed by a service name? */ if (*sep == '\0' && port != 0) { if (port < 1 || port > 65535) { errno = EINVAL; goto syserr; } if (snprintf(sbuf, sizeof(sbuf), "%d", port) < 0) goto syserr; service = sbuf; } else if (*sep != '\0') { service = sep + 1; } else { service = NULL; } /* resolve */ memset(&hints, 0, sizeof(hints)); hints.ai_family = af; hints.ai_socktype = SOCK_STREAM; hints.ai_flags = AI_ADDRCONFIG; if ((err = getaddrinfo(host, service, &hints, &res)) != 0) { netdb_seterr(err); return (NULL); } return (res); syserr: fetch_syserr(); return (NULL); } /* * Bind a socket to a specific local address */ int fetch_bind(int sd, int af, const char *addr) { struct addrinfo *cliai, *ai; int err; if ((cliai = fetch_resolve(addr, 0, af)) == NULL) return (-1); for (ai = cliai; ai != NULL; ai = ai->ai_next) if ((err = bind(sd, ai->ai_addr, ai->ai_addrlen)) == 0) break; if (err != 0) fetch_syserr(); freeaddrinfo(cliai); return (err == 0 ? 0 : -1); } /* * SOCKS5 connection initiation, based on RFC 1928 * Default DNS resolution over SOCKS5 */ int fetch_socks5_init(conn_t *conn, const char *host, int port, int verbose) { /* * Size is based on largest packet prefix (4 bytes) + * Largest FQDN (256) + one byte size (1) + * Port (2) */ unsigned char buf[BUFF_SIZE]; unsigned char *ptr; int ret = 1; if (verbose) fetch_info("Initializing SOCKS5 connection: %s:%d", host, port); /* Connection initialization */ ptr = buf; *ptr++ = SOCKS_VERSION_5; *ptr++ = SOCKS_CONNECTION; *ptr++ = SOCKS_RSV; if (fetch_write(conn, buf, 3) != 3) { ret = SOCKS5_ERR_SELECTION; goto fail; } /* Verify response from SOCKS5 server */ if (fetch_read(conn, buf, 2) != 2) { ret = SOCKS5_ERR_READ_METHOD; goto fail; } ptr = buf; if (ptr[0] != SOCKS_VERSION_5) { ret = SOCKS5_ERR_VER5_ONLY; goto fail; } if (ptr[1] == SOCKS_NOMETHODS) { ret = SOCKS5_ERR_NOMETHODS; goto fail; } else if (ptr[1] != SOCKS5_NOTIMPLEMENTED) { ret = SOCKS5_ERR_NOTIMPLEMENTED; goto fail; } /* Send Request */ *ptr++ = SOCKS_VERSION_5; *ptr++ = SOCKS_CONNECTION; *ptr++ = SOCKS_RSV; /* Encode all targets as a hostname to avoid DNS leaks */ *ptr++ = SOCKS_ATYP_DOMAINNAME; if (strlen(host) > FQDN_SIZE) { ret = SOCKS5_ERR_HOSTNAME_SIZE; goto fail; } *ptr++ = strlen(host); strncpy(ptr, host, strlen(host)); ptr = ptr + strlen(host); port = htons(port); *ptr++ = port & 0x00ff; *ptr++ = (port & 0xff00) >> 8; if (fetch_write(conn, buf, ptr - buf) != ptr - buf) { ret = SOCKS5_ERR_REQUEST; goto fail; } /* BND.ADDR is variable length, read the largest on non-blocking socket */ if (!fetch_read(conn, buf, BUFF_SIZE)) { ret = SOCKS5_ERR_REPLY; goto fail; } ptr = buf; if (*ptr++ != SOCKS_VERSION_5) { ret = SOCKS5_ERR_NON_VER5_RESP; goto fail; } switch(*ptr++) { case SOCKS_SUCCESS: break; case SOCKS_GENERAL_FAILURE: ret = SOCKS5_ERR_GENERAL; goto fail; case SOCKS_CONNECTION_NOT_ALLOWED: ret = SOCKS5_ERR_NOT_ALLOWED; goto fail; case SOCKS_NETWORK_UNREACHABLE: ret = SOCKS5_ERR_NET_UNREACHABLE; goto fail; case SOCKS_HOST_UNREACHABLE: ret = SOCKS5_ERR_HOST_UNREACHABLE; goto fail; case SOCKS_CONNECTION_REFUSED: ret = SOCKS5_ERR_CONN_REFUSED; goto fail; case SOCKS_TTL_EXPIRED: ret = SOCKS5_ERR_TTL_EXPIRED; goto fail; case SOCKS_COMMAND_NOT_SUPPORTED: ret = SOCKS5_ERR_COM_UNSUPPORTED; goto fail; case SOCKS_ADDRESS_NOT_SUPPORTED: ret = SOCKS5_ERR_ADDR_UNSUPPORTED; goto fail; default: ret = SOCKS5_ERR_UNSPECIFIED; goto fail; } return (ret); fail: socks5_seterr(ret); return (0); } /* * Perform SOCKS5 initialization */ int fetch_socks5_getenv(char **host, int *port) { char *socks5env, *endptr, *ext; const char *portDelim; size_t slen; portDelim = ":"; if ((socks5env = getenv("SOCKS5_PROXY")) == NULL || *socks5env == '\0') { *host = NULL; *port = -1; return (-1); } /* * IPv6 addresses begin and end in brackets. Set the port delimiter * accordingly and search for it so we can do appropriate validation. */ if (socks5env[0] == '[') portDelim = "]:"; slen = strlen(socks5env); ext = strstr(socks5env, portDelim); if (socks5env[0] == '[') { if (socks5env[slen - 1] == ']') { *host = strndup(socks5env, slen); } else if (ext != NULL) { *host = strndup(socks5env, ext - socks5env + 1); } else { socks5_seterr(SOCKS5_ERR_BAD_PROXY_FORMAT); return (0); } } else { *host = strndup(socks5env, ext - socks5env); } if (*host == NULL) { fprintf(stderr, "Failure to allocate memory, exiting.\n"); return (-1); } if (ext == NULL) { *port = 1080; /* Default port as defined in RFC1928 */ } else { ext += strlen(portDelim); errno = 0; *port = strtoimax(ext, (char **)&endptr, 10); if (*endptr != '\0' || errno != 0 || *port < 0 || *port > 65535) { free(*host); *host = NULL; socks5_seterr(SOCKS5_ERR_BAD_PORT); return (0); } } return (2); } /* * Establish a TCP connection to the specified port on the specified host. */ conn_t * fetch_connect(const char *host, int port, int af, int verbose) { struct addrinfo *cais = NULL, *sais = NULL, *cai, *sai; const char *bindaddr; conn_t *conn = NULL; int err = 0, sd = -1; char *sockshost; int socksport; DEBUGF("---> %s:%d\n", host, port); /* * Check if SOCKS5_PROXY env variable is set. fetch_socks5_getenv * will either set sockshost = NULL or allocate memory in all cases. */ sockshost = NULL; if (!fetch_socks5_getenv(&sockshost, &socksport)) goto fail; /* Not using SOCKS5 proxy */ if (sockshost == NULL) { /* resolve server address */ if (verbose) fetch_info("resolving server address: %s:%d", host, port); if ((sais = fetch_resolve(host, port, af)) == NULL) goto fail; /* resolve client address */ bindaddr = getenv("FETCH_BIND_ADDRESS"); if (bindaddr != NULL && *bindaddr != '\0') { if (verbose) fetch_info("resolving client address: %s", bindaddr); if ((cais = fetch_resolve(bindaddr, 0, af)) == NULL) goto fail; } } else { /* resolve socks5 proxy address */ if (verbose) fetch_info("resolving SOCKS5 server address: %s:%d", sockshost, socksport); if ((sais = fetch_resolve(sockshost, socksport, af)) == NULL) { socks5_seterr(SOCKS5_ERR_BAD_HOST); goto fail; } } /* try each server address in turn */ for (err = 0, sai = sais; sai != NULL; sai = sai->ai_next) { /* open socket */ if ((sd = socket(sai->ai_family, SOCK_STREAM, 0)) < 0) goto syserr; /* attempt to bind to client address */ for (err = 0, cai = cais; cai != NULL; cai = cai->ai_next) { if (cai->ai_family != sai->ai_family) continue; if ((err = bind(sd, cai->ai_addr, cai->ai_addrlen)) == 0) break; } if (err != 0) { if (verbose) fetch_info("failed to bind to %s", bindaddr); goto syserr; } /* attempt to connect to server address */ if ((err = connect(sd, sai->ai_addr, sai->ai_addrlen)) == 0) break; /* clean up before next attempt */ close(sd); sd = -1; } if (err != 0) { if (verbose && sockshost == NULL) { fetch_info("failed to connect to %s:%d", host, port); goto syserr; } else if (sockshost != NULL) { if (verbose) fetch_info( "failed to connect to SOCKS5 server %s:%d", sockshost, socksport); socks5_seterr(SOCKS5_ERR_CONN_REFUSED); goto fail; } goto syserr; } if ((conn = fetch_reopen(sd)) == NULL) goto syserr; if (sockshost) if (!fetch_socks5_init(conn, host, port, verbose)) goto fail; + free(sockshost); if (cais != NULL) freeaddrinfo(cais); if (sais != NULL) freeaddrinfo(sais); return (conn); syserr: fetch_syserr(); fail: free(sockshost); - if (sd >= 0) + /* Fully close if it was opened; otherwise just don't leak the fd. */ + if (conn != NULL) + fetch_close(conn); + else if (sd >= 0) close(sd); if (cais != NULL) freeaddrinfo(cais); if (sais != NULL) freeaddrinfo(sais); return (NULL); } #ifdef WITH_SSL /* * Convert characters A-Z to lowercase (intentionally avoid any locale * specific conversions). */ static char fetch_ssl_tolower(char in) { if (in >= 'A' && in <= 'Z') return (in + 32); else return (in); } /* * isalpha implementation that intentionally avoids any locale specific * conversions. */ static int fetch_ssl_isalpha(char in) { return ((in >= 'A' && in <= 'Z') || (in >= 'a' && in <= 'z')); } /* * Check if passed hostnames a and b are equal. */ static int fetch_ssl_hname_equal(const char *a, size_t alen, const char *b, size_t blen) { size_t i; if (alen != blen) return (0); for (i = 0; i < alen; ++i) { if (fetch_ssl_tolower(a[i]) != fetch_ssl_tolower(b[i])) return (0); } return (1); } /* * Check if domain label is traditional, meaning that only A-Z, a-z, 0-9 * and '-' (hyphen) are allowed. Hyphens have to be surrounded by alpha- * numeric characters. Double hyphens (like they're found in IDN a-labels * 'xn--') are not allowed. Empty labels are invalid. */ static int fetch_ssl_is_trad_domain_label(const char *l, size_t len, int wcok) { size_t i; if (!len || l[0] == '-' || l[len-1] == '-') return (0); for (i = 0; i < len; ++i) { if (!isdigit(l[i]) && !fetch_ssl_isalpha(l[i]) && !(l[i] == '*' && wcok) && !(l[i] == '-' && l[i - 1] != '-')) return (0); } return (1); } /* * Check if host name consists only of numbers. This might indicate an IP * address, which is not a good idea for CN wildcard comparison. */ static int fetch_ssl_hname_is_only_numbers(const char *hostname, size_t len) { size_t i; for (i = 0; i < len; ++i) { if (!((hostname[i] >= '0' && hostname[i] <= '9') || hostname[i] == '.')) return (0); } return (1); } /* * Check if the host name h passed matches the pattern passed in m which * is usually part of subjectAltName or CN of a certificate presented to * the client. This includes wildcard matching. The algorithm is based on * RFC6125, sections 6.4.3 and 7.2, which clarifies RFC2818 and RFC3280. */ static int fetch_ssl_hname_match(const char *h, size_t hlen, const char *m, size_t mlen) { int delta, hdotidx, mdot1idx, wcidx; const char *hdot, *mdot1, *mdot2; const char *wc; /* wildcard */ if (!(h && *h && m && *m)) return (0); if ((wc = strnstr(m, "*", mlen)) == NULL) return (fetch_ssl_hname_equal(h, hlen, m, mlen)); wcidx = wc - m; /* hostname should not be just dots and numbers */ if (fetch_ssl_hname_is_only_numbers(h, hlen)) return (0); /* only one wildcard allowed in pattern */ if (strnstr(wc + 1, "*", mlen - wcidx - 1) != NULL) return (0); /* * there must be at least two more domain labels and * wildcard has to be in the leftmost label (RFC6125) */ mdot1 = strnstr(m, ".", mlen); if (mdot1 == NULL || mdot1 < wc || (mlen - (mdot1 - m)) < 4) return (0); mdot1idx = mdot1 - m; mdot2 = strnstr(mdot1 + 1, ".", mlen - mdot1idx - 1); if (mdot2 == NULL || (mlen - (mdot2 - m)) < 2) return (0); /* hostname must contain a dot and not be the 1st char */ hdot = strnstr(h, ".", hlen); if (hdot == NULL || hdot == h) return (0); hdotidx = hdot - h; /* * host part of hostname must be at least as long as * pattern it's supposed to match */ if (hdotidx < mdot1idx) return (0); /* * don't allow wildcards in non-traditional domain names * (IDN, A-label, U-label...) */ if (!fetch_ssl_is_trad_domain_label(h, hdotidx, 0) || !fetch_ssl_is_trad_domain_label(m, mdot1idx, 1)) return (0); /* match domain part (part after first dot) */ if (!fetch_ssl_hname_equal(hdot, hlen - hdotidx, mdot1, mlen - mdot1idx)) return (0); /* match part left of wildcard */ if (!fetch_ssl_hname_equal(h, wcidx, m, wcidx)) return (0); /* match part right of wildcard */ delta = mdot1idx - wcidx - 1; if (!fetch_ssl_hname_equal(hdot - delta, delta, mdot1 - delta, delta)) return (0); /* all tests succeeded, it's a match */ return (1); } /* * Get numeric host address info - returns NULL if host was not an IP * address. The caller is responsible for deallocation using * freeaddrinfo(3). */ static struct addrinfo * fetch_ssl_get_numeric_addrinfo(const char *hostname, size_t len) { struct addrinfo hints, *res; char *host; host = (char *)malloc(len + 1); memcpy(host, hostname, len); host[len] = '\0'; memset(&hints, 0, sizeof(hints)); hints.ai_family = PF_UNSPEC; hints.ai_socktype = SOCK_STREAM; hints.ai_protocol = 0; hints.ai_flags = AI_NUMERICHOST; /* port is not relevant for this purpose */ if (getaddrinfo(host, "443", &hints, &res) != 0) res = NULL; free(host); return res; } /* * Compare ip address in addrinfo with address passes. */ static int fetch_ssl_ipaddr_match_bin(const struct addrinfo *lhost, const char *rhost, size_t rhostlen) { const void *left; if (lhost->ai_family == AF_INET && rhostlen == 4) { left = (void *)&((struct sockaddr_in*)(void *) lhost->ai_addr)->sin_addr.s_addr; #ifdef INET6 } else if (lhost->ai_family == AF_INET6 && rhostlen == 16) { left = (void *)&((struct sockaddr_in6 *)(void *) lhost->ai_addr)->sin6_addr; #endif } else return (0); return (!memcmp(left, (const void *)rhost, rhostlen) ? 1 : 0); } /* * Compare ip address in addrinfo with host passed. If host is not an IP * address, comparison will fail. */ static int fetch_ssl_ipaddr_match(const struct addrinfo *laddr, const char *r, size_t rlen) { struct addrinfo *raddr; int ret; char *rip; ret = 0; if ((raddr = fetch_ssl_get_numeric_addrinfo(r, rlen)) == NULL) return 0; /* not a numeric host */ if (laddr->ai_family == raddr->ai_family) { if (laddr->ai_family == AF_INET) { rip = (char *)&((struct sockaddr_in *)(void *) raddr->ai_addr)->sin_addr.s_addr; ret = fetch_ssl_ipaddr_match_bin(laddr, rip, 4); #ifdef INET6 } else if (laddr->ai_family == AF_INET6) { rip = (char *)&((struct sockaddr_in6 *)(void *) raddr->ai_addr)->sin6_addr; ret = fetch_ssl_ipaddr_match_bin(laddr, rip, 16); #endif } } freeaddrinfo(raddr); return (ret); } /* * Verify server certificate by subjectAltName. */ static int fetch_ssl_verify_altname(STACK_OF(GENERAL_NAME) *altnames, const char *host, struct addrinfo *ip) { const GENERAL_NAME *name; size_t nslen; int i; const char *ns; for (i = 0; i < sk_GENERAL_NAME_num(altnames); ++i) { #if OPENSSL_VERSION_NUMBER < 0x10000000L /* * This is a workaround, since the following line causes * alignment issues in clang: * name = sk_GENERAL_NAME_value(altnames, i); * OpenSSL explicitly warns not to use those macros * directly, but there isn't much choice (and there * shouldn't be any ill side effects) */ name = (GENERAL_NAME *)SKM_sk_value(void, altnames, i); #else name = sk_GENERAL_NAME_value(altnames, i); #endif #if OPENSSL_VERSION_NUMBER < 0x10100000L ns = (const char *)ASN1_STRING_data(name->d.ia5); #else ns = (const char *)ASN1_STRING_get0_data(name->d.ia5); #endif nslen = (size_t)ASN1_STRING_length(name->d.ia5); if (name->type == GEN_DNS && ip == NULL && fetch_ssl_hname_match(host, strlen(host), ns, nslen)) return (1); else if (name->type == GEN_IPADD && ip != NULL && fetch_ssl_ipaddr_match_bin(ip, ns, nslen)) return (1); } return (0); } /* * Verify server certificate by CN. */ static int fetch_ssl_verify_cn(X509_NAME *subject, const char *host, struct addrinfo *ip) { ASN1_STRING *namedata; X509_NAME_ENTRY *nameentry; int cnlen, lastpos, loc, ret; unsigned char *cn; ret = 0; lastpos = -1; loc = -1; cn = NULL; /* get most specific CN (last entry in list) and compare */ while ((lastpos = X509_NAME_get_index_by_NID(subject, NID_commonName, lastpos)) != -1) loc = lastpos; if (loc > -1) { nameentry = X509_NAME_get_entry(subject, loc); namedata = X509_NAME_ENTRY_get_data(nameentry); cnlen = ASN1_STRING_to_UTF8(&cn, namedata); if (ip == NULL && fetch_ssl_hname_match(host, strlen(host), cn, cnlen)) ret = 1; else if (ip != NULL && fetch_ssl_ipaddr_match(ip, cn, cnlen)) ret = 1; OPENSSL_free(cn); } return (ret); } /* * Verify that server certificate subjectAltName/CN matches * hostname. First check, if there are alternative subject names. If yes, * those have to match. Only if those don't exist it falls back to * checking the subject's CN. */ static int fetch_ssl_verify_hname(X509 *cert, const char *host) { struct addrinfo *ip; STACK_OF(GENERAL_NAME) *altnames; X509_NAME *subject; int ret; ret = 0; ip = fetch_ssl_get_numeric_addrinfo(host, strlen(host)); altnames = X509_get_ext_d2i(cert, NID_subject_alt_name, NULL, NULL); if (altnames != NULL) { ret = fetch_ssl_verify_altname(altnames, host, ip); } else { subject = X509_get_subject_name(cert); if (subject != NULL) ret = fetch_ssl_verify_cn(subject, host, ip); } if (ip != NULL) freeaddrinfo(ip); if (altnames != NULL) GENERAL_NAMES_free(altnames); return (ret); } /* * Configure transport security layer based on environment. */ static void fetch_ssl_setup_transport_layer(SSL_CTX *ctx, int verbose) { long ssl_ctx_options; ssl_ctx_options = SSL_OP_ALL | SSL_OP_NO_SSLv2 | SSL_OP_NO_TICKET; if (getenv("SSL_ALLOW_SSL3") == NULL) ssl_ctx_options |= SSL_OP_NO_SSLv3; if (getenv("SSL_NO_TLS1") != NULL) ssl_ctx_options |= SSL_OP_NO_TLSv1; if (getenv("SSL_NO_TLS1_1") != NULL) ssl_ctx_options |= SSL_OP_NO_TLSv1_1; if (getenv("SSL_NO_TLS1_2") != NULL) ssl_ctx_options |= SSL_OP_NO_TLSv1_2; if (verbose) fetch_info("SSL options: %lx", ssl_ctx_options); SSL_CTX_set_options(ctx, ssl_ctx_options); } /* * Configure peer verification based on environment. */ #define LOCAL_CERT_FILE "/usr/local/etc/ssl/cert.pem" #define BASE_CERT_FILE "/etc/ssl/cert.pem" static int fetch_ssl_setup_peer_verification(SSL_CTX *ctx, int verbose) { X509_LOOKUP *crl_lookup; X509_STORE *crl_store; const char *ca_cert_file, *ca_cert_path, *crl_file; if (getenv("SSL_NO_VERIFY_PEER") == NULL) { ca_cert_file = getenv("SSL_CA_CERT_FILE"); if (ca_cert_file == NULL && access(LOCAL_CERT_FILE, R_OK) == 0) ca_cert_file = LOCAL_CERT_FILE; if (ca_cert_file == NULL && access(BASE_CERT_FILE, R_OK) == 0) ca_cert_file = BASE_CERT_FILE; ca_cert_path = getenv("SSL_CA_CERT_PATH"); if (verbose) { fetch_info("Peer verification enabled"); if (ca_cert_file != NULL) fetch_info("Using CA cert file: %s", ca_cert_file); if (ca_cert_path != NULL) fetch_info("Using CA cert path: %s", ca_cert_path); if (ca_cert_file == NULL && ca_cert_path == NULL) fetch_info("Using OpenSSL default " "CA cert file and path"); } SSL_CTX_set_verify(ctx, SSL_VERIFY_PEER, fetch_ssl_cb_verify_crt); if (ca_cert_file != NULL || ca_cert_path != NULL) SSL_CTX_load_verify_locations(ctx, ca_cert_file, ca_cert_path); else SSL_CTX_set_default_verify_paths(ctx); if ((crl_file = getenv("SSL_CRL_FILE")) != NULL) { if (verbose) fetch_info("Using CRL file: %s", crl_file); crl_store = SSL_CTX_get_cert_store(ctx); crl_lookup = X509_STORE_add_lookup(crl_store, X509_LOOKUP_file()); if (crl_lookup == NULL || !X509_load_crl_file(crl_lookup, crl_file, X509_FILETYPE_PEM)) { fprintf(stderr, "Could not load CRL file %s\n", crl_file); return (0); } X509_STORE_set_flags(crl_store, X509_V_FLAG_CRL_CHECK | X509_V_FLAG_CRL_CHECK_ALL); } } return (1); } /* * Configure client certificate based on environment. */ static int fetch_ssl_setup_client_certificate(SSL_CTX *ctx, int verbose) { const char *client_cert_file, *client_key_file; if ((client_cert_file = getenv("SSL_CLIENT_CERT_FILE")) != NULL) { client_key_file = getenv("SSL_CLIENT_KEY_FILE") != NULL ? getenv("SSL_CLIENT_KEY_FILE") : client_cert_file; if (verbose) { fetch_info("Using client cert file: %s", client_cert_file); fetch_info("Using client key file: %s", client_key_file); } if (SSL_CTX_use_certificate_chain_file(ctx, client_cert_file) != 1) { fprintf(stderr, "Could not load client certificate %s\n", client_cert_file); return (0); } if (SSL_CTX_use_PrivateKey_file(ctx, client_key_file, SSL_FILETYPE_PEM) != 1) { fprintf(stderr, "Could not load client key %s\n", client_key_file); return (0); } } return (1); } /* * Callback for SSL certificate verification, this is called on server * cert verification. It takes no decision, but informs the user in case * verification failed. */ int fetch_ssl_cb_verify_crt(int verified, X509_STORE_CTX *ctx) { X509 *crt; X509_NAME *name; char *str; str = NULL; if (!verified) { if ((crt = X509_STORE_CTX_get_current_cert(ctx)) != NULL && (name = X509_get_subject_name(crt)) != NULL) str = X509_NAME_oneline(name, 0, 0); fprintf(stderr, "Certificate verification failed for %s\n", str != NULL ? str : "no relevant certificate"); OPENSSL_free(str); } return (verified); } #endif /* * Enable SSL on a connection. */ int fetch_ssl(conn_t *conn, const struct url *URL, int verbose) { #ifdef WITH_SSL int ret, ssl_err; X509_NAME *name; char *str; /* Init the SSL library and context */ if (!SSL_library_init()){ fprintf(stderr, "SSL library init failed\n"); return (-1); } SSL_load_error_strings(); conn->ssl_meth = SSLv23_client_method(); conn->ssl_ctx = SSL_CTX_new(conn->ssl_meth); SSL_CTX_set_mode(conn->ssl_ctx, SSL_MODE_AUTO_RETRY); fetch_ssl_setup_transport_layer(conn->ssl_ctx, verbose); if (!fetch_ssl_setup_peer_verification(conn->ssl_ctx, verbose)) return (-1); if (!fetch_ssl_setup_client_certificate(conn->ssl_ctx, verbose)) return (-1); conn->ssl = SSL_new(conn->ssl_ctx); if (conn->ssl == NULL) { fprintf(stderr, "SSL context creation failed\n"); return (-1); } SSL_set_fd(conn->ssl, conn->sd); #if OPENSSL_VERSION_NUMBER >= 0x0090806fL && !defined(OPENSSL_NO_TLSEXT) if (!SSL_set_tlsext_host_name(conn->ssl, __DECONST(struct url *, URL)->host)) { fprintf(stderr, "TLS server name indication extension failed for host %s\n", URL->host); return (-1); } #endif while ((ret = SSL_connect(conn->ssl)) == -1) { ssl_err = SSL_get_error(conn->ssl, ret); if (ssl_err != SSL_ERROR_WANT_READ && ssl_err != SSL_ERROR_WANT_WRITE) { ERR_print_errors_fp(stderr); return (-1); } } conn->ssl_cert = SSL_get_peer_certificate(conn->ssl); if (conn->ssl_cert == NULL) { fprintf(stderr, "No server SSL certificate\n"); return (-1); } if (getenv("SSL_NO_VERIFY_HOSTNAME") == NULL) { if (verbose) fetch_info("Verify hostname"); if (!fetch_ssl_verify_hname(conn->ssl_cert, URL->host)) { fprintf(stderr, "SSL certificate subject doesn't match host %s\n", URL->host); return (-1); } } if (verbose) { fetch_info("%s connection established using %s", SSL_get_version(conn->ssl), SSL_get_cipher(conn->ssl)); name = X509_get_subject_name(conn->ssl_cert); str = X509_NAME_oneline(name, 0, 0); fetch_info("Certificate subject: %s", str); OPENSSL_free(str); name = X509_get_issuer_name(conn->ssl_cert); str = X509_NAME_oneline(name, 0, 0); fetch_info("Certificate issuer: %s", str); OPENSSL_free(str); } return (0); #else (void)conn; (void)verbose; (void)URL; fprintf(stderr, "SSL support disabled\n"); return (-1); #endif } #define FETCH_READ_WAIT -2 #define FETCH_READ_ERROR -1 #define FETCH_READ_DONE 0 #ifdef WITH_SSL static ssize_t fetch_ssl_read(SSL *ssl, char *buf, size_t len) { ssize_t rlen; int ssl_err; rlen = SSL_read(ssl, buf, len); if (rlen < 0) { ssl_err = SSL_get_error(ssl, rlen); if (ssl_err == SSL_ERROR_WANT_READ || ssl_err == SSL_ERROR_WANT_WRITE) { return (FETCH_READ_WAIT); } else { ERR_print_errors_fp(stderr); return (FETCH_READ_ERROR); } } return (rlen); } #endif static ssize_t fetch_socket_read(int sd, char *buf, size_t len) { ssize_t rlen; rlen = read(sd, buf, len); if (rlen < 0) { if (errno == EAGAIN || (errno == EINTR && fetchRestartCalls)) return (FETCH_READ_WAIT); else return (FETCH_READ_ERROR); } return (rlen); } /* * Read a character from a connection w/ timeout */ ssize_t fetch_read(conn_t *conn, char *buf, size_t len) { struct timeval now, timeout, delta; struct pollfd pfd; ssize_t rlen; int deltams; if (fetchTimeout > 0) { gettimeofday(&timeout, NULL); timeout.tv_sec += fetchTimeout; } deltams = INFTIM; memset(&pfd, 0, sizeof pfd); pfd.fd = conn->sd; pfd.events = POLLIN | POLLERR; for (;;) { /* * The socket is non-blocking. Instead of the canonical * poll() -> read(), we do the following: * * 1) call read() or SSL_read(). * 2) if we received some data, return it. * 3) if an error occurred, return -1. * 4) if read() or SSL_read() signaled EOF, return. * 5) if we did not receive any data but we're not at EOF, * call poll(). * * In the SSL case, this is necessary because if we * receive a close notification, we have to call * SSL_read() one additional time after we've read * everything we received. * * In the non-SSL case, it may improve performance (very * slightly) when reading small amounts of data. */ #ifdef WITH_SSL if (conn->ssl != NULL) rlen = fetch_ssl_read(conn->ssl, buf, len); else #endif rlen = fetch_socket_read(conn->sd, buf, len); if (rlen >= 0) { break; } else if (rlen == FETCH_READ_ERROR) { fetch_syserr(); return (-1); } // assert(rlen == FETCH_READ_WAIT); if (fetchTimeout > 0) { gettimeofday(&now, NULL); if (!timercmp(&timeout, &now, >)) { errno = ETIMEDOUT; fetch_syserr(); return (-1); } timersub(&timeout, &now, &delta); deltams = delta.tv_sec * 1000 + delta.tv_usec / 1000;; } errno = 0; pfd.revents = 0; if (poll(&pfd, 1, deltams) < 0) { if (errno == EINTR && fetchRestartCalls) continue; fetch_syserr(); return (-1); } } return (rlen); } /* * Read a line of text from a connection w/ timeout */ #define MIN_BUF_SIZE 1024 int fetch_getln(conn_t *conn) { char *tmp; size_t tmpsize; ssize_t len; char c; if (conn->buf == NULL) { if ((conn->buf = malloc(MIN_BUF_SIZE)) == NULL) { errno = ENOMEM; return (-1); } conn->bufsize = MIN_BUF_SIZE; } conn->buf[0] = '\0'; conn->buflen = 0; do { len = fetch_read(conn, &c, 1); if (len == -1) return (-1); if (len == 0) break; conn->buf[conn->buflen++] = c; if (conn->buflen == conn->bufsize) { tmp = conn->buf; tmpsize = conn->bufsize * 2 + 1; if ((tmp = realloc(tmp, tmpsize)) == NULL) { errno = ENOMEM; return (-1); } conn->buf = tmp; conn->bufsize = tmpsize; } } while (c != '\n'); conn->buf[conn->buflen] = '\0'; DEBUGF("<<< %s", conn->buf); return (0); } /* * Write to a connection w/ timeout */ ssize_t fetch_write(conn_t *conn, const char *buf, size_t len) { struct iovec iov; iov.iov_base = __DECONST(char *, buf); iov.iov_len = len; return fetch_writev(conn, &iov, 1); } /* * Write a vector to a connection w/ timeout * Note: can modify the iovec. */ ssize_t fetch_writev(conn_t *conn, struct iovec *iov, int iovcnt) { struct timeval now, timeout, delta; struct pollfd pfd; ssize_t wlen, total; int deltams; memset(&pfd, 0, sizeof pfd); if (fetchTimeout) { pfd.fd = conn->sd; pfd.events = POLLOUT | POLLERR; gettimeofday(&timeout, NULL); timeout.tv_sec += fetchTimeout; } total = 0; while (iovcnt > 0) { while (fetchTimeout && pfd.revents == 0) { gettimeofday(&now, NULL); if (!timercmp(&timeout, &now, >)) { errno = ETIMEDOUT; fetch_syserr(); return (-1); } timersub(&timeout, &now, &delta); deltams = delta.tv_sec * 1000 + delta.tv_usec / 1000; errno = 0; pfd.revents = 0; if (poll(&pfd, 1, deltams) < 0) { /* POSIX compliance */ if (errno == EAGAIN) continue; if (errno == EINTR && fetchRestartCalls) continue; return (-1); } } errno = 0; #ifdef WITH_SSL if (conn->ssl != NULL) wlen = SSL_write(conn->ssl, iov->iov_base, iov->iov_len); else #endif wlen = writev(conn->sd, iov, iovcnt); if (wlen == 0) { /* we consider a short write a failure */ /* XXX perhaps we shouldn't in the SSL case */ errno = EPIPE; fetch_syserr(); return (-1); } if (wlen < 0) { if (errno == EINTR && fetchRestartCalls) continue; return (-1); } total += wlen; while (iovcnt > 0 && wlen >= (ssize_t)iov->iov_len) { wlen -= iov->iov_len; iov++; iovcnt--; } if (iovcnt > 0) { iov->iov_len -= wlen; iov->iov_base = __DECONST(char *, iov->iov_base) + wlen; } } return (total); } /* * Write a line of text to a connection w/ timeout */ int fetch_putln(conn_t *conn, const char *str, size_t len) { struct iovec iov[2]; int ret; DEBUGF(">>> %s\n", str); iov[0].iov_base = __DECONST(char *, str); iov[0].iov_len = len; iov[1].iov_base = __DECONST(char *, ENDL); iov[1].iov_len = sizeof(ENDL); if (len == 0) ret = fetch_writev(conn, &iov[1], 1); else ret = fetch_writev(conn, iov, 2); if (ret == -1) return (-1); return (0); } /* * Close connection */ int fetch_close(conn_t *conn) { int ret; if (--conn->ref > 0) return (0); #ifdef WITH_SSL if (conn->ssl) { SSL_shutdown(conn->ssl); SSL_set_connect_state(conn->ssl); SSL_free(conn->ssl); conn->ssl = NULL; } if (conn->ssl_ctx) { SSL_CTX_free(conn->ssl_ctx); conn->ssl_ctx = NULL; } if (conn->ssl_cert) { X509_free(conn->ssl_cert); conn->ssl_cert = NULL; } #endif ret = close(conn->sd); free(conn->buf); free(conn); return (ret); } /*** Directory-related utility functions *************************************/ int fetch_add_entry(struct url_ent **p, int *size, int *len, const char *name, struct url_stat *us) { struct url_ent *tmp; if (*p == NULL) { *size = 0; *len = 0; } if (*len >= *size - 1) { tmp = reallocarray(*p, *size * 2 + 1, sizeof(**p)); if (tmp == NULL) { errno = ENOMEM; fetch_syserr(); return (-1); } *size = (*size * 2 + 1); *p = tmp; } tmp = *p + *len; snprintf(tmp->name, PATH_MAX, "%s", name); memcpy(&tmp->stat, us, sizeof(*us)); (*len)++; (++tmp)->name[0] = 0; return (0); } /*** Authentication-related utility functions ********************************/ static const char * fetch_read_word(FILE *f) { static char word[1024]; if (fscanf(f, " %1023s ", word) != 1) return (NULL); return (word); } static int fetch_netrc_open(void) { struct passwd *pwd; char fn[PATH_MAX]; const char *p; int fd, serrno; if ((p = getenv("NETRC")) != NULL) { DEBUGF("NETRC=%s\n", p); if (snprintf(fn, sizeof(fn), "%s", p) >= (int)sizeof(fn)) { fetch_info("$NETRC specifies a file name " "longer than PATH_MAX"); return (-1); } } else { if ((p = getenv("HOME")) == NULL) { if ((pwd = getpwuid(getuid())) == NULL || (p = pwd->pw_dir) == NULL) return (-1); } if (snprintf(fn, sizeof(fn), "%s/.netrc", p) >= (int)sizeof(fn)) return (-1); } if ((fd = open(fn, O_RDONLY)) < 0) { serrno = errno; DEBUGF("%s: %s\n", fn, strerror(serrno)); errno = serrno; } return (fd); } /* * Get authentication data for a URL from .netrc */ int fetch_netrc_auth(struct url *url) { const char *word; int serrno; FILE *f; if (url->netrcfd < 0) url->netrcfd = fetch_netrc_open(); if (url->netrcfd < 0) return (-1); if ((f = fdopen(url->netrcfd, "r")) == NULL) { serrno = errno; DEBUGF("fdopen(netrcfd): %s", strerror(errno)); close(url->netrcfd); url->netrcfd = -1; errno = serrno; return (-1); } rewind(f); DEBUGF("searching netrc for %s\n", url->host); while ((word = fetch_read_word(f)) != NULL) { if (strcmp(word, "default") == 0) { DEBUGF("using default netrc settings\n"); break; } if (strcmp(word, "machine") == 0 && (word = fetch_read_word(f)) != NULL && strcasecmp(word, url->host) == 0) { DEBUGF("using netrc settings for %s\n", word); break; } } if (word == NULL) goto ferr; while ((word = fetch_read_word(f)) != NULL) { if (strcmp(word, "login") == 0) { if ((word = fetch_read_word(f)) == NULL) goto ferr; if (snprintf(url->user, sizeof(url->user), "%s", word) > (int)sizeof(url->user)) { fetch_info("login name in .netrc is too long"); url->user[0] = '\0'; } } else if (strcmp(word, "password") == 0) { if ((word = fetch_read_word(f)) == NULL) goto ferr; if (snprintf(url->pwd, sizeof(url->pwd), "%s", word) > (int)sizeof(url->pwd)) { fetch_info("password in .netrc is too long"); url->pwd[0] = '\0'; } } else if (strcmp(word, "account") == 0) { if ((word = fetch_read_word(f)) == NULL) goto ferr; /* XXX not supported! */ } else { break; } } fclose(f); url->netrcfd = -1; return (0); ferr: serrno = errno; fclose(f); url->netrcfd = -1; errno = serrno; return (-1); } /* * The no_proxy environment variable specifies a set of domains for * which the proxy should not be consulted; the contents is a comma-, * or space-separated list of domain names. A single asterisk will * override all proxy variables and no transactions will be proxied * (for compatibility with lynx and curl, see the discussion at * ). */ int fetch_no_proxy_match(const char *host) { const char *no_proxy, *p, *q; size_t h_len, d_len; if ((no_proxy = getenv("NO_PROXY")) == NULL && (no_proxy = getenv("no_proxy")) == NULL) return (0); /* asterisk matches any hostname */ if (strcmp(no_proxy, "*") == 0) return (1); h_len = strlen(host); p = no_proxy; do { /* position p at the beginning of a domain suffix */ while (*p == ',' || isspace((unsigned char)*p)) p++; /* position q at the first separator character */ for (q = p; *q; ++q) if (*q == ',' || isspace((unsigned char)*q)) break; d_len = q - p; if (d_len > 0 && h_len >= d_len && strncasecmp(host + h_len - d_len, p, d_len) == 0) { /* domain name matches */ return (1); } p = q + 1; } while (*q); return (0); } Index: projects/clang1000-import/sys/cam/scsi/scsi_da.c =================================================================== --- projects/clang1000-import/sys/cam/scsi/scsi_da.c (revision 358238) +++ projects/clang1000-import/sys/cam/scsi/scsi_da.c (revision 358239) @@ -1,6632 +1,6632 @@ /*- * Implementation of SCSI Direct Access Peripheral driver for CAM. * * SPDX-License-Identifier: BSD-2-Clause-FreeBSD * * Copyright (c) 1997 Justin T. Gibbs. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions, and the following disclaimer, * without modification, immediately at the beginning of the file. * 2. The name of the author may not be used to endorse or promote products * derived from this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include #ifdef _KERNEL #include "opt_da.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #endif /* _KERNEL */ #ifndef _KERNEL #include #include #endif /* _KERNEL */ #include #include #include #include #ifdef _KERNEL #include #endif /* _KERNEL */ #include #include #include #include #ifdef _KERNEL /* * Note that there are probe ordering dependencies here. The order isn't * controlled by this enumeration, but by explicit state transitions in * dastart() and dadone(). Here are some of the dependencies: * * 1. RC should come first, before RC16, unless there is evidence that RC16 * is supported. * 2. BDC needs to come before any of the ATA probes, or the ZONE probe. * 3. The ATA probes should go in this order: * ATA -> LOGDIR -> IDDIR -> SUP -> ATA_ZONE */ typedef enum { DA_STATE_PROBE_WP, DA_STATE_PROBE_RC, DA_STATE_PROBE_RC16, DA_STATE_PROBE_LBP, DA_STATE_PROBE_BLK_LIMITS, DA_STATE_PROBE_BDC, DA_STATE_PROBE_ATA, DA_STATE_PROBE_ATA_LOGDIR, DA_STATE_PROBE_ATA_IDDIR, DA_STATE_PROBE_ATA_SUP, DA_STATE_PROBE_ATA_ZONE, DA_STATE_PROBE_ZONE, DA_STATE_NORMAL } da_state; typedef enum { DA_FLAG_PACK_INVALID = 0x000001, DA_FLAG_NEW_PACK = 0x000002, DA_FLAG_PACK_LOCKED = 0x000004, DA_FLAG_PACK_REMOVABLE = 0x000008, DA_FLAG_ROTATING = 0x000010, DA_FLAG_NEED_OTAG = 0x000020, DA_FLAG_WAS_OTAG = 0x000040, DA_FLAG_RETRY_UA = 0x000080, DA_FLAG_OPEN = 0x000100, DA_FLAG_SCTX_INIT = 0x000200, DA_FLAG_CAN_RC16 = 0x000400, DA_FLAG_PROBED = 0x000800, DA_FLAG_DIRTY = 0x001000, DA_FLAG_ANNOUNCED = 0x002000, DA_FLAG_CAN_ATA_DMA = 0x004000, DA_FLAG_CAN_ATA_LOG = 0x008000, DA_FLAG_CAN_ATA_IDLOG = 0x010000, DA_FLAG_CAN_ATA_SUPCAP = 0x020000, DA_FLAG_CAN_ATA_ZONE = 0x040000, DA_FLAG_TUR_PENDING = 0x080000, DA_FLAG_UNMAPPEDIO = 0x100000 } da_flags; #define DA_FLAG_STRING \ "\020" \ "\001PACK_INVALID" \ "\002NEW_PACK" \ "\003PACK_LOCKED" \ "\004PACK_REMOVABLE" \ "\005ROTATING" \ "\006NEED_OTAG" \ "\007WAS_OTAG" \ "\010RETRY_UA" \ "\011OPEN" \ "\012SCTX_INIT" \ "\013CAN_RC16" \ "\014PROBED" \ "\015DIRTY" \ "\016ANNOUCNED" \ "\017CAN_ATA_DMA" \ "\020CAN_ATA_LOG" \ "\021CAN_ATA_IDLOG" \ "\022CAN_ATA_SUPACP" \ "\023CAN_ATA_ZONE" \ "\024TUR_PENDING" \ "\025UNMAPPEDIO" typedef enum { DA_Q_NONE = 0x00, DA_Q_NO_SYNC_CACHE = 0x01, DA_Q_NO_6_BYTE = 0x02, DA_Q_NO_PREVENT = 0x04, DA_Q_4K = 0x08, DA_Q_NO_RC16 = 0x10, DA_Q_NO_UNMAP = 0x20, DA_Q_RETRY_BUSY = 0x40, DA_Q_SMR_DM = 0x80, DA_Q_STRICT_UNMAP = 0x100, DA_Q_128KB = 0x200 } da_quirks; #define DA_Q_BIT_STRING \ "\020" \ "\001NO_SYNC_CACHE" \ "\002NO_6_BYTE" \ "\003NO_PREVENT" \ "\0044K" \ "\005NO_RC16" \ "\006NO_UNMAP" \ "\007RETRY_BUSY" \ "\010SMR_DM" \ "\011STRICT_UNMAP" \ "\012128KB" typedef enum { DA_CCB_PROBE_RC = 0x01, DA_CCB_PROBE_RC16 = 0x02, DA_CCB_PROBE_LBP = 0x03, DA_CCB_PROBE_BLK_LIMITS = 0x04, DA_CCB_PROBE_BDC = 0x05, DA_CCB_PROBE_ATA = 0x06, DA_CCB_BUFFER_IO = 0x07, DA_CCB_DUMP = 0x0A, DA_CCB_DELETE = 0x0B, DA_CCB_TUR = 0x0C, DA_CCB_PROBE_ZONE = 0x0D, DA_CCB_PROBE_ATA_LOGDIR = 0x0E, DA_CCB_PROBE_ATA_IDDIR = 0x0F, DA_CCB_PROBE_ATA_SUP = 0x10, DA_CCB_PROBE_ATA_ZONE = 0x11, DA_CCB_PROBE_WP = 0x12, DA_CCB_TYPE_MASK = 0x1F, DA_CCB_RETRY_UA = 0x20 } da_ccb_state; /* * Order here is important for method choice * * We prefer ATA_TRIM as tests run against a Sandforce 2281 SSD attached to * LSI 2008 (mps) controller (FW: v12, Drv: v14) resulted 20% quicker deletes * using ATA_TRIM than the corresponding UNMAP results for a real world mysql * import taking 5mins. * */ typedef enum { DA_DELETE_NONE, DA_DELETE_DISABLE, DA_DELETE_ATA_TRIM, DA_DELETE_UNMAP, DA_DELETE_WS16, DA_DELETE_WS10, DA_DELETE_ZERO, DA_DELETE_MIN = DA_DELETE_ATA_TRIM, DA_DELETE_MAX = DA_DELETE_ZERO } da_delete_methods; /* * For SCSI, host managed drives show up as a separate device type. For * ATA, host managed drives also have a different device signature. * XXX KDM figure out the ATA host managed signature. */ typedef enum { DA_ZONE_NONE = 0x00, DA_ZONE_DRIVE_MANAGED = 0x01, DA_ZONE_HOST_AWARE = 0x02, DA_ZONE_HOST_MANAGED = 0x03 } da_zone_mode; /* * We distinguish between these interface cases in addition to the drive type: * o ATA drive behind a SCSI translation layer that knows about ZBC/ZAC * o ATA drive behind a SCSI translation layer that does not know about * ZBC/ZAC, and so needs to be managed via ATA passthrough. In this * case, we would need to share the ATA code with the ada(4) driver. * o SCSI drive. */ typedef enum { DA_ZONE_IF_SCSI, DA_ZONE_IF_ATA_PASS, DA_ZONE_IF_ATA_SAT, } da_zone_interface; typedef enum { DA_ZONE_FLAG_RZ_SUP = 0x0001, DA_ZONE_FLAG_OPEN_SUP = 0x0002, DA_ZONE_FLAG_CLOSE_SUP = 0x0004, DA_ZONE_FLAG_FINISH_SUP = 0x0008, DA_ZONE_FLAG_RWP_SUP = 0x0010, DA_ZONE_FLAG_SUP_MASK = (DA_ZONE_FLAG_RZ_SUP | DA_ZONE_FLAG_OPEN_SUP | DA_ZONE_FLAG_CLOSE_SUP | DA_ZONE_FLAG_FINISH_SUP | DA_ZONE_FLAG_RWP_SUP), DA_ZONE_FLAG_URSWRZ = 0x0020, DA_ZONE_FLAG_OPT_SEQ_SET = 0x0040, DA_ZONE_FLAG_OPT_NONSEQ_SET = 0x0080, DA_ZONE_FLAG_MAX_SEQ_SET = 0x0100, DA_ZONE_FLAG_SET_MASK = (DA_ZONE_FLAG_OPT_SEQ_SET | DA_ZONE_FLAG_OPT_NONSEQ_SET | DA_ZONE_FLAG_MAX_SEQ_SET) } da_zone_flags; static struct da_zone_desc { da_zone_flags value; const char *desc; } da_zone_desc_table[] = { {DA_ZONE_FLAG_RZ_SUP, "Report Zones" }, {DA_ZONE_FLAG_OPEN_SUP, "Open" }, {DA_ZONE_FLAG_CLOSE_SUP, "Close" }, {DA_ZONE_FLAG_FINISH_SUP, "Finish" }, {DA_ZONE_FLAG_RWP_SUP, "Reset Write Pointer" }, }; typedef void da_delete_func_t (struct cam_periph *periph, union ccb *ccb, struct bio *bp); static da_delete_func_t da_delete_trim; static da_delete_func_t da_delete_unmap; static da_delete_func_t da_delete_ws; static const void * da_delete_functions[] = { NULL, NULL, da_delete_trim, da_delete_unmap, da_delete_ws, da_delete_ws, da_delete_ws }; static const char *da_delete_method_names[] = { "NONE", "DISABLE", "ATA_TRIM", "UNMAP", "WS16", "WS10", "ZERO" }; static const char *da_delete_method_desc[] = { "NONE", "DISABLED", "ATA TRIM", "UNMAP", "WRITE SAME(16) with UNMAP", "WRITE SAME(10) with UNMAP", "ZERO" }; /* Offsets into our private area for storing information */ #define ccb_state ppriv_field0 #define ccb_bp ppriv_ptr1 struct disk_params { u_int8_t heads; u_int32_t cylinders; u_int8_t secs_per_track; u_int32_t secsize; /* Number of bytes/sector */ u_int64_t sectors; /* total number sectors */ u_int stripesize; u_int stripeoffset; }; #define UNMAP_RANGE_MAX 0xffffffff #define UNMAP_HEAD_SIZE 8 #define UNMAP_RANGE_SIZE 16 #define UNMAP_MAX_RANGES 2048 /* Protocol Max is 4095 */ #define UNMAP_BUF_SIZE ((UNMAP_MAX_RANGES * UNMAP_RANGE_SIZE) + \ UNMAP_HEAD_SIZE) #define WS10_MAX_BLKS 0xffff #define WS16_MAX_BLKS 0xffffffff #define ATA_TRIM_MAX_RANGES ((UNMAP_BUF_SIZE / \ (ATA_DSM_RANGE_SIZE * ATA_DSM_BLK_SIZE)) * ATA_DSM_BLK_SIZE) #define DA_WORK_TUR (1 << 16) typedef enum { DA_REF_OPEN = 1, DA_REF_OPEN_HOLD, DA_REF_CLOSE_HOLD, DA_REF_PROBE_HOLD, DA_REF_TUR, DA_REF_GEOM, DA_REF_SYSCTL, DA_REF_REPROBE, DA_REF_MAX /* KEEP LAST */ } da_ref_token; struct da_softc { struct cam_iosched_softc *cam_iosched; struct bio_queue_head delete_run_queue; LIST_HEAD(, ccb_hdr) pending_ccbs; int refcount; /* Active xpt_action() calls */ da_state state; - da_flags flags; + u_int flags; da_quirks quirks; int minimum_cmd_size; int error_inject; int trim_max_ranges; int delete_available; /* Delete methods possibly available */ da_zone_mode zone_mode; da_zone_interface zone_interface; da_zone_flags zone_flags; struct ata_gp_log_dir ata_logdir; int valid_logdir_len; struct ata_identify_log_pages ata_iddir; int valid_iddir_len; uint64_t optimal_seq_zones; uint64_t optimal_nonseq_zones; uint64_t max_seq_zones; u_int maxio; uint32_t unmap_max_ranges; uint32_t unmap_max_lba; /* Max LBAs in UNMAP req */ uint32_t unmap_gran; uint32_t unmap_gran_align; uint64_t ws_max_blks; uint64_t trim_count; uint64_t trim_ranges; uint64_t trim_lbas; da_delete_methods delete_method_pref; da_delete_methods delete_method; da_delete_func_t *delete_func; int p_type; struct disk_params params; struct disk *disk; union ccb saved_ccb; struct task sysctl_task; struct sysctl_ctx_list sysctl_ctx; struct sysctl_oid *sysctl_tree; struct callout sendordered_c; uint64_t wwpn; uint8_t unmap_buf[UNMAP_BUF_SIZE]; struct scsi_read_capacity_data_long rcaplong; struct callout mediapoll_c; int ref_flags[DA_REF_MAX]; #ifdef CAM_IO_STATS struct sysctl_ctx_list sysctl_stats_ctx; struct sysctl_oid *sysctl_stats_tree; u_int errors; u_int timeouts; u_int invalidations; #endif #define DA_ANNOUNCETMP_SZ 160 char announce_temp[DA_ANNOUNCETMP_SZ]; #define DA_ANNOUNCE_SZ 400 char announcebuf[DA_ANNOUNCE_SZ]; }; #define dadeleteflag(softc, delete_method, enable) \ if (enable) { \ softc->delete_available |= (1 << delete_method); \ } else { \ softc->delete_available &= ~(1 << delete_method); \ } struct da_quirk_entry { struct scsi_inquiry_pattern inq_pat; da_quirks quirks; }; static const char quantum[] = "QUANTUM"; static const char microp[] = "MICROP"; static struct da_quirk_entry da_quirk_table[] = { /* SPI, FC devices */ { /* * Fujitsu M2513A MO drives. * Tested devices: M2513A2 firmware versions 1200 & 1300. * (dip switch selects whether T_DIRECT or T_OPTICAL device) * Reported by: W.Scholten */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "FUJITSU", "M2513A", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* See above. */ {T_OPTICAL, SIP_MEDIA_REMOVABLE, "FUJITSU", "M2513A", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * This particular Fujitsu drive doesn't like the * synchronize cache command. * Reported by: Tom Jackson */ {T_DIRECT, SIP_MEDIA_FIXED, "FUJITSU", "M2954*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * This drive doesn't like the synchronize cache command * either. Reported by: Matthew Jacob * in NetBSD PR kern/6027, August 24, 1998. */ {T_DIRECT, SIP_MEDIA_FIXED, microp, "2217*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * This drive doesn't like the synchronize cache command * either. Reported by: Hellmuth Michaelis (hm@kts.org) * (PR 8882). */ {T_DIRECT, SIP_MEDIA_FIXED, microp, "2112*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Doesn't like the synchronize cache command. * Reported by: Blaz Zupan */ {T_DIRECT, SIP_MEDIA_FIXED, "NEC", "D3847*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Doesn't like the synchronize cache command. * Reported by: Blaz Zupan */ {T_DIRECT, SIP_MEDIA_FIXED, quantum, "MAVERICK 540S", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Doesn't like the synchronize cache command. */ {T_DIRECT, SIP_MEDIA_FIXED, quantum, "LPS525S", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Doesn't like the synchronize cache command. * Reported by: walter@pelissero.de */ {T_DIRECT, SIP_MEDIA_FIXED, quantum, "LPS540S", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Doesn't work correctly with 6 byte reads/writes. * Returns illegal request, and points to byte 9 of the * 6-byte CDB. * Reported by: Adam McDougall */ {T_DIRECT, SIP_MEDIA_FIXED, quantum, "VIKING 4*", "*"}, /*quirks*/ DA_Q_NO_6_BYTE }, { /* See above. */ {T_DIRECT, SIP_MEDIA_FIXED, quantum, "VIKING 2*", "*"}, /*quirks*/ DA_Q_NO_6_BYTE }, { /* * Doesn't like the synchronize cache command. * Reported by: walter@pelissero.de */ {T_DIRECT, SIP_MEDIA_FIXED, "CONNER", "CP3500*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * The CISS RAID controllers do not support SYNC_CACHE */ {T_DIRECT, SIP_MEDIA_FIXED, "COMPAQ", "RAID*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * The STEC SSDs sometimes hang on UNMAP. */ {T_DIRECT, SIP_MEDIA_FIXED, "STEC", "*", "*"}, /*quirks*/ DA_Q_NO_UNMAP }, { /* * VMware returns BUSY status when storage has transient * connectivity problems, so better wait. * Also VMware returns odd errors on misaligned UNMAPs. */ {T_DIRECT, SIP_MEDIA_FIXED, "VMware*", "*", "*"}, /*quirks*/ DA_Q_RETRY_BUSY | DA_Q_STRICT_UNMAP }, /* USB mass storage devices supported by umass(4) */ { /* * EXATELECOM (Sigmatel) i-Bead 100/105 USB Flash MP3 Player * PR: kern/51675 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "EXATEL", "i-BEAD10*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Power Quotient Int. (PQI) USB flash key * PR: kern/53067 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "Generic*", "USB Flash Disk*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Creative Nomad MUVO mp3 player (USB) * PR: kern/53094 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "CREATIVE", "NOMAD_MUVO", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE|DA_Q_NO_PREVENT }, { /* * Jungsoft NEXDISK USB flash key * PR: kern/54737 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "JUNGSOFT", "NEXDISK*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * FreeDik USB Mini Data Drive * PR: kern/54786 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "FreeDik*", "Mini Data Drive", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Sigmatel USB Flash MP3 Player * PR: kern/57046 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "SigmaTel", "MSCN", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE|DA_Q_NO_PREVENT }, { /* * Neuros USB Digital Audio Computer * PR: kern/63645 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "NEUROS", "dig. audio comp.", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * SEAGRAND NP-900 MP3 Player * PR: kern/64563 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "SEAGRAND", "NP-900*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE|DA_Q_NO_PREVENT }, { /* * iRiver iFP MP3 player (with UMS Firmware) * PR: kern/54881, i386/63941, kern/66124 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "iRiver", "iFP*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Frontier Labs NEX IA+ Digital Audio Player, rev 1.10/0.01 * PR: kern/70158 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "FL" , "Nex*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * ZICPlay USB MP3 Player with FM * PR: kern/75057 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "ACTIONS*" , "USB DISK*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * TEAC USB floppy mechanisms */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "TEAC" , "FD-05*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Kingston DataTraveler II+ USB Pen-Drive. * Reported by: Pawel Jakub Dawidek */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "Kingston" , "DataTraveler II+", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * USB DISK Pro PMAP * Reported by: jhs * PR: usb/96381 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, " ", "USB DISK Pro", "PMAP"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Motorola E398 Mobile Phone (TransFlash memory card). * Reported by: Wojciech A. Koszek * PR: usb/89889 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "Motorola" , "Motorola Phone", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Qware BeatZkey! Pro * PR: usb/79164 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "GENERIC", "USB DISK DEVICE", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Time DPA20B 1GB MP3 Player * PR: usb/81846 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "USB2.0*", "(FS) FLASH DISK*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Samsung USB key 128Mb * PR: usb/90081 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "USB-DISK", "FreeDik-FlashUsb", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Kingston DataTraveler 2.0 USB Flash memory. * PR: usb/89196 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "Kingston", "DataTraveler 2.0", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Creative MUVO Slim mp3 player (USB) * PR: usb/86131 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "CREATIVE", "MuVo Slim", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE|DA_Q_NO_PREVENT }, { /* * United MP5512 Portable MP3 Player (2-in-1 USB DISK/MP3) * PR: usb/80487 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "Generic*", "MUSIC DISK", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * SanDisk Micro Cruzer 128MB * PR: usb/75970 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "SanDisk" , "Micro Cruzer", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * TOSHIBA TransMemory USB sticks * PR: kern/94660 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "TOSHIBA", "TransMemory", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * PNY USB 3.0 Flash Drives */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "PNY", "USB 3.0 FD*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE | DA_Q_NO_RC16 }, { /* * PNY USB Flash keys * PR: usb/75578, usb/72344, usb/65436 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "*" , "USB DISK*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Genesys GL3224 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "Generic*", "STORAGE DEVICE*", "120?"}, /*quirks*/ DA_Q_NO_SYNC_CACHE | DA_Q_4K | DA_Q_NO_RC16 }, { /* * Genesys 6-in-1 Card Reader * PR: usb/94647 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "Generic*", "STORAGE DEVICE*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Rekam Digital CAMERA * PR: usb/98713 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "CAMERA*", "4MP-9J6*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * iRiver H10 MP3 player * PR: usb/102547 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "iriver", "H10*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * iRiver U10 MP3 player * PR: usb/92306 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "iriver", "U10*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * X-Micro Flash Disk * PR: usb/96901 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "X-Micro", "Flash Disk", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * EasyMP3 EM732X USB 2.0 Flash MP3 Player * PR: usb/96546 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "EM732X", "MP3 Player*", "1.00"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Denver MP3 player * PR: usb/107101 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "DENVER", "MP3 PLAYER", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Philips USB Key Audio KEY013 * PR: usb/68412 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "PHILIPS", "Key*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE | DA_Q_NO_PREVENT }, { /* * JNC MP3 Player * PR: usb/94439 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "JNC*" , "MP3 Player*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * SAMSUNG MP0402H * PR: usb/108427 */ {T_DIRECT, SIP_MEDIA_FIXED, "SAMSUNG", "MP0402H", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * I/O Magic USB flash - Giga Bank * PR: usb/108810 */ {T_DIRECT, SIP_MEDIA_FIXED, "GS-Magic", "stor*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * JoyFly 128mb USB Flash Drive * PR: 96133 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "USB 2.0", "Flash Disk*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * ChipsBnk usb stick * PR: 103702 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "ChipsBnk", "USB*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Storcase (Kingston) InfoStation IFS FC2/SATA-R 201A * PR: 129858 */ {T_DIRECT, SIP_MEDIA_FIXED, "IFS", "FC2/SATA-R*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Samsung YP-U3 mp3-player * PR: 125398 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "Samsung", "YP-U3", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { {T_DIRECT, SIP_MEDIA_REMOVABLE, "Netac", "OnlyDisk*", "2000"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Sony Cyber-Shot DSC cameras * PR: usb/137035 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "Sony", "Sony DSC", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE | DA_Q_NO_PREVENT }, { {T_DIRECT, SIP_MEDIA_REMOVABLE, "Kingston", "DataTraveler G3", "1.00"}, /*quirks*/ DA_Q_NO_PREVENT }, { /* At least several Transcent USB sticks lie on RC16. */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "JetFlash", "Transcend*", "*"}, /*quirks*/ DA_Q_NO_RC16 }, { /* * I-O Data USB Flash Disk * PR: usb/211716 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "I-O DATA", "USB Flash Disk*", "*"}, /*quirks*/ DA_Q_NO_RC16 }, { /* * SLC CHIPFANCIER USB drives * PR: usb/234503 (RC10 right, RC16 wrong) * 16GB, 32GB and 128GB confirmed to have same issue */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "*SLC", "CHIPFANCIER", "*"}, /*quirks*/ DA_Q_NO_RC16 }, /* ATA/SATA devices over SAS/USB/... */ { /* Sandisk X400 */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "SanDisk SD8SB8U1*", "*" }, /*quirks*/DA_Q_128KB }, { /* Hitachi Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "Hitachi", "H??????????E3*", "*" }, /*quirks*/DA_Q_4K }, { /* Micron Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "Micron 5100 MTFDDAK*", "*" }, /*quirks*/DA_Q_4K }, { /* Samsung Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "SAMSUNG HD155UI*", "*" }, /*quirks*/DA_Q_4K }, { /* Samsung Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "SAMSUNG", "HD155UI*", "*" }, /*quirks*/DA_Q_4K }, { /* Samsung Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "SAMSUNG HD204UI*", "*" }, /*quirks*/DA_Q_4K }, { /* Samsung Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "SAMSUNG", "HD204UI*", "*" }, /*quirks*/DA_Q_4K }, { /* Seagate Barracuda Green Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "ST????DL*", "*" }, /*quirks*/DA_Q_4K }, { /* Seagate Barracuda Green Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ST????DL", "*", "*" }, /*quirks*/DA_Q_4K }, { /* Seagate Barracuda Green Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "ST???DM*", "*" }, /*quirks*/DA_Q_4K }, { /* Seagate Barracuda Green Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ST???DM*", "*", "*" }, /*quirks*/DA_Q_4K }, { /* Seagate Barracuda Green Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "ST????DM*", "*" }, /*quirks*/DA_Q_4K }, { /* Seagate Barracuda Green Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ST????DM", "*", "*" }, /*quirks*/DA_Q_4K }, { /* Seagate Momentus Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "ST9500423AS*", "*" }, /*quirks*/DA_Q_4K }, { /* Seagate Momentus Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ST950042", "3AS*", "*" }, /*quirks*/DA_Q_4K }, { /* Seagate Momentus Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "ST9500424AS*", "*" }, /*quirks*/DA_Q_4K }, { /* Seagate Momentus Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ST950042", "4AS*", "*" }, /*quirks*/DA_Q_4K }, { /* Seagate Momentus Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "ST9640423AS*", "*" }, /*quirks*/DA_Q_4K }, { /* Seagate Momentus Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ST964042", "3AS*", "*" }, /*quirks*/DA_Q_4K }, { /* Seagate Momentus Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "ST9640424AS*", "*" }, /*quirks*/DA_Q_4K }, { /* Seagate Momentus Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ST964042", "4AS*", "*" }, /*quirks*/DA_Q_4K }, { /* Seagate Momentus Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "ST9750420AS*", "*" }, /*quirks*/DA_Q_4K }, { /* Seagate Momentus Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ST975042", "0AS*", "*" }, /*quirks*/DA_Q_4K }, { /* Seagate Momentus Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "ST9750422AS*", "*" }, /*quirks*/DA_Q_4K }, { /* Seagate Momentus Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ST975042", "2AS*", "*" }, /*quirks*/DA_Q_4K }, { /* Seagate Momentus Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "ST9750423AS*", "*" }, /*quirks*/DA_Q_4K }, { /* Seagate Momentus Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ST975042", "3AS*", "*" }, /*quirks*/DA_Q_4K }, { /* Seagate Momentus Thin Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "ST???LT*", "*" }, /*quirks*/DA_Q_4K }, { /* Seagate Momentus Thin Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ST???LT*", "*", "*" }, /*quirks*/DA_Q_4K }, { /* WDC Caviar Green Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "WDC WD????RS*", "*" }, /*quirks*/DA_Q_4K }, { /* WDC Caviar Green Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "WDC WD??", "??RS*", "*" }, /*quirks*/DA_Q_4K }, { /* WDC Caviar Green Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "WDC WD????RX*", "*" }, /*quirks*/DA_Q_4K }, { /* WDC Caviar Green Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "WDC WD??", "??RX*", "*" }, /*quirks*/DA_Q_4K }, { /* WDC Caviar Green Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "WDC WD??????RS*", "*" }, /*quirks*/DA_Q_4K }, { /* WDC Caviar Green Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "WDC WD??", "????RS*", "*" }, /*quirks*/DA_Q_4K }, { /* WDC Caviar Green Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "WDC WD??????RX*", "*" }, /*quirks*/DA_Q_4K }, { /* WDC Caviar Green Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "WDC WD??", "????RX*", "*" }, /*quirks*/DA_Q_4K }, { /* WDC Scorpio Black Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "WDC WD???PKT*", "*" }, /*quirks*/DA_Q_4K }, { /* WDC Scorpio Black Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "WDC WD??", "?PKT*", "*" }, /*quirks*/DA_Q_4K }, { /* WDC Scorpio Black Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "WDC WD?????PKT*", "*" }, /*quirks*/DA_Q_4K }, { /* WDC Scorpio Black Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "WDC WD??", "???PKT*", "*" }, /*quirks*/DA_Q_4K }, { /* WDC Scorpio Blue Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "WDC WD???PVT*", "*" }, /*quirks*/DA_Q_4K }, { /* WDC Scorpio Blue Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "WDC WD??", "?PVT*", "*" }, /*quirks*/DA_Q_4K }, { /* WDC Scorpio Blue Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "WDC WD?????PVT*", "*" }, /*quirks*/DA_Q_4K }, { /* WDC Scorpio Blue Advanced Format (4k) drives */ { T_DIRECT, SIP_MEDIA_FIXED, "WDC WD??", "???PVT*", "*" }, /*quirks*/DA_Q_4K }, { /* * Olympus digital cameras (C-3040ZOOM, C-2040ZOOM, C-1) * PR: usb/97472 */ { T_DIRECT, SIP_MEDIA_REMOVABLE, "OLYMPUS", "C*", "*"}, /*quirks*/ DA_Q_NO_6_BYTE | DA_Q_NO_SYNC_CACHE }, { /* * Olympus digital cameras (D-370) * PR: usb/97472 */ { T_DIRECT, SIP_MEDIA_REMOVABLE, "OLYMPUS", "D*", "*"}, /*quirks*/ DA_Q_NO_6_BYTE }, { /* * Olympus digital cameras (E-100RS, E-10). * PR: usb/97472 */ { T_DIRECT, SIP_MEDIA_REMOVABLE, "OLYMPUS", "E*", "*"}, /*quirks*/ DA_Q_NO_6_BYTE | DA_Q_NO_SYNC_CACHE }, { /* * Olympus FE-210 camera */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "OLYMPUS", "FE210*", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Pentax Digital Camera * PR: usb/93389 */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "PENTAX", "DIGITAL CAMERA", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * LG UP3S MP3 player */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "LG", "UP3S", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * Laser MP3-2GA13 MP3 player */ {T_DIRECT, SIP_MEDIA_REMOVABLE, "USB 2.0", "(HS) Flash Disk", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, { /* * LaCie external 250GB Hard drive des by Porsche * Submitted by: Ben Stuyts * PR: 121474 */ {T_DIRECT, SIP_MEDIA_FIXED, "SAMSUNG", "HM250JI", "*"}, /*quirks*/ DA_Q_NO_SYNC_CACHE }, /* SATA SSDs */ { /* * Corsair Force 2 SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "Corsair CSSD-F*", "*" }, /*quirks*/DA_Q_4K }, { /* * Corsair Force 3 SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "Corsair Force 3*", "*" }, /*quirks*/DA_Q_4K }, { /* * Corsair Neutron GTX SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "Corsair Neutron GTX*", "*" }, /*quirks*/DA_Q_4K }, { /* * Corsair Force GT & GS SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "Corsair Force G*", "*" }, /*quirks*/DA_Q_4K }, { /* * Crucial M4 SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "M4-CT???M4SSD2*", "*" }, /*quirks*/DA_Q_4K }, { /* * Crucial RealSSD C300 SSDs * 4k optimised */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "C300-CTFDDAC???MAG*", "*" }, /*quirks*/DA_Q_4K }, { /* * Intel 320 Series SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "INTEL SSDSA2CW*", "*" }, /*quirks*/DA_Q_4K }, { /* * Intel 330 Series SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "INTEL SSDSC2CT*", "*" }, /*quirks*/DA_Q_4K }, { /* * Intel 510 Series SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "INTEL SSDSC2MH*", "*" }, /*quirks*/DA_Q_4K }, { /* * Intel 520 Series SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "INTEL SSDSC2BW*", "*" }, /*quirks*/DA_Q_4K }, { /* * Intel S3610 Series SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "INTEL SSDSC2BX*", "*" }, /*quirks*/DA_Q_4K }, { /* * Intel X25-M Series SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "INTEL SSDSA2M*", "*" }, /*quirks*/DA_Q_4K }, { /* * Kingston E100 Series SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "KINGSTON SE100S3*", "*" }, /*quirks*/DA_Q_4K }, { /* * Kingston HyperX 3k SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "KINGSTON SH103S3*", "*" }, /*quirks*/DA_Q_4K }, { /* * Marvell SSDs (entry taken from OpenSolaris) * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "MARVELL SD88SA02*", "*" }, /*quirks*/DA_Q_4K }, { /* * OCZ Agility 2 SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "*", "OCZ-AGILITY2*", "*" }, /*quirks*/DA_Q_4K }, { /* * OCZ Agility 3 SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "OCZ-AGILITY3*", "*" }, /*quirks*/DA_Q_4K }, { /* * OCZ Deneva R Series SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "DENRSTE251M45*", "*" }, /*quirks*/DA_Q_4K }, { /* * OCZ Vertex 2 SSDs (inc pro series) * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "OCZ?VERTEX2*", "*" }, /*quirks*/DA_Q_4K }, { /* * OCZ Vertex 3 SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "OCZ-VERTEX3*", "*" }, /*quirks*/DA_Q_4K }, { /* * OCZ Vertex 4 SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "OCZ-VERTEX4*", "*" }, /*quirks*/DA_Q_4K }, { /* * Samsung 750 Series SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "Samsung SSD 750*", "*" }, /*quirks*/DA_Q_4K }, { /* * Samsung 830 Series SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "SAMSUNG SSD 830 Series*", "*" }, /*quirks*/DA_Q_4K }, { /* * Samsung 840 SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "Samsung SSD 840*", "*" }, /*quirks*/DA_Q_4K }, { /* * Samsung 845 SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "Samsung SSD 845*", "*" }, /*quirks*/DA_Q_4K }, { /* * Samsung 850 SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "Samsung SSD 850*", "*" }, /*quirks*/DA_Q_4K }, { /* * Samsung 843T Series SSDs (MZ7WD*) * Samsung PM851 Series SSDs (MZ7TE*) * Samsung PM853T Series SSDs (MZ7GE*) * Samsung SM863 Series SSDs (MZ7KM*) * 4k optimised */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "SAMSUNG MZ7*", "*" }, /*quirks*/DA_Q_4K }, { /* * Same as for SAMSUNG MZ7* but enable the quirks for SSD * starting with MZ7* too */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "MZ7*", "*" }, /*quirks*/DA_Q_4K }, { /* * SuperTalent TeraDrive CT SSDs * 4k optimised & trim only works in 4k requests + 4k aligned */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "FTM??CT25H*", "*" }, /*quirks*/DA_Q_4K }, { /* * XceedIOPS SATA SSDs * 4k optimised */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "SG9XCS2D*", "*" }, /*quirks*/DA_Q_4K }, { /* * Hama Innostor USB-Stick */ { T_DIRECT, SIP_MEDIA_REMOVABLE, "Innostor", "Innostor*", "*" }, /*quirks*/DA_Q_NO_RC16 }, { /* * Seagate Lamarr 8TB Shingled Magnetic Recording (SMR) * Drive Managed SATA hard drive. This drive doesn't report * in firmware that it is a drive managed SMR drive. */ { T_DIRECT, SIP_MEDIA_FIXED, "ATA", "ST8000AS000[23]*", "*" }, /*quirks*/DA_Q_SMR_DM }, { /* * MX-ES USB Drive by Mach Xtreme */ { T_DIRECT, SIP_MEDIA_REMOVABLE, "MX", "MXUB3*", "*"}, /*quirks*/DA_Q_NO_RC16 }, }; static disk_strategy_t dastrategy; static dumper_t dadump; static periph_init_t dainit; static void daasync(void *callback_arg, u_int32_t code, struct cam_path *path, void *arg); static void dasysctlinit(void *context, int pending); static int dasysctlsofttimeout(SYSCTL_HANDLER_ARGS); static int dacmdsizesysctl(SYSCTL_HANDLER_ARGS); static int dadeletemethodsysctl(SYSCTL_HANDLER_ARGS); static int dabitsysctl(SYSCTL_HANDLER_ARGS); static int daflagssysctl(SYSCTL_HANDLER_ARGS); static int dazonemodesysctl(SYSCTL_HANDLER_ARGS); static int dazonesupsysctl(SYSCTL_HANDLER_ARGS); static int dadeletemaxsysctl(SYSCTL_HANDLER_ARGS); static void dadeletemethodset(struct da_softc *softc, da_delete_methods delete_method); static off_t dadeletemaxsize(struct da_softc *softc, da_delete_methods delete_method); static void dadeletemethodchoose(struct da_softc *softc, da_delete_methods default_method); static void daprobedone(struct cam_periph *periph, union ccb *ccb); static periph_ctor_t daregister; static periph_dtor_t dacleanup; static periph_start_t dastart; static periph_oninv_t daoninvalidate; static void dazonedone(struct cam_periph *periph, union ccb *ccb); static void dadone(struct cam_periph *periph, union ccb *done_ccb); static void dadone_probewp(struct cam_periph *periph, union ccb *done_ccb); static void dadone_proberc(struct cam_periph *periph, union ccb *done_ccb); static void dadone_probelbp(struct cam_periph *periph, union ccb *done_ccb); static void dadone_probeblklimits(struct cam_periph *periph, union ccb *done_ccb); static void dadone_probebdc(struct cam_periph *periph, union ccb *done_ccb); static void dadone_probeata(struct cam_periph *periph, union ccb *done_ccb); static void dadone_probeatalogdir(struct cam_periph *periph, union ccb *done_ccb); static void dadone_probeataiddir(struct cam_periph *periph, union ccb *done_ccb); static void dadone_probeatasup(struct cam_periph *periph, union ccb *done_ccb); static void dadone_probeatazone(struct cam_periph *periph, union ccb *done_ccb); static void dadone_probezone(struct cam_periph *periph, union ccb *done_ccb); static void dadone_tur(struct cam_periph *periph, union ccb *done_ccb); static int daerror(union ccb *ccb, u_int32_t cam_flags, u_int32_t sense_flags); static void daprevent(struct cam_periph *periph, int action); static void dareprobe(struct cam_periph *periph); static void dasetgeom(struct cam_periph *periph, uint32_t block_len, uint64_t maxsector, struct scsi_read_capacity_data_long *rcaplong, size_t rcap_size); static callout_func_t dasendorderedtag; static void dashutdown(void *arg, int howto); static callout_func_t damediapoll; #ifndef DA_DEFAULT_POLL_PERIOD #define DA_DEFAULT_POLL_PERIOD 3 #endif #ifndef DA_DEFAULT_TIMEOUT #define DA_DEFAULT_TIMEOUT 60 /* Timeout in seconds */ #endif #ifndef DA_DEFAULT_SOFTTIMEOUT #define DA_DEFAULT_SOFTTIMEOUT 0 #endif #ifndef DA_DEFAULT_RETRY #define DA_DEFAULT_RETRY 4 #endif #ifndef DA_DEFAULT_SEND_ORDERED #define DA_DEFAULT_SEND_ORDERED 1 #endif static int da_poll_period = DA_DEFAULT_POLL_PERIOD; static int da_retry_count = DA_DEFAULT_RETRY; static int da_default_timeout = DA_DEFAULT_TIMEOUT; static sbintime_t da_default_softtimeout = DA_DEFAULT_SOFTTIMEOUT; static int da_send_ordered = DA_DEFAULT_SEND_ORDERED; static int da_disable_wp_detection = 0; static int da_enable_biospeedup = 1; static SYSCTL_NODE(_kern_cam, OID_AUTO, da, CTLFLAG_RD, 0, "CAM Direct Access Disk driver"); SYSCTL_INT(_kern_cam_da, OID_AUTO, poll_period, CTLFLAG_RWTUN, &da_poll_period, 0, "Media polling period in seconds"); SYSCTL_INT(_kern_cam_da, OID_AUTO, retry_count, CTLFLAG_RWTUN, &da_retry_count, 0, "Normal I/O retry count"); SYSCTL_INT(_kern_cam_da, OID_AUTO, default_timeout, CTLFLAG_RWTUN, &da_default_timeout, 0, "Normal I/O timeout (in seconds)"); SYSCTL_INT(_kern_cam_da, OID_AUTO, send_ordered, CTLFLAG_RWTUN, &da_send_ordered, 0, "Send Ordered Tags"); SYSCTL_INT(_kern_cam_da, OID_AUTO, disable_wp_detection, CTLFLAG_RWTUN, &da_disable_wp_detection, 0, "Disable detection of write-protected disks"); SYSCTL_INT(_kern_cam_da, OID_AUTO, enable_biospeedup, CTLFLAG_RDTUN, &da_enable_biospeedup, 0, "Enable BIO_SPEEDUP processing"); SYSCTL_PROC(_kern_cam_da, OID_AUTO, default_softtimeout, CTLTYPE_UINT | CTLFLAG_RW, NULL, 0, dasysctlsofttimeout, "I", "Soft I/O timeout (ms)"); TUNABLE_INT64("kern.cam.da.default_softtimeout", &da_default_softtimeout); /* * DA_ORDEREDTAG_INTERVAL determines how often, relative * to the default timeout, we check to see whether an ordered * tagged transaction is appropriate to prevent simple tag * starvation. Since we'd like to ensure that there is at least * 1/2 of the timeout length left for a starved transaction to * complete after we've sent an ordered tag, we must poll at least * four times in every timeout period. This takes care of the worst * case where a starved transaction starts during an interval that * meets the requirement "don't send an ordered tag" test so it takes * us two intervals to determine that a tag must be sent. */ #ifndef DA_ORDEREDTAG_INTERVAL #define DA_ORDEREDTAG_INTERVAL 4 #endif static struct periph_driver dadriver = { dainit, "da", TAILQ_HEAD_INITIALIZER(dadriver.units), /* generation */ 0 }; PERIPHDRIVER_DECLARE(da, dadriver); static MALLOC_DEFINE(M_SCSIDA, "scsi_da", "scsi_da buffers"); /* * This driver takes out references / holds in well defined pairs, never * recursively. These macros / inline functions enforce those rules. They * are only enabled with DA_TRACK_REFS or INVARIANTS. If DA_TRACK_REFS is * defined to be 2 or larger, the tracking also includes debug printfs. */ #if defined(DA_TRACK_REFS) || defined(INVARIANTS) #ifndef DA_TRACK_REFS #define DA_TRACK_REFS 1 #endif #if DA_TRACK_REFS > 1 static const char *da_ref_text[] = { "bogus", "open", "open hold", "close hold", "reprobe hold", "Test Unit Ready", "Geom", "sysctl", "reprobe", "max -- also bogus" }; #define DA_PERIPH_PRINT(periph, msg, args...) \ CAM_PERIPH_PRINT(periph, msg, ##args) #else #define DA_PERIPH_PRINT(periph, msg, args...) #endif static inline void token_sanity(da_ref_token token) { if ((unsigned)token >= DA_REF_MAX) panic("Bad token value passed in %d\n", token); } static inline int da_periph_hold(struct cam_periph *periph, int priority, da_ref_token token) { int err = cam_periph_hold(periph, priority); token_sanity(token); DA_PERIPH_PRINT(periph, "Holding device %s (%d): %d\n", da_ref_text[token], token, err); if (err == 0) { int cnt; struct da_softc *softc = periph->softc; cnt = atomic_fetchadd_int(&softc->ref_flags[token], 1); if (cnt != 0) panic("Re-holding for reason %d, cnt = %d", token, cnt); } return (err); } static inline void da_periph_unhold(struct cam_periph *periph, da_ref_token token) { int cnt; struct da_softc *softc = periph->softc; token_sanity(token); DA_PERIPH_PRINT(periph, "Unholding device %s (%d)\n", da_ref_text[token], token); cnt = atomic_fetchadd_int(&softc->ref_flags[token], -1); if (cnt != 1) panic("Unholding %d with cnt = %d", token, cnt); cam_periph_unhold(periph); } static inline int da_periph_acquire(struct cam_periph *periph, da_ref_token token) { int err = cam_periph_acquire(periph); token_sanity(token); DA_PERIPH_PRINT(periph, "acquiring device %s (%d): %d\n", da_ref_text[token], token, err); if (err == 0) { int cnt; struct da_softc *softc = periph->softc; cnt = atomic_fetchadd_int(&softc->ref_flags[token], 1); if (cnt != 0) panic("Re-refing for reason %d, cnt = %d", token, cnt); } return (err); } static inline void da_periph_release(struct cam_periph *periph, da_ref_token token) { int cnt; struct da_softc *softc = periph->softc; token_sanity(token); DA_PERIPH_PRINT(periph, "releasing device %s (%d)\n", da_ref_text[token], token); cnt = atomic_fetchadd_int(&softc->ref_flags[token], -1); if (cnt != 1) panic("Releasing %d with cnt = %d", token, cnt); cam_periph_release(periph); } static inline void da_periph_release_locked(struct cam_periph *periph, da_ref_token token) { int cnt; struct da_softc *softc = periph->softc; token_sanity(token); DA_PERIPH_PRINT(periph, "releasing device (locked) %s (%d)\n", da_ref_text[token], token); cnt = atomic_fetchadd_int(&softc->ref_flags[token], -1); if (cnt != 1) panic("releasing (locked) %d with cnt = %d", token, cnt); cam_periph_release_locked(periph); } #define cam_periph_hold POISON #define cam_periph_unhold POISON #define cam_periph_acquire POISON #define cam_periph_release POISON #define cam_periph_release_locked POISON #else #define da_periph_hold(periph, prio, token) cam_periph_hold((periph), (prio)) #define da_periph_unhold(periph, token) cam_periph_unhold((periph)) #define da_periph_acquire(periph, token) cam_periph_acquire((periph)) #define da_periph_release(periph, token) cam_periph_release((periph)) #define da_periph_release_locked(periph, token) cam_periph_release_locked((periph)) #endif static int daopen(struct disk *dp) { struct cam_periph *periph; struct da_softc *softc; int error; periph = (struct cam_periph *)dp->d_drv1; if (da_periph_acquire(periph, DA_REF_OPEN) != 0) { return (ENXIO); } cam_periph_lock(periph); if ((error = da_periph_hold(periph, PRIBIO|PCATCH, DA_REF_OPEN_HOLD)) != 0) { cam_periph_unlock(periph); da_periph_release(periph, DA_REF_OPEN); return (error); } CAM_DEBUG(periph->path, CAM_DEBUG_TRACE | CAM_DEBUG_PERIPH, ("daopen\n")); softc = (struct da_softc *)periph->softc; dareprobe(periph); /* Wait for the disk size update. */ error = cam_periph_sleep(periph, &softc->disk->d_mediasize, PRIBIO, "dareprobe", 0); if (error != 0) xpt_print(periph->path, "unable to retrieve capacity data\n"); if (periph->flags & CAM_PERIPH_INVALID) error = ENXIO; if (error == 0 && (softc->flags & DA_FLAG_PACK_REMOVABLE) != 0 && (softc->quirks & DA_Q_NO_PREVENT) == 0) daprevent(periph, PR_PREVENT); if (error == 0) { softc->flags &= ~DA_FLAG_PACK_INVALID; softc->flags |= DA_FLAG_OPEN; } da_periph_unhold(periph, DA_REF_OPEN_HOLD); cam_periph_unlock(periph); if (error != 0) da_periph_release(periph, DA_REF_OPEN); return (error); } static int daclose(struct disk *dp) { struct cam_periph *periph; struct da_softc *softc; union ccb *ccb; periph = (struct cam_periph *)dp->d_drv1; softc = (struct da_softc *)periph->softc; cam_periph_lock(periph); CAM_DEBUG(periph->path, CAM_DEBUG_TRACE | CAM_DEBUG_PERIPH, ("daclose\n")); if (da_periph_hold(periph, PRIBIO, DA_REF_CLOSE_HOLD) == 0) { /* Flush disk cache. */ if ((softc->flags & DA_FLAG_DIRTY) != 0 && (softc->quirks & DA_Q_NO_SYNC_CACHE) == 0 && (softc->flags & DA_FLAG_PACK_INVALID) == 0) { ccb = cam_periph_getccb(periph, CAM_PRIORITY_NORMAL); scsi_synchronize_cache(&ccb->csio, /*retries*/1, /*cbfcnp*/NULL, MSG_SIMPLE_Q_TAG, /*begin_lba*/0, /*lb_count*/0, SSD_FULL_SIZE, 5 * 60 * 1000); cam_periph_runccb(ccb, daerror, /*cam_flags*/0, /*sense_flags*/SF_RETRY_UA | SF_QUIET_IR, softc->disk->d_devstat); softc->flags &= ~DA_FLAG_DIRTY; xpt_release_ccb(ccb); } /* Allow medium removal. */ if ((softc->flags & DA_FLAG_PACK_REMOVABLE) != 0 && (softc->quirks & DA_Q_NO_PREVENT) == 0) daprevent(periph, PR_ALLOW); da_periph_unhold(periph, DA_REF_CLOSE_HOLD); } /* * If we've got removable media, mark the blocksize as * unavailable, since it could change when new media is * inserted. */ if ((softc->flags & DA_FLAG_PACK_REMOVABLE) != 0) softc->disk->d_devstat->flags |= DEVSTAT_BS_UNAVAILABLE; softc->flags &= ~DA_FLAG_OPEN; while (softc->refcount != 0) cam_periph_sleep(periph, &softc->refcount, PRIBIO, "daclose", 1); cam_periph_unlock(periph); da_periph_release(periph, DA_REF_OPEN); return (0); } static void daschedule(struct cam_periph *periph) { struct da_softc *softc = (struct da_softc *)periph->softc; if (softc->state != DA_STATE_NORMAL) return; cam_iosched_schedule(softc->cam_iosched, periph); } /* * Actually translate the requested transfer into one the physical driver * can understand. The transfer is described by a buf and will include * only one physical transfer. */ static void dastrategy(struct bio *bp) { struct cam_periph *periph; struct da_softc *softc; periph = (struct cam_periph *)bp->bio_disk->d_drv1; softc = (struct da_softc *)periph->softc; cam_periph_lock(periph); /* * If the device has been made invalid, error out */ if ((softc->flags & DA_FLAG_PACK_INVALID)) { cam_periph_unlock(periph); biofinish(bp, NULL, ENXIO); return; } CAM_DEBUG(periph->path, CAM_DEBUG_TRACE, ("dastrategy(%p)\n", bp)); /* * Zone commands must be ordered, because they can depend on the * effects of previously issued commands, and they may affect * commands after them. */ if (bp->bio_cmd == BIO_ZONE) bp->bio_flags |= BIO_ORDERED; /* * Place it in the queue of disk activities for this disk */ cam_iosched_queue_work(softc->cam_iosched, bp); /* * Schedule ourselves for performing the work. */ daschedule(periph); cam_periph_unlock(periph); return; } static int dadump(void *arg, void *virtual, vm_offset_t physical, off_t offset, size_t length) { struct cam_periph *periph; struct da_softc *softc; u_int secsize; struct ccb_scsiio csio; struct disk *dp; int error = 0; dp = arg; periph = dp->d_drv1; softc = (struct da_softc *)periph->softc; secsize = softc->params.secsize; if ((softc->flags & DA_FLAG_PACK_INVALID) != 0) return (ENXIO); memset(&csio, 0, sizeof(csio)); if (length > 0) { xpt_setup_ccb(&csio.ccb_h, periph->path, CAM_PRIORITY_NORMAL); csio.ccb_h.ccb_state = DA_CCB_DUMP; scsi_read_write(&csio, /*retries*/0, /*cbfcnp*/NULL, MSG_ORDERED_Q_TAG, /*read*/SCSI_RW_WRITE, /*byte2*/0, /*minimum_cmd_size*/ softc->minimum_cmd_size, offset / secsize, length / secsize, /*data_ptr*/(u_int8_t *) virtual, /*dxfer_len*/length, /*sense_len*/SSD_FULL_SIZE, da_default_timeout * 1000); error = cam_periph_runccb((union ccb *)&csio, cam_periph_error, 0, SF_NO_RECOVERY | SF_NO_RETRY, NULL); if (error != 0) printf("Aborting dump due to I/O error.\n"); return (error); } /* * Sync the disk cache contents to the physical media. */ if ((softc->quirks & DA_Q_NO_SYNC_CACHE) == 0) { xpt_setup_ccb(&csio.ccb_h, periph->path, CAM_PRIORITY_NORMAL); csio.ccb_h.ccb_state = DA_CCB_DUMP; scsi_synchronize_cache(&csio, /*retries*/0, /*cbfcnp*/NULL, MSG_SIMPLE_Q_TAG, /*begin_lba*/0,/* Cover the whole disk */ /*lb_count*/0, SSD_FULL_SIZE, 5 * 1000); error = cam_periph_runccb((union ccb *)&csio, cam_periph_error, 0, SF_NO_RECOVERY | SF_NO_RETRY, NULL); if (error != 0) xpt_print(periph->path, "Synchronize cache failed\n"); } return (error); } static int dagetattr(struct bio *bp) { int ret; struct cam_periph *periph; if (g_handleattr_int(bp, "GEOM::canspeedup", da_enable_biospeedup)) return (EJUSTRETURN); periph = (struct cam_periph *)bp->bio_disk->d_drv1; cam_periph_lock(periph); ret = xpt_getattr(bp->bio_data, bp->bio_length, bp->bio_attribute, periph->path); cam_periph_unlock(periph); if (ret == 0) bp->bio_completed = bp->bio_length; return ret; } static void dainit(void) { cam_status status; /* * Install a global async callback. This callback will * receive async callbacks like "new device found". */ status = xpt_register_async(AC_FOUND_DEVICE, daasync, NULL, NULL); if (status != CAM_REQ_CMP) { printf("da: Failed to attach master async callback " "due to status 0x%x!\n", status); } else if (da_send_ordered) { /* Register our shutdown event handler */ if ((EVENTHANDLER_REGISTER(shutdown_post_sync, dashutdown, NULL, SHUTDOWN_PRI_DEFAULT)) == NULL) printf("dainit: shutdown event registration failed!\n"); } } /* * Callback from GEOM, called when it has finished cleaning up its * resources. */ static void dadiskgonecb(struct disk *dp) { struct cam_periph *periph; periph = (struct cam_periph *)dp->d_drv1; da_periph_release(periph, DA_REF_GEOM); } static void daoninvalidate(struct cam_periph *periph) { struct da_softc *softc; cam_periph_assert(periph, MA_OWNED); softc = (struct da_softc *)periph->softc; /* * De-register any async callbacks. */ xpt_register_async(0, daasync, periph, periph->path); softc->flags |= DA_FLAG_PACK_INVALID; #ifdef CAM_IO_STATS softc->invalidations++; #endif /* * Return all queued I/O with ENXIO. * XXX Handle any transactions queued to the card * with XPT_ABORT_CCB. */ cam_iosched_flush(softc->cam_iosched, NULL, ENXIO); /* * Tell GEOM that we've gone away, we'll get a callback when it is * done cleaning up its resources. */ disk_gone(softc->disk); } static void dacleanup(struct cam_periph *periph) { struct da_softc *softc; softc = (struct da_softc *)periph->softc; cam_periph_unlock(periph); cam_iosched_fini(softc->cam_iosched); /* * If we can't free the sysctl tree, oh well... */ if ((softc->flags & DA_FLAG_SCTX_INIT) != 0) { #ifdef CAM_IO_STATS if (sysctl_ctx_free(&softc->sysctl_stats_ctx) != 0) xpt_print(periph->path, "can't remove sysctl stats context\n"); #endif if (sysctl_ctx_free(&softc->sysctl_ctx) != 0) xpt_print(periph->path, "can't remove sysctl context\n"); } callout_drain(&softc->mediapoll_c); disk_destroy(softc->disk); callout_drain(&softc->sendordered_c); free(softc, M_DEVBUF); cam_periph_lock(periph); } static void daasync(void *callback_arg, u_int32_t code, struct cam_path *path, void *arg) { struct cam_periph *periph; struct da_softc *softc; periph = (struct cam_periph *)callback_arg; switch (code) { case AC_FOUND_DEVICE: /* callback to create periph, no locking yet */ { struct ccb_getdev *cgd; cam_status status; cgd = (struct ccb_getdev *)arg; if (cgd == NULL) break; if (cgd->protocol != PROTO_SCSI) break; if (SID_QUAL(&cgd->inq_data) != SID_QUAL_LU_CONNECTED) break; if (SID_TYPE(&cgd->inq_data) != T_DIRECT && SID_TYPE(&cgd->inq_data) != T_RBC && SID_TYPE(&cgd->inq_data) != T_OPTICAL && SID_TYPE(&cgd->inq_data) != T_ZBC_HM) break; /* * Allocate a peripheral instance for * this device and start the probe * process. */ status = cam_periph_alloc(daregister, daoninvalidate, dacleanup, dastart, "da", CAM_PERIPH_BIO, path, daasync, AC_FOUND_DEVICE, cgd); if (status != CAM_REQ_CMP && status != CAM_REQ_INPROG) printf("daasync: Unable to attach to new device " "due to status 0x%x\n", status); return; } case AC_ADVINFO_CHANGED: /* Doesn't touch periph */ { uintptr_t buftype; buftype = (uintptr_t)arg; if (buftype == CDAI_TYPE_PHYS_PATH) { struct da_softc *softc; softc = periph->softc; disk_attr_changed(softc->disk, "GEOM::physpath", M_NOWAIT); } break; } case AC_UNIT_ATTENTION: { union ccb *ccb; int error_code, sense_key, asc, ascq; softc = (struct da_softc *)periph->softc; ccb = (union ccb *)arg; /* * Handle all UNIT ATTENTIONs except our own, as they will be * handled by daerror(). Since this comes from a different periph, * that periph's lock is held, not ours, so we have to take it ours * out to touch softc flags. */ if (xpt_path_periph(ccb->ccb_h.path) != periph && scsi_extract_sense_ccb(ccb, &error_code, &sense_key, &asc, &ascq)) { if (asc == 0x2A && ascq == 0x09) { xpt_print(ccb->ccb_h.path, "Capacity data has changed\n"); cam_periph_lock(periph); softc->flags &= ~DA_FLAG_PROBED; dareprobe(periph); cam_periph_unlock(periph); } else if (asc == 0x28 && ascq == 0x00) { cam_periph_lock(periph); softc->flags &= ~DA_FLAG_PROBED; cam_periph_unlock(periph); disk_media_changed(softc->disk, M_NOWAIT); } else if (asc == 0x3F && ascq == 0x03) { xpt_print(ccb->ccb_h.path, "INQUIRY data has changed\n"); cam_periph_lock(periph); softc->flags &= ~DA_FLAG_PROBED; dareprobe(periph); cam_periph_unlock(periph); } } break; } case AC_SCSI_AEN: /* Called for this path: periph locked */ /* * Appears to be currently unused for SCSI devices, only ata SIMs * generate this. */ cam_periph_assert(periph, MA_OWNED); softc = (struct da_softc *)periph->softc; if (!cam_iosched_has_work_flags(softc->cam_iosched, DA_WORK_TUR) && (softc->flags & DA_FLAG_TUR_PENDING) == 0) { if (da_periph_acquire(periph, DA_REF_TUR) == 0) { cam_iosched_set_work_flags(softc->cam_iosched, DA_WORK_TUR); daschedule(periph); } } /* FALLTHROUGH */ case AC_SENT_BDR: /* Called for this path: periph locked */ case AC_BUS_RESET: /* Called for this path: periph locked */ { struct ccb_hdr *ccbh; cam_periph_assert(periph, MA_OWNED); softc = (struct da_softc *)periph->softc; /* * Don't fail on the expected unit attention * that will occur. */ softc->flags |= DA_FLAG_RETRY_UA; LIST_FOREACH(ccbh, &softc->pending_ccbs, periph_links.le) ccbh->ccb_state |= DA_CCB_RETRY_UA; break; } case AC_INQ_CHANGED: /* Called for this path: periph locked */ cam_periph_assert(periph, MA_OWNED); softc = (struct da_softc *)periph->softc; softc->flags &= ~DA_FLAG_PROBED; dareprobe(periph); break; default: break; } cam_periph_async(periph, code, path, arg); } static void dasysctlinit(void *context, int pending) { struct cam_periph *periph; struct da_softc *softc; char tmpstr[32], tmpstr2[16]; struct ccb_trans_settings cts; periph = (struct cam_periph *)context; /* * periph was held for us when this task was enqueued */ if (periph->flags & CAM_PERIPH_INVALID) { da_periph_release(periph, DA_REF_SYSCTL); return; } softc = (struct da_softc *)periph->softc; snprintf(tmpstr, sizeof(tmpstr), "CAM DA unit %d", periph->unit_number); snprintf(tmpstr2, sizeof(tmpstr2), "%d", periph->unit_number); sysctl_ctx_init(&softc->sysctl_ctx); cam_periph_lock(periph); softc->flags |= DA_FLAG_SCTX_INIT; cam_periph_unlock(periph); softc->sysctl_tree = SYSCTL_ADD_NODE_WITH_LABEL(&softc->sysctl_ctx, SYSCTL_STATIC_CHILDREN(_kern_cam_da), OID_AUTO, tmpstr2, CTLFLAG_RD, 0, tmpstr, "device_index"); if (softc->sysctl_tree == NULL) { printf("dasysctlinit: unable to allocate sysctl tree\n"); da_periph_release(periph, DA_REF_SYSCTL); return; } /* * Now register the sysctl handler, so the user can change the value on * the fly. */ SYSCTL_ADD_PROC(&softc->sysctl_ctx, SYSCTL_CHILDREN(softc->sysctl_tree), OID_AUTO, "delete_method", CTLTYPE_STRING | CTLFLAG_RWTUN, softc, 0, dadeletemethodsysctl, "A", "BIO_DELETE execution method"); SYSCTL_ADD_PROC(&softc->sysctl_ctx, SYSCTL_CHILDREN(softc->sysctl_tree), OID_AUTO, "delete_max", CTLTYPE_U64 | CTLFLAG_RW, softc, 0, dadeletemaxsysctl, "Q", "Maximum BIO_DELETE size"); SYSCTL_ADD_PROC(&softc->sysctl_ctx, SYSCTL_CHILDREN(softc->sysctl_tree), OID_AUTO, "minimum_cmd_size", CTLTYPE_INT | CTLFLAG_RW, &softc->minimum_cmd_size, 0, dacmdsizesysctl, "I", "Minimum CDB size"); SYSCTL_ADD_UQUAD(&softc->sysctl_ctx, SYSCTL_CHILDREN(softc->sysctl_tree), OID_AUTO, "trim_count", CTLFLAG_RD, &softc->trim_count, "Total number of unmap/dsm commands sent"); SYSCTL_ADD_UQUAD(&softc->sysctl_ctx, SYSCTL_CHILDREN(softc->sysctl_tree), OID_AUTO, "trim_ranges", CTLFLAG_RD, &softc->trim_ranges, "Total number of ranges in unmap/dsm commands"); SYSCTL_ADD_UQUAD(&softc->sysctl_ctx, SYSCTL_CHILDREN(softc->sysctl_tree), OID_AUTO, "trim_lbas", CTLFLAG_RD, &softc->trim_lbas, "Total lbas in the unmap/dsm commands sent"); SYSCTL_ADD_PROC(&softc->sysctl_ctx, SYSCTL_CHILDREN(softc->sysctl_tree), OID_AUTO, "zone_mode", CTLTYPE_STRING | CTLFLAG_RD, softc, 0, dazonemodesysctl, "A", "Zone Mode"); SYSCTL_ADD_PROC(&softc->sysctl_ctx, SYSCTL_CHILDREN(softc->sysctl_tree), OID_AUTO, "zone_support", CTLTYPE_STRING | CTLFLAG_RD, softc, 0, dazonesupsysctl, "A", "Zone Support"); SYSCTL_ADD_UQUAD(&softc->sysctl_ctx, SYSCTL_CHILDREN(softc->sysctl_tree), OID_AUTO, "optimal_seq_zones", CTLFLAG_RD, &softc->optimal_seq_zones, "Optimal Number of Open Sequential Write Preferred Zones"); SYSCTL_ADD_UQUAD(&softc->sysctl_ctx, SYSCTL_CHILDREN(softc->sysctl_tree), OID_AUTO, "optimal_nonseq_zones", CTLFLAG_RD, &softc->optimal_nonseq_zones, "Optimal Number of Non-Sequentially Written Sequential Write " "Preferred Zones"); SYSCTL_ADD_UQUAD(&softc->sysctl_ctx, SYSCTL_CHILDREN(softc->sysctl_tree), OID_AUTO, "max_seq_zones", CTLFLAG_RD, &softc->max_seq_zones, "Maximum Number of Open Sequential Write Required Zones"); SYSCTL_ADD_INT(&softc->sysctl_ctx, SYSCTL_CHILDREN(softc->sysctl_tree), OID_AUTO, "error_inject", CTLFLAG_RW, &softc->error_inject, 0, "error_inject leaf"); SYSCTL_ADD_INT(&softc->sysctl_ctx, SYSCTL_CHILDREN(softc->sysctl_tree), OID_AUTO, "p_type", CTLFLAG_RD, &softc->p_type, 0, "DIF protection type"); SYSCTL_ADD_PROC(&softc->sysctl_ctx, SYSCTL_CHILDREN(softc->sysctl_tree), OID_AUTO, "flags", CTLTYPE_STRING | CTLFLAG_RD | CTLFLAG_MPSAFE, softc, 0, daflagssysctl, "A", "Flags for drive"); SYSCTL_ADD_PROC(&softc->sysctl_ctx, SYSCTL_CHILDREN(softc->sysctl_tree), OID_AUTO, "rotating", CTLTYPE_INT | CTLFLAG_RD | CTLFLAG_MPSAFE, - &softc->flags, DA_FLAG_ROTATING, dabitsysctl, "I", + &softc->flags, (u_int)DA_FLAG_ROTATING, dabitsysctl, "I", "Rotating media *DEPRECATED* gone in FreeBSD 14"); SYSCTL_ADD_PROC(&softc->sysctl_ctx, SYSCTL_CHILDREN(softc->sysctl_tree), OID_AUTO, "unmapped_io", CTLTYPE_INT | CTLFLAG_RD | CTLFLAG_MPSAFE, - &softc->flags, DA_FLAG_UNMAPPEDIO, dabitsysctl, "I", + &softc->flags, (u_int)DA_FLAG_UNMAPPEDIO, dabitsysctl, "I", "Unmapped I/O support *DEPRECATED* gone in FreeBSD 14"); #ifdef CAM_TEST_FAILURE SYSCTL_ADD_PROC(&softc->sysctl_ctx, SYSCTL_CHILDREN(softc->sysctl_tree), OID_AUTO, "invalidate", CTLTYPE_U64 | CTLFLAG_RW | CTLFLAG_MPSAFE, periph, 0, cam_periph_invalidate_sysctl, "I", "Write 1 to invalidate the drive immediately"); #endif /* * Add some addressing info. */ memset(&cts, 0, sizeof (cts)); xpt_setup_ccb(&cts.ccb_h, periph->path, CAM_PRIORITY_NONE); cts.ccb_h.func_code = XPT_GET_TRAN_SETTINGS; cts.type = CTS_TYPE_CURRENT_SETTINGS; cam_periph_lock(periph); xpt_action((union ccb *)&cts); cam_periph_unlock(periph); if (cts.ccb_h.status != CAM_REQ_CMP) { da_periph_release(periph, DA_REF_SYSCTL); return; } if (cts.protocol == PROTO_SCSI && cts.transport == XPORT_FC) { struct ccb_trans_settings_fc *fc = &cts.xport_specific.fc; if (fc->valid & CTS_FC_VALID_WWPN) { softc->wwpn = fc->wwpn; SYSCTL_ADD_UQUAD(&softc->sysctl_ctx, SYSCTL_CHILDREN(softc->sysctl_tree), OID_AUTO, "wwpn", CTLFLAG_RD, &softc->wwpn, "World Wide Port Name"); } } #ifdef CAM_IO_STATS /* * Now add some useful stats. * XXX These should live in cam_periph and be common to all periphs */ softc->sysctl_stats_tree = SYSCTL_ADD_NODE(&softc->sysctl_stats_ctx, SYSCTL_CHILDREN(softc->sysctl_tree), OID_AUTO, "stats", CTLFLAG_RD, 0, "Statistics"); SYSCTL_ADD_INT(&softc->sysctl_stats_ctx, SYSCTL_CHILDREN(softc->sysctl_stats_tree), OID_AUTO, "errors", CTLFLAG_RD, &softc->errors, 0, "Transport errors reported by the SIM"); SYSCTL_ADD_INT(&softc->sysctl_stats_ctx, SYSCTL_CHILDREN(softc->sysctl_stats_tree), OID_AUTO, "timeouts", CTLFLAG_RD, &softc->timeouts, 0, "Device timeouts reported by the SIM"); SYSCTL_ADD_INT(&softc->sysctl_stats_ctx, SYSCTL_CHILDREN(softc->sysctl_stats_tree), OID_AUTO, "pack_invalidations", CTLFLAG_RD, &softc->invalidations, 0, "Device pack invalidations"); #endif cam_iosched_sysctl_init(softc->cam_iosched, &softc->sysctl_ctx, softc->sysctl_tree); da_periph_release(periph, DA_REF_SYSCTL); } static int dadeletemaxsysctl(SYSCTL_HANDLER_ARGS) { int error; uint64_t value; struct da_softc *softc; softc = (struct da_softc *)arg1; value = softc->disk->d_delmaxsize; error = sysctl_handle_64(oidp, &value, 0, req); if ((error != 0) || (req->newptr == NULL)) return (error); /* only accept values smaller than the calculated value */ if (value > dadeletemaxsize(softc, softc->delete_method)) { return (EINVAL); } softc->disk->d_delmaxsize = value; return (0); } static int dacmdsizesysctl(SYSCTL_HANDLER_ARGS) { int error, value; value = *(int *)arg1; error = sysctl_handle_int(oidp, &value, 0, req); if ((error != 0) || (req->newptr == NULL)) return (error); /* * Acceptable values here are 6, 10, 12 or 16. */ if (value < 6) value = 6; else if ((value > 6) && (value <= 10)) value = 10; else if ((value > 10) && (value <= 12)) value = 12; else if (value > 12) value = 16; *(int *)arg1 = value; return (0); } static int dasysctlsofttimeout(SYSCTL_HANDLER_ARGS) { sbintime_t value; int error; value = da_default_softtimeout / SBT_1MS; error = sysctl_handle_int(oidp, (int *)&value, 0, req); if ((error != 0) || (req->newptr == NULL)) return (error); /* XXX Should clip this to a reasonable level */ if (value > da_default_timeout * 1000) return (EINVAL); da_default_softtimeout = value * SBT_1MS; return (0); } static void dadeletemethodset(struct da_softc *softc, da_delete_methods delete_method) { softc->delete_method = delete_method; softc->disk->d_delmaxsize = dadeletemaxsize(softc, delete_method); softc->delete_func = da_delete_functions[delete_method]; if (softc->delete_method > DA_DELETE_DISABLE) softc->disk->d_flags |= DISKFLAG_CANDELETE; else softc->disk->d_flags &= ~DISKFLAG_CANDELETE; } static off_t dadeletemaxsize(struct da_softc *softc, da_delete_methods delete_method) { off_t sectors; switch(delete_method) { case DA_DELETE_UNMAP: sectors = (off_t)softc->unmap_max_lba; break; case DA_DELETE_ATA_TRIM: sectors = (off_t)ATA_DSM_RANGE_MAX * softc->trim_max_ranges; break; case DA_DELETE_WS16: sectors = omin(softc->ws_max_blks, WS16_MAX_BLKS); break; case DA_DELETE_ZERO: case DA_DELETE_WS10: sectors = omin(softc->ws_max_blks, WS10_MAX_BLKS); break; default: return 0; } return (off_t)softc->params.secsize * omin(sectors, softc->params.sectors); } static void daprobedone(struct cam_periph *periph, union ccb *ccb) { struct da_softc *softc; softc = (struct da_softc *)periph->softc; cam_periph_assert(periph, MA_OWNED); dadeletemethodchoose(softc, DA_DELETE_NONE); if (bootverbose && (softc->flags & DA_FLAG_ANNOUNCED) == 0) { char buf[80]; int i, sep; snprintf(buf, sizeof(buf), "Delete methods: <"); sep = 0; for (i = 0; i <= DA_DELETE_MAX; i++) { if ((softc->delete_available & (1 << i)) == 0 && i != softc->delete_method) continue; if (sep) strlcat(buf, ",", sizeof(buf)); strlcat(buf, da_delete_method_names[i], sizeof(buf)); if (i == softc->delete_method) strlcat(buf, "(*)", sizeof(buf)); sep = 1; } strlcat(buf, ">", sizeof(buf)); printf("%s%d: %s\n", periph->periph_name, periph->unit_number, buf); } if ((softc->disk->d_flags & DISKFLAG_WRITE_PROTECT) != 0 && (softc->flags & DA_FLAG_ANNOUNCED) == 0) { printf("%s%d: Write Protected\n", periph->periph_name, periph->unit_number); } /* * Since our peripheral may be invalidated by an error * above or an external event, we must release our CCB * before releasing the probe lock on the peripheral. * The peripheral will only go away once the last lock * is removed, and we need it around for the CCB release * operation. */ xpt_release_ccb(ccb); softc->state = DA_STATE_NORMAL; softc->flags |= DA_FLAG_PROBED; daschedule(periph); wakeup(&softc->disk->d_mediasize); if ((softc->flags & DA_FLAG_ANNOUNCED) == 0) { softc->flags |= DA_FLAG_ANNOUNCED; da_periph_unhold(periph, DA_REF_PROBE_HOLD); } else da_periph_release_locked(periph, DA_REF_REPROBE); } static void dadeletemethodchoose(struct da_softc *softc, da_delete_methods default_method) { int i, methods; /* If available, prefer the method requested by user. */ i = softc->delete_method_pref; methods = softc->delete_available | (1 << DA_DELETE_DISABLE); if (methods & (1 << i)) { dadeletemethodset(softc, i); return; } /* Use the pre-defined order to choose the best performing delete. */ for (i = DA_DELETE_MIN; i <= DA_DELETE_MAX; i++) { if (i == DA_DELETE_ZERO) continue; if (softc->delete_available & (1 << i)) { dadeletemethodset(softc, i); return; } } /* Fallback to default. */ dadeletemethodset(softc, default_method); } static int dabitsysctl(SYSCTL_HANDLER_ARGS) { - int flags = (intptr_t)arg1; - int test = arg2; + u_int *flags = arg1; + u_int test = arg2; int tmpout, error; - tmpout = !!(flags & test); + tmpout = !!(*flags & test); error = SYSCTL_OUT(req, &tmpout, sizeof(tmpout)); if (error || !req->newptr) return (error); return (EPERM); } static int daflagssysctl(SYSCTL_HANDLER_ARGS) { struct sbuf sbuf; struct da_softc *softc = arg1; int error; sbuf_new_for_sysctl(&sbuf, NULL, 0, req); if (softc->flags != 0) sbuf_printf(&sbuf, "0x%b", softc->flags, DA_FLAG_STRING); else sbuf_printf(&sbuf, "0"); error = sbuf_finish(&sbuf); sbuf_delete(&sbuf); return (error); } static int dadeletemethodsysctl(SYSCTL_HANDLER_ARGS) { char buf[16]; const char *p; struct da_softc *softc; int i, error, value; softc = (struct da_softc *)arg1; value = softc->delete_method; if (value < 0 || value > DA_DELETE_MAX) p = "UNKNOWN"; else p = da_delete_method_names[value]; strncpy(buf, p, sizeof(buf)); error = sysctl_handle_string(oidp, buf, sizeof(buf), req); if (error != 0 || req->newptr == NULL) return (error); for (i = 0; i <= DA_DELETE_MAX; i++) { if (strcmp(buf, da_delete_method_names[i]) == 0) break; } if (i > DA_DELETE_MAX) return (EINVAL); softc->delete_method_pref = i; dadeletemethodchoose(softc, DA_DELETE_NONE); return (0); } static int dazonemodesysctl(SYSCTL_HANDLER_ARGS) { char tmpbuf[40]; struct da_softc *softc; int error; softc = (struct da_softc *)arg1; switch (softc->zone_mode) { case DA_ZONE_DRIVE_MANAGED: snprintf(tmpbuf, sizeof(tmpbuf), "Drive Managed"); break; case DA_ZONE_HOST_AWARE: snprintf(tmpbuf, sizeof(tmpbuf), "Host Aware"); break; case DA_ZONE_HOST_MANAGED: snprintf(tmpbuf, sizeof(tmpbuf), "Host Managed"); break; case DA_ZONE_NONE: default: snprintf(tmpbuf, sizeof(tmpbuf), "Not Zoned"); break; } error = sysctl_handle_string(oidp, tmpbuf, sizeof(tmpbuf), req); return (error); } static int dazonesupsysctl(SYSCTL_HANDLER_ARGS) { char tmpbuf[180]; struct da_softc *softc; struct sbuf sb; int error, first; unsigned int i; softc = (struct da_softc *)arg1; error = 0; first = 1; sbuf_new(&sb, tmpbuf, sizeof(tmpbuf), 0); for (i = 0; i < sizeof(da_zone_desc_table) / sizeof(da_zone_desc_table[0]); i++) { if (softc->zone_flags & da_zone_desc_table[i].value) { if (first == 0) sbuf_printf(&sb, ", "); else first = 0; sbuf_cat(&sb, da_zone_desc_table[i].desc); } } if (first == 1) sbuf_printf(&sb, "None"); sbuf_finish(&sb); error = sysctl_handle_string(oidp, sbuf_data(&sb), sbuf_len(&sb), req); return (error); } static cam_status daregister(struct cam_periph *periph, void *arg) { struct da_softc *softc; struct ccb_pathinq cpi; struct ccb_getdev *cgd; char tmpstr[80]; caddr_t match; int quirks; cgd = (struct ccb_getdev *)arg; if (cgd == NULL) { printf("daregister: no getdev CCB, can't register device\n"); return(CAM_REQ_CMP_ERR); } softc = (struct da_softc *)malloc(sizeof(*softc), M_DEVBUF, M_NOWAIT|M_ZERO); if (softc == NULL) { printf("daregister: Unable to probe new device. " "Unable to allocate softc\n"); return(CAM_REQ_CMP_ERR); } if (cam_iosched_init(&softc->cam_iosched, periph) != 0) { printf("daregister: Unable to probe new device. " "Unable to allocate iosched memory\n"); free(softc, M_DEVBUF); return(CAM_REQ_CMP_ERR); } LIST_INIT(&softc->pending_ccbs); softc->state = DA_STATE_PROBE_WP; bioq_init(&softc->delete_run_queue); if (SID_IS_REMOVABLE(&cgd->inq_data)) softc->flags |= DA_FLAG_PACK_REMOVABLE; softc->unmap_max_ranges = UNMAP_MAX_RANGES; softc->unmap_max_lba = UNMAP_RANGE_MAX; softc->unmap_gran = 0; softc->unmap_gran_align = 0; softc->ws_max_blks = WS16_MAX_BLKS; softc->trim_max_ranges = ATA_TRIM_MAX_RANGES; softc->flags |= DA_FLAG_ROTATING; periph->softc = softc; /* * See if this device has any quirks. */ match = cam_quirkmatch((caddr_t)&cgd->inq_data, (caddr_t)da_quirk_table, nitems(da_quirk_table), sizeof(*da_quirk_table), scsi_inquiry_match); if (match != NULL) softc->quirks = ((struct da_quirk_entry *)match)->quirks; else softc->quirks = DA_Q_NONE; /* Check if the SIM does not want 6 byte commands */ xpt_path_inq(&cpi, periph->path); if (cpi.ccb_h.status == CAM_REQ_CMP && (cpi.hba_misc & PIM_NO_6_BYTE)) softc->quirks |= DA_Q_NO_6_BYTE; /* Override quirks if tunable is set */ snprintf(tmpstr, sizeof(tmpstr), "kern.cam.da.%d.quirks", periph->unit_number); quirks = softc->quirks; TUNABLE_INT_FETCH(tmpstr, &quirks); softc->quirks = quirks; if (SID_TYPE(&cgd->inq_data) == T_ZBC_HM) softc->zone_mode = DA_ZONE_HOST_MANAGED; else if (softc->quirks & DA_Q_SMR_DM) softc->zone_mode = DA_ZONE_DRIVE_MANAGED; else softc->zone_mode = DA_ZONE_NONE; if (softc->zone_mode != DA_ZONE_NONE) { if (scsi_vpd_supported_page(periph, SVPD_ATA_INFORMATION)) { if (scsi_vpd_supported_page(periph, SVPD_ZONED_BDC)) softc->zone_interface = DA_ZONE_IF_ATA_SAT; else softc->zone_interface = DA_ZONE_IF_ATA_PASS; } else softc->zone_interface = DA_ZONE_IF_SCSI; } TASK_INIT(&softc->sysctl_task, 0, dasysctlinit, periph); /* * Take an exclusive section lock qon the periph while dastart is called * to finish the probe. The lock will be dropped in dadone at the end * of probe. This locks out daopen and daclose from racing with the * probe. * * XXX if cam_periph_hold returns an error, we don't hold a refcount. */ (void)da_periph_hold(periph, PRIBIO, DA_REF_PROBE_HOLD); /* * Schedule a periodic event to occasionally send an * ordered tag to a device. */ callout_init_mtx(&softc->sendordered_c, cam_periph_mtx(periph), 0); callout_reset(&softc->sendordered_c, (da_default_timeout * hz) / DA_ORDEREDTAG_INTERVAL, dasendorderedtag, periph); cam_periph_unlock(periph); /* * RBC devices don't have to support READ(6), only READ(10). */ if (softc->quirks & DA_Q_NO_6_BYTE || SID_TYPE(&cgd->inq_data) == T_RBC) softc->minimum_cmd_size = 10; else softc->minimum_cmd_size = 6; /* * Load the user's default, if any. */ snprintf(tmpstr, sizeof(tmpstr), "kern.cam.da.%d.minimum_cmd_size", periph->unit_number); TUNABLE_INT_FETCH(tmpstr, &softc->minimum_cmd_size); /* * 6, 10, 12 and 16 are the currently permissible values. */ if (softc->minimum_cmd_size > 12) softc->minimum_cmd_size = 16; else if (softc->minimum_cmd_size > 10) softc->minimum_cmd_size = 12; else if (softc->minimum_cmd_size > 6) softc->minimum_cmd_size = 10; else softc->minimum_cmd_size = 6; /* Predict whether device may support READ CAPACITY(16). */ if (SID_ANSI_REV(&cgd->inq_data) >= SCSI_REV_SPC3 && (softc->quirks & DA_Q_NO_RC16) == 0) { softc->flags |= DA_FLAG_CAN_RC16; } /* * Register this media as a disk. */ softc->disk = disk_alloc(); softc->disk->d_devstat = devstat_new_entry(periph->periph_name, periph->unit_number, 0, DEVSTAT_BS_UNAVAILABLE, SID_TYPE(&cgd->inq_data) | XPORT_DEVSTAT_TYPE(cpi.transport), DEVSTAT_PRIORITY_DISK); softc->disk->d_open = daopen; softc->disk->d_close = daclose; softc->disk->d_strategy = dastrategy; softc->disk->d_dump = dadump; softc->disk->d_getattr = dagetattr; softc->disk->d_gone = dadiskgonecb; softc->disk->d_name = "da"; softc->disk->d_drv1 = periph; if (cpi.maxio == 0) softc->maxio = DFLTPHYS; /* traditional default */ else if (cpi.maxio > MAXPHYS) softc->maxio = MAXPHYS; /* for safety */ else softc->maxio = cpi.maxio; if (softc->quirks & DA_Q_128KB) softc->maxio = min(softc->maxio, 128 * 1024); softc->disk->d_maxsize = softc->maxio; softc->disk->d_unit = periph->unit_number; softc->disk->d_flags = DISKFLAG_DIRECT_COMPLETION | DISKFLAG_CANZONE; if ((softc->quirks & DA_Q_NO_SYNC_CACHE) == 0) softc->disk->d_flags |= DISKFLAG_CANFLUSHCACHE; if ((cpi.hba_misc & PIM_UNMAPPED) != 0) { softc->flags |= DA_FLAG_UNMAPPEDIO; softc->disk->d_flags |= DISKFLAG_UNMAPPED_BIO; } cam_strvis(softc->disk->d_descr, cgd->inq_data.vendor, sizeof(cgd->inq_data.vendor), sizeof(softc->disk->d_descr)); strlcat(softc->disk->d_descr, " ", sizeof(softc->disk->d_descr)); cam_strvis(&softc->disk->d_descr[strlen(softc->disk->d_descr)], cgd->inq_data.product, sizeof(cgd->inq_data.product), sizeof(softc->disk->d_descr) - strlen(softc->disk->d_descr)); softc->disk->d_hba_vendor = cpi.hba_vendor; softc->disk->d_hba_device = cpi.hba_device; softc->disk->d_hba_subvendor = cpi.hba_subvendor; softc->disk->d_hba_subdevice = cpi.hba_subdevice; snprintf(softc->disk->d_attachment, sizeof(softc->disk->d_attachment), "%s%d", cpi.dev_name, cpi.unit_number); /* * Acquire a reference to the periph before we register with GEOM. * We'll release this reference once GEOM calls us back (via * dadiskgonecb()) telling us that our provider has been freed. */ if (da_periph_acquire(periph, DA_REF_GEOM) != 0) { xpt_print(periph->path, "%s: lost periph during " "registration!\n", __func__); cam_periph_lock(periph); return (CAM_REQ_CMP_ERR); } disk_create(softc->disk, DISK_VERSION); cam_periph_lock(periph); /* * Add async callbacks for events of interest. * I don't bother checking if this fails as, * in most cases, the system will function just * fine without them and the only alternative * would be to not attach the device on failure. */ xpt_register_async(AC_SENT_BDR | AC_BUS_RESET | AC_LOST_DEVICE | AC_ADVINFO_CHANGED | AC_SCSI_AEN | AC_UNIT_ATTENTION | AC_INQ_CHANGED, daasync, periph, periph->path); /* * Emit an attribute changed notification just in case * physical path information arrived before our async * event handler was registered, but after anyone attaching * to our disk device polled it. */ disk_attr_changed(softc->disk, "GEOM::physpath", M_NOWAIT); /* * Schedule a periodic media polling events. */ callout_init_mtx(&softc->mediapoll_c, cam_periph_mtx(periph), 0); if ((softc->flags & DA_FLAG_PACK_REMOVABLE) && (cgd->inq_flags & SID_AEN) == 0 && da_poll_period != 0) callout_reset(&softc->mediapoll_c, da_poll_period * hz, damediapoll, periph); xpt_schedule(periph, CAM_PRIORITY_DEV); return(CAM_REQ_CMP); } static int da_zone_bio_to_scsi(int disk_zone_cmd) { switch (disk_zone_cmd) { case DISK_ZONE_OPEN: return ZBC_OUT_SA_OPEN; case DISK_ZONE_CLOSE: return ZBC_OUT_SA_CLOSE; case DISK_ZONE_FINISH: return ZBC_OUT_SA_FINISH; case DISK_ZONE_RWP: return ZBC_OUT_SA_RWP; } return -1; } static int da_zone_cmd(struct cam_periph *periph, union ccb *ccb, struct bio *bp, int *queue_ccb) { struct da_softc *softc; int error; error = 0; if (bp->bio_cmd != BIO_ZONE) { error = EINVAL; goto bailout; } softc = periph->softc; switch (bp->bio_zone.zone_cmd) { case DISK_ZONE_OPEN: case DISK_ZONE_CLOSE: case DISK_ZONE_FINISH: case DISK_ZONE_RWP: { int zone_flags; int zone_sa; uint64_t lba; zone_sa = da_zone_bio_to_scsi(bp->bio_zone.zone_cmd); if (zone_sa == -1) { xpt_print(periph->path, "Cannot translate zone " "cmd %#x to SCSI\n", bp->bio_zone.zone_cmd); error = EINVAL; goto bailout; } zone_flags = 0; lba = bp->bio_zone.zone_params.rwp.id; if (bp->bio_zone.zone_params.rwp.flags & DISK_ZONE_RWP_FLAG_ALL) zone_flags |= ZBC_OUT_ALL; if (softc->zone_interface != DA_ZONE_IF_ATA_PASS) { scsi_zbc_out(&ccb->csio, /*retries*/ da_retry_count, /*cbfcnp*/ dadone, /*tag_action*/ MSG_SIMPLE_Q_TAG, /*service_action*/ zone_sa, /*zone_id*/ lba, /*zone_flags*/ zone_flags, /*data_ptr*/ NULL, /*dxfer_len*/ 0, /*sense_len*/ SSD_FULL_SIZE, /*timeout*/ da_default_timeout * 1000); } else { /* * Note that in this case, even though we can * technically use NCQ, we don't bother for several * reasons: * 1. It hasn't been tested on a SAT layer that * supports it. This is new as of SAT-4. * 2. Even when there is a SAT layer that supports * it, that SAT layer will also probably support * ZBC -> ZAC translation, since they are both * in the SAT-4 spec. * 3. Translation will likely be preferable to ATA * passthrough. LSI / Avago at least single * steps ATA passthrough commands in the HBA, * regardless of protocol, so unless that * changes, there is a performance penalty for * doing ATA passthrough no matter whether * you're using NCQ/FPDMA, DMA or PIO. * 4. It requires a 32-byte CDB, which at least at * this point in CAM requires a CDB pointer, which * would require us to allocate an additional bit * of storage separate from the CCB. */ error = scsi_ata_zac_mgmt_out(&ccb->csio, /*retries*/ da_retry_count, /*cbfcnp*/ dadone, /*tag_action*/ MSG_SIMPLE_Q_TAG, /*use_ncq*/ 0, /*zm_action*/ zone_sa, /*zone_id*/ lba, /*zone_flags*/ zone_flags, /*data_ptr*/ NULL, /*dxfer_len*/ 0, /*cdb_storage*/ NULL, /*cdb_storage_len*/ 0, /*sense_len*/ SSD_FULL_SIZE, /*timeout*/ da_default_timeout * 1000); if (error != 0) { error = EINVAL; xpt_print(periph->path, "scsi_ata_zac_mgmt_out() returned an " "error!"); goto bailout; } } *queue_ccb = 1; break; } case DISK_ZONE_REPORT_ZONES: { uint8_t *rz_ptr; uint32_t num_entries, alloc_size; struct disk_zone_report *rep; rep = &bp->bio_zone.zone_params.report; num_entries = rep->entries_allocated; if (num_entries == 0) { xpt_print(periph->path, "No entries allocated for " "Report Zones request\n"); error = EINVAL; goto bailout; } alloc_size = sizeof(struct scsi_report_zones_hdr) + (sizeof(struct scsi_report_zones_desc) * num_entries); alloc_size = min(alloc_size, softc->disk->d_maxsize); rz_ptr = malloc(alloc_size, M_SCSIDA, M_NOWAIT | M_ZERO); if (rz_ptr == NULL) { xpt_print(periph->path, "Unable to allocate memory " "for Report Zones request\n"); error = ENOMEM; goto bailout; } if (softc->zone_interface != DA_ZONE_IF_ATA_PASS) { scsi_zbc_in(&ccb->csio, /*retries*/ da_retry_count, /*cbcfnp*/ dadone, /*tag_action*/ MSG_SIMPLE_Q_TAG, /*service_action*/ ZBC_IN_SA_REPORT_ZONES, /*zone_start_lba*/ rep->starting_id, /*zone_options*/ rep->rep_options, /*data_ptr*/ rz_ptr, /*dxfer_len*/ alloc_size, /*sense_len*/ SSD_FULL_SIZE, /*timeout*/ da_default_timeout * 1000); } else { /* * Note that in this case, even though we can * technically use NCQ, we don't bother for several * reasons: * 1. It hasn't been tested on a SAT layer that * supports it. This is new as of SAT-4. * 2. Even when there is a SAT layer that supports * it, that SAT layer will also probably support * ZBC -> ZAC translation, since they are both * in the SAT-4 spec. * 3. Translation will likely be preferable to ATA * passthrough. LSI / Avago at least single * steps ATA passthrough commands in the HBA, * regardless of protocol, so unless that * changes, there is a performance penalty for * doing ATA passthrough no matter whether * you're using NCQ/FPDMA, DMA or PIO. * 4. It requires a 32-byte CDB, which at least at * this point in CAM requires a CDB pointer, which * would require us to allocate an additional bit * of storage separate from the CCB. */ error = scsi_ata_zac_mgmt_in(&ccb->csio, /*retries*/ da_retry_count, /*cbcfnp*/ dadone, /*tag_action*/ MSG_SIMPLE_Q_TAG, /*use_ncq*/ 0, /*zm_action*/ ATA_ZM_REPORT_ZONES, /*zone_id*/ rep->starting_id, /*zone_flags*/ rep->rep_options, /*data_ptr*/ rz_ptr, /*dxfer_len*/ alloc_size, /*cdb_storage*/ NULL, /*cdb_storage_len*/ 0, /*sense_len*/ SSD_FULL_SIZE, /*timeout*/ da_default_timeout * 1000); if (error != 0) { error = EINVAL; xpt_print(periph->path, "scsi_ata_zac_mgmt_in() returned an " "error!"); goto bailout; } } /* * For BIO_ZONE, this isn't normally needed. However, it * is used by devstat_end_transaction_bio() to determine * how much data was transferred. */ /* * XXX KDM we have a problem. But I'm not sure how to fix * it. devstat uses bio_bcount - bio_resid to calculate * the amount of data transferred. The GEOM disk code * uses bio_length - bio_resid to calculate the amount of * data in bio_completed. We have different structure * sizes above and below the ada(4) driver. So, if we * use the sizes above, the amount transferred won't be * quite accurate for devstat. If we use different sizes * for bio_bcount and bio_length (above and below * respectively), then the residual needs to match one or * the other. Everything is calculated after the bio * leaves the driver, so changing the values around isn't * really an option. For now, just set the count to the * passed in length. This means that the calculations * above (e.g. bio_completed) will be correct, but the * amount of data reported to devstat will be slightly * under or overstated. */ bp->bio_bcount = bp->bio_length; *queue_ccb = 1; break; } case DISK_ZONE_GET_PARAMS: { struct disk_zone_disk_params *params; params = &bp->bio_zone.zone_params.disk_params; bzero(params, sizeof(*params)); switch (softc->zone_mode) { case DA_ZONE_DRIVE_MANAGED: params->zone_mode = DISK_ZONE_MODE_DRIVE_MANAGED; break; case DA_ZONE_HOST_AWARE: params->zone_mode = DISK_ZONE_MODE_HOST_AWARE; break; case DA_ZONE_HOST_MANAGED: params->zone_mode = DISK_ZONE_MODE_HOST_MANAGED; break; default: case DA_ZONE_NONE: params->zone_mode = DISK_ZONE_MODE_NONE; break; } if (softc->zone_flags & DA_ZONE_FLAG_URSWRZ) params->flags |= DISK_ZONE_DISK_URSWRZ; if (softc->zone_flags & DA_ZONE_FLAG_OPT_SEQ_SET) { params->optimal_seq_zones = softc->optimal_seq_zones; params->flags |= DISK_ZONE_OPT_SEQ_SET; } if (softc->zone_flags & DA_ZONE_FLAG_OPT_NONSEQ_SET) { params->optimal_nonseq_zones = softc->optimal_nonseq_zones; params->flags |= DISK_ZONE_OPT_NONSEQ_SET; } if (softc->zone_flags & DA_ZONE_FLAG_MAX_SEQ_SET) { params->max_seq_zones = softc->max_seq_zones; params->flags |= DISK_ZONE_MAX_SEQ_SET; } if (softc->zone_flags & DA_ZONE_FLAG_RZ_SUP) params->flags |= DISK_ZONE_RZ_SUP; if (softc->zone_flags & DA_ZONE_FLAG_OPEN_SUP) params->flags |= DISK_ZONE_OPEN_SUP; if (softc->zone_flags & DA_ZONE_FLAG_CLOSE_SUP) params->flags |= DISK_ZONE_CLOSE_SUP; if (softc->zone_flags & DA_ZONE_FLAG_FINISH_SUP) params->flags |= DISK_ZONE_FINISH_SUP; if (softc->zone_flags & DA_ZONE_FLAG_RWP_SUP) params->flags |= DISK_ZONE_RWP_SUP; break; } default: break; } bailout: return (error); } static void dastart(struct cam_periph *periph, union ccb *start_ccb) { struct da_softc *softc; cam_periph_assert(periph, MA_OWNED); softc = (struct da_softc *)periph->softc; CAM_DEBUG(periph->path, CAM_DEBUG_TRACE, ("dastart\n")); skipstate: switch (softc->state) { case DA_STATE_NORMAL: { struct bio *bp; uint8_t tag_code; more: bp = cam_iosched_next_bio(softc->cam_iosched); if (bp == NULL) { if (cam_iosched_has_work_flags(softc->cam_iosched, DA_WORK_TUR)) { softc->flags |= DA_FLAG_TUR_PENDING; cam_iosched_clr_work_flags(softc->cam_iosched, DA_WORK_TUR); scsi_test_unit_ready(&start_ccb->csio, /*retries*/ da_retry_count, dadone_tur, MSG_SIMPLE_Q_TAG, SSD_FULL_SIZE, da_default_timeout * 1000); start_ccb->ccb_h.ccb_bp = NULL; start_ccb->ccb_h.ccb_state = DA_CCB_TUR; xpt_action(start_ccb); } else xpt_release_ccb(start_ccb); break; } if (bp->bio_cmd == BIO_DELETE) { if (softc->delete_func != NULL) { softc->delete_func(periph, start_ccb, bp); goto out; } else { /* * Not sure this is possible, but failsafe by * lying and saying "sure, done." */ biofinish(bp, NULL, 0); goto more; } } if (cam_iosched_has_work_flags(softc->cam_iosched, DA_WORK_TUR)) { cam_iosched_clr_work_flags(softc->cam_iosched, DA_WORK_TUR); da_periph_release_locked(periph, DA_REF_TUR); } if ((bp->bio_flags & BIO_ORDERED) != 0 || (softc->flags & DA_FLAG_NEED_OTAG) != 0) { softc->flags &= ~DA_FLAG_NEED_OTAG; softc->flags |= DA_FLAG_WAS_OTAG; tag_code = MSG_ORDERED_Q_TAG; } else { tag_code = MSG_SIMPLE_Q_TAG; } switch (bp->bio_cmd) { case BIO_WRITE: case BIO_READ: { void *data_ptr; int rw_op; biotrack(bp, __func__); if (bp->bio_cmd == BIO_WRITE) { softc->flags |= DA_FLAG_DIRTY; rw_op = SCSI_RW_WRITE; } else { rw_op = SCSI_RW_READ; } data_ptr = bp->bio_data; if ((bp->bio_flags & (BIO_UNMAPPED|BIO_VLIST)) != 0) { rw_op |= SCSI_RW_BIO; data_ptr = bp; } scsi_read_write(&start_ccb->csio, /*retries*/da_retry_count, /*cbfcnp*/dadone, /*tag_action*/tag_code, rw_op, /*byte2*/0, softc->minimum_cmd_size, /*lba*/bp->bio_pblkno, /*block_count*/bp->bio_bcount / softc->params.secsize, data_ptr, /*dxfer_len*/ bp->bio_bcount, /*sense_len*/SSD_FULL_SIZE, da_default_timeout * 1000); #if defined(BUF_TRACKING) || defined(FULL_BUF_TRACKING) start_ccb->csio.bio = bp; #endif break; } case BIO_FLUSH: /* * If we don't support sync cache, or the disk * isn't dirty, FLUSH is a no-op. Use the * allocated CCB for the next bio if one is * available. */ if ((softc->quirks & DA_Q_NO_SYNC_CACHE) != 0 || (softc->flags & DA_FLAG_DIRTY) == 0) { biodone(bp); goto skipstate; } /* * BIO_FLUSH doesn't currently communicate * range data, so we synchronize the cache * over the whole disk. */ scsi_synchronize_cache(&start_ccb->csio, /*retries*/1, /*cbfcnp*/dadone, /*tag_action*/tag_code, /*begin_lba*/0, /*lb_count*/0, SSD_FULL_SIZE, da_default_timeout*1000); /* * Clear the dirty flag before sending the command. * Either this sync cache will be successful, or it * will fail after a retry. If it fails, it is * unlikely to be successful if retried later, so * we'll save ourselves time by just marking the * device clean. */ softc->flags &= ~DA_FLAG_DIRTY; break; case BIO_ZONE: { int error, queue_ccb; queue_ccb = 0; error = da_zone_cmd(periph, start_ccb, bp,&queue_ccb); if ((error != 0) || (queue_ccb == 0)) { biofinish(bp, NULL, error); xpt_release_ccb(start_ccb); return; } break; } default: biofinish(bp, NULL, EOPNOTSUPP); xpt_release_ccb(start_ccb); return; } start_ccb->ccb_h.ccb_state = DA_CCB_BUFFER_IO; start_ccb->ccb_h.flags |= CAM_UNLOCKED; start_ccb->ccb_h.softtimeout = sbttotv(da_default_softtimeout); out: LIST_INSERT_HEAD(&softc->pending_ccbs, &start_ccb->ccb_h, periph_links.le); /* We expect a unit attention from this device */ if ((softc->flags & DA_FLAG_RETRY_UA) != 0) { start_ccb->ccb_h.ccb_state |= DA_CCB_RETRY_UA; softc->flags &= ~DA_FLAG_RETRY_UA; } start_ccb->ccb_h.ccb_bp = bp; softc->refcount++; cam_periph_unlock(periph); xpt_action(start_ccb); cam_periph_lock(periph); /* May have more work to do, so ensure we stay scheduled */ daschedule(periph); break; } case DA_STATE_PROBE_WP: { void *mode_buf; int mode_buf_len; if (da_disable_wp_detection) { if ((softc->flags & DA_FLAG_CAN_RC16) != 0) softc->state = DA_STATE_PROBE_RC16; else softc->state = DA_STATE_PROBE_RC; goto skipstate; } mode_buf_len = 192; mode_buf = malloc(mode_buf_len, M_SCSIDA, M_NOWAIT); if (mode_buf == NULL) { xpt_print(periph->path, "Unable to send mode sense - " "malloc failure\n"); if ((softc->flags & DA_FLAG_CAN_RC16) != 0) softc->state = DA_STATE_PROBE_RC16; else softc->state = DA_STATE_PROBE_RC; goto skipstate; } scsi_mode_sense_len(&start_ccb->csio, /*retries*/ da_retry_count, /*cbfcnp*/ dadone_probewp, /*tag_action*/ MSG_SIMPLE_Q_TAG, /*dbd*/ FALSE, /*pc*/ SMS_PAGE_CTRL_CURRENT, /*page*/ SMS_ALL_PAGES_PAGE, /*param_buf*/ mode_buf, /*param_len*/ mode_buf_len, /*minimum_cmd_size*/ softc->minimum_cmd_size, /*sense_len*/ SSD_FULL_SIZE, /*timeout*/ da_default_timeout * 1000); start_ccb->ccb_h.ccb_bp = NULL; start_ccb->ccb_h.ccb_state = DA_CCB_PROBE_WP; xpt_action(start_ccb); break; } case DA_STATE_PROBE_RC: { struct scsi_read_capacity_data *rcap; rcap = (struct scsi_read_capacity_data *) malloc(sizeof(*rcap), M_SCSIDA, M_NOWAIT|M_ZERO); if (rcap == NULL) { printf("dastart: Couldn't malloc read_capacity data\n"); /* da_free_periph??? */ break; } scsi_read_capacity(&start_ccb->csio, /*retries*/da_retry_count, dadone_proberc, MSG_SIMPLE_Q_TAG, rcap, SSD_FULL_SIZE, /*timeout*/5000); start_ccb->ccb_h.ccb_bp = NULL; start_ccb->ccb_h.ccb_state = DA_CCB_PROBE_RC; xpt_action(start_ccb); break; } case DA_STATE_PROBE_RC16: { struct scsi_read_capacity_data_long *rcaplong; rcaplong = (struct scsi_read_capacity_data_long *) malloc(sizeof(*rcaplong), M_SCSIDA, M_NOWAIT|M_ZERO); if (rcaplong == NULL) { printf("dastart: Couldn't malloc read_capacity data\n"); /* da_free_periph??? */ break; } scsi_read_capacity_16(&start_ccb->csio, /*retries*/ da_retry_count, /*cbfcnp*/ dadone_proberc, /*tag_action*/ MSG_SIMPLE_Q_TAG, /*lba*/ 0, /*reladr*/ 0, /*pmi*/ 0, /*rcap_buf*/ (uint8_t *)rcaplong, /*rcap_buf_len*/ sizeof(*rcaplong), /*sense_len*/ SSD_FULL_SIZE, /*timeout*/ da_default_timeout * 1000); start_ccb->ccb_h.ccb_bp = NULL; start_ccb->ccb_h.ccb_state = DA_CCB_PROBE_RC16; xpt_action(start_ccb); break; } case DA_STATE_PROBE_LBP: { struct scsi_vpd_logical_block_prov *lbp; if (!scsi_vpd_supported_page(periph, SVPD_LBP)) { /* * If we get here we don't support any SBC-3 delete * methods with UNMAP as the Logical Block Provisioning * VPD page support is required for devices which * support it according to T10/1799-D Revision 31 * however older revisions of the spec don't mandate * this so we currently don't remove these methods * from the available set. */ softc->state = DA_STATE_PROBE_BLK_LIMITS; goto skipstate; } lbp = (struct scsi_vpd_logical_block_prov *) malloc(sizeof(*lbp), M_SCSIDA, M_NOWAIT|M_ZERO); if (lbp == NULL) { printf("dastart: Couldn't malloc lbp data\n"); /* da_free_periph??? */ break; } scsi_inquiry(&start_ccb->csio, /*retries*/da_retry_count, /*cbfcnp*/dadone_probelbp, /*tag_action*/MSG_SIMPLE_Q_TAG, /*inq_buf*/(u_int8_t *)lbp, /*inq_len*/sizeof(*lbp), /*evpd*/TRUE, /*page_code*/SVPD_LBP, /*sense_len*/SSD_MIN_SIZE, /*timeout*/da_default_timeout * 1000); start_ccb->ccb_h.ccb_bp = NULL; start_ccb->ccb_h.ccb_state = DA_CCB_PROBE_LBP; xpt_action(start_ccb); break; } case DA_STATE_PROBE_BLK_LIMITS: { struct scsi_vpd_block_limits *block_limits; if (!scsi_vpd_supported_page(periph, SVPD_BLOCK_LIMITS)) { /* Not supported skip to next probe */ softc->state = DA_STATE_PROBE_BDC; goto skipstate; } block_limits = (struct scsi_vpd_block_limits *) malloc(sizeof(*block_limits), M_SCSIDA, M_NOWAIT|M_ZERO); if (block_limits == NULL) { printf("dastart: Couldn't malloc block_limits data\n"); /* da_free_periph??? */ break; } scsi_inquiry(&start_ccb->csio, /*retries*/da_retry_count, /*cbfcnp*/dadone_probeblklimits, /*tag_action*/MSG_SIMPLE_Q_TAG, /*inq_buf*/(u_int8_t *)block_limits, /*inq_len*/sizeof(*block_limits), /*evpd*/TRUE, /*page_code*/SVPD_BLOCK_LIMITS, /*sense_len*/SSD_MIN_SIZE, /*timeout*/da_default_timeout * 1000); start_ccb->ccb_h.ccb_bp = NULL; start_ccb->ccb_h.ccb_state = DA_CCB_PROBE_BLK_LIMITS; xpt_action(start_ccb); break; } case DA_STATE_PROBE_BDC: { struct scsi_vpd_block_characteristics *bdc; if (!scsi_vpd_supported_page(periph, SVPD_BDC)) { softc->state = DA_STATE_PROBE_ATA; goto skipstate; } bdc = (struct scsi_vpd_block_characteristics *) malloc(sizeof(*bdc), M_SCSIDA, M_NOWAIT|M_ZERO); if (bdc == NULL) { printf("dastart: Couldn't malloc bdc data\n"); /* da_free_periph??? */ break; } scsi_inquiry(&start_ccb->csio, /*retries*/da_retry_count, /*cbfcnp*/dadone_probebdc, /*tag_action*/MSG_SIMPLE_Q_TAG, /*inq_buf*/(u_int8_t *)bdc, /*inq_len*/sizeof(*bdc), /*evpd*/TRUE, /*page_code*/SVPD_BDC, /*sense_len*/SSD_MIN_SIZE, /*timeout*/da_default_timeout * 1000); start_ccb->ccb_h.ccb_bp = NULL; start_ccb->ccb_h.ccb_state = DA_CCB_PROBE_BDC; xpt_action(start_ccb); break; } case DA_STATE_PROBE_ATA: { struct ata_params *ata_params; if (!scsi_vpd_supported_page(periph, SVPD_ATA_INFORMATION)) { if ((softc->zone_mode == DA_ZONE_HOST_AWARE) || (softc->zone_mode == DA_ZONE_HOST_MANAGED)) { /* * Note that if the ATA VPD page isn't * supported, we aren't talking to an ATA * device anyway. Support for that VPD * page is mandatory for SCSI to ATA (SAT) * translation layers. */ softc->state = DA_STATE_PROBE_ZONE; goto skipstate; } daprobedone(periph, start_ccb); break; } ata_params = &periph->path->device->ident_data; scsi_ata_identify(&start_ccb->csio, /*retries*/da_retry_count, /*cbfcnp*/dadone_probeata, /*tag_action*/MSG_SIMPLE_Q_TAG, /*data_ptr*/(u_int8_t *)ata_params, /*dxfer_len*/sizeof(*ata_params), /*sense_len*/SSD_FULL_SIZE, /*timeout*/da_default_timeout * 1000); start_ccb->ccb_h.ccb_bp = NULL; start_ccb->ccb_h.ccb_state = DA_CCB_PROBE_ATA; xpt_action(start_ccb); break; } case DA_STATE_PROBE_ATA_LOGDIR: { struct ata_gp_log_dir *log_dir; int retval; retval = 0; if ((softc->flags & DA_FLAG_CAN_ATA_LOG) == 0) { /* * If we don't have log support, not much point in * trying to probe zone support. */ daprobedone(periph, start_ccb); break; } /* * If we have an ATA device (the SCSI ATA Information VPD * page should be present and the ATA identify should have * succeeded) and it supports logs, ask for the log directory. */ log_dir = malloc(sizeof(*log_dir), M_SCSIDA, M_NOWAIT|M_ZERO); if (log_dir == NULL) { xpt_print(periph->path, "Couldn't malloc log_dir " "data\n"); daprobedone(periph, start_ccb); break; } retval = scsi_ata_read_log(&start_ccb->csio, /*retries*/ da_retry_count, /*cbfcnp*/ dadone_probeatalogdir, /*tag_action*/ MSG_SIMPLE_Q_TAG, /*log_address*/ ATA_LOG_DIRECTORY, /*page_number*/ 0, /*block_count*/ 1, /*protocol*/ softc->flags & DA_FLAG_CAN_ATA_DMA ? AP_PROTO_DMA : AP_PROTO_PIO_IN, /*data_ptr*/ (uint8_t *)log_dir, /*dxfer_len*/ sizeof(*log_dir), /*sense_len*/ SSD_FULL_SIZE, /*timeout*/ da_default_timeout * 1000); if (retval != 0) { xpt_print(periph->path, "scsi_ata_read_log() failed!"); free(log_dir, M_SCSIDA); daprobedone(periph, start_ccb); break; } start_ccb->ccb_h.ccb_bp = NULL; start_ccb->ccb_h.ccb_state = DA_CCB_PROBE_ATA_LOGDIR; xpt_action(start_ccb); break; } case DA_STATE_PROBE_ATA_IDDIR: { struct ata_identify_log_pages *id_dir; int retval; retval = 0; /* * Check here to see whether the Identify Device log is * supported in the directory of logs. If so, continue * with requesting the log of identify device pages. */ if ((softc->flags & DA_FLAG_CAN_ATA_IDLOG) == 0) { daprobedone(periph, start_ccb); break; } id_dir = malloc(sizeof(*id_dir), M_SCSIDA, M_NOWAIT | M_ZERO); if (id_dir == NULL) { xpt_print(periph->path, "Couldn't malloc id_dir " "data\n"); daprobedone(periph, start_ccb); break; } retval = scsi_ata_read_log(&start_ccb->csio, /*retries*/ da_retry_count, /*cbfcnp*/ dadone_probeataiddir, /*tag_action*/ MSG_SIMPLE_Q_TAG, /*log_address*/ ATA_IDENTIFY_DATA_LOG, /*page_number*/ ATA_IDL_PAGE_LIST, /*block_count*/ 1, /*protocol*/ softc->flags & DA_FLAG_CAN_ATA_DMA ? AP_PROTO_DMA : AP_PROTO_PIO_IN, /*data_ptr*/ (uint8_t *)id_dir, /*dxfer_len*/ sizeof(*id_dir), /*sense_len*/ SSD_FULL_SIZE, /*timeout*/ da_default_timeout * 1000); if (retval != 0) { xpt_print(periph->path, "scsi_ata_read_log() failed!"); free(id_dir, M_SCSIDA); daprobedone(periph, start_ccb); break; } start_ccb->ccb_h.ccb_bp = NULL; start_ccb->ccb_h.ccb_state = DA_CCB_PROBE_ATA_IDDIR; xpt_action(start_ccb); break; } case DA_STATE_PROBE_ATA_SUP: { struct ata_identify_log_sup_cap *sup_cap; int retval; retval = 0; /* * Check here to see whether the Supported Capabilities log * is in the list of Identify Device logs. */ if ((softc->flags & DA_FLAG_CAN_ATA_SUPCAP) == 0) { daprobedone(periph, start_ccb); break; } sup_cap = malloc(sizeof(*sup_cap), M_SCSIDA, M_NOWAIT|M_ZERO); if (sup_cap == NULL) { xpt_print(periph->path, "Couldn't malloc sup_cap " "data\n"); daprobedone(periph, start_ccb); break; } retval = scsi_ata_read_log(&start_ccb->csio, /*retries*/ da_retry_count, /*cbfcnp*/ dadone_probeatasup, /*tag_action*/ MSG_SIMPLE_Q_TAG, /*log_address*/ ATA_IDENTIFY_DATA_LOG, /*page_number*/ ATA_IDL_SUP_CAP, /*block_count*/ 1, /*protocol*/ softc->flags & DA_FLAG_CAN_ATA_DMA ? AP_PROTO_DMA : AP_PROTO_PIO_IN, /*data_ptr*/ (uint8_t *)sup_cap, /*dxfer_len*/ sizeof(*sup_cap), /*sense_len*/ SSD_FULL_SIZE, /*timeout*/ da_default_timeout * 1000); if (retval != 0) { xpt_print(periph->path, "scsi_ata_read_log() failed!"); free(sup_cap, M_SCSIDA); daprobedone(periph, start_ccb); break; } start_ccb->ccb_h.ccb_bp = NULL; start_ccb->ccb_h.ccb_state = DA_CCB_PROBE_ATA_SUP; xpt_action(start_ccb); break; } case DA_STATE_PROBE_ATA_ZONE: { struct ata_zoned_info_log *ata_zone; int retval; retval = 0; /* * Check here to see whether the zoned device information * page is supported. If so, continue on to request it. * If not, skip to DA_STATE_PROBE_LOG or done. */ if ((softc->flags & DA_FLAG_CAN_ATA_ZONE) == 0) { daprobedone(periph, start_ccb); break; } ata_zone = malloc(sizeof(*ata_zone), M_SCSIDA, M_NOWAIT|M_ZERO); if (ata_zone == NULL) { xpt_print(periph->path, "Couldn't malloc ata_zone " "data\n"); daprobedone(periph, start_ccb); break; } retval = scsi_ata_read_log(&start_ccb->csio, /*retries*/ da_retry_count, /*cbfcnp*/ dadone_probeatazone, /*tag_action*/ MSG_SIMPLE_Q_TAG, /*log_address*/ ATA_IDENTIFY_DATA_LOG, /*page_number*/ ATA_IDL_ZDI, /*block_count*/ 1, /*protocol*/ softc->flags & DA_FLAG_CAN_ATA_DMA ? AP_PROTO_DMA : AP_PROTO_PIO_IN, /*data_ptr*/ (uint8_t *)ata_zone, /*dxfer_len*/ sizeof(*ata_zone), /*sense_len*/ SSD_FULL_SIZE, /*timeout*/ da_default_timeout * 1000); if (retval != 0) { xpt_print(periph->path, "scsi_ata_read_log() failed!"); free(ata_zone, M_SCSIDA); daprobedone(periph, start_ccb); break; } start_ccb->ccb_h.ccb_bp = NULL; start_ccb->ccb_h.ccb_state = DA_CCB_PROBE_ATA_ZONE; xpt_action(start_ccb); break; } case DA_STATE_PROBE_ZONE: { struct scsi_vpd_zoned_bdc *bdc; /* * Note that this page will be supported for SCSI protocol * devices that support ZBC (SMR devices), as well as ATA * protocol devices that are behind a SAT (SCSI to ATA * Translation) layer that supports converting ZBC commands * to their ZAC equivalents. */ if (!scsi_vpd_supported_page(periph, SVPD_ZONED_BDC)) { daprobedone(periph, start_ccb); break; } bdc = (struct scsi_vpd_zoned_bdc *) malloc(sizeof(*bdc), M_SCSIDA, M_NOWAIT|M_ZERO); if (bdc == NULL) { xpt_release_ccb(start_ccb); xpt_print(periph->path, "Couldn't malloc zone VPD " "data\n"); break; } scsi_inquiry(&start_ccb->csio, /*retries*/da_retry_count, /*cbfcnp*/dadone_probezone, /*tag_action*/MSG_SIMPLE_Q_TAG, /*inq_buf*/(u_int8_t *)bdc, /*inq_len*/sizeof(*bdc), /*evpd*/TRUE, /*page_code*/SVPD_ZONED_BDC, /*sense_len*/SSD_FULL_SIZE, /*timeout*/da_default_timeout * 1000); start_ccb->ccb_h.ccb_bp = NULL; start_ccb->ccb_h.ccb_state = DA_CCB_PROBE_ZONE; xpt_action(start_ccb); break; } } } /* * In each of the methods below, while its the caller's * responsibility to ensure the request will fit into a * single device request, we might have changed the delete * method due to the device incorrectly advertising either * its supported methods or limits. * * To prevent this causing further issues we validate the * against the methods limits, and warn which would * otherwise be unnecessary. */ static void da_delete_unmap(struct cam_periph *periph, union ccb *ccb, struct bio *bp) { struct da_softc *softc = (struct da_softc *)periph->softc;; struct bio *bp1; uint8_t *buf = softc->unmap_buf; struct scsi_unmap_desc *d = (void *)&buf[UNMAP_HEAD_SIZE]; uint64_t lba, lastlba = (uint64_t)-1; uint64_t totalcount = 0; uint64_t count; uint32_t c, lastcount = 0, ranges = 0; /* * Currently this doesn't take the UNMAP * Granularity and Granularity Alignment * fields into account. * * This could result in both unoptimal unmap * requests as as well as UNMAP calls unmapping * fewer LBA's than requested. */ bzero(softc->unmap_buf, sizeof(softc->unmap_buf)); bp1 = bp; do { /* * Note: ada and da are different in how they store the * pending bp's in a trim. ada stores all of them in the * trim_req.bps. da stores all but the first one in the * delete_run_queue. ada then completes all the bps in * its adadone() loop. da completes all the bps in the * delete_run_queue in dadone, and relies on the biodone * after to complete. This should be reconciled since there's * no real reason to do it differently. XXX */ if (bp1 != bp) bioq_insert_tail(&softc->delete_run_queue, bp1); lba = bp1->bio_pblkno; count = bp1->bio_bcount / softc->params.secsize; /* Try to extend the previous range. */ if (lba == lastlba) { c = omin(count, UNMAP_RANGE_MAX - lastcount); lastlba += c; lastcount += c; scsi_ulto4b(lastcount, d[ranges - 1].length); count -= c; lba += c; totalcount += c; } else if ((softc->quirks & DA_Q_STRICT_UNMAP) && softc->unmap_gran != 0) { /* Align length of the previous range. */ if ((c = lastcount % softc->unmap_gran) != 0) { if (lastcount <= c) { totalcount -= lastcount; lastlba = (uint64_t)-1; lastcount = 0; ranges--; } else { totalcount -= c; lastlba -= c; lastcount -= c; scsi_ulto4b(lastcount, d[ranges - 1].length); } } /* Align beginning of the new range. */ c = (lba - softc->unmap_gran_align) % softc->unmap_gran; if (c != 0) { c = softc->unmap_gran - c; if (count <= c) { count = 0; } else { lba += c; count -= c; } } } while (count > 0) { c = omin(count, UNMAP_RANGE_MAX); if (totalcount + c > softc->unmap_max_lba || ranges >= softc->unmap_max_ranges) { xpt_print(periph->path, "%s issuing short delete %ld > %ld" "|| %d >= %d", da_delete_method_desc[softc->delete_method], totalcount + c, softc->unmap_max_lba, ranges, softc->unmap_max_ranges); break; } scsi_u64to8b(lba, d[ranges].lba); scsi_ulto4b(c, d[ranges].length); lba += c; totalcount += c; ranges++; count -= c; lastlba = lba; lastcount = c; } bp1 = cam_iosched_next_trim(softc->cam_iosched); if (bp1 == NULL) break; if (ranges >= softc->unmap_max_ranges || totalcount + bp1->bio_bcount / softc->params.secsize > softc->unmap_max_lba) { cam_iosched_put_back_trim(softc->cam_iosched, bp1); break; } } while (1); /* Align length of the last range. */ if ((softc->quirks & DA_Q_STRICT_UNMAP) && softc->unmap_gran != 0 && (c = lastcount % softc->unmap_gran) != 0) { if (lastcount <= c) ranges--; else scsi_ulto4b(lastcount - c, d[ranges - 1].length); } scsi_ulto2b(ranges * 16 + 6, &buf[0]); scsi_ulto2b(ranges * 16, &buf[2]); scsi_unmap(&ccb->csio, /*retries*/da_retry_count, /*cbfcnp*/dadone, /*tag_action*/MSG_SIMPLE_Q_TAG, /*byte2*/0, /*data_ptr*/ buf, /*dxfer_len*/ ranges * 16 + 8, /*sense_len*/SSD_FULL_SIZE, da_default_timeout * 1000); ccb->ccb_h.ccb_state = DA_CCB_DELETE; ccb->ccb_h.flags |= CAM_UNLOCKED; softc->trim_count++; softc->trim_ranges += ranges; softc->trim_lbas += totalcount; cam_iosched_submit_trim(softc->cam_iosched); } static void da_delete_trim(struct cam_periph *periph, union ccb *ccb, struct bio *bp) { struct da_softc *softc = (struct da_softc *)periph->softc; struct bio *bp1; uint8_t *buf = softc->unmap_buf; uint64_t lastlba = (uint64_t)-1; uint64_t count; uint64_t lba; uint32_t lastcount = 0, c, requestcount; int ranges = 0, off, block_count; bzero(softc->unmap_buf, sizeof(softc->unmap_buf)); bp1 = bp; do { if (bp1 != bp)//XXX imp XXX bioq_insert_tail(&softc->delete_run_queue, bp1); lba = bp1->bio_pblkno; count = bp1->bio_bcount / softc->params.secsize; requestcount = count; /* Try to extend the previous range. */ if (lba == lastlba) { c = omin(count, ATA_DSM_RANGE_MAX - lastcount); lastcount += c; off = (ranges - 1) * 8; buf[off + 6] = lastcount & 0xff; buf[off + 7] = (lastcount >> 8) & 0xff; count -= c; lba += c; } while (count > 0) { c = omin(count, ATA_DSM_RANGE_MAX); off = ranges * 8; buf[off + 0] = lba & 0xff; buf[off + 1] = (lba >> 8) & 0xff; buf[off + 2] = (lba >> 16) & 0xff; buf[off + 3] = (lba >> 24) & 0xff; buf[off + 4] = (lba >> 32) & 0xff; buf[off + 5] = (lba >> 40) & 0xff; buf[off + 6] = c & 0xff; buf[off + 7] = (c >> 8) & 0xff; lba += c; ranges++; count -= c; lastcount = c; if (count != 0 && ranges == softc->trim_max_ranges) { xpt_print(periph->path, "%s issuing short delete %ld > %ld\n", da_delete_method_desc[softc->delete_method], requestcount, (softc->trim_max_ranges - ranges) * ATA_DSM_RANGE_MAX); break; } } lastlba = lba; bp1 = cam_iosched_next_trim(softc->cam_iosched); if (bp1 == NULL) break; if (bp1->bio_bcount / softc->params.secsize > (softc->trim_max_ranges - ranges) * ATA_DSM_RANGE_MAX) { cam_iosched_put_back_trim(softc->cam_iosched, bp1); break; } } while (1); block_count = howmany(ranges, ATA_DSM_BLK_RANGES); scsi_ata_trim(&ccb->csio, /*retries*/da_retry_count, /*cbfcnp*/dadone, /*tag_action*/MSG_SIMPLE_Q_TAG, block_count, /*data_ptr*/buf, /*dxfer_len*/block_count * ATA_DSM_BLK_SIZE, /*sense_len*/SSD_FULL_SIZE, da_default_timeout * 1000); ccb->ccb_h.ccb_state = DA_CCB_DELETE; ccb->ccb_h.flags |= CAM_UNLOCKED; cam_iosched_submit_trim(softc->cam_iosched); } /* * We calculate ws_max_blks here based off d_delmaxsize instead * of using softc->ws_max_blks as it is absolute max for the * device not the protocol max which may well be lower. */ static void da_delete_ws(struct cam_periph *periph, union ccb *ccb, struct bio *bp) { struct da_softc *softc; struct bio *bp1; uint64_t ws_max_blks; uint64_t lba; uint64_t count; /* forward compat with WS32 */ softc = (struct da_softc *)periph->softc; ws_max_blks = softc->disk->d_delmaxsize / softc->params.secsize; lba = bp->bio_pblkno; count = 0; bp1 = bp; do { if (bp1 != bp)//XXX imp XXX bioq_insert_tail(&softc->delete_run_queue, bp1); count += bp1->bio_bcount / softc->params.secsize; if (count > ws_max_blks) { xpt_print(periph->path, "%s issuing short delete %ld > %ld\n", da_delete_method_desc[softc->delete_method], count, ws_max_blks); count = omin(count, ws_max_blks); break; } bp1 = cam_iosched_next_trim(softc->cam_iosched); if (bp1 == NULL) break; if (lba + count != bp1->bio_pblkno || count + bp1->bio_bcount / softc->params.secsize > ws_max_blks) { cam_iosched_put_back_trim(softc->cam_iosched, bp1); break; } } while (1); scsi_write_same(&ccb->csio, /*retries*/da_retry_count, /*cbfcnp*/dadone, /*tag_action*/MSG_SIMPLE_Q_TAG, /*byte2*/softc->delete_method == DA_DELETE_ZERO ? 0 : SWS_UNMAP, softc->delete_method == DA_DELETE_WS16 ? 16 : 10, /*lba*/lba, /*block_count*/count, /*data_ptr*/ __DECONST(void *, zero_region), /*dxfer_len*/ softc->params.secsize, /*sense_len*/SSD_FULL_SIZE, da_default_timeout * 1000); ccb->ccb_h.ccb_state = DA_CCB_DELETE; ccb->ccb_h.flags |= CAM_UNLOCKED; cam_iosched_submit_trim(softc->cam_iosched); } static int cmd6workaround(union ccb *ccb) { struct scsi_rw_6 cmd6; struct scsi_rw_10 *cmd10; struct da_softc *softc; u_int8_t *cdb; struct bio *bp; int frozen; cdb = ccb->csio.cdb_io.cdb_bytes; softc = (struct da_softc *)xpt_path_periph(ccb->ccb_h.path)->softc; if (ccb->ccb_h.ccb_state == DA_CCB_DELETE) { da_delete_methods old_method = softc->delete_method; /* * Typically there are two reasons for failure here * 1. Delete method was detected as supported but isn't * 2. Delete failed due to invalid params e.g. too big * * While we will attempt to choose an alternative delete method * this may result in short deletes if the existing delete * requests from geom are big for the new method chosen. * * This method assumes that the error which triggered this * will not retry the io otherwise a panic will occur */ dadeleteflag(softc, old_method, 0); dadeletemethodchoose(softc, DA_DELETE_DISABLE); if (softc->delete_method == DA_DELETE_DISABLE) xpt_print(ccb->ccb_h.path, "%s failed, disabling BIO_DELETE\n", da_delete_method_desc[old_method]); else xpt_print(ccb->ccb_h.path, "%s failed, switching to %s BIO_DELETE\n", da_delete_method_desc[old_method], da_delete_method_desc[softc->delete_method]); while ((bp = bioq_takefirst(&softc->delete_run_queue)) != NULL) cam_iosched_queue_work(softc->cam_iosched, bp); cam_iosched_queue_work(softc->cam_iosched, (struct bio *)ccb->ccb_h.ccb_bp); ccb->ccb_h.ccb_bp = NULL; return (0); } /* Detect unsupported PREVENT ALLOW MEDIUM REMOVAL. */ if ((ccb->ccb_h.flags & CAM_CDB_POINTER) == 0 && (*cdb == PREVENT_ALLOW) && (softc->quirks & DA_Q_NO_PREVENT) == 0) { if (bootverbose) xpt_print(ccb->ccb_h.path, "PREVENT ALLOW MEDIUM REMOVAL not supported.\n"); softc->quirks |= DA_Q_NO_PREVENT; return (0); } /* Detect unsupported SYNCHRONIZE CACHE(10). */ if ((ccb->ccb_h.flags & CAM_CDB_POINTER) == 0 && (*cdb == SYNCHRONIZE_CACHE) && (softc->quirks & DA_Q_NO_SYNC_CACHE) == 0) { if (bootverbose) xpt_print(ccb->ccb_h.path, "SYNCHRONIZE CACHE(10) not supported.\n"); softc->quirks |= DA_Q_NO_SYNC_CACHE; softc->disk->d_flags &= ~DISKFLAG_CANFLUSHCACHE; return (0); } /* Translation only possible if CDB is an array and cmd is R/W6 */ if ((ccb->ccb_h.flags & CAM_CDB_POINTER) != 0 || (*cdb != READ_6 && *cdb != WRITE_6)) return 0; xpt_print(ccb->ccb_h.path, "READ(6)/WRITE(6) not supported, " "increasing minimum_cmd_size to 10.\n"); softc->minimum_cmd_size = 10; bcopy(cdb, &cmd6, sizeof(struct scsi_rw_6)); cmd10 = (struct scsi_rw_10 *)cdb; cmd10->opcode = (cmd6.opcode == READ_6) ? READ_10 : WRITE_10; cmd10->byte2 = 0; scsi_ulto4b(scsi_3btoul(cmd6.addr), cmd10->addr); cmd10->reserved = 0; scsi_ulto2b(cmd6.length, cmd10->length); cmd10->control = cmd6.control; ccb->csio.cdb_len = sizeof(*cmd10); /* Requeue request, unfreezing queue if necessary */ frozen = (ccb->ccb_h.status & CAM_DEV_QFRZN) != 0; ccb->ccb_h.status = CAM_REQUEUE_REQ; xpt_action(ccb); if (frozen) { cam_release_devq(ccb->ccb_h.path, /*relsim_flags*/0, /*reduction*/0, /*timeout*/0, /*getcount_only*/0); } return (ERESTART); } static void dazonedone(struct cam_periph *periph, union ccb *ccb) { struct da_softc *softc; struct bio *bp; softc = periph->softc; bp = (struct bio *)ccb->ccb_h.ccb_bp; switch (bp->bio_zone.zone_cmd) { case DISK_ZONE_OPEN: case DISK_ZONE_CLOSE: case DISK_ZONE_FINISH: case DISK_ZONE_RWP: break; case DISK_ZONE_REPORT_ZONES: { uint32_t avail_len; struct disk_zone_report *rep; struct scsi_report_zones_hdr *hdr; struct scsi_report_zones_desc *desc; struct disk_zone_rep_entry *entry; uint32_t hdr_len, num_avail; uint32_t num_to_fill, i; int ata; rep = &bp->bio_zone.zone_params.report; avail_len = ccb->csio.dxfer_len - ccb->csio.resid; /* * Note that bio_resid isn't normally used for zone * commands, but it is used by devstat_end_transaction_bio() * to determine how much data was transferred. Because * the size of the SCSI/ATA data structures is different * than the size of the BIO interface structures, the * amount of data actually transferred from the drive will * be different than the amount of data transferred to * the user. */ bp->bio_resid = ccb->csio.resid; hdr = (struct scsi_report_zones_hdr *)ccb->csio.data_ptr; if (avail_len < sizeof(*hdr)) { /* * Is there a better error than EIO here? We asked * for at least the header, and we got less than * that. */ bp->bio_error = EIO; bp->bio_flags |= BIO_ERROR; bp->bio_resid = bp->bio_bcount; break; } if (softc->zone_interface == DA_ZONE_IF_ATA_PASS) ata = 1; else ata = 0; hdr_len = ata ? le32dec(hdr->length) : scsi_4btoul(hdr->length); if (hdr_len > 0) rep->entries_available = hdr_len / sizeof(*desc); else rep->entries_available = 0; /* * NOTE: using the same values for the BIO version of the * same field as the SCSI/ATA values. This means we could * get some additional values that aren't defined in bio.h * if more values of the same field are defined later. */ rep->header.same = hdr->byte4 & SRZ_SAME_MASK; rep->header.maximum_lba = ata ? le64dec(hdr->maximum_lba) : scsi_8btou64(hdr->maximum_lba); /* * If the drive reports no entries that match the query, * we're done. */ if (hdr_len == 0) { rep->entries_filled = 0; break; } num_avail = min((avail_len - sizeof(*hdr)) / sizeof(*desc), hdr_len / sizeof(*desc)); /* * If the drive didn't return any data, then we're done. */ if (num_avail == 0) { rep->entries_filled = 0; break; } num_to_fill = min(num_avail, rep->entries_allocated); /* * If the user didn't allocate any entries for us to fill, * we're done. */ if (num_to_fill == 0) { rep->entries_filled = 0; break; } for (i = 0, desc = &hdr->desc_list[0], entry=&rep->entries[0]; i < num_to_fill; i++, desc++, entry++) { /* * NOTE: we're mapping the values here directly * from the SCSI/ATA bit definitions to the bio.h * definitons. There is also a warning in * disk_zone.h, but the impact is that if * additional values are added in the SCSI/ATA * specs these will be visible to consumers of * this interface. */ entry->zone_type = desc->zone_type & SRZ_TYPE_MASK; entry->zone_condition = (desc->zone_flags & SRZ_ZONE_COND_MASK) >> SRZ_ZONE_COND_SHIFT; entry->zone_flags |= desc->zone_flags & (SRZ_ZONE_NON_SEQ|SRZ_ZONE_RESET); entry->zone_length = ata ? le64dec(desc->zone_length) : scsi_8btou64(desc->zone_length); entry->zone_start_lba = ata ? le64dec(desc->zone_start_lba) : scsi_8btou64(desc->zone_start_lba); entry->write_pointer_lba = ata ? le64dec(desc->write_pointer_lba) : scsi_8btou64(desc->write_pointer_lba); } rep->entries_filled = num_to_fill; break; } case DISK_ZONE_GET_PARAMS: default: /* * In theory we should not get a GET_PARAMS bio, since it * should be handled without queueing the command to the * drive. */ panic("%s: Invalid zone command %d", __func__, bp->bio_zone.zone_cmd); break; } if (bp->bio_zone.zone_cmd == DISK_ZONE_REPORT_ZONES) free(ccb->csio.data_ptr, M_SCSIDA); } static void dadone(struct cam_periph *periph, union ccb *done_ccb) { struct bio *bp, *bp1; struct da_softc *softc; struct ccb_scsiio *csio; u_int32_t priority; da_ccb_state state; CAM_DEBUG(periph->path, CAM_DEBUG_TRACE, ("dadone\n")); softc = (struct da_softc *)periph->softc; priority = done_ccb->ccb_h.pinfo.priority; csio = &done_ccb->csio; #if defined(BUF_TRACKING) || defined(FULL_BUF_TRACKING) if (csio->bio != NULL) biotrack(csio->bio, __func__); #endif state = csio->ccb_h.ccb_state & DA_CCB_TYPE_MASK; cam_periph_lock(periph); bp = (struct bio *)done_ccb->ccb_h.ccb_bp; if ((done_ccb->ccb_h.status & CAM_STATUS_MASK) != CAM_REQ_CMP) { int error; int sf; if ((csio->ccb_h.ccb_state & DA_CCB_RETRY_UA) != 0) sf = SF_RETRY_UA; else sf = 0; error = daerror(done_ccb, CAM_RETRY_SELTO, sf); if (error == ERESTART) { /* A retry was scheduled, so just return. */ cam_periph_unlock(periph); return; } bp = (struct bio *)done_ccb->ccb_h.ccb_bp; if (error != 0) { int queued_error; /* * return all queued I/O with EIO, so that * the client can retry these I/Os in the * proper order should it attempt to recover. */ queued_error = EIO; if (error == ENXIO && (softc->flags & DA_FLAG_PACK_INVALID)== 0) { /* * Catastrophic error. Mark our pack as * invalid. * * XXX See if this is really a media * XXX change first? */ xpt_print(periph->path, "Invalidating pack\n"); softc->flags |= DA_FLAG_PACK_INVALID; #ifdef CAM_IO_STATS softc->invalidations++; #endif queued_error = ENXIO; } cam_iosched_flush(softc->cam_iosched, NULL, queued_error); if (bp != NULL) { bp->bio_error = error; bp->bio_resid = bp->bio_bcount; bp->bio_flags |= BIO_ERROR; } } else if (bp != NULL) { if (state == DA_CCB_DELETE) bp->bio_resid = 0; else bp->bio_resid = csio->resid; bp->bio_error = 0; if (bp->bio_resid != 0) bp->bio_flags |= BIO_ERROR; } if ((done_ccb->ccb_h.status & CAM_DEV_QFRZN) != 0) cam_release_devq(done_ccb->ccb_h.path, /*relsim_flags*/0, /*reduction*/0, /*timeout*/0, /*getcount_only*/0); } else if (bp != NULL) { if ((done_ccb->ccb_h.status & CAM_DEV_QFRZN) != 0) panic("REQ_CMP with QFRZN"); if (bp->bio_cmd == BIO_ZONE) dazonedone(periph, done_ccb); else if (state == DA_CCB_DELETE) bp->bio_resid = 0; else bp->bio_resid = csio->resid; if ((csio->resid > 0) && (bp->bio_cmd != BIO_ZONE)) bp->bio_flags |= BIO_ERROR; if (softc->error_inject != 0) { bp->bio_error = softc->error_inject; bp->bio_resid = bp->bio_bcount; bp->bio_flags |= BIO_ERROR; softc->error_inject = 0; } } if (bp != NULL) biotrack(bp, __func__); LIST_REMOVE(&done_ccb->ccb_h, periph_links.le); if (LIST_EMPTY(&softc->pending_ccbs)) softc->flags |= DA_FLAG_WAS_OTAG; /* * We need to call cam_iosched before we call biodone so that we don't * measure any activity that happens in the completion routine, which in * the case of sendfile can be quite extensive. Release the periph * refcount taken in dastart() for each CCB. */ cam_iosched_bio_complete(softc->cam_iosched, bp, done_ccb); xpt_release_ccb(done_ccb); KASSERT(softc->refcount >= 1, ("dadone softc %p refcount %d", softc, softc->refcount)); softc->refcount--; if (state == DA_CCB_DELETE) { TAILQ_HEAD(, bio) queue; TAILQ_INIT(&queue); TAILQ_CONCAT(&queue, &softc->delete_run_queue.queue, bio_queue); softc->delete_run_queue.insert_point = NULL; /* * Normally, the xpt_release_ccb() above would make sure * that when we have more work to do, that work would * get kicked off. However, we specifically keep * delete_running set to 0 before the call above to * allow other I/O to progress when many BIO_DELETE * requests are pushed down. We set delete_running to 0 * and call daschedule again so that we don't stall if * there are no other I/Os pending apart from BIO_DELETEs. */ cam_iosched_trim_done(softc->cam_iosched); daschedule(periph); cam_periph_unlock(periph); while ((bp1 = TAILQ_FIRST(&queue)) != NULL) { TAILQ_REMOVE(&queue, bp1, bio_queue); bp1->bio_error = bp->bio_error; if (bp->bio_flags & BIO_ERROR) { bp1->bio_flags |= BIO_ERROR; bp1->bio_resid = bp1->bio_bcount; } else bp1->bio_resid = 0; biodone(bp1); } } else { daschedule(periph); cam_periph_unlock(periph); } if (bp != NULL) biodone(bp); return; } static void dadone_probewp(struct cam_periph *periph, union ccb *done_ccb) { struct scsi_mode_header_6 *mode_hdr6; struct scsi_mode_header_10 *mode_hdr10; struct da_softc *softc; struct ccb_scsiio *csio; u_int32_t priority; uint8_t dev_spec; CAM_DEBUG(periph->path, CAM_DEBUG_TRACE, ("dadone_probewp\n")); softc = (struct da_softc *)periph->softc; priority = done_ccb->ccb_h.pinfo.priority; csio = &done_ccb->csio; cam_periph_assert(periph, MA_OWNED); KASSERT(softc->state == DA_STATE_PROBE_WP, ("State (%d) not PROBE_WP in dadone_probewp, periph %p ccb %p", softc->state, periph, done_ccb)); KASSERT((csio->ccb_h.ccb_state & DA_CCB_TYPE_MASK) == DA_CCB_PROBE_WP, ("CCB State (%lu) not PROBE_WP in dadone_probewp, periph %p ccb %p", (unsigned long)csio->ccb_h.ccb_state & DA_CCB_TYPE_MASK, periph, done_ccb)); if (softc->minimum_cmd_size > 6) { mode_hdr10 = (struct scsi_mode_header_10 *)csio->data_ptr; dev_spec = mode_hdr10->dev_spec; } else { mode_hdr6 = (struct scsi_mode_header_6 *)csio->data_ptr; dev_spec = mode_hdr6->dev_spec; } if (cam_ccb_status(done_ccb) == CAM_REQ_CMP) { if ((dev_spec & 0x80) != 0) softc->disk->d_flags |= DISKFLAG_WRITE_PROTECT; else softc->disk->d_flags &= ~DISKFLAG_WRITE_PROTECT; } else { int error; error = daerror(done_ccb, CAM_RETRY_SELTO, SF_RETRY_UA|SF_NO_PRINT); if (error == ERESTART) return; else if (error != 0) { if ((done_ccb->ccb_h.status & CAM_DEV_QFRZN) != 0) { /* Don't wedge this device's queue */ cam_release_devq(done_ccb->ccb_h.path, /*relsim_flags*/0, /*reduction*/0, /*timeout*/0, /*getcount_only*/0); } } } free(csio->data_ptr, M_SCSIDA); if ((softc->flags & DA_FLAG_CAN_RC16) != 0) softc->state = DA_STATE_PROBE_RC16; else softc->state = DA_STATE_PROBE_RC; xpt_release_ccb(done_ccb); xpt_schedule(periph, priority); return; } static void dadone_proberc(struct cam_periph *periph, union ccb *done_ccb) { struct scsi_read_capacity_data *rdcap; struct scsi_read_capacity_data_long *rcaplong; struct da_softc *softc; struct ccb_scsiio *csio; da_ccb_state state; char *announce_buf; u_int32_t priority; int lbp, n; CAM_DEBUG(periph->path, CAM_DEBUG_TRACE, ("dadone_proberc\n")); softc = (struct da_softc *)periph->softc; priority = done_ccb->ccb_h.pinfo.priority; csio = &done_ccb->csio; state = csio->ccb_h.ccb_state & DA_CCB_TYPE_MASK; KASSERT(softc->state == DA_STATE_PROBE_RC || softc->state == DA_STATE_PROBE_RC16, ("State (%d) not PROBE_RC* in dadone_proberc, periph %p ccb %p", softc->state, periph, done_ccb)); KASSERT(state == DA_CCB_PROBE_RC || state == DA_CCB_PROBE_RC16, ("CCB State (%lu) not PROBE_RC* in dadone_probewp, periph %p ccb %p", (unsigned long)state, periph, done_ccb)); lbp = 0; rdcap = NULL; rcaplong = NULL; /* XXX TODO: can this be a malloc? */ announce_buf = softc->announce_temp; bzero(announce_buf, DA_ANNOUNCETMP_SZ); if (state == DA_CCB_PROBE_RC) rdcap =(struct scsi_read_capacity_data *)csio->data_ptr; else rcaplong = (struct scsi_read_capacity_data_long *) csio->data_ptr; cam_periph_assert(periph, MA_OWNED); if ((csio->ccb_h.status & CAM_STATUS_MASK) == CAM_REQ_CMP) { struct disk_params *dp; uint32_t block_size; uint64_t maxsector; u_int lalba; /* Lowest aligned LBA. */ if (state == DA_CCB_PROBE_RC) { block_size = scsi_4btoul(rdcap->length); maxsector = scsi_4btoul(rdcap->addr); lalba = 0; /* * According to SBC-2, if the standard 10 * byte READ CAPACITY command returns 2^32, * we should issue the 16 byte version of * the command, since the device in question * has more sectors than can be represented * with the short version of the command. */ if (maxsector == 0xffffffff) { free(rdcap, M_SCSIDA); softc->state = DA_STATE_PROBE_RC16; xpt_release_ccb(done_ccb); xpt_schedule(periph, priority); return; } } else { block_size = scsi_4btoul(rcaplong->length); maxsector = scsi_8btou64(rcaplong->addr); lalba = scsi_2btoul(rcaplong->lalba_lbp); } /* * Because GEOM code just will panic us if we * give them an 'illegal' value we'll avoid that * here. */ if (block_size == 0) { block_size = 512; if (maxsector == 0) maxsector = -1; } if (block_size >= MAXPHYS) { xpt_print(periph->path, "unsupportable block size %ju\n", (uintmax_t) block_size); announce_buf = NULL; cam_periph_invalidate(periph); } else { /* * We pass rcaplong into dasetgeom(), * because it will only use it if it is * non-NULL. */ dasetgeom(periph, block_size, maxsector, rcaplong, sizeof(*rcaplong)); lbp = (lalba & SRC16_LBPME_A); dp = &softc->params; n = snprintf(announce_buf, DA_ANNOUNCETMP_SZ, "%juMB (%ju %u byte sectors", ((uintmax_t)dp->secsize * dp->sectors) / (1024 * 1024), (uintmax_t)dp->sectors, dp->secsize); if (softc->p_type != 0) { n += snprintf(announce_buf + n, DA_ANNOUNCETMP_SZ - n, ", DIF type %d", softc->p_type); } snprintf(announce_buf + n, DA_ANNOUNCETMP_SZ - n, ")"); } } else { int error; /* * Retry any UNIT ATTENTION type errors. They * are expected at boot. */ error = daerror(done_ccb, CAM_RETRY_SELTO, SF_RETRY_UA|SF_NO_PRINT); if (error == ERESTART) { /* * A retry was scheuled, so * just return. */ return; } else if (error != 0) { int asc, ascq; int sense_key, error_code; int have_sense; cam_status status; struct ccb_getdev cgd; /* Don't wedge this device's queue */ status = done_ccb->ccb_h.status; if ((status & CAM_DEV_QFRZN) != 0) cam_release_devq(done_ccb->ccb_h.path, /*relsim_flags*/0, /*reduction*/0, /*timeout*/0, /*getcount_only*/0); xpt_setup_ccb(&cgd.ccb_h, done_ccb->ccb_h.path, CAM_PRIORITY_NORMAL); cgd.ccb_h.func_code = XPT_GDEV_TYPE; xpt_action((union ccb *)&cgd); if (scsi_extract_sense_ccb(done_ccb, &error_code, &sense_key, &asc, &ascq)) have_sense = TRUE; else have_sense = FALSE; /* * If we tried READ CAPACITY(16) and failed, * fallback to READ CAPACITY(10). */ if ((state == DA_CCB_PROBE_RC16) && (softc->flags & DA_FLAG_CAN_RC16) && (((csio->ccb_h.status & CAM_STATUS_MASK) == CAM_REQ_INVALID) || ((have_sense) && (error_code == SSD_CURRENT_ERROR || error_code == SSD_DESC_CURRENT_ERROR) && (sense_key == SSD_KEY_ILLEGAL_REQUEST)))) { cam_periph_assert(periph, MA_OWNED); softc->flags &= ~DA_FLAG_CAN_RC16; free(rdcap, M_SCSIDA); softc->state = DA_STATE_PROBE_RC; xpt_release_ccb(done_ccb); xpt_schedule(periph, priority); return; } /* * Attach to anything that claims to be a * direct access or optical disk device, * as long as it doesn't return a "Logical * unit not supported" (0x25) error. * "Internal Target Failure" (0x44) is also * special and typically means that the * device is a SATA drive behind a SATL * translation that's fallen into a * terminally fatal state. */ if ((have_sense) && (asc != 0x25) && (asc != 0x44) && (error_code == SSD_CURRENT_ERROR || error_code == SSD_DESC_CURRENT_ERROR)) { const char *sense_key_desc; const char *asc_desc; dasetgeom(periph, 512, -1, NULL, 0); scsi_sense_desc(sense_key, asc, ascq, &cgd.inq_data, &sense_key_desc, &asc_desc); snprintf(announce_buf, DA_ANNOUNCETMP_SZ, "Attempt to query device " "size failed: %s, %s", sense_key_desc, asc_desc); } else { if (have_sense) scsi_sense_print(&done_ccb->csio); else { xpt_print(periph->path, "got CAM status %#x\n", done_ccb->ccb_h.status); } xpt_print(periph->path, "fatal error, " "failed to attach to device\n"); announce_buf = NULL; /* * Free up resources. */ cam_periph_invalidate(periph); } } } free(csio->data_ptr, M_SCSIDA); if (announce_buf != NULL && ((softc->flags & DA_FLAG_ANNOUNCED) == 0)) { struct sbuf sb; sbuf_new(&sb, softc->announcebuf, DA_ANNOUNCE_SZ, SBUF_FIXEDLEN); xpt_announce_periph_sbuf(periph, &sb, announce_buf); xpt_announce_quirks_sbuf(periph, &sb, softc->quirks, DA_Q_BIT_STRING); sbuf_finish(&sb); sbuf_putbuf(&sb); /* * Create our sysctl variables, now that we know * we have successfully attached. */ /* increase the refcount */ if (da_periph_acquire(periph, DA_REF_SYSCTL) == 0) { taskqueue_enqueue(taskqueue_thread, &softc->sysctl_task); } else { /* XXX This message is useless! */ xpt_print(periph->path, "fatal error, " "could not acquire reference count\n"); } } /* We already probed the device. */ if (softc->flags & DA_FLAG_PROBED) { daprobedone(periph, done_ccb); return; } /* Ensure re-probe doesn't see old delete. */ softc->delete_available = 0; dadeleteflag(softc, DA_DELETE_ZERO, 1); if (lbp && (softc->quirks & DA_Q_NO_UNMAP) == 0) { /* * Based on older SBC-3 spec revisions * any of the UNMAP methods "may" be * available via LBP given this flag so * we flag all of them as available and * then remove those which further * probes confirm aren't available * later. * * We could also check readcap(16) p_type * flag to exclude one or more invalid * write same (X) types here */ dadeleteflag(softc, DA_DELETE_WS16, 1); dadeleteflag(softc, DA_DELETE_WS10, 1); dadeleteflag(softc, DA_DELETE_UNMAP, 1); softc->state = DA_STATE_PROBE_LBP; xpt_release_ccb(done_ccb); xpt_schedule(periph, priority); return; } softc->state = DA_STATE_PROBE_BDC; xpt_release_ccb(done_ccb); xpt_schedule(periph, priority); return; } static void dadone_probelbp(struct cam_periph *periph, union ccb *done_ccb) { struct scsi_vpd_logical_block_prov *lbp; struct da_softc *softc; struct ccb_scsiio *csio; u_int32_t priority; CAM_DEBUG(periph->path, CAM_DEBUG_TRACE, ("dadone_probelbp\n")); softc = (struct da_softc *)periph->softc; priority = done_ccb->ccb_h.pinfo.priority; csio = &done_ccb->csio; lbp = (struct scsi_vpd_logical_block_prov *)csio->data_ptr; cam_periph_assert(periph, MA_OWNED); if ((csio->ccb_h.status & CAM_STATUS_MASK) == CAM_REQ_CMP) { /* * T10/1799-D Revision 31 states at least one of these * must be supported but we don't currently enforce this. */ dadeleteflag(softc, DA_DELETE_WS16, (lbp->flags & SVPD_LBP_WS16)); dadeleteflag(softc, DA_DELETE_WS10, (lbp->flags & SVPD_LBP_WS10)); dadeleteflag(softc, DA_DELETE_UNMAP, (lbp->flags & SVPD_LBP_UNMAP)); } else { int error; error = daerror(done_ccb, CAM_RETRY_SELTO, SF_RETRY_UA|SF_NO_PRINT); if (error == ERESTART) return; else if (error != 0) { if ((done_ccb->ccb_h.status & CAM_DEV_QFRZN) != 0) { /* Don't wedge this device's queue */ cam_release_devq(done_ccb->ccb_h.path, /*relsim_flags*/0, /*reduction*/0, /*timeout*/0, /*getcount_only*/0); } /* * Failure indicates we don't support any SBC-3 * delete methods with UNMAP */ } } free(lbp, M_SCSIDA); softc->state = DA_STATE_PROBE_BLK_LIMITS; xpt_release_ccb(done_ccb); xpt_schedule(periph, priority); return; } static void dadone_probeblklimits(struct cam_periph *periph, union ccb *done_ccb) { struct scsi_vpd_block_limits *block_limits; struct da_softc *softc; struct ccb_scsiio *csio; u_int32_t priority; CAM_DEBUG(periph->path, CAM_DEBUG_TRACE, ("dadone_probeblklimits\n")); softc = (struct da_softc *)periph->softc; priority = done_ccb->ccb_h.pinfo.priority; csio = &done_ccb->csio; block_limits = (struct scsi_vpd_block_limits *)csio->data_ptr; cam_periph_assert(periph, MA_OWNED); if ((csio->ccb_h.status & CAM_STATUS_MASK) == CAM_REQ_CMP) { uint32_t max_txfer_len = scsi_4btoul( block_limits->max_txfer_len); uint32_t max_unmap_lba_cnt = scsi_4btoul( block_limits->max_unmap_lba_cnt); uint32_t max_unmap_blk_cnt = scsi_4btoul( block_limits->max_unmap_blk_cnt); uint32_t unmap_gran = scsi_4btoul( block_limits->opt_unmap_grain); uint32_t unmap_gran_align = scsi_4btoul( block_limits->unmap_grain_align); uint64_t ws_max_blks = scsi_8btou64( block_limits->max_write_same_length); if (max_txfer_len != 0) { softc->disk->d_maxsize = MIN(softc->maxio, (off_t)max_txfer_len * softc->params.secsize); } /* * We should already support UNMAP but we check lba * and block count to be sure */ if (max_unmap_lba_cnt != 0x00L && max_unmap_blk_cnt != 0x00L) { softc->unmap_max_lba = max_unmap_lba_cnt; softc->unmap_max_ranges = min(max_unmap_blk_cnt, UNMAP_MAX_RANGES); if (unmap_gran > 1) { softc->unmap_gran = unmap_gran; if (unmap_gran_align & 0x80000000) { softc->unmap_gran_align = unmap_gran_align & 0x7fffffff; } } } else { /* * Unexpected UNMAP limits which means the * device doesn't actually support UNMAP */ dadeleteflag(softc, DA_DELETE_UNMAP, 0); } if (ws_max_blks != 0x00L) softc->ws_max_blks = ws_max_blks; } else { int error; error = daerror(done_ccb, CAM_RETRY_SELTO, SF_RETRY_UA|SF_NO_PRINT); if (error == ERESTART) return; else if (error != 0) { if ((done_ccb->ccb_h.status & CAM_DEV_QFRZN) != 0) { /* Don't wedge this device's queue */ cam_release_devq(done_ccb->ccb_h.path, /*relsim_flags*/0, /*reduction*/0, /*timeout*/0, /*getcount_only*/0); } /* * Failure here doesn't mean UNMAP is not * supported as this is an optional page. */ softc->unmap_max_lba = 1; softc->unmap_max_ranges = 1; } } free(block_limits, M_SCSIDA); softc->state = DA_STATE_PROBE_BDC; xpt_release_ccb(done_ccb); xpt_schedule(periph, priority); return; } static void dadone_probebdc(struct cam_periph *periph, union ccb *done_ccb) { struct scsi_vpd_block_device_characteristics *bdc; struct da_softc *softc; struct ccb_scsiio *csio; u_int32_t priority; CAM_DEBUG(periph->path, CAM_DEBUG_TRACE, ("dadone_probebdc\n")); softc = (struct da_softc *)periph->softc; priority = done_ccb->ccb_h.pinfo.priority; csio = &done_ccb->csio; bdc = (struct scsi_vpd_block_device_characteristics *)csio->data_ptr; cam_periph_assert(periph, MA_OWNED); if ((csio->ccb_h.status & CAM_STATUS_MASK) == CAM_REQ_CMP) { uint32_t valid_len; /* * Disable queue sorting for non-rotational media * by default. */ u_int16_t old_rate = softc->disk->d_rotation_rate; valid_len = csio->dxfer_len - csio->resid; if (SBDC_IS_PRESENT(bdc, valid_len, medium_rotation_rate)) { softc->disk->d_rotation_rate = scsi_2btoul(bdc->medium_rotation_rate); if (softc->disk->d_rotation_rate == SVPD_BDC_RATE_NON_ROTATING) { cam_iosched_set_sort_queue( softc->cam_iosched, 0); softc->flags &= ~DA_FLAG_ROTATING; } if (softc->disk->d_rotation_rate != old_rate) { disk_attr_changed(softc->disk, "GEOM::rotation_rate", M_NOWAIT); } } if ((SBDC_IS_PRESENT(bdc, valid_len, flags)) && (softc->zone_mode == DA_ZONE_NONE)) { int ata_proto; if (scsi_vpd_supported_page(periph, SVPD_ATA_INFORMATION)) ata_proto = 1; else ata_proto = 0; /* * The Zoned field will only be set for * Drive Managed and Host Aware drives. If * they are Host Managed, the device type * in the standard INQUIRY data should be * set to T_ZBC_HM (0x14). */ if ((bdc->flags & SVPD_ZBC_MASK) == SVPD_HAW_ZBC) { softc->zone_mode = DA_ZONE_HOST_AWARE; softc->zone_interface = (ata_proto) ? DA_ZONE_IF_ATA_SAT : DA_ZONE_IF_SCSI; } else if ((bdc->flags & SVPD_ZBC_MASK) == SVPD_DM_ZBC) { softc->zone_mode =DA_ZONE_DRIVE_MANAGED; softc->zone_interface = (ata_proto) ? DA_ZONE_IF_ATA_SAT : DA_ZONE_IF_SCSI; } else if ((bdc->flags & SVPD_ZBC_MASK) != SVPD_ZBC_NR) { xpt_print(periph->path, "Unknown zoned " "type %#x", bdc->flags & SVPD_ZBC_MASK); } } } else { int error; error = daerror(done_ccb, CAM_RETRY_SELTO, SF_RETRY_UA|SF_NO_PRINT); if (error == ERESTART) return; else if (error != 0) { if ((done_ccb->ccb_h.status & CAM_DEV_QFRZN) != 0) { /* Don't wedge this device's queue */ cam_release_devq(done_ccb->ccb_h.path, /*relsim_flags*/0, /*reduction*/0, /*timeout*/0, /*getcount_only*/0); } } } free(bdc, M_SCSIDA); softc->state = DA_STATE_PROBE_ATA; xpt_release_ccb(done_ccb); xpt_schedule(periph, priority); return; } static void dadone_probeata(struct cam_periph *periph, union ccb *done_ccb) { struct ata_params *ata_params; struct ccb_scsiio *csio; struct da_softc *softc; u_int32_t priority; int continue_probe; int error; int16_t *ptr; CAM_DEBUG(periph->path, CAM_DEBUG_TRACE, ("dadone_probeata\n")); softc = (struct da_softc *)periph->softc; priority = done_ccb->ccb_h.pinfo.priority; csio = &done_ccb->csio; ata_params = (struct ata_params *)csio->data_ptr; ptr = (uint16_t *)ata_params; continue_probe = 0; error = 0; cam_periph_assert(periph, MA_OWNED); if ((csio->ccb_h.status & CAM_STATUS_MASK) == CAM_REQ_CMP) { uint16_t old_rate; ata_param_fixup(ata_params); if (ata_params->support_dsm & ATA_SUPPORT_DSM_TRIM && (softc->quirks & DA_Q_NO_UNMAP) == 0) { dadeleteflag(softc, DA_DELETE_ATA_TRIM, 1); if (ata_params->max_dsm_blocks != 0) softc->trim_max_ranges = min( softc->trim_max_ranges, ata_params->max_dsm_blocks * ATA_DSM_BLK_RANGES); } /* * Disable queue sorting for non-rotational media * by default. */ old_rate = softc->disk->d_rotation_rate; softc->disk->d_rotation_rate = ata_params->media_rotation_rate; if (softc->disk->d_rotation_rate == ATA_RATE_NON_ROTATING) { cam_iosched_set_sort_queue(softc->cam_iosched, 0); softc->flags &= ~DA_FLAG_ROTATING; } if (softc->disk->d_rotation_rate != old_rate) { disk_attr_changed(softc->disk, "GEOM::rotation_rate", M_NOWAIT); } cam_periph_assert(periph, MA_OWNED); if (ata_params->capabilities1 & ATA_SUPPORT_DMA) softc->flags |= DA_FLAG_CAN_ATA_DMA; if (ata_params->support.extension & ATA_SUPPORT_GENLOG) softc->flags |= DA_FLAG_CAN_ATA_LOG; /* * At this point, if we have a SATA host aware drive, * we communicate via ATA passthrough unless the * SAT layer supports ZBC -> ZAC translation. In * that case, * * XXX KDM figure out how to detect a host managed * SATA drive. */ if (softc->zone_mode == DA_ZONE_NONE) { /* * Note that we don't override the zone * mode or interface if it has already been * set. This is because it has either been * set as a quirk, or when we probed the * SCSI Block Device Characteristics page, * the zoned field was set. The latter * means that the SAT layer supports ZBC to * ZAC translation, and we would prefer to * use that if it is available. */ if ((ata_params->support3 & ATA_SUPPORT_ZONE_MASK) == ATA_SUPPORT_ZONE_HOST_AWARE) { softc->zone_mode = DA_ZONE_HOST_AWARE; softc->zone_interface = DA_ZONE_IF_ATA_PASS; } else if ((ata_params->support3 & ATA_SUPPORT_ZONE_MASK) == ATA_SUPPORT_ZONE_DEV_MANAGED) { softc->zone_mode =DA_ZONE_DRIVE_MANAGED; softc->zone_interface = DA_ZONE_IF_ATA_PASS; } } } else { error = daerror(done_ccb, CAM_RETRY_SELTO, SF_RETRY_UA|SF_NO_PRINT); if (error == ERESTART) return; else if (error != 0) { if ((done_ccb->ccb_h.status & CAM_DEV_QFRZN) != 0) { /* Don't wedge this device's queue */ cam_release_devq(done_ccb->ccb_h.path, /*relsim_flags*/0, /*reduction*/0, /*timeout*/0, /*getcount_only*/0); } } } if ((softc->zone_mode == DA_ZONE_HOST_AWARE) || (softc->zone_mode == DA_ZONE_HOST_MANAGED)) { /* * If the ATA IDENTIFY failed, we could be talking * to a SCSI drive, although that seems unlikely, * since the drive did report that it supported the * ATA Information VPD page. If the ATA IDENTIFY * succeeded, and the SAT layer doesn't support * ZBC -> ZAC translation, continue on to get the * directory of ATA logs, and complete the rest of * the ZAC probe. If the SAT layer does support * ZBC -> ZAC translation, we want to use that, * and we'll probe the SCSI Zoned Block Device * Characteristics VPD page next. */ if ((error == 0) && (softc->flags & DA_FLAG_CAN_ATA_LOG) && (softc->zone_interface == DA_ZONE_IF_ATA_PASS)) softc->state = DA_STATE_PROBE_ATA_LOGDIR; else softc->state = DA_STATE_PROBE_ZONE; continue_probe = 1; } if (continue_probe != 0) { xpt_schedule(periph, priority); xpt_release_ccb(done_ccb); return; } else daprobedone(periph, done_ccb); return; } static void dadone_probeatalogdir(struct cam_periph *periph, union ccb *done_ccb) { struct da_softc *softc; struct ccb_scsiio *csio; u_int32_t priority; int error; CAM_DEBUG(periph->path, CAM_DEBUG_TRACE, ("dadone_probeatalogdir\n")); softc = (struct da_softc *)periph->softc; priority = done_ccb->ccb_h.pinfo.priority; csio = &done_ccb->csio; cam_periph_assert(periph, MA_OWNED); if ((csio->ccb_h.status & CAM_STATUS_MASK) == CAM_REQ_CMP) { error = 0; softc->valid_logdir_len = 0; bzero(&softc->ata_logdir, sizeof(softc->ata_logdir)); softc->valid_logdir_len = csio->dxfer_len - csio->resid; if (softc->valid_logdir_len > 0) bcopy(csio->data_ptr, &softc->ata_logdir, min(softc->valid_logdir_len, sizeof(softc->ata_logdir))); /* * Figure out whether the Identify Device log is * supported. The General Purpose log directory * has a header, and lists the number of pages * available for each GP log identified by the * offset into the list. */ if ((softc->valid_logdir_len >= ((ATA_IDENTIFY_DATA_LOG + 1) * sizeof(uint16_t))) && (le16dec(softc->ata_logdir.header) == ATA_GP_LOG_DIR_VERSION) && (le16dec(&softc->ata_logdir.num_pages[ (ATA_IDENTIFY_DATA_LOG * sizeof(uint16_t)) - sizeof(uint16_t)]) > 0)){ softc->flags |= DA_FLAG_CAN_ATA_IDLOG; } else { softc->flags &= ~DA_FLAG_CAN_ATA_IDLOG; } } else { error = daerror(done_ccb, CAM_RETRY_SELTO, SF_RETRY_UA|SF_NO_PRINT); if (error == ERESTART) return; else if (error != 0) { /* * If we can't get the ATA log directory, * then ATA logs are effectively not * supported even if the bit is set in the * identify data. */ softc->flags &= ~(DA_FLAG_CAN_ATA_LOG | DA_FLAG_CAN_ATA_IDLOG); if ((done_ccb->ccb_h.status & CAM_DEV_QFRZN) != 0) { /* Don't wedge this device's queue */ cam_release_devq(done_ccb->ccb_h.path, /*relsim_flags*/0, /*reduction*/0, /*timeout*/0, /*getcount_only*/0); } } } free(csio->data_ptr, M_SCSIDA); if ((error == 0) && (softc->flags & DA_FLAG_CAN_ATA_IDLOG)) { softc->state = DA_STATE_PROBE_ATA_IDDIR; xpt_release_ccb(done_ccb); xpt_schedule(periph, priority); return; } daprobedone(periph, done_ccb); return; } static void dadone_probeataiddir(struct cam_periph *periph, union ccb *done_ccb) { struct da_softc *softc; struct ccb_scsiio *csio; u_int32_t priority; int error; CAM_DEBUG(periph->path, CAM_DEBUG_TRACE, ("dadone_probeataiddir\n")); softc = (struct da_softc *)periph->softc; priority = done_ccb->ccb_h.pinfo.priority; csio = &done_ccb->csio; cam_periph_assert(periph, MA_OWNED); if ((csio->ccb_h.status & CAM_STATUS_MASK) == CAM_REQ_CMP) { off_t entries_offset, max_entries; error = 0; softc->valid_iddir_len = 0; bzero(&softc->ata_iddir, sizeof(softc->ata_iddir)); softc->flags &= ~(DA_FLAG_CAN_ATA_SUPCAP | DA_FLAG_CAN_ATA_ZONE); softc->valid_iddir_len = csio->dxfer_len - csio->resid; if (softc->valid_iddir_len > 0) bcopy(csio->data_ptr, &softc->ata_iddir, min(softc->valid_iddir_len, sizeof(softc->ata_iddir))); entries_offset = __offsetof(struct ata_identify_log_pages,entries); max_entries = softc->valid_iddir_len - entries_offset; if ((softc->valid_iddir_len > (entries_offset + 1)) && (le64dec(softc->ata_iddir.header) == ATA_IDLOG_REVISION) && (softc->ata_iddir.entry_count > 0)) { int num_entries, i; num_entries = softc->ata_iddir.entry_count; num_entries = min(num_entries, softc->valid_iddir_len - entries_offset); for (i = 0; i < num_entries && i < max_entries; i++) { if (softc->ata_iddir.entries[i] == ATA_IDL_SUP_CAP) softc->flags |= DA_FLAG_CAN_ATA_SUPCAP; else if (softc->ata_iddir.entries[i] == ATA_IDL_ZDI) softc->flags |= DA_FLAG_CAN_ATA_ZONE; if ((softc->flags & DA_FLAG_CAN_ATA_SUPCAP) && (softc->flags & DA_FLAG_CAN_ATA_ZONE)) break; } } } else { error = daerror(done_ccb, CAM_RETRY_SELTO, SF_RETRY_UA|SF_NO_PRINT); if (error == ERESTART) return; else if (error != 0) { /* * If we can't get the ATA Identify Data log * directory, then it effectively isn't * supported even if the ATA Log directory * a non-zero number of pages present for * this log. */ softc->flags &= ~DA_FLAG_CAN_ATA_IDLOG; if ((done_ccb->ccb_h.status & CAM_DEV_QFRZN) != 0) { /* Don't wedge this device's queue */ cam_release_devq(done_ccb->ccb_h.path, /*relsim_flags*/0, /*reduction*/0, /*timeout*/0, /*getcount_only*/0); } } } free(csio->data_ptr, M_SCSIDA); if ((error == 0) && (softc->flags & DA_FLAG_CAN_ATA_SUPCAP)) { softc->state = DA_STATE_PROBE_ATA_SUP; xpt_release_ccb(done_ccb); xpt_schedule(periph, priority); return; } daprobedone(periph, done_ccb); return; } static void dadone_probeatasup(struct cam_periph *periph, union ccb *done_ccb) { struct da_softc *softc; struct ccb_scsiio *csio; u_int32_t priority; int error; CAM_DEBUG(periph->path, CAM_DEBUG_TRACE, ("dadone_probeatasup\n")); softc = (struct da_softc *)periph->softc; priority = done_ccb->ccb_h.pinfo.priority; csio = &done_ccb->csio; cam_periph_assert(periph, MA_OWNED); if ((csio->ccb_h.status & CAM_STATUS_MASK) == CAM_REQ_CMP) { uint32_t valid_len; size_t needed_size; struct ata_identify_log_sup_cap *sup_cap; error = 0; sup_cap = (struct ata_identify_log_sup_cap *)csio->data_ptr; valid_len = csio->dxfer_len - csio->resid; needed_size = __offsetof(struct ata_identify_log_sup_cap, sup_zac_cap) + 1 + sizeof(sup_cap->sup_zac_cap); if (valid_len >= needed_size) { uint64_t zoned, zac_cap; zoned = le64dec(sup_cap->zoned_cap); if (zoned & ATA_ZONED_VALID) { /* * This should have already been * set, because this is also in the * ATA identify data. */ if ((zoned & ATA_ZONED_MASK) == ATA_SUPPORT_ZONE_HOST_AWARE) softc->zone_mode = DA_ZONE_HOST_AWARE; else if ((zoned & ATA_ZONED_MASK) == ATA_SUPPORT_ZONE_DEV_MANAGED) softc->zone_mode = DA_ZONE_DRIVE_MANAGED; } zac_cap = le64dec(sup_cap->sup_zac_cap); if (zac_cap & ATA_SUP_ZAC_CAP_VALID) { if (zac_cap & ATA_REPORT_ZONES_SUP) softc->zone_flags |= DA_ZONE_FLAG_RZ_SUP; if (zac_cap & ATA_ND_OPEN_ZONE_SUP) softc->zone_flags |= DA_ZONE_FLAG_OPEN_SUP; if (zac_cap & ATA_ND_CLOSE_ZONE_SUP) softc->zone_flags |= DA_ZONE_FLAG_CLOSE_SUP; if (zac_cap & ATA_ND_FINISH_ZONE_SUP) softc->zone_flags |= DA_ZONE_FLAG_FINISH_SUP; if (zac_cap & ATA_ND_RWP_SUP) softc->zone_flags |= DA_ZONE_FLAG_RWP_SUP; } else { /* * This field was introduced in * ACS-4, r08 on April 28th, 2015. * If the drive firmware was written * to an earlier spec, it won't have * the field. So, assume all * commands are supported. */ softc->zone_flags |= DA_ZONE_FLAG_SUP_MASK; } } } else { error = daerror(done_ccb, CAM_RETRY_SELTO, SF_RETRY_UA|SF_NO_PRINT); if (error == ERESTART) return; else if (error != 0) { /* * If we can't get the ATA Identify Data * Supported Capabilities page, clear the * flag... */ softc->flags &= ~DA_FLAG_CAN_ATA_SUPCAP; /* * And clear zone capabilities. */ softc->zone_flags &= ~DA_ZONE_FLAG_SUP_MASK; if ((done_ccb->ccb_h.status & CAM_DEV_QFRZN) != 0) { /* Don't wedge this device's queue */ cam_release_devq(done_ccb->ccb_h.path, /*relsim_flags*/0, /*reduction*/0, /*timeout*/0, /*getcount_only*/0); } } } free(csio->data_ptr, M_SCSIDA); if ((error == 0) && (softc->flags & DA_FLAG_CAN_ATA_ZONE)) { softc->state = DA_STATE_PROBE_ATA_ZONE; xpt_release_ccb(done_ccb); xpt_schedule(periph, priority); return; } daprobedone(periph, done_ccb); return; } static void dadone_probeatazone(struct cam_periph *periph, union ccb *done_ccb) { struct da_softc *softc; struct ccb_scsiio *csio; int error; CAM_DEBUG(periph->path, CAM_DEBUG_TRACE, ("dadone_probeatazone\n")); softc = (struct da_softc *)periph->softc; csio = &done_ccb->csio; cam_periph_assert(periph, MA_OWNED); if ((csio->ccb_h.status & CAM_STATUS_MASK) == CAM_REQ_CMP) { struct ata_zoned_info_log *zi_log; uint32_t valid_len; size_t needed_size; zi_log = (struct ata_zoned_info_log *)csio->data_ptr; valid_len = csio->dxfer_len - csio->resid; needed_size = __offsetof(struct ata_zoned_info_log, version_info) + 1 + sizeof(zi_log->version_info); if (valid_len >= needed_size) { uint64_t tmpvar; tmpvar = le64dec(zi_log->zoned_cap); if (tmpvar & ATA_ZDI_CAP_VALID) { if (tmpvar & ATA_ZDI_CAP_URSWRZ) softc->zone_flags |= DA_ZONE_FLAG_URSWRZ; else softc->zone_flags &= ~DA_ZONE_FLAG_URSWRZ; } tmpvar = le64dec(zi_log->optimal_seq_zones); if (tmpvar & ATA_ZDI_OPT_SEQ_VALID) { softc->zone_flags |= DA_ZONE_FLAG_OPT_SEQ_SET; softc->optimal_seq_zones = (tmpvar & ATA_ZDI_OPT_SEQ_MASK); } else { softc->zone_flags &= ~DA_ZONE_FLAG_OPT_SEQ_SET; softc->optimal_seq_zones = 0; } tmpvar =le64dec(zi_log->optimal_nonseq_zones); if (tmpvar & ATA_ZDI_OPT_NS_VALID) { softc->zone_flags |= DA_ZONE_FLAG_OPT_NONSEQ_SET; softc->optimal_nonseq_zones = (tmpvar & ATA_ZDI_OPT_NS_MASK); } else { softc->zone_flags &= ~DA_ZONE_FLAG_OPT_NONSEQ_SET; softc->optimal_nonseq_zones = 0; } tmpvar = le64dec(zi_log->max_seq_req_zones); if (tmpvar & ATA_ZDI_MAX_SEQ_VALID) { softc->zone_flags |= DA_ZONE_FLAG_MAX_SEQ_SET; softc->max_seq_zones = (tmpvar & ATA_ZDI_MAX_SEQ_MASK); } else { softc->zone_flags &= ~DA_ZONE_FLAG_MAX_SEQ_SET; softc->max_seq_zones = 0; } } } else { error = daerror(done_ccb, CAM_RETRY_SELTO, SF_RETRY_UA|SF_NO_PRINT); if (error == ERESTART) return; else if (error != 0) { softc->flags &= ~DA_FLAG_CAN_ATA_ZONE; softc->flags &= ~DA_ZONE_FLAG_SET_MASK; if ((done_ccb->ccb_h.status & CAM_DEV_QFRZN) != 0) { /* Don't wedge this device's queue */ cam_release_devq(done_ccb->ccb_h.path, /*relsim_flags*/0, /*reduction*/0, /*timeout*/0, /*getcount_only*/0); } } } free(csio->data_ptr, M_SCSIDA); daprobedone(periph, done_ccb); return; } static void dadone_probezone(struct cam_periph *periph, union ccb *done_ccb) { struct da_softc *softc; struct ccb_scsiio *csio; int error; CAM_DEBUG(periph->path, CAM_DEBUG_TRACE, ("dadone_probezone\n")); softc = (struct da_softc *)periph->softc; csio = &done_ccb->csio; cam_periph_assert(periph, MA_OWNED); if ((csio->ccb_h.status & CAM_STATUS_MASK) == CAM_REQ_CMP) { uint32_t valid_len; size_t needed_len; struct scsi_vpd_zoned_bdc *zoned_bdc; error = 0; zoned_bdc = (struct scsi_vpd_zoned_bdc *)csio->data_ptr; valid_len = csio->dxfer_len - csio->resid; needed_len = __offsetof(struct scsi_vpd_zoned_bdc, max_seq_req_zones) + 1 + sizeof(zoned_bdc->max_seq_req_zones); if ((valid_len >= needed_len) && (scsi_2btoul(zoned_bdc->page_length) >= SVPD_ZBDC_PL)) { if (zoned_bdc->flags & SVPD_ZBDC_URSWRZ) softc->zone_flags |= DA_ZONE_FLAG_URSWRZ; else softc->zone_flags &= ~DA_ZONE_FLAG_URSWRZ; softc->optimal_seq_zones = scsi_4btoul(zoned_bdc->optimal_seq_zones); softc->zone_flags |= DA_ZONE_FLAG_OPT_SEQ_SET; softc->optimal_nonseq_zones = scsi_4btoul( zoned_bdc->optimal_nonseq_zones); softc->zone_flags |= DA_ZONE_FLAG_OPT_NONSEQ_SET; softc->max_seq_zones = scsi_4btoul(zoned_bdc->max_seq_req_zones); softc->zone_flags |= DA_ZONE_FLAG_MAX_SEQ_SET; } /* * All of the zone commands are mandatory for SCSI * devices. * * XXX KDM this is valid as of September 2015. * Re-check this assumption once the SAT spec is * updated to support SCSI ZBC to ATA ZAC mapping. * Since ATA allows zone commands to be reported * as supported or not, this may not necessarily * be true for an ATA device behind a SAT (SCSI to * ATA Translation) layer. */ softc->zone_flags |= DA_ZONE_FLAG_SUP_MASK; } else { error = daerror(done_ccb, CAM_RETRY_SELTO, SF_RETRY_UA|SF_NO_PRINT); if (error == ERESTART) return; else if (error != 0) { if ((done_ccb->ccb_h.status & CAM_DEV_QFRZN) != 0) { /* Don't wedge this device's queue */ cam_release_devq(done_ccb->ccb_h.path, /*relsim_flags*/0, /*reduction*/0, /*timeout*/0, /*getcount_only*/0); } } } free(csio->data_ptr, M_SCSIDA); daprobedone(periph, done_ccb); return; } static void dadone_tur(struct cam_periph *periph, union ccb *done_ccb) { struct da_softc *softc; struct ccb_scsiio *csio; CAM_DEBUG(periph->path, CAM_DEBUG_TRACE, ("dadone_tur\n")); softc = (struct da_softc *)periph->softc; csio = &done_ccb->csio; cam_periph_assert(periph, MA_OWNED); if ((done_ccb->ccb_h.status & CAM_STATUS_MASK) != CAM_REQ_CMP) { if (daerror(done_ccb, CAM_RETRY_SELTO, SF_RETRY_UA | SF_NO_RECOVERY | SF_NO_PRINT) == ERESTART) return; /* Will complete again, keep reference */ if ((done_ccb->ccb_h.status & CAM_DEV_QFRZN) != 0) cam_release_devq(done_ccb->ccb_h.path, /*relsim_flags*/0, /*reduction*/0, /*timeout*/0, /*getcount_only*/0); } softc->flags &= ~DA_FLAG_TUR_PENDING; xpt_release_ccb(done_ccb); da_periph_release_locked(periph, DA_REF_TUR); return; } static void dareprobe(struct cam_periph *periph) { struct da_softc *softc; int status; softc = (struct da_softc *)periph->softc; cam_periph_assert(periph, MA_OWNED); /* Probe in progress; don't interfere. */ if (softc->state != DA_STATE_NORMAL) return; status = da_periph_acquire(periph, DA_REF_REPROBE); KASSERT(status == 0, ("dareprobe: cam_periph_acquire failed")); softc->state = DA_STATE_PROBE_WP; xpt_schedule(periph, CAM_PRIORITY_DEV); } static int daerror(union ccb *ccb, u_int32_t cam_flags, u_int32_t sense_flags) { struct da_softc *softc; struct cam_periph *periph; int error, error_code, sense_key, asc, ascq; #if defined(BUF_TRACKING) || defined(FULL_BUF_TRACKING) if (ccb->csio.bio != NULL) biotrack(ccb->csio.bio, __func__); #endif periph = xpt_path_periph(ccb->ccb_h.path); softc = (struct da_softc *)periph->softc; cam_periph_assert(periph, MA_OWNED); /* * Automatically detect devices that do not support * READ(6)/WRITE(6) and upgrade to using 10 byte cdbs. */ error = 0; if ((ccb->ccb_h.status & CAM_STATUS_MASK) == CAM_REQ_INVALID) { error = cmd6workaround(ccb); } else if (scsi_extract_sense_ccb(ccb, &error_code, &sense_key, &asc, &ascq)) { if (sense_key == SSD_KEY_ILLEGAL_REQUEST) error = cmd6workaround(ccb); /* * If the target replied with CAPACITY DATA HAS CHANGED UA, * query the capacity and notify upper layers. */ else if (sense_key == SSD_KEY_UNIT_ATTENTION && asc == 0x2A && ascq == 0x09) { xpt_print(periph->path, "Capacity data has changed\n"); softc->flags &= ~DA_FLAG_PROBED; dareprobe(periph); sense_flags |= SF_NO_PRINT; } else if (sense_key == SSD_KEY_UNIT_ATTENTION && asc == 0x28 && ascq == 0x00) { softc->flags &= ~DA_FLAG_PROBED; disk_media_changed(softc->disk, M_NOWAIT); } else if (sense_key == SSD_KEY_UNIT_ATTENTION && asc == 0x3F && ascq == 0x03) { xpt_print(periph->path, "INQUIRY data has changed\n"); softc->flags &= ~DA_FLAG_PROBED; dareprobe(periph); sense_flags |= SF_NO_PRINT; } else if (sense_key == SSD_KEY_NOT_READY && asc == 0x3a && (softc->flags & DA_FLAG_PACK_INVALID) == 0) { softc->flags |= DA_FLAG_PACK_INVALID; disk_media_gone(softc->disk, M_NOWAIT); } } if (error == ERESTART) return (ERESTART); #ifdef CAM_IO_STATS switch (ccb->ccb_h.status & CAM_STATUS_MASK) { case CAM_CMD_TIMEOUT: softc->timeouts++; break; case CAM_REQ_ABORTED: case CAM_REQ_CMP_ERR: case CAM_REQ_TERMIO: case CAM_UNREC_HBA_ERROR: case CAM_DATA_RUN_ERR: softc->errors++; break; default: break; } #endif /* * XXX * Until we have a better way of doing pack validation, * don't treat UAs as errors. */ sense_flags |= SF_RETRY_UA; if (softc->quirks & DA_Q_RETRY_BUSY) sense_flags |= SF_RETRY_BUSY; return(cam_periph_error(ccb, cam_flags, sense_flags)); } static void damediapoll(void *arg) { struct cam_periph *periph = arg; struct da_softc *softc = periph->softc; if (!cam_iosched_has_work_flags(softc->cam_iosched, DA_WORK_TUR) && (softc->flags & DA_FLAG_TUR_PENDING) == 0 && softc->state == DA_STATE_NORMAL && LIST_EMPTY(&softc->pending_ccbs)) { if (da_periph_acquire(periph, DA_REF_TUR) == 0) { cam_iosched_set_work_flags(softc->cam_iosched, DA_WORK_TUR); daschedule(periph); } } /* Queue us up again */ if (da_poll_period != 0) callout_schedule(&softc->mediapoll_c, da_poll_period * hz); } static void daprevent(struct cam_periph *periph, int action) { struct da_softc *softc; union ccb *ccb; int error; cam_periph_assert(periph, MA_OWNED); softc = (struct da_softc *)periph->softc; if (((action == PR_ALLOW) && (softc->flags & DA_FLAG_PACK_LOCKED) == 0) || ((action == PR_PREVENT) && (softc->flags & DA_FLAG_PACK_LOCKED) != 0)) { return; } ccb = cam_periph_getccb(periph, CAM_PRIORITY_NORMAL); scsi_prevent(&ccb->csio, /*retries*/1, /*cbcfp*/NULL, MSG_SIMPLE_Q_TAG, action, SSD_FULL_SIZE, 5000); error = cam_periph_runccb(ccb, daerror, CAM_RETRY_SELTO, SF_RETRY_UA | SF_NO_PRINT, softc->disk->d_devstat); if (error == 0) { if (action == PR_ALLOW) softc->flags &= ~DA_FLAG_PACK_LOCKED; else softc->flags |= DA_FLAG_PACK_LOCKED; } xpt_release_ccb(ccb); } static void dasetgeom(struct cam_periph *periph, uint32_t block_len, uint64_t maxsector, struct scsi_read_capacity_data_long *rcaplong, size_t rcap_len) { struct ccb_calc_geometry ccg; struct da_softc *softc; struct disk_params *dp; u_int lbppbe, lalba; int error; softc = (struct da_softc *)periph->softc; dp = &softc->params; dp->secsize = block_len; dp->sectors = maxsector + 1; if (rcaplong != NULL) { lbppbe = rcaplong->prot_lbppbe & SRC16_LBPPBE; lalba = scsi_2btoul(rcaplong->lalba_lbp); lalba &= SRC16_LALBA_A; if (rcaplong->prot & SRC16_PROT_EN) softc->p_type = ((rcaplong->prot & SRC16_P_TYPE) >> SRC16_P_TYPE_SHIFT) + 1; else softc->p_type = 0; } else { lbppbe = 0; lalba = 0; softc->p_type = 0; } if (lbppbe > 0) { dp->stripesize = block_len << lbppbe; dp->stripeoffset = (dp->stripesize - block_len * lalba) % dp->stripesize; } else if (softc->quirks & DA_Q_4K) { dp->stripesize = 4096; dp->stripeoffset = 0; } else if (softc->unmap_gran != 0) { dp->stripesize = block_len * softc->unmap_gran; dp->stripeoffset = (dp->stripesize - block_len * softc->unmap_gran_align) % dp->stripesize; } else { dp->stripesize = 0; dp->stripeoffset = 0; } /* * Have the controller provide us with a geometry * for this disk. The only time the geometry * matters is when we boot and the controller * is the only one knowledgeable enough to come * up with something that will make this a bootable * device. */ xpt_setup_ccb(&ccg.ccb_h, periph->path, CAM_PRIORITY_NORMAL); ccg.ccb_h.func_code = XPT_CALC_GEOMETRY; ccg.block_size = dp->secsize; ccg.volume_size = dp->sectors; ccg.heads = 0; ccg.secs_per_track = 0; ccg.cylinders = 0; xpt_action((union ccb*)&ccg); if ((ccg.ccb_h.status & CAM_STATUS_MASK) != CAM_REQ_CMP) { /* * We don't know what went wrong here- but just pick * a geometry so we don't have nasty things like divide * by zero. */ dp->heads = 255; dp->secs_per_track = 255; dp->cylinders = dp->sectors / (255 * 255); if (dp->cylinders == 0) { dp->cylinders = 1; } } else { dp->heads = ccg.heads; dp->secs_per_track = ccg.secs_per_track; dp->cylinders = ccg.cylinders; } /* * If the user supplied a read capacity buffer, and if it is * different than the previous buffer, update the data in the EDT. * If it's the same, we don't bother. This avoids sending an * update every time someone opens this device. */ if ((rcaplong != NULL) && (bcmp(rcaplong, &softc->rcaplong, min(sizeof(softc->rcaplong), rcap_len)) != 0)) { struct ccb_dev_advinfo cdai; xpt_setup_ccb(&cdai.ccb_h, periph->path, CAM_PRIORITY_NORMAL); cdai.ccb_h.func_code = XPT_DEV_ADVINFO; cdai.buftype = CDAI_TYPE_RCAPLONG; cdai.flags = CDAI_FLAG_STORE; cdai.bufsiz = rcap_len; cdai.buf = (uint8_t *)rcaplong; xpt_action((union ccb *)&cdai); if ((cdai.ccb_h.status & CAM_DEV_QFRZN) != 0) cam_release_devq(cdai.ccb_h.path, 0, 0, 0, FALSE); if (cdai.ccb_h.status != CAM_REQ_CMP) { xpt_print(periph->path, "%s: failed to set read " "capacity advinfo\n", __func__); /* Use cam_error_print() to decode the status */ cam_error_print((union ccb *)&cdai, CAM_ESF_CAM_STATUS, CAM_EPF_ALL); } else { bcopy(rcaplong, &softc->rcaplong, min(sizeof(softc->rcaplong), rcap_len)); } } softc->disk->d_sectorsize = softc->params.secsize; softc->disk->d_mediasize = softc->params.secsize * (off_t)softc->params.sectors; softc->disk->d_stripesize = softc->params.stripesize; softc->disk->d_stripeoffset = softc->params.stripeoffset; /* XXX: these are not actually "firmware" values, so they may be wrong */ softc->disk->d_fwsectors = softc->params.secs_per_track; softc->disk->d_fwheads = softc->params.heads; softc->disk->d_devstat->block_size = softc->params.secsize; softc->disk->d_devstat->flags &= ~DEVSTAT_BS_UNAVAILABLE; error = disk_resize(softc->disk, M_NOWAIT); if (error != 0) xpt_print(periph->path, "disk_resize(9) failed, error = %d\n", error); } static void dasendorderedtag(void *arg) { struct cam_periph *periph = arg; struct da_softc *softc = periph->softc; cam_periph_assert(periph, MA_OWNED); if (da_send_ordered) { if (!LIST_EMPTY(&softc->pending_ccbs)) { if ((softc->flags & DA_FLAG_WAS_OTAG) == 0) softc->flags |= DA_FLAG_NEED_OTAG; softc->flags &= ~DA_FLAG_WAS_OTAG; } } /* Queue us up again */ callout_reset(&softc->sendordered_c, (da_default_timeout * hz) / DA_ORDEREDTAG_INTERVAL, dasendorderedtag, periph); } /* * Step through all DA peripheral drivers, and if the device is still open, * sync the disk cache to physical media. */ static void dashutdown(void * arg, int howto) { struct cam_periph *periph; struct da_softc *softc; union ccb *ccb; int error; CAM_PERIPH_FOREACH(periph, &dadriver) { softc = (struct da_softc *)periph->softc; if (SCHEDULER_STOPPED()) { /* If we paniced with the lock held, do not recurse. */ if (!cam_periph_owned(periph) && (softc->flags & DA_FLAG_OPEN)) { dadump(softc->disk, NULL, 0, 0, 0); } continue; } cam_periph_lock(periph); /* * We only sync the cache if the drive is still open, and * if the drive is capable of it.. */ if (((softc->flags & DA_FLAG_OPEN) == 0) || (softc->quirks & DA_Q_NO_SYNC_CACHE)) { cam_periph_unlock(periph); continue; } ccb = cam_periph_getccb(periph, CAM_PRIORITY_NORMAL); scsi_synchronize_cache(&ccb->csio, /*retries*/0, /*cbfcnp*/NULL, MSG_SIMPLE_Q_TAG, /*begin_lba*/0, /* whole disk */ /*lb_count*/0, SSD_FULL_SIZE, 60 * 60 * 1000); error = cam_periph_runccb(ccb, daerror, /*cam_flags*/0, /*sense_flags*/ SF_NO_RECOVERY | SF_NO_RETRY | SF_QUIET_IR, softc->disk->d_devstat); if (error != 0) xpt_print(periph->path, "Synchronize cache failed\n"); xpt_release_ccb(ccb); cam_periph_unlock(periph); } } #else /* !_KERNEL */ /* * XXX These are only left out of the kernel build to silence warnings. If, * for some reason these functions are used in the kernel, the ifdefs should * be moved so they are included both in the kernel and userland. */ void scsi_format_unit(struct ccb_scsiio *csio, u_int32_t retries, void (*cbfcnp)(struct cam_periph *, union ccb *), u_int8_t tag_action, u_int8_t byte2, u_int16_t ileave, u_int8_t *data_ptr, u_int32_t dxfer_len, u_int8_t sense_len, u_int32_t timeout) { struct scsi_format_unit *scsi_cmd; scsi_cmd = (struct scsi_format_unit *)&csio->cdb_io.cdb_bytes; scsi_cmd->opcode = FORMAT_UNIT; scsi_cmd->byte2 = byte2; scsi_ulto2b(ileave, scsi_cmd->interleave); cam_fill_csio(csio, retries, cbfcnp, /*flags*/ (dxfer_len > 0) ? CAM_DIR_OUT : CAM_DIR_NONE, tag_action, data_ptr, dxfer_len, sense_len, sizeof(*scsi_cmd), timeout); } void scsi_read_defects(struct ccb_scsiio *csio, uint32_t retries, void (*cbfcnp)(struct cam_periph *, union ccb *), uint8_t tag_action, uint8_t list_format, uint32_t addr_desc_index, uint8_t *data_ptr, uint32_t dxfer_len, int minimum_cmd_size, uint8_t sense_len, uint32_t timeout) { uint8_t cdb_len; /* * These conditions allow using the 10 byte command. Otherwise we * need to use the 12 byte command. */ if ((minimum_cmd_size <= 10) && (addr_desc_index == 0) && (dxfer_len <= SRDD10_MAX_LENGTH)) { struct scsi_read_defect_data_10 *cdb10; cdb10 = (struct scsi_read_defect_data_10 *) &csio->cdb_io.cdb_bytes; cdb_len = sizeof(*cdb10); bzero(cdb10, cdb_len); cdb10->opcode = READ_DEFECT_DATA_10; cdb10->format = list_format; scsi_ulto2b(dxfer_len, cdb10->alloc_length); } else { struct scsi_read_defect_data_12 *cdb12; cdb12 = (struct scsi_read_defect_data_12 *) &csio->cdb_io.cdb_bytes; cdb_len = sizeof(*cdb12); bzero(cdb12, cdb_len); cdb12->opcode = READ_DEFECT_DATA_12; cdb12->format = list_format; scsi_ulto4b(dxfer_len, cdb12->alloc_length); scsi_ulto4b(addr_desc_index, cdb12->address_descriptor_index); } cam_fill_csio(csio, retries, cbfcnp, /*flags*/ CAM_DIR_IN, tag_action, data_ptr, dxfer_len, sense_len, cdb_len, timeout); } void scsi_sanitize(struct ccb_scsiio *csio, u_int32_t retries, void (*cbfcnp)(struct cam_periph *, union ccb *), u_int8_t tag_action, u_int8_t byte2, u_int16_t control, u_int8_t *data_ptr, u_int32_t dxfer_len, u_int8_t sense_len, u_int32_t timeout) { struct scsi_sanitize *scsi_cmd; scsi_cmd = (struct scsi_sanitize *)&csio->cdb_io.cdb_bytes; scsi_cmd->opcode = SANITIZE; scsi_cmd->byte2 = byte2; scsi_cmd->control = control; scsi_ulto2b(dxfer_len, scsi_cmd->length); cam_fill_csio(csio, retries, cbfcnp, /*flags*/ (dxfer_len > 0) ? CAM_DIR_OUT : CAM_DIR_NONE, tag_action, data_ptr, dxfer_len, sense_len, sizeof(*scsi_cmd), timeout); } #endif /* _KERNEL */ void scsi_zbc_out(struct ccb_scsiio *csio, uint32_t retries, void (*cbfcnp)(struct cam_periph *, union ccb *), uint8_t tag_action, uint8_t service_action, uint64_t zone_id, uint8_t zone_flags, uint8_t *data_ptr, uint32_t dxfer_len, uint8_t sense_len, uint32_t timeout) { struct scsi_zbc_out *scsi_cmd; scsi_cmd = (struct scsi_zbc_out *)&csio->cdb_io.cdb_bytes; scsi_cmd->opcode = ZBC_OUT; scsi_cmd->service_action = service_action; scsi_u64to8b(zone_id, scsi_cmd->zone_id); scsi_cmd->zone_flags = zone_flags; cam_fill_csio(csio, retries, cbfcnp, /*flags*/ (dxfer_len > 0) ? CAM_DIR_OUT : CAM_DIR_NONE, tag_action, data_ptr, dxfer_len, sense_len, sizeof(*scsi_cmd), timeout); } void scsi_zbc_in(struct ccb_scsiio *csio, uint32_t retries, void (*cbfcnp)(struct cam_periph *, union ccb *), uint8_t tag_action, uint8_t service_action, uint64_t zone_start_lba, uint8_t zone_options, uint8_t *data_ptr, uint32_t dxfer_len, uint8_t sense_len, uint32_t timeout) { struct scsi_zbc_in *scsi_cmd; scsi_cmd = (struct scsi_zbc_in *)&csio->cdb_io.cdb_bytes; scsi_cmd->opcode = ZBC_IN; scsi_cmd->service_action = service_action; scsi_ulto4b(dxfer_len, scsi_cmd->length); scsi_u64to8b(zone_start_lba, scsi_cmd->zone_start_lba); scsi_cmd->zone_options = zone_options; cam_fill_csio(csio, retries, cbfcnp, /*flags*/ (dxfer_len > 0) ? CAM_DIR_IN : CAM_DIR_NONE, tag_action, data_ptr, dxfer_len, sense_len, sizeof(*scsi_cmd), timeout); } int scsi_ata_zac_mgmt_out(struct ccb_scsiio *csio, uint32_t retries, void (*cbfcnp)(struct cam_periph *, union ccb *), uint8_t tag_action, int use_ncq, uint8_t zm_action, uint64_t zone_id, uint8_t zone_flags, uint8_t *data_ptr, uint32_t dxfer_len, uint8_t *cdb_storage, size_t cdb_storage_len, uint8_t sense_len, uint32_t timeout) { uint8_t command_out, protocol, ata_flags; uint16_t features_out; uint32_t sectors_out, auxiliary; int retval; retval = 0; if (use_ncq == 0) { command_out = ATA_ZAC_MANAGEMENT_OUT; features_out = (zm_action & 0xf) | (zone_flags << 8); ata_flags = AP_FLAG_BYT_BLOK_BLOCKS; if (dxfer_len == 0) { protocol = AP_PROTO_NON_DATA; ata_flags |= AP_FLAG_TLEN_NO_DATA; sectors_out = 0; } else { protocol = AP_PROTO_DMA; ata_flags |= AP_FLAG_TLEN_SECT_CNT | AP_FLAG_TDIR_TO_DEV; sectors_out = ((dxfer_len >> 9) & 0xffff); } auxiliary = 0; } else { ata_flags = AP_FLAG_BYT_BLOK_BLOCKS; if (dxfer_len == 0) { command_out = ATA_NCQ_NON_DATA; features_out = ATA_NCQ_ZAC_MGMT_OUT; /* * We're assuming the SCSI to ATA translation layer * will set the NCQ tag number in the tag field. * That isn't clear from the SAT-4 spec (as of rev 05). */ sectors_out = 0; ata_flags |= AP_FLAG_TLEN_NO_DATA; } else { command_out = ATA_SEND_FPDMA_QUEUED; /* * Note that we're defaulting to normal priority, * and assuming that the SCSI to ATA translation * layer will insert the NCQ tag number in the tag * field. That isn't clear in the SAT-4 spec (as * of rev 05). */ sectors_out = ATA_SFPDMA_ZAC_MGMT_OUT << 8; ata_flags |= AP_FLAG_TLEN_FEAT | AP_FLAG_TDIR_TO_DEV; /* * For SEND FPDMA QUEUED, the transfer length is * encoded in the FEATURE register, and 0 means * that 65536 512 byte blocks are to be tranferred. * In practice, it seems unlikely that we'll see * a transfer that large, and it may confuse the * the SAT layer, because generally that means that * 0 bytes should be transferred. */ if (dxfer_len == (65536 * 512)) { features_out = 0; } else if (dxfer_len <= (65535 * 512)) { features_out = ((dxfer_len >> 9) & 0xffff); } else { /* The transfer is too big. */ retval = 1; goto bailout; } } auxiliary = (zm_action & 0xf) | (zone_flags << 8); protocol = AP_PROTO_FPDMA; } protocol |= AP_EXTEND; retval = scsi_ata_pass(csio, retries, cbfcnp, /*flags*/ (dxfer_len > 0) ? CAM_DIR_OUT : CAM_DIR_NONE, tag_action, /*protocol*/ protocol, /*ata_flags*/ ata_flags, /*features*/ features_out, /*sector_count*/ sectors_out, /*lba*/ zone_id, /*command*/ command_out, /*device*/ 0, /*icc*/ 0, /*auxiliary*/ auxiliary, /*control*/ 0, /*data_ptr*/ data_ptr, /*dxfer_len*/ dxfer_len, /*cdb_storage*/ cdb_storage, /*cdb_storage_len*/ cdb_storage_len, /*minimum_cmd_size*/ 0, /*sense_len*/ SSD_FULL_SIZE, /*timeout*/ timeout); bailout: return (retval); } int scsi_ata_zac_mgmt_in(struct ccb_scsiio *csio, uint32_t retries, void (*cbfcnp)(struct cam_periph *, union ccb *), uint8_t tag_action, int use_ncq, uint8_t zm_action, uint64_t zone_id, uint8_t zone_flags, uint8_t *data_ptr, uint32_t dxfer_len, uint8_t *cdb_storage, size_t cdb_storage_len, uint8_t sense_len, uint32_t timeout) { uint8_t command_out, protocol; uint16_t features_out, sectors_out; uint32_t auxiliary; int ata_flags; int retval; retval = 0; ata_flags = AP_FLAG_TDIR_FROM_DEV | AP_FLAG_BYT_BLOK_BLOCKS; if (use_ncq == 0) { command_out = ATA_ZAC_MANAGEMENT_IN; /* XXX KDM put a macro here */ features_out = (zm_action & 0xf) | (zone_flags << 8); sectors_out = dxfer_len >> 9; /* XXX KDM macro */ protocol = AP_PROTO_DMA; ata_flags |= AP_FLAG_TLEN_SECT_CNT; auxiliary = 0; } else { ata_flags |= AP_FLAG_TLEN_FEAT; command_out = ATA_RECV_FPDMA_QUEUED; sectors_out = ATA_RFPDMA_ZAC_MGMT_IN << 8; /* * For RECEIVE FPDMA QUEUED, the transfer length is * encoded in the FEATURE register, and 0 means * that 65536 512 byte blocks are to be tranferred. * In practice, it seems unlikely that we'll see * a transfer that large, and it may confuse the * the SAT layer, because generally that means that * 0 bytes should be transferred. */ if (dxfer_len == (65536 * 512)) { features_out = 0; } else if (dxfer_len <= (65535 * 512)) { features_out = ((dxfer_len >> 9) & 0xffff); } else { /* The transfer is too big. */ retval = 1; goto bailout; } auxiliary = (zm_action & 0xf) | (zone_flags << 8), protocol = AP_PROTO_FPDMA; } protocol |= AP_EXTEND; retval = scsi_ata_pass(csio, retries, cbfcnp, /*flags*/ CAM_DIR_IN, tag_action, /*protocol*/ protocol, /*ata_flags*/ ata_flags, /*features*/ features_out, /*sector_count*/ sectors_out, /*lba*/ zone_id, /*command*/ command_out, /*device*/ 0, /*icc*/ 0, /*auxiliary*/ auxiliary, /*control*/ 0, /*data_ptr*/ data_ptr, /*dxfer_len*/ (dxfer_len >> 9) * 512, /* XXX KDM */ /*cdb_storage*/ cdb_storage, /*cdb_storage_len*/ cdb_storage_len, /*minimum_cmd_size*/ 0, /*sense_len*/ SSD_FULL_SIZE, /*timeout*/ timeout); bailout: return (retval); } Index: projects/clang1000-import/sys/compat/linuxkpi/common/include/linux/fs.h =================================================================== --- projects/clang1000-import/sys/compat/linuxkpi/common/include/linux/fs.h (revision 358238) +++ projects/clang1000-import/sys/compat/linuxkpi/common/include/linux/fs.h (revision 358239) @@ -1,326 +1,305 @@ /*- * Copyright (c) 2010 Isilon Systems, Inc. * Copyright (c) 2010 iX Systems, Inc. * Copyright (c) 2010 Panasas, Inc. * Copyright (c) 2013-2018 Mellanox Technologies, Ltd. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice unmodified, this list of conditions, and the following * disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * $FreeBSD$ */ #ifndef _LINUX_FS_H_ #define _LINUX_FS_H_ #include #include #include #include #include #include #include #include #include #include #include #include struct module; struct kiocb; struct iovec; struct dentry; struct page; struct file_lock; struct pipe_inode_info; struct vm_area_struct; struct poll_table_struct; struct files_struct; struct pfs_node; struct linux_cdev; #define inode vnode #define i_cdev v_rdev #define i_private v_data #define S_IRUGO (S_IRUSR | S_IRGRP | S_IROTH) #define S_IWUGO (S_IWUSR | S_IWGRP | S_IWOTH) typedef struct files_struct *fl_owner_t; struct file_operations; struct linux_file_wait_queue { struct wait_queue wq; struct wait_queue_head *wqh; atomic_t state; #define LINUX_FWQ_STATE_INIT 0 #define LINUX_FWQ_STATE_NOT_READY 1 #define LINUX_FWQ_STATE_QUEUED 2 #define LINUX_FWQ_STATE_READY 3 #define LINUX_FWQ_STATE_MAX 4 }; struct linux_file { struct file *_file; const struct file_operations *f_op; void *private_data; int f_flags; int f_mode; /* Just starting mode. */ struct dentry *f_dentry; struct dentry f_dentry_store; struct selinfo f_selinfo; struct sigio *f_sigio; struct vnode *f_vnode; #define f_inode f_vnode volatile u_int f_count; /* anonymous shmem object */ vm_object_t f_shmem; /* kqfilter support */ int f_kqflags; #define LINUX_KQ_FLAG_HAS_READ (1 << 0) #define LINUX_KQ_FLAG_HAS_WRITE (1 << 1) #define LINUX_KQ_FLAG_NEED_READ (1 << 2) #define LINUX_KQ_FLAG_NEED_WRITE (1 << 3) /* protects f_selinfo.si_note */ spinlock_t f_kqlock; struct linux_file_wait_queue f_wait_queue; /* pointer to associated character device, if any */ struct linux_cdev *f_cdev; }; #define file linux_file #define fasync_struct sigio * #define fasync_helper(fd, filp, on, queue) \ ({ \ if ((on)) \ *(queue) = &(filp)->f_sigio; \ else \ *(queue) = NULL; \ 0; \ }) #define kill_fasync(queue, sig, pollstat) \ do { \ if (*(queue) != NULL) \ pgsigio(*(queue), (sig), 0); \ } while (0) typedef int (*filldir_t)(void *, const char *, int, off_t, u64, unsigned); struct file_operations { struct module *owner; ssize_t (*read)(struct linux_file *, char __user *, size_t, off_t *); ssize_t (*write)(struct linux_file *, const char __user *, size_t, off_t *); unsigned int (*poll) (struct linux_file *, struct poll_table_struct *); long (*unlocked_ioctl)(struct linux_file *, unsigned int, unsigned long); long (*compat_ioctl)(struct linux_file *, unsigned int, unsigned long); int (*mmap)(struct linux_file *, struct vm_area_struct *); int (*open)(struct inode *, struct file *); int (*release)(struct inode *, struct linux_file *); int (*fasync)(int, struct linux_file *, int); /* Although not supported in FreeBSD, to align with Linux code * we are adding llseek() only when it is mapped to no_llseek which returns * an illegal seek error */ off_t (*llseek)(struct linux_file *, off_t, int); #if 0 /* We do not support these methods. Don't permit them to compile. */ loff_t (*llseek)(struct file *, loff_t, int); ssize_t (*aio_read)(struct kiocb *, const struct iovec *, unsigned long, loff_t); ssize_t (*aio_write)(struct kiocb *, const struct iovec *, unsigned long, loff_t); int (*readdir)(struct file *, void *, filldir_t); int (*ioctl)(struct inode *, struct file *, unsigned int, unsigned long); int (*flush)(struct file *, fl_owner_t id); int (*fsync)(struct file *, struct dentry *, int datasync); int (*aio_fsync)(struct kiocb *, int datasync); int (*lock)(struct file *, int, struct file_lock *); ssize_t (*sendpage)(struct file *, struct page *, int, size_t, loff_t *, int); unsigned long (*get_unmapped_area)(struct file *, unsigned long, unsigned long, unsigned long, unsigned long); int (*check_flags)(int); int (*flock)(struct file *, int, struct file_lock *); ssize_t (*splice_write)(struct pipe_inode_info *, struct file *, loff_t *, size_t, unsigned int); ssize_t (*splice_read)(struct file *, loff_t *, struct pipe_inode_info *, size_t, unsigned int); int (*setlease)(struct file *, long, struct file_lock **); #endif }; #define fops_get(fops) (fops) #define replace_fops(f, fops) ((f)->f_op = (fops)) #define FMODE_READ FREAD #define FMODE_WRITE FWRITE #define FMODE_EXEC FEXEC int __register_chrdev(unsigned int major, unsigned int baseminor, unsigned int count, const char *name, const struct file_operations *fops); int __register_chrdev_p(unsigned int major, unsigned int baseminor, unsigned int count, const char *name, const struct file_operations *fops, uid_t uid, gid_t gid, int mode); void __unregister_chrdev(unsigned int major, unsigned int baseminor, unsigned int count, const char *name); static inline void unregister_chrdev(unsigned int major, const char *name) { __unregister_chrdev(major, 0, 256, name); } static inline int register_chrdev(unsigned int major, const char *name, const struct file_operations *fops) { return (__register_chrdev(major, 0, 256, name, fops)); } static inline int register_chrdev_p(unsigned int major, const char *name, const struct file_operations *fops, uid_t uid, gid_t gid, int mode) { return (__register_chrdev_p(major, 0, 256, name, fops, uid, gid, mode)); } static inline int register_chrdev_region(dev_t dev, unsigned range, const char *name) { return 0; } static inline void unregister_chrdev_region(dev_t dev, unsigned range) { return; } static inline int alloc_chrdev_region(dev_t *dev, unsigned baseminor, unsigned count, const char *name) { return 0; } /* No current support for seek op in FreeBSD */ static inline int nonseekable_open(struct inode *inode, struct file *filp) { return 0; } extern unsigned int linux_iminor(struct inode *); #define iminor(...) linux_iminor(__VA_ARGS__) static inline struct linux_file * get_file(struct linux_file *f) { refcount_acquire(f->_file == NULL ? &f->f_count : &f->_file->f_count); return (f); } static inline struct inode * igrab(struct inode *inode) { int error; error = vget(inode, 0, curthread); if (error) return (NULL); return (inode); } static inline void iput(struct inode *inode) { vrele(inode); } static inline loff_t no_llseek(struct file *file, loff_t offset, int whence) { return (-ESPIPE); } static inline loff_t noop_llseek(struct linux_file *file, loff_t offset, int whence) { return (file->_file->f_offset); } static inline struct vnode * file_inode(const struct linux_file *file) { return (file->f_vnode); } static inline int call_mmap(struct linux_file *file, struct vm_area_struct *vma) { return (file->f_op->mmap(file, vma)); } -/* Shared memory support */ -unsigned long linux_invalidate_mapping_pages(vm_object_t, pgoff_t, pgoff_t); -struct page *linux_shmem_read_mapping_page_gfp(vm_object_t, int, gfp_t); -struct linux_file *linux_shmem_file_setup(const char *, loff_t, unsigned long); -void linux_shmem_truncate_range(vm_object_t, loff_t, loff_t); - -#define invalidate_mapping_pages(...) \ - linux_invalidate_mapping_pages(__VA_ARGS__) - -#define shmem_read_mapping_page(...) \ - linux_shmem_read_mapping_page_gfp(__VA_ARGS__, 0) - -#define shmem_read_mapping_page_gfp(...) \ - linux_shmem_read_mapping_page_gfp(__VA_ARGS__) - -#define shmem_file_setup(...) \ - linux_shmem_file_setup(__VA_ARGS__) - -#define shmem_truncate_range(...) \ - linux_shmem_truncate_range(__VA_ARGS__) - #endif /* _LINUX_FS_H_ */ Index: projects/clang1000-import/sys/compat/linuxkpi/common/include/linux/shmem_fs.h =================================================================== --- projects/clang1000-import/sys/compat/linuxkpi/common/include/linux/shmem_fs.h (nonexistent) +++ projects/clang1000-import/sys/compat/linuxkpi/common/include/linux/shmem_fs.h (revision 358239) @@ -0,0 +1,55 @@ +/*- + * Copyright (c) 2010 Isilon Systems, Inc. + * Copyright (c) 2010 iX Systems, Inc. + * Copyright (c) 2010 Panasas, Inc. + * Copyright (c) 2013-2018 Mellanox Technologies, Ltd. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice unmodified, this list of conditions, and the following + * disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + * + * $FreeBSD$ + */ +#ifndef _LINUX_SHMEM_FS_H_ +#define _LINUX_SHMEM_FS_H_ + +/* Shared memory support */ +unsigned long linux_invalidate_mapping_pages(vm_object_t, pgoff_t, pgoff_t); +struct page *linux_shmem_read_mapping_page_gfp(vm_object_t, int, gfp_t); +struct linux_file *linux_shmem_file_setup(const char *, loff_t, unsigned long); +void linux_shmem_truncate_range(vm_object_t, loff_t, loff_t); + +#define invalidate_mapping_pages(...) \ + linux_invalidate_mapping_pages(__VA_ARGS__) + +#define shmem_read_mapping_page(...) \ + linux_shmem_read_mapping_page_gfp(__VA_ARGS__, 0) + +#define shmem_read_mapping_page_gfp(...) \ + linux_shmem_read_mapping_page_gfp(__VA_ARGS__) + +#define shmem_file_setup(...) \ + linux_shmem_file_setup(__VA_ARGS__) + +#define shmem_truncate_range(...) \ + linux_shmem_truncate_range(__VA_ARGS__) + +#endif /* _LINUX_SHMEM_FS_H_ */ Property changes on: projects/clang1000-import/sys/compat/linuxkpi/common/include/linux/shmem_fs.h ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: projects/clang1000-import/sys/compat/linuxkpi/common/src/linux_page.c =================================================================== --- projects/clang1000-import/sys/compat/linuxkpi/common/src/linux_page.c (revision 358238) +++ projects/clang1000-import/sys/compat/linuxkpi/common/src/linux_page.c (revision 358239) @@ -1,360 +1,278 @@ /*- * Copyright (c) 2010 Isilon Systems, Inc. * Copyright (c) 2016 Matthew Macy (mmacy@mattmacy.io) * Copyright (c) 2017 Mellanox Technologies, Ltd. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice unmodified, this list of conditions, and the following * disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include +#include void si_meminfo(struct sysinfo *si) { si->totalram = physmem; si->totalhigh = 0; si->mem_unit = PAGE_SIZE; } void * linux_page_address(struct page *page) { if (page->object != kmem_object && page->object != kernel_object) { return (PMAP_HAS_DMAP ? ((void *)(uintptr_t)PHYS_TO_DMAP(VM_PAGE_TO_PHYS(page))) : NULL); } return ((void *)(uintptr_t)(VM_MIN_KERNEL_ADDRESS + IDX_TO_OFF(page->pindex))); } vm_page_t linux_alloc_pages(gfp_t flags, unsigned int order) { vm_page_t page; if (PMAP_HAS_DMAP) { unsigned long npages = 1UL << order; int req = VM_ALLOC_NOOBJ | VM_ALLOC_WIRED | VM_ALLOC_NORMAL; if ((flags & M_ZERO) != 0) req |= VM_ALLOC_ZERO; if (order == 0 && (flags & GFP_DMA32) == 0) { page = vm_page_alloc(NULL, 0, req); if (page == NULL) return (NULL); } else { vm_paddr_t pmax = (flags & GFP_DMA32) ? BUS_SPACE_MAXADDR_32BIT : BUS_SPACE_MAXADDR; retry: page = vm_page_alloc_contig(NULL, 0, req, npages, 0, pmax, PAGE_SIZE, 0, VM_MEMATTR_DEFAULT); if (page == NULL) { if (flags & M_WAITOK) { if (!vm_page_reclaim_contig(req, npages, 0, pmax, PAGE_SIZE, 0)) { vm_wait(NULL); } flags &= ~M_WAITOK; goto retry; } return (NULL); } } if (flags & M_ZERO) { unsigned long x; for (x = 0; x != npages; x++) { vm_page_t pgo = page + x; if ((pgo->flags & PG_ZERO) == 0) pmap_zero_page(pgo); } } } else { vm_offset_t vaddr; vaddr = linux_alloc_kmem(flags, order); if (vaddr == 0) return (NULL); page = PHYS_TO_VM_PAGE(vtophys((void *)vaddr)); KASSERT(vaddr == (vm_offset_t)page_address(page), ("Page address mismatch")); } return (page); } void linux_free_pages(vm_page_t page, unsigned int order) { if (PMAP_HAS_DMAP) { unsigned long npages = 1UL << order; unsigned long x; for (x = 0; x != npages; x++) { vm_page_t pgo = page + x; if (vm_page_unwire_noq(pgo)) vm_page_free(pgo); } } else { vm_offset_t vaddr; vaddr = (vm_offset_t)page_address(page); linux_free_kmem(vaddr, order); } } vm_offset_t linux_alloc_kmem(gfp_t flags, unsigned int order) { size_t size = ((size_t)PAGE_SIZE) << order; vm_offset_t addr; if ((flags & GFP_DMA32) == 0) { addr = kmem_malloc(size, flags & GFP_NATIVE_MASK); } else { addr = kmem_alloc_contig(size, flags & GFP_NATIVE_MASK, 0, BUS_SPACE_MAXADDR_32BIT, PAGE_SIZE, 0, VM_MEMATTR_DEFAULT); } return (addr); } void linux_free_kmem(vm_offset_t addr, unsigned int order) { size_t size = ((size_t)PAGE_SIZE) << order; kmem_free(addr, size); } static int linux_get_user_pages_internal(vm_map_t map, unsigned long start, int nr_pages, int write, struct page **pages) { vm_prot_t prot; size_t len; int count; prot = write ? (VM_PROT_READ | VM_PROT_WRITE) : VM_PROT_READ; len = ((size_t)nr_pages) << PAGE_SHIFT; count = vm_fault_quick_hold_pages(map, start, len, prot, pages, nr_pages); return (count == -1 ? -EFAULT : nr_pages); } int __get_user_pages_fast(unsigned long start, int nr_pages, int write, struct page **pages) { vm_map_t map; vm_page_t *mp; vm_offset_t va; vm_offset_t end; vm_prot_t prot; int count; if (nr_pages == 0 || in_interrupt()) return (0); MPASS(pages != NULL); va = start; map = &curthread->td_proc->p_vmspace->vm_map; end = start + (((size_t)nr_pages) << PAGE_SHIFT); if (start < vm_map_min(map) || end > vm_map_max(map)) return (-EINVAL); prot = write ? (VM_PROT_READ | VM_PROT_WRITE) : VM_PROT_READ; for (count = 0, mp = pages, va = start; va < end; mp++, va += PAGE_SIZE, count++) { *mp = pmap_extract_and_hold(map->pmap, va, prot); if (*mp == NULL) break; if ((prot & VM_PROT_WRITE) != 0 && (*mp)->dirty != VM_PAGE_BITS_ALL) { /* * Explicitly dirty the physical page. Otherwise, the * caller's changes may go unnoticed because they are * performed through an unmanaged mapping or by a DMA * operation. * * The object lock is not held here. * See vm_page_clear_dirty_mask(). */ vm_page_dirty(*mp); } } return (count); } long get_user_pages_remote(struct task_struct *task, struct mm_struct *mm, unsigned long start, unsigned long nr_pages, int gup_flags, struct page **pages, struct vm_area_struct **vmas) { vm_map_t map; map = &task->task_thread->td_proc->p_vmspace->vm_map; return (linux_get_user_pages_internal(map, start, nr_pages, !!(gup_flags & FOLL_WRITE), pages)); } long get_user_pages(unsigned long start, unsigned long nr_pages, int gup_flags, struct page **pages, struct vm_area_struct **vmas) { vm_map_t map; map = &curthread->td_proc->p_vmspace->vm_map; return (linux_get_user_pages_internal(map, start, nr_pages, !!(gup_flags & FOLL_WRITE), pages)); } int is_vmalloc_addr(const void *addr) { return (vtoslab((vm_offset_t)addr & ~UMA_SLAB_MASK) != NULL); -} - -struct page * -linux_shmem_read_mapping_page_gfp(vm_object_t obj, int pindex, gfp_t gfp) -{ - vm_page_t page; - int rv; - - if ((gfp & GFP_NOWAIT) != 0) - panic("GFP_NOWAIT is unimplemented"); - - VM_OBJECT_WLOCK(obj); - rv = vm_page_grab_valid(&page, obj, pindex, VM_ALLOC_NORMAL | - VM_ALLOC_NOBUSY | VM_ALLOC_WIRED); - VM_OBJECT_WUNLOCK(obj); - if (rv != VM_PAGER_OK) - return (ERR_PTR(-EINVAL)); - return (page); -} - -struct linux_file * -linux_shmem_file_setup(const char *name, loff_t size, unsigned long flags) -{ - struct fileobj { - struct linux_file file __aligned(sizeof(void *)); - struct vnode vnode __aligned(sizeof(void *)); - }; - struct fileobj *fileobj; - struct linux_file *filp; - struct vnode *vp; - int error; - - fileobj = kzalloc(sizeof(*fileobj), GFP_KERNEL); - if (fileobj == NULL) { - error = -ENOMEM; - goto err_0; - } - filp = &fileobj->file; - vp = &fileobj->vnode; - - filp->f_count = 1; - filp->f_vnode = vp; - filp->f_shmem = vm_pager_allocate(OBJT_DEFAULT, NULL, size, - VM_PROT_READ | VM_PROT_WRITE, 0, curthread->td_ucred); - if (filp->f_shmem == NULL) { - error = -ENOMEM; - goto err_1; - } - return (filp); -err_1: - kfree(filp); -err_0: - return (ERR_PTR(error)); -} - -static vm_ooffset_t -linux_invalidate_mapping_pages_sub(vm_object_t obj, vm_pindex_t start, - vm_pindex_t end, int flags) -{ - int start_count, end_count; - - VM_OBJECT_WLOCK(obj); - start_count = obj->resident_page_count; - vm_object_page_remove(obj, start, end, flags); - end_count = obj->resident_page_count; - VM_OBJECT_WUNLOCK(obj); - return (start_count - end_count); -} - -unsigned long -linux_invalidate_mapping_pages(vm_object_t obj, pgoff_t start, pgoff_t end) -{ - - return (linux_invalidate_mapping_pages_sub(obj, start, end, OBJPR_CLEANONLY)); -} - -void -linux_shmem_truncate_range(vm_object_t obj, loff_t lstart, loff_t lend) -{ - vm_pindex_t start = OFF_TO_IDX(lstart + PAGE_SIZE - 1); - vm_pindex_t end = OFF_TO_IDX(lend + 1); - - (void) linux_invalidate_mapping_pages_sub(obj, start, end, 0); } Index: projects/clang1000-import/sys/compat/linuxkpi/common/src/linux_shmemfs.c =================================================================== --- projects/clang1000-import/sys/compat/linuxkpi/common/src/linux_shmemfs.c (nonexistent) +++ projects/clang1000-import/sys/compat/linuxkpi/common/src/linux_shmemfs.c (revision 358239) @@ -0,0 +1,128 @@ +/*- + * Copyright (c) 2010 Isilon Systems, Inc. + * Copyright (c) 2016 Matthew Macy (mmacy@mattmacy.io) + * Copyright (c) 2017 Mellanox Technologies, Ltd. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice unmodified, this list of conditions, and the following + * disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include +__FBSDID("$FreeBSD$"); + +#include +#include +#include + +#include +#include +#include +#include +#include +#include + +#include +#include +#include + +struct page * +linux_shmem_read_mapping_page_gfp(vm_object_t obj, int pindex, gfp_t gfp) +{ + vm_page_t page; + int rv; + + if ((gfp & GFP_NOWAIT) != 0) + panic("GFP_NOWAIT is unimplemented"); + + VM_OBJECT_WLOCK(obj); + rv = vm_page_grab_valid(&page, obj, pindex, VM_ALLOC_NORMAL | + VM_ALLOC_NOBUSY | VM_ALLOC_WIRED); + VM_OBJECT_WUNLOCK(obj); + if (rv != VM_PAGER_OK) + return (ERR_PTR(-EINVAL)); + return (page); +} + +struct linux_file * +linux_shmem_file_setup(const char *name, loff_t size, unsigned long flags) +{ + struct fileobj { + struct linux_file file __aligned(sizeof(void *)); + struct vnode vnode __aligned(sizeof(void *)); + }; + struct fileobj *fileobj; + struct linux_file *filp; + struct vnode *vp; + int error; + + fileobj = kzalloc(sizeof(*fileobj), GFP_KERNEL); + if (fileobj == NULL) { + error = -ENOMEM; + goto err_0; + } + filp = &fileobj->file; + vp = &fileobj->vnode; + + filp->f_count = 1; + filp->f_vnode = vp; + filp->f_shmem = vm_pager_allocate(OBJT_DEFAULT, NULL, size, + VM_PROT_READ | VM_PROT_WRITE, 0, curthread->td_ucred); + if (filp->f_shmem == NULL) { + error = -ENOMEM; + goto err_1; + } + return (filp); +err_1: + kfree(filp); +err_0: + return (ERR_PTR(error)); +} + +static vm_ooffset_t +linux_invalidate_mapping_pages_sub(vm_object_t obj, vm_pindex_t start, + vm_pindex_t end, int flags) +{ + int start_count, end_count; + + VM_OBJECT_WLOCK(obj); + start_count = obj->resident_page_count; + vm_object_page_remove(obj, start, end, flags); + end_count = obj->resident_page_count; + VM_OBJECT_WUNLOCK(obj); + return (start_count - end_count); +} + +unsigned long +linux_invalidate_mapping_pages(vm_object_t obj, pgoff_t start, pgoff_t end) +{ + + return (linux_invalidate_mapping_pages_sub(obj, start, end, OBJPR_CLEANONLY)); +} + +void +linux_shmem_truncate_range(vm_object_t obj, loff_t lstart, loff_t lend) +{ + vm_pindex_t start = OFF_TO_IDX(lstart + PAGE_SIZE - 1); + vm_pindex_t end = OFF_TO_IDX(lend + 1); + + (void) linux_invalidate_mapping_pages_sub(obj, start, end, 0); +} Property changes on: projects/clang1000-import/sys/compat/linuxkpi/common/src/linux_shmemfs.c ___________________________________________________________________ Added: svn:eol-style ## -0,0 +1 ## +native \ No newline at end of property Added: svn:keywords ## -0,0 +1 ## +FreeBSD=%H \ No newline at end of property Added: svn:mime-type ## -0,0 +1 ## +text/plain \ No newline at end of property Index: projects/clang1000-import/sys/conf/files =================================================================== --- projects/clang1000-import/sys/conf/files (revision 358238) +++ projects/clang1000-import/sys/conf/files (revision 358239) @@ -1,4991 +1,4993 @@ # $FreeBSD$ # # The long compile-with and dependency lines are required because of # limitations in config: backslash-newline doesn't work in strings, and # dependency lines other than the first are silently ignored. # acpi_quirks.h optional acpi \ dependency "$S/tools/acpi_quirks2h.awk $S/dev/acpica/acpi_quirks" \ compile-with "${AWK} -f $S/tools/acpi_quirks2h.awk $S/dev/acpica/acpi_quirks" \ no-obj no-implicit-rule before-depend \ clean "acpi_quirks.h" bhnd_nvram_map.h optional bhnd \ dependency "$S/dev/bhnd/tools/nvram_map_gen.sh $S/dev/bhnd/tools/nvram_map_gen.awk $S/dev/bhnd/nvram/nvram_map" \ compile-with "sh $S/dev/bhnd/tools/nvram_map_gen.sh $S/dev/bhnd/nvram/nvram_map -h" \ no-obj no-implicit-rule before-depend \ clean "bhnd_nvram_map.h" bhnd_nvram_map_data.h optional bhnd \ dependency "$S/dev/bhnd/tools/nvram_map_gen.sh $S/dev/bhnd/tools/nvram_map_gen.awk $S/dev/bhnd/nvram/nvram_map" \ compile-with "sh $S/dev/bhnd/tools/nvram_map_gen.sh $S/dev/bhnd/nvram/nvram_map -d" \ no-obj no-implicit-rule before-depend \ clean "bhnd_nvram_map_data.h" fdt_static_dtb.h optional fdt fdt_dtb_static \ compile-with "sh -c 'MACHINE=${MACHINE} $S/tools/fdt/make_dtbh.sh ${FDT_DTS_FILE} ${.CURDIR}'" \ dependency "${FDT_DTS_FILE:T:R}.dtb" \ no-obj no-implicit-rule before-depend \ clean "fdt_static_dtb.h" feeder_eq_gen.h optional sound \ dependency "$S/tools/sound/feeder_eq_mkfilter.awk" \ compile-with "${AWK} -f $S/tools/sound/feeder_eq_mkfilter.awk -- ${FEEDER_EQ_PRESETS} > feeder_eq_gen.h" \ no-obj no-implicit-rule before-depend \ clean "feeder_eq_gen.h" feeder_rate_gen.h optional sound \ dependency "$S/tools/sound/feeder_rate_mkfilter.awk" \ compile-with "${AWK} -f $S/tools/sound/feeder_rate_mkfilter.awk -- ${FEEDER_RATE_PRESETS} > feeder_rate_gen.h" \ no-obj no-implicit-rule before-depend \ clean "feeder_rate_gen.h" font.h optional sc_dflt_font \ compile-with "uudecode < ${SRCTOP}/share/syscons/fonts/${SC_DFLT_FONT}-8x16.fnt && file2c 'u_char dflt_font_16[16*256] = {' '};' < ${SC_DFLT_FONT}-8x16 > font.h && uudecode < ${SRCTOP}/share/syscons/fonts/${SC_DFLT_FONT}-8x14.fnt && file2c 'u_char dflt_font_14[14*256] = {' '};' < ${SC_DFLT_FONT}-8x14 >> font.h && uudecode < ${SRCTOP}/share/syscons/fonts/${SC_DFLT_FONT}-8x8.fnt && file2c 'u_char dflt_font_8[8*256] = {' '};' < ${SC_DFLT_FONT}-8x8 >> font.h" \ no-obj no-implicit-rule before-depend \ clean "font.h ${SC_DFLT_FONT}-8x14 ${SC_DFLT_FONT}-8x16 ${SC_DFLT_FONT}-8x8" snd_fxdiv_gen.h optional sound \ dependency "$S/tools/sound/snd_fxdiv_gen.awk" \ compile-with "${AWK} -f $S/tools/sound/snd_fxdiv_gen.awk -- > snd_fxdiv_gen.h" \ no-obj no-implicit-rule before-depend \ clean "snd_fxdiv_gen.h" miidevs.h optional miibus | mii \ dependency "$S/tools/miidevs2h.awk $S/dev/mii/miidevs" \ compile-with "${AWK} -f $S/tools/miidevs2h.awk $S/dev/mii/miidevs" \ no-obj no-implicit-rule before-depend \ clean "miidevs.h" pccarddevs.h standard \ dependency "$S/tools/pccarddevs2h.awk $S/dev/pccard/pccarddevs" \ compile-with "${AWK} -f $S/tools/pccarddevs2h.awk $S/dev/pccard/pccarddevs" \ no-obj no-implicit-rule before-depend \ clean "pccarddevs.h" kbdmuxmap.h optional kbdmux_dflt_keymap \ compile-with "${KEYMAP} -L ${KBDMUX_DFLT_KEYMAP} | ${KEYMAP_FIX} > ${.TARGET}" \ no-obj no-implicit-rule before-depend \ clean "kbdmuxmap.h" teken_state.h optional sc | vt \ dependency "$S/teken/gensequences $S/teken/sequences" \ compile-with "${AWK} -f $S/teken/gensequences $S/teken/sequences > teken_state.h" \ no-obj no-implicit-rule before-depend \ clean "teken_state.h" ukbdmap.h optional ukbd_dflt_keymap \ compile-with "${KEYMAP} -L ${UKBD_DFLT_KEYMAP} | ${KEYMAP_FIX} > ${.TARGET}" \ no-obj no-implicit-rule before-depend \ clean "ukbdmap.h" usbdevs.h optional usb \ dependency "$S/tools/usbdevs2h.awk $S/dev/usb/usbdevs" \ compile-with "${AWK} -f $S/tools/usbdevs2h.awk $S/dev/usb/usbdevs -h" \ no-obj no-implicit-rule before-depend \ clean "usbdevs.h" usbdevs_data.h optional usb \ dependency "$S/tools/usbdevs2h.awk $S/dev/usb/usbdevs" \ compile-with "${AWK} -f $S/tools/usbdevs2h.awk $S/dev/usb/usbdevs -d" \ no-obj no-implicit-rule before-depend \ clean "usbdevs_data.h" sdiodevs.h optional mmccam \ dependency "$S/tools/sdiodevs2h.awk $S/dev/sdio/sdiodevs" \ compile-with "${AWK} -f $S/tools/sdiodevs2h.awk $S/dev/sdio/sdiodevs -h" \ no-obj no-implicit-rule before-depend \ clean "sdiodevs.h" sdiodevs_data.h optional mmccam \ dependency "$S/tools/sdiodevs2h.awk $S/dev/sdio/sdiodevs" \ compile-with "${AWK} -f $S/tools/sdiodevs2h.awk $S/dev/sdio/sdiodevs -d" \ no-obj no-implicit-rule before-depend \ clean "sdiodevs_data.h" cam/cam.c optional scbus cam/cam_compat.c optional scbus cam/cam_iosched.c optional scbus cam/cam_periph.c optional scbus cam/cam_queue.c optional scbus cam/cam_sim.c optional scbus cam/cam_xpt.c optional scbus cam/ata/ata_all.c optional scbus cam/ata/ata_xpt.c optional scbus cam/ata/ata_pmp.c optional scbus cam/nvme/nvme_all.c optional scbus cam/nvme/nvme_da.c optional nda | da cam/nvme/nvme_xpt.c optional scbus cam/scsi/scsi_xpt.c optional scbus cam/scsi/scsi_all.c optional scbus cam/scsi/scsi_cd.c optional cd cam/scsi/scsi_ch.c optional ch cam/ata/ata_da.c optional ada | da cam/ctl/ctl.c optional ctl cam/ctl/ctl_backend.c optional ctl cam/ctl/ctl_backend_block.c optional ctl cam/ctl/ctl_backend_ramdisk.c optional ctl cam/ctl/ctl_cmd_table.c optional ctl cam/ctl/ctl_frontend.c optional ctl cam/ctl/ctl_frontend_cam_sim.c optional ctl cam/ctl/ctl_frontend_ioctl.c optional ctl cam/ctl/ctl_frontend_iscsi.c optional ctl cfiscsi cam/ctl/ctl_ha.c optional ctl cam/ctl/ctl_scsi_all.c optional ctl cam/ctl/ctl_tpc.c optional ctl cam/ctl/ctl_tpc_local.c optional ctl cam/ctl/ctl_error.c optional ctl cam/ctl/ctl_util.c optional ctl cam/ctl/scsi_ctl.c optional ctl cam/mmc/mmc_xpt.c optional scbus mmccam cam/mmc/mmc_da.c optional scbus mmccam da cam/scsi/scsi_da.c optional da cam/scsi/scsi_pass.c optional pass cam/scsi/scsi_pt.c optional pt cam/scsi/scsi_sa.c optional sa cam/scsi/scsi_enc.c optional ses cam/scsi/scsi_enc_ses.c optional ses cam/scsi/scsi_enc_safte.c optional ses cam/scsi/scsi_sg.c optional sg cam/scsi/scsi_targ_bh.c optional targbh cam/scsi/scsi_target.c optional targ cam/scsi/smp_all.c optional scbus # shared between zfs and dtrace cddl/compat/opensolaris/kern/opensolaris.c optional zfs | dtrace compile-with "${CDDL_C}" cddl/compat/opensolaris/kern/opensolaris_cmn_err.c optional zfs | dtrace compile-with "${CDDL_C}" cddl/compat/opensolaris/kern/opensolaris_kmem.c optional zfs | dtrace compile-with "${CDDL_C}" cddl/compat/opensolaris/kern/opensolaris_misc.c optional zfs | dtrace compile-with "${CDDL_C}" cddl/compat/opensolaris/kern/opensolaris_proc.c optional zfs | dtrace compile-with "${CDDL_C}" cddl/compat/opensolaris/kern/opensolaris_sunddi.c optional zfs | dtrace compile-with "${CDDL_C}" cddl/compat/opensolaris/kern/opensolaris_taskq.c optional zfs | dtrace compile-with "${CDDL_C}" # zfs specific cddl/compat/opensolaris/kern/opensolaris_acl.c optional zfs compile-with "${ZFS_C}" cddl/compat/opensolaris/kern/opensolaris_dtrace.c optional zfs compile-with "${ZFS_C}" cddl/compat/opensolaris/kern/opensolaris_kobj.c optional zfs compile-with "${ZFS_C}" cddl/compat/opensolaris/kern/opensolaris_kstat.c optional zfs compile-with "${ZFS_C}" cddl/compat/opensolaris/kern/opensolaris_lookup.c optional zfs compile-with "${ZFS_C}" cddl/compat/opensolaris/kern/opensolaris_policy.c optional zfs compile-with "${ZFS_C}" cddl/compat/opensolaris/kern/opensolaris_string.c optional zfs compile-with "${ZFS_C}" cddl/compat/opensolaris/kern/opensolaris_sysevent.c optional zfs compile-with "${ZFS_C}" cddl/compat/opensolaris/kern/opensolaris_uio.c optional zfs compile-with "${ZFS_C}" cddl/compat/opensolaris/kern/opensolaris_vfs.c optional zfs compile-with "${ZFS_C}" cddl/compat/opensolaris/kern/opensolaris_vm.c optional zfs compile-with "${ZFS_C}" cddl/compat/opensolaris/kern/opensolaris_zone.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/common/acl/acl_common.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/common/avl/avl.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/common/lz4/lz4.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/common/nvpair/opensolaris_fnvpair.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/common/nvpair/opensolaris_nvpair.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/common/nvpair/opensolaris_nvpair_alloc_fixed.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/common/unicode/u8_textprep.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/common/zfs/zfeature_common.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/common/zfs/zfs_comutil.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/common/zfs/zfs_deleg.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/common/zfs/zfs_fletcher.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/common/zfs/zfs_ioctl_compat.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/common/zfs/zfs_namecheck.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/common/zfs/zfs_prop.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/common/zfs/zpool_prop.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/common/zfs/zprop_common.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/vnode.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/abd.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/aggsum.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/blkptr.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/bplist.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/bpobj.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/bptree.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/bqueue.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/cityhash.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf_stats.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/ddt.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/ddt_zap.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_diff.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_object.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_traverse.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_zfetch.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c optional zfs compile-with "${ZFS_C}" \ warning "kernel contains CDDL licensed ZFS filesystem" cddl/contrib/opensolaris/uts/common/fs/zfs/dnode_sync.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_bookmark.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_deadlist.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_deleg.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_destroy.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dir.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_pool.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_prop.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_userhold.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_synctask.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/gzip.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/lzjb.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/mmp.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/multilist.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/range_tree.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/refcount.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/rrwlock.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/sa.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/sha256.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/skein_zfs.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/spa_checkpoint.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/spa_config.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/spa_errlog.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/spa_history.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/space_map.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/space_reftree.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/trim_map.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/uberblock.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/unique.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_cache.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_file.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_indirect.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_indirect_births.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_indirect_mapping.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_initialize.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_label.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_mirror.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_missing.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_queue.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_raidz.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_removal.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_root.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zap.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zap_leaf.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zcp.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zcp_get.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zcp_global.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zcp_iter.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zcp_synctask.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zfeature.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_acl.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_byteswap.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ctldir.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_debug.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_dir.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_fm.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_fuid.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_log.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_onexit.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_replay.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_rlock.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_sa.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zio_checksum.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zio_compress.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zio_inject.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zle.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zrlock.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zthr.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/os/callb.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/os/fm.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/os/list.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/os/nvpair_alloc_system.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/zmod/zmod.c optional zfs compile-with "${ZFS_C}" # zfs lua support cddl/contrib/opensolaris/uts/common/fs/zfs/lua/lapi.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/lua/lauxlib.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/lua/lbaselib.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/lua/lbitlib.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/lua/lcode.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/lua/lcompat.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/lua/lcorolib.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/lua/lctype.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/lua/ldebug.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/lua/ldo.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/lua/ldump.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/lua/lfunc.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/lua/lgc.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/lua/llex.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/lua/lmem.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/lua/lobject.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/lua/lopcodes.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/lua/lparser.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/lua/lstate.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/lua/lstring.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/lua/lstrlib.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/lua/ltable.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/lua/ltablib.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/lua/ltm.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/lua/lundump.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/lua/lvm.c optional zfs compile-with "${ZFS_C}" cddl/contrib/opensolaris/uts/common/fs/zfs/lua/lzio.c optional zfs compile-with "${ZFS_C}" # dtrace specific cddl/contrib/opensolaris/uts/common/dtrace/dtrace.c optional dtrace compile-with "${DTRACE_C}" \ warning "kernel contains CDDL licensed DTRACE" cddl/contrib/opensolaris/uts/common/dtrace/dtrace_xoroshiro128_plus.c optional dtrace compile-with "${DTRACE_C}" cddl/dev/dtmalloc/dtmalloc.c optional dtmalloc | dtraceall compile-with "${CDDL_C}" cddl/dev/profile/profile.c optional dtrace_profile | dtraceall compile-with "${CDDL_C}" cddl/dev/sdt/sdt.c optional dtrace_sdt | dtraceall compile-with "${CDDL_C}" cddl/dev/fbt/fbt.c optional dtrace_fbt | dtraceall compile-with "${FBT_C}" cddl/dev/systrace/systrace.c optional dtrace_systrace | dtraceall compile-with "${CDDL_C}" cddl/dev/prototype.c optional dtrace_prototype | dtraceall compile-with "${CDDL_C}" fs/nfsclient/nfs_clkdtrace.c optional dtnfscl nfscl | dtraceall nfscl compile-with "${CDDL_C}" compat/cloudabi/cloudabi_clock.c optional compat_cloudabi32 | compat_cloudabi64 compat/cloudabi/cloudabi_errno.c optional compat_cloudabi32 | compat_cloudabi64 compat/cloudabi/cloudabi_fd.c optional compat_cloudabi32 | compat_cloudabi64 compat/cloudabi/cloudabi_file.c optional compat_cloudabi32 | compat_cloudabi64 compat/cloudabi/cloudabi_futex.c optional compat_cloudabi32 | compat_cloudabi64 compat/cloudabi/cloudabi_mem.c optional compat_cloudabi32 | compat_cloudabi64 compat/cloudabi/cloudabi_proc.c optional compat_cloudabi32 | compat_cloudabi64 compat/cloudabi/cloudabi_random.c optional compat_cloudabi32 | compat_cloudabi64 compat/cloudabi/cloudabi_sock.c optional compat_cloudabi32 | compat_cloudabi64 compat/cloudabi/cloudabi_thread.c optional compat_cloudabi32 | compat_cloudabi64 compat/cloudabi/cloudabi_vdso.c optional compat_cloudabi32 | compat_cloudabi64 compat/cloudabi32/cloudabi32_fd.c optional compat_cloudabi32 compat/cloudabi32/cloudabi32_module.c optional compat_cloudabi32 compat/cloudabi32/cloudabi32_poll.c optional compat_cloudabi32 compat/cloudabi32/cloudabi32_sock.c optional compat_cloudabi32 compat/cloudabi32/cloudabi32_syscalls.c optional compat_cloudabi32 compat/cloudabi32/cloudabi32_sysent.c optional compat_cloudabi32 compat/cloudabi32/cloudabi32_thread.c optional compat_cloudabi32 compat/cloudabi64/cloudabi64_fd.c optional compat_cloudabi64 compat/cloudabi64/cloudabi64_module.c optional compat_cloudabi64 compat/cloudabi64/cloudabi64_poll.c optional compat_cloudabi64 compat/cloudabi64/cloudabi64_sock.c optional compat_cloudabi64 compat/cloudabi64/cloudabi64_syscalls.c optional compat_cloudabi64 compat/cloudabi64/cloudabi64_sysent.c optional compat_cloudabi64 compat/cloudabi64/cloudabi64_thread.c optional compat_cloudabi64 compat/freebsd32/freebsd32_capability.c optional compat_freebsd32 compat/freebsd32/freebsd32_ioctl.c optional compat_freebsd32 compat/freebsd32/freebsd32_misc.c optional compat_freebsd32 compat/freebsd32/freebsd32_syscalls.c optional compat_freebsd32 compat/freebsd32/freebsd32_sysent.c optional compat_freebsd32 contrib/ck/src/ck_array.c standard compile-with "${NORMAL_C} -I$S/contrib/ck/include" contrib/ck/src/ck_barrier_centralized.c standard compile-with "${NORMAL_C} -I$S/contrib/ck/include" contrib/ck/src/ck_barrier_combining.c standard compile-with "${NORMAL_C} -I$S/contrib/ck/include" contrib/ck/src/ck_barrier_dissemination.c standard compile-with "${NORMAL_C} -I$S/contrib/ck/include" contrib/ck/src/ck_barrier_mcs.c standard compile-with "${NORMAL_C} -I$S/contrib/ck/include" contrib/ck/src/ck_barrier_tournament.c standard compile-with "${NORMAL_C} -I$S/contrib/ck/include" contrib/ck/src/ck_epoch.c standard compile-with "${NORMAL_C} -I$S/contrib/ck/include" contrib/ck/src/ck_hp.c standard compile-with "${NORMAL_C} -I$S/contrib/ck/include" contrib/ck/src/ck_hs.c standard compile-with "${NORMAL_C} -I$S/contrib/ck/include" contrib/ck/src/ck_ht.c standard compile-with "${NORMAL_C} -I$S/contrib/ck/include" contrib/ck/src/ck_rhs.c standard compile-with "${NORMAL_C} -I$S/contrib/ck/include" contrib/dev/acpica/common/ahids.c optional acpi acpi_debug contrib/dev/acpica/common/ahuuids.c optional acpi acpi_debug contrib/dev/acpica/components/debugger/dbcmds.c optional acpi acpi_debug contrib/dev/acpica/components/debugger/dbconvert.c optional acpi acpi_debug contrib/dev/acpica/components/debugger/dbdisply.c optional acpi acpi_debug contrib/dev/acpica/components/debugger/dbexec.c optional acpi acpi_debug contrib/dev/acpica/components/debugger/dbhistry.c optional acpi acpi_debug contrib/dev/acpica/components/debugger/dbinput.c optional acpi acpi_debug contrib/dev/acpica/components/debugger/dbmethod.c optional acpi acpi_debug contrib/dev/acpica/components/debugger/dbnames.c optional acpi acpi_debug contrib/dev/acpica/components/debugger/dbobject.c optional acpi acpi_debug contrib/dev/acpica/components/debugger/dbstats.c optional acpi acpi_debug contrib/dev/acpica/components/debugger/dbtest.c optional acpi acpi_debug contrib/dev/acpica/components/debugger/dbutils.c optional acpi acpi_debug contrib/dev/acpica/components/debugger/dbxface.c optional acpi acpi_debug contrib/dev/acpica/components/disassembler/dmbuffer.c optional acpi acpi_debug contrib/dev/acpica/components/disassembler/dmcstyle.c optional acpi acpi_debug contrib/dev/acpica/components/disassembler/dmdeferred.c optional acpi acpi_debug contrib/dev/acpica/components/disassembler/dmnames.c optional acpi acpi_debug contrib/dev/acpica/components/disassembler/dmopcode.c optional acpi acpi_debug contrib/dev/acpica/components/disassembler/dmresrc.c optional acpi acpi_debug contrib/dev/acpica/components/disassembler/dmresrcl.c optional acpi acpi_debug contrib/dev/acpica/components/disassembler/dmresrcl2.c optional acpi acpi_debug contrib/dev/acpica/components/disassembler/dmresrcs.c optional acpi acpi_debug contrib/dev/acpica/components/disassembler/dmutils.c optional acpi acpi_debug contrib/dev/acpica/components/disassembler/dmwalk.c optional acpi acpi_debug contrib/dev/acpica/components/dispatcher/dsargs.c optional acpi contrib/dev/acpica/components/dispatcher/dscontrol.c optional acpi contrib/dev/acpica/components/dispatcher/dsdebug.c optional acpi contrib/dev/acpica/components/dispatcher/dsfield.c optional acpi contrib/dev/acpica/components/dispatcher/dsinit.c optional acpi contrib/dev/acpica/components/dispatcher/dsmethod.c optional acpi contrib/dev/acpica/components/dispatcher/dsmthdat.c optional acpi contrib/dev/acpica/components/dispatcher/dsobject.c optional acpi contrib/dev/acpica/components/dispatcher/dsopcode.c optional acpi contrib/dev/acpica/components/dispatcher/dspkginit.c optional acpi contrib/dev/acpica/components/dispatcher/dsutils.c optional acpi contrib/dev/acpica/components/dispatcher/dswexec.c optional acpi contrib/dev/acpica/components/dispatcher/dswload.c optional acpi contrib/dev/acpica/components/dispatcher/dswload2.c optional acpi contrib/dev/acpica/components/dispatcher/dswscope.c optional acpi contrib/dev/acpica/components/dispatcher/dswstate.c optional acpi contrib/dev/acpica/components/events/evevent.c optional acpi contrib/dev/acpica/components/events/evglock.c optional acpi contrib/dev/acpica/components/events/evgpe.c optional acpi contrib/dev/acpica/components/events/evgpeblk.c optional acpi contrib/dev/acpica/components/events/evgpeinit.c optional acpi contrib/dev/acpica/components/events/evgpeutil.c optional acpi contrib/dev/acpica/components/events/evhandler.c optional acpi contrib/dev/acpica/components/events/evmisc.c optional acpi contrib/dev/acpica/components/events/evregion.c optional acpi contrib/dev/acpica/components/events/evrgnini.c optional acpi contrib/dev/acpica/components/events/evsci.c optional acpi contrib/dev/acpica/components/events/evxface.c optional acpi contrib/dev/acpica/components/events/evxfevnt.c optional acpi contrib/dev/acpica/components/events/evxfgpe.c optional acpi contrib/dev/acpica/components/events/evxfregn.c optional acpi contrib/dev/acpica/components/executer/exconcat.c optional acpi contrib/dev/acpica/components/executer/exconfig.c optional acpi contrib/dev/acpica/components/executer/exconvrt.c optional acpi contrib/dev/acpica/components/executer/excreate.c optional acpi contrib/dev/acpica/components/executer/exdebug.c optional acpi contrib/dev/acpica/components/executer/exdump.c optional acpi contrib/dev/acpica/components/executer/exfield.c optional acpi contrib/dev/acpica/components/executer/exfldio.c optional acpi contrib/dev/acpica/components/executer/exmisc.c optional acpi contrib/dev/acpica/components/executer/exmutex.c optional acpi contrib/dev/acpica/components/executer/exnames.c optional acpi contrib/dev/acpica/components/executer/exoparg1.c optional acpi contrib/dev/acpica/components/executer/exoparg2.c optional acpi contrib/dev/acpica/components/executer/exoparg3.c optional acpi contrib/dev/acpica/components/executer/exoparg6.c optional acpi contrib/dev/acpica/components/executer/exprep.c optional acpi contrib/dev/acpica/components/executer/exregion.c optional acpi contrib/dev/acpica/components/executer/exresnte.c optional acpi contrib/dev/acpica/components/executer/exresolv.c optional acpi contrib/dev/acpica/components/executer/exresop.c optional acpi contrib/dev/acpica/components/executer/exserial.c optional acpi contrib/dev/acpica/components/executer/exstore.c optional acpi contrib/dev/acpica/components/executer/exstoren.c optional acpi contrib/dev/acpica/components/executer/exstorob.c optional acpi contrib/dev/acpica/components/executer/exsystem.c optional acpi contrib/dev/acpica/components/executer/extrace.c optional acpi contrib/dev/acpica/components/executer/exutils.c optional acpi contrib/dev/acpica/components/hardware/hwacpi.c optional acpi contrib/dev/acpica/components/hardware/hwesleep.c optional acpi contrib/dev/acpica/components/hardware/hwgpe.c optional acpi contrib/dev/acpica/components/hardware/hwpci.c optional acpi contrib/dev/acpica/components/hardware/hwregs.c optional acpi contrib/dev/acpica/components/hardware/hwsleep.c optional acpi contrib/dev/acpica/components/hardware/hwtimer.c optional acpi contrib/dev/acpica/components/hardware/hwvalid.c optional acpi contrib/dev/acpica/components/hardware/hwxface.c optional acpi contrib/dev/acpica/components/hardware/hwxfsleep.c optional acpi contrib/dev/acpica/components/namespace/nsaccess.c optional acpi contrib/dev/acpica/components/namespace/nsalloc.c optional acpi contrib/dev/acpica/components/namespace/nsarguments.c optional acpi contrib/dev/acpica/components/namespace/nsconvert.c optional acpi contrib/dev/acpica/components/namespace/nsdump.c optional acpi contrib/dev/acpica/components/namespace/nseval.c optional acpi contrib/dev/acpica/components/namespace/nsinit.c optional acpi contrib/dev/acpica/components/namespace/nsload.c optional acpi contrib/dev/acpica/components/namespace/nsnames.c optional acpi contrib/dev/acpica/components/namespace/nsobject.c optional acpi contrib/dev/acpica/components/namespace/nsparse.c optional acpi contrib/dev/acpica/components/namespace/nspredef.c optional acpi contrib/dev/acpica/components/namespace/nsprepkg.c optional acpi contrib/dev/acpica/components/namespace/nsrepair.c optional acpi contrib/dev/acpica/components/namespace/nsrepair2.c optional acpi contrib/dev/acpica/components/namespace/nssearch.c optional acpi contrib/dev/acpica/components/namespace/nsutils.c optional acpi contrib/dev/acpica/components/namespace/nswalk.c optional acpi contrib/dev/acpica/components/namespace/nsxfeval.c optional acpi contrib/dev/acpica/components/namespace/nsxfname.c optional acpi contrib/dev/acpica/components/namespace/nsxfobj.c optional acpi contrib/dev/acpica/components/parser/psargs.c optional acpi contrib/dev/acpica/components/parser/psloop.c optional acpi contrib/dev/acpica/components/parser/psobject.c optional acpi contrib/dev/acpica/components/parser/psopcode.c optional acpi contrib/dev/acpica/components/parser/psopinfo.c optional acpi contrib/dev/acpica/components/parser/psparse.c optional acpi contrib/dev/acpica/components/parser/psscope.c optional acpi contrib/dev/acpica/components/parser/pstree.c optional acpi contrib/dev/acpica/components/parser/psutils.c optional acpi contrib/dev/acpica/components/parser/pswalk.c optional acpi contrib/dev/acpica/components/parser/psxface.c optional acpi contrib/dev/acpica/components/resources/rsaddr.c optional acpi contrib/dev/acpica/components/resources/rscalc.c optional acpi contrib/dev/acpica/components/resources/rscreate.c optional acpi contrib/dev/acpica/components/resources/rsdump.c optional acpi acpi_debug contrib/dev/acpica/components/resources/rsdumpinfo.c optional acpi contrib/dev/acpica/components/resources/rsinfo.c optional acpi contrib/dev/acpica/components/resources/rsio.c optional acpi contrib/dev/acpica/components/resources/rsirq.c optional acpi contrib/dev/acpica/components/resources/rslist.c optional acpi contrib/dev/acpica/components/resources/rsmemory.c optional acpi contrib/dev/acpica/components/resources/rsmisc.c optional acpi contrib/dev/acpica/components/resources/rsserial.c optional acpi contrib/dev/acpica/components/resources/rsutils.c optional acpi contrib/dev/acpica/components/resources/rsxface.c optional acpi contrib/dev/acpica/components/tables/tbdata.c optional acpi contrib/dev/acpica/components/tables/tbfadt.c optional acpi contrib/dev/acpica/components/tables/tbfind.c optional acpi contrib/dev/acpica/components/tables/tbinstal.c optional acpi contrib/dev/acpica/components/tables/tbprint.c optional acpi contrib/dev/acpica/components/tables/tbutils.c optional acpi contrib/dev/acpica/components/tables/tbxface.c optional acpi contrib/dev/acpica/components/tables/tbxfload.c optional acpi contrib/dev/acpica/components/tables/tbxfroot.c optional acpi contrib/dev/acpica/components/utilities/utaddress.c optional acpi contrib/dev/acpica/components/utilities/utalloc.c optional acpi contrib/dev/acpica/components/utilities/utascii.c optional acpi contrib/dev/acpica/components/utilities/utbuffer.c optional acpi contrib/dev/acpica/components/utilities/utcache.c optional acpi contrib/dev/acpica/components/utilities/utcopy.c optional acpi contrib/dev/acpica/components/utilities/utdebug.c optional acpi contrib/dev/acpica/components/utilities/utdecode.c optional acpi contrib/dev/acpica/components/utilities/utdelete.c optional acpi contrib/dev/acpica/components/utilities/uterror.c optional acpi contrib/dev/acpica/components/utilities/uteval.c optional acpi contrib/dev/acpica/components/utilities/utexcep.c optional acpi contrib/dev/acpica/components/utilities/utglobal.c optional acpi contrib/dev/acpica/components/utilities/uthex.c optional acpi contrib/dev/acpica/components/utilities/utids.c optional acpi contrib/dev/acpica/components/utilities/utinit.c optional acpi contrib/dev/acpica/components/utilities/utlock.c optional acpi contrib/dev/acpica/components/utilities/utmath.c optional acpi contrib/dev/acpica/components/utilities/utmisc.c optional acpi contrib/dev/acpica/components/utilities/utmutex.c optional acpi contrib/dev/acpica/components/utilities/utnonansi.c optional acpi contrib/dev/acpica/components/utilities/utobject.c optional acpi contrib/dev/acpica/components/utilities/utosi.c optional acpi contrib/dev/acpica/components/utilities/utownerid.c optional acpi contrib/dev/acpica/components/utilities/utpredef.c optional acpi contrib/dev/acpica/components/utilities/utresdecode.c optional acpi acpi_debug contrib/dev/acpica/components/utilities/utresrc.c optional acpi contrib/dev/acpica/components/utilities/utstate.c optional acpi contrib/dev/acpica/components/utilities/utstring.c optional acpi contrib/dev/acpica/components/utilities/utstrsuppt.c optional acpi contrib/dev/acpica/components/utilities/utstrtoul64.c optional acpi contrib/dev/acpica/components/utilities/utuuid.c optional acpi acpi_debug contrib/dev/acpica/components/utilities/utxface.c optional acpi contrib/dev/acpica/components/utilities/utxferror.c optional acpi contrib/dev/acpica/components/utilities/utxfinit.c optional acpi contrib/dev/acpica/os_specific/service_layers/osgendbg.c optional acpi acpi_debug contrib/ipfilter/netinet/fil.c optional ipfilter inet \ compile-with "${NORMAL_C} ${NO_WSELF_ASSIGN} -Wno-unused -I$S/contrib/ipfilter" contrib/ipfilter/netinet/ip_auth.c optional ipfilter inet \ compile-with "${NORMAL_C} -Wno-unused -I$S/contrib/ipfilter" contrib/ipfilter/netinet/ip_fil_freebsd.c optional ipfilter inet \ compile-with "${NORMAL_C} -Wno-unused -I$S/contrib/ipfilter" contrib/ipfilter/netinet/ip_frag.c optional ipfilter inet \ compile-with "${NORMAL_C} -Wno-unused -I$S/contrib/ipfilter" contrib/ipfilter/netinet/ip_log.c optional ipfilter inet \ compile-with "${NORMAL_C} -I$S/contrib/ipfilter" contrib/ipfilter/netinet/ip_nat.c optional ipfilter inet \ compile-with "${NORMAL_C} -Wno-unused -I$S/contrib/ipfilter" contrib/ipfilter/netinet/ip_proxy.c optional ipfilter inet \ compile-with "${NORMAL_C} ${NO_WSELF_ASSIGN} -Wno-unused -I$S/contrib/ipfilter" contrib/ipfilter/netinet/ip_state.c optional ipfilter inet \ compile-with "${NORMAL_C} -Wno-unused -I$S/contrib/ipfilter" contrib/ipfilter/netinet/ip_lookup.c optional ipfilter inet \ compile-with "${NORMAL_C} ${NO_WSELF_ASSIGN} -Wno-unused -Wno-error -I$S/contrib/ipfilter" contrib/ipfilter/netinet/ip_pool.c optional ipfilter inet \ compile-with "${NORMAL_C} -Wno-unused -I$S/contrib/ipfilter" contrib/ipfilter/netinet/ip_htable.c optional ipfilter inet \ compile-with "${NORMAL_C} -Wno-unused -I$S/contrib/ipfilter ${NO_WTAUTOLOGICAL_POINTER_COMPARE}" contrib/ipfilter/netinet/ip_sync.c optional ipfilter inet \ compile-with "${NORMAL_C} -Wno-unused -I$S/contrib/ipfilter" contrib/ipfilter/netinet/mlfk_ipl.c optional ipfilter inet \ compile-with "${NORMAL_C} -I$S/contrib/ipfilter" contrib/ipfilter/netinet/ip_nat6.c optional ipfilter inet \ compile-with "${NORMAL_C} -Wno-unused -I$S/contrib/ipfilter" contrib/ipfilter/netinet/ip_rules.c optional ipfilter inet \ compile-with "${NORMAL_C} -I$S/contrib/ipfilter" contrib/ipfilter/netinet/ip_scan.c optional ipfilter inet \ compile-with "${NORMAL_C} -Wno-unused -I$S/contrib/ipfilter" contrib/ipfilter/netinet/ip_dstlist.c optional ipfilter inet \ compile-with "${NORMAL_C} -Wno-unused -I$S/contrib/ipfilter" contrib/ipfilter/netinet/radix_ipf.c optional ipfilter inet \ compile-with "${NORMAL_C} -I$S/contrib/ipfilter" contrib/libfdt/fdt.c optional fdt contrib/libfdt/fdt_ro.c optional fdt contrib/libfdt/fdt_rw.c optional fdt contrib/libfdt/fdt_strerror.c optional fdt contrib/libfdt/fdt_sw.c optional fdt contrib/libfdt/fdt_wip.c optional fdt contrib/libnv/cnvlist.c standard contrib/libnv/dnvlist.c standard contrib/libnv/nvlist.c standard contrib/libnv/nvpair.c standard contrib/ngatm/netnatm/api/cc_conn.c optional ngatm_ccatm \ compile-with "${NORMAL_C_NOWERROR} -I$S/contrib/ngatm" contrib/ngatm/netnatm/api/cc_data.c optional ngatm_ccatm \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" contrib/ngatm/netnatm/api/cc_dump.c optional ngatm_ccatm \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" contrib/ngatm/netnatm/api/cc_port.c optional ngatm_ccatm \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" contrib/ngatm/netnatm/api/cc_sig.c optional ngatm_ccatm \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" contrib/ngatm/netnatm/api/cc_user.c optional ngatm_ccatm \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" contrib/ngatm/netnatm/api/unisap.c optional ngatm_ccatm \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" contrib/ngatm/netnatm/misc/straddr.c optional ngatm_atmbase \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" contrib/ngatm/netnatm/misc/unimsg_common.c optional ngatm_atmbase \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" contrib/ngatm/netnatm/msg/traffic.c optional ngatm_atmbase \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" contrib/ngatm/netnatm/msg/uni_ie.c optional ngatm_atmbase \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" contrib/ngatm/netnatm/msg/uni_msg.c optional ngatm_atmbase \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" contrib/ngatm/netnatm/saal/saal_sscfu.c optional ngatm_sscfu \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" contrib/ngatm/netnatm/saal/saal_sscop.c optional ngatm_sscop \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" contrib/ngatm/netnatm/sig/sig_call.c optional ngatm_uni \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" contrib/ngatm/netnatm/sig/sig_coord.c optional ngatm_uni \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" contrib/ngatm/netnatm/sig/sig_party.c optional ngatm_uni \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" contrib/ngatm/netnatm/sig/sig_print.c optional ngatm_uni \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" contrib/ngatm/netnatm/sig/sig_reset.c optional ngatm_uni \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" contrib/ngatm/netnatm/sig/sig_uni.c optional ngatm_uni \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" contrib/ngatm/netnatm/sig/sig_unimsgcpy.c optional ngatm_uni \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" contrib/ngatm/netnatm/sig/sig_verify.c optional ngatm_uni \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" # xz dev/xz/xz_mod.c optional xz \ compile-with "${NORMAL_C} -I$S/contrib/xz-embedded/freebsd/ -I$S/contrib/xz-embedded/linux/lib/xz/ -I$S/contrib/xz-embedded/linux/include/linux/" contrib/xz-embedded/linux/lib/xz/xz_crc32.c optional xz \ compile-with "${NORMAL_C} -I$S/contrib/xz-embedded/freebsd/ -I$S/contrib/xz-embedded/linux/lib/xz/ -I$S/contrib/xz-embedded/linux/include/linux/" contrib/xz-embedded/linux/lib/xz/xz_dec_bcj.c optional xz \ compile-with "${NORMAL_C} -I$S/contrib/xz-embedded/freebsd/ -I$S/contrib/xz-embedded/linux/lib/xz/ -I$S/contrib/xz-embedded/linux/include/linux/" contrib/xz-embedded/linux/lib/xz/xz_dec_lzma2.c optional xz \ compile-with "${NORMAL_C} -I$S/contrib/xz-embedded/freebsd/ -I$S/contrib/xz-embedded/linux/lib/xz/ -I$S/contrib/xz-embedded/linux/include/linux/" contrib/xz-embedded/linux/lib/xz/xz_dec_stream.c optional xz \ compile-with "${NORMAL_C} -I$S/contrib/xz-embedded/freebsd/ -I$S/contrib/xz-embedded/linux/lib/xz/ -I$S/contrib/xz-embedded/linux/include/linux/" # Zstd contrib/zstd/lib/freebsd/zstd_kmalloc.c optional zstdio compile-with ${ZSTD_C} contrib/zstd/lib/common/zstd_common.c optional zstdio compile-with ${ZSTD_C} contrib/zstd/lib/common/fse_decompress.c optional zstdio compile-with ${ZSTD_C} contrib/zstd/lib/common/entropy_common.c optional zstdio compile-with ${ZSTD_C} contrib/zstd/lib/common/error_private.c optional zstdio compile-with ${ZSTD_C} contrib/zstd/lib/common/xxhash.c optional zstdio compile-with ${ZSTD_C} contrib/zstd/lib/compress/zstd_compress.c optional zstdio compile-with ${ZSTD_C} contrib/zstd/lib/compress/zstd_compress_literals.c optional zstdio compile-with ${ZSTD_C} contrib/zstd/lib/compress/zstd_compress_sequences.c optional zstdio compile-with ${ZSTD_C} contrib/zstd/lib/compress/fse_compress.c optional zstdio compile-with ${ZSTD_C} contrib/zstd/lib/compress/hist.c optional zstdio compile-with ${ZSTD_C} contrib/zstd/lib/compress/huf_compress.c optional zstdio compile-with ${ZSTD_C} contrib/zstd/lib/compress/zstd_double_fast.c optional zstdio compile-with ${ZSTD_C} contrib/zstd/lib/compress/zstd_fast.c optional zstdio compile-with ${ZSTD_C} contrib/zstd/lib/compress/zstd_lazy.c optional zstdio compile-with ${ZSTD_C} contrib/zstd/lib/compress/zstd_ldm.c optional zstdio compile-with ${ZSTD_C} contrib/zstd/lib/compress/zstd_opt.c optional zstdio compile-with ${ZSTD_C} contrib/zstd/lib/decompress/zstd_ddict.c optional zstdio compile-with ${ZSTD_C} contrib/zstd/lib/decompress/zstd_decompress.c optional zstdio compile-with ${ZSTD_C} # See comment in sys/conf/kern.pre.mk contrib/zstd/lib/decompress/zstd_decompress_block.c optional zstdio \ compile-with "${ZSTD_C} ${ZSTD_DECOMPRESS_BLOCK_FLAGS}" contrib/zstd/lib/decompress/huf_decompress.c optional zstdio compile-with ${ZSTD_C} # Blake 2 contrib/libb2/blake2b-ref.c optional crypto | ipsec | ipsec_support \ compile-with "${NORMAL_C} -I$S/crypto/blake2 -Wno-cast-qual -DSUFFIX=_ref -Wno-unused-function" contrib/libb2/blake2s-ref.c optional crypto | ipsec | ipsec_support \ compile-with "${NORMAL_C} -I$S/crypto/blake2 -Wno-cast-qual -DSUFFIX=_ref -Wno-unused-function" crypto/blake2/blake2-sw.c optional crypto | ipsec | ipsec_support \ compile-with "${NORMAL_C} -I$S/crypto/blake2 -Wno-cast-qual" crypto/blowfish/bf_ecb.c optional ipsec | ipsec_support crypto/blowfish/bf_skey.c optional crypto | ipsec | ipsec_support crypto/camellia/camellia.c optional crypto | ipsec | ipsec_support crypto/camellia/camellia-api.c optional crypto | ipsec | ipsec_support crypto/chacha20/chacha.c standard crypto/chacha20/chacha-sw.c optional crypto | ipsec | ipsec_support crypto/des/des_ecb.c optional crypto | ipsec | ipsec_support | netsmb crypto/des/des_setkey.c optional crypto | ipsec | ipsec_support | netsmb crypto/rc4/rc4.c optional netgraph_mppc_encryption | kgssapi crypto/rijndael/rijndael-alg-fst.c optional crypto | ekcd | geom_bde | \ ipsec | ipsec_support | !random_loadable | wlan_ccmp crypto/rijndael/rijndael-api-fst.c optional ekcd | geom_bde | !random_loadable crypto/rijndael/rijndael-api.c optional crypto | ipsec | ipsec_support | \ wlan_ccmp crypto/sha1.c optional carp | crypto | ether | ipsec | \ ipsec_support | netgraph_mppc_encryption | sctp crypto/sha2/sha256c.c optional crypto | ekcd | geom_bde | ipsec | \ ipsec_support | !random_loadable | sctp | zfs crypto/sha2/sha512c.c optional crypto | geom_bde | ipsec | \ ipsec_support | zfs crypto/skein/skein.c optional crypto | zfs crypto/skein/skein_block.c optional crypto | zfs crypto/siphash/siphash.c optional inet | inet6 crypto/siphash/siphash_test.c optional inet | inet6 ddb/db_access.c optional ddb ddb/db_break.c optional ddb ddb/db_capture.c optional ddb ddb/db_command.c optional ddb ddb/db_examine.c optional ddb ddb/db_expr.c optional ddb ddb/db_input.c optional ddb ddb/db_lex.c optional ddb ddb/db_main.c optional ddb ddb/db_output.c optional ddb ddb/db_print.c optional ddb ddb/db_ps.c optional ddb ddb/db_run.c optional ddb ddb/db_script.c optional ddb ddb/db_sym.c optional ddb ddb/db_thread.c optional ddb ddb/db_textdump.c optional ddb ddb/db_variables.c optional ddb ddb/db_watch.c optional ddb ddb/db_write_cmd.c optional ddb dev/aac/aac.c optional aac dev/aac/aac_cam.c optional aacp aac dev/aac/aac_debug.c optional aac dev/aac/aac_disk.c optional aac dev/aac/aac_linux.c optional aac compat_linux dev/aac/aac_pci.c optional aac pci dev/aacraid/aacraid.c optional aacraid dev/aacraid/aacraid_cam.c optional aacraid scbus dev/aacraid/aacraid_debug.c optional aacraid dev/aacraid/aacraid_linux.c optional aacraid compat_linux dev/aacraid/aacraid_pci.c optional aacraid pci dev/acpi_support/acpi_wmi.c optional acpi_wmi acpi dev/acpi_support/acpi_asus.c optional acpi_asus acpi dev/acpi_support/acpi_asus_wmi.c optional acpi_asus_wmi acpi dev/acpi_support/acpi_fujitsu.c optional acpi_fujitsu acpi dev/acpi_support/acpi_hp.c optional acpi_hp acpi dev/acpi_support/acpi_ibm.c optional acpi_ibm acpi dev/acpi_support/acpi_panasonic.c optional acpi_panasonic acpi dev/acpi_support/acpi_sony.c optional acpi_sony acpi dev/acpi_support/acpi_toshiba.c optional acpi_toshiba acpi dev/acpi_support/atk0110.c optional aibs acpi dev/acpica/Osd/OsdDebug.c optional acpi dev/acpica/Osd/OsdHardware.c optional acpi dev/acpica/Osd/OsdInterrupt.c optional acpi dev/acpica/Osd/OsdMemory.c optional acpi dev/acpica/Osd/OsdSchedule.c optional acpi dev/acpica/Osd/OsdStream.c optional acpi dev/acpica/Osd/OsdSynch.c optional acpi dev/acpica/Osd/OsdTable.c optional acpi dev/acpica/acpi.c optional acpi dev/acpica/acpi_acad.c optional acpi dev/acpica/acpi_battery.c optional acpi dev/acpica/acpi_button.c optional acpi dev/acpica/acpi_cmbat.c optional acpi dev/acpica/acpi_cpu.c optional acpi dev/acpica/acpi_ec.c optional acpi dev/acpica/acpi_isab.c optional acpi isa dev/acpica/acpi_lid.c optional acpi dev/acpica/acpi_package.c optional acpi dev/acpica/acpi_perf.c optional acpi dev/acpica/acpi_powerres.c optional acpi dev/acpica/acpi_quirk.c optional acpi dev/acpica/acpi_resource.c optional acpi dev/acpica/acpi_container.c optional acpi dev/acpica/acpi_smbat.c optional acpi dev/acpica/acpi_thermal.c optional acpi dev/acpica/acpi_throttle.c optional acpi dev/acpica/acpi_video.c optional acpi_video acpi dev/acpica/acpi_dock.c optional acpi_dock acpi dev/adlink/adlink.c optional adlink dev/ae/if_ae.c optional ae pci dev/age/if_age.c optional age pci dev/agp/agp.c optional agp pci dev/agp/agp_if.m optional agp pci dev/ahci/ahci.c optional ahci dev/ahci/ahciem.c optional ahci dev/ahci/ahci_pci.c optional ahci pci dev/aic7xxx/ahc_isa.c optional ahc isa dev/aic7xxx/ahc_pci.c optional ahc pci \ compile-with "${NORMAL_C} ${NO_WCONSTANT_CONVERSION}" dev/aic7xxx/ahd_pci.c optional ahd pci \ compile-with "${NORMAL_C} ${NO_WCONSTANT_CONVERSION}" dev/aic7xxx/aic7770.c optional ahc dev/aic7xxx/aic79xx.c optional ahd pci dev/aic7xxx/aic79xx_osm.c optional ahd pci dev/aic7xxx/aic79xx_pci.c optional ahd pci dev/aic7xxx/aic79xx_reg_print.c optional ahd pci ahd_reg_pretty_print dev/aic7xxx/aic7xxx.c optional ahc dev/aic7xxx/aic7xxx_93cx6.c optional ahc dev/aic7xxx/aic7xxx_osm.c optional ahc dev/aic7xxx/aic7xxx_pci.c optional ahc pci dev/aic7xxx/aic7xxx_reg_print.c optional ahc ahc_reg_pretty_print dev/al_eth/al_eth.c optional al_eth fdt \ no-depend \ compile-with "${CC} -c -o ${.TARGET} ${CFLAGS} -I$S/contrib/alpine-hal -I$S/contrib/alpine-hal/eth ${PROF} ${.IMPSRC}" dev/al_eth/al_init_eth_lm.c optional al_eth fdt \ no-depend \ compile-with "${CC} -c -o ${.TARGET} ${CFLAGS} -I$S/contrib/alpine-hal -I$S/contrib/alpine-hal/eth ${PROF} ${.IMPSRC}" dev/al_eth/al_init_eth_kr.c optional al_eth fdt \ no-depend \ compile-with "${CC} -c -o ${.TARGET} ${CFLAGS} -I$S/contrib/alpine-hal -I$S/contrib/alpine-hal/eth ${PROF} ${.IMPSRC}" contrib/alpine-hal/al_hal_iofic.c optional al_iofic \ no-depend \ compile-with "${CC} -c -o ${.TARGET} ${CFLAGS} -I$S/contrib/alpine-hal -I$S/contrib/alpine-hal/eth ${PROF} ${.IMPSRC}" contrib/alpine-hal/al_hal_serdes_25g.c optional al_serdes \ no-depend \ compile-with "${CC} -c -o ${.TARGET} ${CFLAGS} -I$S/contrib/alpine-hal -I$S/contrib/alpine-hal/eth ${PROF} ${.IMPSRC}" contrib/alpine-hal/al_hal_serdes_hssp.c optional al_serdes \ no-depend \ compile-with "${CC} -c -o ${.TARGET} ${CFLAGS} -I$S/contrib/alpine-hal -I$S/contrib/alpine-hal/eth ${PROF} ${.IMPSRC}" contrib/alpine-hal/al_hal_udma_config.c optional al_udma \ no-depend \ compile-with "${CC} -c -o ${.TARGET} ${CFLAGS} -I$S/contrib/alpine-hal -I$S/contrib/alpine-hal/eth ${PROF} ${.IMPSRC}" contrib/alpine-hal/al_hal_udma_debug.c optional al_udma \ no-depend \ compile-with "${CC} -c -o ${.TARGET} ${CFLAGS} -I$S/contrib/alpine-hal -I$S/contrib/alpine-hal/eth ${PROF} ${.IMPSRC}" contrib/alpine-hal/al_hal_udma_iofic.c optional al_udma \ no-depend \ compile-with "${CC} -c -o ${.TARGET} ${CFLAGS} -I$S/contrib/alpine-hal -I$S/contrib/alpine-hal/eth ${PROF} ${.IMPSRC}" contrib/alpine-hal/al_hal_udma_main.c optional al_udma \ no-depend \ compile-with "${CC} -c -o ${.TARGET} ${CFLAGS} -I$S/contrib/alpine-hal -I$S/contrib/alpine-hal/eth ${PROF} ${.IMPSRC}" contrib/alpine-hal/al_serdes.c optional al_serdes \ no-depend \ compile-with "${CC} -c -o ${.TARGET} ${CFLAGS} -I$S/contrib/alpine-hal -I$S/contrib/alpine-hal/eth ${PROF} ${.IMPSRC}" contrib/alpine-hal/eth/al_hal_eth_kr.c optional al_eth \ no-depend \ compile-with "${CC} -c -o ${.TARGET} ${CFLAGS} -I$S/contrib/alpine-hal -I$S/contrib/alpine-hal/eth ${PROF} ${.IMPSRC}" contrib/alpine-hal/eth/al_hal_eth_main.c optional al_eth \ no-depend \ compile-with "${CC} -c -o ${.TARGET} ${CFLAGS} -I$S/contrib/alpine-hal -I$S/contrib/alpine-hal/eth ${PROF} ${.IMPSRC}" dev/alc/if_alc.c optional alc pci dev/ale/if_ale.c optional ale pci dev/alpm/alpm.c optional alpm pci dev/altera/avgen/altera_avgen.c optional altera_avgen dev/altera/avgen/altera_avgen_fdt.c optional altera_avgen fdt dev/altera/avgen/altera_avgen_nexus.c optional altera_avgen dev/altera/msgdma/msgdma.c optional altera_msgdma xdma dev/altera/sdcard/altera_sdcard.c optional altera_sdcard dev/altera/sdcard/altera_sdcard_disk.c optional altera_sdcard dev/altera/sdcard/altera_sdcard_io.c optional altera_sdcard dev/altera/sdcard/altera_sdcard_fdt.c optional altera_sdcard fdt dev/altera/sdcard/altera_sdcard_nexus.c optional altera_sdcard dev/altera/softdma/softdma.c optional altera_softdma xdma fdt dev/altera/pio/pio.c optional altera_pio dev/altera/pio/pio_if.m optional altera_pio dev/amdpm/amdpm.c optional amdpm pci | nfpm pci dev/amdsmb/amdsmb.c optional amdsmb pci dev/amr/amr.c optional amr dev/amr/amr_cam.c optional amrp amr dev/amr/amr_disk.c optional amr dev/amr/amr_linux.c optional amr compat_linux dev/amr/amr_pci.c optional amr pci dev/an/if_an.c optional an dev/an/if_an_isa.c optional an isa dev/an/if_an_pccard.c optional an pccard dev/an/if_an_pci.c optional an pci # dev/ata/ata_if.m optional ata | atacore dev/ata/ata-all.c optional ata | atacore dev/ata/ata-dma.c optional ata | atacore dev/ata/ata-lowlevel.c optional ata | atacore dev/ata/ata-sata.c optional ata | atacore dev/ata/ata-card.c optional ata pccard | atapccard dev/ata/ata-isa.c optional ata isa | ataisa dev/ata/ata-pci.c optional ata pci | atapci dev/ata/chipsets/ata-acard.c optional ata pci | ataacard dev/ata/chipsets/ata-acerlabs.c optional ata pci | ataacerlabs dev/ata/chipsets/ata-amd.c optional ata pci | ataamd dev/ata/chipsets/ata-ati.c optional ata pci | ataati dev/ata/chipsets/ata-cenatek.c optional ata pci | atacenatek dev/ata/chipsets/ata-cypress.c optional ata pci | atacypress dev/ata/chipsets/ata-cyrix.c optional ata pci | atacyrix dev/ata/chipsets/ata-highpoint.c optional ata pci | atahighpoint dev/ata/chipsets/ata-intel.c optional ata pci | ataintel dev/ata/chipsets/ata-ite.c optional ata pci | ataite dev/ata/chipsets/ata-jmicron.c optional ata pci | atajmicron dev/ata/chipsets/ata-marvell.c optional ata pci | atamarvell dev/ata/chipsets/ata-micron.c optional ata pci | atamicron dev/ata/chipsets/ata-national.c optional ata pci | atanational dev/ata/chipsets/ata-netcell.c optional ata pci | atanetcell dev/ata/chipsets/ata-nvidia.c optional ata pci | atanvidia dev/ata/chipsets/ata-promise.c optional ata pci | atapromise dev/ata/chipsets/ata-serverworks.c optional ata pci | ataserverworks dev/ata/chipsets/ata-siliconimage.c optional ata pci | atasiliconimage | ataati dev/ata/chipsets/ata-sis.c optional ata pci | atasis dev/ata/chipsets/ata-via.c optional ata pci | atavia # dev/ath/if_ath_pci.c optional ath_pci pci \ compile-with "${NORMAL_C} -I$S/dev/ath" # dev/ath/if_ath_ahb.c optional ath_ahb \ compile-with "${NORMAL_C} -I$S/dev/ath" # dev/ath/if_ath.c optional ath \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/if_ath_alq.c optional ath \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/if_ath_beacon.c optional ath \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/if_ath_btcoex.c optional ath \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/if_ath_btcoex_mci.c optional ath \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/if_ath_debug.c optional ath \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/if_ath_descdma.c optional ath \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/if_ath_keycache.c optional ath \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/if_ath_ioctl.c optional ath \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/if_ath_led.c optional ath \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/if_ath_lna_div.c optional ath \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/if_ath_tx.c optional ath \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/if_ath_tx_edma.c optional ath \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/if_ath_tx_ht.c optional ath \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/if_ath_tdma.c optional ath \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/if_ath_sysctl.c optional ath \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/if_ath_rx.c optional ath \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/if_ath_rx_edma.c optional ath \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/if_ath_spectral.c optional ath \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/ah_osdep.c optional ath \ compile-with "${NORMAL_C} -I$S/dev/ath" # dev/ath/ath_hal/ah.c optional ath \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/ath_hal/ah_eeprom_v1.c optional ath_hal | ath_ar5210 \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/ath_hal/ah_eeprom_v3.c optional ath_hal | ath_ar5211 | ath_ar5212 \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/ath_hal/ah_eeprom_v14.c \ optional ath_hal | ath_ar5416 | ath_ar9160 | ath_ar9280 \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/ath_hal/ah_eeprom_v4k.c \ optional ath_hal | ath_ar9285 \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/ath_hal/ah_eeprom_9287.c \ optional ath_hal | ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/ath_hal/ah_regdomain.c optional ath \ compile-with "${NORMAL_C} ${NO_WSHIFT_COUNT_NEGATIVE} ${NO_WSHIFT_COUNT_OVERFLOW} -I$S/dev/ath" # ar5210 dev/ath/ath_hal/ar5210/ar5210_attach.c optional ath_hal | ath_ar5210 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5210/ar5210_beacon.c optional ath_hal | ath_ar5210 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5210/ar5210_interrupts.c optional ath_hal | ath_ar5210 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5210/ar5210_keycache.c optional ath_hal | ath_ar5210 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5210/ar5210_misc.c optional ath_hal | ath_ar5210 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5210/ar5210_phy.c optional ath_hal | ath_ar5210 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5210/ar5210_power.c optional ath_hal | ath_ar5210 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5210/ar5210_recv.c optional ath_hal | ath_ar5210 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5210/ar5210_reset.c optional ath_hal | ath_ar5210 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5210/ar5210_xmit.c optional ath_hal | ath_ar5210 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" # ar5211 dev/ath/ath_hal/ar5211/ar5211_attach.c optional ath_hal | ath_ar5211 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5211/ar5211_beacon.c optional ath_hal | ath_ar5211 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5211/ar5211_interrupts.c optional ath_hal | ath_ar5211 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5211/ar5211_keycache.c optional ath_hal | ath_ar5211 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5211/ar5211_misc.c optional ath_hal | ath_ar5211 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5211/ar5211_phy.c optional ath_hal | ath_ar5211 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5211/ar5211_power.c optional ath_hal | ath_ar5211 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5211/ar5211_recv.c optional ath_hal | ath_ar5211 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5211/ar5211_reset.c optional ath_hal | ath_ar5211 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5211/ar5211_xmit.c optional ath_hal | ath_ar5211 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" # ar5212 dev/ath/ath_hal/ar5212/ar5212_ani.c \ optional ath_hal | ath_ar5212 | ath_ar5416 | ath_ar9160 | ath_ar9280 | \ ath_ar9285 ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5212/ar5212_attach.c \ optional ath_hal | ath_ar5212 | ath_ar5416 | ath_ar9160 | ath_ar9280 | \ ath_ar9285 ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5212/ar5212_beacon.c \ optional ath_hal | ath_ar5212 | ath_ar5416 | ath_ar9160 | ath_ar9280 | \ ath_ar9285 ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5212/ar5212_eeprom.c \ optional ath_hal | ath_ar5212 | ath_ar5416 | ath_ar9160 | ath_ar9280 | \ ath_ar9285 ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5212/ar5212_gpio.c \ optional ath_hal | ath_ar5212 | ath_ar5416 | ath_ar9160 | ath_ar9280 | \ ath_ar9285 ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5212/ar5212_interrupts.c \ optional ath_hal | ath_ar5212 | ath_ar5416 | ath_ar9160 | ath_ar9280 | \ ath_ar9285 ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5212/ar5212_keycache.c \ optional ath_hal | ath_ar5212 | ath_ar5416 | ath_ar9160 | ath_ar9280 | \ ath_ar9285 ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5212/ar5212_misc.c \ optional ath_hal | ath_ar5212 | ath_ar5416 | ath_ar9160 | ath_ar9280 | \ ath_ar9285 ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5212/ar5212_phy.c \ optional ath_hal | ath_ar5212 | ath_ar5416 | ath_ar9160 | ath_ar9280 | \ ath_ar9285 ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5212/ar5212_power.c \ optional ath_hal | ath_ar5212 | ath_ar5416 | ath_ar9160 | ath_ar9280 | \ ath_ar9285 ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5212/ar5212_recv.c \ optional ath_hal | ath_ar5212 | ath_ar5416 | ath_ar9160 | ath_ar9280 | \ ath_ar9285 ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5212/ar5212_reset.c \ optional ath_hal | ath_ar5212 | ath_ar5416 | ath_ar9160 | ath_ar9280 | \ ath_ar9285 ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5212/ar5212_rfgain.c \ optional ath_hal | ath_ar5212 | ath_ar5416 | ath_ar9160 | ath_ar9280 | \ ath_ar9285 ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5212/ar5212_xmit.c \ optional ath_hal | ath_ar5212 | ath_ar5416 | ath_ar9160 | ath_ar9280 | \ ath_ar9285 ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" # ar5416 (depends on ar5212) dev/ath/ath_hal/ar5416/ar5416_ani.c \ optional ath_hal | ath_ar5416 | ath_ar9160 | ath_ar9280 | ath_ar9285 | \ ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5416/ar5416_attach.c \ optional ath_hal | ath_ar5416 | ath_ar9160 | ath_ar9280 | ath_ar9285 | \ ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5416/ar5416_beacon.c \ optional ath_hal | ath_ar5416 | ath_ar9160 | ath_ar9280 | ath_ar9285 | \ ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5416/ar5416_btcoex.c \ optional ath_hal | ath_ar5416 | ath_ar9160 | ath_ar9280 | ath_ar9285 | \ ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5416/ar5416_cal.c \ optional ath_hal | ath_ar5416 | ath_ar9160 | ath_ar9280 | ath_ar9285 | \ ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5416/ar5416_cal_iq.c \ optional ath_hal | ath_ar5416 | ath_ar9160 | ath_ar9280 | ath_ar9285 | \ ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5416/ar5416_cal_adcgain.c \ optional ath_hal | ath_ar5416 | ath_ar9160 | ath_ar9280 | ath_ar9285 | \ ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5416/ar5416_cal_adcdc.c \ optional ath_hal | ath_ar5416 | ath_ar9160 | ath_ar9280 | ath_ar9285 | \ ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5416/ar5416_eeprom.c \ optional ath_hal | ath_ar5416 | ath_ar9160 | ath_ar9280 | ath_ar9285 | \ ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5416/ar5416_gpio.c \ optional ath_hal | ath_ar5416 | ath_ar9160 | ath_ar9280 | ath_ar9285 | \ ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5416/ar5416_interrupts.c \ optional ath_hal | ath_ar5416 | ath_ar9160 | ath_ar9280 | ath_ar9285 | \ ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5416/ar5416_keycache.c \ optional ath_hal | ath_ar5416 | ath_ar9160 | ath_ar9280 | ath_ar9285 | \ ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5416/ar5416_misc.c \ optional ath_hal | ath_ar5416 | ath_ar9160 | ath_ar9280 | ath_ar9285 | \ ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5416/ar5416_phy.c \ optional ath_hal | ath_ar5416 | ath_ar9160 | ath_ar9280 | ath_ar9285 | \ ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5416/ar5416_power.c \ optional ath_hal | ath_ar5416 | ath_ar9160 | ath_ar9280 | ath_ar9285 | \ ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5416/ar5416_radar.c \ optional ath_hal | ath_ar5416 | ath_ar9160 | ath_ar9280 | ath_ar9285 | \ ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5416/ar5416_recv.c \ optional ath_hal | ath_ar5416 | ath_ar9160 | ath_ar9280 | ath_ar9285 | \ ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5416/ar5416_reset.c \ optional ath_hal | ath_ar5416 | ath_ar9160 | ath_ar9280 | ath_ar9285 | \ ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5416/ar5416_spectral.c \ optional ath_hal | ath_ar5416 | ath_ar9160 | ath_ar9280 | ath_ar9285 | \ ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5416/ar5416_xmit.c \ optional ath_hal | ath_ar5416 | ath_ar9160 | ath_ar9280 | ath_ar9285 | \ ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" # ar9130 (depends upon ar5416) - also requires AH_SUPPORT_AR9130 # # Since this is an embedded MAC SoC, there's no need to compile it into the # default HAL. dev/ath/ath_hal/ar9001/ar9130_attach.c optional ath_ar9130 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar9001/ar9130_phy.c optional ath_ar9130 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar9001/ar9130_eeprom.c optional ath_ar9130 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" # ar9160 (depends on ar5416) dev/ath/ath_hal/ar9001/ar9160_attach.c optional ath_hal | ath_ar9160 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" # ar9280 (depends on ar5416) dev/ath/ath_hal/ar9002/ar9280_attach.c optional ath_hal | ath_ar9280 | \ ath_ar9285 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar9002/ar9280_olc.c optional ath_hal | ath_ar9280 | \ ath_ar9285 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" # ar9285 (depends on ar5416 and ar9280) dev/ath/ath_hal/ar9002/ar9285_attach.c optional ath_hal | ath_ar9285 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar9002/ar9285_btcoex.c optional ath_hal | ath_ar9285 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar9002/ar9285_reset.c optional ath_hal | ath_ar9285 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar9002/ar9285_cal.c optional ath_hal | ath_ar9285 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar9002/ar9285_phy.c optional ath_hal | ath_ar9285 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar9002/ar9285_diversity.c optional ath_hal | ath_ar9285 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" # ar9287 (depends on ar5416) dev/ath/ath_hal/ar9002/ar9287_attach.c optional ath_hal | ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar9002/ar9287_reset.c optional ath_hal | ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar9002/ar9287_cal.c optional ath_hal | ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar9002/ar9287_olc.c optional ath_hal | ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" # ar9300 contrib/dev/ath/ath_hal/ar9300/ar9300_ani.c optional ath_hal | ath_ar9300 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal -I$S/contrib/dev/ath/ath_hal" contrib/dev/ath/ath_hal/ar9300/ar9300_attach.c optional ath_hal | ath_ar9300 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal -I$S/contrib/dev/ath/ath_hal" contrib/dev/ath/ath_hal/ar9300/ar9300_beacon.c optional ath_hal | ath_ar9300 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal -I$S/contrib/dev/ath/ath_hal" contrib/dev/ath/ath_hal/ar9300/ar9300_eeprom.c optional ath_hal | ath_ar9300 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal -I$S/contrib/dev/ath/ath_hal ${NO_WCONSTANT_CONVERSION}" contrib/dev/ath/ath_hal/ar9300/ar9300_freebsd.c optional ath_hal | ath_ar9300 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal -I$S/contrib/dev/ath/ath_hal" contrib/dev/ath/ath_hal/ar9300/ar9300_gpio.c optional ath_hal | ath_ar9300 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal -I$S/contrib/dev/ath/ath_hal" contrib/dev/ath/ath_hal/ar9300/ar9300_interrupts.c optional ath_hal | ath_ar9300 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal -I$S/contrib/dev/ath/ath_hal" contrib/dev/ath/ath_hal/ar9300/ar9300_keycache.c optional ath_hal | ath_ar9300 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal -I$S/contrib/dev/ath/ath_hal" contrib/dev/ath/ath_hal/ar9300/ar9300_mci.c optional ath_hal | ath_ar9300 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal -I$S/contrib/dev/ath/ath_hal" contrib/dev/ath/ath_hal/ar9300/ar9300_misc.c optional ath_hal | ath_ar9300 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal -I$S/contrib/dev/ath/ath_hal" contrib/dev/ath/ath_hal/ar9300/ar9300_paprd.c optional ath_hal | ath_ar9300 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal -I$S/contrib/dev/ath/ath_hal" contrib/dev/ath/ath_hal/ar9300/ar9300_phy.c optional ath_hal | ath_ar9300 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal -I$S/contrib/dev/ath/ath_hal" contrib/dev/ath/ath_hal/ar9300/ar9300_power.c optional ath_hal | ath_ar9300 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal -I$S/contrib/dev/ath/ath_hal" contrib/dev/ath/ath_hal/ar9300/ar9300_radar.c optional ath_hal | ath_ar9300 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal -I$S/contrib/dev/ath/ath_hal" contrib/dev/ath/ath_hal/ar9300/ar9300_radio.c optional ath_hal | ath_ar9300 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal -I$S/contrib/dev/ath/ath_hal" contrib/dev/ath/ath_hal/ar9300/ar9300_recv.c optional ath_hal | ath_ar9300 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal -I$S/contrib/dev/ath/ath_hal" contrib/dev/ath/ath_hal/ar9300/ar9300_recv_ds.c optional ath_hal | ath_ar9300 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal -I$S/contrib/dev/ath/ath_hal" contrib/dev/ath/ath_hal/ar9300/ar9300_reset.c optional ath_hal | ath_ar9300 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal -I$S/contrib/dev/ath/ath_hal ${NO_WSOMETIMES_UNINITIALIZED} -Wno-unused-function" contrib/dev/ath/ath_hal/ar9300/ar9300_stub.c optional ath_hal | ath_ar9300 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal -I$S/contrib/dev/ath/ath_hal" contrib/dev/ath/ath_hal/ar9300/ar9300_stub_funcs.c optional ath_hal | ath_ar9300 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal -I$S/contrib/dev/ath/ath_hal" contrib/dev/ath/ath_hal/ar9300/ar9300_spectral.c optional ath_hal | ath_ar9300 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal -I$S/contrib/dev/ath/ath_hal" contrib/dev/ath/ath_hal/ar9300/ar9300_timer.c optional ath_hal | ath_ar9300 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal -I$S/contrib/dev/ath/ath_hal" contrib/dev/ath/ath_hal/ar9300/ar9300_xmit.c optional ath_hal | ath_ar9300 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal -I$S/contrib/dev/ath/ath_hal" contrib/dev/ath/ath_hal/ar9300/ar9300_xmit_ds.c optional ath_hal | ath_ar9300 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal -I$S/contrib/dev/ath/ath_hal" # rf backends dev/ath/ath_hal/ar5212/ar2316.c optional ath_rf2316 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5212/ar2317.c optional ath_rf2317 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5212/ar2413.c optional ath_hal | ath_rf2413 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5212/ar2425.c optional ath_hal | ath_rf2425 | ath_rf2417 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5212/ar5111.c optional ath_hal | ath_rf5111 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5212/ar5112.c optional ath_hal | ath_rf5112 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5212/ar5413.c optional ath_hal | ath_rf5413 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar5416/ar2133.c optional ath_hal | ath_ar5416 | \ ath_ar9130 | ath_ar9160 | ath_ar9280 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar9002/ar9280.c optional ath_hal | ath_ar9280 | ath_ar9285 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar9002/ar9285.c optional ath_hal | ath_ar9285 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" dev/ath/ath_hal/ar9002/ar9287.c optional ath_hal | ath_ar9287 \ compile-with "${NORMAL_C} -I$S/dev/ath -I$S/dev/ath/ath_hal" # ath rate control algorithms dev/ath/ath_rate/amrr/amrr.c optional ath_rate_amrr \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/ath_rate/onoe/onoe.c optional ath_rate_onoe \ compile-with "${NORMAL_C} -I$S/dev/ath" dev/ath/ath_rate/sample/sample.c optional ath_rate_sample \ compile-with "${NORMAL_C} -I$S/dev/ath" # ath DFS modules dev/ath/ath_dfs/null/dfs_null.c optional ath \ compile-with "${NORMAL_C} -I$S/dev/ath" # dev/bce/if_bce.c optional bce dev/bfe/if_bfe.c optional bfe dev/bge/if_bge.c optional bge dev/bhnd/bhnd.c optional bhnd dev/bhnd/bhnd_erom.c optional bhnd dev/bhnd/bhnd_erom_if.m optional bhnd dev/bhnd/bhnd_subr.c optional bhnd dev/bhnd/bhnd_bus_if.m optional bhnd dev/bhnd/bhndb/bhnd_bhndb.c optional bhndb bhnd dev/bhnd/bhndb/bhndb.c optional bhndb bhnd dev/bhnd/bhndb/bhndb_bus_if.m optional bhndb bhnd dev/bhnd/bhndb/bhndb_hwdata.c optional bhndb bhnd dev/bhnd/bhndb/bhndb_if.m optional bhndb bhnd dev/bhnd/bhndb/bhndb_pci.c optional bhndb_pci bhndb bhnd pci dev/bhnd/bhndb/bhndb_pci_hwdata.c optional bhndb_pci bhndb bhnd pci dev/bhnd/bhndb/bhndb_pci_sprom.c optional bhndb_pci bhndb bhnd pci dev/bhnd/bhndb/bhndb_subr.c optional bhndb bhnd dev/bhnd/bcma/bcma.c optional bcma bhnd dev/bhnd/bcma/bcma_bhndb.c optional bcma bhnd bhndb dev/bhnd/bcma/bcma_erom.c optional bcma bhnd dev/bhnd/bcma/bcma_subr.c optional bcma bhnd dev/bhnd/cores/chipc/bhnd_chipc_if.m optional bhnd dev/bhnd/cores/chipc/bhnd_sprom_chipc.c optional bhnd dev/bhnd/cores/chipc/bhnd_pmu_chipc.c optional bhnd dev/bhnd/cores/chipc/chipc.c optional bhnd dev/bhnd/cores/chipc/chipc_cfi.c optional bhnd cfi dev/bhnd/cores/chipc/chipc_gpio.c optional bhnd gpio dev/bhnd/cores/chipc/chipc_slicer.c optional bhnd cfi | bhnd spibus dev/bhnd/cores/chipc/chipc_spi.c optional bhnd spibus dev/bhnd/cores/chipc/chipc_subr.c optional bhnd dev/bhnd/cores/chipc/pwrctl/bhnd_pwrctl.c optional bhnd dev/bhnd/cores/chipc/pwrctl/bhnd_pwrctl_if.m optional bhnd dev/bhnd/cores/chipc/pwrctl/bhnd_pwrctl_hostb_if.m optional bhnd dev/bhnd/cores/chipc/pwrctl/bhnd_pwrctl_subr.c optional bhnd dev/bhnd/cores/pci/bhnd_pci.c optional bhnd pci dev/bhnd/cores/pci/bhnd_pci_hostb.c optional bhndb bhnd pci dev/bhnd/cores/pci/bhnd_pcib.c optional bhnd_pcib bhnd pci dev/bhnd/cores/pcie2/bhnd_pcie2.c optional bhnd pci dev/bhnd/cores/pcie2/bhnd_pcie2_hostb.c optional bhndb bhnd pci dev/bhnd/cores/pcie2/bhnd_pcie2b.c optional bhnd_pcie2b bhnd pci dev/bhnd/cores/pmu/bhnd_pmu.c optional bhnd dev/bhnd/cores/pmu/bhnd_pmu_core.c optional bhnd dev/bhnd/cores/pmu/bhnd_pmu_if.m optional bhnd dev/bhnd/cores/pmu/bhnd_pmu_subr.c optional bhnd dev/bhnd/nvram/bhnd_nvram_data.c optional bhnd dev/bhnd/nvram/bhnd_nvram_data_bcm.c optional bhnd dev/bhnd/nvram/bhnd_nvram_data_bcmraw.c optional bhnd dev/bhnd/nvram/bhnd_nvram_data_btxt.c optional bhnd dev/bhnd/nvram/bhnd_nvram_data_sprom.c optional bhnd dev/bhnd/nvram/bhnd_nvram_data_sprom_subr.c optional bhnd dev/bhnd/nvram/bhnd_nvram_data_tlv.c optional bhnd dev/bhnd/nvram/bhnd_nvram_if.m optional bhnd dev/bhnd/nvram/bhnd_nvram_io.c optional bhnd dev/bhnd/nvram/bhnd_nvram_iobuf.c optional bhnd dev/bhnd/nvram/bhnd_nvram_ioptr.c optional bhnd dev/bhnd/nvram/bhnd_nvram_iores.c optional bhnd dev/bhnd/nvram/bhnd_nvram_plist.c optional bhnd dev/bhnd/nvram/bhnd_nvram_store.c optional bhnd dev/bhnd/nvram/bhnd_nvram_store_subr.c optional bhnd dev/bhnd/nvram/bhnd_nvram_subr.c optional bhnd dev/bhnd/nvram/bhnd_nvram_value.c optional bhnd dev/bhnd/nvram/bhnd_nvram_value_fmts.c optional bhnd dev/bhnd/nvram/bhnd_nvram_value_prf.c optional bhnd dev/bhnd/nvram/bhnd_nvram_value_subr.c optional bhnd dev/bhnd/nvram/bhnd_sprom.c optional bhnd dev/bhnd/siba/siba.c optional siba bhnd dev/bhnd/siba/siba_bhndb.c optional siba bhnd bhndb dev/bhnd/siba/siba_erom.c optional siba bhnd dev/bhnd/siba/siba_subr.c optional siba bhnd # dev/bktr/bktr_audio.c optional bktr pci dev/bktr/bktr_card.c optional bktr pci dev/bktr/bktr_core.c optional bktr pci dev/bktr/bktr_i2c.c optional bktr pci smbus dev/bktr/bktr_os.c optional bktr pci dev/bktr/bktr_tuner.c optional bktr pci dev/bktr/msp34xx.c optional bktr pci dev/bnxt/bnxt_hwrm.c optional bnxt iflib pci dev/bnxt/bnxt_sysctl.c optional bnxt iflib pci dev/bnxt/bnxt_txrx.c optional bnxt iflib pci dev/bnxt/if_bnxt.c optional bnxt iflib pci dev/bwi/bwimac.c optional bwi dev/bwi/bwiphy.c optional bwi dev/bwi/bwirf.c optional bwi dev/bwi/if_bwi.c optional bwi dev/bwi/if_bwi_pci.c optional bwi pci dev/bwn/if_bwn.c optional bwn bhnd dev/bwn/if_bwn_pci.c optional bwn pci bhnd bhndb bhndb_pci dev/bwn/if_bwn_phy_common.c optional bwn bhnd dev/bwn/if_bwn_phy_g.c optional bwn bhnd dev/bwn/if_bwn_phy_lp.c optional bwn bhnd dev/bwn/if_bwn_phy_n.c optional bwn bhnd dev/bwn/if_bwn_util.c optional bwn bhnd dev/cardbus/cardbus.c optional cardbus dev/cardbus/cardbus_cis.c optional cardbus dev/cardbus/cardbus_device.c optional cardbus dev/cas/if_cas.c optional cas dev/cfi/cfi_bus_fdt.c optional cfi fdt dev/cfi/cfi_bus_nexus.c optional cfi dev/cfi/cfi_core.c optional cfi dev/cfi/cfi_dev.c optional cfi dev/cfi/cfi_disk.c optional cfid dev/chromebook_platform/chromebook_platform.c optional chromebook_platform dev/ciss/ciss.c optional ciss dev/cmx/cmx.c optional cmx dev/cmx/cmx_pccard.c optional cmx pccard dev/cpufreq/ichss.c optional cpufreq pci dev/cxgb/cxgb_main.c optional cxgb pci \ compile-with "${NORMAL_C} -I$S/dev/cxgb" dev/cxgb/cxgb_sge.c optional cxgb pci \ compile-with "${NORMAL_C} -I$S/dev/cxgb" dev/cxgb/common/cxgb_mc5.c optional cxgb pci \ compile-with "${NORMAL_C} -I$S/dev/cxgb" dev/cxgb/common/cxgb_vsc7323.c optional cxgb pci \ compile-with "${NORMAL_C} -I$S/dev/cxgb" dev/cxgb/common/cxgb_vsc8211.c optional cxgb pci \ compile-with "${NORMAL_C} -I$S/dev/cxgb" dev/cxgb/common/cxgb_ael1002.c optional cxgb pci \ compile-with "${NORMAL_C} -I$S/dev/cxgb" dev/cxgb/common/cxgb_aq100x.c optional cxgb pci \ compile-with "${NORMAL_C} -I$S/dev/cxgb" dev/cxgb/common/cxgb_mv88e1xxx.c optional cxgb pci \ compile-with "${NORMAL_C} -I$S/dev/cxgb" dev/cxgb/common/cxgb_xgmac.c optional cxgb pci \ compile-with "${NORMAL_C} -I$S/dev/cxgb" dev/cxgb/common/cxgb_t3_hw.c optional cxgb pci \ compile-with "${NORMAL_C} -I$S/dev/cxgb" dev/cxgb/common/cxgb_tn1010.c optional cxgb pci \ compile-with "${NORMAL_C} -I$S/dev/cxgb" dev/cxgb/sys/uipc_mvec.c optional cxgb pci \ compile-with "${NORMAL_C} -I$S/dev/cxgb" dev/cxgb/cxgb_t3fw.c optional cxgb cxgb_t3fw \ compile-with "${NORMAL_C} -I$S/dev/cxgb" dev/cxgbe/t4_clip.c optional cxgbe pci \ compile-with "${NORMAL_C} -I$S/dev/cxgbe" dev/cxgbe/t4_filter.c optional cxgbe pci \ compile-with "${NORMAL_C} -I$S/dev/cxgbe" dev/cxgbe/t4_if.m optional cxgbe pci dev/cxgbe/t4_iov.c optional cxgbe pci \ compile-with "${NORMAL_C} -I$S/dev/cxgbe" dev/cxgbe/t4_mp_ring.c optional cxgbe pci \ compile-with "${NORMAL_C} -I$S/dev/cxgbe" dev/cxgbe/t4_main.c optional cxgbe pci \ compile-with "${NORMAL_C} -I$S/dev/cxgbe" dev/cxgbe/t4_netmap.c optional cxgbe pci \ compile-with "${NORMAL_C} -I$S/dev/cxgbe" dev/cxgbe/t4_sched.c optional cxgbe pci \ compile-with "${NORMAL_C} -I$S/dev/cxgbe" dev/cxgbe/t4_sge.c optional cxgbe pci \ compile-with "${NORMAL_C} -I$S/dev/cxgbe" dev/cxgbe/t4_smt.c optional cxgbe pci \ compile-with "${NORMAL_C} -I$S/dev/cxgbe" dev/cxgbe/t4_l2t.c optional cxgbe pci \ compile-with "${NORMAL_C} -I$S/dev/cxgbe" dev/cxgbe/t4_tracer.c optional cxgbe pci \ compile-with "${NORMAL_C} -I$S/dev/cxgbe" dev/cxgbe/t4_vf.c optional cxgbev pci \ compile-with "${NORMAL_C} -I$S/dev/cxgbe" dev/cxgbe/common/t4_hw.c optional cxgbe pci \ compile-with "${NORMAL_C} -I$S/dev/cxgbe" dev/cxgbe/common/t4vf_hw.c optional cxgbev pci \ compile-with "${NORMAL_C} -I$S/dev/cxgbe" dev/cxgbe/crypto/t4_kern_tls.c optional cxgbe pci kern_tls \ compile-with "${NORMAL_C} -I$S/dev/cxgbe" dev/cxgbe/crypto/t4_keyctx.c optional cxgbe pci \ compile-with "${NORMAL_C} -I$S/dev/cxgbe" dev/cxgbe/cudbg/cudbg_common.c optional cxgbe \ compile-with "${NORMAL_C} -I$S/dev/cxgbe" dev/cxgbe/cudbg/cudbg_flash_utils.c optional cxgbe \ compile-with "${NORMAL_C} -I$S/dev/cxgbe" dev/cxgbe/cudbg/cudbg_lib.c optional cxgbe \ compile-with "${NORMAL_C} -I$S/dev/cxgbe" dev/cxgbe/cudbg/cudbg_wtp.c optional cxgbe \ compile-with "${NORMAL_C} -I$S/dev/cxgbe" dev/cxgbe/cudbg/fastlz.c optional cxgbe \ compile-with "${NORMAL_C} -I$S/dev/cxgbe" dev/cxgbe/cudbg/fastlz_api.c optional cxgbe \ compile-with "${NORMAL_C} -I$S/dev/cxgbe" t4fw_cfg.c optional cxgbe \ compile-with "${AWK} -f $S/tools/fw_stub.awk t4fw_cfg.fw:t4fw_cfg t4fw_cfg_uwire.fw:t4fw_cfg_uwire t4fw.fw:t4fw -mt4fw_cfg -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "t4fw_cfg.c" t4fw_cfg.fwo optional cxgbe \ dependency "t4fw_cfg.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "t4fw_cfg.fwo" t4fw_cfg.fw optional cxgbe \ dependency "$S/dev/cxgbe/firmware/t4fw_cfg.txt" \ compile-with "${CP} ${.ALLSRC} ${.TARGET}" \ no-obj no-implicit-rule \ clean "t4fw_cfg.fw" t4fw_cfg_uwire.fwo optional cxgbe \ dependency "t4fw_cfg_uwire.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "t4fw_cfg_uwire.fwo" t4fw_cfg_uwire.fw optional cxgbe \ dependency "$S/dev/cxgbe/firmware/t4fw_cfg_uwire.txt" \ compile-with "${CP} ${.ALLSRC} ${.TARGET}" \ no-obj no-implicit-rule \ clean "t4fw_cfg_uwire.fw" t4fw.fwo optional cxgbe \ dependency "t4fw.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "t4fw.fwo" t4fw.fw optional cxgbe \ dependency "$S/dev/cxgbe/firmware/t4fw-1.24.12.0.bin" \ compile-with "${CP} ${.ALLSRC} ${.TARGET}" \ no-obj no-implicit-rule \ clean "t4fw.fw" t5fw_cfg.c optional cxgbe \ compile-with "${AWK} -f $S/tools/fw_stub.awk t5fw_cfg.fw:t5fw_cfg t5fw_cfg_uwire.fw:t5fw_cfg_uwire t5fw.fw:t5fw -mt5fw_cfg -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "t5fw_cfg.c" t5fw_cfg.fwo optional cxgbe \ dependency "t5fw_cfg.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "t5fw_cfg.fwo" t5fw_cfg.fw optional cxgbe \ dependency "$S/dev/cxgbe/firmware/t5fw_cfg.txt" \ compile-with "${CP} ${.ALLSRC} ${.TARGET}" \ no-obj no-implicit-rule \ clean "t5fw_cfg.fw" t5fw_cfg_uwire.fwo optional cxgbe \ dependency "t5fw_cfg_uwire.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "t5fw_cfg_uwire.fwo" t5fw_cfg_uwire.fw optional cxgbe \ dependency "$S/dev/cxgbe/firmware/t5fw_cfg_uwire.txt" \ compile-with "${CP} ${.ALLSRC} ${.TARGET}" \ no-obj no-implicit-rule \ clean "t5fw_cfg_uwire.fw" t5fw.fwo optional cxgbe \ dependency "t5fw.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "t5fw.fwo" t5fw.fw optional cxgbe \ dependency "$S/dev/cxgbe/firmware/t5fw-1.24.12.0.bin" \ compile-with "${CP} ${.ALLSRC} ${.TARGET}" \ no-obj no-implicit-rule \ clean "t5fw.fw" t6fw_cfg.c optional cxgbe \ compile-with "${AWK} -f $S/tools/fw_stub.awk t6fw_cfg.fw:t6fw_cfg t6fw_cfg_uwire.fw:t6fw_cfg_uwire t6fw.fw:t6fw -mt6fw_cfg -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "t6fw_cfg.c" t6fw_cfg.fwo optional cxgbe \ dependency "t6fw_cfg.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "t6fw_cfg.fwo" t6fw_cfg.fw optional cxgbe \ dependency "$S/dev/cxgbe/firmware/t6fw_cfg.txt" \ compile-with "${CP} ${.ALLSRC} ${.TARGET}" \ no-obj no-implicit-rule \ clean "t6fw_cfg.fw" t6fw_cfg_uwire.fwo optional cxgbe \ dependency "t6fw_cfg_uwire.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "t6fw_cfg_uwire.fwo" t6fw_cfg_uwire.fw optional cxgbe \ dependency "$S/dev/cxgbe/firmware/t6fw_cfg_uwire.txt" \ compile-with "${CP} ${.ALLSRC} ${.TARGET}" \ no-obj no-implicit-rule \ clean "t6fw_cfg_uwire.fw" t6fw.fwo optional cxgbe \ dependency "t6fw.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "t6fw.fwo" t6fw.fw optional cxgbe \ dependency "$S/dev/cxgbe/firmware/t6fw-1.24.12.0.bin" \ compile-with "${CP} ${.ALLSRC} ${.TARGET}" \ no-obj no-implicit-rule \ clean "t6fw.fw" dev/cxgbe/crypto/t4_crypto.c optional ccr \ compile-with "${NORMAL_C} -I$S/dev/cxgbe" dev/cy/cy.c optional cy dev/cy/cy_isa.c optional cy isa dev/cy/cy_pci.c optional cy pci dev/cyapa/cyapa.c optional cyapa iicbus dev/dc/if_dc.c optional dc pci dev/dc/dcphy.c optional dc pci dev/dc/pnphy.c optional dc pci dev/dcons/dcons.c optional dcons dev/dcons/dcons_crom.c optional dcons_crom dev/dcons/dcons_os.c optional dcons dev/dme/if_dme.c optional dme dev/drm2/drm_agpsupport.c optional drm2 dev/drm2/drm_auth.c optional drm2 dev/drm2/drm_bufs.c optional drm2 dev/drm2/drm_buffer.c optional drm2 dev/drm2/drm_context.c optional drm2 dev/drm2/drm_crtc.c optional drm2 dev/drm2/drm_crtc_helper.c optional drm2 dev/drm2/drm_dma.c optional drm2 dev/drm2/drm_dp_helper.c optional drm2 dev/drm2/drm_dp_iic_helper.c optional drm2 dev/drm2/drm_drv.c optional drm2 dev/drm2/drm_edid.c optional drm2 dev/drm2/drm_fb_helper.c optional drm2 dev/drm2/drm_fops.c optional drm2 dev/drm2/drm_gem.c optional drm2 dev/drm2/drm_gem_names.c optional drm2 dev/drm2/drm_global.c optional drm2 dev/drm2/drm_hashtab.c optional drm2 dev/drm2/drm_ioctl.c optional drm2 dev/drm2/drm_irq.c optional drm2 dev/drm2/drm_linux_list_sort.c optional drm2 dev/drm2/drm_lock.c optional drm2 dev/drm2/drm_memory.c optional drm2 dev/drm2/drm_mm.c optional drm2 dev/drm2/drm_modes.c optional drm2 dev/drm2/drm_pci.c optional drm2 dev/drm2/drm_platform.c optional drm2 dev/drm2/drm_scatter.c optional drm2 dev/drm2/drm_stub.c optional drm2 dev/drm2/drm_sysctl.c optional drm2 dev/drm2/drm_vm.c optional drm2 dev/drm2/drm_os_freebsd.c optional drm2 dev/drm2/ttm/ttm_agp_backend.c optional drm2 dev/drm2/ttm/ttm_lock.c optional drm2 dev/drm2/ttm/ttm_object.c optional drm2 dev/drm2/ttm/ttm_tt.c optional drm2 dev/drm2/ttm/ttm_bo_util.c optional drm2 dev/drm2/ttm/ttm_bo.c optional drm2 dev/drm2/ttm/ttm_bo_manager.c optional drm2 dev/drm2/ttm/ttm_execbuf_util.c optional drm2 dev/drm2/ttm/ttm_memory.c optional drm2 dev/drm2/ttm/ttm_page_alloc.c optional drm2 dev/drm2/ttm/ttm_bo_vm.c optional drm2 dev/efidev/efidev.c optional efirt dev/efidev/efirt.c optional efirt dev/efidev/efirtc.c optional efirt dev/e1000/if_em.c optional em \ compile-with "${NORMAL_C} -I$S/dev/e1000" dev/e1000/em_txrx.c optional em \ compile-with "${NORMAL_C} -I$S/dev/e1000" dev/e1000/igb_txrx.c optional em \ compile-with "${NORMAL_C} -I$S/dev/e1000" dev/e1000/e1000_80003es2lan.c optional em \ compile-with "${NORMAL_C} -I$S/dev/e1000" dev/e1000/e1000_82540.c optional em \ compile-with "${NORMAL_C} -I$S/dev/e1000" dev/e1000/e1000_82541.c optional em \ compile-with "${NORMAL_C} -I$S/dev/e1000" dev/e1000/e1000_82542.c optional em \ compile-with "${NORMAL_C} -I$S/dev/e1000" dev/e1000/e1000_82543.c optional em \ compile-with "${NORMAL_C} -I$S/dev/e1000" dev/e1000/e1000_82571.c optional em \ compile-with "${NORMAL_C} -I$S/dev/e1000" dev/e1000/e1000_82575.c optional em \ compile-with "${NORMAL_C} -I$S/dev/e1000" dev/e1000/e1000_ich8lan.c optional em \ compile-with "${NORMAL_C} -I$S/dev/e1000" dev/e1000/e1000_i210.c optional em \ compile-with "${NORMAL_C} -I$S/dev/e1000" dev/e1000/e1000_api.c optional em \ compile-with "${NORMAL_C} -I$S/dev/e1000" dev/e1000/e1000_mac.c optional em \ compile-with "${NORMAL_C} -I$S/dev/e1000" dev/e1000/e1000_manage.c optional em \ compile-with "${NORMAL_C} -I$S/dev/e1000" dev/e1000/e1000_nvm.c optional em \ compile-with "${NORMAL_C} -I$S/dev/e1000" dev/e1000/e1000_phy.c optional em \ compile-with "${NORMAL_C} -I$S/dev/e1000" dev/e1000/e1000_vf.c optional em \ compile-with "${NORMAL_C} -I$S/dev/e1000" dev/e1000/e1000_mbx.c optional em \ compile-with "${NORMAL_C} -I$S/dev/e1000" dev/e1000/e1000_osdep.c optional em \ compile-with "${NORMAL_C} -I$S/dev/e1000" dev/et/if_et.c optional et dev/ena/ena.c optional ena \ compile-with "${NORMAL_C} -I$S/contrib" dev/ena/ena_sysctl.c optional ena \ compile-with "${NORMAL_C} -I$S/contrib" contrib/ena-com/ena_com.c optional ena contrib/ena-com/ena_eth_com.c optional ena dev/esp/esp_pci.c optional esp pci dev/esp/ncr53c9x.c optional esp dev/etherswitch/arswitch/arswitch.c optional arswitch dev/etherswitch/arswitch/arswitch_reg.c optional arswitch dev/etherswitch/arswitch/arswitch_phy.c optional arswitch dev/etherswitch/arswitch/arswitch_8216.c optional arswitch dev/etherswitch/arswitch/arswitch_8226.c optional arswitch dev/etherswitch/arswitch/arswitch_8316.c optional arswitch dev/etherswitch/arswitch/arswitch_8327.c optional arswitch dev/etherswitch/arswitch/arswitch_7240.c optional arswitch dev/etherswitch/arswitch/arswitch_9340.c optional arswitch dev/etherswitch/arswitch/arswitch_vlans.c optional arswitch dev/etherswitch/etherswitch.c optional etherswitch dev/etherswitch/etherswitch_if.m optional etherswitch dev/etherswitch/ip17x/ip17x.c optional ip17x dev/etherswitch/ip17x/ip175c.c optional ip17x dev/etherswitch/ip17x/ip175d.c optional ip17x dev/etherswitch/ip17x/ip17x_phy.c optional ip17x dev/etherswitch/ip17x/ip17x_vlans.c optional ip17x dev/etherswitch/miiproxy.c optional miiproxy dev/etherswitch/rtl8366/rtl8366rb.c optional rtl8366rb dev/etherswitch/e6000sw/e6000sw.c optional e6000sw dev/etherswitch/e6000sw/e6060sw.c optional e6060sw dev/etherswitch/infineon/adm6996fc.c optional adm6996fc dev/etherswitch/micrel/ksz8995ma.c optional ksz8995ma dev/etherswitch/ukswitch/ukswitch.c optional ukswitch dev/evdev/cdev.c optional evdev dev/evdev/evdev.c optional evdev dev/evdev/evdev_mt.c optional evdev dev/evdev/evdev_utils.c optional evdev dev/evdev/uinput.c optional evdev uinput dev/exca/exca.c optional cbb dev/extres/clk/clk.c optional ext_resources clk fdt dev/extres/clk/clkdev_if.m optional ext_resources clk fdt dev/extres/clk/clknode_if.m optional ext_resources clk fdt dev/extres/clk/clk_bus.c optional ext_resources clk fdt dev/extres/clk/clk_div.c optional ext_resources clk fdt dev/extres/clk/clk_fixed.c optional ext_resources clk fdt dev/extres/clk/clk_gate.c optional ext_resources clk fdt dev/extres/clk/clk_link.c optional ext_resources clk fdt dev/extres/clk/clk_mux.c optional ext_resources clk fdt dev/extres/phy/phy.c optional ext_resources phy fdt dev/extres/phy/phydev_if.m optional ext_resources phy fdt dev/extres/phy/phynode_if.m optional ext_resources phy fdt dev/extres/phy/phy_usb.c optional ext_resources phy fdt dev/extres/phy/phynode_usb_if.m optional ext_resources phy fdt dev/extres/hwreset/hwreset.c optional ext_resources hwreset fdt dev/extres/hwreset/hwreset_if.m optional ext_resources hwreset fdt dev/extres/nvmem/nvmem.c optional ext_resources nvmem fdt dev/extres/nvmem/nvmem_if.m optional ext_resources nvmem fdt dev/extres/regulator/regdev_if.m optional ext_resources regulator fdt dev/extres/regulator/regnode_if.m optional ext_resources regulator fdt dev/extres/regulator/regulator.c optional ext_resources regulator fdt dev/extres/regulator/regulator_bus.c optional ext_resources regulator fdt dev/extres/regulator/regulator_fixed.c optional ext_resources regulator fdt dev/extres/syscon/syscon.c optional ext_resources syscon dev/extres/syscon/syscon_generic.c optional ext_resources syscon fdt dev/extres/syscon/syscon_if.m optional ext_resources syscon dev/fb/fbd.c optional fbd | vt dev/fb/fb_if.m standard dev/fb/splash.c optional sc splash dev/fdt/fdt_clock.c optional fdt fdt_clock dev/fdt/fdt_clock_if.m optional fdt fdt_clock dev/fdt/fdt_common.c optional fdt dev/fdt/fdt_pinctrl.c optional fdt fdt_pinctrl dev/fdt/fdt_pinctrl_if.m optional fdt fdt_pinctrl dev/fdt/fdt_slicer.c optional fdt cfi | fdt mx25l | fdt n25q | fdt at45d dev/fdt/fdt_static_dtb.S optional fdt fdt_dtb_static \ dependency "${FDT_DTS_FILE:T:R}.dtb" dev/fdt/simplebus.c optional fdt dev/fdt/simple_mfd.c optional syscon fdt dev/filemon/filemon.c optional filemon dev/firewire/firewire.c optional firewire dev/firewire/fwcrom.c optional firewire dev/firewire/fwdev.c optional firewire dev/firewire/fwdma.c optional firewire dev/firewire/fwmem.c optional firewire dev/firewire/fwohci.c optional firewire dev/firewire/fwohci_pci.c optional firewire pci dev/firewire/if_fwe.c optional fwe dev/firewire/if_fwip.c optional fwip dev/firewire/sbp.c optional sbp dev/firewire/sbp_targ.c optional sbp_targ dev/flash/at45d.c optional at45d dev/flash/cqspi.c optional cqspi fdt xdma dev/flash/mx25l.c optional mx25l dev/flash/n25q.c optional n25q fdt dev/flash/qspi_if.m optional cqspi fdt | n25q fdt dev/fxp/if_fxp.c optional fxp dev/fxp/inphy.c optional fxp dev/gem/if_gem.c optional gem dev/gem/if_gem_pci.c optional gem pci dev/gpio/dwgpio/dwgpio.c optional gpio dwgpio fdt dev/gpio/dwgpio/dwgpio_bus.c optional gpio dwgpio fdt dev/gpio/dwgpio/dwgpio_if.m optional gpio dwgpio fdt dev/gpio/gpiobacklight.c optional gpiobacklight fdt dev/gpio/gpiokeys.c optional gpiokeys fdt dev/gpio/gpiokeys_codes.c optional gpiokeys fdt dev/gpio/gpiobus.c optional gpio \ dependency "gpiobus_if.h" dev/gpio/gpioc.c optional gpio \ dependency "gpio_if.h" dev/gpio/gpioiic.c optional gpioiic dev/gpio/gpioled.c optional gpioled !fdt dev/gpio/gpioled_fdt.c optional gpioled fdt dev/gpio/gpiomdio.c optional gpiomdio mii_bitbang dev/gpio/gpiopower.c optional gpiopower fdt dev/gpio/gpioregulator.c optional gpioregulator fdt ext_resources dev/gpio/gpiospi.c optional gpiospi dev/gpio/gpioths.c optional gpioths dev/gpio/gpio_if.m optional gpio dev/gpio/gpiobus_if.m optional gpio dev/gpio/gpiopps.c optional gpiopps fdt dev/gpio/ofw_gpiobus.c optional fdt gpio dev/hifn/hifn7751.c optional hifn dev/hme/if_hme.c optional hme dev/hme/if_hme_pci.c optional hme pci dev/hptiop/hptiop.c optional hptiop scbus dev/hwpmc/hwpmc_logging.c optional hwpmc dev/hwpmc/hwpmc_mod.c optional hwpmc dev/hwpmc/hwpmc_soft.c optional hwpmc dev/ichiic/ig4_acpi.c optional ig4 acpi iicbus dev/ichiic/ig4_iic.c optional ig4 iicbus dev/ichiic/ig4_pci.c optional ig4 pci iicbus dev/ichsmb/ichsmb.c optional ichsmb dev/ichsmb/ichsmb_pci.c optional ichsmb pci dev/ida/ida.c optional ida dev/ida/ida_disk.c optional ida dev/ida/ida_pci.c optional ida pci dev/iicbus/ad7418.c optional ad7418 dev/iicbus/ads111x.c optional ads111x dev/iicbus/ds1307.c optional ds1307 dev/iicbus/ds13rtc.c optional ds13rtc | ds133x | ds1374 dev/iicbus/ds1672.c optional ds1672 dev/iicbus/ds3231.c optional ds3231 dev/iicbus/syr827.c optional syr827 ext_resources fdt dev/iicbus/icee.c optional icee dev/iicbus/if_ic.c optional ic dev/iicbus/iic.c optional iic dev/iicbus/iic_recover_bus.c optional iicbus dev/iicbus/iicbb.c optional iicbb dev/iicbus/iicbb_if.m optional iicbb dev/iicbus/iicbus.c optional iicbus dev/iicbus/iicbus_if.m optional iicbus dev/iicbus/iiconf.c optional iicbus dev/iicbus/iicsmb.c optional iicsmb \ dependency "iicbus_if.h" dev/iicbus/iicoc.c optional iicoc dev/iicbus/iicoc_fdt.c optional iicoc ext_resources fdt dev/iicbus/iicoc_pci.c optional iicoc pci dev/iicbus/isl12xx.c optional isl12xx dev/iicbus/lm75.c optional lm75 dev/iicbus/mux/iicmux.c optional iicmux dev/iicbus/mux/iicmux_if.m optional iicmux dev/iicbus/mux/iic_gpiomux.c optional iic_gpiomux fdt dev/iicbus/mux/ltc430x.c optional ltc430x dev/iicbus/nxprtc.c optional nxprtc | pcf8563 dev/iicbus/ofw_iicbus.c optional fdt iicbus dev/iicbus/rtc8583.c optional rtc8583 dev/iicbus/s35390a.c optional s35390a dev/iicbus/sy8106a.c optional sy8106a ext_resources fdt dev/iir/iir.c optional iir dev/iir/iir_ctrl.c optional iir dev/iir/iir_pci.c optional iir pci dev/intpm/intpm.c optional intpm pci # XXX Work around clang warning, until maintainer approves fix. dev/ips/ips.c optional ips \ compile-with "${NORMAL_C} ${NO_WSOMETIMES_UNINITIALIZED}" dev/ips/ips_commands.c optional ips dev/ips/ips_disk.c optional ips dev/ips/ips_ioctl.c optional ips dev/ips/ips_pci.c optional ips pci dev/ipw/if_ipw.c optional ipw ipwbssfw.c optional ipwbssfw | ipwfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk ipw_bss.fw:ipw_bss:130 -lintel_ipw -mipw_bss -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "ipwbssfw.c" ipw_bss.fwo optional ipwbssfw | ipwfw \ dependency "ipw_bss.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "ipw_bss.fwo" ipw_bss.fw optional ipwbssfw | ipwfw \ dependency "$S/contrib/dev/ipw/ipw2100-1.3.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "ipw_bss.fw" ipwibssfw.c optional ipwibssfw | ipwfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk ipw_ibss.fw:ipw_ibss:130 -lintel_ipw -mipw_ibss -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "ipwibssfw.c" ipw_ibss.fwo optional ipwibssfw | ipwfw \ dependency "ipw_ibss.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "ipw_ibss.fwo" ipw_ibss.fw optional ipwibssfw | ipwfw \ dependency "$S/contrib/dev/ipw/ipw2100-1.3-i.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "ipw_ibss.fw" ipwmonitorfw.c optional ipwmonitorfw | ipwfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk ipw_monitor.fw:ipw_monitor:130 -lintel_ipw -mipw_monitor -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "ipwmonitorfw.c" ipw_monitor.fwo optional ipwmonitorfw | ipwfw \ dependency "ipw_monitor.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "ipw_monitor.fwo" ipw_monitor.fw optional ipwmonitorfw | ipwfw \ dependency "$S/contrib/dev/ipw/ipw2100-1.3-p.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "ipw_monitor.fw" dev/iscsi/icl.c optional iscsi dev/iscsi/icl_conn_if.m optional cfiscsi | iscsi dev/iscsi/icl_soft.c optional iscsi dev/iscsi/icl_soft_proxy.c optional iscsi dev/iscsi/iscsi.c optional iscsi scbus dev/iscsi_initiator/iscsi.c optional iscsi_initiator scbus dev/iscsi_initiator/iscsi_subr.c optional iscsi_initiator scbus dev/iscsi_initiator/isc_cam.c optional iscsi_initiator scbus dev/iscsi_initiator/isc_soc.c optional iscsi_initiator scbus dev/iscsi_initiator/isc_sm.c optional iscsi_initiator scbus dev/iscsi_initiator/isc_subr.c optional iscsi_initiator scbus dev/ismt/ismt.c optional ismt dev/isl/isl.c optional isl iicbus dev/isp/isp.c optional isp dev/isp/isp_freebsd.c optional isp dev/isp/isp_library.c optional isp dev/isp/isp_pci.c optional isp pci dev/isp/isp_target.c optional isp dev/ispfw/ispfw.c optional ispfw dev/iwi/if_iwi.c optional iwi iwibssfw.c optional iwibssfw | iwifw \ compile-with "${AWK} -f $S/tools/fw_stub.awk iwi_bss.fw:iwi_bss:300 -lintel_iwi -miwi_bss -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "iwibssfw.c" iwi_bss.fwo optional iwibssfw | iwifw \ dependency "iwi_bss.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "iwi_bss.fwo" iwi_bss.fw optional iwibssfw | iwifw \ dependency "$S/contrib/dev/iwi/ipw2200-bss.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "iwi_bss.fw" iwiibssfw.c optional iwiibssfw | iwifw \ compile-with "${AWK} -f $S/tools/fw_stub.awk iwi_ibss.fw:iwi_ibss:300 -lintel_iwi -miwi_ibss -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "iwiibssfw.c" iwi_ibss.fwo optional iwiibssfw | iwifw \ dependency "iwi_ibss.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "iwi_ibss.fwo" iwi_ibss.fw optional iwiibssfw | iwifw \ dependency "$S/contrib/dev/iwi/ipw2200-ibss.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "iwi_ibss.fw" iwimonitorfw.c optional iwimonitorfw | iwifw \ compile-with "${AWK} -f $S/tools/fw_stub.awk iwi_monitor.fw:iwi_monitor:300 -lintel_iwi -miwi_monitor -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "iwimonitorfw.c" iwi_monitor.fwo optional iwimonitorfw | iwifw \ dependency "iwi_monitor.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "iwi_monitor.fwo" iwi_monitor.fw optional iwimonitorfw | iwifw \ dependency "$S/contrib/dev/iwi/ipw2200-sniffer.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "iwi_monitor.fw" dev/iwm/if_iwm.c optional iwm dev/iwm/if_iwm_7000.c optional iwm dev/iwm/if_iwm_8000.c optional iwm dev/iwm/if_iwm_9000.c optional iwm dev/iwm/if_iwm_9260.c optional iwm dev/iwm/if_iwm_binding.c optional iwm dev/iwm/if_iwm_fw.c optional iwm dev/iwm/if_iwm_led.c optional iwm dev/iwm/if_iwm_mac_ctxt.c optional iwm dev/iwm/if_iwm_notif_wait.c optional iwm dev/iwm/if_iwm_pcie_trans.c optional iwm dev/iwm/if_iwm_phy_ctxt.c optional iwm dev/iwm/if_iwm_phy_db.c optional iwm dev/iwm/if_iwm_power.c optional iwm dev/iwm/if_iwm_scan.c optional iwm dev/iwm/if_iwm_sf.c optional iwm dev/iwm/if_iwm_sta.c optional iwm dev/iwm/if_iwm_time_event.c optional iwm dev/iwm/if_iwm_util.c optional iwm iwm3160fw.c optional iwm3160fw | iwmfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk iwm3160.fw:iwm3160fw -miwm3160fw -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "iwm3160fw.c" iwm3160fw.fwo optional iwm3160fw | iwmfw \ dependency "iwm3160.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "iwm3160fw.fwo" iwm3160.fw optional iwm3160fw | iwmfw \ dependency "$S/contrib/dev/iwm/iwm-3160-17.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "iwm3160.fw" iwm3168fw.c optional iwm3168fw | iwmfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk iwm3168.fw:iwm3168fw -miwm3168fw -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "iwm3168fw.c" iwm3168fw.fwo optional iwm3168fw | iwmfw \ dependency "iwm3168.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "iwm3168fw.fwo" iwm3168.fw optional iwm3168fw | iwmfw \ dependency "$S/contrib/dev/iwm/iwm-3168-22.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "iwm3168.fw" iwm7260fw.c optional iwm7260fw | iwmfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk iwm7260.fw:iwm7260fw -miwm7260fw -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "iwm7260fw.c" iwm7260fw.fwo optional iwm7260fw | iwmfw \ dependency "iwm7260.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "iwm7260fw.fwo" iwm7260.fw optional iwm7260fw | iwmfw \ dependency "$S/contrib/dev/iwm/iwm-7260-17.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "iwm7260.fw" iwm7265fw.c optional iwm7265fw | iwmfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk iwm7265.fw:iwm7265fw -miwm7265fw -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "iwm7265fw.c" iwm7265fw.fwo optional iwm7265fw | iwmfw \ dependency "iwm7265.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "iwm7265fw.fwo" iwm7265.fw optional iwm7265fw | iwmfw \ dependency "$S/contrib/dev/iwm/iwm-7265-17.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "iwm7265.fw" iwm7265Dfw.c optional iwm7265Dfw | iwmfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk iwm7265D.fw:iwm7265Dfw -miwm7265Dfw -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "iwm7265Dfw.c" iwm7265Dfw.fwo optional iwm7265Dfw | iwmfw \ dependency "iwm7265D.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "iwm7265Dfw.fwo" iwm7265D.fw optional iwm7265Dfw | iwmfw \ dependency "$S/contrib/dev/iwm/iwm-7265D-17.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "iwm7265D.fw" iwm8000Cfw.c optional iwm8000Cfw | iwmfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk iwm8000C.fw:iwm8000Cfw -miwm8000Cfw -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "iwm8000Cfw.c" iwm8000Cfw.fwo optional iwm8000Cfw | iwmfw \ dependency "iwm8000C.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "iwm8000Cfw.fwo" iwm8000C.fw optional iwm8000Cfw | iwmfw \ dependency "$S/contrib/dev/iwm/iwm-8000C-16.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "iwm8000C.fw" iwm8265.fw optional iwm8265fw | iwmfw \ dependency "$S/contrib/dev/iwm/iwm-8265-22.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "iwm8265.fw" iwm8265fw.c optional iwm8265fw | iwmfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk iwm8265.fw:iwm8265fw -miwm8265fw -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "iwm8265fw.c" iwm8265fw.fwo optional iwm8265fw | iwmfw \ dependency "iwm8265.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "iwm8265fw.fwo" dev/iwn/if_iwn.c optional iwn iwn1000fw.c optional iwn1000fw | iwnfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk iwn1000.fw:iwn1000fw -miwn1000fw -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "iwn1000fw.c" iwn1000fw.fwo optional iwn1000fw | iwnfw \ dependency "iwn1000.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "iwn1000fw.fwo" iwn1000.fw optional iwn1000fw | iwnfw \ dependency "$S/contrib/dev/iwn/iwlwifi-1000-39.31.5.1.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "iwn1000.fw" iwn100fw.c optional iwn100fw | iwnfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk iwn100.fw:iwn100fw -miwn100fw -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "iwn100fw.c" iwn100fw.fwo optional iwn100fw | iwnfw \ dependency "iwn100.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "iwn100fw.fwo" iwn100.fw optional iwn100fw | iwnfw \ dependency "$S/contrib/dev/iwn/iwlwifi-100-39.31.5.1.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "iwn100.fw" iwn105fw.c optional iwn105fw | iwnfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk iwn105.fw:iwn105fw -miwn105fw -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "iwn105fw.c" iwn105fw.fwo optional iwn105fw | iwnfw \ dependency "iwn105.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "iwn105fw.fwo" iwn105.fw optional iwn105fw | iwnfw \ dependency "$S/contrib/dev/iwn/iwlwifi-105-6-18.168.6.1.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "iwn105.fw" iwn135fw.c optional iwn135fw | iwnfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk iwn135.fw:iwn135fw -miwn135fw -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "iwn135fw.c" iwn135fw.fwo optional iwn135fw | iwnfw \ dependency "iwn135.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "iwn135fw.fwo" iwn135.fw optional iwn135fw | iwnfw \ dependency "$S/contrib/dev/iwn/iwlwifi-135-6-18.168.6.1.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "iwn135.fw" iwn2000fw.c optional iwn2000fw | iwnfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk iwn2000.fw:iwn2000fw -miwn2000fw -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "iwn2000fw.c" iwn2000fw.fwo optional iwn2000fw | iwnfw \ dependency "iwn2000.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "iwn2000fw.fwo" iwn2000.fw optional iwn2000fw | iwnfw \ dependency "$S/contrib/dev/iwn/iwlwifi-2000-18.168.6.1.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "iwn2000.fw" iwn2030fw.c optional iwn2030fw | iwnfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk iwn2030.fw:iwn2030fw -miwn2030fw -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "iwn2030fw.c" iwn2030fw.fwo optional iwn2030fw | iwnfw \ dependency "iwn2030.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "iwn2030fw.fwo" iwn2030.fw optional iwn2030fw | iwnfw \ dependency "$S/contrib/dev/iwn/iwnwifi-2030-18.168.6.1.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "iwn2030.fw" iwn4965fw.c optional iwn4965fw | iwnfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk iwn4965.fw:iwn4965fw -miwn4965fw -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "iwn4965fw.c" iwn4965fw.fwo optional iwn4965fw | iwnfw \ dependency "iwn4965.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "iwn4965fw.fwo" iwn4965.fw optional iwn4965fw | iwnfw \ dependency "$S/contrib/dev/iwn/iwlwifi-4965-228.61.2.24.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "iwn4965.fw" iwn5000fw.c optional iwn5000fw | iwnfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk iwn5000.fw:iwn5000fw -miwn5000fw -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "iwn5000fw.c" iwn5000fw.fwo optional iwn5000fw | iwnfw \ dependency "iwn5000.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "iwn5000fw.fwo" iwn5000.fw optional iwn5000fw | iwnfw \ dependency "$S/contrib/dev/iwn/iwlwifi-5000-8.83.5.1.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "iwn5000.fw" iwn5150fw.c optional iwn5150fw | iwnfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk iwn5150.fw:iwn5150fw -miwn5150fw -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "iwn5150fw.c" iwn5150fw.fwo optional iwn5150fw | iwnfw \ dependency "iwn5150.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "iwn5150fw.fwo" iwn5150.fw optional iwn5150fw | iwnfw \ dependency "$S/contrib/dev/iwn/iwlwifi-5150-8.24.2.2.fw.uu"\ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "iwn5150.fw" iwn6000fw.c optional iwn6000fw | iwnfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk iwn6000.fw:iwn6000fw -miwn6000fw -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "iwn6000fw.c" iwn6000fw.fwo optional iwn6000fw | iwnfw \ dependency "iwn6000.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "iwn6000fw.fwo" iwn6000.fw optional iwn6000fw | iwnfw \ dependency "$S/contrib/dev/iwn/iwlwifi-6000-9.221.4.1.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "iwn6000.fw" iwn6000g2afw.c optional iwn6000g2afw | iwnfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk iwn6000g2a.fw:iwn6000g2afw -miwn6000g2afw -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "iwn6000g2afw.c" iwn6000g2afw.fwo optional iwn6000g2afw | iwnfw \ dependency "iwn6000g2a.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "iwn6000g2afw.fwo" iwn6000g2a.fw optional iwn6000g2afw | iwnfw \ dependency "$S/contrib/dev/iwn/iwlwifi-6000g2a-18.168.6.1.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "iwn6000g2a.fw" iwn6000g2bfw.c optional iwn6000g2bfw | iwnfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk iwn6000g2b.fw:iwn6000g2bfw -miwn6000g2bfw -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "iwn6000g2bfw.c" iwn6000g2bfw.fwo optional iwn6000g2bfw | iwnfw \ dependency "iwn6000g2b.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "iwn6000g2bfw.fwo" iwn6000g2b.fw optional iwn6000g2bfw | iwnfw \ dependency "$S/contrib/dev/iwn/iwlwifi-6000g2b-18.168.6.1.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "iwn6000g2b.fw" iwn6050fw.c optional iwn6050fw | iwnfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk iwn6050.fw:iwn6050fw -miwn6050fw -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "iwn6050fw.c" iwn6050fw.fwo optional iwn6050fw | iwnfw \ dependency "iwn6050.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "iwn6050fw.fwo" iwn6050.fw optional iwn6050fw | iwnfw \ dependency "$S/contrib/dev/iwn/iwlwifi-6050-41.28.5.1.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "iwn6050.fw" dev/ixgbe/if_ix.c optional ix inet \ compile-with "${NORMAL_C} -I$S/dev/ixgbe -DSMP" dev/ixgbe/if_ixv.c optional ixv inet \ compile-with "${NORMAL_C} -I$S/dev/ixgbe -DSMP" dev/ixgbe/if_bypass.c optional ix inet \ compile-with "${NORMAL_C} -I$S/dev/ixgbe" dev/ixgbe/if_fdir.c optional ix inet | ixv inet \ compile-with "${NORMAL_C} -I$S/dev/ixgbe" dev/ixgbe/if_sriov.c optional ix inet \ compile-with "${NORMAL_C} -I$S/dev/ixgbe" dev/ixgbe/ix_txrx.c optional ix inet | ixv inet \ compile-with "${NORMAL_C} -I$S/dev/ixgbe" dev/ixgbe/ixgbe_osdep.c optional ix inet | ixv inet \ compile-with "${NORMAL_C} -I$S/dev/ixgbe" dev/ixgbe/ixgbe_phy.c optional ix inet | ixv inet \ compile-with "${NORMAL_C} -I$S/dev/ixgbe" dev/ixgbe/ixgbe_api.c optional ix inet | ixv inet \ compile-with "${NORMAL_C} -I$S/dev/ixgbe" dev/ixgbe/ixgbe_common.c optional ix inet | ixv inet \ compile-with "${NORMAL_C} -I$S/dev/ixgbe" dev/ixgbe/ixgbe_mbx.c optional ix inet | ixv inet \ compile-with "${NORMAL_C} -I$S/dev/ixgbe" dev/ixgbe/ixgbe_vf.c optional ix inet | ixv inet \ compile-with "${NORMAL_C} -I$S/dev/ixgbe" dev/ixgbe/ixgbe_82598.c optional ix inet | ixv inet \ compile-with "${NORMAL_C} -I$S/dev/ixgbe" dev/ixgbe/ixgbe_82599.c optional ix inet | ixv inet \ compile-with "${NORMAL_C} -I$S/dev/ixgbe" dev/ixgbe/ixgbe_x540.c optional ix inet | ixv inet \ compile-with "${NORMAL_C} -I$S/dev/ixgbe" dev/ixgbe/ixgbe_x550.c optional ix inet | ixv inet \ compile-with "${NORMAL_C} -I$S/dev/ixgbe" dev/ixgbe/ixgbe_dcb.c optional ix inet | ixv inet \ compile-with "${NORMAL_C} -I$S/dev/ixgbe" dev/ixgbe/ixgbe_dcb_82598.c optional ix inet | ixv inet \ compile-with "${NORMAL_C} -I$S/dev/ixgbe" dev/ixgbe/ixgbe_dcb_82599.c optional ix inet | ixv inet \ compile-with "${NORMAL_C} -I$S/dev/ixgbe" dev/jedec_dimm/jedec_dimm.c optional jedec_dimm smbus dev/jme/if_jme.c optional jme pci dev/kbd/kbd.c optional atkbd | pckbd | sc | ukbd | vt dev/kbdmux/kbdmux.c optional kbdmux dev/ksyms/ksyms.c optional ksyms dev/le/am7990.c optional le dev/le/am79900.c optional le dev/le/if_le_pci.c optional le pci dev/le/lance.c optional le dev/led/led.c standard dev/lge/if_lge.c optional lge dev/liquidio/base/cn23xx_pf_device.c optional lio \ compile-with "${NORMAL_C} \ -I$S/dev/liquidio -I$S/dev/liquidio/base -DSMP" dev/liquidio/base/lio_console.c optional lio \ compile-with "${NORMAL_C} \ -I$S/dev/liquidio -I$S/dev/liquidio/base -DSMP" dev/liquidio/base/lio_ctrl.c optional lio \ compile-with "${NORMAL_C} \ -I$S/dev/liquidio -I$S/dev/liquidio/base -DSMP" dev/liquidio/base/lio_device.c optional lio \ compile-with "${NORMAL_C} \ -I$S/dev/liquidio -I$S/dev/liquidio/base -DSMP" dev/liquidio/base/lio_droq.c optional lio \ compile-with "${NORMAL_C} \ -I$S/dev/liquidio -I$S/dev/liquidio/base -DSMP" dev/liquidio/base/lio_mem_ops.c optional lio \ compile-with "${NORMAL_C} \ -I$S/dev/liquidio -I$S/dev/liquidio/base -DSMP" dev/liquidio/base/lio_request_manager.c optional lio \ compile-with "${NORMAL_C} \ -I$S/dev/liquidio -I$S/dev/liquidio/base -DSMP" dev/liquidio/base/lio_response_manager.c optional lio \ compile-with "${NORMAL_C} \ -I$S/dev/liquidio -I$S/dev/liquidio/base -DSMP" dev/liquidio/lio_core.c optional lio \ compile-with "${NORMAL_C} \ -I$S/dev/liquidio -I$S/dev/liquidio/base -DSMP" dev/liquidio/lio_ioctl.c optional lio \ compile-with "${NORMAL_C} \ -I$S/dev/liquidio -I$S/dev/liquidio/base -DSMP" dev/liquidio/lio_main.c optional lio \ compile-with "${NORMAL_C} \ -I$S/dev/liquidio -I$S/dev/liquidio/base -DSMP" dev/liquidio/lio_rss.c optional lio \ compile-with "${NORMAL_C} \ -I$S/dev/liquidio -I$S/dev/liquidio/base -DSMP" dev/liquidio/lio_rxtx.c optional lio \ compile-with "${NORMAL_C} \ -I$S/dev/liquidio -I$S/dev/liquidio/base -DSMP" dev/liquidio/lio_sysctl.c optional lio \ compile-with "${NORMAL_C} \ -I$S/dev/liquidio -I$S/dev/liquidio/base -DSMP" lio.c optional lio \ compile-with "${AWK} -f $S/tools/fw_stub.awk lio_23xx_nic.bin.fw:lio_23xx_nic.bin -mlio_23xx_nic.bin -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "lio.c" lio_23xx_nic.bin.fw.fwo optional lio \ dependency "lio_23xx_nic.bin.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "lio_23xx_nic.bin.fw.fwo" lio_23xx_nic.bin.fw optional lio \ dependency "$S/contrib/dev/liquidio/lio_23xx_nic.bin.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "lio_23xx_nic.bin.fw" dev/malo/if_malo.c optional malo dev/malo/if_malohal.c optional malo dev/malo/if_malo_pci.c optional malo pci dev/mc146818/mc146818.c optional mc146818 dev/md/md.c optional md dev/mdio/mdio_if.m optional miiproxy | mdio dev/mdio/mdio.c optional miiproxy | mdio dev/mem/memdev.c optional mem dev/mem/memutil.c optional mem dev/mfi/mfi.c optional mfi dev/mfi/mfi_debug.c optional mfi dev/mfi/mfi_pci.c optional mfi pci dev/mfi/mfi_disk.c optional mfi dev/mfi/mfi_syspd.c optional mfi dev/mfi/mfi_tbolt.c optional mfi dev/mfi/mfi_linux.c optional mfi compat_linux dev/mfi/mfi_cam.c optional mfip scbus dev/mii/acphy.c optional miibus | acphy dev/mii/amphy.c optional miibus | amphy dev/mii/atphy.c optional miibus | atphy dev/mii/axphy.c optional miibus | axphy dev/mii/bmtphy.c optional miibus | bmtphy dev/mii/brgphy.c optional miibus | brgphy dev/mii/ciphy.c optional miibus | ciphy dev/mii/e1000phy.c optional miibus | e1000phy dev/mii/gentbi.c optional miibus | gentbi dev/mii/icsphy.c optional miibus | icsphy dev/mii/ip1000phy.c optional miibus | ip1000phy dev/mii/jmphy.c optional miibus | jmphy dev/mii/lxtphy.c optional miibus | lxtphy dev/mii/micphy.c optional miibus fdt | micphy fdt dev/mii/mii.c optional miibus | mii dev/mii/mii_bitbang.c optional miibus | mii_bitbang dev/mii/mii_physubr.c optional miibus | mii dev/mii/mii_fdt.c optional miibus fdt | mii fdt dev/mii/miibus_if.m optional miibus | mii dev/mii/mlphy.c optional miibus | mlphy dev/mii/nsgphy.c optional miibus | nsgphy dev/mii/nsphy.c optional miibus | nsphy dev/mii/nsphyter.c optional miibus | nsphyter dev/mii/pnaphy.c optional miibus | pnaphy dev/mii/qsphy.c optional miibus | qsphy dev/mii/rdcphy.c optional miibus | rdcphy dev/mii/rgephy.c optional miibus | rgephy dev/mii/rlphy.c optional miibus | rlphy dev/mii/rlswitch.c optional rlswitch dev/mii/smcphy.c optional miibus | smcphy dev/mii/smscphy.c optional miibus | smscphy dev/mii/tdkphy.c optional miibus | tdkphy dev/mii/tlphy.c optional miibus | tlphy dev/mii/truephy.c optional miibus | truephy dev/mii/ukphy.c optional miibus | mii dev/mii/ukphy_subr.c optional miibus | mii dev/mii/vscphy.c optional miibus | vscphy dev/mii/xmphy.c optional miibus | xmphy dev/mk48txx/mk48txx.c optional mk48txx dev/mlxfw/mlxfw_fsm.c optional mlxfw \ compile-with "${MLXFW_C}" dev/mlxfw/mlxfw_mfa2.c optional mlxfw \ compile-with "${MLXFW_C}" dev/mlxfw/mlxfw_mfa2_tlv_multi.c optional mlxfw \ compile-with "${MLXFW_C}" dev/mlx/mlx.c optional mlx dev/mlx/mlx_disk.c optional mlx dev/mlx/mlx_pci.c optional mlx pci dev/mly/mly.c optional mly dev/mmc/mmc_subr.c optional mmc | mmcsd !mmccam dev/mmc/mmc.c optional mmc !mmccam dev/mmc/mmcbr_if.m standard dev/mmc/mmcbus_if.m standard dev/mmc/mmcsd.c optional mmcsd !mmccam dev/mmcnull/mmcnull.c optional mmcnull dev/mn/if_mn.c optional mn pci dev/mpr/mpr.c optional mpr dev/mpr/mpr_config.c optional mpr # XXX Work around clang warning, until maintainer approves fix. dev/mpr/mpr_mapping.c optional mpr \ compile-with "${NORMAL_C} ${NO_WSOMETIMES_UNINITIALIZED}" dev/mpr/mpr_pci.c optional mpr pci dev/mpr/mpr_sas.c optional mpr \ compile-with "${NORMAL_C} ${NO_WUNNEEDED_INTERNAL_DECL}" dev/mpr/mpr_sas_lsi.c optional mpr dev/mpr/mpr_table.c optional mpr dev/mpr/mpr_user.c optional mpr dev/mps/mps.c optional mps dev/mps/mps_config.c optional mps # XXX Work around clang warning, until maintainer approves fix. dev/mps/mps_mapping.c optional mps \ compile-with "${NORMAL_C} ${NO_WSOMETIMES_UNINITIALIZED}" dev/mps/mps_pci.c optional mps pci dev/mps/mps_sas.c optional mps \ compile-with "${NORMAL_C} ${NO_WUNNEEDED_INTERNAL_DECL}" dev/mps/mps_sas_lsi.c optional mps dev/mps/mps_table.c optional mps dev/mps/mps_user.c optional mps dev/mpt/mpt.c optional mpt dev/mpt/mpt_cam.c optional mpt dev/mpt/mpt_debug.c optional mpt dev/mpt/mpt_pci.c optional mpt pci dev/mpt/mpt_raid.c optional mpt dev/mpt/mpt_user.c optional mpt dev/mrsas/mrsas.c optional mrsas dev/mrsas/mrsas_cam.c optional mrsas dev/mrsas/mrsas_ioctl.c optional mrsas dev/mrsas/mrsas_fp.c optional mrsas dev/msk/if_msk.c optional msk dev/mvs/mvs.c optional mvs dev/mvs/mvs_if.m optional mvs dev/mvs/mvs_pci.c optional mvs pci dev/mwl/if_mwl.c optional mwl dev/mwl/if_mwl_pci.c optional mwl pci dev/mwl/mwlhal.c optional mwl mwlfw.c optional mwlfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk mw88W8363.fw:mw88W8363fw mwlboot.fw:mwlboot -mmwl -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "mwlfw.c" mw88W8363.fwo optional mwlfw \ dependency "mw88W8363.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "mw88W8363.fwo" mw88W8363.fw optional mwlfw \ dependency "$S/contrib/dev/mwl/mw88W8363.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "mw88W8363.fw" mwlboot.fwo optional mwlfw \ dependency "mwlboot.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "mwlboot.fwo" mwlboot.fw optional mwlfw \ dependency "$S/contrib/dev/mwl/mwlboot.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "mwlboot.fw" dev/mxge/if_mxge.c optional mxge pci dev/mxge/mxge_eth_z8e.c optional mxge pci dev/mxge/mxge_ethp_z8e.c optional mxge pci dev/mxge/mxge_rss_eth_z8e.c optional mxge pci dev/mxge/mxge_rss_ethp_z8e.c optional mxge pci dev/my/if_my.c optional my dev/netmap/if_ptnet.c optional netmap inet dev/netmap/netmap.c optional netmap dev/netmap/netmap_bdg.c optional netmap dev/netmap/netmap_freebsd.c optional netmap dev/netmap/netmap_generic.c optional netmap dev/netmap/netmap_kloop.c optional netmap dev/netmap/netmap_legacy.c optional netmap dev/netmap/netmap_mbq.c optional netmap dev/netmap/netmap_mem2.c optional netmap dev/netmap/netmap_monitor.c optional netmap dev/netmap/netmap_null.c optional netmap dev/netmap/netmap_offloadings.c optional netmap dev/netmap/netmap_pipe.c optional netmap dev/netmap/netmap_vale.c optional netmap # compile-with "${NORMAL_C} -Wconversion -Wextra" dev/nfsmb/nfsmb.c optional nfsmb pci dev/nge/if_nge.c optional nge dev/nmdm/nmdm.c optional nmdm dev/null/null.c standard dev/nvd/nvd.c optional nvd nvme dev/nvme/nvme.c optional nvme dev/nvme/nvme_ahci.c optional nvme ahci dev/nvme/nvme_ctrlr.c optional nvme dev/nvme/nvme_ctrlr_cmd.c optional nvme dev/nvme/nvme_ns.c optional nvme dev/nvme/nvme_ns_cmd.c optional nvme dev/nvme/nvme_pci.c optional nvme pci dev/nvme/nvme_qpair.c optional nvme dev/nvme/nvme_sim.c optional nvme scbus dev/nvme/nvme_sysctl.c optional nvme dev/nvme/nvme_test.c optional nvme dev/nvme/nvme_util.c optional nvme dev/oce/oce_hw.c optional oce pci dev/oce/oce_if.c optional oce pci dev/oce/oce_mbox.c optional oce pci dev/oce/oce_queue.c optional oce pci dev/oce/oce_sysctl.c optional oce pci dev/oce/oce_util.c optional oce pci dev/ocs_fc/ocs_pci.c optional ocs_fc pci dev/ocs_fc/ocs_ioctl.c optional ocs_fc pci dev/ocs_fc/ocs_os.c optional ocs_fc pci dev/ocs_fc/ocs_utils.c optional ocs_fc pci dev/ocs_fc/ocs_hw.c optional ocs_fc pci dev/ocs_fc/ocs_hw_queues.c optional ocs_fc pci dev/ocs_fc/sli4.c optional ocs_fc pci dev/ocs_fc/ocs_sm.c optional ocs_fc pci dev/ocs_fc/ocs_device.c optional ocs_fc pci dev/ocs_fc/ocs_xport.c optional ocs_fc pci dev/ocs_fc/ocs_domain.c optional ocs_fc pci dev/ocs_fc/ocs_sport.c optional ocs_fc pci dev/ocs_fc/ocs_els.c optional ocs_fc pci dev/ocs_fc/ocs_fabric.c optional ocs_fc pci dev/ocs_fc/ocs_io.c optional ocs_fc pci dev/ocs_fc/ocs_node.c optional ocs_fc pci dev/ocs_fc/ocs_scsi.c optional ocs_fc pci dev/ocs_fc/ocs_unsol.c optional ocs_fc pci dev/ocs_fc/ocs_ddump.c optional ocs_fc pci dev/ocs_fc/ocs_mgmt.c optional ocs_fc pci dev/ocs_fc/ocs_cam.c optional ocs_fc pci dev/ofw/ofw_bus_if.m optional fdt dev/ofw/ofw_bus_subr.c optional fdt dev/ofw/ofw_cpu.c optional fdt dev/ofw/ofw_fdt.c optional fdt dev/ofw/ofw_if.m optional fdt dev/ofw/ofw_graph.c optional fdt dev/ofw/ofw_subr.c optional fdt dev/ofw/ofwbus.c optional fdt dev/ofw/openfirm.c optional fdt dev/ofw/openfirmio.c optional fdt dev/ow/ow.c optional ow \ dependency "owll_if.h" \ dependency "own_if.h" dev/ow/owll_if.m optional ow dev/ow/own_if.m optional ow dev/ow/ow_temp.c optional ow_temp dev/ow/owc_gpiobus.c optional owc gpio dev/pbio/pbio.c optional pbio isa dev/pccard/card_if.m standard dev/pccard/pccard.c optional pccard dev/pccard/pccard_cis.c optional pccard dev/pccard/pccard_cis_quirks.c optional pccard dev/pccard/pccard_device.c optional pccard dev/pccard/power_if.m standard dev/pccbb/pccbb.c optional cbb dev/pccbb/pccbb_pci.c optional cbb pci dev/pcf/pcf.c optional pcf dev/pci/fixup_pci.c optional pci dev/pci/hostb_pci.c optional pci dev/pci/ignore_pci.c optional pci dev/pci/isa_pci.c optional pci isa dev/pci/pci.c optional pci dev/pci/pci_if.m standard dev/pci/pci_iov.c optional pci pci_iov dev/pci/pci_iov_if.m standard dev/pci/pci_iov_schema.c optional pci pci_iov dev/pci/pci_pci.c optional pci dev/pci/pci_subr.c optional pci dev/pci/pci_user.c optional pci dev/pci/pcib_if.m standard dev/pci/pcib_support.c standard dev/pci/vga_pci.c optional pci dev/pms/freebsd/driver/ini/src/agtiapi.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/sallsdk/spc/sadisc.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/sallsdk/spc/mpi.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/sallsdk/spc/saframe.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/sallsdk/spc/sahw.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/sallsdk/spc/sainit.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/sallsdk/spc/saint.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/sallsdk/spc/sampicmd.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/sallsdk/spc/sampirsp.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/sallsdk/spc/saphy.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/sallsdk/spc/saport.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/sallsdk/spc/sasata.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/sallsdk/spc/sasmp.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/sallsdk/spc/sassp.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/sallsdk/spc/satimer.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/sallsdk/spc/sautil.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/sallsdk/spc/saioctlcmd.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/sallsdk/spc/mpidebug.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/discovery/dm/dminit.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/discovery/dm/dmsmp.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/discovery/dm/dmdisc.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/discovery/dm/dmport.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/discovery/dm/dmtimer.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/discovery/dm/dmmisc.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/sat/src/sminit.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/sat/src/smmisc.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/sat/src/smsat.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/sat/src/smsatcb.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/sat/src/smsathw.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/sat/src/smtimer.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/tisa/sassata/common/tdinit.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/tisa/sassata/common/tdmisc.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/tisa/sassata/common/tdesgl.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/tisa/sassata/common/tdport.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/tisa/sassata/common/tdint.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/tisa/sassata/common/tdioctl.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/tisa/sassata/common/tdhw.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/tisa/sassata/common/ossacmnapi.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/tisa/sassata/common/tddmcmnapi.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/tisa/sassata/common/tdsmcmnapi.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/tisa/sassata/common/tdtimers.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/tisa/sassata/sas/ini/itdio.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/tisa/sassata/sas/ini/itdcb.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/tisa/sassata/sas/ini/itdinit.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/tisa/sassata/sas/ini/itddisc.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/tisa/sassata/sata/host/sat.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/tisa/sassata/sata/host/ossasat.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/pms/RefTisa/tisa/sassata/sata/host/sathw.c optional pmspcv \ compile-with "${NORMAL_C} -Wunused-variable -Woverflow -Wparentheses -w" dev/ppbus/if_plip.c optional plip dev/ppbus/lpbb.c optional lpbb dev/ppbus/lpt.c optional lpt dev/ppbus/pcfclock.c optional pcfclock dev/ppbus/ppb_1284.c optional ppbus dev/ppbus/ppb_base.c optional ppbus dev/ppbus/ppb_msq.c optional ppbus dev/ppbus/ppbconf.c optional ppbus dev/ppbus/ppbus_if.m optional ppbus dev/ppbus/ppi.c optional ppi dev/ppbus/pps.c optional pps dev/ppc/ppc.c optional ppc dev/ppc/ppc_acpi.c optional ppc acpi dev/ppc/ppc_isa.c optional ppc isa dev/ppc/ppc_pci.c optional ppc pci dev/ppc/ppc_puc.c optional ppc puc dev/proto/proto_bus_isa.c optional proto acpi | proto isa dev/proto/proto_bus_pci.c optional proto pci dev/proto/proto_busdma.c optional proto dev/proto/proto_core.c optional proto dev/pst/pst-iop.c optional pst dev/pst/pst-pci.c optional pst pci dev/pst/pst-raid.c optional pst dev/pty/pty.c optional pty dev/puc/puc.c optional puc dev/puc/puc_cfg.c optional puc dev/puc/puc_pccard.c optional puc pccard dev/puc/puc_pci.c optional puc pci dev/pwm/pwmc.c optional pwm | pwmc dev/pwm/pwmbus.c optional pwm | pwmbus dev/pwm/pwmbus_if.m optional pwm | pwmbus dev/pwm/ofw_pwm.c optional pwm fdt | pwmbus fdt dev/pwm/ofw_pwmbus.c optional pwm fdt | pwmbus fdt dev/quicc/quicc_core.c optional quicc dev/ral/rt2560.c optional ral dev/ral/rt2661.c optional ral dev/ral/rt2860.c optional ral dev/ral/if_ral_pci.c optional ral pci rt2561fw.c optional rt2561fw | ralfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk rt2561.fw:rt2561fw -mrt2561 -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "rt2561fw.c" rt2561fw.fwo optional rt2561fw | ralfw \ dependency "rt2561.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "rt2561fw.fwo" rt2561.fw optional rt2561fw | ralfw \ dependency "$S/contrib/dev/ral/rt2561.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "rt2561.fw" rt2561sfw.c optional rt2561sfw | ralfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk rt2561s.fw:rt2561sfw -mrt2561s -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "rt2561sfw.c" rt2561sfw.fwo optional rt2561sfw | ralfw \ dependency "rt2561s.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "rt2561sfw.fwo" rt2561s.fw optional rt2561sfw | ralfw \ dependency "$S/contrib/dev/ral/rt2561s.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "rt2561s.fw" rt2661fw.c optional rt2661fw | ralfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk rt2661.fw:rt2661fw -mrt2661 -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "rt2661fw.c" rt2661fw.fwo optional rt2661fw | ralfw \ dependency "rt2661.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "rt2661fw.fwo" rt2661.fw optional rt2661fw | ralfw \ dependency "$S/contrib/dev/ral/rt2661.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "rt2661.fw" rt2860fw.c optional rt2860fw | ralfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk rt2860.fw:rt2860fw -mrt2860 -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "rt2860fw.c" rt2860fw.fwo optional rt2860fw | ralfw \ dependency "rt2860.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "rt2860fw.fwo" rt2860.fw optional rt2860fw | ralfw \ dependency "$S/contrib/dev/ral/rt2860.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "rt2860.fw" dev/random/random_infra.c standard dev/random/random_harvestq.c standard dev/random/randomdev.c optional !random_loadable dev/random/fortuna.c optional !random_loadable dev/random/hash.c optional !random_loadable dev/rc/rc.c optional rc dev/rccgpio/rccgpio.c optional rccgpio gpio dev/re/if_re.c optional re dev/rl/if_rl.c optional rl pci dev/rndtest/rndtest.c optional rndtest dev/rp/rp.c optional rp dev/rp/rp_isa.c optional rp isa dev/rp/rp_pci.c optional rp pci # dev/rtwn/if_rtwn.c optional rtwn dev/rtwn/if_rtwn_beacon.c optional rtwn dev/rtwn/if_rtwn_calib.c optional rtwn dev/rtwn/if_rtwn_cam.c optional rtwn dev/rtwn/if_rtwn_efuse.c optional rtwn dev/rtwn/if_rtwn_fw.c optional rtwn dev/rtwn/if_rtwn_rx.c optional rtwn dev/rtwn/if_rtwn_task.c optional rtwn dev/rtwn/if_rtwn_tx.c optional rtwn # dev/rtwn/pci/rtwn_pci_attach.c optional rtwn_pci pci dev/rtwn/pci/rtwn_pci_reg.c optional rtwn_pci pci dev/rtwn/pci/rtwn_pci_rx.c optional rtwn_pci pci dev/rtwn/pci/rtwn_pci_tx.c optional rtwn_pci pci # dev/rtwn/usb/rtwn_usb_attach.c optional rtwn_usb dev/rtwn/usb/rtwn_usb_ep.c optional rtwn_usb dev/rtwn/usb/rtwn_usb_reg.c optional rtwn_usb dev/rtwn/usb/rtwn_usb_rx.c optional rtwn_usb dev/rtwn/usb/rtwn_usb_tx.c optional rtwn_usb # RTL8188E dev/rtwn/rtl8188e/r88e_beacon.c optional rtwn dev/rtwn/rtl8188e/r88e_calib.c optional rtwn dev/rtwn/rtl8188e/r88e_chan.c optional rtwn dev/rtwn/rtl8188e/r88e_fw.c optional rtwn dev/rtwn/rtl8188e/r88e_init.c optional rtwn dev/rtwn/rtl8188e/r88e_led.c optional rtwn dev/rtwn/rtl8188e/r88e_tx.c optional rtwn dev/rtwn/rtl8188e/r88e_rf.c optional rtwn dev/rtwn/rtl8188e/r88e_rom.c optional rtwn dev/rtwn/rtl8188e/r88e_rx.c optional rtwn dev/rtwn/rtl8188e/pci/r88ee_attach.c optional rtwn_pci pci dev/rtwn/rtl8188e/pci/r88ee_init.c optional rtwn_pci pci dev/rtwn/rtl8188e/pci/r88ee_rx.c optional rtwn_pci pci dev/rtwn/rtl8188e/usb/r88eu_attach.c optional rtwn_usb dev/rtwn/rtl8188e/usb/r88eu_init.c optional rtwn_usb # RTL8192C dev/rtwn/rtl8192c/r92c_attach.c optional rtwn dev/rtwn/rtl8192c/r92c_beacon.c optional rtwn dev/rtwn/rtl8192c/r92c_calib.c optional rtwn dev/rtwn/rtl8192c/r92c_chan.c optional rtwn dev/rtwn/rtl8192c/r92c_fw.c optional rtwn dev/rtwn/rtl8192c/r92c_init.c optional rtwn dev/rtwn/rtl8192c/r92c_llt.c optional rtwn dev/rtwn/rtl8192c/r92c_rf.c optional rtwn dev/rtwn/rtl8192c/r92c_rom.c optional rtwn dev/rtwn/rtl8192c/r92c_rx.c optional rtwn dev/rtwn/rtl8192c/r92c_tx.c optional rtwn dev/rtwn/rtl8192c/pci/r92ce_attach.c optional rtwn_pci pci dev/rtwn/rtl8192c/pci/r92ce_calib.c optional rtwn_pci pci dev/rtwn/rtl8192c/pci/r92ce_fw.c optional rtwn_pci pci dev/rtwn/rtl8192c/pci/r92ce_init.c optional rtwn_pci pci dev/rtwn/rtl8192c/pci/r92ce_led.c optional rtwn_pci pci dev/rtwn/rtl8192c/pci/r92ce_rx.c optional rtwn_pci pci dev/rtwn/rtl8192c/pci/r92ce_tx.c optional rtwn_pci pci dev/rtwn/rtl8192c/usb/r92cu_attach.c optional rtwn_usb dev/rtwn/rtl8192c/usb/r92cu_init.c optional rtwn_usb dev/rtwn/rtl8192c/usb/r92cu_led.c optional rtwn_usb dev/rtwn/rtl8192c/usb/r92cu_rx.c optional rtwn_usb dev/rtwn/rtl8192c/usb/r92cu_tx.c optional rtwn_usb # RTL8192E dev/rtwn/rtl8192e/r92e_chan.c optional rtwn dev/rtwn/rtl8192e/r92e_fw.c optional rtwn dev/rtwn/rtl8192e/r92e_init.c optional rtwn dev/rtwn/rtl8192e/r92e_led.c optional rtwn dev/rtwn/rtl8192e/r92e_rf.c optional rtwn dev/rtwn/rtl8192e/r92e_rom.c optional rtwn dev/rtwn/rtl8192e/r92e_rx.c optional rtwn dev/rtwn/rtl8192e/usb/r92eu_attach.c optional rtwn_usb dev/rtwn/rtl8192e/usb/r92eu_init.c optional rtwn_usb # RTL8812A dev/rtwn/rtl8812a/r12a_beacon.c optional rtwn dev/rtwn/rtl8812a/r12a_calib.c optional rtwn dev/rtwn/rtl8812a/r12a_caps.c optional rtwn dev/rtwn/rtl8812a/r12a_chan.c optional rtwn dev/rtwn/rtl8812a/r12a_fw.c optional rtwn dev/rtwn/rtl8812a/r12a_init.c optional rtwn dev/rtwn/rtl8812a/r12a_led.c optional rtwn dev/rtwn/rtl8812a/r12a_rf.c optional rtwn dev/rtwn/rtl8812a/r12a_rom.c optional rtwn dev/rtwn/rtl8812a/r12a_rx.c optional rtwn dev/rtwn/rtl8812a/r12a_tx.c optional rtwn dev/rtwn/rtl8812a/usb/r12au_attach.c optional rtwn_usb dev/rtwn/rtl8812a/usb/r12au_init.c optional rtwn_usb dev/rtwn/rtl8812a/usb/r12au_rx.c optional rtwn_usb dev/rtwn/rtl8812a/usb/r12au_tx.c optional rtwn_usb # RTL8821A dev/rtwn/rtl8821a/r21a_beacon.c optional rtwn dev/rtwn/rtl8821a/r21a_calib.c optional rtwn dev/rtwn/rtl8821a/r21a_chan.c optional rtwn dev/rtwn/rtl8821a/r21a_fw.c optional rtwn dev/rtwn/rtl8821a/r21a_init.c optional rtwn dev/rtwn/rtl8821a/r21a_led.c optional rtwn dev/rtwn/rtl8821a/r21a_rom.c optional rtwn dev/rtwn/rtl8821a/r21a_rx.c optional rtwn dev/rtwn/rtl8821a/usb/r21au_attach.c optional rtwn_usb dev/rtwn/rtl8821a/usb/r21au_dfs.c optional rtwn_usb dev/rtwn/rtl8821a/usb/r21au_init.c optional rtwn_usb rtwn-rtl8188eefw.c optional rtwn-rtl8188eefw | rtwnfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk rtwn-rtl8188eefw.fw:rtwn-rtl8188eefw:111 -mrtwn-rtl8188eefw -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "rtwn-rtl8188eefw.c" rtwn-rtl8188eefw.fwo optional rtwn-rtl8188eefw | rtwnfw \ dependency "rtwn-rtl8188eefw.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "rtwn-rtl8188eefw.fwo" rtwn-rtl8188eefw.fw optional rtwn-rtl8188eefw | rtwnfw \ dependency "$S/contrib/dev/rtwn/rtwn-rtl8188eefw.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "rtwn-rtl8188eefw.fw" rtwn-rtl8188eufw.c optional rtwn-rtl8188eufw | rtwnfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk rtwn-rtl8188eufw.fw:rtwn-rtl8188eufw:111 -mrtwn-rtl8188eufw -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "rtwn-rtl8188eufw.c" rtwn-rtl8188eufw.fwo optional rtwn-rtl8188eufw | rtwnfw \ dependency "rtwn-rtl8188eufw.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "rtwn-rtl8188eufw.fwo" rtwn-rtl8188eufw.fw optional rtwn-rtl8188eufw | rtwnfw \ dependency "$S/contrib/dev/rtwn/rtwn-rtl8188eufw.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "rtwn-rtl8188eufw.fw" rtwn-rtl8192cfwE.c optional rtwn-rtl8192cfwE | rtwnfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk rtwn-rtl8192cfwE.fw:rtwn-rtl8192cfwE:111 -mrtwn-rtl8192cfwE -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "rtwn-rtl8192cfwE.c" rtwn-rtl8192cfwE.fwo optional rtwn-rtl8192cfwE | rtwnfw \ dependency "rtwn-rtl8192cfwE.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "rtwn-rtl8192cfwE.fwo" rtwn-rtl8192cfwE.fw optional rtwn-rtl8192cfwE | rtwnfw \ dependency "$S/contrib/dev/rtwn/rtwn-rtl8192cfwE.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "rtwn-rtl8192cfwE.fw" rtwn-rtl8192cfwE_B.c optional rtwn-rtl8192cfwE_B | rtwnfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk rtwn-rtl8192cfwE_B.fw:rtwn-rtl8192cfwE_B:111 -mrtwn-rtl8192cfwE_B -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "rtwn-rtl8192cfwE_B.c" rtwn-rtl8192cfwE_B.fwo optional rtwn-rtl8192cfwE_B | rtwnfw \ dependency "rtwn-rtl8192cfwE_B.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "rtwn-rtl8192cfwE_B.fwo" rtwn-rtl8192cfwE_B.fw optional rtwn-rtl8192cfwE_B | rtwnfw \ dependency "$S/contrib/dev/rtwn/rtwn-rtl8192cfwE_B.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "rtwn-rtl8192cfwE_B.fw" rtwn-rtl8192cfwT.c optional rtwn-rtl8192cfwT | rtwnfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk rtwn-rtl8192cfwT.fw:rtwn-rtl8192cfwT:111 -mrtwn-rtl8192cfwT -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "rtwn-rtl8192cfwT.c" rtwn-rtl8192cfwT.fwo optional rtwn-rtl8192cfwT | rtwnfw \ dependency "rtwn-rtl8192cfwT.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "rtwn-rtl8192cfwT.fwo" rtwn-rtl8192cfwT.fw optional rtwn-rtl8192cfwT | rtwnfw \ dependency "$S/contrib/dev/rtwn/rtwn-rtl8192cfwT.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "rtwn-rtl8192cfwT.fw" rtwn-rtl8192cfwU.c optional rtwn-rtl8192cfwU | rtwnfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk rtwn-rtl8192cfwU.fw:rtwn-rtl8192cfwU:111 -mrtwn-rtl8192cfwU -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "rtwn-rtl8192cfwU.c" rtwn-rtl8192cfwU.fwo optional rtwn-rtl8192cfwU | rtwnfw \ dependency "rtwn-rtl8192cfwU.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "rtwn-rtl8192cfwU.fwo" rtwn-rtl8192cfwU.fw optional rtwn-rtl8192cfwU | rtwnfw \ dependency "$S/contrib/dev/rtwn/rtwn-rtl8192cfwU.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "rtwn-rtl8192cfwU.fw" rtwn-rtl8192eufw.c optional rtwn-rtl8192eufw | rtwnfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk rtwn-rtl8192eufw.fw:rtwn-rtl8192eufw:111 -mrtwn-rtl8192eufw -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "rtwn-rtl8192eufw.c" rtwn-rtl8192eufw.fwo optional rtwn-rtl8192eufw | rtwnfw \ dependency "rtwn-rtl8192eufw.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "rtwn-rtl8192eufw.fwo" rtwn-rtl8192eufw.fw optional rtwn-rtl8192eufw | rtwnfw \ dependency "$S/contrib/dev/rtwn/rtwn-rtl8192eufw.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "rtwn-rtl8192eufw.fw" rtwn-rtl8812aufw.c optional rtwn-rtl8812aufw | rtwnfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk rtwn-rtl8812aufw.fw:rtwn-rtl8812aufw:111 -mrtwn-rtl8812aufw -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "rtwn-rtl8812aufw.c" rtwn-rtl8812aufw.fwo optional rtwn-rtl8812aufw | rtwnfw \ dependency "rtwn-rtl8812aufw.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "rtwn-rtl8812aufw.fwo" rtwn-rtl8812aufw.fw optional rtwn-rtl8812aufw | rtwnfw \ dependency "$S/contrib/dev/rtwn/rtwn-rtl8812aufw.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "rtwn-rtl8812aufw.fw" rtwn-rtl8821aufw.c optional rtwn-rtl8821aufw | rtwnfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk rtwn-rtl8821aufw.fw:rtwn-rtl8821aufw:111 -mrtwn-rtl8821aufw -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "rtwn-rtl8821aufw.c" rtwn-rtl8821aufw.fwo optional rtwn-rtl8821aufw | rtwnfw \ dependency "rtwn-rtl8821aufw.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "rtwn-rtl8821aufw.fwo" rtwn-rtl8821aufw.fw optional rtwn-rtl8821aufw | rtwnfw \ dependency "$S/contrib/dev/rtwn/rtwn-rtl8821aufw.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "rtwn-rtl8821aufw.fw" dev/safe/safe.c optional safe dev/scc/scc_if.m optional scc dev/scc/scc_bfe_quicc.c optional scc quicc dev/scc/scc_core.c optional scc dev/scc/scc_dev_quicc.c optional scc quicc dev/scc/scc_dev_sab82532.c optional scc dev/scc/scc_dev_z8530.c optional scc dev/sdhci/sdhci.c optional sdhci dev/sdhci/sdhci_fdt.c optional sdhci fdt dev/sdhci/sdhci_fdt_gpio.c optional sdhci fdt gpio dev/sdhci/sdhci_if.m optional sdhci dev/sdhci/sdhci_acpi.c optional sdhci acpi dev/sdhci/sdhci_pci.c optional sdhci pci dev/sdio/sdio_if.m optional mmccam dev/sdio/sdio_subr.c optional mmccam dev/sdio/sdiob.c optional mmccam dev/sge/if_sge.c optional sge pci dev/siis/siis.c optional siis pci dev/sis/if_sis.c optional sis pci dev/sk/if_sk.c optional sk pci dev/smbus/smb.c optional smb dev/smbus/smbconf.c optional smbus dev/smbus/smbus.c optional smbus dev/smbus/smbus_if.m optional smbus dev/smc/if_smc.c optional smc dev/smc/if_smc_fdt.c optional smc fdt dev/snp/snp.c optional snp dev/sound/clone.c optional sound dev/sound/unit.c optional sound dev/sound/isa/ad1816.c optional snd_ad1816 isa dev/sound/isa/ess.c optional snd_ess isa dev/sound/isa/gusc.c optional snd_gusc isa dev/sound/isa/mss.c optional snd_mss isa dev/sound/isa/sb16.c optional snd_sb16 isa dev/sound/isa/sb8.c optional snd_sb8 isa dev/sound/isa/sbc.c optional snd_sbc isa dev/sound/isa/sndbuf_dma.c optional sound isa dev/sound/pci/als4000.c optional snd_als4000 pci dev/sound/pci/atiixp.c optional snd_atiixp pci dev/sound/pci/cmi.c optional snd_cmi pci dev/sound/pci/cs4281.c optional snd_cs4281 pci dev/sound/pci/csa.c optional snd_csa pci dev/sound/pci/csapcm.c optional snd_csa pci dev/sound/pci/ds1.c optional snd_ds1 pci dev/sound/pci/emu10k1.c optional snd_emu10k1 pci dev/sound/pci/emu10kx.c optional snd_emu10kx pci dev/sound/pci/emu10kx-pcm.c optional snd_emu10kx pci dev/sound/pci/emu10kx-midi.c optional snd_emu10kx pci dev/sound/pci/envy24.c optional snd_envy24 pci dev/sound/pci/envy24ht.c optional snd_envy24ht pci dev/sound/pci/es137x.c optional snd_es137x pci dev/sound/pci/fm801.c optional snd_fm801 pci dev/sound/pci/ich.c optional snd_ich pci dev/sound/pci/maestro.c optional snd_maestro pci dev/sound/pci/maestro3.c optional snd_maestro3 pci dev/sound/pci/neomagic.c optional snd_neomagic pci dev/sound/pci/solo.c optional snd_solo pci dev/sound/pci/spicds.c optional snd_spicds pci dev/sound/pci/t4dwave.c optional snd_t4dwave pci dev/sound/pci/via8233.c optional snd_via8233 pci dev/sound/pci/via82c686.c optional snd_via82c686 pci dev/sound/pci/vibes.c optional snd_vibes pci dev/sound/pci/hda/hdaa.c optional snd_hda pci dev/sound/pci/hda/hdaa_patches.c optional snd_hda pci dev/sound/pci/hda/hdac.c optional snd_hda pci dev/sound/pci/hda/hdac_if.m optional snd_hda pci dev/sound/pci/hda/hdacc.c optional snd_hda pci dev/sound/pci/hdspe.c optional snd_hdspe pci dev/sound/pci/hdspe-pcm.c optional snd_hdspe pci dev/sound/pcm/ac97.c optional sound dev/sound/pcm/ac97_if.m optional sound dev/sound/pcm/ac97_patch.c optional sound dev/sound/pcm/buffer.c optional sound \ dependency "snd_fxdiv_gen.h" dev/sound/pcm/channel.c optional sound dev/sound/pcm/channel_if.m optional sound dev/sound/pcm/dsp.c optional sound dev/sound/pcm/feeder.c optional sound dev/sound/pcm/feeder_chain.c optional sound dev/sound/pcm/feeder_eq.c optional sound \ dependency "feeder_eq_gen.h" \ dependency "snd_fxdiv_gen.h" dev/sound/pcm/feeder_if.m optional sound dev/sound/pcm/feeder_format.c optional sound \ dependency "snd_fxdiv_gen.h" dev/sound/pcm/feeder_matrix.c optional sound \ dependency "snd_fxdiv_gen.h" dev/sound/pcm/feeder_mixer.c optional sound \ dependency "snd_fxdiv_gen.h" dev/sound/pcm/feeder_rate.c optional sound \ dependency "feeder_rate_gen.h" \ dependency "snd_fxdiv_gen.h" dev/sound/pcm/feeder_volume.c optional sound \ dependency "snd_fxdiv_gen.h" dev/sound/pcm/mixer.c optional sound dev/sound/pcm/mixer_if.m optional sound dev/sound/pcm/sndstat.c optional sound dev/sound/pcm/sound.c optional sound dev/sound/pcm/vchan.c optional sound dev/sound/usb/uaudio.c optional snd_uaudio usb dev/sound/usb/uaudio_pcm.c optional snd_uaudio usb dev/sound/midi/midi.c optional sound dev/sound/midi/mpu401.c optional sound dev/sound/midi/mpu_if.m optional sound dev/sound/midi/mpufoi_if.m optional sound dev/sound/midi/sequencer.c optional sound dev/sound/midi/synth_if.m optional sound dev/spibus/ofw_spibus.c optional fdt spibus dev/spibus/spibus.c optional spibus \ dependency "spibus_if.h" dev/spibus/spigen.c optional spigen dev/spibus/spibus_if.m optional spibus dev/ste/if_ste.c optional ste pci dev/stge/if_stge.c optional stge dev/sym/sym_hipd.c optional sym \ dependency "$S/dev/sym/sym_{conf,defs}.h" dev/syscons/blank/blank_saver.c optional blank_saver dev/syscons/daemon/daemon_saver.c optional daemon_saver dev/syscons/dragon/dragon_saver.c optional dragon_saver dev/syscons/fade/fade_saver.c optional fade_saver dev/syscons/fire/fire_saver.c optional fire_saver dev/syscons/green/green_saver.c optional green_saver dev/syscons/logo/logo.c optional logo_saver dev/syscons/logo/logo_saver.c optional logo_saver dev/syscons/rain/rain_saver.c optional rain_saver dev/syscons/schistory.c optional sc dev/syscons/scmouse.c optional sc dev/syscons/scterm.c optional sc dev/syscons/scterm-dumb.c optional sc !SC_NO_TERM_DUMB dev/syscons/scterm-sc.c optional sc !SC_NO_TERM_SC dev/syscons/scterm-teken.c optional sc !SC_NO_TERM_TEKEN dev/syscons/scvidctl.c optional sc dev/syscons/scvtb.c optional sc dev/syscons/snake/snake_saver.c optional snake_saver dev/syscons/star/star_saver.c optional star_saver dev/syscons/syscons.c optional sc dev/syscons/sysmouse.c optional sc dev/syscons/warp/warp_saver.c optional warp_saver dev/tcp_log/tcp_log_dev.c optional tcp_blackbox inet | tcp_blackbox inet6 dev/tdfx/tdfx_linux.c optional tdfx_linux tdfx compat_linux dev/tdfx/tdfx_pci.c optional tdfx pci dev/ti/if_ti.c optional ti pci dev/twa/tw_cl_init.c optional twa \ compile-with "${NORMAL_C} -I$S/dev/twa" dev/twa/tw_cl_intr.c optional twa \ compile-with "${NORMAL_C} -I$S/dev/twa" dev/twa/tw_cl_io.c optional twa \ compile-with "${NORMAL_C} -I$S/dev/twa" dev/twa/tw_cl_misc.c optional twa \ compile-with "${NORMAL_C} -I$S/dev/twa" dev/twa/tw_osl_cam.c optional twa \ compile-with "${NORMAL_C} -I$S/dev/twa" dev/twa/tw_osl_freebsd.c optional twa \ compile-with "${NORMAL_C} -I$S/dev/twa" dev/twe/twe.c optional twe dev/twe/twe_freebsd.c optional twe dev/tws/tws.c optional tws dev/tws/tws_cam.c optional tws dev/tws/tws_hdm.c optional tws dev/tws/tws_services.c optional tws dev/tws/tws_user.c optional tws dev/uart/uart_bus_acpi.c optional uart acpi dev/uart/uart_bus_fdt.c optional uart fdt dev/uart/uart_bus_isa.c optional uart isa dev/uart/uart_bus_pccard.c optional uart pccard dev/uart/uart_bus_pci.c optional uart pci dev/uart/uart_bus_puc.c optional uart puc dev/uart/uart_bus_scc.c optional uart scc dev/uart/uart_core.c optional uart dev/uart/uart_cpu_acpi.c optional uart acpi dev/uart/uart_dbg.c optional uart gdb dev/uart/uart_dev_msm.c optional uart uart_msm fdt dev/uart/uart_dev_mvebu.c optional uart uart_mvebu dev/uart/uart_dev_ns8250.c optional uart uart_ns8250 | uart uart_snps dev/uart/uart_dev_pl011.c optional uart pl011 dev/uart/uart_dev_quicc.c optional uart quicc dev/uart/uart_dev_sab82532.c optional uart uart_sab82532 | uart scc dev/uart/uart_dev_snps.c optional uart uart_snps fdt dev/uart/uart_dev_z8530.c optional uart uart_z8530 | uart scc dev/uart/uart_if.m optional uart dev/uart/uart_subr.c optional uart dev/uart/uart_tty.c optional uart dev/ubsec/ubsec.c optional ubsec # # USB controller drivers # dev/usb/controller/musb_otg.c optional musb dev/usb/controller/dwc_otg.c optional dwcotg dev/usb/controller/dwc_otg_fdt.c optional dwcotg fdt dev/usb/controller/ehci.c optional ehci dev/usb/controller/ehci_msm.c optional ehci_msm fdt dev/usb/controller/ehci_pci.c optional ehci pci dev/usb/controller/ohci.c optional ohci dev/usb/controller/ohci_pci.c optional ohci pci dev/usb/controller/uhci.c optional uhci dev/usb/controller/uhci_pci.c optional uhci pci dev/usb/controller/xhci.c optional xhci dev/usb/controller/xhci_pci.c optional xhci pci dev/usb/controller/saf1761_otg.c optional saf1761otg dev/usb/controller/saf1761_otg_fdt.c optional saf1761otg fdt dev/usb/controller/uss820dci.c optional uss820dci dev/usb/controller/usb_controller.c optional usb # # USB storage drivers # dev/usb/storage/cfumass.c optional cfumass ctl dev/usb/storage/umass.c optional umass dev/usb/storage/urio.c optional urio dev/usb/storage/ustorage_fs.c optional usfs # # USB core # dev/usb/usb_busdma.c optional usb dev/usb/usb_core.c optional usb dev/usb/usb_debug.c optional usb dev/usb/usb_dev.c optional usb dev/usb/usb_device.c optional usb dev/usb/usb_dynamic.c optional usb dev/usb/usb_error.c optional usb dev/usb/usb_fdt_support.c optional usb fdt dev/usb/usb_generic.c optional usb dev/usb/usb_handle_request.c optional usb dev/usb/usb_hid.c optional usb dev/usb/usb_hub.c optional usb dev/usb/usb_hub_acpi.c optional uacpi acpi dev/usb/usb_if.m optional usb dev/usb/usb_lookup.c optional usb dev/usb/usb_mbuf.c optional usb dev/usb/usb_msctest.c optional usb dev/usb/usb_parse.c optional usb dev/usb/usb_pf.c optional usb dev/usb/usb_process.c optional usb dev/usb/usb_request.c optional usb dev/usb/usb_transfer.c optional usb dev/usb/usb_util.c optional usb # # USB network drivers # dev/usb/net/if_aue.c optional aue dev/usb/net/if_axe.c optional axe dev/usb/net/if_axge.c optional axge dev/usb/net/if_cdce.c optional cdce dev/usb/net/if_cdceem.c optional cdceem dev/usb/net/if_cue.c optional cue dev/usb/net/if_ipheth.c optional ipheth dev/usb/net/if_kue.c optional kue dev/usb/net/if_mos.c optional mos dev/usb/net/if_muge.c optional muge dev/usb/net/if_rue.c optional rue dev/usb/net/if_smsc.c optional smsc dev/usb/net/if_udav.c optional udav dev/usb/net/if_ure.c optional ure dev/usb/net/if_usie.c optional usie dev/usb/net/if_urndis.c optional urndis dev/usb/net/ruephy.c optional rue dev/usb/net/usb_ethernet.c optional uether | aue | axe | axge | cdce | \ cdceem | cue | ipheth | kue | mos | \ rue | smsc | udav | ure | urndis | muge dev/usb/net/uhso.c optional uhso # # USB WLAN drivers # dev/usb/wlan/if_rsu.c optional rsu rsu-rtl8712fw.c optional rsu-rtl8712fw | rsufw \ compile-with "${AWK} -f $S/tools/fw_stub.awk rsu-rtl8712fw.fw:rsu-rtl8712fw:120 -mrsu-rtl8712fw -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "rsu-rtl8712fw.c" rsu-rtl8712fw.fwo optional rsu-rtl8712fw | rsufw \ dependency "rsu-rtl8712fw.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "rsu-rtl8712fw.fwo" rsu-rtl8712fw.fw optional rsu-rtl8712.fw | rsufw \ dependency "$S/contrib/dev/rsu/rsu-rtl8712fw.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "rsu-rtl8712fw.fw" dev/usb/wlan/if_rum.c optional rum dev/usb/wlan/if_run.c optional run runfw.c optional runfw \ compile-with "${AWK} -f $S/tools/fw_stub.awk run.fw:runfw -mrunfw -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "runfw.c" runfw.fwo optional runfw \ dependency "run.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "runfw.fwo" run.fw optional runfw \ dependency "$S/contrib/dev/run/rt2870.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "run.fw" dev/usb/wlan/if_uath.c optional uath dev/usb/wlan/if_upgt.c optional upgt dev/usb/wlan/if_ural.c optional ural dev/usb/wlan/if_urtw.c optional urtw dev/usb/wlan/if_zyd.c optional zyd # # USB serial and parallel port drivers # dev/usb/serial/u3g.c optional u3g dev/usb/serial/uark.c optional uark dev/usb/serial/ubsa.c optional ubsa dev/usb/serial/ubser.c optional ubser dev/usb/serial/uchcom.c optional uchcom dev/usb/serial/ucycom.c optional ucycom dev/usb/serial/ufoma.c optional ufoma dev/usb/serial/uftdi.c optional uftdi dev/usb/serial/ugensa.c optional ugensa dev/usb/serial/uipaq.c optional uipaq dev/usb/serial/ulpt.c optional ulpt dev/usb/serial/umcs.c optional umcs dev/usb/serial/umct.c optional umct dev/usb/serial/umodem.c optional umodem dev/usb/serial/umoscom.c optional umoscom dev/usb/serial/uplcom.c optional uplcom dev/usb/serial/uslcom.c optional uslcom dev/usb/serial/uvisor.c optional uvisor dev/usb/serial/uvscom.c optional uvscom dev/usb/serial/usb_serial.c optional ucom | u3g | uark | ubsa | ubser | \ uchcom | ucycom | ufoma | uftdi | \ ugensa | uipaq | umcs | umct | \ umodem | umoscom | uplcom | usie | \ uslcom | uvisor | uvscom # # USB misc drivers # dev/usb/misc/ufm.c optional ufm dev/usb/misc/udbp.c optional udbp dev/usb/misc/ugold.c optional ugold dev/usb/misc/uled.c optional uled # # USB input drivers # dev/usb/input/atp.c optional atp dev/usb/input/uep.c optional uep dev/usb/input/uhid.c optional uhid dev/usb/input/uhid_snes.c optional uhid_snes dev/usb/input/ukbd.c optional ukbd dev/usb/input/ums.c optional ums dev/usb/input/wmt.c optional wmt dev/usb/input/wsp.c optional wsp # # USB quirks # dev/usb/quirk/usb_quirk.c optional usb # # USB templates # dev/usb/template/usb_template.c optional usb_template dev/usb/template/usb_template_audio.c optional usb_template dev/usb/template/usb_template_cdce.c optional usb_template dev/usb/template/usb_template_kbd.c optional usb_template dev/usb/template/usb_template_modem.c optional usb_template dev/usb/template/usb_template_mouse.c optional usb_template dev/usb/template/usb_template_msc.c optional usb_template dev/usb/template/usb_template_mtp.c optional usb_template dev/usb/template/usb_template_phone.c optional usb_template dev/usb/template/usb_template_serialnet.c optional usb_template dev/usb/template/usb_template_midi.c optional usb_template dev/usb/template/usb_template_multi.c optional usb_template dev/usb/template/usb_template_cdceem.c optional usb_template # # USB video drivers # dev/usb/video/udl.c optional udl # # USB END # dev/videomode/videomode.c optional videomode dev/videomode/edid.c optional videomode dev/videomode/pickmode.c optional videomode dev/videomode/vesagtf.c optional videomode dev/veriexec/verified_exec.c optional veriexec mac_veriexec dev/vge/if_vge.c optional vge dev/viapm/viapm.c optional viapm pci dev/virtio/virtio.c optional virtio dev/virtio/virtqueue.c optional virtio dev/virtio/virtio_bus_if.m optional virtio dev/virtio/virtio_if.m optional virtio dev/virtio/pci/virtio_pci.c optional virtio_pci dev/virtio/mmio/virtio_mmio.c optional virtio_mmio dev/virtio/mmio/virtio_mmio_acpi.c optional virtio_mmio acpi dev/virtio/mmio/virtio_mmio_fdt.c optional virtio_mmio fdt dev/virtio/mmio/virtio_mmio_if.m optional virtio_mmio dev/virtio/network/if_vtnet.c optional vtnet dev/virtio/block/virtio_blk.c optional virtio_blk dev/virtio/balloon/virtio_balloon.c optional virtio_balloon dev/virtio/scsi/virtio_scsi.c optional virtio_scsi dev/virtio/random/virtio_random.c optional virtio_random dev/virtio/console/virtio_console.c optional virtio_console dev/vkbd/vkbd.c optional vkbd dev/vmgenc/vmgenc_acpi.c optional acpi dev/vr/if_vr.c optional vr pci dev/vt/colors/vt_termcolors.c optional vt dev/vt/font/vt_font_default.c optional vt dev/vt/font/vt_mouse_cursor.c optional vt dev/vt/hw/efifb/efifb.c optional vt_efifb dev/vt/hw/fb/vt_fb.c optional vt dev/vt/hw/vga/vt_vga.c optional vt vt_vga dev/vt/logo/logo_freebsd.c optional vt splash dev/vt/logo/logo_beastie.c optional vt splash dev/vt/vt_buf.c optional vt dev/vt/vt_consolectl.c optional vt dev/vt/vt_core.c optional vt dev/vt/vt_cpulogos.c optional vt splash dev/vt/vt_font.c optional vt dev/vt/vt_sysmouse.c optional vt dev/vte/if_vte.c optional vte pci dev/watchdog/watchdog.c standard dev/wi/if_wi.c optional wi dev/wi/if_wi_pccard.c optional wi pccard dev/wi/if_wi_pci.c optional wi pci dev/wpi/if_wpi.c optional wpi pci wpifw.c optional wpifw \ compile-with "${AWK} -f $S/tools/fw_stub.awk wpi.fw:wpifw:153229 -mwpi -c${.TARGET}" \ no-implicit-rule before-depend local \ clean "wpifw.c" wpifw.fwo optional wpifw \ dependency "wpi.fw" \ compile-with "${NORMAL_FWO}" \ no-implicit-rule \ clean "wpifw.fwo" wpi.fw optional wpifw \ dependency "$S/contrib/dev/wpi/iwlwifi-3945-15.32.2.9.fw.uu" \ compile-with "${NORMAL_FW}" \ no-obj no-implicit-rule \ clean "wpi.fw" dev/xdma/controller/pl330.c optional xdma pl330 dev/xdma/xdma.c optional xdma dev/xdma/xdma_bank.c optional xdma dev/xdma/xdma_bio.c optional xdma dev/xdma/xdma_fdt_test.c optional xdma xdma_test fdt dev/xdma/xdma_if.m optional xdma dev/xdma/xdma_iommu.c optional xdma dev/xdma/xdma_mbuf.c optional xdma dev/xdma/xdma_queue.c optional xdma dev/xdma/xdma_sg.c optional xdma dev/xdma/xdma_sglist.c optional xdma dev/xen/balloon/balloon.c optional xenhvm dev/xen/blkfront/blkfront.c optional xenhvm dev/xen/blkback/blkback.c optional xenhvm dev/xen/console/xen_console.c optional xenhvm dev/xen/control/control.c optional xenhvm dev/xen/grant_table/grant_table.c optional xenhvm dev/xen/netback/netback.c optional xenhvm dev/xen/netfront/netfront.c optional xenhvm dev/xen/xenpci/xenpci.c optional xenpci dev/xen/timer/timer.c optional xenhvm dev/xen/pvcpu/pvcpu.c optional xenhvm dev/xen/xenstore/xenstore.c optional xenhvm dev/xen/xenstore/xenstore_dev.c optional xenhvm dev/xen/xenstore/xenstored_dev.c optional xenhvm dev/xen/evtchn/evtchn_dev.c optional xenhvm dev/xen/privcmd/privcmd.c optional xenhvm dev/xen/gntdev/gntdev.c optional xenhvm dev/xen/debug/debug.c optional xenhvm dev/xl/if_xl.c optional xl pci dev/xl/xlphy.c optional xl pci fs/autofs/autofs.c optional autofs fs/autofs/autofs_vfsops.c optional autofs fs/autofs/autofs_vnops.c optional autofs fs/deadfs/dead_vnops.c standard fs/devfs/devfs_devs.c standard fs/devfs/devfs_dir.c standard fs/devfs/devfs_rule.c standard fs/devfs/devfs_vfsops.c standard fs/devfs/devfs_vnops.c standard fs/fdescfs/fdesc_vfsops.c optional fdescfs fs/fdescfs/fdesc_vnops.c optional fdescfs fs/fifofs/fifo_vnops.c standard fs/cuse/cuse.c optional cuse fs/fuse/fuse_device.c optional fusefs fs/fuse/fuse_file.c optional fusefs fs/fuse/fuse_internal.c optional fusefs fs/fuse/fuse_io.c optional fusefs fs/fuse/fuse_ipc.c optional fusefs fs/fuse/fuse_main.c optional fusefs fs/fuse/fuse_node.c optional fusefs fs/fuse/fuse_vfsops.c optional fusefs fs/fuse/fuse_vnops.c optional fusefs fs/msdosfs/msdosfs_conv.c optional msdosfs fs/msdosfs/msdosfs_denode.c optional msdosfs fs/msdosfs/msdosfs_fat.c optional msdosfs fs/msdosfs/msdosfs_iconv.c optional msdosfs_iconv fs/msdosfs/msdosfs_lookup.c optional msdosfs fs/msdosfs/msdosfs_vfsops.c optional msdosfs fs/msdosfs/msdosfs_vnops.c optional msdosfs fs/nfs/nfs_commonkrpc.c optional nfscl | nfsd fs/nfs/nfs_commonsubs.c optional nfscl | nfsd fs/nfs/nfs_commonport.c optional nfscl | nfsd fs/nfs/nfs_commonacl.c optional nfscl | nfsd fs/nfsclient/nfs_clcomsubs.c optional nfscl fs/nfsclient/nfs_clsubs.c optional nfscl fs/nfsclient/nfs_clstate.c optional nfscl fs/nfsclient/nfs_clkrpc.c optional nfscl fs/nfsclient/nfs_clrpcops.c optional nfscl fs/nfsclient/nfs_clvnops.c optional nfscl fs/nfsclient/nfs_clnode.c optional nfscl fs/nfsclient/nfs_clvfsops.c optional nfscl fs/nfsclient/nfs_clport.c optional nfscl fs/nfsclient/nfs_clbio.c optional nfscl fs/nfsclient/nfs_clnfsiod.c optional nfscl fs/nfsserver/nfs_fha_new.c optional nfsd inet fs/nfsserver/nfs_nfsdsocket.c optional nfsd inet fs/nfsserver/nfs_nfsdsubs.c optional nfsd inet fs/nfsserver/nfs_nfsdstate.c optional nfsd inet fs/nfsserver/nfs_nfsdkrpc.c optional nfsd inet fs/nfsserver/nfs_nfsdserv.c optional nfsd inet fs/nfsserver/nfs_nfsdport.c optional nfsd inet fs/nfsserver/nfs_nfsdcache.c optional nfsd inet fs/nullfs/null_subr.c optional nullfs fs/nullfs/null_vfsops.c optional nullfs fs/nullfs/null_vnops.c optional nullfs fs/procfs/procfs.c optional procfs fs/procfs/procfs_dbregs.c optional procfs fs/procfs/procfs_fpregs.c optional procfs fs/procfs/procfs_ioctl.c optional procfs fs/procfs/procfs_map.c optional procfs fs/procfs/procfs_mem.c optional procfs fs/procfs/procfs_note.c optional procfs fs/procfs/procfs_osrel.c optional procfs fs/procfs/procfs_regs.c optional procfs fs/procfs/procfs_rlimit.c optional procfs fs/procfs/procfs_status.c optional procfs fs/procfs/procfs_type.c optional procfs fs/pseudofs/pseudofs.c optional pseudofs fs/pseudofs/pseudofs_fileno.c optional pseudofs fs/pseudofs/pseudofs_vncache.c optional pseudofs fs/pseudofs/pseudofs_vnops.c optional pseudofs fs/smbfs/smbfs_io.c optional smbfs fs/smbfs/smbfs_node.c optional smbfs fs/smbfs/smbfs_smb.c optional smbfs fs/smbfs/smbfs_subr.c optional smbfs fs/smbfs/smbfs_vfsops.c optional smbfs fs/smbfs/smbfs_vnops.c optional smbfs fs/udf/osta.c optional udf fs/udf/udf_iconv.c optional udf_iconv fs/udf/udf_vfsops.c optional udf fs/udf/udf_vnops.c optional udf fs/unionfs/union_subr.c optional unionfs fs/unionfs/union_vfsops.c optional unionfs fs/unionfs/union_vnops.c optional unionfs fs/tmpfs/tmpfs_vnops.c optional tmpfs fs/tmpfs/tmpfs_fifoops.c optional tmpfs fs/tmpfs/tmpfs_vfsops.c optional tmpfs fs/tmpfs/tmpfs_subr.c optional tmpfs gdb/gdb_cons.c optional gdb gdb/gdb_main.c optional gdb gdb/gdb_packet.c optional gdb gdb/netgdb.c optional ddb debugnet gdb netgdb inet geom/bde/g_bde.c optional geom_bde geom/bde/g_bde_crypt.c optional geom_bde geom/bde/g_bde_lock.c optional geom_bde geom/bde/g_bde_work.c optional geom_bde geom/cache/g_cache.c optional geom_cache geom/concat/g_concat.c optional geom_concat geom/eli/g_eli.c optional geom_eli geom/eli/g_eli_crypto.c optional geom_eli geom/eli/g_eli_ctl.c optional geom_eli geom/eli/g_eli_hmac.c optional geom_eli geom/eli/g_eli_integrity.c optional geom_eli geom/eli/g_eli_key.c optional geom_eli geom/eli/g_eli_key_cache.c optional geom_eli geom/eli/g_eli_privacy.c optional geom_eli geom/eli/pkcs5v2.c optional geom_eli geom/gate/g_gate.c optional geom_gate geom/geom_bsd_enc.c optional geom_part_bsd geom/geom_ccd.c optional ccd | geom_ccd geom/geom_ctl.c standard geom/geom_dev.c standard geom/geom_disk.c standard geom/geom_dump.c standard geom/geom_event.c standard geom/geom_flashmap.c optional fdt cfi | fdt mx25l | mmcsd | fdt n25q | fdt at45d geom/geom_io.c standard geom/geom_kern.c standard geom/geom_map.c optional geom_map geom/geom_redboot.c optional geom_redboot geom/geom_slice.c standard geom/geom_subr.c standard geom/geom_vfs.c standard geom/journal/g_journal.c optional geom_journal geom/journal/g_journal_ufs.c optional geom_journal geom/label/g_label.c optional geom_label | geom_label_gpt geom/label/g_label_ext2fs.c optional geom_label geom/label/g_label_flashmap.c optional geom_label geom/label/g_label_iso9660.c optional geom_label geom/label/g_label_msdosfs.c optional geom_label geom/label/g_label_ntfs.c optional geom_label geom/label/g_label_reiserfs.c optional geom_label geom/label/g_label_ufs.c optional geom_label geom/label/g_label_gpt.c optional geom_label | geom_label_gpt geom/label/g_label_disk_ident.c optional geom_label geom/linux_lvm/g_linux_lvm.c optional geom_linux_lvm geom/mirror/g_mirror.c optional geom_mirror geom/mirror/g_mirror_ctl.c optional geom_mirror geom/mountver/g_mountver.c optional geom_mountver geom/multipath/g_multipath.c optional geom_multipath geom/nop/g_nop.c optional geom_nop geom/part/g_part.c standard geom/part/g_part_if.m standard geom/part/g_part_apm.c optional geom_part_apm geom/part/g_part_bsd.c optional geom_part_bsd geom/part/g_part_bsd64.c optional geom_part_bsd64 geom/part/g_part_ebr.c optional geom_part_ebr geom/part/g_part_gpt.c optional geom_part_gpt geom/part/g_part_ldm.c optional geom_part_ldm geom/part/g_part_mbr.c optional geom_part_mbr geom/part/g_part_vtoc8.c optional geom_part_vtoc8 geom/raid/g_raid.c optional geom_raid geom/raid/g_raid_ctl.c optional geom_raid geom/raid/g_raid_md_if.m optional geom_raid geom/raid/g_raid_tr_if.m optional geom_raid geom/raid/md_ddf.c optional geom_raid geom/raid/md_intel.c optional geom_raid geom/raid/md_jmicron.c optional geom_raid geom/raid/md_nvidia.c optional geom_raid geom/raid/md_promise.c optional geom_raid geom/raid/md_sii.c optional geom_raid geom/raid/tr_concat.c optional geom_raid geom/raid/tr_raid0.c optional geom_raid geom/raid/tr_raid1.c optional geom_raid geom/raid/tr_raid1e.c optional geom_raid geom/raid/tr_raid5.c optional geom_raid geom/raid3/g_raid3.c optional geom_raid3 geom/raid3/g_raid3_ctl.c optional geom_raid3 geom/shsec/g_shsec.c optional geom_shsec geom/stripe/g_stripe.c optional geom_stripe geom/uzip/g_uzip.c optional geom_uzip geom/uzip/g_uzip_lzma.c optional geom_uzip geom/uzip/g_uzip_wrkthr.c optional geom_uzip geom/uzip/g_uzip_zlib.c optional geom_uzip geom/uzip/g_uzip_zstd.c optional geom_uzip zstdio \ compile-with "${NORMAL_C} -I$S/contrib/zstd/lib/freebsd" geom/vinum/geom_vinum.c optional geom_vinum geom/vinum/geom_vinum_create.c optional geom_vinum geom/vinum/geom_vinum_drive.c optional geom_vinum geom/vinum/geom_vinum_plex.c optional geom_vinum geom/vinum/geom_vinum_volume.c optional geom_vinum geom/vinum/geom_vinum_subr.c optional geom_vinum geom/vinum/geom_vinum_raid5.c optional geom_vinum geom/vinum/geom_vinum_share.c optional geom_vinum geom/vinum/geom_vinum_list.c optional geom_vinum geom/vinum/geom_vinum_rm.c optional geom_vinum geom/vinum/geom_vinum_init.c optional geom_vinum geom/vinum/geom_vinum_state.c optional geom_vinum geom/vinum/geom_vinum_rename.c optional geom_vinum geom/vinum/geom_vinum_move.c optional geom_vinum geom/vinum/geom_vinum_events.c optional geom_vinum geom/virstor/binstream.c optional geom_virstor geom/virstor/g_virstor.c optional geom_virstor geom/virstor/g_virstor_md.c optional geom_virstor geom/zero/g_zero.c optional geom_zero fs/ext2fs/ext2_acl.c optional ext2fs fs/ext2fs/ext2_alloc.c optional ext2fs fs/ext2fs/ext2_balloc.c optional ext2fs fs/ext2fs/ext2_bmap.c optional ext2fs fs/ext2fs/ext2_csum.c optional ext2fs fs/ext2fs/ext2_extattr.c optional ext2fs fs/ext2fs/ext2_extents.c optional ext2fs fs/ext2fs/ext2_inode.c optional ext2fs fs/ext2fs/ext2_inode_cnv.c optional ext2fs fs/ext2fs/ext2_hash.c optional ext2fs fs/ext2fs/ext2_htree.c optional ext2fs fs/ext2fs/ext2_lookup.c optional ext2fs fs/ext2fs/ext2_subr.c optional ext2fs fs/ext2fs/ext2_vfsops.c optional ext2fs fs/ext2fs/ext2_vnops.c optional ext2fs # isa/isa_if.m standard isa/isa_common.c optional isa isa/isahint.c optional isa isa/pnp.c optional isa isapnp isa/pnpparse.c optional isa isapnp fs/cd9660/cd9660_bmap.c optional cd9660 fs/cd9660/cd9660_lookup.c optional cd9660 fs/cd9660/cd9660_node.c optional cd9660 fs/cd9660/cd9660_rrip.c optional cd9660 fs/cd9660/cd9660_util.c optional cd9660 fs/cd9660/cd9660_vfsops.c optional cd9660 fs/cd9660/cd9660_vnops.c optional cd9660 fs/cd9660/cd9660_iconv.c optional cd9660_iconv gnu/gcov/gcc_4_7.c optional gcov \ warning "kernel contains GPL licensed gcov support" gnu/gcov/gcov_fs.c optional gcov lindebugfs \ compile-with "${LINUXKPI_C}" gnu/gcov/gcov_subr.c optional gcov kern/bus_if.m standard kern/clock_if.m standard kern/cpufreq_if.m standard kern/device_if.m standard kern/imgact_binmisc.c optional imagact_binmisc kern/imgact_elf.c standard kern/imgact_elf32.c optional compat_freebsd32 kern/imgact_shell.c standard kern/init_main.c standard kern/init_sysent.c standard kern/ksched.c optional _kposix_priority_scheduling kern/kern_acct.c standard kern/kern_alq.c optional alq kern/kern_clock.c standard kern/kern_condvar.c standard kern/kern_conf.c standard kern/kern_cons.c standard kern/kern_cpu.c standard kern/kern_cpuset.c standard kern/kern_context.c standard kern/kern_descrip.c standard kern/kern_dtrace.c optional kdtrace_hooks kern/kern_dump.c standard kern/kern_environment.c standard kern/kern_et.c standard kern/kern_event.c standard kern/kern_exec.c standard kern/kern_exit.c standard kern/kern_fail.c standard kern/kern_ffclock.c standard kern/kern_fork.c standard kern/kern_hhook.c standard kern/kern_idle.c standard kern/kern_intr.c standard kern/kern_jail.c standard kern/kern_kcov.c optional kcov \ compile-with "${NORMAL_C:N-fsanitize*}" kern/kern_khelp.c standard kern/kern_kthread.c standard kern/kern_ktr.c optional ktr kern/kern_ktrace.c standard kern/kern_linker.c standard kern/kern_lock.c standard kern/kern_lockf.c standard kern/kern_lockstat.c optional kdtrace_hooks kern/kern_loginclass.c standard kern/kern_malloc.c standard kern/kern_mbuf.c standard kern/kern_mib.c standard kern/kern_module.c standard kern/kern_mtxpool.c standard kern/kern_mutex.c standard kern/kern_ntptime.c standard kern/kern_osd.c standard kern/kern_physio.c standard kern/kern_pmc.c standard kern/kern_poll.c optional device_polling kern/kern_priv.c standard kern/kern_proc.c standard kern/kern_procctl.c standard kern/kern_prot.c standard kern/kern_racct.c standard kern/kern_rangelock.c standard kern/kern_rctl.c standard kern/kern_resource.c standard kern/kern_rmlock.c standard kern/kern_rwlock.c standard kern/kern_sdt.c optional kdtrace_hooks kern/kern_sema.c standard kern/kern_sendfile.c standard kern/kern_sharedpage.c standard kern/kern_shutdown.c standard kern/kern_sig.c standard kern/kern_switch.c standard kern/kern_sx.c standard kern/kern_synch.c standard kern/kern_syscalls.c standard kern/kern_sysctl.c standard kern/kern_tc.c standard kern/kern_thr.c standard kern/kern_thread.c standard kern/kern_time.c standard kern/kern_timeout.c standard kern/kern_tslog.c optional tslog kern/kern_ubsan.c optional kubsan kern/kern_umtx.c standard kern/kern_uuid.c standard kern/kern_xxx.c standard kern/link_elf.c standard kern/linker_if.m standard kern/md4c.c optional netsmb kern/md5c.c standard kern/p1003_1b.c standard kern/posix4_mib.c standard kern/sched_4bsd.c optional sched_4bsd kern/sched_ule.c optional sched_ule kern/serdev_if.m standard kern/stack_protector.c standard \ compile-with "${NORMAL_C:N-fstack-protector*}" kern/subr_acl_nfs4.c optional ufs_acl | zfs kern/subr_acl_posix1e.c optional ufs_acl kern/subr_autoconf.c standard kern/subr_blist.c standard kern/subr_boot.c standard kern/subr_bus.c standard kern/subr_bus_dma.c standard kern/subr_bufring.c standard kern/subr_capability.c standard kern/subr_clock.c standard kern/subr_compressor.c standard \ compile-with "${NORMAL_C} -I$S/contrib/zstd/lib/freebsd" kern/subr_coverage.c optional coverage \ compile-with "${NORMAL_C:N-fsanitize*}" kern/subr_counter.c standard kern/subr_csan.c optional kcsan \ compile-with "${NORMAL_C:N-fsanitize*}" kern/subr_devstat.c standard kern/subr_disk.c standard kern/subr_early.c standard kern/subr_epoch.c standard kern/subr_eventhandler.c standard kern/subr_fattime.c standard kern/subr_firmware.c optional firmware kern/subr_filter.c standard kern/subr_gtaskqueue.c standard kern/subr_hash.c standard kern/subr_hints.c standard kern/subr_kdb.c standard kern/subr_kobj.c standard kern/subr_lock.c standard kern/subr_log.c standard kern/subr_mchain.c optional libmchain kern/subr_module.c standard kern/subr_msgbuf.c standard kern/subr_param.c standard kern/subr_pcpu.c standard kern/subr_pctrie.c standard kern/subr_pidctrl.c standard kern/subr_power.c standard kern/subr_prf.c standard kern/subr_prof.c standard kern/subr_rangeset.c standard kern/subr_rman.c standard kern/subr_rtc.c standard kern/subr_sbuf.c standard kern/subr_scanf.c standard kern/subr_sglist.c standard kern/subr_sleepqueue.c standard kern/subr_smp.c standard kern/subr_smr.c standard kern/subr_stack.c optional ddb | stack | ktr kern/subr_stats.c optional stats kern/subr_taskqueue.c standard kern/subr_terminal.c optional vt kern/subr_trap.c standard kern/subr_turnstile.c standard kern/subr_uio.c standard kern/subr_unit.c standard kern/subr_vmem.c standard kern/subr_witness.c optional witness kern/sys_capability.c standard kern/sys_generic.c standard kern/sys_getrandom.c standard kern/sys_pipe.c standard kern/sys_procdesc.c standard kern/sys_process.c standard kern/sys_socket.c standard kern/syscalls.c standard kern/sysv_ipc.c standard kern/sysv_msg.c optional sysvmsg kern/sysv_sem.c optional sysvsem kern/sysv_shm.c optional sysvshm kern/tty.c standard kern/tty_compat.c optional compat_43tty kern/tty_info.c standard kern/tty_inq.c standard kern/tty_outq.c standard kern/tty_pts.c standard kern/tty_tty.c standard kern/tty_ttydisc.c standard kern/uipc_accf.c standard kern/uipc_debug.c optional ddb kern/uipc_domain.c standard kern/uipc_ktls.c optional kern_tls kern/uipc_mbuf.c standard kern/uipc_mbuf2.c standard kern/uipc_mbufhash.c standard kern/uipc_mqueue.c optional p1003_1b_mqueue kern/uipc_sem.c optional p1003_1b_semaphores kern/uipc_shm.c standard kern/uipc_sockbuf.c standard kern/uipc_socket.c standard kern/uipc_syscalls.c standard kern/uipc_usrreq.c standard kern/vfs_acl.c standard kern/vfs_aio.c standard kern/vfs_bio.c standard kern/vfs_cache.c standard kern/vfs_cluster.c standard kern/vfs_default.c standard kern/vfs_export.c standard kern/vfs_extattr.c standard kern/vfs_hash.c standard kern/vfs_init.c standard kern/vfs_lookup.c standard kern/vfs_mount.c standard kern/vfs_mountroot.c standard kern/vfs_subr.c standard kern/vfs_syscalls.c standard kern/vfs_vnops.c standard # # Kernel GSS-API # gssd.h optional kgssapi \ dependency "$S/kgssapi/gssd.x" \ compile-with "RPCGEN_CPP='${CPP}' rpcgen -hM $S/kgssapi/gssd.x | grep -v pthread.h > gssd.h" \ no-obj no-implicit-rule before-depend local \ clean "gssd.h" gssd_xdr.c optional kgssapi \ dependency "$S/kgssapi/gssd.x gssd.h" \ compile-with "RPCGEN_CPP='${CPP}' rpcgen -c $S/kgssapi/gssd.x -o gssd_xdr.c" \ no-implicit-rule before-depend local \ clean "gssd_xdr.c" gssd_clnt.c optional kgssapi \ dependency "$S/kgssapi/gssd.x gssd.h" \ compile-with "RPCGEN_CPP='${CPP}' rpcgen -lM $S/kgssapi/gssd.x | grep -v string.h > gssd_clnt.c" \ no-implicit-rule before-depend local \ clean "gssd_clnt.c" kgssapi/gss_accept_sec_context.c optional kgssapi kgssapi/gss_add_oid_set_member.c optional kgssapi kgssapi/gss_acquire_cred.c optional kgssapi kgssapi/gss_canonicalize_name.c optional kgssapi kgssapi/gss_create_empty_oid_set.c optional kgssapi kgssapi/gss_delete_sec_context.c optional kgssapi kgssapi/gss_display_status.c optional kgssapi kgssapi/gss_export_name.c optional kgssapi kgssapi/gss_get_mic.c optional kgssapi kgssapi/gss_init_sec_context.c optional kgssapi kgssapi/gss_impl.c optional kgssapi kgssapi/gss_import_name.c optional kgssapi kgssapi/gss_names.c optional kgssapi kgssapi/gss_pname_to_uid.c optional kgssapi kgssapi/gss_release_buffer.c optional kgssapi kgssapi/gss_release_cred.c optional kgssapi kgssapi/gss_release_name.c optional kgssapi kgssapi/gss_release_oid_set.c optional kgssapi kgssapi/gss_set_cred_option.c optional kgssapi kgssapi/gss_test_oid_set_member.c optional kgssapi kgssapi/gss_unwrap.c optional kgssapi kgssapi/gss_verify_mic.c optional kgssapi kgssapi/gss_wrap.c optional kgssapi kgssapi/gss_wrap_size_limit.c optional kgssapi kgssapi/gssd_prot.c optional kgssapi kgssapi/krb5/krb5_mech.c optional kgssapi kgssapi/krb5/kcrypto.c optional kgssapi kgssapi/krb5/kcrypto_aes.c optional kgssapi kgssapi/krb5/kcrypto_arcfour.c optional kgssapi kgssapi/krb5/kcrypto_des.c optional kgssapi kgssapi/krb5/kcrypto_des3.c optional kgssapi kgssapi/kgss_if.m optional kgssapi kgssapi/gsstest.c optional kgssapi_debug # These files in libkern/ are those needed by all architectures. Some # of the files in libkern/ are only needed on some architectures, e.g., # libkern/divdi3.c is needed by i386 but not alpha. Also, some of these # routines may be optimized for a particular platform. In either case, # the file should be moved to conf/files. from here. # libkern/arc4random.c standard libkern/asprintf.c standard libkern/bcd.c standard libkern/bsearch.c standard libkern/explicit_bzero.c standard libkern/fnmatch.c standard libkern/gsb_crc32.c standard libkern/iconv.c optional libiconv libkern/iconv_converter_if.m optional libiconv libkern/iconv_ucs.c optional libiconv libkern/iconv_xlat.c optional libiconv libkern/iconv_xlat16.c optional libiconv libkern/inet_aton.c standard libkern/inet_ntoa.c standard libkern/inet_ntop.c standard libkern/inet_pton.c standard libkern/jenkins_hash.c standard libkern/murmur3_32.c standard libkern/mcount.c optional profiling-routine libkern/memcchr.c standard libkern/memchr.c standard libkern/memmem.c optional gdb libkern/qsort.c standard libkern/qsort_r.c standard libkern/random.c standard libkern/scanc.c standard libkern/strcasecmp.c standard libkern/strcat.c standard libkern/strchr.c standard libkern/strchrnul.c optional gdb libkern/strcmp.c standard libkern/strcpy.c standard libkern/strcspn.c standard libkern/strdup.c standard libkern/strndup.c standard libkern/strlcat.c standard libkern/strlcpy.c standard libkern/strlen.c standard libkern/strncat.c standard libkern/strncmp.c standard libkern/strncpy.c standard libkern/strnlen.c standard libkern/strrchr.c standard libkern/strsep.c standard libkern/strspn.c standard libkern/strstr.c standard libkern/strtol.c standard libkern/strtoq.c standard libkern/strtoul.c standard libkern/strtouq.c standard libkern/strvalid.c standard libkern/timingsafe_bcmp.c standard contrib/zlib/adler32.c optional crypto | geom_uzip | ipsec | \ ipsec_support | mxge | ddb_ctf | gzio | zfs | zlib contrib/zlib/compress.c optional crypto | geom_uzip | ipsec | \ ipsec_support | mxge | ddb_ctf | gzio | zfs | zlib \ compile-with "${NORMAL_C} -Wno-cast-qual" contrib/zlib/crc32.c optional crypto | geom_uzip | ipsec | \ ipsec_support | mxge | ddb_ctf | gzio | zfs | zlib contrib/zlib/deflate.c optional crypto | geom_uzip | ipsec | \ ipsec_support | mxge | ddb_ctf | gzio | zfs | zlib \ compile-with "${NORMAL_C} -Wno-cast-qual" contrib/zlib/inffast.c optional crypto | geom_uzip | ipsec | \ ipsec_support | mxge | ddb_ctf | gzio | zfs | zlib contrib/zlib/inflate.c optional crypto | geom_uzip | ipsec | \ ipsec_support | mxge | ddb_ctf | gzio | zfs | zlib contrib/zlib/inftrees.c optional crypto | geom_uzip | ipsec | \ ipsec_support | mxge | ddb_ctf | gzio | zfs | zlib contrib/zlib/trees.c optional crypto | geom_uzip | ipsec | \ ipsec_support | mxge | ddb_ctf | gzio | zfs | zlib contrib/zlib/uncompr.c optional crypto | geom_uzip | ipsec | \ ipsec_support | mxge | ddb_ctf | gzio | zfs | zlib \ compile-with "${NORMAL_C} -Wno-cast-qual" contrib/zlib/zutil.c optional crypto | geom_uzip | ipsec | \ ipsec_support | mxge | ddb_ctf | gzio | zfs | zlib dev/zlib/zlib_mod.c optional crypto | geom_uzip | ipsec | \ ipsec_support | mxge | ddb_ctf | gzio | zfs | zlib dev/zlib/zcalloc.c optional crypto | geom_uzip | ipsec | \ ipsec_support | mxge | ddb_ctf | gzio | zfs | zlib net/altq/altq_cbq.c optional altq net/altq/altq_codel.c optional altq net/altq/altq_hfsc.c optional altq net/altq/altq_fairq.c optional altq net/altq/altq_priq.c optional altq net/altq/altq_red.c optional altq net/altq/altq_rio.c optional altq net/altq/altq_rmclass.c optional altq net/altq/altq_subr.c optional altq net/bpf.c standard net/bpf_buffer.c optional bpf net/bpf_jitter.c optional bpf_jitter net/bpf_filter.c optional bpf | netgraph_bpf net/bpf_zerocopy.c optional bpf net/bridgestp.c optional bridge | if_bridge net/ieee8023ad_lacp.c optional lagg net/if.c standard net/if_bridge.c optional bridge inet | if_bridge inet net/if_clone.c standard net/if_dead.c standard net/if_debug.c optional ddb net/if_disc.c optional disc net/if_edsc.c optional edsc net/if_enc.c optional enc inet | enc inet6 net/if_epair.c optional epair net/if_ethersubr.c optional ether net/if_fwsubr.c optional fwip net/if_gif.c optional gif inet | gif inet6 | \ netgraph_gif inet | netgraph_gif inet6 net/if_gre.c optional gre inet | gre inet6 net/if_ipsec.c optional inet ipsec | inet6 ipsec net/if_lagg.c optional lagg net/if_loop.c optional loop net/if_llatbl.c standard net/if_me.c optional me inet net/if_media.c standard net/if_mib.c standard net/if_spppfr.c optional sppp | netgraph_sppp net/if_spppsubr.c optional sppp | netgraph_sppp net/if_stf.c optional stf inet inet6 net/if_tuntap.c optional tuntap net/if_vlan.c optional vlan net/if_vxlan.c optional vxlan inet | vxlan inet6 net/ifdi_if.m optional ether pci iflib net/iflib.c optional ether pci iflib net/iflib_clone.c optional ether pci iflib net/mp_ring.c optional ether iflib net/mppcc.c optional netgraph_mppc_compression net/mppcd.c optional netgraph_mppc_compression net/netisr.c standard net/debugnet.c optional inet debugnet net/debugnet_inet.c optional inet debugnet net/pfil.c optional ether | inet net/radix.c standard net/radix_mpath.c standard net/raw_cb.c standard net/raw_usrreq.c standard net/route.c standard net/route_temporal.c standard net/rss_config.c optional inet rss | inet6 rss net/rtsock.c standard net/slcompress.c optional netgraph_vjc | sppp | \ netgraph_sppp net/toeplitz.c optional inet rss | inet6 rss net/vnet.c optional vimage net80211/ieee80211.c optional wlan net80211/ieee80211_acl.c optional wlan wlan_acl net80211/ieee80211_action.c optional wlan net80211/ieee80211_adhoc.c optional wlan \ compile-with "${NORMAL_C} -Wno-unused-function" net80211/ieee80211_ageq.c optional wlan net80211/ieee80211_amrr.c optional wlan | wlan_amrr net80211/ieee80211_crypto.c optional wlan \ compile-with "${NORMAL_C} -Wno-unused-function" net80211/ieee80211_crypto_ccmp.c optional wlan wlan_ccmp net80211/ieee80211_crypto_none.c optional wlan net80211/ieee80211_crypto_tkip.c optional wlan wlan_tkip net80211/ieee80211_crypto_wep.c optional wlan wlan_wep net80211/ieee80211_ddb.c optional wlan ddb net80211/ieee80211_dfs.c optional wlan net80211/ieee80211_freebsd.c optional wlan net80211/ieee80211_hostap.c optional wlan \ compile-with "${NORMAL_C} -Wno-unused-function" net80211/ieee80211_ht.c optional wlan net80211/ieee80211_hwmp.c optional wlan ieee80211_support_mesh net80211/ieee80211_input.c optional wlan net80211/ieee80211_ioctl.c optional wlan net80211/ieee80211_mesh.c optional wlan ieee80211_support_mesh \ compile-with "${NORMAL_C} -Wno-unused-function" net80211/ieee80211_monitor.c optional wlan net80211/ieee80211_node.c optional wlan net80211/ieee80211_output.c optional wlan net80211/ieee80211_phy.c optional wlan net80211/ieee80211_power.c optional wlan net80211/ieee80211_proto.c optional wlan net80211/ieee80211_radiotap.c optional wlan net80211/ieee80211_ratectl.c optional wlan net80211/ieee80211_ratectl_none.c optional wlan net80211/ieee80211_regdomain.c optional wlan net80211/ieee80211_rssadapt.c optional wlan wlan_rssadapt net80211/ieee80211_scan.c optional wlan net80211/ieee80211_scan_sta.c optional wlan net80211/ieee80211_sta.c optional wlan \ compile-with "${NORMAL_C} -Wno-unused-function" net80211/ieee80211_superg.c optional wlan ieee80211_support_superg net80211/ieee80211_scan_sw.c optional wlan net80211/ieee80211_tdma.c optional wlan ieee80211_support_tdma net80211/ieee80211_vht.c optional wlan net80211/ieee80211_wds.c optional wlan net80211/ieee80211_xauth.c optional wlan wlan_xauth net80211/ieee80211_alq.c optional wlan ieee80211_alq netgraph/atm/ccatm/ng_ccatm.c optional ngatm_ccatm \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" netgraph/atm/ngatmbase.c optional ngatm_atmbase \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" netgraph/atm/sscfu/ng_sscfu.c optional ngatm_sscfu \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" netgraph/atm/sscop/ng_sscop.c optional ngatm_sscop \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" netgraph/atm/uni/ng_uni.c optional ngatm_uni \ compile-with "${NORMAL_C} -I$S/contrib/ngatm" netgraph/bluetooth/common/ng_bluetooth.c optional netgraph_bluetooth netgraph/bluetooth/drivers/bt3c/ng_bt3c_pccard.c optional netgraph_bluetooth_bt3c netgraph/bluetooth/drivers/h4/ng_h4.c optional netgraph_bluetooth_h4 netgraph/bluetooth/drivers/ubt/ng_ubt.c optional netgraph_bluetooth_ubt usb netgraph/bluetooth/drivers/ubt/ng_ubt_intel.c optional netgraph_bluetooth_ubt usb netgraph/bluetooth/drivers/ubtbcmfw/ubtbcmfw.c optional netgraph_bluetooth_ubtbcmfw usb netgraph/bluetooth/hci/ng_hci_cmds.c optional netgraph_bluetooth_hci netgraph/bluetooth/hci/ng_hci_evnt.c optional netgraph_bluetooth_hci netgraph/bluetooth/hci/ng_hci_main.c optional netgraph_bluetooth_hci netgraph/bluetooth/hci/ng_hci_misc.c optional netgraph_bluetooth_hci netgraph/bluetooth/hci/ng_hci_ulpi.c optional netgraph_bluetooth_hci netgraph/bluetooth/l2cap/ng_l2cap_cmds.c optional netgraph_bluetooth_l2cap netgraph/bluetooth/l2cap/ng_l2cap_evnt.c optional netgraph_bluetooth_l2cap netgraph/bluetooth/l2cap/ng_l2cap_llpi.c optional netgraph_bluetooth_l2cap netgraph/bluetooth/l2cap/ng_l2cap_main.c optional netgraph_bluetooth_l2cap netgraph/bluetooth/l2cap/ng_l2cap_misc.c optional netgraph_bluetooth_l2cap netgraph/bluetooth/l2cap/ng_l2cap_ulpi.c optional netgraph_bluetooth_l2cap netgraph/bluetooth/socket/ng_btsocket.c optional netgraph_bluetooth_socket netgraph/bluetooth/socket/ng_btsocket_hci_raw.c optional netgraph_bluetooth_socket netgraph/bluetooth/socket/ng_btsocket_l2cap.c optional netgraph_bluetooth_socket netgraph/bluetooth/socket/ng_btsocket_l2cap_raw.c optional netgraph_bluetooth_socket netgraph/bluetooth/socket/ng_btsocket_rfcomm.c optional netgraph_bluetooth_socket netgraph/bluetooth/socket/ng_btsocket_sco.c optional netgraph_bluetooth_socket netgraph/netflow/netflow.c optional netgraph_netflow netgraph/netflow/netflow_v9.c optional netgraph_netflow netgraph/netflow/ng_netflow.c optional netgraph_netflow netgraph/ng_UI.c optional netgraph_UI netgraph/ng_async.c optional netgraph_async netgraph/ng_atmllc.c optional netgraph_atmllc netgraph/ng_base.c optional netgraph netgraph/ng_bpf.c optional netgraph_bpf netgraph/ng_bridge.c optional netgraph_bridge netgraph/ng_car.c optional netgraph_car netgraph/ng_checksum.c optional netgraph_checksum netgraph/ng_cisco.c optional netgraph_cisco netgraph/ng_deflate.c optional netgraph_deflate netgraph/ng_device.c optional netgraph_device netgraph/ng_echo.c optional netgraph_echo netgraph/ng_eiface.c optional netgraph_eiface netgraph/ng_ether.c optional netgraph_ether netgraph/ng_ether_echo.c optional netgraph_ether_echo netgraph/ng_frame_relay.c optional netgraph_frame_relay netgraph/ng_gif.c optional netgraph_gif inet6 | netgraph_gif inet netgraph/ng_gif_demux.c optional netgraph_gif_demux netgraph/ng_hole.c optional netgraph_hole netgraph/ng_iface.c optional netgraph_iface netgraph/ng_ip_input.c optional netgraph_ip_input netgraph/ng_ipfw.c optional netgraph_ipfw inet ipfirewall netgraph/ng_ksocket.c optional netgraph_ksocket netgraph/ng_l2tp.c optional netgraph_l2tp netgraph/ng_lmi.c optional netgraph_lmi netgraph/ng_mppc.c optional netgraph_mppc_compression | \ netgraph_mppc_encryption netgraph/ng_nat.c optional netgraph_nat inet libalias netgraph/ng_one2many.c optional netgraph_one2many netgraph/ng_parse.c optional netgraph netgraph/ng_patch.c optional netgraph_patch netgraph/ng_pipe.c optional netgraph_pipe netgraph/ng_ppp.c optional netgraph_ppp netgraph/ng_pppoe.c optional netgraph_pppoe netgraph/ng_pptpgre.c optional netgraph_pptpgre netgraph/ng_pred1.c optional netgraph_pred1 netgraph/ng_rfc1490.c optional netgraph_rfc1490 netgraph/ng_socket.c optional netgraph_socket netgraph/ng_split.c optional netgraph_split netgraph/ng_sppp.c optional netgraph_sppp netgraph/ng_tag.c optional netgraph_tag netgraph/ng_tcpmss.c optional netgraph_tcpmss netgraph/ng_tee.c optional netgraph_tee netgraph/ng_tty.c optional netgraph_tty netgraph/ng_vjc.c optional netgraph_vjc netgraph/ng_vlan.c optional netgraph_vlan netinet/accf_data.c optional accept_filter_data inet netinet/accf_dns.c optional accept_filter_dns inet netinet/accf_http.c optional accept_filter_http inet netinet/if_ether.c optional inet ether netinet/igmp.c optional inet netinet/in.c optional inet netinet/in_debug.c optional inet ddb netinet/in_kdtrace.c optional inet | inet6 netinet/ip_carp.c optional inet carp | inet6 carp netinet/in_fib.c optional inet netinet/in_gif.c optional gif inet | netgraph_gif inet netinet/ip_gre.c optional gre inet netinet/ip_id.c optional inet netinet/in_jail.c optional inet netinet/in_mcast.c optional inet netinet/in_pcb.c optional inet | inet6 netinet/in_pcbgroup.c optional inet pcbgroup | inet6 pcbgroup netinet/in_prot.c optional inet | inet6 netinet/in_proto.c optional inet | inet6 netinet/in_rmx.c optional inet netinet/in_rss.c optional inet rss netinet/ip_divert.c optional inet ipdivert ipfirewall netinet/ip_ecn.c optional inet | inet6 netinet/ip_encap.c optional inet | inet6 netinet/ip_fastfwd.c optional inet netinet/ip_icmp.c optional inet | inet6 netinet/ip_input.c optional inet netinet/ip_mroute.c optional mrouting inet netinet/ip_options.c optional inet netinet/ip_output.c optional inet netinet/ip_reass.c optional inet netinet/raw_ip.c optional inet | inet6 netinet/cc/cc.c optional inet | inet6 netinet/cc/cc_newreno.c optional inet | inet6 netinet/sctp_asconf.c optional inet sctp | inet6 sctp netinet/sctp_auth.c optional inet sctp | inet6 sctp netinet/sctp_bsd_addr.c optional inet sctp | inet6 sctp netinet/sctp_cc_functions.c optional inet sctp | inet6 sctp netinet/sctp_crc32.c optional inet | inet6 netinet/sctp_indata.c optional inet sctp | inet6 sctp netinet/sctp_input.c optional inet sctp | inet6 sctp netinet/sctp_kdtrace.c optional inet sctp | inet6 sctp netinet/sctp_output.c optional inet sctp | inet6 sctp netinet/sctp_pcb.c optional inet sctp | inet6 sctp netinet/sctp_peeloff.c optional inet sctp | inet6 sctp netinet/sctp_ss_functions.c optional inet sctp | inet6 sctp netinet/sctp_syscalls.c optional inet sctp | inet6 sctp netinet/sctp_sysctl.c optional inet sctp | inet6 sctp netinet/sctp_timer.c optional inet sctp | inet6 sctp netinet/sctp_usrreq.c optional inet sctp | inet6 sctp netinet/sctputil.c optional inet sctp | inet6 sctp netinet/siftr.c optional inet siftr alq | inet6 siftr alq netinet/tcp_debug.c optional tcpdebug netinet/tcp_fastopen.c optional inet tcp_rfc7413 | inet6 tcp_rfc7413 netinet/tcp_hostcache.c optional inet | inet6 netinet/tcp_input.c optional inet | inet6 netinet/tcp_log_buf.c optional tcp_blackbox inet | tcp_blackbox inet6 netinet/tcp_lro.c optional inet | inet6 netinet/tcp_output.c optional inet | inet6 netinet/tcp_offload.c optional tcp_offload inet | tcp_offload inet6 netinet/tcp_hpts.c optional tcphpts inet | tcphpts inet6 netinet/tcp_ratelimit.c optional ratelimit inet | ratelimit inet6 netinet/tcp_pcap.c optional inet tcppcap | inet6 tcppcap \ compile-with "${NORMAL_C} ${NO_WNONNULL}" netinet/tcp_reass.c optional inet | inet6 netinet/tcp_sack.c optional inet | inet6 netinet/tcp_stats.c optional stats inet | stats inet6 netinet/tcp_subr.c optional inet | inet6 netinet/tcp_syncache.c optional inet | inet6 netinet/tcp_timer.c optional inet | inet6 netinet/tcp_timewait.c optional inet | inet6 netinet/tcp_usrreq.c optional inet | inet6 netinet/udp_usrreq.c optional inet | inet6 netinet/libalias/alias.c optional libalias inet | netgraph_nat inet netinet/libalias/alias_db.c optional libalias inet | netgraph_nat inet netinet/libalias/alias_mod.c optional libalias | netgraph_nat netinet/libalias/alias_proxy.c optional libalias inet | netgraph_nat inet netinet/libalias/alias_util.c optional libalias inet | netgraph_nat inet netinet/libalias/alias_sctp.c optional libalias inet | netgraph_nat inet netinet/netdump/netdump_client.c optional inet debugnet netdump netinet6/dest6.c optional inet6 netinet6/frag6.c optional inet6 netinet6/icmp6.c optional inet6 netinet6/in6.c optional inet6 netinet6/in6_cksum.c optional inet6 netinet6/in6_fib.c optional inet6 netinet6/in6_gif.c optional gif inet6 | netgraph_gif inet6 netinet6/in6_ifattach.c optional inet6 netinet6/in6_jail.c optional inet6 netinet6/in6_mcast.c optional inet6 netinet6/in6_pcb.c optional inet6 netinet6/in6_pcbgroup.c optional inet6 pcbgroup netinet6/in6_proto.c optional inet6 netinet6/in6_rmx.c optional inet6 netinet6/in6_rss.c optional inet6 rss netinet6/in6_src.c optional inet6 netinet6/ip6_fastfwd.c optional inet6 netinet6/ip6_forward.c optional inet6 netinet6/ip6_gre.c optional gre inet6 netinet6/ip6_id.c optional inet6 netinet6/ip6_input.c optional inet6 netinet6/ip6_mroute.c optional mrouting inet6 netinet6/ip6_output.c optional inet6 netinet6/mld6.c optional inet6 netinet6/nd6.c optional inet6 netinet6/nd6_nbr.c optional inet6 netinet6/nd6_rtr.c optional inet6 netinet6/raw_ip6.c optional inet6 netinet6/route6.c optional inet6 netinet6/scope6.c optional inet6 netinet6/sctp6_usrreq.c optional inet6 sctp netinet6/udp6_usrreq.c optional inet6 netipsec/ipsec.c optional ipsec inet | ipsec inet6 netipsec/ipsec_input.c optional ipsec inet | ipsec inet6 netipsec/ipsec_mbuf.c optional ipsec inet | ipsec inet6 netipsec/ipsec_mod.c optional ipsec inet | ipsec inet6 netipsec/ipsec_output.c optional ipsec inet | ipsec inet6 netipsec/ipsec_pcb.c optional ipsec inet | ipsec inet6 | \ ipsec_support inet | ipsec_support inet6 netipsec/key.c optional ipsec inet | ipsec inet6 | \ ipsec_support inet | ipsec_support inet6 netipsec/key_debug.c optional ipsec inet | ipsec inet6 | \ ipsec_support inet | ipsec_support inet6 netipsec/keysock.c optional ipsec inet | ipsec inet6 | \ ipsec_support inet | ipsec_support inet6 netipsec/subr_ipsec.c optional ipsec inet | ipsec inet6 | \ ipsec_support inet | ipsec_support inet6 netipsec/udpencap.c optional ipsec inet netipsec/xform_ah.c optional ipsec inet | ipsec inet6 netipsec/xform_esp.c optional ipsec inet | ipsec inet6 netipsec/xform_ipcomp.c optional ipsec inet | ipsec inet6 netipsec/xform_tcp.c optional ipsec inet tcp_signature | \ ipsec inet6 tcp_signature | ipsec_support inet tcp_signature | \ ipsec_support inet6 tcp_signature netpfil/ipfw/dn_aqm_codel.c optional inet dummynet netpfil/ipfw/dn_aqm_pie.c optional inet dummynet netpfil/ipfw/dn_heap.c optional inet dummynet netpfil/ipfw/dn_sched_fifo.c optional inet dummynet netpfil/ipfw/dn_sched_fq_codel.c optional inet dummynet netpfil/ipfw/dn_sched_fq_pie.c optional inet dummynet netpfil/ipfw/dn_sched_prio.c optional inet dummynet netpfil/ipfw/dn_sched_qfq.c optional inet dummynet netpfil/ipfw/dn_sched_rr.c optional inet dummynet netpfil/ipfw/dn_sched_wf2q.c optional inet dummynet netpfil/ipfw/ip_dummynet.c optional inet dummynet netpfil/ipfw/ip_dn_io.c optional inet dummynet netpfil/ipfw/ip_dn_glue.c optional inet dummynet netpfil/ipfw/ip_fw2.c optional inet ipfirewall netpfil/ipfw/ip_fw_bpf.c optional inet ipfirewall netpfil/ipfw/ip_fw_dynamic.c optional inet ipfirewall \ compile-with "${NORMAL_C} -I$S/contrib/ck/include" netpfil/ipfw/ip_fw_eaction.c optional inet ipfirewall netpfil/ipfw/ip_fw_log.c optional inet ipfirewall netpfil/ipfw/ip_fw_pfil.c optional inet ipfirewall netpfil/ipfw/ip_fw_sockopt.c optional inet ipfirewall netpfil/ipfw/ip_fw_table.c optional inet ipfirewall netpfil/ipfw/ip_fw_table_algo.c optional inet ipfirewall netpfil/ipfw/ip_fw_table_value.c optional inet ipfirewall netpfil/ipfw/ip_fw_iface.c optional inet ipfirewall netpfil/ipfw/ip_fw_nat.c optional inet ipfirewall_nat netpfil/ipfw/nat64/ip_fw_nat64.c optional inet inet6 ipfirewall \ ipfirewall_nat64 netpfil/ipfw/nat64/nat64clat.c optional inet inet6 ipfirewall \ ipfirewall_nat64 netpfil/ipfw/nat64/nat64clat_control.c optional inet inet6 ipfirewall \ ipfirewall_nat64 netpfil/ipfw/nat64/nat64lsn.c optional inet inet6 ipfirewall \ ipfirewall_nat64 compile-with "${NORMAL_C} -I$S/contrib/ck/include" netpfil/ipfw/nat64/nat64lsn_control.c optional inet inet6 ipfirewall \ ipfirewall_nat64 compile-with "${NORMAL_C} -I$S/contrib/ck/include" netpfil/ipfw/nat64/nat64stl.c optional inet inet6 ipfirewall \ ipfirewall_nat64 netpfil/ipfw/nat64/nat64stl_control.c optional inet inet6 ipfirewall \ ipfirewall_nat64 netpfil/ipfw/nat64/nat64_translate.c optional inet inet6 ipfirewall \ ipfirewall_nat64 netpfil/ipfw/nptv6/ip_fw_nptv6.c optional inet inet6 ipfirewall \ ipfirewall_nptv6 netpfil/ipfw/nptv6/nptv6.c optional inet inet6 ipfirewall \ ipfirewall_nptv6 netpfil/ipfw/pmod/ip_fw_pmod.c optional inet ipfirewall_pmod netpfil/ipfw/pmod/tcpmod.c optional inet ipfirewall_pmod netpfil/pf/if_pflog.c optional pflog pf inet netpfil/pf/if_pfsync.c optional pfsync pf inet netpfil/pf/pf.c optional pf inet netpfil/pf/pf_if.c optional pf inet netpfil/pf/pf_ioctl.c optional pf inet netpfil/pf/pf_lb.c optional pf inet netpfil/pf/pf_norm.c optional pf inet netpfil/pf/pf_osfp.c optional pf inet netpfil/pf/pf_ruleset.c optional pf inet netpfil/pf/pf_table.c optional pf inet netpfil/pf/in4_cksum.c optional pf inet netsmb/smb_conn.c optional netsmb netsmb/smb_crypt.c optional netsmb netsmb/smb_dev.c optional netsmb netsmb/smb_iod.c optional netsmb netsmb/smb_rq.c optional netsmb netsmb/smb_smb.c optional netsmb netsmb/smb_subr.c optional netsmb netsmb/smb_trantcp.c optional netsmb netsmb/smb_usr.c optional netsmb nfs/bootp_subr.c optional bootp nfscl nfs/krpc_subr.c optional bootp nfscl nfs/nfs_diskless.c optional nfscl nfs_root nfs/nfs_fha.c optional nfsd nfs/nfs_lock.c optional nfscl | nfslockd | nfsd nfs/nfs_nfssvc.c optional nfscl | nfsd nlm/nlm_advlock.c optional nfslockd | nfsd nlm/nlm_prot_clnt.c optional nfslockd | nfsd nlm/nlm_prot_impl.c optional nfslockd | nfsd nlm/nlm_prot_server.c optional nfslockd | nfsd nlm/nlm_prot_svc.c optional nfslockd | nfsd nlm/nlm_prot_xdr.c optional nfslockd | nfsd nlm/sm_inter_xdr.c optional nfslockd | nfsd # Linux Kernel Programming Interface compat/linuxkpi/common/src/linux_kmod.c optional compat_linuxkpi \ compile-with "${LINUXKPI_C}" compat/linuxkpi/common/src/linux_compat.c optional compat_linuxkpi \ compile-with "${LINUXKPI_C}" compat/linuxkpi/common/src/linux_current.c optional compat_linuxkpi \ compile-with "${LINUXKPI_C}" compat/linuxkpi/common/src/linux_hrtimer.c optional compat_linuxkpi \ compile-with "${LINUXKPI_C}" compat/linuxkpi/common/src/linux_kthread.c optional compat_linuxkpi \ compile-with "${LINUXKPI_C}" compat/linuxkpi/common/src/linux_lock.c optional compat_linuxkpi \ compile-with "${LINUXKPI_C}" compat/linuxkpi/common/src/linux_page.c optional compat_linuxkpi \ compile-with "${LINUXKPI_C}" compat/linuxkpi/common/src/linux_pci.c optional compat_linuxkpi pci \ compile-with "${LINUXKPI_C}" compat/linuxkpi/common/src/linux_tasklet.c optional compat_linuxkpi \ compile-with "${LINUXKPI_C}" compat/linuxkpi/common/src/linux_idr.c optional compat_linuxkpi \ compile-with "${LINUXKPI_C}" compat/linuxkpi/common/src/linux_radix.c optional compat_linuxkpi \ compile-with "${LINUXKPI_C}" compat/linuxkpi/common/src/linux_rcu.c optional compat_linuxkpi \ compile-with "${LINUXKPI_C} -I$S/contrib/ck/include" compat/linuxkpi/common/src/linux_schedule.c optional compat_linuxkpi \ compile-with "${LINUXKPI_C}" +compat/linuxkpi/common/src/linux_shmemfs.c optional compat_linuxkpi \ + compile-with "${LINUXKPI_C}" compat/linuxkpi/common/src/linux_slab.c optional compat_linuxkpi \ compile-with "${LINUXKPI_C}" compat/linuxkpi/common/src/linux_usb.c optional compat_linuxkpi usb \ compile-with "${LINUXKPI_C}" compat/linuxkpi/common/src/linux_work.c optional compat_linuxkpi \ compile-with "${LINUXKPI_C}" compat/linuxkpi/common/src/linux_seq_file.c optional compat_linuxkpi | lindebugfs \ compile-with "${LINUXKPI_C}" compat/lindebugfs/lindebugfs.c optional lindebugfs \ compile-with "${LINUXKPI_C}" # OpenFabrics Enterprise Distribution (Infiniband) ofed/drivers/infiniband/core/ib_addr.c optional ofed \ compile-with "${OFED_C}" ofed/drivers/infiniband/core/ib_agent.c optional ofed \ compile-with "${OFED_C}" ofed/drivers/infiniband/core/ib_cache.c optional ofed \ compile-with "${OFED_C}" ofed/drivers/infiniband/core/ib_cm.c optional ofed \ compile-with "${OFED_C}" ofed/drivers/infiniband/core/ib_cma.c optional ofed \ compile-with "${OFED_C}" ofed/drivers/infiniband/core/ib_cq.c optional ofed \ compile-with "${OFED_C}" ofed/drivers/infiniband/core/ib_device.c optional ofed \ compile-with "${OFED_C}" ofed/drivers/infiniband/core/ib_fmr_pool.c optional ofed \ compile-with "${OFED_C}" ofed/drivers/infiniband/core/ib_iwcm.c optional ofed \ compile-with "${OFED_C}" ofed/drivers/infiniband/core/ib_iwpm_msg.c optional ofed \ compile-with "${OFED_C}" ofed/drivers/infiniband/core/ib_iwpm_util.c optional ofed \ compile-with "${OFED_C}" ofed/drivers/infiniband/core/ib_mad.c optional ofed \ compile-with "${OFED_C}" ofed/drivers/infiniband/core/ib_mad_rmpp.c optional ofed \ compile-with "${OFED_C}" ofed/drivers/infiniband/core/ib_multicast.c optional ofed \ compile-with "${OFED_C}" ofed/drivers/infiniband/core/ib_packer.c optional ofed \ compile-with "${OFED_C}" ofed/drivers/infiniband/core/ib_roce_gid_mgmt.c optional ofed \ compile-with "${OFED_C}" ofed/drivers/infiniband/core/ib_sa_query.c optional ofed \ compile-with "${OFED_C}" ofed/drivers/infiniband/core/ib_smi.c optional ofed \ compile-with "${OFED_C}" ofed/drivers/infiniband/core/ib_sysfs.c optional ofed \ compile-with "${OFED_C}" ofed/drivers/infiniband/core/ib_ucm.c optional ofed \ compile-with "${OFED_C}" ofed/drivers/infiniband/core/ib_ucma.c optional ofed \ compile-with "${OFED_C}" ofed/drivers/infiniband/core/ib_ud_header.c optional ofed \ compile-with "${OFED_C}" ofed/drivers/infiniband/core/ib_umem.c optional ofed \ compile-with "${OFED_C}" ofed/drivers/infiniband/core/ib_user_mad.c optional ofed \ compile-with "${OFED_C}" ofed/drivers/infiniband/core/ib_uverbs_cmd.c optional ofed \ compile-with "${OFED_C}" ofed/drivers/infiniband/core/ib_uverbs_main.c optional ofed \ compile-with "${OFED_C}" ofed/drivers/infiniband/core/ib_uverbs_marshall.c optional ofed \ compile-with "${OFED_C}" ofed/drivers/infiniband/core/ib_verbs.c optional ofed \ compile-with "${OFED_C}" ofed/drivers/infiniband/ulp/ipoib/ipoib_cm.c optional ipoib \ compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/ulp/ipoib/" #ofed/drivers/infiniband/ulp/ipoib/ipoib_fs.c optional ipoib \ # compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/ulp/ipoib/" ofed/drivers/infiniband/ulp/ipoib/ipoib_ib.c optional ipoib \ compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/ulp/ipoib/" ofed/drivers/infiniband/ulp/ipoib/ipoib_main.c optional ipoib \ compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/ulp/ipoib/" ofed/drivers/infiniband/ulp/ipoib/ipoib_multicast.c optional ipoib \ compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/ulp/ipoib/" ofed/drivers/infiniband/ulp/ipoib/ipoib_verbs.c optional ipoib \ compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/ulp/ipoib/" #ofed/drivers/infiniband/ulp/ipoib/ipoib_vlan.c optional ipoib \ # compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/ulp/ipoib/" ofed/drivers/infiniband/ulp/sdp/sdp_bcopy.c optional sdp inet \ compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/ulp/sdp/" ofed/drivers/infiniband/ulp/sdp/sdp_main.c optional sdp inet \ compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/ulp/sdp/" ofed/drivers/infiniband/ulp/sdp/sdp_rx.c optional sdp inet \ compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/ulp/sdp/" ofed/drivers/infiniband/ulp/sdp/sdp_cma.c optional sdp inet \ compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/ulp/sdp/" ofed/drivers/infiniband/ulp/sdp/sdp_tx.c optional sdp inet \ compile-with "${OFED_C} -I$S/ofed/drivers/infiniband/ulp/sdp/" dev/mthca/mthca_allocator.c optional mthca pci ofed \ compile-with "${OFED_C}" dev/mthca/mthca_av.c optional mthca pci ofed \ compile-with "${OFED_C}" dev/mthca/mthca_catas.c optional mthca pci ofed \ compile-with "${OFED_C}" dev/mthca/mthca_cmd.c optional mthca pci ofed \ compile-with "${OFED_C}" dev/mthca/mthca_cq.c optional mthca pci ofed \ compile-with "${OFED_C}" dev/mthca/mthca_eq.c optional mthca pci ofed \ compile-with "${OFED_C}" dev/mthca/mthca_mad.c optional mthca pci ofed \ compile-with "${OFED_C}" dev/mthca/mthca_main.c optional mthca pci ofed \ compile-with "${OFED_C}" dev/mthca/mthca_mcg.c optional mthca pci ofed \ compile-with "${OFED_C}" dev/mthca/mthca_memfree.c optional mthca pci ofed \ compile-with "${OFED_C}" dev/mthca/mthca_mr.c optional mthca pci ofed \ compile-with "${OFED_C}" dev/mthca/mthca_pd.c optional mthca pci ofed \ compile-with "${OFED_C}" dev/mthca/mthca_profile.c optional mthca pci ofed \ compile-with "${OFED_C}" dev/mthca/mthca_provider.c optional mthca pci ofed \ compile-with "${OFED_C}" dev/mthca/mthca_qp.c optional mthca pci ofed \ compile-with "${OFED_C}" dev/mthca/mthca_reset.c optional mthca pci ofed \ compile-with "${OFED_C}" dev/mthca/mthca_srq.c optional mthca pci ofed \ compile-with "${OFED_C}" dev/mthca/mthca_uar.c optional mthca pci ofed \ compile-with "${OFED_C}" dev/mlx4/mlx4_ib/mlx4_ib_alias_GUID.c optional mlx4ib pci ofed \ compile-with "${OFED_C}" dev/mlx4/mlx4_ib/mlx4_ib_mcg.c optional mlx4ib pci ofed \ compile-with "${OFED_C}" dev/mlx4/mlx4_ib/mlx4_ib_sysfs.c optional mlx4ib pci ofed \ compile-with "${OFED_C}" dev/mlx4/mlx4_ib/mlx4_ib_cm.c optional mlx4ib pci ofed \ compile-with "${OFED_C}" dev/mlx4/mlx4_ib/mlx4_ib_ah.c optional mlx4ib pci ofed \ compile-with "${OFED_C}" dev/mlx4/mlx4_ib/mlx4_ib_cq.c optional mlx4ib pci ofed \ compile-with "${OFED_C}" dev/mlx4/mlx4_ib/mlx4_ib_doorbell.c optional mlx4ib pci ofed \ compile-with "${OFED_C}" dev/mlx4/mlx4_ib/mlx4_ib_mad.c optional mlx4ib pci ofed \ compile-with "${OFED_C}" dev/mlx4/mlx4_ib/mlx4_ib_main.c optional mlx4ib pci ofed \ compile-with "${OFED_C}" dev/mlx4/mlx4_ib/mlx4_ib_mr.c optional mlx4ib pci ofed \ compile-with "${OFED_C}" dev/mlx4/mlx4_ib/mlx4_ib_qp.c optional mlx4ib pci ofed \ compile-with "${OFED_C}" dev/mlx4/mlx4_ib/mlx4_ib_srq.c optional mlx4ib pci ofed \ compile-with "${OFED_C}" dev/mlx4/mlx4_ib/mlx4_ib_wc.c optional mlx4ib pci ofed \ compile-with "${OFED_C}" dev/mlx4/mlx4_core/mlx4_alloc.c optional mlx4 pci \ compile-with "${OFED_C}" dev/mlx4/mlx4_core/mlx4_catas.c optional mlx4 pci \ compile-with "${OFED_C}" dev/mlx4/mlx4_core/mlx4_cmd.c optional mlx4 pci \ compile-with "${OFED_C}" dev/mlx4/mlx4_core/mlx4_cq.c optional mlx4 pci \ compile-with "${OFED_C}" dev/mlx4/mlx4_core/mlx4_eq.c optional mlx4 pci \ compile-with "${OFED_C}" dev/mlx4/mlx4_core/mlx4_fw.c optional mlx4 pci \ compile-with "${OFED_C}" dev/mlx4/mlx4_core/mlx4_fw_qos.c optional mlx4 pci \ compile-with "${OFED_C}" dev/mlx4/mlx4_core/mlx4_icm.c optional mlx4 pci \ compile-with "${OFED_C}" dev/mlx4/mlx4_core/mlx4_intf.c optional mlx4 pci \ compile-with "${OFED_C}" dev/mlx4/mlx4_core/mlx4_main.c optional mlx4 pci \ compile-with "${OFED_C}" dev/mlx4/mlx4_core/mlx4_mcg.c optional mlx4 pci \ compile-with "${OFED_C}" dev/mlx4/mlx4_core/mlx4_mr.c optional mlx4 pci \ compile-with "${OFED_C}" dev/mlx4/mlx4_core/mlx4_pd.c optional mlx4 pci \ compile-with "${OFED_C}" dev/mlx4/mlx4_core/mlx4_port.c optional mlx4 pci \ compile-with "${OFED_C}" dev/mlx4/mlx4_core/mlx4_profile.c optional mlx4 pci \ compile-with "${OFED_C}" dev/mlx4/mlx4_core/mlx4_qp.c optional mlx4 pci \ compile-with "${OFED_C}" dev/mlx4/mlx4_core/mlx4_reset.c optional mlx4 pci \ compile-with "${OFED_C}" dev/mlx4/mlx4_core/mlx4_sense.c optional mlx4 pci \ compile-with "${OFED_C}" dev/mlx4/mlx4_core/mlx4_srq.c optional mlx4 pci \ compile-with "${OFED_C}" dev/mlx4/mlx4_core/mlx4_resource_tracker.c optional mlx4 pci \ compile-with "${OFED_C}" dev/mlx4/mlx4_en/mlx4_en_cq.c optional mlx4en pci inet inet6 \ compile-with "${OFED_C}" dev/mlx4/mlx4_en/mlx4_en_main.c optional mlx4en pci inet inet6 \ compile-with "${OFED_C}" dev/mlx4/mlx4_en/mlx4_en_netdev.c optional mlx4en pci inet inet6 \ compile-with "${OFED_C}" dev/mlx4/mlx4_en/mlx4_en_port.c optional mlx4en pci inet inet6 \ compile-with "${OFED_C}" dev/mlx4/mlx4_en/mlx4_en_resources.c optional mlx4en pci inet inet6 \ compile-with "${OFED_C}" dev/mlx4/mlx4_en/mlx4_en_rx.c optional mlx4en pci inet inet6 \ compile-with "${OFED_C}" dev/mlx4/mlx4_en/mlx4_en_tx.c optional mlx4en pci inet inet6 \ compile-with "${OFED_C}" dev/mlx5/mlx5_ib/mlx5_ib_ah.c optional mlx5ib pci ofed \ compile-with "${OFED_C}" dev/mlx5/mlx5_ib/mlx5_ib_cong.c optional mlx5ib pci ofed \ compile-with "${OFED_C}" dev/mlx5/mlx5_ib/mlx5_ib_cq.c optional mlx5ib pci ofed \ compile-with "${OFED_C}" dev/mlx5/mlx5_ib/mlx5_ib_doorbell.c optional mlx5ib pci ofed \ compile-with "${OFED_C}" dev/mlx5/mlx5_ib/mlx5_ib_gsi.c optional mlx5ib pci ofed \ compile-with "${OFED_C}" dev/mlx5/mlx5_ib/mlx5_ib_mad.c optional mlx5ib pci ofed \ compile-with "${OFED_C}" dev/mlx5/mlx5_ib/mlx5_ib_main.c optional mlx5ib pci ofed \ compile-with "${OFED_C}" dev/mlx5/mlx5_ib/mlx5_ib_mem.c optional mlx5ib pci ofed \ compile-with "${OFED_C}" dev/mlx5/mlx5_ib/mlx5_ib_mr.c optional mlx5ib pci ofed \ compile-with "${OFED_C}" dev/mlx5/mlx5_ib/mlx5_ib_qp.c optional mlx5ib pci ofed \ compile-with "${OFED_C}" dev/mlx5/mlx5_ib/mlx5_ib_srq.c optional mlx5ib pci ofed \ compile-with "${OFED_C}" dev/mlx5/mlx5_ib/mlx5_ib_virt.c optional mlx5ib pci ofed \ compile-with "${OFED_C}" dev/mlx5/mlx5_core/mlx5_alloc.c optional mlx5 pci \ compile-with "${OFED_C}" dev/mlx5/mlx5_core/mlx5_cmd.c optional mlx5 pci \ compile-with "${OFED_C}" dev/mlx5/mlx5_core/mlx5_cq.c optional mlx5 pci \ compile-with "${OFED_C}" dev/mlx5/mlx5_core/mlx5_diagnostics.c optional mlx5 pci \ compile-with "${OFED_C}" dev/mlx5/mlx5_core/mlx5_eq.c optional mlx5 pci \ compile-with "${OFED_C}" dev/mlx5/mlx5_core/mlx5_fs_cmd.c optional mlx5 pci \ compile-with "${OFED_C}" dev/mlx5/mlx5_core/mlx5_fs_tree.c optional mlx5 pci \ compile-with "${OFED_C}" dev/mlx5/mlx5_core/mlx5_fw.c optional mlx5 pci \ compile-with "${OFED_C}" dev/mlx5/mlx5_core/mlx5_fwdump.c optional mlx5 pci \ compile-with "${OFED_C}" dev/mlx5/mlx5_core/mlx5_health.c optional mlx5 pci \ compile-with "${OFED_C}" dev/mlx5/mlx5_core/mlx5_mad.c optional mlx5 pci \ compile-with "${OFED_C}" dev/mlx5/mlx5_core/mlx5_main.c optional mlx5 pci \ compile-with "${OFED_C}" dev/mlx5/mlx5_core/mlx5_mcg.c optional mlx5 pci \ compile-with "${OFED_C}" dev/mlx5/mlx5_core/mlx5_mpfs.c optional mlx5 pci \ compile-with "${OFED_C}" dev/mlx5/mlx5_core/mlx5_mr.c optional mlx5 pci \ compile-with "${OFED_C}" dev/mlx5/mlx5_core/mlx5_pagealloc.c optional mlx5 pci \ compile-with "${OFED_C}" dev/mlx5/mlx5_core/mlx5_pd.c optional mlx5 pci \ compile-with "${OFED_C}" dev/mlx5/mlx5_core/mlx5_port.c optional mlx5 pci \ compile-with "${OFED_C}" dev/mlx5/mlx5_core/mlx5_qp.c optional mlx5 pci \ compile-with "${OFED_C}" dev/mlx5/mlx5_core/mlx5_rl.c optional mlx5 pci \ compile-with "${OFED_C}" dev/mlx5/mlx5_core/mlx5_srq.c optional mlx5 pci \ compile-with "${OFED_C}" dev/mlx5/mlx5_core/mlx5_tls.c optional mlx5 pci \ compile-with "${OFED_C}" dev/mlx5/mlx5_core/mlx5_transobj.c optional mlx5 pci \ compile-with "${OFED_C}" dev/mlx5/mlx5_core/mlx5_uar.c optional mlx5 pci \ compile-with "${OFED_C}" dev/mlx5/mlx5_core/mlx5_vport.c optional mlx5 pci \ compile-with "${OFED_C}" dev/mlx5/mlx5_core/mlx5_vsc.c optional mlx5 pci \ compile-with "${OFED_C}" dev/mlx5/mlx5_core/mlx5_wq.c optional mlx5 pci \ compile-with "${OFED_C}" dev/mlx5/mlx5_lib/mlx5_gid.c optional mlx5 pci \ compile-with "${OFED_C}" dev/mlx5/mlx5_en/mlx5_en_dim.c optional mlx5en pci inet inet6 \ compile-with "${OFED_C}" dev/mlx5/mlx5_en/mlx5_en_ethtool.c optional mlx5en pci inet inet6 \ compile-with "${OFED_C}" dev/mlx5/mlx5_en/mlx5_en_main.c optional mlx5en pci inet inet6 \ compile-with "${OFED_C}" dev/mlx5/mlx5_en/mlx5_en_tx.c optional mlx5en pci inet inet6 \ compile-with "${OFED_C}" dev/mlx5/mlx5_en/mlx5_en_flow_table.c optional mlx5en pci inet inet6 \ compile-with "${OFED_C}" dev/mlx5/mlx5_en/mlx5_en_hw_tls.c optional mlx5en pci inet inet6 \ compile-with "${OFED_C}" dev/mlx5/mlx5_en/mlx5_en_rx.c optional mlx5en pci inet inet6 \ compile-with "${OFED_C}" dev/mlx5/mlx5_en/mlx5_en_rl.c optional mlx5en pci inet inet6 \ compile-with "${OFED_C}" dev/mlx5/mlx5_en/mlx5_en_txrx.c optional mlx5en pci inet inet6 \ compile-with "${OFED_C}" dev/mlx5/mlx5_en/mlx5_en_port_buffer.c optional mlx5en pci inet inet6 \ compile-with "${OFED_C}" # crypto support opencrypto/cast.c optional crypto | ipsec | ipsec_support opencrypto/criov.c optional crypto | ipsec | ipsec_support opencrypto/crypto.c optional crypto | ipsec | ipsec_support opencrypto/cryptodev.c optional cryptodev opencrypto/cryptodev_if.m optional crypto | ipsec | ipsec_support opencrypto/cryptosoft.c optional crypto | ipsec | ipsec_support opencrypto/cryptodeflate.c optional crypto | ipsec | ipsec_support opencrypto/gmac.c optional crypto | ipsec | ipsec_support opencrypto/gfmult.c optional crypto | ipsec | ipsec_support opencrypto/rmd160.c optional crypto | ipsec | ipsec_support opencrypto/skipjack.c optional crypto | ipsec | ipsec_support opencrypto/xform.c optional crypto | ipsec | ipsec_support opencrypto/xform_poly1305.c optional crypto \ compile-with "${NORMAL_C} -I$S/contrib/libsodium/src/libsodium/include -I$S/crypto/libsodium" contrib/libsodium/src/libsodium/crypto_onetimeauth/poly1305/onetimeauth_poly1305.c \ optional crypto \ compile-with "${NORMAL_C} -I$S/contrib/libsodium/src/libsodium/include/sodium -I$S/crypto/libsodium" contrib/libsodium/src/libsodium/crypto_onetimeauth/poly1305/donna/poly1305_donna.c \ optional crypto \ compile-with "${NORMAL_C} -I$S/contrib/libsodium/src/libsodium/include/sodium -I$S/crypto/libsodium" contrib/libsodium/src/libsodium/crypto_verify/sodium/verify.c \ optional crypto \ compile-with "${NORMAL_C} -I$S/contrib/libsodium/src/libsodium/include/sodium -I$S/crypto/libsodium" crypto/libsodium/randombytes.c optional crypto \ compile-with "${NORMAL_C} -I$S/contrib/libsodium/src/libsodium/include -I$S/crypto/libsodium" crypto/libsodium/utils.c optional crypto \ compile-with "${NORMAL_C} -I$S/contrib/libsodium/src/libsodium/include -I$S/crypto/libsodium" contrib/libsodium/src/libsodium/crypto_core/hchacha20/core_hchacha20.c \ optional crypto \ compile-with "${NORMAL_C} -I$S/contrib/libsodium/src/libsodium/include/sodium -I$S/crypto/libsodium" contrib/libsodium/src/libsodium/crypto_stream/chacha20/stream_chacha20.c \ optional crypto \ compile-with "${NORMAL_C} -I$S/contrib/libsodium/src/libsodium/include/sodium -I$S/crypto/libsodium" contrib/libsodium/src/libsodium/crypto_stream/chacha20/ref/chacha20_ref.c \ optional crypto \ compile-with "${NORMAL_C} -I$S/contrib/libsodium/src/libsodium/include/sodium -I$S/crypto/libsodium" contrib/libsodium/src/libsodium/crypto_aead/chacha20poly1305/sodium/aead_chacha20poly1305.c \ optional crypto \ compile-with "${NORMAL_C} -I$S/contrib/libsodium/src/libsodium/include/sodium -I$S/crypto/libsodium" contrib/libsodium/src/libsodium/crypto_aead/xchacha20poly1305/sodium/aead_xchacha20poly1305.c \ optional crypto \ compile-with "${NORMAL_C} -I$S/contrib/libsodium/src/libsodium/include/sodium -I$S/crypto/libsodium" opencrypto/cbc_mac.c optional crypto opencrypto/xform_cbc_mac.c optional crypto rpc/auth_none.c optional krpc | nfslockd | nfscl | nfsd rpc/auth_unix.c optional krpc | nfslockd | nfscl | nfsd rpc/authunix_prot.c optional krpc | nfslockd | nfscl | nfsd rpc/clnt_bck.c optional krpc | nfslockd | nfscl | nfsd rpc/clnt_dg.c optional krpc | nfslockd | nfscl | nfsd rpc/clnt_rc.c optional krpc | nfslockd | nfscl | nfsd rpc/clnt_vc.c optional krpc | nfslockd | nfscl | nfsd rpc/getnetconfig.c optional krpc | nfslockd | nfscl | nfsd rpc/replay.c optional krpc | nfslockd | nfscl | nfsd rpc/rpc_callmsg.c optional krpc | nfslockd | nfscl | nfsd rpc/rpc_generic.c optional krpc | nfslockd | nfscl | nfsd rpc/rpc_prot.c optional krpc | nfslockd | nfscl | nfsd rpc/rpcb_clnt.c optional krpc | nfslockd | nfscl | nfsd rpc/rpcb_prot.c optional krpc | nfslockd | nfscl | nfsd rpc/svc.c optional krpc | nfslockd | nfscl | nfsd rpc/svc_auth.c optional krpc | nfslockd | nfscl | nfsd rpc/svc_auth_unix.c optional krpc | nfslockd | nfscl | nfsd rpc/svc_dg.c optional krpc | nfslockd | nfscl | nfsd rpc/svc_generic.c optional krpc | nfslockd | nfscl | nfsd rpc/svc_vc.c optional krpc | nfslockd | nfscl | nfsd rpc/rpcsec_gss/rpcsec_gss.c optional krpc kgssapi | nfslockd kgssapi | nfscl kgssapi | nfsd kgssapi rpc/rpcsec_gss/rpcsec_gss_conf.c optional krpc kgssapi | nfslockd kgssapi | nfscl kgssapi | nfsd kgssapi rpc/rpcsec_gss/rpcsec_gss_misc.c optional krpc kgssapi | nfslockd kgssapi | nfscl kgssapi | nfsd kgssapi rpc/rpcsec_gss/rpcsec_gss_prot.c optional krpc kgssapi | nfslockd kgssapi | nfscl kgssapi | nfsd kgssapi rpc/rpcsec_gss/svc_rpcsec_gss.c optional krpc kgssapi | nfslockd kgssapi | nfscl kgssapi | nfsd kgssapi security/audit/audit.c optional audit security/audit/audit_arg.c optional audit security/audit/audit_bsm.c optional audit security/audit/audit_bsm_db.c optional audit security/audit/audit_bsm_klib.c optional audit security/audit/audit_dtrace.c optional dtaudit audit | dtraceall audit compile-with "${CDDL_C}" security/audit/audit_pipe.c optional audit security/audit/audit_syscalls.c standard security/audit/audit_trigger.c optional audit security/audit/audit_worker.c optional audit security/audit/bsm_domain.c optional audit security/audit/bsm_errno.c optional audit security/audit/bsm_fcntl.c optional audit security/audit/bsm_socket_type.c optional audit security/audit/bsm_token.c optional audit security/mac/mac_audit.c optional mac audit security/mac/mac_cred.c optional mac security/mac/mac_framework.c optional mac security/mac/mac_inet.c optional mac inet | mac inet6 security/mac/mac_inet6.c optional mac inet6 security/mac/mac_label.c optional mac security/mac/mac_net.c optional mac security/mac/mac_pipe.c optional mac security/mac/mac_posix_sem.c optional mac security/mac/mac_posix_shm.c optional mac security/mac/mac_priv.c optional mac security/mac/mac_process.c optional mac security/mac/mac_socket.c optional mac security/mac/mac_syscalls.c standard security/mac/mac_system.c optional mac security/mac/mac_sysv_msg.c optional mac security/mac/mac_sysv_sem.c optional mac security/mac/mac_sysv_shm.c optional mac security/mac/mac_vfs.c optional mac security/mac_biba/mac_biba.c optional mac_biba security/mac_bsdextended/mac_bsdextended.c optional mac_bsdextended security/mac_bsdextended/ugidfw_system.c optional mac_bsdextended security/mac_bsdextended/ugidfw_vnode.c optional mac_bsdextended security/mac_ifoff/mac_ifoff.c optional mac_ifoff security/mac_lomac/mac_lomac.c optional mac_lomac security/mac_mls/mac_mls.c optional mac_mls security/mac_none/mac_none.c optional mac_none security/mac_ntpd/mac_ntpd.c optional mac_ntpd security/mac_partition/mac_partition.c optional mac_partition security/mac_portacl/mac_portacl.c optional mac_portacl security/mac_seeotheruids/mac_seeotheruids.c optional mac_seeotheruids security/mac_stub/mac_stub.c optional mac_stub security/mac_test/mac_test.c optional mac_test security/mac_veriexec/mac_veriexec.c optional mac_veriexec security/mac_veriexec/veriexec_fingerprint.c optional mac_veriexec security/mac_veriexec/veriexec_metadata.c optional mac_veriexec security/mac_veriexec_parser/mac_veriexec_parser.c optional mac_veriexec mac_veriexec_parser security/mac_veriexec/mac_veriexec_rmd160.c optional mac_veriexec_rmd160 security/mac_veriexec/mac_veriexec_sha1.c optional mac_veriexec_sha1 security/mac_veriexec/mac_veriexec_sha256.c optional mac_veriexec_sha256 security/mac_veriexec/mac_veriexec_sha384.c optional mac_veriexec_sha384 security/mac_veriexec/mac_veriexec_sha512.c optional mac_veriexec_sha512 teken/teken.c optional sc !SC_NO_TERM_TEKEN | vt ufs/ffs/ffs_alloc.c optional ffs ufs/ffs/ffs_balloc.c optional ffs ufs/ffs/ffs_inode.c optional ffs ufs/ffs/ffs_snapshot.c optional ffs ufs/ffs/ffs_softdep.c optional ffs ufs/ffs/ffs_subr.c optional ffs | geom_label ufs/ffs/ffs_tables.c optional ffs | geom_label ufs/ffs/ffs_vfsops.c optional ffs ufs/ffs/ffs_vnops.c optional ffs ufs/ffs/ffs_rawread.c optional ffs directio ufs/ffs/ffs_suspend.c optional ffs ufs/ufs/ufs_acl.c optional ffs ufs/ufs/ufs_bmap.c optional ffs ufs/ufs/ufs_dirhash.c optional ffs ufs/ufs/ufs_extattr.c optional ffs ufs/ufs/ufs_gjournal.c optional ffs UFS_GJOURNAL ufs/ufs/ufs_inode.c optional ffs ufs/ufs/ufs_lookup.c optional ffs ufs/ufs/ufs_quota.c optional ffs ufs/ufs/ufs_vfsops.c optional ffs ufs/ufs/ufs_vnops.c optional ffs vm/default_pager.c standard vm/device_pager.c standard vm/phys_pager.c standard vm/redzone.c optional DEBUG_REDZONE vm/sg_pager.c standard vm/swap_pager.c standard vm/uma_core.c standard vm/uma_dbg.c standard vm/memguard.c optional DEBUG_MEMGUARD vm/vm_domainset.c standard vm/vm_fault.c standard vm/vm_glue.c standard vm/vm_init.c standard vm/vm_kern.c standard vm/vm_map.c standard vm/vm_meter.c standard vm/vm_mmap.c standard vm/vm_object.c standard vm/vm_page.c standard vm/vm_pageout.c standard vm/vm_pager.c standard vm/vm_phys.c standard vm/vm_radix.c standard vm/vm_reserv.c standard vm/vm_swapout.c optional !NO_SWAPPING vm/vm_swapout_dummy.c optional NO_SWAPPING vm/vm_unix.c standard vm/vnode_pager.c standard xen/features.c optional xenhvm xen/xenbus/xenbus_if.m optional xenhvm xen/xenbus/xenbus.c optional xenhvm xen/xenbus/xenbusb_if.m optional xenhvm xen/xenbus/xenbusb.c optional xenhvm xen/xenbus/xenbusb_front.c optional xenhvm xen/xenbus/xenbusb_back.c optional xenhvm xen/xenmem/xenmem_if.m optional xenhvm xdr/xdr.c optional krpc | nfslockd | nfscl | nfsd xdr/xdr_array.c optional krpc | nfslockd | nfscl | nfsd xdr/xdr_mbuf.c optional krpc | nfslockd | nfscl | nfsd xdr/xdr_mem.c optional krpc | nfslockd | nfscl | nfsd xdr/xdr_reference.c optional krpc | nfslockd | nfscl | nfsd xdr/xdr_sizeof.c optional krpc | nfslockd | nfscl | nfsd Index: projects/clang1000-import/sys/dev/acpica/acpi_lid.c =================================================================== --- projects/clang1000-import/sys/dev/acpica/acpi_lid.c (revision 358238) +++ projects/clang1000-import/sys/dev/acpica/acpi_lid.c (revision 358239) @@ -1,212 +1,222 @@ /*- * Copyright (c) 2000 Takanori Watanabe * Copyright (c) 2000 Mitsuru IWASAKI * Copyright (c) 2000 Michael Smith * Copyright (c) 2000 BSDi * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include "opt_acpi.h" #include #include #include #include #include #include #include #include #include /* Hooks for the ACPI CA debugging infrastructure */ #define _COMPONENT ACPI_BUTTON ACPI_MODULE_NAME("LID") struct acpi_lid_softc { device_t lid_dev; ACPI_HANDLE lid_handle; int lid_status; /* open or closed */ }; ACPI_HANDLE acpi_lid_handle; ACPI_SERIAL_DECL(lid, "ACPI lid"); static int acpi_lid_probe(device_t dev); static int acpi_lid_attach(device_t dev); static int acpi_lid_suspend(device_t dev); static int acpi_lid_resume(device_t dev); static void acpi_lid_notify_status_changed(void *arg); static void acpi_lid_notify_handler(ACPI_HANDLE h, UINT32 notify, void *context); static device_method_t acpi_lid_methods[] = { /* Device interface */ DEVMETHOD(device_probe, acpi_lid_probe), DEVMETHOD(device_attach, acpi_lid_attach), DEVMETHOD(device_suspend, acpi_lid_suspend), DEVMETHOD(device_resume, acpi_lid_resume), DEVMETHOD_END }; static driver_t acpi_lid_driver = { "acpi_lid", acpi_lid_methods, sizeof(struct acpi_lid_softc), }; static devclass_t acpi_lid_devclass; DRIVER_MODULE(acpi_lid, acpi, acpi_lid_driver, acpi_lid_devclass, 0, 0); MODULE_DEPEND(acpi_lid, acpi, 1, 1, 1); static int acpi_lid_probe(device_t dev) { static char *lid_ids[] = { "PNP0C0D", NULL }; int rv; if (acpi_disabled("lid")) return (ENXIO); rv = ACPI_ID_PROBE(device_get_parent(dev), dev, lid_ids, NULL); if (rv <= 0) device_set_desc(dev, "Control Method Lid Switch"); return (rv); } static int acpi_lid_attach(device_t dev) { struct acpi_prw_data prw; struct acpi_lid_softc *sc; ACPI_FUNCTION_TRACE((char *)(uintptr_t)__func__); sc = device_get_softc(dev); sc->lid_dev = dev; acpi_lid_handle = sc->lid_handle = acpi_get_handle(dev); /* * If a system does not get lid events, it may make sense to change * the type to ACPI_ALL_NOTIFY. Some systems generate both a wake and * runtime notify in that case though. */ AcpiInstallNotifyHandler(sc->lid_handle, ACPI_DEVICE_NOTIFY, acpi_lid_notify_handler, sc); /* Enable the GPE for wake/runtime. */ acpi_wake_set_enable(dev, 1); if (acpi_parse_prw(sc->lid_handle, &prw) == 0) AcpiEnableGpe(prw.gpe_handle, prw.gpe_bit); + /* Get the initial lid status, ignore failures */ + (void) acpi_GetInteger(sc->lid_handle, "_LID", &sc->lid_status); + /* * Export the lid status */ SYSCTL_ADD_INT(device_get_sysctl_ctx(dev), SYSCTL_CHILDREN(device_get_sysctl_tree(dev)), OID_AUTO, "state", CTLFLAG_RD, &sc->lid_status, 0, - "Device set to wake the system"); + "Device state (0 = closed, 1 = open)"); return (0); } static int acpi_lid_suspend(device_t dev) { return (0); } static int acpi_lid_resume(device_t dev) { + struct acpi_lid_softc *sc; + + sc = device_get_softc(dev); + + /* Get lid status after resume, ignore failures */ + (void) acpi_GetInteger(sc->lid_handle, "_LID", &sc->lid_status); + return (0); } static void acpi_lid_notify_status_changed(void *arg) { struct acpi_lid_softc *sc; struct acpi_softc *acpi_sc; ACPI_STATUS status; ACPI_FUNCTION_TRACE((char *)(uintptr_t)__func__); sc = (struct acpi_lid_softc *)arg; ACPI_SERIAL_BEGIN(lid); /* * Evaluate _LID and check the return value, update lid status. * Zero: The lid is closed * Non-zero: The lid is open */ status = acpi_GetInteger(sc->lid_handle, "_LID", &sc->lid_status); if (ACPI_FAILURE(status)) goto out; acpi_sc = acpi_device_get_parent_softc(sc->lid_dev); if (acpi_sc == NULL) goto out; ACPI_VPRINT(sc->lid_dev, acpi_sc, "Lid %s\n", sc->lid_status ? "opened" : "closed"); acpi_UserNotify("Lid", sc->lid_handle, sc->lid_status); if (sc->lid_status == 0) EVENTHANDLER_INVOKE(acpi_sleep_event, acpi_sc->acpi_lid_switch_sx); else EVENTHANDLER_INVOKE(acpi_wakeup_event, acpi_sc->acpi_lid_switch_sx); out: ACPI_SERIAL_END(lid); return_VOID; } /* XXX maybe not here */ #define ACPI_NOTIFY_STATUS_CHANGED 0x80 static void acpi_lid_notify_handler(ACPI_HANDLE h, UINT32 notify, void *context) { struct acpi_lid_softc *sc; ACPI_FUNCTION_TRACE_U32((char *)(uintptr_t)__func__, notify); sc = (struct acpi_lid_softc *)context; switch (notify) { case ACPI_NOTIFY_STATUS_CHANGED: AcpiOsExecute(OSL_NOTIFY_HANDLER, acpi_lid_notify_status_changed, sc); break; default: device_printf(sc->lid_dev, "unknown notify %#x\n", notify); break; } return_VOID; } Index: projects/clang1000-import/sys/dev/ath/ah_osdep.c =================================================================== --- projects/clang1000-import/sys/dev/ath/ah_osdep.c (revision 358238) +++ projects/clang1000-import/sys/dev/ath/ah_osdep.c (revision 358239) @@ -1,456 +1,459 @@ /*- * SPDX-License-Identifier: BSD-2-Clause-FreeBSD * * Copyright (c) 2002-2008 Sam Leffler, Errno Consulting * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer, * without modification. * 2. Redistributions in binary form must reproduce at minimum a disclaimer * similar to the "NO WARRANTY" disclaimer below ("Disclaimer") and any * redistribution must be conditioned upon including a substantially * similar Disclaimer requirement for further binary redistribution. * * NO WARRANTY * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS * ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT * LIMITED TO, THE IMPLIED WARRANTIES OF NONINFRINGEMENT, MERCHANTIBILITY * AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL * THE COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR SPECIAL, EXEMPLARY, * OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF * THE POSSIBILITY OF SUCH DAMAGES. * * $FreeBSD$ */ #include "opt_ah.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include /* XXX for ether_sprintf */ #include #include /* * WiSoC boards overload the bus tag with information about the * board layout. We must extract the bus space tag from that * indirect structure. For everyone else the tag is passed in * directly. * XXX cache indirect ref privately */ #ifdef AH_SUPPORT_AR5312 #define BUSTAG(ah) \ ((bus_space_tag_t) ((struct ar531x_config *)((ah)->ah_st))->tag) #else #define BUSTAG(ah) ((ah)->ah_st) #endif /* * This lock is used to seralise register access for chips which have * problems w/ SMP CPUs issuing concurrent PCI transactions. * * XXX This is a global lock for now; it should be pushed to * a per-device lock in some platform-independent fashion. */ struct mtx ah_regser_mtx; MTX_SYSINIT(ah_regser, &ah_regser_mtx, "Atheros register access mutex", MTX_SPIN); extern void ath_hal_printf(struct ath_hal *, const char*, ...) __printflike(2,3); extern void ath_hal_vprintf(struct ath_hal *, const char*, __va_list) __printflike(2, 0); extern const char* ath_hal_ether_sprintf(const u_int8_t *mac); extern void *ath_hal_malloc(size_t); extern void ath_hal_free(void *); #ifdef AH_ASSERT extern void ath_hal_assert_failed(const char* filename, int lineno, const char* msg); #endif #ifdef AH_DEBUG extern void DO_HALDEBUG(struct ath_hal *ah, u_int mask, const char* fmt, ...); #endif /* AH_DEBUG */ /* NB: put this here instead of the driver to avoid circular references */ -SYSCTL_NODE(_hw, OID_AUTO, ath, CTLFLAG_RD, 0, "Atheros driver parameters"); -static SYSCTL_NODE(_hw_ath, OID_AUTO, hal, CTLFLAG_RD, 0, +SYSCTL_NODE(_hw, OID_AUTO, ath, CTLFLAG_RD | CTLFLAG_MPSAFE, 0, + "Atheros driver parameters"); +static SYSCTL_NODE(_hw_ath, OID_AUTO, hal, CTLFLAG_RD | CTLFLAG_MPSAFE, 0, "Atheros HAL parameters"); #ifdef AH_DEBUG int ath_hal_debug = 0; SYSCTL_INT(_hw_ath_hal, OID_AUTO, debug, CTLFLAG_RWTUN, &ath_hal_debug, 0, "Atheros HAL debugging printfs"); #endif /* AH_DEBUG */ static MALLOC_DEFINE(M_ATH_HAL, "ath_hal", "ath hal data"); void* ath_hal_malloc(size_t size) { return malloc(size, M_ATH_HAL, M_NOWAIT | M_ZERO); } void ath_hal_free(void* p) { free(p, M_ATH_HAL); } void ath_hal_vprintf(struct ath_hal *ah, const char* fmt, va_list ap) { vprintf(fmt, ap); } void ath_hal_printf(struct ath_hal *ah, const char* fmt, ...) { va_list ap; va_start(ap, fmt); ath_hal_vprintf(ah, fmt, ap); va_end(ap); } const char* ath_hal_ether_sprintf(const u_int8_t *mac) { return ether_sprintf(mac); } #ifdef AH_DEBUG /* * XXX This is highly relevant only for the AR5416 and later * PCI/PCIe NICs. It'll need adjustment for other hardware * variations. */ static int ath_hal_reg_whilst_asleep(struct ath_hal *ah, uint32_t reg) { if (reg >= 0x4000 && reg < 0x5000) return (1); if (reg >= 0x6000 && reg < 0x7000) return (1); if (reg >= 0x7000 && reg < 0x8000) return (1); return (0); } void DO_HALDEBUG(struct ath_hal *ah, u_int mask, const char* fmt, ...) { if ((mask == HAL_DEBUG_UNMASKABLE) || (ah != NULL && ah->ah_config.ah_debug & mask) || (ath_hal_debug & mask)) { __va_list ap; va_start(ap, fmt); ath_hal_vprintf(ah, fmt, ap); va_end(ap); } } #undef HAL_DEBUG_UNMASKABLE #endif /* AH_DEBUG */ #ifdef AH_DEBUG_ALQ /* * ALQ register tracing support. * * Setting hw.ath.hal.alq=1 enables tracing of all register reads and * writes to the file /tmp/ath_hal.log. The file format is a simple * fixed-size array of records. When done logging set hw.ath.hal.alq=0 * and then decode the file with the arcode program (that is part of the * HAL). If you start+stop tracing the data will be appended to an * existing file. * * NB: doesn't handle multiple devices properly; only one DEVICE record * is emitted and the different devices are not identified. */ #include #include #include static struct alq *ath_hal_alq; static int ath_hal_alq_emitdev; /* need to emit DEVICE record */ static u_int ath_hal_alq_lost; /* count of lost records */ static char ath_hal_logfile[MAXPATHLEN] = "/tmp/ath_hal.log"; SYSCTL_STRING(_hw_ath_hal, OID_AUTO, alq_logfile, CTLFLAG_RW, &ath_hal_logfile, sizeof(kernelname), "Name of ALQ logfile"); static u_int ath_hal_alq_qsize = 64*1024; static int ath_hal_setlogging(int enable) { int error; if (enable) { error = alq_open(&ath_hal_alq, ath_hal_logfile, curthread->td_ucred, ALQ_DEFAULT_CMODE, sizeof (struct athregrec), ath_hal_alq_qsize); ath_hal_alq_lost = 0; ath_hal_alq_emitdev = 1; printf("ath_hal: logging to %s enabled\n", ath_hal_logfile); } else { if (ath_hal_alq) alq_close(ath_hal_alq); ath_hal_alq = NULL; printf("ath_hal: logging disabled\n"); error = 0; } return (error); } static int sysctl_hw_ath_hal_log(SYSCTL_HANDLER_ARGS) { int error, enable; enable = (ath_hal_alq != NULL); error = sysctl_handle_int(oidp, &enable, 0, req); if (error || !req->newptr) return (error); else return (ath_hal_setlogging(enable)); } -SYSCTL_PROC(_hw_ath_hal, OID_AUTO, alq, CTLTYPE_INT|CTLFLAG_RW, - 0, 0, sysctl_hw_ath_hal_log, "I", "Enable HAL register logging"); +SYSCTL_PROC(_hw_ath_hal, OID_AUTO, alq, + CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_MPSAFE, + 0, 0, sysctl_hw_ath_hal_log, "I", + "Enable HAL register logging"); SYSCTL_INT(_hw_ath_hal, OID_AUTO, alq_size, CTLFLAG_RW, &ath_hal_alq_qsize, 0, "In-memory log size (#records)"); SYSCTL_INT(_hw_ath_hal, OID_AUTO, alq_lost, CTLFLAG_RW, &ath_hal_alq_lost, 0, "Register operations not logged"); static struct ale * ath_hal_alq_get(struct ath_hal *ah) { struct ale *ale; if (ath_hal_alq_emitdev) { ale = alq_get(ath_hal_alq, ALQ_NOWAIT); if (ale) { struct athregrec *r = (struct athregrec *) ale->ae_data; r->op = OP_DEVICE; r->reg = 0; r->val = ah->ah_devid; alq_post(ath_hal_alq, ale); ath_hal_alq_emitdev = 0; } else ath_hal_alq_lost++; } ale = alq_get(ath_hal_alq, ALQ_NOWAIT); if (!ale) ath_hal_alq_lost++; return ale; } void ath_hal_reg_write(struct ath_hal *ah, u_int32_t reg, u_int32_t val) { bus_space_tag_t tag = BUSTAG(ah); bus_space_handle_t h = ah->ah_sh; #ifdef AH_DEBUG /* Debug - complain if we haven't fully waken things up */ if (! ath_hal_reg_whilst_asleep(ah, reg) && ah->ah_powerMode != HAL_PM_AWAKE) { ath_hal_printf(ah, "%s: reg=0x%08x, val=0x%08x, pm=%d\n", __func__, reg, val, ah->ah_powerMode); } #endif if (ath_hal_alq) { struct ale *ale = ath_hal_alq_get(ah); if (ale) { struct athregrec *r = (struct athregrec *) ale->ae_data; r->threadid = curthread->td_tid; r->op = OP_WRITE; r->reg = reg; r->val = val; alq_post(ath_hal_alq, ale); } } if (ah->ah_config.ah_serialise_reg_war) mtx_lock_spin(&ah_regser_mtx); bus_space_write_4(tag, h, reg, val); OS_BUS_BARRIER_REG(ah, reg, OS_BUS_BARRIER_WRITE); if (ah->ah_config.ah_serialise_reg_war) mtx_unlock_spin(&ah_regser_mtx); } u_int32_t ath_hal_reg_read(struct ath_hal *ah, u_int32_t reg) { bus_space_tag_t tag = BUSTAG(ah); bus_space_handle_t h = ah->ah_sh; u_int32_t val; #ifdef AH_DEBUG /* Debug - complain if we haven't fully waken things up */ if (! ath_hal_reg_whilst_asleep(ah, reg) && ah->ah_powerMode != HAL_PM_AWAKE) { ath_hal_printf(ah, "%s: reg=0x%08x, pm=%d\n", __func__, reg, ah->ah_powerMode); } #endif if (ah->ah_config.ah_serialise_reg_war) mtx_lock_spin(&ah_regser_mtx); OS_BUS_BARRIER_REG(ah, reg, OS_BUS_BARRIER_READ); val = bus_space_read_4(tag, h, reg); if (ah->ah_config.ah_serialise_reg_war) mtx_unlock_spin(&ah_regser_mtx); if (ath_hal_alq) { struct ale *ale = ath_hal_alq_get(ah); if (ale) { struct athregrec *r = (struct athregrec *) ale->ae_data; r->threadid = curthread->td_tid; r->op = OP_READ; r->reg = reg; r->val = val; alq_post(ath_hal_alq, ale); } } return val; } void OS_MARK(struct ath_hal *ah, u_int id, u_int32_t v) { if (ath_hal_alq) { struct ale *ale = ath_hal_alq_get(ah); if (ale) { struct athregrec *r = (struct athregrec *) ale->ae_data; r->threadid = curthread->td_tid; r->op = OP_MARK; r->reg = id; r->val = v; alq_post(ath_hal_alq, ale); } } } #else /* AH_DEBUG_ALQ */ /* * Memory-mapped device register read/write. These are here * as routines when debugging support is enabled and/or when * explicitly configured to use function calls. The latter is * for architectures that might need to do something before * referencing memory (e.g. remap an i/o window). * * NB: see the comments in ah_osdep.h about byte-swapping register * reads and writes to understand what's going on below. */ void ath_hal_reg_write(struct ath_hal *ah, u_int32_t reg, u_int32_t val) { bus_space_tag_t tag = BUSTAG(ah); bus_space_handle_t h = ah->ah_sh; #ifdef AH_DEBUG /* Debug - complain if we haven't fully waken things up */ if (! ath_hal_reg_whilst_asleep(ah, reg) && ah->ah_powerMode != HAL_PM_AWAKE) { ath_hal_printf(ah, "%s: reg=0x%08x, val=0x%08x, pm=%d\n", __func__, reg, val, ah->ah_powerMode); } #endif if (ah->ah_config.ah_serialise_reg_war) mtx_lock_spin(&ah_regser_mtx); bus_space_write_4(tag, h, reg, val); OS_BUS_BARRIER_REG(ah, reg, OS_BUS_BARRIER_WRITE); if (ah->ah_config.ah_serialise_reg_war) mtx_unlock_spin(&ah_regser_mtx); } u_int32_t ath_hal_reg_read(struct ath_hal *ah, u_int32_t reg) { bus_space_tag_t tag = BUSTAG(ah); bus_space_handle_t h = ah->ah_sh; u_int32_t val; #ifdef AH_DEBUG /* Debug - complain if we haven't fully waken things up */ if (! ath_hal_reg_whilst_asleep(ah, reg) && ah->ah_powerMode != HAL_PM_AWAKE) { ath_hal_printf(ah, "%s: reg=0x%08x, pm=%d\n", __func__, reg, ah->ah_powerMode); } #endif if (ah->ah_config.ah_serialise_reg_war) mtx_lock_spin(&ah_regser_mtx); OS_BUS_BARRIER_REG(ah, reg, OS_BUS_BARRIER_READ); val = bus_space_read_4(tag, h, reg); if (ah->ah_config.ah_serialise_reg_war) mtx_unlock_spin(&ah_regser_mtx); return val; } #endif /* AH_DEBUG_ALQ */ #ifdef AH_ASSERT void ath_hal_assert_failed(const char* filename, int lineno, const char *msg) { printf("Atheros HAL assertion failure: %s: line %u: %s\n", filename, lineno, msg); panic("ath_hal_assert"); } #endif /* AH_ASSERT */ static int ath_hal_modevent(module_t mod __unused, int type, void *data __unused) { int error = 0; switch (type) { case MOD_LOAD: printf("[ath_hal] loaded\n"); break; case MOD_UNLOAD: printf("[ath_hal] unloaded\n"); break; case MOD_SHUTDOWN: break; default: error = EOPNOTSUPP; break; } return (error); } DEV_MODULE(ath_hal, ath_hal_modevent, NULL); MODULE_VERSION(ath_hal, 1); #if defined(AH_DEBUG_ALQ) MODULE_DEPEND(ath_hal, alq, 1, 1, 1); #endif Index: projects/clang1000-import/sys/dev/ath/ath_rate/sample/sample.c =================================================================== --- projects/clang1000-import/sys/dev/ath/ath_rate/sample/sample.c (revision 358238) +++ projects/clang1000-import/sys/dev/ath/ath_rate/sample/sample.c (revision 358239) @@ -1,1405 +1,1405 @@ /*- * SPDX-License-Identifier: BSD-3-Clause * * Copyright (c) 2005 John Bicket * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer, * without modification. * 2. Redistributions in binary form must reproduce at minimum a disclaimer * similar to the "NO WARRANTY" disclaimer below ("Disclaimer") and any * redistribution must be conditioned upon including a substantially * similar Disclaimer requirement for further binary redistribution. * 3. Neither the names of the above-listed copyright holders nor the names * of any contributors may be used to endorse or promote products derived * from this software without specific prior written permission. * * Alternatively, this software may be distributed under the terms of the * GNU General Public License ("GPL") version 2 as published by the Free * Software Foundation. * * NO WARRANTY * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS * ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT * LIMITED TO, THE IMPLIED WARRANTIES OF NONINFRINGEMENT, MERCHANTIBILITY * AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL * THE COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR SPECIAL, EXEMPLARY, * OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF * THE POSSIBILITY OF SUCH DAMAGES. * */ #include __FBSDID("$FreeBSD$"); /* * John Bicket's SampleRate control algorithm. */ #include "opt_ath.h" #include "opt_inet.h" #include "opt_wlan.h" #include "opt_ah.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include /* XXX for ether_sprintf */ #include #include #ifdef INET #include #include #endif #include #include #include #include /* * This file is an implementation of the SampleRate algorithm * in "Bit-rate Selection in Wireless Networks" * (http://www.pdos.lcs.mit.edu/papers/jbicket-ms.ps) * * SampleRate chooses the bit-rate it predicts will provide the most * throughput based on estimates of the expected per-packet * transmission time for each bit-rate. SampleRate periodically sends * packets at bit-rates other than the current one to estimate when * another bit-rate will provide better performance. SampleRate * switches to another bit-rate when its estimated per-packet * transmission time becomes smaller than the current bit-rate's. * SampleRate reduces the number of bit-rates it must sample by * eliminating those that could not perform better than the one * currently being used. SampleRate also stops probing at a bit-rate * if it experiences several successive losses. * * The difference between the algorithm in the thesis and the one in this * file is that the one in this file uses a ewma instead of a window. * * Also, this implementation tracks the average transmission time for * a few different packet sizes independently for each link. */ static void ath_rate_ctl_reset(struct ath_softc *, struct ieee80211_node *); static __inline int size_to_bin(int size) { #if NUM_PACKET_SIZE_BINS > 1 if (size <= packet_size_bins[0]) return 0; #endif #if NUM_PACKET_SIZE_BINS > 2 if (size <= packet_size_bins[1]) return 1; #endif #if NUM_PACKET_SIZE_BINS > 3 if (size <= packet_size_bins[2]) return 2; #endif #if NUM_PACKET_SIZE_BINS > 4 #error "add support for more packet sizes" #endif return NUM_PACKET_SIZE_BINS-1; } void ath_rate_node_init(struct ath_softc *sc, struct ath_node *an) { /* NB: assumed to be zero'd by caller */ } void ath_rate_node_cleanup(struct ath_softc *sc, struct ath_node *an) { } static int dot11rate(const HAL_RATE_TABLE *rt, int rix) { if (rix < 0) return -1; return rt->info[rix].phy == IEEE80211_T_HT ? rt->info[rix].dot11Rate : (rt->info[rix].dot11Rate & IEEE80211_RATE_VAL) / 2; } static const char * dot11rate_label(const HAL_RATE_TABLE *rt, int rix) { if (rix < 0) return ""; return rt->info[rix].phy == IEEE80211_T_HT ? "MCS" : "Mb "; } /* * Return the rix with the lowest average_tx_time, * or -1 if all the average_tx_times are 0. */ static __inline int pick_best_rate(struct ath_node *an, const HAL_RATE_TABLE *rt, int size_bin, int require_acked_before) { struct sample_node *sn = ATH_NODE_SAMPLE(an); int best_rate_rix, best_rate_tt, best_rate_pct; uint64_t mask; int rix, tt, pct; best_rate_rix = 0; best_rate_tt = 0; best_rate_pct = 0; for (mask = sn->ratemask, rix = 0; mask != 0; mask >>= 1, rix++) { if ((mask & 1) == 0) /* not a supported rate */ continue; /* Don't pick a non-HT rate for a HT node */ if ((an->an_node.ni_flags & IEEE80211_NODE_HT) && (rt->info[rix].phy != IEEE80211_T_HT)) { continue; } tt = sn->stats[size_bin][rix].average_tx_time; if (tt <= 0 || (require_acked_before && !sn->stats[size_bin][rix].packets_acked)) continue; /* Calculate percentage if possible */ if (sn->stats[size_bin][rix].total_packets > 0) { pct = sn->stats[size_bin][rix].ewma_pct; } else { /* XXX for now, assume 95% ok */ pct = 95; } /* don't use a bit-rate that has been failing */ if (sn->stats[size_bin][rix].successive_failures > 3) continue; /* * For HT, Don't use a bit rate that is much more * lossy than the best. * * XXX this isn't optimal; it's just designed to * eliminate rates that are going to be obviously * worse. */ if (an->an_node.ni_flags & IEEE80211_NODE_HT) { if (best_rate_pct > (pct + 50)) continue; } /* * For non-MCS rates, use the current average txtime for * comparison. */ if (! (an->an_node.ni_flags & IEEE80211_NODE_HT)) { if (best_rate_tt == 0 || tt <= best_rate_tt) { best_rate_tt = tt; best_rate_rix = rix; best_rate_pct = pct; } } /* * Since 2 stream rates have slightly higher TX times, * allow a little bit of leeway. This should later * be abstracted out and properly handled. */ if (an->an_node.ni_flags & IEEE80211_NODE_HT) { if (best_rate_tt == 0 || (tt * 8 <= best_rate_tt * 10)) { best_rate_tt = tt; best_rate_rix = rix; best_rate_pct = pct; } } } return (best_rate_tt ? best_rate_rix : -1); } /* * Pick a good "random" bit-rate to sample other than the current one. */ static __inline int pick_sample_rate(struct sample_softc *ssc , struct ath_node *an, const HAL_RATE_TABLE *rt, int size_bin) { #define DOT11RATE(ix) (rt->info[ix].dot11Rate & IEEE80211_RATE_VAL) #define MCS(ix) (rt->info[ix].dot11Rate | IEEE80211_RATE_MCS) struct sample_node *sn = ATH_NODE_SAMPLE(an); int current_rix, rix; unsigned current_tt; uint64_t mask; current_rix = sn->current_rix[size_bin]; if (current_rix < 0) { /* no successes yet, send at the lowest bit-rate */ /* XXX should return MCS0 if HT */ return 0; } current_tt = sn->stats[size_bin][current_rix].average_tx_time; rix = sn->last_sample_rix[size_bin]+1; /* next sample rate */ mask = sn->ratemask &~ ((uint64_t) 1<= rt->rateCount) rix = 0; continue; } /* * The following code stops trying to sample * non-MCS rates when speaking to an MCS node. * However, at least for CCK rates in 2.4GHz mode, * the non-MCS rates MAY actually provide better * PER at the very far edge of reception. * * However! Until ath_rate_form_aggr() grows * some logic to not form aggregates if the * selected rate is non-MCS, this won't work. * * So don't disable this code until you've taught * ath_rate_form_aggr() to drop out if any of * the selected rates are non-MCS. */ #if 1 /* if the node is HT and the rate isn't HT, don't bother sample */ if ((an->an_node.ni_flags & IEEE80211_NODE_HT) && (rt->info[rix].phy != IEEE80211_T_HT)) { mask &= ~((uint64_t) 1<stats[size_bin][rix].perfect_tx_time > current_tt) { mask &= ~((uint64_t) 1<stats[size_bin][rix].successive_failures > ssc->max_successive_failures && ticks - sn->stats[size_bin][rix].last_tx < ssc->stale_failure_timeout) { mask &= ~((uint64_t) 1<an_node.ni_flags & IEEE80211_NODE_HT) { if (rix < (current_rix - 3) || rix > (current_rix + 3)) { mask &= ~((uint64_t) 1< 11M for non-HT rates */ if (! (an->an_node.ni_flags & IEEE80211_NODE_HT)) { if (DOT11RATE(rix) > 2*11 && rix > current_rix + 2) { mask &= ~((uint64_t) 1<last_sample_rix[size_bin] = rix; return rix; } return current_rix; #undef DOT11RATE #undef MCS } static int ath_rate_get_static_rix(struct ath_softc *sc, const struct ieee80211_node *ni) { #define RATE(_ix) (ni->ni_rates.rs_rates[(_ix)] & IEEE80211_RATE_VAL) #define DOT11RATE(_ix) (rt->info[(_ix)].dot11Rate & IEEE80211_RATE_VAL) #define MCS(_ix) (ni->ni_htrates.rs_rates[_ix] | IEEE80211_RATE_MCS) const struct ieee80211_txparam *tp = ni->ni_txparms; int srate; /* Check MCS rates */ for (srate = ni->ni_htrates.rs_nrates - 1; srate >= 0; srate--) { if (MCS(srate) == tp->ucastrate) return sc->sc_rixmap[tp->ucastrate]; } /* Check legacy rates */ for (srate = ni->ni_rates.rs_nrates - 1; srate >= 0; srate--) { if (RATE(srate) == tp->ucastrate) return sc->sc_rixmap[tp->ucastrate]; } return -1; #undef RATE #undef DOT11RATE #undef MCS } static void ath_rate_update_static_rix(struct ath_softc *sc, struct ieee80211_node *ni) { struct ath_node *an = ATH_NODE(ni); const struct ieee80211_txparam *tp = ni->ni_txparms; struct sample_node *sn = ATH_NODE_SAMPLE(an); if (tp != NULL && tp->ucastrate != IEEE80211_FIXED_RATE_NONE) { /* * A fixed rate is to be used; ucastrate is the IEEE code * for this rate (sans basic bit). Check this against the * negotiated rate set for the node. Note the fixed rate * may not be available for various reasons so we only * setup the static rate index if the lookup is successful. */ sn->static_rix = ath_rate_get_static_rix(sc, ni); } else { sn->static_rix = -1; } } /* * Pick a non-HT rate to begin using. */ static int ath_rate_pick_seed_rate_legacy(struct ath_softc *sc, struct ath_node *an, int frameLen) { #define DOT11RATE(ix) (rt->info[ix].dot11Rate & IEEE80211_RATE_VAL) #define MCS(ix) (rt->info[ix].dot11Rate | IEEE80211_RATE_MCS) #define RATE(ix) (DOT11RATE(ix) / 2) int rix = -1; const HAL_RATE_TABLE *rt = sc->sc_currates; struct sample_node *sn = ATH_NODE_SAMPLE(an); const int size_bin = size_to_bin(frameLen); /* no packet has been sent successfully yet */ for (rix = rt->rateCount-1; rix > 0; rix--) { if ((sn->ratemask & ((uint64_t) 1<info[rix].phy == IEEE80211_T_HT) continue; /* * Pick the highest rate <= 36 Mbps * that hasn't failed. */ if (DOT11RATE(rix) <= 72 && sn->stats[size_bin][rix].successive_failures == 0) { break; } } return rix; #undef RATE #undef MCS #undef DOT11RATE } /* * Pick a HT rate to begin using. * * Don't use any non-HT rates; only consider HT rates. */ static int ath_rate_pick_seed_rate_ht(struct ath_softc *sc, struct ath_node *an, int frameLen) { #define DOT11RATE(ix) (rt->info[ix].dot11Rate & IEEE80211_RATE_VAL) #define MCS(ix) (rt->info[ix].dot11Rate | IEEE80211_RATE_MCS) #define RATE(ix) (DOT11RATE(ix) / 2) int rix = -1, ht_rix = -1; const HAL_RATE_TABLE *rt = sc->sc_currates; struct sample_node *sn = ATH_NODE_SAMPLE(an); const int size_bin = size_to_bin(frameLen); /* no packet has been sent successfully yet */ for (rix = rt->rateCount-1; rix > 0; rix--) { /* Skip rates we can't use */ if ((sn->ratemask & ((uint64_t) 1<info[rix].phy == IEEE80211_T_HT) ht_rix = rix; /* Skip non-HT rates */ if (rt->info[rix].phy != IEEE80211_T_HT) continue; /* * Pick a medium-speed rate regardless of stream count * which has not seen any failures. Higher rates may fail; * we'll try them later. */ if (((MCS(rix) & 0x7) <= 4) && sn->stats[size_bin][rix].successive_failures == 0) { break; } } /* * If all the MCS rates have successive failures, rix should be * > 0; otherwise use the lowest MCS rix (hopefully MCS 0.) */ return MAX(rix, ht_rix); #undef RATE #undef MCS #undef DOT11RATE } void ath_rate_findrate(struct ath_softc *sc, struct ath_node *an, int shortPreamble, size_t frameLen, u_int8_t *rix0, int *try0, u_int8_t *txrate) { #define DOT11RATE(ix) (rt->info[ix].dot11Rate & IEEE80211_RATE_VAL) #define MCS(ix) (rt->info[ix].dot11Rate | IEEE80211_RATE_MCS) #define RATE(ix) (DOT11RATE(ix) / 2) struct sample_node *sn = ATH_NODE_SAMPLE(an); struct sample_softc *ssc = ATH_SOFTC_SAMPLE(sc); struct ieee80211com *ic = &sc->sc_ic; const HAL_RATE_TABLE *rt = sc->sc_currates; const int size_bin = size_to_bin(frameLen); int rix, mrr, best_rix, change_rates; unsigned average_tx_time; ath_rate_update_static_rix(sc, &an->an_node); if (sn->currates != sc->sc_currates) { device_printf(sc->sc_dev, "%s: currates != sc_currates!\n", __func__); rix = 0; *try0 = ATH_TXMAXTRY; goto done; } if (sn->static_rix != -1) { rix = sn->static_rix; *try0 = ATH_TXMAXTRY; goto done; } mrr = sc->sc_mrretry; /* XXX check HT protmode too */ if (mrr && (ic->ic_flags & IEEE80211_F_USEPROT && !sc->sc_mrrprot)) mrr = 0; best_rix = pick_best_rate(an, rt, size_bin, !mrr); if (best_rix >= 0) { average_tx_time = sn->stats[size_bin][best_rix].average_tx_time; } else { average_tx_time = 0; } /* * Limit the time measuring the performance of other tx * rates to sample_rate% of the total transmission time. */ if (sn->sample_tt[size_bin] < average_tx_time * (sn->packets_since_sample[size_bin]*ssc->sample_rate/100)) { rix = pick_sample_rate(ssc, an, rt, size_bin); IEEE80211_NOTE(an->an_node.ni_vap, IEEE80211_MSG_RATECTL, &an->an_node, "att %d sample_tt %d size %u sample rate %d %s current rate %d %s", average_tx_time, sn->sample_tt[size_bin], bin_to_size(size_bin), dot11rate(rt, rix), dot11rate_label(rt, rix), dot11rate(rt, sn->current_rix[size_bin]), dot11rate_label(rt, sn->current_rix[size_bin])); if (rix != sn->current_rix[size_bin]) { sn->current_sample_rix[size_bin] = rix; } else { sn->current_sample_rix[size_bin] = -1; } sn->packets_since_sample[size_bin] = 0; } else { change_rates = 0; if (!sn->packets_sent[size_bin] || best_rix == -1) { /* no packet has been sent successfully yet */ change_rates = 1; if (an->an_node.ni_flags & IEEE80211_NODE_HT) best_rix = ath_rate_pick_seed_rate_ht(sc, an, frameLen); else best_rix = ath_rate_pick_seed_rate_legacy(sc, an, frameLen); } else if (sn->packets_sent[size_bin] < 20) { /* let the bit-rate switch quickly during the first few packets */ IEEE80211_NOTE(an->an_node.ni_vap, IEEE80211_MSG_RATECTL, &an->an_node, "%s: switching quickly..", __func__); change_rates = 1; } else if (ticks - ssc->min_switch > sn->ticks_since_switch[size_bin]) { /* min_switch seconds have gone by */ IEEE80211_NOTE(an->an_node.ni_vap, IEEE80211_MSG_RATECTL, &an->an_node, "%s: min_switch %d > ticks_since_switch %d..", __func__, ticks - ssc->min_switch, sn->ticks_since_switch[size_bin]); change_rates = 1; } else if ((! (an->an_node.ni_flags & IEEE80211_NODE_HT)) && (2*average_tx_time < sn->stats[size_bin][sn->current_rix[size_bin]].average_tx_time)) { /* the current bit-rate is twice as slow as the best one */ IEEE80211_NOTE(an->an_node.ni_vap, IEEE80211_MSG_RATECTL, &an->an_node, "%s: 2x att (= %d) < cur_rix att %d", __func__, 2 * average_tx_time, sn->stats[size_bin][sn->current_rix[size_bin]].average_tx_time); change_rates = 1; } else if ((an->an_node.ni_flags & IEEE80211_NODE_HT)) { int cur_rix = sn->current_rix[size_bin]; int cur_att = sn->stats[size_bin][cur_rix].average_tx_time; /* * If the node is HT, upgrade it if the MCS rate is * higher and the average tx time is within 20% of * the current rate. It can fail a little. * * This is likely not optimal! */ #if 0 printf("cur rix/att %x/%d, best rix/att %x/%d\n", MCS(cur_rix), cur_att, MCS(best_rix), average_tx_time); #endif if ((MCS(best_rix) > MCS(cur_rix)) && (average_tx_time * 8) <= (cur_att * 10)) { IEEE80211_NOTE(an->an_node.ni_vap, IEEE80211_MSG_RATECTL, &an->an_node, "%s: HT: best_rix 0x%d > cur_rix 0x%x, average_tx_time %d, cur_att %d", __func__, MCS(best_rix), MCS(cur_rix), average_tx_time, cur_att); change_rates = 1; } } sn->packets_since_sample[size_bin]++; if (change_rates) { if (best_rix != sn->current_rix[size_bin]) { IEEE80211_NOTE(an->an_node.ni_vap, IEEE80211_MSG_RATECTL, &an->an_node, "%s: size %d switch rate %d (%d/%d) -> %d (%d/%d) after %d packets mrr %d", __func__, bin_to_size(size_bin), RATE(sn->current_rix[size_bin]), sn->stats[size_bin][sn->current_rix[size_bin]].average_tx_time, sn->stats[size_bin][sn->current_rix[size_bin]].perfect_tx_time, RATE(best_rix), sn->stats[size_bin][best_rix].average_tx_time, sn->stats[size_bin][best_rix].perfect_tx_time, sn->packets_since_switch[size_bin], mrr); } sn->packets_since_switch[size_bin] = 0; sn->current_rix[size_bin] = best_rix; sn->ticks_since_switch[size_bin] = ticks; /* * Set the visible txrate for this node. */ an->an_node.ni_txrate = (rt->info[best_rix].phy == IEEE80211_T_HT) ? MCS(best_rix) : DOT11RATE(best_rix); } rix = sn->current_rix[size_bin]; sn->packets_since_switch[size_bin]++; } *try0 = mrr ? sn->sched[rix].t0 : ATH_TXMAXTRY; done: /* * This bug totally sucks and should be fixed. * * For now though, let's not panic, so we can start to figure * out how to better reproduce it. */ if (rix < 0 || rix >= rt->rateCount) { printf("%s: ERROR: rix %d out of bounds (rateCount=%d)\n", __func__, rix, rt->rateCount); rix = 0; /* XXX just default for now */ } KASSERT(rix >= 0 && rix < rt->rateCount, ("rix is %d", rix)); *rix0 = rix; *txrate = rt->info[rix].rateCode | (shortPreamble ? rt->info[rix].shortPreamble : 0); sn->packets_sent[size_bin]++; #undef DOT11RATE #undef MCS #undef RATE } /* * Get the TX rates. Don't fiddle with short preamble flags for them; * the caller can do that. */ void ath_rate_getxtxrates(struct ath_softc *sc, struct ath_node *an, uint8_t rix0, struct ath_rc_series *rc) { struct sample_node *sn = ATH_NODE_SAMPLE(an); const struct txschedule *sched = &sn->sched[rix0]; KASSERT(rix0 == sched->r0, ("rix0 (%x) != sched->r0 (%x)!\n", rix0, sched->r0)); rc[0].flags = rc[1].flags = rc[2].flags = rc[3].flags = 0; rc[0].rix = sched->r0; rc[1].rix = sched->r1; rc[2].rix = sched->r2; rc[3].rix = sched->r3; rc[0].tries = sched->t0; rc[1].tries = sched->t1; rc[2].tries = sched->t2; rc[3].tries = sched->t3; } void ath_rate_setupxtxdesc(struct ath_softc *sc, struct ath_node *an, struct ath_desc *ds, int shortPreamble, u_int8_t rix) { struct sample_node *sn = ATH_NODE_SAMPLE(an); const struct txschedule *sched = &sn->sched[rix]; const HAL_RATE_TABLE *rt = sc->sc_currates; uint8_t rix1, s1code, rix2, s2code, rix3, s3code; /* XXX precalculate short preamble tables */ rix1 = sched->r1; s1code = rt->info[rix1].rateCode | (shortPreamble ? rt->info[rix1].shortPreamble : 0); rix2 = sched->r2; s2code = rt->info[rix2].rateCode | (shortPreamble ? rt->info[rix2].shortPreamble : 0); rix3 = sched->r3; s3code = rt->info[rix3].rateCode | (shortPreamble ? rt->info[rix3].shortPreamble : 0); ath_hal_setupxtxdesc(sc->sc_ah, ds, s1code, sched->t1, /* series 1 */ s2code, sched->t2, /* series 2 */ s3code, sched->t3); /* series 3 */ } static void update_stats(struct ath_softc *sc, struct ath_node *an, int frame_size, int rix0, int tries0, int rix1, int tries1, int rix2, int tries2, int rix3, int tries3, int short_tries, int tries, int status, int nframes, int nbad) { struct sample_node *sn = ATH_NODE_SAMPLE(an); struct sample_softc *ssc = ATH_SOFTC_SAMPLE(sc); #ifdef IEEE80211_DEBUG const HAL_RATE_TABLE *rt = sc->sc_currates; #endif const int size_bin = size_to_bin(frame_size); const int size = bin_to_size(size_bin); int tt, tries_so_far; int is_ht40 = (an->an_node.ni_chw == 40); int pct; if (!IS_RATE_DEFINED(sn, rix0)) return; tt = calc_usecs_unicast_packet(sc, size, rix0, short_tries, MIN(tries0, tries) - 1, is_ht40); tries_so_far = tries0; if (tries1 && tries_so_far < tries) { if (!IS_RATE_DEFINED(sn, rix1)) return; tt += calc_usecs_unicast_packet(sc, size, rix1, short_tries, MIN(tries1 + tries_so_far, tries) - tries_so_far - 1, is_ht40); tries_so_far += tries1; } if (tries2 && tries_so_far < tries) { if (!IS_RATE_DEFINED(sn, rix2)) return; tt += calc_usecs_unicast_packet(sc, size, rix2, short_tries, MIN(tries2 + tries_so_far, tries) - tries_so_far - 1, is_ht40); tries_so_far += tries2; } if (tries3 && tries_so_far < tries) { if (!IS_RATE_DEFINED(sn, rix3)) return; tt += calc_usecs_unicast_packet(sc, size, rix3, short_tries, MIN(tries3 + tries_so_far, tries) - tries_so_far - 1, is_ht40); } if (sn->stats[size_bin][rix0].total_packets < ssc->smoothing_minpackets) { /* just average the first few packets */ int avg_tx = sn->stats[size_bin][rix0].average_tx_time; int packets = sn->stats[size_bin][rix0].total_packets; sn->stats[size_bin][rix0].average_tx_time = (tt+(avg_tx*packets))/(packets+nframes); } else { /* use a ewma */ sn->stats[size_bin][rix0].average_tx_time = ((sn->stats[size_bin][rix0].average_tx_time * ssc->smoothing_rate) + (tt * (100 - ssc->smoothing_rate))) / 100; } /* * XXX Don't mark the higher bit rates as also having failed; as this * unfortunately stops those rates from being tasted when trying to * TX. This happens with 11n aggregation. * * This is valid for higher CCK rates, higher OFDM rates, and higher * HT rates within the current number of streams (eg MCS0..7, 8..15, * etc.) */ if (nframes == nbad) { #if 0 int y; #endif sn->stats[size_bin][rix0].successive_failures += nbad; #if 0 for (y = size_bin+1; y < NUM_PACKET_SIZE_BINS; y++) { /* * Also say larger packets failed since we * assume if a small packet fails at a * bit-rate then a larger one will also. */ sn->stats[y][rix0].successive_failures += nbad; sn->stats[y][rix0].last_tx = ticks; sn->stats[y][rix0].tries += tries; sn->stats[y][rix0].total_packets += nframes; } #endif } else { sn->stats[size_bin][rix0].packets_acked += (nframes - nbad); sn->stats[size_bin][rix0].successive_failures = 0; } sn->stats[size_bin][rix0].tries += tries; sn->stats[size_bin][rix0].last_tx = ticks; sn->stats[size_bin][rix0].total_packets += nframes; /* update EWMA for this rix */ /* Calculate percentage based on current rate */ if (nframes == 0) nframes = nbad = 1; pct = ((nframes - nbad) * 1000) / nframes; if (sn->stats[size_bin][rix0].total_packets < ssc->smoothing_minpackets) { /* just average the first few packets */ int a_pct = (sn->stats[size_bin][rix0].packets_acked * 1000) / (sn->stats[size_bin][rix0].total_packets); sn->stats[size_bin][rix0].ewma_pct = a_pct; } else { /* use a ewma */ sn->stats[size_bin][rix0].ewma_pct = ((sn->stats[size_bin][rix0].ewma_pct * ssc->smoothing_rate) + (pct * (100 - ssc->smoothing_rate))) / 100; } if (rix0 == sn->current_sample_rix[size_bin]) { IEEE80211_NOTE(an->an_node.ni_vap, IEEE80211_MSG_RATECTL, &an->an_node, "%s: size %d %s sample rate %d %s tries (%d/%d) tt %d avg_tt (%d/%d) nfrm %d nbad %d", __func__, size, status ? "FAIL" : "OK", dot11rate(rt, rix0), dot11rate_label(rt, rix0), short_tries, tries, tt, sn->stats[size_bin][rix0].average_tx_time, sn->stats[size_bin][rix0].perfect_tx_time, nframes, nbad); sn->sample_tt[size_bin] = tt; sn->current_sample_rix[size_bin] = -1; } } static void badrate(struct ath_softc *sc, int series, int hwrate, int tries, int status) { device_printf(sc->sc_dev, "bad series%d hwrate 0x%x, tries %u ts_status 0x%x\n", series, hwrate, tries, status); } void ath_rate_tx_complete(struct ath_softc *sc, struct ath_node *an, const struct ath_rc_series *rc, const struct ath_tx_status *ts, int frame_size, int nframes, int nbad) { struct ieee80211com *ic = &sc->sc_ic; struct sample_node *sn = ATH_NODE_SAMPLE(an); int final_rix, short_tries, long_tries; const HAL_RATE_TABLE *rt = sc->sc_currates; int status = ts->ts_status; int mrr; final_rix = rt->rateCodeToIndex[ts->ts_rate]; short_tries = ts->ts_shortretry; long_tries = ts->ts_longretry + 1; if (nframes == 0) { device_printf(sc->sc_dev, "%s: nframes=0?\n", __func__); return; } if (frame_size == 0) /* NB: should not happen */ frame_size = 1500; if (sn->ratemask == 0) { IEEE80211_NOTE(an->an_node.ni_vap, IEEE80211_MSG_RATECTL, &an->an_node, "%s: size %d %s rate/try %d/%d no rates yet", __func__, bin_to_size(size_to_bin(frame_size)), status ? "FAIL" : "OK", short_tries, long_tries); return; } mrr = sc->sc_mrretry; /* XXX check HT protmode too */ if (mrr && (ic->ic_flags & IEEE80211_F_USEPROT && !sc->sc_mrrprot)) mrr = 0; if (!mrr || ts->ts_finaltsi == 0) { if (!IS_RATE_DEFINED(sn, final_rix)) { device_printf(sc->sc_dev, "%s: ts_rate=%d ts_finaltsi=%d, final_rix=%d\n", __func__, ts->ts_rate, ts->ts_finaltsi, final_rix); badrate(sc, 0, ts->ts_rate, long_tries, status); return; } /* * Only one rate was used; optimize work. */ IEEE80211_NOTE(an->an_node.ni_vap, IEEE80211_MSG_RATECTL, &an->an_node, "%s: size %d (%d bytes) %s rate/short/long %d %s/%d/%d nframes/nbad [%d/%d]", __func__, bin_to_size(size_to_bin(frame_size)), frame_size, status ? "FAIL" : "OK", dot11rate(rt, final_rix), dot11rate_label(rt, final_rix), short_tries, long_tries, nframes, nbad); update_stats(sc, an, frame_size, final_rix, long_tries, 0, 0, 0, 0, 0, 0, short_tries, long_tries, status, nframes, nbad); } else { int finalTSIdx = ts->ts_finaltsi; int i; /* * Process intermediate rates that failed. */ IEEE80211_NOTE(an->an_node.ni_vap, IEEE80211_MSG_RATECTL, &an->an_node, "%s: size %d (%d bytes) finaltsidx %d short %d long %d %s rate/try [%d %s/%d %d %s/%d %d %s/%d %d %s/%d] nframes/nbad [%d/%d]", __func__, bin_to_size(size_to_bin(frame_size)), frame_size, finalTSIdx, short_tries, long_tries, status ? "FAIL" : "OK", dot11rate(rt, rc[0].rix), dot11rate_label(rt, rc[0].rix), rc[0].tries, dot11rate(rt, rc[1].rix), dot11rate_label(rt, rc[1].rix), rc[1].tries, dot11rate(rt, rc[2].rix), dot11rate_label(rt, rc[2].rix), rc[2].tries, dot11rate(rt, rc[3].rix), dot11rate_label(rt, rc[3].rix), rc[3].tries, nframes, nbad); for (i = 0; i < 4; i++) { if (rc[i].tries && !IS_RATE_DEFINED(sn, rc[i].rix)) badrate(sc, 0, rc[i].ratecode, rc[i].tries, status); } /* * NB: series > 0 are not penalized for failure * based on the try counts under the assumption * that losses are often bursty and since we * sample higher rates 1 try at a time doing so * may unfairly penalize them. */ if (rc[0].tries) { update_stats(sc, an, frame_size, rc[0].rix, rc[0].tries, rc[1].rix, rc[1].tries, rc[2].rix, rc[2].tries, rc[3].rix, rc[3].tries, short_tries, long_tries, long_tries > rc[0].tries, nframes, nbad); long_tries -= rc[0].tries; } if (rc[1].tries && finalTSIdx > 0) { update_stats(sc, an, frame_size, rc[1].rix, rc[1].tries, rc[2].rix, rc[2].tries, rc[3].rix, rc[3].tries, 0, 0, short_tries, long_tries, status, nframes, nbad); long_tries -= rc[1].tries; } if (rc[2].tries && finalTSIdx > 1) { update_stats(sc, an, frame_size, rc[2].rix, rc[2].tries, rc[3].rix, rc[3].tries, 0, 0, 0, 0, short_tries, long_tries, status, nframes, nbad); long_tries -= rc[2].tries; } if (rc[3].tries && finalTSIdx > 2) { update_stats(sc, an, frame_size, rc[3].rix, rc[3].tries, 0, 0, 0, 0, 0, 0, short_tries, long_tries, status, nframes, nbad); } } } void ath_rate_newassoc(struct ath_softc *sc, struct ath_node *an, int isnew) { if (isnew) ath_rate_ctl_reset(sc, &an->an_node); } void ath_rate_update_rx_rssi(struct ath_softc *sc, struct ath_node *an, int rssi) { } static const struct txschedule *mrr_schedules[IEEE80211_MODE_MAX+2] = { NULL, /* IEEE80211_MODE_AUTO */ series_11a, /* IEEE80211_MODE_11A */ series_11g, /* IEEE80211_MODE_11B */ series_11g, /* IEEE80211_MODE_11G */ NULL, /* IEEE80211_MODE_FH */ series_11a, /* IEEE80211_MODE_TURBO_A */ series_11g, /* IEEE80211_MODE_TURBO_G */ series_11a, /* IEEE80211_MODE_STURBO_A */ series_11na, /* IEEE80211_MODE_11NA */ series_11ng, /* IEEE80211_MODE_11NG */ series_half, /* IEEE80211_MODE_HALF */ series_quarter, /* IEEE80211_MODE_QUARTER */ }; /* * Initialize the tables for a node. */ static void ath_rate_ctl_reset(struct ath_softc *sc, struct ieee80211_node *ni) { #define RATE(_ix) (ni->ni_rates.rs_rates[(_ix)] & IEEE80211_RATE_VAL) #define DOT11RATE(_ix) (rt->info[(_ix)].dot11Rate & IEEE80211_RATE_VAL) #define MCS(_ix) (ni->ni_htrates.rs_rates[_ix] | IEEE80211_RATE_MCS) struct ath_node *an = ATH_NODE(ni); struct sample_node *sn = ATH_NODE_SAMPLE(an); const HAL_RATE_TABLE *rt = sc->sc_currates; int x, y, rix; KASSERT(rt != NULL, ("no rate table, mode %u", sc->sc_curmode)); KASSERT(sc->sc_curmode < IEEE80211_MODE_MAX+2, ("curmode %u", sc->sc_curmode)); sn->sched = mrr_schedules[sc->sc_curmode]; KASSERT(sn->sched != NULL, ("no mrr schedule for mode %u", sc->sc_curmode)); sn->static_rix = -1; ath_rate_update_static_rix(sc, ni); sn->currates = sc->sc_currates; /* * Construct a bitmask of usable rates. This has all * negotiated rates minus those marked by the hal as * to be ignored for doing rate control. */ sn->ratemask = 0; /* MCS rates */ if (ni->ni_flags & IEEE80211_NODE_HT) { for (x = 0; x < ni->ni_htrates.rs_nrates; x++) { rix = sc->sc_rixmap[MCS(x)]; if (rix == 0xff) continue; /* skip rates marked broken by hal */ if (!rt->info[rix].valid) continue; KASSERT(rix < SAMPLE_MAXRATES, ("mcs %u has rix %d", MCS(x), rix)); sn->ratemask |= (uint64_t) 1<ni_rates.rs_nrates; x++) { rix = sc->sc_rixmap[RATE(x)]; if (rix == 0xff) continue; /* skip rates marked broken by hal */ if (!rt->info[rix].valid) continue; KASSERT(rix < SAMPLE_MAXRATES, ("rate %u has rix %d", RATE(x), rix)); sn->ratemask |= (uint64_t) 1<ni_vap, IEEE80211_MSG_RATECTL)) { uint64_t mask; ieee80211_note(ni->ni_vap, "[%6D] %s: size 1600 rate/tt", ni->ni_macaddr, ":", __func__); for (mask = sn->ratemask, rix = 0; mask != 0; mask >>= 1, rix++) { if ((mask & 1) == 0) continue; printf(" %d %s/%d", dot11rate(rt, rix), dot11rate_label(rt, rix), calc_usecs_unicast_packet(sc, 1600, rix, 0,0, (ni->ni_chw == 40))); } printf("\n"); } #endif for (y = 0; y < NUM_PACKET_SIZE_BINS; y++) { int size = bin_to_size(y); uint64_t mask; sn->packets_sent[y] = 0; sn->current_sample_rix[y] = -1; sn->last_sample_rix[y] = 0; /* XXX start with first valid rate */ sn->current_rix[y] = ffs(sn->ratemask)-1; /* * Initialize the statistics buckets; these are * indexed by the rate code index. */ for (rix = 0, mask = sn->ratemask; mask != 0; rix++, mask >>= 1) { if ((mask & 1) == 0) /* not a valid rate */ continue; sn->stats[y][rix].successive_failures = 0; sn->stats[y][rix].tries = 0; sn->stats[y][rix].total_packets = 0; sn->stats[y][rix].packets_acked = 0; sn->stats[y][rix].last_tx = 0; sn->stats[y][rix].ewma_pct = 0; sn->stats[y][rix].perfect_tx_time = calc_usecs_unicast_packet(sc, size, rix, 0, 0, (ni->ni_chw == 40)); sn->stats[y][rix].average_tx_time = sn->stats[y][rix].perfect_tx_time; } } #if 0 /* XXX 0, num_rates-1 are wrong */ IEEE80211_NOTE(ni->ni_vap, IEEE80211_MSG_RATECTL, ni, "%s: %d rates %d%sMbps (%dus)- %d%sMbps (%dus)", __func__, sn->num_rates, DOT11RATE(0)/2, DOT11RATE(0) % 1 ? ".5" : "", sn->stats[1][0].perfect_tx_time, DOT11RATE(sn->num_rates-1)/2, DOT11RATE(sn->num_rates-1) % 1 ? ".5" : "", sn->stats[1][sn->num_rates-1].perfect_tx_time ); #endif /* set the visible bit-rate */ if (sn->static_rix != -1) ni->ni_txrate = DOT11RATE(sn->static_rix); else ni->ni_txrate = RATE(0); #undef RATE #undef DOT11RATE } /* * Fetch the statistics for the given node. * * The ieee80211 node must be referenced and unlocked, however the ath_node * must be locked. * * The main difference here is that we convert the rate indexes * to 802.11 rates, or the userland output won't make much sense * as it has no access to the rix table. */ int ath_rate_fetch_node_stats(struct ath_softc *sc, struct ath_node *an, struct ath_rateioctl *rs) { struct sample_node *sn = ATH_NODE_SAMPLE(an); const HAL_RATE_TABLE *rt = sc->sc_currates; struct ath_rateioctl_tlv av; struct ath_rateioctl_rt *tv; int y; int o = 0; ATH_NODE_LOCK_ASSERT(an); /* * Ensure there's enough space for the statistics. */ if (rs->len < sizeof(struct ath_rateioctl_tlv) + sizeof(struct ath_rateioctl_rt) + sizeof(struct ath_rateioctl_tlv) + sizeof(struct sample_node)) { device_printf(sc->sc_dev, "%s: len=%d, too short\n", __func__, rs->len); return (EINVAL); } /* * Take a temporary copy of the sample node state so we can * modify it before we copy it. */ tv = malloc(sizeof(struct ath_rateioctl_rt), M_TEMP, M_NOWAIT | M_ZERO); if (tv == NULL) { return (ENOMEM); } /* * Populate the rate table mapping TLV. */ tv->nentries = rt->rateCount; for (y = 0; y < rt->rateCount; y++) { tv->ratecode[y] = rt->info[y].dot11Rate & IEEE80211_RATE_VAL; if (rt->info[y].phy == IEEE80211_T_HT) tv->ratecode[y] |= IEEE80211_RATE_MCS; } o = 0; /* * First TLV - rate code mapping */ av.tlv_id = ATH_RATE_TLV_RATETABLE; av.tlv_len = sizeof(struct ath_rateioctl_rt); copyout(&av, rs->buf + o, sizeof(struct ath_rateioctl_tlv)); o += sizeof(struct ath_rateioctl_tlv); copyout(tv, rs->buf + o, sizeof(struct ath_rateioctl_rt)); o += sizeof(struct ath_rateioctl_rt); /* * Second TLV - sample node statistics */ av.tlv_id = ATH_RATE_TLV_SAMPLENODE; av.tlv_len = sizeof(struct sample_node); copyout(&av, rs->buf + o, sizeof(struct ath_rateioctl_tlv)); o += sizeof(struct ath_rateioctl_tlv); /* * Copy the statistics over to the provided buffer. */ copyout(sn, rs->buf + o, sizeof(struct sample_node)); o += sizeof(struct sample_node); free(tv, M_TEMP); return (0); } static void sample_stats(void *arg, struct ieee80211_node *ni) { struct ath_softc *sc = arg; const HAL_RATE_TABLE *rt = sc->sc_currates; struct sample_node *sn = ATH_NODE_SAMPLE(ATH_NODE(ni)); uint64_t mask; int rix, y; printf("\n[%s] refcnt %d static_rix (%d %s) ratemask 0x%jx\n", ether_sprintf(ni->ni_macaddr), ieee80211_node_refcnt(ni), dot11rate(rt, sn->static_rix), dot11rate_label(rt, sn->static_rix), (uintmax_t)sn->ratemask); for (y = 0; y < NUM_PACKET_SIZE_BINS; y++) { printf("[%4u] cur rix %d (%d %s) since switch: packets %d ticks %u\n", bin_to_size(y), sn->current_rix[y], dot11rate(rt, sn->current_rix[y]), dot11rate_label(rt, sn->current_rix[y]), sn->packets_since_switch[y], sn->ticks_since_switch[y]); printf("[%4u] last sample (%d %s) cur sample (%d %s) packets sent %d\n", bin_to_size(y), dot11rate(rt, sn->last_sample_rix[y]), dot11rate_label(rt, sn->last_sample_rix[y]), dot11rate(rt, sn->current_sample_rix[y]), dot11rate_label(rt, sn->current_sample_rix[y]), sn->packets_sent[y]); printf("[%4u] packets since sample %d sample tt %u\n", bin_to_size(y), sn->packets_since_sample[y], sn->sample_tt[y]); } for (mask = sn->ratemask, rix = 0; mask != 0; mask >>= 1, rix++) { if ((mask & 1) == 0) continue; for (y = 0; y < NUM_PACKET_SIZE_BINS; y++) { if (sn->stats[y][rix].total_packets == 0) continue; printf("[%2u %s:%4u] %8ju:%-8ju (%3d%%) (EWMA %3d.%1d%%) T %8ju F %4d avg %5u last %u\n", dot11rate(rt, rix), dot11rate_label(rt, rix), bin_to_size(y), (uintmax_t) sn->stats[y][rix].total_packets, (uintmax_t) sn->stats[y][rix].packets_acked, (int) ((sn->stats[y][rix].packets_acked * 100ULL) / sn->stats[y][rix].total_packets), sn->stats[y][rix].ewma_pct / 10, sn->stats[y][rix].ewma_pct % 10, (uintmax_t) sn->stats[y][rix].tries, sn->stats[y][rix].successive_failures, sn->stats[y][rix].average_tx_time, ticks - sn->stats[y][rix].last_tx); } } } static int ath_rate_sysctl_stats(SYSCTL_HANDLER_ARGS) { struct ath_softc *sc = arg1; struct ieee80211com *ic = &sc->sc_ic; int error, v; v = 0; error = sysctl_handle_int(oidp, &v, 0, req); if (error || !req->newptr) return error; ieee80211_iterate_nodes(&ic->ic_sta, sample_stats, sc); return 0; } static int ath_rate_sysctl_smoothing_rate(SYSCTL_HANDLER_ARGS) { struct sample_softc *ssc = arg1; int rate, error; rate = ssc->smoothing_rate; error = sysctl_handle_int(oidp, &rate, 0, req); if (error || !req->newptr) return error; if (!(0 <= rate && rate < 100)) return EINVAL; ssc->smoothing_rate = rate; ssc->smoothing_minpackets = 100 / (100 - rate); return 0; } static int ath_rate_sysctl_sample_rate(SYSCTL_HANDLER_ARGS) { struct sample_softc *ssc = arg1; int rate, error; rate = ssc->sample_rate; error = sysctl_handle_int(oidp, &rate, 0, req); if (error || !req->newptr) return error; if (!(2 <= rate && rate <= 100)) return EINVAL; ssc->sample_rate = rate; return 0; } static void ath_rate_sysctlattach(struct ath_softc *sc, struct sample_softc *ssc) { struct sysctl_ctx_list *ctx = device_get_sysctl_ctx(sc->sc_dev); struct sysctl_oid *tree = device_get_sysctl_tree(sc->sc_dev); SYSCTL_ADD_PROC(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, - "smoothing_rate", CTLTYPE_INT | CTLFLAG_RW, ssc, 0, - ath_rate_sysctl_smoothing_rate, "I", + "smoothing_rate", CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_MPSAFE, + ssc, 0, ath_rate_sysctl_smoothing_rate, "I", "sample: smoothing rate for avg tx time (%%)"); SYSCTL_ADD_PROC(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, - "sample_rate", CTLTYPE_INT | CTLFLAG_RW, ssc, 0, - ath_rate_sysctl_sample_rate, "I", + "sample_rate", CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_MPSAFE, + ssc, 0, ath_rate_sysctl_sample_rate, "I", "sample: percent air time devoted to sampling new rates (%%)"); /* XXX max_successive_failures, stale_failure_timeout, min_switch */ SYSCTL_ADD_PROC(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, - "sample_stats", CTLTYPE_INT | CTLFLAG_RW, sc, 0, - ath_rate_sysctl_stats, "I", "sample: print statistics"); + "sample_stats", CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_MPSAFE, + sc, 0, ath_rate_sysctl_stats, "I", "sample: print statistics"); } struct ath_ratectrl * ath_rate_attach(struct ath_softc *sc) { struct sample_softc *ssc; ssc = malloc(sizeof(struct sample_softc), M_DEVBUF, M_NOWAIT|M_ZERO); if (ssc == NULL) return NULL; ssc->arc.arc_space = sizeof(struct sample_node); ssc->smoothing_rate = 75; /* ewma percentage ([0..99]) */ ssc->smoothing_minpackets = 100 / (100 - ssc->smoothing_rate); ssc->sample_rate = 10; /* %time to try diff tx rates */ ssc->max_successive_failures = 3; /* threshold for rate sampling*/ ssc->stale_failure_timeout = 10 * hz; /* 10 seconds */ ssc->min_switch = hz; /* 1 second */ ath_rate_sysctlattach(sc, ssc); return &ssc->arc; } void ath_rate_detach(struct ath_ratectrl *arc) { struct sample_softc *ssc = (struct sample_softc *) arc; free(ssc, M_DEVBUF); } Index: projects/clang1000-import/sys/dev/ath/if_ath_sysctl.c =================================================================== --- projects/clang1000-import/sys/dev/ath/if_ath_sysctl.c (revision 358238) +++ projects/clang1000-import/sys/dev/ath/if_ath_sysctl.c (revision 358239) @@ -1,1357 +1,1359 @@ /*- * SPDX-License-Identifier: BSD-2-Clause-FreeBSD * * Copyright (c) 2002-2009 Sam Leffler, Errno Consulting * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer, * without modification. * 2. Redistributions in binary form must reproduce at minimum a disclaimer * similar to the "NO WARRANTY" disclaimer below ("Disclaimer") and any * redistribution must be conditioned upon including a substantially * similar Disclaimer requirement for further binary redistribution. * * NO WARRANTY * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS * ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT * LIMITED TO, THE IMPLIED WARRANTIES OF NONINFRINGEMENT, MERCHANTIBILITY * AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL * THE COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR SPECIAL, EXEMPLARY, * OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF * THE POSSIBILITY OF SUCH DAMAGES. */ #include __FBSDID("$FreeBSD$"); /* * Driver for the Atheros Wireless LAN controller. * * This software is derived from work of Atsushi Onoe; his contribution * is greatly appreciated. */ #include "opt_inet.h" #include "opt_ath.h" #include "opt_wlan.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #ifdef IEEE80211_SUPPORT_SUPERG #include #endif #ifdef IEEE80211_SUPPORT_TDMA #include #endif #include #ifdef INET #include #include #endif #include #include /* XXX for softled */ #include #include #include #include #include #include #ifdef ATH_TX99_DIAG #include #endif #ifdef ATH_DEBUG_ALQ #include #endif static int ath_sysctl_slottime(SYSCTL_HANDLER_ARGS) { struct ath_softc *sc = arg1; u_int slottime; int error; ATH_LOCK(sc); ath_power_set_power_state(sc, HAL_PM_AWAKE); slottime = ath_hal_getslottime(sc->sc_ah); ATH_UNLOCK(sc); error = sysctl_handle_int(oidp, &slottime, 0, req); if (error || !req->newptr) goto finish; error = !ath_hal_setslottime(sc->sc_ah, slottime) ? EINVAL : 0; finish: ATH_LOCK(sc); ath_power_restore_power_state(sc); ATH_UNLOCK(sc); return error; } static int ath_sysctl_acktimeout(SYSCTL_HANDLER_ARGS) { struct ath_softc *sc = arg1; u_int acktimeout; int error; ATH_LOCK(sc); ath_power_set_power_state(sc, HAL_PM_AWAKE); acktimeout = ath_hal_getacktimeout(sc->sc_ah); ATH_UNLOCK(sc); error = sysctl_handle_int(oidp, &acktimeout, 0, req); if (error || !req->newptr) goto finish; error = !ath_hal_setacktimeout(sc->sc_ah, acktimeout) ? EINVAL : 0; finish: ATH_LOCK(sc); ath_power_restore_power_state(sc); ATH_UNLOCK(sc); return (error); } static int ath_sysctl_ctstimeout(SYSCTL_HANDLER_ARGS) { struct ath_softc *sc = arg1; u_int ctstimeout; int error; ATH_LOCK(sc); ath_power_set_power_state(sc, HAL_PM_AWAKE); ctstimeout = ath_hal_getctstimeout(sc->sc_ah); ATH_UNLOCK(sc); error = sysctl_handle_int(oidp, &ctstimeout, 0, req); if (error || !req->newptr) goto finish; error = !ath_hal_setctstimeout(sc->sc_ah, ctstimeout) ? EINVAL : 0; finish: ATH_LOCK(sc); ath_power_restore_power_state(sc); ATH_UNLOCK(sc); return (error); } static int ath_sysctl_softled(SYSCTL_HANDLER_ARGS) { struct ath_softc *sc = arg1; int softled = sc->sc_softled; int error; error = sysctl_handle_int(oidp, &softled, 0, req); if (error || !req->newptr) return error; softled = (softled != 0); if (softled != sc->sc_softled) { if (softled) { /* NB: handle any sc_ledpin change */ ath_led_config(sc); } sc->sc_softled = softled; } return 0; } static int ath_sysctl_ledpin(SYSCTL_HANDLER_ARGS) { struct ath_softc *sc = arg1; int ledpin = sc->sc_ledpin; int error; error = sysctl_handle_int(oidp, &ledpin, 0, req); if (error || !req->newptr) return error; if (ledpin != sc->sc_ledpin) { sc->sc_ledpin = ledpin; if (sc->sc_softled) { ath_led_config(sc); } } return 0; } static int ath_sysctl_hardled(SYSCTL_HANDLER_ARGS) { struct ath_softc *sc = arg1; int hardled = sc->sc_hardled; int error; error = sysctl_handle_int(oidp, &hardled, 0, req); if (error || !req->newptr) return error; hardled = (hardled != 0); if (hardled != sc->sc_hardled) { if (hardled) { /* NB: handle any sc_ledpin change */ ath_led_config(sc); } sc->sc_hardled = hardled; } return 0; } static int ath_sysctl_txantenna(SYSCTL_HANDLER_ARGS) { struct ath_softc *sc = arg1; u_int txantenna; int error; ATH_LOCK(sc); ath_power_set_power_state(sc, HAL_PM_AWAKE); ATH_UNLOCK(sc); txantenna = ath_hal_getantennaswitch(sc->sc_ah); error = sysctl_handle_int(oidp, &txantenna, 0, req); if (!error && req->newptr) { /* XXX assumes 2 antenna ports */ if (txantenna < HAL_ANT_VARIABLE || txantenna > HAL_ANT_FIXED_B) { error = EINVAL; goto finish; } ath_hal_setantennaswitch(sc->sc_ah, txantenna); /* * NB: with the switch locked this isn't meaningful, * but set it anyway so things like radiotap get * consistent info in their data. */ sc->sc_txantenna = txantenna; } finish: ATH_LOCK(sc); ath_power_restore_power_state(sc); ATH_UNLOCK(sc); return (error); } static int ath_sysctl_rxantenna(SYSCTL_HANDLER_ARGS) { struct ath_softc *sc = arg1; u_int defantenna; int error; ATH_LOCK(sc); ath_power_set_power_state(sc, HAL_PM_AWAKE); defantenna = ath_hal_getdefantenna(sc->sc_ah); ATH_UNLOCK(sc); error = sysctl_handle_int(oidp, &defantenna, 0, req); if (!error && req->newptr) ath_hal_setdefantenna(sc->sc_ah, defantenna); ATH_LOCK(sc); ath_power_restore_power_state(sc); ATH_UNLOCK(sc); return (error); } static int ath_sysctl_diversity(SYSCTL_HANDLER_ARGS) { struct ath_softc *sc = arg1; u_int diversity; int error; ATH_LOCK(sc); ath_power_set_power_state(sc, HAL_PM_AWAKE); ATH_UNLOCK(sc); diversity = ath_hal_getdiversity(sc->sc_ah); error = sysctl_handle_int(oidp, &diversity, 0, req); if (error || !req->newptr) goto finish; if (!ath_hal_setdiversity(sc->sc_ah, diversity)) { error = EINVAL; goto finish; } sc->sc_diversity = diversity; error = 0; finish: ATH_LOCK(sc); ath_power_restore_power_state(sc); ATH_UNLOCK(sc); return (error); } static int ath_sysctl_diag(SYSCTL_HANDLER_ARGS) { struct ath_softc *sc = arg1; u_int32_t diag; int error; ATH_LOCK(sc); ath_power_set_power_state(sc, HAL_PM_AWAKE); ATH_UNLOCK(sc); if (!ath_hal_getdiag(sc->sc_ah, &diag)) { error = EINVAL; goto finish; } error = sysctl_handle_int(oidp, &diag, 0, req); if (error || !req->newptr) goto finish; error = !ath_hal_setdiag(sc->sc_ah, diag) ? EINVAL : 0; finish: ATH_LOCK(sc); ath_power_restore_power_state(sc); ATH_UNLOCK(sc); return (error); } static int ath_sysctl_tpscale(SYSCTL_HANDLER_ARGS) { struct ath_softc *sc = arg1; u_int32_t scale; int error; ATH_LOCK(sc); ath_power_set_power_state(sc, HAL_PM_AWAKE); ATH_UNLOCK(sc); (void) ath_hal_gettpscale(sc->sc_ah, &scale); error = sysctl_handle_int(oidp, &scale, 0, req); if (error || !req->newptr) goto finish; error = !ath_hal_settpscale(sc->sc_ah, scale) ? EINVAL : (sc->sc_running) ? ath_reset(sc, ATH_RESET_NOLOSS) : 0; finish: ATH_LOCK(sc); ath_power_restore_power_state(sc); ATH_UNLOCK(sc); return (error); } static int ath_sysctl_tpc(SYSCTL_HANDLER_ARGS) { struct ath_softc *sc = arg1; u_int tpc; int error; ATH_LOCK(sc); ath_power_set_power_state(sc, HAL_PM_AWAKE); ATH_UNLOCK(sc); tpc = ath_hal_gettpc(sc->sc_ah); error = sysctl_handle_int(oidp, &tpc, 0, req); if (error || !req->newptr) goto finish; error = !ath_hal_settpc(sc->sc_ah, tpc) ? EINVAL : 0; finish: ATH_LOCK(sc); ath_power_restore_power_state(sc); ATH_UNLOCK(sc); return (error); } static int ath_sysctl_rfkill(SYSCTL_HANDLER_ARGS) { struct ath_softc *sc = arg1; struct ath_hal *ah = sc->sc_ah; u_int rfkill; int error; ATH_LOCK(sc); ath_power_set_power_state(sc, HAL_PM_AWAKE); ATH_UNLOCK(sc); rfkill = ath_hal_getrfkill(ah); error = sysctl_handle_int(oidp, &rfkill, 0, req); if (error || !req->newptr) goto finish; if (rfkill == ath_hal_getrfkill(ah)) { /* unchanged */ error = 0; goto finish; } if (!ath_hal_setrfkill(ah, rfkill)) { error = EINVAL; goto finish; } error = sc->sc_running ? ath_reset(sc, ATH_RESET_FULL) : 0; finish: ATH_LOCK(sc); ath_power_restore_power_state(sc); ATH_UNLOCK(sc); return (error); } static int ath_sysctl_txagg(SYSCTL_HANDLER_ARGS) { struct ath_softc *sc = arg1; int i, t, param = 0; int error; struct ath_buf *bf; error = sysctl_handle_int(oidp, ¶m, 0, req); if (error || !req->newptr) return error; if (param != 1) return 0; printf("no tx bufs (empty list): %d\n", sc->sc_stats.ast_tx_getnobuf); printf("no tx bufs (was busy): %d\n", sc->sc_stats.ast_tx_getbusybuf); printf("aggr single packet: %d\n", sc->sc_aggr_stats.aggr_single_pkt); printf("aggr single packet w/ BAW closed: %d\n", sc->sc_aggr_stats.aggr_baw_closed_single_pkt); printf("aggr non-baw packet: %d\n", sc->sc_aggr_stats.aggr_nonbaw_pkt); printf("aggr aggregate packet: %d\n", sc->sc_aggr_stats.aggr_aggr_pkt); printf("aggr single packet low hwq: %d\n", sc->sc_aggr_stats.aggr_low_hwq_single_pkt); printf("aggr single packet RTS aggr limited: %d\n", sc->sc_aggr_stats.aggr_rts_aggr_limited); printf("aggr sched, no work: %d\n", sc->sc_aggr_stats.aggr_sched_nopkt); for (i = 0; i < 64; i++) { printf("%2d: %10d ", i, sc->sc_aggr_stats.aggr_pkts[i]); if (i % 4 == 3) printf("\n"); } printf("\n"); for (i = 0; i < HAL_NUM_TX_QUEUES; i++) { if (ATH_TXQ_SETUP(sc, i)) { printf("HW TXQ %d: axq_depth=%d, axq_aggr_depth=%d, " "axq_fifo_depth=%d, holdingbf=%p\n", i, sc->sc_txq[i].axq_depth, sc->sc_txq[i].axq_aggr_depth, sc->sc_txq[i].axq_fifo_depth, sc->sc_txq[i].axq_holdingbf); } } i = t = 0; ATH_TXBUF_LOCK(sc); TAILQ_FOREACH(bf, &sc->sc_txbuf, bf_list) { if (bf->bf_flags & ATH_BUF_BUSY) { printf("Busy: %d\n", t); i++; } t++; } ATH_TXBUF_UNLOCK(sc); printf("Total TX buffers: %d; Total TX buffers busy: %d (%d)\n", t, i, sc->sc_txbuf_cnt); i = t = 0; ATH_TXBUF_LOCK(sc); TAILQ_FOREACH(bf, &sc->sc_txbuf_mgmt, bf_list) { if (bf->bf_flags & ATH_BUF_BUSY) { printf("Busy: %d\n", t); i++; } t++; } ATH_TXBUF_UNLOCK(sc); printf("Total mgmt TX buffers: %d; Total mgmt TX buffers busy: %d\n", t, i); ATH_RX_LOCK(sc); for (i = 0; i < 2; i++) { printf("%d: fifolen: %d/%d; head=%d; tail=%d; m_pending=%p, m_holdbf=%p\n", i, sc->sc_rxedma[i].m_fifo_depth, sc->sc_rxedma[i].m_fifolen, sc->sc_rxedma[i].m_fifo_head, sc->sc_rxedma[i].m_fifo_tail, sc->sc_rxedma[i].m_rxpending, sc->sc_rxedma[i].m_holdbf); } i = 0; TAILQ_FOREACH(bf, &sc->sc_rxbuf, bf_list) { i++; } printf("Total RX buffers in free list: %d buffers\n", i); ATH_RX_UNLOCK(sc); return 0; } static int ath_sysctl_rfsilent(SYSCTL_HANDLER_ARGS) { struct ath_softc *sc = arg1; u_int rfsilent; int error; ATH_LOCK(sc); ath_power_set_power_state(sc, HAL_PM_AWAKE); ATH_UNLOCK(sc); (void) ath_hal_getrfsilent(sc->sc_ah, &rfsilent); error = sysctl_handle_int(oidp, &rfsilent, 0, req); if (error || !req->newptr) goto finish; if (!ath_hal_setrfsilent(sc->sc_ah, rfsilent)) { error = EINVAL; goto finish; } /* * Earlier chips (< AR5212) have up to 8 GPIO * pins exposed. * * AR5416 and later chips have many more GPIO * pins (up to 16) so the mask is expanded to * four bits. */ sc->sc_rfsilentpin = rfsilent & 0x3c; sc->sc_rfsilentpol = (rfsilent & 0x2) != 0; error = 0; finish: ATH_LOCK(sc); ath_power_restore_power_state(sc); ATH_UNLOCK(sc); return (error); } static int ath_sysctl_tpack(SYSCTL_HANDLER_ARGS) { struct ath_softc *sc = arg1; u_int32_t tpack; int error; ATH_LOCK(sc); ath_power_set_power_state(sc, HAL_PM_AWAKE); ATH_UNLOCK(sc); (void) ath_hal_gettpack(sc->sc_ah, &tpack); error = sysctl_handle_int(oidp, &tpack, 0, req); if (error || !req->newptr) goto finish; error = !ath_hal_settpack(sc->sc_ah, tpack) ? EINVAL : 0; finish: ATH_LOCK(sc); ath_power_restore_power_state(sc); ATH_UNLOCK(sc); return (error); } static int ath_sysctl_tpcts(SYSCTL_HANDLER_ARGS) { struct ath_softc *sc = arg1; u_int32_t tpcts; int error; ATH_LOCK(sc); ath_power_set_power_state(sc, HAL_PM_AWAKE); ATH_UNLOCK(sc); (void) ath_hal_gettpcts(sc->sc_ah, &tpcts); error = sysctl_handle_int(oidp, &tpcts, 0, req); if (error || !req->newptr) goto finish; error = !ath_hal_settpcts(sc->sc_ah, tpcts) ? EINVAL : 0; finish: ATH_LOCK(sc); ath_power_restore_power_state(sc); ATH_UNLOCK(sc); return (error); } static int ath_sysctl_intmit(SYSCTL_HANDLER_ARGS) { struct ath_softc *sc = arg1; int intmit, error; ATH_LOCK(sc); ath_power_set_power_state(sc, HAL_PM_AWAKE); ATH_UNLOCK(sc); intmit = ath_hal_getintmit(sc->sc_ah); error = sysctl_handle_int(oidp, &intmit, 0, req); if (error || !req->newptr) goto finish; /* reusing error; 1 here means "good"; 0 means "fail" */ error = ath_hal_setintmit(sc->sc_ah, intmit); if (! error) { error = EINVAL; goto finish; } /* * Reset the hardware here - disabling ANI in the HAL * doesn't reset ANI related registers, so it'll leave * things in an inconsistent state. */ if (sc->sc_running) ath_reset(sc, ATH_RESET_NOLOSS); error = 0; finish: ATH_LOCK(sc); ath_power_restore_power_state(sc); ATH_UNLOCK(sc); return (error); } #ifdef IEEE80211_SUPPORT_TDMA static int ath_sysctl_setcca(SYSCTL_HANDLER_ARGS) { struct ath_softc *sc = arg1; int setcca, error; setcca = sc->sc_setcca; error = sysctl_handle_int(oidp, &setcca, 0, req); if (error || !req->newptr) return error; sc->sc_setcca = (setcca != 0); return 0; } #endif /* IEEE80211_SUPPORT_TDMA */ static int ath_sysctl_forcebstuck(SYSCTL_HANDLER_ARGS) { struct ath_softc *sc = arg1; int val = 0; int error; error = sysctl_handle_int(oidp, &val, 0, req); if (error || !req->newptr) return error; if (val == 0) return 0; taskqueue_enqueue(sc->sc_tq, &sc->sc_bstucktask); val = 0; return 0; } static int ath_sysctl_hangcheck(SYSCTL_HANDLER_ARGS) { struct ath_softc *sc = arg1; int val = 0; int error; uint32_t mask = 0xffffffff; uint32_t *sp; uint32_t rsize; struct ath_hal *ah = sc->sc_ah; error = sysctl_handle_int(oidp, &val, 0, req); if (error || !req->newptr) return error; if (val == 0) return 0; ATH_LOCK(sc); ath_power_set_power_state(sc, HAL_PM_AWAKE); ATH_UNLOCK(sc); /* Do a hang check */ if (!ath_hal_getdiagstate(ah, HAL_DIAG_CHECK_HANGS, &mask, sizeof(mask), (void *) &sp, &rsize)) { error = 0; goto finish; } device_printf(sc->sc_dev, "%s: sp=0x%08x\n", __func__, *sp); val = 0; error = 0; finish: ATH_LOCK(sc); ath_power_restore_power_state(sc); ATH_UNLOCK(sc); return (error); } #ifdef ATH_DEBUG_ALQ static int ath_sysctl_alq_log(SYSCTL_HANDLER_ARGS) { struct ath_softc *sc = arg1; int error, enable; enable = (sc->sc_alq.sc_alq_isactive); error = sysctl_handle_int(oidp, &enable, 0, req); if (error || !req->newptr) return (error); else if (enable) error = if_ath_alq_start(&sc->sc_alq); else error = if_ath_alq_stop(&sc->sc_alq); return (error); } /* * Attach the ALQ debugging if required. */ static void ath_sysctl_alq_attach(struct ath_softc *sc) { struct sysctl_oid *tree = device_get_sysctl_tree(sc->sc_dev); struct sysctl_ctx_list *ctx = device_get_sysctl_ctx(sc->sc_dev); struct sysctl_oid_list *child = SYSCTL_CHILDREN(tree); - tree = SYSCTL_ADD_NODE(ctx, child, OID_AUTO, "alq", CTLFLAG_RD, - NULL, "Atheros ALQ logging parameters"); + tree = SYSCTL_ADD_NODE(ctx, child, OID_AUTO, "alq", + CTLFLAG_RD | CTLFLAG_MPSAFE, NULL, + "Atheros ALQ logging parameters"); child = SYSCTL_CHILDREN(tree); SYSCTL_ADD_STRING(ctx, child, OID_AUTO, "filename", CTLFLAG_RW, sc->sc_alq.sc_alq_filename, 0, "ALQ filename"); SYSCTL_ADD_PROC(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, - "enable", CTLTYPE_INT | CTLFLAG_RW, sc, 0, - ath_sysctl_alq_log, "I", ""); + "enable", CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_NEEDGIANT, sc, 0, + ath_sysctl_alq_log, "I", ""); SYSCTL_ADD_UINT(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, "debugmask", CTLFLAG_RW, &sc->sc_alq.sc_alq_debug, 0, "ALQ debug mask"); SYSCTL_ADD_UINT(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, "numlost", CTLFLAG_RW, &sc->sc_alq.sc_alq_numlost, 0, "number lost"); } #endif /* ATH_DEBUG_ALQ */ void ath_sysctlattach(struct ath_softc *sc) { struct sysctl_ctx_list *ctx = device_get_sysctl_ctx(sc->sc_dev); struct sysctl_oid *tree = device_get_sysctl_tree(sc->sc_dev); struct ath_hal *ah = sc->sc_ah; SYSCTL_ADD_UINT(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, "countrycode", CTLFLAG_RD, &sc->sc_eecc, 0, "EEPROM country code"); SYSCTL_ADD_UINT(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, "regdomain", CTLFLAG_RD, &sc->sc_eerd, 0, "EEPROM regdomain code"); #ifdef ATH_DEBUG SYSCTL_ADD_QUAD(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, "debug", CTLFLAG_RW, &sc->sc_debug, "control debugging printfs"); #endif #ifdef ATH_DEBUG_ALQ SYSCTL_ADD_QUAD(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, "ktrdebug", CTLFLAG_RW, &sc->sc_ktrdebug, "control debugging KTR"); #endif /* ATH_DEBUG_ALQ */ SYSCTL_ADD_PROC(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, - "slottime", CTLTYPE_INT | CTLFLAG_RW, sc, 0, - ath_sysctl_slottime, "I", "802.11 slot time (us)"); + "slottime", CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_NEEDGIANT, sc, 0, + ath_sysctl_slottime, "I", "802.11 slot time (us)"); SYSCTL_ADD_PROC(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, - "acktimeout", CTLTYPE_INT | CTLFLAG_RW, sc, 0, - ath_sysctl_acktimeout, "I", "802.11 ACK timeout (us)"); + "acktimeout", CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_NEEDGIANT, sc, 0, + ath_sysctl_acktimeout, "I", "802.11 ACK timeout (us)"); SYSCTL_ADD_PROC(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, - "ctstimeout", CTLTYPE_INT | CTLFLAG_RW, sc, 0, - ath_sysctl_ctstimeout, "I", "802.11 CTS timeout (us)"); + "ctstimeout", CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_NEEDGIANT, sc, 0, + ath_sysctl_ctstimeout, "I", "802.11 CTS timeout (us)"); SYSCTL_ADD_PROC(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, - "softled", CTLTYPE_INT | CTLFLAG_RW, sc, 0, - ath_sysctl_softled, "I", "enable/disable software LED support"); + "softled", CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_NEEDGIANT, sc, 0, + ath_sysctl_softled, "I", "enable/disable software LED support"); SYSCTL_ADD_PROC(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, - "ledpin", CTLTYPE_INT | CTLFLAG_RW, sc, 0, - ath_sysctl_ledpin, "I", "GPIO pin connected to LED"); + "ledpin", CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_NEEDGIANT, sc, 0, + ath_sysctl_ledpin, "I", "GPIO pin connected to LED"); SYSCTL_ADD_UINT(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, "ledon", CTLFLAG_RW, &sc->sc_ledon, 0, "setting to turn LED on"); SYSCTL_ADD_UINT(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, "ledidle", CTLFLAG_RW, &sc->sc_ledidle, 0, "idle time for inactivity LED (ticks)"); SYSCTL_ADD_PROC(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, - "hardled", CTLTYPE_INT | CTLFLAG_RW, sc, 0, - ath_sysctl_hardled, "I", "enable/disable hardware LED support"); + "hardled", CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_NEEDGIANT, sc, 0, + ath_sysctl_hardled, "I", "enable/disable hardware LED support"); /* XXX Laziness - configure pins, then flip hardled off/on */ SYSCTL_ADD_INT(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, "led_net_pin", CTLFLAG_RW, &sc->sc_led_net_pin, 0, "MAC Network LED pin, or -1 to disable"); SYSCTL_ADD_INT(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, "led_pwr_pin", CTLFLAG_RW, &sc->sc_led_pwr_pin, 0, "MAC Power LED pin, or -1 to disable"); SYSCTL_ADD_PROC(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, - "txantenna", CTLTYPE_INT | CTLFLAG_RW, sc, 0, - ath_sysctl_txantenna, "I", "antenna switch"); + "txantenna", CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_NEEDGIANT, sc, 0, + ath_sysctl_txantenna, "I", "antenna switch"); SYSCTL_ADD_PROC(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, - "rxantenna", CTLTYPE_INT | CTLFLAG_RW, sc, 0, - ath_sysctl_rxantenna, "I", "default/rx antenna"); + "rxantenna", CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_NEEDGIANT, sc, 0, + ath_sysctl_rxantenna, "I", "default/rx antenna"); if (ath_hal_hasdiversity(ah)) SYSCTL_ADD_PROC(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, - "diversity", CTLTYPE_INT | CTLFLAG_RW, sc, 0, - ath_sysctl_diversity, "I", "antenna diversity"); + "diversity", CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_NEEDGIANT, + sc, 0, ath_sysctl_diversity, "I", "antenna diversity"); sc->sc_txintrperiod = ATH_TXINTR_PERIOD; SYSCTL_ADD_UINT(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, "txintrperiod", CTLFLAG_RW, &sc->sc_txintrperiod, 0, "tx descriptor batching"); SYSCTL_ADD_PROC(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, - "diag", CTLTYPE_INT | CTLFLAG_RW, sc, 0, - ath_sysctl_diag, "I", "h/w diagnostic control"); + "diag", CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_NEEDGIANT, sc, 0, + ath_sysctl_diag, "I", "h/w diagnostic control"); SYSCTL_ADD_PROC(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, - "tpscale", CTLTYPE_INT | CTLFLAG_RW, sc, 0, - ath_sysctl_tpscale, "I", "tx power scaling"); + "tpscale", CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_NEEDGIANT, sc, 0, + ath_sysctl_tpscale, "I", "tx power scaling"); if (ath_hal_hastpc(ah)) { SYSCTL_ADD_PROC(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, - "tpc", CTLTYPE_INT | CTLFLAG_RW, sc, 0, - ath_sysctl_tpc, "I", "enable/disable per-packet TPC"); + "tpc", CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_NEEDGIANT, sc, 0, + ath_sysctl_tpc, "I", "enable/disable per-packet TPC"); SYSCTL_ADD_PROC(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, - "tpack", CTLTYPE_INT | CTLFLAG_RW, sc, 0, - ath_sysctl_tpack, "I", "tx power for ack frames"); + "tpack", CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_NEEDGIANT, sc, + 0, ath_sysctl_tpack, "I", "tx power for ack frames"); SYSCTL_ADD_PROC(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, - "tpcts", CTLTYPE_INT | CTLFLAG_RW, sc, 0, - ath_sysctl_tpcts, "I", "tx power for cts frames"); + "tpcts", CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_NEEDGIANT, sc, + 0, ath_sysctl_tpcts, "I", "tx power for cts frames"); } if (ath_hal_hasrfsilent(ah)) { SYSCTL_ADD_PROC(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, - "rfsilent", CTLTYPE_INT | CTLFLAG_RW, sc, 0, - ath_sysctl_rfsilent, "I", "h/w RF silent config"); + "rfsilent", CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_NEEDGIANT, + sc, 0, ath_sysctl_rfsilent, "I", "h/w RF silent config"); SYSCTL_ADD_PROC(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, - "rfkill", CTLTYPE_INT | CTLFLAG_RW, sc, 0, - ath_sysctl_rfkill, "I", "enable/disable RF kill switch"); + "rfkill", CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_NEEDGIANT, sc, + 0, ath_sysctl_rfkill, "I", "enable/disable RF kill switch"); } SYSCTL_ADD_PROC(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, - "txagg", CTLTYPE_INT | CTLFLAG_RW, sc, 0, - ath_sysctl_txagg, "I", ""); + "txagg", CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_NEEDGIANT, sc, 0, + ath_sysctl_txagg, "I", ""); SYSCTL_ADD_PROC(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, - "forcebstuck", CTLTYPE_INT | CTLFLAG_RW, sc, 0, - ath_sysctl_forcebstuck, "I", ""); + "forcebstuck", CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_NEEDGIANT, sc, + 0, ath_sysctl_forcebstuck, "I", ""); SYSCTL_ADD_PROC(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, - "hangcheck", CTLTYPE_INT | CTLFLAG_RW, sc, 0, - ath_sysctl_hangcheck, "I", ""); + "hangcheck", CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_NEEDGIANT, sc, 0, + ath_sysctl_hangcheck, "I", ""); if (ath_hal_hasintmit(ah)) { SYSCTL_ADD_PROC(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, - "intmit", CTLTYPE_INT | CTLFLAG_RW, sc, 0, - ath_sysctl_intmit, "I", "interference mitigation"); + "intmit", CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_NEEDGIANT, sc, + 0, ath_sysctl_intmit, "I", "interference mitigation"); } sc->sc_monpass = HAL_RXERR_DECRYPT | HAL_RXERR_MIC; SYSCTL_ADD_UINT(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, "monpass", CTLFLAG_RW, &sc->sc_monpass, 0, "mask of error frames to pass when monitoring"); SYSCTL_ADD_INT(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, "hwq_limit_nonaggr", CTLFLAG_RW, &sc->sc_hwq_limit_nonaggr, 0, "Hardware non-AMPDU queue depth before software-queuing TX frames"); SYSCTL_ADD_INT(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, "hwq_limit_aggr", CTLFLAG_RW, &sc->sc_hwq_limit_aggr, 0, "Hardware AMPDU queue depth before software-queuing TX frames"); SYSCTL_ADD_INT(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, "tid_hwq_lo", CTLFLAG_RW, &sc->sc_tid_hwq_lo, 0, ""); SYSCTL_ADD_INT(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, "tid_hwq_hi", CTLFLAG_RW, &sc->sc_tid_hwq_hi, 0, ""); /* Aggregate length twiddles */ SYSCTL_ADD_INT(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, "aggr_limit", CTLFLAG_RW, &sc->sc_aggr_limit, 0, "Maximum A-MPDU size, or 0 for 'default'"); SYSCTL_ADD_INT(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, "rts_aggr_limit", CTLFLAG_RW, &sc->sc_rts_aggr_limit, 0, "Maximum A-MPDU size for RTS-protected frames, or '0' " "for default"); SYSCTL_ADD_INT(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, "delim_min_pad", CTLFLAG_RW, &sc->sc_delim_min_pad, 0, "Enforce a minimum number of delimiters per A-MPDU " " sub-frame"); SYSCTL_ADD_INT(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, "txq_data_minfree", CTLFLAG_RW, &sc->sc_txq_data_minfree, 0, "Minimum free buffers before adding a data frame" " to the TX queue"); SYSCTL_ADD_INT(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, "txq_mcastq_maxdepth", CTLFLAG_RW, &sc->sc_txq_mcastq_maxdepth, 0, "Maximum buffer depth for multicast/broadcast frames"); SYSCTL_ADD_INT(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, "txq_node_maxdepth", CTLFLAG_RW, &sc->sc_txq_node_maxdepth, 0, "Maximum buffer depth for a single node"); #if 0 SYSCTL_ADD_INT(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, "cabq_enable", CTLFLAG_RW, &sc->sc_cabq_enable, 0, "Whether to transmit on the CABQ or not"); #endif #ifdef IEEE80211_SUPPORT_TDMA if (ath_hal_macversion(ah) > 0x78) { sc->sc_tdmadbaprep = 2; SYSCTL_ADD_UINT(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, "dbaprep", CTLFLAG_RW, &sc->sc_tdmadbaprep, 0, "TDMA DBA preparation time"); sc->sc_tdmaswbaprep = 10; SYSCTL_ADD_UINT(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, "swbaprep", CTLFLAG_RW, &sc->sc_tdmaswbaprep, 0, "TDMA SWBA preparation time"); SYSCTL_ADD_UINT(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, "guardtime", CTLFLAG_RW, &sc->sc_tdmaguard, 0, "TDMA slot guard time"); SYSCTL_ADD_UINT(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, "superframe", CTLFLAG_RD, &sc->sc_tdmabintval, 0, "TDMA calculated super frame"); SYSCTL_ADD_PROC(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, - "setcca", CTLTYPE_INT | CTLFLAG_RW, sc, 0, - ath_sysctl_setcca, "I", "enable CCA control"); + "setcca", CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_NEEDGIANT, + sc, 0, ath_sysctl_setcca, "I", "enable CCA control"); } #endif #ifdef ATH_DEBUG_ALQ ath_sysctl_alq_attach(sc); #endif } static int ath_sysctl_clearstats(SYSCTL_HANDLER_ARGS) { struct ath_softc *sc = arg1; int val = 0; int error; error = sysctl_handle_int(oidp, &val, 0, req); if (error || !req->newptr) return error; if (val == 0) return 0; /* Not clearing the stats is still valid */ memset(&sc->sc_stats, 0, sizeof(sc->sc_stats)); memset(&sc->sc_aggr_stats, 0, sizeof(sc->sc_aggr_stats)); memset(&sc->sc_intr_stats, 0, sizeof(sc->sc_intr_stats)); val = 0; return 0; } static void ath_sysctl_stats_attach_rxphyerr(struct ath_softc *sc, struct sysctl_oid_list *parent) { struct sysctl_ctx_list *ctx = device_get_sysctl_ctx(sc->sc_dev); struct sysctl_oid *tree = device_get_sysctl_tree(sc->sc_dev); struct sysctl_oid_list *child = SYSCTL_CHILDREN(tree); int i; char sn[8]; - tree = SYSCTL_ADD_NODE(ctx, parent, OID_AUTO, "rx_phy_err", CTLFLAG_RD, NULL, "Per-code RX PHY Errors"); + tree = SYSCTL_ADD_NODE(ctx, parent, OID_AUTO, "rx_phy_err", + CTLFLAG_RD | CTLFLAG_MPSAFE, NULL, "Per-code RX PHY Errors"); child = SYSCTL_CHILDREN(tree); for (i = 0; i < 64; i++) { snprintf(sn, sizeof(sn), "%d", i); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, sn, CTLFLAG_RD, &sc->sc_stats.ast_rx_phy[i], 0, ""); } } static void ath_sysctl_stats_attach_intr(struct ath_softc *sc, struct sysctl_oid_list *parent) { struct sysctl_ctx_list *ctx = device_get_sysctl_ctx(sc->sc_dev); struct sysctl_oid *tree = device_get_sysctl_tree(sc->sc_dev); struct sysctl_oid_list *child = SYSCTL_CHILDREN(tree); int i; char sn[8]; tree = SYSCTL_ADD_NODE(ctx, parent, OID_AUTO, "sync_intr", - CTLFLAG_RD, NULL, "Sync interrupt statistics"); + CTLFLAG_RD | CTLFLAG_MPSAFE, NULL, "Sync interrupt statistics"); child = SYSCTL_CHILDREN(tree); for (i = 0; i < 32; i++) { snprintf(sn, sizeof(sn), "%d", i); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, sn, CTLFLAG_RD, &sc->sc_intr_stats.sync_intr[i], 0, ""); } } void ath_sysctl_stats_attach(struct ath_softc *sc) { struct sysctl_oid *tree = device_get_sysctl_tree(sc->sc_dev); struct sysctl_ctx_list *ctx = device_get_sysctl_ctx(sc->sc_dev); struct sysctl_oid_list *child = SYSCTL_CHILDREN(tree); /* Create "clear" node */ SYSCTL_ADD_PROC(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, - "clear_stats", CTLTYPE_INT | CTLFLAG_RW, sc, 0, - ath_sysctl_clearstats, "I", "clear stats"); + "clear_stats", CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_NEEDGIANT, sc, + 0, ath_sysctl_clearstats, "I", "clear stats"); /* Create stats node */ - tree = SYSCTL_ADD_NODE(ctx, child, OID_AUTO, "stats", CTLFLAG_RD, - NULL, "Statistics"); + tree = SYSCTL_ADD_NODE(ctx, child, OID_AUTO, "stats", + CTLFLAG_RD | CTLFLAG_MPSAFE, NULL, "Statistics"); child = SYSCTL_CHILDREN(tree); /* This was generated from if_athioctl.h */ SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_watchdog", CTLFLAG_RD, &sc->sc_stats.ast_watchdog, 0, "device reset by watchdog"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_hardware", CTLFLAG_RD, &sc->sc_stats.ast_hardware, 0, "fatal hardware error interrupts"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_bmiss", CTLFLAG_RD, &sc->sc_stats.ast_bmiss, 0, "beacon miss interrupts"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_bmiss_phantom", CTLFLAG_RD, &sc->sc_stats.ast_bmiss_phantom, 0, "beacon miss interrupts"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_bstuck", CTLFLAG_RD, &sc->sc_stats.ast_bstuck, 0, "beacon stuck interrupts"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_rxorn", CTLFLAG_RD, &sc->sc_stats.ast_rxorn, 0, "rx overrun interrupts"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_rxeol", CTLFLAG_RD, &sc->sc_stats.ast_rxeol, 0, "rx eol interrupts"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_txurn", CTLFLAG_RD, &sc->sc_stats.ast_txurn, 0, "tx underrun interrupts"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_mib", CTLFLAG_RD, &sc->sc_stats.ast_mib, 0, "mib interrupts"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_intrcoal", CTLFLAG_RD, &sc->sc_stats.ast_intrcoal, 0, "interrupts coalesced"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_packets", CTLFLAG_RD, &sc->sc_stats.ast_tx_packets, 0, "packet sent on the interface"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_mgmt", CTLFLAG_RD, &sc->sc_stats.ast_tx_mgmt, 0, "management frames transmitted"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_discard", CTLFLAG_RD, &sc->sc_stats.ast_tx_discard, 0, "frames discarded prior to assoc"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_qstop", CTLFLAG_RD, &sc->sc_stats.ast_tx_qstop, 0, "output stopped 'cuz no buffer"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_encap", CTLFLAG_RD, &sc->sc_stats.ast_tx_encap, 0, "tx encapsulation failed"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_nonode", CTLFLAG_RD, &sc->sc_stats.ast_tx_nonode, 0, "tx failed 'cuz no node"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_nombuf", CTLFLAG_RD, &sc->sc_stats.ast_tx_nombuf, 0, "tx failed 'cuz no mbuf"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_nomcl", CTLFLAG_RD, &sc->sc_stats.ast_tx_nomcl, 0, "tx failed 'cuz no cluster"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_linear", CTLFLAG_RD, &sc->sc_stats.ast_tx_linear, 0, "tx linearized to cluster"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_nodata", CTLFLAG_RD, &sc->sc_stats.ast_tx_nodata, 0, "tx discarded empty frame"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_busdma", CTLFLAG_RD, &sc->sc_stats.ast_tx_busdma, 0, "tx failed for dma resrcs"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_xretries", CTLFLAG_RD, &sc->sc_stats.ast_tx_xretries, 0, "tx failed 'cuz too many retries"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_fifoerr", CTLFLAG_RD, &sc->sc_stats.ast_tx_fifoerr, 0, "tx failed 'cuz FIFO underrun"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_filtered", CTLFLAG_RD, &sc->sc_stats.ast_tx_filtered, 0, "tx failed 'cuz xmit filtered"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_shortretry", CTLFLAG_RD, &sc->sc_stats.ast_tx_shortretry, 0, "tx on-chip retries (short)"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_longretry", CTLFLAG_RD, &sc->sc_stats.ast_tx_longretry, 0, "tx on-chip retries (long)"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_badrate", CTLFLAG_RD, &sc->sc_stats.ast_tx_badrate, 0, "tx failed 'cuz bogus xmit rate"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_noack", CTLFLAG_RD, &sc->sc_stats.ast_tx_noack, 0, "tx frames with no ack marked"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_rts", CTLFLAG_RD, &sc->sc_stats.ast_tx_rts, 0, "tx frames with rts enabled"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_cts", CTLFLAG_RD, &sc->sc_stats.ast_tx_cts, 0, "tx frames with cts enabled"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_shortpre", CTLFLAG_RD, &sc->sc_stats.ast_tx_shortpre, 0, "tx frames with short preamble"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_altrate", CTLFLAG_RD, &sc->sc_stats.ast_tx_altrate, 0, "tx frames with alternate rate"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_protect", CTLFLAG_RD, &sc->sc_stats.ast_tx_protect, 0, "tx frames with protection"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_ctsburst", CTLFLAG_RD, &sc->sc_stats.ast_tx_ctsburst, 0, "tx frames with cts and bursting"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_ctsext", CTLFLAG_RD, &sc->sc_stats.ast_tx_ctsext, 0, "tx frames with cts extension"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_rx_nombuf", CTLFLAG_RD, &sc->sc_stats.ast_rx_nombuf, 0, "rx setup failed 'cuz no mbuf"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_rx_busdma", CTLFLAG_RD, &sc->sc_stats.ast_rx_busdma, 0, "rx setup failed for dma resrcs"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_rx_orn", CTLFLAG_RD, &sc->sc_stats.ast_rx_orn, 0, "rx failed 'cuz of desc overrun"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_rx_crcerr", CTLFLAG_RD, &sc->sc_stats.ast_rx_crcerr, 0, "rx failed 'cuz of bad CRC"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_rx_fifoerr", CTLFLAG_RD, &sc->sc_stats.ast_rx_fifoerr, 0, "rx failed 'cuz of FIFO overrun"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_rx_badcrypt", CTLFLAG_RD, &sc->sc_stats.ast_rx_badcrypt, 0, "rx failed 'cuz decryption"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_rx_badmic", CTLFLAG_RD, &sc->sc_stats.ast_rx_badmic, 0, "rx failed 'cuz MIC failure"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_rx_phyerr", CTLFLAG_RD, &sc->sc_stats.ast_rx_phyerr, 0, "rx failed 'cuz of PHY err"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_rx_tooshort", CTLFLAG_RD, &sc->sc_stats.ast_rx_tooshort, 0, "rx discarded 'cuz frame too short"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_rx_toobig", CTLFLAG_RD, &sc->sc_stats.ast_rx_toobig, 0, "rx discarded 'cuz frame too large"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_rx_packets", CTLFLAG_RD, &sc->sc_stats.ast_rx_packets, 0, "packet recv on the interface"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_rx_mgt", CTLFLAG_RD, &sc->sc_stats.ast_rx_mgt, 0, "management frames received"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_rx_ctl", CTLFLAG_RD, &sc->sc_stats.ast_rx_ctl, 0, "rx discarded 'cuz ctl frame"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_be_xmit", CTLFLAG_RD, &sc->sc_stats.ast_be_xmit, 0, "beacons transmitted"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_be_nombuf", CTLFLAG_RD, &sc->sc_stats.ast_be_nombuf, 0, "beacon setup failed 'cuz no mbuf"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_per_cal", CTLFLAG_RD, &sc->sc_stats.ast_per_cal, 0, "periodic calibration calls"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_per_calfail", CTLFLAG_RD, &sc->sc_stats.ast_per_calfail, 0, "periodic calibration failed"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_per_rfgain", CTLFLAG_RD, &sc->sc_stats.ast_per_rfgain, 0, "periodic calibration rfgain reset"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_rate_calls", CTLFLAG_RD, &sc->sc_stats.ast_rate_calls, 0, "rate control checks"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_rate_raise", CTLFLAG_RD, &sc->sc_stats.ast_rate_raise, 0, "rate control raised xmit rate"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_rate_drop", CTLFLAG_RD, &sc->sc_stats.ast_rate_drop, 0, "rate control dropped xmit rate"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_ant_defswitch", CTLFLAG_RD, &sc->sc_stats.ast_ant_defswitch, 0, "rx/default antenna switches"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_ant_txswitch", CTLFLAG_RD, &sc->sc_stats.ast_ant_txswitch, 0, "tx antenna switches"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_cabq_xmit", CTLFLAG_RD, &sc->sc_stats.ast_cabq_xmit, 0, "cabq frames transmitted"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_cabq_busy", CTLFLAG_RD, &sc->sc_stats.ast_cabq_busy, 0, "cabq found busy"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_raw", CTLFLAG_RD, &sc->sc_stats.ast_tx_raw, 0, "tx frames through raw api"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_ff_txok", CTLFLAG_RD, &sc->sc_stats.ast_ff_txok, 0, "fast frames tx'd successfully"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_ff_txerr", CTLFLAG_RD, &sc->sc_stats.ast_ff_txerr, 0, "fast frames tx'd w/ error"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_ff_rx", CTLFLAG_RD, &sc->sc_stats.ast_ff_rx, 0, "fast frames rx'd"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_ff_flush", CTLFLAG_RD, &sc->sc_stats.ast_ff_flush, 0, "fast frames flushed from staging q"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_qfull", CTLFLAG_RD, &sc->sc_stats.ast_tx_qfull, 0, "tx dropped 'cuz of queue limit"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_nobuf", CTLFLAG_RD, &sc->sc_stats.ast_tx_nobuf, 0, "tx dropped 'cuz no ath buffer"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tdma_update", CTLFLAG_RD, &sc->sc_stats.ast_tdma_update, 0, "TDMA slot timing updates"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tdma_timers", CTLFLAG_RD, &sc->sc_stats.ast_tdma_timers, 0, "TDMA slot update set beacon timers"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tdma_tsf", CTLFLAG_RD, &sc->sc_stats.ast_tdma_tsf, 0, "TDMA slot update set TSF"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tdma_ack", CTLFLAG_RD, &sc->sc_stats.ast_tdma_ack, 0, "TDMA tx failed 'cuz ACK required"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_raw_fail", CTLFLAG_RD, &sc->sc_stats.ast_tx_raw_fail, 0, "raw tx failed 'cuz h/w down"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_nofrag", CTLFLAG_RD, &sc->sc_stats.ast_tx_nofrag, 0, "tx dropped 'cuz no ath frag buffer"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_be_missed", CTLFLAG_RD, &sc->sc_stats.ast_be_missed, 0, "number of -missed- beacons"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_ani_cal", CTLFLAG_RD, &sc->sc_stats.ast_ani_cal, 0, "number of ANI polls"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_rx_agg", CTLFLAG_RD, &sc->sc_stats.ast_rx_agg, 0, "number of aggregate frames received"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_rx_halfgi", CTLFLAG_RD, &sc->sc_stats.ast_rx_halfgi, 0, "number of frames received with half-GI"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_rx_2040", CTLFLAG_RD, &sc->sc_stats.ast_rx_2040, 0, "number of HT/40 frames received"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_rx_pre_crc_err", CTLFLAG_RD, &sc->sc_stats.ast_rx_pre_crc_err, 0, "number of delimeter-CRC errors detected"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_rx_post_crc_err", CTLFLAG_RD, &sc->sc_stats.ast_rx_post_crc_err, 0, "number of post-delimiter CRC errors detected"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_rx_decrypt_busy_err", CTLFLAG_RD, &sc->sc_stats.ast_rx_decrypt_busy_err, 0, "number of frames received w/ busy decrypt engine"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_rx_hi_rx_chain", CTLFLAG_RD, &sc->sc_stats.ast_rx_hi_rx_chain, 0, ""); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_htprotect", CTLFLAG_RD, &sc->sc_stats.ast_tx_htprotect, 0, "HT tx frames with protection"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_rx_hitqueueend", CTLFLAG_RD, &sc->sc_stats.ast_rx_hitqueueend, 0, "RX hit queue end"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_timeout", CTLFLAG_RD, &sc->sc_stats.ast_tx_timeout, 0, "TX Global Timeout"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_cst", CTLFLAG_RD, &sc->sc_stats.ast_tx_cst, 0, "TX Carrier Sense Timeout"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_xtxop", CTLFLAG_RD, &sc->sc_stats.ast_tx_xtxop, 0, "TX exceeded TXOP"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_timerexpired", CTLFLAG_RD, &sc->sc_stats.ast_tx_timerexpired, 0, "TX exceeded TX_TIMER register"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_desccfgerr", CTLFLAG_RD, &sc->sc_stats.ast_tx_desccfgerr, 0, "TX Descriptor Cfg Error"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_swretries", CTLFLAG_RD, &sc->sc_stats.ast_tx_swretries, 0, "TX software retry count"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_swretrymax", CTLFLAG_RD, &sc->sc_stats.ast_tx_swretrymax, 0, "TX software retry max reached"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_data_underrun", CTLFLAG_RD, &sc->sc_stats.ast_tx_data_underrun, 0, ""); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_delim_underrun", CTLFLAG_RD, &sc->sc_stats.ast_tx_delim_underrun, 0, ""); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_aggr_failall", CTLFLAG_RD, &sc->sc_stats.ast_tx_aggr_failall, 0, "Number of aggregate TX failures (whole frame)"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_aggr_ok", CTLFLAG_RD, &sc->sc_stats.ast_tx_aggr_ok, 0, "Number of aggregate TX OK completions (subframe)"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_aggr_fail", CTLFLAG_RD, &sc->sc_stats.ast_tx_aggr_fail, 0, "Number of aggregate TX failures (subframe)"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_rx_intr", CTLFLAG_RD, &sc->sc_stats.ast_rx_intr, 0, "RX interrupts"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_intr", CTLFLAG_RD, &sc->sc_stats.ast_tx_intr, 0, "TX interrupts"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_mcastq_overflow", CTLFLAG_RD, &sc->sc_stats.ast_tx_mcastq_overflow, 0, "Number of multicast frames exceeding maximum mcast queue depth"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_rx_keymiss", CTLFLAG_RD, &sc->sc_stats.ast_rx_keymiss, 0, ""); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_swfiltered", CTLFLAG_RD, &sc->sc_stats.ast_tx_swfiltered, 0, ""); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_nodeq_overflow", CTLFLAG_RD, &sc->sc_stats.ast_tx_nodeq_overflow, 0, "tx dropped 'cuz nodeq overflow"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_rx_stbc", CTLFLAG_RD, &sc->sc_stats.ast_rx_stbc, 0, "Number of STBC frames received"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_stbc", CTLFLAG_RD, &sc->sc_stats.ast_tx_stbc, 0, "Number of STBC frames transmitted"); SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "ast_tx_ldpc", CTLFLAG_RD, &sc->sc_stats.ast_tx_ldpc, 0, "Number of LDPC frames transmitted"); /* Attach the RX phy error array */ ath_sysctl_stats_attach_rxphyerr(sc, child); /* Attach the interrupt statistics array */ ath_sysctl_stats_attach_intr(sc, child); } /* * This doesn't necessarily belong here (because it's HAL related, not * driver related). */ void ath_sysctl_hal_attach(struct ath_softc *sc) { struct sysctl_oid *tree = device_get_sysctl_tree(sc->sc_dev); struct sysctl_ctx_list *ctx = device_get_sysctl_ctx(sc->sc_dev); struct sysctl_oid_list *child = SYSCTL_CHILDREN(tree); - tree = SYSCTL_ADD_NODE(ctx, child, OID_AUTO, "hal", CTLFLAG_RD, - NULL, "Atheros HAL parameters"); + tree = SYSCTL_ADD_NODE(ctx, child, OID_AUTO, "hal", + CTLFLAG_RD | CTLFLAG_MPSAFE, NULL, "Atheros HAL parameters"); child = SYSCTL_CHILDREN(tree); sc->sc_ah->ah_config.ah_debug = 0; SYSCTL_ADD_INT(ctx, child, OID_AUTO, "debug", CTLFLAG_RW, &sc->sc_ah->ah_config.ah_debug, 0, "Atheros HAL debugging printfs"); sc->sc_ah->ah_config.ah_ar5416_biasadj = 0; SYSCTL_ADD_INT(ctx, child, OID_AUTO, "ar5416_biasadj", CTLFLAG_RW, &sc->sc_ah->ah_config.ah_ar5416_biasadj, 0, "Enable 2GHz AR5416 direction sensitivity bias adjust"); sc->sc_ah->ah_config.ah_dma_beacon_response_time = 2; SYSCTL_ADD_INT(ctx, child, OID_AUTO, "dma_brt", CTLFLAG_RW, &sc->sc_ah->ah_config.ah_dma_beacon_response_time, 0, "Atheros HAL DMA beacon response time"); sc->sc_ah->ah_config.ah_sw_beacon_response_time = 10; SYSCTL_ADD_INT(ctx, child, OID_AUTO, "sw_brt", CTLFLAG_RW, &sc->sc_ah->ah_config.ah_sw_beacon_response_time, 0, "Atheros HAL software beacon response time"); sc->sc_ah->ah_config.ah_additional_swba_backoff = 0; SYSCTL_ADD_INT(ctx, child, OID_AUTO, "swba_backoff", CTLFLAG_RW, &sc->sc_ah->ah_config.ah_additional_swba_backoff, 0, "Atheros HAL additional SWBA backoff time"); sc->sc_ah->ah_config.ah_force_full_reset = 0; SYSCTL_ADD_INT(ctx, child, OID_AUTO, "force_full_reset", CTLFLAG_RW, &sc->sc_ah->ah_config.ah_force_full_reset, 0, "Force full chip reset rather than a warm reset"); /* * This is initialised by the driver. */ SYSCTL_ADD_INT(ctx, child, OID_AUTO, "serialise_reg_war", CTLFLAG_RW, &sc->sc_ah->ah_config.ah_serialise_reg_war, 0, "Force register access serialisation"); } Index: projects/clang1000-import/sys/dev/mlx5/mlx5_ib/mlx5_ib.h =================================================================== --- projects/clang1000-import/sys/dev/mlx5/mlx5_ib/mlx5_ib.h (revision 358238) +++ projects/clang1000-import/sys/dev/mlx5/mlx5_ib/mlx5_ib.h (revision 358239) @@ -1,1051 +1,1055 @@ /*- * Copyright (c) 2013-2015, Mellanox Technologies, Ltd. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS `AS IS' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $FreeBSD$ */ #ifndef MLX5_IB_H #define MLX5_IB_H #include #include #include #include #include #include #include #include #include #include #include #include #define mlx5_ib_dbg(dev, format, arg...) \ pr_debug("%s:%s:%d:(pid %d): " format, (dev)->ib_dev.name, __func__, \ __LINE__, current->pid, ##arg) #define mlx5_ib_err(dev, format, arg...) \ pr_err("%s: ERR: %s:%d:(pid %d): " format, (dev)->ib_dev.name, __func__, \ __LINE__, current->pid, ##arg) #define mlx5_ib_warn(dev, format, arg...) \ pr_warn("%s: WARN: %s:%d:(pid %d): " format, (dev)->ib_dev.name, __func__, \ __LINE__, current->pid, ##arg) #define field_avail(type, fld, sz) (offsetof(type, fld) + \ sizeof(((type *)0)->fld) <= (sz)) #define MLX5_IB_DEFAULT_UIDX 0xffffff #define MLX5_USER_ASSIGNED_UIDX_MASK __mlx5_mask(qpc, user_index) enum { MLX5_IB_MMAP_CMD_SHIFT = 8, MLX5_IB_MMAP_CMD_MASK = 0xff, }; enum mlx5_ib_mmap_cmd { MLX5_IB_MMAP_REGULAR_PAGE = 0, MLX5_IB_MMAP_GET_CONTIGUOUS_PAGES = 1, MLX5_IB_MMAP_WC_PAGE = 2, MLX5_IB_MMAP_NC_PAGE = 3, /* 5 is chosen in order to be compatible with old versions of libmlx5 */ MLX5_IB_MMAP_CORE_CLOCK = 5, }; enum { MLX5_RES_SCAT_DATA32_CQE = 0x1, MLX5_RES_SCAT_DATA64_CQE = 0x2, MLX5_REQ_SCAT_DATA32_CQE = 0x11, MLX5_REQ_SCAT_DATA64_CQE = 0x22, }; enum mlx5_ib_latency_class { MLX5_IB_LATENCY_CLASS_LOW, MLX5_IB_LATENCY_CLASS_MEDIUM, MLX5_IB_LATENCY_CLASS_HIGH, MLX5_IB_LATENCY_CLASS_FAST_PATH }; enum mlx5_ib_mad_ifc_flags { MLX5_MAD_IFC_IGNORE_MKEY = 1, MLX5_MAD_IFC_IGNORE_BKEY = 2, MLX5_MAD_IFC_NET_VIEW = 4, }; enum { MLX5_CROSS_CHANNEL_UUAR = 0, }; enum { MLX5_CQE_VERSION_V0, MLX5_CQE_VERSION_V1, }; struct mlx5_ib_vma_private_data { struct list_head list; struct vm_area_struct *vma; }; struct mlx5_ib_ucontext { struct ib_ucontext ibucontext; struct list_head db_page_list; /* protect doorbell record alloc/free */ struct mutex db_page_mutex; struct mlx5_uuar_info uuari; u8 cqe_version; /* Transport Domain number */ u32 tdn; struct list_head vma_private_list; }; static inline struct mlx5_ib_ucontext *to_mucontext(struct ib_ucontext *ibucontext) { return container_of(ibucontext, struct mlx5_ib_ucontext, ibucontext); } struct mlx5_ib_pd { struct ib_pd ibpd; u32 pdn; }; #define MLX5_IB_FLOW_MCAST_PRIO (MLX5_BY_PASS_NUM_PRIOS - 1) #define MLX5_IB_FLOW_LAST_PRIO (MLX5_BY_PASS_NUM_REGULAR_PRIOS - 1) #if (MLX5_IB_FLOW_LAST_PRIO <= 0) #error "Invalid number of bypass priorities" #endif #define MLX5_IB_FLOW_LEFTOVERS_PRIO (MLX5_IB_FLOW_MCAST_PRIO + 1) #define MLX5_IB_NUM_FLOW_FT (MLX5_IB_FLOW_LEFTOVERS_PRIO + 1) #define MLX5_IB_NUM_SNIFFER_FTS 2 struct mlx5_ib_flow_prio { struct mlx5_flow_table *flow_table; unsigned int refcount; }; struct mlx5_ib_flow_handler { struct list_head list; struct ib_flow ibflow; struct mlx5_ib_flow_prio *prio; struct mlx5_flow_rule *rule; }; struct mlx5_ib_flow_db { struct mlx5_ib_flow_prio prios[MLX5_IB_NUM_FLOW_FT]; struct mlx5_ib_flow_prio sniffer[MLX5_IB_NUM_SNIFFER_FTS]; struct mlx5_flow_table *lag_demux_ft; /* Protect flow steering bypass flow tables * when add/del flow rules. * only single add/removal of flow steering rule could be done * simultaneously. */ struct mutex lock; }; /* Use macros here so that don't have to duplicate * enum ib_send_flags and enum ib_qp_type for low-level driver */ #define MLX5_IB_SEND_UMR_UNREG IB_SEND_RESERVED_START #define MLX5_IB_SEND_UMR_FAIL_IF_FREE (IB_SEND_RESERVED_START << 1) #define MLX5_IB_SEND_UMR_UPDATE_MTT (IB_SEND_RESERVED_START << 2) #define MLX5_IB_SEND_UMR_UPDATE_TRANSLATION (IB_SEND_RESERVED_START << 3) #define MLX5_IB_SEND_UMR_UPDATE_PD (IB_SEND_RESERVED_START << 4) #define MLX5_IB_SEND_UMR_UPDATE_ACCESS IB_SEND_RESERVED_END #define MLX5_IB_QPT_REG_UMR IB_QPT_RESERVED1 /* * IB_QPT_GSI creates the software wrapper around GSI, and MLX5_IB_QPT_HW_GSI * creates the actual hardware QP. */ #define MLX5_IB_QPT_HW_GSI IB_QPT_RESERVED2 #define MLX5_IB_WR_UMR IB_WR_RESERVED1 /* Private QP creation flags to be passed in ib_qp_init_attr.create_flags. * * These flags are intended for internal use by the mlx5_ib driver, and they * rely on the range reserved for that use in the ib_qp_create_flags enum. */ /* Create a UD QP whose source QP number is 1 */ static inline enum ib_qp_create_flags mlx5_ib_create_qp_sqpn_qp1(void) { return IB_QP_CREATE_RESERVED_START; } struct wr_list { u16 opcode; u16 next; }; struct mlx5_ib_wq { u64 *wrid; u32 *wr_data; struct wr_list *w_list; unsigned *wqe_head; u16 unsig_count; /* serialize post to the work queue */ spinlock_t lock; int wqe_cnt; int max_post; int max_gs; int offset; int wqe_shift; unsigned head; unsigned tail; u16 cur_post; u16 last_poll; void *qend; }; struct mlx5_ib_rwq { struct ib_wq ibwq; struct mlx5_core_qp core_qp; u32 rq_num_pas; u32 log_rq_stride; u32 log_rq_size; u32 rq_page_offset; u32 log_page_size; struct ib_umem *umem; size_t buf_size; unsigned int page_shift; int create_type; struct mlx5_db db; u32 user_index; u32 wqe_count; u32 wqe_shift; int wq_sig; }; enum { MLX5_QP_USER, MLX5_QP_KERNEL, MLX5_QP_EMPTY }; enum { MLX5_WQ_USER, MLX5_WQ_KERNEL }; struct mlx5_ib_rwq_ind_table { struct ib_rwq_ind_table ib_rwq_ind_tbl; u32 rqtn; }; /* * Connect-IB can trigger up to four concurrent pagefaults * per-QP. */ enum mlx5_ib_pagefault_context { MLX5_IB_PAGEFAULT_RESPONDER_READ, MLX5_IB_PAGEFAULT_REQUESTOR_READ, MLX5_IB_PAGEFAULT_RESPONDER_WRITE, MLX5_IB_PAGEFAULT_REQUESTOR_WRITE, MLX5_IB_PAGEFAULT_CONTEXTS }; static inline enum mlx5_ib_pagefault_context mlx5_ib_get_pagefault_context(struct mlx5_pagefault *pagefault) { return pagefault->flags & (MLX5_PFAULT_REQUESTOR | MLX5_PFAULT_WRITE); } struct mlx5_ib_pfault { struct work_struct work; struct mlx5_pagefault mpfault; }; struct mlx5_ib_ubuffer { struct ib_umem *umem; int buf_size; u64 buf_addr; }; struct mlx5_ib_qp_base { struct mlx5_ib_qp *container_mibqp; struct mlx5_core_qp mqp; struct mlx5_ib_ubuffer ubuffer; }; struct mlx5_ib_qp_trans { struct mlx5_ib_qp_base base; u16 xrcdn; u8 alt_port; u8 atomic_rd_en; u8 resp_depth; }; struct mlx5_ib_rss_qp { u32 tirn; }; struct mlx5_ib_rq { struct mlx5_ib_qp_base base; struct mlx5_ib_wq *rq; struct mlx5_ib_ubuffer ubuffer; struct mlx5_db *doorbell; u32 tirn; u8 state; }; struct mlx5_ib_sq { struct mlx5_ib_qp_base base; struct mlx5_ib_wq *sq; struct mlx5_ib_ubuffer ubuffer; struct mlx5_db *doorbell; u32 tisn; u8 state; }; struct mlx5_ib_raw_packet_qp { struct mlx5_ib_sq sq; struct mlx5_ib_rq rq; }; struct mlx5_ib_qp { struct ib_qp ibqp; union { struct mlx5_ib_qp_trans trans_qp; struct mlx5_ib_raw_packet_qp raw_packet_qp; struct mlx5_ib_rss_qp rss_qp; }; struct mlx5_buf buf; struct mlx5_db db; struct mlx5_ib_wq rq; u8 sq_signal_bits; u8 fm_cache; struct mlx5_ib_wq sq; /* serialize qp state modifications */ struct mutex mutex; u32 flags; u8 port; u8 state; int wq_sig; int scat_cqe; int max_inline_data; struct mlx5_bf *bf; int has_rq; /* only for user space QPs. For kernel * we have it from the bf object */ int uuarn; int create_type; /* Store signature errors */ bool signature_en; #ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING /* * A flag that is true for QP's that are in a state that doesn't * allow page faults, and shouldn't schedule any more faults. */ int disable_page_faults; /* * The disable_page_faults_lock protects a QP's disable_page_faults * field, allowing for a thread to atomically check whether the QP * allows page faults, and if so schedule a page fault. */ spinlock_t disable_page_faults_lock; struct mlx5_ib_pfault pagefaults[MLX5_IB_PAGEFAULT_CONTEXTS]; #endif struct list_head qps_list; struct list_head cq_recv_list; struct list_head cq_send_list; }; struct mlx5_ib_cq_buf { struct mlx5_buf buf; struct ib_umem *umem; int cqe_size; int nent; }; enum mlx5_ib_qp_flags { MLX5_IB_QP_LSO = IB_QP_CREATE_IPOIB_UD_LSO, MLX5_IB_QP_BLOCK_MULTICAST_LOOPBACK = IB_QP_CREATE_BLOCK_MULTICAST_LOOPBACK, MLX5_IB_QP_CROSS_CHANNEL = IB_QP_CREATE_CROSS_CHANNEL, MLX5_IB_QP_MANAGED_SEND = IB_QP_CREATE_MANAGED_SEND, MLX5_IB_QP_MANAGED_RECV = IB_QP_CREATE_MANAGED_RECV, MLX5_IB_QP_SIGNATURE_HANDLING = 1 << 5, /* QP uses 1 as its source QP number */ MLX5_IB_QP_SQPN_QP1 = 1 << 6, MLX5_IB_QP_CAP_SCATTER_FCS = 1 << 7, MLX5_IB_QP_RSS = 1 << 8, }; struct mlx5_umr_wr { struct ib_send_wr wr; union { u64 virt_addr; u64 offset; } target; struct ib_pd *pd; unsigned int page_shift; unsigned int npages; u32 length; int access_flags; u32 mkey; }; static inline struct mlx5_umr_wr *umr_wr(struct ib_send_wr *wr) { return container_of(wr, struct mlx5_umr_wr, wr); } struct mlx5_shared_mr_info { int mr_id; struct ib_umem *umem; }; struct mlx5_ib_cq { struct ib_cq ibcq; struct mlx5_core_cq mcq; struct mlx5_ib_cq_buf buf; struct mlx5_db db; /* serialize access to the CQ */ spinlock_t lock; /* protect resize cq */ struct mutex resize_mutex; struct mlx5_ib_cq_buf *resize_buf; struct ib_umem *resize_umem; int cqe_size; struct list_head list_send_qp; struct list_head list_recv_qp; u32 create_flags; struct list_head wc_list; enum ib_cq_notify_flags notify_flags; struct work_struct notify_work; }; struct mlx5_ib_wc { struct ib_wc wc; struct list_head list; }; struct mlx5_ib_srq { struct ib_srq ibsrq; struct mlx5_core_srq msrq; struct mlx5_buf buf; struct mlx5_db db; u64 *wrid; /* protect SRQ hanlding */ spinlock_t lock; int head; int tail; u16 wqe_ctr; struct ib_umem *umem; /* serialize arming a SRQ */ struct mutex mutex; int wq_sig; }; struct mlx5_ib_xrcd { struct ib_xrcd ibxrcd; u32 xrcdn; }; enum mlx5_ib_mtt_access_flags { MLX5_IB_MTT_READ = (1 << 0), MLX5_IB_MTT_WRITE = (1 << 1), }; #define MLX5_IB_MTT_PRESENT (MLX5_IB_MTT_READ | MLX5_IB_MTT_WRITE) struct mlx5_ib_mr { struct ib_mr ibmr; void *descs; dma_addr_t desc_map; int ndescs; int max_descs; int desc_size; int access_mode; struct mlx5_core_mr mmkey; struct ib_umem *umem; struct mlx5_shared_mr_info *smr_info; struct list_head list; int order; int umred; int npages; struct mlx5_ib_dev *dev; u32 out[MLX5_ST_SZ_DW(create_mkey_out)]; struct mlx5_core_sig_ctx *sig; int live; void *descs_alloc; int access_flags; /* Needed for rereg MR */ }; struct mlx5_ib_mw { struct ib_mw ibmw; struct mlx5_core_mr mmkey; }; struct mlx5_ib_umr_context { struct ib_cqe cqe; enum ib_wc_status status; struct completion done; }; struct umr_common { struct ib_pd *pd; struct ib_cq *cq; struct ib_qp *qp; /* control access to UMR QP */ struct semaphore sem; }; enum { MLX5_FMR_INVALID, MLX5_FMR_VALID, MLX5_FMR_BUSY, }; struct mlx5_cache_ent { struct list_head head; /* sync access to the cahce entry */ spinlock_t lock; struct dentry *dir; char name[4]; u32 order; u32 size; u32 cur; u32 miss; u32 limit; struct dentry *fsize; struct dentry *fcur; struct dentry *fmiss; struct dentry *flimit; struct mlx5_ib_dev *dev; struct work_struct work; struct delayed_work dwork; int pending; }; struct mlx5_mr_cache { struct workqueue_struct *wq; struct mlx5_cache_ent ent[MAX_MR_CACHE_ENTRIES]; int stopped; struct dentry *root; unsigned long last_add; }; struct mlx5_ib_gsi_qp; struct mlx5_ib_port_resources { struct mlx5_ib_resources *devr; struct mlx5_ib_gsi_qp *gsi; struct work_struct pkey_change_work; }; struct mlx5_ib_resources { struct ib_cq *c0; struct ib_xrcd *x0; struct ib_xrcd *x1; struct ib_pd *p0; struct ib_srq *s0; struct ib_srq *s1; struct mlx5_ib_port_resources ports[2]; /* Protects changes to the port resources */ struct mutex mutex; }; struct mlx5_ib_port { u16 q_cnt_id; }; struct mlx5_roce { /* Protect mlx5_ib_get_netdev from invoking dev_hold() with a NULL * netdev pointer */ rwlock_t netdev_lock; struct net_device *netdev; struct notifier_block nb; atomic_t next_port; }; #define MLX5_IB_STATS_COUNT(a,b,c,d) a #define MLX5_IB_STATS_VAR(a,b,c,d) b; #define MLX5_IB_STATS_DESC(a,b,c,d) c, d, #define MLX5_IB_CONG_PARAMS(m) \ /* ECN RP */ \ m(+1, u64 rp_clamp_tgt_rate, "rp_clamp_tgt_rate", "If set, whenever a CNP is processed, the target rate is updated to be the current rate") \ m(+1, u64 rp_clamp_tgt_rate_ati, "rp_clamp_tgt_rate_ati", "If set, when receiving a CNP, the target rate should be updated if the transission rate was increased due to the timer, and not only due to the byte counter") \ m(+1, u64 rp_time_reset, "rp_time_reset", "Time in microseconds between rate increases if no CNPs are received") \ m(+1, u64 rp_byte_reset, "rp_byte_reset", "Transmitted data in bytes between rate increases if no CNP's are received. A value of zero means disabled.") \ m(+1, u64 rp_threshold, "rp_threshold", "The number of times rpByteStage or rpTimeStage can count before the RP rate control state machine advances states") \ m(+1, u64 rp_ai_rate, "rp_ai_rate", "The rate, in Mbits per second, used to increase rpTargetRate in the active increase state") \ m(+1, u64 rp_hai_rate, "rp_hai_rate", "The rate, in Mbits per second, used to increase rpTargetRate in the hyper increase state") \ m(+1, u64 rp_min_dec_fac, "rp_min_dec_fac", "The minimum factor by which the current transmit rate can be changed when processing a CNP. Value is given as a percentage, [1 .. 100]") \ m(+1, u64 rp_min_rate, "rp_min_rate", "The minimum value, in Mbps per second, for rate to limit") \ m(+1, u64 rp_rate_to_set_on_first_cnp, "rp_rate_to_set_on_first_cnp", "The rate that is set for the flow when a rate limiter is allocated to it upon first CNP received, in Mbps. A value of zero means use full port speed") \ m(+1, u64 rp_dce_tcp_g, "rp_dce_tcp_g", "Used to update the congestion estimator, alpha, once every dce_tcp_rtt once every dce_tcp_rtt microseconds") \ m(+1, u64 rp_dce_tcp_rtt, "rp_dce_tcp_rtt", "The time between updates of the aolpha value, in microseconds") \ m(+1, u64 rp_rate_reduce_monitor_period, "rp_rate_reduce_monitor_period", "The minimum time between two consecutive rate reductions for a single flow") \ m(+1, u64 rp_initial_alpha_value, "rp_initial_alpha_value", "The initial value of alpha to use when receiving the first CNP for a flow") \ m(+1, u64 rp_gd, "rp_gd", "If a CNP is received, the flow rate is reduced at the beginning of the next rate_reduce_monitor_period interval") \ /* ECN NP */ \ m(+1, u64 np_cnp_dscp, "np_cnp_dscp", "The DiffServ Code Point of the generated CNP for this port") \ m(+1, u64 np_cnp_prio_mode, "np_cnp_prio_mode", "The 802.1p priority value of the generated CNP for this port") \ m(+1, u64 np_cnp_prio, "np_cnp_prio", "The 802.1p priority value of the generated CNP for this port") #define MLX5_IB_CONG_PARAMS_NUM (0 MLX5_IB_CONG_PARAMS(MLX5_IB_STATS_COUNT)) #define MLX5_IB_CONG_STATS(m) \ m(+1, u64 syndrome, "syndrome", "Syndrome number") \ m(+1, u64 rp_cur_flows, "rp_cur_flows", "Number of flows limited") \ m(+1, u64 sum_flows, "sum_flows", "Sum of the number of flows limited over time") \ m(+1, u64 rp_cnp_ignored, "rp_cnp_ignored", "Number of CNPs and CNMs ignored") \ m(+1, u64 rp_cnp_handled, "rp_cnp_handled", "Number of CNPs and CNMs successfully handled") \ m(+1, u64 time_stamp, "time_stamp", "Time stamp in microseconds") \ m(+1, u64 accumulators_period, "accumulators_period", "The value of X variable for accumulating counters") \ m(+1, u64 np_ecn_marked_roce_packets, "np_ecn_marked_roce_packets", "Number of ECN marked packets seen") \ m(+1, u64 np_cnp_sent, "np_cnp_sent", "Number of CNPs sent") #define MLX5_IB_CONG_STATS_NUM (0 MLX5_IB_CONG_STATS(MLX5_IB_STATS_COUNT)) struct mlx5_ib_congestion { struct sysctl_ctx_list ctx; struct sx lock; struct delayed_work dwork; - u64 arg [0]; - MLX5_IB_CONG_PARAMS(MLX5_IB_STATS_VAR) - MLX5_IB_CONG_STATS(MLX5_IB_STATS_VAR) + union { + u64 arg[1]; + struct { + MLX5_IB_CONG_PARAMS(MLX5_IB_STATS_VAR) + MLX5_IB_CONG_STATS(MLX5_IB_STATS_VAR) + }; + }; }; struct mlx5_ib_dev { struct ib_device ib_dev; struct mlx5_core_dev *mdev; struct mlx5_roce roce; MLX5_DECLARE_DOORBELL_LOCK(uar_lock); int num_ports; /* serialize update of capability mask */ struct mutex cap_mask_mutex; bool ib_active; struct umr_common umrc; /* sync used page count stats */ struct mlx5_ib_resources devr; struct mlx5_mr_cache cache; struct timer_list delay_timer; /* Prevents soft lock on massive reg MRs */ struct mutex slow_path_mutex; int fill_delay; #ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING struct ib_odp_caps odp_caps; /* * Sleepable RCU that prevents destruction of MRs while they are still * being used by a page fault handler. */ struct srcu_struct mr_srcu; #endif struct mlx5_ib_flow_db flow_db; /* protect resources needed as part of reset flow */ spinlock_t reset_flow_resource_lock; struct list_head qp_list; /* Array with num_ports elements */ struct mlx5_ib_port *port; struct mlx5_ib_congestion congestion; }; static inline struct mlx5_ib_cq *to_mibcq(struct mlx5_core_cq *mcq) { return container_of(mcq, struct mlx5_ib_cq, mcq); } static inline struct mlx5_ib_xrcd *to_mxrcd(struct ib_xrcd *ibxrcd) { return container_of(ibxrcd, struct mlx5_ib_xrcd, ibxrcd); } static inline struct mlx5_ib_dev *to_mdev(struct ib_device *ibdev) { return container_of(ibdev, struct mlx5_ib_dev, ib_dev); } static inline struct mlx5_ib_cq *to_mcq(struct ib_cq *ibcq) { return container_of(ibcq, struct mlx5_ib_cq, ibcq); } static inline struct mlx5_ib_qp *to_mibqp(struct mlx5_core_qp *mqp) { return container_of(mqp, struct mlx5_ib_qp_base, mqp)->container_mibqp; } static inline struct mlx5_ib_rwq *to_mibrwq(struct mlx5_core_qp *core_qp) { return container_of(core_qp, struct mlx5_ib_rwq, core_qp); } static inline struct mlx5_ib_mr *to_mibmr(struct mlx5_core_mr *mmkey) { return container_of(mmkey, struct mlx5_ib_mr, mmkey); } static inline struct mlx5_ib_pd *to_mpd(struct ib_pd *ibpd) { return container_of(ibpd, struct mlx5_ib_pd, ibpd); } static inline struct mlx5_ib_srq *to_msrq(struct ib_srq *ibsrq) { return container_of(ibsrq, struct mlx5_ib_srq, ibsrq); } static inline struct mlx5_ib_qp *to_mqp(struct ib_qp *ibqp) { return container_of(ibqp, struct mlx5_ib_qp, ibqp); } static inline struct mlx5_ib_rwq *to_mrwq(struct ib_wq *ibwq) { return container_of(ibwq, struct mlx5_ib_rwq, ibwq); } static inline struct mlx5_ib_rwq_ind_table *to_mrwq_ind_table(struct ib_rwq_ind_table *ib_rwq_ind_tbl) { return container_of(ib_rwq_ind_tbl, struct mlx5_ib_rwq_ind_table, ib_rwq_ind_tbl); } static inline struct mlx5_ib_srq *to_mibsrq(struct mlx5_core_srq *msrq) { return container_of(msrq, struct mlx5_ib_srq, msrq); } static inline struct mlx5_ib_mr *to_mmr(struct ib_mr *ibmr) { return container_of(ibmr, struct mlx5_ib_mr, ibmr); } static inline struct mlx5_ib_mw *to_mmw(struct ib_mw *ibmw) { return container_of(ibmw, struct mlx5_ib_mw, ibmw); } struct mlx5_ib_ah { struct ib_ah ibah; struct mlx5_av av; }; static inline struct mlx5_ib_ah *to_mah(struct ib_ah *ibah) { return container_of(ibah, struct mlx5_ib_ah, ibah); } int mlx5_ib_db_map_user(struct mlx5_ib_ucontext *context, unsigned long virt, struct mlx5_db *db); void mlx5_ib_db_unmap_user(struct mlx5_ib_ucontext *context, struct mlx5_db *db); void __mlx5_ib_cq_clean(struct mlx5_ib_cq *cq, u32 qpn, struct mlx5_ib_srq *srq); void mlx5_ib_cq_clean(struct mlx5_ib_cq *cq, u32 qpn, struct mlx5_ib_srq *srq); void mlx5_ib_free_srq_wqe(struct mlx5_ib_srq *srq, int wqe_index); int mlx5_MAD_IFC(struct mlx5_ib_dev *dev, int ignore_mkey, int ignore_bkey, u8 port, const struct ib_wc *in_wc, const struct ib_grh *in_grh, const void *in_mad, void *response_mad); struct ib_ah *mlx5_ib_create_ah(struct ib_pd *pd, struct ib_ah_attr *ah_attr, struct ib_udata *udata); int mlx5_ib_query_ah(struct ib_ah *ibah, struct ib_ah_attr *ah_attr); int mlx5_ib_destroy_ah(struct ib_ah *ah); struct ib_srq *mlx5_ib_create_srq(struct ib_pd *pd, struct ib_srq_init_attr *init_attr, struct ib_udata *udata); int mlx5_ib_modify_srq(struct ib_srq *ibsrq, struct ib_srq_attr *attr, enum ib_srq_attr_mask attr_mask, struct ib_udata *udata); int mlx5_ib_query_srq(struct ib_srq *ibsrq, struct ib_srq_attr *srq_attr); int mlx5_ib_destroy_srq(struct ib_srq *srq); int mlx5_ib_post_srq_recv(struct ib_srq *ibsrq, struct ib_recv_wr *wr, struct ib_recv_wr **bad_wr); struct ib_qp *mlx5_ib_create_qp(struct ib_pd *pd, struct ib_qp_init_attr *init_attr, struct ib_udata *udata); int mlx5_ib_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, int attr_mask, struct ib_udata *udata); int mlx5_ib_query_qp(struct ib_qp *ibqp, struct ib_qp_attr *qp_attr, int qp_attr_mask, struct ib_qp_init_attr *qp_init_attr); int mlx5_ib_destroy_qp(struct ib_qp *qp); int mlx5_ib_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr, struct ib_send_wr **bad_wr); int mlx5_ib_post_recv(struct ib_qp *ibqp, struct ib_recv_wr *wr, struct ib_recv_wr **bad_wr); void *mlx5_get_send_wqe(struct mlx5_ib_qp *qp, int n); int mlx5_ib_read_user_wqe(struct mlx5_ib_qp *qp, int send, int wqe_index, void *buffer, u32 length, struct mlx5_ib_qp_base *base); struct ib_cq *mlx5_ib_create_cq(struct ib_device *ibdev, const struct ib_cq_init_attr *attr, struct ib_ucontext *context, struct ib_udata *udata); int mlx5_ib_destroy_cq(struct ib_cq *cq); int mlx5_ib_poll_cq(struct ib_cq *ibcq, int num_entries, struct ib_wc *wc); int mlx5_ib_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags); int mlx5_ib_modify_cq(struct ib_cq *cq, u16 cq_count, u16 cq_period); int mlx5_ib_resize_cq(struct ib_cq *ibcq, int entries, struct ib_udata *udata); struct ib_mr *mlx5_ib_get_dma_mr(struct ib_pd *pd, int acc); struct ib_mr *mlx5_ib_reg_user_mr(struct ib_pd *pd, u64 start, u64 length, u64 virt_addr, int access_flags, struct ib_udata *udata); struct ib_mw *mlx5_ib_alloc_mw(struct ib_pd *pd, enum ib_mw_type type, struct ib_udata *udata); int mlx5_ib_dealloc_mw(struct ib_mw *mw); int mlx5_ib_update_mtt(struct mlx5_ib_mr *mr, u64 start_page_index, int npages, int zap); int mlx5_ib_rereg_user_mr(struct ib_mr *ib_mr, int flags, u64 start, u64 length, u64 virt_addr, int access_flags, struct ib_pd *pd, struct ib_udata *udata); int mlx5_ib_dereg_mr(struct ib_mr *ibmr); struct ib_mr *mlx5_ib_alloc_mr(struct ib_pd *pd, enum ib_mr_type mr_type, u32 max_num_sg); int mlx5_ib_map_mr_sg(struct ib_mr *ibmr, struct scatterlist *sg, int sg_nents, unsigned int *sg_offset); int mlx5_ib_process_mad(struct ib_device *ibdev, int mad_flags, u8 port_num, const struct ib_wc *in_wc, const struct ib_grh *in_grh, const struct ib_mad_hdr *in, size_t in_mad_size, struct ib_mad_hdr *out, size_t *out_mad_size, u16 *out_mad_pkey_index); struct ib_xrcd *mlx5_ib_alloc_xrcd(struct ib_device *ibdev, struct ib_ucontext *context, struct ib_udata *udata); int mlx5_ib_dealloc_xrcd(struct ib_xrcd *xrcd); int mlx5_ib_get_buf_offset(u64 addr, int page_shift, u32 *offset); int mlx5_query_ext_port_caps(struct mlx5_ib_dev *dev, u8 port); int mlx5_query_mad_ifc_smp_attr_node_info(struct ib_device *ibdev, struct ib_smp *out_mad); int mlx5_query_mad_ifc_system_image_guid(struct ib_device *ibdev, __be64 *sys_image_guid); int mlx5_query_mad_ifc_max_pkeys(struct ib_device *ibdev, u16 *max_pkeys); int mlx5_query_mad_ifc_vendor_id(struct ib_device *ibdev, u32 *vendor_id); int mlx5_query_mad_ifc_node_desc(struct mlx5_ib_dev *dev, char *node_desc); int mlx5_query_mad_ifc_node_guid(struct mlx5_ib_dev *dev, __be64 *node_guid); int mlx5_query_mad_ifc_pkey(struct ib_device *ibdev, u8 port, u16 index, u16 *pkey); int mlx5_query_mad_ifc_gids(struct ib_device *ibdev, u8 port, int index, union ib_gid *gid); int mlx5_query_mad_ifc_port(struct ib_device *ibdev, u8 port, struct ib_port_attr *props); int mlx5_ib_query_port(struct ib_device *ibdev, u8 port, struct ib_port_attr *props); int mlx5_ib_init_fmr(struct mlx5_ib_dev *dev); void mlx5_ib_cleanup_fmr(struct mlx5_ib_dev *dev); void mlx5_ib_cont_pages(struct ib_umem *umem, u64 addr, int *count, int *shift, int *ncont, int *order); void __mlx5_ib_populate_pas(struct mlx5_ib_dev *dev, struct ib_umem *umem, int page_shift, size_t offset, size_t num_pages, __be64 *pas, int access_flags); void mlx5_ib_populate_pas(struct mlx5_ib_dev *dev, struct ib_umem *umem, int page_shift, __be64 *pas, int access_flags); void mlx5_ib_copy_pas(u64 *old, u64 *new, int step, int num); int mlx5_ib_get_cqe_size(struct mlx5_ib_dev *dev, struct ib_cq *ibcq); int mlx5_mr_cache_init(struct mlx5_ib_dev *dev); int mlx5_mr_cache_cleanup(struct mlx5_ib_dev *dev); int mlx5_mr_ib_cont_pages(struct ib_umem *umem, u64 addr, int *count, int *shift); int mlx5_ib_check_mr_status(struct ib_mr *ibmr, u32 check_mask, struct ib_mr_status *mr_status); struct ib_wq *mlx5_ib_create_wq(struct ib_pd *pd, struct ib_wq_init_attr *init_attr, struct ib_udata *udata); int mlx5_ib_destroy_wq(struct ib_wq *wq); int mlx5_ib_modify_wq(struct ib_wq *wq, struct ib_wq_attr *wq_attr, u32 wq_attr_mask, struct ib_udata *udata); struct ib_rwq_ind_table *mlx5_ib_create_rwq_ind_table(struct ib_device *device, struct ib_rwq_ind_table_init_attr *init_attr, struct ib_udata *udata); int mlx5_ib_destroy_rwq_ind_table(struct ib_rwq_ind_table *wq_ind_table); #ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING extern struct workqueue_struct *mlx5_ib_page_fault_wq; void mlx5_ib_internal_fill_odp_caps(struct mlx5_ib_dev *dev); void mlx5_ib_mr_pfault_handler(struct mlx5_ib_qp *qp, struct mlx5_ib_pfault *pfault); void mlx5_ib_odp_create_qp(struct mlx5_ib_qp *qp); int mlx5_ib_odp_init_one(struct mlx5_ib_dev *ibdev); void mlx5_ib_odp_remove_one(struct mlx5_ib_dev *ibdev); int __init mlx5_ib_odp_init(void); void mlx5_ib_odp_cleanup(void); void mlx5_ib_qp_disable_pagefaults(struct mlx5_ib_qp *qp); void mlx5_ib_qp_enable_pagefaults(struct mlx5_ib_qp *qp); void mlx5_ib_invalidate_range(struct ib_umem *umem, unsigned long start, unsigned long end); #else /* CONFIG_INFINIBAND_ON_DEMAND_PAGING */ static inline void mlx5_ib_internal_fill_odp_caps(struct mlx5_ib_dev *dev) { return; } static inline void mlx5_ib_odp_create_qp(struct mlx5_ib_qp *qp) {} static inline int mlx5_ib_odp_init_one(struct mlx5_ib_dev *ibdev) { return 0; } static inline void mlx5_ib_odp_remove_one(struct mlx5_ib_dev *ibdev) {} static inline int mlx5_ib_odp_init(void) { return 0; } static inline void mlx5_ib_odp_cleanup(void) {} static inline void mlx5_ib_qp_disable_pagefaults(struct mlx5_ib_qp *qp) {} static inline void mlx5_ib_qp_enable_pagefaults(struct mlx5_ib_qp *qp) {} #endif /* CONFIG_INFINIBAND_ON_DEMAND_PAGING */ int mlx5_ib_get_vf_config(struct ib_device *device, int vf, u8 port, struct ifla_vf_info *info); int mlx5_ib_set_vf_link_state(struct ib_device *device, int vf, u8 port, int state); int mlx5_ib_get_vf_stats(struct ib_device *device, int vf, u8 port, struct ifla_vf_stats *stats); int mlx5_ib_set_vf_guid(struct ib_device *device, int vf, u8 port, u64 guid, int type); __be16 mlx5_get_roce_udp_sport(struct mlx5_ib_dev *dev, u8 port_num, int index); int mlx5_get_roce_gid_type(struct mlx5_ib_dev *dev, u8 port_num, int index, enum ib_gid_type *gid_type); /* GSI QP helper functions */ struct ib_qp *mlx5_ib_gsi_create_qp(struct ib_pd *pd, struct ib_qp_init_attr *init_attr); int mlx5_ib_gsi_destroy_qp(struct ib_qp *qp); int mlx5_ib_gsi_modify_qp(struct ib_qp *qp, struct ib_qp_attr *attr, int attr_mask); int mlx5_ib_gsi_query_qp(struct ib_qp *qp, struct ib_qp_attr *qp_attr, int qp_attr_mask, struct ib_qp_init_attr *qp_init_attr); int mlx5_ib_gsi_post_send(struct ib_qp *qp, struct ib_send_wr *wr, struct ib_send_wr **bad_wr); int mlx5_ib_gsi_post_recv(struct ib_qp *qp, struct ib_recv_wr *wr, struct ib_recv_wr **bad_wr); void mlx5_ib_gsi_pkey_change(struct mlx5_ib_gsi_qp *gsi); int mlx5_ib_generate_wc(struct ib_cq *ibcq, struct ib_wc *wc); static inline void init_query_mad(struct ib_smp *mad) { mad->base_version = 1; mad->mgmt_class = IB_MGMT_CLASS_SUBN_LID_ROUTED; mad->class_version = 1; mad->method = IB_MGMT_METHOD_GET; } static inline u8 convert_access(int acc) { return (acc & IB_ACCESS_REMOTE_ATOMIC ? MLX5_PERM_ATOMIC : 0) | (acc & IB_ACCESS_REMOTE_WRITE ? MLX5_PERM_REMOTE_WRITE : 0) | (acc & IB_ACCESS_REMOTE_READ ? MLX5_PERM_REMOTE_READ : 0) | (acc & IB_ACCESS_LOCAL_WRITE ? MLX5_PERM_LOCAL_WRITE : 0) | MLX5_PERM_LOCAL_READ; } static inline int is_qp1(enum ib_qp_type qp_type) { return qp_type == MLX5_IB_QPT_HW_GSI; } #define MLX5_MAX_UMR_SHIFT 16 #define MLX5_MAX_UMR_PAGES (1 << MLX5_MAX_UMR_SHIFT) static inline u32 check_cq_create_flags(u32 flags) { /* * It returns non-zero value for unsupported CQ * create flags, otherwise it returns zero. */ return (flags & ~(IB_CQ_FLAGS_IGNORE_OVERRUN | IB_CQ_FLAGS_TIMESTAMP_COMPLETION)); } static inline int verify_assign_uidx(u8 cqe_version, u32 cmd_uidx, u32 *user_index) { if (cqe_version) { if ((cmd_uidx == MLX5_IB_DEFAULT_UIDX) || (cmd_uidx & ~MLX5_USER_ASSIGNED_UIDX_MASK)) return -EINVAL; *user_index = cmd_uidx; } else { *user_index = MLX5_IB_DEFAULT_UIDX; } return 0; } static inline int get_qp_user_index(struct mlx5_ib_ucontext *ucontext, struct mlx5_ib_create_qp *ucmd, int inlen, u32 *user_index) { u8 cqe_version = ucontext->cqe_version; if (field_avail(struct mlx5_ib_create_qp, uidx, inlen) && !cqe_version && (ucmd->uidx == MLX5_IB_DEFAULT_UIDX)) return 0; if (!!(field_avail(struct mlx5_ib_create_qp, uidx, inlen) != !!cqe_version)) return -EINVAL; return verify_assign_uidx(cqe_version, ucmd->uidx, user_index); } static inline int get_srq_user_index(struct mlx5_ib_ucontext *ucontext, struct mlx5_ib_create_srq *ucmd, int inlen, u32 *user_index) { u8 cqe_version = ucontext->cqe_version; if (field_avail(struct mlx5_ib_create_srq, uidx, inlen) && !cqe_version && (ucmd->uidx == MLX5_IB_DEFAULT_UIDX)) return 0; if (!!(field_avail(struct mlx5_ib_create_srq, uidx, inlen) != !!cqe_version)) return -EINVAL; return verify_assign_uidx(cqe_version, ucmd->uidx, user_index); } void mlx5_ib_cleanup_congestion(struct mlx5_ib_dev *); int mlx5_ib_init_congestion(struct mlx5_ib_dev *); #endif /* MLX5_IB_H */ Index: projects/clang1000-import/sys/dev/mlx5/mlx5_ib/mlx5_ib_cong.c =================================================================== --- projects/clang1000-import/sys/dev/mlx5/mlx5_ib/mlx5_ib_cong.c (revision 358238) +++ projects/clang1000-import/sys/dev/mlx5/mlx5_ib/mlx5_ib_cong.c (revision 358239) @@ -1,461 +1,463 @@ /*- - * Copyright (c) 2013-2015, Mellanox Technologies, Ltd. All rights reserved. + * Copyright (c) 2013-2020, Mellanox Technologies, Ltd. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS `AS IS' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $FreeBSD$ */ #include "mlx5_ib.h" #include static const char *mlx5_ib_cong_params_desc[] = { MLX5_IB_CONG_PARAMS(MLX5_IB_STATS_DESC) }; static const char *mlx5_ib_cong_stats_desc[] = { MLX5_IB_CONG_STATS(MLX5_IB_STATS_DESC) }; -#define MLX5_IB_INDEX(field) (__offsetof(struct mlx5_ib_congestion, field) / sizeof(u64)) +#define MLX5_IB_INDEX(field) ( \ + (__offsetof(struct mlx5_ib_congestion, field) - \ + __offsetof(struct mlx5_ib_congestion, arg[0])) / sizeof(u64)) #define MLX5_IB_FLD_MAX(type, field) ((1ULL << __mlx5_bit_sz(type, field)) - 1ULL) #define MLX5_IB_SET_CLIPPED(type, ptr, field, var) do { \ /* rangecheck */ \ if ((var) > MLX5_IB_FLD_MAX(type, field)) \ (var) = MLX5_IB_FLD_MAX(type, field); \ /* set value */ \ MLX5_SET(type, ptr, field, var); \ } while (0) #define CONG_LOCK(dev) sx_xlock(&(dev)->congestion.lock) #define CONG_UNLOCK(dev) sx_xunlock(&(dev)->congestion.lock) #define CONG_LOCKED(dev) sx_xlocked(&(dev)->congestion.lock) #define MLX5_IB_RP_CLAMP_TGT_RATE_ATTR BIT(1) #define MLX5_IB_RP_CLAMP_TGT_RATE_ATI_ATTR BIT(2) #define MLX5_IB_RP_TIME_RESET_ATTR BIT(3) #define MLX5_IB_RP_BYTE_RESET_ATTR BIT(4) #define MLX5_IB_RP_THRESHOLD_ATTR BIT(5) #define MLX5_IB_RP_AI_RATE_ATTR BIT(7) #define MLX5_IB_RP_HAI_RATE_ATTR BIT(8) #define MLX5_IB_RP_MIN_DEC_FAC_ATTR BIT(9) #define MLX5_IB_RP_MIN_RATE_ATTR BIT(10) #define MLX5_IB_RP_RATE_TO_SET_ON_FIRST_CNP_ATTR BIT(11) #define MLX5_IB_RP_DCE_TCP_G_ATTR BIT(12) #define MLX5_IB_RP_DCE_TCP_RTT_ATTR BIT(13) #define MLX5_IB_RP_RATE_REDUCE_MONITOR_PERIOD_ATTR BIT(14) #define MLX5_IB_RP_INITIAL_ALPHA_VALUE_ATTR BIT(15) #define MLX5_IB_RP_GD_ATTR BIT(16) #define MLX5_IB_NP_CNP_DSCP_ATTR BIT(3) #define MLX5_IB_NP_CNP_PRIO_MODE_ATTR BIT(4) enum mlx5_ib_cong_node_type { MLX5_IB_RROCE_ECN_RP = 1, MLX5_IB_RROCE_ECN_NP = 2, }; static enum mlx5_ib_cong_node_type mlx5_ib_param_to_node(u32 index) { if (index >= MLX5_IB_INDEX(rp_clamp_tgt_rate) && index <= MLX5_IB_INDEX(rp_gd)) return MLX5_IB_RROCE_ECN_RP; else return MLX5_IB_RROCE_ECN_NP; } static u64 mlx5_get_cc_param_val(void *field, u32 index) { switch (index) { case MLX5_IB_INDEX(rp_clamp_tgt_rate): return MLX5_GET(cong_control_r_roce_ecn_rp, field, clamp_tgt_rate); case MLX5_IB_INDEX(rp_clamp_tgt_rate_ati): return MLX5_GET(cong_control_r_roce_ecn_rp, field, clamp_tgt_rate_after_time_inc); case MLX5_IB_INDEX(rp_time_reset): return MLX5_GET(cong_control_r_roce_ecn_rp, field, rpg_time_reset); case MLX5_IB_INDEX(rp_byte_reset): return MLX5_GET(cong_control_r_roce_ecn_rp, field, rpg_byte_reset); case MLX5_IB_INDEX(rp_threshold): return MLX5_GET(cong_control_r_roce_ecn_rp, field, rpg_threshold); case MLX5_IB_INDEX(rp_ai_rate): return MLX5_GET(cong_control_r_roce_ecn_rp, field, rpg_ai_rate); case MLX5_IB_INDEX(rp_hai_rate): return MLX5_GET(cong_control_r_roce_ecn_rp, field, rpg_hai_rate); case MLX5_IB_INDEX(rp_min_dec_fac): return MLX5_GET(cong_control_r_roce_ecn_rp, field, rpg_min_dec_fac); case MLX5_IB_INDEX(rp_min_rate): return MLX5_GET(cong_control_r_roce_ecn_rp, field, rpg_min_rate); case MLX5_IB_INDEX(rp_rate_to_set_on_first_cnp): return MLX5_GET(cong_control_r_roce_ecn_rp, field, rate_to_set_on_first_cnp); case MLX5_IB_INDEX(rp_dce_tcp_g): return MLX5_GET(cong_control_r_roce_ecn_rp, field, dce_tcp_g); case MLX5_IB_INDEX(rp_dce_tcp_rtt): return MLX5_GET(cong_control_r_roce_ecn_rp, field, dce_tcp_rtt); case MLX5_IB_INDEX(rp_rate_reduce_monitor_period): return MLX5_GET(cong_control_r_roce_ecn_rp, field, rate_reduce_monitor_period); case MLX5_IB_INDEX(rp_initial_alpha_value): return MLX5_GET(cong_control_r_roce_ecn_rp, field, initial_alpha_value); case MLX5_IB_INDEX(rp_gd): return MLX5_GET(cong_control_r_roce_ecn_rp, field, rpg_gd); case MLX5_IB_INDEX(np_cnp_dscp): return MLX5_GET(cong_control_r_roce_ecn_np, field, cnp_dscp); case MLX5_IB_INDEX(np_cnp_prio_mode): return MLX5_GET(cong_control_r_roce_ecn_np, field, cnp_prio_mode); case MLX5_IB_INDEX(np_cnp_prio): return MLX5_GET(cong_control_r_roce_ecn_np, field, cnp_802p_prio); default: return 0; } } static void mlx5_ib_set_cc_param_mask_val(void *field, u32 index, u64 var, u32 *attr_mask) { switch (index) { case MLX5_IB_INDEX(rp_clamp_tgt_rate): *attr_mask |= MLX5_IB_RP_CLAMP_TGT_RATE_ATTR; MLX5_IB_SET_CLIPPED(cong_control_r_roce_ecn_rp, field, clamp_tgt_rate, var); break; case MLX5_IB_INDEX(rp_clamp_tgt_rate_ati): *attr_mask |= MLX5_IB_RP_CLAMP_TGT_RATE_ATI_ATTR; MLX5_IB_SET_CLIPPED(cong_control_r_roce_ecn_rp, field, clamp_tgt_rate_after_time_inc, var); break; case MLX5_IB_INDEX(rp_time_reset): *attr_mask |= MLX5_IB_RP_TIME_RESET_ATTR; MLX5_IB_SET_CLIPPED(cong_control_r_roce_ecn_rp, field, rpg_time_reset, var); break; case MLX5_IB_INDEX(rp_byte_reset): *attr_mask |= MLX5_IB_RP_BYTE_RESET_ATTR; MLX5_IB_SET_CLIPPED(cong_control_r_roce_ecn_rp, field, rpg_byte_reset, var); break; case MLX5_IB_INDEX(rp_threshold): *attr_mask |= MLX5_IB_RP_THRESHOLD_ATTR; MLX5_IB_SET_CLIPPED(cong_control_r_roce_ecn_rp, field, rpg_threshold, var); break; case MLX5_IB_INDEX(rp_ai_rate): *attr_mask |= MLX5_IB_RP_AI_RATE_ATTR; MLX5_IB_SET_CLIPPED(cong_control_r_roce_ecn_rp, field, rpg_ai_rate, var); break; case MLX5_IB_INDEX(rp_hai_rate): *attr_mask |= MLX5_IB_RP_HAI_RATE_ATTR; MLX5_IB_SET_CLIPPED(cong_control_r_roce_ecn_rp, field, rpg_hai_rate, var); break; case MLX5_IB_INDEX(rp_min_dec_fac): *attr_mask |= MLX5_IB_RP_MIN_DEC_FAC_ATTR; MLX5_IB_SET_CLIPPED(cong_control_r_roce_ecn_rp, field, rpg_min_dec_fac, var); break; case MLX5_IB_INDEX(rp_min_rate): *attr_mask |= MLX5_IB_RP_MIN_RATE_ATTR; MLX5_IB_SET_CLIPPED(cong_control_r_roce_ecn_rp, field, rpg_min_rate, var); break; case MLX5_IB_INDEX(rp_rate_to_set_on_first_cnp): *attr_mask |= MLX5_IB_RP_RATE_TO_SET_ON_FIRST_CNP_ATTR; MLX5_IB_SET_CLIPPED(cong_control_r_roce_ecn_rp, field, rate_to_set_on_first_cnp, var); break; case MLX5_IB_INDEX(rp_dce_tcp_g): *attr_mask |= MLX5_IB_RP_DCE_TCP_G_ATTR; MLX5_IB_SET_CLIPPED(cong_control_r_roce_ecn_rp, field, dce_tcp_g, var); break; case MLX5_IB_INDEX(rp_dce_tcp_rtt): *attr_mask |= MLX5_IB_RP_DCE_TCP_RTT_ATTR; MLX5_IB_SET_CLIPPED(cong_control_r_roce_ecn_rp, field, dce_tcp_rtt, var); break; case MLX5_IB_INDEX(rp_rate_reduce_monitor_period): *attr_mask |= MLX5_IB_RP_RATE_REDUCE_MONITOR_PERIOD_ATTR; MLX5_IB_SET_CLIPPED(cong_control_r_roce_ecn_rp, field, rate_reduce_monitor_period, var); break; case MLX5_IB_INDEX(rp_initial_alpha_value): *attr_mask |= MLX5_IB_RP_INITIAL_ALPHA_VALUE_ATTR; MLX5_IB_SET_CLIPPED(cong_control_r_roce_ecn_rp, field, initial_alpha_value, var); break; case MLX5_IB_INDEX(rp_gd): *attr_mask |= MLX5_IB_RP_GD_ATTR; MLX5_IB_SET_CLIPPED(cong_control_r_roce_ecn_rp, field, rpg_gd, var); break; case MLX5_IB_INDEX(np_cnp_dscp): *attr_mask |= MLX5_IB_NP_CNP_DSCP_ATTR; MLX5_IB_SET_CLIPPED(cong_control_r_roce_ecn_np, field, cnp_dscp, var); break; case MLX5_IB_INDEX(np_cnp_prio_mode): *attr_mask |= MLX5_IB_NP_CNP_PRIO_MODE_ATTR; MLX5_IB_SET_CLIPPED(cong_control_r_roce_ecn_np, field, cnp_prio_mode, var); break; case MLX5_IB_INDEX(np_cnp_prio): *attr_mask |= MLX5_IB_NP_CNP_PRIO_MODE_ATTR; MLX5_SET(cong_control_r_roce_ecn_np, field, cnp_prio_mode, 0); MLX5_IB_SET_CLIPPED(cong_control_r_roce_ecn_np, field, cnp_802p_prio, var); break; default: break; } } static int mlx5_ib_get_all_cc_params(struct mlx5_ib_dev *dev) { int outlen = MLX5_ST_SZ_BYTES(query_cong_params_out); enum mlx5_ib_cong_node_type node = 0; void *out; void *field; u32 x; int err = 0; out = kzalloc(outlen, GFP_KERNEL); if (!out) return -ENOMEM; /* get the current values */ for (x = 0; x != MLX5_IB_CONG_PARAMS_NUM; x++) { if (node != mlx5_ib_param_to_node(x)) { node = mlx5_ib_param_to_node(x); err = mlx5_cmd_query_cong_params(dev->mdev, node, out, outlen); if (err) break; } field = MLX5_ADDR_OF(query_cong_params_out, out, congestion_parameters); dev->congestion.arg[x] = mlx5_get_cc_param_val(field, x); } kfree(out); return err; } static int mlx5_ib_set_cc_params(struct mlx5_ib_dev *dev, u32 index, u64 var) { int inlen = MLX5_ST_SZ_BYTES(modify_cong_params_in); enum mlx5_ib_cong_node_type node; u32 attr_mask = 0; void *field; void *in; int err; in = kzalloc(inlen, GFP_KERNEL); if (!in) return -ENOMEM; MLX5_SET(modify_cong_params_in, in, opcode, MLX5_CMD_OP_MODIFY_CONG_PARAMS); node = mlx5_ib_param_to_node(index); MLX5_SET(modify_cong_params_in, in, cong_protocol, node); field = MLX5_ADDR_OF(modify_cong_params_in, in, congestion_parameters); mlx5_ib_set_cc_param_mask_val(field, index, var, &attr_mask); field = MLX5_ADDR_OF(modify_cong_params_in, in, field_select); MLX5_SET(field_select_r_roce_rp, field, field_select_r_roce_rp, attr_mask); err = mlx5_cmd_modify_cong_params(dev->mdev, in, inlen); kfree(in); return err; } static int mlx5_ib_cong_params_handler(SYSCTL_HANDLER_ARGS) { struct mlx5_ib_dev *dev = arg1; u64 value; int error; CONG_LOCK(dev); value = dev->congestion.arg[arg2]; if (req != NULL) { error = sysctl_handle_64(oidp, &value, 0, req); if (error || req->newptr == NULL || value == dev->congestion.arg[arg2]) goto done; /* assign new value */ dev->congestion.arg[arg2] = value; } else { error = 0; } if (!MLX5_CAP_GEN(dev->mdev, cc_modify_allowed)) error = EPERM; else { error = -mlx5_ib_set_cc_params(dev, MLX5_IB_INDEX(arg[arg2]), dev->congestion.arg[arg2]); } done: CONG_UNLOCK(dev); return (error); } #define MLX5_GET_UNALIGNED_64(t,p,f) \ (((u64)MLX5_GET(t,p,f##_high) << 32) | MLX5_GET(t,p,f##_low)) static void mlx5_ib_read_cong_stats(struct work_struct *work) { struct mlx5_ib_dev *dev = container_of(work, struct mlx5_ib_dev, congestion.dwork.work); const int outlen = MLX5_ST_SZ_BYTES(query_cong_statistics_out); void *out; out = kzalloc(outlen, GFP_KERNEL); if (!out) goto done; CONG_LOCK(dev); if (mlx5_cmd_query_cong_counter(dev->mdev, 0, out, outlen)) memset(out, 0, outlen); dev->congestion.syndrome = MLX5_GET(query_cong_statistics_out, out, syndrome); dev->congestion.rp_cur_flows = MLX5_GET(query_cong_statistics_out, out, rp_cur_flows); dev->congestion.sum_flows = MLX5_GET(query_cong_statistics_out, out, sum_flows); dev->congestion.rp_cnp_ignored = MLX5_GET_UNALIGNED_64(query_cong_statistics_out, out, rp_cnp_ignored); dev->congestion.rp_cnp_handled = MLX5_GET_UNALIGNED_64(query_cong_statistics_out, out, rp_cnp_handled); dev->congestion.time_stamp = MLX5_GET_UNALIGNED_64(query_cong_statistics_out, out, time_stamp); dev->congestion.accumulators_period = MLX5_GET(query_cong_statistics_out, out, accumulators_period); dev->congestion.np_ecn_marked_roce_packets = MLX5_GET_UNALIGNED_64(query_cong_statistics_out, out, np_ecn_marked_roce_packets); dev->congestion.np_cnp_sent = MLX5_GET_UNALIGNED_64(query_cong_statistics_out, out, np_cnp_sent); CONG_UNLOCK(dev); kfree(out); done: schedule_delayed_work(&dev->congestion.dwork, hz); } void mlx5_ib_cleanup_congestion(struct mlx5_ib_dev *dev) { while (cancel_delayed_work_sync(&dev->congestion.dwork)) ; sysctl_ctx_free(&dev->congestion.ctx); sx_destroy(&dev->congestion.lock); } int mlx5_ib_init_congestion(struct mlx5_ib_dev *dev) { struct sysctl_ctx_list *ctx; struct sysctl_oid *parent; struct sysctl_oid *node; int err; u32 x; ctx = &dev->congestion.ctx; sysctl_ctx_init(ctx); sx_init(&dev->congestion.lock, "mlx5ibcong"); INIT_DELAYED_WORK(&dev->congestion.dwork, mlx5_ib_read_cong_stats); if (!MLX5_CAP_GEN(dev->mdev, cc_query_allowed)) return (0); err = mlx5_ib_get_all_cc_params(dev); if (err) return (err); parent = SYSCTL_ADD_NODE(ctx, SYSCTL_CHILDREN(dev->ib_dev.dev.kobj.oidp), OID_AUTO, "cong", CTLFLAG_RW, NULL, "Congestion control"); if (parent == NULL) return (-ENOMEM); node = SYSCTL_ADD_NODE(ctx, SYSCTL_CHILDREN(parent), OID_AUTO, "conf", CTLFLAG_RW, NULL, "Configuration"); if (node == NULL) { sysctl_ctx_free(&dev->congestion.ctx); return (-ENOMEM); } for (x = 0; x != MLX5_IB_CONG_PARAMS_NUM; x++) { SYSCTL_ADD_PROC(ctx, SYSCTL_CHILDREN(node), OID_AUTO, mlx5_ib_cong_params_desc[2 * x], CTLTYPE_U64 | CTLFLAG_RWTUN | CTLFLAG_MPSAFE, dev, x, &mlx5_ib_cong_params_handler, "QU", mlx5_ib_cong_params_desc[2 * x + 1]); } node = SYSCTL_ADD_NODE(ctx, SYSCTL_CHILDREN(parent), OID_AUTO, "stats", CTLFLAG_RD, NULL, "Statistics"); if (node == NULL) { sysctl_ctx_free(&dev->congestion.ctx); return (-ENOMEM); } for (x = 0; x != MLX5_IB_CONG_STATS_NUM; x++) { /* read-only SYSCTLs */ SYSCTL_ADD_U64(ctx, SYSCTL_CHILDREN(node), OID_AUTO, mlx5_ib_cong_stats_desc[2 * x], CTLFLAG_RD | CTLFLAG_MPSAFE, &dev->congestion.arg[x + MLX5_IB_CONG_PARAMS_NUM], 0, mlx5_ib_cong_stats_desc[2 * x + 1]); } schedule_delayed_work(&dev->congestion.dwork, hz); return (0); } Index: projects/clang1000-import/sys/dev/otus/if_otus.c =================================================================== --- projects/clang1000-import/sys/dev/otus/if_otus.c (revision 358238) +++ projects/clang1000-import/sys/dev/otus/if_otus.c (revision 358239) @@ -1,3244 +1,3245 @@ /* $OpenBSD: if_otus.c,v 1.49 2015/11/24 13:33:18 mpi Exp $ */ /*- * Copyright (c) 2009 Damien Bergamini * Copyright (c) 2015 Adrian Chadd * * Permission to use, copy, modify, and distribute this software for any * purpose with or without fee is hereby granted, provided that the above * copyright notice and this permission notice appear in all copies. * * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. */ /* * Driver for Atheros AR9001U chipset. */ #include __FBSDID("$FreeBSD$"); #include "opt_wlan.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #ifdef IEEE80211_SUPPORT_SUPERG #include #endif #include #include #include "usbdevs.h" #define USB_DEBUG_VAR otus_debug #include #include "if_otusreg.h" static int otus_debug = 0; -static SYSCTL_NODE(_hw_usb, OID_AUTO, otus, CTLFLAG_RW, 0, "USB otus"); +static SYSCTL_NODE(_hw_usb, OID_AUTO, otus, CTLFLAG_RW | CTLFLAG_MPSAFE, 0, + "USB otus"); SYSCTL_INT(_hw_usb_otus, OID_AUTO, debug, CTLFLAG_RWTUN, &otus_debug, 0, "Debug level"); #define OTUS_DEBUG_XMIT 0x00000001 #define OTUS_DEBUG_RECV 0x00000002 #define OTUS_DEBUG_TXDONE 0x00000004 #define OTUS_DEBUG_RXDONE 0x00000008 #define OTUS_DEBUG_CMD 0x00000010 #define OTUS_DEBUG_CMDDONE 0x00000020 #define OTUS_DEBUG_RESET 0x00000040 #define OTUS_DEBUG_STATE 0x00000080 #define OTUS_DEBUG_CMDNOTIFY 0x00000100 #define OTUS_DEBUG_REGIO 0x00000200 #define OTUS_DEBUG_IRQ 0x00000400 #define OTUS_DEBUG_TXCOMP 0x00000800 #define OTUS_DEBUG_ANY 0xffffffff #define OTUS_DPRINTF(sc, dm, ...) \ do { \ if ((dm == OTUS_DEBUG_ANY) || (dm & otus_debug)) \ device_printf(sc->sc_dev, __VA_ARGS__); \ } while (0) #define OTUS_DEV(v, p) { USB_VPI(v, p, 0) } static const STRUCT_USB_HOST_ID otus_devs[] = { OTUS_DEV(USB_VENDOR_ACCTON, USB_PRODUCT_ACCTON_WN7512), OTUS_DEV(USB_VENDOR_ATHEROS2, USB_PRODUCT_ATHEROS2_3CRUSBN275), OTUS_DEV(USB_VENDOR_ATHEROS2, USB_PRODUCT_ATHEROS2_TG121N), OTUS_DEV(USB_VENDOR_ATHEROS2, USB_PRODUCT_ATHEROS2_AR9170), OTUS_DEV(USB_VENDOR_ATHEROS2, USB_PRODUCT_ATHEROS2_WN612), OTUS_DEV(USB_VENDOR_ATHEROS2, USB_PRODUCT_ATHEROS2_WN821NV2), OTUS_DEV(USB_VENDOR_AVM, USB_PRODUCT_AVM_FRITZWLAN), OTUS_DEV(USB_VENDOR_CACE, USB_PRODUCT_CACE_AIRPCAPNX), OTUS_DEV(USB_VENDOR_DLINK2, USB_PRODUCT_DLINK2_DWA130D1), OTUS_DEV(USB_VENDOR_DLINK2, USB_PRODUCT_DLINK2_DWA160A1), OTUS_DEV(USB_VENDOR_DLINK2, USB_PRODUCT_DLINK2_DWA160A2), OTUS_DEV(USB_VENDOR_IODATA, USB_PRODUCT_IODATA_WNGDNUS2), OTUS_DEV(USB_VENDOR_NEC, USB_PRODUCT_NEC_WL300NUG), OTUS_DEV(USB_VENDOR_NETGEAR, USB_PRODUCT_NETGEAR_WN111V2), OTUS_DEV(USB_VENDOR_NETGEAR, USB_PRODUCT_NETGEAR_WNA1000), OTUS_DEV(USB_VENDOR_NETGEAR, USB_PRODUCT_NETGEAR_WNDA3100), OTUS_DEV(USB_VENDOR_PLANEX2, USB_PRODUCT_PLANEX2_GW_US300), OTUS_DEV(USB_VENDOR_WISTRONNEWEB, USB_PRODUCT_WISTRONNEWEB_O8494), OTUS_DEV(USB_VENDOR_WISTRONNEWEB, USB_PRODUCT_WISTRONNEWEB_WNC0600), OTUS_DEV(USB_VENDOR_ZCOM, USB_PRODUCT_ZCOM_UB81), OTUS_DEV(USB_VENDOR_ZCOM, USB_PRODUCT_ZCOM_UB82), OTUS_DEV(USB_VENDOR_ZYDAS, USB_PRODUCT_ZYDAS_ZD1221), OTUS_DEV(USB_VENDOR_ZYXEL, USB_PRODUCT_ZYXEL_NWD271N), }; static device_probe_t otus_match; static device_attach_t otus_attach; static device_detach_t otus_detach; static int otus_attachhook(struct otus_softc *); void otus_get_chanlist(struct otus_softc *); static void otus_getradiocaps(struct ieee80211com *, int, int *, struct ieee80211_channel[]); int otus_load_firmware(struct otus_softc *, const char *, uint32_t); int otus_open_pipes(struct otus_softc *); void otus_close_pipes(struct otus_softc *); static int otus_alloc_tx_cmd_list(struct otus_softc *); static void otus_free_tx_cmd_list(struct otus_softc *); static int otus_alloc_rx_list(struct otus_softc *); static void otus_free_rx_list(struct otus_softc *); static int otus_alloc_tx_list(struct otus_softc *); static void otus_free_tx_list(struct otus_softc *); static void otus_free_list(struct otus_softc *, struct otus_data [], int); static struct otus_data *_otus_getbuf(struct otus_softc *); static struct otus_data *otus_getbuf(struct otus_softc *); static void otus_freebuf(struct otus_softc *, struct otus_data *); static struct otus_tx_cmd *_otus_get_txcmd(struct otus_softc *); static struct otus_tx_cmd *otus_get_txcmd(struct otus_softc *); static void otus_free_txcmd(struct otus_softc *, struct otus_tx_cmd *); void otus_next_scan(void *, int); static void otus_tx_task(void *, int pending); void otus_do_async(struct otus_softc *, void (*)(struct otus_softc *, void *), void *, int); int otus_newstate(struct ieee80211vap *, enum ieee80211_state, int); int otus_cmd(struct otus_softc *, uint8_t, const void *, int, void *, int); void otus_write(struct otus_softc *, uint32_t, uint32_t); int otus_write_barrier(struct otus_softc *); static struct ieee80211_node *otus_node_alloc(struct ieee80211vap *vap, const uint8_t mac[IEEE80211_ADDR_LEN]); int otus_media_change(struct ifnet *); int otus_read_eeprom(struct otus_softc *); void otus_newassoc(struct ieee80211_node *, int); void otus_cmd_rxeof(struct otus_softc *, uint8_t *, int); void otus_sub_rxeof(struct otus_softc *, uint8_t *, int, struct mbufq *); static int otus_tx(struct otus_softc *, struct ieee80211_node *, struct mbuf *, struct otus_data *, const struct ieee80211_bpf_params *); int otus_ioctl(struct ifnet *, u_long, caddr_t); int otus_set_multi(struct otus_softc *); static int otus_updateedca(struct ieee80211com *); static void otus_updateedca_locked(struct otus_softc *); static void otus_updateslot(struct otus_softc *); static void otus_set_operating_mode(struct otus_softc *sc); static void otus_set_rx_filter(struct otus_softc *sc); int otus_init_mac(struct otus_softc *); uint32_t otus_phy_get_def(struct otus_softc *, uint32_t); int otus_set_board_values(struct otus_softc *, struct ieee80211_channel *); int otus_program_phy(struct otus_softc *, struct ieee80211_channel *); int otus_set_rf_bank4(struct otus_softc *, struct ieee80211_channel *); void otus_get_delta_slope(uint32_t, uint32_t *, uint32_t *); static int otus_set_chan(struct otus_softc *, struct ieee80211_channel *, int); int otus_set_key(struct ieee80211com *, struct ieee80211_node *, struct ieee80211_key *); void otus_set_key_cb(struct otus_softc *, void *); void otus_delete_key(struct ieee80211com *, struct ieee80211_node *, struct ieee80211_key *); void otus_delete_key_cb(struct otus_softc *, void *); void otus_calibrate_to(void *, int); int otus_set_bssid(struct otus_softc *, const uint8_t *); int otus_set_macaddr(struct otus_softc *, const uint8_t *); void otus_led_newstate_type1(struct otus_softc *); void otus_led_newstate_type2(struct otus_softc *); void otus_led_newstate_type3(struct otus_softc *); int otus_init(struct otus_softc *sc); void otus_stop(struct otus_softc *sc); static device_method_t otus_methods[] = { DEVMETHOD(device_probe, otus_match), DEVMETHOD(device_attach, otus_attach), DEVMETHOD(device_detach, otus_detach), DEVMETHOD_END }; static driver_t otus_driver = { .name = "otus", .methods = otus_methods, .size = sizeof(struct otus_softc) }; static devclass_t otus_devclass; DRIVER_MODULE(otus, uhub, otus_driver, otus_devclass, NULL, 0); MODULE_DEPEND(otus, wlan, 1, 1, 1); MODULE_DEPEND(otus, usb, 1, 1, 1); MODULE_DEPEND(otus, firmware, 1, 1, 1); MODULE_VERSION(otus, 1); static usb_callback_t otus_bulk_tx_callback; static usb_callback_t otus_bulk_rx_callback; static usb_callback_t otus_bulk_irq_callback; static usb_callback_t otus_bulk_cmd_callback; static const struct usb_config otus_config[OTUS_N_XFER] = { [OTUS_BULK_TX] = { .type = UE_BULK, .endpoint = UE_ADDR_ANY, .direction = UE_DIR_OUT, .bufsize = 0x200, .flags = {.pipe_bof = 1,.force_short_xfer = 1,}, .callback = otus_bulk_tx_callback, .timeout = 5000, /* ms */ }, [OTUS_BULK_RX] = { .type = UE_BULK, .endpoint = UE_ADDR_ANY, .direction = UE_DIR_IN, .bufsize = OTUS_RXBUFSZ, .flags = { .ext_buffer = 1, .pipe_bof = 1,.short_xfer_ok = 1,}, .callback = otus_bulk_rx_callback, }, [OTUS_BULK_IRQ] = { .type = UE_INTERRUPT, .endpoint = UE_ADDR_ANY, .direction = UE_DIR_IN, .bufsize = OTUS_MAX_CTRLSZ, .flags = {.pipe_bof = 1,.short_xfer_ok = 1,}, .callback = otus_bulk_irq_callback, }, [OTUS_BULK_CMD] = { .type = UE_INTERRUPT, .endpoint = UE_ADDR_ANY, .direction = UE_DIR_OUT, .bufsize = OTUS_MAX_CTRLSZ, .flags = {.pipe_bof = 1,.force_short_xfer = 1,}, .callback = otus_bulk_cmd_callback, .timeout = 5000, /* ms */ }, }; static int otus_match(device_t self) { struct usb_attach_arg *uaa = device_get_ivars(self); if (uaa->usb_mode != USB_MODE_HOST || uaa->info.bIfaceIndex != 0 || uaa->info.bConfigIndex != 0) return (ENXIO); return (usbd_lookup_id_by_uaa(otus_devs, sizeof(otus_devs), uaa)); } static int otus_attach(device_t self) { struct usb_attach_arg *uaa = device_get_ivars(self); struct otus_softc *sc = device_get_softc(self); int error; uint8_t iface_index; device_set_usb_desc(self); sc->sc_udev = uaa->device; sc->sc_dev = self; mtx_init(&sc->sc_mtx, device_get_nameunit(self), MTX_NETWORK_LOCK, MTX_DEF); TIMEOUT_TASK_INIT(taskqueue_thread, &sc->scan_to, 0, otus_next_scan, sc); TIMEOUT_TASK_INIT(taskqueue_thread, &sc->calib_to, 0, otus_calibrate_to, sc); TASK_INIT(&sc->tx_task, 0, otus_tx_task, sc); mbufq_init(&sc->sc_snd, ifqmaxlen); iface_index = 0; error = usbd_transfer_setup(uaa->device, &iface_index, sc->sc_xfer, otus_config, OTUS_N_XFER, sc, &sc->sc_mtx); if (error) { device_printf(sc->sc_dev, "could not allocate USB transfers, err=%s\n", usbd_errstr(error)); goto fail_usb; } if ((error = otus_open_pipes(sc)) != 0) { device_printf(sc->sc_dev, "%s: could not open pipes\n", __func__); goto fail; } /* XXX check return status; fail out if appropriate */ if (otus_attachhook(sc) != 0) goto fail; return (0); fail: otus_close_pipes(sc); fail_usb: mtx_destroy(&sc->sc_mtx); return (ENXIO); } static int otus_detach(device_t self) { struct otus_softc *sc = device_get_softc(self); struct ieee80211com *ic = &sc->sc_ic; otus_stop(sc); usbd_transfer_unsetup(sc->sc_xfer, OTUS_N_XFER); taskqueue_drain_timeout(taskqueue_thread, &sc->scan_to); taskqueue_drain_timeout(taskqueue_thread, &sc->calib_to); taskqueue_drain(taskqueue_thread, &sc->tx_task); otus_close_pipes(sc); #if 0 /* Wait for all queued asynchronous commands to complete. */ usb_rem_wait_task(sc->sc_udev, &sc->sc_task); usbd_ref_wait(sc->sc_udev); #endif ieee80211_ifdetach(ic); mtx_destroy(&sc->sc_mtx); return 0; } static void otus_delay_ms(struct otus_softc *sc, int ms) { DELAY(1000 * ms); } static struct ieee80211vap * otus_vap_create(struct ieee80211com *ic, const char name[IFNAMSIZ], int unit, enum ieee80211_opmode opmode, int flags, const uint8_t bssid[IEEE80211_ADDR_LEN], const uint8_t mac[IEEE80211_ADDR_LEN]) { struct otus_vap *uvp; struct ieee80211vap *vap; if (!TAILQ_EMPTY(&ic->ic_vaps)) /* only one at a time */ return (NULL); uvp = malloc(sizeof(struct otus_vap), M_80211_VAP, M_WAITOK | M_ZERO); vap = &uvp->vap; if (ieee80211_vap_setup(ic, vap, name, unit, opmode, flags, bssid) != 0) { /* out of memory */ free(uvp, M_80211_VAP); return (NULL); } /* override state transition machine */ uvp->newstate = vap->iv_newstate; vap->iv_newstate = otus_newstate; /* XXX TODO: double-check */ vap->iv_ampdu_density = IEEE80211_HTCAP_MPDUDENSITY_16; vap->iv_ampdu_rxmax = IEEE80211_HTCAP_MAXRXAMPDU_32K; ieee80211_ratectl_init(vap); /* complete setup */ ieee80211_vap_attach(vap, ieee80211_media_change, ieee80211_media_status, mac); ic->ic_opmode = opmode; return (vap); } static void otus_vap_delete(struct ieee80211vap *vap) { struct otus_vap *uvp = OTUS_VAP(vap); ieee80211_ratectl_deinit(vap); ieee80211_vap_detach(vap); free(uvp, M_80211_VAP); } static void otus_parent(struct ieee80211com *ic) { struct otus_softc *sc = ic->ic_softc; int startall = 0; if (ic->ic_nrunning > 0) { if (!sc->sc_running) { otus_init(sc); startall = 1; } else { (void) otus_set_multi(sc); } } else if (sc->sc_running) otus_stop(sc); if (startall) ieee80211_start_all(ic); } static void otus_drain_mbufq(struct otus_softc *sc) { struct mbuf *m; struct ieee80211_node *ni; OTUS_LOCK_ASSERT(sc); while ((m = mbufq_dequeue(&sc->sc_snd)) != NULL) { ni = (struct ieee80211_node *) m->m_pkthdr.rcvif; m->m_pkthdr.rcvif = NULL; ieee80211_free_node(ni); m_freem(m); } } static void otus_tx_start(struct otus_softc *sc) { taskqueue_enqueue(taskqueue_thread, &sc->tx_task); } static int otus_transmit(struct ieee80211com *ic, struct mbuf *m) { struct otus_softc *sc = ic->ic_softc; int error; OTUS_LOCK(sc); if (! sc->sc_running) { OTUS_UNLOCK(sc); return (ENXIO); } /* XXX TODO: handle fragments */ error = mbufq_enqueue(&sc->sc_snd, m); if (error) { OTUS_DPRINTF(sc, OTUS_DEBUG_XMIT, "%s: mbufq_enqueue failed: %d\n", __func__, error); OTUS_UNLOCK(sc); return (error); } OTUS_UNLOCK(sc); /* Kick TX */ otus_tx_start(sc); return (0); } static void _otus_start(struct otus_softc *sc) { struct ieee80211_node *ni; struct otus_data *bf; struct mbuf *m; OTUS_LOCK_ASSERT(sc); while ((m = mbufq_dequeue(&sc->sc_snd)) != NULL) { bf = otus_getbuf(sc); if (bf == NULL) { OTUS_DPRINTF(sc, OTUS_DEBUG_XMIT, "%s: failed to get buffer\n", __func__); mbufq_prepend(&sc->sc_snd, m); break; } ni = (struct ieee80211_node *)m->m_pkthdr.rcvif; m->m_pkthdr.rcvif = NULL; if (otus_tx(sc, ni, m, bf, NULL) != 0) { OTUS_DPRINTF(sc, OTUS_DEBUG_XMIT, "%s: failed to transmit\n", __func__); if_inc_counter(ni->ni_vap->iv_ifp, IFCOUNTER_OERRORS, 1); otus_freebuf(sc, bf); ieee80211_free_node(ni); m_freem(m); break; } } } static void otus_tx_task(void *arg, int pending) { struct otus_softc *sc = arg; OTUS_LOCK(sc); _otus_start(sc); OTUS_UNLOCK(sc); } static int otus_raw_xmit(struct ieee80211_node *ni, struct mbuf *m, const struct ieee80211_bpf_params *params) { struct ieee80211com *ic= ni->ni_ic; struct otus_softc *sc = ic->ic_softc; struct otus_data *bf = NULL; int error = 0; /* Don't transmit if we're not running */ OTUS_LOCK(sc); if (! sc->sc_running) { error = ENETDOWN; goto error; } bf = otus_getbuf(sc); if (bf == NULL) { error = ENOBUFS; goto error; } if (otus_tx(sc, ni, m, bf, params) != 0) { error = EIO; goto error; } OTUS_UNLOCK(sc); return (0); error: if (bf) otus_freebuf(sc, bf); OTUS_UNLOCK(sc); m_freem(m); return (ENXIO); } static void otus_update_chw(struct ieee80211com *ic) { printf("%s: TODO\n", __func__); } static void otus_set_channel(struct ieee80211com *ic) { struct otus_softc *sc = ic->ic_softc; OTUS_DPRINTF(sc, OTUS_DEBUG_RESET, "%s: set channel: %d\n", __func__, ic->ic_curchan->ic_freq); OTUS_LOCK(sc); (void) otus_set_chan(sc, ic->ic_curchan, 0); OTUS_UNLOCK(sc); } static int otus_ampdu_enable(struct ieee80211_node *ni, struct ieee80211_tx_ampdu *tap) { /* For now, no A-MPDU TX support in the driver */ return (0); } static void otus_scan_start(struct ieee80211com *ic) { // printf("%s: TODO\n", __func__); } static void otus_scan_end(struct ieee80211com *ic) { // printf("%s: TODO\n", __func__); } static void otus_update_mcast(struct ieee80211com *ic) { struct otus_softc *sc = ic->ic_softc; (void) otus_set_multi(sc); } static int otus_attachhook(struct otus_softc *sc) { struct ieee80211com *ic = &sc->sc_ic; usb_device_request_t req; uint32_t in, out; int error; /* Not locked */ error = otus_load_firmware(sc, "otusfw_init", AR_FW_INIT_ADDR); if (error != 0) { device_printf(sc->sc_dev, "%s: could not load %s firmware\n", __func__, "init"); return (ENXIO); } /* XXX not locked? */ otus_delay_ms(sc, 1000); /* Not locked */ error = otus_load_firmware(sc, "otusfw_main", AR_FW_MAIN_ADDR); if (error != 0) { device_printf(sc->sc_dev, "%s: could not load %s firmware\n", __func__, "main"); return (ENXIO); } OTUS_LOCK(sc); /* Tell device that firmware transfer is complete. */ req.bmRequestType = UT_WRITE_VENDOR_DEVICE; req.bRequest = AR_FW_DOWNLOAD_COMPLETE; USETW(req.wValue, 0); USETW(req.wIndex, 0); USETW(req.wLength, 0); if (usbd_do_request_flags(sc->sc_udev, &sc->sc_mtx, &req, NULL, 0, NULL, 250) != 0) { OTUS_UNLOCK(sc); device_printf(sc->sc_dev, "%s: firmware initialization failed\n", __func__); return (ENXIO); } /* Send an ECHO command to check that everything is settled. */ in = 0xbadc0ffe; if (otus_cmd(sc, AR_CMD_ECHO, &in, sizeof in, &out, sizeof(out)) != 0) { OTUS_UNLOCK(sc); device_printf(sc->sc_dev, "%s: echo command failed\n", __func__); return (ENXIO); } if (in != out) { OTUS_UNLOCK(sc); device_printf(sc->sc_dev, "%s: echo reply mismatch: 0x%08x!=0x%08x\n", __func__, in, out); return (ENXIO); } /* Read entire EEPROM. */ if (otus_read_eeprom(sc) != 0) { OTUS_UNLOCK(sc); device_printf(sc->sc_dev, "%s: could not read EEPROM\n", __func__); return (ENXIO); } OTUS_UNLOCK(sc); sc->txmask = sc->eeprom.baseEepHeader.txMask; sc->rxmask = sc->eeprom.baseEepHeader.rxMask; sc->capflags = sc->eeprom.baseEepHeader.opCapFlags; IEEE80211_ADDR_COPY(ic->ic_macaddr, sc->eeprom.baseEepHeader.macAddr); sc->sc_led_newstate = otus_led_newstate_type3; /* XXX */ device_printf(sc->sc_dev, "MAC/BBP AR9170, RF AR%X, MIMO %dT%dR, address %s\n", (sc->capflags & AR5416_OPFLAGS_11A) ? 0x9104 : ((sc->txmask == 0x5) ? 0x9102 : 0x9101), (sc->txmask == 0x5) ? 2 : 1, (sc->rxmask == 0x5) ? 2 : 1, ether_sprintf(ic->ic_macaddr)); ic->ic_softc = sc; ic->ic_name = device_get_nameunit(sc->sc_dev); ic->ic_phytype = IEEE80211_T_OFDM; /* not only, but not used */ ic->ic_opmode = IEEE80211_M_STA; /* default to BSS mode */ /* Set device capabilities. */ ic->ic_caps = IEEE80211_C_STA | /* station mode */ #if 0 IEEE80211_C_BGSCAN | /* Background scan. */ #endif IEEE80211_C_SHPREAMBLE | /* Short preamble supported. */ IEEE80211_C_WME | /* WME/QoS */ IEEE80211_C_SHSLOT | /* Short slot time supported. */ IEEE80211_C_FF | /* Atheros fast-frames supported. */ IEEE80211_C_MONITOR | IEEE80211_C_WPA; /* WPA/RSN. */ /* XXX TODO: 11n */ #if 0 if (sc->eeprom.baseEepHeader.opCapFlags & AR5416_OPFLAGS_11G) { /* Set supported .11b and .11g rates. */ ic->ic_sup_rates[IEEE80211_MODE_11B] = ieee80211_std_rateset_11b; ic->ic_sup_rates[IEEE80211_MODE_11G] = ieee80211_std_rateset_11g; } if (sc->eeprom.baseEepHeader.opCapFlags & AR5416_OPFLAGS_11A) { /* Set supported .11a rates. */ ic->ic_sup_rates[IEEE80211_MODE_11A] = ieee80211_std_rateset_11a; } #endif #if 0 /* Build the list of supported channels. */ otus_get_chanlist(sc); #else otus_getradiocaps(ic, IEEE80211_CHAN_MAX, &ic->ic_nchans, ic->ic_channels); #endif ieee80211_ifattach(ic); ic->ic_raw_xmit = otus_raw_xmit; ic->ic_scan_start = otus_scan_start; ic->ic_scan_end = otus_scan_end; ic->ic_set_channel = otus_set_channel; ic->ic_getradiocaps = otus_getradiocaps; ic->ic_vap_create = otus_vap_create; ic->ic_vap_delete = otus_vap_delete; ic->ic_update_mcast = otus_update_mcast; ic->ic_update_promisc = otus_update_mcast; ic->ic_parent = otus_parent; ic->ic_transmit = otus_transmit; ic->ic_update_chw = otus_update_chw; ic->ic_ampdu_enable = otus_ampdu_enable; ic->ic_wme.wme_update = otus_updateedca; ic->ic_newassoc = otus_newassoc; ic->ic_node_alloc = otus_node_alloc; #ifdef notyet ic->ic_set_key = otus_set_key; ic->ic_delete_key = otus_delete_key; #endif ieee80211_radiotap_attach(ic, &sc->sc_txtap.wt_ihdr, sizeof(sc->sc_txtap), OTUS_TX_RADIOTAP_PRESENT, &sc->sc_rxtap.wr_ihdr, sizeof(sc->sc_rxtap), OTUS_RX_RADIOTAP_PRESENT); return (0); } void otus_get_chanlist(struct otus_softc *sc) { struct ieee80211com *ic = &sc->sc_ic; uint16_t domain; uint8_t chan; int i; /* XXX regulatory domain. */ domain = le16toh(sc->eeprom.baseEepHeader.regDmn[0]); OTUS_DPRINTF(sc, OTUS_DEBUG_RESET, "regdomain=0x%04x\n", domain); if (sc->eeprom.baseEepHeader.opCapFlags & AR5416_OPFLAGS_11G) { for (i = 0; i < 14; i++) { chan = ar_chans[i]; ic->ic_channels[chan].ic_freq = ieee80211_ieee2mhz(chan, IEEE80211_CHAN_2GHZ); ic->ic_channels[chan].ic_flags = IEEE80211_CHAN_CCK | IEEE80211_CHAN_OFDM | IEEE80211_CHAN_DYN | IEEE80211_CHAN_2GHZ; } } if (sc->eeprom.baseEepHeader.opCapFlags & AR5416_OPFLAGS_11A) { for (i = 14; i < nitems(ar_chans); i++) { chan = ar_chans[i]; ic->ic_channels[chan].ic_freq = ieee80211_ieee2mhz(chan, IEEE80211_CHAN_5GHZ); ic->ic_channels[chan].ic_flags = IEEE80211_CHAN_A; } } } static void otus_getradiocaps(struct ieee80211com *ic, int maxchans, int *nchans, struct ieee80211_channel chans[]) { struct otus_softc *sc = ic->ic_softc; uint8_t bands[IEEE80211_MODE_BYTES]; /* Set supported .11b and .11g rates. */ memset(bands, 0, sizeof(bands)); if (sc->eeprom.baseEepHeader.opCapFlags & AR5416_OPFLAGS_11G) { setbit(bands, IEEE80211_MODE_11B); setbit(bands, IEEE80211_MODE_11G); #if 0 if (sc->sc_ht) setbit(bands, IEEE80211_MODE_11NG); #endif ieee80211_add_channel_list_2ghz(chans, maxchans, nchans, ar_chans, 14, bands, 0); } if (sc->eeprom.baseEepHeader.opCapFlags & AR5416_OPFLAGS_11A) { setbit(bands, IEEE80211_MODE_11A); ieee80211_add_channel_list_5ghz(chans, maxchans, nchans, &ar_chans[14], nitems(ar_chans) - 14, bands, 0); } } int otus_load_firmware(struct otus_softc *sc, const char *name, uint32_t addr) { usb_device_request_t req; char *ptr; const struct firmware *fw; int mlen, error, size; error = 0; /* Read firmware image from the filesystem. */ if ((fw = firmware_get(name)) == NULL) { device_printf(sc->sc_dev, "%s: failed loadfirmware of file %s\n", __func__, name); return (ENXIO); } req.bmRequestType = UT_WRITE_VENDOR_DEVICE; req.bRequest = AR_FW_DOWNLOAD; USETW(req.wIndex, 0); OTUS_LOCK(sc); /* XXX const */ ptr = __DECONST(char *, fw->data); size = fw->datasize; addr >>= 8; while (size > 0) { mlen = MIN(size, 4096); USETW(req.wValue, addr); USETW(req.wLength, mlen); if (usbd_do_request_flags(sc->sc_udev, &sc->sc_mtx, &req, ptr, 0, NULL, 250) != 0) { error = EIO; break; } addr += mlen >> 8; ptr += mlen; size -= mlen; } OTUS_UNLOCK(sc); firmware_put(fw, FIRMWARE_UNLOAD); if (error != 0) device_printf(sc->sc_dev, "%s: %s: error=%d\n", __func__, name, error); return error; } int otus_open_pipes(struct otus_softc *sc) { #if 0 int isize, error; int i; #endif int error; OTUS_UNLOCK_ASSERT(sc); if ((error = otus_alloc_tx_cmd_list(sc)) != 0) { device_printf(sc->sc_dev, "%s: could not allocate command xfer\n", __func__); goto fail; } if ((error = otus_alloc_tx_list(sc)) != 0) { device_printf(sc->sc_dev, "%s: could not allocate Tx xfers\n", __func__); goto fail; } if ((error = otus_alloc_rx_list(sc)) != 0) { device_printf(sc->sc_dev, "%s: could not allocate Rx xfers\n", __func__); goto fail; } /* Enable RX transfers; needed for initial firmware messages */ OTUS_LOCK(sc); usbd_transfer_start(sc->sc_xfer[OTUS_BULK_RX]); usbd_transfer_start(sc->sc_xfer[OTUS_BULK_IRQ]); OTUS_UNLOCK(sc); return 0; fail: otus_close_pipes(sc); return error; } void otus_close_pipes(struct otus_softc *sc) { OTUS_LOCK(sc); otus_free_tx_cmd_list(sc); otus_free_tx_list(sc); otus_free_rx_list(sc); OTUS_UNLOCK(sc); usbd_transfer_unsetup(sc->sc_xfer, OTUS_N_XFER); } static void otus_free_cmd_list(struct otus_softc *sc, struct otus_tx_cmd cmd[], int ndata) { int i; /* XXX TODO: someone has to have waken up waiters! */ for (i = 0; i < ndata; i++) { struct otus_tx_cmd *dp = &cmd[i]; if (dp->buf != NULL) { free(dp->buf, M_USBDEV); dp->buf = NULL; } } } static int otus_alloc_cmd_list(struct otus_softc *sc, struct otus_tx_cmd cmd[], int ndata, int maxsz) { int i, error; for (i = 0; i < ndata; i++) { struct otus_tx_cmd *dp = &cmd[i]; dp->buf = malloc(maxsz, M_USBDEV, M_NOWAIT | M_ZERO); dp->odata = NULL; if (dp->buf == NULL) { device_printf(sc->sc_dev, "could not allocate buffer\n"); error = ENOMEM; goto fail; } } return (0); fail: otus_free_cmd_list(sc, cmd, ndata); return (error); } static int otus_alloc_tx_cmd_list(struct otus_softc *sc) { int error, i; error = otus_alloc_cmd_list(sc, sc->sc_cmd, OTUS_CMD_LIST_COUNT, OTUS_MAX_TXCMDSZ); if (error != 0) return (error); STAILQ_INIT(&sc->sc_cmd_active); STAILQ_INIT(&sc->sc_cmd_inactive); STAILQ_INIT(&sc->sc_cmd_pending); STAILQ_INIT(&sc->sc_cmd_waiting); for (i = 0; i < OTUS_CMD_LIST_COUNT; i++) STAILQ_INSERT_HEAD(&sc->sc_cmd_inactive, &sc->sc_cmd[i], next_cmd); return (0); } static void otus_free_tx_cmd_list(struct otus_softc *sc) { /* * XXX TODO: something needs to wake up any pending/sleeping * waiters! */ STAILQ_INIT(&sc->sc_cmd_active); STAILQ_INIT(&sc->sc_cmd_inactive); STAILQ_INIT(&sc->sc_cmd_pending); STAILQ_INIT(&sc->sc_cmd_waiting); otus_free_cmd_list(sc, sc->sc_cmd, OTUS_CMD_LIST_COUNT); } static int otus_alloc_list(struct otus_softc *sc, struct otus_data data[], int ndata, int maxsz) { int i, error; for (i = 0; i < ndata; i++) { struct otus_data *dp = &data[i]; dp->sc = sc; dp->m = NULL; dp->buf = malloc(maxsz, M_USBDEV, M_NOWAIT | M_ZERO); if (dp->buf == NULL) { device_printf(sc->sc_dev, "could not allocate buffer\n"); error = ENOMEM; goto fail; } dp->ni = NULL; } return (0); fail: otus_free_list(sc, data, ndata); return (error); } static int otus_alloc_rx_list(struct otus_softc *sc) { int error, i; error = otus_alloc_list(sc, sc->sc_rx, OTUS_RX_LIST_COUNT, OTUS_RXBUFSZ); if (error != 0) return (error); STAILQ_INIT(&sc->sc_rx_active); STAILQ_INIT(&sc->sc_rx_inactive); for (i = 0; i < OTUS_RX_LIST_COUNT; i++) STAILQ_INSERT_HEAD(&sc->sc_rx_inactive, &sc->sc_rx[i], next); return (0); } static int otus_alloc_tx_list(struct otus_softc *sc) { int error, i; error = otus_alloc_list(sc, sc->sc_tx, OTUS_TX_LIST_COUNT, OTUS_TXBUFSZ); if (error != 0) return (error); STAILQ_INIT(&sc->sc_tx_inactive); for (i = 0; i != OTUS_N_XFER; i++) { STAILQ_INIT(&sc->sc_tx_active[i]); STAILQ_INIT(&sc->sc_tx_pending[i]); } for (i = 0; i < OTUS_TX_LIST_COUNT; i++) { STAILQ_INSERT_HEAD(&sc->sc_tx_inactive, &sc->sc_tx[i], next); } return (0); } static void otus_free_tx_list(struct otus_softc *sc) { int i; /* prevent further allocations from TX list(s) */ STAILQ_INIT(&sc->sc_tx_inactive); for (i = 0; i != OTUS_N_XFER; i++) { STAILQ_INIT(&sc->sc_tx_active[i]); STAILQ_INIT(&sc->sc_tx_pending[i]); } otus_free_list(sc, sc->sc_tx, OTUS_TX_LIST_COUNT); } static void otus_free_rx_list(struct otus_softc *sc) { /* prevent further allocations from RX list(s) */ STAILQ_INIT(&sc->sc_rx_inactive); STAILQ_INIT(&sc->sc_rx_active); otus_free_list(sc, sc->sc_rx, OTUS_RX_LIST_COUNT); } static void otus_free_list(struct otus_softc *sc, struct otus_data data[], int ndata) { int i; for (i = 0; i < ndata; i++) { struct otus_data *dp = &data[i]; if (dp->buf != NULL) { free(dp->buf, M_USBDEV); dp->buf = NULL; } if (dp->ni != NULL) { ieee80211_free_node(dp->ni); dp->ni = NULL; } } } static struct otus_data * _otus_getbuf(struct otus_softc *sc) { struct otus_data *bf; bf = STAILQ_FIRST(&sc->sc_tx_inactive); if (bf != NULL) STAILQ_REMOVE_HEAD(&sc->sc_tx_inactive, next); else bf = NULL; /* XXX bzero? */ return (bf); } static struct otus_data * otus_getbuf(struct otus_softc *sc) { struct otus_data *bf; OTUS_LOCK_ASSERT(sc); bf = _otus_getbuf(sc); return (bf); } static void otus_freebuf(struct otus_softc *sc, struct otus_data *bf) { OTUS_LOCK_ASSERT(sc); STAILQ_INSERT_TAIL(&sc->sc_tx_inactive, bf, next); } static struct otus_tx_cmd * _otus_get_txcmd(struct otus_softc *sc) { struct otus_tx_cmd *bf; bf = STAILQ_FIRST(&sc->sc_cmd_inactive); if (bf != NULL) STAILQ_REMOVE_HEAD(&sc->sc_cmd_inactive, next_cmd); else bf = NULL; return (bf); } static struct otus_tx_cmd * otus_get_txcmd(struct otus_softc *sc) { struct otus_tx_cmd *bf; OTUS_LOCK_ASSERT(sc); bf = _otus_get_txcmd(sc); if (bf == NULL) { device_printf(sc->sc_dev, "%s: no tx cmd buffers\n", __func__); } return (bf); } static void otus_free_txcmd(struct otus_softc *sc, struct otus_tx_cmd *bf) { OTUS_LOCK_ASSERT(sc); STAILQ_INSERT_TAIL(&sc->sc_cmd_inactive, bf, next_cmd); } void otus_next_scan(void *arg, int pending) { #if 0 struct otus_softc *sc = arg; if (usbd_is_dying(sc->sc_udev)) return; usbd_ref_incr(sc->sc_udev); if (sc->sc_ic.ic_state == IEEE80211_S_SCAN) ieee80211_next_scan(&sc->sc_ic.ic_if); usbd_ref_decr(sc->sc_udev); #endif } int otus_newstate(struct ieee80211vap *vap, enum ieee80211_state nstate, int arg) { struct otus_vap *uvp = OTUS_VAP(vap); struct ieee80211com *ic = vap->iv_ic; struct otus_softc *sc = ic->ic_softc; enum ieee80211_state ostate; ostate = vap->iv_state; OTUS_DPRINTF(sc, OTUS_DEBUG_STATE, "%s: %s -> %s\n", __func__, ieee80211_state_name[ostate], ieee80211_state_name[nstate]); IEEE80211_UNLOCK(ic); OTUS_LOCK(sc); /* XXX TODO: more fleshing out! */ switch (nstate) { case IEEE80211_S_INIT: otus_set_operating_mode(sc); otus_set_rx_filter(sc); break; case IEEE80211_S_RUN: if (ic->ic_opmode == IEEE80211_M_STA) { otus_updateslot(sc); otus_set_operating_mode(sc); otus_set_rx_filter(sc); /* Start calibration timer. */ taskqueue_enqueue_timeout(taskqueue_thread, &sc->calib_to, hz); } break; default: break; } /* XXX TODO: calibration? */ sc->sc_led_newstate(sc); OTUS_UNLOCK(sc); IEEE80211_LOCK(ic); return (uvp->newstate(vap, nstate, arg)); } int otus_cmd(struct otus_softc *sc, uint8_t code, const void *idata, int ilen, void *odata, int odatalen) { struct otus_tx_cmd *cmd; struct ar_cmd_hdr *hdr; int xferlen, error; OTUS_LOCK_ASSERT(sc); /* Always bulk-out a multiple of 4 bytes. */ xferlen = (sizeof (*hdr) + ilen + 3) & ~3; if (xferlen > OTUS_MAX_TXCMDSZ) { device_printf(sc->sc_dev, "%s: command (0x%02x) size (%d) > %d\n", __func__, code, xferlen, OTUS_MAX_TXCMDSZ); return (EIO); } cmd = otus_get_txcmd(sc); if (cmd == NULL) { device_printf(sc->sc_dev, "%s: failed to get buf\n", __func__); return (EIO); } hdr = (struct ar_cmd_hdr *)cmd->buf; hdr->code = code; hdr->len = ilen; hdr->token = ++sc->token; /* Don't care about endianness. */ cmd->token = hdr->token; /* XXX TODO: check max cmd length? */ memcpy((uint8_t *)&hdr[1], idata, ilen); OTUS_DPRINTF(sc, OTUS_DEBUG_CMD, "%s: sending command code=0x%02x len=%d token=%d\n", __func__, code, ilen, hdr->token); cmd->odata = odata; cmd->odatalen = odatalen; cmd->buflen = xferlen; /* Queue the command to the endpoint */ STAILQ_INSERT_TAIL(&sc->sc_cmd_pending, cmd, next_cmd); usbd_transfer_start(sc->sc_xfer[OTUS_BULK_CMD]); /* Sleep on the command; wait for it to complete */ error = msleep(cmd, &sc->sc_mtx, PCATCH, "otuscmd", hz); /* * At this point we don't own cmd any longer; it'll be * freed by the cmd bulk path or the RX notification * path. If the data is made available then it'll be copied * to the caller. All that is left to do is communicate * status back to the caller. */ if (error != 0) { device_printf(sc->sc_dev, "%s: timeout waiting for command 0x%02x reply\n", __func__, code); } return error; } void otus_write(struct otus_softc *sc, uint32_t reg, uint32_t val) { OTUS_LOCK_ASSERT(sc); sc->write_buf[sc->write_idx].reg = htole32(reg); sc->write_buf[sc->write_idx].val = htole32(val); if (++sc->write_idx > (AR_MAX_WRITE_IDX-1)) (void)otus_write_barrier(sc); } int otus_write_barrier(struct otus_softc *sc) { int error; OTUS_LOCK_ASSERT(sc); if (sc->write_idx == 0) return 0; /* Nothing to flush. */ OTUS_DPRINTF(sc, OTUS_DEBUG_REGIO, "%s: called; %d updates\n", __func__, sc->write_idx); error = otus_cmd(sc, AR_CMD_WREG, sc->write_buf, sizeof (sc->write_buf[0]) * sc->write_idx, NULL, 0); sc->write_idx = 0; return error; } static struct ieee80211_node * otus_node_alloc(struct ieee80211vap *vap, const uint8_t mac[IEEE80211_ADDR_LEN]) { return malloc(sizeof (struct otus_node), M_80211_NODE, M_NOWAIT | M_ZERO); } #if 0 int otus_media_change(struct ifnet *ifp) { struct otus_softc *sc = ifp->if_softc; struct ieee80211com *ic = &sc->sc_ic; uint8_t rate, ridx; int error; error = ieee80211_media_change(ifp); if (error != ENETRESET) return error; if (ic->ic_fixed_rate != -1) { rate = ic->ic_sup_rates[ic->ic_curmode]. rs_rates[ic->ic_fixed_rate] & IEEE80211_RATE_VAL; for (ridx = 0; ridx <= OTUS_RIDX_MAX; ridx++) if (otus_rates[ridx].rate == rate) break; sc->fixed_ridx = ridx; } if ((ifp->if_flags & (IFF_UP | IFF_RUNNING)) == (IFF_UP | IFF_RUNNING)) error = otus_init(sc); return error; } #endif int otus_read_eeprom(struct otus_softc *sc) { uint32_t regs[8], reg; uint8_t *eep; int i, j, error; OTUS_LOCK_ASSERT(sc); /* Read EEPROM by blocks of 32 bytes. */ eep = (uint8_t *)&sc->eeprom; reg = AR_EEPROM_OFFSET; for (i = 0; i < sizeof (sc->eeprom) / 32; i++) { for (j = 0; j < 8; j++, reg += 4) regs[j] = htole32(reg); error = otus_cmd(sc, AR_CMD_RREG, regs, sizeof regs, eep, 32); if (error != 0) break; eep += 32; } return error; } void otus_newassoc(struct ieee80211_node *ni, int isnew) { struct ieee80211com *ic = ni->ni_ic; struct otus_softc *sc = ic->ic_softc; struct otus_node *on = OTUS_NODE(ni); OTUS_DPRINTF(sc, OTUS_DEBUG_STATE, "new assoc isnew=%d addr=%s\n", isnew, ether_sprintf(ni->ni_macaddr)); on->tx_done = 0; on->tx_err = 0; on->tx_retries = 0; } static void otus_cmd_handle_response(struct otus_softc *sc, struct ar_cmd_hdr *hdr) { struct otus_tx_cmd *cmd; OTUS_LOCK_ASSERT(sc); OTUS_DPRINTF(sc, OTUS_DEBUG_CMDDONE, "%s: received reply code=0x%02x len=%d token=%d\n", __func__, hdr->code, hdr->len, hdr->token); /* * Walk the list, freeing items that aren't ours, * stopping when we hit our token. */ while ((cmd = STAILQ_FIRST(&sc->sc_cmd_waiting)) != NULL) { STAILQ_REMOVE_HEAD(&sc->sc_cmd_waiting, next_cmd); OTUS_DPRINTF(sc, OTUS_DEBUG_CMDDONE, "%s: cmd=%p; hdr.token=%d, cmd.token=%d\n", __func__, cmd, (int) hdr->token, (int) cmd->token); if (hdr->token == cmd->token) { /* Copy answer into caller's supplied buffer. */ if (cmd->odata != NULL) { if (hdr->len != cmd->odatalen) { device_printf(sc->sc_dev, "%s: code 0x%02x, len=%d, olen=%d\n", __func__, (int) hdr->code, (int) hdr->len, (int) cmd->odatalen); } memcpy(cmd->odata, &hdr[1], MIN(cmd->odatalen, hdr->len)); } wakeup(cmd); } STAILQ_INSERT_TAIL(&sc->sc_cmd_inactive, cmd, next_cmd); } } void otus_cmd_rxeof(struct otus_softc *sc, uint8_t *buf, int len) { struct ieee80211com *ic = &sc->sc_ic; struct ar_cmd_hdr *hdr; OTUS_LOCK_ASSERT(sc); if (__predict_false(len < sizeof (*hdr))) { OTUS_DPRINTF(sc, OTUS_DEBUG_CMDDONE, "cmd too small %d\n", len); return; } hdr = (struct ar_cmd_hdr *)buf; if (__predict_false(sizeof (*hdr) + hdr->len > len || sizeof (*hdr) + hdr->len > 64)) { OTUS_DPRINTF(sc, OTUS_DEBUG_CMDDONE, "cmd too large %d\n", hdr->len); return; } OTUS_DPRINTF(sc, OTUS_DEBUG_RXDONE, "%s: code=%.02x\n", __func__, hdr->code); /* * This has to reach into the cmd queue "waiting for * an RX response" list, grab the head entry and check * if we need to wake anyone up. */ if ((hdr->code & 0xc0) != 0xc0) { otus_cmd_handle_response(sc, hdr); return; } /* Received unsolicited notification. */ switch (hdr->code & 0x3f) { case AR_EVT_BEACON: break; case AR_EVT_TX_COMP: { struct ar_evt_tx_comp *tx = (struct ar_evt_tx_comp *)&hdr[1]; struct ieee80211_node *ni; ni = ieee80211_find_node(&ic->ic_sta, tx->macaddr); if (ni == NULL) { device_printf(sc->sc_dev, "%s: txcomp on unknown node (%s)\n", __func__, ether_sprintf(tx->macaddr)); break; } OTUS_DPRINTF(sc, OTUS_DEBUG_TXCOMP, "tx completed %s status=%d phy=0x%x\n", ether_sprintf(tx->macaddr), le16toh(tx->status), le32toh(tx->phy)); switch (le16toh(tx->status)) { case AR_TX_STATUS_COMP: #if 0 ackfailcnt = 0; ieee80211_ratectl_tx_complete(ni->ni_vap, ni, IEEE80211_RATECTL_TX_SUCCESS, &ackfailcnt, NULL); #endif /* * We don't get the above; only error notifications. * Sigh. So, don't worry about this. */ break; case AR_TX_STATUS_RETRY_COMP: OTUS_NODE(ni)->tx_retries++; break; case AR_TX_STATUS_FAILED: OTUS_NODE(ni)->tx_err++; break; } ieee80211_free_node(ni); break; } case AR_EVT_TBTT: break; case AR_EVT_DO_BB_RESET: /* * This is "tell driver to reset baseband" from ar9170-fw. * * I'm not sure what we should do here, so I'm going to * fall through; it gets generated when RTSRetryCnt internally * reaches '5' - I guess the firmware authors thought that * meant that the BB may have gone deaf or something. */ default: device_printf(sc->sc_dev, "%s: received notification code=0x%02x len=%d\n", __func__, hdr->code, hdr->len); } } void otus_sub_rxeof(struct otus_softc *sc, uint8_t *buf, int len, struct mbufq *rxq) { struct ieee80211com *ic = &sc->sc_ic; struct ieee80211_rx_stats rxs; #if 0 struct ieee80211_node *ni; #endif struct ar_rx_tail *tail; struct ieee80211_frame *wh; struct mbuf *m; uint8_t *plcp; // int s; int mlen; if (__predict_false(len < AR_PLCP_HDR_LEN)) { OTUS_DPRINTF(sc, OTUS_DEBUG_RXDONE, "sub-xfer too short %d\n", len); return; } plcp = buf; /* All bits in the PLCP header are set to 1 for non-MPDU. */ if (memcmp(plcp, AR_PLCP_HDR_INTR, AR_PLCP_HDR_LEN) == 0) { otus_cmd_rxeof(sc, plcp + AR_PLCP_HDR_LEN, len - AR_PLCP_HDR_LEN); return; } /* Received MPDU. */ if (__predict_false(len < AR_PLCP_HDR_LEN + sizeof (*tail))) { OTUS_DPRINTF(sc, OTUS_DEBUG_RXDONE, "MPDU too short %d\n", len); counter_u64_add(ic->ic_ierrors, 1); return; } tail = (struct ar_rx_tail *)(plcp + len - sizeof (*tail)); /* Discard error frames; don't discard BAD_RA (eg monitor mode); let net80211 do that */ if (__predict_false((tail->error & ~AR_RX_ERROR_BAD_RA) != 0)) { OTUS_DPRINTF(sc, OTUS_DEBUG_RXDONE, "error frame 0x%02x\n", tail->error); if (tail->error & AR_RX_ERROR_FCS) { OTUS_DPRINTF(sc, OTUS_DEBUG_RXDONE, "bad FCS\n"); } else if (tail->error & AR_RX_ERROR_MMIC) { /* Report Michael MIC failures to net80211. */ #if 0 ieee80211_notify_michael_failure(ni->ni_vap, wh, keyidx); #endif device_printf(sc->sc_dev, "%s: MIC failure\n", __func__); } counter_u64_add(ic->ic_ierrors, 1); return; } /* Compute MPDU's length. */ mlen = len - AR_PLCP_HDR_LEN - sizeof (*tail); /* Make sure there's room for an 802.11 header + FCS. */ if (__predict_false(mlen < IEEE80211_MIN_LEN)) { counter_u64_add(ic->ic_ierrors, 1); return; } mlen -= IEEE80211_CRC_LEN; /* strip 802.11 FCS */ wh = (struct ieee80211_frame *)(plcp + AR_PLCP_HDR_LEN); /* * TODO: I see > 2KiB buffers in this path; is it A-MSDU or something? */ m = m_get2(mlen, M_NOWAIT, MT_DATA, M_PKTHDR); if (m == NULL) { device_printf(sc->sc_dev, "%s: failed m_get2() (mlen=%d)\n", __func__, mlen); counter_u64_add(ic->ic_ierrors, 1); return; } /* Finalize mbuf. */ memcpy(mtod(m, uint8_t *), wh, mlen); m->m_pkthdr.len = m->m_len = mlen; #if 0 if (__predict_false(sc->sc_drvbpf != NULL)) { struct otus_rx_radiotap_header *tap = &sc->sc_rxtap; struct mbuf mb; tap->wr_flags = 0; tap->wr_antsignal = tail->rssi; tap->wr_rate = 2; /* In case it can't be found below. */ switch (tail->status & AR_RX_STATUS_MT_MASK) { case AR_RX_STATUS_MT_CCK: switch (plcp[0]) { case 10: tap->wr_rate = 2; break; case 20: tap->wr_rate = 4; break; case 55: tap->wr_rate = 11; break; case 110: tap->wr_rate = 22; break; } if (tail->status & AR_RX_STATUS_SHPREAMBLE) tap->wr_flags |= IEEE80211_RADIOTAP_F_SHORTPRE; break; case AR_RX_STATUS_MT_OFDM: switch (plcp[0] & 0xf) { case 0xb: tap->wr_rate = 12; break; case 0xf: tap->wr_rate = 18; break; case 0xa: tap->wr_rate = 24; break; case 0xe: tap->wr_rate = 36; break; case 0x9: tap->wr_rate = 48; break; case 0xd: tap->wr_rate = 72; break; case 0x8: tap->wr_rate = 96; break; case 0xc: tap->wr_rate = 108; break; } break; } mb.m_data = (caddr_t)tap; mb.m_next = m; mb.m_nextpkt = NULL; mb.m_type = 0; mb.m_flags = 0; bpf_mtap(sc->sc_drvbpf, &mb, BPF_DIRECTION_IN); } #endif /* Add RSSI/NF to this mbuf */ bzero(&rxs, sizeof(rxs)); rxs.r_flags = IEEE80211_R_NF | IEEE80211_R_RSSI; rxs.c_nf = sc->sc_nf[0]; /* XXX chain 0 != combined rssi/nf */ rxs.c_rssi = tail->rssi; /* XXX TODO: add MIMO RSSI/NF as well */ if (ieee80211_add_rx_params(m, &rxs) == 0) { counter_u64_add(ic->ic_ierrors, 1); return; } /* XXX make a method */ STAILQ_INSERT_TAIL(&rxq->mq_head, m, m_stailqpkt); #if 0 OTUS_UNLOCK(sc); ni = ieee80211_find_rxnode(ic, wh); rxi.rxi_flags = 0; rxi.rxi_rssi = tail->rssi; rxi.rxi_tstamp = 0; /* unused */ ieee80211_input(ifp, m, ni, &rxi); /* Node is no longer needed. */ ieee80211_release_node(ic, ni); OTUS_LOCK(sc); #endif } static void otus_rxeof(struct usb_xfer *xfer, struct otus_data *data, struct mbufq *rxq) { struct otus_softc *sc = usbd_xfer_softc(xfer); caddr_t buf = data->buf; struct ar_rx_head *head; uint16_t hlen; int len; usbd_xfer_status(xfer, &len, NULL, NULL, NULL); while (len >= sizeof (*head)) { head = (struct ar_rx_head *)buf; if (__predict_false(head->tag != htole16(AR_RX_HEAD_TAG))) { OTUS_DPRINTF(sc, OTUS_DEBUG_RXDONE, "tag not valid 0x%x\n", le16toh(head->tag)); break; } hlen = le16toh(head->len); if (__predict_false(sizeof (*head) + hlen > len)) { OTUS_DPRINTF(sc, OTUS_DEBUG_RXDONE, "xfer too short %d/%d\n", len, hlen); break; } /* Process sub-xfer. */ otus_sub_rxeof(sc, (uint8_t *)&head[1], hlen, rxq); /* Next sub-xfer is aligned on a 32-bit boundary. */ hlen = (sizeof (*head) + hlen + 3) & ~3; buf += hlen; len -= hlen; } } static void otus_bulk_rx_callback(struct usb_xfer *xfer, usb_error_t error) { struct epoch_tracker et; struct otus_softc *sc = usbd_xfer_softc(xfer); struct ieee80211com *ic = &sc->sc_ic; struct ieee80211_frame *wh; struct ieee80211_node *ni; struct mbuf *m; struct mbufq scrx; struct otus_data *data; OTUS_LOCK_ASSERT(sc); mbufq_init(&scrx, 1024); #if 0 device_printf(sc->sc_dev, "%s: called; state=%d; error=%d\n", __func__, USB_GET_STATE(xfer), error); #endif switch (USB_GET_STATE(xfer)) { case USB_ST_TRANSFERRED: data = STAILQ_FIRST(&sc->sc_rx_active); if (data == NULL) goto tr_setup; STAILQ_REMOVE_HEAD(&sc->sc_rx_active, next); otus_rxeof(xfer, data, &scrx); STAILQ_INSERT_TAIL(&sc->sc_rx_inactive, data, next); /* FALLTHROUGH */ case USB_ST_SETUP: tr_setup: /* * XXX TODO: what if sc_rx isn't empty, but data * is empty? Then we leak mbufs. */ data = STAILQ_FIRST(&sc->sc_rx_inactive); if (data == NULL) { //KASSERT(m == NULL, ("mbuf isn't NULL")); return; } STAILQ_REMOVE_HEAD(&sc->sc_rx_inactive, next); STAILQ_INSERT_TAIL(&sc->sc_rx_active, data, next); usbd_xfer_set_frame_data(xfer, 0, data->buf, usbd_xfer_max_len(xfer)); usbd_transfer_submit(xfer); /* * To avoid LOR we should unlock our private mutex here to call * ieee80211_input() because here is at the end of a USB * callback and safe to unlock. */ OTUS_UNLOCK(sc); NET_EPOCH_ENTER(et); while ((m = mbufq_dequeue(&scrx)) != NULL) { wh = mtod(m, struct ieee80211_frame *); ni = ieee80211_find_rxnode(ic, (struct ieee80211_frame_min *)wh); if (ni != NULL) { if (ni->ni_flags & IEEE80211_NODE_HT) m->m_flags |= M_AMPDU; (void)ieee80211_input_mimo(ni, m); ieee80211_free_node(ni); } else (void)ieee80211_input_mimo_all(ic, m); } NET_EPOCH_EXIT(et); #ifdef IEEE80211_SUPPORT_SUPERG ieee80211_ff_age_all(ic, 100); #endif OTUS_LOCK(sc); break; default: /* needs it to the inactive queue due to a error. */ data = STAILQ_FIRST(&sc->sc_rx_active); if (data != NULL) { STAILQ_REMOVE_HEAD(&sc->sc_rx_active, next); STAILQ_INSERT_TAIL(&sc->sc_rx_inactive, data, next); } if (error != USB_ERR_CANCELLED) { usbd_xfer_set_stall(xfer); counter_u64_add(ic->ic_ierrors, 1); goto tr_setup; } break; } } static void otus_txeof(struct usb_xfer *xfer, struct otus_data *data) { struct otus_softc *sc = usbd_xfer_softc(xfer); OTUS_DPRINTF(sc, OTUS_DEBUG_TXDONE, "%s: called; data=%p\n", __func__, data); OTUS_LOCK_ASSERT(sc); if (sc->sc_tx_n_active == 0) { device_printf(sc->sc_dev, "%s: completed but tx_active=0\n", __func__); } else { sc->sc_tx_n_active--; } if (data->m) { /* XXX status? */ /* XXX we get TX status via the RX path.. */ ieee80211_tx_complete(data->ni, data->m, 0); data->m = NULL; data->ni = NULL; } } static void otus_txcmdeof(struct usb_xfer *xfer, struct otus_tx_cmd *cmd) { struct otus_softc *sc = usbd_xfer_softc(xfer); OTUS_LOCK_ASSERT(sc); OTUS_DPRINTF(sc, OTUS_DEBUG_CMDDONE, "%s: called; data=%p; odata=%p\n", __func__, cmd, cmd->odata); /* * Non-response commands still need wakeup so the caller * knows it was submitted and completed OK; response commands should * wait until they're ACKed by the firmware with a response. */ if (cmd->odata) { STAILQ_INSERT_TAIL(&sc->sc_cmd_waiting, cmd, next_cmd); } else { wakeup(cmd); otus_free_txcmd(sc, cmd); } } static void otus_bulk_tx_callback(struct usb_xfer *xfer, usb_error_t error) { uint8_t which = OTUS_BULK_TX; struct otus_softc *sc = usbd_xfer_softc(xfer); struct ieee80211com *ic = &sc->sc_ic; struct otus_data *data; OTUS_LOCK_ASSERT(sc); switch (USB_GET_STATE(xfer)) { case USB_ST_TRANSFERRED: data = STAILQ_FIRST(&sc->sc_tx_active[which]); if (data == NULL) goto tr_setup; OTUS_DPRINTF(sc, OTUS_DEBUG_TXDONE, "%s: transfer done %p\n", __func__, data); STAILQ_REMOVE_HEAD(&sc->sc_tx_active[which], next); otus_txeof(xfer, data); otus_freebuf(sc, data); /* FALLTHROUGH */ case USB_ST_SETUP: tr_setup: data = STAILQ_FIRST(&sc->sc_tx_pending[which]); if (data == NULL) { OTUS_DPRINTF(sc, OTUS_DEBUG_XMIT, "%s: empty pending queue sc %p\n", __func__, sc); sc->sc_tx_n_active = 0; goto finish; } STAILQ_REMOVE_HEAD(&sc->sc_tx_pending[which], next); STAILQ_INSERT_TAIL(&sc->sc_tx_active[which], data, next); usbd_xfer_set_frame_data(xfer, 0, data->buf, data->buflen); OTUS_DPRINTF(sc, OTUS_DEBUG_XMIT, "%s: submitting transfer %p\n", __func__, data); usbd_transfer_submit(xfer); sc->sc_tx_n_active++; break; default: data = STAILQ_FIRST(&sc->sc_tx_active[which]); if (data != NULL) { STAILQ_REMOVE_HEAD(&sc->sc_tx_active[which], next); otus_txeof(xfer, data); otus_freebuf(sc, data); } counter_u64_add(ic->ic_oerrors, 1); if (error != USB_ERR_CANCELLED) { usbd_xfer_set_stall(xfer); goto tr_setup; } break; } finish: #ifdef IEEE80211_SUPPORT_SUPERG /* * If the TX active queue drops below a certain * threshold, ensure we age fast-frames out so they're * transmitted. */ if (sc->sc_tx_n_active < 2) { /* XXX ew - net80211 should defer this for us! */ OTUS_UNLOCK(sc); ieee80211_ff_flush(ic, WME_AC_VO); ieee80211_ff_flush(ic, WME_AC_VI); ieee80211_ff_flush(ic, WME_AC_BE); ieee80211_ff_flush(ic, WME_AC_BK); OTUS_LOCK(sc); } #endif /* Kick TX */ otus_tx_start(sc); } static void otus_bulk_cmd_callback(struct usb_xfer *xfer, usb_error_t error) { struct otus_softc *sc = usbd_xfer_softc(xfer); #if 0 struct ieee80211com *ic = &sc->sc_ic; #endif struct otus_tx_cmd *cmd; OTUS_LOCK_ASSERT(sc); switch (USB_GET_STATE(xfer)) { case USB_ST_TRANSFERRED: cmd = STAILQ_FIRST(&sc->sc_cmd_active); if (cmd == NULL) goto tr_setup; OTUS_DPRINTF(sc, OTUS_DEBUG_CMDDONE, "%s: transfer done %p\n", __func__, cmd); STAILQ_REMOVE_HEAD(&sc->sc_cmd_active, next_cmd); otus_txcmdeof(xfer, cmd); /* FALLTHROUGH */ case USB_ST_SETUP: tr_setup: cmd = STAILQ_FIRST(&sc->sc_cmd_pending); if (cmd == NULL) { OTUS_DPRINTF(sc, OTUS_DEBUG_CMD, "%s: empty pending queue sc %p\n", __func__, sc); return; } STAILQ_REMOVE_HEAD(&sc->sc_cmd_pending, next_cmd); STAILQ_INSERT_TAIL(&sc->sc_cmd_active, cmd, next_cmd); usbd_xfer_set_frame_data(xfer, 0, cmd->buf, cmd->buflen); OTUS_DPRINTF(sc, OTUS_DEBUG_CMD, "%s: submitting transfer %p; buf=%p, buflen=%d\n", __func__, cmd, cmd->buf, cmd->buflen); usbd_transfer_submit(xfer); break; default: cmd = STAILQ_FIRST(&sc->sc_cmd_active); if (cmd != NULL) { STAILQ_REMOVE_HEAD(&sc->sc_cmd_active, next_cmd); otus_txcmdeof(xfer, cmd); } if (error != USB_ERR_CANCELLED) { usbd_xfer_set_stall(xfer); goto tr_setup; } break; } } /* * This isn't used by carl9170; it however may be used by the * initial bootloader. */ static void otus_bulk_irq_callback(struct usb_xfer *xfer, usb_error_t error) { struct otus_softc *sc = usbd_xfer_softc(xfer); int actlen; int sumlen; usbd_xfer_status(xfer, &actlen, &sumlen, NULL, NULL); OTUS_DPRINTF(sc, OTUS_DEBUG_IRQ, "%s: called; state=%d\n", __func__, USB_GET_STATE(xfer)); switch (USB_GET_STATE(xfer)) { case USB_ST_TRANSFERRED: /* * Read usb frame data, if any. * "actlen" has the total length for all frames * transferred. */ OTUS_DPRINTF(sc, OTUS_DEBUG_IRQ, "%s: comp; %d bytes\n", __func__, actlen); #if 0 pc = usbd_xfer_get_frame(xfer, 0); otus_dump_usb_rx_page(sc, pc, actlen); #endif /* XXX fallthrough */ case USB_ST_SETUP: /* * Setup xfer frame lengths/count and data */ OTUS_DPRINTF(sc, OTUS_DEBUG_IRQ, "%s: setup\n", __func__); usbd_xfer_set_frame_len(xfer, 0, usbd_xfer_max_len(xfer)); usbd_transfer_submit(xfer); break; default: /* Error */ /* * Print error message and clear stall * for example. */ OTUS_DPRINTF(sc, OTUS_DEBUG_IRQ, "%s: ERROR?\n", __func__); break; } } /* * Map net80211 rate to hw rate for otus MAC/PHY. */ static uint8_t otus_rate_to_hw_rate(struct otus_softc *sc, uint8_t rate) { int is_2ghz; is_2ghz = !! (IEEE80211_IS_CHAN_2GHZ(sc->sc_ic.ic_curchan)); switch (rate) { /* CCK */ case 2: return (0x0); case 4: return (0x1); case 11: return (0x2); case 22: return (0x3); /* OFDM */ case 12: return (0xb); case 18: return (0xf); case 24: return (0xa); case 36: return (0xe); case 48: return (0x9); case 72: return (0xd); case 96: return (0x8); case 108: return (0xc); default: device_printf(sc->sc_dev, "%s: unknown rate '%d'\n", __func__, (int) rate); case 0: if (is_2ghz) return (0x0); /* 1MB CCK */ else return (0xb); /* 6MB OFDM */ /* XXX TODO: HT */ } } static int otus_hw_rate_is_ofdm(struct otus_softc *sc, uint8_t hw_rate) { switch (hw_rate) { case 0x0: case 0x1: case 0x2: case 0x3: return (0); default: return (1); } } static void otus_tx_update_ratectl(struct otus_softc *sc, struct ieee80211_node *ni) { struct ieee80211_ratectl_tx_stats *txs = &sc->sc_txs; struct otus_node *on = OTUS_NODE(ni); txs->flags = IEEE80211_RATECTL_TX_STATS_NODE | IEEE80211_RATECTL_TX_STATS_RETRIES; txs->ni = ni; txs->nframes = on->tx_done; txs->nsuccess = on->tx_done - on->tx_err; txs->nretries = on->tx_retries; ieee80211_ratectl_tx_update(ni->ni_vap, txs); on->tx_done = on->tx_err = on->tx_retries = 0; } /* * XXX TODO: support tx bpf parameters for configuration! * * Relevant pieces: * * ac = params->ibp_pri & 3; * rate = params->ibp_rate0; * params->ibp_flags & IEEE80211_BPF_NOACK * params->ibp_flags & IEEE80211_BPF_RTS * params->ibp_flags & IEEE80211_BPF_CTS * tx->rts_ntries = params->ibp_try1; * tx->data_ntries = params->ibp_try0; */ static int otus_tx(struct otus_softc *sc, struct ieee80211_node *ni, struct mbuf *m, struct otus_data *data, const struct ieee80211_bpf_params *params) { const struct ieee80211_txparam *tp = ni->ni_txparms; struct ieee80211com *ic = &sc->sc_ic; struct ieee80211vap *vap = ni->ni_vap; struct ieee80211_frame *wh; struct ieee80211_key *k; struct ar_tx_head *head; uint32_t phyctl; uint16_t macctl, qos; uint8_t qid, rate; int hasqos, xferlen, type, ismcast; wh = mtod(m, struct ieee80211_frame *); if (wh->i_fc[1] & IEEE80211_FC1_PROTECTED) { k = ieee80211_crypto_encap(ni, m); if (k == NULL) { device_printf(sc->sc_dev, "%s: m=%p: ieee80211_crypto_encap returns NULL\n", __func__, m); return (ENOBUFS); } wh = mtod(m, struct ieee80211_frame *); } /* Calculate transfer length; ensure data buffer is large enough */ xferlen = sizeof (*head) + m->m_pkthdr.len; if (xferlen > OTUS_TXBUFSZ) { device_printf(sc->sc_dev, "%s: 802.11 TX frame is %d bytes, max %d bytes\n", __func__, xferlen, OTUS_TXBUFSZ); return (ENOBUFS); } hasqos = !! IEEE80211_QOS_HAS_SEQ(wh); if (hasqos) { uint8_t tid; qos = ((const struct ieee80211_qosframe *)wh)->i_qos[0]; tid = qos & IEEE80211_QOS_TID; qid = TID_TO_WME_AC(tid); } else { qos = 0; qid = WME_AC_BE; } type = wh->i_fc[0] & IEEE80211_FC0_TYPE_MASK; ismcast = IEEE80211_IS_MULTICAST(wh->i_addr1); /* Pickup a rate index. */ if (params != NULL) rate = otus_rate_to_hw_rate(sc, params->ibp_rate0); else if (!!(m->m_flags & M_EAPOL) || type != IEEE80211_FC0_TYPE_DATA) rate = otus_rate_to_hw_rate(sc, tp->mgmtrate); else if (ismcast) rate = otus_rate_to_hw_rate(sc, tp->mcastrate); else if (tp->ucastrate != IEEE80211_FIXED_RATE_NONE) rate = otus_rate_to_hw_rate(sc, tp->ucastrate); else { (void) ieee80211_ratectl_rate(ni, NULL, 0); rate = otus_rate_to_hw_rate(sc, ni->ni_txrate); } phyctl = 0; macctl = AR_TX_MAC_BACKOFF | AR_TX_MAC_HW_DUR | AR_TX_MAC_QID(qid); /* * XXX TODO: params for NOACK, ACK, RTS, CTS, etc */ if (ismcast || (hasqos && ((qos & IEEE80211_QOS_ACKPOLICY) == IEEE80211_QOS_ACKPOLICY_NOACK))) macctl |= AR_TX_MAC_NOACK; if (!ismcast) { if (m->m_pkthdr.len + IEEE80211_CRC_LEN >= vap->iv_rtsthreshold) macctl |= AR_TX_MAC_RTS; else if (ic->ic_flags & IEEE80211_F_USEPROT) { if (ic->ic_protmode == IEEE80211_PROT_CTSONLY) macctl |= AR_TX_MAC_CTS; else if (ic->ic_protmode == IEEE80211_PROT_RTSCTS) macctl |= AR_TX_MAC_RTS; } } phyctl |= AR_TX_PHY_MCS(rate); if (otus_hw_rate_is_ofdm(sc, rate)) { phyctl |= AR_TX_PHY_MT_OFDM; /* Always use all tx antennas for now, just to be safe */ phyctl |= AR_TX_PHY_ANTMSK(sc->txmask); } else { /* CCK */ phyctl |= AR_TX_PHY_MT_CCK; phyctl |= AR_TX_PHY_ANTMSK(sc->txmask); } /* Update net80211 with the current counters */ otus_tx_update_ratectl(sc, ni); /* Update rate control stats for frames that are ACK'ed. */ if (!(macctl & AR_TX_MAC_NOACK)) OTUS_NODE(ni)->tx_done++; /* Fill Tx descriptor. */ head = (struct ar_tx_head *)data->buf; head->len = htole16(m->m_pkthdr.len + IEEE80211_CRC_LEN); head->macctl = htole16(macctl); head->phyctl = htole32(phyctl); m_copydata(m, 0, m->m_pkthdr.len, (caddr_t)&head[1]); data->buflen = xferlen; data->ni = ni; data->m = m; OTUS_DPRINTF(sc, OTUS_DEBUG_XMIT, "%s: tx: m=%p; data=%p; len=%d mac=0x%04x phy=0x%08x rate=0x%02x, ni_txrate=%d\n", __func__, m, data, le16toh(head->len), macctl, phyctl, (int) rate, (int) ni->ni_txrate); /* Submit transfer */ STAILQ_INSERT_TAIL(&sc->sc_tx_pending[OTUS_BULK_TX], data, next); usbd_transfer_start(sc->sc_xfer[OTUS_BULK_TX]); return 0; } static u_int otus_hash_maddr(void *arg, struct sockaddr_dl *sdl, u_int cnt) { uint32_t val, *hashes = arg; val = le32dec(LLADDR(sdl) + 4); /* Get address byte 5 */ val = val & 0x0000ff00; val = val >> 8; /* As per below, shift it >> 2 to get only 6 bits */ val = val >> 2; if (val < 32) hashes[0] |= 1 << val; else hashes[1] |= 1 << (val - 32); return (1); } int otus_set_multi(struct otus_softc *sc) { struct ieee80211com *ic = &sc->sc_ic; uint32_t hashes[2]; int r; if (ic->ic_allmulti > 0 || ic->ic_promisc > 0 || ic->ic_opmode == IEEE80211_M_MONITOR) { hashes[0] = 0xffffffff; hashes[1] = 0xffffffff; } else { struct ieee80211vap *vap; hashes[0] = hashes[1] = 0; TAILQ_FOREACH(vap, &ic->ic_vaps, iv_next) if_foreach_llmaddr(vap->iv_ifp, otus_hash_maddr, hashes); } #if 0 /* XXX openbsd code */ while (enm != NULL) { bit = enm->enm_addrlo[5] >> 2; if (bit < 32) hashes[0] |= 1 << bit; else hashes[1] |= 1 << (bit - 32); ETHER_NEXT_MULTI(step, enm); } #endif hashes[1] |= 1U << 31; /* Make sure the broadcast bit is set. */ OTUS_LOCK(sc); otus_write(sc, AR_MAC_REG_GROUP_HASH_TBL_L, hashes[0]); otus_write(sc, AR_MAC_REG_GROUP_HASH_TBL_H, hashes[1]); r = otus_write_barrier(sc); /* XXX operating mode? filter? */ OTUS_UNLOCK(sc); return (r); } static int otus_updateedca(struct ieee80211com *ic) { struct otus_softc *sc = ic->ic_softc; OTUS_LOCK(sc); /* * XXX TODO: take temporary copy of EDCA information * when scheduling this so we have a more time-correct view * of things. * XXX TODO: this can be done on the net80211 level */ otus_updateedca_locked(sc); OTUS_UNLOCK(sc); return (0); } static void otus_updateedca_locked(struct otus_softc *sc) { #define EXP2(val) ((1 << (val)) - 1) #define AIFS(val) ((val) * 9 + 10) struct chanAccParams chp; struct ieee80211com *ic = &sc->sc_ic; const struct wmeParams *edca; ieee80211_wme_ic_getparams(ic, &chp); OTUS_LOCK_ASSERT(sc); edca = chp.cap_wmeParams; /* Set CWmin/CWmax values. */ otus_write(sc, AR_MAC_REG_AC0_CW, EXP2(edca[WME_AC_BE].wmep_logcwmax) << 16 | EXP2(edca[WME_AC_BE].wmep_logcwmin)); otus_write(sc, AR_MAC_REG_AC1_CW, EXP2(edca[WME_AC_BK].wmep_logcwmax) << 16 | EXP2(edca[WME_AC_BK].wmep_logcwmin)); otus_write(sc, AR_MAC_REG_AC2_CW, EXP2(edca[WME_AC_VI].wmep_logcwmax) << 16 | EXP2(edca[WME_AC_VI].wmep_logcwmin)); otus_write(sc, AR_MAC_REG_AC3_CW, EXP2(edca[WME_AC_VO].wmep_logcwmax) << 16 | EXP2(edca[WME_AC_VO].wmep_logcwmin)); otus_write(sc, AR_MAC_REG_AC4_CW, /* Special TXQ. */ EXP2(edca[WME_AC_VO].wmep_logcwmax) << 16 | EXP2(edca[WME_AC_VO].wmep_logcwmin)); /* Set AIFSN values. */ otus_write(sc, AR_MAC_REG_AC1_AC0_AIFS, AIFS(edca[WME_AC_VI].wmep_aifsn) << 24 | AIFS(edca[WME_AC_BK].wmep_aifsn) << 12 | AIFS(edca[WME_AC_BE].wmep_aifsn)); otus_write(sc, AR_MAC_REG_AC3_AC2_AIFS, AIFS(edca[WME_AC_VO].wmep_aifsn) << 16 | /* Special TXQ. */ AIFS(edca[WME_AC_VO].wmep_aifsn) << 4 | AIFS(edca[WME_AC_VI].wmep_aifsn) >> 8); /* Set TXOP limit. */ otus_write(sc, AR_MAC_REG_AC1_AC0_TXOP, edca[WME_AC_BK].wmep_txopLimit << 16 | edca[WME_AC_BE].wmep_txopLimit); otus_write(sc, AR_MAC_REG_AC3_AC2_TXOP, edca[WME_AC_VO].wmep_txopLimit << 16 | edca[WME_AC_VI].wmep_txopLimit); /* XXX ACK policy? */ (void)otus_write_barrier(sc); #undef AIFS #undef EXP2 } static void otus_updateslot(struct otus_softc *sc) { struct ieee80211com *ic = &sc->sc_ic; uint32_t slottime; OTUS_LOCK_ASSERT(sc); slottime = IEEE80211_GET_SLOTTIME(ic); otus_write(sc, AR_MAC_REG_SLOT_TIME, slottime << 10); (void)otus_write_barrier(sc); } int otus_init_mac(struct otus_softc *sc) { int error; OTUS_LOCK_ASSERT(sc); otus_write(sc, AR_MAC_REG_ACK_EXTENSION, 0x40); otus_write(sc, AR_MAC_REG_RETRY_MAX, 0); otus_write(sc, AR_MAC_REG_RX_THRESHOLD, 0xc1f80); otus_write(sc, AR_MAC_REG_RX_PE_DELAY, 0x70); otus_write(sc, AR_MAC_REG_EIFS_AND_SIFS, 0xa144000); otus_write(sc, AR_MAC_REG_SLOT_TIME, 9 << 10); otus_write(sc, AR_MAC_REG_TID_CFACK_CFEND_RATE, 0x19000000); /* NAV protects ACK only (in TXOP). */ otus_write(sc, AR_MAC_REG_TXOP_DURATION, 0x201); /* Set beacon Tx power to 0x7. */ otus_write(sc, AR_MAC_REG_BCN_HT1, 0x8000170); otus_write(sc, AR_MAC_REG_BACKOFF_PROTECT, 0x105); otus_write(sc, AR_MAC_REG_AMPDU_FACTOR, 0x10000a); otus_set_rx_filter(sc); otus_write(sc, AR_MAC_REG_BASIC_RATE, 0x150f); otus_write(sc, AR_MAC_REG_MANDATORY_RATE, 0x150f); otus_write(sc, AR_MAC_REG_RTS_CTS_RATE, 0x10b01bb); otus_write(sc, AR_MAC_REG_ACK_TPC, 0x4003c1e); /* Enable LED0 and LED1. */ otus_write(sc, AR_GPIO_REG_PORT_TYPE, 0x3); otus_write(sc, AR_GPIO_REG_PORT_DATA, 0x3); /* Switch MAC to OTUS interface. */ otus_write(sc, 0x1c3600, 0x3); otus_write(sc, AR_MAC_REG_AMPDU_RX_THRESH, 0xffff); otus_write(sc, AR_MAC_REG_MISC_680, 0xf00008); /* Disable Rx timeout (workaround). */ otus_write(sc, AR_MAC_REG_RX_TIMEOUT, 0); /* Set USB Rx stream mode maximum frame number to 2. */ otus_write(sc, 0x1e1110, 0x4); /* Set USB Rx stream mode timeout to 10us. */ otus_write(sc, 0x1e1114, 0x80); /* Set clock frequency to 88/80MHz. */ otus_write(sc, AR_PWR_REG_CLOCK_SEL, 0x73); /* Set WLAN DMA interrupt mode: generate intr per packet. */ otus_write(sc, AR_MAC_REG_TXRX_MPI, 0x110011); otus_write(sc, AR_MAC_REG_FCS_SELECT, 0x4); otus_write(sc, AR_MAC_REG_TXOP_NOT_ENOUGH_INDICATION, 0x141e0f48); /* Disable HW decryption for now. */ otus_write(sc, AR_MAC_REG_ENCRYPTION, 0x78); if ((error = otus_write_barrier(sc)) != 0) return error; /* Set default EDCA parameters. */ otus_updateedca_locked(sc); return 0; } /* * Return default value for PHY register based on current operating mode. */ uint32_t otus_phy_get_def(struct otus_softc *sc, uint32_t reg) { int i; for (i = 0; i < nitems(ar5416_phy_regs); i++) if (AR_PHY(ar5416_phy_regs[i]) == reg) return sc->phy_vals[i]; return 0; /* Register not found. */ } /* * Update PHY's programming based on vendor-specific data stored in EEPROM. * This is for FEM-type devices only. */ int otus_set_board_values(struct otus_softc *sc, struct ieee80211_channel *c) { const struct ModalEepHeader *eep; uint32_t tmp, offset; if (IEEE80211_IS_CHAN_5GHZ(c)) eep = &sc->eeprom.modalHeader[0]; else eep = &sc->eeprom.modalHeader[1]; /* Offset of chain 2. */ offset = 2 * 0x1000; tmp = le32toh(eep->antCtrlCommon); otus_write(sc, AR_PHY_SWITCH_COM, tmp); tmp = le32toh(eep->antCtrlChain[0]); otus_write(sc, AR_PHY_SWITCH_CHAIN_0, tmp); tmp = le32toh(eep->antCtrlChain[1]); otus_write(sc, AR_PHY_SWITCH_CHAIN_0 + offset, tmp); if (1 /* sc->sc_sco == AR_SCO_SCN */) { tmp = otus_phy_get_def(sc, AR_PHY_SETTLING); tmp &= ~(0x7f << 7); tmp |= (eep->switchSettling & 0x7f) << 7; otus_write(sc, AR_PHY_SETTLING, tmp); } tmp = otus_phy_get_def(sc, AR_PHY_DESIRED_SZ); tmp &= ~0xffff; tmp |= eep->pgaDesiredSize << 8 | eep->adcDesiredSize; otus_write(sc, AR_PHY_DESIRED_SZ, tmp); tmp = eep->txEndToXpaOff << 24 | eep->txEndToXpaOff << 16 | eep->txFrameToXpaOn << 8 | eep->txFrameToXpaOn; otus_write(sc, AR_PHY_RF_CTL4, tmp); tmp = otus_phy_get_def(sc, AR_PHY_RF_CTL3); tmp &= ~(0xff << 16); tmp |= eep->txEndToRxOn << 16; otus_write(sc, AR_PHY_RF_CTL3, tmp); tmp = otus_phy_get_def(sc, AR_PHY_CCA); tmp &= ~(0x7f << 12); tmp |= (eep->thresh62 & 0x7f) << 12; otus_write(sc, AR_PHY_CCA, tmp); tmp = otus_phy_get_def(sc, AR_PHY_RXGAIN); tmp &= ~(0x3f << 12); tmp |= (eep->txRxAttenCh[0] & 0x3f) << 12; otus_write(sc, AR_PHY_RXGAIN, tmp); tmp = otus_phy_get_def(sc, AR_PHY_RXGAIN + offset); tmp &= ~(0x3f << 12); tmp |= (eep->txRxAttenCh[1] & 0x3f) << 12; otus_write(sc, AR_PHY_RXGAIN + offset, tmp); tmp = otus_phy_get_def(sc, AR_PHY_GAIN_2GHZ); tmp &= ~(0x3f << 18); tmp |= (eep->rxTxMarginCh[0] & 0x3f) << 18; if (IEEE80211_IS_CHAN_5GHZ(c)) { tmp &= ~(0xf << 10); tmp |= (eep->bswMargin[0] & 0xf) << 10; } otus_write(sc, AR_PHY_GAIN_2GHZ, tmp); tmp = otus_phy_get_def(sc, AR_PHY_GAIN_2GHZ + offset); tmp &= ~(0x3f << 18); tmp |= (eep->rxTxMarginCh[1] & 0x3f) << 18; otus_write(sc, AR_PHY_GAIN_2GHZ + offset, tmp); tmp = otus_phy_get_def(sc, AR_PHY_TIMING_CTRL4); tmp &= ~(0x3f << 5 | 0x1f); tmp |= (eep->iqCalICh[0] & 0x3f) << 5 | (eep->iqCalQCh[0] & 0x1f); otus_write(sc, AR_PHY_TIMING_CTRL4, tmp); tmp = otus_phy_get_def(sc, AR_PHY_TIMING_CTRL4 + offset); tmp &= ~(0x3f << 5 | 0x1f); tmp |= (eep->iqCalICh[1] & 0x3f) << 5 | (eep->iqCalQCh[1] & 0x1f); otus_write(sc, AR_PHY_TIMING_CTRL4 + offset, tmp); tmp = otus_phy_get_def(sc, AR_PHY_TPCRG1); tmp &= ~(0xf << 16); tmp |= (eep->xpd & 0xf) << 16; otus_write(sc, AR_PHY_TPCRG1, tmp); return otus_write_barrier(sc); } int otus_program_phy(struct otus_softc *sc, struct ieee80211_channel *c) { const uint32_t *vals; int error, i; /* Select PHY programming based on band and bandwidth. */ if (IEEE80211_IS_CHAN_2GHZ(c)) vals = ar5416_phy_vals_2ghz_20mhz; else vals = ar5416_phy_vals_5ghz_20mhz; for (i = 0; i < nitems(ar5416_phy_regs); i++) otus_write(sc, AR_PHY(ar5416_phy_regs[i]), vals[i]); sc->phy_vals = vals; if (sc->eeprom.baseEepHeader.deviceType == 0x80) /* FEM */ if ((error = otus_set_board_values(sc, c)) != 0) return error; /* Initial Tx power settings. */ otus_write(sc, AR_PHY_POWER_TX_RATE_MAX, 0x7f); otus_write(sc, AR_PHY_POWER_TX_RATE1, 0x3f3f3f3f); otus_write(sc, AR_PHY_POWER_TX_RATE2, 0x3f3f3f3f); otus_write(sc, AR_PHY_POWER_TX_RATE3, 0x3f3f3f3f); otus_write(sc, AR_PHY_POWER_TX_RATE4, 0x3f3f3f3f); otus_write(sc, AR_PHY_POWER_TX_RATE5, 0x3f3f3f3f); otus_write(sc, AR_PHY_POWER_TX_RATE6, 0x3f3f3f3f); otus_write(sc, AR_PHY_POWER_TX_RATE7, 0x3f3f3f3f); otus_write(sc, AR_PHY_POWER_TX_RATE8, 0x3f3f3f3f); otus_write(sc, AR_PHY_POWER_TX_RATE9, 0x3f3f3f3f); if (IEEE80211_IS_CHAN_2GHZ(c)) otus_write(sc, AR_PWR_REG_PLL_ADDAC, 0x5163); else otus_write(sc, AR_PWR_REG_PLL_ADDAC, 0x5143); return otus_write_barrier(sc); } static __inline uint8_t otus_reverse_bits(uint8_t v) { v = ((v >> 1) & 0x55) | ((v & 0x55) << 1); v = ((v >> 2) & 0x33) | ((v & 0x33) << 2); v = ((v >> 4) & 0x0f) | ((v & 0x0f) << 4); return v; } int otus_set_rf_bank4(struct otus_softc *sc, struct ieee80211_channel *c) { uint8_t chansel, d0, d1; uint16_t data; int error; OTUS_LOCK_ASSERT(sc); d0 = 0; if (IEEE80211_IS_CHAN_5GHZ(c)) { chansel = (c->ic_freq - 4800) / 5; if (chansel & 1) d0 |= AR_BANK4_AMODE_REFSEL(2); else d0 |= AR_BANK4_AMODE_REFSEL(1); } else { d0 |= AR_BANK4_AMODE_REFSEL(2); if (c->ic_freq == 2484) { /* CH 14 */ d0 |= AR_BANK4_BMODE_LF_SYNTH_FREQ; chansel = 10 + (c->ic_freq - 2274) / 5; } else chansel = 16 + (c->ic_freq - 2272) / 5; chansel <<= 2; } d0 |= AR_BANK4_ADDR(1) | AR_BANK4_CHUP; d1 = otus_reverse_bits(chansel); /* Write bits 0-4 of d0 and d1. */ data = (d1 & 0x1f) << 5 | (d0 & 0x1f); otus_write(sc, AR_PHY(44), data); /* Write bits 5-7 of d0 and d1. */ data = (d1 >> 5) << 5 | (d0 >> 5); otus_write(sc, AR_PHY(58), data); if ((error = otus_write_barrier(sc)) == 0) otus_delay_ms(sc, 10); return error; } void otus_get_delta_slope(uint32_t coeff, uint32_t *exponent, uint32_t *mantissa) { #define COEFF_SCALE_SHIFT 24 uint32_t exp, man; /* exponent = 14 - floor(log2(coeff)) */ for (exp = 31; exp > 0; exp--) if (coeff & (1 << exp)) break; KASSERT(exp != 0, ("exp")); exp = 14 - (exp - COEFF_SCALE_SHIFT); /* mantissa = floor(coeff * 2^exponent + 0.5) */ man = coeff + (1 << (COEFF_SCALE_SHIFT - exp - 1)); *mantissa = man >> (COEFF_SCALE_SHIFT - exp); *exponent = exp - 16; #undef COEFF_SCALE_SHIFT } static int otus_set_chan(struct otus_softc *sc, struct ieee80211_channel *c, int assoc) { struct ieee80211com *ic = &sc->sc_ic; struct ar_cmd_frequency cmd; struct ar_rsp_frequency rsp; const uint32_t *vals; uint32_t coeff, exp, man, tmp; uint8_t code; int error, chan, i; error = 0; chan = ieee80211_chan2ieee(ic, c); OTUS_DPRINTF(sc, OTUS_DEBUG_RESET, "setting channel %d (%dMHz)\n", chan, c->ic_freq); tmp = IEEE80211_IS_CHAN_2GHZ(c) ? 0x105 : 0x104; otus_write(sc, AR_MAC_REG_DYNAMIC_SIFS_ACK, tmp); if ((error = otus_write_barrier(sc)) != 0) goto finish; /* Disable BB Heavy Clip. */ otus_write(sc, AR_PHY_HEAVY_CLIP_ENABLE, 0x200); if ((error = otus_write_barrier(sc)) != 0) goto finish; /* XXX Is that FREQ_START ? */ error = otus_cmd(sc, AR_CMD_FREQ_STRAT, NULL, 0, NULL, 0); if (error != 0) goto finish; /* Reprogram PHY and RF on channel band or bandwidth changes. */ if (sc->bb_reset || c->ic_flags != sc->sc_curchan->ic_flags) { OTUS_DPRINTF(sc, OTUS_DEBUG_RESET, "band switch\n"); /* Cold/Warm reset BB/ADDA. */ otus_write(sc, AR_PWR_REG_RESET, sc->bb_reset ? 0x800 : 0x400); if ((error = otus_write_barrier(sc)) != 0) goto finish; otus_write(sc, AR_PWR_REG_RESET, 0); if ((error = otus_write_barrier(sc)) != 0) goto finish; sc->bb_reset = 0; if ((error = otus_program_phy(sc, c)) != 0) { device_printf(sc->sc_dev, "%s: could not program PHY\n", __func__); goto finish; } /* Select RF programming based on band. */ if (IEEE80211_IS_CHAN_5GHZ(c)) vals = ar5416_banks_vals_5ghz; else vals = ar5416_banks_vals_2ghz; for (i = 0; i < nitems(ar5416_banks_regs); i++) otus_write(sc, AR_PHY(ar5416_banks_regs[i]), vals[i]); if ((error = otus_write_barrier(sc)) != 0) { device_printf(sc->sc_dev, "%s: could not program RF\n", __func__); goto finish; } code = AR_CMD_RF_INIT; } else { code = AR_CMD_FREQUENCY; } if ((error = otus_set_rf_bank4(sc, c)) != 0) goto finish; tmp = (sc->txmask == 0x5) ? 0x340 : 0x240; otus_write(sc, AR_PHY_TURBO, tmp); if ((error = otus_write_barrier(sc)) != 0) goto finish; /* Send firmware command to set channel. */ cmd.freq = htole32((uint32_t)c->ic_freq * 1000); cmd.dynht2040 = htole32(0); cmd.htena = htole32(1); /* Set Delta Slope (exponent and mantissa). */ coeff = (100 << 24) / c->ic_freq; otus_get_delta_slope(coeff, &exp, &man); cmd.dsc_exp = htole32(exp); cmd.dsc_man = htole32(man); OTUS_DPRINTF(sc, OTUS_DEBUG_RESET, "ds coeff=%u exp=%u man=%u\n", coeff, exp, man); /* For Short GI, coeff is 9/10 that of normal coeff. */ coeff = (9 * coeff) / 10; otus_get_delta_slope(coeff, &exp, &man); cmd.dsc_shgi_exp = htole32(exp); cmd.dsc_shgi_man = htole32(man); OTUS_DPRINTF(sc, OTUS_DEBUG_RESET, "ds shgi coeff=%u exp=%u man=%u\n", coeff, exp, man); /* Set wait time for AGC and noise calibration (100 or 200ms). */ cmd.check_loop_count = assoc ? htole32(2000) : htole32(1000); OTUS_DPRINTF(sc, OTUS_DEBUG_RESET, "%s\n", (code == AR_CMD_RF_INIT) ? "RF_INIT" : "FREQUENCY"); error = otus_cmd(sc, code, &cmd, sizeof cmd, &rsp, sizeof(rsp)); if (error != 0) goto finish; if ((rsp.status & htole32(AR_CAL_ERR_AGC | AR_CAL_ERR_NF_VAL)) != 0) { OTUS_DPRINTF(sc, OTUS_DEBUG_RESET, "status=0x%x\n", le32toh(rsp.status)); /* Force cold reset on next channel. */ sc->bb_reset = 1; } #ifdef USB_DEBUG if (otus_debug & OTUS_DEBUG_RESET) { device_printf(sc->sc_dev, "calibration status=0x%x\n", le32toh(rsp.status)); for (i = 0; i < 2; i++) { /* 2 Rx chains */ /* Sign-extend 9-bit NF values. */ device_printf(sc->sc_dev, "noisefloor chain %d=%d\n", i, (((int32_t)le32toh(rsp.nf[i])) << 4) >> 23); device_printf(sc->sc_dev, "noisefloor ext chain %d=%d\n", i, ((int32_t)le32toh(rsp.nf_ext[i])) >> 23); } } #endif for (i = 0; i < OTUS_NUM_CHAINS; i++) { sc->sc_nf[i] = ((((int32_t)le32toh(rsp.nf[i])) << 4) >> 23); } sc->sc_curchan = c; finish: return (error); } #ifdef notyet int otus_set_key(struct ieee80211com *ic, struct ieee80211_node *ni, struct ieee80211_key *k) { struct otus_softc *sc = ic->ic_softc; struct otus_cmd_key cmd; /* Defer setting of WEP keys until interface is brought up. */ if ((ic->ic_if.if_flags & (IFF_UP | IFF_RUNNING)) != (IFF_UP | IFF_RUNNING)) return 0; /* Do it in a process context. */ cmd.key = *k; cmd.associd = (ni != NULL) ? ni->ni_associd : 0; otus_do_async(sc, otus_set_key_cb, &cmd, sizeof cmd); return 0; } void otus_set_key_cb(struct otus_softc *sc, void *arg) { struct otus_cmd_key *cmd = arg; struct ieee80211_key *k = &cmd->key; struct ar_cmd_ekey key; uint16_t cipher; int error; memset(&key, 0, sizeof key); if (k->k_flags & IEEE80211_KEY_GROUP) { key.uid = htole16(k->k_id); IEEE80211_ADDR_COPY(key.macaddr, sc->sc_ic.ic_myaddr); key.macaddr[0] |= 0x80; } else { key.uid = htole16(OTUS_UID(cmd->associd)); IEEE80211_ADDR_COPY(key.macaddr, ni->ni_macaddr); } key.kix = htole16(0); /* Map net80211 cipher to hardware. */ switch (k->k_cipher) { case IEEE80211_CIPHER_WEP40: cipher = AR_CIPHER_WEP64; break; case IEEE80211_CIPHER_WEP104: cipher = AR_CIPHER_WEP128; break; case IEEE80211_CIPHER_TKIP: cipher = AR_CIPHER_TKIP; break; case IEEE80211_CIPHER_CCMP: cipher = AR_CIPHER_AES; break; default: return; } key.cipher = htole16(cipher); memcpy(key.key, k->k_key, MIN(k->k_len, 16)); error = otus_cmd(sc, AR_CMD_EKEY, &key, sizeof key, NULL, 0); if (error != 0 || k->k_cipher != IEEE80211_CIPHER_TKIP) return; /* TKIP: set Tx/Rx MIC Key. */ key.kix = htole16(1); memcpy(key.key, k->k_key + 16, 16); (void)otus_cmd(sc, AR_CMD_EKEY, &key, sizeof key, NULL, 0); } void otus_delete_key(struct ieee80211com *ic, struct ieee80211_node *ni, struct ieee80211_key *k) { struct otus_softc *sc = ic->ic_softc; struct otus_cmd_key cmd; if (!(ic->ic_if.if_flags & IFF_RUNNING) || ic->ic_state != IEEE80211_S_RUN) return; /* Nothing to do. */ /* Do it in a process context. */ cmd.key = *k; cmd.associd = (ni != NULL) ? ni->ni_associd : 0; otus_do_async(sc, otus_delete_key_cb, &cmd, sizeof cmd); } void otus_delete_key_cb(struct otus_softc *sc, void *arg) { struct otus_cmd_key *cmd = arg; struct ieee80211_key *k = &cmd->key; uint32_t uid; if (k->k_flags & IEEE80211_KEY_GROUP) uid = htole32(k->k_id); else uid = htole32(OTUS_UID(cmd->associd)); (void)otus_cmd(sc, AR_CMD_DKEY, &uid, sizeof uid, NULL, 0); } #endif /* * XXX TODO: check if we have to be doing any calibration in the host * or whether it's purely a firmware thing. */ void otus_calibrate_to(void *arg, int pending) { #if 0 struct otus_softc *sc = arg; device_printf(sc->sc_dev, "%s: called\n", __func__); struct ieee80211com *ic = &sc->sc_ic; struct ieee80211_node *ni; int s; if (usbd_is_dying(sc->sc_udev)) return; usbd_ref_incr(sc->sc_udev); s = splnet(); ni = ic->ic_bss; ieee80211_amrr_choose(&sc->amrr, ni, &((struct otus_node *)ni)->amn); splx(s); if (!usbd_is_dying(sc->sc_udev)) timeout_add_sec(&sc->calib_to, 1); usbd_ref_decr(sc->sc_udev); #endif } int otus_set_bssid(struct otus_softc *sc, const uint8_t *bssid) { OTUS_LOCK_ASSERT(sc); otus_write(sc, AR_MAC_REG_BSSID_L, bssid[0] | bssid[1] << 8 | bssid[2] << 16 | bssid[3] << 24); otus_write(sc, AR_MAC_REG_BSSID_H, bssid[4] | bssid[5] << 8); return otus_write_barrier(sc); } int otus_set_macaddr(struct otus_softc *sc, const uint8_t *addr) { OTUS_LOCK_ASSERT(sc); otus_write(sc, AR_MAC_REG_MAC_ADDR_L, addr[0] | addr[1] << 8 | addr[2] << 16 | addr[3] << 24); otus_write(sc, AR_MAC_REG_MAC_ADDR_H, addr[4] | addr[5] << 8); return otus_write_barrier(sc); } /* Default single-LED. */ void otus_led_newstate_type1(struct otus_softc *sc) { /* TBD */ device_printf(sc->sc_dev, "%s: TODO\n", __func__); } /* NETGEAR, dual-LED. */ void otus_led_newstate_type2(struct otus_softc *sc) { /* TBD */ device_printf(sc->sc_dev, "%s: TODO\n", __func__); } /* NETGEAR, single-LED/3 colors (blue, red, purple.) */ void otus_led_newstate_type3(struct otus_softc *sc) { #if 0 struct ieee80211com *ic = &sc->sc_ic; struct ieee80211vap *vap = TAILQ_FIRST(&ic->ic_vaps); uint32_t state = sc->led_state; OTUS_LOCK_ASSERT(sc); if (!vap) { state = 0; /* led off */ } else if (vap->iv_state == IEEE80211_S_INIT) { state = 0; /* LED off. */ } else if (vap->iv_state == IEEE80211_S_RUN) { /* Associated, LED always on. */ if (IEEE80211_IS_CHAN_2GHZ(sc->sc_curchan)) state = AR_LED0_ON; /* 2GHz=>Red. */ else state = AR_LED1_ON; /* 5GHz=>Blue. */ } else { /* Scanning, blink LED. */ state ^= AR_LED0_ON | AR_LED1_ON; if (IEEE80211_IS_CHAN_2GHZ(sc->sc_curchan)) state &= ~AR_LED1_ON; else state &= ~AR_LED0_ON; } if (state != sc->led_state) { otus_write(sc, AR_GPIO_REG_PORT_DATA, state); if (otus_write_barrier(sc) == 0) sc->led_state = state; } #endif } static uint8_t zero_macaddr[IEEE80211_ADDR_LEN] = { 0,0,0,0,0,0 }; /* * Set up operating mode, MAC/BSS address and RX filter. */ static void otus_set_operating_mode(struct otus_softc *sc) { struct ieee80211com *ic = &sc->sc_ic; struct ieee80211vap *vap; uint32_t cam_mode = AR_MAC_CAM_DEFAULTS; uint32_t rx_ctrl = AR_MAC_RX_CTRL_DEAGG | AR_MAC_RX_CTRL_SHORT_FILTER; uint32_t sniffer = AR_MAC_SNIFFER_DEFAULTS; uint32_t enc_mode = 0x78; /* XXX */ const uint8_t *macaddr; uint8_t bssid[IEEE80211_ADDR_LEN]; struct ieee80211_node *ni; OTUS_LOCK_ASSERT(sc); /* * If we're in sniffer mode or we don't have a MAC * address assigned, ensure it gets reset to all-zero. */ IEEE80211_ADDR_COPY(bssid, zero_macaddr); vap = TAILQ_FIRST(&ic->ic_vaps); macaddr = vap ? vap->iv_myaddr : ic->ic_macaddr; switch (ic->ic_opmode) { case IEEE80211_M_STA: if (vap) { ni = ieee80211_ref_node(vap->iv_bss); IEEE80211_ADDR_COPY(bssid, ni->ni_bssid); ieee80211_free_node(ni); } cam_mode |= AR_MAC_CAM_STA; rx_ctrl |= AR_MAC_RX_CTRL_PASS_TO_HOST; break; case IEEE80211_M_MONITOR: /* * Note: monitor mode ends up causing the MAC to * generate ACK frames for everything it sees. * So don't do that; instead just put it in STA mode * and disable RX filters. */ default: cam_mode |= AR_MAC_CAM_STA; rx_ctrl |= AR_MAC_RX_CTRL_PASS_TO_HOST; break; } /* * TODO: if/when we do hardware encryption, ensure it's * disabled if the NIC is in monitor mode. */ otus_write(sc, AR_MAC_REG_SNIFFER, sniffer); otus_write(sc, AR_MAC_REG_CAM_MODE, cam_mode); otus_write(sc, AR_MAC_REG_ENCRYPTION, enc_mode); otus_write(sc, AR_MAC_REG_RX_CONTROL, rx_ctrl); otus_set_macaddr(sc, macaddr); otus_set_bssid(sc, bssid); /* XXX barrier? */ } static void otus_set_rx_filter(struct otus_softc *sc) { // struct ieee80211com *ic = &sc->sc_ic; OTUS_LOCK_ASSERT(sc); #if 0 if (ic->ic_allmulti > 0 || ic->ic_promisc > 0 || ic->ic_opmode == IEEE80211_M_MONITOR) { otus_write(sc, AR_MAC_REG_FRAMETYPE_FILTER, 0xff00ffff); } else { #endif /* Filter any control frames, BAR is bit 24. */ otus_write(sc, AR_MAC_REG_FRAMETYPE_FILTER, 0x0500ffff); #if 0 } #endif } int otus_init(struct otus_softc *sc) { struct ieee80211com *ic = &sc->sc_ic; int error; OTUS_UNLOCK_ASSERT(sc); OTUS_LOCK(sc); /* Drain any pending TX frames */ otus_drain_mbufq(sc); /* Init MAC */ if ((error = otus_init_mac(sc)) != 0) { OTUS_UNLOCK(sc); device_printf(sc->sc_dev, "%s: could not initialize MAC\n", __func__); return error; } otus_set_operating_mode(sc); otus_set_rx_filter(sc); (void) otus_set_operating_mode(sc); sc->bb_reset = 1; /* Force cold reset. */ if ((error = otus_set_chan(sc, ic->ic_curchan, 0)) != 0) { OTUS_UNLOCK(sc); device_printf(sc->sc_dev, "%s: could not set channel\n", __func__); return error; } /* Start Rx. */ otus_write(sc, AR_MAC_REG_DMA_TRIGGER, 0x100); (void)otus_write_barrier(sc); sc->sc_running = 1; OTUS_UNLOCK(sc); return 0; } void otus_stop(struct otus_softc *sc) { #if 0 int s; #endif OTUS_UNLOCK_ASSERT(sc); OTUS_LOCK(sc); sc->sc_running = 0; sc->sc_tx_timer = 0; OTUS_UNLOCK(sc); taskqueue_drain_timeout(taskqueue_thread, &sc->scan_to); taskqueue_drain_timeout(taskqueue_thread, &sc->calib_to); taskqueue_drain(taskqueue_thread, &sc->tx_task); OTUS_LOCK(sc); sc->sc_running = 0; /* Stop Rx. */ otus_write(sc, AR_MAC_REG_DMA_TRIGGER, 0); (void)otus_write_barrier(sc); /* Drain any pending TX frames */ otus_drain_mbufq(sc); OTUS_UNLOCK(sc); } Index: projects/clang1000-import/sys/kern/subr_compressor.c =================================================================== --- projects/clang1000-import/sys/kern/subr_compressor.c (revision 358238) +++ projects/clang1000-import/sys/kern/subr_compressor.c (revision 358239) @@ -1,558 +1,565 @@ /*- * SPDX-License-Identifier: BSD-2-Clause-FreeBSD * * Copyright (c) 2014, 2017 Mark Johnston * Copyright (c) 2017 Conrad Meyer * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are * met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ /* * Subroutines used for writing compressed user process and kernel core dumps. */ #include __FBSDID("$FreeBSD$"); #include "opt_gzio.h" #include "opt_zstdio.h" #include #include #include #include #include #include MALLOC_DEFINE(M_COMPRESS, "compressor", "kernel compression subroutines"); struct compressor_methods { int format; void *(* const init)(size_t, int); void (* const reset)(void *); int (* const write)(void *, void *, size_t, compressor_cb_t, void *); void (* const fini)(void *); }; struct compressor { const struct compressor_methods *methods; compressor_cb_t cb; void *priv; void *arg; }; SET_DECLARE(compressors, struct compressor_methods); #ifdef GZIO #include struct gz_stream { uint8_t *gz_buffer; /* output buffer */ size_t gz_bufsz; /* output buffer size */ off_t gz_off; /* offset into the output stream */ uint32_t gz_crc; /* stream CRC32 */ z_stream gz_stream; /* zlib state */ }; static void *gz_init(size_t maxiosize, int level); static void gz_reset(void *stream); static int gz_write(void *stream, void *data, size_t len, compressor_cb_t, void *); static void gz_fini(void *stream); static void * gz_alloc(void *arg __unused, u_int n, u_int sz) { /* * Memory for zlib state is allocated using M_NODUMP since it may be * used to compress a kernel dump, and we don't want zlib to attempt to * compress its own state. */ return (malloc(n * sz, M_COMPRESS, M_WAITOK | M_ZERO | M_NODUMP)); } static void gz_free(void *arg __unused, void *ptr) { free(ptr, M_COMPRESS); } static void * gz_init(size_t maxiosize, int level) { struct gz_stream *s; int error; s = gz_alloc(NULL, 1, roundup2(sizeof(*s), PAGE_SIZE)); s->gz_buffer = gz_alloc(NULL, 1, maxiosize); s->gz_bufsz = maxiosize; s->gz_stream.zalloc = gz_alloc; s->gz_stream.zfree = gz_free; s->gz_stream.opaque = NULL; s->gz_stream.next_in = Z_NULL; s->gz_stream.avail_in = 0; + if (level != Z_DEFAULT_COMPRESSION) { + if (level < Z_BEST_SPEED) + level = Z_BEST_SPEED; + else if (level > Z_BEST_COMPRESSION) + level = Z_BEST_COMPRESSION; + } + error = deflateInit2(&s->gz_stream, level, Z_DEFLATED, -MAX_WBITS, DEF_MEM_LEVEL, Z_DEFAULT_STRATEGY); if (error != 0) goto fail; gz_reset(s); return (s); fail: gz_free(NULL, s); return (NULL); } static void gz_reset(void *stream) { struct gz_stream *s; uint8_t *hdr; const size_t hdrlen = 10; s = stream; s->gz_off = 0; s->gz_crc = crc32(0L, Z_NULL, 0); (void)deflateReset(&s->gz_stream); s->gz_stream.avail_out = s->gz_bufsz; s->gz_stream.next_out = s->gz_buffer; /* Write the gzip header to the output buffer. */ hdr = s->gz_buffer; memset(hdr, 0, hdrlen); hdr[0] = 0x1f; hdr[1] = 0x8b; hdr[2] = Z_DEFLATED; hdr[9] = OS_CODE; s->gz_stream.next_out += hdrlen; s->gz_stream.avail_out -= hdrlen; } static int gz_write(void *stream, void *data, size_t len, compressor_cb_t cb, void *arg) { struct gz_stream *s; uint8_t trailer[8]; size_t room; int error, zerror, zflag; s = stream; zflag = data == NULL ? Z_FINISH : Z_NO_FLUSH; if (len > 0) { s->gz_stream.avail_in = len; s->gz_stream.next_in = data; s->gz_crc = crc32(s->gz_crc, data, len); } error = 0; do { zerror = deflate(&s->gz_stream, zflag); if (zerror != Z_OK && zerror != Z_STREAM_END) { error = EIO; break; } if (s->gz_stream.avail_out == 0 || zerror == Z_STREAM_END) { /* * Our output buffer is full or there's nothing left * to produce, so we're flushing the buffer. */ len = s->gz_bufsz - s->gz_stream.avail_out; if (zerror == Z_STREAM_END) { /* * Try to pack as much of the trailer into the * output buffer as we can. */ ((uint32_t *)trailer)[0] = s->gz_crc; ((uint32_t *)trailer)[1] = s->gz_stream.total_in; room = MIN(sizeof(trailer), s->gz_bufsz - len); memcpy(s->gz_buffer + len, trailer, room); len += room; } error = cb(s->gz_buffer, len, s->gz_off, arg); if (error != 0) break; s->gz_off += len; s->gz_stream.next_out = s->gz_buffer; s->gz_stream.avail_out = s->gz_bufsz; /* * If we couldn't pack the trailer into the output * buffer, write it out now. */ if (zerror == Z_STREAM_END && room < sizeof(trailer)) error = cb(trailer + room, sizeof(trailer) - room, s->gz_off, arg); } } while (zerror != Z_STREAM_END && (zflag == Z_FINISH || s->gz_stream.avail_in > 0)); return (error); } static void gz_fini(void *stream) { struct gz_stream *s; s = stream; (void)deflateEnd(&s->gz_stream); gz_free(NULL, s->gz_buffer); gz_free(NULL, s); } struct compressor_methods gzip_methods = { .format = COMPRESS_GZIP, .init = gz_init, .reset = gz_reset, .write = gz_write, .fini = gz_fini, }; DATA_SET(compressors, gzip_methods); #endif /* GZIO */ #ifdef ZSTDIO #define ZSTD_STATIC_LINKING_ONLY #include struct zstdio_stream { ZSTD_CCtx *zst_stream; ZSTD_inBuffer zst_inbuffer; ZSTD_outBuffer zst_outbuffer; uint8_t * zst_buffer; /* output buffer */ size_t zst_maxiosz; /* Max output IO size */ off_t zst_off; /* offset into the output stream */ void * zst_static_wkspc; }; static void *zstdio_init(size_t maxiosize, int level); static void zstdio_reset(void *stream); static int zstdio_write(void *stream, void *data, size_t len, compressor_cb_t, void *); static void zstdio_fini(void *stream); static void * zstdio_init(size_t maxiosize, int level) { ZSTD_CCtx *dump_compressor; struct zstdio_stream *s; void *wkspc, *owkspc, *buffer; size_t wkspc_size, buf_size, rc; s = NULL; wkspc_size = ZSTD_estimateCStreamSize(level); owkspc = wkspc = malloc(wkspc_size + 8, M_COMPRESS, M_WAITOK | M_NODUMP); /* Zstd API requires 8-byte alignment. */ if ((uintptr_t)wkspc % 8 != 0) wkspc = (void *)roundup2((uintptr_t)wkspc, 8); dump_compressor = ZSTD_initStaticCCtx(wkspc, wkspc_size); if (dump_compressor == NULL) { printf("%s: workspace too small.\n", __func__); goto out; } rc = ZSTD_CCtx_setParameter(dump_compressor, ZSTD_c_checksumFlag, 1); if (ZSTD_isError(rc)) { printf("%s: error setting checksumFlag: %s\n", __func__, ZSTD_getErrorName(rc)); goto out; } rc = ZSTD_CCtx_setParameter(dump_compressor, ZSTD_c_compressionLevel, level); if (ZSTD_isError(rc)) { printf("%s: error setting compressLevel: %s\n", __func__, ZSTD_getErrorName(rc)); goto out; } buf_size = ZSTD_CStreamOutSize() * 2; buffer = malloc(buf_size, M_COMPRESS, M_WAITOK | M_NODUMP); s = malloc(sizeof(*s), M_COMPRESS, M_NODUMP | M_WAITOK); s->zst_buffer = buffer; s->zst_outbuffer.dst = buffer; s->zst_outbuffer.size = buf_size; s->zst_maxiosz = maxiosize; s->zst_stream = dump_compressor; s->zst_static_wkspc = owkspc; zstdio_reset(s); out: if (s == NULL) free(owkspc, M_COMPRESS); return (s); } static void zstdio_reset(void *stream) { struct zstdio_stream *s; size_t res; s = stream; res = ZSTD_resetCStream(s->zst_stream, 0); if (ZSTD_isError(res)) panic("%s: could not reset stream %p: %s\n", __func__, s, ZSTD_getErrorName(res)); s->zst_off = 0; s->zst_inbuffer.src = NULL; s->zst_inbuffer.size = 0; s->zst_inbuffer.pos = 0; s->zst_outbuffer.pos = 0; } static int zst_flush_intermediate(struct zstdio_stream *s, compressor_cb_t cb, void *arg) { size_t bytes_to_dump; int error; /* Flush as many full output blocks as possible. */ /* XXX: 4096 is arbitrary safe HDD block size for kernel dumps */ while (s->zst_outbuffer.pos >= 4096) { bytes_to_dump = rounddown(s->zst_outbuffer.pos, 4096); if (bytes_to_dump > s->zst_maxiosz) bytes_to_dump = s->zst_maxiosz; error = cb(s->zst_buffer, bytes_to_dump, s->zst_off, arg); if (error != 0) return (error); /* * Shift any non-full blocks up to the front of the output * buffer. */ s->zst_outbuffer.pos -= bytes_to_dump; memmove(s->zst_outbuffer.dst, (char *)s->zst_outbuffer.dst + bytes_to_dump, s->zst_outbuffer.pos); s->zst_off += bytes_to_dump; } return (0); } static int zstdio_flush(struct zstdio_stream *s, compressor_cb_t cb, void *arg) { size_t rc, lastpos; int error; /* * Positive return indicates unflushed data remaining; need to call * endStream again after clearing out room in output buffer. */ rc = 1; lastpos = s->zst_outbuffer.pos; while (rc > 0) { rc = ZSTD_endStream(s->zst_stream, &s->zst_outbuffer); if (ZSTD_isError(rc)) { printf("%s: ZSTD_endStream failed (%s)\n", __func__, ZSTD_getErrorName(rc)); return (EIO); } if (lastpos == s->zst_outbuffer.pos) { printf("%s: did not make forward progress endStream %zu\n", __func__, lastpos); return (EIO); } error = zst_flush_intermediate(s, cb, arg); if (error != 0) return (error); lastpos = s->zst_outbuffer.pos; } /* * We've already done an intermediate flush, so all full blocks have * been written. Only a partial block remains. Padding happens in a * higher layer. */ if (s->zst_outbuffer.pos != 0) { error = cb(s->zst_buffer, s->zst_outbuffer.pos, s->zst_off, arg); if (error != 0) return (error); } return (0); } static int zstdio_write(void *stream, void *data, size_t len, compressor_cb_t cb, void *arg) { struct zstdio_stream *s; size_t lastpos, rc; int error; s = stream; if (data == NULL) return (zstdio_flush(s, cb, arg)); s->zst_inbuffer.src = data; s->zst_inbuffer.size = len; s->zst_inbuffer.pos = 0; lastpos = 0; while (s->zst_inbuffer.pos < s->zst_inbuffer.size) { rc = ZSTD_compressStream(s->zst_stream, &s->zst_outbuffer, &s->zst_inbuffer); if (ZSTD_isError(rc)) { printf("%s: Compress failed on %p! (%s)\n", __func__, data, ZSTD_getErrorName(rc)); return (EIO); } if (lastpos == s->zst_inbuffer.pos) { /* * XXX: May need flushStream to make forward progress */ printf("ZSTD: did not make forward progress @pos %zu\n", lastpos); return (EIO); } lastpos = s->zst_inbuffer.pos; error = zst_flush_intermediate(s, cb, arg); if (error != 0) return (error); } return (0); } static void zstdio_fini(void *stream) { struct zstdio_stream *s; s = stream; if (s->zst_static_wkspc != NULL) free(s->zst_static_wkspc, M_COMPRESS); else ZSTD_freeCCtx(s->zst_stream); free(s->zst_buffer, M_COMPRESS); free(s, M_COMPRESS); } static struct compressor_methods zstd_methods = { .format = COMPRESS_ZSTD, .init = zstdio_init, .reset = zstdio_reset, .write = zstdio_write, .fini = zstdio_fini, }; DATA_SET(compressors, zstd_methods); #endif /* ZSTDIO */ bool compressor_avail(int format) { struct compressor_methods **iter; SET_FOREACH(iter, compressors) { if ((*iter)->format == format) return (true); } return (false); } struct compressor * compressor_init(compressor_cb_t cb, int format, size_t maxiosize, int level, void *arg) { struct compressor_methods **iter; struct compressor *s; void *priv; SET_FOREACH(iter, compressors) { if ((*iter)->format == format) break; } if (iter == SET_LIMIT(compressors)) return (NULL); priv = (*iter)->init(maxiosize, level); if (priv == NULL) return (NULL); s = malloc(sizeof(*s), M_COMPRESS, M_WAITOK | M_ZERO); s->methods = (*iter); s->priv = priv; s->cb = cb; s->arg = arg; return (s); } void compressor_reset(struct compressor *stream) { stream->methods->reset(stream->priv); } int compressor_write(struct compressor *stream, void *data, size_t len) { return (stream->methods->write(stream->priv, data, len, stream->cb, stream->arg)); } int compressor_flush(struct compressor *stream) { return (stream->methods->write(stream->priv, NULL, 0, stream->cb, stream->arg)); } void compressor_fini(struct compressor *stream) { stream->methods->fini(stream->priv); } Index: projects/clang1000-import/sys/kern/subr_smr.c =================================================================== --- projects/clang1000-import/sys/kern/subr_smr.c (revision 358238) +++ projects/clang1000-import/sys/kern/subr_smr.c (revision 358239) @@ -1,464 +1,670 @@ /*- * SPDX-License-Identifier: BSD-2-Clause-FreeBSD * * Copyright (c) 2019,2020 Jeffrey Roberson * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice unmodified, this list of conditions, and the following * disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include /* + * Global Unbounded Sequences (GUS) + * * This is a novel safe memory reclamation technique inspired by * epoch based reclamation from Samy Al Bahra's concurrency kit which * in turn was based on work described in: * Fraser, K. 2004. Practical Lock-Freedom. PhD Thesis, University * of Cambridge Computing Laboratory. * And shares some similarities with: * Wang, Stamler, Parmer. 2016 Parallel Sections: Scaling System-Level * Data-Structures * * This is not an implementation of hazard pointers or related * techniques. The term safe memory reclamation is used as a * generic descriptor for algorithms that defer frees to avoid - * use-after-free errors with lockless datastructures. + * use-after-free errors with lockless datastructures or as + * a mechanism to detect quiescence for writer synchronization. * * The basic approach is to maintain a monotonic write sequence * number that is updated on some application defined granularity. * Readers record the most recent write sequence number they have * observed. A shared read sequence number records the lowest * sequence number observed by any reader as of the last poll. Any * write older than this value has been observed by all readers * and memory can be reclaimed. Like Epoch we also detect idle * readers by storing an invalid sequence number in the per-cpu * state when the read section exits. Like Parsec we establish * a global write clock that is used to mark memory on free. * * The write and read sequence numbers can be thought of as a two - * handed clock with readers always advancing towards writers. SMR + * handed clock with readers always advancing towards writers. GUS * maintains the invariant that all readers can safely access memory * that was visible at the time they loaded their copy of the sequence * number. Periodically the read sequence or hand is polled and * advanced as far towards the write sequence as active readers allow. * Memory which was freed between the old and new global read sequence * number can now be reclaimed. When the system is idle the two hands * meet and no deferred memory is outstanding. Readers never advance * any sequence number, they only observe them. The shared read * sequence number is consequently never higher than the write sequence. * A stored sequence number that falls outside of this range has expired * and needs no scan to reclaim. * - * A notable distinction between this SMR and Epoch, qsbr, rcu, etc. is + * A notable distinction between GUS and Epoch, qsbr, rcu, etc. is * that advancing the sequence number is decoupled from detecting its - * observation. This results in a more granular assignment of sequence + * observation. That is to say, the delta between read and write + * sequence numbers is not bound. This can be thought of as a more + * generalized form of epoch which requires them at most one step + * apart. This results in a more granular assignment of sequence * numbers even as read latencies prohibit all or some expiration. * It also allows writers to advance the sequence number and save the * poll for expiration until a later time when it is likely to * complete without waiting. The batch granularity and free-to-use * latency is dynamic and can be significantly smaller than in more * strict systems. * * This mechanism is primarily intended to be used in coordination with * UMA. By integrating with the allocator we avoid all of the callout * queue machinery and are provided with an efficient way to batch * sequence advancement and waiting. The allocator accumulates a full * per-cpu cache of memory before advancing the sequence. It then * delays waiting for this sequence to expire until the memory is * selected for reuse. In this way we only increment the sequence * value once for n=cache-size frees and the waits are done long * after the sequence has been expired so they need only be verified * to account for pathological conditions and to advance the read * sequence. Tying the sequence number to the bucket size has the * nice property that as the zone gets busier the buckets get larger * and the sequence writes become fewer. If the coherency of advancing * the write sequence number becomes too costly we can advance * it for every N buckets in exchange for higher free-to-use * latency and consequently higher memory consumption. * * If the read overhead of accessing the shared cacheline becomes * especially burdensome an invariant TSC could be used in place of the * sequence. The algorithm would then only need to maintain the minimum * observed tsc. This would trade potential cache synchronization * overhead for local serialization and cpu timestamp overhead. */ /* * A simplified diagram: * * 0 UINT_MAX * | -------------------- sequence number space -------------------- | * ^ rd seq ^ wr seq * | ----- valid sequence numbers ---- | * ^cpuA ^cpuC * | -- free -- | --------- deferred frees -------- | ---- free ---- | * * * In this example cpuA has the lowest sequence number and poll can * advance rd seq. cpuB is not running and is considered to observe * wr seq. * * Freed memory that is tagged with a sequence number between rd seq and * wr seq can not be safely reclaimed because cpuA may hold a reference to * it. Any other memory is guaranteed to be unreferenced. * * Any writer is free to advance wr seq at any time however it may busy * poll in pathological cases. */ static uma_zone_t smr_shared_zone; static uma_zone_t smr_zone; #ifndef INVARIANTS #define SMR_SEQ_INIT 1 /* All valid sequence numbers are odd. */ #define SMR_SEQ_INCR 2 /* * SMR_SEQ_MAX_DELTA is the maximum distance allowed between rd_seq and * wr_seq. For the modular arithmetic to work a value of UNIT_MAX / 2 * would be possible but it is checked after we increment the wr_seq so * a safety margin is left to prevent overflow. * * We will block until SMR_SEQ_MAX_ADVANCE sequence numbers have progressed * to prevent integer wrapping. See smr_advance() for more details. */ #define SMR_SEQ_MAX_DELTA (UINT_MAX / 4) #define SMR_SEQ_MAX_ADVANCE (SMR_SEQ_MAX_DELTA - 1024) #else /* We want to test the wrapping feature in invariants kernels. */ #define SMR_SEQ_INCR (UINT_MAX / 10000) #define SMR_SEQ_INIT (UINT_MAX - 100000) /* Force extra polls to test the integer overflow detection. */ #define SMR_SEQ_MAX_DELTA (SMR_SEQ_INCR * 32) #define SMR_SEQ_MAX_ADVANCE SMR_SEQ_MAX_DELTA / 2 #endif +/* + * The grace period for lazy (tick based) SMR. + * + * Hardclock is responsible for advancing ticks on a single CPU while every + * CPU receives a regular clock interrupt. The clock interrupts are flushing + * the store buffers and any speculative loads that may violate our invariants. + * Because these interrupts are not synchronized we must wait one additional + * tick in the future to be certain that all processors have had their state + * synchronized by an interrupt. + * + * This assumes that the clock interrupt will only be delayed by other causes + * that will flush the store buffer or prevent access to the section protected + * data. For example, an idle processor, or an system management interrupt, + * or a vm exit. + * + * We must wait one additional tick if we are around the wrap condition + * because the write seq will move forward by two with one interrupt. + */ +#define SMR_LAZY_GRACE 2 +#define SMR_LAZY_GRACE_MAX (SMR_LAZY_GRACE + 1) + +/* + * The maximum sequence number ahead of wr_seq that may still be valid. The + * sequence may not be advanced on write for lazy or deferred SMRs. In this + * case poll needs to attempt to forward the sequence number if the goal is + * within wr_seq + SMR_SEQ_ADVANCE. + */ +#define SMR_SEQ_ADVANCE MAX(SMR_SEQ_INCR, SMR_LAZY_GRACE_MAX) + static SYSCTL_NODE(_debug, OID_AUTO, smr, CTLFLAG_RW, NULL, "SMR Stats"); static counter_u64_t advance = EARLY_COUNTER; -SYSCTL_COUNTER_U64(_debug_smr, OID_AUTO, advance, CTLFLAG_RD, &advance, ""); +SYSCTL_COUNTER_U64(_debug_smr, OID_AUTO, advance, CTLFLAG_RW, &advance, ""); static counter_u64_t advance_wait = EARLY_COUNTER; -SYSCTL_COUNTER_U64(_debug_smr, OID_AUTO, advance_wait, CTLFLAG_RD, &advance_wait, ""); +SYSCTL_COUNTER_U64(_debug_smr, OID_AUTO, advance_wait, CTLFLAG_RW, &advance_wait, ""); static counter_u64_t poll = EARLY_COUNTER; -SYSCTL_COUNTER_U64(_debug_smr, OID_AUTO, poll, CTLFLAG_RD, &poll, ""); +SYSCTL_COUNTER_U64(_debug_smr, OID_AUTO, poll, CTLFLAG_RW, &poll, ""); static counter_u64_t poll_scan = EARLY_COUNTER; -SYSCTL_COUNTER_U64(_debug_smr, OID_AUTO, poll_scan, CTLFLAG_RD, &poll_scan, ""); +SYSCTL_COUNTER_U64(_debug_smr, OID_AUTO, poll_scan, CTLFLAG_RW, &poll_scan, ""); +static counter_u64_t poll_fail = EARLY_COUNTER; +SYSCTL_COUNTER_U64(_debug_smr, OID_AUTO, poll_fail, CTLFLAG_RW, &poll_fail, ""); - /* - * Advance the write sequence and return the new value for use as the - * wait goal. This guarantees that any changes made by the calling - * thread prior to this call will be visible to all threads after - * rd_seq meets or exceeds the return value. + * Advance a lazy write sequence number. These move forward at the rate of + * ticks. Grace is two ticks in the future. lazy write sequence numbers can + * be even but not SMR_SEQ_INVALID so we pause time for a tick when we wrap. * - * This function may busy loop if the readers are roughly 1 billion - * sequence numbers behind the writers. + * This returns the _current_ write sequence number. The lazy goal sequence + * number is SMR_LAZY_GRACE ticks ahead. */ -smr_seq_t -smr_advance(smr_t smr) +static smr_seq_t +smr_lazy_advance(smr_t smr, smr_shared_t s) { - smr_shared_t s; - smr_seq_t goal, s_rd_seq; + smr_seq_t s_rd_seq, s_wr_seq, goal; + int t; + CRITICAL_ASSERT(curthread); + /* - * It is illegal to enter while in an smr section. + * Load s_wr_seq prior to ticks to ensure that the thread that + * observes the largest value wins. */ - SMR_ASSERT_NOT_ENTERED(smr); + s_wr_seq = atomic_load_acq_int(&s->s_wr_seq); /* - * Modifications not done in a smr section need to be visible - * before advancing the seq. + * We must not allow a zero tick value. We go back in time one tick + * and advance the grace period forward one tick around zero. */ - atomic_thread_fence_rel(); + t = ticks; + if (t == SMR_SEQ_INVALID) + t--; /* - * Load the current read seq before incrementing the goal so - * we are guaranteed it is always < goal. + * The most probable condition that the update already took place. */ - s = zpcpu_get(smr)->c_shared; - s_rd_seq = atomic_load_acq_int(&s->s_rd_seq); + if (__predict_true(t == s_wr_seq)) + goto out; /* - * Increment the shared write sequence by 2. Since it is - * initialized to 1 this means the only valid values are - * odd and an observed value of 0 in a particular CPU means - * it is not currently in a read section. + * After long idle periods the read sequence may fall too far + * behind write. Prevent poll from ever seeing this condition + * by updating the stale rd_seq. This assumes that there can + * be no valid section 2bn ticks old. The rd_seq update must + * be visible before wr_seq to avoid races with other advance + * callers. */ - goal = atomic_fetchadd_int(&s->s_wr_seq, SMR_SEQ_INCR) + SMR_SEQ_INCR; + s_rd_seq = atomic_load_int(&s->s_rd_seq); + if (SMR_SEQ_GT(s_rd_seq, t)) + atomic_cmpset_rel_int(&s->s_rd_seq, s_rd_seq, t); + + /* + * Release to synchronize with the wr_seq load above. Ignore + * cmpset failures from simultaneous updates. + */ + atomic_cmpset_rel_int(&s->s_wr_seq, s_wr_seq, t); counter_u64_add(advance, 1); + /* If we lost either update race another thread did it. */ + s_wr_seq = t; +out: + goal = s_wr_seq + SMR_LAZY_GRACE; + /* Skip over the SMR_SEQ_INVALID tick. */ + if (goal < SMR_LAZY_GRACE) + goal++; + return (goal); +} +/* + * Increment the shared write sequence by 2. Since it is initialized + * to 1 this means the only valid values are odd and an observed value + * of 0 in a particular CPU means it is not currently in a read section. + */ +static smr_seq_t +smr_shared_advance(smr_shared_t s) +{ + + return (atomic_fetchadd_int(&s->s_wr_seq, SMR_SEQ_INCR) + SMR_SEQ_INCR); +} + +/* + * Advance the write sequence number for a normal smr section. If the + * write sequence is too far behind the read sequence we have to poll + * to advance rd_seq and prevent undetectable wraps. + */ +static smr_seq_t +smr_default_advance(smr_t smr, smr_shared_t s) +{ + smr_seq_t goal, s_rd_seq; + + CRITICAL_ASSERT(curthread); + KASSERT((zpcpu_get(smr)->c_flags & SMR_LAZY) == 0, + ("smr_default_advance: called with lazy smr.")); + /* + * Load the current read seq before incrementing the goal so + * we are guaranteed it is always < goal. + */ + s_rd_seq = atomic_load_acq_int(&s->s_rd_seq); + goal = smr_shared_advance(s); + + /* * Force a synchronization here if the goal is getting too * far ahead of the read sequence number. This keeps the * wrap detecting arithmetic working in pathological cases. */ if (SMR_SEQ_DELTA(goal, s_rd_seq) >= SMR_SEQ_MAX_DELTA) { counter_u64_add(advance_wait, 1); smr_wait(smr, goal - SMR_SEQ_MAX_ADVANCE); } + counter_u64_add(advance, 1); return (goal); } +/* + * Deferred SMRs conditionally update s_wr_seq based on an + * cpu local interval count. + */ +static smr_seq_t +smr_deferred_advance(smr_t smr, smr_shared_t s, smr_t self) +{ + + if (++self->c_deferred < self->c_limit) + return (smr_shared_current(s) + SMR_SEQ_INCR); + self->c_deferred = 0; + return (smr_default_advance(smr, s)); +} + +/* + * Advance the write sequence and return the value for use as the + * wait goal. This guarantees that any changes made by the calling + * thread prior to this call will be visible to all threads after + * rd_seq meets or exceeds the return value. + * + * This function may busy loop if the readers are roughly 1 billion + * sequence numbers behind the writers. + * + * Lazy SMRs will not busy loop and the wrap happens every 49.6 days + * at 1khz and 119 hours at 10khz. Readers can block for no longer + * than half of this for SMR_SEQ_ macros to continue working. + */ smr_seq_t -smr_advance_deferred(smr_t smr, int limit) +smr_advance(smr_t smr) { + smr_t self; + smr_shared_t s; smr_seq_t goal; - smr_t csmr; + int flags; + /* + * It is illegal to enter while in an smr section. + */ SMR_ASSERT_NOT_ENTERED(smr); + /* + * Modifications not done in a smr section need to be visible + * before advancing the seq. + */ + atomic_thread_fence_rel(); + critical_enter(); - csmr = zpcpu_get(smr); - if (++csmr->c_deferred >= limit) { - goal = SMR_SEQ_INVALID; - csmr->c_deferred = 0; - } else - goal = smr_shared_current(csmr->c_shared) + SMR_SEQ_INCR; + /* Try to touch the line once. */ + self = zpcpu_get(smr); + s = self->c_shared; + flags = self->c_flags; + goal = SMR_SEQ_INVALID; + if ((flags & (SMR_LAZY | SMR_DEFERRED)) == 0) + goal = smr_default_advance(smr, s); + else if ((flags & SMR_LAZY) != 0) + goal = smr_lazy_advance(smr, s); + else if ((flags & SMR_DEFERRED) != 0) + goal = smr_deferred_advance(smr, s, self); critical_exit(); - if (goal != SMR_SEQ_INVALID) - return (goal); - return (smr_advance(smr)); + return (goal); } /* + * Poll to determine the currently observed sequence number on a cpu + * and spinwait if the 'wait' argument is true. + */ +static smr_seq_t +smr_poll_cpu(smr_t c, smr_seq_t s_rd_seq, smr_seq_t goal, bool wait) +{ + smr_seq_t c_seq; + + c_seq = SMR_SEQ_INVALID; + for (;;) { + c_seq = atomic_load_int(&c->c_seq); + if (c_seq == SMR_SEQ_INVALID) + break; + + /* + * There is a race described in smr.h:smr_enter that + * can lead to a stale seq value but not stale data + * access. If we find a value out of range here we + * pin it to the current min to prevent it from + * advancing until that stale section has expired. + * + * The race is created when a cpu loads the s_wr_seq + * value in a local register and then another thread + * advances s_wr_seq and calls smr_poll() which will + * oberve no value yet in c_seq and advance s_rd_seq + * up to s_wr_seq which is beyond the register + * cached value. This is only likely to happen on + * hypervisor or with a system management interrupt. + */ + if (SMR_SEQ_LT(c_seq, s_rd_seq)) + c_seq = s_rd_seq; + + /* + * If the sequence number meets the goal we are done + * with this cpu. + */ + if (SMR_SEQ_LEQ(goal, c_seq)) + break; + + if (!wait) + break; + cpu_spinwait(); + } + + return (c_seq); +} + +/* + * Loop until all cores have observed the goal sequence or have + * gone inactive. Returns the oldest sequence currently active; + * + * This function assumes a snapshot of sequence values has + * been obtained and validated by smr_poll(). + */ +static smr_seq_t +smr_poll_scan(smr_t smr, smr_shared_t s, smr_seq_t s_rd_seq, + smr_seq_t s_wr_seq, smr_seq_t goal, bool wait) +{ + smr_seq_t rd_seq, c_seq; + int i; + + CRITICAL_ASSERT(curthread); + counter_u64_add_protected(poll_scan, 1); + + /* + * The read sequence can be no larger than the write sequence at + * the start of the poll. + */ + rd_seq = s_wr_seq; + CPU_FOREACH(i) { + /* + * Query the active sequence on this cpu. If we're not + * waiting and we don't meet the goal we will still scan + * the rest of the cpus to update s_rd_seq before returning + * failure. + */ + c_seq = smr_poll_cpu(zpcpu_get_cpu(smr, i), s_rd_seq, goal, + wait); + + /* + * Limit the minimum observed rd_seq whether we met the goal + * or not. + */ + if (c_seq != SMR_SEQ_INVALID) + rd_seq = SMR_SEQ_MIN(rd_seq, c_seq); + } + + /* + * Advance the rd_seq as long as we observed a more recent value. + */ + s_rd_seq = atomic_load_int(&s->s_rd_seq); + if (SMR_SEQ_GEQ(rd_seq, s_rd_seq)) { + atomic_cmpset_int(&s->s_rd_seq, s_rd_seq, rd_seq); + s_rd_seq = rd_seq; + } + + return (s_rd_seq); +} + +/* * Poll to determine whether all readers have observed the 'goal' write * sequence number. * * If wait is true this will spin until the goal is met. * * This routine will updated the minimum observed read sequence number in * s_rd_seq if it does a scan. It may not do a scan if another call has * advanced s_rd_seq beyond the callers goal already. * * Returns true if the goal is met and false if not. */ bool smr_poll(smr_t smr, smr_seq_t goal, bool wait) { smr_shared_t s; - smr_t c; - smr_seq_t s_wr_seq, s_rd_seq, rd_seq, c_seq; - int i; + smr_t self; + smr_seq_t s_wr_seq, s_rd_seq; + smr_delta_t delta; + int flags; bool success; /* * It is illegal to enter while in an smr section. */ KASSERT(!wait || !SMR_ENTERED(smr), ("smr_poll: Blocking not allowed in a SMR section.")); + KASSERT(!wait || (zpcpu_get(smr)->c_flags & SMR_LAZY) == 0, + ("smr_poll: Blocking not allowed on lazy smrs.")); /* * Use a critical section so that we can avoid ABA races * caused by long preemption sleeps. */ success = true; critical_enter(); - s = zpcpu_get(smr)->c_shared; + /* Attempt to load from self only once. */ + self = zpcpu_get(smr); + s = self->c_shared; + flags = self->c_flags; counter_u64_add_protected(poll, 1); /* + * Conditionally advance the lazy write clock on any writer + * activity. This may reset s_rd_seq. + */ + if ((flags & SMR_LAZY) != 0) + smr_lazy_advance(smr, s); + + /* * Acquire barrier loads s_wr_seq after s_rd_seq so that we can not * observe an updated read sequence that is larger than write. */ s_rd_seq = atomic_load_acq_int(&s->s_rd_seq); /* - * wr_seq must be loaded prior to any c_seq value so that a stale - * c_seq can only reference time after this wr_seq. + * If we have already observed the sequence number we can immediately + * return success. Most polls should meet this criterion. */ + if (SMR_SEQ_LEQ(goal, s_rd_seq)) + goto out; + + /* + * wr_seq must be loaded prior to any c_seq value so that a + * stale c_seq can only reference time after this wr_seq. + */ s_wr_seq = atomic_load_acq_int(&s->s_wr_seq); /* - * This may have come from a deferred advance. Consider one - * increment past the current wr_seq valid and make sure we - * have advanced far enough to succeed. We simply add to avoid - * an additional fence. + * This is the distance from s_wr_seq to goal. Positive values + * are in the future. */ - if (goal == s_wr_seq + SMR_SEQ_INCR) { - atomic_add_int(&s->s_wr_seq, SMR_SEQ_INCR); - s_wr_seq = goal; + delta = SMR_SEQ_DELTA(goal, s_wr_seq); + + /* + * Detect a stale wr_seq. + * + * This goal may have come from a deferred advance or a lazy + * smr. If we are not blocking we can not succeed but the + * sequence number is valid. + */ + if (delta > 0 && delta <= SMR_SEQ_MAX_ADVANCE && + (flags & (SMR_LAZY | SMR_DEFERRED)) != 0) { + if (!wait) { + success = false; + goto out; + } + /* LAZY is always !wait. */ + s_wr_seq = smr_shared_advance(s); + delta = 0; } /* - * Detect whether the goal is valid and has already been observed. + * Detect an invalid goal. * * The goal must be in the range of s_wr_seq >= goal >= s_rd_seq for * it to be valid. If it is not then the caller held on to it and * the integer wrapped. If we wrapped back within range the caller * will harmlessly scan. - * - * A valid goal must be greater than s_rd_seq or we have not verified - * that it has been observed and must fall through to polling. */ - if (SMR_SEQ_GEQ(s_rd_seq, goal) || SMR_SEQ_LT(s_wr_seq, goal)) + if (delta > 0) goto out; - /* - * Loop until all cores have observed the goal sequence or have - * gone inactive. Keep track of the oldest sequence currently - * active as rd_seq. - */ - counter_u64_add_protected(poll_scan, 1); - rd_seq = s_wr_seq; - CPU_FOREACH(i) { - c = zpcpu_get_cpu(smr, i); - c_seq = SMR_SEQ_INVALID; - for (;;) { - c_seq = atomic_load_int(&c->c_seq); - if (c_seq == SMR_SEQ_INVALID) - break; - - /* - * There is a race described in smr.h:smr_enter that - * can lead to a stale seq value but not stale data - * access. If we find a value out of range here we - * pin it to the current min to prevent it from - * advancing until that stale section has expired. - * - * The race is created when a cpu loads the s_wr_seq - * value in a local register and then another thread - * advances s_wr_seq and calls smr_poll() which will - * oberve no value yet in c_seq and advance s_rd_seq - * up to s_wr_seq which is beyond the register - * cached value. This is only likely to happen on - * hypervisor or with a system management interrupt. - */ - if (SMR_SEQ_LT(c_seq, s_rd_seq)) - c_seq = s_rd_seq; - - /* - * If the sequence number meets the goal we are - * done with this cpu. - */ - if (SMR_SEQ_GEQ(c_seq, goal)) - break; - - /* - * If we're not waiting we will still scan the rest - * of the cpus and update s_rd_seq before returning - * an error. - */ - if (!wait) { - success = false; - break; - } - cpu_spinwait(); - } - - /* - * Limit the minimum observed rd_seq whether we met the goal - * or not. - */ - if (c_seq != SMR_SEQ_INVALID && SMR_SEQ_GT(rd_seq, c_seq)) - rd_seq = c_seq; - } - - /* - * Advance the rd_seq as long as we observed the most recent one. - */ - s_rd_seq = atomic_load_int(&s->s_rd_seq); - do { - if (SMR_SEQ_LEQ(rd_seq, s_rd_seq)) - goto out; - } while (atomic_fcmpset_int(&s->s_rd_seq, &s_rd_seq, rd_seq) == 0); - + /* Determine the lowest visible sequence number. */ + s_rd_seq = smr_poll_scan(smr, s, s_rd_seq, s_wr_seq, goal, wait); + success = SMR_SEQ_LEQ(goal, s_rd_seq); out: + if (!success) + counter_u64_add_protected(poll_fail, 1); critical_exit(); /* * Serialize with smr_advance()/smr_exit(). The caller is now free * to modify memory as expected. */ atomic_thread_fence_acq(); return (success); } smr_t -smr_create(const char *name) +smr_create(const char *name, int limit, int flags) { smr_t smr, c; smr_shared_t s; int i; s = uma_zalloc(smr_shared_zone, M_WAITOK); smr = uma_zalloc_pcpu(smr_zone, M_WAITOK); s->s_name = name; - s->s_rd_seq = s->s_wr_seq = SMR_SEQ_INIT; + if ((flags & SMR_LAZY) == 0) + s->s_rd_seq = s->s_wr_seq = SMR_SEQ_INIT; + else + s->s_rd_seq = s->s_wr_seq = ticks; /* Initialize all CPUS, not just those running. */ for (i = 0; i <= mp_maxid; i++) { c = zpcpu_get_cpu(smr, i); c->c_seq = SMR_SEQ_INVALID; c->c_shared = s; + c->c_deferred = 0; + c->c_limit = limit; + c->c_flags = flags; } atomic_thread_fence_seq_cst(); return (smr); } void smr_destroy(smr_t smr) { smr_synchronize(smr); uma_zfree(smr_shared_zone, smr->c_shared); uma_zfree_pcpu(smr_zone, smr); } /* * Initialize the UMA slab zone. */ void smr_init(void) { smr_shared_zone = uma_zcreate("SMR SHARED", sizeof(struct smr_shared), NULL, NULL, NULL, NULL, (CACHE_LINE_SIZE * 2) - 1, 0); smr_zone = uma_zcreate("SMR CPU", sizeof(struct smr), NULL, NULL, NULL, NULL, (CACHE_LINE_SIZE * 2) - 1, UMA_ZONE_PCPU); } static void smr_init_counters(void *unused) { advance = counter_u64_alloc(M_WAITOK); advance_wait = counter_u64_alloc(M_WAITOK); poll = counter_u64_alloc(M_WAITOK); poll_scan = counter_u64_alloc(M_WAITOK); + poll_fail = counter_u64_alloc(M_WAITOK); } SYSINIT(smr_counters, SI_SUB_CPU, SI_ORDER_ANY, smr_init_counters, NULL); Index: projects/clang1000-import/sys/kern/subr_trap.c =================================================================== --- projects/clang1000-import/sys/kern/subr_trap.c (revision 358238) +++ projects/clang1000-import/sys/kern/subr_trap.c (revision 358239) @@ -1,381 +1,383 @@ /*- * SPDX-License-Identifier: BSD-4-Clause * * Copyright (C) 1994, David Greenman * Copyright (c) 1990, 1993 * The Regents of the University of California. All rights reserved. * Copyright (c) 2007 The FreeBSD Foundation * * This code is derived from software contributed to Berkeley by * the University of Utah, and William Jolitz. * * Portions of this software were developed by A. Joseph Koshy under * sponsorship from the FreeBSD Foundation and Google, Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. All advertising materials mentioning features or use of this software * must display the following acknowledgement: * This product includes software developed by the University of * California, Berkeley and its contributors. * 4. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * from: @(#)trap.c 7.4 (Berkeley) 5/13/91 */ #include __FBSDID("$FreeBSD$"); #include "opt_hwpmc_hooks.h" #include "opt_ktrace.h" #include "opt_sched.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #ifdef KTRACE #include #include #endif #include #include #ifdef VIMAGE #include #endif #ifdef HWPMC_HOOKS #include #endif #include void (*softdep_ast_cleanup)(struct thread *); /* * Define the code needed before returning to user mode, for trap and * syscall. */ void userret(struct thread *td, struct trapframe *frame) { struct proc *p = td->td_proc; CTR3(KTR_SYSC, "userret: thread %p (pid %d, %s)", td, p->p_pid, td->td_name); KASSERT((p->p_flag & P_WEXIT) == 0, ("Exiting process returns to usermode")); #ifdef DIAGNOSTIC /* * Check that we called signotify() enough. For * multi-threaded processes, where signal distribution might * change due to other threads changing sigmask, the check is * racy and cannot be performed reliably. * If current process is vfork child, indicated by P_PPWAIT, then * issignal() ignores stops, so we block the check to avoid * classifying pending signals. */ if (p->p_numthreads == 1) { PROC_LOCK(p); thread_lock(td); if ((p->p_flag & P_PPWAIT) == 0 && (td->td_pflags & TDP_SIGFASTBLOCK) == 0) { if (SIGPENDING(td) && (td->td_flags & (TDF_NEEDSIGCHK | TDF_ASTPENDING)) != (TDF_NEEDSIGCHK | TDF_ASTPENDING)) { thread_unlock(td); panic( "failed to set signal flags for ast p %p td %p fl %x", p, td, td->td_flags); } } thread_unlock(td); PROC_UNLOCK(p); } #endif #ifdef KTRACE KTRUSERRET(td); #endif td_softdep_cleanup(td); MPASS(td->td_su == NULL); /* * If this thread tickled GEOM, we need to wait for the giggling to * stop before we return to userland */ if (__predict_false(td->td_pflags & TDP_GEOM)) g_waitidle(); /* * Charge system time if profiling. */ if (__predict_false(p->p_flag & P_PROFIL)) addupc_task(td, TRAPF_PC(frame), td->td_pticks * psratio); #ifdef HWPMC_HOOKS if (PMC_THREAD_HAS_SAMPLES(td)) PMC_CALL_HOOK(td, PMC_FN_THR_USERRET, NULL); #endif /* * Let the scheduler adjust our priority etc. */ sched_userret(td); /* * Check for misbehavior. * * In case there is a callchain tracing ongoing because of * hwpmc(4), skip the scheduler pinning check. * hwpmc(4) subsystem, infact, will collect callchain informations * at ast() checkpoint, which is past userret(). */ WITNESS_WARN(WARN_PANIC, NULL, "userret: returning"); KASSERT(td->td_critnest == 0, ("userret: Returning in a critical section")); KASSERT(td->td_locks == 0, ("userret: Returning with %d locks held", td->td_locks)); KASSERT(td->td_rw_rlocks == 0, ("userret: Returning with %d rwlocks held in read mode", td->td_rw_rlocks)); KASSERT(td->td_sx_slocks == 0, ("userret: Returning with %d sx locks held in shared mode", td->td_sx_slocks)); KASSERT(td->td_lk_slocks == 0, ("userret: Returning with %d lockmanager locks held in shared mode", td->td_lk_slocks)); KASSERT((td->td_pflags & TDP_NOFAULTING) == 0, ("userret: Returning with pagefaults disabled")); if (__predict_false(!THREAD_CAN_SLEEP())) { #ifdef EPOCH_TRACE epoch_trace_list(curthread); #endif KASSERT(1, ("userret: Returning with sleep disabled")); } KASSERT(td->td_pinned == 0 || (td->td_pflags & TDP_CALLCHAIN) != 0, ("userret: Returning with with pinned thread")); KASSERT(td->td_vp_reserved == NULL, ("userret: Returning with preallocated vnode")); KASSERT((td->td_flags & (TDF_SBDRY | TDF_SEINTR | TDF_SERESTART)) == 0, ("userret: Returning with stop signals deferred")); KASSERT(td->td_su == NULL, ("userret: Returning with SU cleanup request not handled")); KASSERT(td->td_vslock_sz == 0, ("userret: Returning with vslock-wired space")); #ifdef VIMAGE /* Unfortunately td_vnet_lpush needs VNET_DEBUG. */ VNET_ASSERT(curvnet == NULL, ("%s: Returning on td %p (pid %d, %s) with vnet %p set in %s", __func__, td, p->p_pid, td->td_name, curvnet, (td->td_vnet_lpush != NULL) ? td->td_vnet_lpush : "N/A")); #endif #ifdef RACCT if (__predict_false(racct_enable && p->p_throttled != 0)) racct_proc_throttled(p); #endif } /* * Process an asynchronous software trap. * This is relatively easy. * This function will return with preemption disabled. */ void ast(struct trapframe *framep) { struct thread *td; struct proc *p; int flags, sig; td = curthread; p = td->td_proc; CTR3(KTR_SYSC, "ast: thread %p (pid %d, %s)", td, p->p_pid, p->p_comm); KASSERT(TRAPF_USERMODE(framep), ("ast in kernel mode")); WITNESS_WARN(WARN_PANIC, NULL, "Returning to user mode"); mtx_assert(&Giant, MA_NOTOWNED); THREAD_LOCK_ASSERT(td, MA_NOTOWNED); td->td_frame = framep; td->td_pticks = 0; /* * This updates the td_flag's for the checks below in one * "atomic" operation with turning off the astpending flag. * If another AST is triggered while we are handling the * AST's saved in flags, the astpending flag will be set and * ast() will be called again. */ thread_lock(td); flags = td->td_flags; td->td_flags &= ~(TDF_ASTPENDING | TDF_NEEDSIGCHK | TDF_NEEDSUSPCHK | TDF_NEEDRESCHED | TDF_ALRMPEND | TDF_PROFPEND | TDF_MACPEND); thread_unlock(td); VM_CNT_INC(v_trap); if (td->td_cowgen != p->p_cowgen) thread_cow_update(td); if (td->td_pflags & TDP_OWEUPC && p->p_flag & P_PROFIL) { addupc_task(td, td->td_profil_addr, td->td_profil_ticks); td->td_profil_ticks = 0; td->td_pflags &= ~TDP_OWEUPC; } #ifdef HWPMC_HOOKS /* Handle Software PMC callchain capture. */ if (PMC_IS_PENDING_CALLCHAIN(td)) PMC_CALL_HOOK_UNLOCKED(td, PMC_FN_USER_CALLCHAIN_SOFT, (void *) framep); #endif if (flags & TDF_ALRMPEND) { PROC_LOCK(p); kern_psignal(p, SIGVTALRM); PROC_UNLOCK(p); } if (flags & TDF_PROFPEND) { PROC_LOCK(p); kern_psignal(p, SIGPROF); PROC_UNLOCK(p); } #ifdef MAC if (flags & TDF_MACPEND) mac_thread_userret(td); #endif if (flags & TDF_NEEDRESCHED) { #ifdef KTRACE if (KTRPOINT(td, KTR_CSW)) ktrcsw(1, 1, __func__); #endif thread_lock(td); sched_prio(td, td->td_user_pri); mi_switch(SW_INVOL | SWT_NEEDRESCHED); #ifdef KTRACE if (KTRPOINT(td, KTR_CSW)) ktrcsw(0, 1, __func__); #endif } #ifdef DIAGNOSTIC if (p->p_numthreads == 1 && (flags & TDF_NEEDSIGCHK) == 0) { PROC_LOCK(p); thread_lock(td); /* * Note that TDF_NEEDSIGCHK should be re-read from * td_flags, since signal might have been delivered * after we cleared td_flags above. This is one of * the reason for looping check for AST condition. * See comment in userret() about P_PPWAIT. */ if ((p->p_flag & P_PPWAIT) == 0 && (td->td_pflags & TDP_SIGFASTBLOCK) == 0) { if (SIGPENDING(td) && (td->td_flags & (TDF_NEEDSIGCHK | TDF_ASTPENDING)) != (TDF_NEEDSIGCHK | TDF_ASTPENDING)) { thread_unlock(td); /* fix dumps */ panic( "failed2 to set signal flags for ast p %p td %p fl %x %x", p, td, flags, td->td_flags); } } thread_unlock(td); PROC_UNLOCK(p); } #endif /* * Check for signals. Unlocked reads of p_pendingcnt or * p_siglist might cause process-directed signal to be handled * later. */ if (flags & TDF_NEEDSIGCHK || p->p_pendingcnt > 0 || !SIGISEMPTY(p->p_siglist)) { sigfastblock_fetch(td); - PROC_LOCK(p); - mtx_lock(&p->p_sigacts->ps_mtx); if ((td->td_pflags & TDP_SIGFASTBLOCK) != 0 && td->td_sigblock_val != 0) { sigfastblock_setpend(td); + PROC_LOCK(p); reschedule_signals(p, fastblock_mask, - SIGPROCMASK_PS_LOCKED | SIGPROCMASK_FASTBLK); + SIGPROCMASK_FASTBLK); + PROC_UNLOCK(p); } else { + PROC_LOCK(p); + mtx_lock(&p->p_sigacts->ps_mtx); while ((sig = cursig(td)) != 0) { KASSERT(sig >= 0, ("sig %d", sig)); postsig(sig); } + mtx_unlock(&p->p_sigacts->ps_mtx); + PROC_UNLOCK(p); } - mtx_unlock(&p->p_sigacts->ps_mtx); - PROC_UNLOCK(p); } /* * Handle deferred update of the fast sigblock value, after * the postsig() loop was performed. */ if (td->td_pflags & TDP_SIGFASTPENDING) sigfastblock_setpend(td); /* * We need to check to see if we have to exit or wait due to a * single threading requirement or some other STOP condition. */ if (flags & TDF_NEEDSUSPCHK) { PROC_LOCK(p); thread_suspend_check(0); PROC_UNLOCK(p); } if (td->td_pflags & TDP_OLDMASK) { td->td_pflags &= ~TDP_OLDMASK; kern_sigprocmask(td, SIG_SETMASK, &td->td_oldsigmask, NULL, 0); } userret(td, framep); } const char * syscallname(struct proc *p, u_int code) { static const char unknown[] = "unknown"; struct sysentvec *sv; sv = p->p_sysent; if (sv->sv_syscallnames == NULL || code >= sv->sv_size) return (unknown); return (sv->sv_syscallnames[code]); } Index: projects/clang1000-import/sys/kern/vfs_lookup.c =================================================================== --- projects/clang1000-import/sys/kern/vfs_lookup.c (revision 358238) +++ projects/clang1000-import/sys/kern/vfs_lookup.c (revision 358239) @@ -1,1515 +1,1514 @@ /*- * SPDX-License-Identifier: BSD-3-Clause * * Copyright (c) 1982, 1986, 1989, 1993 * The Regents of the University of California. All rights reserved. * (c) UNIX System Laboratories, Inc. * All or some portions of this file are derived from material licensed * to the University of California by American Telephone and Telegraph * Co. or Unix System Laboratories, Inc. and are reproduced herein with * the permission of UNIX System Laboratories, Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * @(#)vfs_lookup.c 8.4 (Berkeley) 2/16/94 */ #include __FBSDID("$FreeBSD$"); #include "opt_capsicum.h" #include "opt_ktrace.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #ifdef KTRACE #include #endif #include #include #include #define NAMEI_DIAGNOSTIC 1 #undef NAMEI_DIAGNOSTIC SDT_PROVIDER_DECLARE(vfs); SDT_PROBE_DEFINE3(vfs, namei, lookup, entry, "struct vnode *", "char *", "unsigned long"); SDT_PROBE_DEFINE2(vfs, namei, lookup, return, "int", "struct vnode *"); /* Allocation zone for namei. */ uma_zone_t namei_zone; /* Placeholder vnode for mp traversal. */ static struct vnode *vp_crossmp; static int crossmp_vop_islocked(struct vop_islocked_args *ap) { return (LK_SHARED); } static int crossmp_vop_lock1(struct vop_lock1_args *ap) { struct vnode *vp; struct lock *lk __unused; const char *file __unused; int flags, line __unused; vp = ap->a_vp; lk = vp->v_vnlock; flags = ap->a_flags; file = ap->a_file; line = ap->a_line; if ((flags & LK_SHARED) == 0) panic("invalid lock request for crossmp"); WITNESS_CHECKORDER(&lk->lock_object, LOP_NEWORDER, file, line, flags & LK_INTERLOCK ? &VI_MTX(vp)->lock_object : NULL); WITNESS_LOCK(&lk->lock_object, 0, file, line); if ((flags & LK_INTERLOCK) != 0) VI_UNLOCK(vp); LOCK_LOG_LOCK("SLOCK", &lk->lock_object, 0, 0, ap->a_file, line); return (0); } static int crossmp_vop_unlock(struct vop_unlock_args *ap) { struct vnode *vp; struct lock *lk __unused; vp = ap->a_vp; lk = vp->v_vnlock; WITNESS_UNLOCK(&lk->lock_object, 0, LOCK_FILE, LOCK_LINE); LOCK_LOG_LOCK("SUNLOCK", &lk->lock_object, 0, 0, LOCK_FILE, LOCK_LINE); return (0); } static struct vop_vector crossmp_vnodeops = { .vop_default = &default_vnodeops, .vop_islocked = crossmp_vop_islocked, .vop_lock1 = crossmp_vop_lock1, .vop_unlock = crossmp_vop_unlock, }; /* * VFS_VOP_VECTOR_REGISTER(crossmp_vnodeops) is not used here since the vnode * gets allocated early. See nameiinit for the direct call below. */ struct nameicap_tracker { struct vnode *dp; TAILQ_ENTRY(nameicap_tracker) nm_link; }; /* Zone for cap mode tracker elements used for dotdot capability checks. */ static uma_zone_t nt_zone; static void nameiinit(void *dummy __unused) { namei_zone = uma_zcreate("NAMEI", MAXPATHLEN, NULL, NULL, NULL, NULL, UMA_ALIGN_PTR, 0); nt_zone = uma_zcreate("rentr", sizeof(struct nameicap_tracker), NULL, NULL, NULL, NULL, UMA_ALIGN_PTR, 0); vfs_vector_op_register(&crossmp_vnodeops); getnewvnode("crossmp", NULL, &crossmp_vnodeops, &vp_crossmp); } SYSINIT(vfs, SI_SUB_VFS, SI_ORDER_SECOND, nameiinit, NULL); static int lookup_cap_dotdot = 1; SYSCTL_INT(_vfs, OID_AUTO, lookup_cap_dotdot, CTLFLAG_RWTUN, &lookup_cap_dotdot, 0, "enables \"..\" components in path lookup in capability mode"); static int lookup_cap_dotdot_nonlocal = 1; SYSCTL_INT(_vfs, OID_AUTO, lookup_cap_dotdot_nonlocal, CTLFLAG_RWTUN, &lookup_cap_dotdot_nonlocal, 0, "enables \"..\" components in path lookup in capability mode " "on non-local mount"); static void nameicap_tracker_add(struct nameidata *ndp, struct vnode *dp) { struct nameicap_tracker *nt; if ((ndp->ni_lcf & NI_LCF_CAP_DOTDOT) == 0 || dp->v_type != VDIR) return; if ((ndp->ni_lcf & (NI_LCF_BENEATH_ABS | NI_LCF_BENEATH_LATCHED)) == NI_LCF_BENEATH_ABS) { MPASS((ndp->ni_lcf & NI_LCF_LATCH) != 0); if (dp != ndp->ni_beneath_latch) return; ndp->ni_lcf |= NI_LCF_BENEATH_LATCHED; } nt = uma_zalloc(nt_zone, M_WAITOK); vhold(dp); nt->dp = dp; TAILQ_INSERT_TAIL(&ndp->ni_cap_tracker, nt, nm_link); } static void nameicap_cleanup(struct nameidata *ndp, bool clean_latch) { struct nameicap_tracker *nt, *nt1; KASSERT(TAILQ_EMPTY(&ndp->ni_cap_tracker) || (ndp->ni_lcf & NI_LCF_CAP_DOTDOT) != 0, ("not strictrelative")); TAILQ_FOREACH_SAFE(nt, &ndp->ni_cap_tracker, nm_link, nt1) { TAILQ_REMOVE(&ndp->ni_cap_tracker, nt, nm_link); vdrop(nt->dp); uma_zfree(nt_zone, nt); } if (clean_latch && (ndp->ni_lcf & NI_LCF_LATCH) != 0) { ndp->ni_lcf &= ~NI_LCF_LATCH; vrele(ndp->ni_beneath_latch); } } /* * For dotdot lookups in capability mode, only allow the component * lookup to succeed if the resulting directory was already traversed * during the operation. Also fail dotdot lookups for non-local * filesystems, where external agents might assist local lookups to * escape the compartment. */ static int nameicap_check_dotdot(struct nameidata *ndp, struct vnode *dp) { struct nameicap_tracker *nt; struct mount *mp; if ((ndp->ni_lcf & NI_LCF_CAP_DOTDOT) == 0 || dp == NULL || dp->v_type != VDIR) return (0); mp = dp->v_mount; if (lookup_cap_dotdot_nonlocal == 0 && mp != NULL && (mp->mnt_flag & MNT_LOCAL) == 0) return (ENOTCAPABLE); TAILQ_FOREACH_REVERSE(nt, &ndp->ni_cap_tracker, nameicap_tracker_head, nm_link) { if (dp == nt->dp) return (0); } if ((ndp->ni_lcf & NI_LCF_BENEATH_ABS) != 0) { ndp->ni_lcf &= ~NI_LCF_BENEATH_LATCHED; nameicap_cleanup(ndp, false); return (0); } return (ENOTCAPABLE); } static void namei_cleanup_cnp(struct componentname *cnp) { uma_zfree(namei_zone, cnp->cn_pnbuf); #ifdef DIAGNOSTIC cnp->cn_pnbuf = NULL; cnp->cn_nameptr = NULL; #endif } static int namei_handle_root(struct nameidata *ndp, struct vnode **dpp, u_int n) { struct componentname *cnp; cnp = &ndp->ni_cnd; if ((ndp->ni_lcf & NI_LCF_STRICTRELATIVE) != 0) { #ifdef KTRACE if (KTRPOINT(curthread, KTR_CAPFAIL)) ktrcapfail(CAPFAIL_LOOKUP, NULL, NULL); #endif return (ENOTCAPABLE); } if ((cnp->cn_flags & BENEATH) != 0) { ndp->ni_lcf |= NI_LCF_BENEATH_ABS; ndp->ni_lcf &= ~NI_LCF_BENEATH_LATCHED; nameicap_cleanup(ndp, false); } while (*(cnp->cn_nameptr) == '/') { cnp->cn_nameptr++; ndp->ni_pathlen--; } *dpp = ndp->ni_rootdir; vrefactn(*dpp, n); return (0); } /* * Convert a pathname into a pointer to a locked vnode. * * The FOLLOW flag is set when symbolic links are to be followed * when they occur at the end of the name translation process. * Symbolic links are always followed for all other pathname * components other than the last. * * The segflg defines whether the name is to be copied from user * space or kernel space. * * Overall outline of namei: * * copy in name * get starting directory * while (!done && !error) { * call lookup to search path. * if symbolic link, massage name in buffer and continue * } */ int namei(struct nameidata *ndp) { struct filedesc *fdp; /* pointer to file descriptor state */ char *cp; /* pointer into pathname argument */ struct vnode *dp; /* the directory we are searching */ struct iovec aiov; /* uio for reading symbolic links */ struct componentname *cnp; struct file *dfp; struct thread *td; struct proc *p; cap_rights_t rights; struct filecaps dirfd_caps; struct uio auio; int error, linklen, startdir_used; cnp = &ndp->ni_cnd; td = cnp->cn_thread; p = td->td_proc; ndp->ni_cnd.cn_cred = ndp->ni_cnd.cn_thread->td_ucred; KASSERT(cnp->cn_cred && p, ("namei: bad cred/proc")); KASSERT((cnp->cn_nameiop & (~OPMASK)) == 0, ("namei: nameiop contaminated with flags")); KASSERT((cnp->cn_flags & OPMASK) == 0, ("namei: flags contaminated with nameiops")); MPASS(ndp->ni_startdir == NULL || ndp->ni_startdir->v_type == VDIR || ndp->ni_startdir->v_type == VBAD); fdp = p->p_fd; TAILQ_INIT(&ndp->ni_cap_tracker); ndp->ni_lcf = 0; /* We will set this ourselves if we need it. */ cnp->cn_flags &= ~TRAILINGSLASH; /* * Get a buffer for the name to be translated, and copy the * name into the buffer. */ if ((cnp->cn_flags & HASBUF) == 0) cnp->cn_pnbuf = uma_zalloc(namei_zone, M_WAITOK); if (ndp->ni_segflg == UIO_SYSSPACE) error = copystr(ndp->ni_dirp, cnp->cn_pnbuf, MAXPATHLEN, &ndp->ni_pathlen); else error = copyinstr(ndp->ni_dirp, cnp->cn_pnbuf, MAXPATHLEN, &ndp->ni_pathlen); /* * Don't allow empty pathnames. */ if (error == 0 && *cnp->cn_pnbuf == '\0') error = ENOENT; #ifdef CAPABILITY_MODE /* * In capability mode, lookups must be restricted to happen in * the subtree with the root specified by the file descriptor: * - The root must be real file descriptor, not the pseudo-descriptor * AT_FDCWD. * - The passed path must be relative and not absolute. * - If lookup_cap_dotdot is disabled, path must not contain the * '..' components. * - If lookup_cap_dotdot is enabled, we verify that all '..' * components lookups result in the directories which were * previously walked by us, which prevents an escape from * the relative root. */ if (error == 0 && IN_CAPABILITY_MODE(td) && (cnp->cn_flags & NOCAPCHECK) == 0) { ndp->ni_lcf |= NI_LCF_STRICTRELATIVE; if (ndp->ni_dirfd == AT_FDCWD) { #ifdef KTRACE if (KTRPOINT(td, KTR_CAPFAIL)) ktrcapfail(CAPFAIL_LOOKUP, NULL, NULL); #endif error = ECAPMODE; } } #endif if (error != 0) { namei_cleanup_cnp(cnp); ndp->ni_vp = NULL; return (error); } ndp->ni_loopcnt = 0; #ifdef KTRACE if (KTRPOINT(td, KTR_NAMEI)) { KASSERT(cnp->cn_thread == curthread, ("namei not using curthread")); ktrnamei(cnp->cn_pnbuf); } #endif /* * Get starting point for the translation. */ FILEDESC_SLOCK(fdp); /* * The reference on ni_rootdir is acquired in the block below to avoid * back-to-back atomics for absolute lookups. */ ndp->ni_rootdir = fdp->fd_rdir; ndp->ni_topdir = fdp->fd_jdir; - /* - * If we are auditing the kernel pathname, save the user pathname. - */ - if (cnp->cn_flags & AUDITVNODE1) - AUDIT_ARG_UPATH1(td, ndp->ni_dirfd, cnp->cn_pnbuf); - if (cnp->cn_flags & AUDITVNODE2) - AUDIT_ARG_UPATH2(td, ndp->ni_dirfd, cnp->cn_pnbuf); - startdir_used = 0; dp = NULL; cnp->cn_nameptr = cnp->cn_pnbuf; if (cnp->cn_pnbuf[0] == '/') { ndp->ni_resflags |= NIRES_ABS; error = namei_handle_root(ndp, &dp, 2); if (error != 0) { /* * Simplify error handling, we should almost never be * here. */ vrefact(ndp->ni_rootdir); } } else { if (ndp->ni_startdir != NULL) { vrefact(ndp->ni_rootdir); dp = ndp->ni_startdir; startdir_used = 1; } else if (ndp->ni_dirfd == AT_FDCWD) { dp = fdp->fd_cdir; if (dp == ndp->ni_rootdir) { vrefactn(dp, 2); } else { vrefact(ndp->ni_rootdir); vrefact(dp); } } else { vrefact(ndp->ni_rootdir); rights = ndp->ni_rightsneeded; cap_rights_set_one(&rights, CAP_LOOKUP); if (cnp->cn_flags & AUDITVNODE1) AUDIT_ARG_ATFD1(ndp->ni_dirfd); if (cnp->cn_flags & AUDITVNODE2) AUDIT_ARG_ATFD2(ndp->ni_dirfd); /* * Effectively inlined fgetvp_rights, because we need to * inspect the file as well as grabbing the vnode. */ error = fget_cap_locked(fdp, ndp->ni_dirfd, &rights, &dfp, &ndp->ni_filecaps); if (error != 0) { /* * Preserve the error; it should either be EBADF * or capability-related, both of which can be * safely returned to the caller. */ } else if (dfp->f_ops == &badfileops) { error = EBADF; } else if (dfp->f_vnode == NULL) { error = ENOTDIR; } else { dp = dfp->f_vnode; vrefact(dp); if ((dfp->f_flag & FSEARCH) != 0) cnp->cn_flags |= NOEXECCHECK; } #ifdef CAPABILITIES /* * If file descriptor doesn't have all rights, * all lookups relative to it must also be * strictly relative. */ CAP_ALL(&rights); if (!cap_rights_contains(&ndp->ni_filecaps.fc_rights, &rights) || ndp->ni_filecaps.fc_fcntls != CAP_FCNTL_ALL || ndp->ni_filecaps.fc_nioctls != -1) { ndp->ni_lcf |= NI_LCF_STRICTRELATIVE; } #endif } if (error == 0 && dp->v_type != VDIR) error = ENOTDIR; } if (error == 0 && (cnp->cn_flags & BENEATH) != 0) { if (ndp->ni_dirfd == AT_FDCWD) { ndp->ni_beneath_latch = fdp->fd_cdir; vrefact(ndp->ni_beneath_latch); } else { rights = ndp->ni_rightsneeded; cap_rights_set_one(&rights, CAP_LOOKUP); error = fgetvp_rights(td, ndp->ni_dirfd, &rights, &dirfd_caps, &ndp->ni_beneath_latch); if (error == 0 && dp->v_type != VDIR) { vrele(ndp->ni_beneath_latch); error = ENOTDIR; } } if (error == 0) ndp->ni_lcf |= NI_LCF_LATCH; } FILEDESC_SUNLOCK(fdp); + /* + * If we are auditing the kernel pathname, save the user pathname. + */ + if (cnp->cn_flags & AUDITVNODE1) + AUDIT_ARG_UPATH1_VP(td, ndp->ni_rootdir, dp, cnp->cn_pnbuf); + if (cnp->cn_flags & AUDITVNODE2) + AUDIT_ARG_UPATH2_VP(td, ndp->ni_rootdir, dp, cnp->cn_pnbuf); if (ndp->ni_startdir != NULL && !startdir_used) vrele(ndp->ni_startdir); if (error != 0) { if (dp != NULL) vrele(dp); goto out; } MPASS((ndp->ni_lcf & (NI_LCF_BENEATH_ABS | NI_LCF_LATCH)) != NI_LCF_BENEATH_ABS); if (((ndp->ni_lcf & NI_LCF_STRICTRELATIVE) != 0 && lookup_cap_dotdot != 0) || ((ndp->ni_lcf & NI_LCF_STRICTRELATIVE) == 0 && (cnp->cn_flags & BENEATH) != 0)) ndp->ni_lcf |= NI_LCF_CAP_DOTDOT; SDT_PROBE3(vfs, namei, lookup, entry, dp, cnp->cn_pnbuf, cnp->cn_flags); for (;;) { ndp->ni_startdir = dp; error = lookup(ndp); if (error != 0) goto out; /* * If not a symbolic link, we're done. */ if ((cnp->cn_flags & ISSYMLINK) == 0) { vrele(ndp->ni_rootdir); if ((cnp->cn_flags & (SAVENAME | SAVESTART)) == 0) { namei_cleanup_cnp(cnp); } else cnp->cn_flags |= HASBUF; if ((ndp->ni_lcf & (NI_LCF_BENEATH_ABS | NI_LCF_BENEATH_LATCHED)) == NI_LCF_BENEATH_ABS) { NDFREE(ndp, 0); error = ENOTCAPABLE; } nameicap_cleanup(ndp, true); SDT_PROBE2(vfs, namei, lookup, return, error, (error == 0 ? ndp->ni_vp : NULL)); return (error); } if (ndp->ni_loopcnt++ >= MAXSYMLINKS) { error = ELOOP; break; } #ifdef MAC if ((cnp->cn_flags & NOMACCHECK) == 0) { error = mac_vnode_check_readlink(td->td_ucred, ndp->ni_vp); if (error != 0) break; } #endif if (ndp->ni_pathlen > 1) cp = uma_zalloc(namei_zone, M_WAITOK); else cp = cnp->cn_pnbuf; aiov.iov_base = cp; aiov.iov_len = MAXPATHLEN; auio.uio_iov = &aiov; auio.uio_iovcnt = 1; auio.uio_offset = 0; auio.uio_rw = UIO_READ; auio.uio_segflg = UIO_SYSSPACE; auio.uio_td = td; auio.uio_resid = MAXPATHLEN; error = VOP_READLINK(ndp->ni_vp, &auio, cnp->cn_cred); if (error != 0) { if (ndp->ni_pathlen > 1) uma_zfree(namei_zone, cp); break; } linklen = MAXPATHLEN - auio.uio_resid; if (linklen == 0) { if (ndp->ni_pathlen > 1) uma_zfree(namei_zone, cp); error = ENOENT; break; } if (linklen + ndp->ni_pathlen > MAXPATHLEN) { if (ndp->ni_pathlen > 1) uma_zfree(namei_zone, cp); error = ENAMETOOLONG; break; } if (ndp->ni_pathlen > 1) { bcopy(ndp->ni_next, cp + linklen, ndp->ni_pathlen); uma_zfree(namei_zone, cnp->cn_pnbuf); cnp->cn_pnbuf = cp; } else cnp->cn_pnbuf[linklen] = '\0'; ndp->ni_pathlen += linklen; vput(ndp->ni_vp); dp = ndp->ni_dvp; /* * Check if root directory should replace current directory. */ cnp->cn_nameptr = cnp->cn_pnbuf; if (*(cnp->cn_nameptr) == '/') { vrele(dp); error = namei_handle_root(ndp, &dp, 1); if (error != 0) goto out; } } vput(ndp->ni_vp); ndp->ni_vp = NULL; vrele(ndp->ni_dvp); out: vrele(ndp->ni_rootdir); MPASS(error != 0); namei_cleanup_cnp(cnp); nameicap_cleanup(ndp, true); SDT_PROBE2(vfs, namei, lookup, return, error, NULL); return (error); } static int compute_cn_lkflags(struct mount *mp, int lkflags, int cnflags) { if (mp == NULL || ((lkflags & LK_SHARED) && (!(mp->mnt_kern_flag & MNTK_LOOKUP_SHARED) || ((cnflags & ISDOTDOT) && (mp->mnt_kern_flag & MNTK_LOOKUP_EXCL_DOTDOT))))) { lkflags &= ~LK_SHARED; lkflags |= LK_EXCLUSIVE; } lkflags |= LK_NODDLKTREAT; return (lkflags); } static __inline int needs_exclusive_leaf(struct mount *mp, int flags) { /* * Intermediate nodes can use shared locks, we only need to * force an exclusive lock for leaf nodes. */ if ((flags & (ISLASTCN | LOCKLEAF)) != (ISLASTCN | LOCKLEAF)) return (0); /* Always use exclusive locks if LOCKSHARED isn't set. */ if (!(flags & LOCKSHARED)) return (1); /* * For lookups during open(), if the mount point supports * extended shared operations, then use a shared lock for the * leaf node, otherwise use an exclusive lock. */ if ((flags & ISOPEN) != 0) return (!MNT_EXTENDED_SHARED(mp)); /* * Lookup requests outside of open() that specify LOCKSHARED * only need a shared lock on the leaf vnode. */ return (0); } /* * Search a pathname. * This is a very central and rather complicated routine. * * The pathname is pointed to by ni_ptr and is of length ni_pathlen. * The starting directory is taken from ni_startdir. The pathname is * descended until done, or a symbolic link is encountered. The variable * ni_more is clear if the path is completed; it is set to one if a * symbolic link needing interpretation is encountered. * * The flag argument is LOOKUP, CREATE, RENAME, or DELETE depending on * whether the name is to be looked up, created, renamed, or deleted. * When CREATE, RENAME, or DELETE is specified, information usable in * creating, renaming, or deleting a directory entry may be calculated. * If flag has LOCKPARENT or'ed into it, the parent directory is returned * locked. If flag has WANTPARENT or'ed into it, the parent directory is * returned unlocked. Otherwise the parent directory is not returned. If * the target of the pathname exists and LOCKLEAF is or'ed into the flag * the target is returned locked, otherwise it is returned unlocked. * When creating or renaming and LOCKPARENT is specified, the target may not * be ".". When deleting and LOCKPARENT is specified, the target may be ".". * * Overall outline of lookup: * * dirloop: * identify next component of name at ndp->ni_ptr * handle degenerate case where name is null string * if .. and crossing mount points and on mounted filesys, find parent * call VOP_LOOKUP routine for next component name * directory vnode returned in ni_dvp, unlocked unless LOCKPARENT set * component vnode returned in ni_vp (if it exists), locked. * if result vnode is mounted on and crossing mount points, * find mounted on vnode * if more components of name, do next level at dirloop * return the answer in ni_vp, locked if LOCKLEAF set * if LOCKPARENT set, return locked parent in ni_dvp * if WANTPARENT set, return unlocked parent in ni_dvp */ int lookup(struct nameidata *ndp) { char *cp; /* pointer into pathname argument */ char *prev_ni_next; /* saved ndp->ni_next */ struct vnode *dp = NULL; /* the directory we are searching */ struct vnode *tdp; /* saved dp */ struct mount *mp; /* mount table entry */ struct prison *pr; size_t prev_ni_pathlen; /* saved ndp->ni_pathlen */ int docache; /* == 0 do not cache last component */ int wantparent; /* 1 => wantparent or lockparent flag */ int rdonly; /* lookup read-only flag bit */ int error = 0; int dpunlocked = 0; /* dp has already been unlocked */ int relookup = 0; /* do not consume the path component */ struct componentname *cnp = &ndp->ni_cnd; int lkflags_save; int ni_dvp_unlocked; /* * Setup: break out flag bits into variables. */ ni_dvp_unlocked = 0; wantparent = cnp->cn_flags & (LOCKPARENT | WANTPARENT); KASSERT(cnp->cn_nameiop == LOOKUP || wantparent, ("CREATE, DELETE, RENAME require LOCKPARENT or WANTPARENT.")); docache = (cnp->cn_flags & NOCACHE) ^ NOCACHE; if (cnp->cn_nameiop == DELETE || (wantparent && cnp->cn_nameiop != CREATE && cnp->cn_nameiop != LOOKUP)) docache = 0; rdonly = cnp->cn_flags & RDONLY; cnp->cn_flags &= ~ISSYMLINK; ndp->ni_dvp = NULL; /* * We use shared locks until we hit the parent of the last cn then * we adjust based on the requesting flags. */ cnp->cn_lkflags = LK_SHARED; dp = ndp->ni_startdir; ndp->ni_startdir = NULLVP; vn_lock(dp, compute_cn_lkflags(dp->v_mount, cnp->cn_lkflags | LK_RETRY, cnp->cn_flags)); dirloop: /* * Search a new directory. * * The last component of the filename is left accessible via * cnp->cn_nameptr for callers that need the name. Callers needing * the name set the SAVENAME flag. When done, they assume * responsibility for freeing the pathname buffer. */ for (cp = cnp->cn_nameptr; *cp != 0 && *cp != '/'; cp++) continue; cnp->cn_namelen = cp - cnp->cn_nameptr; if (cnp->cn_namelen > NAME_MAX) { error = ENAMETOOLONG; goto bad; } #ifdef NAMEI_DIAGNOSTIC { char c = *cp; *cp = '\0'; printf("{%s}: ", cnp->cn_nameptr); *cp = c; } #endif prev_ni_pathlen = ndp->ni_pathlen; ndp->ni_pathlen -= cnp->cn_namelen; KASSERT(ndp->ni_pathlen <= PATH_MAX, ("%s: ni_pathlen underflow to %zd\n", __func__, ndp->ni_pathlen)); prev_ni_next = ndp->ni_next; ndp->ni_next = cp; /* * Replace multiple slashes by a single slash and trailing slashes * by a null. This must be done before VOP_LOOKUP() because some * fs's don't know about trailing slashes. Remember if there were * trailing slashes to handle symlinks, existing non-directories * and non-existing files that won't be directories specially later. */ while (*cp == '/' && (cp[1] == '/' || cp[1] == '\0')) { cp++; ndp->ni_pathlen--; if (*cp == '\0') { *ndp->ni_next = '\0'; cnp->cn_flags |= TRAILINGSLASH; } } ndp->ni_next = cp; cnp->cn_flags |= MAKEENTRY; if (*cp == '\0' && docache == 0) cnp->cn_flags &= ~MAKEENTRY; if (cnp->cn_namelen == 2 && cnp->cn_nameptr[1] == '.' && cnp->cn_nameptr[0] == '.') cnp->cn_flags |= ISDOTDOT; else cnp->cn_flags &= ~ISDOTDOT; if (*ndp->ni_next == 0) cnp->cn_flags |= ISLASTCN; else cnp->cn_flags &= ~ISLASTCN; if ((cnp->cn_flags & ISLASTCN) != 0 && cnp->cn_namelen == 1 && cnp->cn_nameptr[0] == '.' && (cnp->cn_nameiop == DELETE || cnp->cn_nameiop == RENAME)) { error = EINVAL; goto bad; } nameicap_tracker_add(ndp, dp); /* * Check for degenerate name (e.g. / or "") * which is a way of talking about a directory, * e.g. like "/." or ".". */ if (cnp->cn_nameptr[0] == '\0') { if (dp->v_type != VDIR) { error = ENOTDIR; goto bad; } if (cnp->cn_nameiop != LOOKUP) { error = EISDIR; goto bad; } if (wantparent) { ndp->ni_dvp = dp; VREF(dp); } ndp->ni_vp = dp; if (cnp->cn_flags & AUDITVNODE1) AUDIT_ARG_VNODE1(dp); else if (cnp->cn_flags & AUDITVNODE2) AUDIT_ARG_VNODE2(dp); if (!(cnp->cn_flags & (LOCKPARENT | LOCKLEAF))) VOP_UNLOCK(dp); /* XXX This should probably move to the top of function. */ if (cnp->cn_flags & SAVESTART) panic("lookup: SAVESTART"); goto success; } /* * Handle "..": five special cases. * 0. If doing a capability lookup and lookup_cap_dotdot is * disabled, return ENOTCAPABLE. * 1. Return an error if this is the last component of * the name and the operation is DELETE or RENAME. * 2. If at root directory (e.g. after chroot) * or at absolute root directory * then ignore it so can't get out. * 3. If this vnode is the root of a mounted * filesystem, then replace it with the * vnode which was mounted on so we take the * .. in the other filesystem. * 4. If the vnode is the top directory of * the jail or chroot, don't let them out. * 5. If doing a capability lookup and lookup_cap_dotdot is * enabled, return ENOTCAPABLE if the lookup would escape * from the initial file descriptor directory. Checks are * done by ensuring that namei() already traversed the * result of dotdot lookup. */ if (cnp->cn_flags & ISDOTDOT) { if ((ndp->ni_lcf & (NI_LCF_STRICTRELATIVE | NI_LCF_CAP_DOTDOT)) == NI_LCF_STRICTRELATIVE) { #ifdef KTRACE if (KTRPOINT(curthread, KTR_CAPFAIL)) ktrcapfail(CAPFAIL_LOOKUP, NULL, NULL); #endif error = ENOTCAPABLE; goto bad; } if ((cnp->cn_flags & ISLASTCN) != 0 && (cnp->cn_nameiop == DELETE || cnp->cn_nameiop == RENAME)) { error = EINVAL; goto bad; } for (;;) { for (pr = cnp->cn_cred->cr_prison; pr != NULL; pr = pr->pr_parent) if (dp == pr->pr_root) break; if (dp == ndp->ni_rootdir || dp == ndp->ni_topdir || dp == rootvnode || pr != NULL || ((dp->v_vflag & VV_ROOT) != 0 && (cnp->cn_flags & NOCROSSMOUNT) != 0)) { ndp->ni_dvp = dp; ndp->ni_vp = dp; VREF(dp); goto nextname; } if ((dp->v_vflag & VV_ROOT) == 0) break; if (VN_IS_DOOMED(dp)) { /* forced unmount */ error = ENOENT; goto bad; } tdp = dp; dp = dp->v_mount->mnt_vnodecovered; VREF(dp); vput(tdp); vn_lock(dp, compute_cn_lkflags(dp->v_mount, cnp->cn_lkflags | LK_RETRY, ISDOTDOT)); error = nameicap_check_dotdot(ndp, dp); if (error != 0) { #ifdef KTRACE if (KTRPOINT(curthread, KTR_CAPFAIL)) ktrcapfail(CAPFAIL_LOOKUP, NULL, NULL); #endif goto bad; } } } /* * We now have a segment name to search for, and a directory to search. */ unionlookup: #ifdef MAC error = mac_vnode_check_lookup(cnp->cn_thread->td_ucred, dp, cnp); if (error) goto bad; #endif ndp->ni_dvp = dp; ndp->ni_vp = NULL; ASSERT_VOP_LOCKED(dp, "lookup"); /* * If we have a shared lock we may need to upgrade the lock for the * last operation. */ if ((cnp->cn_flags & LOCKPARENT) && (cnp->cn_flags & ISLASTCN) && dp != vp_crossmp && VOP_ISLOCKED(dp) == LK_SHARED) vn_lock(dp, LK_UPGRADE|LK_RETRY); if (VN_IS_DOOMED(dp)) { error = ENOENT; goto bad; } /* * If we're looking up the last component and we need an exclusive * lock, adjust our lkflags. */ if (needs_exclusive_leaf(dp->v_mount, cnp->cn_flags)) cnp->cn_lkflags = LK_EXCLUSIVE; #ifdef NAMEI_DIAGNOSTIC vn_printf(dp, "lookup in "); #endif lkflags_save = cnp->cn_lkflags; cnp->cn_lkflags = compute_cn_lkflags(dp->v_mount, cnp->cn_lkflags, cnp->cn_flags); error = VOP_LOOKUP(dp, &ndp->ni_vp, cnp); cnp->cn_lkflags = lkflags_save; if (error != 0) { KASSERT(ndp->ni_vp == NULL, ("leaf should be empty")); #ifdef NAMEI_DIAGNOSTIC printf("not found\n"); #endif if ((error == ENOENT) && (dp->v_vflag & VV_ROOT) && (dp->v_mount != NULL) && (dp->v_mount->mnt_flag & MNT_UNION)) { tdp = dp; dp = dp->v_mount->mnt_vnodecovered; VREF(dp); vput(tdp); vn_lock(dp, compute_cn_lkflags(dp->v_mount, cnp->cn_lkflags | LK_RETRY, cnp->cn_flags)); nameicap_tracker_add(ndp, dp); goto unionlookup; } if (error == ERELOOKUP) { vref(dp); ndp->ni_vp = dp; error = 0; relookup = 1; goto good; } if (error != EJUSTRETURN) goto bad; /* * At this point, we know we're at the end of the * pathname. If creating / renaming, we can consider * allowing the file or directory to be created / renamed, * provided we're not on a read-only filesystem. */ if (rdonly) { error = EROFS; goto bad; } /* trailing slash only allowed for directories */ if ((cnp->cn_flags & TRAILINGSLASH) && !(cnp->cn_flags & WILLBEDIR)) { error = ENOENT; goto bad; } if ((cnp->cn_flags & LOCKPARENT) == 0) VOP_UNLOCK(dp); /* * We return with ni_vp NULL to indicate that the entry * doesn't currently exist, leaving a pointer to the * (possibly locked) directory vnode in ndp->ni_dvp. */ if (cnp->cn_flags & SAVESTART) { ndp->ni_startdir = ndp->ni_dvp; VREF(ndp->ni_startdir); } goto success; } good: #ifdef NAMEI_DIAGNOSTIC printf("found\n"); #endif dp = ndp->ni_vp; /* * Check to see if the vnode has been mounted on; * if so find the root of the mounted filesystem. */ while (dp->v_type == VDIR && (mp = dp->v_mountedhere) && (cnp->cn_flags & NOCROSSMOUNT) == 0) { if (vfs_busy(mp, 0)) continue; vput(dp); if (dp != ndp->ni_dvp) vput(ndp->ni_dvp); else vrele(ndp->ni_dvp); vrefact(vp_crossmp); ndp->ni_dvp = vp_crossmp; error = VFS_ROOT(mp, compute_cn_lkflags(mp, cnp->cn_lkflags, cnp->cn_flags), &tdp); vfs_unbusy(mp); if (vn_lock(vp_crossmp, LK_SHARED | LK_NOWAIT)) panic("vp_crossmp exclusively locked or reclaimed"); if (error) { dpunlocked = 1; goto bad2; } ndp->ni_vp = dp = tdp; } /* * Check for symbolic link */ if ((dp->v_type == VLNK) && ((cnp->cn_flags & FOLLOW) || (cnp->cn_flags & TRAILINGSLASH) || *ndp->ni_next == '/')) { cnp->cn_flags |= ISSYMLINK; if (VN_IS_DOOMED(dp)) { /* * We can't know whether the directory was mounted with * NOSYMFOLLOW, so we can't follow safely. */ error = ENOENT; goto bad2; } if (dp->v_mount->mnt_flag & MNT_NOSYMFOLLOW) { error = EACCES; goto bad2; } /* * Symlink code always expects an unlocked dvp. */ if (ndp->ni_dvp != ndp->ni_vp) { VOP_UNLOCK(ndp->ni_dvp); ni_dvp_unlocked = 1; } goto success; } nextname: /* * Not a symbolic link that we will follow. Continue with the * next component if there is any; otherwise, we're done. */ KASSERT((cnp->cn_flags & ISLASTCN) || *ndp->ni_next == '/', ("lookup: invalid path state.")); if (relookup) { relookup = 0; ndp->ni_pathlen = prev_ni_pathlen; ndp->ni_next = prev_ni_next; if (ndp->ni_dvp != dp) vput(ndp->ni_dvp); else vrele(ndp->ni_dvp); goto dirloop; } if (cnp->cn_flags & ISDOTDOT) { error = nameicap_check_dotdot(ndp, ndp->ni_vp); if (error != 0) { #ifdef KTRACE if (KTRPOINT(curthread, KTR_CAPFAIL)) ktrcapfail(CAPFAIL_LOOKUP, NULL, NULL); #endif goto bad2; } } if (*ndp->ni_next == '/') { cnp->cn_nameptr = ndp->ni_next; while (*cnp->cn_nameptr == '/') { cnp->cn_nameptr++; ndp->ni_pathlen--; } if (ndp->ni_dvp != dp) vput(ndp->ni_dvp); else vrele(ndp->ni_dvp); goto dirloop; } /* * If we're processing a path with a trailing slash, * check that the end result is a directory. */ if ((cnp->cn_flags & TRAILINGSLASH) && dp->v_type != VDIR) { error = ENOTDIR; goto bad2; } /* * Disallow directory write attempts on read-only filesystems. */ if (rdonly && (cnp->cn_nameiop == DELETE || cnp->cn_nameiop == RENAME)) { error = EROFS; goto bad2; } if (cnp->cn_flags & SAVESTART) { ndp->ni_startdir = ndp->ni_dvp; VREF(ndp->ni_startdir); } if (!wantparent) { ni_dvp_unlocked = 2; if (ndp->ni_dvp != dp) vput(ndp->ni_dvp); else vrele(ndp->ni_dvp); } else if ((cnp->cn_flags & LOCKPARENT) == 0 && ndp->ni_dvp != dp) { VOP_UNLOCK(ndp->ni_dvp); ni_dvp_unlocked = 1; } if (cnp->cn_flags & AUDITVNODE1) AUDIT_ARG_VNODE1(dp); else if (cnp->cn_flags & AUDITVNODE2) AUDIT_ARG_VNODE2(dp); if ((cnp->cn_flags & LOCKLEAF) == 0) VOP_UNLOCK(dp); success: /* * Because of shared lookup we may have the vnode shared locked, but * the caller may want it to be exclusively locked. */ if (needs_exclusive_leaf(dp->v_mount, cnp->cn_flags) && VOP_ISLOCKED(dp) != LK_EXCLUSIVE) { vn_lock(dp, LK_UPGRADE | LK_RETRY); if (VN_IS_DOOMED(dp)) { error = ENOENT; goto bad2; } } return (0); bad2: if (ni_dvp_unlocked != 2) { if (dp != ndp->ni_dvp && !ni_dvp_unlocked) vput(ndp->ni_dvp); else vrele(ndp->ni_dvp); } bad: if (!dpunlocked) vput(dp); ndp->ni_vp = NULL; return (error); } /* * relookup - lookup a path name component * Used by lookup to re-acquire things. */ int relookup(struct vnode *dvp, struct vnode **vpp, struct componentname *cnp) { struct vnode *dp = NULL; /* the directory we are searching */ int wantparent; /* 1 => wantparent or lockparent flag */ int rdonly; /* lookup read-only flag bit */ int error = 0; KASSERT(cnp->cn_flags & ISLASTCN, ("relookup: Not given last component.")); /* * Setup: break out flag bits into variables. */ wantparent = cnp->cn_flags & (LOCKPARENT|WANTPARENT); KASSERT(wantparent, ("relookup: parent not wanted.")); rdonly = cnp->cn_flags & RDONLY; cnp->cn_flags &= ~ISSYMLINK; dp = dvp; cnp->cn_lkflags = LK_EXCLUSIVE; vn_lock(dp, LK_EXCLUSIVE | LK_RETRY); /* * Search a new directory. * * The last component of the filename is left accessible via * cnp->cn_nameptr for callers that need the name. Callers needing * the name set the SAVENAME flag. When done, they assume * responsibility for freeing the pathname buffer. */ #ifdef NAMEI_DIAGNOSTIC printf("{%s}: ", cnp->cn_nameptr); #endif /* * Check for "" which represents the root directory after slash * removal. */ if (cnp->cn_nameptr[0] == '\0') { /* * Support only LOOKUP for "/" because lookup() * can't succeed for CREATE, DELETE and RENAME. */ KASSERT(cnp->cn_nameiop == LOOKUP, ("nameiop must be LOOKUP")); KASSERT(dp->v_type == VDIR, ("dp is not a directory")); if (!(cnp->cn_flags & LOCKLEAF)) VOP_UNLOCK(dp); *vpp = dp; /* XXX This should probably move to the top of function. */ if (cnp->cn_flags & SAVESTART) panic("lookup: SAVESTART"); return (0); } if (cnp->cn_flags & ISDOTDOT) panic ("relookup: lookup on dot-dot"); /* * We now have a segment name to search for, and a directory to search. */ #ifdef NAMEI_DIAGNOSTIC vn_printf(dp, "search in "); #endif if ((error = VOP_LOOKUP(dp, vpp, cnp)) != 0) { KASSERT(*vpp == NULL, ("leaf should be empty")); if (error != EJUSTRETURN) goto bad; /* * If creating and at end of pathname, then can consider * allowing file to be created. */ if (rdonly) { error = EROFS; goto bad; } /* ASSERT(dvp == ndp->ni_startdir) */ if (cnp->cn_flags & SAVESTART) VREF(dvp); if ((cnp->cn_flags & LOCKPARENT) == 0) VOP_UNLOCK(dp); /* * We return with ni_vp NULL to indicate that the entry * doesn't currently exist, leaving a pointer to the * (possibly locked) directory vnode in ndp->ni_dvp. */ return (0); } dp = *vpp; /* * Disallow directory write attempts on read-only filesystems. */ if (rdonly && (cnp->cn_nameiop == DELETE || cnp->cn_nameiop == RENAME)) { if (dvp == dp) vrele(dvp); else vput(dvp); error = EROFS; goto bad; } /* * Set the parent lock/ref state to the requested state. */ if ((cnp->cn_flags & LOCKPARENT) == 0 && dvp != dp) { if (wantparent) VOP_UNLOCK(dvp); else vput(dvp); } else if (!wantparent) vrele(dvp); /* * Check for symbolic link */ KASSERT(dp->v_type != VLNK || !(cnp->cn_flags & FOLLOW), ("relookup: symlink found.\n")); /* ASSERT(dvp == ndp->ni_startdir) */ if (cnp->cn_flags & SAVESTART) VREF(dvp); if ((cnp->cn_flags & LOCKLEAF) == 0) VOP_UNLOCK(dp); return (0); bad: vput(dp); *vpp = NULL; return (error); } void NDINIT_ALL(struct nameidata *ndp, u_long op, u_long flags, enum uio_seg segflg, const char *namep, int dirfd, struct vnode *startdir, cap_rights_t *rightsp, struct thread *td) { ndp->ni_cnd.cn_nameiop = op; ndp->ni_cnd.cn_flags = flags; ndp->ni_segflg = segflg; ndp->ni_dirp = namep; ndp->ni_dirfd = dirfd; ndp->ni_startdir = startdir; ndp->ni_resflags = 0; filecaps_init(&ndp->ni_filecaps); ndp->ni_cnd.cn_thread = td; if (rightsp != NULL) ndp->ni_rightsneeded = *rightsp; else cap_rights_init_zero(&ndp->ni_rightsneeded); } /* * Free data allocated by namei(); see namei(9) for details. */ void NDFREE(struct nameidata *ndp, const u_int flags) { int unlock_dvp; int unlock_vp; unlock_dvp = 0; unlock_vp = 0; if (!(flags & NDF_NO_FREE_PNBUF) && (ndp->ni_cnd.cn_flags & HASBUF)) { uma_zfree(namei_zone, ndp->ni_cnd.cn_pnbuf); ndp->ni_cnd.cn_flags &= ~HASBUF; } if (!(flags & NDF_NO_VP_UNLOCK) && (ndp->ni_cnd.cn_flags & LOCKLEAF) && ndp->ni_vp) unlock_vp = 1; if (!(flags & NDF_NO_DVP_UNLOCK) && (ndp->ni_cnd.cn_flags & LOCKPARENT) && ndp->ni_dvp != ndp->ni_vp) unlock_dvp = 1; if (!(flags & NDF_NO_VP_RELE) && ndp->ni_vp) { if (unlock_vp) { vput(ndp->ni_vp); unlock_vp = 0; } else vrele(ndp->ni_vp); ndp->ni_vp = NULL; } if (unlock_vp) VOP_UNLOCK(ndp->ni_vp); if (!(flags & NDF_NO_DVP_RELE) && (ndp->ni_cnd.cn_flags & (LOCKPARENT|WANTPARENT))) { if (unlock_dvp) { vput(ndp->ni_dvp); unlock_dvp = 0; } else vrele(ndp->ni_dvp); ndp->ni_dvp = NULL; } if (unlock_dvp) VOP_UNLOCK(ndp->ni_dvp); if (!(flags & NDF_NO_STARTDIR_RELE) && (ndp->ni_cnd.cn_flags & SAVESTART)) { vrele(ndp->ni_startdir); ndp->ni_startdir = NULL; } } /* * Determine if there is a suitable alternate filename under the specified * prefix for the specified path. If the create flag is set, then the * alternate prefix will be used so long as the parent directory exists. * This is used by the various compatibility ABIs so that Linux binaries prefer * files under /compat/linux for example. The chosen path (whether under * the prefix or under /) is returned in a kernel malloc'd buffer pointed * to by pathbuf. The caller is responsible for free'ing the buffer from * the M_TEMP bucket if one is returned. */ int kern_alternate_path(struct thread *td, const char *prefix, const char *path, enum uio_seg pathseg, char **pathbuf, int create, int dirfd) { struct nameidata nd, ndroot; char *ptr, *buf, *cp; size_t len, sz; int error; buf = (char *) malloc(MAXPATHLEN, M_TEMP, M_WAITOK); *pathbuf = buf; /* Copy the prefix into the new pathname as a starting point. */ len = strlcpy(buf, prefix, MAXPATHLEN); if (len >= MAXPATHLEN) { *pathbuf = NULL; free(buf, M_TEMP); return (EINVAL); } sz = MAXPATHLEN - len; ptr = buf + len; /* Append the filename to the prefix. */ if (pathseg == UIO_SYSSPACE) error = copystr(path, ptr, sz, &len); else error = copyinstr(path, ptr, sz, &len); if (error) { *pathbuf = NULL; free(buf, M_TEMP); return (error); } /* Only use a prefix with absolute pathnames. */ if (*ptr != '/') { error = EINVAL; goto keeporig; } if (dirfd != AT_FDCWD) { /* * We want the original because the "prefix" is * included in the already opened dirfd. */ bcopy(ptr, buf, len); return (0); } /* * We know that there is a / somewhere in this pathname. * Search backwards for it, to find the file's parent dir * to see if it exists in the alternate tree. If it does, * and we want to create a file (cflag is set). We don't * need to worry about the root comparison in this case. */ if (create) { for (cp = &ptr[len] - 1; *cp != '/'; cp--); *cp = '\0'; NDINIT(&nd, LOOKUP, NOFOLLOW, UIO_SYSSPACE, buf, td); error = namei(&nd); *cp = '/'; if (error != 0) goto keeporig; } else { NDINIT(&nd, LOOKUP, NOFOLLOW, UIO_SYSSPACE, buf, td); error = namei(&nd); if (error != 0) goto keeporig; /* * We now compare the vnode of the prefix to the one * vnode asked. If they resolve to be the same, then we * ignore the match so that the real root gets used. * This avoids the problem of traversing "../.." to find the * root directory and never finding it, because "/" resolves * to the emulation root directory. This is expensive :-( */ NDINIT(&ndroot, LOOKUP, FOLLOW, UIO_SYSSPACE, prefix, td); /* We shouldn't ever get an error from this namei(). */ error = namei(&ndroot); if (error == 0) { if (nd.ni_vp == ndroot.ni_vp) error = ENOENT; NDFREE(&ndroot, NDF_ONLY_PNBUF); vrele(ndroot.ni_vp); } } NDFREE(&nd, NDF_ONLY_PNBUF); vrele(nd.ni_vp); keeporig: /* If there was an error, use the original path name. */ if (error) bcopy(ptr, buf, len); return (error); } Index: projects/clang1000-import/sys/modules/linuxkpi/Makefile =================================================================== --- projects/clang1000-import/sys/modules/linuxkpi/Makefile (revision 358238) +++ projects/clang1000-import/sys/modules/linuxkpi/Makefile (revision 358239) @@ -1,34 +1,35 @@ # $FreeBSD$ .PATH: ${SRCTOP}/sys/compat/linuxkpi/common/src KMOD= linuxkpi SRCS= linux_compat.c \ linux_current.c \ linux_hrtimer.c \ linux_idr.c \ linux_kmod.c \ linux_kthread.c \ linux_lock.c \ linux_page.c \ linux_pci.c \ linux_radix.c \ linux_rcu.c \ linux_seq_file.c \ linux_schedule.c \ + linux_shmemfs.c \ linux_slab.c \ linux_tasklet.c \ linux_usb.c \ linux_work.c SRCS+= bus_if.h \ device_if.h \ pci_if.h \ vnode_if.h \ usb_if.h \ opt_usb.h \ opt_stack.h CFLAGS+= -I${SRCTOP}/sys/compat/linuxkpi/common/include CFLAGS+= -I${SRCTOP}/sys/contrib/ck/include .include Index: projects/clang1000-import/sys/net80211/ieee80211_alq.c =================================================================== --- projects/clang1000-import/sys/net80211/ieee80211_alq.c (revision 358238) +++ projects/clang1000-import/sys/net80211/ieee80211_alq.c (revision 358239) @@ -1,177 +1,179 @@ /*- * SPDX-License-Identifier: BSD-2-Clause-FreeBSD * * Copyright (c) 2011 Adrian Chadd, Xenion Lty Ltd * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #include #ifdef __FreeBSD__ __FBSDID("$FreeBSD$"); #endif /* * net80211 fast-logging support, primarily for debugging. * * This implements a single debugging queue which includes * per-device enumeration where needed. */ #include "opt_wlan.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include static struct alq *ieee80211_alq; static int ieee80211_alq_lost; static int ieee80211_alq_logged; static char ieee80211_alq_logfile[MAXPATHLEN] = "/tmp/net80211.log"; static unsigned int ieee80211_alq_qsize = 64*1024; static int ieee80211_alq_setlogging(int enable) { int error; if (enable) { if (ieee80211_alq) alq_close(ieee80211_alq); error = alq_open(&ieee80211_alq, ieee80211_alq_logfile, curthread->td_ucred, ALQ_DEFAULT_CMODE, ieee80211_alq_qsize, 0); ieee80211_alq_lost = 0; ieee80211_alq_logged = 0; printf("net80211: logging to %s enabled; " "struct size %d bytes\n", ieee80211_alq_logfile, (int) sizeof(struct ieee80211_alq_rec)); } else { if (ieee80211_alq) alq_close(ieee80211_alq); ieee80211_alq = NULL; printf("net80211: logging disabled\n"); error = 0; } return (error); } static int sysctl_ieee80211_alq_log(SYSCTL_HANDLER_ARGS) { int error, enable; enable = (ieee80211_alq != NULL); error = sysctl_handle_int(oidp, &enable, 0, req); if (error || !req->newptr) return (error); else return (ieee80211_alq_setlogging(enable)); } -SYSCTL_PROC(_net_wlan, OID_AUTO, alq, CTLTYPE_INT|CTLFLAG_RW, - 0, 0, sysctl_ieee80211_alq_log, "I", "Enable net80211 alq logging"); +SYSCTL_PROC(_net_wlan, OID_AUTO, alq, + CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_NEEDGIANT, 0, 0, + sysctl_ieee80211_alq_log, "I", + "Enable net80211 alq logging"); SYSCTL_INT(_net_wlan, OID_AUTO, alq_size, CTLFLAG_RW, &ieee80211_alq_qsize, 0, "In-memory log size (bytes)"); SYSCTL_INT(_net_wlan, OID_AUTO, alq_lost, CTLFLAG_RW, &ieee80211_alq_lost, 0, "Debugging operations not logged"); SYSCTL_INT(_net_wlan, OID_AUTO, alq_logged, CTLFLAG_RW, &ieee80211_alq_logged, 0, "Debugging operations logged"); static struct ale * ieee80211_alq_get(size_t len) { struct ale *ale; ale = alq_getn(ieee80211_alq, len + sizeof(struct ieee80211_alq_rec), ALQ_NOWAIT); if (!ale) ieee80211_alq_lost++; else ieee80211_alq_logged++; return ale; } int ieee80211_alq_log(struct ieee80211com *ic, struct ieee80211vap *vap, uint32_t op, uint32_t flags, uint16_t srcid, const uint8_t *src, size_t len) { struct ale *ale; struct ieee80211_alq_rec *r; char *dst; /* Don't log if we're disabled */ if (ieee80211_alq == NULL) return (0); if (len > IEEE80211_ALQ_MAX_PAYLOAD) return (ENOMEM); ale = ieee80211_alq_get(len); if (! ale) return (ENOMEM); r = (struct ieee80211_alq_rec *) ale->ae_data; dst = ((char *) r) + sizeof(struct ieee80211_alq_rec); r->r_timestamp = htobe64(ticks); if (vap != NULL) { r->r_wlan = htobe16(vap->iv_ifp->if_dunit); } else { r->r_wlan = 0xffff; } r->r_src = htobe16(srcid); r->r_flags = htobe32(flags); r->r_op = htobe32(op); r->r_len = htobe32(len + sizeof(struct ieee80211_alq_rec)); r->r_threadid = htobe32((uint32_t) curthread->td_tid); if (src != NULL) memcpy(dst, src, len); alq_post(ieee80211_alq, ale); return (0); } Index: projects/clang1000-import/sys/net80211/ieee80211_amrr.c =================================================================== --- projects/clang1000-import/sys/net80211/ieee80211_amrr.c (revision 358238) +++ projects/clang1000-import/sys/net80211/ieee80211_amrr.c (revision 358239) @@ -1,514 +1,514 @@ /* $OpenBSD: ieee80211_amrr.c,v 1.1 2006/06/17 19:07:19 damien Exp $ */ /*- * Copyright (c) 2010 Rui Paulo * Copyright (c) 2006 * Damien Bergamini * * Permission to use, copy, modify, and distribute this software for any * purpose with or without fee is hereby granted, provided that the above * copyright notice and this permission notice appear in all copies. * * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. */ #include __FBSDID("$FreeBSD$"); /*- * Naive implementation of the Adaptive Multi Rate Retry algorithm: * * "IEEE 802.11 Rate Adaptation: A Practical Approach" * Mathieu Lacage, Hossein Manshaei, Thierry Turletti * INRIA Sophia - Projet Planete * http://www-sop.inria.fr/rapports/sophia/RR-5208.html */ #include "opt_wlan.h" #include #include #include #include #include #include #include #include #include #include #include #ifdef INET #include #include #endif #include #include #include #include #define is_success(amn) \ ((amn)->amn_retrycnt < (amn)->amn_txcnt / 10) #define is_failure(amn) \ ((amn)->amn_retrycnt > (amn)->amn_txcnt / 3) #define is_enough(amn) \ ((amn)->amn_txcnt > 10) static void amrr_setinterval(const struct ieee80211vap *, int); static void amrr_init(struct ieee80211vap *); static void amrr_deinit(struct ieee80211vap *); static void amrr_node_init(struct ieee80211_node *); static void amrr_node_deinit(struct ieee80211_node *); static int amrr_update(struct ieee80211_amrr *, struct ieee80211_amrr_node *, struct ieee80211_node *); static int amrr_rate(struct ieee80211_node *, void *, uint32_t); static void amrr_tx_complete(const struct ieee80211_node *, const struct ieee80211_ratectl_tx_status *); static void amrr_tx_update_cb(void *, struct ieee80211_node *); static void amrr_tx_update(struct ieee80211vap *vap, struct ieee80211_ratectl_tx_stats *); static void amrr_sysctlattach(struct ieee80211vap *, struct sysctl_ctx_list *, struct sysctl_oid *); static void amrr_node_stats(struct ieee80211_node *ni, struct sbuf *s); /* number of references from net80211 layer */ static int nrefs = 0; static const struct ieee80211_ratectl amrr = { .ir_name = "amrr", .ir_attach = NULL, .ir_detach = NULL, .ir_init = amrr_init, .ir_deinit = amrr_deinit, .ir_node_init = amrr_node_init, .ir_node_deinit = amrr_node_deinit, .ir_rate = amrr_rate, .ir_tx_complete = amrr_tx_complete, .ir_tx_update = amrr_tx_update, .ir_setinterval = amrr_setinterval, .ir_node_stats = amrr_node_stats, }; IEEE80211_RATECTL_MODULE(amrr, 1); IEEE80211_RATECTL_ALG(amrr, IEEE80211_RATECTL_AMRR, amrr); static void amrr_setinterval(const struct ieee80211vap *vap, int msecs) { struct ieee80211_amrr *amrr = vap->iv_rs; if (!amrr) return; if (msecs < 100) msecs = 100; amrr->amrr_interval = msecs_to_ticks(msecs); } static void amrr_init(struct ieee80211vap *vap) { struct ieee80211_amrr *amrr; KASSERT(vap->iv_rs == NULL, ("%s called multiple times", __func__)); nrefs++; /* XXX locking */ amrr = vap->iv_rs = IEEE80211_MALLOC(sizeof(struct ieee80211_amrr), M_80211_RATECTL, IEEE80211_M_NOWAIT | IEEE80211_M_ZERO); if (amrr == NULL) { if_printf(vap->iv_ifp, "couldn't alloc ratectl structure\n"); return; } amrr->amrr_min_success_threshold = IEEE80211_AMRR_MIN_SUCCESS_THRESHOLD; amrr->amrr_max_success_threshold = IEEE80211_AMRR_MAX_SUCCESS_THRESHOLD; amrr_setinterval(vap, 500 /* ms */); amrr_sysctlattach(vap, vap->iv_sysctl, vap->iv_oid); } static void amrr_deinit(struct ieee80211vap *vap) { IEEE80211_FREE(vap->iv_rs, M_80211_RATECTL); KASSERT(nrefs > 0, ("imbalanced attach/detach")); nrefs--; /* XXX locking */ } /* * Return whether 11n rates are possible. * * Some 11n devices may return HT information but no HT rates. * Thus, we shouldn't treat them as an 11n node. */ static int amrr_node_is_11n(struct ieee80211_node *ni) { if (ni->ni_chan == NULL) return (0); if (ni->ni_chan == IEEE80211_CHAN_ANYC) return (0); if (IEEE80211_IS_CHAN_HT(ni->ni_chan) && ni->ni_htrates.rs_nrates == 0) return (0); return (IEEE80211_IS_CHAN_HT(ni->ni_chan)); } static void amrr_node_init(struct ieee80211_node *ni) { const struct ieee80211_rateset *rs = NULL; struct ieee80211vap *vap = ni->ni_vap; struct ieee80211_amrr *amrr = vap->iv_rs; struct ieee80211_amrr_node *amn; uint8_t rate; if (!amrr) { if_printf(vap->iv_ifp, "ratectl structure was not allocated, " "per-node structure allocation skipped\n"); return; } if (ni->ni_rctls == NULL) { ni->ni_rctls = amn = IEEE80211_MALLOC(sizeof(struct ieee80211_amrr_node), M_80211_RATECTL, IEEE80211_M_NOWAIT | IEEE80211_M_ZERO); if (amn == NULL) { if_printf(vap->iv_ifp, "couldn't alloc per-node ratectl " "structure\n"); return; } } else amn = ni->ni_rctls; amn->amn_amrr = amrr; amn->amn_success = 0; amn->amn_recovery = 0; amn->amn_txcnt = amn->amn_retrycnt = 0; amn->amn_success_threshold = amrr->amrr_min_success_threshold; /* 11n or not? Pick the right rateset */ if (amrr_node_is_11n(ni)) { /* XXX ew */ IEEE80211_NOTE(ni->ni_vap, IEEE80211_MSG_RATECTL, ni, "%s: 11n node", __func__); rs = (struct ieee80211_rateset *) &ni->ni_htrates; } else { IEEE80211_NOTE(ni->ni_vap, IEEE80211_MSG_RATECTL, ni, "%s: non-11n node", __func__); rs = &ni->ni_rates; } /* Initial rate - lowest */ rate = rs->rs_rates[0]; /* XXX clear the basic rate flag if it's not 11n */ if (! amrr_node_is_11n(ni)) rate &= IEEE80211_RATE_VAL; /* pick initial rate from the rateset - HT or otherwise */ /* Pick something low that's likely to succeed */ for (amn->amn_rix = rs->rs_nrates - 1; amn->amn_rix > 0; amn->amn_rix--) { /* legacy - anything < 36mbit, stop searching */ /* 11n - stop at MCS4 */ if (amrr_node_is_11n(ni)) { if ((rs->rs_rates[amn->amn_rix] & 0x1f) < 4) break; } else if ((rs->rs_rates[amn->amn_rix] & IEEE80211_RATE_VAL) <= 72) break; } rate = rs->rs_rates[amn->amn_rix] & IEEE80211_RATE_VAL; /* if the rate is an 11n rate, ensure the MCS bit is set */ if (amrr_node_is_11n(ni)) rate |= IEEE80211_RATE_MCS; /* Assign initial rate from the rateset */ ni->ni_txrate = rate; amn->amn_ticks = ticks; /* XXX TODO: we really need a rate-to-string method */ /* XXX TODO: non-11n rate should be divided by two.. */ IEEE80211_NOTE(ni->ni_vap, IEEE80211_MSG_RATECTL, ni, "AMRR: nrates=%d, initial rate %s%d", rs->rs_nrates, amrr_node_is_11n(ni) ? "MCS " : "", rate & IEEE80211_RATE_VAL); } static void amrr_node_deinit(struct ieee80211_node *ni) { IEEE80211_FREE(ni->ni_rctls, M_80211_RATECTL); } static int amrr_update(struct ieee80211_amrr *amrr, struct ieee80211_amrr_node *amn, struct ieee80211_node *ni) { int rix = amn->amn_rix; const struct ieee80211_rateset *rs = NULL; KASSERT(is_enough(amn), ("txcnt %d", amn->amn_txcnt)); /* 11n or not? Pick the right rateset */ if (amrr_node_is_11n(ni)) { /* XXX ew */ rs = (struct ieee80211_rateset *) &ni->ni_htrates; } else { rs = &ni->ni_rates; } /* XXX TODO: we really need a rate-to-string method */ /* XXX TODO: non-11n rate should be divided by two.. */ IEEE80211_NOTE(ni->ni_vap, IEEE80211_MSG_RATECTL, ni, "AMRR: current rate %d, txcnt=%d, retrycnt=%d", rs->rs_rates[rix] & IEEE80211_RATE_VAL, amn->amn_txcnt, amn->amn_retrycnt); /* * XXX This is totally bogus for 11n, as although high MCS * rates for each stream may be failing, the next stream * should be checked. * * Eg, if MCS5 is ok but MCS6/7 isn't, and we can go up to * MCS23, we should skip 6/7 and try 8 onwards. */ if (is_success(amn)) { amn->amn_success++; if (amn->amn_success >= amn->amn_success_threshold && rix + 1 < rs->rs_nrates) { amn->amn_recovery = 1; amn->amn_success = 0; rix++; /* XXX TODO: we really need a rate-to-string method */ /* XXX TODO: non-11n rate should be divided by two.. */ IEEE80211_NOTE(ni->ni_vap, IEEE80211_MSG_RATECTL, ni, "AMRR increasing rate %d (txcnt=%d retrycnt=%d)", rs->rs_rates[rix] & IEEE80211_RATE_VAL, amn->amn_txcnt, amn->amn_retrycnt); } else { amn->amn_recovery = 0; } } else if (is_failure(amn)) { amn->amn_success = 0; if (rix > 0) { if (amn->amn_recovery) { amn->amn_success_threshold *= 2; if (amn->amn_success_threshold > amrr->amrr_max_success_threshold) amn->amn_success_threshold = amrr->amrr_max_success_threshold; } else { amn->amn_success_threshold = amrr->amrr_min_success_threshold; } rix--; /* XXX TODO: we really need a rate-to-string method */ /* XXX TODO: non-11n rate should be divided by two.. */ IEEE80211_NOTE(ni->ni_vap, IEEE80211_MSG_RATECTL, ni, "AMRR decreasing rate %d (txcnt=%d retrycnt=%d)", rs->rs_rates[rix] & IEEE80211_RATE_VAL, amn->amn_txcnt, amn->amn_retrycnt); } amn->amn_recovery = 0; } /* reset counters */ amn->amn_txcnt = 0; amn->amn_retrycnt = 0; return rix; } /* * Return the rate index to use in sending a data frame. * Update our internal state if it's been long enough. * If the rate changes we also update ni_txrate to match. */ static int amrr_rate(struct ieee80211_node *ni, void *arg __unused, uint32_t iarg __unused) { struct ieee80211_amrr_node *amn = ni->ni_rctls; struct ieee80211_amrr *amrr; const struct ieee80211_rateset *rs = NULL; int rix; /* XXX should return -1 here, but drivers may not expect this... */ if (!amn) { ni->ni_txrate = ni->ni_rates.rs_rates[0]; return 0; } amrr = amn->amn_amrr; /* 11n or not? Pick the right rateset */ if (amrr_node_is_11n(ni)) { /* XXX ew */ rs = (struct ieee80211_rateset *) &ni->ni_htrates; } else { rs = &ni->ni_rates; } if (is_enough(amn) && (ticks - amn->amn_ticks) > amrr->amrr_interval) { rix = amrr_update(amrr, amn, ni); if (rix != amn->amn_rix) { /* update public rate */ ni->ni_txrate = rs->rs_rates[rix]; /* XXX strip basic rate flag from txrate, if non-11n */ if (amrr_node_is_11n(ni)) ni->ni_txrate |= IEEE80211_RATE_MCS; else ni->ni_txrate &= IEEE80211_RATE_VAL; amn->amn_rix = rix; } amn->amn_ticks = ticks; } else rix = amn->amn_rix; return rix; } /* * Update statistics with tx complete status. Ok is non-zero * if the packet is known to be ACK'd. Retries has the number * retransmissions (i.e. xmit attempts - 1). */ static void amrr_tx_complete(const struct ieee80211_node *ni, const struct ieee80211_ratectl_tx_status *status) { struct ieee80211_amrr_node *amn = ni->ni_rctls; int retries; if (!amn) return; retries = 0; if (status->flags & IEEE80211_RATECTL_STATUS_LONG_RETRY) retries = status->long_retries; amn->amn_txcnt++; if (status->status == IEEE80211_RATECTL_TX_SUCCESS) amn->amn_success++; amn->amn_retrycnt += retries; } static void amrr_tx_update_cb(void *arg, struct ieee80211_node *ni) { struct ieee80211_ratectl_tx_stats *stats = arg; struct ieee80211_amrr_node *amn = ni->ni_rctls; int txcnt, success, retrycnt; if (!amn) return; txcnt = stats->nframes; success = stats->nsuccess; retrycnt = 0; if (stats->flags & IEEE80211_RATECTL_TX_STATS_RETRIES) retrycnt = stats->nretries; amn->amn_txcnt += txcnt; amn->amn_success += success; amn->amn_retrycnt += retrycnt; } /* * Set tx count/retry statistics explicitly. Intended for * drivers that poll the device for statistics maintained * in the device. */ static void amrr_tx_update(struct ieee80211vap *vap, struct ieee80211_ratectl_tx_stats *stats) { if (stats->flags & IEEE80211_RATECTL_TX_STATS_NODE) amrr_tx_update_cb(stats, stats->ni); else { ieee80211_iterate_nodes_vap(&vap->iv_ic->ic_sta, vap, amrr_tx_update_cb, stats); } } static int amrr_sysctl_interval(SYSCTL_HANDLER_ARGS) { struct ieee80211vap *vap = arg1; struct ieee80211_amrr *amrr = vap->iv_rs; int msecs, error; if (!amrr) return ENOMEM; msecs = ticks_to_msecs(amrr->amrr_interval); error = sysctl_handle_int(oidp, &msecs, 0, req); if (error || !req->newptr) return error; amrr_setinterval(vap, msecs); return 0; } static void amrr_sysctlattach(struct ieee80211vap *vap, struct sysctl_ctx_list *ctx, struct sysctl_oid *tree) { struct ieee80211_amrr *amrr = vap->iv_rs; if (!amrr) return; SYSCTL_ADD_PROC(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, - "amrr_rate_interval", CTLTYPE_INT | CTLFLAG_RW, vap, - 0, amrr_sysctl_interval, "I", "amrr operation interval (ms)"); + "amrr_rate_interval", CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_NEEDGIANT, + vap, 0, amrr_sysctl_interval, "I", "amrr operation interval (ms)"); /* XXX bounds check values */ SYSCTL_ADD_UINT(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, "amrr_max_sucess_threshold", CTLFLAG_RW, &amrr->amrr_max_success_threshold, 0, ""); SYSCTL_ADD_UINT(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, "amrr_min_sucess_threshold", CTLFLAG_RW, &amrr->amrr_min_success_threshold, 0, ""); } static void amrr_print_node_rate(struct ieee80211_amrr_node *amn, struct ieee80211_node *ni, struct sbuf *s) { int rate; struct ieee80211_rateset *rs; if (amrr_node_is_11n(ni)) { rs = (struct ieee80211_rateset *) &ni->ni_htrates; rate = rs->rs_rates[amn->amn_rix] & IEEE80211_RATE_VAL; sbuf_printf(s, "rate: MCS %d\n", rate); } else { rs = &ni->ni_rates; rate = rs->rs_rates[amn->amn_rix] & IEEE80211_RATE_VAL; sbuf_printf(s, "rate: %d Mbit\n", rate / 2); } } static void amrr_node_stats(struct ieee80211_node *ni, struct sbuf *s) { struct ieee80211_amrr_node *amn = ni->ni_rctls; /* XXX TODO: check locking? */ if (!amn) return; amrr_print_node_rate(amn, ni, s); sbuf_printf(s, "ticks: %d\n", amn->amn_ticks); sbuf_printf(s, "txcnt: %u\n", amn->amn_txcnt); sbuf_printf(s, "success: %u\n", amn->amn_success); sbuf_printf(s, "success_threshold: %u\n", amn->amn_success_threshold); sbuf_printf(s, "recovery: %u\n", amn->amn_recovery); sbuf_printf(s, "retry_cnt: %u\n", amn->amn_retrycnt); } Index: projects/clang1000-import/sys/net80211/ieee80211_freebsd.c =================================================================== --- projects/clang1000-import/sys/net80211/ieee80211_freebsd.c (revision 358238) +++ projects/clang1000-import/sys/net80211/ieee80211_freebsd.c (revision 358239) @@ -1,1058 +1,1059 @@ /*- * SPDX-License-Identifier: BSD-2-Clause-FreeBSD * * Copyright (c) 2003-2009 Sam Leffler, Errno Consulting * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); /* * IEEE 802.11 support (FreeBSD-specific code) */ #include "opt_wlan.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include -SYSCTL_NODE(_net, OID_AUTO, wlan, CTLFLAG_RD, 0, "IEEE 80211 parameters"); +SYSCTL_NODE(_net, OID_AUTO, wlan, CTLFLAG_RD | CTLFLAG_MPSAFE, 0, + "IEEE 80211 parameters"); #ifdef IEEE80211_DEBUG static int ieee80211_debug = 0; SYSCTL_INT(_net_wlan, OID_AUTO, debug, CTLFLAG_RW, &ieee80211_debug, 0, "debugging printfs"); #endif static const char wlanname[] = "wlan"; static struct if_clone *wlan_cloner; static int wlan_clone_create(struct if_clone *ifc, int unit, caddr_t params) { struct ieee80211_clone_params cp; struct ieee80211vap *vap; struct ieee80211com *ic; int error; error = copyin(params, &cp, sizeof(cp)); if (error) return error; ic = ieee80211_find_com(cp.icp_parent); if (ic == NULL) return ENXIO; if (cp.icp_opmode >= IEEE80211_OPMODE_MAX) { ic_printf(ic, "%s: invalid opmode %d\n", __func__, cp.icp_opmode); return EINVAL; } if ((ic->ic_caps & ieee80211_opcap[cp.icp_opmode]) == 0) { ic_printf(ic, "%s mode not supported\n", ieee80211_opmode_name[cp.icp_opmode]); return EOPNOTSUPP; } if ((cp.icp_flags & IEEE80211_CLONE_TDMA) && #ifdef IEEE80211_SUPPORT_TDMA (ic->ic_caps & IEEE80211_C_TDMA) == 0 #else (1) #endif ) { ic_printf(ic, "TDMA not supported\n"); return EOPNOTSUPP; } vap = ic->ic_vap_create(ic, wlanname, unit, cp.icp_opmode, cp.icp_flags, cp.icp_bssid, cp.icp_flags & IEEE80211_CLONE_MACADDR ? cp.icp_macaddr : ic->ic_macaddr); return (vap == NULL ? EIO : 0); } static void wlan_clone_destroy(struct ifnet *ifp) { struct ieee80211vap *vap = ifp->if_softc; struct ieee80211com *ic = vap->iv_ic; ic->ic_vap_delete(vap); } void ieee80211_vap_destroy(struct ieee80211vap *vap) { CURVNET_SET(vap->iv_ifp->if_vnet); if_clone_destroyif(wlan_cloner, vap->iv_ifp); CURVNET_RESTORE(); } int ieee80211_sysctl_msecs_ticks(SYSCTL_HANDLER_ARGS) { int msecs = ticks_to_msecs(*(int *)arg1); int error; error = sysctl_handle_int(oidp, &msecs, 0, req); if (error || !req->newptr) return error; *(int *)arg1 = msecs_to_ticks(msecs); return 0; } static int ieee80211_sysctl_inact(SYSCTL_HANDLER_ARGS) { int inact = (*(int *)arg1) * IEEE80211_INACT_WAIT; int error; error = sysctl_handle_int(oidp, &inact, 0, req); if (error || !req->newptr) return error; *(int *)arg1 = inact / IEEE80211_INACT_WAIT; return 0; } static int ieee80211_sysctl_parent(SYSCTL_HANDLER_ARGS) { struct ieee80211com *ic = arg1; return SYSCTL_OUT_STR(req, ic->ic_name); } static int ieee80211_sysctl_radar(SYSCTL_HANDLER_ARGS) { struct ieee80211com *ic = arg1; int t = 0, error; error = sysctl_handle_int(oidp, &t, 0, req); if (error || !req->newptr) return error; IEEE80211_LOCK(ic); ieee80211_dfs_notify_radar(ic, ic->ic_curchan); IEEE80211_UNLOCK(ic); return 0; } /* * For now, just restart everything. * * Later on, it'd be nice to have a separate VAP restart to * full-device restart. */ static int ieee80211_sysctl_vap_restart(SYSCTL_HANDLER_ARGS) { struct ieee80211vap *vap = arg1; int t = 0, error; error = sysctl_handle_int(oidp, &t, 0, req); if (error || !req->newptr) return error; ieee80211_restart_all(vap->iv_ic); return 0; } void ieee80211_sysctl_attach(struct ieee80211com *ic) { } void ieee80211_sysctl_detach(struct ieee80211com *ic) { } void ieee80211_sysctl_vattach(struct ieee80211vap *vap) { struct ifnet *ifp = vap->iv_ifp; struct sysctl_ctx_list *ctx; struct sysctl_oid *oid; char num[14]; /* sufficient for 32 bits */ ctx = (struct sysctl_ctx_list *) IEEE80211_MALLOC(sizeof(struct sysctl_ctx_list), M_DEVBUF, IEEE80211_M_NOWAIT | IEEE80211_M_ZERO); if (ctx == NULL) { if_printf(ifp, "%s: cannot allocate sysctl context!\n", __func__); return; } sysctl_ctx_init(ctx); snprintf(num, sizeof(num), "%u", ifp->if_dunit); oid = SYSCTL_ADD_NODE(ctx, &SYSCTL_NODE_CHILDREN(_net, wlan), - OID_AUTO, num, CTLFLAG_RD, NULL, ""); + OID_AUTO, num, CTLFLAG_RD | CTLFLAG_MPSAFE, NULL, ""); SYSCTL_ADD_PROC(ctx, SYSCTL_CHILDREN(oid), OID_AUTO, - "%parent", CTLTYPE_STRING | CTLFLAG_RD, vap->iv_ic, 0, - ieee80211_sysctl_parent, "A", "parent device"); + "%parent", CTLTYPE_STRING | CTLFLAG_RD | CTLFLAG_NEEDGIANT, + vap->iv_ic, 0, ieee80211_sysctl_parent, "A", "parent device"); SYSCTL_ADD_UINT(ctx, SYSCTL_CHILDREN(oid), OID_AUTO, "driver_caps", CTLFLAG_RW, &vap->iv_caps, 0, "driver capabilities"); #ifdef IEEE80211_DEBUG vap->iv_debug = ieee80211_debug; SYSCTL_ADD_UINT(ctx, SYSCTL_CHILDREN(oid), OID_AUTO, "debug", CTLFLAG_RW, &vap->iv_debug, 0, "control debugging printfs"); #endif SYSCTL_ADD_INT(ctx, SYSCTL_CHILDREN(oid), OID_AUTO, "bmiss_max", CTLFLAG_RW, &vap->iv_bmiss_max, 0, "consecutive beacon misses before scanning"); /* XXX inherit from tunables */ SYSCTL_ADD_PROC(ctx, SYSCTL_CHILDREN(oid), OID_AUTO, - "inact_run", CTLTYPE_INT | CTLFLAG_RW, &vap->iv_inact_run, 0, - ieee80211_sysctl_inact, "I", - "station inactivity timeout (sec)"); + "inact_run", CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_NEEDGIANT, + &vap->iv_inact_run, 0, ieee80211_sysctl_inact, "I", + "station inactivity timeout (sec)"); SYSCTL_ADD_PROC(ctx, SYSCTL_CHILDREN(oid), OID_AUTO, - "inact_probe", CTLTYPE_INT | CTLFLAG_RW, &vap->iv_inact_probe, 0, - ieee80211_sysctl_inact, "I", - "station inactivity probe timeout (sec)"); + "inact_probe", CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_NEEDGIANT, + &vap->iv_inact_probe, 0, ieee80211_sysctl_inact, "I", + "station inactivity probe timeout (sec)"); SYSCTL_ADD_PROC(ctx, SYSCTL_CHILDREN(oid), OID_AUTO, - "inact_auth", CTLTYPE_INT | CTLFLAG_RW, &vap->iv_inact_auth, 0, - ieee80211_sysctl_inact, "I", - "station authentication timeout (sec)"); + "inact_auth", CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_NEEDGIANT, + &vap->iv_inact_auth, 0, ieee80211_sysctl_inact, "I", + "station authentication timeout (sec)"); SYSCTL_ADD_PROC(ctx, SYSCTL_CHILDREN(oid), OID_AUTO, - "inact_init", CTLTYPE_INT | CTLFLAG_RW, &vap->iv_inact_init, 0, - ieee80211_sysctl_inact, "I", - "station initial state timeout (sec)"); + "inact_init", CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_NEEDGIANT, + &vap->iv_inact_init, 0, ieee80211_sysctl_inact, "I", + "station initial state timeout (sec)"); if (vap->iv_htcaps & IEEE80211_HTC_HT) { SYSCTL_ADD_UINT(ctx, SYSCTL_CHILDREN(oid), OID_AUTO, "ampdu_mintraffic_bk", CTLFLAG_RW, &vap->iv_ampdu_mintraffic[WME_AC_BK], 0, "BK traffic tx aggr threshold (pps)"); SYSCTL_ADD_UINT(ctx, SYSCTL_CHILDREN(oid), OID_AUTO, "ampdu_mintraffic_be", CTLFLAG_RW, &vap->iv_ampdu_mintraffic[WME_AC_BE], 0, "BE traffic tx aggr threshold (pps)"); SYSCTL_ADD_UINT(ctx, SYSCTL_CHILDREN(oid), OID_AUTO, "ampdu_mintraffic_vo", CTLFLAG_RW, &vap->iv_ampdu_mintraffic[WME_AC_VO], 0, "VO traffic tx aggr threshold (pps)"); SYSCTL_ADD_UINT(ctx, SYSCTL_CHILDREN(oid), OID_AUTO, "ampdu_mintraffic_vi", CTLFLAG_RW, &vap->iv_ampdu_mintraffic[WME_AC_VI], 0, "VI traffic tx aggr threshold (pps)"); } SYSCTL_ADD_PROC(ctx, SYSCTL_CHILDREN(oid), OID_AUTO, - "force_restart", CTLTYPE_INT | CTLFLAG_RW, vap, 0, - ieee80211_sysctl_vap_restart, "I", - "force a VAP restart"); + "force_restart", CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_NEEDGIANT, + vap, 0, ieee80211_sysctl_vap_restart, "I", "force a VAP restart"); if (vap->iv_caps & IEEE80211_C_DFS) { SYSCTL_ADD_PROC(ctx, SYSCTL_CHILDREN(oid), OID_AUTO, - "radar", CTLTYPE_INT | CTLFLAG_RW, vap->iv_ic, 0, - ieee80211_sysctl_radar, "I", "simulate radar event"); + "radar", CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_NEEDGIANT, + vap->iv_ic, 0, ieee80211_sysctl_radar, "I", + "simulate radar event"); } vap->iv_sysctl = ctx; vap->iv_oid = oid; } void ieee80211_sysctl_vdetach(struct ieee80211vap *vap) { if (vap->iv_sysctl != NULL) { sysctl_ctx_free(vap->iv_sysctl); IEEE80211_FREE(vap->iv_sysctl, M_DEVBUF); vap->iv_sysctl = NULL; } } #define MS(_v, _f) (((_v) & _f##_M) >> _f##_S) int ieee80211_com_vincref(struct ieee80211vap *vap) { uint32_t ostate; ostate = atomic_fetchadd_32(&vap->iv_com_state, IEEE80211_COM_REF_ADD); if (ostate & IEEE80211_COM_DETACHED) { atomic_subtract_32(&vap->iv_com_state, IEEE80211_COM_REF_ADD); return (ENETDOWN); } if (MS(ostate, IEEE80211_COM_REF) == IEEE80211_COM_REF_MAX) { atomic_subtract_32(&vap->iv_com_state, IEEE80211_COM_REF_ADD); return (EOVERFLOW); } return (0); } void ieee80211_com_vdecref(struct ieee80211vap *vap) { uint32_t ostate; ostate = atomic_fetchadd_32(&vap->iv_com_state, -IEEE80211_COM_REF_ADD); KASSERT(MS(ostate, IEEE80211_COM_REF) != 0, ("com reference counter underflow")); (void) ostate; } void ieee80211_com_vdetach(struct ieee80211vap *vap) { int sleep_time; sleep_time = msecs_to_ticks(250); atomic_set_32(&vap->iv_com_state, IEEE80211_COM_DETACHED); while (MS(atomic_load_32(&vap->iv_com_state), IEEE80211_COM_REF) != 0) pause("comref", sleep_time); } #undef MS int ieee80211_node_dectestref(struct ieee80211_node *ni) { /* XXX need equivalent of atomic_dec_and_test */ atomic_subtract_int(&ni->ni_refcnt, 1); return atomic_cmpset_int(&ni->ni_refcnt, 0, 1); } void ieee80211_drain_ifq(struct ifqueue *ifq) { struct ieee80211_node *ni; struct mbuf *m; for (;;) { IF_DEQUEUE(ifq, m); if (m == NULL) break; ni = (struct ieee80211_node *)m->m_pkthdr.rcvif; KASSERT(ni != NULL, ("frame w/o node")); ieee80211_free_node(ni); m->m_pkthdr.rcvif = NULL; m_freem(m); } } void ieee80211_flush_ifq(struct ifqueue *ifq, struct ieee80211vap *vap) { struct ieee80211_node *ni; struct mbuf *m, **mprev; IF_LOCK(ifq); mprev = &ifq->ifq_head; while ((m = *mprev) != NULL) { ni = (struct ieee80211_node *)m->m_pkthdr.rcvif; if (ni != NULL && ni->ni_vap == vap) { *mprev = m->m_nextpkt; /* remove from list */ ifq->ifq_len--; m_freem(m); ieee80211_free_node(ni); /* reclaim ref */ } else mprev = &m->m_nextpkt; } /* recalculate tail ptr */ m = ifq->ifq_head; for (; m != NULL && m->m_nextpkt != NULL; m = m->m_nextpkt) ; ifq->ifq_tail = m; IF_UNLOCK(ifq); } /* * As above, for mbufs allocated with m_gethdr/MGETHDR * or initialized by M_COPY_PKTHDR. */ #define MC_ALIGN(m, len) \ do { \ (m)->m_data += rounddown2(MCLBYTES - (len), sizeof(long)); \ } while (/* CONSTCOND */ 0) /* * Allocate and setup a management frame of the specified * size. We return the mbuf and a pointer to the start * of the contiguous data area that's been reserved based * on the packet length. The data area is forced to 32-bit * alignment and the buffer length to a multiple of 4 bytes. * This is done mainly so beacon frames (that require this) * can use this interface too. */ struct mbuf * ieee80211_getmgtframe(uint8_t **frm, int headroom, int pktlen) { struct mbuf *m; u_int len; /* * NB: we know the mbuf routines will align the data area * so we don't need to do anything special. */ len = roundup2(headroom + pktlen, 4); KASSERT(len <= MCLBYTES, ("802.11 mgt frame too large: %u", len)); if (len < MINCLSIZE) { m = m_gethdr(M_NOWAIT, MT_DATA); /* * Align the data in case additional headers are added. * This should only happen when a WEP header is added * which only happens for shared key authentication mgt * frames which all fit in MHLEN. */ if (m != NULL) M_ALIGN(m, len); } else { m = m_getcl(M_NOWAIT, MT_DATA, M_PKTHDR); if (m != NULL) MC_ALIGN(m, len); } if (m != NULL) { m->m_data += headroom; *frm = m->m_data; } return m; } #ifndef __NO_STRICT_ALIGNMENT /* * Re-align the payload in the mbuf. This is mainly used (right now) * to handle IP header alignment requirements on certain architectures. */ struct mbuf * ieee80211_realign(struct ieee80211vap *vap, struct mbuf *m, size_t align) { int pktlen, space; struct mbuf *n; pktlen = m->m_pkthdr.len; space = pktlen + align; if (space < MINCLSIZE) n = m_gethdr(M_NOWAIT, MT_DATA); else { n = m_getjcl(M_NOWAIT, MT_DATA, M_PKTHDR, space <= MCLBYTES ? MCLBYTES : #if MJUMPAGESIZE != MCLBYTES space <= MJUMPAGESIZE ? MJUMPAGESIZE : #endif space <= MJUM9BYTES ? MJUM9BYTES : MJUM16BYTES); } if (__predict_true(n != NULL)) { m_move_pkthdr(n, m); n->m_data = (caddr_t)(ALIGN(n->m_data + align) - align); m_copydata(m, 0, pktlen, mtod(n, caddr_t)); n->m_len = pktlen; } else { IEEE80211_DISCARD(vap, IEEE80211_MSG_ANY, mtod(m, const struct ieee80211_frame *), NULL, "%s", "no mbuf to realign"); vap->iv_stats.is_rx_badalign++; } m_freem(m); return n; } #endif /* !__NO_STRICT_ALIGNMENT */ int ieee80211_add_callback(struct mbuf *m, void (*func)(struct ieee80211_node *, void *, int), void *arg) { struct m_tag *mtag; struct ieee80211_cb *cb; mtag = m_tag_alloc(MTAG_ABI_NET80211, NET80211_TAG_CALLBACK, sizeof(struct ieee80211_cb), M_NOWAIT); if (mtag == NULL) return 0; cb = (struct ieee80211_cb *)(mtag+1); cb->func = func; cb->arg = arg; m_tag_prepend(m, mtag); m->m_flags |= M_TXCB; return 1; } int ieee80211_add_xmit_params(struct mbuf *m, const struct ieee80211_bpf_params *params) { struct m_tag *mtag; struct ieee80211_tx_params *tx; mtag = m_tag_alloc(MTAG_ABI_NET80211, NET80211_TAG_XMIT_PARAMS, sizeof(struct ieee80211_tx_params), M_NOWAIT); if (mtag == NULL) return (0); tx = (struct ieee80211_tx_params *)(mtag+1); memcpy(&tx->params, params, sizeof(struct ieee80211_bpf_params)); m_tag_prepend(m, mtag); return (1); } int ieee80211_get_xmit_params(struct mbuf *m, struct ieee80211_bpf_params *params) { struct m_tag *mtag; struct ieee80211_tx_params *tx; mtag = m_tag_locate(m, MTAG_ABI_NET80211, NET80211_TAG_XMIT_PARAMS, NULL); if (mtag == NULL) return (-1); tx = (struct ieee80211_tx_params *)(mtag + 1); memcpy(params, &tx->params, sizeof(struct ieee80211_bpf_params)); return (0); } void ieee80211_process_callback(struct ieee80211_node *ni, struct mbuf *m, int status) { struct m_tag *mtag; mtag = m_tag_locate(m, MTAG_ABI_NET80211, NET80211_TAG_CALLBACK, NULL); if (mtag != NULL) { struct ieee80211_cb *cb = (struct ieee80211_cb *)(mtag+1); cb->func(ni, cb->arg, status); } } /* * Add RX parameters to the given mbuf. * * Returns 1 if OK, 0 on error. */ int ieee80211_add_rx_params(struct mbuf *m, const struct ieee80211_rx_stats *rxs) { struct m_tag *mtag; struct ieee80211_rx_params *rx; mtag = m_tag_alloc(MTAG_ABI_NET80211, NET80211_TAG_RECV_PARAMS, sizeof(struct ieee80211_rx_stats), M_NOWAIT); if (mtag == NULL) return (0); rx = (struct ieee80211_rx_params *)(mtag + 1); memcpy(&rx->params, rxs, sizeof(*rxs)); m_tag_prepend(m, mtag); return (1); } int ieee80211_get_rx_params(struct mbuf *m, struct ieee80211_rx_stats *rxs) { struct m_tag *mtag; struct ieee80211_rx_params *rx; mtag = m_tag_locate(m, MTAG_ABI_NET80211, NET80211_TAG_RECV_PARAMS, NULL); if (mtag == NULL) return (-1); rx = (struct ieee80211_rx_params *)(mtag + 1); memcpy(rxs, &rx->params, sizeof(*rxs)); return (0); } const struct ieee80211_rx_stats * ieee80211_get_rx_params_ptr(struct mbuf *m) { struct m_tag *mtag; struct ieee80211_rx_params *rx; mtag = m_tag_locate(m, MTAG_ABI_NET80211, NET80211_TAG_RECV_PARAMS, NULL); if (mtag == NULL) return (NULL); rx = (struct ieee80211_rx_params *)(mtag + 1); return (&rx->params); } /* * Add TOA parameters to the given mbuf. */ int ieee80211_add_toa_params(struct mbuf *m, const struct ieee80211_toa_params *p) { struct m_tag *mtag; struct ieee80211_toa_params *rp; mtag = m_tag_alloc(MTAG_ABI_NET80211, NET80211_TAG_TOA_PARAMS, sizeof(struct ieee80211_toa_params), M_NOWAIT); if (mtag == NULL) return (0); rp = (struct ieee80211_toa_params *)(mtag + 1); memcpy(rp, p, sizeof(*rp)); m_tag_prepend(m, mtag); return (1); } int ieee80211_get_toa_params(struct mbuf *m, struct ieee80211_toa_params *p) { struct m_tag *mtag; struct ieee80211_toa_params *rp; mtag = m_tag_locate(m, MTAG_ABI_NET80211, NET80211_TAG_TOA_PARAMS, NULL); if (mtag == NULL) return (0); rp = (struct ieee80211_toa_params *)(mtag + 1); if (p != NULL) memcpy(p, rp, sizeof(*p)); return (1); } /* * Transmit a frame to the parent interface. */ int ieee80211_parent_xmitpkt(struct ieee80211com *ic, struct mbuf *m) { int error; /* * Assert the IC TX lock is held - this enforces the * processing -> queuing order is maintained */ IEEE80211_TX_LOCK_ASSERT(ic); error = ic->ic_transmit(ic, m); if (error) { struct ieee80211_node *ni; ni = (struct ieee80211_node *)m->m_pkthdr.rcvif; /* XXX number of fragments */ if_inc_counter(ni->ni_vap->iv_ifp, IFCOUNTER_OERRORS, 1); ieee80211_free_node(ni); ieee80211_free_mbuf(m); } return (error); } /* * Transmit a frame to the VAP interface. */ int ieee80211_vap_xmitpkt(struct ieee80211vap *vap, struct mbuf *m) { struct ifnet *ifp = vap->iv_ifp; /* * When transmitting via the VAP, we shouldn't hold * any IC TX lock as the VAP TX path will acquire it. */ IEEE80211_TX_UNLOCK_ASSERT(vap->iv_ic); return (ifp->if_transmit(ifp, m)); } #include void get_random_bytes(void *p, size_t n) { uint8_t *dp = p; while (n > 0) { uint32_t v = arc4random(); size_t nb = n > sizeof(uint32_t) ? sizeof(uint32_t) : n; bcopy(&v, dp, n > sizeof(uint32_t) ? sizeof(uint32_t) : n); dp += sizeof(uint32_t), n -= nb; } } /* * Helper function for events that pass just a single mac address. */ static void notify_macaddr(struct ifnet *ifp, int op, const uint8_t mac[IEEE80211_ADDR_LEN]) { struct ieee80211_join_event iev; CURVNET_SET(ifp->if_vnet); memset(&iev, 0, sizeof(iev)); IEEE80211_ADDR_COPY(iev.iev_addr, mac); rt_ieee80211msg(ifp, op, &iev, sizeof(iev)); CURVNET_RESTORE(); } void ieee80211_notify_node_join(struct ieee80211_node *ni, int newassoc) { struct ieee80211vap *vap = ni->ni_vap; struct ifnet *ifp = vap->iv_ifp; CURVNET_SET_QUIET(ifp->if_vnet); IEEE80211_NOTE(vap, IEEE80211_MSG_NODE, ni, "%snode join", (ni == vap->iv_bss) ? "bss " : ""); if (ni == vap->iv_bss) { notify_macaddr(ifp, newassoc ? RTM_IEEE80211_ASSOC : RTM_IEEE80211_REASSOC, ni->ni_bssid); if_link_state_change(ifp, LINK_STATE_UP); } else { notify_macaddr(ifp, newassoc ? RTM_IEEE80211_JOIN : RTM_IEEE80211_REJOIN, ni->ni_macaddr); } CURVNET_RESTORE(); } void ieee80211_notify_node_leave(struct ieee80211_node *ni) { struct ieee80211vap *vap = ni->ni_vap; struct ifnet *ifp = vap->iv_ifp; CURVNET_SET_QUIET(ifp->if_vnet); IEEE80211_NOTE(vap, IEEE80211_MSG_NODE, ni, "%snode leave", (ni == vap->iv_bss) ? "bss " : ""); if (ni == vap->iv_bss) { rt_ieee80211msg(ifp, RTM_IEEE80211_DISASSOC, NULL, 0); if_link_state_change(ifp, LINK_STATE_DOWN); } else { /* fire off wireless event station leaving */ notify_macaddr(ifp, RTM_IEEE80211_LEAVE, ni->ni_macaddr); } CURVNET_RESTORE(); } void ieee80211_notify_scan_done(struct ieee80211vap *vap) { struct ifnet *ifp = vap->iv_ifp; IEEE80211_DPRINTF(vap, IEEE80211_MSG_SCAN, "%s\n", "notify scan done"); /* dispatch wireless event indicating scan completed */ CURVNET_SET(ifp->if_vnet); rt_ieee80211msg(ifp, RTM_IEEE80211_SCAN, NULL, 0); CURVNET_RESTORE(); } void ieee80211_notify_replay_failure(struct ieee80211vap *vap, const struct ieee80211_frame *wh, const struct ieee80211_key *k, u_int64_t rsc, int tid) { struct ifnet *ifp = vap->iv_ifp; IEEE80211_NOTE_MAC(vap, IEEE80211_MSG_CRYPTO, wh->i_addr2, "%s replay detected tid %d ", k->wk_cipher->ic_name, tid, (intmax_t) rsc, (intmax_t) k->wk_keyrsc[tid], k->wk_keyix, k->wk_rxkeyix); if (ifp != NULL) { /* NB: for cipher test modules */ struct ieee80211_replay_event iev; IEEE80211_ADDR_COPY(iev.iev_dst, wh->i_addr1); IEEE80211_ADDR_COPY(iev.iev_src, wh->i_addr2); iev.iev_cipher = k->wk_cipher->ic_cipher; if (k->wk_rxkeyix != IEEE80211_KEYIX_NONE) iev.iev_keyix = k->wk_rxkeyix; else iev.iev_keyix = k->wk_keyix; iev.iev_keyrsc = k->wk_keyrsc[tid]; iev.iev_rsc = rsc; CURVNET_SET(ifp->if_vnet); rt_ieee80211msg(ifp, RTM_IEEE80211_REPLAY, &iev, sizeof(iev)); CURVNET_RESTORE(); } } void ieee80211_notify_michael_failure(struct ieee80211vap *vap, const struct ieee80211_frame *wh, u_int keyix) { struct ifnet *ifp = vap->iv_ifp; IEEE80211_NOTE_MAC(vap, IEEE80211_MSG_CRYPTO, wh->i_addr2, "michael MIC verification failed ", keyix); vap->iv_stats.is_rx_tkipmic++; if (ifp != NULL) { /* NB: for cipher test modules */ struct ieee80211_michael_event iev; IEEE80211_ADDR_COPY(iev.iev_dst, wh->i_addr1); IEEE80211_ADDR_COPY(iev.iev_src, wh->i_addr2); iev.iev_cipher = IEEE80211_CIPHER_TKIP; iev.iev_keyix = keyix; CURVNET_SET(ifp->if_vnet); rt_ieee80211msg(ifp, RTM_IEEE80211_MICHAEL, &iev, sizeof(iev)); CURVNET_RESTORE(); } } void ieee80211_notify_wds_discover(struct ieee80211_node *ni) { struct ieee80211vap *vap = ni->ni_vap; struct ifnet *ifp = vap->iv_ifp; notify_macaddr(ifp, RTM_IEEE80211_WDS, ni->ni_macaddr); } void ieee80211_notify_csa(struct ieee80211com *ic, const struct ieee80211_channel *c, int mode, int count) { struct ieee80211_csa_event iev; struct ieee80211vap *vap; struct ifnet *ifp; memset(&iev, 0, sizeof(iev)); iev.iev_flags = c->ic_flags; iev.iev_freq = c->ic_freq; iev.iev_ieee = c->ic_ieee; iev.iev_mode = mode; iev.iev_count = count; TAILQ_FOREACH(vap, &ic->ic_vaps, iv_next) { ifp = vap->iv_ifp; CURVNET_SET(ifp->if_vnet); rt_ieee80211msg(ifp, RTM_IEEE80211_CSA, &iev, sizeof(iev)); CURVNET_RESTORE(); } } void ieee80211_notify_radar(struct ieee80211com *ic, const struct ieee80211_channel *c) { struct ieee80211_radar_event iev; struct ieee80211vap *vap; struct ifnet *ifp; memset(&iev, 0, sizeof(iev)); iev.iev_flags = c->ic_flags; iev.iev_freq = c->ic_freq; iev.iev_ieee = c->ic_ieee; TAILQ_FOREACH(vap, &ic->ic_vaps, iv_next) { ifp = vap->iv_ifp; CURVNET_SET(ifp->if_vnet); rt_ieee80211msg(ifp, RTM_IEEE80211_RADAR, &iev, sizeof(iev)); CURVNET_RESTORE(); } } void ieee80211_notify_cac(struct ieee80211com *ic, const struct ieee80211_channel *c, enum ieee80211_notify_cac_event type) { struct ieee80211_cac_event iev; struct ieee80211vap *vap; struct ifnet *ifp; memset(&iev, 0, sizeof(iev)); iev.iev_flags = c->ic_flags; iev.iev_freq = c->ic_freq; iev.iev_ieee = c->ic_ieee; iev.iev_type = type; TAILQ_FOREACH(vap, &ic->ic_vaps, iv_next) { ifp = vap->iv_ifp; CURVNET_SET(ifp->if_vnet); rt_ieee80211msg(ifp, RTM_IEEE80211_CAC, &iev, sizeof(iev)); CURVNET_RESTORE(); } } void ieee80211_notify_node_deauth(struct ieee80211_node *ni) { struct ieee80211vap *vap = ni->ni_vap; struct ifnet *ifp = vap->iv_ifp; IEEE80211_NOTE(vap, IEEE80211_MSG_NODE, ni, "%s", "node deauth"); notify_macaddr(ifp, RTM_IEEE80211_DEAUTH, ni->ni_macaddr); } void ieee80211_notify_node_auth(struct ieee80211_node *ni) { struct ieee80211vap *vap = ni->ni_vap; struct ifnet *ifp = vap->iv_ifp; IEEE80211_NOTE(vap, IEEE80211_MSG_NODE, ni, "%s", "node auth"); notify_macaddr(ifp, RTM_IEEE80211_AUTH, ni->ni_macaddr); } void ieee80211_notify_country(struct ieee80211vap *vap, const uint8_t bssid[IEEE80211_ADDR_LEN], const uint8_t cc[2]) { struct ifnet *ifp = vap->iv_ifp; struct ieee80211_country_event iev; memset(&iev, 0, sizeof(iev)); IEEE80211_ADDR_COPY(iev.iev_addr, bssid); iev.iev_cc[0] = cc[0]; iev.iev_cc[1] = cc[1]; CURVNET_SET(ifp->if_vnet); rt_ieee80211msg(ifp, RTM_IEEE80211_COUNTRY, &iev, sizeof(iev)); CURVNET_RESTORE(); } void ieee80211_notify_radio(struct ieee80211com *ic, int state) { struct ieee80211_radio_event iev; struct ieee80211vap *vap; struct ifnet *ifp; memset(&iev, 0, sizeof(iev)); iev.iev_state = state; TAILQ_FOREACH(vap, &ic->ic_vaps, iv_next) { ifp = vap->iv_ifp; CURVNET_SET(ifp->if_vnet); rt_ieee80211msg(ifp, RTM_IEEE80211_RADIO, &iev, sizeof(iev)); CURVNET_RESTORE(); } } void ieee80211_load_module(const char *modname) { #ifdef notyet (void)kern_kldload(curthread, modname, NULL); #else printf("%s: load the %s module by hand for now.\n", __func__, modname); #endif } static eventhandler_tag wlan_bpfevent; static eventhandler_tag wlan_ifllevent; static void bpf_track(void *arg, struct ifnet *ifp, int dlt, int attach) { /* NB: identify vap's by if_init */ if (dlt == DLT_IEEE802_11_RADIO && ifp->if_init == ieee80211_init) { struct ieee80211vap *vap = ifp->if_softc; /* * Track bpf radiotap listener state. We mark the vap * to indicate if any listener is present and the com * to indicate if any listener exists on any associated * vap. This flag is used by drivers to prepare radiotap * state only when needed. */ if (attach) { ieee80211_syncflag_ext(vap, IEEE80211_FEXT_BPF); if (vap->iv_opmode == IEEE80211_M_MONITOR) atomic_add_int(&vap->iv_ic->ic_montaps, 1); } else if (!bpf_peers_present(vap->iv_rawbpf)) { ieee80211_syncflag_ext(vap, -IEEE80211_FEXT_BPF); if (vap->iv_opmode == IEEE80211_M_MONITOR) atomic_subtract_int(&vap->iv_ic->ic_montaps, 1); } } } /* * Change MAC address on the vap (if was not started). */ static void wlan_iflladdr(void *arg __unused, struct ifnet *ifp) { /* NB: identify vap's by if_init */ if (ifp->if_init == ieee80211_init && (ifp->if_flags & IFF_UP) == 0) { struct ieee80211vap *vap = ifp->if_softc; IEEE80211_ADDR_COPY(vap->iv_myaddr, IF_LLADDR(ifp)); } } /* * Module glue. * * NB: the module name is "wlan" for compatibility with NetBSD. */ static int wlan_modevent(module_t mod, int type, void *unused) { switch (type) { case MOD_LOAD: if (bootverbose) printf("wlan: <802.11 Link Layer>\n"); wlan_bpfevent = EVENTHANDLER_REGISTER(bpf_track, bpf_track, 0, EVENTHANDLER_PRI_ANY); wlan_ifllevent = EVENTHANDLER_REGISTER(iflladdr_event, wlan_iflladdr, NULL, EVENTHANDLER_PRI_ANY); wlan_cloner = if_clone_simple(wlanname, wlan_clone_create, wlan_clone_destroy, 0); return 0; case MOD_UNLOAD: if_clone_detach(wlan_cloner); EVENTHANDLER_DEREGISTER(bpf_track, wlan_bpfevent); EVENTHANDLER_DEREGISTER(iflladdr_event, wlan_ifllevent); return 0; } return EINVAL; } static moduledata_t wlan_mod = { wlanname, wlan_modevent, 0 }; DECLARE_MODULE(wlan, wlan_mod, SI_SUB_DRIVERS, SI_ORDER_FIRST); MODULE_VERSION(wlan, 1); MODULE_DEPEND(wlan, ether, 1, 1, 1); #ifdef IEEE80211_ALQ MODULE_DEPEND(wlan, alq, 1, 1, 1); #endif /* IEEE80211_ALQ */ Index: projects/clang1000-import/sys/net80211/ieee80211_ht.c =================================================================== --- projects/clang1000-import/sys/net80211/ieee80211_ht.c (revision 358238) +++ projects/clang1000-import/sys/net80211/ieee80211_ht.c (revision 358239) @@ -1,3378 +1,3381 @@ /*- * SPDX-License-Identifier: BSD-2-Clause-FreeBSD * * Copyright (c) 2007-2008 Sam Leffler, Errno Consulting * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #include #ifdef __FreeBSD__ __FBSDID("$FreeBSD$"); #endif /* * IEEE 802.11n protocol support. */ #include "opt_inet.h" #include "opt_wlan.h" #include #include #include #include #include #include #include #include #include #include #include #include #include /* define here, used throughout file */ #define MS(_v, _f) (((_v) & _f) >> _f##_S) #define SM(_v, _f) (((_v) << _f##_S) & _f) const struct ieee80211_mcs_rates ieee80211_htrates[IEEE80211_HTRATE_MAXSIZE] = { { 13, 14, 27, 30 }, /* MCS 0 */ { 26, 29, 54, 60 }, /* MCS 1 */ { 39, 43, 81, 90 }, /* MCS 2 */ { 52, 58, 108, 120 }, /* MCS 3 */ { 78, 87, 162, 180 }, /* MCS 4 */ { 104, 116, 216, 240 }, /* MCS 5 */ { 117, 130, 243, 270 }, /* MCS 6 */ { 130, 144, 270, 300 }, /* MCS 7 */ { 26, 29, 54, 60 }, /* MCS 8 */ { 52, 58, 108, 120 }, /* MCS 9 */ { 78, 87, 162, 180 }, /* MCS 10 */ { 104, 116, 216, 240 }, /* MCS 11 */ { 156, 173, 324, 360 }, /* MCS 12 */ { 208, 231, 432, 480 }, /* MCS 13 */ { 234, 260, 486, 540 }, /* MCS 14 */ { 260, 289, 540, 600 }, /* MCS 15 */ { 39, 43, 81, 90 }, /* MCS 16 */ { 78, 87, 162, 180 }, /* MCS 17 */ { 117, 130, 243, 270 }, /* MCS 18 */ { 156, 173, 324, 360 }, /* MCS 19 */ { 234, 260, 486, 540 }, /* MCS 20 */ { 312, 347, 648, 720 }, /* MCS 21 */ { 351, 390, 729, 810 }, /* MCS 22 */ { 390, 433, 810, 900 }, /* MCS 23 */ { 52, 58, 108, 120 }, /* MCS 24 */ { 104, 116, 216, 240 }, /* MCS 25 */ { 156, 173, 324, 360 }, /* MCS 26 */ { 208, 231, 432, 480 }, /* MCS 27 */ { 312, 347, 648, 720 }, /* MCS 28 */ { 416, 462, 864, 960 }, /* MCS 29 */ { 468, 520, 972, 1080 }, /* MCS 30 */ { 520, 578, 1080, 1200 }, /* MCS 31 */ { 0, 0, 12, 13 }, /* MCS 32 */ { 78, 87, 162, 180 }, /* MCS 33 */ { 104, 116, 216, 240 }, /* MCS 34 */ { 130, 144, 270, 300 }, /* MCS 35 */ { 117, 130, 243, 270 }, /* MCS 36 */ { 156, 173, 324, 360 }, /* MCS 37 */ { 195, 217, 405, 450 }, /* MCS 38 */ { 104, 116, 216, 240 }, /* MCS 39 */ { 130, 144, 270, 300 }, /* MCS 40 */ { 130, 144, 270, 300 }, /* MCS 41 */ { 156, 173, 324, 360 }, /* MCS 42 */ { 182, 202, 378, 420 }, /* MCS 43 */ { 182, 202, 378, 420 }, /* MCS 44 */ { 208, 231, 432, 480 }, /* MCS 45 */ { 156, 173, 324, 360 }, /* MCS 46 */ { 195, 217, 405, 450 }, /* MCS 47 */ { 195, 217, 405, 450 }, /* MCS 48 */ { 234, 260, 486, 540 }, /* MCS 49 */ { 273, 303, 567, 630 }, /* MCS 50 */ { 273, 303, 567, 630 }, /* MCS 51 */ { 312, 347, 648, 720 }, /* MCS 52 */ { 130, 144, 270, 300 }, /* MCS 53 */ { 156, 173, 324, 360 }, /* MCS 54 */ { 182, 202, 378, 420 }, /* MCS 55 */ { 156, 173, 324, 360 }, /* MCS 56 */ { 182, 202, 378, 420 }, /* MCS 57 */ { 208, 231, 432, 480 }, /* MCS 58 */ { 234, 260, 486, 540 }, /* MCS 59 */ { 208, 231, 432, 480 }, /* MCS 60 */ { 234, 260, 486, 540 }, /* MCS 61 */ { 260, 289, 540, 600 }, /* MCS 62 */ { 260, 289, 540, 600 }, /* MCS 63 */ { 286, 318, 594, 660 }, /* MCS 64 */ { 195, 217, 405, 450 }, /* MCS 65 */ { 234, 260, 486, 540 }, /* MCS 66 */ { 273, 303, 567, 630 }, /* MCS 67 */ { 234, 260, 486, 540 }, /* MCS 68 */ { 273, 303, 567, 630 }, /* MCS 69 */ { 312, 347, 648, 720 }, /* MCS 70 */ { 351, 390, 729, 810 }, /* MCS 71 */ { 312, 347, 648, 720 }, /* MCS 72 */ { 351, 390, 729, 810 }, /* MCS 73 */ { 390, 433, 810, 900 }, /* MCS 74 */ { 390, 433, 810, 900 }, /* MCS 75 */ { 429, 477, 891, 990 }, /* MCS 76 */ }; static int ieee80211_ampdu_age = -1; /* threshold for ampdu reorder q (ms) */ -SYSCTL_PROC(_net_wlan, OID_AUTO, ampdu_age, CTLTYPE_INT | CTLFLAG_RW, - &ieee80211_ampdu_age, 0, ieee80211_sysctl_msecs_ticks, "I", - "AMPDU max reorder age (ms)"); +SYSCTL_PROC(_net_wlan, OID_AUTO, ampdu_age, + CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_NEEDGIANT, + &ieee80211_ampdu_age, 0, ieee80211_sysctl_msecs_ticks, "I", + "AMPDU max reorder age (ms)"); static int ieee80211_recv_bar_ena = 1; SYSCTL_INT(_net_wlan, OID_AUTO, recv_bar, CTLFLAG_RW, &ieee80211_recv_bar_ena, 0, "BAR frame processing (ena/dis)"); static int ieee80211_addba_timeout = -1;/* timeout for ADDBA response */ -SYSCTL_PROC(_net_wlan, OID_AUTO, addba_timeout, CTLTYPE_INT | CTLFLAG_RW, - &ieee80211_addba_timeout, 0, ieee80211_sysctl_msecs_ticks, "I", - "ADDBA request timeout (ms)"); +SYSCTL_PROC(_net_wlan, OID_AUTO, addba_timeout, + CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_NEEDGIANT, + &ieee80211_addba_timeout, 0, ieee80211_sysctl_msecs_ticks, "I", + "ADDBA request timeout (ms)"); static int ieee80211_addba_backoff = -1;/* backoff after max ADDBA requests */ -SYSCTL_PROC(_net_wlan, OID_AUTO, addba_backoff, CTLTYPE_INT | CTLFLAG_RW, - &ieee80211_addba_backoff, 0, ieee80211_sysctl_msecs_ticks, "I", - "ADDBA request backoff (ms)"); +SYSCTL_PROC(_net_wlan, OID_AUTO, addba_backoff, + CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_NEEDGIANT, + &ieee80211_addba_backoff, 0, ieee80211_sysctl_msecs_ticks, "I", + "ADDBA request backoff (ms)"); static int ieee80211_addba_maxtries = 3;/* max ADDBA requests before backoff */ SYSCTL_INT(_net_wlan, OID_AUTO, addba_maxtries, CTLFLAG_RW, &ieee80211_addba_maxtries, 0, "max ADDBA requests sent before backoff"); static int ieee80211_bar_timeout = -1; /* timeout waiting for BAR response */ static int ieee80211_bar_maxtries = 50;/* max BAR requests before DELBA */ static ieee80211_recv_action_func ht_recv_action_ba_addba_request; static ieee80211_recv_action_func ht_recv_action_ba_addba_response; static ieee80211_recv_action_func ht_recv_action_ba_delba; static ieee80211_recv_action_func ht_recv_action_ht_mimopwrsave; static ieee80211_recv_action_func ht_recv_action_ht_txchwidth; static ieee80211_send_action_func ht_send_action_ba_addba; static ieee80211_send_action_func ht_send_action_ba_delba; static ieee80211_send_action_func ht_send_action_ht_txchwidth; static void ieee80211_ht_init(void) { /* * Setup HT parameters that depends on the clock frequency. */ ieee80211_ampdu_age = msecs_to_ticks(500); ieee80211_addba_timeout = msecs_to_ticks(250); ieee80211_addba_backoff = msecs_to_ticks(10*1000); ieee80211_bar_timeout = msecs_to_ticks(250); /* * Register action frame handlers. */ ieee80211_recv_action_register(IEEE80211_ACTION_CAT_BA, IEEE80211_ACTION_BA_ADDBA_REQUEST, ht_recv_action_ba_addba_request); ieee80211_recv_action_register(IEEE80211_ACTION_CAT_BA, IEEE80211_ACTION_BA_ADDBA_RESPONSE, ht_recv_action_ba_addba_response); ieee80211_recv_action_register(IEEE80211_ACTION_CAT_BA, IEEE80211_ACTION_BA_DELBA, ht_recv_action_ba_delba); ieee80211_recv_action_register(IEEE80211_ACTION_CAT_HT, IEEE80211_ACTION_HT_MIMOPWRSAVE, ht_recv_action_ht_mimopwrsave); ieee80211_recv_action_register(IEEE80211_ACTION_CAT_HT, IEEE80211_ACTION_HT_TXCHWIDTH, ht_recv_action_ht_txchwidth); ieee80211_send_action_register(IEEE80211_ACTION_CAT_BA, IEEE80211_ACTION_BA_ADDBA_REQUEST, ht_send_action_ba_addba); ieee80211_send_action_register(IEEE80211_ACTION_CAT_BA, IEEE80211_ACTION_BA_ADDBA_RESPONSE, ht_send_action_ba_addba); ieee80211_send_action_register(IEEE80211_ACTION_CAT_BA, IEEE80211_ACTION_BA_DELBA, ht_send_action_ba_delba); ieee80211_send_action_register(IEEE80211_ACTION_CAT_HT, IEEE80211_ACTION_HT_TXCHWIDTH, ht_send_action_ht_txchwidth); } SYSINIT(wlan_ht, SI_SUB_DRIVERS, SI_ORDER_FIRST, ieee80211_ht_init, NULL); static int ieee80211_ampdu_enable(struct ieee80211_node *ni, struct ieee80211_tx_ampdu *tap); static int ieee80211_addba_request(struct ieee80211_node *ni, struct ieee80211_tx_ampdu *tap, int dialogtoken, int baparamset, int batimeout); static int ieee80211_addba_response(struct ieee80211_node *ni, struct ieee80211_tx_ampdu *tap, int code, int baparamset, int batimeout); static void ieee80211_addba_stop(struct ieee80211_node *ni, struct ieee80211_tx_ampdu *tap); static void null_addba_response_timeout(struct ieee80211_node *ni, struct ieee80211_tx_ampdu *tap); static void ieee80211_bar_response(struct ieee80211_node *ni, struct ieee80211_tx_ampdu *tap, int status); static void ampdu_tx_stop(struct ieee80211_tx_ampdu *tap); static void bar_stop_timer(struct ieee80211_tx_ampdu *tap); static int ampdu_rx_start(struct ieee80211_node *, struct ieee80211_rx_ampdu *, int baparamset, int batimeout, int baseqctl); static void ampdu_rx_stop(struct ieee80211_node *, struct ieee80211_rx_ampdu *); void ieee80211_ht_attach(struct ieee80211com *ic) { /* setup default aggregation policy */ ic->ic_recv_action = ieee80211_recv_action; ic->ic_send_action = ieee80211_send_action; ic->ic_ampdu_enable = ieee80211_ampdu_enable; ic->ic_addba_request = ieee80211_addba_request; ic->ic_addba_response = ieee80211_addba_response; ic->ic_addba_response_timeout = null_addba_response_timeout; ic->ic_addba_stop = ieee80211_addba_stop; ic->ic_bar_response = ieee80211_bar_response; ic->ic_ampdu_rx_start = ampdu_rx_start; ic->ic_ampdu_rx_stop = ampdu_rx_stop; ic->ic_htprotmode = IEEE80211_PROT_RTSCTS; ic->ic_curhtprotmode = IEEE80211_HTINFO_OPMODE_PURE; } void ieee80211_ht_detach(struct ieee80211com *ic) { } void ieee80211_ht_vattach(struct ieee80211vap *vap) { /* driver can override defaults */ vap->iv_ampdu_rxmax = IEEE80211_HTCAP_MAXRXAMPDU_8K; vap->iv_ampdu_density = IEEE80211_HTCAP_MPDUDENSITY_NA; vap->iv_ampdu_limit = vap->iv_ampdu_rxmax; vap->iv_amsdu_limit = vap->iv_htcaps & IEEE80211_HTCAP_MAXAMSDU; /* tx aggregation traffic thresholds */ vap->iv_ampdu_mintraffic[WME_AC_BK] = 128; vap->iv_ampdu_mintraffic[WME_AC_BE] = 64; vap->iv_ampdu_mintraffic[WME_AC_VO] = 32; vap->iv_ampdu_mintraffic[WME_AC_VI] = 32; if (vap->iv_htcaps & IEEE80211_HTC_HT) { /* * Device is HT capable; enable all HT-related * facilities by default. * XXX these choices may be too aggressive. */ vap->iv_flags_ht |= IEEE80211_FHT_HT | IEEE80211_FHT_HTCOMPAT ; if (vap->iv_htcaps & IEEE80211_HTCAP_SHORTGI20) vap->iv_flags_ht |= IEEE80211_FHT_SHORTGI20; /* XXX infer from channel list? */ if (vap->iv_htcaps & IEEE80211_HTCAP_CHWIDTH40) { vap->iv_flags_ht |= IEEE80211_FHT_USEHT40; if (vap->iv_htcaps & IEEE80211_HTCAP_SHORTGI40) vap->iv_flags_ht |= IEEE80211_FHT_SHORTGI40; } /* enable RIFS if capable */ if (vap->iv_htcaps & IEEE80211_HTC_RIFS) vap->iv_flags_ht |= IEEE80211_FHT_RIFS; /* NB: A-MPDU and A-MSDU rx are mandated, these are tx only */ vap->iv_flags_ht |= IEEE80211_FHT_AMPDU_RX; if (vap->iv_htcaps & IEEE80211_HTC_AMPDU) vap->iv_flags_ht |= IEEE80211_FHT_AMPDU_TX; vap->iv_flags_ht |= IEEE80211_FHT_AMSDU_RX; if (vap->iv_htcaps & IEEE80211_HTC_AMSDU) vap->iv_flags_ht |= IEEE80211_FHT_AMSDU_TX; if (vap->iv_htcaps & IEEE80211_HTCAP_TXSTBC) vap->iv_flags_ht |= IEEE80211_FHT_STBC_TX; if (vap->iv_htcaps & IEEE80211_HTCAP_RXSTBC) vap->iv_flags_ht |= IEEE80211_FHT_STBC_RX; if (vap->iv_htcaps & IEEE80211_HTCAP_LDPC) vap->iv_flags_ht |= IEEE80211_FHT_LDPC_RX; if (vap->iv_htcaps & IEEE80211_HTC_TXLDPC) vap->iv_flags_ht |= IEEE80211_FHT_LDPC_TX; } /* NB: disable default legacy WDS, too many issues right now */ if (vap->iv_flags_ext & IEEE80211_FEXT_WDSLEGACY) vap->iv_flags_ht &= ~IEEE80211_FHT_HT; } void ieee80211_ht_vdetach(struct ieee80211vap *vap) { } static int ht_getrate(struct ieee80211com *ic, int index, enum ieee80211_phymode mode, int ratetype) { int mword, rate; mword = ieee80211_rate2media(ic, index | IEEE80211_RATE_MCS, mode); if (IFM_SUBTYPE(mword) != IFM_IEEE80211_MCS) return (0); switch (ratetype) { case 0: rate = ieee80211_htrates[index].ht20_rate_800ns; break; case 1: rate = ieee80211_htrates[index].ht20_rate_400ns; break; case 2: rate = ieee80211_htrates[index].ht40_rate_800ns; break; default: rate = ieee80211_htrates[index].ht40_rate_400ns; break; } return (rate); } static struct printranges { int minmcs; int maxmcs; int txstream; int ratetype; int htcapflags; } ranges[] = { { 0, 7, 1, 0, 0 }, { 8, 15, 2, 0, 0 }, { 16, 23, 3, 0, 0 }, { 24, 31, 4, 0, 0 }, { 32, 0, 1, 2, IEEE80211_HTC_TXMCS32 }, { 33, 38, 2, 0, IEEE80211_HTC_TXUNEQUAL }, { 39, 52, 3, 0, IEEE80211_HTC_TXUNEQUAL }, { 53, 76, 4, 0, IEEE80211_HTC_TXUNEQUAL }, { 0, 0, 0, 0, 0 }, }; static void ht_rateprint(struct ieee80211com *ic, enum ieee80211_phymode mode, int ratetype) { int minrate, maxrate; struct printranges *range; for (range = ranges; range->txstream != 0; range++) { if (ic->ic_txstream < range->txstream) continue; if (range->htcapflags && (ic->ic_htcaps & range->htcapflags) == 0) continue; if (ratetype < range->ratetype) continue; minrate = ht_getrate(ic, range->minmcs, mode, ratetype); maxrate = ht_getrate(ic, range->maxmcs, mode, ratetype); if (range->maxmcs) { ic_printf(ic, "MCS %d-%d: %d%sMbps - %d%sMbps\n", range->minmcs, range->maxmcs, minrate/2, ((minrate & 0x1) != 0 ? ".5" : ""), maxrate/2, ((maxrate & 0x1) != 0 ? ".5" : "")); } else { ic_printf(ic, "MCS %d: %d%sMbps\n", range->minmcs, minrate/2, ((minrate & 0x1) != 0 ? ".5" : "")); } } } static void ht_announce(struct ieee80211com *ic, enum ieee80211_phymode mode) { const char *modestr = ieee80211_phymode_name[mode]; ic_printf(ic, "%s MCS 20MHz\n", modestr); ht_rateprint(ic, mode, 0); if (ic->ic_htcaps & IEEE80211_HTCAP_SHORTGI20) { ic_printf(ic, "%s MCS 20MHz SGI\n", modestr); ht_rateprint(ic, mode, 1); } if (ic->ic_htcaps & IEEE80211_HTCAP_CHWIDTH40) { ic_printf(ic, "%s MCS 40MHz:\n", modestr); ht_rateprint(ic, mode, 2); } if ((ic->ic_htcaps & IEEE80211_HTCAP_CHWIDTH40) && (ic->ic_htcaps & IEEE80211_HTCAP_SHORTGI40)) { ic_printf(ic, "%s MCS 40MHz SGI:\n", modestr); ht_rateprint(ic, mode, 3); } } void ieee80211_ht_announce(struct ieee80211com *ic) { if (isset(ic->ic_modecaps, IEEE80211_MODE_11NA) || isset(ic->ic_modecaps, IEEE80211_MODE_11NG)) ic_printf(ic, "%dT%dR\n", ic->ic_txstream, ic->ic_rxstream); if (isset(ic->ic_modecaps, IEEE80211_MODE_11NA)) ht_announce(ic, IEEE80211_MODE_11NA); if (isset(ic->ic_modecaps, IEEE80211_MODE_11NG)) ht_announce(ic, IEEE80211_MODE_11NG); } void ieee80211_init_suphtrates(struct ieee80211com *ic) { #define ADDRATE(x) do { \ htrateset->rs_rates[htrateset->rs_nrates] = x; \ htrateset->rs_nrates++; \ } while (0) struct ieee80211_htrateset *htrateset = &ic->ic_sup_htrates; int i; memset(htrateset, 0, sizeof(struct ieee80211_htrateset)); for (i = 0; i < ic->ic_txstream * 8; i++) ADDRATE(i); if ((ic->ic_htcaps & IEEE80211_HTCAP_CHWIDTH40) && (ic->ic_htcaps & IEEE80211_HTC_TXMCS32)) ADDRATE(32); if (ic->ic_htcaps & IEEE80211_HTC_TXUNEQUAL) { if (ic->ic_txstream >= 2) { for (i = 33; i <= 38; i++) ADDRATE(i); } if (ic->ic_txstream >= 3) { for (i = 39; i <= 52; i++) ADDRATE(i); } if (ic->ic_txstream == 4) { for (i = 53; i <= 76; i++) ADDRATE(i); } } #undef ADDRATE } /* * Receive processing. */ /* * Decap the encapsulated A-MSDU frames and dispatch all but * the last for delivery. The last frame is returned for * delivery via the normal path. */ struct mbuf * ieee80211_decap_amsdu(struct ieee80211_node *ni, struct mbuf *m) { struct ieee80211vap *vap = ni->ni_vap; int framelen; struct mbuf *n; /* discard 802.3 header inserted by ieee80211_decap */ m_adj(m, sizeof(struct ether_header)); vap->iv_stats.is_amsdu_decap++; for (;;) { /* * Decap the first frame, bust it apart from the * remainder and deliver. We leave the last frame * delivery to the caller (for consistency with other * code paths, could also do it here). */ m = ieee80211_decap1(m, &framelen); if (m == NULL) { IEEE80211_DISCARD_MAC(vap, IEEE80211_MSG_ANY, ni->ni_macaddr, "a-msdu", "%s", "decap failed"); vap->iv_stats.is_amsdu_tooshort++; return NULL; } if (m->m_pkthdr.len == framelen) break; n = m_split(m, framelen, M_NOWAIT); if (n == NULL) { IEEE80211_DISCARD_MAC(vap, IEEE80211_MSG_ANY, ni->ni_macaddr, "a-msdu", "%s", "unable to split encapsulated frames"); vap->iv_stats.is_amsdu_split++; m_freem(m); /* NB: must reclaim */ return NULL; } vap->iv_deliver_data(vap, ni, m); /* * Remove frame contents; each intermediate frame * is required to be aligned to a 4-byte boundary. */ m = n; m_adj(m, roundup2(framelen, 4) - framelen); /* padding */ } return m; /* last delivered by caller */ } /* * Add the given frame to the current RX reorder slot. * * For future offloaded A-MSDU handling where multiple frames with * the same sequence number show up here, this routine will append * those frames as long as they're appropriately tagged. */ static int ampdu_rx_add_slot(struct ieee80211_rx_ampdu *rap, int off, int tid, ieee80211_seq rxseq, struct ieee80211_node *ni, struct mbuf *m) { struct ieee80211vap *vap = ni->ni_vap; if (rap->rxa_m[off] == NULL) { rap->rxa_m[off] = m; rap->rxa_qframes++; rap->rxa_qbytes += m->m_pkthdr.len; vap->iv_stats.is_ampdu_rx_reorder++; return (0); } else { IEEE80211_DISCARD_MAC(vap, IEEE80211_MSG_INPUT | IEEE80211_MSG_11N, ni->ni_macaddr, "a-mpdu duplicate", "seqno %u tid %u BA win <%u:%u>", rxseq, tid, rap->rxa_start, IEEE80211_SEQ_ADD(rap->rxa_start, rap->rxa_wnd-1)); vap->iv_stats.is_rx_dup++; IEEE80211_NODE_STAT(ni, rx_dup); m_freem(m); return (-1); } } static void ampdu_rx_purge_slot(struct ieee80211_rx_ampdu *rap, int i) { struct mbuf *m; m = rap->rxa_m[i]; if (m == NULL) return; rap->rxa_m[i] = NULL; rap->rxa_qbytes -= m->m_pkthdr.len; rap->rxa_qframes--; m_freem(m); } /* * Purge all frames in the A-MPDU re-order queue. */ static void ampdu_rx_purge(struct ieee80211_rx_ampdu *rap) { int i; for (i = 0; i < rap->rxa_wnd; i++) { ampdu_rx_purge_slot(rap, i); if (rap->rxa_qframes == 0) break; } KASSERT(rap->rxa_qbytes == 0 && rap->rxa_qframes == 0, ("lost %u data, %u frames on ampdu rx q", rap->rxa_qbytes, rap->rxa_qframes)); } /* * Start A-MPDU rx/re-order processing for the specified TID. */ static int ampdu_rx_start(struct ieee80211_node *ni, struct ieee80211_rx_ampdu *rap, int baparamset, int batimeout, int baseqctl) { int bufsiz = MS(baparamset, IEEE80211_BAPS_BUFSIZ); if (rap->rxa_flags & IEEE80211_AGGR_RUNNING) { /* * AMPDU previously setup and not terminated with a DELBA, * flush the reorder q's in case anything remains. */ ampdu_rx_purge(rap); } memset(rap, 0, sizeof(*rap)); rap->rxa_wnd = (bufsiz == 0) ? IEEE80211_AGGR_BAWMAX : min(bufsiz, IEEE80211_AGGR_BAWMAX); rap->rxa_start = MS(baseqctl, IEEE80211_BASEQ_START); rap->rxa_flags |= IEEE80211_AGGR_RUNNING | IEEE80211_AGGR_XCHGPEND; return 0; } /* * Public function; manually setup the RX ampdu state. */ int ieee80211_ampdu_rx_start_ext(struct ieee80211_node *ni, int tid, int seq, int baw) { struct ieee80211_rx_ampdu *rap; /* XXX TODO: sanity check tid, seq, baw */ rap = &ni->ni_rx_ampdu[tid]; if (rap->rxa_flags & IEEE80211_AGGR_RUNNING) { /* * AMPDU previously setup and not terminated with a DELBA, * flush the reorder q's in case anything remains. */ ampdu_rx_purge(rap); } memset(rap, 0, sizeof(*rap)); rap->rxa_wnd = (baw== 0) ? IEEE80211_AGGR_BAWMAX : min(baw, IEEE80211_AGGR_BAWMAX); if (seq == -1) { /* Wait for the first RX frame, use that as BAW */ rap->rxa_start = 0; rap->rxa_flags |= IEEE80211_AGGR_WAITRX; } else { rap->rxa_start = seq; } rap->rxa_flags |= IEEE80211_AGGR_RUNNING | IEEE80211_AGGR_XCHGPEND; IEEE80211_NOTE(ni->ni_vap, IEEE80211_MSG_11N, ni, "%s: tid=%d, start=%d, wnd=%d, flags=0x%08x", __func__, tid, seq, rap->rxa_wnd, rap->rxa_flags); return 0; } /* * Public function; manually stop the RX AMPDU state. */ void ieee80211_ampdu_rx_stop_ext(struct ieee80211_node *ni, int tid) { struct ieee80211_rx_ampdu *rap; /* XXX TODO: sanity check tid, seq, baw */ rap = &ni->ni_rx_ampdu[tid]; ampdu_rx_stop(ni, rap); } /* * Stop A-MPDU rx processing for the specified TID. */ static void ampdu_rx_stop(struct ieee80211_node *ni, struct ieee80211_rx_ampdu *rap) { ampdu_rx_purge(rap); rap->rxa_flags &= ~(IEEE80211_AGGR_RUNNING | IEEE80211_AGGR_XCHGPEND | IEEE80211_AGGR_WAITRX); } /* * Dispatch a frame from the A-MPDU reorder queue. The * frame is fed back into ieee80211_input marked with an * M_AMPDU_MPDU flag so it doesn't come back to us (it also * permits ieee80211_input to optimize re-processing). */ static __inline void ampdu_dispatch(struct ieee80211_node *ni, struct mbuf *m) { m->m_flags |= M_AMPDU_MPDU; /* bypass normal processing */ /* NB: rssi and noise are ignored w/ M_AMPDU_MPDU set */ (void) ieee80211_input(ni, m, 0, 0); } static int ampdu_dispatch_slot(struct ieee80211_rx_ampdu *rap, struct ieee80211_node *ni, int i) { struct mbuf *m; if (rap->rxa_m[i] == NULL) return (0); m = rap->rxa_m[i]; rap->rxa_m[i] = NULL; rap->rxa_qbytes -= m->m_pkthdr.len; rap->rxa_qframes--; ampdu_dispatch(ni, m); return (1); } static void ampdu_rx_moveup(struct ieee80211_rx_ampdu *rap, struct ieee80211_node *ni, int i, int winstart) { struct ieee80211vap *vap = ni->ni_vap; if (rap->rxa_qframes != 0) { int n = rap->rxa_qframes, j; if (winstart != -1) { /* * NB: in window-sliding mode, loop assumes i > 0 * and/or rxa_m[0] is NULL */ KASSERT(rap->rxa_m[0] == NULL, ("%s: BA window slot 0 occupied", __func__)); } for (j = i+1; j < rap->rxa_wnd; j++) { if (rap->rxa_m[j] != NULL) { rap->rxa_m[j-i] = rap->rxa_m[j]; rap->rxa_m[j] = NULL; if (--n == 0) break; } } KASSERT(n == 0, ("%s: lost %d frames, qframes %d off %d " "BA win <%d:%d> winstart %d", __func__, n, rap->rxa_qframes, i, rap->rxa_start, IEEE80211_SEQ_ADD(rap->rxa_start, rap->rxa_wnd-1), winstart)); vap->iv_stats.is_ampdu_rx_copy += rap->rxa_qframes; } } /* * Dispatch as many frames as possible from the re-order queue. * Frames will always be "at the front"; we process all frames * up to the first empty slot in the window. On completion we * cleanup state if there are still pending frames in the current * BA window. We assume the frame at slot 0 is already handled * by the caller; we always start at slot 1. */ static void ampdu_rx_dispatch(struct ieee80211_rx_ampdu *rap, struct ieee80211_node *ni) { struct ieee80211vap *vap = ni->ni_vap; int i; /* flush run of frames */ for (i = 1; i < rap->rxa_wnd; i++) { if (ampdu_dispatch_slot(rap, ni, i) == 0) break; } /* * If frames remain, copy the mbuf pointers down so * they correspond to the offsets in the new window. */ ampdu_rx_moveup(rap, ni, i, -1); /* * Adjust the start of the BA window to * reflect the frames just dispatched. */ rap->rxa_start = IEEE80211_SEQ_ADD(rap->rxa_start, i); vap->iv_stats.is_ampdu_rx_oor += i; } /* * Dispatch all frames in the A-MPDU re-order queue. */ static void ampdu_rx_flush(struct ieee80211_node *ni, struct ieee80211_rx_ampdu *rap) { struct ieee80211vap *vap = ni->ni_vap; int i, r; for (i = 0; i < rap->rxa_wnd; i++) { r = ampdu_dispatch_slot(rap, ni, i); if (r == 0) continue; vap->iv_stats.is_ampdu_rx_oor += r; if (rap->rxa_qframes == 0) break; } } /* * Dispatch all frames in the A-MPDU re-order queue * preceding the specified sequence number. This logic * handles window moves due to a received MSDU or BAR. */ static void ampdu_rx_flush_upto(struct ieee80211_node *ni, struct ieee80211_rx_ampdu *rap, ieee80211_seq winstart) { struct ieee80211vap *vap = ni->ni_vap; ieee80211_seq seqno; int i, r; /* * Flush any complete MSDU's with a sequence number lower * than winstart. Gaps may exist. Note that we may actually * dispatch frames past winstart if a run continues; this is * an optimization that avoids having to do a separate pass * to dispatch frames after moving the BA window start. */ seqno = rap->rxa_start; for (i = 0; i < rap->rxa_wnd; i++) { r = ampdu_dispatch_slot(rap, ni, i); if (r == 0) { if (!IEEE80211_SEQ_BA_BEFORE(seqno, winstart)) break; } vap->iv_stats.is_ampdu_rx_oor += r; seqno = IEEE80211_SEQ_INC(seqno); } /* * If frames remain, copy the mbuf pointers down so * they correspond to the offsets in the new window. */ ampdu_rx_moveup(rap, ni, i, winstart); /* * Move the start of the BA window; we use the * sequence number of the last MSDU that was * passed up the stack+1 or winstart if stopped on * a gap in the reorder buffer. */ rap->rxa_start = seqno; } /* * Process a received QoS data frame for an HT station. Handle * A-MPDU reordering: if this frame is received out of order * and falls within the BA window hold onto it. Otherwise if * this frame completes a run, flush any pending frames. We * return 1 if the frame is consumed. A 0 is returned if * the frame should be processed normally by the caller. */ int ieee80211_ampdu_reorder(struct ieee80211_node *ni, struct mbuf *m, const struct ieee80211_rx_stats *rxs) { #define PROCESS 0 /* caller should process frame */ #define CONSUMED 1 /* frame consumed, caller does nothing */ struct ieee80211vap *vap = ni->ni_vap; struct ieee80211_qosframe *wh; struct ieee80211_rx_ampdu *rap; ieee80211_seq rxseq; uint8_t tid; int off; KASSERT((m->m_flags & (M_AMPDU | M_AMPDU_MPDU)) == M_AMPDU, ("!a-mpdu or already re-ordered, flags 0x%x", m->m_flags)); KASSERT(ni->ni_flags & IEEE80211_NODE_HT, ("not an HT sta")); /* NB: m_len known to be sufficient */ wh = mtod(m, struct ieee80211_qosframe *); if (wh->i_fc[0] != IEEE80211_FC0_QOSDATA) { /* * Not QoS data, shouldn't get here but just * return it to the caller for processing. */ return PROCESS; } /* * 802.11-2012 9.3.2.10 - Duplicate detection and recovery. * * Multicast QoS data frames are checked against a different * counter, not the per-TID counter. */ if (IEEE80211_IS_MULTICAST(wh->i_addr1)) return PROCESS; tid = ieee80211_getqos(wh)[0]; tid &= IEEE80211_QOS_TID; rap = &ni->ni_rx_ampdu[tid]; if ((rap->rxa_flags & IEEE80211_AGGR_XCHGPEND) == 0) { /* * No ADDBA request yet, don't touch. */ return PROCESS; } rxseq = le16toh(*(uint16_t *)wh->i_seq); if ((rxseq & IEEE80211_SEQ_FRAG_MASK) != 0) { /* * Fragments are not allowed; toss. */ IEEE80211_DISCARD_MAC(vap, IEEE80211_MSG_INPUT | IEEE80211_MSG_11N, ni->ni_macaddr, "A-MPDU", "fragment, rxseq 0x%x tid %u%s", rxseq, tid, wh->i_fc[1] & IEEE80211_FC1_RETRY ? " (retransmit)" : ""); vap->iv_stats.is_ampdu_rx_drop++; IEEE80211_NODE_STAT(ni, rx_drop); m_freem(m); return CONSUMED; } rxseq >>= IEEE80211_SEQ_SEQ_SHIFT; rap->rxa_nframes++; /* * Handle waiting for the first frame to define the BAW. * Some firmware doesn't provide the RX of the starting point * of the BAW and we have to cope. */ if (rap->rxa_flags & IEEE80211_AGGR_WAITRX) { rap->rxa_flags &= ~IEEE80211_AGGR_WAITRX; rap->rxa_start = rxseq; } again: if (rxseq == rap->rxa_start) { /* * First frame in window. */ if (rap->rxa_qframes != 0) { /* * Dispatch as many packets as we can. */ KASSERT(rap->rxa_m[0] == NULL, ("unexpected dup")); ampdu_dispatch(ni, m); ampdu_rx_dispatch(rap, ni); return CONSUMED; } else { /* * In order; advance window and notify * caller to dispatch directly. */ rap->rxa_start = IEEE80211_SEQ_INC(rxseq); return PROCESS; } } /* * Frame is out of order; store if in the BA window. */ /* calculate offset in BA window */ off = IEEE80211_SEQ_SUB(rxseq, rap->rxa_start); if (off < rap->rxa_wnd) { /* * Common case (hopefully): in the BA window. * Sec 9.10.7.6.2 a) (p.137) */ /* * Check for frames sitting too long in the reorder queue. * This should only ever happen if frames are not delivered * without the sender otherwise notifying us (e.g. with a * BAR to move the window). Typically this happens because * of vendor bugs that cause the sequence number to jump. * When this happens we get a gap in the reorder queue that * leaves frame sitting on the queue until they get pushed * out due to window moves. When the vendor does not send * BAR this move only happens due to explicit packet sends * * NB: we only track the time of the oldest frame in the * reorder q; this means that if we flush we might push * frames that still "new"; if this happens then subsequent * frames will result in BA window moves which cost something * but is still better than a big throughput dip. */ if (rap->rxa_qframes != 0) { /* XXX honor batimeout? */ if (ticks - rap->rxa_age > ieee80211_ampdu_age) { /* * Too long since we received the first * frame; flush the reorder buffer. */ if (rap->rxa_qframes != 0) { vap->iv_stats.is_ampdu_rx_age += rap->rxa_qframes; ampdu_rx_flush(ni, rap); } rap->rxa_start = IEEE80211_SEQ_INC(rxseq); return PROCESS; } } else { /* * First frame, start aging timer. */ rap->rxa_age = ticks; } /* save packet - this consumes, no matter what */ ampdu_rx_add_slot(rap, off, tid, rxseq, ni, m); return CONSUMED; } if (off < IEEE80211_SEQ_BA_RANGE) { /* * Outside the BA window, but within range; * flush the reorder q and move the window. * Sec 9.10.7.6.2 b) (p.138) */ IEEE80211_NOTE(vap, IEEE80211_MSG_11N, ni, "move BA win <%u:%u> (%u frames) rxseq %u tid %u", rap->rxa_start, IEEE80211_SEQ_ADD(rap->rxa_start, rap->rxa_wnd-1), rap->rxa_qframes, rxseq, tid); vap->iv_stats.is_ampdu_rx_move++; /* * The spec says to flush frames up to but not including: * WinStart_B = rxseq - rap->rxa_wnd + 1 * Then insert the frame or notify the caller to process * it immediately. We can safely do this by just starting * over again because we know the frame will now be within * the BA window. */ /* NB: rxa_wnd known to be >0 */ ampdu_rx_flush_upto(ni, rap, IEEE80211_SEQ_SUB(rxseq, rap->rxa_wnd-1)); goto again; } else { /* * Outside the BA window and out of range; toss. * Sec 9.10.7.6.2 c) (p.138) */ IEEE80211_DISCARD_MAC(vap, IEEE80211_MSG_INPUT | IEEE80211_MSG_11N, ni->ni_macaddr, "MPDU", "BA win <%u:%u> (%u frames) rxseq %u tid %u%s", rap->rxa_start, IEEE80211_SEQ_ADD(rap->rxa_start, rap->rxa_wnd-1), rap->rxa_qframes, rxseq, tid, wh->i_fc[1] & IEEE80211_FC1_RETRY ? " (retransmit)" : ""); vap->iv_stats.is_ampdu_rx_drop++; IEEE80211_NODE_STAT(ni, rx_drop); m_freem(m); return CONSUMED; } #undef CONSUMED #undef PROCESS } /* * Process a BAR ctl frame. Dispatch all frames up to * the sequence number of the frame. If this frame is * out of range it's discarded. */ void ieee80211_recv_bar(struct ieee80211_node *ni, struct mbuf *m0) { struct ieee80211vap *vap = ni->ni_vap; struct ieee80211_frame_bar *wh; struct ieee80211_rx_ampdu *rap; ieee80211_seq rxseq; int tid, off; if (!ieee80211_recv_bar_ena) { #if 0 IEEE80211_DISCARD_MAC(vap, IEEE80211_MSG_11N, ni->ni_macaddr, "BAR", "%s", "processing disabled"); #endif vap->iv_stats.is_ampdu_bar_bad++; return; } wh = mtod(m0, struct ieee80211_frame_bar *); /* XXX check basic BAR */ tid = MS(le16toh(wh->i_ctl), IEEE80211_BAR_TID); rap = &ni->ni_rx_ampdu[tid]; if ((rap->rxa_flags & IEEE80211_AGGR_XCHGPEND) == 0) { /* * No ADDBA request yet, don't touch. */ IEEE80211_DISCARD_MAC(vap, IEEE80211_MSG_INPUT | IEEE80211_MSG_11N, ni->ni_macaddr, "BAR", "no BA stream, tid %u", tid); vap->iv_stats.is_ampdu_bar_bad++; return; } vap->iv_stats.is_ampdu_bar_rx++; rxseq = le16toh(wh->i_seq) >> IEEE80211_SEQ_SEQ_SHIFT; if (rxseq == rap->rxa_start) return; /* calculate offset in BA window */ off = IEEE80211_SEQ_SUB(rxseq, rap->rxa_start); if (off < IEEE80211_SEQ_BA_RANGE) { /* * Flush the reorder q up to rxseq and move the window. * Sec 9.10.7.6.3 a) (p.138) */ IEEE80211_NOTE(vap, IEEE80211_MSG_11N, ni, "BAR moves BA win <%u:%u> (%u frames) rxseq %u tid %u", rap->rxa_start, IEEE80211_SEQ_ADD(rap->rxa_start, rap->rxa_wnd-1), rap->rxa_qframes, rxseq, tid); vap->iv_stats.is_ampdu_bar_move++; ampdu_rx_flush_upto(ni, rap, rxseq); if (off >= rap->rxa_wnd) { /* * BAR specifies a window start to the right of BA * window; we must move it explicitly since * ampdu_rx_flush_upto will not. */ rap->rxa_start = rxseq; } } else { /* * Out of range; toss. * Sec 9.10.7.6.3 b) (p.138) */ IEEE80211_DISCARD_MAC(vap, IEEE80211_MSG_INPUT | IEEE80211_MSG_11N, ni->ni_macaddr, "BAR", "BA win <%u:%u> (%u frames) rxseq %u tid %u%s", rap->rxa_start, IEEE80211_SEQ_ADD(rap->rxa_start, rap->rxa_wnd-1), rap->rxa_qframes, rxseq, tid, wh->i_fc[1] & IEEE80211_FC1_RETRY ? " (retransmit)" : ""); vap->iv_stats.is_ampdu_bar_oow++; IEEE80211_NODE_STAT(ni, rx_drop); } } /* * Setup HT-specific state in a node. Called only * when HT use is negotiated so we don't do extra * work for temporary and/or legacy sta's. */ void ieee80211_ht_node_init(struct ieee80211_node *ni) { struct ieee80211_tx_ampdu *tap; int tid; IEEE80211_NOTE(ni->ni_vap, IEEE80211_MSG_11N, ni, "%s: called (%p)", __func__, ni); if (ni->ni_flags & IEEE80211_NODE_HT) { /* * Clean AMPDU state on re-associate. This handles the case * where a station leaves w/o notifying us and then returns * before node is reaped for inactivity. */ IEEE80211_NOTE(ni->ni_vap, IEEE80211_MSG_11N, ni, "%s: calling cleanup (%p)", __func__, ni); ieee80211_ht_node_cleanup(ni); } for (tid = 0; tid < WME_NUM_TID; tid++) { tap = &ni->ni_tx_ampdu[tid]; tap->txa_tid = tid; tap->txa_ni = ni; ieee80211_txampdu_init_pps(tap); /* NB: further initialization deferred */ } ni->ni_flags |= IEEE80211_NODE_HT | IEEE80211_NODE_AMPDU; } /* * Cleanup HT-specific state in a node. Called only * when HT use has been marked. */ void ieee80211_ht_node_cleanup(struct ieee80211_node *ni) { struct ieee80211com *ic = ni->ni_ic; int i; IEEE80211_NOTE(ni->ni_vap, IEEE80211_MSG_11N, ni, "%s: called (%p)", __func__, ni); KASSERT(ni->ni_flags & IEEE80211_NODE_HT, ("not an HT node")); /* XXX optimize this */ for (i = 0; i < WME_NUM_TID; i++) { struct ieee80211_tx_ampdu *tap = &ni->ni_tx_ampdu[i]; if (tap->txa_flags & IEEE80211_AGGR_SETUP) ampdu_tx_stop(tap); } for (i = 0; i < WME_NUM_TID; i++) ic->ic_ampdu_rx_stop(ni, &ni->ni_rx_ampdu[i]); ni->ni_htcap = 0; ni->ni_flags &= ~IEEE80211_NODE_HT_ALL; } /* * Age out HT resources for a station. */ void ieee80211_ht_node_age(struct ieee80211_node *ni) { struct ieee80211vap *vap = ni->ni_vap; uint8_t tid; KASSERT(ni->ni_flags & IEEE80211_NODE_HT, ("not an HT sta")); for (tid = 0; tid < WME_NUM_TID; tid++) { struct ieee80211_rx_ampdu *rap; rap = &ni->ni_rx_ampdu[tid]; if ((rap->rxa_flags & IEEE80211_AGGR_XCHGPEND) == 0) continue; if (rap->rxa_qframes == 0) continue; /* * Check for frames sitting too long in the reorder queue. * See above for more details on what's happening here. */ /* XXX honor batimeout? */ if (ticks - rap->rxa_age > ieee80211_ampdu_age) { /* * Too long since we received the first * frame; flush the reorder buffer. */ vap->iv_stats.is_ampdu_rx_age += rap->rxa_qframes; ampdu_rx_flush(ni, rap); } } } static struct ieee80211_channel * findhtchan(struct ieee80211com *ic, struct ieee80211_channel *c, int htflags) { return ieee80211_find_channel(ic, c->ic_freq, (c->ic_flags &~ IEEE80211_CHAN_HT) | htflags); } /* * Adjust a channel to be HT/non-HT according to the vap's configuration. */ struct ieee80211_channel * ieee80211_ht_adjust_channel(struct ieee80211com *ic, struct ieee80211_channel *chan, int flags) { struct ieee80211_channel *c; if (flags & IEEE80211_FHT_HT) { /* promote to HT if possible */ if (flags & IEEE80211_FHT_USEHT40) { if (!IEEE80211_IS_CHAN_HT40(chan)) { /* NB: arbitrarily pick ht40+ over ht40- */ c = findhtchan(ic, chan, IEEE80211_CHAN_HT40U); if (c == NULL) c = findhtchan(ic, chan, IEEE80211_CHAN_HT40D); if (c == NULL) c = findhtchan(ic, chan, IEEE80211_CHAN_HT20); if (c != NULL) chan = c; } } else if (!IEEE80211_IS_CHAN_HT20(chan)) { c = findhtchan(ic, chan, IEEE80211_CHAN_HT20); if (c != NULL) chan = c; } } else if (IEEE80211_IS_CHAN_HT(chan)) { /* demote to legacy, HT use is disabled */ c = ieee80211_find_channel(ic, chan->ic_freq, chan->ic_flags &~ IEEE80211_CHAN_HT); if (c != NULL) chan = c; } return chan; } /* * Setup HT-specific state for a legacy WDS peer. */ void ieee80211_ht_wds_init(struct ieee80211_node *ni) { struct ieee80211vap *vap = ni->ni_vap; struct ieee80211_tx_ampdu *tap; int tid; KASSERT(vap->iv_flags_ht & IEEE80211_FHT_HT, ("no HT requested")); /* XXX check scan cache in case peer has an ap and we have info */ /* * If setup with a legacy channel; locate an HT channel. * Otherwise if the inherited channel (from a companion * AP) is suitable use it so we use the same location * for the extension channel). */ ni->ni_chan = ieee80211_ht_adjust_channel(ni->ni_ic, ni->ni_chan, ieee80211_htchanflags(ni->ni_chan)); ni->ni_htcap = 0; if (vap->iv_flags_ht & IEEE80211_FHT_SHORTGI20) ni->ni_htcap |= IEEE80211_HTCAP_SHORTGI20; if (IEEE80211_IS_CHAN_HT40(ni->ni_chan)) { ni->ni_htcap |= IEEE80211_HTCAP_CHWIDTH40; ni->ni_chw = 40; if (IEEE80211_IS_CHAN_HT40U(ni->ni_chan)) ni->ni_ht2ndchan = IEEE80211_HTINFO_2NDCHAN_ABOVE; else if (IEEE80211_IS_CHAN_HT40D(ni->ni_chan)) ni->ni_ht2ndchan = IEEE80211_HTINFO_2NDCHAN_BELOW; if (vap->iv_flags_ht & IEEE80211_FHT_SHORTGI40) ni->ni_htcap |= IEEE80211_HTCAP_SHORTGI40; } else { ni->ni_chw = 20; ni->ni_ht2ndchan = IEEE80211_HTINFO_2NDCHAN_NONE; } ni->ni_htctlchan = ni->ni_chan->ic_ieee; if (vap->iv_flags_ht & IEEE80211_FHT_RIFS) ni->ni_flags |= IEEE80211_NODE_RIFS; /* XXX does it make sense to enable SMPS? */ ni->ni_htopmode = 0; /* XXX need protection state */ ni->ni_htstbc = 0; /* XXX need info */ for (tid = 0; tid < WME_NUM_TID; tid++) { tap = &ni->ni_tx_ampdu[tid]; tap->txa_tid = tid; ieee80211_txampdu_init_pps(tap); } /* NB: AMPDU tx/rx governed by IEEE80211_FHT_AMPDU_{TX,RX} */ ni->ni_flags |= IEEE80211_NODE_HT | IEEE80211_NODE_AMPDU; } /* * Notify hostap vaps of a change in the HTINFO ie. */ static void htinfo_notify(struct ieee80211com *ic) { struct ieee80211vap *vap; int first = 1; IEEE80211_LOCK_ASSERT(ic); TAILQ_FOREACH(vap, &ic->ic_vaps, iv_next) { if (vap->iv_opmode != IEEE80211_M_HOSTAP) continue; if (vap->iv_state != IEEE80211_S_RUN || !IEEE80211_IS_CHAN_HT(vap->iv_bss->ni_chan)) continue; if (first) { IEEE80211_NOTE(vap, IEEE80211_MSG_ASSOC | IEEE80211_MSG_11N, vap->iv_bss, "HT bss occupancy change: %d sta, %d ht, " "%d ht40%s, HT protmode now 0x%x" , ic->ic_sta_assoc , ic->ic_ht_sta_assoc , ic->ic_ht40_sta_assoc , (ic->ic_flags_ht & IEEE80211_FHT_NONHT_PR) ? ", non-HT sta present" : "" , ic->ic_curhtprotmode); first = 0; } ieee80211_beacon_notify(vap, IEEE80211_BEACON_HTINFO); } } /* * Calculate HT protection mode from current * state and handle updates. */ static void htinfo_update(struct ieee80211com *ic) { uint8_t protmode; if (ic->ic_sta_assoc != ic->ic_ht_sta_assoc) { protmode = IEEE80211_HTINFO_OPMODE_MIXED | IEEE80211_HTINFO_NONHT_PRESENT; } else if (ic->ic_flags_ht & IEEE80211_FHT_NONHT_PR) { protmode = IEEE80211_HTINFO_OPMODE_PROTOPT | IEEE80211_HTINFO_NONHT_PRESENT; } else if (ic->ic_bsschan != IEEE80211_CHAN_ANYC && IEEE80211_IS_CHAN_HT40(ic->ic_bsschan) && ic->ic_sta_assoc != ic->ic_ht40_sta_assoc) { protmode = IEEE80211_HTINFO_OPMODE_HT20PR; } else { protmode = IEEE80211_HTINFO_OPMODE_PURE; } if (protmode != ic->ic_curhtprotmode) { ic->ic_curhtprotmode = protmode; htinfo_notify(ic); } } /* * Handle an HT station joining a BSS. */ void ieee80211_ht_node_join(struct ieee80211_node *ni) { struct ieee80211com *ic = ni->ni_ic; IEEE80211_LOCK_ASSERT(ic); if (ni->ni_flags & IEEE80211_NODE_HT) { ic->ic_ht_sta_assoc++; if (ni->ni_chw == 40) ic->ic_ht40_sta_assoc++; } htinfo_update(ic); } /* * Handle an HT station leaving a BSS. */ void ieee80211_ht_node_leave(struct ieee80211_node *ni) { struct ieee80211com *ic = ni->ni_ic; IEEE80211_LOCK_ASSERT(ic); if (ni->ni_flags & IEEE80211_NODE_HT) { ic->ic_ht_sta_assoc--; if (ni->ni_chw == 40) ic->ic_ht40_sta_assoc--; } htinfo_update(ic); } /* * Public version of htinfo_update; used for processing * beacon frames from overlapping bss. * * Caller can specify either IEEE80211_HTINFO_OPMODE_MIXED * (on receipt of a beacon that advertises MIXED) or * IEEE80211_HTINFO_OPMODE_PROTOPT (on receipt of a beacon * from an overlapping legacy bss). We treat MIXED with * a higher precedence than PROTOPT (i.e. we will not change * change PROTOPT -> MIXED; only MIXED -> PROTOPT). This * corresponds to how we handle things in htinfo_update. */ void ieee80211_htprot_update(struct ieee80211com *ic, int protmode) { #define OPMODE(x) SM(x, IEEE80211_HTINFO_OPMODE) IEEE80211_LOCK(ic); /* track non-HT station presence */ KASSERT(protmode & IEEE80211_HTINFO_NONHT_PRESENT, ("protmode 0x%x", protmode)); ic->ic_flags_ht |= IEEE80211_FHT_NONHT_PR; ic->ic_lastnonht = ticks; if (protmode != ic->ic_curhtprotmode && (OPMODE(ic->ic_curhtprotmode) != IEEE80211_HTINFO_OPMODE_MIXED || OPMODE(protmode) == IEEE80211_HTINFO_OPMODE_PROTOPT)) { /* push beacon update */ ic->ic_curhtprotmode = protmode; htinfo_notify(ic); } IEEE80211_UNLOCK(ic); #undef OPMODE } /* * Time out presence of an overlapping bss with non-HT * stations. When operating in hostap mode we listen for * beacons from other stations and if we identify a non-HT * station is present we update the opmode field of the * HTINFO ie. To identify when all non-HT stations are * gone we time out this condition. */ void ieee80211_ht_timeout(struct ieee80211com *ic) { IEEE80211_LOCK_ASSERT(ic); if ((ic->ic_flags_ht & IEEE80211_FHT_NONHT_PR) && ieee80211_time_after(ticks, ic->ic_lastnonht + IEEE80211_NONHT_PRESENT_AGE)) { #if 0 IEEE80211_NOTE(vap, IEEE80211_MSG_11N, ni, "%s", "time out non-HT STA present on channel"); #endif ic->ic_flags_ht &= ~IEEE80211_FHT_NONHT_PR; htinfo_update(ic); } } /* * Process an 802.11n HT capabilities ie. */ void ieee80211_parse_htcap(struct ieee80211_node *ni, const uint8_t *ie) { if (ie[0] == IEEE80211_ELEMID_VENDOR) { /* * Station used Vendor OUI ie to associate; * mark the node so when we respond we'll use * the Vendor OUI's and not the standard ie's. */ ni->ni_flags |= IEEE80211_NODE_HTCOMPAT; ie += 4; } else ni->ni_flags &= ~IEEE80211_NODE_HTCOMPAT; ni->ni_htcap = le16dec(ie + __offsetof(struct ieee80211_ie_htcap, hc_cap)); ni->ni_htparam = ie[__offsetof(struct ieee80211_ie_htcap, hc_param)]; } static void htinfo_parse(struct ieee80211_node *ni, const struct ieee80211_ie_htinfo *htinfo) { uint16_t w; ni->ni_htctlchan = htinfo->hi_ctrlchannel; ni->ni_ht2ndchan = SM(htinfo->hi_byte1, IEEE80211_HTINFO_2NDCHAN); w = le16dec(&htinfo->hi_byte2); ni->ni_htopmode = SM(w, IEEE80211_HTINFO_OPMODE); w = le16dec(&htinfo->hi_byte45); ni->ni_htstbc = SM(w, IEEE80211_HTINFO_BASIC_STBCMCS); } /* * Parse an 802.11n HT info ie and save useful information * to the node state. Note this does not effect any state * changes such as for channel width change. */ void ieee80211_parse_htinfo(struct ieee80211_node *ni, const uint8_t *ie) { if (ie[0] == IEEE80211_ELEMID_VENDOR) ie += 4; htinfo_parse(ni, (const struct ieee80211_ie_htinfo *) ie); } /* * Handle 11n/11ac channel switch. * * Use the received HT/VHT ie's to identify the right channel to use. * If we cannot locate it in the channel table then fallback to * legacy operation. * * Note that we use this information to identify the node's * channel only; the caller is responsible for insuring any * required channel change is done (e.g. in sta mode when * parsing the contents of a beacon frame). */ static int htinfo_update_chw(struct ieee80211_node *ni, int htflags, int vhtflags) { struct ieee80211com *ic = ni->ni_ic; struct ieee80211_channel *c; int chanflags; int ret = 0; /* * First step - do HT/VHT only channel lookup based on operating mode * flags. This involves masking out the VHT flags as well. * Otherwise we end up doing the full channel walk each time * we trigger this, which is expensive. */ chanflags = (ni->ni_chan->ic_flags &~ (IEEE80211_CHAN_HT | IEEE80211_CHAN_VHT)) | htflags | vhtflags; if (chanflags == ni->ni_chan->ic_flags) goto done; /* * If HT /or/ VHT flags have changed then check both. * We need to start by picking a HT channel anyway. */ c = NULL; chanflags = (ni->ni_chan->ic_flags &~ (IEEE80211_CHAN_HT | IEEE80211_CHAN_VHT)) | htflags; /* XXX not right for ht40- */ c = ieee80211_find_channel(ic, ni->ni_chan->ic_freq, chanflags); if (c == NULL && (htflags & IEEE80211_CHAN_HT40)) { /* * No HT40 channel entry in our table; fall back * to HT20 operation. This should not happen. */ c = findhtchan(ic, ni->ni_chan, IEEE80211_CHAN_HT20); #if 0 IEEE80211_NOTE(ni->ni_vap, IEEE80211_MSG_ASSOC | IEEE80211_MSG_11N, ni, "no HT40 channel (freq %u), falling back to HT20", ni->ni_chan->ic_freq); #endif /* XXX stat */ } /* Nothing found - leave it alone; move onto VHT */ if (c == NULL) c = ni->ni_chan; /* * If it's non-HT, then bail out now. */ if (! IEEE80211_IS_CHAN_HT(c)) { IEEE80211_NOTE(ni->ni_vap, IEEE80211_MSG_ASSOC | IEEE80211_MSG_11N, ni, "not HT; skipping VHT check (%u/0x%x)", c->ic_freq, c->ic_flags); goto done; } /* * Next step - look at the current VHT flags and determine * if we need to upgrade. Mask out the VHT and HT flags since * the vhtflags field will already have the correct HT * flags to use. */ if (IEEE80211_CONF_VHT(ic) && ni->ni_vhtcap != 0 && vhtflags != 0) { chanflags = (c->ic_flags &~ (IEEE80211_CHAN_HT | IEEE80211_CHAN_VHT)) | vhtflags; IEEE80211_NOTE(ni->ni_vap, IEEE80211_MSG_ASSOC | IEEE80211_MSG_11N, ni, "%s: VHT; chanwidth=0x%02x; vhtflags=0x%08x", __func__, ni->ni_vht_chanwidth, vhtflags); IEEE80211_NOTE(ni->ni_vap, IEEE80211_MSG_ASSOC | IEEE80211_MSG_11N, ni, "%s: VHT; trying lookup for %d/0x%08x", __func__, c->ic_freq, chanflags); c = ieee80211_find_channel(ic, c->ic_freq, chanflags); } /* Finally, if it's changed */ if (c != NULL && c != ni->ni_chan) { IEEE80211_NOTE(ni->ni_vap, IEEE80211_MSG_ASSOC | IEEE80211_MSG_11N, ni, "switch station to %s%d channel %u/0x%x", IEEE80211_IS_CHAN_VHT(c) ? "VHT" : "HT", IEEE80211_IS_CHAN_VHT80(c) ? 80 : (IEEE80211_IS_CHAN_HT40(c) ? 40 : 20), c->ic_freq, c->ic_flags); ni->ni_chan = c; ret = 1; } /* NB: caller responsible for forcing any channel change */ done: /* update node's (11n) tx channel width */ ni->ni_chw = IEEE80211_IS_CHAN_HT40(ni->ni_chan)? 40 : 20; return (ret); } /* * Update 11n MIMO PS state according to received htcap. */ static __inline int htcap_update_mimo_ps(struct ieee80211_node *ni) { uint16_t oflags = ni->ni_flags; switch (ni->ni_htcap & IEEE80211_HTCAP_SMPS) { case IEEE80211_HTCAP_SMPS_DYNAMIC: ni->ni_flags |= IEEE80211_NODE_MIMO_PS; ni->ni_flags |= IEEE80211_NODE_MIMO_RTS; break; case IEEE80211_HTCAP_SMPS_ENA: ni->ni_flags |= IEEE80211_NODE_MIMO_PS; ni->ni_flags &= ~IEEE80211_NODE_MIMO_RTS; break; case IEEE80211_HTCAP_SMPS_OFF: default: /* disable on rx of reserved value */ ni->ni_flags &= ~IEEE80211_NODE_MIMO_PS; ni->ni_flags &= ~IEEE80211_NODE_MIMO_RTS; break; } return (oflags ^ ni->ni_flags); } /* * Update short GI state according to received htcap * and local settings. */ static __inline void htcap_update_shortgi(struct ieee80211_node *ni) { struct ieee80211vap *vap = ni->ni_vap; ni->ni_flags &= ~(IEEE80211_NODE_SGI20|IEEE80211_NODE_SGI40); if ((ni->ni_htcap & IEEE80211_HTCAP_SHORTGI20) && (vap->iv_flags_ht & IEEE80211_FHT_SHORTGI20)) ni->ni_flags |= IEEE80211_NODE_SGI20; if ((ni->ni_htcap & IEEE80211_HTCAP_SHORTGI40) && (vap->iv_flags_ht & IEEE80211_FHT_SHORTGI40)) ni->ni_flags |= IEEE80211_NODE_SGI40; } /* * Update LDPC state according to received htcap * and local settings. */ static __inline void htcap_update_ldpc(struct ieee80211_node *ni) { struct ieee80211vap *vap = ni->ni_vap; if ((ni->ni_htcap & IEEE80211_HTCAP_LDPC) && (vap->iv_flags_ht & IEEE80211_FHT_LDPC_TX)) ni->ni_flags |= IEEE80211_NODE_LDPC; } /* * Parse and update HT-related state extracted from * the HT cap and info ie's. * * This is called from the STA management path and * the ieee80211_node_join() path. It will take into * account the IEs discovered during scanning and * adjust things accordingly. */ void ieee80211_ht_updateparams(struct ieee80211_node *ni, const uint8_t *htcapie, const uint8_t *htinfoie) { struct ieee80211vap *vap = ni->ni_vap; const struct ieee80211_ie_htinfo *htinfo; ieee80211_parse_htcap(ni, htcapie); if (vap->iv_htcaps & IEEE80211_HTC_SMPS) htcap_update_mimo_ps(ni); htcap_update_shortgi(ni); htcap_update_ldpc(ni); if (htinfoie[0] == IEEE80211_ELEMID_VENDOR) htinfoie += 4; htinfo = (const struct ieee80211_ie_htinfo *) htinfoie; htinfo_parse(ni, htinfo); /* * Defer the node channel change; we need to now * update VHT parameters before we do it. */ if ((htinfo->hi_byte1 & IEEE80211_HTINFO_RIFSMODE_PERM) && (vap->iv_flags_ht & IEEE80211_FHT_RIFS)) ni->ni_flags |= IEEE80211_NODE_RIFS; else ni->ni_flags &= ~IEEE80211_NODE_RIFS; } static uint32_t ieee80211_vht_get_vhtflags(struct ieee80211_node *ni, uint32_t htflags) { struct ieee80211vap *vap = ni->ni_vap; uint32_t vhtflags = 0; vhtflags = 0; if (ni->ni_flags & IEEE80211_NODE_VHT && vap->iv_flags_vht & IEEE80211_FVHT_VHT) { if ((ni->ni_vht_chanwidth == IEEE80211_VHT_CHANWIDTH_160MHZ) && /* XXX 2 means "160MHz and 80+80MHz", 1 means "160MHz" */ (MS(vap->iv_vhtcaps, IEEE80211_VHTCAP_SUPP_CHAN_WIDTH_MASK) >= 1) && (vap->iv_flags_vht & IEEE80211_FVHT_USEVHT160)) { vhtflags = IEEE80211_CHAN_VHT160; /* Mirror the HT40 flags */ if (htflags == IEEE80211_CHAN_HT40U) { vhtflags |= IEEE80211_CHAN_HT40U; } else if (htflags == IEEE80211_CHAN_HT40D) { vhtflags |= IEEE80211_CHAN_HT40D; } } else if ((ni->ni_vht_chanwidth == IEEE80211_VHT_CHANWIDTH_80P80MHZ) && /* XXX 2 means "160MHz and 80+80MHz" */ (MS(vap->iv_vhtcaps, IEEE80211_VHTCAP_SUPP_CHAN_WIDTH_MASK) == 2) && (vap->iv_flags_vht & IEEE80211_FVHT_USEVHT80P80)) { vhtflags = IEEE80211_CHAN_VHT80_80; /* Mirror the HT40 flags */ if (htflags == IEEE80211_CHAN_HT40U) { vhtflags |= IEEE80211_CHAN_HT40U; } else if (htflags == IEEE80211_CHAN_HT40D) { vhtflags |= IEEE80211_CHAN_HT40D; } } else if ((ni->ni_vht_chanwidth == IEEE80211_VHT_CHANWIDTH_80MHZ) && (vap->iv_flags_vht & IEEE80211_FVHT_USEVHT80)) { vhtflags = IEEE80211_CHAN_VHT80; /* Mirror the HT40 flags */ if (htflags == IEEE80211_CHAN_HT40U) { vhtflags |= IEEE80211_CHAN_HT40U; } else if (htflags == IEEE80211_CHAN_HT40D) { vhtflags |= IEEE80211_CHAN_HT40D; } } else if (ni->ni_vht_chanwidth == IEEE80211_VHT_CHANWIDTH_USE_HT) { /* Mirror the HT40 flags */ /* * XXX TODO: if ht40 is disabled, but vht40 isn't * disabled then this logic will get very, very sad. * It's quite possible the only sane thing to do is * to not have vht40 as an option, and just obey * 'ht40' as that flag. */ if ((htflags == IEEE80211_CHAN_HT40U) && (vap->iv_flags_vht & IEEE80211_FVHT_USEVHT40)) { vhtflags = IEEE80211_CHAN_VHT40U | IEEE80211_CHAN_HT40U; } else if (htflags == IEEE80211_CHAN_HT40D && (vap->iv_flags_vht & IEEE80211_FVHT_USEVHT40)) { vhtflags = IEEE80211_CHAN_VHT40D | IEEE80211_CHAN_HT40D; } else if (htflags == IEEE80211_CHAN_HT20) { vhtflags = IEEE80211_CHAN_VHT20 | IEEE80211_CHAN_HT20; } } else { vhtflags = IEEE80211_CHAN_VHT20; } } return (vhtflags); } /* * Final part of updating the HT parameters. * * This is called from the STA management path and * the ieee80211_node_join() path. It will take into * account the IEs discovered during scanning and * adjust things accordingly. * * This is done after a call to ieee80211_ht_updateparams() * because it (and the upcoming VHT version of updateparams) * needs to ensure everything is parsed before htinfo_update_chw() * is called - which will change the channel config for the * node for us. */ int ieee80211_ht_updateparams_final(struct ieee80211_node *ni, const uint8_t *htcapie, const uint8_t *htinfoie) { struct ieee80211vap *vap = ni->ni_vap; const struct ieee80211_ie_htinfo *htinfo; int htflags, vhtflags; int ret = 0; htinfo = (const struct ieee80211_ie_htinfo *) htinfoie; htflags = (vap->iv_flags_ht & IEEE80211_FHT_HT) ? IEEE80211_CHAN_HT20 : 0; /* NB: honor operating mode constraint */ if ((htinfo->hi_byte1 & IEEE80211_HTINFO_TXWIDTH_2040) && (vap->iv_flags_ht & IEEE80211_FHT_USEHT40)) { if (ni->ni_ht2ndchan == IEEE80211_HTINFO_2NDCHAN_ABOVE) htflags = IEEE80211_CHAN_HT40U; else if (ni->ni_ht2ndchan == IEEE80211_HTINFO_2NDCHAN_BELOW) htflags = IEEE80211_CHAN_HT40D; } /* * VHT flags - do much the same; check whether VHT is available * and if so, what our ideal channel use would be based on our * capabilities and the (pre-parsed) VHT info IE. */ vhtflags = ieee80211_vht_get_vhtflags(ni, htflags); if (htinfo_update_chw(ni, htflags, vhtflags)) ret = 1; return (ret); } /* * Parse and update HT-related state extracted from the HT cap ie * for a station joining an HT BSS. * * This is called from the hostap path for each station. */ void ieee80211_ht_updatehtcap(struct ieee80211_node *ni, const uint8_t *htcapie) { struct ieee80211vap *vap = ni->ni_vap; ieee80211_parse_htcap(ni, htcapie); if (vap->iv_htcaps & IEEE80211_HTC_SMPS) htcap_update_mimo_ps(ni); htcap_update_shortgi(ni); htcap_update_ldpc(ni); } /* * Called once HT and VHT capabilities are parsed in hostap mode - * this will adjust the channel configuration of the given node * based on the configuration and capabilities. */ void ieee80211_ht_updatehtcap_final(struct ieee80211_node *ni) { struct ieee80211vap *vap = ni->ni_vap; int htflags; int vhtflags; /* NB: honor operating mode constraint */ /* XXX 40 MHz intolerant */ htflags = (vap->iv_flags_ht & IEEE80211_FHT_HT) ? IEEE80211_CHAN_HT20 : 0; if ((ni->ni_htcap & IEEE80211_HTCAP_CHWIDTH40) && (vap->iv_flags_ht & IEEE80211_FHT_USEHT40)) { if (IEEE80211_IS_CHAN_HT40U(vap->iv_bss->ni_chan)) htflags = IEEE80211_CHAN_HT40U; else if (IEEE80211_IS_CHAN_HT40D(vap->iv_bss->ni_chan)) htflags = IEEE80211_CHAN_HT40D; } /* * VHT flags - do much the same; check whether VHT is available * and if so, what our ideal channel use would be based on our * capabilities and the (pre-parsed) VHT info IE. */ vhtflags = ieee80211_vht_get_vhtflags(ni, htflags); (void) htinfo_update_chw(ni, htflags, vhtflags); } /* * Install received HT rate set by parsing the HT cap ie. */ int ieee80211_setup_htrates(struct ieee80211_node *ni, const uint8_t *ie, int flags) { struct ieee80211com *ic = ni->ni_ic; struct ieee80211vap *vap = ni->ni_vap; const struct ieee80211_ie_htcap *htcap; struct ieee80211_htrateset *rs; int i, maxequalmcs, maxunequalmcs; maxequalmcs = ic->ic_txstream * 8 - 1; maxunequalmcs = 0; if (ic->ic_htcaps & IEEE80211_HTC_TXUNEQUAL) { if (ic->ic_txstream >= 2) maxunequalmcs = 38; if (ic->ic_txstream >= 3) maxunequalmcs = 52; if (ic->ic_txstream >= 4) maxunequalmcs = 76; } rs = &ni->ni_htrates; memset(rs, 0, sizeof(*rs)); if (ie != NULL) { if (ie[0] == IEEE80211_ELEMID_VENDOR) ie += 4; htcap = (const struct ieee80211_ie_htcap *) ie; for (i = 0; i < IEEE80211_HTRATE_MAXSIZE; i++) { if (isclr(htcap->hc_mcsset, i)) continue; if (rs->rs_nrates == IEEE80211_HTRATE_MAXSIZE) { IEEE80211_NOTE(vap, IEEE80211_MSG_XRATE | IEEE80211_MSG_11N, ni, "WARNING, HT rate set too large; only " "using %u rates", IEEE80211_HTRATE_MAXSIZE); vap->iv_stats.is_rx_rstoobig++; break; } if (i <= 31 && i > maxequalmcs) continue; if (i == 32 && (ic->ic_htcaps & IEEE80211_HTC_TXMCS32) == 0) continue; if (i > 32 && i > maxunequalmcs) continue; rs->rs_rates[rs->rs_nrates++] = i; } } return ieee80211_fix_rate(ni, (struct ieee80211_rateset *) rs, flags); } /* * Mark rates in a node's HT rate set as basic according * to the information in the supplied HT info ie. */ void ieee80211_setup_basic_htrates(struct ieee80211_node *ni, const uint8_t *ie) { const struct ieee80211_ie_htinfo *htinfo; struct ieee80211_htrateset *rs; int i, j; if (ie[0] == IEEE80211_ELEMID_VENDOR) ie += 4; htinfo = (const struct ieee80211_ie_htinfo *) ie; rs = &ni->ni_htrates; if (rs->rs_nrates == 0) { IEEE80211_NOTE(ni->ni_vap, IEEE80211_MSG_XRATE | IEEE80211_MSG_11N, ni, "%s", "WARNING, empty HT rate set"); return; } for (i = 0; i < IEEE80211_HTRATE_MAXSIZE; i++) { if (isclr(htinfo->hi_basicmcsset, i)) continue; for (j = 0; j < rs->rs_nrates; j++) if ((rs->rs_rates[j] & IEEE80211_RATE_VAL) == i) rs->rs_rates[j] |= IEEE80211_RATE_BASIC; } } static void ampdu_tx_setup(struct ieee80211_tx_ampdu *tap) { callout_init(&tap->txa_timer, 1); tap->txa_flags |= IEEE80211_AGGR_SETUP; tap->txa_lastsample = ticks; } static void ampdu_tx_stop(struct ieee80211_tx_ampdu *tap) { struct ieee80211_node *ni = tap->txa_ni; struct ieee80211com *ic = ni->ni_ic; IEEE80211_NOTE(tap->txa_ni->ni_vap, IEEE80211_MSG_11N, tap->txa_ni, "%s: called", __func__); KASSERT(tap->txa_flags & IEEE80211_AGGR_SETUP, ("txa_flags 0x%x tid %d ac %d", tap->txa_flags, tap->txa_tid, TID_TO_WME_AC(tap->txa_tid))); /* * Stop BA stream if setup so driver has a chance * to reclaim any resources it might have allocated. */ ic->ic_addba_stop(ni, tap); /* * Stop any pending BAR transmit. */ bar_stop_timer(tap); /* * Reset packet estimate. */ ieee80211_txampdu_init_pps(tap); /* NB: clearing NAK means we may re-send ADDBA */ tap->txa_flags &= ~(IEEE80211_AGGR_SETUP | IEEE80211_AGGR_NAK); } /* * ADDBA response timeout. * * If software aggregation and per-TID queue management was done here, * that queue would be unpaused after the ADDBA timeout occurs. */ static void addba_timeout(void *arg) { struct ieee80211_tx_ampdu *tap = arg; struct ieee80211_node *ni = tap->txa_ni; struct ieee80211com *ic = ni->ni_ic; /* XXX ? */ tap->txa_flags &= ~IEEE80211_AGGR_XCHGPEND; tap->txa_attempts++; ic->ic_addba_response_timeout(ni, tap); } static void addba_start_timeout(struct ieee80211_tx_ampdu *tap) { /* XXX use CALLOUT_PENDING instead? */ callout_reset(&tap->txa_timer, ieee80211_addba_timeout, addba_timeout, tap); tap->txa_flags |= IEEE80211_AGGR_XCHGPEND; tap->txa_nextrequest = ticks + ieee80211_addba_timeout; } static void addba_stop_timeout(struct ieee80211_tx_ampdu *tap) { /* XXX use CALLOUT_PENDING instead? */ if (tap->txa_flags & IEEE80211_AGGR_XCHGPEND) { callout_stop(&tap->txa_timer); tap->txa_flags &= ~IEEE80211_AGGR_XCHGPEND; } } static void null_addba_response_timeout(struct ieee80211_node *ni, struct ieee80211_tx_ampdu *tap) { } /* * Default method for requesting A-MPDU tx aggregation. * We setup the specified state block and start a timer * to wait for an ADDBA response frame. */ static int ieee80211_addba_request(struct ieee80211_node *ni, struct ieee80211_tx_ampdu *tap, int dialogtoken, int baparamset, int batimeout) { int bufsiz; /* XXX locking */ tap->txa_token = dialogtoken; tap->txa_flags |= IEEE80211_AGGR_IMMEDIATE; bufsiz = MS(baparamset, IEEE80211_BAPS_BUFSIZ); tap->txa_wnd = (bufsiz == 0) ? IEEE80211_AGGR_BAWMAX : min(bufsiz, IEEE80211_AGGR_BAWMAX); addba_start_timeout(tap); return 1; } /* * Called by drivers that wish to request an ADDBA session be * setup. This brings it up and starts the request timer. */ int ieee80211_ampdu_tx_request_ext(struct ieee80211_node *ni, int tid) { struct ieee80211_tx_ampdu *tap; if (tid < 0 || tid > 15) return (0); tap = &ni->ni_tx_ampdu[tid]; /* XXX locking */ if ((tap->txa_flags & IEEE80211_AGGR_SETUP) == 0) { /* do deferred setup of state */ ampdu_tx_setup(tap); } /* XXX hack for not doing proper locking */ tap->txa_flags &= ~IEEE80211_AGGR_NAK; addba_start_timeout(tap); return (1); } /* * Called by drivers that have marked a session as active. */ int ieee80211_ampdu_tx_request_active_ext(struct ieee80211_node *ni, int tid, int status) { struct ieee80211_tx_ampdu *tap; if (tid < 0 || tid > 15) return (0); tap = &ni->ni_tx_ampdu[tid]; /* XXX locking */ addba_stop_timeout(tap); if (status == 1) { tap->txa_flags |= IEEE80211_AGGR_RUNNING; tap->txa_attempts = 0; } else { /* mark tid so we don't try again */ tap->txa_flags |= IEEE80211_AGGR_NAK; } return (1); } /* * Default method for processing an A-MPDU tx aggregation * response. We shutdown any pending timer and update the * state block according to the reply. */ static int ieee80211_addba_response(struct ieee80211_node *ni, struct ieee80211_tx_ampdu *tap, int status, int baparamset, int batimeout) { int bufsiz, tid; /* XXX locking */ addba_stop_timeout(tap); if (status == IEEE80211_STATUS_SUCCESS) { bufsiz = MS(baparamset, IEEE80211_BAPS_BUFSIZ); /* XXX override our request? */ tap->txa_wnd = (bufsiz == 0) ? IEEE80211_AGGR_BAWMAX : min(bufsiz, IEEE80211_AGGR_BAWMAX); /* XXX AC/TID */ tid = MS(baparamset, IEEE80211_BAPS_TID); tap->txa_flags |= IEEE80211_AGGR_RUNNING; tap->txa_attempts = 0; } else { /* mark tid so we don't try again */ tap->txa_flags |= IEEE80211_AGGR_NAK; } return 1; } /* * Default method for stopping A-MPDU tx aggregation. * Any timer is cleared and we drain any pending frames. */ static void ieee80211_addba_stop(struct ieee80211_node *ni, struct ieee80211_tx_ampdu *tap) { /* XXX locking */ addba_stop_timeout(tap); if (tap->txa_flags & IEEE80211_AGGR_RUNNING) { /* XXX clear aggregation queue */ tap->txa_flags &= ~IEEE80211_AGGR_RUNNING; } tap->txa_attempts = 0; } /* * Process a received action frame using the default aggregation * policy. We intercept ADDBA-related frames and use them to * update our aggregation state. All other frames are passed up * for processing by ieee80211_recv_action. */ static int ht_recv_action_ba_addba_request(struct ieee80211_node *ni, const struct ieee80211_frame *wh, const uint8_t *frm, const uint8_t *efrm) { struct ieee80211com *ic = ni->ni_ic; struct ieee80211vap *vap = ni->ni_vap; struct ieee80211_rx_ampdu *rap; uint8_t dialogtoken; uint16_t baparamset, batimeout, baseqctl; uint16_t args[5]; int tid; dialogtoken = frm[2]; baparamset = le16dec(frm+3); batimeout = le16dec(frm+5); baseqctl = le16dec(frm+7); tid = MS(baparamset, IEEE80211_BAPS_TID); IEEE80211_NOTE(vap, IEEE80211_MSG_ACTION | IEEE80211_MSG_11N, ni, "recv ADDBA request: dialogtoken %u baparamset 0x%x " "(tid %d bufsiz %d) batimeout %d baseqctl %d:%d", dialogtoken, baparamset, tid, MS(baparamset, IEEE80211_BAPS_BUFSIZ), batimeout, MS(baseqctl, IEEE80211_BASEQ_START), MS(baseqctl, IEEE80211_BASEQ_FRAG)); rap = &ni->ni_rx_ampdu[tid]; /* Send ADDBA response */ args[0] = dialogtoken; /* * NB: We ack only if the sta associated with HT and * the ap is configured to do AMPDU rx (the latter * violates the 11n spec and is mostly for testing). */ if ((ni->ni_flags & IEEE80211_NODE_AMPDU_RX) && (vap->iv_flags_ht & IEEE80211_FHT_AMPDU_RX)) { /* XXX handle ampdu_rx_start failure */ ic->ic_ampdu_rx_start(ni, rap, baparamset, batimeout, baseqctl); args[1] = IEEE80211_STATUS_SUCCESS; } else { IEEE80211_NOTE(vap, IEEE80211_MSG_ACTION | IEEE80211_MSG_11N, ni, "reject ADDBA request: %s", ni->ni_flags & IEEE80211_NODE_AMPDU_RX ? "administratively disabled" : "not negotiated for station"); vap->iv_stats.is_addba_reject++; args[1] = IEEE80211_STATUS_UNSPECIFIED; } /* XXX honor rap flags? */ args[2] = IEEE80211_BAPS_POLICY_IMMEDIATE | SM(tid, IEEE80211_BAPS_TID) | SM(rap->rxa_wnd, IEEE80211_BAPS_BUFSIZ) ; args[3] = 0; args[4] = 0; ic->ic_send_action(ni, IEEE80211_ACTION_CAT_BA, IEEE80211_ACTION_BA_ADDBA_RESPONSE, args); return 0; } static int ht_recv_action_ba_addba_response(struct ieee80211_node *ni, const struct ieee80211_frame *wh, const uint8_t *frm, const uint8_t *efrm) { struct ieee80211com *ic = ni->ni_ic; struct ieee80211vap *vap = ni->ni_vap; struct ieee80211_tx_ampdu *tap; uint8_t dialogtoken, policy; uint16_t baparamset, batimeout, code; int tid, bufsiz; dialogtoken = frm[2]; code = le16dec(frm+3); baparamset = le16dec(frm+5); tid = MS(baparamset, IEEE80211_BAPS_TID); bufsiz = MS(baparamset, IEEE80211_BAPS_BUFSIZ); policy = MS(baparamset, IEEE80211_BAPS_POLICY); batimeout = le16dec(frm+7); tap = &ni->ni_tx_ampdu[tid]; if ((tap->txa_flags & IEEE80211_AGGR_XCHGPEND) == 0) { IEEE80211_DISCARD_MAC(vap, IEEE80211_MSG_ACTION | IEEE80211_MSG_11N, ni->ni_macaddr, "ADDBA response", "no pending ADDBA, tid %d dialogtoken %u " "code %d", tid, dialogtoken, code); vap->iv_stats.is_addba_norequest++; return 0; } if (dialogtoken != tap->txa_token) { IEEE80211_DISCARD_MAC(vap, IEEE80211_MSG_ACTION | IEEE80211_MSG_11N, ni->ni_macaddr, "ADDBA response", "dialogtoken mismatch: waiting for %d, " "received %d, tid %d code %d", tap->txa_token, dialogtoken, tid, code); vap->iv_stats.is_addba_badtoken++; return 0; } /* NB: assumes IEEE80211_AGGR_IMMEDIATE is 1 */ if (policy != (tap->txa_flags & IEEE80211_AGGR_IMMEDIATE)) { IEEE80211_DISCARD_MAC(vap, IEEE80211_MSG_ACTION | IEEE80211_MSG_11N, ni->ni_macaddr, "ADDBA response", "policy mismatch: expecting %s, " "received %s, tid %d code %d", tap->txa_flags & IEEE80211_AGGR_IMMEDIATE, policy, tid, code); vap->iv_stats.is_addba_badpolicy++; return 0; } #if 0 /* XXX we take MIN in ieee80211_addba_response */ if (bufsiz > IEEE80211_AGGR_BAWMAX) { IEEE80211_DISCARD_MAC(vap, IEEE80211_MSG_ACTION | IEEE80211_MSG_11N, ni->ni_macaddr, "ADDBA response", "BA window too large: max %d, " "received %d, tid %d code %d", bufsiz, IEEE80211_AGGR_BAWMAX, tid, code); vap->iv_stats.is_addba_badbawinsize++; return 0; } #endif IEEE80211_NOTE(vap, IEEE80211_MSG_ACTION | IEEE80211_MSG_11N, ni, "recv ADDBA response: dialogtoken %u code %d " "baparamset 0x%x (tid %d bufsiz %d) batimeout %d", dialogtoken, code, baparamset, tid, bufsiz, batimeout); ic->ic_addba_response(ni, tap, code, baparamset, batimeout); return 0; } static int ht_recv_action_ba_delba(struct ieee80211_node *ni, const struct ieee80211_frame *wh, const uint8_t *frm, const uint8_t *efrm) { struct ieee80211com *ic = ni->ni_ic; struct ieee80211_rx_ampdu *rap; struct ieee80211_tx_ampdu *tap; uint16_t baparamset, code; int tid; baparamset = le16dec(frm+2); code = le16dec(frm+4); tid = MS(baparamset, IEEE80211_DELBAPS_TID); IEEE80211_NOTE(ni->ni_vap, IEEE80211_MSG_ACTION | IEEE80211_MSG_11N, ni, "recv DELBA: baparamset 0x%x (tid %d initiator %d) " "code %d", baparamset, tid, MS(baparamset, IEEE80211_DELBAPS_INIT), code); if ((baparamset & IEEE80211_DELBAPS_INIT) == 0) { tap = &ni->ni_tx_ampdu[tid]; ic->ic_addba_stop(ni, tap); } else { rap = &ni->ni_rx_ampdu[tid]; ic->ic_ampdu_rx_stop(ni, rap); } return 0; } static int ht_recv_action_ht_txchwidth(struct ieee80211_node *ni, const struct ieee80211_frame *wh, const uint8_t *frm, const uint8_t *efrm) { int chw; chw = (frm[2] == IEEE80211_A_HT_TXCHWIDTH_2040) ? 40 : 20; IEEE80211_NOTE(ni->ni_vap, IEEE80211_MSG_ACTION | IEEE80211_MSG_11N, ni, "%s: HT txchwidth, width %d%s", __func__, chw, ni->ni_chw != chw ? "*" : ""); if (chw != ni->ni_chw) { /* XXX does this need to change the ht40 station count? */ ni->ni_chw = chw; /* XXX notify on change */ } return 0; } static int ht_recv_action_ht_mimopwrsave(struct ieee80211_node *ni, const struct ieee80211_frame *wh, const uint8_t *frm, const uint8_t *efrm) { const struct ieee80211_action_ht_mimopowersave *mps = (const struct ieee80211_action_ht_mimopowersave *) frm; /* XXX check iv_htcaps */ if (mps->am_control & IEEE80211_A_HT_MIMOPWRSAVE_ENA) ni->ni_flags |= IEEE80211_NODE_MIMO_PS; else ni->ni_flags &= ~IEEE80211_NODE_MIMO_PS; if (mps->am_control & IEEE80211_A_HT_MIMOPWRSAVE_MODE) ni->ni_flags |= IEEE80211_NODE_MIMO_RTS; else ni->ni_flags &= ~IEEE80211_NODE_MIMO_RTS; /* XXX notify on change */ IEEE80211_NOTE(ni->ni_vap, IEEE80211_MSG_ACTION | IEEE80211_MSG_11N, ni, "%s: HT MIMO PS (%s%s)", __func__, (ni->ni_flags & IEEE80211_NODE_MIMO_PS) ? "on" : "off", (ni->ni_flags & IEEE80211_NODE_MIMO_RTS) ? "+rts" : "" ); return 0; } /* * Transmit processing. */ /* * Check if A-MPDU should be requested/enabled for a stream. * We require a traffic rate above a per-AC threshold and we * also handle backoff from previous failed attempts. * * Drivers may override this method to bring in information * such as link state conditions in making the decision. */ static int ieee80211_ampdu_enable(struct ieee80211_node *ni, struct ieee80211_tx_ampdu *tap) { struct ieee80211vap *vap = ni->ni_vap; if (tap->txa_avgpps < vap->iv_ampdu_mintraffic[TID_TO_WME_AC(tap->txa_tid)]) return 0; /* XXX check rssi? */ if (tap->txa_attempts >= ieee80211_addba_maxtries && ieee80211_time_after(ticks, tap->txa_nextrequest)) { /* * Don't retry too often; txa_nextrequest is set * to the minimum interval we'll retry after * ieee80211_addba_maxtries failed attempts are made. */ return 0; } IEEE80211_NOTE(vap, IEEE80211_MSG_11N, ni, "enable AMPDU on tid %d (%s), avgpps %d pkts %d attempt %d", tap->txa_tid, ieee80211_wme_acnames[TID_TO_WME_AC(tap->txa_tid)], tap->txa_avgpps, tap->txa_pkts, tap->txa_attempts); return 1; } /* * Request A-MPDU tx aggregation. Setup local state and * issue an ADDBA request. BA use will only happen after * the other end replies with ADDBA response. */ int ieee80211_ampdu_request(struct ieee80211_node *ni, struct ieee80211_tx_ampdu *tap) { struct ieee80211com *ic = ni->ni_ic; uint16_t args[5]; int tid, dialogtoken; static int tokens = 0; /* XXX */ /* XXX locking */ if ((tap->txa_flags & IEEE80211_AGGR_SETUP) == 0) { /* do deferred setup of state */ ampdu_tx_setup(tap); } /* XXX hack for not doing proper locking */ tap->txa_flags &= ~IEEE80211_AGGR_NAK; dialogtoken = (tokens+1) % 63; /* XXX */ tid = tap->txa_tid; /* * XXX TODO: This is racy with any other parallel TX going on. :( */ tap->txa_start = ni->ni_txseqs[tid]; args[0] = dialogtoken; args[1] = 0; /* NB: status code not used */ args[2] = IEEE80211_BAPS_POLICY_IMMEDIATE | SM(tid, IEEE80211_BAPS_TID) | SM(IEEE80211_AGGR_BAWMAX, IEEE80211_BAPS_BUFSIZ) ; args[3] = 0; /* batimeout */ /* NB: do first so there's no race against reply */ if (!ic->ic_addba_request(ni, tap, dialogtoken, args[2], args[3])) { /* unable to setup state, don't make request */ IEEE80211_NOTE(ni->ni_vap, IEEE80211_MSG_11N, ni, "%s: could not setup BA stream for TID %d AC %d", __func__, tap->txa_tid, TID_TO_WME_AC(tap->txa_tid)); /* defer next try so we don't slam the driver with requests */ tap->txa_attempts = ieee80211_addba_maxtries; /* NB: check in case driver wants to override */ if (tap->txa_nextrequest <= ticks) tap->txa_nextrequest = ticks + ieee80211_addba_backoff; return 0; } tokens = dialogtoken; /* allocate token */ /* NB: after calling ic_addba_request so driver can set txa_start */ args[4] = SM(tap->txa_start, IEEE80211_BASEQ_START) | SM(0, IEEE80211_BASEQ_FRAG) ; return ic->ic_send_action(ni, IEEE80211_ACTION_CAT_BA, IEEE80211_ACTION_BA_ADDBA_REQUEST, args); } /* * Terminate an AMPDU tx stream. State is reclaimed * and the peer notified with a DelBA Action frame. */ void ieee80211_ampdu_stop(struct ieee80211_node *ni, struct ieee80211_tx_ampdu *tap, int reason) { struct ieee80211com *ic = ni->ni_ic; struct ieee80211vap *vap = ni->ni_vap; uint16_t args[4]; /* XXX locking */ tap->txa_flags &= ~IEEE80211_AGGR_BARPEND; if (IEEE80211_AMPDU_RUNNING(tap)) { IEEE80211_NOTE(vap, IEEE80211_MSG_ACTION | IEEE80211_MSG_11N, ni, "%s: stop BA stream for TID %d (reason: %d (%s))", __func__, tap->txa_tid, reason, ieee80211_reason_to_string(reason)); vap->iv_stats.is_ampdu_stop++; ic->ic_addba_stop(ni, tap); args[0] = tap->txa_tid; args[1] = IEEE80211_DELBAPS_INIT; args[2] = reason; /* XXX reason code */ ic->ic_send_action(ni, IEEE80211_ACTION_CAT_BA, IEEE80211_ACTION_BA_DELBA, args); } else { IEEE80211_NOTE(vap, IEEE80211_MSG_ACTION | IEEE80211_MSG_11N, ni, "%s: BA stream for TID %d not running " "(reason: %d (%s))", __func__, tap->txa_tid, reason, ieee80211_reason_to_string(reason)); vap->iv_stats.is_ampdu_stop_failed++; } } /* XXX */ static void bar_start_timer(struct ieee80211_tx_ampdu *tap); static void bar_timeout(void *arg) { struct ieee80211_tx_ampdu *tap = arg; struct ieee80211_node *ni = tap->txa_ni; KASSERT((tap->txa_flags & IEEE80211_AGGR_XCHGPEND) == 0, ("bar/addba collision, flags 0x%x", tap->txa_flags)); IEEE80211_NOTE(ni->ni_vap, IEEE80211_MSG_11N, ni, "%s: tid %u flags 0x%x attempts %d", __func__, tap->txa_tid, tap->txa_flags, tap->txa_attempts); /* guard against race with bar_tx_complete */ if ((tap->txa_flags & IEEE80211_AGGR_BARPEND) == 0) return; /* XXX ? */ if (tap->txa_attempts >= ieee80211_bar_maxtries) { struct ieee80211com *ic = ni->ni_ic; ni->ni_vap->iv_stats.is_ampdu_bar_tx_fail++; /* * If (at least) the last BAR TX timeout was due to * an ieee80211_send_bar() failures, then we need * to make sure we notify the driver that a BAR * TX did occur and fail. This gives the driver * a chance to undo any queue pause that may * have occurred. */ ic->ic_bar_response(ni, tap, 1); ieee80211_ampdu_stop(ni, tap, IEEE80211_REASON_TIMEOUT); } else { ni->ni_vap->iv_stats.is_ampdu_bar_tx_retry++; if (ieee80211_send_bar(ni, tap, tap->txa_seqpending) != 0) { IEEE80211_NOTE(ni->ni_vap, IEEE80211_MSG_11N, ni, "%s: failed to TX, starting timer\n", __func__); /* * If ieee80211_send_bar() fails here, the * timer may have stopped and/or the pending * flag may be clear. Because of this, * fake the BARPEND and reset the timer. * A retransmission attempt will then occur * during the next timeout. */ /* XXX locking */ tap->txa_flags |= IEEE80211_AGGR_BARPEND; bar_start_timer(tap); } } } static void bar_start_timer(struct ieee80211_tx_ampdu *tap) { IEEE80211_NOTE(tap->txa_ni->ni_vap, IEEE80211_MSG_11N, tap->txa_ni, "%s: called", __func__); callout_reset(&tap->txa_timer, ieee80211_bar_timeout, bar_timeout, tap); } static void bar_stop_timer(struct ieee80211_tx_ampdu *tap) { IEEE80211_NOTE(tap->txa_ni->ni_vap, IEEE80211_MSG_11N, tap->txa_ni, "%s: called", __func__); callout_stop(&tap->txa_timer); } static void bar_tx_complete(struct ieee80211_node *ni, void *arg, int status) { struct ieee80211_tx_ampdu *tap = arg; IEEE80211_NOTE(ni->ni_vap, IEEE80211_MSG_11N, ni, "%s: tid %u flags 0x%x pending %d status %d", __func__, tap->txa_tid, tap->txa_flags, callout_pending(&tap->txa_timer), status); ni->ni_vap->iv_stats.is_ampdu_bar_tx++; /* XXX locking */ if ((tap->txa_flags & IEEE80211_AGGR_BARPEND) && callout_pending(&tap->txa_timer)) { struct ieee80211com *ic = ni->ni_ic; if (status == 0) /* ACK'd */ bar_stop_timer(tap); ic->ic_bar_response(ni, tap, status); /* NB: just let timer expire so we pace requests */ } } static void ieee80211_bar_response(struct ieee80211_node *ni, struct ieee80211_tx_ampdu *tap, int status) { IEEE80211_NOTE(tap->txa_ni->ni_vap, IEEE80211_MSG_11N, tap->txa_ni, "%s: called", __func__); if (status == 0) { /* got ACK */ IEEE80211_NOTE(ni->ni_vap, IEEE80211_MSG_11N, ni, "BAR moves BA win <%u:%u> (%u frames) txseq %u tid %u", tap->txa_start, IEEE80211_SEQ_ADD(tap->txa_start, tap->txa_wnd-1), tap->txa_qframes, tap->txa_seqpending, tap->txa_tid); /* NB: timer already stopped in bar_tx_complete */ tap->txa_start = tap->txa_seqpending; tap->txa_flags &= ~IEEE80211_AGGR_BARPEND; } } /* * Transmit a BAR frame to the specified node. The * BAR contents are drawn from the supplied aggregation * state associated with the node. * * NB: we only handle immediate ACK w/ compressed bitmap. */ int ieee80211_send_bar(struct ieee80211_node *ni, struct ieee80211_tx_ampdu *tap, ieee80211_seq seq) { #define senderr(_x, _v) do { vap->iv_stats._v++; ret = _x; goto bad; } while (0) struct ieee80211vap *vap = ni->ni_vap; struct ieee80211com *ic = ni->ni_ic; struct ieee80211_frame_bar *bar; struct mbuf *m; uint16_t barctl, barseqctl; uint8_t *frm; int tid, ret; IEEE80211_NOTE(tap->txa_ni->ni_vap, IEEE80211_MSG_11N, tap->txa_ni, "%s: called", __func__); if ((tap->txa_flags & IEEE80211_AGGR_RUNNING) == 0) { /* no ADDBA response, should not happen */ /* XXX stat+msg */ return EINVAL; } /* XXX locking */ bar_stop_timer(tap); ieee80211_ref_node(ni); m = ieee80211_getmgtframe(&frm, ic->ic_headroom, sizeof(*bar)); if (m == NULL) senderr(ENOMEM, is_tx_nobuf); if (!ieee80211_add_callback(m, bar_tx_complete, tap)) { m_freem(m); senderr(ENOMEM, is_tx_nobuf); /* XXX */ /* NOTREACHED */ } bar = mtod(m, struct ieee80211_frame_bar *); bar->i_fc[0] = IEEE80211_FC0_VERSION_0 | IEEE80211_FC0_TYPE_CTL | IEEE80211_FC0_SUBTYPE_BAR; bar->i_fc[1] = 0; IEEE80211_ADDR_COPY(bar->i_ra, ni->ni_macaddr); IEEE80211_ADDR_COPY(bar->i_ta, vap->iv_myaddr); tid = tap->txa_tid; barctl = (tap->txa_flags & IEEE80211_AGGR_IMMEDIATE ? 0 : IEEE80211_BAR_NOACK) | IEEE80211_BAR_COMP | SM(tid, IEEE80211_BAR_TID) ; barseqctl = SM(seq, IEEE80211_BAR_SEQ_START); /* NB: known to have proper alignment */ bar->i_ctl = htole16(barctl); bar->i_seq = htole16(barseqctl); m->m_pkthdr.len = m->m_len = sizeof(struct ieee80211_frame_bar); M_WME_SETAC(m, WME_AC_VO); IEEE80211_NODE_STAT(ni, tx_mgmt); /* XXX tx_ctl? */ /* XXX locking */ /* init/bump attempts counter */ if ((tap->txa_flags & IEEE80211_AGGR_BARPEND) == 0) tap->txa_attempts = 1; else tap->txa_attempts++; tap->txa_seqpending = seq; tap->txa_flags |= IEEE80211_AGGR_BARPEND; IEEE80211_NOTE(vap, IEEE80211_MSG_DEBUG | IEEE80211_MSG_11N, ni, "send BAR: tid %u ctl 0x%x start %u (attempt %d)", tid, barctl, seq, tap->txa_attempts); /* * ic_raw_xmit will free the node reference * regardless of queue/TX success or failure. */ IEEE80211_TX_LOCK(ic); ret = ieee80211_raw_output(vap, ni, m, NULL); IEEE80211_TX_UNLOCK(ic); if (ret != 0) { IEEE80211_NOTE(vap, IEEE80211_MSG_DEBUG | IEEE80211_MSG_11N, ni, "send BAR: failed: (ret = %d)\n", ret); /* xmit failed, clear state flag */ tap->txa_flags &= ~IEEE80211_AGGR_BARPEND; vap->iv_stats.is_ampdu_bar_tx_fail++; return ret; } /* XXX hack against tx complete happening before timer is started */ if (tap->txa_flags & IEEE80211_AGGR_BARPEND) bar_start_timer(tap); return 0; bad: IEEE80211_NOTE(tap->txa_ni->ni_vap, IEEE80211_MSG_11N, tap->txa_ni, "%s: bad! ret=%d", __func__, ret); vap->iv_stats.is_ampdu_bar_tx_fail++; ieee80211_free_node(ni); return ret; #undef senderr } static int ht_action_output(struct ieee80211_node *ni, struct mbuf *m) { struct ieee80211_bpf_params params; memset(¶ms, 0, sizeof(params)); params.ibp_pri = WME_AC_VO; params.ibp_rate0 = ni->ni_txparms->mgmtrate; /* NB: we know all frames are unicast */ params.ibp_try0 = ni->ni_txparms->maxretry; params.ibp_power = ni->ni_txpower; return ieee80211_mgmt_output(ni, m, IEEE80211_FC0_SUBTYPE_ACTION, ¶ms); } #define ADDSHORT(frm, v) do { \ frm[0] = (v) & 0xff; \ frm[1] = (v) >> 8; \ frm += 2; \ } while (0) /* * Send an action management frame. The arguments are stuff * into a frame without inspection; the caller is assumed to * prepare them carefully (e.g. based on the aggregation state). */ static int ht_send_action_ba_addba(struct ieee80211_node *ni, int category, int action, void *arg0) { struct ieee80211vap *vap = ni->ni_vap; struct ieee80211com *ic = ni->ni_ic; uint16_t *args = arg0; struct mbuf *m; uint8_t *frm; IEEE80211_NOTE(vap, IEEE80211_MSG_ACTION | IEEE80211_MSG_11N, ni, "send ADDBA %s: dialogtoken %d status %d " "baparamset 0x%x (tid %d) batimeout 0x%x baseqctl 0x%x", (action == IEEE80211_ACTION_BA_ADDBA_REQUEST) ? "request" : "response", args[0], args[1], args[2], MS(args[2], IEEE80211_BAPS_TID), args[3], args[4]); IEEE80211_DPRINTF(vap, IEEE80211_MSG_NODE, "ieee80211_ref_node (%s:%u) %p<%s> refcnt %d\n", __func__, __LINE__, ni, ether_sprintf(ni->ni_macaddr), ieee80211_node_refcnt(ni)+1); ieee80211_ref_node(ni); m = ieee80211_getmgtframe(&frm, ic->ic_headroom + sizeof(struct ieee80211_frame), sizeof(uint16_t) /* action+category */ /* XXX may action payload */ + sizeof(struct ieee80211_action_ba_addbaresponse) ); if (m != NULL) { *frm++ = category; *frm++ = action; *frm++ = args[0]; /* dialog token */ if (action == IEEE80211_ACTION_BA_ADDBA_RESPONSE) ADDSHORT(frm, args[1]); /* status code */ ADDSHORT(frm, args[2]); /* baparamset */ ADDSHORT(frm, args[3]); /* batimeout */ if (action == IEEE80211_ACTION_BA_ADDBA_REQUEST) ADDSHORT(frm, args[4]); /* baseqctl */ m->m_pkthdr.len = m->m_len = frm - mtod(m, uint8_t *); return ht_action_output(ni, m); } else { vap->iv_stats.is_tx_nobuf++; ieee80211_free_node(ni); return ENOMEM; } } static int ht_send_action_ba_delba(struct ieee80211_node *ni, int category, int action, void *arg0) { struct ieee80211vap *vap = ni->ni_vap; struct ieee80211com *ic = ni->ni_ic; uint16_t *args = arg0; struct mbuf *m; uint16_t baparamset; uint8_t *frm; baparamset = SM(args[0], IEEE80211_DELBAPS_TID) | args[1] ; IEEE80211_NOTE(vap, IEEE80211_MSG_ACTION | IEEE80211_MSG_11N, ni, "send DELBA action: tid %d, initiator %d reason %d (%s)", args[0], args[1], args[2], ieee80211_reason_to_string(args[2])); IEEE80211_DPRINTF(vap, IEEE80211_MSG_NODE, "ieee80211_ref_node (%s:%u) %p<%s> refcnt %d\n", __func__, __LINE__, ni, ether_sprintf(ni->ni_macaddr), ieee80211_node_refcnt(ni)+1); ieee80211_ref_node(ni); m = ieee80211_getmgtframe(&frm, ic->ic_headroom + sizeof(struct ieee80211_frame), sizeof(uint16_t) /* action+category */ /* XXX may action payload */ + sizeof(struct ieee80211_action_ba_addbaresponse) ); if (m != NULL) { *frm++ = category; *frm++ = action; ADDSHORT(frm, baparamset); ADDSHORT(frm, args[2]); /* reason code */ m->m_pkthdr.len = m->m_len = frm - mtod(m, uint8_t *); return ht_action_output(ni, m); } else { vap->iv_stats.is_tx_nobuf++; ieee80211_free_node(ni); return ENOMEM; } } static int ht_send_action_ht_txchwidth(struct ieee80211_node *ni, int category, int action, void *arg0) { struct ieee80211vap *vap = ni->ni_vap; struct ieee80211com *ic = ni->ni_ic; struct mbuf *m; uint8_t *frm; IEEE80211_NOTE(vap, IEEE80211_MSG_ACTION | IEEE80211_MSG_11N, ni, "send HT txchwidth: width %d", IEEE80211_IS_CHAN_HT40(ni->ni_chan) ? 40 : 20); IEEE80211_DPRINTF(vap, IEEE80211_MSG_NODE, "ieee80211_ref_node (%s:%u) %p<%s> refcnt %d\n", __func__, __LINE__, ni, ether_sprintf(ni->ni_macaddr), ieee80211_node_refcnt(ni)+1); ieee80211_ref_node(ni); m = ieee80211_getmgtframe(&frm, ic->ic_headroom + sizeof(struct ieee80211_frame), sizeof(uint16_t) /* action+category */ /* XXX may action payload */ + sizeof(struct ieee80211_action_ba_addbaresponse) ); if (m != NULL) { *frm++ = category; *frm++ = action; *frm++ = IEEE80211_IS_CHAN_HT40(ni->ni_chan) ? IEEE80211_A_HT_TXCHWIDTH_2040 : IEEE80211_A_HT_TXCHWIDTH_20; m->m_pkthdr.len = m->m_len = frm - mtod(m, uint8_t *); return ht_action_output(ni, m); } else { vap->iv_stats.is_tx_nobuf++; ieee80211_free_node(ni); return ENOMEM; } } #undef ADDSHORT /* * Construct the MCS bit mask for inclusion in an HT capabilities * information element. */ static void ieee80211_set_mcsset(struct ieee80211com *ic, uint8_t *frm) { int i; uint8_t txparams; KASSERT((ic->ic_rxstream > 0 && ic->ic_rxstream <= 4), ("ic_rxstream %d out of range", ic->ic_rxstream)); KASSERT((ic->ic_txstream > 0 && ic->ic_txstream <= 4), ("ic_txstream %d out of range", ic->ic_txstream)); for (i = 0; i < ic->ic_rxstream * 8; i++) setbit(frm, i); if ((ic->ic_htcaps & IEEE80211_HTCAP_CHWIDTH40) && (ic->ic_htcaps & IEEE80211_HTC_RXMCS32)) setbit(frm, 32); if (ic->ic_htcaps & IEEE80211_HTC_RXUNEQUAL) { if (ic->ic_rxstream >= 2) { for (i = 33; i <= 38; i++) setbit(frm, i); } if (ic->ic_rxstream >= 3) { for (i = 39; i <= 52; i++) setbit(frm, i); } if (ic->ic_txstream >= 4) { for (i = 53; i <= 76; i++) setbit(frm, i); } } if (ic->ic_rxstream != ic->ic_txstream) { txparams = 0x1; /* TX MCS set defined */ txparams |= 0x2; /* TX RX MCS not equal */ txparams |= (ic->ic_txstream - 1) << 2; /* num TX streams */ if (ic->ic_htcaps & IEEE80211_HTC_TXUNEQUAL) txparams |= 0x16; /* TX unequal modulation sup */ } else txparams = 0; frm[12] = txparams; } /* * Add body of an HTCAP information element. */ static uint8_t * ieee80211_add_htcap_body(uint8_t *frm, struct ieee80211_node *ni) { #define ADDSHORT(frm, v) do { \ frm[0] = (v) & 0xff; \ frm[1] = (v) >> 8; \ frm += 2; \ } while (0) struct ieee80211com *ic = ni->ni_ic; struct ieee80211vap *vap = ni->ni_vap; uint16_t caps, extcaps; int rxmax, density; /* HT capabilities */ caps = vap->iv_htcaps & 0xffff; /* * Note channel width depends on whether we are operating as * a sta or not. When operating as a sta we are generating * a request based on our desired configuration. Otherwise * we are operational and the channel attributes identify * how we've been setup (which might be different if a fixed * channel is specified). */ if (vap->iv_opmode == IEEE80211_M_STA) { /* override 20/40 use based on config */ if (vap->iv_flags_ht & IEEE80211_FHT_USEHT40) caps |= IEEE80211_HTCAP_CHWIDTH40; else caps &= ~IEEE80211_HTCAP_CHWIDTH40; /* Start by using the advertised settings */ rxmax = MS(ni->ni_htparam, IEEE80211_HTCAP_MAXRXAMPDU); density = MS(ni->ni_htparam, IEEE80211_HTCAP_MPDUDENSITY); IEEE80211_DPRINTF(vap, IEEE80211_MSG_11N, "%s: advertised rxmax=%d, density=%d, vap rxmax=%d, density=%d\n", __func__, rxmax, density, vap->iv_ampdu_rxmax, vap->iv_ampdu_density); /* Cap at VAP rxmax */ if (rxmax > vap->iv_ampdu_rxmax) rxmax = vap->iv_ampdu_rxmax; /* * If the VAP ampdu density value greater, use that. * * (Larger density value == larger minimum gap between A-MPDU * subframes.) */ if (vap->iv_ampdu_density > density) density = vap->iv_ampdu_density; /* * NB: Hardware might support HT40 on some but not all * channels. We can't determine this earlier because only * after association the channel is upgraded to HT based * on the negotiated capabilities. */ if (ni->ni_chan != IEEE80211_CHAN_ANYC && findhtchan(ic, ni->ni_chan, IEEE80211_CHAN_HT40U) == NULL && findhtchan(ic, ni->ni_chan, IEEE80211_CHAN_HT40D) == NULL) caps &= ~IEEE80211_HTCAP_CHWIDTH40; } else { /* override 20/40 use based on current channel */ if (IEEE80211_IS_CHAN_HT40(ni->ni_chan)) caps |= IEEE80211_HTCAP_CHWIDTH40; else caps &= ~IEEE80211_HTCAP_CHWIDTH40; /* XXX TODO should it start by using advertised settings? */ rxmax = vap->iv_ampdu_rxmax; density = vap->iv_ampdu_density; } /* adjust short GI based on channel and config */ if ((vap->iv_flags_ht & IEEE80211_FHT_SHORTGI20) == 0) caps &= ~IEEE80211_HTCAP_SHORTGI20; if ((vap->iv_flags_ht & IEEE80211_FHT_SHORTGI40) == 0 || (caps & IEEE80211_HTCAP_CHWIDTH40) == 0) caps &= ~IEEE80211_HTCAP_SHORTGI40; /* adjust STBC based on receive capabilities */ if ((vap->iv_flags_ht & IEEE80211_FHT_STBC_RX) == 0) caps &= ~IEEE80211_HTCAP_RXSTBC; /* adjust LDPC based on receive capabilites */ if ((vap->iv_flags_ht & IEEE80211_FHT_LDPC_RX) == 0) caps &= ~IEEE80211_HTCAP_LDPC; ADDSHORT(frm, caps); /* HT parameters */ *frm = SM(rxmax, IEEE80211_HTCAP_MAXRXAMPDU) | SM(density, IEEE80211_HTCAP_MPDUDENSITY) ; frm++; /* pre-zero remainder of ie */ memset(frm, 0, sizeof(struct ieee80211_ie_htcap) - __offsetof(struct ieee80211_ie_htcap, hc_mcsset)); /* supported MCS set */ /* * XXX: For sta mode the rate set should be restricted based * on the AP's capabilities, but ni_htrates isn't setup when * we're called to form an AssocReq frame so for now we're * restricted to the device capabilities. */ ieee80211_set_mcsset(ni->ni_ic, frm); frm += __offsetof(struct ieee80211_ie_htcap, hc_extcap) - __offsetof(struct ieee80211_ie_htcap, hc_mcsset); /* HT extended capabilities */ extcaps = vap->iv_htextcaps & 0xffff; ADDSHORT(frm, extcaps); frm += sizeof(struct ieee80211_ie_htcap) - __offsetof(struct ieee80211_ie_htcap, hc_txbf); return frm; #undef ADDSHORT } /* * Add 802.11n HT capabilities information element */ uint8_t * ieee80211_add_htcap(uint8_t *frm, struct ieee80211_node *ni) { frm[0] = IEEE80211_ELEMID_HTCAP; frm[1] = sizeof(struct ieee80211_ie_htcap) - 2; return ieee80211_add_htcap_body(frm + 2, ni); } /* * Non-associated probe request - add HT capabilities based on * the current channel configuration. */ static uint8_t * ieee80211_add_htcap_body_ch(uint8_t *frm, struct ieee80211vap *vap, struct ieee80211_channel *c) { #define ADDSHORT(frm, v) do { \ frm[0] = (v) & 0xff; \ frm[1] = (v) >> 8; \ frm += 2; \ } while (0) struct ieee80211com *ic = vap->iv_ic; uint16_t caps, extcaps; int rxmax, density; /* HT capabilities */ caps = vap->iv_htcaps & 0xffff; /* * We don't use this in STA mode; only in IBSS mode. * So in IBSS mode we base our HTCAP flags on the * given channel. */ /* override 20/40 use based on current channel */ if (IEEE80211_IS_CHAN_HT40(c)) caps |= IEEE80211_HTCAP_CHWIDTH40; else caps &= ~IEEE80211_HTCAP_CHWIDTH40; /* Use the currently configured values */ rxmax = vap->iv_ampdu_rxmax; density = vap->iv_ampdu_density; /* adjust short GI based on channel and config */ if ((vap->iv_flags_ht & IEEE80211_FHT_SHORTGI20) == 0) caps &= ~IEEE80211_HTCAP_SHORTGI20; if ((vap->iv_flags_ht & IEEE80211_FHT_SHORTGI40) == 0 || (caps & IEEE80211_HTCAP_CHWIDTH40) == 0) caps &= ~IEEE80211_HTCAP_SHORTGI40; ADDSHORT(frm, caps); /* HT parameters */ *frm = SM(rxmax, IEEE80211_HTCAP_MAXRXAMPDU) | SM(density, IEEE80211_HTCAP_MPDUDENSITY) ; frm++; /* pre-zero remainder of ie */ memset(frm, 0, sizeof(struct ieee80211_ie_htcap) - __offsetof(struct ieee80211_ie_htcap, hc_mcsset)); /* supported MCS set */ /* * XXX: For sta mode the rate set should be restricted based * on the AP's capabilities, but ni_htrates isn't setup when * we're called to form an AssocReq frame so for now we're * restricted to the device capabilities. */ ieee80211_set_mcsset(ic, frm); frm += __offsetof(struct ieee80211_ie_htcap, hc_extcap) - __offsetof(struct ieee80211_ie_htcap, hc_mcsset); /* HT extended capabilities */ extcaps = vap->iv_htextcaps & 0xffff; ADDSHORT(frm, extcaps); frm += sizeof(struct ieee80211_ie_htcap) - __offsetof(struct ieee80211_ie_htcap, hc_txbf); return frm; #undef ADDSHORT } /* * Add 802.11n HT capabilities information element */ uint8_t * ieee80211_add_htcap_ch(uint8_t *frm, struct ieee80211vap *vap, struct ieee80211_channel *c) { frm[0] = IEEE80211_ELEMID_HTCAP; frm[1] = sizeof(struct ieee80211_ie_htcap) - 2; return ieee80211_add_htcap_body_ch(frm + 2, vap, c); } /* * Add Broadcom OUI wrapped standard HTCAP ie; this is * used for compatibility w/ pre-draft implementations. */ uint8_t * ieee80211_add_htcap_vendor(uint8_t *frm, struct ieee80211_node *ni) { frm[0] = IEEE80211_ELEMID_VENDOR; frm[1] = 4 + sizeof(struct ieee80211_ie_htcap) - 2; frm[2] = (BCM_OUI >> 0) & 0xff; frm[3] = (BCM_OUI >> 8) & 0xff; frm[4] = (BCM_OUI >> 16) & 0xff; frm[5] = BCM_OUI_HTCAP; return ieee80211_add_htcap_body(frm + 6, ni); } /* * Construct the MCS bit mask of basic rates * for inclusion in an HT information element. */ static void ieee80211_set_basic_htrates(uint8_t *frm, const struct ieee80211_htrateset *rs) { int i; for (i = 0; i < rs->rs_nrates; i++) { int r = rs->rs_rates[i] & IEEE80211_RATE_VAL; if ((rs->rs_rates[i] & IEEE80211_RATE_BASIC) && r < IEEE80211_HTRATE_MAXSIZE) { /* NB: this assumes a particular implementation */ setbit(frm, r); } } } /* * Update the HTINFO ie for a beacon frame. */ void ieee80211_ht_update_beacon(struct ieee80211vap *vap, struct ieee80211_beacon_offsets *bo) { #define PROTMODE (IEEE80211_HTINFO_OPMODE|IEEE80211_HTINFO_NONHT_PRESENT) struct ieee80211_node *ni; const struct ieee80211_channel *bsschan; struct ieee80211com *ic = vap->iv_ic; struct ieee80211_ie_htinfo *ht = (struct ieee80211_ie_htinfo *) bo->bo_htinfo; ni = ieee80211_ref_node(vap->iv_bss); bsschan = ni->ni_chan; /* XXX only update on channel change */ ht->hi_ctrlchannel = ieee80211_chan2ieee(ic, bsschan); if (vap->iv_flags_ht & IEEE80211_FHT_RIFS) ht->hi_byte1 = IEEE80211_HTINFO_RIFSMODE_PERM; else ht->hi_byte1 = IEEE80211_HTINFO_RIFSMODE_PROH; if (IEEE80211_IS_CHAN_HT40U(bsschan)) ht->hi_byte1 |= IEEE80211_HTINFO_2NDCHAN_ABOVE; else if (IEEE80211_IS_CHAN_HT40D(bsschan)) ht->hi_byte1 |= IEEE80211_HTINFO_2NDCHAN_BELOW; else ht->hi_byte1 |= IEEE80211_HTINFO_2NDCHAN_NONE; if (IEEE80211_IS_CHAN_HT40(bsschan)) ht->hi_byte1 |= IEEE80211_HTINFO_TXWIDTH_2040; /* protection mode */ ht->hi_byte2 = (ht->hi_byte2 &~ PROTMODE) | ic->ic_curhtprotmode; ieee80211_free_node(ni); /* XXX propagate to vendor ie's */ #undef PROTMODE } /* * Add body of an HTINFO information element. * * NB: We don't use struct ieee80211_ie_htinfo because we can * be called to fillin both a standard ie and a compat ie that * has a vendor OUI at the front. */ static uint8_t * ieee80211_add_htinfo_body(uint8_t *frm, struct ieee80211_node *ni) { struct ieee80211vap *vap = ni->ni_vap; struct ieee80211com *ic = ni->ni_ic; /* pre-zero remainder of ie */ memset(frm, 0, sizeof(struct ieee80211_ie_htinfo) - 2); /* primary/control channel center */ *frm++ = ieee80211_chan2ieee(ic, ni->ni_chan); if (vap->iv_flags_ht & IEEE80211_FHT_RIFS) frm[0] = IEEE80211_HTINFO_RIFSMODE_PERM; else frm[0] = IEEE80211_HTINFO_RIFSMODE_PROH; if (IEEE80211_IS_CHAN_HT40U(ni->ni_chan)) frm[0] |= IEEE80211_HTINFO_2NDCHAN_ABOVE; else if (IEEE80211_IS_CHAN_HT40D(ni->ni_chan)) frm[0] |= IEEE80211_HTINFO_2NDCHAN_BELOW; else frm[0] |= IEEE80211_HTINFO_2NDCHAN_NONE; if (IEEE80211_IS_CHAN_HT40(ni->ni_chan)) frm[0] |= IEEE80211_HTINFO_TXWIDTH_2040; frm[1] = ic->ic_curhtprotmode; frm += 5; /* basic MCS set */ ieee80211_set_basic_htrates(frm, &ni->ni_htrates); frm += sizeof(struct ieee80211_ie_htinfo) - __offsetof(struct ieee80211_ie_htinfo, hi_basicmcsset); return frm; } /* * Add 802.11n HT information element. */ uint8_t * ieee80211_add_htinfo(uint8_t *frm, struct ieee80211_node *ni) { frm[0] = IEEE80211_ELEMID_HTINFO; frm[1] = sizeof(struct ieee80211_ie_htinfo) - 2; return ieee80211_add_htinfo_body(frm + 2, ni); } /* * Add Broadcom OUI wrapped standard HTINFO ie; this is * used for compatibility w/ pre-draft implementations. */ uint8_t * ieee80211_add_htinfo_vendor(uint8_t *frm, struct ieee80211_node *ni) { frm[0] = IEEE80211_ELEMID_VENDOR; frm[1] = 4 + sizeof(struct ieee80211_ie_htinfo) - 2; frm[2] = (BCM_OUI >> 0) & 0xff; frm[3] = (BCM_OUI >> 8) & 0xff; frm[4] = (BCM_OUI >> 16) & 0xff; frm[5] = BCM_OUI_HTINFO; return ieee80211_add_htinfo_body(frm + 6, ni); } Index: projects/clang1000-import/sys/net80211/ieee80211_hwmp.c =================================================================== --- projects/clang1000-import/sys/net80211/ieee80211_hwmp.c (revision 358238) +++ projects/clang1000-import/sys/net80211/ieee80211_hwmp.c (revision 358239) @@ -1,2085 +1,2093 @@ /*- * SPDX-License-Identifier: BSD-2-Clause-FreeBSD * * Copyright (c) 2009 The FreeBSD Foundation * All rights reserved. * * This software was developed by Rui Paulo under sponsorship from the * FreeBSD Foundation. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include #ifdef __FreeBSD__ __FBSDID("$FreeBSD$"); #endif /* * IEEE 802.11s Hybrid Wireless Mesh Protocol, HWMP. * * Based on March 2009, D3.0 802.11s draft spec. */ #include "opt_inet.h" #include "opt_wlan.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include static void hwmp_vattach(struct ieee80211vap *); static void hwmp_vdetach(struct ieee80211vap *); static int hwmp_newstate(struct ieee80211vap *, enum ieee80211_state, int); static int hwmp_send_action(struct ieee80211vap *, const uint8_t [IEEE80211_ADDR_LEN], uint8_t *, size_t); static uint8_t * hwmp_add_meshpreq(uint8_t *, const struct ieee80211_meshpreq_ie *); static uint8_t * hwmp_add_meshprep(uint8_t *, const struct ieee80211_meshprep_ie *); static uint8_t * hwmp_add_meshperr(uint8_t *, const struct ieee80211_meshperr_ie *); static uint8_t * hwmp_add_meshrann(uint8_t *, const struct ieee80211_meshrann_ie *); static void hwmp_rootmode_setup(struct ieee80211vap *); static void hwmp_rootmode_cb(void *); static void hwmp_rootmode_rann_cb(void *); static void hwmp_recv_preq(struct ieee80211vap *, struct ieee80211_node *, const struct ieee80211_frame *, const struct ieee80211_meshpreq_ie *); static int hwmp_send_preq(struct ieee80211vap *, const uint8_t [IEEE80211_ADDR_LEN], struct ieee80211_meshpreq_ie *, struct timeval *, struct timeval *); static void hwmp_recv_prep(struct ieee80211vap *, struct ieee80211_node *, const struct ieee80211_frame *, const struct ieee80211_meshprep_ie *); static int hwmp_send_prep(struct ieee80211vap *, const uint8_t [IEEE80211_ADDR_LEN], struct ieee80211_meshprep_ie *); static void hwmp_recv_perr(struct ieee80211vap *, struct ieee80211_node *, const struct ieee80211_frame *, const struct ieee80211_meshperr_ie *); static int hwmp_send_perr(struct ieee80211vap *, const uint8_t [IEEE80211_ADDR_LEN], struct ieee80211_meshperr_ie *); static void hwmp_senderror(struct ieee80211vap *, const uint8_t [IEEE80211_ADDR_LEN], struct ieee80211_mesh_route *, int); static void hwmp_recv_rann(struct ieee80211vap *, struct ieee80211_node *, const struct ieee80211_frame *, const struct ieee80211_meshrann_ie *); static int hwmp_send_rann(struct ieee80211vap *, const uint8_t [IEEE80211_ADDR_LEN], struct ieee80211_meshrann_ie *); static struct ieee80211_node * hwmp_discover(struct ieee80211vap *, const uint8_t [IEEE80211_ADDR_LEN], struct mbuf *); static void hwmp_peerdown(struct ieee80211_node *); static struct timeval ieee80211_hwmp_preqminint = { 0, 100000 }; static struct timeval ieee80211_hwmp_perrminint = { 0, 100000 }; /* NB: the Target Address set in a Proactive PREQ is the broadcast address. */ static const uint8_t broadcastaddr[IEEE80211_ADDR_LEN] = { 0xff, 0xff, 0xff, 0xff, 0xff, 0xff }; typedef uint32_t ieee80211_hwmp_seq; #define HWMP_SEQ_LT(a, b) ((int32_t)((a)-(b)) < 0) #define HWMP_SEQ_LEQ(a, b) ((int32_t)((a)-(b)) <= 0) #define HWMP_SEQ_EQ(a, b) ((int32_t)((a)-(b)) == 0) #define HWMP_SEQ_GT(a, b) ((int32_t)((a)-(b)) > 0) #define HWMP_SEQ_MAX(a, b) (a > b ? a : b) /* * Private extension of ieee80211_mesh_route. */ struct ieee80211_hwmp_route { ieee80211_hwmp_seq hr_seq; /* last HWMP seq seen from dst*/ ieee80211_hwmp_seq hr_preqid; /* last PREQ ID seen from dst */ ieee80211_hwmp_seq hr_origseq; /* seq. no. on our latest PREQ*/ struct timeval hr_lastpreq; /* last time we sent a PREQ */ struct timeval hr_lastrootconf; /* last sent PREQ root conf */ int hr_preqretries; /* number of discoveries */ int hr_lastdiscovery; /* last discovery in ticks */ }; struct ieee80211_hwmp_state { ieee80211_hwmp_seq hs_seq; /* next seq to be used */ ieee80211_hwmp_seq hs_preqid; /* next PREQ ID to be used */ int hs_rootmode; /* proactive HWMP */ struct timeval hs_lastperr; /* last time we sent a PERR */ struct callout hs_roottimer; uint8_t hs_maxhops; /* max hop count */ }; -static SYSCTL_NODE(_net_wlan, OID_AUTO, hwmp, CTLFLAG_RD, 0, +static SYSCTL_NODE(_net_wlan, OID_AUTO, hwmp, CTLFLAG_RD | CTLFLAG_MPSAFE, 0, "IEEE 802.11s HWMP parameters"); static int ieee80211_hwmp_targetonly = 0; SYSCTL_INT(_net_wlan_hwmp, OID_AUTO, targetonly, CTLFLAG_RW, &ieee80211_hwmp_targetonly, 0, "Set TO bit on generated PREQs"); static int ieee80211_hwmp_pathtimeout = -1; -SYSCTL_PROC(_net_wlan_hwmp, OID_AUTO, pathlifetime, CTLTYPE_INT | CTLFLAG_RW, +SYSCTL_PROC(_net_wlan_hwmp, OID_AUTO, pathlifetime, + CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_MPSAFE, &ieee80211_hwmp_pathtimeout, 0, ieee80211_sysctl_msecs_ticks, "I", "path entry lifetime (ms)"); static int ieee80211_hwmp_maxpreq_retries = -1; -SYSCTL_PROC(_net_wlan_hwmp, OID_AUTO, maxpreq_retries, CTLTYPE_INT | CTLFLAG_RW, +SYSCTL_PROC(_net_wlan_hwmp, OID_AUTO, maxpreq_retries, + CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_MPSAFE, &ieee80211_hwmp_maxpreq_retries, 0, ieee80211_sysctl_msecs_ticks, "I", "maximum number of preq retries"); static int ieee80211_hwmp_net_diameter_traversaltime = -1; SYSCTL_PROC(_net_wlan_hwmp, OID_AUTO, net_diameter_traversal_time, - CTLTYPE_INT | CTLFLAG_RW, &ieee80211_hwmp_net_diameter_traversaltime, 0, + CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_MPSAFE, + &ieee80211_hwmp_net_diameter_traversaltime, 0, ieee80211_sysctl_msecs_ticks, "I", "estimate travelse time across the MBSS (ms)"); static int ieee80211_hwmp_roottimeout = -1; -SYSCTL_PROC(_net_wlan_hwmp, OID_AUTO, roottimeout, CTLTYPE_INT | CTLFLAG_RW, +SYSCTL_PROC(_net_wlan_hwmp, OID_AUTO, roottimeout, + CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_MPSAFE, &ieee80211_hwmp_roottimeout, 0, ieee80211_sysctl_msecs_ticks, "I", "root PREQ timeout (ms)"); static int ieee80211_hwmp_rootint = -1; -SYSCTL_PROC(_net_wlan_hwmp, OID_AUTO, rootint, CTLTYPE_INT | CTLFLAG_RW, +SYSCTL_PROC(_net_wlan_hwmp, OID_AUTO, rootint, + CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_MPSAFE, &ieee80211_hwmp_rootint, 0, ieee80211_sysctl_msecs_ticks, "I", "root interval (ms)"); static int ieee80211_hwmp_rannint = -1; -SYSCTL_PROC(_net_wlan_hwmp, OID_AUTO, rannint, CTLTYPE_INT | CTLFLAG_RW, +SYSCTL_PROC(_net_wlan_hwmp, OID_AUTO, rannint, + CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_MPSAFE, &ieee80211_hwmp_rannint, 0, ieee80211_sysctl_msecs_ticks, "I", "root announcement interval (ms)"); static struct timeval ieee80211_hwmp_rootconfint = { 0, 0 }; static int ieee80211_hwmp_rootconfint_internal = -1; -SYSCTL_PROC(_net_wlan_hwmp, OID_AUTO, rootconfint, CTLTYPE_INT | CTLFLAG_RD, +SYSCTL_PROC(_net_wlan_hwmp, OID_AUTO, rootconfint, + CTLTYPE_INT | CTLFLAG_RD | CTLFLAG_MPSAFE, &ieee80211_hwmp_rootconfint_internal, 0, ieee80211_sysctl_msecs_ticks, "I", "root confirmation interval (ms) (read-only)"); #define IEEE80211_HWMP_DEFAULT_MAXHOPS 31 static ieee80211_recv_action_func hwmp_recv_action_meshpath; static struct ieee80211_mesh_proto_path mesh_proto_hwmp = { .mpp_descr = "HWMP", .mpp_ie = IEEE80211_MESHCONF_PATH_HWMP, .mpp_discover = hwmp_discover, .mpp_peerdown = hwmp_peerdown, .mpp_senderror = hwmp_senderror, .mpp_vattach = hwmp_vattach, .mpp_vdetach = hwmp_vdetach, .mpp_newstate = hwmp_newstate, .mpp_privlen = sizeof(struct ieee80211_hwmp_route), }; -SYSCTL_PROC(_net_wlan_hwmp, OID_AUTO, inact, CTLTYPE_INT | CTLFLAG_RW, - &mesh_proto_hwmp.mpp_inact, 0, ieee80211_sysctl_msecs_ticks, "I", - "mesh route inactivity timeout (ms)"); +SYSCTL_PROC(_net_wlan_hwmp, OID_AUTO, inact, + CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_NEEDGIANT, + &mesh_proto_hwmp.mpp_inact, 0, ieee80211_sysctl_msecs_ticks, "I", + "mesh route inactivity timeout (ms)"); static void ieee80211_hwmp_init(void) { /* Default values as per amendment */ ieee80211_hwmp_pathtimeout = msecs_to_ticks(5*1000); ieee80211_hwmp_roottimeout = msecs_to_ticks(5*1000); ieee80211_hwmp_rootint = msecs_to_ticks(2*1000); ieee80211_hwmp_rannint = msecs_to_ticks(1*1000); ieee80211_hwmp_rootconfint_internal = msecs_to_ticks(2*1000); ieee80211_hwmp_maxpreq_retries = 3; /* * (TU): A measurement of time equal to 1024 μs, * 500 TU is 512 ms. */ ieee80211_hwmp_net_diameter_traversaltime = msecs_to_ticks(512); /* * NB: I dont know how to make SYSCTL_PROC that calls ms to ticks * and return a struct timeval... */ ieee80211_hwmp_rootconfint.tv_usec = ieee80211_hwmp_rootconfint_internal * 1000; /* * Register action frame handler. */ ieee80211_recv_action_register(IEEE80211_ACTION_CAT_MESH, IEEE80211_ACTION_MESH_HWMP, hwmp_recv_action_meshpath); /* NB: default is 5 secs per spec */ mesh_proto_hwmp.mpp_inact = msecs_to_ticks(5*1000); /* * Register HWMP. */ ieee80211_mesh_register_proto_path(&mesh_proto_hwmp); } SYSINIT(wlan_hwmp, SI_SUB_DRIVERS, SI_ORDER_SECOND, ieee80211_hwmp_init, NULL); static void hwmp_vattach(struct ieee80211vap *vap) { struct ieee80211_hwmp_state *hs; KASSERT(vap->iv_opmode == IEEE80211_M_MBSS, ("not a mesh vap, opmode %d", vap->iv_opmode)); hs = IEEE80211_MALLOC(sizeof(struct ieee80211_hwmp_state), M_80211_VAP, IEEE80211_M_NOWAIT | IEEE80211_M_ZERO); if (hs == NULL) { printf("%s: couldn't alloc HWMP state\n", __func__); return; } hs->hs_maxhops = IEEE80211_HWMP_DEFAULT_MAXHOPS; callout_init(&hs->hs_roottimer, 1); vap->iv_hwmp = hs; } static void hwmp_vdetach(struct ieee80211vap *vap) { struct ieee80211_hwmp_state *hs = vap->iv_hwmp; callout_drain(&hs->hs_roottimer); IEEE80211_FREE(vap->iv_hwmp, M_80211_VAP); vap->iv_hwmp = NULL; } static int hwmp_newstate(struct ieee80211vap *vap, enum ieee80211_state ostate, int arg) { enum ieee80211_state nstate = vap->iv_state; struct ieee80211_hwmp_state *hs = vap->iv_hwmp; IEEE80211_DPRINTF(vap, IEEE80211_MSG_STATE, "%s: %s -> %s (%d)\n", __func__, ieee80211_state_name[ostate], ieee80211_state_name[nstate], arg); if (nstate != IEEE80211_S_RUN && ostate == IEEE80211_S_RUN) callout_drain(&hs->hs_roottimer); if (nstate == IEEE80211_S_RUN) hwmp_rootmode_setup(vap); return 0; } /* * Verify the length of an HWMP PREQ and return the number * of destinations >= 1, if verification fails -1 is returned. */ static int verify_mesh_preq_len(struct ieee80211vap *vap, const struct ieee80211_frame *wh, const uint8_t *iefrm) { int alloc_sz = -1; int ndest = -1; if (iefrm[2] & IEEE80211_MESHPREQ_FLAGS_AE) { /* Originator External Address present */ alloc_sz = IEEE80211_MESHPREQ_BASE_SZ_AE; ndest = iefrm[IEEE80211_MESHPREQ_TCNT_OFFSET_AE]; } else { /* w/o Originator External Address */ alloc_sz = IEEE80211_MESHPREQ_BASE_SZ; ndest = iefrm[IEEE80211_MESHPREQ_TCNT_OFFSET]; } alloc_sz += ndest * IEEE80211_MESHPREQ_TRGT_SZ; if(iefrm[1] != (alloc_sz)) { IEEE80211_DISCARD(vap, IEEE80211_MSG_ACTION | IEEE80211_MSG_HWMP, wh, NULL, "PREQ (AE=%s) with wrong len", iefrm[2] & IEEE80211_MESHPREQ_FLAGS_AE ? "1" : "0"); return (-1); } return ndest; } /* * Verify the length of an HWMP PREP and returns 1 on success, * otherwise -1. */ static int verify_mesh_prep_len(struct ieee80211vap *vap, const struct ieee80211_frame *wh, const uint8_t *iefrm) { int alloc_sz = -1; if (iefrm[2] & IEEE80211_MESHPREP_FLAGS_AE) { if (iefrm[1] == IEEE80211_MESHPREP_BASE_SZ_AE) alloc_sz = IEEE80211_MESHPREP_BASE_SZ_AE; } else if (iefrm[1] == IEEE80211_MESHPREP_BASE_SZ) alloc_sz = IEEE80211_MESHPREP_BASE_SZ; if(alloc_sz < 0) { IEEE80211_DISCARD(vap, IEEE80211_MSG_ACTION | IEEE80211_MSG_HWMP, wh, NULL, "PREP (AE=%s) with wrong len", iefrm[2] & IEEE80211_MESHPREP_FLAGS_AE ? "1" : "0"); return (-1); } return (1); } /* * Verify the length of an HWMP PERR and return the number * of destinations >= 1, if verification fails -1 is returned. */ static int verify_mesh_perr_len(struct ieee80211vap *vap, const struct ieee80211_frame *wh, const uint8_t *iefrm) { int alloc_sz = -1; const uint8_t *iefrm_t = iefrm; uint8_t ndest = iefrm_t[IEEE80211_MESHPERR_NDEST_OFFSET]; int i; if(ndest > IEEE80211_MESHPERR_MAXDEST) { IEEE80211_DISCARD(vap, IEEE80211_MSG_ACTION | IEEE80211_MSG_HWMP, wh, NULL, "PERR with wrong number of destionat (>19), %u", ndest); return (-1); } iefrm_t += IEEE80211_MESHPERR_NDEST_OFFSET + 1; /* flag is next field */ /* We need to check each destionation flag to know size */ for(i = 0; ini_vap; struct ieee80211_meshpreq_ie *preq; struct ieee80211_meshprep_ie *prep; struct ieee80211_meshperr_ie *perr; struct ieee80211_meshrann_ie rann; const uint8_t *iefrm = frm + 2; /* action + code */ const uint8_t *iefrm_t = iefrm; /* temporary pointer */ int ndest = -1; int found = 0; while (efrm - iefrm > 1) { IEEE80211_VERIFY_LENGTH(efrm - iefrm, iefrm[1] + 2, return 0); switch (*iefrm) { case IEEE80211_ELEMID_MESHPREQ: { int i = 0; iefrm_t = iefrm; ndest = verify_mesh_preq_len(vap, wh, iefrm_t); if (ndest < 0) { vap->iv_stats.is_rx_mgtdiscard++; break; } preq = IEEE80211_MALLOC(sizeof(*preq) + (ndest - 1) * sizeof(*preq->preq_targets), M_80211_MESH_PREQ, IEEE80211_M_NOWAIT | IEEE80211_M_ZERO); KASSERT(preq != NULL, ("preq == NULL")); preq->preq_ie = *iefrm_t++; preq->preq_len = *iefrm_t++; preq->preq_flags = *iefrm_t++; preq->preq_hopcount = *iefrm_t++; preq->preq_ttl = *iefrm_t++; preq->preq_id = le32dec(iefrm_t); iefrm_t += 4; IEEE80211_ADDR_COPY(preq->preq_origaddr, iefrm_t); iefrm_t += 6; preq->preq_origseq = le32dec(iefrm_t); iefrm_t += 4; /* NB: may have Originator Proxied Address */ if (preq->preq_flags & IEEE80211_MESHPREQ_FLAGS_AE) { IEEE80211_ADDR_COPY( preq->preq_orig_ext_addr, iefrm_t); iefrm_t += 6; } preq->preq_lifetime = le32dec(iefrm_t); iefrm_t += 4; preq->preq_metric = le32dec(iefrm_t); iefrm_t += 4; preq->preq_tcount = *iefrm_t++; for (i = 0; i < preq->preq_tcount; i++) { preq->preq_targets[i].target_flags = *iefrm_t++; IEEE80211_ADDR_COPY( preq->preq_targets[i].target_addr, iefrm_t); iefrm_t += 6; preq->preq_targets[i].target_seq = le32dec(iefrm_t); iefrm_t += 4; } hwmp_recv_preq(vap, ni, wh, preq); IEEE80211_FREE(preq, M_80211_MESH_PREQ); found++; break; } case IEEE80211_ELEMID_MESHPREP: { iefrm_t = iefrm; ndest = verify_mesh_prep_len(vap, wh, iefrm_t); if (ndest < 0) { vap->iv_stats.is_rx_mgtdiscard++; break; } prep = IEEE80211_MALLOC(sizeof(*prep), M_80211_MESH_PREP, IEEE80211_M_NOWAIT | IEEE80211_M_ZERO); KASSERT(prep != NULL, ("prep == NULL")); prep->prep_ie = *iefrm_t++; prep->prep_len = *iefrm_t++; prep->prep_flags = *iefrm_t++; prep->prep_hopcount = *iefrm_t++; prep->prep_ttl = *iefrm_t++; IEEE80211_ADDR_COPY(prep->prep_targetaddr, iefrm_t); iefrm_t += 6; prep->prep_targetseq = le32dec(iefrm_t); iefrm_t += 4; /* NB: May have Target Proxied Address */ if (prep->prep_flags & IEEE80211_MESHPREP_FLAGS_AE) { IEEE80211_ADDR_COPY( prep->prep_target_ext_addr, iefrm_t); iefrm_t += 6; } prep->prep_lifetime = le32dec(iefrm_t); iefrm_t += 4; prep->prep_metric = le32dec(iefrm_t); iefrm_t += 4; IEEE80211_ADDR_COPY(prep->prep_origaddr, iefrm_t); iefrm_t += 6; prep->prep_origseq = le32dec(iefrm_t); iefrm_t += 4; hwmp_recv_prep(vap, ni, wh, prep); IEEE80211_FREE(prep, M_80211_MESH_PREP); found++; break; } case IEEE80211_ELEMID_MESHPERR: { int i = 0; iefrm_t = iefrm; ndest = verify_mesh_perr_len(vap, wh, iefrm_t); if (ndest < 0) { vap->iv_stats.is_rx_mgtdiscard++; break; } perr = IEEE80211_MALLOC(sizeof(*perr) + (ndest - 1) * sizeof(*perr->perr_dests), M_80211_MESH_PERR, IEEE80211_M_NOWAIT | IEEE80211_M_ZERO); KASSERT(perr != NULL, ("perr == NULL")); perr->perr_ie = *iefrm_t++; perr->perr_len = *iefrm_t++; perr->perr_ttl = *iefrm_t++; perr->perr_ndests = *iefrm_t++; for (i = 0; iperr_ndests; i++) { perr->perr_dests[i].dest_flags = *iefrm_t++; IEEE80211_ADDR_COPY( perr->perr_dests[i].dest_addr, iefrm_t); iefrm_t += 6; perr->perr_dests[i].dest_seq = le32dec(iefrm_t); iefrm_t += 4; /* NB: May have Target Proxied Address */ if (perr->perr_dests[i].dest_flags & IEEE80211_MESHPERR_FLAGS_AE) { IEEE80211_ADDR_COPY( perr->perr_dests[i].dest_ext_addr, iefrm_t); iefrm_t += 6; } perr->perr_dests[i].dest_rcode = le16dec(iefrm_t); iefrm_t += 2; } hwmp_recv_perr(vap, ni, wh, perr); IEEE80211_FREE(perr, M_80211_MESH_PERR); found++; break; } case IEEE80211_ELEMID_MESHRANN: { const struct ieee80211_meshrann_ie *mrann = (const struct ieee80211_meshrann_ie *) iefrm; if (mrann->rann_len != sizeof(struct ieee80211_meshrann_ie) - 2) { IEEE80211_DISCARD(vap, IEEE80211_MSG_ACTION | IEEE80211_MSG_HWMP, wh, NULL, "%s", "RAN with wrong len"); vap->iv_stats.is_rx_mgtdiscard++; return 1; } memcpy(&rann, mrann, sizeof(rann)); rann.rann_seq = le32dec(&mrann->rann_seq); rann.rann_interval = le32dec(&mrann->rann_interval); rann.rann_metric = le32dec(&mrann->rann_metric); hwmp_recv_rann(vap, ni, wh, &rann); found++; break; } } iefrm += iefrm[1] + 2; } if (!found) { IEEE80211_DISCARD(vap, IEEE80211_MSG_ACTION | IEEE80211_MSG_HWMP, wh, NULL, "%s", "PATH SEL action without IE"); vap->iv_stats.is_rx_mgtdiscard++; } return 0; } static int hwmp_send_action(struct ieee80211vap *vap, const uint8_t da[IEEE80211_ADDR_LEN], uint8_t *ie, size_t len) { struct ieee80211_node *ni; struct ieee80211com *ic; struct ieee80211_bpf_params params; struct mbuf *m; uint8_t *frm; int ret; if (IEEE80211_IS_MULTICAST(da)) { ni = ieee80211_ref_node(vap->iv_bss); #ifdef IEEE80211_DEBUG_REFCNT IEEE80211_DPRINTF(vap, IEEE80211_MSG_NODE, "ieee80211_ref_node (%s:%u) %p<%s> refcnt %d\n", __func__, __LINE__, ni, ether_sprintf(ni->ni_macaddr), ieee80211_node_refcnt(ni)+1); #endif ieee80211_ref_node(ni); } else ni = ieee80211_mesh_find_txnode(vap, da); if (vap->iv_state == IEEE80211_S_CAC) { IEEE80211_NOTE(vap, IEEE80211_MSG_OUTPUT, ni, "block %s frame in CAC state", "HWMP action"); vap->iv_stats.is_tx_badstate++; return EIO; /* XXX */ } KASSERT(ni != NULL, ("null node")); ic = ni->ni_ic; m = ieee80211_getmgtframe(&frm, ic->ic_headroom + sizeof(struct ieee80211_frame), sizeof(struct ieee80211_action) + len ); if (m == NULL) { ieee80211_free_node(ni); vap->iv_stats.is_tx_nobuf++; return ENOMEM; } *frm++ = IEEE80211_ACTION_CAT_MESH; *frm++ = IEEE80211_ACTION_MESH_HWMP; switch (*ie) { case IEEE80211_ELEMID_MESHPREQ: frm = hwmp_add_meshpreq(frm, (struct ieee80211_meshpreq_ie *)ie); break; case IEEE80211_ELEMID_MESHPREP: frm = hwmp_add_meshprep(frm, (struct ieee80211_meshprep_ie *)ie); break; case IEEE80211_ELEMID_MESHPERR: frm = hwmp_add_meshperr(frm, (struct ieee80211_meshperr_ie *)ie); break; case IEEE80211_ELEMID_MESHRANN: frm = hwmp_add_meshrann(frm, (struct ieee80211_meshrann_ie *)ie); break; } m->m_pkthdr.len = m->m_len = frm - mtod(m, uint8_t *); M_PREPEND(m, sizeof(struct ieee80211_frame), M_NOWAIT); if (m == NULL) { ieee80211_free_node(ni); vap->iv_stats.is_tx_nobuf++; return ENOMEM; } IEEE80211_TX_LOCK(ic); ieee80211_send_setup(ni, m, IEEE80211_FC0_TYPE_MGT | IEEE80211_FC0_SUBTYPE_ACTION, IEEE80211_NONQOS_TID, vap->iv_myaddr, da, vap->iv_myaddr); m->m_flags |= M_ENCAP; /* mark encapsulated */ IEEE80211_NODE_STAT(ni, tx_mgmt); memset(¶ms, 0, sizeof(params)); params.ibp_pri = WME_AC_VO; params.ibp_rate0 = ni->ni_txparms->mgmtrate; if (IEEE80211_IS_MULTICAST(da)) params.ibp_try0 = 1; else params.ibp_try0 = ni->ni_txparms->maxretry; params.ibp_power = ni->ni_txpower; ret = ieee80211_raw_output(vap, ni, m, ¶ms); IEEE80211_TX_UNLOCK(ic); return (ret); } #define ADDSHORT(frm, v) do { \ le16enc(frm, v); \ frm += 2; \ } while (0) #define ADDWORD(frm, v) do { \ le32enc(frm, v); \ frm += 4; \ } while (0) /* * Add a Mesh Path Request IE to a frame. */ #define PREQ_TFLAGS(n) preq->preq_targets[n].target_flags #define PREQ_TADDR(n) preq->preq_targets[n].target_addr #define PREQ_TSEQ(n) preq->preq_targets[n].target_seq static uint8_t * hwmp_add_meshpreq(uint8_t *frm, const struct ieee80211_meshpreq_ie *preq) { int i; *frm++ = IEEE80211_ELEMID_MESHPREQ; *frm++ = preq->preq_len; /* len already calculated */ *frm++ = preq->preq_flags; *frm++ = preq->preq_hopcount; *frm++ = preq->preq_ttl; ADDWORD(frm, preq->preq_id); IEEE80211_ADDR_COPY(frm, preq->preq_origaddr); frm += 6; ADDWORD(frm, preq->preq_origseq); if (preq->preq_flags & IEEE80211_MESHPREQ_FLAGS_AE) { IEEE80211_ADDR_COPY(frm, preq->preq_orig_ext_addr); frm += 6; } ADDWORD(frm, preq->preq_lifetime); ADDWORD(frm, preq->preq_metric); *frm++ = preq->preq_tcount; for (i = 0; i < preq->preq_tcount; i++) { *frm++ = PREQ_TFLAGS(i); IEEE80211_ADDR_COPY(frm, PREQ_TADDR(i)); frm += 6; ADDWORD(frm, PREQ_TSEQ(i)); } return frm; } #undef PREQ_TFLAGS #undef PREQ_TADDR #undef PREQ_TSEQ /* * Add a Mesh Path Reply IE to a frame. */ static uint8_t * hwmp_add_meshprep(uint8_t *frm, const struct ieee80211_meshprep_ie *prep) { *frm++ = IEEE80211_ELEMID_MESHPREP; *frm++ = prep->prep_len; /* len already calculated */ *frm++ = prep->prep_flags; *frm++ = prep->prep_hopcount; *frm++ = prep->prep_ttl; IEEE80211_ADDR_COPY(frm, prep->prep_targetaddr); frm += 6; ADDWORD(frm, prep->prep_targetseq); if (prep->prep_flags & IEEE80211_MESHPREP_FLAGS_AE) { IEEE80211_ADDR_COPY(frm, prep->prep_target_ext_addr); frm += 6; } ADDWORD(frm, prep->prep_lifetime); ADDWORD(frm, prep->prep_metric); IEEE80211_ADDR_COPY(frm, prep->prep_origaddr); frm += 6; ADDWORD(frm, prep->prep_origseq); return frm; } /* * Add a Mesh Path Error IE to a frame. */ #define PERR_DFLAGS(n) perr->perr_dests[n].dest_flags #define PERR_DADDR(n) perr->perr_dests[n].dest_addr #define PERR_DSEQ(n) perr->perr_dests[n].dest_seq #define PERR_EXTADDR(n) perr->perr_dests[n].dest_ext_addr #define PERR_DRCODE(n) perr->perr_dests[n].dest_rcode static uint8_t * hwmp_add_meshperr(uint8_t *frm, const struct ieee80211_meshperr_ie *perr) { int i; *frm++ = IEEE80211_ELEMID_MESHPERR; *frm++ = perr->perr_len; /* len already calculated */ *frm++ = perr->perr_ttl; *frm++ = perr->perr_ndests; for (i = 0; i < perr->perr_ndests; i++) { *frm++ = PERR_DFLAGS(i); IEEE80211_ADDR_COPY(frm, PERR_DADDR(i)); frm += 6; ADDWORD(frm, PERR_DSEQ(i)); if (PERR_DFLAGS(i) & IEEE80211_MESHPERR_FLAGS_AE) { IEEE80211_ADDR_COPY(frm, PERR_EXTADDR(i)); frm += 6; } ADDSHORT(frm, PERR_DRCODE(i)); } return frm; } #undef PERR_DFLAGS #undef PERR_DADDR #undef PERR_DSEQ #undef PERR_EXTADDR #undef PERR_DRCODE /* * Add a Root Annoucement IE to a frame. */ static uint8_t * hwmp_add_meshrann(uint8_t *frm, const struct ieee80211_meshrann_ie *rann) { *frm++ = IEEE80211_ELEMID_MESHRANN; *frm++ = rann->rann_len; *frm++ = rann->rann_flags; *frm++ = rann->rann_hopcount; *frm++ = rann->rann_ttl; IEEE80211_ADDR_COPY(frm, rann->rann_addr); frm += 6; ADDWORD(frm, rann->rann_seq); ADDWORD(frm, rann->rann_interval); ADDWORD(frm, rann->rann_metric); return frm; } static void hwmp_rootmode_setup(struct ieee80211vap *vap) { struct ieee80211_hwmp_state *hs = vap->iv_hwmp; struct ieee80211_mesh_state *ms = vap->iv_mesh; switch (hs->hs_rootmode) { case IEEE80211_HWMP_ROOTMODE_DISABLED: callout_drain(&hs->hs_roottimer); ms->ms_flags &= ~IEEE80211_MESHFLAGS_ROOT; break; case IEEE80211_HWMP_ROOTMODE_NORMAL: case IEEE80211_HWMP_ROOTMODE_PROACTIVE: callout_reset(&hs->hs_roottimer, ieee80211_hwmp_rootint, hwmp_rootmode_cb, vap); ms->ms_flags |= IEEE80211_MESHFLAGS_ROOT; break; case IEEE80211_HWMP_ROOTMODE_RANN: callout_reset(&hs->hs_roottimer, ieee80211_hwmp_rannint, hwmp_rootmode_rann_cb, vap); ms->ms_flags |= IEEE80211_MESHFLAGS_ROOT; break; } } /* * Send a broadcast Path Request to find all nodes on the mesh. We are * called when the vap is configured as a HWMP root node. */ #define PREQ_TFLAGS(n) preq.preq_targets[n].target_flags #define PREQ_TADDR(n) preq.preq_targets[n].target_addr #define PREQ_TSEQ(n) preq.preq_targets[n].target_seq static void hwmp_rootmode_cb(void *arg) { struct ieee80211vap *vap = (struct ieee80211vap *)arg; struct ieee80211_hwmp_state *hs = vap->iv_hwmp; struct ieee80211_mesh_state *ms = vap->iv_mesh; struct ieee80211_meshpreq_ie preq; IEEE80211_NOTE(vap, IEEE80211_MSG_HWMP, vap->iv_bss, "%s", "send broadcast PREQ"); preq.preq_flags = 0; if (ms->ms_flags & IEEE80211_MESHFLAGS_GATE) preq.preq_flags |= IEEE80211_MESHPREQ_FLAGS_GATE; if (hs->hs_rootmode == IEEE80211_HWMP_ROOTMODE_PROACTIVE) preq.preq_flags |= IEEE80211_MESHPREQ_FLAGS_PP; preq.preq_hopcount = 0; preq.preq_ttl = ms->ms_ttl; preq.preq_id = ++hs->hs_preqid; IEEE80211_ADDR_COPY(preq.preq_origaddr, vap->iv_myaddr); preq.preq_origseq = ++hs->hs_seq; preq.preq_lifetime = ticks_to_msecs(ieee80211_hwmp_roottimeout); preq.preq_metric = IEEE80211_MESHLMETRIC_INITIALVAL; preq.preq_tcount = 1; IEEE80211_ADDR_COPY(PREQ_TADDR(0), broadcastaddr); PREQ_TFLAGS(0) = IEEE80211_MESHPREQ_TFLAGS_TO | IEEE80211_MESHPREQ_TFLAGS_USN; PREQ_TSEQ(0) = 0; vap->iv_stats.is_hwmp_rootreqs++; /* NB: we enforce rate check ourself */ hwmp_send_preq(vap, broadcastaddr, &preq, NULL, NULL); hwmp_rootmode_setup(vap); } #undef PREQ_TFLAGS #undef PREQ_TADDR #undef PREQ_TSEQ /* * Send a Root Annoucement (RANN) to find all the nodes on the mesh. We are * called when the vap is configured as a HWMP RANN root node. */ static void hwmp_rootmode_rann_cb(void *arg) { struct ieee80211vap *vap = (struct ieee80211vap *)arg; struct ieee80211_hwmp_state *hs = vap->iv_hwmp; struct ieee80211_mesh_state *ms = vap->iv_mesh; struct ieee80211_meshrann_ie rann; IEEE80211_NOTE(vap, IEEE80211_MSG_HWMP, vap->iv_bss, "%s", "send broadcast RANN"); rann.rann_flags = 0; if (ms->ms_flags & IEEE80211_MESHFLAGS_GATE) rann.rann_flags |= IEEE80211_MESHFLAGS_GATE; rann.rann_hopcount = 0; rann.rann_ttl = ms->ms_ttl; IEEE80211_ADDR_COPY(rann.rann_addr, vap->iv_myaddr); rann.rann_seq = ++hs->hs_seq; rann.rann_interval = ieee80211_hwmp_rannint; rann.rann_metric = IEEE80211_MESHLMETRIC_INITIALVAL; vap->iv_stats.is_hwmp_rootrann++; hwmp_send_rann(vap, broadcastaddr, &rann); hwmp_rootmode_setup(vap); } /* * Update forwarding information to TA if metric improves. */ static void hwmp_update_transmitter(struct ieee80211vap *vap, struct ieee80211_node *ni, const char *hwmp_frame) { struct ieee80211_mesh_state *ms = vap->iv_mesh; struct ieee80211_mesh_route *rttran = NULL; /* Transmitter */ int metric = 0; rttran = ieee80211_mesh_rt_find(vap, ni->ni_macaddr); if (rttran == NULL) { rttran = ieee80211_mesh_rt_add(vap, ni->ni_macaddr); if (rttran == NULL) { IEEE80211_NOTE(vap, IEEE80211_MSG_HWMP, ni, "unable to add path to transmitter %6D of %s", ni->ni_macaddr, ":", hwmp_frame); vap->iv_stats.is_mesh_rtaddfailed++; return; } } metric = ms->ms_pmetric->mpm_metric(ni); if (!(rttran->rt_flags & IEEE80211_MESHRT_FLAGS_VALID) || rttran->rt_metric > metric) { IEEE80211_NOTE(vap, IEEE80211_MSG_HWMP, ni, "%s path to transmitter %6D of %s, metric %d:%d", rttran->rt_flags & IEEE80211_MESHRT_FLAGS_VALID ? "prefer" : "update", ni->ni_macaddr, ":", hwmp_frame, rttran->rt_metric, metric); IEEE80211_ADDR_COPY(rttran->rt_nexthop, ni->ni_macaddr); rttran->rt_metric = metric; rttran->rt_nhops = 1; ieee80211_mesh_rt_update(rttran, ms->ms_ppath->mpp_inact); rttran->rt_flags = IEEE80211_MESHRT_FLAGS_VALID; } } #define PREQ_TFLAGS(n) preq->preq_targets[n].target_flags #define PREQ_TADDR(n) preq->preq_targets[n].target_addr #define PREQ_TSEQ(n) preq->preq_targets[n].target_seq static void hwmp_recv_preq(struct ieee80211vap *vap, struct ieee80211_node *ni, const struct ieee80211_frame *wh, const struct ieee80211_meshpreq_ie *preq) { struct ieee80211_mesh_state *ms = vap->iv_mesh; struct ieee80211_mesh_route *rtorig = NULL; struct ieee80211_mesh_route *rtorig_ext = NULL; struct ieee80211_mesh_route *rttarg = NULL; struct ieee80211_hwmp_route *hrorig = NULL; struct ieee80211_hwmp_route *hrtarg = NULL; struct ieee80211_hwmp_state *hs = vap->iv_hwmp; ieee80211_hwmp_seq preqid; /* last seen preqid for orig */ uint32_t metric = 0; /* * Ignore PREQs from us. Could happen because someone forward it * back to us. */ if (IEEE80211_ADDR_EQ(vap->iv_myaddr, preq->preq_origaddr)) return; IEEE80211_NOTE(vap, IEEE80211_MSG_HWMP, ni, "received PREQ, orig %6D, targ(0) %6D", preq->preq_origaddr, ":", PREQ_TADDR(0), ":"); /* * Acceptance criteria: (if the PREQ is not for us or not broadcast, * or an external mac address not proxied by us), * AND forwarding is disabled, discard this PREQ. */ rttarg = ieee80211_mesh_rt_find(vap, PREQ_TADDR(0)); if (!(ms->ms_flags & IEEE80211_MESHFLAGS_FWD) && (!IEEE80211_ADDR_EQ(vap->iv_myaddr, PREQ_TADDR(0)) || !IEEE80211_IS_MULTICAST(PREQ_TADDR(0)) || (rttarg != NULL && rttarg->rt_flags & IEEE80211_MESHRT_FLAGS_PROXY && IEEE80211_ADDR_EQ(vap->iv_myaddr, rttarg->rt_mesh_gate)))) { IEEE80211_DISCARD_MAC(vap, IEEE80211_MSG_HWMP, preq->preq_origaddr, NULL, "%s", "not accepting PREQ"); return; } /* * Acceptance criteria: if unicast addressed * AND no valid forwarding for Target of PREQ, discard this PREQ. */ if(rttarg != NULL) hrtarg = IEEE80211_MESH_ROUTE_PRIV(rttarg, struct ieee80211_hwmp_route); /* Address mode: ucast */ if(preq->preq_flags & IEEE80211_MESHPREQ_FLAGS_AM && rttarg == NULL && !IEEE80211_ADDR_EQ(vap->iv_myaddr, PREQ_TADDR(0))) { IEEE80211_DISCARD_MAC(vap, IEEE80211_MSG_HWMP, preq->preq_origaddr, NULL, "unicast addressed PREQ of unknown target %6D", PREQ_TADDR(0), ":"); return; } /* PREQ ACCEPTED */ rtorig = ieee80211_mesh_rt_find(vap, preq->preq_origaddr); if (rtorig == NULL) { rtorig = ieee80211_mesh_rt_add(vap, preq->preq_origaddr); if (rtorig == NULL) { IEEE80211_NOTE(vap, IEEE80211_MSG_HWMP, ni, "unable to add orig path to %6D", preq->preq_origaddr, ":"); vap->iv_stats.is_mesh_rtaddfailed++; return; } IEEE80211_NOTE(vap, IEEE80211_MSG_HWMP, ni, "adding originator %6D", preq->preq_origaddr, ":"); } hrorig = IEEE80211_MESH_ROUTE_PRIV(rtorig, struct ieee80211_hwmp_route); /* record last seen preqid */ preqid = hrorig->hr_preqid; hrorig->hr_preqid = HWMP_SEQ_MAX(hrorig->hr_preqid, preq->preq_id); /* Data creation and update of forwarding information * according to Table 11C-8 for originator mesh STA. */ metric = preq->preq_metric + ms->ms_pmetric->mpm_metric(ni); if (HWMP_SEQ_GT(preq->preq_origseq, hrorig->hr_seq) || (HWMP_SEQ_EQ(preq->preq_origseq, hrorig->hr_seq) && metric < rtorig->rt_metric)) { hrorig->hr_seq = preq->preq_origseq; IEEE80211_ADDR_COPY(rtorig->rt_nexthop, wh->i_addr2); rtorig->rt_metric = metric; rtorig->rt_nhops = preq->preq_hopcount + 1; ieee80211_mesh_rt_update(rtorig, preq->preq_lifetime); /* Path to orig is valid now. * NB: we know it can't be Proxy, and if it is GATE * it will be marked below. */ rtorig->rt_flags = IEEE80211_MESHRT_FLAGS_VALID; } else if ((hrtarg != NULL && !HWMP_SEQ_EQ(hrtarg->hr_seq, PREQ_TSEQ(0))) || (rtorig->rt_flags & IEEE80211_MESHRT_FLAGS_VALID && preqid >= preq->preq_id)) { IEEE80211_NOTE(vap, IEEE80211_MSG_HWMP, ni, "discard PREQ from %6D, old seqno %u <= %u," " or old preqid %u < %u", preq->preq_origaddr, ":", preq->preq_origseq, hrorig->hr_seq, preq->preq_id, preqid); return; } /* Update forwarding information to TA if metric improves. */ hwmp_update_transmitter(vap, ni, "PREQ"); /* * Check if the PREQ is addressed to us. * or a Proxy currently gated by us. */ if (IEEE80211_ADDR_EQ(vap->iv_myaddr, PREQ_TADDR(0)) || (ms->ms_flags & IEEE80211_MESHFLAGS_GATE && rttarg != NULL && IEEE80211_ADDR_EQ(vap->iv_myaddr, rttarg->rt_mesh_gate) && rttarg->rt_flags & IEEE80211_MESHRT_FLAGS_PROXY && rttarg->rt_flags & IEEE80211_MESHRT_FLAGS_VALID)) { struct ieee80211_meshprep_ie prep; /* * When we are the target we shall update our own HWMP seq * number with max of (current and preq->seq) + 1 */ hs->hs_seq = HWMP_SEQ_MAX(hs->hs_seq, PREQ_TSEQ(0)) + 1; prep.prep_flags = 0; prep.prep_hopcount = 0; prep.prep_metric = IEEE80211_MESHLMETRIC_INITIALVAL; IEEE80211_ADDR_COPY(prep.prep_targetaddr, vap->iv_myaddr); if (rttarg != NULL && /* if NULL it means we are the target */ rttarg->rt_flags & IEEE80211_MESHRT_FLAGS_PROXY) { IEEE80211_NOTE(vap, IEEE80211_MSG_HWMP, ni, "reply for proxy %6D", rttarg->rt_dest, ":"); prep.prep_flags |= IEEE80211_MESHPREP_FLAGS_AE; IEEE80211_ADDR_COPY(prep.prep_target_ext_addr, rttarg->rt_dest); /* update proxy seqno to HWMP seqno */ rttarg->rt_ext_seq = hs->hs_seq; prep.prep_hopcount = rttarg->rt_nhops; prep.prep_metric = rttarg->rt_metric; IEEE80211_ADDR_COPY(prep.prep_targetaddr, rttarg->rt_mesh_gate); } /* * Build and send a PREP frame. */ prep.prep_ttl = ms->ms_ttl; prep.prep_targetseq = hs->hs_seq; prep.prep_lifetime = preq->preq_lifetime; IEEE80211_ADDR_COPY(prep.prep_origaddr, preq->preq_origaddr); prep.prep_origseq = preq->preq_origseq; IEEE80211_NOTE(vap, IEEE80211_MSG_HWMP, ni, "reply to %6D", preq->preq_origaddr, ":"); hwmp_send_prep(vap, wh->i_addr2, &prep); return; } /* we may update our proxy information for the orig external */ else if (preq->preq_flags & IEEE80211_MESHPREQ_FLAGS_AE) { rtorig_ext = ieee80211_mesh_rt_find(vap, preq->preq_orig_ext_addr); if (rtorig_ext == NULL) { rtorig_ext = ieee80211_mesh_rt_add(vap, preq->preq_orig_ext_addr); if (rtorig_ext == NULL) { IEEE80211_NOTE(vap, IEEE80211_MSG_HWMP, ni, "unable to add orig ext proxy to %6D", preq->preq_orig_ext_addr, ":"); vap->iv_stats.is_mesh_rtaddfailed++; return; } IEEE80211_ADDR_COPY(rtorig_ext->rt_mesh_gate, preq->preq_origaddr); } rtorig_ext->rt_ext_seq = preq->preq_origseq; ieee80211_mesh_rt_update(rtorig_ext, preq->preq_lifetime); } /* * Proactive PREQ: reply with a proactive PREP to the * root STA if requested. */ if (IEEE80211_ADDR_EQ(PREQ_TADDR(0), broadcastaddr) && (PREQ_TFLAGS(0) & IEEE80211_MESHPREQ_TFLAGS_TO)) { IEEE80211_NOTE(vap, IEEE80211_MSG_HWMP, ni, "root mesh station @ %6D", preq->preq_origaddr, ":"); /* Check if root is a mesh gate, mark it */ if (preq->preq_flags & IEEE80211_MESHPREQ_FLAGS_GATE) { struct ieee80211_mesh_gate_route *gr; rtorig->rt_flags |= IEEE80211_MESHRT_FLAGS_GATE; gr = ieee80211_mesh_mark_gate(vap, preq->preq_origaddr, rtorig); gr->gr_lastseq = 0; /* NOT GANN */ } /* * Reply with a PREP if we don't have a path to the root * or if the root sent us a proactive PREQ. */ if ((rtorig->rt_flags & IEEE80211_MESHRT_FLAGS_VALID) == 0 || (preq->preq_flags & IEEE80211_MESHPREQ_FLAGS_PP)) { struct ieee80211_meshprep_ie prep; prep.prep_flags = 0; prep.prep_hopcount = 0; prep.prep_ttl = ms->ms_ttl; IEEE80211_ADDR_COPY(prep.prep_origaddr, preq->preq_origaddr); prep.prep_origseq = preq->preq_origseq; prep.prep_lifetime = preq->preq_lifetime; prep.prep_metric = IEEE80211_MESHLMETRIC_INITIALVAL; IEEE80211_ADDR_COPY(prep.prep_targetaddr, vap->iv_myaddr); prep.prep_targetseq = ++hs->hs_seq; hwmp_send_prep(vap, rtorig->rt_nexthop, &prep); } } /* * Forwarding and Intermediate reply for PREQs with 1 target. */ if ((preq->preq_tcount == 1) && (preq->preq_ttl > 1) && (ms->ms_flags & IEEE80211_MESHFLAGS_FWD)) { struct ieee80211_meshpreq_ie ppreq; /* propagated PREQ */ memcpy(&ppreq, preq, sizeof(ppreq)); /* * We have a valid route to this node. * NB: if target is proxy dont reply. */ if (rttarg != NULL && rttarg->rt_flags & IEEE80211_MESHRT_FLAGS_VALID && !(rttarg->rt_flags & IEEE80211_MESHRT_FLAGS_PROXY)) { /* * Check if we can send an intermediate Path Reply, * i.e., Target Only bit is not set and target is not * the MAC broadcast address. */ if (!(PREQ_TFLAGS(0) & IEEE80211_MESHPREQ_TFLAGS_TO) && !IEEE80211_ADDR_EQ(PREQ_TADDR(0), broadcastaddr)) { struct ieee80211_meshprep_ie prep; IEEE80211_NOTE(vap, IEEE80211_MSG_HWMP, ni, "intermediate reply for PREQ from %6D", preq->preq_origaddr, ":"); prep.prep_flags = 0; prep.prep_hopcount = rttarg->rt_nhops; prep.prep_ttl = ms->ms_ttl; IEEE80211_ADDR_COPY(&prep.prep_targetaddr, PREQ_TADDR(0)); prep.prep_targetseq = hrtarg->hr_seq; prep.prep_lifetime = preq->preq_lifetime; prep.prep_metric =rttarg->rt_metric; IEEE80211_ADDR_COPY(&prep.prep_origaddr, preq->preq_origaddr); prep.prep_origseq = hrorig->hr_seq; hwmp_send_prep(vap, rtorig->rt_nexthop, &prep); /* * Set TO and unset RF bits because we have * sent a PREP. */ ppreq.preq_targets[0].target_flags |= IEEE80211_MESHPREQ_TFLAGS_TO; } } IEEE80211_NOTE(vap, IEEE80211_MSG_HWMP, ni, "forward PREQ from %6D", preq->preq_origaddr, ":"); ppreq.preq_hopcount += 1; ppreq.preq_ttl -= 1; ppreq.preq_metric += ms->ms_pmetric->mpm_metric(ni); /* don't do PREQ ratecheck when we propagate */ hwmp_send_preq(vap, broadcastaddr, &ppreq, NULL, NULL); } } #undef PREQ_TFLAGS #undef PREQ_TADDR #undef PREQ_TSEQ static int hwmp_send_preq(struct ieee80211vap *vap, const uint8_t da[IEEE80211_ADDR_LEN], struct ieee80211_meshpreq_ie *preq, struct timeval *last, struct timeval *minint) { /* * Enforce PREQ interval. * NB: Proactive ROOT PREQs rate is handled by cb task. */ if (last != NULL && minint != NULL) { if (ratecheck(last, minint) == 0) return EALREADY; /* XXX: we should postpone */ getmicrouptime(last); } /* * mesh preq action frame format * [6] da * [6] sa * [6] addr3 = sa * [1] action * [1] category * [tlv] mesh path request */ preq->preq_ie = IEEE80211_ELEMID_MESHPREQ; preq->preq_len = (preq->preq_flags & IEEE80211_MESHPREQ_FLAGS_AE ? IEEE80211_MESHPREQ_BASE_SZ_AE : IEEE80211_MESHPREQ_BASE_SZ) + preq->preq_tcount * IEEE80211_MESHPREQ_TRGT_SZ; return hwmp_send_action(vap, da, (uint8_t *)preq, preq->preq_len+2); } static void hwmp_recv_prep(struct ieee80211vap *vap, struct ieee80211_node *ni, const struct ieee80211_frame *wh, const struct ieee80211_meshprep_ie *prep) { #define IS_PROXY(rt) (rt->rt_flags & IEEE80211_MESHRT_FLAGS_PROXY) #define PROXIED_BY_US(rt) \ (IEEE80211_ADDR_EQ(vap->iv_myaddr, rt->rt_mesh_gate)) struct ieee80211_mesh_state *ms = vap->iv_mesh; struct ieee80211_hwmp_state *hs = vap->iv_hwmp; struct ieee80211_mesh_route *rt = NULL; struct ieee80211_mesh_route *rtorig = NULL; struct ieee80211_mesh_route *rtext = NULL; struct ieee80211_hwmp_route *hr; struct ieee80211com *ic = vap->iv_ic; struct mbuf *m, *next; uint32_t metric = 0; const uint8_t *addr; IEEE80211_NOTE(vap, IEEE80211_MSG_HWMP, ni, "received PREP, orig %6D, targ %6D", prep->prep_origaddr, ":", prep->prep_targetaddr, ":"); /* * Acceptance criteria: (If the corresponding PREP was not generated * by us OR not generated by an external mac that is not proxied by us) * AND forwarding is disabled, discard this PREP. */ rtorig = ieee80211_mesh_rt_find(vap, prep->prep_origaddr); if ((!IEEE80211_ADDR_EQ(vap->iv_myaddr, prep->prep_origaddr) || (rtorig != NULL && IS_PROXY(rtorig) && !PROXIED_BY_US(rtorig))) && !(ms->ms_flags & IEEE80211_MESHFLAGS_FWD)){ IEEE80211_NOTE(vap, IEEE80211_MSG_HWMP, ni, "discard PREP, orig(%6D) not proxied or generated by us", prep->prep_origaddr, ":"); return; } /* PREP ACCEPTED */ /* * If accepted shall create or update the active forwarding information * it maintains for the target mesh STA of the PREP (according to the * rules defined in 13.10.8.4). If the conditions for creating or * updating the forwarding information have not been met in those * rules, no further steps are applied to the PREP. */ rt = ieee80211_mesh_rt_find(vap, prep->prep_targetaddr); if (rt == NULL) { rt = ieee80211_mesh_rt_add(vap, prep->prep_targetaddr); if (rt == NULL) { IEEE80211_NOTE(vap, IEEE80211_MSG_HWMP, ni, "unable to add PREP path to %6D", prep->prep_targetaddr, ":"); vap->iv_stats.is_mesh_rtaddfailed++; return; } IEEE80211_NOTE(vap, IEEE80211_MSG_HWMP, ni, "adding target %6D", prep->prep_targetaddr, ":"); } hr = IEEE80211_MESH_ROUTE_PRIV(rt, struct ieee80211_hwmp_route); /* update path metric */ metric = prep->prep_metric + ms->ms_pmetric->mpm_metric(ni); if ((rt->rt_flags & IEEE80211_MESHRT_FLAGS_VALID)) { if (HWMP_SEQ_LT(prep->prep_targetseq, hr->hr_seq)) { IEEE80211_NOTE(vap, IEEE80211_MSG_HWMP, ni, "discard PREP from %6D, old seq no %u < %u", prep->prep_targetaddr, ":", prep->prep_targetseq, hr->hr_seq); return; } else if (HWMP_SEQ_LEQ(prep->prep_targetseq, hr->hr_seq) && metric > rt->rt_metric) { IEEE80211_NOTE(vap, IEEE80211_MSG_HWMP, ni, "discard PREP from %6D, new metric %u > %u", prep->prep_targetaddr, ":", metric, rt->rt_metric); return; } } IEEE80211_NOTE(vap, IEEE80211_MSG_HWMP, ni, "%s path to %6D, hopcount %d:%d metric %d:%d", rt->rt_flags & IEEE80211_MESHRT_FLAGS_VALID ? "prefer" : "update", prep->prep_targetaddr, ":", rt->rt_nhops, prep->prep_hopcount + 1, rt->rt_metric, metric); hr->hr_seq = prep->prep_targetseq; hr->hr_preqretries = 0; IEEE80211_ADDR_COPY(rt->rt_nexthop, ni->ni_macaddr); rt->rt_metric = metric; rt->rt_nhops = prep->prep_hopcount + 1; ieee80211_mesh_rt_update(rt, prep->prep_lifetime); if (rt->rt_flags & IEEE80211_MESHRT_FLAGS_DISCOVER) { /* discovery complete */ rt->rt_flags &= ~IEEE80211_MESHRT_FLAGS_DISCOVER; } rt->rt_flags |= IEEE80211_MESHRT_FLAGS_VALID; /* mark valid */ /* Update forwarding information to TA if metric improves */ hwmp_update_transmitter(vap, ni, "PREP"); /* * If it's NOT for us, propagate the PREP */ if (!IEEE80211_ADDR_EQ(vap->iv_myaddr, prep->prep_origaddr) && prep->prep_ttl > 1 && prep->prep_hopcount < hs->hs_maxhops) { struct ieee80211_meshprep_ie pprep; /* propagated PREP */ /* * NB: We should already have setup the path to orig * mesh STA when we propagated PREQ to target mesh STA, * no PREP is generated without a corresponding PREQ. * XXX: for now just ignore. */ if (rtorig == NULL) { IEEE80211_NOTE(vap, IEEE80211_MSG_HWMP, ni, "received PREP for an unknown orig(%6D)", prep->prep_origaddr, ":"); return; } IEEE80211_NOTE(vap, IEEE80211_MSG_HWMP, ni, "propagate PREP from %6D", prep->prep_targetaddr, ":"); memcpy(&pprep, prep, sizeof(pprep)); pprep.prep_hopcount += 1; pprep.prep_ttl -= 1; pprep.prep_metric += ms->ms_pmetric->mpm_metric(ni); hwmp_send_prep(vap, rtorig->rt_nexthop, &pprep); /* precursor list for the Target Mesh STA Address is updated */ } /* * Check if we received a PREP w/ AE and store target external address. * We may store target external address if recevied PREP w/ AE * and we are not final destination */ if (prep->prep_flags & IEEE80211_MESHPREP_FLAGS_AE) { rtext = ieee80211_mesh_rt_find(vap, prep->prep_target_ext_addr); if (rtext == NULL) { rtext = ieee80211_mesh_rt_add(vap, prep->prep_target_ext_addr); if (rtext == NULL) { IEEE80211_NOTE(vap, IEEE80211_MSG_HWMP, ni, "unable to add PREP path to proxy %6D", prep->prep_targetaddr, ":"); vap->iv_stats.is_mesh_rtaddfailed++; return; } } IEEE80211_NOTE(vap, IEEE80211_MSG_HWMP, ni, "%s path to %6D, hopcount %d:%d metric %d:%d", rtext->rt_flags & IEEE80211_MESHRT_FLAGS_VALID ? "prefer" : "update", prep->prep_target_ext_addr, ":", rtext->rt_nhops, prep->prep_hopcount + 1, rtext->rt_metric, metric); rtext->rt_flags = IEEE80211_MESHRT_FLAGS_PROXY | IEEE80211_MESHRT_FLAGS_VALID; IEEE80211_ADDR_COPY(rtext->rt_dest, prep->prep_target_ext_addr); IEEE80211_ADDR_COPY(rtext->rt_mesh_gate, prep->prep_targetaddr); IEEE80211_ADDR_COPY(rtext->rt_nexthop, wh->i_addr2); rtext->rt_metric = metric; rtext->rt_lifetime = prep->prep_lifetime; rtext->rt_nhops = prep->prep_hopcount + 1; rtext->rt_ext_seq = prep->prep_origseq; /* new proxy seq */ /* * XXX: proxy entries have no HWMP priv data, * nullify them to be sure? */ } /* * Check for frames queued awaiting path discovery. * XXX probably can tell exactly and avoid remove call * NB: hash may have false matches, if so they will get * stuck back on the stageq because there won't be * a path. */ addr = prep->prep_flags & IEEE80211_MESHPREP_FLAGS_AE ? prep->prep_target_ext_addr : prep->prep_targetaddr; m = ieee80211_ageq_remove(&ic->ic_stageq, (struct ieee80211_node *)(uintptr_t) ieee80211_mac_hash(ic, addr)); /* either dest or ext_dest */ /* * All frames in the stageq here should be non-M_ENCAP; or things * will get very unhappy. */ for (; m != NULL; m = next) { next = m->m_nextpkt; m->m_nextpkt = NULL; IEEE80211_NOTE(vap, IEEE80211_MSG_HWMP, ni, "flush queued frame %p len %d", m, m->m_pkthdr.len); /* * If the mbuf has M_ENCAP set, ensure we free it. * Note that after if_transmit() is called, m is invalid. */ (void) ieee80211_vap_xmitpkt(vap, m); } #undef IS_PROXY #undef PROXIED_BY_US } static int hwmp_send_prep(struct ieee80211vap *vap, const uint8_t da[IEEE80211_ADDR_LEN], struct ieee80211_meshprep_ie *prep) { /* NB: there's no PREP minimum interval. */ /* * mesh prep action frame format * [6] da * [6] sa * [6] addr3 = sa * [1] action * [1] category * [tlv] mesh path reply */ prep->prep_ie = IEEE80211_ELEMID_MESHPREP; prep->prep_len = prep->prep_flags & IEEE80211_MESHPREP_FLAGS_AE ? IEEE80211_MESHPREP_BASE_SZ_AE : IEEE80211_MESHPREP_BASE_SZ; return hwmp_send_action(vap, da, (uint8_t *)prep, prep->prep_len + 2); } #define PERR_DFLAGS(n) perr.perr_dests[n].dest_flags #define PERR_DADDR(n) perr.perr_dests[n].dest_addr #define PERR_DSEQ(n) perr.perr_dests[n].dest_seq #define PERR_DRCODE(n) perr.perr_dests[n].dest_rcode static void hwmp_peerdown(struct ieee80211_node *ni) { struct ieee80211vap *vap = ni->ni_vap; struct ieee80211_mesh_state *ms = vap->iv_mesh; struct ieee80211_meshperr_ie perr; struct ieee80211_mesh_route *rt; struct ieee80211_hwmp_route *hr; rt = ieee80211_mesh_rt_find(vap, ni->ni_macaddr); if (rt == NULL) return; hr = IEEE80211_MESH_ROUTE_PRIV(rt, struct ieee80211_hwmp_route); IEEE80211_NOTE(vap, IEEE80211_MSG_HWMP, ni, "%s", "delete route entry"); perr.perr_ttl = ms->ms_ttl; perr.perr_ndests = 1; PERR_DFLAGS(0) = 0; if (hr->hr_seq == 0) PERR_DFLAGS(0) |= IEEE80211_MESHPERR_DFLAGS_USN; PERR_DFLAGS(0) |= IEEE80211_MESHPERR_DFLAGS_RC; IEEE80211_ADDR_COPY(PERR_DADDR(0), rt->rt_dest); PERR_DSEQ(0) = ++hr->hr_seq; PERR_DRCODE(0) = IEEE80211_REASON_MESH_PERR_DEST_UNREACH; /* NB: flush everything passing through peer */ ieee80211_mesh_rt_flush_peer(vap, ni->ni_macaddr); hwmp_send_perr(vap, broadcastaddr, &perr); } #undef PERR_DFLAGS #undef PERR_DADDR #undef PERR_DSEQ #undef PERR_DRCODE #define PERR_DFLAGS(n) perr->perr_dests[n].dest_flags #define PERR_DADDR(n) perr->perr_dests[n].dest_addr #define PERR_DSEQ(n) perr->perr_dests[n].dest_seq #define PERR_DEXTADDR(n) perr->perr_dests[n].dest_ext_addr static void hwmp_recv_perr(struct ieee80211vap *vap, struct ieee80211_node *ni, const struct ieee80211_frame *wh, const struct ieee80211_meshperr_ie *perr) { struct ieee80211_mesh_state *ms = vap->iv_mesh; struct ieee80211_mesh_route *rt = NULL; struct ieee80211_mesh_route *rt_ext = NULL; struct ieee80211_hwmp_route *hr; struct ieee80211_meshperr_ie *pperr = NULL; int i, j = 0, forward = 0; IEEE80211_NOTE(vap, IEEE80211_MSG_HWMP, ni, "received PERR from %6D", wh->i_addr2, ":"); /* * if forwarding is true, prepare pperr */ if (ms->ms_flags & IEEE80211_MESHFLAGS_FWD) { forward = 1; pperr = IEEE80211_MALLOC(sizeof(*perr) + 31*sizeof(*perr->perr_dests), M_80211_MESH_PERR, IEEE80211_M_NOWAIT); /* XXX: magic number, 32 err dests */ } /* * Acceptance criteria: check if we have forwarding information * stored about destination, and that nexthop == TA of this PERR. * NB: we also build a new PERR to propagate in case we should forward. */ for (i = 0; i < perr->perr_ndests; i++) { rt = ieee80211_mesh_rt_find(vap, PERR_DADDR(i)); if (rt == NULL) continue; if (!IEEE80211_ADDR_EQ(rt->rt_nexthop, wh->i_addr2)) continue; /* found and accepted a PERR ndest element, process it... */ if (forward) memcpy(&pperr->perr_dests[j], &perr->perr_dests[i], sizeof(*perr->perr_dests)); hr = IEEE80211_MESH_ROUTE_PRIV(rt, struct ieee80211_hwmp_route); switch(PERR_DFLAGS(i)) { case (IEEE80211_REASON_MESH_PERR_NO_FI): if (PERR_DSEQ(i) == 0) { hr->hr_seq++; if (forward) { pperr->perr_dests[j].dest_seq = hr->hr_seq; } } else { hr->hr_seq = PERR_DSEQ(i); } rt->rt_flags &= ~IEEE80211_MESHRT_FLAGS_VALID; j++; break; case (IEEE80211_REASON_MESH_PERR_DEST_UNREACH): if(HWMP_SEQ_GT(PERR_DSEQ(i), hr->hr_seq)) { hr->hr_seq = PERR_DSEQ(i); rt->rt_flags &= ~IEEE80211_MESHRT_FLAGS_VALID; j++; } break; case (IEEE80211_REASON_MESH_PERR_NO_PROXY): rt_ext = ieee80211_mesh_rt_find(vap, PERR_DEXTADDR(i)); if (rt_ext != NULL) { rt_ext->rt_flags &= ~IEEE80211_MESHRT_FLAGS_VALID; j++; } break; default: IEEE80211_DISCARD(vap, IEEE80211_MSG_HWMP, wh, NULL, "PERR, unknown reason code %u\n", PERR_DFLAGS(i)); goto done; /* XXX: stats?? */ } ieee80211_mesh_rt_flush_peer(vap, PERR_DADDR(i)); KASSERT(j < 32, ("PERR, error ndest >= 32 (%u)", j)); } if (j == 0) { IEEE80211_DISCARD(vap, IEEE80211_MSG_HWMP, wh, NULL, "%s", "PERR not accepted"); goto done; /* XXX: stats?? */ } /* * Propagate the PERR if we previously found it on our routing table. */ if (forward && perr->perr_ttl > 1) { IEEE80211_NOTE(vap, IEEE80211_MSG_HWMP, ni, "propagate PERR from %6D", wh->i_addr2, ":"); pperr->perr_ndests = j; pperr->perr_ttl--; hwmp_send_perr(vap, broadcastaddr, pperr); } done: if (pperr != NULL) IEEE80211_FREE(pperr, M_80211_MESH_PERR); } #undef PERR_DFLAGS #undef PERR_DADDR #undef PERR_DSEQ #undef PERR_DEXTADDR static int hwmp_send_perr(struct ieee80211vap *vap, const uint8_t da[IEEE80211_ADDR_LEN], struct ieee80211_meshperr_ie *perr) { struct ieee80211_hwmp_state *hs = vap->iv_hwmp; int i; uint8_t length = 0; /* * Enforce PERR interval. */ if (ratecheck(&hs->hs_lastperr, &ieee80211_hwmp_perrminint) == 0) return EALREADY; getmicrouptime(&hs->hs_lastperr); /* * mesh perr action frame format * [6] da * [6] sa * [6] addr3 = sa * [1] action * [1] category * [tlv] mesh path error */ perr->perr_ie = IEEE80211_ELEMID_MESHPERR; length = IEEE80211_MESHPERR_BASE_SZ; for (i = 0; iperr_ndests; i++) { if (perr->perr_dests[i].dest_flags & IEEE80211_MESHPERR_FLAGS_AE) { length += IEEE80211_MESHPERR_DEST_SZ_AE; continue ; } length += IEEE80211_MESHPERR_DEST_SZ; } perr->perr_len =length; return hwmp_send_action(vap, da, (uint8_t *)perr, perr->perr_len+2); } /* * Called from the rest of the net80211 code (mesh code for example). * NB: IEEE80211_REASON_MESH_PERR_DEST_UNREACH can be trigger by the fact that * a mesh STA is unable to forward an MSDU/MMPDU to a next-hop mesh STA. */ #define PERR_DFLAGS(n) perr.perr_dests[n].dest_flags #define PERR_DADDR(n) perr.perr_dests[n].dest_addr #define PERR_DSEQ(n) perr.perr_dests[n].dest_seq #define PERR_DEXTADDR(n) perr.perr_dests[n].dest_ext_addr #define PERR_DRCODE(n) perr.perr_dests[n].dest_rcode static void hwmp_senderror(struct ieee80211vap *vap, const uint8_t addr[IEEE80211_ADDR_LEN], struct ieee80211_mesh_route *rt, int rcode) { struct ieee80211_mesh_state *ms = vap->iv_mesh; struct ieee80211_hwmp_route *hr = NULL; struct ieee80211_meshperr_ie perr; if (rt != NULL) hr = IEEE80211_MESH_ROUTE_PRIV(rt, struct ieee80211_hwmp_route); perr.perr_ndests = 1; perr.perr_ttl = ms->ms_ttl; PERR_DFLAGS(0) = 0; PERR_DRCODE(0) = rcode; switch (rcode) { case IEEE80211_REASON_MESH_PERR_NO_FI: IEEE80211_ADDR_COPY(PERR_DADDR(0), addr); PERR_DSEQ(0) = 0; /* reserved */ break; case IEEE80211_REASON_MESH_PERR_NO_PROXY: KASSERT(rt != NULL, ("no proxy info for sending PERR")); KASSERT(rt->rt_flags & IEEE80211_MESHRT_FLAGS_PROXY, ("route is not marked proxy")); PERR_DFLAGS(0) |= IEEE80211_MESHPERR_FLAGS_AE; IEEE80211_ADDR_COPY(PERR_DADDR(0), vap->iv_myaddr); PERR_DSEQ(0) = rt->rt_ext_seq; IEEE80211_ADDR_COPY(PERR_DEXTADDR(0), addr); break; case IEEE80211_REASON_MESH_PERR_DEST_UNREACH: KASSERT(rt != NULL, ("no route info for sending PERR")); IEEE80211_ADDR_COPY(PERR_DADDR(0), addr); PERR_DSEQ(0) = hr->hr_seq; break; default: KASSERT(0, ("unknown reason code for HWMP PERR (%u)", rcode)); } hwmp_send_perr(vap, broadcastaddr, &perr); } #undef PERR_DFLAGS #undef PEER_DADDR #undef PERR_DSEQ #undef PERR_DEXTADDR #undef PERR_DRCODE static void hwmp_recv_rann(struct ieee80211vap *vap, struct ieee80211_node *ni, const struct ieee80211_frame *wh, const struct ieee80211_meshrann_ie *rann) { struct ieee80211_mesh_state *ms = vap->iv_mesh; struct ieee80211_hwmp_state *hs = vap->iv_hwmp; struct ieee80211_mesh_route *rt = NULL; struct ieee80211_hwmp_route *hr; struct ieee80211_meshpreq_ie preq; struct ieee80211_meshrann_ie prann; if (IEEE80211_ADDR_EQ(rann->rann_addr, vap->iv_myaddr)) return; rt = ieee80211_mesh_rt_find(vap, rann->rann_addr); if (rt != NULL && rt->rt_flags & IEEE80211_MESHRT_FLAGS_VALID) { hr = IEEE80211_MESH_ROUTE_PRIV(rt, struct ieee80211_hwmp_route); /* Acceptance criteria: if RANN.seq < stored seq, discard RANN */ if (HWMP_SEQ_LT(rann->rann_seq, hr->hr_seq)) { IEEE80211_DISCARD(vap, IEEE80211_MSG_HWMP, wh, NULL, "RANN seq %u < %u", rann->rann_seq, hr->hr_seq); return; } /* Acceptance criteria: if RANN.seq == stored seq AND * RANN.metric > stored metric, discard RANN */ if (HWMP_SEQ_EQ(rann->rann_seq, hr->hr_seq) && rann->rann_metric > rt->rt_metric) { IEEE80211_DISCARD(vap, IEEE80211_MSG_HWMP, wh, NULL, "RANN metric %u > %u", rann->rann_metric, rt->rt_metric); return; } } /* RANN ACCEPTED */ ieee80211_hwmp_rannint = rann->rann_interval; /* XXX: mtx lock? */ if (rt == NULL) { rt = ieee80211_mesh_rt_add(vap, rann->rann_addr); if (rt == NULL) { IEEE80211_DISCARD(vap, IEEE80211_MSG_HWMP, wh, NULL, "unable to add mac for RANN root %6D", rann->rann_addr, ":"); vap->iv_stats.is_mesh_rtaddfailed++; return; } } hr = IEEE80211_MESH_ROUTE_PRIV(rt, struct ieee80211_hwmp_route); /* Check if root is a mesh gate, mark it */ if (rann->rann_flags & IEEE80211_MESHRANN_FLAGS_GATE) { struct ieee80211_mesh_gate_route *gr; rt->rt_flags |= IEEE80211_MESHRT_FLAGS_GATE; gr = ieee80211_mesh_mark_gate(vap, rann->rann_addr, rt); gr->gr_lastseq = 0; /* NOT GANN */ } /* discovery timeout */ ieee80211_mesh_rt_update(rt, ticks_to_msecs(ieee80211_hwmp_roottimeout)); preq.preq_flags = IEEE80211_MESHPREQ_FLAGS_AM; preq.preq_hopcount = 0; preq.preq_ttl = ms->ms_ttl; preq.preq_id = 0; /* reserved */ IEEE80211_ADDR_COPY(preq.preq_origaddr, vap->iv_myaddr); preq.preq_origseq = ++hs->hs_seq; preq.preq_lifetime = ieee80211_hwmp_roottimeout; preq.preq_metric = IEEE80211_MESHLMETRIC_INITIALVAL; preq.preq_tcount = 1; preq.preq_targets[0].target_flags = IEEE80211_MESHPREQ_TFLAGS_TO; /* NB: IEEE80211_MESHPREQ_TFLAGS_USN = 0 implicitly implied */ IEEE80211_ADDR_COPY(preq.preq_targets[0].target_addr, rann->rann_addr); preq.preq_targets[0].target_seq = rann->rann_seq; /* XXX: if rootconfint have not passed, we built this preq in vain */ hwmp_send_preq(vap, wh->i_addr2, &preq, &hr->hr_lastrootconf, &ieee80211_hwmp_rootconfint); /* propagate a RANN */ if (rt->rt_flags & IEEE80211_MESHRT_FLAGS_VALID && rann->rann_ttl > 1 && ms->ms_flags & IEEE80211_MESHFLAGS_FWD) { hr->hr_seq = rann->rann_seq; memcpy(&prann, rann, sizeof(prann)); prann.rann_hopcount += 1; prann.rann_ttl -= 1; prann.rann_metric += ms->ms_pmetric->mpm_metric(ni); hwmp_send_rann(vap, broadcastaddr, &prann); } } static int hwmp_send_rann(struct ieee80211vap *vap, const uint8_t da[IEEE80211_ADDR_LEN], struct ieee80211_meshrann_ie *rann) { /* * mesh rann action frame format * [6] da * [6] sa * [6] addr3 = sa * [1] action * [1] category * [tlv] root annoucement */ rann->rann_ie = IEEE80211_ELEMID_MESHRANN; rann->rann_len = IEEE80211_MESHRANN_BASE_SZ; return hwmp_send_action(vap, da, (uint8_t *)rann, rann->rann_len + 2); } #define PREQ_TFLAGS(n) preq.preq_targets[n].target_flags #define PREQ_TADDR(n) preq.preq_targets[n].target_addr #define PREQ_TSEQ(n) preq.preq_targets[n].target_seq static void hwmp_rediscover_cb(void *arg) { struct ieee80211_mesh_route *rt = arg; struct ieee80211vap *vap = rt->rt_vap; struct ieee80211_hwmp_state *hs = vap->iv_hwmp; struct ieee80211_mesh_state *ms = vap->iv_mesh; struct ieee80211_hwmp_route *hr; struct ieee80211_meshpreq_ie preq; /* Optimize: storing first preq? */ if ((rt->rt_flags & IEEE80211_MESHRT_FLAGS_VALID)) return ; /* nothing to do */ hr = IEEE80211_MESH_ROUTE_PRIV(rt, struct ieee80211_hwmp_route); if (hr->hr_preqretries >= ieee80211_hwmp_maxpreq_retries) { IEEE80211_NOTE_MAC(vap, IEEE80211_MSG_ANY, rt->rt_dest, "%s", "max number of discovery, send queued frames to GATE"); ieee80211_mesh_forward_to_gates(vap, rt); vap->iv_stats.is_mesh_fwd_nopath++; return ; /* XXX: flush queue? */ } hr->hr_preqretries++; IEEE80211_NOTE_MAC(vap, IEEE80211_MSG_HWMP, rt->rt_dest, "start path rediscovery , target seq %u", hr->hr_seq); /* * Try to discover the path for this node. * Group addressed PREQ Case A */ preq.preq_flags = 0; preq.preq_hopcount = 0; preq.preq_ttl = ms->ms_ttl; preq.preq_id = ++hs->hs_preqid; IEEE80211_ADDR_COPY(preq.preq_origaddr, vap->iv_myaddr); preq.preq_origseq = hr->hr_origseq; preq.preq_lifetime = ticks_to_msecs(ieee80211_hwmp_pathtimeout); preq.preq_metric = IEEE80211_MESHLMETRIC_INITIALVAL; preq.preq_tcount = 1; IEEE80211_ADDR_COPY(PREQ_TADDR(0), rt->rt_dest); PREQ_TFLAGS(0) = 0; if (ieee80211_hwmp_targetonly) PREQ_TFLAGS(0) |= IEEE80211_MESHPREQ_TFLAGS_TO; PREQ_TFLAGS(0) |= IEEE80211_MESHPREQ_TFLAGS_USN; PREQ_TSEQ(0) = 0; /* RESERVED when USN flag is set */ /* XXX check return value */ hwmp_send_preq(vap, broadcastaddr, &preq, &hr->hr_lastpreq, &ieee80211_hwmp_preqminint); callout_reset(&rt->rt_discovery, ieee80211_hwmp_net_diameter_traversaltime * 2, hwmp_rediscover_cb, rt); } static struct ieee80211_node * hwmp_discover(struct ieee80211vap *vap, const uint8_t dest[IEEE80211_ADDR_LEN], struct mbuf *m) { struct ieee80211_hwmp_state *hs = vap->iv_hwmp; struct ieee80211_mesh_state *ms = vap->iv_mesh; struct ieee80211_mesh_route *rt = NULL; struct ieee80211_hwmp_route *hr; struct ieee80211_meshpreq_ie preq; struct ieee80211_node *ni; int sendpreq = 0; KASSERT(vap->iv_opmode == IEEE80211_M_MBSS, ("not a mesh vap, opmode %d", vap->iv_opmode)); KASSERT(!IEEE80211_ADDR_EQ(vap->iv_myaddr, dest), ("%s: discovering self!", __func__)); ni = NULL; if (!IEEE80211_IS_MULTICAST(dest)) { rt = ieee80211_mesh_rt_find(vap, dest); if (rt == NULL) { rt = ieee80211_mesh_rt_add(vap, dest); if (rt == NULL) { IEEE80211_NOTE(vap, IEEE80211_MSG_HWMP, ni, "unable to add discovery path to %6D", dest, ":"); vap->iv_stats.is_mesh_rtaddfailed++; goto done; } } hr = IEEE80211_MESH_ROUTE_PRIV(rt, struct ieee80211_hwmp_route); if (rt->rt_flags & IEEE80211_MESHRT_FLAGS_DISCOVER) { IEEE80211_NOTE_MAC(vap, IEEE80211_MSG_HWMP, dest, "%s", "already discovering queue frame until path found"); sendpreq = 1; goto done; } if ((rt->rt_flags & IEEE80211_MESHRT_FLAGS_VALID) == 0) { if (hr->hr_lastdiscovery != 0 && (ticks - hr->hr_lastdiscovery < (ieee80211_hwmp_net_diameter_traversaltime * 2))) { IEEE80211_DISCARD_MAC(vap, IEEE80211_MSG_ANY, dest, NULL, "%s", "too frequent discovery requeust"); sendpreq = 1; goto done; } hr->hr_lastdiscovery = ticks; if (hr->hr_preqretries >= ieee80211_hwmp_maxpreq_retries) { IEEE80211_DISCARD_MAC(vap, IEEE80211_MSG_ANY, dest, NULL, "%s", "no valid path , max number of discovery"); vap->iv_stats.is_mesh_fwd_nopath++; goto done; } rt->rt_flags = IEEE80211_MESHRT_FLAGS_DISCOVER; hr->hr_preqretries++; if (hr->hr_origseq == 0) hr->hr_origseq = ++hs->hs_seq; rt->rt_metric = IEEE80211_MESHLMETRIC_INITIALVAL; sendpreq = 1; IEEE80211_NOTE_MAC(vap, IEEE80211_MSG_HWMP, dest, "start path discovery (src %s), target seq %u", m == NULL ? "" : ether_sprintf( mtod(m, struct ether_header *)->ether_shost), hr->hr_seq); /* * Try to discover the path for this node. * Group addressed PREQ Case A */ preq.preq_flags = 0; preq.preq_hopcount = 0; preq.preq_ttl = ms->ms_ttl; preq.preq_id = ++hs->hs_preqid; IEEE80211_ADDR_COPY(preq.preq_origaddr, vap->iv_myaddr); preq.preq_origseq = hr->hr_origseq; preq.preq_lifetime = ticks_to_msecs(ieee80211_hwmp_pathtimeout); preq.preq_metric = IEEE80211_MESHLMETRIC_INITIALVAL; preq.preq_tcount = 1; IEEE80211_ADDR_COPY(PREQ_TADDR(0), dest); PREQ_TFLAGS(0) = 0; if (ieee80211_hwmp_targetonly) PREQ_TFLAGS(0) |= IEEE80211_MESHPREQ_TFLAGS_TO; PREQ_TFLAGS(0) |= IEEE80211_MESHPREQ_TFLAGS_USN; PREQ_TSEQ(0) = 0; /* RESERVED when USN flag is set */ /* XXX check return value */ hwmp_send_preq(vap, broadcastaddr, &preq, &hr->hr_lastpreq, &ieee80211_hwmp_preqminint); callout_reset(&rt->rt_discovery, ieee80211_hwmp_net_diameter_traversaltime * 2, hwmp_rediscover_cb, rt); } if (rt->rt_flags & IEEE80211_MESHRT_FLAGS_VALID) ni = ieee80211_find_txnode(vap, rt->rt_nexthop); } else { ni = ieee80211_find_txnode(vap, dest); /* NB: if null then we leak mbuf */ KASSERT(ni != NULL, ("leak mcast frame")); return ni; } done: if (ni == NULL && m != NULL) { if (sendpreq) { struct ieee80211com *ic = vap->iv_ic; /* * Queue packet for transmit when path discovery * completes. If discovery never completes the * frame will be flushed by way of the aging timer. */ IEEE80211_NOTE_MAC(vap, IEEE80211_MSG_HWMP, dest, "%s", "queue frame until path found"); MPASS((m->m_pkthdr.csum_flags & CSUM_SND_TAG) == 0); m->m_pkthdr.rcvif = (void *)(uintptr_t) ieee80211_mac_hash(ic, dest); /* XXX age chosen randomly */ ieee80211_ageq_append(&ic->ic_stageq, m, IEEE80211_INACT_WAIT); } else { IEEE80211_DISCARD_MAC(vap, IEEE80211_MSG_HWMP, dest, NULL, "%s", "no valid path to this node"); m_freem(m); } } return ni; } #undef PREQ_TFLAGS #undef PREQ_TADDR #undef PREQ_TSEQ static int hwmp_ioctl_get80211(struct ieee80211vap *vap, struct ieee80211req *ireq) { struct ieee80211_hwmp_state *hs = vap->iv_hwmp; int error; if (vap->iv_opmode != IEEE80211_M_MBSS) return ENOSYS; error = 0; switch (ireq->i_type) { case IEEE80211_IOC_HWMP_ROOTMODE: ireq->i_val = hs->hs_rootmode; break; case IEEE80211_IOC_HWMP_MAXHOPS: ireq->i_val = hs->hs_maxhops; break; default: return ENOSYS; } return error; } IEEE80211_IOCTL_GET(hwmp, hwmp_ioctl_get80211); static int hwmp_ioctl_set80211(struct ieee80211vap *vap, struct ieee80211req *ireq) { struct ieee80211_hwmp_state *hs = vap->iv_hwmp; int error; if (vap->iv_opmode != IEEE80211_M_MBSS) return ENOSYS; error = 0; switch (ireq->i_type) { case IEEE80211_IOC_HWMP_ROOTMODE: if (ireq->i_val < 0 || ireq->i_val > 3) return EINVAL; hs->hs_rootmode = ireq->i_val; hwmp_rootmode_setup(vap); break; case IEEE80211_IOC_HWMP_MAXHOPS: if (ireq->i_val <= 0 || ireq->i_val > 255) return EINVAL; hs->hs_maxhops = ireq->i_val; break; default: return ENOSYS; } return error; } IEEE80211_IOCTL_SET(hwmp, hwmp_ioctl_set80211); Index: projects/clang1000-import/sys/net80211/ieee80211_mesh.c =================================================================== --- projects/clang1000-import/sys/net80211/ieee80211_mesh.c (revision 358238) +++ projects/clang1000-import/sys/net80211/ieee80211_mesh.c (revision 358239) @@ -1,3609 +1,3614 @@ /*- * SPDX-License-Identifier: BSD-2-Clause-FreeBSD * * Copyright (c) 2009 The FreeBSD Foundation * All rights reserved. * * This software was developed by Rui Paulo under sponsorship from the * FreeBSD Foundation. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include #ifdef __FreeBSD__ __FBSDID("$FreeBSD$"); #endif /* * IEEE 802.11s Mesh Point (MBSS) support. * * Based on March 2009, D3.0 802.11s draft spec. */ #include "opt_inet.h" #include "opt_wlan.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #ifdef IEEE80211_SUPPORT_SUPERG #include #endif #include #include static void mesh_rt_flush_invalid(struct ieee80211vap *); static int mesh_select_proto_path(struct ieee80211vap *, const char *); static int mesh_select_proto_metric(struct ieee80211vap *, const char *); static void mesh_vattach(struct ieee80211vap *); static int mesh_newstate(struct ieee80211vap *, enum ieee80211_state, int); static void mesh_rt_cleanup_cb(void *); static void mesh_gatemode_setup(struct ieee80211vap *); static void mesh_gatemode_cb(void *); static void mesh_linkchange(struct ieee80211_node *, enum ieee80211_mesh_mlstate); static void mesh_checkid(void *, struct ieee80211_node *); static uint32_t mesh_generateid(struct ieee80211vap *); static int mesh_checkpseq(struct ieee80211vap *, const uint8_t [IEEE80211_ADDR_LEN], uint32_t); static void mesh_transmit_to_gate(struct ieee80211vap *, struct mbuf *, struct ieee80211_mesh_route *); static void mesh_forward(struct ieee80211vap *, struct mbuf *, const struct ieee80211_meshcntl *); static int mesh_input(struct ieee80211_node *, struct mbuf *, const struct ieee80211_rx_stats *rxs, int, int); static void mesh_recv_mgmt(struct ieee80211_node *, struct mbuf *, int, const struct ieee80211_rx_stats *rxs, int, int); static void mesh_recv_ctl(struct ieee80211_node *, struct mbuf *, int); static void mesh_peer_timeout_setup(struct ieee80211_node *); static void mesh_peer_timeout_backoff(struct ieee80211_node *); static void mesh_peer_timeout_cb(void *); static __inline void mesh_peer_timeout_stop(struct ieee80211_node *); static int mesh_verify_meshid(struct ieee80211vap *, const uint8_t *); static int mesh_verify_meshconf(struct ieee80211vap *, const uint8_t *); static int mesh_verify_meshpeer(struct ieee80211vap *, uint8_t, const uint8_t *); uint32_t mesh_airtime_calc(struct ieee80211_node *); /* * Timeout values come from the specification and are in milliseconds. */ -static SYSCTL_NODE(_net_wlan, OID_AUTO, mesh, CTLFLAG_RD, 0, +static SYSCTL_NODE(_net_wlan, OID_AUTO, mesh, CTLFLAG_RD | CTLFLAG_MPSAFE, 0, "IEEE 802.11s parameters"); static int ieee80211_mesh_gateint = -1; -SYSCTL_PROC(_net_wlan_mesh, OID_AUTO, gateint, CTLTYPE_INT | CTLFLAG_RW, +SYSCTL_PROC(_net_wlan_mesh, OID_AUTO, gateint, + CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_NEEDGIANT, &ieee80211_mesh_gateint, 0, ieee80211_sysctl_msecs_ticks, "I", "mesh gate interval (ms)"); static int ieee80211_mesh_retrytimeout = -1; -SYSCTL_PROC(_net_wlan_mesh, OID_AUTO, retrytimeout, CTLTYPE_INT | CTLFLAG_RW, +SYSCTL_PROC(_net_wlan_mesh, OID_AUTO, retrytimeout, + CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_NEEDGIANT, &ieee80211_mesh_retrytimeout, 0, ieee80211_sysctl_msecs_ticks, "I", "Retry timeout (msec)"); static int ieee80211_mesh_holdingtimeout = -1; -SYSCTL_PROC(_net_wlan_mesh, OID_AUTO, holdingtimeout, CTLTYPE_INT | CTLFLAG_RW, +SYSCTL_PROC(_net_wlan_mesh, OID_AUTO, holdingtimeout, + CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_NEEDGIANT, &ieee80211_mesh_holdingtimeout, 0, ieee80211_sysctl_msecs_ticks, "I", "Holding state timeout (msec)"); static int ieee80211_mesh_confirmtimeout = -1; -SYSCTL_PROC(_net_wlan_mesh, OID_AUTO, confirmtimeout, CTLTYPE_INT | CTLFLAG_RW, +SYSCTL_PROC(_net_wlan_mesh, OID_AUTO, confirmtimeout, + CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_NEEDGIANT, &ieee80211_mesh_confirmtimeout, 0, ieee80211_sysctl_msecs_ticks, "I", "Confirm state timeout (msec)"); static int ieee80211_mesh_backofftimeout = -1; -SYSCTL_PROC(_net_wlan_mesh, OID_AUTO, backofftimeout, CTLTYPE_INT | CTLFLAG_RW, +SYSCTL_PROC(_net_wlan_mesh, OID_AUTO, backofftimeout, + CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_NEEDGIANT, &ieee80211_mesh_backofftimeout, 0, ieee80211_sysctl_msecs_ticks, "I", "Backoff timeout (msec). This is to throutles peering forever when " "not receiving answer or is rejected by a neighbor"); static int ieee80211_mesh_maxretries = 2; SYSCTL_INT(_net_wlan_mesh, OID_AUTO, maxretries, CTLFLAG_RW, &ieee80211_mesh_maxretries, 0, "Maximum retries during peer link establishment"); static int ieee80211_mesh_maxholding = 2; SYSCTL_INT(_net_wlan_mesh, OID_AUTO, maxholding, CTLFLAG_RW, &ieee80211_mesh_maxholding, 0, "Maximum times we are allowed to transition to HOLDING state before " "backinoff during peer link establishment"); static const uint8_t broadcastaddr[IEEE80211_ADDR_LEN] = { 0xff, 0xff, 0xff, 0xff, 0xff, 0xff }; static ieee80211_recv_action_func mesh_recv_action_meshpeering_open; static ieee80211_recv_action_func mesh_recv_action_meshpeering_confirm; static ieee80211_recv_action_func mesh_recv_action_meshpeering_close; static ieee80211_recv_action_func mesh_recv_action_meshlmetric; static ieee80211_recv_action_func mesh_recv_action_meshgate; static ieee80211_send_action_func mesh_send_action_meshpeering_open; static ieee80211_send_action_func mesh_send_action_meshpeering_confirm; static ieee80211_send_action_func mesh_send_action_meshpeering_close; static ieee80211_send_action_func mesh_send_action_meshlmetric; static ieee80211_send_action_func mesh_send_action_meshgate; static const struct ieee80211_mesh_proto_metric mesh_metric_airtime = { .mpm_descr = "AIRTIME", .mpm_ie = IEEE80211_MESHCONF_METRIC_AIRTIME, .mpm_metric = mesh_airtime_calc, }; static struct ieee80211_mesh_proto_path mesh_proto_paths[4]; static struct ieee80211_mesh_proto_metric mesh_proto_metrics[4]; MALLOC_DEFINE(M_80211_MESH_PREQ, "80211preq", "802.11 MESH Path Request frame"); MALLOC_DEFINE(M_80211_MESH_PREP, "80211prep", "802.11 MESH Path Reply frame"); MALLOC_DEFINE(M_80211_MESH_PERR, "80211perr", "802.11 MESH Path Error frame"); /* The longer one of the lifetime should be stored as new lifetime */ #define MESH_ROUTE_LIFETIME_MAX(a, b) (a > b ? a : b) MALLOC_DEFINE(M_80211_MESH_RT, "80211mesh_rt", "802.11s routing table"); MALLOC_DEFINE(M_80211_MESH_GT_RT, "80211mesh_gt", "802.11s known gates table"); /* * Helper functions to manipulate the Mesh routing table. */ static struct ieee80211_mesh_route * mesh_rt_find_locked(struct ieee80211_mesh_state *ms, const uint8_t dest[IEEE80211_ADDR_LEN]) { struct ieee80211_mesh_route *rt; MESH_RT_LOCK_ASSERT(ms); TAILQ_FOREACH(rt, &ms->ms_routes, rt_next) { if (IEEE80211_ADDR_EQ(dest, rt->rt_dest)) return rt; } return NULL; } static struct ieee80211_mesh_route * mesh_rt_add_locked(struct ieee80211vap *vap, const uint8_t dest[IEEE80211_ADDR_LEN]) { struct ieee80211_mesh_state *ms = vap->iv_mesh; struct ieee80211_mesh_route *rt; KASSERT(!IEEE80211_ADDR_EQ(broadcastaddr, dest), ("%s: adding broadcast to the routing table", __func__)); MESH_RT_LOCK_ASSERT(ms); rt = IEEE80211_MALLOC(ALIGN(sizeof(struct ieee80211_mesh_route)) + ms->ms_ppath->mpp_privlen, M_80211_MESH_RT, IEEE80211_M_NOWAIT | IEEE80211_M_ZERO); if (rt != NULL) { rt->rt_vap = vap; IEEE80211_ADDR_COPY(rt->rt_dest, dest); rt->rt_priv = (void *)ALIGN(&rt[1]); MESH_RT_ENTRY_LOCK_INIT(rt, "MBSS_RT"); callout_init(&rt->rt_discovery, 1); rt->rt_updtime = ticks; /* create time */ TAILQ_INSERT_TAIL(&ms->ms_routes, rt, rt_next); } return rt; } struct ieee80211_mesh_route * ieee80211_mesh_rt_find(struct ieee80211vap *vap, const uint8_t dest[IEEE80211_ADDR_LEN]) { struct ieee80211_mesh_state *ms = vap->iv_mesh; struct ieee80211_mesh_route *rt; MESH_RT_LOCK(ms); rt = mesh_rt_find_locked(ms, dest); MESH_RT_UNLOCK(ms); return rt; } struct ieee80211_mesh_route * ieee80211_mesh_rt_add(struct ieee80211vap *vap, const uint8_t dest[IEEE80211_ADDR_LEN]) { struct ieee80211_mesh_state *ms = vap->iv_mesh; struct ieee80211_mesh_route *rt; KASSERT(ieee80211_mesh_rt_find(vap, dest) == NULL, ("%s: duplicate entry in the routing table", __func__)); KASSERT(!IEEE80211_ADDR_EQ(vap->iv_myaddr, dest), ("%s: adding self to the routing table", __func__)); MESH_RT_LOCK(ms); rt = mesh_rt_add_locked(vap, dest); MESH_RT_UNLOCK(ms); return rt; } /* * Update the route lifetime and returns the updated lifetime. * If new_lifetime is zero and route is timedout it will be invalidated. * new_lifetime is in msec */ int ieee80211_mesh_rt_update(struct ieee80211_mesh_route *rt, int new_lifetime) { int timesince, now; uint32_t lifetime = 0; KASSERT(rt != NULL, ("route is NULL")); now = ticks; MESH_RT_ENTRY_LOCK(rt); /* dont clobber a proxy entry gated by us */ if (rt->rt_flags & IEEE80211_MESHRT_FLAGS_PROXY && rt->rt_nhops == 0) { MESH_RT_ENTRY_UNLOCK(rt); return rt->rt_lifetime; } timesince = ticks_to_msecs(now - rt->rt_updtime); rt->rt_updtime = now; if (timesince >= rt->rt_lifetime) { if (new_lifetime != 0) { rt->rt_lifetime = new_lifetime; } else { rt->rt_flags &= ~IEEE80211_MESHRT_FLAGS_VALID; rt->rt_lifetime = 0; } } else { /* update what is left of lifetime */ rt->rt_lifetime = rt->rt_lifetime - timesince; rt->rt_lifetime = MESH_ROUTE_LIFETIME_MAX( new_lifetime, rt->rt_lifetime); } lifetime = rt->rt_lifetime; MESH_RT_ENTRY_UNLOCK(rt); return lifetime; } /* * Add a proxy route (as needed) for the specified destination. */ void ieee80211_mesh_proxy_check(struct ieee80211vap *vap, const uint8_t dest[IEEE80211_ADDR_LEN]) { struct ieee80211_mesh_state *ms = vap->iv_mesh; struct ieee80211_mesh_route *rt; MESH_RT_LOCK(ms); rt = mesh_rt_find_locked(ms, dest); if (rt == NULL) { rt = mesh_rt_add_locked(vap, dest); if (rt == NULL) { IEEE80211_NOTE_MAC(vap, IEEE80211_MSG_MESH, dest, "%s", "unable to add proxy entry"); vap->iv_stats.is_mesh_rtaddfailed++; } else { IEEE80211_NOTE_MAC(vap, IEEE80211_MSG_MESH, dest, "%s", "add proxy entry"); IEEE80211_ADDR_COPY(rt->rt_mesh_gate, vap->iv_myaddr); IEEE80211_ADDR_COPY(rt->rt_nexthop, vap->iv_myaddr); rt->rt_flags |= IEEE80211_MESHRT_FLAGS_VALID | IEEE80211_MESHRT_FLAGS_PROXY; } } else if ((rt->rt_flags & IEEE80211_MESHRT_FLAGS_VALID) == 0) { KASSERT(rt->rt_flags & IEEE80211_MESHRT_FLAGS_PROXY, ("no proxy flag for poxy entry")); struct ieee80211com *ic = vap->iv_ic; /* * Fix existing entry created by received frames from * stations that have some memory of dest. We also * flush any frames held on the staging queue; delivering * them is too much trouble right now. */ IEEE80211_NOTE_MAC(vap, IEEE80211_MSG_MESH, dest, "%s", "fix proxy entry"); IEEE80211_ADDR_COPY(rt->rt_nexthop, vap->iv_myaddr); rt->rt_flags |= IEEE80211_MESHRT_FLAGS_VALID | IEEE80211_MESHRT_FLAGS_PROXY; /* XXX belongs in hwmp */ ieee80211_ageq_drain_node(&ic->ic_stageq, (void *)(uintptr_t) ieee80211_mac_hash(ic, dest)); /* XXX stat? */ } MESH_RT_UNLOCK(ms); } static __inline void mesh_rt_del(struct ieee80211_mesh_state *ms, struct ieee80211_mesh_route *rt) { TAILQ_REMOVE(&ms->ms_routes, rt, rt_next); /* * Grab the lock before destroying it, to be sure no one else * is holding the route. */ MESH_RT_ENTRY_LOCK(rt); callout_drain(&rt->rt_discovery); MESH_RT_ENTRY_LOCK_DESTROY(rt); IEEE80211_FREE(rt, M_80211_MESH_RT); } void ieee80211_mesh_rt_del(struct ieee80211vap *vap, const uint8_t dest[IEEE80211_ADDR_LEN]) { struct ieee80211_mesh_state *ms = vap->iv_mesh; struct ieee80211_mesh_route *rt, *next; MESH_RT_LOCK(ms); TAILQ_FOREACH_SAFE(rt, &ms->ms_routes, rt_next, next) { if (IEEE80211_ADDR_EQ(rt->rt_dest, dest)) { if (rt->rt_flags & IEEE80211_MESHRT_FLAGS_PROXY) { ms->ms_ppath->mpp_senderror(vap, dest, rt, IEEE80211_REASON_MESH_PERR_NO_PROXY); } else { ms->ms_ppath->mpp_senderror(vap, dest, rt, IEEE80211_REASON_MESH_PERR_DEST_UNREACH); } mesh_rt_del(ms, rt); MESH_RT_UNLOCK(ms); return; } } MESH_RT_UNLOCK(ms); } void ieee80211_mesh_rt_flush(struct ieee80211vap *vap) { struct ieee80211_mesh_state *ms = vap->iv_mesh; struct ieee80211_mesh_route *rt, *next; if (ms == NULL) return; MESH_RT_LOCK(ms); TAILQ_FOREACH_SAFE(rt, &ms->ms_routes, rt_next, next) mesh_rt_del(ms, rt); MESH_RT_UNLOCK(ms); } void ieee80211_mesh_rt_flush_peer(struct ieee80211vap *vap, const uint8_t peer[IEEE80211_ADDR_LEN]) { struct ieee80211_mesh_state *ms = vap->iv_mesh; struct ieee80211_mesh_route *rt, *next; MESH_RT_LOCK(ms); TAILQ_FOREACH_SAFE(rt, &ms->ms_routes, rt_next, next) { if (IEEE80211_ADDR_EQ(rt->rt_nexthop, peer)) mesh_rt_del(ms, rt); } MESH_RT_UNLOCK(ms); } /* * Flush expired routing entries, i.e. those in invalid state for * some time. */ static void mesh_rt_flush_invalid(struct ieee80211vap *vap) { struct ieee80211_mesh_state *ms = vap->iv_mesh; struct ieee80211_mesh_route *rt, *next; if (ms == NULL) return; MESH_RT_LOCK(ms); TAILQ_FOREACH_SAFE(rt, &ms->ms_routes, rt_next, next) { /* Discover paths will be deleted by their own callout */ if (rt->rt_flags & IEEE80211_MESHRT_FLAGS_DISCOVER) continue; ieee80211_mesh_rt_update(rt, 0); if ((rt->rt_flags & IEEE80211_MESHRT_FLAGS_VALID) == 0) mesh_rt_del(ms, rt); } MESH_RT_UNLOCK(ms); } int ieee80211_mesh_register_proto_path(const struct ieee80211_mesh_proto_path *mpp) { int i, firstempty = -1; for (i = 0; i < nitems(mesh_proto_paths); i++) { if (strncmp(mpp->mpp_descr, mesh_proto_paths[i].mpp_descr, IEEE80211_MESH_PROTO_DSZ) == 0) return EEXIST; if (!mesh_proto_paths[i].mpp_active && firstempty == -1) firstempty = i; } if (firstempty < 0) return ENOSPC; memcpy(&mesh_proto_paths[firstempty], mpp, sizeof(*mpp)); mesh_proto_paths[firstempty].mpp_active = 1; return 0; } int ieee80211_mesh_register_proto_metric(const struct ieee80211_mesh_proto_metric *mpm) { int i, firstempty = -1; for (i = 0; i < nitems(mesh_proto_metrics); i++) { if (strncmp(mpm->mpm_descr, mesh_proto_metrics[i].mpm_descr, IEEE80211_MESH_PROTO_DSZ) == 0) return EEXIST; if (!mesh_proto_metrics[i].mpm_active && firstempty == -1) firstempty = i; } if (firstempty < 0) return ENOSPC; memcpy(&mesh_proto_metrics[firstempty], mpm, sizeof(*mpm)); mesh_proto_metrics[firstempty].mpm_active = 1; return 0; } static int mesh_select_proto_path(struct ieee80211vap *vap, const char *name) { struct ieee80211_mesh_state *ms = vap->iv_mesh; int i; for (i = 0; i < nitems(mesh_proto_paths); i++) { if (strcasecmp(mesh_proto_paths[i].mpp_descr, name) == 0) { ms->ms_ppath = &mesh_proto_paths[i]; return 0; } } return ENOENT; } static int mesh_select_proto_metric(struct ieee80211vap *vap, const char *name) { struct ieee80211_mesh_state *ms = vap->iv_mesh; int i; for (i = 0; i < nitems(mesh_proto_metrics); i++) { if (strcasecmp(mesh_proto_metrics[i].mpm_descr, name) == 0) { ms->ms_pmetric = &mesh_proto_metrics[i]; return 0; } } return ENOENT; } static void mesh_gatemode_setup(struct ieee80211vap *vap) { struct ieee80211_mesh_state *ms = vap->iv_mesh; /* * NB: When a mesh gate is running as a ROOT it shall * not send out periodic GANNs but instead mark the * mesh gate flag for the corresponding proactive PREQ * and RANN frames. */ if (ms->ms_flags & IEEE80211_MESHFLAGS_ROOT || (ms->ms_flags & IEEE80211_MESHFLAGS_GATE) == 0) { callout_drain(&ms->ms_gatetimer); return ; } callout_reset(&ms->ms_gatetimer, ieee80211_mesh_gateint, mesh_gatemode_cb, vap); } static void mesh_gatemode_cb(void *arg) { struct ieee80211vap *vap = (struct ieee80211vap *)arg; struct ieee80211_mesh_state *ms = vap->iv_mesh; struct ieee80211_meshgann_ie gann; gann.gann_flags = 0; /* Reserved */ gann.gann_hopcount = 0; gann.gann_ttl = ms->ms_ttl; IEEE80211_ADDR_COPY(gann.gann_addr, vap->iv_myaddr); gann.gann_seq = ms->ms_gateseq++; gann.gann_interval = ieee80211_mesh_gateint; IEEE80211_NOTE(vap, IEEE80211_MSG_MESH, vap->iv_bss, "send broadcast GANN (seq %u)", gann.gann_seq); ieee80211_send_action(vap->iv_bss, IEEE80211_ACTION_CAT_MESH, IEEE80211_ACTION_MESH_GANN, &gann); mesh_gatemode_setup(vap); } static void ieee80211_mesh_init(void) { memset(mesh_proto_paths, 0, sizeof(mesh_proto_paths)); memset(mesh_proto_metrics, 0, sizeof(mesh_proto_metrics)); /* * Setup mesh parameters that depends on the clock frequency. */ ieee80211_mesh_gateint = msecs_to_ticks(10000); ieee80211_mesh_retrytimeout = msecs_to_ticks(40); ieee80211_mesh_holdingtimeout = msecs_to_ticks(40); ieee80211_mesh_confirmtimeout = msecs_to_ticks(40); ieee80211_mesh_backofftimeout = msecs_to_ticks(5000); /* * Register action frame handlers. */ ieee80211_recv_action_register(IEEE80211_ACTION_CAT_SELF_PROT, IEEE80211_ACTION_MESHPEERING_OPEN, mesh_recv_action_meshpeering_open); ieee80211_recv_action_register(IEEE80211_ACTION_CAT_SELF_PROT, IEEE80211_ACTION_MESHPEERING_CONFIRM, mesh_recv_action_meshpeering_confirm); ieee80211_recv_action_register(IEEE80211_ACTION_CAT_SELF_PROT, IEEE80211_ACTION_MESHPEERING_CLOSE, mesh_recv_action_meshpeering_close); ieee80211_recv_action_register(IEEE80211_ACTION_CAT_MESH, IEEE80211_ACTION_MESH_LMETRIC, mesh_recv_action_meshlmetric); ieee80211_recv_action_register(IEEE80211_ACTION_CAT_MESH, IEEE80211_ACTION_MESH_GANN, mesh_recv_action_meshgate); ieee80211_send_action_register(IEEE80211_ACTION_CAT_SELF_PROT, IEEE80211_ACTION_MESHPEERING_OPEN, mesh_send_action_meshpeering_open); ieee80211_send_action_register(IEEE80211_ACTION_CAT_SELF_PROT, IEEE80211_ACTION_MESHPEERING_CONFIRM, mesh_send_action_meshpeering_confirm); ieee80211_send_action_register(IEEE80211_ACTION_CAT_SELF_PROT, IEEE80211_ACTION_MESHPEERING_CLOSE, mesh_send_action_meshpeering_close); ieee80211_send_action_register(IEEE80211_ACTION_CAT_MESH, IEEE80211_ACTION_MESH_LMETRIC, mesh_send_action_meshlmetric); ieee80211_send_action_register(IEEE80211_ACTION_CAT_MESH, IEEE80211_ACTION_MESH_GANN, mesh_send_action_meshgate); /* * Register Airtime Link Metric. */ ieee80211_mesh_register_proto_metric(&mesh_metric_airtime); } SYSINIT(wlan_mesh, SI_SUB_DRIVERS, SI_ORDER_FIRST, ieee80211_mesh_init, NULL); void ieee80211_mesh_attach(struct ieee80211com *ic) { ic->ic_vattach[IEEE80211_M_MBSS] = mesh_vattach; } void ieee80211_mesh_detach(struct ieee80211com *ic) { } static void mesh_vdetach_peers(void *arg, struct ieee80211_node *ni) { struct ieee80211com *ic = ni->ni_ic; uint16_t args[3]; if (ni->ni_mlstate == IEEE80211_NODE_MESH_ESTABLISHED) { args[0] = ni->ni_mlpid; args[1] = ni->ni_mllid; args[2] = IEEE80211_REASON_PEER_LINK_CANCELED; ieee80211_send_action(ni, IEEE80211_ACTION_CAT_SELF_PROT, IEEE80211_ACTION_MESHPEERING_CLOSE, args); } callout_drain(&ni->ni_mltimer); /* XXX belongs in hwmp */ ieee80211_ageq_drain_node(&ic->ic_stageq, (void *)(uintptr_t) ieee80211_mac_hash(ic, ni->ni_macaddr)); } static void mesh_vdetach(struct ieee80211vap *vap) { struct ieee80211_mesh_state *ms = vap->iv_mesh; callout_drain(&ms->ms_cleantimer); ieee80211_iterate_nodes(&vap->iv_ic->ic_sta, mesh_vdetach_peers, NULL); ieee80211_mesh_rt_flush(vap); MESH_RT_LOCK_DESTROY(ms); ms->ms_ppath->mpp_vdetach(vap); IEEE80211_FREE(vap->iv_mesh, M_80211_VAP); vap->iv_mesh = NULL; } static void mesh_vattach(struct ieee80211vap *vap) { struct ieee80211_mesh_state *ms; vap->iv_newstate = mesh_newstate; vap->iv_input = mesh_input; vap->iv_opdetach = mesh_vdetach; vap->iv_recv_mgmt = mesh_recv_mgmt; vap->iv_recv_ctl = mesh_recv_ctl; ms = IEEE80211_MALLOC(sizeof(struct ieee80211_mesh_state), M_80211_VAP, IEEE80211_M_NOWAIT | IEEE80211_M_ZERO); if (ms == NULL) { printf("%s: couldn't alloc MBSS state\n", __func__); return; } vap->iv_mesh = ms; ms->ms_seq = 0; ms->ms_flags = (IEEE80211_MESHFLAGS_AP | IEEE80211_MESHFLAGS_FWD); ms->ms_ttl = IEEE80211_MESH_DEFAULT_TTL; TAILQ_INIT(&ms->ms_known_gates); TAILQ_INIT(&ms->ms_routes); MESH_RT_LOCK_INIT(ms, "MBSS"); callout_init(&ms->ms_cleantimer, 1); callout_init(&ms->ms_gatetimer, 1); ms->ms_gateseq = 0; mesh_select_proto_metric(vap, "AIRTIME"); KASSERT(ms->ms_pmetric, ("ms_pmetric == NULL")); mesh_select_proto_path(vap, "HWMP"); KASSERT(ms->ms_ppath, ("ms_ppath == NULL")); ms->ms_ppath->mpp_vattach(vap); } /* * IEEE80211_M_MBSS vap state machine handler. */ static int mesh_newstate(struct ieee80211vap *vap, enum ieee80211_state nstate, int arg) { struct ieee80211_mesh_state *ms = vap->iv_mesh; struct ieee80211com *ic = vap->iv_ic; struct ieee80211_node *ni; enum ieee80211_state ostate; IEEE80211_LOCK_ASSERT(ic); ostate = vap->iv_state; IEEE80211_DPRINTF(vap, IEEE80211_MSG_STATE, "%s: %s -> %s (%d)\n", __func__, ieee80211_state_name[ostate], ieee80211_state_name[nstate], arg); vap->iv_state = nstate; /* state transition */ if (ostate != IEEE80211_S_SCAN) ieee80211_cancel_scan(vap); /* background scan */ ni = vap->iv_bss; /* NB: no reference held */ if (nstate != IEEE80211_S_RUN && ostate == IEEE80211_S_RUN) { callout_drain(&ms->ms_cleantimer); callout_drain(&ms->ms_gatetimer); } switch (nstate) { case IEEE80211_S_INIT: switch (ostate) { case IEEE80211_S_SCAN: ieee80211_cancel_scan(vap); break; case IEEE80211_S_CAC: ieee80211_dfs_cac_stop(vap); break; case IEEE80211_S_RUN: ieee80211_iterate_nodes(&ic->ic_sta, mesh_vdetach_peers, NULL); break; default: break; } if (ostate != IEEE80211_S_INIT) { /* NB: optimize INIT -> INIT case */ ieee80211_reset_bss(vap); ieee80211_mesh_rt_flush(vap); } break; case IEEE80211_S_SCAN: switch (ostate) { case IEEE80211_S_INIT: if (vap->iv_des_chan != IEEE80211_CHAN_ANYC && !IEEE80211_IS_CHAN_RADAR(vap->iv_des_chan) && ms->ms_idlen != 0) { /* * Already have a channel and a mesh ID; bypass * the scan and startup immediately. */ ieee80211_create_ibss(vap, vap->iv_des_chan); break; } /* * Initiate a scan. We can come here as a result * of an IEEE80211_IOC_SCAN_REQ too in which case * the vap will be marked with IEEE80211_FEXT_SCANREQ * and the scan request parameters will be present * in iv_scanreq. Otherwise we do the default. */ if (vap->iv_flags_ext & IEEE80211_FEXT_SCANREQ) { ieee80211_check_scan(vap, vap->iv_scanreq_flags, vap->iv_scanreq_duration, vap->iv_scanreq_mindwell, vap->iv_scanreq_maxdwell, vap->iv_scanreq_nssid, vap->iv_scanreq_ssid); vap->iv_flags_ext &= ~IEEE80211_FEXT_SCANREQ; } else ieee80211_check_scan_current(vap); break; default: break; } break; case IEEE80211_S_CAC: /* * Start CAC on a DFS channel. We come here when starting * a bss on a DFS channel (see ieee80211_create_ibss). */ ieee80211_dfs_cac_start(vap); break; case IEEE80211_S_RUN: switch (ostate) { case IEEE80211_S_INIT: /* * Already have a channel; bypass the * scan and startup immediately. * Note that ieee80211_create_ibss will call * back to do a RUN->RUN state change. */ ieee80211_create_ibss(vap, ieee80211_ht_adjust_channel(ic, ic->ic_curchan, vap->iv_flags_ht)); /* NB: iv_bss is changed on return */ break; case IEEE80211_S_CAC: /* * NB: This is the normal state change when CAC * expires and no radar was detected; no need to * clear the CAC timer as it's already expired. */ /* fall thru... */ case IEEE80211_S_CSA: #if 0 /* * Shorten inactivity timer of associated stations * to weed out sta's that don't follow a CSA. */ ieee80211_iterate_nodes(&ic->ic_sta, sta_csa, vap); #endif /* * Update bss node channel to reflect where * we landed after CSA. */ ieee80211_node_set_chan(ni, ieee80211_ht_adjust_channel(ic, ic->ic_curchan, ieee80211_htchanflags(ni->ni_chan))); /* XXX bypass debug msgs */ break; case IEEE80211_S_SCAN: case IEEE80211_S_RUN: #ifdef IEEE80211_DEBUG if (ieee80211_msg_debug(vap)) { ieee80211_note(vap, "synchronized with %s meshid ", ether_sprintf(ni->ni_meshid)); ieee80211_print_essid(ni->ni_meshid, ni->ni_meshidlen); /* XXX MCS/HT */ printf(" channel %d\n", ieee80211_chan2ieee(ic, ic->ic_curchan)); } #endif break; default: break; } ieee80211_node_authorize(ni); callout_reset(&ms->ms_cleantimer, ms->ms_ppath->mpp_inact, mesh_rt_cleanup_cb, vap); mesh_gatemode_setup(vap); break; default: break; } /* NB: ostate not nstate */ ms->ms_ppath->mpp_newstate(vap, ostate, arg); return 0; } static void mesh_rt_cleanup_cb(void *arg) { struct ieee80211vap *vap = arg; struct ieee80211_mesh_state *ms = vap->iv_mesh; mesh_rt_flush_invalid(vap); callout_reset(&ms->ms_cleantimer, ms->ms_ppath->mpp_inact, mesh_rt_cleanup_cb, vap); } /* * Mark a mesh STA as gate and return a pointer to it. * If this is first time, we create a new gate route. * Always update the path route to this mesh gate. */ struct ieee80211_mesh_gate_route * ieee80211_mesh_mark_gate(struct ieee80211vap *vap, const uint8_t *addr, struct ieee80211_mesh_route *rt) { struct ieee80211_mesh_state *ms = vap->iv_mesh; struct ieee80211_mesh_gate_route *gr = NULL, *next; int found = 0; MESH_RT_LOCK(ms); TAILQ_FOREACH_SAFE(gr, &ms->ms_known_gates, gr_next, next) { if (IEEE80211_ADDR_EQ(gr->gr_addr, addr)) { found = 1; break; } } if (!found) { /* New mesh gate add it to known table. */ IEEE80211_NOTE_MAC(vap, IEEE80211_MSG_MESH, addr, "%s", "stored new gate information from pro-PREQ."); gr = IEEE80211_MALLOC(ALIGN(sizeof(struct ieee80211_mesh_gate_route)), M_80211_MESH_GT_RT, IEEE80211_M_NOWAIT | IEEE80211_M_ZERO); IEEE80211_ADDR_COPY(gr->gr_addr, addr); TAILQ_INSERT_TAIL(&ms->ms_known_gates, gr, gr_next); } gr->gr_route = rt; /* TODO: link from path route to gate route */ MESH_RT_UNLOCK(ms); return gr; } /* * Helper function to note the Mesh Peer Link FSM change. */ static void mesh_linkchange(struct ieee80211_node *ni, enum ieee80211_mesh_mlstate state) { struct ieee80211vap *vap = ni->ni_vap; struct ieee80211_mesh_state *ms = vap->iv_mesh; #ifdef IEEE80211_DEBUG static const char *meshlinkstates[] = { [IEEE80211_NODE_MESH_IDLE] = "IDLE", [IEEE80211_NODE_MESH_OPENSNT] = "OPEN SENT", [IEEE80211_NODE_MESH_OPENRCV] = "OPEN RECEIVED", [IEEE80211_NODE_MESH_CONFIRMRCV] = "CONFIRM RECEIVED", [IEEE80211_NODE_MESH_ESTABLISHED] = "ESTABLISHED", [IEEE80211_NODE_MESH_HOLDING] = "HOLDING" }; #endif IEEE80211_NOTE(vap, IEEE80211_MSG_MESH, ni, "peer link: %s -> %s", meshlinkstates[ni->ni_mlstate], meshlinkstates[state]); /* track neighbor count */ if (state == IEEE80211_NODE_MESH_ESTABLISHED && ni->ni_mlstate != IEEE80211_NODE_MESH_ESTABLISHED) { KASSERT(ms->ms_neighbors < 65535, ("neighbor count overflow")); ms->ms_neighbors++; ieee80211_beacon_notify(vap, IEEE80211_BEACON_MESHCONF); } else if (ni->ni_mlstate == IEEE80211_NODE_MESH_ESTABLISHED && state != IEEE80211_NODE_MESH_ESTABLISHED) { KASSERT(ms->ms_neighbors > 0, ("neighbor count 0")); ms->ms_neighbors--; ieee80211_beacon_notify(vap, IEEE80211_BEACON_MESHCONF); } ni->ni_mlstate = state; switch (state) { case IEEE80211_NODE_MESH_HOLDING: ms->ms_ppath->mpp_peerdown(ni); break; case IEEE80211_NODE_MESH_ESTABLISHED: ieee80211_mesh_discover(vap, ni->ni_macaddr, NULL); break; default: break; } } /* * Helper function to generate a unique local ID required for mesh * peer establishment. */ static void mesh_checkid(void *arg, struct ieee80211_node *ni) { uint16_t *r = arg; if (*r == ni->ni_mllid) *(uint16_t *)arg = 0; } static uint32_t mesh_generateid(struct ieee80211vap *vap) { int maxiter = 4; uint16_t r; do { get_random_bytes(&r, 2); ieee80211_iterate_nodes(&vap->iv_ic->ic_sta, mesh_checkid, &r); maxiter--; } while (r == 0 && maxiter > 0); return r; } /* * Verifies if we already received this packet by checking its * sequence number. * Returns 0 if the frame is to be accepted, 1 otherwise. */ static int mesh_checkpseq(struct ieee80211vap *vap, const uint8_t source[IEEE80211_ADDR_LEN], uint32_t seq) { struct ieee80211_mesh_route *rt; rt = ieee80211_mesh_rt_find(vap, source); if (rt == NULL) { rt = ieee80211_mesh_rt_add(vap, source); if (rt == NULL) { IEEE80211_NOTE_MAC(vap, IEEE80211_MSG_MESH, source, "%s", "add mcast route failed"); vap->iv_stats.is_mesh_rtaddfailed++; return 1; } IEEE80211_NOTE_MAC(vap, IEEE80211_MSG_MESH, source, "add mcast route, mesh seqno %d", seq); rt->rt_lastmseq = seq; return 0; } if (IEEE80211_MESH_SEQ_GEQ(rt->rt_lastmseq, seq)) { return 1; } else { rt->rt_lastmseq = seq; return 0; } } /* * Iterate the routing table and locate the next hop. */ struct ieee80211_node * ieee80211_mesh_find_txnode(struct ieee80211vap *vap, const uint8_t dest[IEEE80211_ADDR_LEN]) { struct ieee80211_mesh_route *rt; rt = ieee80211_mesh_rt_find(vap, dest); if (rt == NULL) return NULL; if ((rt->rt_flags & IEEE80211_MESHRT_FLAGS_VALID) == 0) { IEEE80211_NOTE_MAC(vap, IEEE80211_MSG_MESH, dest, "%s: !valid, flags 0x%x", __func__, rt->rt_flags); /* XXX stat */ return NULL; } if (rt->rt_flags & IEEE80211_MESHRT_FLAGS_PROXY) { rt = ieee80211_mesh_rt_find(vap, rt->rt_mesh_gate); if (rt == NULL) return NULL; if ((rt->rt_flags & IEEE80211_MESHRT_FLAGS_VALID) == 0) { IEEE80211_NOTE_MAC(vap, IEEE80211_MSG_MESH, dest, "%s: meshgate !valid, flags 0x%x", __func__, rt->rt_flags); /* XXX stat */ return NULL; } } return ieee80211_find_txnode(vap, rt->rt_nexthop); } static void mesh_transmit_to_gate(struct ieee80211vap *vap, struct mbuf *m, struct ieee80211_mesh_route *rt_gate) { struct ifnet *ifp = vap->iv_ifp; struct ieee80211_node *ni; IEEE80211_TX_UNLOCK_ASSERT(vap->iv_ic); ni = ieee80211_mesh_find_txnode(vap, rt_gate->rt_dest); if (ni == NULL) { if_inc_counter(ifp, IFCOUNTER_OERRORS, 1); m_freem(m); return; } /* * Send through the VAP packet transmit path. * This consumes the node ref grabbed above and * the mbuf, regardless of whether there's a problem * or not. */ (void) ieee80211_vap_pkt_send_dest(vap, m, ni); } /* * Forward the queued frames to known valid mesh gates. * Assume destination to be outside the MBSS (i.e. proxy entry), * If no valid mesh gates are known silently discard queued frames. * After transmitting frames to all known valid mesh gates, this route * will be marked invalid, and a new path discovery will happen in the hopes * that (at least) one of the mesh gates have a new proxy entry for us to use. */ void ieee80211_mesh_forward_to_gates(struct ieee80211vap *vap, struct ieee80211_mesh_route *rt_dest) { struct ieee80211com *ic = vap->iv_ic; struct ieee80211_mesh_state *ms = vap->iv_mesh; struct ieee80211_mesh_route *rt_gate; struct ieee80211_mesh_gate_route *gr = NULL, *gr_next; struct mbuf *m, *mcopy, *next; IEEE80211_TX_UNLOCK_ASSERT(ic); KASSERT( rt_dest->rt_flags == IEEE80211_MESHRT_FLAGS_DISCOVER, ("Route is not marked with IEEE80211_MESHRT_FLAGS_DISCOVER")); /* XXX: send to more than one valid mash gate */ MESH_RT_LOCK(ms); m = ieee80211_ageq_remove(&ic->ic_stageq, (struct ieee80211_node *)(uintptr_t) ieee80211_mac_hash(ic, rt_dest->rt_dest)); TAILQ_FOREACH_SAFE(gr, &ms->ms_known_gates, gr_next, gr_next) { rt_gate = gr->gr_route; if (rt_gate == NULL) { IEEE80211_NOTE_MAC(vap, IEEE80211_MSG_HWMP, rt_dest->rt_dest, "mesh gate with no path %6D", gr->gr_addr, ":"); continue; } if ((rt_gate->rt_flags & IEEE80211_MESHRT_FLAGS_VALID) == 0) continue; KASSERT(rt_gate->rt_flags & IEEE80211_MESHRT_FLAGS_GATE, ("route not marked as a mesh gate")); KASSERT((rt_gate->rt_flags & IEEE80211_MESHRT_FLAGS_PROXY) == 0, ("found mesh gate that is also marked porxy")); /* * convert route to a proxy route gated by the current * mesh gate, this is needed so encap can built data * frame with correct address. */ rt_dest->rt_flags = IEEE80211_MESHRT_FLAGS_PROXY | IEEE80211_MESHRT_FLAGS_VALID; rt_dest->rt_ext_seq = 1; /* random value */ IEEE80211_ADDR_COPY(rt_dest->rt_mesh_gate, rt_gate->rt_dest); IEEE80211_ADDR_COPY(rt_dest->rt_nexthop, rt_gate->rt_nexthop); rt_dest->rt_metric = rt_gate->rt_metric; rt_dest->rt_nhops = rt_gate->rt_nhops; ieee80211_mesh_rt_update(rt_dest, ms->ms_ppath->mpp_inact); MESH_RT_UNLOCK(ms); /* XXX: lock?? */ mcopy = m_dup(m, M_NOWAIT); for (; mcopy != NULL; mcopy = next) { next = mcopy->m_nextpkt; mcopy->m_nextpkt = NULL; IEEE80211_NOTE_MAC(vap, IEEE80211_MSG_HWMP, rt_dest->rt_dest, "flush queued frame %p len %d", mcopy, mcopy->m_pkthdr.len); mesh_transmit_to_gate(vap, mcopy, rt_gate); } MESH_RT_LOCK(ms); } rt_dest->rt_flags = 0; /* Mark invalid */ m_freem(m); MESH_RT_UNLOCK(ms); } /* * Forward the specified frame. * Decrement the TTL and set TA to our MAC address. */ static void mesh_forward(struct ieee80211vap *vap, struct mbuf *m, const struct ieee80211_meshcntl *mc) { struct ieee80211com *ic = vap->iv_ic; struct ieee80211_mesh_state *ms = vap->iv_mesh; struct ifnet *ifp = vap->iv_ifp; const struct ieee80211_frame *wh = mtod(m, const struct ieee80211_frame *); struct mbuf *mcopy; struct ieee80211_meshcntl *mccopy; struct ieee80211_frame *whcopy; struct ieee80211_node *ni; int err; /* This is called from the RX path - don't hold this lock */ IEEE80211_TX_UNLOCK_ASSERT(ic); /* * mesh ttl of 1 means we are the last one receiving it, * according to amendment we decrement and then check if * 0, if so we dont forward. */ if (mc->mc_ttl < 1) { IEEE80211_NOTE_FRAME(vap, IEEE80211_MSG_MESH, wh, "%s", "frame not fwd'd, ttl 1"); vap->iv_stats.is_mesh_fwd_ttl++; return; } if (!(ms->ms_flags & IEEE80211_MESHFLAGS_FWD)) { IEEE80211_NOTE_FRAME(vap, IEEE80211_MSG_MESH, wh, "%s", "frame not fwd'd, fwding disabled"); vap->iv_stats.is_mesh_fwd_disabled++; return; } mcopy = m_dup(m, M_NOWAIT); if (mcopy == NULL) { IEEE80211_NOTE_FRAME(vap, IEEE80211_MSG_MESH, wh, "%s", "frame not fwd'd, cannot dup"); vap->iv_stats.is_mesh_fwd_nobuf++; if_inc_counter(ifp, IFCOUNTER_OERRORS, 1); return; } mcopy = m_pullup(mcopy, ieee80211_hdrspace(ic, wh) + sizeof(struct ieee80211_meshcntl)); if (mcopy == NULL) { IEEE80211_NOTE_FRAME(vap, IEEE80211_MSG_MESH, wh, "%s", "frame not fwd'd, too short"); vap->iv_stats.is_mesh_fwd_tooshort++; if_inc_counter(ifp, IFCOUNTER_OERRORS, 1); m_freem(mcopy); return; } whcopy = mtod(mcopy, struct ieee80211_frame *); mccopy = (struct ieee80211_meshcntl *) (mtod(mcopy, uint8_t *) + ieee80211_hdrspace(ic, wh)); /* XXX clear other bits? */ whcopy->i_fc[1] &= ~IEEE80211_FC1_RETRY; IEEE80211_ADDR_COPY(whcopy->i_addr2, vap->iv_myaddr); if (IEEE80211_IS_MULTICAST(wh->i_addr1)) { ni = ieee80211_ref_node(vap->iv_bss); mcopy->m_flags |= M_MCAST; } else { ni = ieee80211_mesh_find_txnode(vap, whcopy->i_addr3); if (ni == NULL) { /* * [Optional] any of the following three actions: * o silently discard * o trigger a path discovery * o inform TA that meshDA is unknown. */ IEEE80211_NOTE_FRAME(vap, IEEE80211_MSG_MESH, wh, "%s", "frame not fwd'd, no path"); ms->ms_ppath->mpp_senderror(vap, whcopy->i_addr3, NULL, IEEE80211_REASON_MESH_PERR_NO_FI); vap->iv_stats.is_mesh_fwd_nopath++; m_freem(mcopy); return; } IEEE80211_ADDR_COPY(whcopy->i_addr1, ni->ni_macaddr); } KASSERT(mccopy->mc_ttl > 0, ("%s called with wrong ttl", __func__)); mccopy->mc_ttl--; /* XXX calculate priority so drivers can find the tx queue */ M_WME_SETAC(mcopy, WME_AC_BE); /* XXX do we know m_nextpkt is NULL? */ MPASS((mcopy->m_pkthdr.csum_flags & CSUM_SND_TAG) == 0); mcopy->m_pkthdr.rcvif = (void *) ni; /* * XXX this bypasses all of the VAP TX handling; it passes frames * directly to the parent interface. * * Because of this, there's no TX lock being held as there's no * encaps state being used. * * Doing a direct parent transmit may not be the correct thing * to do here; we'll have to re-think this soon. */ IEEE80211_TX_LOCK(ic); err = ieee80211_parent_xmitpkt(ic, mcopy); IEEE80211_TX_UNLOCK(ic); if (!err) if_inc_counter(ifp, IFCOUNTER_OPACKETS, 1); } static struct mbuf * mesh_decap(struct ieee80211vap *vap, struct mbuf *m, int hdrlen, int meshdrlen) { #define WHDIR(wh) ((wh)->i_fc[1] & IEEE80211_FC1_DIR_MASK) #define MC01(mc) ((const struct ieee80211_meshcntl_ae01 *)mc) uint8_t b[sizeof(struct ieee80211_qosframe_addr4) + sizeof(struct ieee80211_meshcntl_ae10)]; const struct ieee80211_qosframe_addr4 *wh; const struct ieee80211_meshcntl_ae10 *mc; struct ether_header *eh; struct llc *llc; int ae; if (m->m_len < hdrlen + sizeof(*llc) && (m = m_pullup(m, hdrlen + sizeof(*llc))) == NULL) { IEEE80211_DPRINTF(vap, IEEE80211_MSG_ANY, "discard data frame: %s", "m_pullup failed"); vap->iv_stats.is_rx_tooshort++; return NULL; } memcpy(b, mtod(m, caddr_t), hdrlen); wh = (const struct ieee80211_qosframe_addr4 *)&b[0]; mc = (const struct ieee80211_meshcntl_ae10 *)&b[hdrlen - meshdrlen]; KASSERT(WHDIR(wh) == IEEE80211_FC1_DIR_FROMDS || WHDIR(wh) == IEEE80211_FC1_DIR_DSTODS, ("bogus dir, fc 0x%x:0x%x", wh->i_fc[0], wh->i_fc[1])); llc = (struct llc *)(mtod(m, caddr_t) + hdrlen); if (llc->llc_dsap == LLC_SNAP_LSAP && llc->llc_ssap == LLC_SNAP_LSAP && llc->llc_control == LLC_UI && llc->llc_snap.org_code[0] == 0 && llc->llc_snap.org_code[1] == 0 && llc->llc_snap.org_code[2] == 0 && /* NB: preserve AppleTalk frames that have a native SNAP hdr */ !(llc->llc_snap.ether_type == htons(ETHERTYPE_AARP) || llc->llc_snap.ether_type == htons(ETHERTYPE_IPX))) { m_adj(m, hdrlen + sizeof(struct llc) - sizeof(*eh)); llc = NULL; } else { m_adj(m, hdrlen - sizeof(*eh)); } eh = mtod(m, struct ether_header *); ae = mc->mc_flags & IEEE80211_MESH_AE_MASK; if (WHDIR(wh) == IEEE80211_FC1_DIR_FROMDS) { IEEE80211_ADDR_COPY(eh->ether_dhost, wh->i_addr1); if (ae == IEEE80211_MESH_AE_00) { IEEE80211_ADDR_COPY(eh->ether_shost, wh->i_addr3); } else if (ae == IEEE80211_MESH_AE_01) { IEEE80211_ADDR_COPY(eh->ether_shost, MC01(mc)->mc_addr4); } else { IEEE80211_DISCARD(vap, IEEE80211_MSG_ANY, (const struct ieee80211_frame *)wh, NULL, "bad AE %d", ae); vap->iv_stats.is_mesh_badae++; m_freem(m); return NULL; } } else { if (ae == IEEE80211_MESH_AE_00) { IEEE80211_ADDR_COPY(eh->ether_dhost, wh->i_addr3); IEEE80211_ADDR_COPY(eh->ether_shost, wh->i_addr4); } else if (ae == IEEE80211_MESH_AE_10) { IEEE80211_ADDR_COPY(eh->ether_dhost, mc->mc_addr5); IEEE80211_ADDR_COPY(eh->ether_shost, mc->mc_addr6); } else { IEEE80211_DISCARD(vap, IEEE80211_MSG_ANY, (const struct ieee80211_frame *)wh, NULL, "bad AE %d", ae); vap->iv_stats.is_mesh_badae++; m_freem(m); return NULL; } } #ifndef __NO_STRICT_ALIGNMENT if (!ALIGNED_POINTER(mtod(m, caddr_t) + sizeof(*eh), uint32_t)) { m = ieee80211_realign(vap, m, sizeof(*eh)); if (m == NULL) return NULL; } #endif /* !__NO_STRICT_ALIGNMENT */ if (llc != NULL) { eh = mtod(m, struct ether_header *); eh->ether_type = htons(m->m_pkthdr.len - sizeof(*eh)); } return m; #undef WDIR #undef MC01 } /* * Return non-zero if the unicast mesh data frame should be processed * locally. Frames that are not proxy'd have our address, otherwise * we need to consult the routing table to look for a proxy entry. */ static __inline int mesh_isucastforme(struct ieee80211vap *vap, const struct ieee80211_frame *wh, const struct ieee80211_meshcntl *mc) { int ae = mc->mc_flags & 3; KASSERT((wh->i_fc[1] & IEEE80211_FC1_DIR_MASK) == IEEE80211_FC1_DIR_DSTODS, ("bad dir 0x%x:0x%x", wh->i_fc[0], wh->i_fc[1])); KASSERT(ae == IEEE80211_MESH_AE_00 || ae == IEEE80211_MESH_AE_10, ("bad AE %d", ae)); if (ae == IEEE80211_MESH_AE_10) { /* ucast w/ proxy */ const struct ieee80211_meshcntl_ae10 *mc10 = (const struct ieee80211_meshcntl_ae10 *) mc; struct ieee80211_mesh_route *rt = ieee80211_mesh_rt_find(vap, mc10->mc_addr5); /* check for proxy route to ourself */ return (rt != NULL && (rt->rt_flags & IEEE80211_MESHRT_FLAGS_PROXY)); } else /* ucast w/o proxy */ return IEEE80211_ADDR_EQ(wh->i_addr3, vap->iv_myaddr); } /* * Verifies transmitter, updates lifetime, precursor list and forwards data. * > 0 means we have forwarded data and no need to process locally * == 0 means we want to process locally (and we may have forwarded data * < 0 means there was an error and data should be discarded */ static int mesh_recv_indiv_data_to_fwrd(struct ieee80211vap *vap, struct mbuf *m, struct ieee80211_frame *wh, const struct ieee80211_meshcntl *mc) { struct ieee80211_qosframe_addr4 *qwh; struct ieee80211_mesh_state *ms = vap->iv_mesh; struct ieee80211_mesh_route *rt_meshda, *rt_meshsa; /* This is called from the RX path - don't hold this lock */ IEEE80211_TX_UNLOCK_ASSERT(vap->iv_ic); qwh = (struct ieee80211_qosframe_addr4 *)wh; /* * TODO: * o verify addr2 is a legitimate transmitter * o lifetime of precursor of addr3 (addr2) is max(init, curr) * o lifetime of precursor of addr4 (nexthop) is max(init, curr) */ /* set lifetime of addr3 (meshDA) to initial value */ rt_meshda = ieee80211_mesh_rt_find(vap, qwh->i_addr3); if (rt_meshda == NULL) { IEEE80211_NOTE_MAC(vap, IEEE80211_MSG_MESH, qwh->i_addr2, "no route to meshDA(%6D)", qwh->i_addr3, ":"); /* * [Optional] any of the following three actions: * o silently discard [X] * o trigger a path discovery [ ] * o inform TA that meshDA is unknown. [ ] */ /* XXX: stats */ return (-1); } ieee80211_mesh_rt_update(rt_meshda, ticks_to_msecs( ms->ms_ppath->mpp_inact)); /* set lifetime of addr4 (meshSA) to initial value */ rt_meshsa = ieee80211_mesh_rt_find(vap, qwh->i_addr4); KASSERT(rt_meshsa != NULL, ("no route")); ieee80211_mesh_rt_update(rt_meshsa, ticks_to_msecs( ms->ms_ppath->mpp_inact)); mesh_forward(vap, m, mc); return (1); /* dont process locally */ } /* * Verifies transmitter, updates lifetime, precursor list and process data * locally, if data is proxy with AE = 10 it could mean data should go * on another mesh path or data should be forwarded to the DS. * * > 0 means we have forwarded data and no need to process locally * == 0 means we want to process locally (and we may have forwarded data * < 0 means there was an error and data should be discarded */ static int mesh_recv_indiv_data_to_me(struct ieee80211vap *vap, struct mbuf *m, struct ieee80211_frame *wh, const struct ieee80211_meshcntl *mc) { struct ieee80211_qosframe_addr4 *qwh; const struct ieee80211_meshcntl_ae10 *mc10; struct ieee80211_mesh_state *ms = vap->iv_mesh; struct ieee80211_mesh_route *rt; int ae; /* This is called from the RX path - don't hold this lock */ IEEE80211_TX_UNLOCK_ASSERT(vap->iv_ic); qwh = (struct ieee80211_qosframe_addr4 *)wh; mc10 = (const struct ieee80211_meshcntl_ae10 *)mc; /* * TODO: * o verify addr2 is a legitimate transmitter * o lifetime of precursor entry is max(init, curr) */ /* set lifetime of addr4 (meshSA) to initial value */ rt = ieee80211_mesh_rt_find(vap, qwh->i_addr4); KASSERT(rt != NULL, ("no route")); ieee80211_mesh_rt_update(rt, ticks_to_msecs(ms->ms_ppath->mpp_inact)); rt = NULL; ae = mc10->mc_flags & IEEE80211_MESH_AE_MASK; KASSERT(ae == IEEE80211_MESH_AE_00 || ae == IEEE80211_MESH_AE_10, ("bad AE %d", ae)); if (ae == IEEE80211_MESH_AE_10) { if (IEEE80211_ADDR_EQ(mc10->mc_addr5, qwh->i_addr3)) { return (0); /* process locally */ } rt = ieee80211_mesh_rt_find(vap, mc10->mc_addr5); if (rt != NULL && (rt->rt_flags & IEEE80211_MESHRT_FLAGS_VALID) && (rt->rt_flags & IEEE80211_MESHRT_FLAGS_PROXY) == 0) { /* * Forward on another mesh-path, according to * amendment as specified in 9.32.4.1 */ IEEE80211_ADDR_COPY(qwh->i_addr3, mc10->mc_addr5); mesh_forward(vap, m, (const struct ieee80211_meshcntl *)mc10); return (1); /* dont process locally */ } /* * All other cases: forward of MSDUs from the MBSS to DS indiv. * addressed according to 13.11.3.2. */ IEEE80211_NOTE_MAC(vap, IEEE80211_MSG_OUTPUT, qwh->i_addr2, "forward frame to DS, SA(%6D) DA(%6D)", mc10->mc_addr6, ":", mc10->mc_addr5, ":"); } return (0); /* process locally */ } /* * Try to forward the group addressed data on to other mesh STAs, and * also to the DS. * * > 0 means we have forwarded data and no need to process locally * == 0 means we want to process locally (and we may have forwarded data * < 0 means there was an error and data should be discarded */ static int mesh_recv_group_data(struct ieee80211vap *vap, struct mbuf *m, struct ieee80211_frame *wh, const struct ieee80211_meshcntl *mc) { #define MC01(mc) ((const struct ieee80211_meshcntl_ae01 *)mc) struct ieee80211_mesh_state *ms = vap->iv_mesh; /* This is called from the RX path - don't hold this lock */ IEEE80211_TX_UNLOCK_ASSERT(vap->iv_ic); mesh_forward(vap, m, mc); if(mc->mc_ttl > 0) { if (mc->mc_flags & IEEE80211_MESH_AE_01) { /* * Forward of MSDUs from the MBSS to DS group addressed * (according to 13.11.3.2) * This happens by delivering the packet, and a bridge * will sent it on another port member. */ if (ms->ms_flags & IEEE80211_MESHFLAGS_GATE && ms->ms_flags & IEEE80211_MESHFLAGS_FWD) { IEEE80211_NOTE_MAC(vap, IEEE80211_MSG_MESH, MC01(mc)->mc_addr4, "%s", "forward from MBSS to the DS"); } } } return (0); /* process locally */ #undef MC01 } static int mesh_input(struct ieee80211_node *ni, struct mbuf *m, const struct ieee80211_rx_stats *rxs, int rssi, int nf) { #define HAS_SEQ(type) ((type & 0x4) == 0) #define MC01(mc) ((const struct ieee80211_meshcntl_ae01 *)mc) struct ieee80211vap *vap = ni->ni_vap; struct ieee80211com *ic = ni->ni_ic; struct ifnet *ifp = vap->iv_ifp; struct ieee80211_frame *wh; const struct ieee80211_meshcntl *mc; int hdrspace, meshdrlen, need_tap, error; uint8_t dir, type, subtype, ae; uint32_t seq; const uint8_t *addr; uint8_t qos[2]; KASSERT(ni != NULL, ("null node")); ni->ni_inact = ni->ni_inact_reload; need_tap = 1; /* mbuf need to be tapped. */ type = -1; /* undefined */ /* This is called from the RX path - don't hold this lock */ IEEE80211_TX_UNLOCK_ASSERT(ic); if (m->m_pkthdr.len < sizeof(struct ieee80211_frame_min)) { IEEE80211_DISCARD_MAC(vap, IEEE80211_MSG_ANY, ni->ni_macaddr, NULL, "too short (1): len %u", m->m_pkthdr.len); vap->iv_stats.is_rx_tooshort++; goto out; } /* * Bit of a cheat here, we use a pointer for a 3-address * frame format but don't reference fields past outside * ieee80211_frame_min w/o first validating the data is * present. */ wh = mtod(m, struct ieee80211_frame *); if ((wh->i_fc[0] & IEEE80211_FC0_VERSION_MASK) != IEEE80211_FC0_VERSION_0) { IEEE80211_DISCARD_MAC(vap, IEEE80211_MSG_ANY, ni->ni_macaddr, NULL, "wrong version %x", wh->i_fc[0]); vap->iv_stats.is_rx_badversion++; goto err; } dir = wh->i_fc[1] & IEEE80211_FC1_DIR_MASK; type = wh->i_fc[0] & IEEE80211_FC0_TYPE_MASK; subtype = wh->i_fc[0] & IEEE80211_FC0_SUBTYPE_MASK; if ((ic->ic_flags & IEEE80211_F_SCAN) == 0) { IEEE80211_RSSI_LPF(ni->ni_avgrssi, rssi); ni->ni_noise = nf; if (HAS_SEQ(type)) { uint8_t tid = ieee80211_gettid(wh); if (IEEE80211_QOS_HAS_SEQ(wh) && TID_TO_WME_AC(tid) >= WME_AC_VI) ic->ic_wme.wme_hipri_traffic++; if (! ieee80211_check_rxseq(ni, wh, wh->i_addr1, rxs)) goto out; } } #ifdef IEEE80211_DEBUG /* * It's easier, but too expensive, to simulate different mesh * topologies by consulting the ACL policy very early, so do this * only under DEBUG. * * NB: this check is also done upon peering link initiation. */ if (vap->iv_acl != NULL && !vap->iv_acl->iac_check(vap, wh)) { IEEE80211_DISCARD(vap, IEEE80211_MSG_ACL, wh, NULL, "%s", "disallowed by ACL"); vap->iv_stats.is_rx_acl++; goto out; } #endif switch (type) { case IEEE80211_FC0_TYPE_DATA: if (ni == vap->iv_bss) goto out; if (ni->ni_mlstate != IEEE80211_NODE_MESH_ESTABLISHED) { IEEE80211_DISCARD_MAC(vap, IEEE80211_MSG_MESH, ni->ni_macaddr, NULL, "peer link not yet established (%d)", ni->ni_mlstate); vap->iv_stats.is_mesh_nolink++; goto out; } if (dir != IEEE80211_FC1_DIR_FROMDS && dir != IEEE80211_FC1_DIR_DSTODS) { IEEE80211_DISCARD(vap, IEEE80211_MSG_INPUT, wh, "data", "incorrect dir 0x%x", dir); vap->iv_stats.is_rx_wrongdir++; goto err; } /* All Mesh data frames are QoS subtype */ if (!HAS_SEQ(type)) { IEEE80211_DISCARD(vap, IEEE80211_MSG_INPUT, wh, "data", "incorrect subtype 0x%x", subtype); vap->iv_stats.is_rx_badsubtype++; goto err; } /* * Next up, any fragmentation. * XXX: we defrag before we even try to forward, * Mesh Control field is not present in sub-sequent * fragmented frames. This is in contrast to Draft 4.0. */ hdrspace = ieee80211_hdrspace(ic, wh); if (!IEEE80211_IS_MULTICAST(wh->i_addr1)) { m = ieee80211_defrag(ni, m, hdrspace); if (m == NULL) { /* Fragment dropped or frame not complete yet */ goto out; } } wh = mtod(m, struct ieee80211_frame *); /* NB: after defrag */ /* * Now we have a complete Mesh Data frame. */ /* * Only fromDStoDS data frames use 4 address qos frames * as specified in amendment. Otherwise addr4 is located * in the Mesh Control field and a 3 address qos frame * is used. */ *(uint16_t *)qos = *(uint16_t *)ieee80211_getqos(wh); /* * NB: The mesh STA sets the Mesh Control Present * subfield to 1 in the Mesh Data frame containing * an unfragmented MSDU, an A-MSDU, or the first * fragment of an MSDU. * After defrag it should always be present. */ if (!(qos[1] & IEEE80211_QOS_MC)) { IEEE80211_DISCARD_MAC(vap, IEEE80211_MSG_MESH, ni->ni_macaddr, NULL, "%s", "Mesh control field not present"); vap->iv_stats.is_rx_elem_missing++; /* XXX: kinda */ goto err; } /* pull up enough to get to the mesh control */ if (m->m_len < hdrspace + sizeof(struct ieee80211_meshcntl) && (m = m_pullup(m, hdrspace + sizeof(struct ieee80211_meshcntl))) == NULL) { IEEE80211_DISCARD_MAC(vap, IEEE80211_MSG_ANY, ni->ni_macaddr, NULL, "data too short: expecting %u", hdrspace); vap->iv_stats.is_rx_tooshort++; goto out; /* XXX */ } /* * Now calculate the full extent of the headers. Note * mesh_decap will pull up anything we didn't get * above when it strips the 802.11 headers. */ mc = (const struct ieee80211_meshcntl *) (mtod(m, const uint8_t *) + hdrspace); ae = mc->mc_flags & IEEE80211_MESH_AE_MASK; meshdrlen = sizeof(struct ieee80211_meshcntl) + ae * IEEE80211_ADDR_LEN; hdrspace += meshdrlen; /* pull complete hdrspace = ieee80211_hdrspace + meshcontrol */ if ((meshdrlen > sizeof(struct ieee80211_meshcntl)) && (m->m_len < hdrspace) && ((m = m_pullup(m, hdrspace)) == NULL)) { IEEE80211_DISCARD_MAC(vap, IEEE80211_MSG_ANY, ni->ni_macaddr, NULL, "data too short: expecting %u", hdrspace); vap->iv_stats.is_rx_tooshort++; goto out; /* XXX */ } /* XXX: are we sure there is no reallocating after m_pullup? */ seq = le32dec(mc->mc_seq); if (IEEE80211_IS_MULTICAST(wh->i_addr1)) addr = wh->i_addr3; else if (ae == IEEE80211_MESH_AE_01) addr = MC01(mc)->mc_addr4; else addr = ((struct ieee80211_qosframe_addr4 *)wh)->i_addr4; if (IEEE80211_ADDR_EQ(vap->iv_myaddr, addr)) { IEEE80211_DISCARD_MAC(vap, IEEE80211_MSG_INPUT, addr, "data", "%s", "not to me"); vap->iv_stats.is_rx_wrongbss++; /* XXX kinda */ goto out; } if (mesh_checkpseq(vap, addr, seq) != 0) { vap->iv_stats.is_rx_dup++; goto out; } /* This code "routes" the frame to the right control path */ if (!IEEE80211_IS_MULTICAST(wh->i_addr1)) { if (IEEE80211_ADDR_EQ(vap->iv_myaddr, wh->i_addr3)) error = mesh_recv_indiv_data_to_me(vap, m, wh, mc); else if (IEEE80211_IS_MULTICAST(wh->i_addr3)) error = mesh_recv_group_data(vap, m, wh, mc); else error = mesh_recv_indiv_data_to_fwrd(vap, m, wh, mc); } else error = mesh_recv_group_data(vap, m, wh, mc); if (error < 0) goto err; else if (error > 0) goto out; if (ieee80211_radiotap_active_vap(vap)) ieee80211_radiotap_rx(vap, m); need_tap = 0; /* * Finally, strip the 802.11 header. */ m = mesh_decap(vap, m, hdrspace, meshdrlen); if (m == NULL) { /* XXX mask bit to check for both */ /* don't count Null data frames as errors */ if (subtype == IEEE80211_FC0_SUBTYPE_NODATA || subtype == IEEE80211_FC0_SUBTYPE_QOS_NULL) goto out; IEEE80211_DISCARD_MAC(vap, IEEE80211_MSG_INPUT, ni->ni_macaddr, "data", "%s", "decap error"); vap->iv_stats.is_rx_decap++; IEEE80211_NODE_STAT(ni, rx_decap); goto err; } if (qos[0] & IEEE80211_QOS_AMSDU) { m = ieee80211_decap_amsdu(ni, m); if (m == NULL) return IEEE80211_FC0_TYPE_DATA; } ieee80211_deliver_data(vap, ni, m); return type; case IEEE80211_FC0_TYPE_MGT: vap->iv_stats.is_rx_mgmt++; IEEE80211_NODE_STAT(ni, rx_mgmt); if (dir != IEEE80211_FC1_DIR_NODS) { IEEE80211_DISCARD(vap, IEEE80211_MSG_INPUT, wh, "mgt", "incorrect dir 0x%x", dir); vap->iv_stats.is_rx_wrongdir++; goto err; } if (m->m_pkthdr.len < sizeof(struct ieee80211_frame)) { IEEE80211_DISCARD_MAC(vap, IEEE80211_MSG_ANY, ni->ni_macaddr, "mgt", "too short: len %u", m->m_pkthdr.len); vap->iv_stats.is_rx_tooshort++; goto out; } #ifdef IEEE80211_DEBUG if ((ieee80211_msg_debug(vap) && (vap->iv_ic->ic_flags & IEEE80211_F_SCAN)) || ieee80211_msg_dumppkts(vap)) { if_printf(ifp, "received %s from %s rssi %d\n", ieee80211_mgt_subtype_name(subtype), ether_sprintf(wh->i_addr2), rssi); } #endif if (wh->i_fc[1] & IEEE80211_FC1_PROTECTED) { IEEE80211_DISCARD(vap, IEEE80211_MSG_INPUT, wh, NULL, "%s", "WEP set but not permitted"); vap->iv_stats.is_rx_mgtdiscard++; /* XXX */ goto out; } vap->iv_recv_mgmt(ni, m, subtype, rxs, rssi, nf); goto out; case IEEE80211_FC0_TYPE_CTL: vap->iv_stats.is_rx_ctl++; IEEE80211_NODE_STAT(ni, rx_ctrl); goto out; default: IEEE80211_DISCARD(vap, IEEE80211_MSG_ANY, wh, "bad", "frame type 0x%x", type); /* should not come here */ break; } err: if_inc_counter(ifp, IFCOUNTER_IERRORS, 1); out: if (m != NULL) { if (need_tap && ieee80211_radiotap_active_vap(vap)) ieee80211_radiotap_rx(vap, m); m_freem(m); } return type; #undef HAS_SEQ #undef MC01 } static void mesh_recv_mgmt(struct ieee80211_node *ni, struct mbuf *m0, int subtype, const struct ieee80211_rx_stats *rxs, int rssi, int nf) { struct ieee80211vap *vap = ni->ni_vap; struct ieee80211_mesh_state *ms = vap->iv_mesh; struct ieee80211com *ic = ni->ni_ic; struct ieee80211_channel *rxchan = ic->ic_curchan; struct ieee80211_frame *wh; struct ieee80211_mesh_route *rt; uint8_t *frm, *efrm; wh = mtod(m0, struct ieee80211_frame *); frm = (uint8_t *)&wh[1]; efrm = mtod(m0, uint8_t *) + m0->m_len; switch (subtype) { case IEEE80211_FC0_SUBTYPE_PROBE_RESP: case IEEE80211_FC0_SUBTYPE_BEACON: { struct ieee80211_scanparams scan; struct ieee80211_channel *c; /* * We process beacon/probe response * frames to discover neighbors. */ if (rxs != NULL) { c = ieee80211_lookup_channel_rxstatus(vap, rxs); if (c != NULL) rxchan = c; } if (ieee80211_parse_beacon(ni, m0, rxchan, &scan) != 0) return; /* * Count frame now that we know it's to be processed. */ if (subtype == IEEE80211_FC0_SUBTYPE_BEACON) { vap->iv_stats.is_rx_beacon++; /* XXX remove */ IEEE80211_NODE_STAT(ni, rx_beacons); } else IEEE80211_NODE_STAT(ni, rx_proberesp); /* * If scanning, just pass information to the scan module. */ if (ic->ic_flags & IEEE80211_F_SCAN) { if (ic->ic_flags_ext & IEEE80211_FEXT_PROBECHAN) { /* * Actively scanning a channel marked passive; * send a probe request now that we know there * is 802.11 traffic present. * * XXX check if the beacon we recv'd gives * us what we need and suppress the probe req */ ieee80211_probe_curchan(vap, 1); ic->ic_flags_ext &= ~IEEE80211_FEXT_PROBECHAN; } ieee80211_add_scan(vap, rxchan, &scan, wh, subtype, rssi, nf); return; } /* The rest of this code assumes we are running */ if (vap->iv_state != IEEE80211_S_RUN) return; /* * Ignore non-mesh STAs. */ if ((scan.capinfo & (IEEE80211_CAPINFO_ESS|IEEE80211_CAPINFO_IBSS)) || scan.meshid == NULL || scan.meshconf == NULL) { IEEE80211_DISCARD(vap, IEEE80211_MSG_INPUT, wh, "beacon", "%s", "not a mesh sta"); vap->iv_stats.is_mesh_wrongmesh++; return; } /* * Ignore STAs for other mesh networks. */ if (memcmp(scan.meshid+2, ms->ms_id, ms->ms_idlen) != 0 || mesh_verify_meshconf(vap, scan.meshconf)) { IEEE80211_DISCARD(vap, IEEE80211_MSG_INPUT, wh, "beacon", "%s", "not for our mesh"); vap->iv_stats.is_mesh_wrongmesh++; return; } /* * Peer only based on the current ACL policy. */ if (vap->iv_acl != NULL && !vap->iv_acl->iac_check(vap, wh)) { IEEE80211_DISCARD(vap, IEEE80211_MSG_ACL, wh, NULL, "%s", "disallowed by ACL"); vap->iv_stats.is_rx_acl++; return; } /* * Do neighbor discovery. */ if (!IEEE80211_ADDR_EQ(wh->i_addr2, ni->ni_macaddr)) { /* * Create a new entry in the neighbor table. */ ni = ieee80211_add_neighbor(vap, wh, &scan); } /* * Automatically peer with discovered nodes if possible. */ if (ni != vap->iv_bss && (ms->ms_flags & IEEE80211_MESHFLAGS_AP)) { switch (ni->ni_mlstate) { case IEEE80211_NODE_MESH_IDLE: { uint16_t args[1]; /* Wait for backoff callout to reset counter */ if (ni->ni_mlhcnt >= ieee80211_mesh_maxholding) return; ni->ni_mlpid = mesh_generateid(vap); if (ni->ni_mlpid == 0) return; mesh_linkchange(ni, IEEE80211_NODE_MESH_OPENSNT); args[0] = ni->ni_mlpid; ieee80211_send_action(ni, IEEE80211_ACTION_CAT_SELF_PROT, IEEE80211_ACTION_MESHPEERING_OPEN, args); ni->ni_mlrcnt = 0; mesh_peer_timeout_setup(ni); break; } case IEEE80211_NODE_MESH_ESTABLISHED: { /* * Valid beacon from a peer mesh STA * bump TA lifetime */ rt = ieee80211_mesh_rt_find(vap, wh->i_addr2); if(rt != NULL) { ieee80211_mesh_rt_update(rt, ticks_to_msecs( ms->ms_ppath->mpp_inact)); } break; } default: break; /* ignore */ } } break; } case IEEE80211_FC0_SUBTYPE_PROBE_REQ: { uint8_t *ssid, *meshid, *rates, *xrates; if (vap->iv_state != IEEE80211_S_RUN) { IEEE80211_DISCARD(vap, IEEE80211_MSG_INPUT, wh, NULL, "wrong state %s", ieee80211_state_name[vap->iv_state]); vap->iv_stats.is_rx_mgtdiscard++; return; } if (IEEE80211_IS_MULTICAST(wh->i_addr2)) { /* frame must be directed */ IEEE80211_DISCARD(vap, IEEE80211_MSG_INPUT, wh, NULL, "%s", "not unicast"); vap->iv_stats.is_rx_mgtdiscard++; /* XXX stat */ return; } /* * prreq frame format * [tlv] ssid * [tlv] supported rates * [tlv] extended supported rates * [tlv] mesh id */ ssid = meshid = rates = xrates = NULL; while (efrm - frm > 1) { IEEE80211_VERIFY_LENGTH(efrm - frm, frm[1] + 2, return); switch (*frm) { case IEEE80211_ELEMID_SSID: ssid = frm; break; case IEEE80211_ELEMID_RATES: rates = frm; break; case IEEE80211_ELEMID_XRATES: xrates = frm; break; case IEEE80211_ELEMID_MESHID: meshid = frm; break; } frm += frm[1] + 2; } IEEE80211_VERIFY_ELEMENT(ssid, IEEE80211_NWID_LEN, return); IEEE80211_VERIFY_ELEMENT(rates, IEEE80211_RATE_MAXSIZE, return); if (xrates != NULL) IEEE80211_VERIFY_ELEMENT(xrates, IEEE80211_RATE_MAXSIZE - rates[1], return); if (meshid != NULL) { IEEE80211_VERIFY_ELEMENT(meshid, IEEE80211_MESHID_LEN, return); /* NB: meshid, not ssid */ IEEE80211_VERIFY_SSID(vap->iv_bss, meshid, return); } /* XXX find a better class or define it's own */ IEEE80211_NOTE_MAC(vap, IEEE80211_MSG_INPUT, wh->i_addr2, "%s", "recv probe req"); /* * Some legacy 11b clients cannot hack a complete * probe response frame. When the request includes * only a bare-bones rate set, communicate this to * the transmit side. */ ieee80211_send_proberesp(vap, wh->i_addr2, 0); break; } case IEEE80211_FC0_SUBTYPE_ACTION: case IEEE80211_FC0_SUBTYPE_ACTION_NOACK: if (ni == vap->iv_bss) { IEEE80211_DISCARD(vap, IEEE80211_MSG_INPUT, wh, NULL, "%s", "unknown node"); vap->iv_stats.is_rx_mgtdiscard++; } else if (!IEEE80211_ADDR_EQ(vap->iv_myaddr, wh->i_addr1) && !IEEE80211_IS_MULTICAST(wh->i_addr1)) { IEEE80211_DISCARD(vap, IEEE80211_MSG_INPUT, wh, NULL, "%s", "not for us"); vap->iv_stats.is_rx_mgtdiscard++; } else if (vap->iv_state != IEEE80211_S_RUN) { IEEE80211_DISCARD(vap, IEEE80211_MSG_INPUT, wh, NULL, "wrong state %s", ieee80211_state_name[vap->iv_state]); vap->iv_stats.is_rx_mgtdiscard++; } else { if (ieee80211_parse_action(ni, m0) == 0) (void)ic->ic_recv_action(ni, wh, frm, efrm); } break; case IEEE80211_FC0_SUBTYPE_ASSOC_REQ: case IEEE80211_FC0_SUBTYPE_ASSOC_RESP: case IEEE80211_FC0_SUBTYPE_REASSOC_REQ: case IEEE80211_FC0_SUBTYPE_REASSOC_RESP: case IEEE80211_FC0_SUBTYPE_TIMING_ADV: case IEEE80211_FC0_SUBTYPE_ATIM: case IEEE80211_FC0_SUBTYPE_DISASSOC: case IEEE80211_FC0_SUBTYPE_AUTH: case IEEE80211_FC0_SUBTYPE_DEAUTH: IEEE80211_DISCARD(vap, IEEE80211_MSG_INPUT, wh, NULL, "%s", "not handled"); vap->iv_stats.is_rx_mgtdiscard++; break; default: IEEE80211_DISCARD(vap, IEEE80211_MSG_ANY, wh, "mgt", "subtype 0x%x not handled", subtype); vap->iv_stats.is_rx_badsubtype++; break; } } static void mesh_recv_ctl(struct ieee80211_node *ni, struct mbuf *m, int subtype) { switch (subtype) { case IEEE80211_FC0_SUBTYPE_BAR: ieee80211_recv_bar(ni, m); break; } } /* * Parse meshpeering action ie's for MPM frames */ static const struct ieee80211_meshpeer_ie * mesh_parse_meshpeering_action(struct ieee80211_node *ni, const struct ieee80211_frame *wh, /* XXX for VERIFY_LENGTH */ const uint8_t *frm, const uint8_t *efrm, struct ieee80211_meshpeer_ie *mp, uint8_t subtype) { struct ieee80211vap *vap = ni->ni_vap; const struct ieee80211_meshpeer_ie *mpie; uint16_t args[3]; const uint8_t *meshid, *meshconf; uint8_t sendclose = 0; /* 1 = MPM frame rejected, close will be sent */ meshid = meshconf = NULL; while (efrm - frm > 1) { IEEE80211_VERIFY_LENGTH(efrm - frm, frm[1] + 2, return NULL); switch (*frm) { case IEEE80211_ELEMID_MESHID: meshid = frm; break; case IEEE80211_ELEMID_MESHCONF: meshconf = frm; break; case IEEE80211_ELEMID_MESHPEER: mpie = (const struct ieee80211_meshpeer_ie *) frm; memset(mp, 0, sizeof(*mp)); mp->peer_len = mpie->peer_len; mp->peer_proto = le16dec(&mpie->peer_proto); mp->peer_llinkid = le16dec(&mpie->peer_llinkid); switch (subtype) { case IEEE80211_ACTION_MESHPEERING_CONFIRM: mp->peer_linkid = le16dec(&mpie->peer_linkid); break; case IEEE80211_ACTION_MESHPEERING_CLOSE: /* NB: peer link ID is optional */ if (mpie->peer_len == (IEEE80211_MPM_BASE_SZ + 2)) { mp->peer_linkid = 0; mp->peer_rcode = le16dec(&mpie->peer_linkid); } else { mp->peer_linkid = le16dec(&mpie->peer_linkid); mp->peer_rcode = le16dec(&mpie->peer_rcode); } break; } break; } frm += frm[1] + 2; } /* * Verify the contents of the frame. * If it fails validation, close the peer link. */ if (mesh_verify_meshpeer(vap, subtype, (const uint8_t *)mp)) { sendclose = 1; IEEE80211_DISCARD(vap, IEEE80211_MSG_ACTION | IEEE80211_MSG_MESH, wh, NULL, "%s", "MPM validation failed"); } /* If meshid is not the same reject any frames type. */ if (sendclose == 0 && mesh_verify_meshid(vap, meshid)) { sendclose = 1; IEEE80211_DISCARD(vap, IEEE80211_MSG_ACTION | IEEE80211_MSG_MESH, wh, NULL, "%s", "not for our mesh"); if (subtype == IEEE80211_ACTION_MESHPEERING_CLOSE) { /* * Standard not clear about this, if we dont ignore * there will be an endless loop between nodes sending * CLOSE frames between each other with wrong meshid. * Discard and timers will bring FSM to IDLE state. */ return NULL; } } /* * Close frames are accepted if meshid is the same. * Verify the other two types. */ if (sendclose == 0 && subtype != IEEE80211_ACTION_MESHPEERING_CLOSE && mesh_verify_meshconf(vap, meshconf)) { sendclose = 1; IEEE80211_DISCARD(vap, IEEE80211_MSG_ACTION | IEEE80211_MSG_MESH, wh, NULL, "%s", "configuration missmatch"); } if (sendclose) { vap->iv_stats.is_rx_mgtdiscard++; switch (ni->ni_mlstate) { case IEEE80211_NODE_MESH_IDLE: case IEEE80211_NODE_MESH_ESTABLISHED: case IEEE80211_NODE_MESH_HOLDING: /* ignore */ break; case IEEE80211_NODE_MESH_OPENSNT: case IEEE80211_NODE_MESH_OPENRCV: case IEEE80211_NODE_MESH_CONFIRMRCV: args[0] = ni->ni_mlpid; args[1] = ni->ni_mllid; /* Reason codes for rejection */ switch (subtype) { case IEEE80211_ACTION_MESHPEERING_OPEN: args[2] = IEEE80211_REASON_MESH_CPVIOLATION; break; case IEEE80211_ACTION_MESHPEERING_CONFIRM: args[2] = IEEE80211_REASON_MESH_INCONS_PARAMS; break; } ieee80211_send_action(ni, IEEE80211_ACTION_CAT_SELF_PROT, IEEE80211_ACTION_MESHPEERING_CLOSE, args); mesh_linkchange(ni, IEEE80211_NODE_MESH_HOLDING); mesh_peer_timeout_setup(ni); break; } return NULL; } return (const struct ieee80211_meshpeer_ie *) mp; } static int mesh_recv_action_meshpeering_open(struct ieee80211_node *ni, const struct ieee80211_frame *wh, const uint8_t *frm, const uint8_t *efrm) { struct ieee80211vap *vap = ni->ni_vap; struct ieee80211_mesh_state *ms = vap->iv_mesh; struct ieee80211_meshpeer_ie ie; const struct ieee80211_meshpeer_ie *meshpeer; uint16_t args[3]; /* +2+2 for action + code + capabilites */ meshpeer = mesh_parse_meshpeering_action(ni, wh, frm+2+2, efrm, &ie, IEEE80211_ACTION_MESHPEERING_OPEN); if (meshpeer == NULL) { return 0; } /* XXX move up */ IEEE80211_NOTE(vap, IEEE80211_MSG_ACTION | IEEE80211_MSG_MESH, ni, "recv PEER OPEN, lid 0x%x", meshpeer->peer_llinkid); switch (ni->ni_mlstate) { case IEEE80211_NODE_MESH_IDLE: /* Reject open request if reached our maximum neighbor count */ if (ms->ms_neighbors >= IEEE80211_MESH_MAX_NEIGHBORS) { args[0] = meshpeer->peer_llinkid; args[1] = 0; args[2] = IEEE80211_REASON_MESH_MAX_PEERS; ieee80211_send_action(ni, IEEE80211_ACTION_CAT_SELF_PROT, IEEE80211_ACTION_MESHPEERING_CLOSE, args); /* stay in IDLE state */ return (0); } /* Open frame accepted */ mesh_linkchange(ni, IEEE80211_NODE_MESH_OPENRCV); ni->ni_mllid = meshpeer->peer_llinkid; ni->ni_mlpid = mesh_generateid(vap); if (ni->ni_mlpid == 0) return 0; /* XXX */ args[0] = ni->ni_mlpid; /* Announce we're open too... */ ieee80211_send_action(ni, IEEE80211_ACTION_CAT_SELF_PROT, IEEE80211_ACTION_MESHPEERING_OPEN, args); /* ...and confirm the link. */ args[0] = ni->ni_mlpid; args[1] = ni->ni_mllid; ieee80211_send_action(ni, IEEE80211_ACTION_CAT_SELF_PROT, IEEE80211_ACTION_MESHPEERING_CONFIRM, args); mesh_peer_timeout_setup(ni); break; case IEEE80211_NODE_MESH_OPENRCV: /* Wrong Link ID */ if (ni->ni_mllid != meshpeer->peer_llinkid) { args[0] = ni->ni_mllid; args[1] = ni->ni_mlpid; args[2] = IEEE80211_REASON_PEER_LINK_CANCELED; ieee80211_send_action(ni, IEEE80211_ACTION_CAT_SELF_PROT, IEEE80211_ACTION_MESHPEERING_CLOSE, args); mesh_linkchange(ni, IEEE80211_NODE_MESH_HOLDING); mesh_peer_timeout_setup(ni); break; } /* Duplicate open, confirm again. */ args[0] = ni->ni_mlpid; args[1] = ni->ni_mllid; ieee80211_send_action(ni, IEEE80211_ACTION_CAT_SELF_PROT, IEEE80211_ACTION_MESHPEERING_CONFIRM, args); break; case IEEE80211_NODE_MESH_OPENSNT: ni->ni_mllid = meshpeer->peer_llinkid; mesh_linkchange(ni, IEEE80211_NODE_MESH_OPENRCV); args[0] = ni->ni_mlpid; args[1] = ni->ni_mllid; ieee80211_send_action(ni, IEEE80211_ACTION_CAT_SELF_PROT, IEEE80211_ACTION_MESHPEERING_CONFIRM, args); /* NB: don't setup/clear any timeout */ break; case IEEE80211_NODE_MESH_CONFIRMRCV: if (ni->ni_mlpid != meshpeer->peer_linkid || ni->ni_mllid != meshpeer->peer_llinkid) { args[0] = ni->ni_mlpid; args[1] = ni->ni_mllid; args[2] = IEEE80211_REASON_PEER_LINK_CANCELED; ieee80211_send_action(ni, IEEE80211_ACTION_CAT_SELF_PROT, IEEE80211_ACTION_MESHPEERING_CLOSE, args); mesh_linkchange(ni, IEEE80211_NODE_MESH_HOLDING); mesh_peer_timeout_setup(ni); break; } mesh_linkchange(ni, IEEE80211_NODE_MESH_ESTABLISHED); ni->ni_mllid = meshpeer->peer_llinkid; args[0] = ni->ni_mlpid; args[1] = ni->ni_mllid; ieee80211_send_action(ni, IEEE80211_ACTION_CAT_SELF_PROT, IEEE80211_ACTION_MESHPEERING_CONFIRM, args); mesh_peer_timeout_stop(ni); break; case IEEE80211_NODE_MESH_ESTABLISHED: if (ni->ni_mllid != meshpeer->peer_llinkid) { args[0] = ni->ni_mllid; args[1] = ni->ni_mlpid; args[2] = IEEE80211_REASON_PEER_LINK_CANCELED; ieee80211_send_action(ni, IEEE80211_ACTION_CAT_SELF_PROT, IEEE80211_ACTION_MESHPEERING_CLOSE, args); mesh_linkchange(ni, IEEE80211_NODE_MESH_HOLDING); mesh_peer_timeout_setup(ni); break; } args[0] = ni->ni_mlpid; args[1] = ni->ni_mllid; ieee80211_send_action(ni, IEEE80211_ACTION_CAT_SELF_PROT, IEEE80211_ACTION_MESHPEERING_CONFIRM, args); break; case IEEE80211_NODE_MESH_HOLDING: args[0] = ni->ni_mlpid; args[1] = meshpeer->peer_llinkid; /* Standard not clear about what the reaason code should be */ args[2] = IEEE80211_REASON_PEER_LINK_CANCELED; ieee80211_send_action(ni, IEEE80211_ACTION_CAT_SELF_PROT, IEEE80211_ACTION_MESHPEERING_CLOSE, args); break; } return 0; } static int mesh_recv_action_meshpeering_confirm(struct ieee80211_node *ni, const struct ieee80211_frame *wh, const uint8_t *frm, const uint8_t *efrm) { struct ieee80211vap *vap = ni->ni_vap; struct ieee80211_meshpeer_ie ie; const struct ieee80211_meshpeer_ie *meshpeer; uint16_t args[3]; /* +2+2+2+2 for action + code + capabilites + status code + AID */ meshpeer = mesh_parse_meshpeering_action(ni, wh, frm+2+2+2+2, efrm, &ie, IEEE80211_ACTION_MESHPEERING_CONFIRM); if (meshpeer == NULL) { return 0; } IEEE80211_NOTE(vap, IEEE80211_MSG_ACTION | IEEE80211_MSG_MESH, ni, "recv PEER CONFIRM, local id 0x%x, peer id 0x%x", meshpeer->peer_llinkid, meshpeer->peer_linkid); switch (ni->ni_mlstate) { case IEEE80211_NODE_MESH_OPENRCV: mesh_linkchange(ni, IEEE80211_NODE_MESH_ESTABLISHED); mesh_peer_timeout_stop(ni); break; case IEEE80211_NODE_MESH_OPENSNT: mesh_linkchange(ni, IEEE80211_NODE_MESH_CONFIRMRCV); mesh_peer_timeout_setup(ni); break; case IEEE80211_NODE_MESH_HOLDING: args[0] = ni->ni_mlpid; args[1] = meshpeer->peer_llinkid; /* Standard not clear about what the reaason code should be */ args[2] = IEEE80211_REASON_PEER_LINK_CANCELED; ieee80211_send_action(ni, IEEE80211_ACTION_CAT_SELF_PROT, IEEE80211_ACTION_MESHPEERING_CLOSE, args); break; case IEEE80211_NODE_MESH_CONFIRMRCV: if (ni->ni_mllid != meshpeer->peer_llinkid) { args[0] = ni->ni_mlpid; args[1] = ni->ni_mllid; args[2] = IEEE80211_REASON_PEER_LINK_CANCELED; ieee80211_send_action(ni, IEEE80211_ACTION_CAT_SELF_PROT, IEEE80211_ACTION_MESHPEERING_CLOSE, args); mesh_linkchange(ni, IEEE80211_NODE_MESH_HOLDING); mesh_peer_timeout_setup(ni); } break; default: IEEE80211_DISCARD(vap, IEEE80211_MSG_ACTION | IEEE80211_MSG_MESH, wh, NULL, "received confirm in invalid state %d", ni->ni_mlstate); vap->iv_stats.is_rx_mgtdiscard++; break; } return 0; } static int mesh_recv_action_meshpeering_close(struct ieee80211_node *ni, const struct ieee80211_frame *wh, const uint8_t *frm, const uint8_t *efrm) { struct ieee80211_meshpeer_ie ie; const struct ieee80211_meshpeer_ie *meshpeer; uint16_t args[3]; /* +2 for action + code */ meshpeer = mesh_parse_meshpeering_action(ni, wh, frm+2, efrm, &ie, IEEE80211_ACTION_MESHPEERING_CLOSE); if (meshpeer == NULL) { return 0; } /* * XXX: check reason code, for example we could receive * IEEE80211_REASON_MESH_MAX_PEERS then we should not attempt * to peer again. */ IEEE80211_NOTE(ni->ni_vap, IEEE80211_MSG_ACTION | IEEE80211_MSG_MESH, ni, "%s", "recv PEER CLOSE"); switch (ni->ni_mlstate) { case IEEE80211_NODE_MESH_IDLE: /* ignore */ break; case IEEE80211_NODE_MESH_OPENRCV: case IEEE80211_NODE_MESH_OPENSNT: case IEEE80211_NODE_MESH_CONFIRMRCV: case IEEE80211_NODE_MESH_ESTABLISHED: args[0] = ni->ni_mlpid; args[1] = ni->ni_mllid; args[2] = IEEE80211_REASON_MESH_CLOSE_RCVD; ieee80211_send_action(ni, IEEE80211_ACTION_CAT_SELF_PROT, IEEE80211_ACTION_MESHPEERING_CLOSE, args); mesh_linkchange(ni, IEEE80211_NODE_MESH_HOLDING); mesh_peer_timeout_setup(ni); break; case IEEE80211_NODE_MESH_HOLDING: mesh_linkchange(ni, IEEE80211_NODE_MESH_IDLE); mesh_peer_timeout_stop(ni); break; } return 0; } /* * Link Metric handling. */ static int mesh_recv_action_meshlmetric(struct ieee80211_node *ni, const struct ieee80211_frame *wh, const uint8_t *frm, const uint8_t *efrm) { const struct ieee80211_meshlmetric_ie *ie = (const struct ieee80211_meshlmetric_ie *) (frm+2); /* action + code */ struct ieee80211_meshlmetric_ie lm_rep; if (ie->lm_flags & IEEE80211_MESH_LMETRIC_FLAGS_REQ) { lm_rep.lm_flags = 0; lm_rep.lm_metric = mesh_airtime_calc(ni); ieee80211_send_action(ni, IEEE80211_ACTION_CAT_MESH, IEEE80211_ACTION_MESH_LMETRIC, &lm_rep); } /* XXX: else do nothing for now */ return 0; } /* * Parse meshgate action ie's for GANN frames. * Returns -1 if parsing fails, otherwise 0. */ static int mesh_parse_meshgate_action(struct ieee80211_node *ni, const struct ieee80211_frame *wh, /* XXX for VERIFY_LENGTH */ struct ieee80211_meshgann_ie *ie, const uint8_t *frm, const uint8_t *efrm) { struct ieee80211vap *vap = ni->ni_vap; const struct ieee80211_meshgann_ie *gannie; while (efrm - frm > 1) { IEEE80211_VERIFY_LENGTH(efrm - frm, frm[1] + 2, return -1); switch (*frm) { case IEEE80211_ELEMID_MESHGANN: gannie = (const struct ieee80211_meshgann_ie *) frm; memset(ie, 0, sizeof(*ie)); ie->gann_ie = gannie->gann_ie; ie->gann_len = gannie->gann_len; ie->gann_flags = gannie->gann_flags; ie->gann_hopcount = gannie->gann_hopcount; ie->gann_ttl = gannie->gann_ttl; IEEE80211_ADDR_COPY(ie->gann_addr, gannie->gann_addr); ie->gann_seq = le32dec(&gannie->gann_seq); ie->gann_interval = le16dec(&gannie->gann_interval); break; } frm += frm[1] + 2; } return 0; } /* * Mesh Gate Announcement handling. */ static int mesh_recv_action_meshgate(struct ieee80211_node *ni, const struct ieee80211_frame *wh, const uint8_t *frm, const uint8_t *efrm) { struct ieee80211vap *vap = ni->ni_vap; struct ieee80211_mesh_state *ms = vap->iv_mesh; struct ieee80211_mesh_gate_route *gr, *next; struct ieee80211_mesh_route *rt_gate; struct ieee80211_meshgann_ie pgann; struct ieee80211_meshgann_ie ie; int found = 0; /* +2 for action + code */ if (mesh_parse_meshgate_action(ni, wh, &ie, frm+2, efrm) != 0) { IEEE80211_DISCARD_MAC(vap, IEEE80211_MSG_MESH, ni->ni_macaddr, NULL, "%s", "GANN parsing failed"); vap->iv_stats.is_rx_mgtdiscard++; return (0); } if (IEEE80211_ADDR_EQ(vap->iv_myaddr, ie.gann_addr)) return 0; IEEE80211_NOTE_MAC(vap, IEEE80211_MSG_MESH, ni->ni_macaddr, "received GANN, meshgate: %6D (seq %u)", ie.gann_addr, ":", ie.gann_seq); if (ms == NULL) return (0); MESH_RT_LOCK(ms); TAILQ_FOREACH_SAFE(gr, &ms->ms_known_gates, gr_next, next) { if (!IEEE80211_ADDR_EQ(gr->gr_addr, ie.gann_addr)) continue; if (ie.gann_seq <= gr->gr_lastseq) { IEEE80211_DISCARD_MAC(vap, IEEE80211_MSG_MESH, ni->ni_macaddr, NULL, "GANN old seqno %u <= %u", ie.gann_seq, gr->gr_lastseq); MESH_RT_UNLOCK(ms); return (0); } /* corresponding mesh gate found & GANN accepted */ found = 1; break; } if (found == 0) { /* this GANN is from a new mesh Gate add it to known table. */ IEEE80211_NOTE_MAC(vap, IEEE80211_MSG_MESH, ie.gann_addr, "stored new GANN information, seq %u.", ie.gann_seq); gr = IEEE80211_MALLOC(ALIGN(sizeof(struct ieee80211_mesh_gate_route)), M_80211_MESH_GT_RT, IEEE80211_M_NOWAIT | IEEE80211_M_ZERO); IEEE80211_ADDR_COPY(gr->gr_addr, ie.gann_addr); TAILQ_INSERT_TAIL(&ms->ms_known_gates, gr, gr_next); } gr->gr_lastseq = ie.gann_seq; /* check if we have a path to this gate */ rt_gate = mesh_rt_find_locked(ms, gr->gr_addr); if (rt_gate != NULL && rt_gate->rt_flags & IEEE80211_MESHRT_FLAGS_VALID) { gr->gr_route = rt_gate; rt_gate->rt_flags |= IEEE80211_MESHRT_FLAGS_GATE; } MESH_RT_UNLOCK(ms); /* popagate only if decremented ttl >= 1 && forwarding is enabled */ if ((ie.gann_ttl - 1) < 1 && !(ms->ms_flags & IEEE80211_MESHFLAGS_FWD)) return 0; pgann.gann_flags = ie.gann_flags; /* Reserved */ pgann.gann_hopcount = ie.gann_hopcount + 1; pgann.gann_ttl = ie.gann_ttl - 1; IEEE80211_ADDR_COPY(pgann.gann_addr, ie.gann_addr); pgann.gann_seq = ie.gann_seq; pgann.gann_interval = ie.gann_interval; IEEE80211_NOTE_MAC(vap, IEEE80211_MSG_MESH, ie.gann_addr, "%s", "propagate GANN"); ieee80211_send_action(vap->iv_bss, IEEE80211_ACTION_CAT_MESH, IEEE80211_ACTION_MESH_GANN, &pgann); return 0; } static int mesh_send_action(struct ieee80211_node *ni, const uint8_t sa[IEEE80211_ADDR_LEN], const uint8_t da[IEEE80211_ADDR_LEN], struct mbuf *m) { struct ieee80211vap *vap = ni->ni_vap; struct ieee80211com *ic = ni->ni_ic; struct ieee80211_bpf_params params; int ret; KASSERT(ni != NULL, ("null node")); if (vap->iv_state == IEEE80211_S_CAC) { IEEE80211_NOTE(vap, IEEE80211_MSG_OUTPUT, ni, "block %s frame in CAC state", "Mesh action"); vap->iv_stats.is_tx_badstate++; ieee80211_free_node(ni); m_freem(m); return EIO; /* XXX */ } M_PREPEND(m, sizeof(struct ieee80211_frame), M_NOWAIT); if (m == NULL) { ieee80211_free_node(ni); return ENOMEM; } IEEE80211_TX_LOCK(ic); ieee80211_send_setup(ni, m, IEEE80211_FC0_TYPE_MGT | IEEE80211_FC0_SUBTYPE_ACTION, IEEE80211_NONQOS_TID, sa, da, sa); m->m_flags |= M_ENCAP; /* mark encapsulated */ memset(¶ms, 0, sizeof(params)); params.ibp_pri = WME_AC_VO; params.ibp_rate0 = ni->ni_txparms->mgmtrate; if (IEEE80211_IS_MULTICAST(da)) params.ibp_try0 = 1; else params.ibp_try0 = ni->ni_txparms->maxretry; params.ibp_power = ni->ni_txpower; IEEE80211_NODE_STAT(ni, tx_mgmt); ret = ieee80211_raw_output(vap, ni, m, ¶ms); IEEE80211_TX_UNLOCK(ic); return (ret); } #define ADDSHORT(frm, v) do { \ frm[0] = (v) & 0xff; \ frm[1] = (v) >> 8; \ frm += 2; \ } while (0) #define ADDWORD(frm, v) do { \ frm[0] = (v) & 0xff; \ frm[1] = ((v) >> 8) & 0xff; \ frm[2] = ((v) >> 16) & 0xff; \ frm[3] = ((v) >> 24) & 0xff; \ frm += 4; \ } while (0) static int mesh_send_action_meshpeering_open(struct ieee80211_node *ni, int category, int action, void *args0) { struct ieee80211vap *vap = ni->ni_vap; struct ieee80211com *ic = ni->ni_ic; uint16_t *args = args0; const struct ieee80211_rateset *rs; struct mbuf *m; uint8_t *frm; IEEE80211_NOTE(vap, IEEE80211_MSG_ACTION | IEEE80211_MSG_MESH, ni, "send PEER OPEN action: localid 0x%x", args[0]); IEEE80211_DPRINTF(vap, IEEE80211_MSG_NODE, "ieee80211_ref_node (%s:%u) %p<%s> refcnt %d\n", __func__, __LINE__, ni, ether_sprintf(ni->ni_macaddr), ieee80211_node_refcnt(ni)+1); ieee80211_ref_node(ni); m = ieee80211_getmgtframe(&frm, ic->ic_headroom + sizeof(struct ieee80211_frame), sizeof(uint16_t) /* action+category */ + sizeof(uint16_t) /* capabilites */ + 2 + IEEE80211_RATE_SIZE + 2 + (IEEE80211_RATE_MAXSIZE - IEEE80211_RATE_SIZE) + 2 + IEEE80211_MESHID_LEN + sizeof(struct ieee80211_meshconf_ie) + sizeof(struct ieee80211_meshpeer_ie) ); if (m != NULL) { /* * mesh peer open action frame format: * [1] category * [1] action * [2] capabilities * [tlv] rates * [tlv] xrates * [tlv] mesh id * [tlv] mesh conf * [tlv] mesh peer link mgmt */ *frm++ = category; *frm++ = action; ADDSHORT(frm, ieee80211_getcapinfo(vap, ni->ni_chan)); rs = ieee80211_get_suprates(ic, ic->ic_curchan); frm = ieee80211_add_rates(frm, rs); frm = ieee80211_add_xrates(frm, rs); frm = ieee80211_add_meshid(frm, vap); frm = ieee80211_add_meshconf(frm, vap); frm = ieee80211_add_meshpeer(frm, IEEE80211_ACTION_MESHPEERING_OPEN, args[0], 0, 0); m->m_pkthdr.len = m->m_len = frm - mtod(m, uint8_t *); return mesh_send_action(ni, vap->iv_myaddr, ni->ni_macaddr, m); } else { vap->iv_stats.is_tx_nobuf++; ieee80211_free_node(ni); return ENOMEM; } } static int mesh_send_action_meshpeering_confirm(struct ieee80211_node *ni, int category, int action, void *args0) { struct ieee80211vap *vap = ni->ni_vap; struct ieee80211com *ic = ni->ni_ic; uint16_t *args = args0; const struct ieee80211_rateset *rs; struct mbuf *m; uint8_t *frm; IEEE80211_NOTE(vap, IEEE80211_MSG_ACTION | IEEE80211_MSG_MESH, ni, "send PEER CONFIRM action: localid 0x%x, peerid 0x%x", args[0], args[1]); IEEE80211_DPRINTF(vap, IEEE80211_MSG_NODE, "ieee80211_ref_node (%s:%u) %p<%s> refcnt %d\n", __func__, __LINE__, ni, ether_sprintf(ni->ni_macaddr), ieee80211_node_refcnt(ni)+1); ieee80211_ref_node(ni); m = ieee80211_getmgtframe(&frm, ic->ic_headroom + sizeof(struct ieee80211_frame), sizeof(uint16_t) /* action+category */ + sizeof(uint16_t) /* capabilites */ + sizeof(uint16_t) /* status code */ + sizeof(uint16_t) /* AID */ + 2 + IEEE80211_RATE_SIZE + 2 + (IEEE80211_RATE_MAXSIZE - IEEE80211_RATE_SIZE) + 2 + IEEE80211_MESHID_LEN + sizeof(struct ieee80211_meshconf_ie) + sizeof(struct ieee80211_meshpeer_ie) ); if (m != NULL) { /* * mesh peer confirm action frame format: * [1] category * [1] action * [2] capabilities * [2] status code * [2] association id (peer ID) * [tlv] rates * [tlv] xrates * [tlv] mesh id * [tlv] mesh conf * [tlv] mesh peer link mgmt */ *frm++ = category; *frm++ = action; ADDSHORT(frm, ieee80211_getcapinfo(vap, ni->ni_chan)); ADDSHORT(frm, 0); /* status code */ ADDSHORT(frm, args[1]); /* AID */ rs = ieee80211_get_suprates(ic, ic->ic_curchan); frm = ieee80211_add_rates(frm, rs); frm = ieee80211_add_xrates(frm, rs); frm = ieee80211_add_meshid(frm, vap); frm = ieee80211_add_meshconf(frm, vap); frm = ieee80211_add_meshpeer(frm, IEEE80211_ACTION_MESHPEERING_CONFIRM, args[0], args[1], 0); m->m_pkthdr.len = m->m_len = frm - mtod(m, uint8_t *); return mesh_send_action(ni, vap->iv_myaddr, ni->ni_macaddr, m); } else { vap->iv_stats.is_tx_nobuf++; ieee80211_free_node(ni); return ENOMEM; } } static int mesh_send_action_meshpeering_close(struct ieee80211_node *ni, int category, int action, void *args0) { struct ieee80211vap *vap = ni->ni_vap; struct ieee80211com *ic = ni->ni_ic; uint16_t *args = args0; struct mbuf *m; uint8_t *frm; IEEE80211_NOTE(vap, IEEE80211_MSG_ACTION | IEEE80211_MSG_MESH, ni, "send PEER CLOSE action: localid 0x%x, peerid 0x%x reason %d (%s)", args[0], args[1], args[2], ieee80211_reason_to_string(args[2])); IEEE80211_DPRINTF(vap, IEEE80211_MSG_NODE, "ieee80211_ref_node (%s:%u) %p<%s> refcnt %d\n", __func__, __LINE__, ni, ether_sprintf(ni->ni_macaddr), ieee80211_node_refcnt(ni)+1); ieee80211_ref_node(ni); m = ieee80211_getmgtframe(&frm, ic->ic_headroom + sizeof(struct ieee80211_frame), sizeof(uint16_t) /* action+category */ + sizeof(uint16_t) /* reason code */ + 2 + IEEE80211_MESHID_LEN + sizeof(struct ieee80211_meshpeer_ie) ); if (m != NULL) { /* * mesh peer close action frame format: * [1] category * [1] action * [tlv] mesh id * [tlv] mesh peer link mgmt */ *frm++ = category; *frm++ = action; frm = ieee80211_add_meshid(frm, vap); frm = ieee80211_add_meshpeer(frm, IEEE80211_ACTION_MESHPEERING_CLOSE, args[0], args[1], args[2]); m->m_pkthdr.len = m->m_len = frm - mtod(m, uint8_t *); return mesh_send_action(ni, vap->iv_myaddr, ni->ni_macaddr, m); } else { vap->iv_stats.is_tx_nobuf++; ieee80211_free_node(ni); return ENOMEM; } } static int mesh_send_action_meshlmetric(struct ieee80211_node *ni, int category, int action, void *arg0) { struct ieee80211vap *vap = ni->ni_vap; struct ieee80211com *ic = ni->ni_ic; struct ieee80211_meshlmetric_ie *ie = arg0; struct mbuf *m; uint8_t *frm; if (ie->lm_flags & IEEE80211_MESH_LMETRIC_FLAGS_REQ) { IEEE80211_NOTE(vap, IEEE80211_MSG_ACTION | IEEE80211_MSG_MESH, ni, "%s", "send LINK METRIC REQUEST action"); } else { IEEE80211_NOTE(vap, IEEE80211_MSG_ACTION | IEEE80211_MSG_MESH, ni, "send LINK METRIC REPLY action: metric 0x%x", ie->lm_metric); } IEEE80211_DPRINTF(vap, IEEE80211_MSG_NODE, "ieee80211_ref_node (%s:%u) %p<%s> refcnt %d\n", __func__, __LINE__, ni, ether_sprintf(ni->ni_macaddr), ieee80211_node_refcnt(ni)+1); ieee80211_ref_node(ni); m = ieee80211_getmgtframe(&frm, ic->ic_headroom + sizeof(struct ieee80211_frame), sizeof(uint16_t) + /* action+category */ sizeof(struct ieee80211_meshlmetric_ie) ); if (m != NULL) { /* * mesh link metric * [1] category * [1] action * [tlv] mesh link metric */ *frm++ = category; *frm++ = action; frm = ieee80211_add_meshlmetric(frm, ie->lm_flags, ie->lm_metric); m->m_pkthdr.len = m->m_len = frm - mtod(m, uint8_t *); return mesh_send_action(ni, vap->iv_myaddr, ni->ni_macaddr, m); } else { vap->iv_stats.is_tx_nobuf++; ieee80211_free_node(ni); return ENOMEM; } } static int mesh_send_action_meshgate(struct ieee80211_node *ni, int category, int action, void *arg0) { struct ieee80211vap *vap = ni->ni_vap; struct ieee80211com *ic = ni->ni_ic; struct ieee80211_meshgann_ie *ie = arg0; struct mbuf *m; uint8_t *frm; IEEE80211_DPRINTF(vap, IEEE80211_MSG_NODE, "ieee80211_ref_node (%s:%u) %p<%s> refcnt %d\n", __func__, __LINE__, ni, ether_sprintf(ni->ni_macaddr), ieee80211_node_refcnt(ni)+1); ieee80211_ref_node(ni); m = ieee80211_getmgtframe(&frm, ic->ic_headroom + sizeof(struct ieee80211_frame), sizeof(uint16_t) + /* action+category */ IEEE80211_MESHGANN_BASE_SZ ); if (m != NULL) { /* * mesh link metric * [1] category * [1] action * [tlv] mesh gate annoucement */ *frm++ = category; *frm++ = action; frm = ieee80211_add_meshgate(frm, ie); m->m_pkthdr.len = m->m_len = frm - mtod(m, uint8_t *); return mesh_send_action(ni, vap->iv_myaddr, broadcastaddr, m); } else { vap->iv_stats.is_tx_nobuf++; ieee80211_free_node(ni); return ENOMEM; } } static void mesh_peer_timeout_setup(struct ieee80211_node *ni) { switch (ni->ni_mlstate) { case IEEE80211_NODE_MESH_HOLDING: ni->ni_mltval = ieee80211_mesh_holdingtimeout; break; case IEEE80211_NODE_MESH_CONFIRMRCV: ni->ni_mltval = ieee80211_mesh_confirmtimeout; break; case IEEE80211_NODE_MESH_IDLE: ni->ni_mltval = 0; break; default: ni->ni_mltval = ieee80211_mesh_retrytimeout; break; } if (ni->ni_mltval) callout_reset(&ni->ni_mltimer, ni->ni_mltval, mesh_peer_timeout_cb, ni); } /* * Same as above but backoffs timer statisically 50%. */ static void mesh_peer_timeout_backoff(struct ieee80211_node *ni) { uint32_t r; r = arc4random(); ni->ni_mltval += r % ni->ni_mltval; callout_reset(&ni->ni_mltimer, ni->ni_mltval, mesh_peer_timeout_cb, ni); } static __inline void mesh_peer_timeout_stop(struct ieee80211_node *ni) { callout_drain(&ni->ni_mltimer); } static void mesh_peer_backoff_cb(void *arg) { struct ieee80211_node *ni = (struct ieee80211_node *)arg; /* After backoff timeout, try to peer automatically again. */ ni->ni_mlhcnt = 0; } /* * Mesh Peer Link Management FSM timeout handling. */ static void mesh_peer_timeout_cb(void *arg) { struct ieee80211_node *ni = (struct ieee80211_node *)arg; uint16_t args[3]; IEEE80211_NOTE(ni->ni_vap, IEEE80211_MSG_MESH, ni, "mesh link timeout, state %d, retry counter %d", ni->ni_mlstate, ni->ni_mlrcnt); switch (ni->ni_mlstate) { case IEEE80211_NODE_MESH_IDLE: case IEEE80211_NODE_MESH_ESTABLISHED: break; case IEEE80211_NODE_MESH_OPENSNT: case IEEE80211_NODE_MESH_OPENRCV: if (ni->ni_mlrcnt == ieee80211_mesh_maxretries) { args[0] = ni->ni_mlpid; args[2] = IEEE80211_REASON_MESH_MAX_RETRIES; ieee80211_send_action(ni, IEEE80211_ACTION_CAT_SELF_PROT, IEEE80211_ACTION_MESHPEERING_CLOSE, args); ni->ni_mlrcnt = 0; mesh_linkchange(ni, IEEE80211_NODE_MESH_HOLDING); mesh_peer_timeout_setup(ni); } else { args[0] = ni->ni_mlpid; ieee80211_send_action(ni, IEEE80211_ACTION_CAT_SELF_PROT, IEEE80211_ACTION_MESHPEERING_OPEN, args); ni->ni_mlrcnt++; mesh_peer_timeout_backoff(ni); } break; case IEEE80211_NODE_MESH_CONFIRMRCV: args[0] = ni->ni_mlpid; args[2] = IEEE80211_REASON_MESH_CONFIRM_TIMEOUT; ieee80211_send_action(ni, IEEE80211_ACTION_CAT_SELF_PROT, IEEE80211_ACTION_MESHPEERING_CLOSE, args); mesh_linkchange(ni, IEEE80211_NODE_MESH_HOLDING); mesh_peer_timeout_setup(ni); break; case IEEE80211_NODE_MESH_HOLDING: ni->ni_mlhcnt++; if (ni->ni_mlhcnt >= ieee80211_mesh_maxholding) callout_reset(&ni->ni_mlhtimer, ieee80211_mesh_backofftimeout, mesh_peer_backoff_cb, ni); mesh_linkchange(ni, IEEE80211_NODE_MESH_IDLE); break; } } static int mesh_verify_meshid(struct ieee80211vap *vap, const uint8_t *ie) { struct ieee80211_mesh_state *ms = vap->iv_mesh; if (ie == NULL || ie[1] != ms->ms_idlen) return 1; return memcmp(ms->ms_id, ie + 2, ms->ms_idlen); } /* * Check if we are using the same algorithms for this mesh. */ static int mesh_verify_meshconf(struct ieee80211vap *vap, const uint8_t *ie) { const struct ieee80211_meshconf_ie *meshconf = (const struct ieee80211_meshconf_ie *) ie; const struct ieee80211_mesh_state *ms = vap->iv_mesh; if (meshconf == NULL) return 1; if (meshconf->conf_pselid != ms->ms_ppath->mpp_ie) { IEEE80211_DPRINTF(vap, IEEE80211_MSG_MESH, "unknown path selection algorithm: 0x%x\n", meshconf->conf_pselid); return 1; } if (meshconf->conf_pmetid != ms->ms_pmetric->mpm_ie) { IEEE80211_DPRINTF(vap, IEEE80211_MSG_MESH, "unknown path metric algorithm: 0x%x\n", meshconf->conf_pmetid); return 1; } if (meshconf->conf_ccid != 0) { IEEE80211_DPRINTF(vap, IEEE80211_MSG_MESH, "unknown congestion control algorithm: 0x%x\n", meshconf->conf_ccid); return 1; } if (meshconf->conf_syncid != IEEE80211_MESHCONF_SYNC_NEIGHOFF) { IEEE80211_DPRINTF(vap, IEEE80211_MSG_MESH, "unknown sync algorithm: 0x%x\n", meshconf->conf_syncid); return 1; } if (meshconf->conf_authid != 0) { IEEE80211_DPRINTF(vap, IEEE80211_MSG_MESH, "unknown auth auth algorithm: 0x%x\n", meshconf->conf_pselid); return 1; } /* Not accepting peers */ if (!(meshconf->conf_cap & IEEE80211_MESHCONF_CAP_AP)) { IEEE80211_DPRINTF(vap, IEEE80211_MSG_MESH, "not accepting peers: 0x%x\n", meshconf->conf_cap); return 1; } return 0; } static int mesh_verify_meshpeer(struct ieee80211vap *vap, uint8_t subtype, const uint8_t *ie) { const struct ieee80211_meshpeer_ie *meshpeer = (const struct ieee80211_meshpeer_ie *) ie; if (meshpeer == NULL || meshpeer->peer_len < IEEE80211_MPM_BASE_SZ || meshpeer->peer_len > IEEE80211_MPM_MAX_SZ) return 1; if (meshpeer->peer_proto != IEEE80211_MPPID_MPM) { IEEE80211_DPRINTF(vap, IEEE80211_MSG_ACTION | IEEE80211_MSG_MESH, "Only MPM protocol is supported (proto: 0x%02X)", meshpeer->peer_proto); return 1; } switch (subtype) { case IEEE80211_ACTION_MESHPEERING_OPEN: if (meshpeer->peer_len != IEEE80211_MPM_BASE_SZ) return 1; break; case IEEE80211_ACTION_MESHPEERING_CONFIRM: if (meshpeer->peer_len != IEEE80211_MPM_BASE_SZ + 2) return 1; break; case IEEE80211_ACTION_MESHPEERING_CLOSE: if (meshpeer->peer_len < IEEE80211_MPM_BASE_SZ + 2) return 1; if (meshpeer->peer_len == (IEEE80211_MPM_BASE_SZ + 2) && meshpeer->peer_linkid != 0) return 1; if (meshpeer->peer_rcode == 0) return 1; break; } return 0; } /* * Add a Mesh ID IE to a frame. */ uint8_t * ieee80211_add_meshid(uint8_t *frm, struct ieee80211vap *vap) { struct ieee80211_mesh_state *ms = vap->iv_mesh; KASSERT(vap->iv_opmode == IEEE80211_M_MBSS, ("not a mbss vap")); *frm++ = IEEE80211_ELEMID_MESHID; *frm++ = ms->ms_idlen; memcpy(frm, ms->ms_id, ms->ms_idlen); return frm + ms->ms_idlen; } /* * Add a Mesh Configuration IE to a frame. * For now just use HWMP routing, Airtime link metric, Null Congestion * Signaling, Null Sync Protocol and Null Authentication. */ uint8_t * ieee80211_add_meshconf(uint8_t *frm, struct ieee80211vap *vap) { const struct ieee80211_mesh_state *ms = vap->iv_mesh; uint16_t caps; KASSERT(vap->iv_opmode == IEEE80211_M_MBSS, ("not a MBSS vap")); *frm++ = IEEE80211_ELEMID_MESHCONF; *frm++ = IEEE80211_MESH_CONF_SZ; *frm++ = ms->ms_ppath->mpp_ie; /* path selection */ *frm++ = ms->ms_pmetric->mpm_ie; /* link metric */ *frm++ = IEEE80211_MESHCONF_CC_DISABLED; *frm++ = IEEE80211_MESHCONF_SYNC_NEIGHOFF; *frm++ = IEEE80211_MESHCONF_AUTH_DISABLED; /* NB: set the number of neighbors before the rest */ *frm = (ms->ms_neighbors > IEEE80211_MESH_MAX_NEIGHBORS ? IEEE80211_MESH_MAX_NEIGHBORS : ms->ms_neighbors) << 1; if (ms->ms_flags & IEEE80211_MESHFLAGS_GATE) *frm |= IEEE80211_MESHCONF_FORM_GATE; frm += 1; caps = 0; if (ms->ms_flags & IEEE80211_MESHFLAGS_AP) caps |= IEEE80211_MESHCONF_CAP_AP; if (ms->ms_flags & IEEE80211_MESHFLAGS_FWD) caps |= IEEE80211_MESHCONF_CAP_FWRD; *frm++ = caps; return frm; } /* * Add a Mesh Peer Management IE to a frame. */ uint8_t * ieee80211_add_meshpeer(uint8_t *frm, uint8_t subtype, uint16_t localid, uint16_t peerid, uint16_t reason) { KASSERT(localid != 0, ("localid == 0")); *frm++ = IEEE80211_ELEMID_MESHPEER; switch (subtype) { case IEEE80211_ACTION_MESHPEERING_OPEN: *frm++ = IEEE80211_MPM_BASE_SZ; /* length */ ADDSHORT(frm, IEEE80211_MPPID_MPM); /* proto */ ADDSHORT(frm, localid); /* local ID */ break; case IEEE80211_ACTION_MESHPEERING_CONFIRM: KASSERT(peerid != 0, ("sending peer confirm without peer id")); *frm++ = IEEE80211_MPM_BASE_SZ + 2; /* length */ ADDSHORT(frm, IEEE80211_MPPID_MPM); /* proto */ ADDSHORT(frm, localid); /* local ID */ ADDSHORT(frm, peerid); /* peer ID */ break; case IEEE80211_ACTION_MESHPEERING_CLOSE: if (peerid) *frm++ = IEEE80211_MPM_MAX_SZ; /* length */ else *frm++ = IEEE80211_MPM_BASE_SZ + 2; /* length */ ADDSHORT(frm, IEEE80211_MPPID_MPM); /* proto */ ADDSHORT(frm, localid); /* local ID */ if (peerid) ADDSHORT(frm, peerid); /* peer ID */ ADDSHORT(frm, reason); break; } return frm; } /* * Compute an Airtime Link Metric for the link with this node. * * Based on Draft 3.0 spec (11B.10, p.149). */ /* * Max 802.11s overhead. */ #define IEEE80211_MESH_MAXOVERHEAD \ (sizeof(struct ieee80211_qosframe_addr4) \ + sizeof(struct ieee80211_meshcntl_ae10) \ + sizeof(struct llc) \ + IEEE80211_ADDR_LEN \ + IEEE80211_WEP_IVLEN \ + IEEE80211_WEP_KIDLEN \ + IEEE80211_WEP_CRCLEN \ + IEEE80211_WEP_MICLEN \ + IEEE80211_CRC_LEN) uint32_t mesh_airtime_calc(struct ieee80211_node *ni) { #define M_BITS 8 #define S_FACTOR (2 * M_BITS) struct ieee80211com *ic = ni->ni_ic; struct ifnet *ifp = ni->ni_vap->iv_ifp; const static int nbits = 8192 << M_BITS; uint32_t overhead, rate, errrate; uint64_t res; /* Time to transmit a frame */ rate = ni->ni_txrate; overhead = ieee80211_compute_duration(ic->ic_rt, ifp->if_mtu + IEEE80211_MESH_MAXOVERHEAD, rate, 0) << M_BITS; /* Error rate in percentage */ /* XXX assuming small failures are ok */ errrate = (((ifp->if_get_counter(ifp, IFCOUNTER_OERRORS) + ifp->if_get_counter(ifp, IFCOUNTER_IERRORS)) / 100) << M_BITS) / 100; res = (overhead + (nbits / rate)) * ((1 << S_FACTOR) / ((1 << M_BITS) - errrate)); return (uint32_t)(res >> S_FACTOR); #undef M_BITS #undef S_FACTOR } /* * Add a Mesh Link Metric report IE to a frame. */ uint8_t * ieee80211_add_meshlmetric(uint8_t *frm, uint8_t flags, uint32_t metric) { *frm++ = IEEE80211_ELEMID_MESHLINK; *frm++ = 5; *frm++ = flags; ADDWORD(frm, metric); return frm; } /* * Add a Mesh Gate Announcement IE to a frame. */ uint8_t * ieee80211_add_meshgate(uint8_t *frm, struct ieee80211_meshgann_ie *ie) { *frm++ = IEEE80211_ELEMID_MESHGANN; /* ie */ *frm++ = IEEE80211_MESHGANN_BASE_SZ; /* len */ *frm++ = ie->gann_flags; *frm++ = ie->gann_hopcount; *frm++ = ie->gann_ttl; IEEE80211_ADDR_COPY(frm, ie->gann_addr); frm += 6; ADDWORD(frm, ie->gann_seq); ADDSHORT(frm, ie->gann_interval); return frm; } #undef ADDSHORT #undef ADDWORD /* * Initialize any mesh-specific node state. */ void ieee80211_mesh_node_init(struct ieee80211vap *vap, struct ieee80211_node *ni) { ni->ni_flags |= IEEE80211_NODE_QOS; callout_init(&ni->ni_mltimer, 1); callout_init(&ni->ni_mlhtimer, 1); } /* * Cleanup any mesh-specific node state. */ void ieee80211_mesh_node_cleanup(struct ieee80211_node *ni) { struct ieee80211vap *vap = ni->ni_vap; struct ieee80211_mesh_state *ms = vap->iv_mesh; callout_drain(&ni->ni_mltimer); callout_drain(&ni->ni_mlhtimer); /* NB: short-circuit callbacks after mesh_vdetach */ if (vap->iv_mesh != NULL) ms->ms_ppath->mpp_peerdown(ni); } void ieee80211_parse_meshid(struct ieee80211_node *ni, const uint8_t *ie) { ni->ni_meshidlen = ie[1]; memcpy(ni->ni_meshid, ie + 2, ie[1]); } /* * Setup mesh-specific node state on neighbor discovery. */ void ieee80211_mesh_init_neighbor(struct ieee80211_node *ni, const struct ieee80211_frame *wh, const struct ieee80211_scanparams *sp) { ieee80211_parse_meshid(ni, sp->meshid); } void ieee80211_mesh_update_beacon(struct ieee80211vap *vap, struct ieee80211_beacon_offsets *bo) { KASSERT(vap->iv_opmode == IEEE80211_M_MBSS, ("not a MBSS vap")); if (isset(bo->bo_flags, IEEE80211_BEACON_MESHCONF)) { (void)ieee80211_add_meshconf(bo->bo_meshconf, vap); clrbit(bo->bo_flags, IEEE80211_BEACON_MESHCONF); } } static int mesh_ioctl_get80211(struct ieee80211vap *vap, struct ieee80211req *ireq) { struct ieee80211_mesh_state *ms = vap->iv_mesh; uint8_t tmpmeshid[IEEE80211_NWID_LEN]; struct ieee80211_mesh_route *rt; struct ieee80211req_mesh_route *imr; size_t len, off; uint8_t *p; int error; if (vap->iv_opmode != IEEE80211_M_MBSS) return ENOSYS; error = 0; switch (ireq->i_type) { case IEEE80211_IOC_MESH_ID: ireq->i_len = ms->ms_idlen; memcpy(tmpmeshid, ms->ms_id, ireq->i_len); error = copyout(tmpmeshid, ireq->i_data, ireq->i_len); break; case IEEE80211_IOC_MESH_AP: ireq->i_val = (ms->ms_flags & IEEE80211_MESHFLAGS_AP) != 0; break; case IEEE80211_IOC_MESH_FWRD: ireq->i_val = (ms->ms_flags & IEEE80211_MESHFLAGS_FWD) != 0; break; case IEEE80211_IOC_MESH_GATE: ireq->i_val = (ms->ms_flags & IEEE80211_MESHFLAGS_GATE) != 0; break; case IEEE80211_IOC_MESH_TTL: ireq->i_val = ms->ms_ttl; break; case IEEE80211_IOC_MESH_RTCMD: switch (ireq->i_val) { case IEEE80211_MESH_RTCMD_LIST: len = 0; MESH_RT_LOCK(ms); TAILQ_FOREACH(rt, &ms->ms_routes, rt_next) { len += sizeof(*imr); } MESH_RT_UNLOCK(ms); if (len > ireq->i_len || ireq->i_len < sizeof(*imr)) { ireq->i_len = len; return ENOMEM; } ireq->i_len = len; /* XXX M_WAIT? */ p = IEEE80211_MALLOC(len, M_TEMP, IEEE80211_M_NOWAIT | IEEE80211_M_ZERO); if (p == NULL) return ENOMEM; off = 0; MESH_RT_LOCK(ms); TAILQ_FOREACH(rt, &ms->ms_routes, rt_next) { if (off >= len) break; imr = (struct ieee80211req_mesh_route *) (p + off); IEEE80211_ADDR_COPY(imr->imr_dest, rt->rt_dest); IEEE80211_ADDR_COPY(imr->imr_nexthop, rt->rt_nexthop); imr->imr_metric = rt->rt_metric; imr->imr_nhops = rt->rt_nhops; imr->imr_lifetime = ieee80211_mesh_rt_update(rt, 0); imr->imr_lastmseq = rt->rt_lastmseq; imr->imr_flags = rt->rt_flags; /* last */ off += sizeof(*imr); } MESH_RT_UNLOCK(ms); error = copyout(p, (uint8_t *)ireq->i_data, ireq->i_len); IEEE80211_FREE(p, M_TEMP); break; case IEEE80211_MESH_RTCMD_FLUSH: case IEEE80211_MESH_RTCMD_ADD: case IEEE80211_MESH_RTCMD_DELETE: return EINVAL; default: return ENOSYS; } break; case IEEE80211_IOC_MESH_PR_METRIC: len = strlen(ms->ms_pmetric->mpm_descr); if (ireq->i_len < len) return EINVAL; ireq->i_len = len; error = copyout(ms->ms_pmetric->mpm_descr, (uint8_t *)ireq->i_data, len); break; case IEEE80211_IOC_MESH_PR_PATH: len = strlen(ms->ms_ppath->mpp_descr); if (ireq->i_len < len) return EINVAL; ireq->i_len = len; error = copyout(ms->ms_ppath->mpp_descr, (uint8_t *)ireq->i_data, len); break; default: return ENOSYS; } return error; } IEEE80211_IOCTL_GET(mesh, mesh_ioctl_get80211); static int mesh_ioctl_set80211(struct ieee80211vap *vap, struct ieee80211req *ireq) { struct ieee80211_mesh_state *ms = vap->iv_mesh; uint8_t tmpmeshid[IEEE80211_NWID_LEN]; uint8_t tmpaddr[IEEE80211_ADDR_LEN]; char tmpproto[IEEE80211_MESH_PROTO_DSZ]; int error; if (vap->iv_opmode != IEEE80211_M_MBSS) return ENOSYS; error = 0; switch (ireq->i_type) { case IEEE80211_IOC_MESH_ID: if (ireq->i_val != 0 || ireq->i_len > IEEE80211_MESHID_LEN) return EINVAL; error = copyin(ireq->i_data, tmpmeshid, ireq->i_len); if (error != 0) break; memset(ms->ms_id, 0, IEEE80211_NWID_LEN); ms->ms_idlen = ireq->i_len; memcpy(ms->ms_id, tmpmeshid, ireq->i_len); error = ENETRESET; break; case IEEE80211_IOC_MESH_AP: if (ireq->i_val) ms->ms_flags |= IEEE80211_MESHFLAGS_AP; else ms->ms_flags &= ~IEEE80211_MESHFLAGS_AP; error = ENETRESET; break; case IEEE80211_IOC_MESH_FWRD: if (ireq->i_val) ms->ms_flags |= IEEE80211_MESHFLAGS_FWD; else ms->ms_flags &= ~IEEE80211_MESHFLAGS_FWD; mesh_gatemode_setup(vap); break; case IEEE80211_IOC_MESH_GATE: if (ireq->i_val) ms->ms_flags |= IEEE80211_MESHFLAGS_GATE; else ms->ms_flags &= ~IEEE80211_MESHFLAGS_GATE; break; case IEEE80211_IOC_MESH_TTL: ms->ms_ttl = (uint8_t) ireq->i_val; break; case IEEE80211_IOC_MESH_RTCMD: switch (ireq->i_val) { case IEEE80211_MESH_RTCMD_LIST: return EINVAL; case IEEE80211_MESH_RTCMD_FLUSH: ieee80211_mesh_rt_flush(vap); break; case IEEE80211_MESH_RTCMD_ADD: if (IEEE80211_ADDR_EQ(vap->iv_myaddr, ireq->i_data) || IEEE80211_ADDR_EQ(broadcastaddr, ireq->i_data)) return EINVAL; error = copyin(ireq->i_data, &tmpaddr, IEEE80211_ADDR_LEN); if (error == 0) ieee80211_mesh_discover(vap, tmpaddr, NULL); break; case IEEE80211_MESH_RTCMD_DELETE: ieee80211_mesh_rt_del(vap, ireq->i_data); break; default: return ENOSYS; } break; case IEEE80211_IOC_MESH_PR_METRIC: error = copyin(ireq->i_data, tmpproto, sizeof(tmpproto)); if (error == 0) { error = mesh_select_proto_metric(vap, tmpproto); if (error == 0) error = ENETRESET; } break; case IEEE80211_IOC_MESH_PR_PATH: error = copyin(ireq->i_data, tmpproto, sizeof(tmpproto)); if (error == 0) { error = mesh_select_proto_path(vap, tmpproto); if (error == 0) error = ENETRESET; } break; default: return ENOSYS; } return error; } IEEE80211_IOCTL_SET(mesh, mesh_ioctl_set80211); Index: projects/clang1000-import/sys/net80211/ieee80211_rssadapt.c =================================================================== --- projects/clang1000-import/sys/net80211/ieee80211_rssadapt.c (revision 358238) +++ projects/clang1000-import/sys/net80211/ieee80211_rssadapt.c (revision 358239) @@ -1,386 +1,387 @@ /* $FreeBSD$ */ /* $NetBSD: ieee80211_rssadapt.c,v 1.9 2005/02/26 22:45:09 perry Exp $ */ /*- * SPDX-License-Identifier: BSD-3-Clause * * Copyright (c) 2010 Rui Paulo * Copyright (c) 2003, 2004 David Young. All rights reserved. * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials provided * with the distribution. * 3. The name of David Young may not be used to endorse or promote * products derived from this software without specific prior * written permission. * * THIS SOFTWARE IS PROVIDED BY David Young ``AS IS'' AND ANY * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A * PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL David * Young BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED * TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY * OF SUCH DAMAGE. */ #include "opt_wlan.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include struct rssadapt_expavgctl { /* RSS threshold decay. */ u_int rc_decay_denom; u_int rc_decay_old; /* RSS threshold update. */ u_int rc_thresh_denom; u_int rc_thresh_old; /* RSS average update. */ u_int rc_avgrssi_denom; u_int rc_avgrssi_old; }; static struct rssadapt_expavgctl master_expavgctl = { .rc_decay_denom = 16, .rc_decay_old = 15, .rc_thresh_denom = 8, .rc_thresh_old = 4, .rc_avgrssi_denom = 8, .rc_avgrssi_old = 4 }; #ifdef interpolate #undef interpolate #endif #define interpolate(parm, old, new) ((parm##_old * (old) + \ (parm##_denom - parm##_old) * (new)) / \ parm##_denom) static void rssadapt_setinterval(const struct ieee80211vap *, int); static void rssadapt_init(struct ieee80211vap *); static void rssadapt_deinit(struct ieee80211vap *); static void rssadapt_updatestats(struct ieee80211_rssadapt_node *); static void rssadapt_node_init(struct ieee80211_node *); static void rssadapt_node_deinit(struct ieee80211_node *); static int rssadapt_rate(struct ieee80211_node *, void *, uint32_t); static void rssadapt_lower_rate(struct ieee80211_rssadapt_node *, int, int); static void rssadapt_raise_rate(struct ieee80211_rssadapt_node *, int, int); static void rssadapt_tx_complete(const struct ieee80211_node *, const struct ieee80211_ratectl_tx_status *); static void rssadapt_sysctlattach(struct ieee80211vap *, struct sysctl_ctx_list *, struct sysctl_oid *); /* number of references from net80211 layer */ static int nrefs = 0; static const struct ieee80211_ratectl rssadapt = { .ir_name = "rssadapt", .ir_attach = NULL, .ir_detach = NULL, .ir_init = rssadapt_init, .ir_deinit = rssadapt_deinit, .ir_node_init = rssadapt_node_init, .ir_node_deinit = rssadapt_node_deinit, .ir_rate = rssadapt_rate, .ir_tx_complete = rssadapt_tx_complete, .ir_tx_update = NULL, .ir_setinterval = rssadapt_setinterval, }; IEEE80211_RATECTL_MODULE(rssadapt, 1); IEEE80211_RATECTL_ALG(rssadapt, IEEE80211_RATECTL_RSSADAPT, rssadapt); static void rssadapt_setinterval(const struct ieee80211vap *vap, int msecs) { struct ieee80211_rssadapt *rs = vap->iv_rs; if (!rs) return; if (msecs < 100) msecs = 100; rs->interval = msecs_to_ticks(msecs); } static void rssadapt_init(struct ieee80211vap *vap) { struct ieee80211_rssadapt *rs; KASSERT(vap->iv_rs == NULL, ("%s: iv_rs already initialized", __func__)); nrefs++; /* XXX locking */ vap->iv_rs = rs = IEEE80211_MALLOC(sizeof(struct ieee80211_rssadapt), M_80211_RATECTL, IEEE80211_M_NOWAIT | IEEE80211_M_ZERO); if (rs == NULL) { if_printf(vap->iv_ifp, "couldn't alloc ratectl structure\n"); return; } rs->vap = vap; rssadapt_setinterval(vap, 500 /* msecs */); rssadapt_sysctlattach(vap, vap->iv_sysctl, vap->iv_oid); } static void rssadapt_deinit(struct ieee80211vap *vap) { IEEE80211_FREE(vap->iv_rs, M_80211_RATECTL); KASSERT(nrefs > 0, ("imbalanced attach/detach")); nrefs--; /* XXX locking */ } static void rssadapt_updatestats(struct ieee80211_rssadapt_node *ra) { long interval; ra->ra_pktrate = (ra->ra_pktrate + 10*(ra->ra_nfail + ra->ra_nok))/2; ra->ra_nfail = ra->ra_nok = 0; /* * A node is eligible for its rate to be raised every 1/10 to 10 * seconds, more eligible in proportion to recent packet rates. */ interval = MAX(10*1000, 10*1000 / MAX(1, 10 * ra->ra_pktrate)); ra->ra_raise_interval = msecs_to_ticks(interval); } static void rssadapt_node_init(struct ieee80211_node *ni) { struct ieee80211_rssadapt_node *ra; struct ieee80211vap *vap = ni->ni_vap; struct ieee80211_rssadapt *rsa = vap->iv_rs; const struct ieee80211_rateset *rs = &ni->ni_rates; if (!rsa) { if_printf(vap->iv_ifp, "ratectl structure was not allocated, " "per-node structure allocation skipped\n"); return; } if (ni->ni_rctls == NULL) { ni->ni_rctls = ra = IEEE80211_MALLOC(sizeof(struct ieee80211_rssadapt_node), M_80211_RATECTL, IEEE80211_M_NOWAIT | IEEE80211_M_ZERO); if (ra == NULL) { if_printf(vap->iv_ifp, "couldn't alloc per-node ratectl " "structure\n"); return; } } else ra = ni->ni_rctls; ra->ra_rs = rsa; ra->ra_rates = *rs; rssadapt_updatestats(ra); /* pick initial rate */ for (ra->ra_rix = rs->rs_nrates - 1; ra->ra_rix > 0 && (rs->rs_rates[ra->ra_rix] & IEEE80211_RATE_VAL) > 72; ra->ra_rix--) ; ni->ni_txrate = rs->rs_rates[ra->ra_rix] & IEEE80211_RATE_VAL; ra->ra_ticks = ticks; IEEE80211_NOTE(ni->ni_vap, IEEE80211_MSG_RATECTL, ni, "RSSADAPT initial rate %d", ni->ni_txrate); } static void rssadapt_node_deinit(struct ieee80211_node *ni) { IEEE80211_FREE(ni->ni_rctls, M_80211_RATECTL); } static __inline int bucket(int pktlen) { int i, top, thridx; for (i = 0, top = IEEE80211_RSSADAPT_BKT0; i < IEEE80211_RSSADAPT_BKTS; i++, top <<= IEEE80211_RSSADAPT_BKTPOWER) { thridx = i; if (pktlen <= top) break; } return thridx; } static int rssadapt_rate(struct ieee80211_node *ni, void *arg __unused, uint32_t iarg) { struct ieee80211_rssadapt_node *ra = ni->ni_rctls; u_int pktlen = iarg; const struct ieee80211_rateset *rs; uint16_t (*thrs)[IEEE80211_RATE_SIZE]; int rix, rssi; /* XXX should return -1 here, but drivers may not expect this... */ if (!ra) { ni->ni_txrate = ni->ni_rates.rs_rates[0]; return 0; } rs = &ra->ra_rates; if ((ticks - ra->ra_ticks) > ra->ra_rs->interval) { rssadapt_updatestats(ra); ra->ra_ticks = ticks; } thrs = &ra->ra_rate_thresh[bucket(pktlen)]; /* XXX this is average rssi, should be using last value */ rssi = ni->ni_ic->ic_node_getrssi(ni); for (rix = rs->rs_nrates-1; rix >= 0; rix--) if ((*thrs)[rix] < (rssi << 8)) break; if (rix != ra->ra_rix) { /* update public rate */ ni->ni_txrate = ni->ni_rates.rs_rates[rix] & IEEE80211_RATE_VAL; ra->ra_rix = rix; IEEE80211_NOTE(ni->ni_vap, IEEE80211_MSG_RATECTL, ni, "RSSADAPT new rate %d (pktlen %d rssi %d)", ni->ni_txrate, pktlen, rssi); } return rix; } /* * Adapt the data rate to suit the conditions. When a transmitted * packet is dropped after RAL_RSSADAPT_RETRY_LIMIT retransmissions, * raise the RSS threshold for transmitting packets of similar length at * the same data rate. */ static void rssadapt_lower_rate(struct ieee80211_rssadapt_node *ra, int pktlen, int rssi) { uint16_t last_thr; uint16_t (*thrs)[IEEE80211_RATE_SIZE]; u_int rix; thrs = &ra->ra_rate_thresh[bucket(pktlen)]; rix = ra->ra_rix; last_thr = (*thrs)[rix]; (*thrs)[rix] = interpolate(master_expavgctl.rc_thresh, last_thr, (rssi << 8)); IEEE80211_DPRINTF(ra->ra_rs->vap, IEEE80211_MSG_RATECTL, "RSSADAPT lower threshold for rate %d (last_thr %d new thr %d rssi %d)\n", ra->ra_rates.rs_rates[rix + 1] & IEEE80211_RATE_VAL, last_thr, (*thrs)[rix], rssi); } static void rssadapt_raise_rate(struct ieee80211_rssadapt_node *ra, int pktlen, int rssi) { uint16_t (*thrs)[IEEE80211_RATE_SIZE]; uint16_t newthr, oldthr; int rix; thrs = &ra->ra_rate_thresh[bucket(pktlen)]; rix = ra->ra_rix; if ((*thrs)[rix + 1] > (*thrs)[rix]) { oldthr = (*thrs)[rix + 1]; if ((*thrs)[rix] == 0) newthr = (rssi << 8); else newthr = (*thrs)[rix]; (*thrs)[rix + 1] = interpolate(master_expavgctl.rc_decay, oldthr, newthr); IEEE80211_DPRINTF(ra->ra_rs->vap, IEEE80211_MSG_RATECTL, "RSSADAPT raise threshold for rate %d (oldthr %d newthr %d rssi %d)\n", ra->ra_rates.rs_rates[rix + 1] & IEEE80211_RATE_VAL, oldthr, newthr, rssi); ra->ra_last_raise = ticks; } } static void rssadapt_tx_complete(const struct ieee80211_node *ni, const struct ieee80211_ratectl_tx_status *status) { struct ieee80211_rssadapt_node *ra = ni->ni_rctls; int pktlen, rssi; if (!ra) return; if ((status->flags & (IEEE80211_RATECTL_STATUS_PKTLEN|IEEE80211_RATECTL_STATUS_RSSI)) != (IEEE80211_RATECTL_STATUS_PKTLEN|IEEE80211_RATECTL_STATUS_RSSI)) return; pktlen = status->pktlen; rssi = status->rssi; if (status->status == IEEE80211_RATECTL_TX_SUCCESS) { ra->ra_nok++; if ((ra->ra_rix + 1) < ra->ra_rates.rs_nrates && (ticks - ra->ra_last_raise) >= ra->ra_raise_interval) rssadapt_raise_rate(ra, pktlen, rssi); } else { ra->ra_nfail++; rssadapt_lower_rate(ra, pktlen, rssi); } } static int rssadapt_sysctl_interval(SYSCTL_HANDLER_ARGS) { struct ieee80211vap *vap = arg1; struct ieee80211_rssadapt *rs = vap->iv_rs; int msecs, error; if (!rs) return ENOMEM; msecs = ticks_to_msecs(rs->interval); error = sysctl_handle_int(oidp, &msecs, 0, req); if (error || !req->newptr) return error; rssadapt_setinterval(vap, msecs); return 0; } static void rssadapt_sysctlattach(struct ieee80211vap *vap, struct sysctl_ctx_list *ctx, struct sysctl_oid *tree) { SYSCTL_ADD_PROC(ctx, SYSCTL_CHILDREN(tree), OID_AUTO, - "rssadapt_rate_interval", CTLTYPE_INT | CTLFLAG_RW, vap, - 0, rssadapt_sysctl_interval, "I", "rssadapt operation interval (ms)"); + "rssadapt_rate_interval", + CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_NEEDGIANT, vap, 0, + rssadapt_sysctl_interval, "I", "rssadapt operation interval (ms)"); } Index: projects/clang1000-import/sys/net80211/ieee80211_superg.c =================================================================== --- projects/clang1000-import/sys/net80211/ieee80211_superg.c (revision 358238) +++ projects/clang1000-import/sys/net80211/ieee80211_superg.c (revision 358239) @@ -1,1067 +1,1068 @@ /*- * SPDX-License-Identifier: BSD-2-Clause-FreeBSD * * Copyright (c) 2002-2009 Sam Leffler, Errno Consulting * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include "opt_wlan.h" #ifdef IEEE80211_SUPPORT_SUPERG #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include /* * Atheros fast-frame encapsulation format. * FF max payload: * 802.2 + FFHDR + HPAD + 802.3 + 802.2 + 1500 + SPAD + 802.3 + 802.2 + 1500: * 8 + 4 + 4 + 14 + 8 + 1500 + 6 + 14 + 8 + 1500 * = 3066 */ /* fast frame header is 32-bits */ #define ATH_FF_PROTO 0x0000003f /* protocol */ #define ATH_FF_PROTO_S 0 #define ATH_FF_FTYPE 0x000000c0 /* frame type */ #define ATH_FF_FTYPE_S 6 #define ATH_FF_HLEN32 0x00000300 /* optional hdr length */ #define ATH_FF_HLEN32_S 8 #define ATH_FF_SEQNUM 0x001ffc00 /* sequence number */ #define ATH_FF_SEQNUM_S 10 #define ATH_FF_OFFSET 0xffe00000 /* offset to 2nd payload */ #define ATH_FF_OFFSET_S 21 #define ATH_FF_MAX_HDR_PAD 4 #define ATH_FF_MAX_SEP_PAD 6 #define ATH_FF_MAX_HDR 30 #define ATH_FF_PROTO_L2TUNNEL 0 /* L2 tunnel protocol */ #define ATH_FF_ETH_TYPE 0x88bd /* Ether type for encapsulated frames */ #define ATH_FF_SNAP_ORGCODE_0 0x00 #define ATH_FF_SNAP_ORGCODE_1 0x03 #define ATH_FF_SNAP_ORGCODE_2 0x7f #define ATH_FF_TXQMIN 2 /* min txq depth for staging */ #define ATH_FF_TXQMAX 50 /* maximum # of queued frames allowed */ #define ATH_FF_STAGEMAX 5 /* max waiting period for staged frame*/ #define ETHER_HEADER_COPY(dst, src) \ memcpy(dst, src, sizeof(struct ether_header)) static int ieee80211_ffppsmin = 2; /* pps threshold for ff aggregation */ SYSCTL_INT(_net_wlan, OID_AUTO, ffppsmin, CTLFLAG_RW, &ieee80211_ffppsmin, 0, "min packet rate before fast-frame staging"); static int ieee80211_ffagemax = -1; /* max time frames held on stage q */ -SYSCTL_PROC(_net_wlan, OID_AUTO, ffagemax, CTLTYPE_INT | CTLFLAG_RW, - &ieee80211_ffagemax, 0, ieee80211_sysctl_msecs_ticks, "I", - "max hold time for fast-frame staging (ms)"); +SYSCTL_PROC(_net_wlan, OID_AUTO, ffagemax, + CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_NEEDGIANT, + &ieee80211_ffagemax, 0, ieee80211_sysctl_msecs_ticks, "I", + "max hold time for fast-frame staging (ms)"); static void ff_age_all(void *arg, int npending) { struct ieee80211com *ic = arg; /* XXX cache timer value somewhere (racy) */ ieee80211_ff_age_all(ic, ieee80211_ffagemax + 1); } void ieee80211_superg_attach(struct ieee80211com *ic) { struct ieee80211_superg *sg; IEEE80211_FF_LOCK_INIT(ic, ic->ic_name); sg = (struct ieee80211_superg *) IEEE80211_MALLOC( sizeof(struct ieee80211_superg), M_80211_VAP, IEEE80211_M_NOWAIT | IEEE80211_M_ZERO); if (sg == NULL) { printf("%s: cannot allocate SuperG state block\n", __func__); return; } TIMEOUT_TASK_INIT(ic->ic_tq, &sg->ff_qtimer, 0, ff_age_all, ic); ic->ic_superg = sg; /* * Default to not being so aggressive for FF/AMSDU * aging, otherwise we may hold a frame around * for way too long before we expire it out. */ ieee80211_ffagemax = msecs_to_ticks(2); } void ieee80211_superg_detach(struct ieee80211com *ic) { if (ic->ic_superg != NULL) { struct timeout_task *qtask = &ic->ic_superg->ff_qtimer; while (taskqueue_cancel_timeout(ic->ic_tq, qtask, NULL) != 0) taskqueue_drain_timeout(ic->ic_tq, qtask); IEEE80211_FREE(ic->ic_superg, M_80211_VAP); ic->ic_superg = NULL; } IEEE80211_FF_LOCK_DESTROY(ic); } void ieee80211_superg_vattach(struct ieee80211vap *vap) { struct ieee80211com *ic = vap->iv_ic; if (ic->ic_superg == NULL) /* NB: can't do fast-frames w/o state */ vap->iv_caps &= ~IEEE80211_C_FF; if (vap->iv_caps & IEEE80211_C_FF) vap->iv_flags |= IEEE80211_F_FF; /* NB: we only implement sta mode */ if (vap->iv_opmode == IEEE80211_M_STA && (vap->iv_caps & IEEE80211_C_TURBOP)) vap->iv_flags |= IEEE80211_F_TURBOP; } void ieee80211_superg_vdetach(struct ieee80211vap *vap) { } #define ATH_OUI_BYTES 0x00, 0x03, 0x7f /* * Add a WME information element to a frame. */ uint8_t * ieee80211_add_ath(uint8_t *frm, uint8_t caps, ieee80211_keyix defkeyix) { static const struct ieee80211_ath_ie info = { .ath_id = IEEE80211_ELEMID_VENDOR, .ath_len = sizeof(struct ieee80211_ath_ie) - 2, .ath_oui = { ATH_OUI_BYTES }, .ath_oui_type = ATH_OUI_TYPE, .ath_oui_subtype= ATH_OUI_SUBTYPE, .ath_version = ATH_OUI_VERSION, }; struct ieee80211_ath_ie *ath = (struct ieee80211_ath_ie *) frm; memcpy(frm, &info, sizeof(info)); ath->ath_capability = caps; if (defkeyix != IEEE80211_KEYIX_NONE) { ath->ath_defkeyix[0] = (defkeyix & 0xff); ath->ath_defkeyix[1] = ((defkeyix >> 8) & 0xff); } else { ath->ath_defkeyix[0] = 0xff; ath->ath_defkeyix[1] = 0x7f; } return frm + sizeof(info); } #undef ATH_OUI_BYTES uint8_t * ieee80211_add_athcaps(uint8_t *frm, const struct ieee80211_node *bss) { const struct ieee80211vap *vap = bss->ni_vap; return ieee80211_add_ath(frm, vap->iv_flags & IEEE80211_F_ATHEROS, ((vap->iv_flags & IEEE80211_F_WPA) == 0 && bss->ni_authmode != IEEE80211_AUTH_8021X) ? vap->iv_def_txkey : IEEE80211_KEYIX_NONE); } void ieee80211_parse_ath(struct ieee80211_node *ni, uint8_t *ie) { const struct ieee80211_ath_ie *ath = (const struct ieee80211_ath_ie *) ie; ni->ni_ath_flags = ath->ath_capability; ni->ni_ath_defkeyix = le16dec(&ath->ath_defkeyix); } int ieee80211_parse_athparams(struct ieee80211_node *ni, uint8_t *frm, const struct ieee80211_frame *wh) { struct ieee80211vap *vap = ni->ni_vap; const struct ieee80211_ath_ie *ath; u_int len = frm[1]; int capschanged; uint16_t defkeyix; if (len < sizeof(struct ieee80211_ath_ie)-2) { IEEE80211_DISCARD_IE(vap, IEEE80211_MSG_ELEMID | IEEE80211_MSG_SUPERG, wh, "Atheros", "too short, len %u", len); return -1; } ath = (const struct ieee80211_ath_ie *)frm; capschanged = (ni->ni_ath_flags != ath->ath_capability); defkeyix = le16dec(ath->ath_defkeyix); if (capschanged || defkeyix != ni->ni_ath_defkeyix) { ni->ni_ath_flags = ath->ath_capability; ni->ni_ath_defkeyix = defkeyix; IEEE80211_NOTE(vap, IEEE80211_MSG_SUPERG, ni, "ath ie change: new caps 0x%x defkeyix 0x%x", ni->ni_ath_flags, ni->ni_ath_defkeyix); } if (IEEE80211_ATH_CAP(vap, ni, ATHEROS_CAP_TURBO_PRIME)) { uint16_t curflags, newflags; /* * Check for turbo mode switch. Calculate flags * for the new mode and effect the switch. */ newflags = curflags = vap->iv_ic->ic_bsschan->ic_flags; /* NB: BOOST is not in ic_flags, so get it from the ie */ if (ath->ath_capability & ATHEROS_CAP_BOOST) newflags |= IEEE80211_CHAN_TURBO; else newflags &= ~IEEE80211_CHAN_TURBO; if (newflags != curflags) ieee80211_dturbo_switch(vap, newflags); } return capschanged; } /* * Decap the encapsulated frame pair and dispatch the first * for delivery. The second frame is returned for delivery * via the normal path. */ struct mbuf * ieee80211_ff_decap(struct ieee80211_node *ni, struct mbuf *m) { #define FF_LLC_SIZE (sizeof(struct ether_header) + sizeof(struct llc)) #define MS(x,f) (((x) & f) >> f##_S) struct ieee80211vap *vap = ni->ni_vap; struct llc *llc; uint32_t ath; struct mbuf *n; int framelen; /* NB: we assume caller does this check for us */ KASSERT(IEEE80211_ATH_CAP(vap, ni, IEEE80211_NODE_FF), ("ff not negotiated")); /* * Check for fast-frame tunnel encapsulation. */ if (m->m_pkthdr.len < 3*FF_LLC_SIZE) return m; if (m->m_len < FF_LLC_SIZE && (m = m_pullup(m, FF_LLC_SIZE)) == NULL) { IEEE80211_DISCARD_MAC(vap, IEEE80211_MSG_ANY, ni->ni_macaddr, "fast-frame", "%s", "m_pullup(llc) failed"); vap->iv_stats.is_rx_tooshort++; return NULL; } llc = (struct llc *)(mtod(m, uint8_t *) + sizeof(struct ether_header)); if (llc->llc_snap.ether_type != htons(ATH_FF_ETH_TYPE)) return m; m_adj(m, FF_LLC_SIZE); m_copydata(m, 0, sizeof(uint32_t), (caddr_t) &ath); if (MS(ath, ATH_FF_PROTO) != ATH_FF_PROTO_L2TUNNEL) { IEEE80211_DISCARD_MAC(vap, IEEE80211_MSG_ANY, ni->ni_macaddr, "fast-frame", "unsupport tunnel protocol, header 0x%x", ath); vap->iv_stats.is_ff_badhdr++; m_freem(m); return NULL; } /* NB: skip header and alignment padding */ m_adj(m, roundup(sizeof(uint32_t) - 2, 4) + 2); vap->iv_stats.is_ff_decap++; /* * Decap the first frame, bust it apart from the * second and deliver; then decap the second frame * and return it to the caller for normal delivery. */ m = ieee80211_decap1(m, &framelen); if (m == NULL) { IEEE80211_DISCARD_MAC(vap, IEEE80211_MSG_ANY, ni->ni_macaddr, "fast-frame", "%s", "first decap failed"); vap->iv_stats.is_ff_tooshort++; return NULL; } n = m_split(m, framelen, M_NOWAIT); if (n == NULL) { IEEE80211_DISCARD_MAC(vap, IEEE80211_MSG_ANY, ni->ni_macaddr, "fast-frame", "%s", "unable to split encapsulated frames"); vap->iv_stats.is_ff_split++; m_freem(m); /* NB: must reclaim */ return NULL; } /* XXX not right for WDS */ vap->iv_deliver_data(vap, ni, m); /* 1st of pair */ /* * Decap second frame. */ m_adj(n, roundup2(framelen, 4) - framelen); /* padding */ n = ieee80211_decap1(n, &framelen); if (n == NULL) { IEEE80211_DISCARD_MAC(vap, IEEE80211_MSG_ANY, ni->ni_macaddr, "fast-frame", "%s", "second decap failed"); vap->iv_stats.is_ff_tooshort++; } /* XXX verify framelen against mbuf contents */ return n; /* 2nd delivered by caller */ #undef MS #undef FF_LLC_SIZE } /* * Fast frame encapsulation. There must be two packets * chained with m_nextpkt. We do header adjustment for * each, add the tunnel encapsulation, and then concatenate * the mbuf chains to form a single frame for transmission. */ struct mbuf * ieee80211_ff_encap(struct ieee80211vap *vap, struct mbuf *m1, int hdrspace, struct ieee80211_key *key) { struct mbuf *m2; struct ether_header eh1, eh2; struct llc *llc; struct mbuf *m; int pad; m2 = m1->m_nextpkt; if (m2 == NULL) { IEEE80211_DPRINTF(vap, IEEE80211_MSG_SUPERG, "%s: only one frame\n", __func__); goto bad; } m1->m_nextpkt = NULL; /* * Adjust to include 802.11 header requirement. */ KASSERT(m1->m_len >= sizeof(eh1), ("no ethernet header!")); ETHER_HEADER_COPY(&eh1, mtod(m1, caddr_t)); m1 = ieee80211_mbuf_adjust(vap, hdrspace, key, m1); if (m1 == NULL) { printf("%s: failed initial mbuf_adjust\n", __func__); /* NB: ieee80211_mbuf_adjust handles msgs+statistics */ m_freem(m2); goto bad; } /* * Copy second frame's Ethernet header out of line * and adjust for possible padding in case there isn't room * at the end of first frame. */ KASSERT(m2->m_len >= sizeof(eh2), ("no ethernet header!")); ETHER_HEADER_COPY(&eh2, mtod(m2, caddr_t)); m2 = ieee80211_mbuf_adjust(vap, 4, NULL, m2); if (m2 == NULL) { /* NB: ieee80211_mbuf_adjust handles msgs+statistics */ printf("%s: failed second \n", __func__); goto bad; } /* * Now do tunnel encapsulation. First, each * frame gets a standard encapsulation. */ m1 = ieee80211_ff_encap1(vap, m1, &eh1); if (m1 == NULL) goto bad; m2 = ieee80211_ff_encap1(vap, m2, &eh2); if (m2 == NULL) goto bad; /* * Pad leading frame to a 4-byte boundary. If there * is space at the end of the first frame, put it * there; otherwise prepend to the front of the second * frame. We know doing the second will always work * because we reserve space above. We prefer appending * as this typically has better DMA alignment properties. */ for (m = m1; m->m_next != NULL; m = m->m_next) ; pad = roundup2(m1->m_pkthdr.len, 4) - m1->m_pkthdr.len; if (pad) { if (M_TRAILINGSPACE(m) < pad) { /* prepend to second */ m2->m_data -= pad; m2->m_len += pad; m2->m_pkthdr.len += pad; } else { /* append to first */ m->m_len += pad; m1->m_pkthdr.len += pad; } } /* * A-MSDU's are just appended; the "I'm A-MSDU!" bit is in the * QoS header. * * XXX optimize by prepending together */ m->m_next = m2; /* NB: last mbuf from above */ m1->m_pkthdr.len += m2->m_pkthdr.len; M_PREPEND(m1, sizeof(uint32_t)+2, M_NOWAIT); if (m1 == NULL) { /* XXX cannot happen */ IEEE80211_DPRINTF(vap, IEEE80211_MSG_SUPERG, "%s: no space for tunnel header\n", __func__); vap->iv_stats.is_tx_nobuf++; return NULL; } memset(mtod(m1, void *), 0, sizeof(uint32_t)+2); M_PREPEND(m1, sizeof(struct llc), M_NOWAIT); if (m1 == NULL) { /* XXX cannot happen */ IEEE80211_DPRINTF(vap, IEEE80211_MSG_SUPERG, "%s: no space for llc header\n", __func__); vap->iv_stats.is_tx_nobuf++; return NULL; } llc = mtod(m1, struct llc *); llc->llc_dsap = llc->llc_ssap = LLC_SNAP_LSAP; llc->llc_control = LLC_UI; llc->llc_snap.org_code[0] = ATH_FF_SNAP_ORGCODE_0; llc->llc_snap.org_code[1] = ATH_FF_SNAP_ORGCODE_1; llc->llc_snap.org_code[2] = ATH_FF_SNAP_ORGCODE_2; llc->llc_snap.ether_type = htons(ATH_FF_ETH_TYPE); vap->iv_stats.is_ff_encap++; return m1; bad: vap->iv_stats.is_ff_encapfail++; if (m1 != NULL) m_freem(m1); if (m2 != NULL) m_freem(m2); return NULL; } /* * A-MSDU encapsulation. * * This assumes just two frames for now, since we're borrowing the * same queuing code and infrastructure as fast-frames. * * There must be two packets chained with m_nextpkt. * We do header adjustment for each, and then concatenate the mbuf chains * to form a single frame for transmission. */ struct mbuf * ieee80211_amsdu_encap(struct ieee80211vap *vap, struct mbuf *m1, int hdrspace, struct ieee80211_key *key) { struct mbuf *m2; struct ether_header eh1, eh2; struct mbuf *m; int pad; m2 = m1->m_nextpkt; if (m2 == NULL) { IEEE80211_DPRINTF(vap, IEEE80211_MSG_SUPERG, "%s: only one frame\n", __func__); goto bad; } m1->m_nextpkt = NULL; /* * Include A-MSDU header in adjusting header layout. */ KASSERT(m1->m_len >= sizeof(eh1), ("no ethernet header!")); ETHER_HEADER_COPY(&eh1, mtod(m1, caddr_t)); m1 = ieee80211_mbuf_adjust(vap, hdrspace + sizeof(struct llc) + sizeof(uint32_t) + sizeof(struct ether_header), key, m1); if (m1 == NULL) { /* NB: ieee80211_mbuf_adjust handles msgs+statistics */ m_freem(m2); goto bad; } /* * Copy second frame's Ethernet header out of line * and adjust for encapsulation headers. Note that * we make room for padding in case there isn't room * at the end of first frame. */ KASSERT(m2->m_len >= sizeof(eh2), ("no ethernet header!")); ETHER_HEADER_COPY(&eh2, mtod(m2, caddr_t)); m2 = ieee80211_mbuf_adjust(vap, 4, NULL, m2); if (m2 == NULL) { /* NB: ieee80211_mbuf_adjust handles msgs+statistics */ goto bad; } /* * Now do tunnel encapsulation. First, each * frame gets a standard encapsulation. */ m1 = ieee80211_ff_encap1(vap, m1, &eh1); if (m1 == NULL) goto bad; m2 = ieee80211_ff_encap1(vap, m2, &eh2); if (m2 == NULL) goto bad; /* * Pad leading frame to a 4-byte boundary. If there * is space at the end of the first frame, put it * there; otherwise prepend to the front of the second * frame. We know doing the second will always work * because we reserve space above. We prefer appending * as this typically has better DMA alignment properties. */ for (m = m1; m->m_next != NULL; m = m->m_next) ; pad = roundup2(m1->m_pkthdr.len, 4) - m1->m_pkthdr.len; if (pad) { if (M_TRAILINGSPACE(m) < pad) { /* prepend to second */ m2->m_data -= pad; m2->m_len += pad; m2->m_pkthdr.len += pad; } else { /* append to first */ m->m_len += pad; m1->m_pkthdr.len += pad; } } /* * Now, stick 'em together. */ m->m_next = m2; /* NB: last mbuf from above */ m1->m_pkthdr.len += m2->m_pkthdr.len; vap->iv_stats.is_amsdu_encap++; return m1; bad: vap->iv_stats.is_amsdu_encapfail++; if (m1 != NULL) m_freem(m1); if (m2 != NULL) m_freem(m2); return NULL; } static void ff_transmit(struct ieee80211_node *ni, struct mbuf *m) { struct ieee80211vap *vap = ni->ni_vap; struct ieee80211com *ic = ni->ni_ic; IEEE80211_TX_LOCK_ASSERT(ic); /* encap and xmit */ m = ieee80211_encap(vap, ni, m); if (m != NULL) (void) ieee80211_parent_xmitpkt(ic, m); else ieee80211_free_node(ni); } /* * Flush frames to device; note we re-use the linked list * the frames were stored on and use the sentinel (unchanged) * which may be non-NULL. */ static void ff_flush(struct mbuf *head, struct mbuf *last) { struct mbuf *m, *next; struct ieee80211_node *ni; struct ieee80211vap *vap; for (m = head; m != last; m = next) { next = m->m_nextpkt; m->m_nextpkt = NULL; ni = (struct ieee80211_node *) m->m_pkthdr.rcvif; vap = ni->ni_vap; IEEE80211_NOTE(vap, IEEE80211_MSG_SUPERG, ni, "%s: flush frame, age %u", __func__, M_AGE_GET(m)); vap->iv_stats.is_ff_flush++; ff_transmit(ni, m); } } /* * Age frames on the staging queue. */ void ieee80211_ff_age(struct ieee80211com *ic, struct ieee80211_stageq *sq, int quanta) { struct mbuf *m, *head; struct ieee80211_node *ni; IEEE80211_FF_LOCK(ic); if (sq->depth == 0) { IEEE80211_FF_UNLOCK(ic); return; /* nothing to do */ } KASSERT(sq->head != NULL, ("stageq empty")); head = sq->head; while ((m = sq->head) != NULL && M_AGE_GET(m) < quanta) { int tid = WME_AC_TO_TID(M_WME_GETAC(m)); /* clear staging ref to frame */ ni = (struct ieee80211_node *) m->m_pkthdr.rcvif; KASSERT(ni->ni_tx_superg[tid] == m, ("staging queue empty")); ni->ni_tx_superg[tid] = NULL; sq->head = m->m_nextpkt; sq->depth--; } if (m == NULL) sq->tail = NULL; else M_AGE_SUB(m, quanta); IEEE80211_FF_UNLOCK(ic); IEEE80211_TX_LOCK(ic); ff_flush(head, m); IEEE80211_TX_UNLOCK(ic); } static void stageq_add(struct ieee80211com *ic, struct ieee80211_stageq *sq, struct mbuf *m) { int age = ieee80211_ffagemax; IEEE80211_FF_LOCK_ASSERT(ic); if (sq->tail != NULL) { sq->tail->m_nextpkt = m; age -= M_AGE_GET(sq->head); } else { sq->head = m; struct timeout_task *qtask = &ic->ic_superg->ff_qtimer; taskqueue_enqueue_timeout(ic->ic_tq, qtask, age); } KASSERT(age >= 0, ("age %d", age)); M_AGE_SET(m, age); m->m_nextpkt = NULL; sq->tail = m; sq->depth++; } static void stageq_remove(struct ieee80211com *ic, struct ieee80211_stageq *sq, struct mbuf *mstaged) { struct mbuf *m, *mprev; IEEE80211_FF_LOCK_ASSERT(ic); mprev = NULL; for (m = sq->head; m != NULL; m = m->m_nextpkt) { if (m == mstaged) { if (mprev == NULL) sq->head = m->m_nextpkt; else mprev->m_nextpkt = m->m_nextpkt; if (sq->tail == m) sq->tail = mprev; sq->depth--; return; } mprev = m; } printf("%s: packet not found\n", __func__); } static uint32_t ff_approx_txtime(struct ieee80211_node *ni, const struct mbuf *m1, const struct mbuf *m2) { struct ieee80211com *ic = ni->ni_ic; struct ieee80211vap *vap = ni->ni_vap; uint32_t framelen; uint32_t frame_time; /* * Approximate the frame length to be transmitted. A swag to add * the following maximal values to the skb payload: * - 32: 802.11 encap + CRC * - 24: encryption overhead (if wep bit) * - 4 + 6: fast-frame header and padding * - 16: 2 LLC FF tunnel headers * - 14: 1 802.3 FF tunnel header (mbuf already accounts for 2nd) */ framelen = m1->m_pkthdr.len + 32 + ATH_FF_MAX_HDR_PAD + ATH_FF_MAX_SEP_PAD + ATH_FF_MAX_HDR; if (vap->iv_flags & IEEE80211_F_PRIVACY) framelen += 24; if (m2 != NULL) framelen += m2->m_pkthdr.len; /* * For now, we assume non-shortgi, 20MHz, just because I want to * at least test 802.11n. */ if (ni->ni_txrate & IEEE80211_RATE_MCS) frame_time = ieee80211_compute_duration_ht(framelen, ni->ni_txrate, IEEE80211_HT_RC_2_STREAMS(ni->ni_txrate), 0, /* isht40 */ 0); /* isshortgi */ else frame_time = ieee80211_compute_duration(ic->ic_rt, framelen, ni->ni_txrate, 0); return (frame_time); } /* * Check if the supplied frame can be partnered with an existing * or pending frame. Return a reference to any frame that should be * sent on return; otherwise return NULL. */ struct mbuf * ieee80211_ff_check(struct ieee80211_node *ni, struct mbuf *m) { struct ieee80211vap *vap = ni->ni_vap; struct ieee80211com *ic = ni->ni_ic; struct ieee80211_superg *sg = ic->ic_superg; const int pri = M_WME_GETAC(m); struct ieee80211_stageq *sq; struct ieee80211_tx_ampdu *tap; struct mbuf *mstaged; uint32_t txtime, limit; IEEE80211_TX_UNLOCK_ASSERT(ic); IEEE80211_LOCK(ic); limit = IEEE80211_TXOP_TO_US( ic->ic_wme.wme_chanParams.cap_wmeParams[pri].wmep_txopLimit); IEEE80211_UNLOCK(ic); /* * Check if the supplied frame can be aggregated. * * NB: we allow EAPOL frames to be aggregated with other ucast traffic. * Do 802.1x EAPOL frames proceed in the clear? Then they couldn't * be aggregated with other types of frames when encryption is on? */ IEEE80211_FF_LOCK(ic); tap = &ni->ni_tx_ampdu[WME_AC_TO_TID(pri)]; mstaged = ni->ni_tx_superg[WME_AC_TO_TID(pri)]; /* XXX NOTE: reusing packet counter state from A-MPDU */ /* * XXX NOTE: this means we're double-counting; it should just * be done in ieee80211_output.c once for both superg and A-MPDU. */ ieee80211_txampdu_count_packet(tap); /* * When not in station mode never aggregate a multicast * frame; this insures, for example, that a combined frame * does not require multiple encryption keys. */ if (vap->iv_opmode != IEEE80211_M_STA && ETHER_IS_MULTICAST(mtod(m, struct ether_header *)->ether_dhost)) { /* XXX flush staged frame? */ IEEE80211_FF_UNLOCK(ic); return m; } /* * If there is no frame to combine with and the pps is * too low; then do not attempt to aggregate this frame. */ if (mstaged == NULL && ieee80211_txampdu_getpps(tap) < ieee80211_ffppsmin) { IEEE80211_FF_UNLOCK(ic); return m; } sq = &sg->ff_stageq[pri]; /* * Check the txop limit to insure the aggregate fits. */ if (limit != 0 && (txtime = ff_approx_txtime(ni, m, mstaged)) > limit) { /* * Aggregate too long, return to the caller for direct * transmission. In addition, flush any pending frame * before sending this one. */ IEEE80211_DPRINTF(vap, IEEE80211_MSG_SUPERG, "%s: txtime %u exceeds txop limit %u\n", __func__, txtime, limit); ni->ni_tx_superg[WME_AC_TO_TID(pri)] = NULL; if (mstaged != NULL) stageq_remove(ic, sq, mstaged); IEEE80211_FF_UNLOCK(ic); if (mstaged != NULL) { IEEE80211_TX_LOCK(ic); IEEE80211_NOTE(vap, IEEE80211_MSG_SUPERG, ni, "%s: flush staged frame", __func__); /* encap and xmit */ ff_transmit(ni, mstaged); IEEE80211_TX_UNLOCK(ic); } return m; /* NB: original frame */ } /* * An aggregation candidate. If there's a frame to partner * with then combine and return for processing. Otherwise * save this frame and wait for a partner to show up (or * the frame to be flushed). Note that staged frames also * hold their node reference. */ if (mstaged != NULL) { ni->ni_tx_superg[WME_AC_TO_TID(pri)] = NULL; stageq_remove(ic, sq, mstaged); IEEE80211_FF_UNLOCK(ic); IEEE80211_NOTE(vap, IEEE80211_MSG_SUPERG, ni, "%s: aggregate fast-frame", __func__); /* * Release the node reference; we only need * the one already in mstaged. */ KASSERT(mstaged->m_pkthdr.rcvif == (void *)ni, ("rcvif %p ni %p", mstaged->m_pkthdr.rcvif, ni)); ieee80211_free_node(ni); m->m_nextpkt = NULL; mstaged->m_nextpkt = m; mstaged->m_flags |= M_FF; /* NB: mark for encap work */ } else { KASSERT(ni->ni_tx_superg[WME_AC_TO_TID(pri)] == NULL, ("ni_tx_superg[]: %p", ni->ni_tx_superg[WME_AC_TO_TID(pri)])); ni->ni_tx_superg[WME_AC_TO_TID(pri)] = m; stageq_add(ic, sq, m); IEEE80211_FF_UNLOCK(ic); IEEE80211_NOTE(vap, IEEE80211_MSG_SUPERG, ni, "%s: stage frame, %u queued", __func__, sq->depth); /* NB: mstaged is NULL */ } return mstaged; } struct mbuf * ieee80211_amsdu_check(struct ieee80211_node *ni, struct mbuf *m) { /* * XXX TODO: actually enforce the node support * and HTCAP requirements for the maximum A-MSDU * size. */ /* First: software A-MSDU transmit? */ if (! ieee80211_amsdu_tx_ok(ni)) return (m); /* Next - EAPOL? Nope, don't aggregate; we don't QoS encap them */ if (m->m_flags & (M_EAPOL | M_MCAST | M_BCAST)) return (m); /* Next - needs to be a data frame, non-broadcast, etc */ if (ETHER_IS_MULTICAST(mtod(m, struct ether_header *)->ether_dhost)) return (m); return (ieee80211_ff_check(ni, m)); } void ieee80211_ff_node_init(struct ieee80211_node *ni) { /* * Clean FF state on re-associate. This handles the case * where a station leaves w/o notifying us and then returns * before node is reaped for inactivity. */ ieee80211_ff_node_cleanup(ni); } void ieee80211_ff_node_cleanup(struct ieee80211_node *ni) { struct ieee80211com *ic = ni->ni_ic; struct ieee80211_superg *sg = ic->ic_superg; struct mbuf *m, *next_m, *head; int tid; IEEE80211_FF_LOCK(ic); head = NULL; for (tid = 0; tid < WME_NUM_TID; tid++) { int ac = TID_TO_WME_AC(tid); /* * XXX Initialise the packet counter. * * This may be double-work for 11n stations; * but without it we never setup things. */ ieee80211_txampdu_init_pps(&ni->ni_tx_ampdu[tid]); m = ni->ni_tx_superg[tid]; if (m != NULL) { ni->ni_tx_superg[tid] = NULL; stageq_remove(ic, &sg->ff_stageq[ac], m); m->m_nextpkt = head; head = m; } } IEEE80211_FF_UNLOCK(ic); /* * Free mbufs, taking care to not dereference the mbuf after * we free it (hence grabbing m_nextpkt before we free it.) */ m = head; while (m != NULL) { next_m = m->m_nextpkt; m_freem(m); ieee80211_free_node(ni); m = next_m; } } /* * Switch between turbo and non-turbo operating modes. * Use the specified channel flags to locate the new * channel, update 802.11 state, and then call back into * the driver to effect the change. */ void ieee80211_dturbo_switch(struct ieee80211vap *vap, int newflags) { struct ieee80211com *ic = vap->iv_ic; struct ieee80211_channel *chan; chan = ieee80211_find_channel(ic, ic->ic_bsschan->ic_freq, newflags); if (chan == NULL) { /* XXX should not happen */ IEEE80211_DPRINTF(vap, IEEE80211_MSG_SUPERG, "%s: no channel with freq %u flags 0x%x\n", __func__, ic->ic_bsschan->ic_freq, newflags); return; } IEEE80211_DPRINTF(vap, IEEE80211_MSG_SUPERG, "%s: %s -> %s (freq %u flags 0x%x)\n", __func__, ieee80211_phymode_name[ieee80211_chan2mode(ic->ic_bsschan)], ieee80211_phymode_name[ieee80211_chan2mode(chan)], chan->ic_freq, chan->ic_flags); ic->ic_bsschan = chan; ic->ic_prevchan = ic->ic_curchan; ic->ic_curchan = chan; ic->ic_rt = ieee80211_get_ratetable(chan); ic->ic_set_channel(ic); ieee80211_radiotap_chan_change(ic); /* NB: do not need to reset ERP state 'cuz we're in sta mode */ } /* * Return the current ``state'' of an Atheros capbility. * If associated in station mode report the negotiated * setting. Otherwise report the current setting. */ static int getathcap(struct ieee80211vap *vap, int cap) { if (vap->iv_opmode == IEEE80211_M_STA && vap->iv_state == IEEE80211_S_RUN) return IEEE80211_ATH_CAP(vap, vap->iv_bss, cap) != 0; else return (vap->iv_flags & cap) != 0; } static int superg_ioctl_get80211(struct ieee80211vap *vap, struct ieee80211req *ireq) { switch (ireq->i_type) { case IEEE80211_IOC_FF: ireq->i_val = getathcap(vap, IEEE80211_F_FF); break; case IEEE80211_IOC_TURBOP: ireq->i_val = getathcap(vap, IEEE80211_F_TURBOP); break; default: return ENOSYS; } return 0; } IEEE80211_IOCTL_GET(superg, superg_ioctl_get80211); static int superg_ioctl_set80211(struct ieee80211vap *vap, struct ieee80211req *ireq) { switch (ireq->i_type) { case IEEE80211_IOC_FF: if (ireq->i_val) { if ((vap->iv_caps & IEEE80211_C_FF) == 0) return EOPNOTSUPP; vap->iv_flags |= IEEE80211_F_FF; } else vap->iv_flags &= ~IEEE80211_F_FF; return ENETRESET; case IEEE80211_IOC_TURBOP: if (ireq->i_val) { if ((vap->iv_caps & IEEE80211_C_TURBOP) == 0) return EOPNOTSUPP; vap->iv_flags |= IEEE80211_F_TURBOP; } else vap->iv_flags &= ~IEEE80211_F_TURBOP; return ENETRESET; default: return ENOSYS; } } IEEE80211_IOCTL_SET(superg, superg_ioctl_set80211); #endif /* IEEE80211_SUPPORT_SUPERG */ Index: projects/clang1000-import/sys/netgraph/ng_socket.c =================================================================== --- projects/clang1000-import/sys/netgraph/ng_socket.c (revision 358238) +++ projects/clang1000-import/sys/netgraph/ng_socket.c (revision 358239) @@ -1,1215 +1,1215 @@ /* * ng_socket.c */ /*- * Copyright (c) 1996-1999 Whistle Communications, Inc. * All rights reserved. * * Subject to the following obligations and disclaimer of warranty, use and * redistribution of this software, in source or object code forms, with or * without modifications are expressly permitted by Whistle Communications; * provided, however, that: * 1. Any and all reproductions of the source or object code must include the * copyright notice above and the following disclaimer of warranties; and * 2. No rights are granted, in any manner or form, to use Whistle * Communications, Inc. trademarks, including the mark "WHISTLE * COMMUNICATIONS" on advertising, endorsements, or otherwise except as * such appears in the above copyright notice or in the software. * * THIS SOFTWARE IS BEING PROVIDED BY WHISTLE COMMUNICATIONS "AS IS", AND * TO THE MAXIMUM EXTENT PERMITTED BY LAW, WHISTLE COMMUNICATIONS MAKES NO * REPRESENTATIONS OR WARRANTIES, EXPRESS OR IMPLIED, REGARDING THIS SOFTWARE, * INCLUDING WITHOUT LIMITATION, ANY AND ALL IMPLIED WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT. * WHISTLE COMMUNICATIONS DOES NOT WARRANT, GUARANTEE, OR MAKE ANY * REPRESENTATIONS REGARDING THE USE OF, OR THE RESULTS OF THE USE OF THIS * SOFTWARE IN TERMS OF ITS CORRECTNESS, ACCURACY, RELIABILITY OR OTHERWISE. * IN NO EVENT SHALL WHISTLE COMMUNICATIONS BE LIABLE FOR ANY DAMAGES * RESULTING FROM OR ARISING OUT OF ANY USE OF THIS SOFTWARE, INCLUDING * WITHOUT LIMITATION, ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, * PUNITIVE, OR CONSEQUENTIAL DAMAGES, PROCUREMENT OF SUBSTITUTE GOODS OR * SERVICES, LOSS OF USE, DATA OR PROFITS, HOWEVER CAUSED AND UNDER ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF WHISTLE COMMUNICATIONS IS ADVISED OF THE POSSIBILITY * OF SUCH DAMAGE. * * Author: Julian Elischer * * $FreeBSD$ * $Whistle: ng_socket.c,v 1.28 1999/11/01 09:24:52 julian Exp $ */ /* * Netgraph socket nodes * * There are two types of netgraph sockets, control and data. * Control sockets have a netgraph node, but data sockets are * parasitic on control sockets, and have no node of their own. */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #ifdef NG_SEPARATE_MALLOC static MALLOC_DEFINE(M_NETGRAPH_PATH, "netgraph_path", "netgraph path info"); static MALLOC_DEFINE(M_NETGRAPH_SOCK, "netgraph_sock", "netgraph socket info"); #else #define M_NETGRAPH_PATH M_NETGRAPH #define M_NETGRAPH_SOCK M_NETGRAPH #endif /* * It's Ascii-art time! * +-------------+ +-------------+ * |socket (ctl)| |socket (data)| * +-------------+ +-------------+ * ^ ^ * | | * v v * +-----------+ +-----------+ * |pcb (ctl)| |pcb (data)| * +-----------+ +-----------+ * ^ ^ * | | * v v * +--------------------------+ * | Socket type private | * | data | * +--------------------------+ * ^ * | * v * +----------------+ * | struct ng_node | * +----------------+ */ /* Netgraph node methods */ static ng_constructor_t ngs_constructor; static ng_rcvmsg_t ngs_rcvmsg; static ng_shutdown_t ngs_shutdown; static ng_newhook_t ngs_newhook; static ng_connect_t ngs_connect; static ng_findhook_t ngs_findhook; static ng_rcvdata_t ngs_rcvdata; static ng_disconnect_t ngs_disconnect; /* Internal methods */ static int ng_attach_data(struct socket *so); static int ng_attach_cntl(struct socket *so); static int ng_attach_common(struct socket *so, int type); static void ng_detach_common(struct ngpcb *pcbp, int type); static void ng_socket_free_priv(struct ngsock *priv); static int ng_connect_data(struct sockaddr *nam, struct ngpcb *pcbp); static int ng_bind(struct sockaddr *nam, struct ngpcb *pcbp); static int ngs_mod_event(module_t mod, int event, void *data); static void ng_socket_item_applied(void *context, int error); /* Netgraph type descriptor */ static struct ng_type typestruct = { .version = NG_ABI_VERSION, .name = NG_SOCKET_NODE_TYPE, .mod_event = ngs_mod_event, .constructor = ngs_constructor, .rcvmsg = ngs_rcvmsg, .shutdown = ngs_shutdown, .newhook = ngs_newhook, .connect = ngs_connect, .findhook = ngs_findhook, .rcvdata = ngs_rcvdata, .disconnect = ngs_disconnect, }; NETGRAPH_INIT_ORDERED(socket, &typestruct, SI_SUB_PROTO_DOMAIN, SI_ORDER_ANY); /* Buffer space */ static u_long ngpdg_sendspace = 20 * 1024; /* really max datagram size */ SYSCTL_ULONG(_net_graph, OID_AUTO, maxdgram, CTLFLAG_RW, &ngpdg_sendspace , 0, "Maximum outgoing Netgraph datagram size"); static u_long ngpdg_recvspace = 20 * 1024; SYSCTL_ULONG(_net_graph, OID_AUTO, recvspace, CTLFLAG_RW, &ngpdg_recvspace , 0, "Maximum space for incoming Netgraph datagrams"); /* List of all sockets (for netstat -f netgraph) */ static LIST_HEAD(, ngpcb) ngsocklist; static struct mtx ngsocketlist_mtx; #define sotongpcb(so) ((struct ngpcb *)(so)->so_pcb) /* If getting unexplained errors returned, set this to "kdb_enter("X"); */ #ifndef TRAP_ERROR #define TRAP_ERROR #endif struct hookpriv { LIST_ENTRY(hookpriv) next; hook_p hook; }; LIST_HEAD(ngshash, hookpriv); /* Per-node private data */ struct ngsock { struct ng_node *node; /* the associated netgraph node */ struct ngpcb *datasock; /* optional data socket */ struct ngpcb *ctlsock; /* optional control socket */ struct ngshash *hash; /* hash for hook names */ u_long hmask; /* hash mask */ int flags; int refs; struct mtx mtx; /* mtx to wait on */ int error; /* place to store error */ }; #define NGS_FLAG_NOLINGER 1 /* close with last hook */ /*************************************************************** Control sockets ***************************************************************/ static int ngc_attach(struct socket *so, int proto, struct thread *td) { struct ngpcb *const pcbp = sotongpcb(so); int error; error = priv_check(td, PRIV_NETGRAPH_CONTROL); if (error) return (error); if (pcbp != NULL) return (EISCONN); return (ng_attach_cntl(so)); } static void ngc_detach(struct socket *so) { struct ngpcb *const pcbp = sotongpcb(so); KASSERT(pcbp != NULL, ("ngc_detach: pcbp == NULL")); ng_detach_common(pcbp, NG_CONTROL); } static int ngc_send(struct socket *so, int flags, struct mbuf *m, struct sockaddr *addr, struct mbuf *control, struct thread *td) { - struct epoch_tracker et; struct ngpcb *const pcbp = sotongpcb(so); struct ngsock *const priv = NG_NODE_PRIVATE(pcbp->sockdata->node); struct sockaddr_ng *const sap = (struct sockaddr_ng *) addr; struct ng_mesg *msg; struct mbuf *m0; item_p item; char *path = NULL; int len, error = 0; struct ng_apply_info apply; if (control) { error = EINVAL; goto release; } /* Require destination as there may be >= 1 hooks on this node. */ if (addr == NULL) { error = EDESTADDRREQ; goto release; } /* * Allocate an expendable buffer for the path, chop off * the sockaddr header, and make sure it's NUL terminated. */ len = sap->sg_len - 2; path = malloc(len + 1, M_NETGRAPH_PATH, M_WAITOK); bcopy(sap->sg_data, path, len); path[len] = '\0'; /* * Move the actual message out of mbufs into a linear buffer. * Start by adding up the size of the data. (could use mh_len?) */ for (len = 0, m0 = m; m0 != NULL; m0 = m0->m_next) len += m0->m_len; /* * Move the data into a linear buffer as well. * Messages are not delivered in mbufs. */ msg = malloc(len + 1, M_NETGRAPH_MSG, M_WAITOK); m_copydata(m, 0, len, (char *)msg); if (msg->header.version != NG_VERSION) { free(msg, M_NETGRAPH_MSG); error = EINVAL; goto release; } /* * Hack alert! * We look into the message and if it mkpeers a node of unknown type, we * try to load it. We need to do this now, in syscall thread, because if * message gets queued and applied later we will get panic. */ if (msg->header.typecookie == NGM_GENERIC_COOKIE && msg->header.cmd == NGM_MKPEER) { struct ngm_mkpeer *const mkp = (struct ngm_mkpeer *) msg->data; if (ng_findtype(mkp->type) == NULL) { char filename[NG_TYPESIZ + 3]; int fileid; /* Not found, try to load it as a loadable module. */ snprintf(filename, sizeof(filename), "ng_%s", mkp->type); error = kern_kldload(curthread, filename, &fileid); if (error != 0) { free(msg, M_NETGRAPH_MSG); goto release; } /* See if type has been loaded successfully. */ if (ng_findtype(mkp->type) == NULL) { free(msg, M_NETGRAPH_MSG); (void)kern_kldunload(curthread, fileid, LINKER_UNLOAD_NORMAL); error = ENXIO; goto release; } } } item = ng_package_msg(msg, NG_WAITOK); if ((error = ng_address_path((pcbp->sockdata->node), item, path, 0)) != 0) { #ifdef TRACE_MESSAGES printf("ng_address_path: errx=%d\n", error); #endif goto release; } #ifdef TRACE_MESSAGES printf("[%x]:<---------[socket]: c=<%d>cmd=%x(%s) f=%x #%d (%s)\n", item->el_dest->nd_ID, msg->header.typecookie, msg->header.cmd, msg->header.cmdstr, msg->header.flags, msg->header.token, item->el_dest->nd_type->name); #endif SAVE_LINE(item); /* * We do not want to return from syscall until the item * is processed by destination node. We register callback * on the item, which will update priv->error when item * was applied. * If ng_snd_item() has queued item, we sleep until * callback wakes us up. */ bzero(&apply, sizeof(apply)); apply.apply = ng_socket_item_applied; apply.context = priv; item->apply = &apply; priv->error = -1; - NET_EPOCH_ENTER(et); error = ng_snd_item(item, 0); - NET_EPOCH_EXIT(et); mtx_lock(&priv->mtx); if (priv->error == -1) msleep(priv, &priv->mtx, 0, "ngsock", 0); mtx_unlock(&priv->mtx); KASSERT(priv->error != -1, ("ng_socket: priv->error wasn't updated")); error = priv->error; release: if (path != NULL) free(path, M_NETGRAPH_PATH); if (control != NULL) m_freem(control); if (m != NULL) m_freem(m); return (error); } static int ngc_bind(struct socket *so, struct sockaddr *nam, struct thread *td) { struct ngpcb *const pcbp = sotongpcb(so); if (pcbp == NULL) return (EINVAL); return (ng_bind(nam, pcbp)); } static int ngc_connect(struct socket *so, struct sockaddr *nam, struct thread *td) { /* * At this time refuse to do this.. it used to * do something but it was undocumented and not used. */ printf("program tried to connect control socket to remote node\n"); return (EINVAL); } /*************************************************************** Data sockets ***************************************************************/ static int ngd_attach(struct socket *so, int proto, struct thread *td) { struct ngpcb *const pcbp = sotongpcb(so); if (pcbp != NULL) return (EISCONN); return (ng_attach_data(so)); } static void ngd_detach(struct socket *so) { struct ngpcb *const pcbp = sotongpcb(so); KASSERT(pcbp != NULL, ("ngd_detach: pcbp == NULL")); ng_detach_common(pcbp, NG_DATA); } static int ngd_send(struct socket *so, int flags, struct mbuf *m, struct sockaddr *addr, struct mbuf *control, struct thread *td) { struct epoch_tracker et; struct ngpcb *const pcbp = sotongpcb(so); struct sockaddr_ng *const sap = (struct sockaddr_ng *) addr; int len, error; hook_p hook = NULL; + item_p item; char hookname[NG_HOOKSIZ]; if ((pcbp == NULL) || (control != NULL)) { error = EINVAL; goto release; } if (pcbp->sockdata == NULL) { error = ENOTCONN; goto release; } if (sap == NULL) len = 0; /* Make compiler happy. */ else len = sap->sg_len - 2; /* * If the user used any of these ways to not specify an address * then handle specially. */ if ((sap == NULL) || (len <= 0) || (*sap->sg_data == '\0')) { if (NG_NODE_NUMHOOKS(pcbp->sockdata->node) != 1) { error = EDESTADDRREQ; goto release; } /* * If exactly one hook exists, just use it. * Special case to allow write(2) to work on an ng_socket. */ hook = LIST_FIRST(&pcbp->sockdata->node->nd_hooks); } else { if (len >= NG_HOOKSIZ) { error = EINVAL; goto release; } /* * chop off the sockaddr header, and make sure it's NUL * terminated */ bcopy(sap->sg_data, hookname, len); hookname[len] = '\0'; /* Find the correct hook from 'hookname' */ hook = ng_findhook(pcbp->sockdata->node, hookname); if (hook == NULL) { error = EHOSTUNREACH; goto release; } } /* Send data. */ + item = ng_package_data(m, NG_WAITOK); + m = NULL; NET_EPOCH_ENTER(et); - NG_SEND_DATA_FLAGS(error, hook, m, NG_WAITOK); + NG_FWD_ITEM_HOOK(error, item, hook); NET_EPOCH_EXIT(et); release: if (control != NULL) m_freem(control); if (m != NULL) m_freem(m); return (error); } static int ngd_connect(struct socket *so, struct sockaddr *nam, struct thread *td) { struct ngpcb *const pcbp = sotongpcb(so); if (pcbp == NULL) return (EINVAL); return (ng_connect_data(nam, pcbp)); } /* * Used for both data and control sockets */ static int ng_getsockaddr(struct socket *so, struct sockaddr **addr) { struct ngpcb *pcbp; struct sockaddr_ng *sg; int sg_len; int error = 0; pcbp = sotongpcb(so); if ((pcbp == NULL) || (pcbp->sockdata == NULL)) /* XXXGL: can this still happen? */ return (EINVAL); sg_len = sizeof(struct sockaddr_ng) + NG_NODESIZ - sizeof(sg->sg_data); sg = malloc(sg_len, M_SONAME, M_WAITOK | M_ZERO); mtx_lock(&pcbp->sockdata->mtx); if (pcbp->sockdata->node != NULL) { node_p node = pcbp->sockdata->node; if (NG_NODE_HAS_NAME(node)) bcopy(NG_NODE_NAME(node), sg->sg_data, strlen(NG_NODE_NAME(node))); mtx_unlock(&pcbp->sockdata->mtx); sg->sg_len = sg_len; sg->sg_family = AF_NETGRAPH; *addr = (struct sockaddr *)sg; } else { mtx_unlock(&pcbp->sockdata->mtx); free(sg, M_SONAME); error = EINVAL; } return (error); } /* * Attach a socket to it's protocol specific partner. * For a control socket, actually create a netgraph node and attach * to it as well. */ static int ng_attach_cntl(struct socket *so) { struct ngsock *priv; struct ngpcb *pcbp; node_p node; int error; /* Setup protocol control block */ if ((error = ng_attach_common(so, NG_CONTROL)) != 0) return (error); pcbp = sotongpcb(so); /* Make the generic node components */ if ((error = ng_make_node_common(&typestruct, &node)) != 0) { ng_detach_common(pcbp, NG_CONTROL); return (error); } /* * Allocate node private info and hash. We start * with 16 hash entries, however we may grow later * in ngs_newhook(). We can't predict how much hooks * does this node plan to have. */ priv = malloc(sizeof(*priv), M_NETGRAPH_SOCK, M_WAITOK | M_ZERO); priv->hash = hashinit(16, M_NETGRAPH_SOCK, &priv->hmask); /* Initialize mutex. */ mtx_init(&priv->mtx, "ng_socket", NULL, MTX_DEF); /* Link the pcb the private data. */ priv->ctlsock = pcbp; pcbp->sockdata = priv; priv->refs++; priv->node = node; pcbp->node_id = node->nd_ID; /* hint for netstat(1) */ /* Link the node and the private data. */ NG_NODE_SET_PRIVATE(priv->node, priv); NG_NODE_REF(priv->node); priv->refs++; return (0); } static int ng_attach_data(struct socket *so) { return (ng_attach_common(so, NG_DATA)); } /* * Set up a socket protocol control block. * This code is shared between control and data sockets. */ static int ng_attach_common(struct socket *so, int type) { struct ngpcb *pcbp; int error; /* Standard socket setup stuff. */ error = soreserve(so, ngpdg_sendspace, ngpdg_recvspace); if (error) return (error); /* Allocate the pcb. */ pcbp = malloc(sizeof(struct ngpcb), M_PCB, M_WAITOK | M_ZERO); pcbp->type = type; /* Link the pcb and the socket. */ so->so_pcb = (caddr_t)pcbp; pcbp->ng_socket = so; /* Add the socket to linked list */ mtx_lock(&ngsocketlist_mtx); LIST_INSERT_HEAD(&ngsocklist, pcbp, socks); mtx_unlock(&ngsocketlist_mtx); return (0); } /* * Disassociate the socket from it's protocol specific * partner. If it's attached to a node's private data structure, * then unlink from that too. If we were the last socket attached to it, * then shut down the entire node. Shared code for control and data sockets. */ static void ng_detach_common(struct ngpcb *pcbp, int which) { struct ngsock *priv = pcbp->sockdata; if (priv != NULL) { mtx_lock(&priv->mtx); switch (which) { case NG_CONTROL: priv->ctlsock = NULL; break; case NG_DATA: priv->datasock = NULL; break; default: panic("%s", __func__); } pcbp->sockdata = NULL; pcbp->node_id = 0; ng_socket_free_priv(priv); } pcbp->ng_socket->so_pcb = NULL; mtx_lock(&ngsocketlist_mtx); LIST_REMOVE(pcbp, socks); mtx_unlock(&ngsocketlist_mtx); free(pcbp, M_PCB); } /* * Remove a reference from node private data. */ static void ng_socket_free_priv(struct ngsock *priv) { mtx_assert(&priv->mtx, MA_OWNED); priv->refs--; if (priv->refs == 0) { mtx_destroy(&priv->mtx); hashdestroy(priv->hash, M_NETGRAPH_SOCK, priv->hmask); free(priv, M_NETGRAPH_SOCK); return; } if ((priv->refs == 1) && (priv->node != NULL)) { node_p node = priv->node; priv->node = NULL; mtx_unlock(&priv->mtx); NG_NODE_UNREF(node); ng_rmnode_self(node); } else mtx_unlock(&priv->mtx); } /* * Connect the data socket to a named control socket node. */ static int ng_connect_data(struct sockaddr *nam, struct ngpcb *pcbp) { struct sockaddr_ng *sap; node_p farnode; struct ngsock *priv; int error; item_p item; /* If we are already connected, don't do it again. */ if (pcbp->sockdata != NULL) return (EISCONN); /* * Find the target (victim) and check it doesn't already have * a data socket. Also check it is a 'socket' type node. * Use ng_package_data() and ng_address_path() to do this. */ sap = (struct sockaddr_ng *) nam; /* The item will hold the node reference. */ item = ng_package_data(NULL, NG_WAITOK); if ((error = ng_address_path(NULL, item, sap->sg_data, 0))) return (error); /* item is freed on failure */ /* * Extract node from item and free item. Remember we now have * a reference on the node. The item holds it for us. * when we free the item we release the reference. */ farnode = item->el_dest; /* shortcut */ if (strcmp(farnode->nd_type->name, NG_SOCKET_NODE_TYPE) != 0) { NG_FREE_ITEM(item); /* drop the reference to the node */ return (EINVAL); } priv = NG_NODE_PRIVATE(farnode); if (priv->datasock != NULL) { NG_FREE_ITEM(item); /* drop the reference to the node */ return (EADDRINUSE); } /* * Link the PCB and the private data struct. and note the extra * reference. Drop the extra reference on the node. */ mtx_lock(&priv->mtx); priv->datasock = pcbp; pcbp->sockdata = priv; pcbp->node_id = priv->node->nd_ID; /* hint for netstat(1) */ priv->refs++; mtx_unlock(&priv->mtx); NG_FREE_ITEM(item); /* drop the reference to the node */ return (0); } /* * Binding a socket means giving the corresponding node a name */ static int ng_bind(struct sockaddr *nam, struct ngpcb *pcbp) { struct ngsock *const priv = pcbp->sockdata; struct sockaddr_ng *const sap = (struct sockaddr_ng *) nam; if (priv == NULL) { TRAP_ERROR; return (EINVAL); } if ((sap->sg_len < 4) || (sap->sg_len > (NG_NODESIZ + 2)) || (sap->sg_data[0] == '\0') || (sap->sg_data[sap->sg_len - 3] != '\0')) { TRAP_ERROR; return (EINVAL); } return (ng_name_node(priv->node, sap->sg_data)); } /*************************************************************** Netgraph node ***************************************************************/ /* * You can only create new nodes from the socket end of things. */ static int ngs_constructor(node_p nodep) { return (EINVAL); } static void ngs_rehash(node_p node) { struct ngsock *priv = NG_NODE_PRIVATE(node); struct ngshash *new; struct hookpriv *hp; hook_p hook; uint32_t h; u_long hmask; new = hashinit_flags((priv->hmask + 1) * 2, M_NETGRAPH_SOCK, &hmask, HASH_NOWAIT); if (new == NULL) return; LIST_FOREACH(hook, &node->nd_hooks, hk_hooks) { hp = NG_HOOK_PRIVATE(hook); #ifdef INVARIANTS LIST_REMOVE(hp, next); #endif h = hash32_str(NG_HOOK_NAME(hook), HASHINIT) & hmask; LIST_INSERT_HEAD(&new[h], hp, next); } hashdestroy(priv->hash, M_NETGRAPH_SOCK, priv->hmask); priv->hash = new; priv->hmask = hmask; } /* * We allow any hook to be connected to the node. * There is no per-hook private information though. */ static int ngs_newhook(node_p node, hook_p hook, const char *name) { struct ngsock *const priv = NG_NODE_PRIVATE(node); struct hookpriv *hp; uint32_t h; hp = malloc(sizeof(*hp), M_NETGRAPH_SOCK, M_NOWAIT); if (hp == NULL) return (ENOMEM); if (node->nd_numhooks * 2 > priv->hmask) ngs_rehash(node); hp->hook = hook; h = hash32_str(name, HASHINIT) & priv->hmask; LIST_INSERT_HEAD(&priv->hash[h], hp, next); NG_HOOK_SET_PRIVATE(hook, hp); return (0); } /* * If only one hook, allow read(2) and write(2) to work. */ static int ngs_connect(hook_p hook) { node_p node = NG_HOOK_NODE(hook); struct ngsock *priv = NG_NODE_PRIVATE(node); if ((priv->datasock) && (priv->datasock->ng_socket)) { if (NG_NODE_NUMHOOKS(node) == 1) priv->datasock->ng_socket->so_state |= SS_ISCONNECTED; else priv->datasock->ng_socket->so_state &= ~SS_ISCONNECTED; } return (0); } /* Look up hook by name */ static hook_p ngs_findhook(node_p node, const char *name) { struct ngsock *priv = NG_NODE_PRIVATE(node); struct hookpriv *hp; uint32_t h; /* * Microoptimisation for an ng_socket with * a single hook, which is a common case. */ if (node->nd_numhooks == 1) { hook_p hook; hook = LIST_FIRST(&node->nd_hooks); if (strcmp(NG_HOOK_NAME(hook), name) == 0) return (hook); else return (NULL); } h = hash32_str(name, HASHINIT) & priv->hmask; LIST_FOREACH(hp, &priv->hash[h], next) if (strcmp(NG_HOOK_NAME(hp->hook), name) == 0) return (hp->hook); return (NULL); } /* * Incoming messages get passed up to the control socket. * Unless they are for us specifically (socket_type) */ static int ngs_rcvmsg(node_p node, item_p item, hook_p lasthook) { struct ngsock *const priv = NG_NODE_PRIVATE(node); struct ngpcb *pcbp; struct socket *so; struct sockaddr_ng addr; struct ng_mesg *msg; struct mbuf *m; ng_ID_t retaddr = NGI_RETADDR(item); int addrlen; int error = 0; NGI_GET_MSG(item, msg); NG_FREE_ITEM(item); /* * Grab priv->mtx here to prevent destroying of control socket * after checking that priv->ctlsock is not NULL. */ mtx_lock(&priv->mtx); pcbp = priv->ctlsock; /* * Only allow mesgs to be passed if we have the control socket. * Data sockets can only support the generic messages. */ if (pcbp == NULL) { mtx_unlock(&priv->mtx); TRAP_ERROR; NG_FREE_MSG(msg); return (EINVAL); } so = pcbp->ng_socket; SOCKBUF_LOCK(&so->so_rcv); /* As long as the race is handled, priv->mtx may be unlocked now. */ mtx_unlock(&priv->mtx); #ifdef TRACE_MESSAGES printf("[%x]:---------->[socket]: c=<%d>cmd=%x(%s) f=%x #%d\n", retaddr, msg->header.typecookie, msg->header.cmd, msg->header.cmdstr, msg->header.flags, msg->header.token); #endif if (msg->header.typecookie == NGM_SOCKET_COOKIE) { switch (msg->header.cmd) { case NGM_SOCK_CMD_NOLINGER: priv->flags |= NGS_FLAG_NOLINGER; break; case NGM_SOCK_CMD_LINGER: priv->flags &= ~NGS_FLAG_NOLINGER; break; default: error = EINVAL; /* unknown command */ } SOCKBUF_UNLOCK(&so->so_rcv); /* Free the message and return. */ NG_FREE_MSG(msg); return (error); } /* Get the return address into a sockaddr. */ bzero(&addr, sizeof(addr)); addr.sg_len = sizeof(addr); addr.sg_family = AF_NETGRAPH; addrlen = snprintf((char *)&addr.sg_data, sizeof(addr.sg_data), "[%x]:", retaddr); if (addrlen < 0 || addrlen > sizeof(addr.sg_data)) { SOCKBUF_UNLOCK(&so->so_rcv); printf("%s: snprintf([%x]) failed - %d\n", __func__, retaddr, addrlen); NG_FREE_MSG(msg); return (EINVAL); } /* Copy the message itself into an mbuf chain. */ m = m_devget((caddr_t)msg, sizeof(struct ng_mesg) + msg->header.arglen, 0, NULL, NULL); /* * Here we free the message. We need to do that * regardless of whether we got mbufs. */ NG_FREE_MSG(msg); if (m == NULL) { SOCKBUF_UNLOCK(&so->so_rcv); TRAP_ERROR; return (ENOBUFS); } /* Send it up to the socket. */ if (sbappendaddr_locked(&so->so_rcv, (struct sockaddr *)&addr, m, NULL) == 0) { SOCKBUF_UNLOCK(&so->so_rcv); TRAP_ERROR; m_freem(m); return (ENOBUFS); } sorwakeup_locked(so); return (error); } /* * Receive data on a hook */ static int ngs_rcvdata(hook_p hook, item_p item) { struct ngsock *const priv = NG_NODE_PRIVATE(NG_HOOK_NODE(hook)); struct ngpcb *const pcbp = priv->datasock; struct socket *so; struct sockaddr_ng *addr; char *addrbuf[NG_HOOKSIZ + 4]; int addrlen; struct mbuf *m; NGI_GET_M(item, m); NG_FREE_ITEM(item); /* If there is no data socket, black-hole it. */ if (pcbp == NULL) { NG_FREE_M(m); return (0); } so = pcbp->ng_socket; /* Get the return address into a sockaddr. */ addrlen = strlen(NG_HOOK_NAME(hook)); /* <= NG_HOOKSIZ - 1 */ addr = (struct sockaddr_ng *) addrbuf; addr->sg_len = addrlen + 3; addr->sg_family = AF_NETGRAPH; bcopy(NG_HOOK_NAME(hook), addr->sg_data, addrlen); addr->sg_data[addrlen] = '\0'; /* Try to tell the socket which hook it came in on. */ if (sbappendaddr(&so->so_rcv, (struct sockaddr *)addr, m, NULL) == 0) { m_freem(m); TRAP_ERROR; return (ENOBUFS); } sorwakeup(so); return (0); } /* * Hook disconnection * * For this type, removal of the last link destroys the node * if the NOLINGER flag is set. */ static int ngs_disconnect(hook_p hook) { node_p node = NG_HOOK_NODE(hook); struct ngsock *const priv = NG_NODE_PRIVATE(node); struct hookpriv *hp = NG_HOOK_PRIVATE(hook); LIST_REMOVE(hp, next); free(hp, M_NETGRAPH_SOCK); if ((priv->datasock) && (priv->datasock->ng_socket)) { if (NG_NODE_NUMHOOKS(node) == 1) priv->datasock->ng_socket->so_state |= SS_ISCONNECTED; else priv->datasock->ng_socket->so_state &= ~SS_ISCONNECTED; } if ((priv->flags & NGS_FLAG_NOLINGER) && (NG_NODE_NUMHOOKS(node) == 0) && (NG_NODE_IS_VALID(node))) ng_rmnode_self(node); return (0); } /* * Do local shutdown processing. * In this case, that involves making sure the socket * knows we should be shutting down. */ static int ngs_shutdown(node_p node) { struct ngsock *const priv = NG_NODE_PRIVATE(node); struct ngpcb *dpcbp, *pcbp; mtx_lock(&priv->mtx); dpcbp = priv->datasock; pcbp = priv->ctlsock; if (dpcbp != NULL) soisdisconnected(dpcbp->ng_socket); if (pcbp != NULL) soisdisconnected(pcbp->ng_socket); priv->node = NULL; NG_NODE_SET_PRIVATE(node, NULL); ng_socket_free_priv(priv); NG_NODE_UNREF(node); return (0); } static void ng_socket_item_applied(void *context, int error) { struct ngsock *const priv = (struct ngsock *)context; mtx_lock(&priv->mtx); priv->error = error; wakeup(priv); mtx_unlock(&priv->mtx); } static int dummy_disconnect(struct socket *so) { return (0); } /* * Control and data socket type descriptors * * XXXRW: Perhaps _close should do something? */ static struct pr_usrreqs ngc_usrreqs = { .pru_abort = NULL, .pru_attach = ngc_attach, .pru_bind = ngc_bind, .pru_connect = ngc_connect, .pru_detach = ngc_detach, .pru_disconnect = dummy_disconnect, .pru_peeraddr = NULL, .pru_send = ngc_send, .pru_shutdown = NULL, .pru_sockaddr = ng_getsockaddr, .pru_close = NULL, }; static struct pr_usrreqs ngd_usrreqs = { .pru_abort = NULL, .pru_attach = ngd_attach, .pru_bind = NULL, .pru_connect = ngd_connect, .pru_detach = ngd_detach, .pru_disconnect = dummy_disconnect, .pru_peeraddr = NULL, .pru_send = ngd_send, .pru_shutdown = NULL, .pru_sockaddr = ng_getsockaddr, .pru_close = NULL, }; /* * Definitions of protocols supported in the NETGRAPH domain. */ extern struct domain ngdomain; /* stop compiler warnings */ static struct protosw ngsw[] = { { .pr_type = SOCK_DGRAM, .pr_domain = &ngdomain, .pr_protocol = NG_CONTROL, .pr_flags = PR_ATOMIC | PR_ADDR /* | PR_RIGHTS */, .pr_usrreqs = &ngc_usrreqs }, { .pr_type = SOCK_DGRAM, .pr_domain = &ngdomain, .pr_protocol = NG_DATA, .pr_flags = PR_ATOMIC | PR_ADDR, .pr_usrreqs = &ngd_usrreqs } }; struct domain ngdomain = { .dom_family = AF_NETGRAPH, .dom_name = "netgraph", .dom_protosw = ngsw, .dom_protoswNPROTOSW = &ngsw[nitems(ngsw)] }; /* * Handle loading and unloading for this node type. * This is to handle auxiliary linkages (e.g protocol domain addition). */ static int ngs_mod_event(module_t mod, int event, void *data) { int error = 0; switch (event) { case MOD_LOAD: mtx_init(&ngsocketlist_mtx, "ng_socketlist", NULL, MTX_DEF); break; case MOD_UNLOAD: /* Ensure there are no open netgraph sockets. */ if (!LIST_EMPTY(&ngsocklist)) { error = EBUSY; break; } #ifdef NOTYET /* Unregister protocol domain XXX can't do this yet.. */ #endif error = EBUSY; break; default: error = EOPNOTSUPP; break; } return (error); } VNET_DOMAIN_SET(ng); SYSCTL_INT(_net_graph, OID_AUTO, family, CTLFLAG_RD, SYSCTL_NULL_INT_PTR, AF_NETGRAPH, ""); static SYSCTL_NODE(_net_graph, OID_AUTO, data, CTLFLAG_RW, 0, "DATA"); SYSCTL_INT(_net_graph_data, OID_AUTO, proto, CTLFLAG_RD, SYSCTL_NULL_INT_PTR, NG_DATA, ""); static SYSCTL_NODE(_net_graph, OID_AUTO, control, CTLFLAG_RW, 0, "CONTROL"); SYSCTL_INT(_net_graph_control, OID_AUTO, proto, CTLFLAG_RD, SYSCTL_NULL_INT_PTR, NG_CONTROL, ""); Index: projects/clang1000-import/sys/netinet/ip_carp.c =================================================================== --- projects/clang1000-import/sys/netinet/ip_carp.c (revision 358238) +++ projects/clang1000-import/sys/netinet/ip_carp.c (revision 358239) @@ -1,2306 +1,2309 @@ /*- * SPDX-License-Identifier: BSD-2-Clause-FreeBSD * * Copyright (c) 2002 Michael Shalayeff. * Copyright (c) 2003 Ryan McBride. * Copyright (c) 2011 Gleb Smirnoff * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR OR HIS RELATIVES BE LIABLE FOR ANY DIRECT, * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR * SERVICES; LOSS OF MIND, USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING * IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF * THE POSSIBILITY OF SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include "opt_bpf.h" #include "opt_inet.h" #include "opt_inet6.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #if defined(INET) || defined(INET6) #include #include #include #include #include #endif #ifdef INET #include #include #endif #ifdef INET6 #include #include #include #include #include #include #endif #include static MALLOC_DEFINE(M_CARP, "CARP", "CARP addresses"); struct carp_softc { struct ifnet *sc_carpdev; /* Pointer to parent ifnet. */ struct ifaddr **sc_ifas; /* Our ifaddrs. */ struct sockaddr_dl sc_addr; /* Our link level address. */ struct callout sc_ad_tmo; /* Advertising timeout. */ #ifdef INET struct callout sc_md_tmo; /* Master down timeout. */ #endif #ifdef INET6 struct callout sc_md6_tmo; /* XXX: Master down timeout. */ #endif struct mtx sc_mtx; int sc_vhid; int sc_advskew; int sc_advbase; int sc_naddrs; int sc_naddrs6; int sc_ifasiz; enum { INIT = 0, BACKUP, MASTER } sc_state; int sc_suppress; int sc_sendad_errors; #define CARP_SENDAD_MAX_ERRORS 3 int sc_sendad_success; #define CARP_SENDAD_MIN_SUCCESS 3 int sc_init_counter; uint64_t sc_counter; /* authentication */ #define CARP_HMAC_PAD 64 unsigned char sc_key[CARP_KEY_LEN]; unsigned char sc_pad[CARP_HMAC_PAD]; SHA1_CTX sc_sha1; TAILQ_ENTRY(carp_softc) sc_list; /* On the carp_if list. */ LIST_ENTRY(carp_softc) sc_next; /* On the global list. */ }; struct carp_if { #ifdef INET int cif_naddrs; #endif #ifdef INET6 int cif_naddrs6; #endif TAILQ_HEAD(, carp_softc) cif_vrs; #ifdef INET struct ip_moptions cif_imo; #endif #ifdef INET6 struct ip6_moptions cif_im6o; #endif struct ifnet *cif_ifp; struct mtx cif_mtx; uint32_t cif_flags; #define CIF_PROMISC 0x00000001 }; #define CARP_INET 0 #define CARP_INET6 1 static int proto_reg[] = {-1, -1}; /* * Brief design of carp(4). * * Any carp-capable ifnet may have a list of carp softcs hanging off * its ifp->if_carp pointer. Each softc represents one unique virtual * host id, or vhid. The softc has a back pointer to the ifnet. All * softcs are joined in a global list, which has quite limited use. * * Any interface address that takes part in CARP negotiation has a * pointer to the softc of its vhid, ifa->ifa_carp. That could be either * AF_INET or AF_INET6 address. * * Although, one can get the softc's backpointer to ifnet and traverse * through its ifp->if_addrhead queue to find all interface addresses * involved in CARP, we keep a growable array of ifaddr pointers. This * allows us to avoid grabbing the IF_ADDR_LOCK() in many traversals that * do calls into the network stack, thus avoiding LORs. * * Locking: * * Each softc has a lock sc_mtx. It is used to synchronise carp_input_c(), * callout-driven events and ioctl()s. * * To traverse the list of softcs on an ifnet we use CIF_LOCK() or carp_sx. * To traverse the global list we use the mutex carp_mtx. * * Known issues with locking: * * - Sending ad, we put the pointer to the softc in an mtag, and no reference * counting is done on the softc. * - On module unload we may race (?) with packet processing thread * dereferencing our function pointers. */ /* Accept incoming CARP packets. */ VNET_DEFINE_STATIC(int, carp_allow) = 1; #define V_carp_allow VNET(carp_allow) /* Set DSCP in outgoing CARP packets. */ VNET_DEFINE_STATIC(int, carp_dscp) = 56; #define V_carp_dscp VNET(carp_dscp) /* Preempt slower nodes. */ VNET_DEFINE_STATIC(int, carp_preempt) = 0; #define V_carp_preempt VNET(carp_preempt) /* Log level. */ VNET_DEFINE_STATIC(int, carp_log) = 1; #define V_carp_log VNET(carp_log) /* Global advskew demotion. */ VNET_DEFINE_STATIC(int, carp_demotion) = 0; #define V_carp_demotion VNET(carp_demotion) /* Send error demotion factor. */ VNET_DEFINE_STATIC(int, carp_senderr_adj) = CARP_MAXSKEW; #define V_carp_senderr_adj VNET(carp_senderr_adj) /* Iface down demotion factor. */ VNET_DEFINE_STATIC(int, carp_ifdown_adj) = CARP_MAXSKEW; #define V_carp_ifdown_adj VNET(carp_ifdown_adj) static int carp_allow_sysctl(SYSCTL_HANDLER_ARGS); static int carp_dscp_sysctl(SYSCTL_HANDLER_ARGS); static int carp_demote_adj_sysctl(SYSCTL_HANDLER_ARGS); -SYSCTL_NODE(_net_inet, IPPROTO_CARP, carp, CTLFLAG_RW, 0, "CARP"); +SYSCTL_NODE(_net_inet, IPPROTO_CARP, carp, CTLFLAG_RW | CTLFLAG_MPSAFE, 0, + "CARP"); SYSCTL_PROC(_net_inet_carp, OID_AUTO, allow, - CTLFLAG_VNET | CTLTYPE_INT | CTLFLAG_RW, 0, 0, carp_allow_sysctl, "I", + CTLFLAG_VNET | CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_MPSAFE, + 0, 0, carp_allow_sysctl, "I", "Accept incoming CARP packets"); SYSCTL_PROC(_net_inet_carp, OID_AUTO, dscp, - CTLFLAG_VNET | CTLTYPE_INT | CTLFLAG_RW, 0, 0, carp_dscp_sysctl, "I", + CTLFLAG_VNET | CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_MPSAFE, + 0, 0, carp_dscp_sysctl, "I", "DSCP value for carp packets"); SYSCTL_INT(_net_inet_carp, OID_AUTO, preempt, CTLFLAG_VNET | CTLFLAG_RW, &VNET_NAME(carp_preempt), 0, "High-priority backup preemption mode"); SYSCTL_INT(_net_inet_carp, OID_AUTO, log, CTLFLAG_VNET | CTLFLAG_RW, &VNET_NAME(carp_log), 0, "CARP log level"); SYSCTL_PROC(_net_inet_carp, OID_AUTO, demotion, - CTLFLAG_VNET | CTLTYPE_INT | CTLFLAG_RW, + CTLFLAG_VNET | CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_MPSAFE, 0, 0, carp_demote_adj_sysctl, "I", "Adjust demotion factor (skew of advskew)"); SYSCTL_INT(_net_inet_carp, OID_AUTO, senderr_demotion_factor, CTLFLAG_VNET | CTLFLAG_RW, &VNET_NAME(carp_senderr_adj), 0, "Send error demotion factor adjustment"); SYSCTL_INT(_net_inet_carp, OID_AUTO, ifdown_demotion_factor, CTLFLAG_VNET | CTLFLAG_RW, &VNET_NAME(carp_ifdown_adj), 0, "Interface down demotion factor adjustment"); VNET_PCPUSTAT_DEFINE(struct carpstats, carpstats); VNET_PCPUSTAT_SYSINIT(carpstats); VNET_PCPUSTAT_SYSUNINIT(carpstats); #define CARPSTATS_ADD(name, val) \ counter_u64_add(VNET(carpstats)[offsetof(struct carpstats, name) / \ sizeof(uint64_t)], (val)) #define CARPSTATS_INC(name) CARPSTATS_ADD(name, 1) SYSCTL_VNET_PCPUSTAT(_net_inet_carp, OID_AUTO, stats, struct carpstats, carpstats, "CARP statistics (struct carpstats, netinet/ip_carp.h)"); #define CARP_LOCK_INIT(sc) mtx_init(&(sc)->sc_mtx, "carp_softc", \ NULL, MTX_DEF) #define CARP_LOCK_DESTROY(sc) mtx_destroy(&(sc)->sc_mtx) #define CARP_LOCK_ASSERT(sc) mtx_assert(&(sc)->sc_mtx, MA_OWNED) #define CARP_LOCK(sc) mtx_lock(&(sc)->sc_mtx) #define CARP_UNLOCK(sc) mtx_unlock(&(sc)->sc_mtx) #define CIF_LOCK_INIT(cif) mtx_init(&(cif)->cif_mtx, "carp_if", \ NULL, MTX_DEF) #define CIF_LOCK_DESTROY(cif) mtx_destroy(&(cif)->cif_mtx) #define CIF_LOCK_ASSERT(cif) mtx_assert(&(cif)->cif_mtx, MA_OWNED) #define CIF_LOCK(cif) mtx_lock(&(cif)->cif_mtx) #define CIF_UNLOCK(cif) mtx_unlock(&(cif)->cif_mtx) #define CIF_FREE(cif) do { \ CIF_LOCK(cif); \ if (TAILQ_EMPTY(&(cif)->cif_vrs)) \ carp_free_if(cif); \ else \ CIF_UNLOCK(cif); \ } while (0) #define CARP_LOG(...) do { \ if (V_carp_log > 0) \ log(LOG_INFO, "carp: " __VA_ARGS__); \ } while (0) #define CARP_DEBUG(...) do { \ if (V_carp_log > 1) \ log(LOG_DEBUG, __VA_ARGS__); \ } while (0) #define IFNET_FOREACH_IFA(ifp, ifa) \ CK_STAILQ_FOREACH((ifa), &(ifp)->if_addrhead, ifa_link) \ if ((ifa)->ifa_carp != NULL) #define CARP_FOREACH_IFA(sc, ifa) \ CARP_LOCK_ASSERT(sc); \ for (int _i = 0; \ _i < (sc)->sc_naddrs + (sc)->sc_naddrs6 && \ ((ifa) = sc->sc_ifas[_i]) != NULL; \ ++_i) #define IFNET_FOREACH_CARP(ifp, sc) \ KASSERT(mtx_owned(&ifp->if_carp->cif_mtx) || \ sx_xlocked(&carp_sx), ("cif_vrs not locked")); \ TAILQ_FOREACH((sc), &(ifp)->if_carp->cif_vrs, sc_list) #define DEMOTE_ADVSKEW(sc) \ (((sc)->sc_advskew + V_carp_demotion > CARP_MAXSKEW) ? \ CARP_MAXSKEW : ((sc)->sc_advskew + V_carp_demotion)) static void carp_input_c(struct mbuf *, struct carp_header *, sa_family_t); static struct carp_softc *carp_alloc(struct ifnet *); static void carp_destroy(struct carp_softc *); static struct carp_if *carp_alloc_if(struct ifnet *); static void carp_free_if(struct carp_if *); static void carp_set_state(struct carp_softc *, int, const char* reason); static void carp_sc_state(struct carp_softc *); static void carp_setrun(struct carp_softc *, sa_family_t); static void carp_master_down(void *); static void carp_master_down_locked(struct carp_softc *, const char* reason); static void carp_send_ad(void *); static void carp_send_ad_locked(struct carp_softc *); static void carp_addroute(struct carp_softc *); static void carp_ifa_addroute(struct ifaddr *); static void carp_delroute(struct carp_softc *); static void carp_ifa_delroute(struct ifaddr *); static void carp_send_ad_all(void *, int); static void carp_demote_adj(int, char *); static LIST_HEAD(, carp_softc) carp_list; static struct mtx carp_mtx; static struct sx carp_sx; static struct task carp_sendall_task = TASK_INITIALIZER(0, carp_send_ad_all, NULL); static void carp_hmac_prepare(struct carp_softc *sc) { uint8_t version = CARP_VERSION, type = CARP_ADVERTISEMENT; uint8_t vhid = sc->sc_vhid & 0xff; struct ifaddr *ifa; int i, found; #ifdef INET struct in_addr last, cur, in; #endif #ifdef INET6 struct in6_addr last6, cur6, in6; #endif CARP_LOCK_ASSERT(sc); /* Compute ipad from key. */ bzero(sc->sc_pad, sizeof(sc->sc_pad)); bcopy(sc->sc_key, sc->sc_pad, sizeof(sc->sc_key)); for (i = 0; i < sizeof(sc->sc_pad); i++) sc->sc_pad[i] ^= 0x36; /* Precompute first part of inner hash. */ SHA1Init(&sc->sc_sha1); SHA1Update(&sc->sc_sha1, sc->sc_pad, sizeof(sc->sc_pad)); SHA1Update(&sc->sc_sha1, (void *)&version, sizeof(version)); SHA1Update(&sc->sc_sha1, (void *)&type, sizeof(type)); SHA1Update(&sc->sc_sha1, (void *)&vhid, sizeof(vhid)); #ifdef INET cur.s_addr = 0; do { found = 0; last = cur; cur.s_addr = 0xffffffff; CARP_FOREACH_IFA(sc, ifa) { in.s_addr = ifatoia(ifa)->ia_addr.sin_addr.s_addr; if (ifa->ifa_addr->sa_family == AF_INET && ntohl(in.s_addr) > ntohl(last.s_addr) && ntohl(in.s_addr) < ntohl(cur.s_addr)) { cur.s_addr = in.s_addr; found++; } } if (found) SHA1Update(&sc->sc_sha1, (void *)&cur, sizeof(cur)); } while (found); #endif /* INET */ #ifdef INET6 memset(&cur6, 0, sizeof(cur6)); do { found = 0; last6 = cur6; memset(&cur6, 0xff, sizeof(cur6)); CARP_FOREACH_IFA(sc, ifa) { in6 = ifatoia6(ifa)->ia_addr.sin6_addr; if (IN6_IS_SCOPE_EMBED(&in6)) in6.s6_addr16[1] = 0; if (ifa->ifa_addr->sa_family == AF_INET6 && memcmp(&in6, &last6, sizeof(in6)) > 0 && memcmp(&in6, &cur6, sizeof(in6)) < 0) { cur6 = in6; found++; } } if (found) SHA1Update(&sc->sc_sha1, (void *)&cur6, sizeof(cur6)); } while (found); #endif /* INET6 */ /* convert ipad to opad */ for (i = 0; i < sizeof(sc->sc_pad); i++) sc->sc_pad[i] ^= 0x36 ^ 0x5c; } static void carp_hmac_generate(struct carp_softc *sc, uint32_t counter[2], unsigned char md[20]) { SHA1_CTX sha1ctx; CARP_LOCK_ASSERT(sc); /* fetch first half of inner hash */ bcopy(&sc->sc_sha1, &sha1ctx, sizeof(sha1ctx)); SHA1Update(&sha1ctx, (void *)counter, sizeof(sc->sc_counter)); SHA1Final(md, &sha1ctx); /* outer hash */ SHA1Init(&sha1ctx); SHA1Update(&sha1ctx, sc->sc_pad, sizeof(sc->sc_pad)); SHA1Update(&sha1ctx, md, 20); SHA1Final(md, &sha1ctx); } static int carp_hmac_verify(struct carp_softc *sc, uint32_t counter[2], unsigned char md[20]) { unsigned char md2[20]; CARP_LOCK_ASSERT(sc); carp_hmac_generate(sc, counter, md2); return (bcmp(md, md2, sizeof(md2))); } /* * process input packet. * we have rearranged checks order compared to the rfc, * but it seems more efficient this way or not possible otherwise. */ #ifdef INET int carp_input(struct mbuf **mp, int *offp, int proto) { struct mbuf *m = *mp; struct ip *ip = mtod(m, struct ip *); struct carp_header *ch; int iplen, len; iplen = *offp; *mp = NULL; CARPSTATS_INC(carps_ipackets); if (!V_carp_allow) { m_freem(m); return (IPPROTO_DONE); } /* verify that the IP TTL is 255. */ if (ip->ip_ttl != CARP_DFLTTL) { CARPSTATS_INC(carps_badttl); CARP_DEBUG("%s: received ttl %d != 255 on %s\n", __func__, ip->ip_ttl, m->m_pkthdr.rcvif->if_xname); m_freem(m); return (IPPROTO_DONE); } iplen = ip->ip_hl << 2; if (m->m_pkthdr.len < iplen + sizeof(*ch)) { CARPSTATS_INC(carps_badlen); CARP_DEBUG("%s: received len %zd < sizeof(struct carp_header) " "on %s\n", __func__, m->m_len - sizeof(struct ip), m->m_pkthdr.rcvif->if_xname); m_freem(m); return (IPPROTO_DONE); } if (iplen + sizeof(*ch) < m->m_len) { if ((m = m_pullup(m, iplen + sizeof(*ch))) == NULL) { CARPSTATS_INC(carps_hdrops); CARP_DEBUG("%s: pullup failed\n", __func__); return (IPPROTO_DONE); } ip = mtod(m, struct ip *); } ch = (struct carp_header *)((char *)ip + iplen); /* * verify that the received packet length is * equal to the CARP header */ len = iplen + sizeof(*ch); if (len > m->m_pkthdr.len) { CARPSTATS_INC(carps_badlen); CARP_DEBUG("%s: packet too short %d on %s\n", __func__, m->m_pkthdr.len, m->m_pkthdr.rcvif->if_xname); m_freem(m); return (IPPROTO_DONE); } if ((m = m_pullup(m, len)) == NULL) { CARPSTATS_INC(carps_hdrops); return (IPPROTO_DONE); } ip = mtod(m, struct ip *); ch = (struct carp_header *)((char *)ip + iplen); /* verify the CARP checksum */ m->m_data += iplen; if (in_cksum(m, len - iplen)) { CARPSTATS_INC(carps_badsum); CARP_DEBUG("%s: checksum failed on %s\n", __func__, m->m_pkthdr.rcvif->if_xname); m_freem(m); return (IPPROTO_DONE); } m->m_data -= iplen; carp_input_c(m, ch, AF_INET); return (IPPROTO_DONE); } #endif #ifdef INET6 int carp6_input(struct mbuf **mp, int *offp, int proto) { struct mbuf *m = *mp; struct ip6_hdr *ip6 = mtod(m, struct ip6_hdr *); struct carp_header *ch; u_int len; CARPSTATS_INC(carps_ipackets6); if (!V_carp_allow) { m_freem(m); return (IPPROTO_DONE); } /* check if received on a valid carp interface */ if (m->m_pkthdr.rcvif->if_carp == NULL) { CARPSTATS_INC(carps_badif); CARP_DEBUG("%s: packet received on non-carp interface: %s\n", __func__, m->m_pkthdr.rcvif->if_xname); m_freem(m); return (IPPROTO_DONE); } /* verify that the IP TTL is 255 */ if (ip6->ip6_hlim != CARP_DFLTTL) { CARPSTATS_INC(carps_badttl); CARP_DEBUG("%s: received ttl %d != 255 on %s\n", __func__, ip6->ip6_hlim, m->m_pkthdr.rcvif->if_xname); m_freem(m); return (IPPROTO_DONE); } /* verify that we have a complete carp packet */ if (m->m_len < *offp + sizeof(*ch)) { len = m->m_len; m = m_pullup(m, *offp + sizeof(*ch)); if (m == NULL) { CARPSTATS_INC(carps_badlen); CARP_DEBUG("%s: packet size %u too small\n", __func__, len); return (IPPROTO_DONE); } } ch = (struct carp_header *)(mtod(m, char *) + *offp); /* verify the CARP checksum */ m->m_data += *offp; if (in_cksum(m, sizeof(*ch))) { CARPSTATS_INC(carps_badsum); CARP_DEBUG("%s: checksum failed, on %s\n", __func__, m->m_pkthdr.rcvif->if_xname); m_freem(m); return (IPPROTO_DONE); } m->m_data -= *offp; carp_input_c(m, ch, AF_INET6); return (IPPROTO_DONE); } #endif /* INET6 */ /* * This routine should not be necessary at all, but some switches * (VMWare ESX vswitches) can echo our own packets back at us, * and we must ignore them or they will cause us to drop out of * MASTER mode. * * We cannot catch all cases of network loops. Instead, what we * do here is catch any packet that arrives with a carp header * with a VHID of 0, that comes from an address that is our own. * These packets are by definition "from us" (even if they are from * a misconfigured host that is pretending to be us). * * The VHID test is outside this mini-function. */ static int carp_source_is_self(struct mbuf *m, struct ifaddr *ifa, sa_family_t af) { #ifdef INET struct ip *ip4; struct in_addr in4; #endif #ifdef INET6 struct ip6_hdr *ip6; struct in6_addr in6; #endif switch (af) { #ifdef INET case AF_INET: ip4 = mtod(m, struct ip *); in4 = ifatoia(ifa)->ia_addr.sin_addr; return (in4.s_addr == ip4->ip_src.s_addr); #endif #ifdef INET6 case AF_INET6: ip6 = mtod(m, struct ip6_hdr *); in6 = ifatoia6(ifa)->ia_addr.sin6_addr; return (memcmp(&in6, &ip6->ip6_src, sizeof(in6)) == 0); #endif default: break; } return (0); } static void carp_input_c(struct mbuf *m, struct carp_header *ch, sa_family_t af) { struct ifnet *ifp = m->m_pkthdr.rcvif; struct ifaddr *ifa, *match; struct carp_softc *sc; uint64_t tmp_counter; struct timeval sc_tv, ch_tv; int error; NET_EPOCH_ASSERT(); /* * Verify that the VHID is valid on the receiving interface. * * There should be just one match. If there are none * the VHID is not valid and we drop the packet. If * there are multiple VHID matches, take just the first * one, for compatibility with previous code. While we're * scanning, check for obvious loops in the network topology * (these should never happen, and as noted above, we may * miss real loops; this is just a double-check). */ error = 0; match = NULL; IFNET_FOREACH_IFA(ifp, ifa) { if (match == NULL && ifa->ifa_carp != NULL && ifa->ifa_addr->sa_family == af && ifa->ifa_carp->sc_vhid == ch->carp_vhid) match = ifa; if (ch->carp_vhid == 0 && carp_source_is_self(m, ifa, af)) error = ELOOP; } ifa = error ? NULL : match; if (ifa != NULL) ifa_ref(ifa); if (ifa == NULL) { if (error == ELOOP) { CARP_DEBUG("dropping looped packet on interface %s\n", ifp->if_xname); CARPSTATS_INC(carps_badif); /* ??? */ } else { CARPSTATS_INC(carps_badvhid); } m_freem(m); return; } /* verify the CARP version. */ if (ch->carp_version != CARP_VERSION) { CARPSTATS_INC(carps_badver); CARP_DEBUG("%s: invalid version %d\n", ifp->if_xname, ch->carp_version); ifa_free(ifa); m_freem(m); return; } sc = ifa->ifa_carp; CARP_LOCK(sc); ifa_free(ifa); if (carp_hmac_verify(sc, ch->carp_counter, ch->carp_md)) { CARPSTATS_INC(carps_badauth); CARP_DEBUG("%s: incorrect hash for VHID %u@%s\n", __func__, sc->sc_vhid, ifp->if_xname); goto out; } tmp_counter = ntohl(ch->carp_counter[0]); tmp_counter = tmp_counter<<32; tmp_counter += ntohl(ch->carp_counter[1]); /* XXX Replay protection goes here */ sc->sc_init_counter = 0; sc->sc_counter = tmp_counter; sc_tv.tv_sec = sc->sc_advbase; sc_tv.tv_usec = DEMOTE_ADVSKEW(sc) * 1000000 / 256; ch_tv.tv_sec = ch->carp_advbase; ch_tv.tv_usec = ch->carp_advskew * 1000000 / 256; switch (sc->sc_state) { case INIT: break; case MASTER: /* * If we receive an advertisement from a master who's going to * be more frequent than us, go into BACKUP state. */ if (timevalcmp(&sc_tv, &ch_tv, >) || timevalcmp(&sc_tv, &ch_tv, ==)) { callout_stop(&sc->sc_ad_tmo); carp_set_state(sc, BACKUP, "more frequent advertisement received"); carp_setrun(sc, 0); carp_delroute(sc); } break; case BACKUP: /* * If we're pre-empting masters who advertise slower than us, * and this one claims to be slower, treat him as down. */ if (V_carp_preempt && timevalcmp(&sc_tv, &ch_tv, <)) { carp_master_down_locked(sc, "preempting a slower master"); break; } /* * If the master is going to advertise at such a low frequency * that he's guaranteed to time out, we'd might as well just * treat him as timed out now. */ sc_tv.tv_sec = sc->sc_advbase * 3; if (timevalcmp(&sc_tv, &ch_tv, <)) { carp_master_down_locked(sc, "master will time out"); break; } /* * Otherwise, we reset the counter and wait for the next * advertisement. */ carp_setrun(sc, af); break; } out: CARP_UNLOCK(sc); m_freem(m); } static int carp_prepare_ad(struct mbuf *m, struct carp_softc *sc, struct carp_header *ch) { struct m_tag *mtag; if (sc->sc_init_counter) { /* this could also be seconds since unix epoch */ sc->sc_counter = arc4random(); sc->sc_counter = sc->sc_counter << 32; sc->sc_counter += arc4random(); } else sc->sc_counter++; ch->carp_counter[0] = htonl((sc->sc_counter>>32)&0xffffffff); ch->carp_counter[1] = htonl(sc->sc_counter&0xffffffff); carp_hmac_generate(sc, ch->carp_counter, ch->carp_md); /* Tag packet for carp_output */ if ((mtag = m_tag_get(PACKET_TAG_CARP, sizeof(struct carp_softc *), M_NOWAIT)) == NULL) { m_freem(m); CARPSTATS_INC(carps_onomem); return (ENOMEM); } bcopy(&sc, mtag + 1, sizeof(sc)); m_tag_prepend(m, mtag); return (0); } /* * To avoid LORs and possible recursions this function shouldn't * be called directly, but scheduled via taskqueue. */ static void carp_send_ad_all(void *ctx __unused, int pending __unused) { struct carp_softc *sc; mtx_lock(&carp_mtx); LIST_FOREACH(sc, &carp_list, sc_next) if (sc->sc_state == MASTER) { CARP_LOCK(sc); CURVNET_SET(sc->sc_carpdev->if_vnet); carp_send_ad_locked(sc); CURVNET_RESTORE(); CARP_UNLOCK(sc); } mtx_unlock(&carp_mtx); } /* Send a periodic advertisement, executed in callout context. */ static void carp_send_ad(void *v) { struct carp_softc *sc = v; CARP_LOCK_ASSERT(sc); CURVNET_SET(sc->sc_carpdev->if_vnet); carp_send_ad_locked(sc); CURVNET_RESTORE(); CARP_UNLOCK(sc); } static void carp_send_ad_error(struct carp_softc *sc, int error) { if (error) { if (sc->sc_sendad_errors < INT_MAX) sc->sc_sendad_errors++; if (sc->sc_sendad_errors == CARP_SENDAD_MAX_ERRORS) { static const char fmt[] = "send error %d on %s"; char msg[sizeof(fmt) + IFNAMSIZ]; sprintf(msg, fmt, error, sc->sc_carpdev->if_xname); carp_demote_adj(V_carp_senderr_adj, msg); } sc->sc_sendad_success = 0; } else { if (sc->sc_sendad_errors >= CARP_SENDAD_MAX_ERRORS && ++sc->sc_sendad_success >= CARP_SENDAD_MIN_SUCCESS) { static const char fmt[] = "send ok on %s"; char msg[sizeof(fmt) + IFNAMSIZ]; sprintf(msg, fmt, sc->sc_carpdev->if_xname); carp_demote_adj(-V_carp_senderr_adj, msg); sc->sc_sendad_errors = 0; } else sc->sc_sendad_errors = 0; } } /* * Pick the best ifaddr on the given ifp for sending CARP * advertisements. * * "Best" here is defined by ifa_preferred(). This function is much * much like ifaof_ifpforaddr() except that we just use ifa_preferred(). * * (This could be simplified to return the actual address, except that * it has a different format in AF_INET and AF_INET6.) */ static struct ifaddr * carp_best_ifa(int af, struct ifnet *ifp) { struct ifaddr *ifa, *best; NET_EPOCH_ASSERT(); if (af >= AF_MAX) return (NULL); best = NULL; CK_STAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link) { if (ifa->ifa_addr->sa_family == af && (best == NULL || ifa_preferred(best, ifa))) best = ifa; } if (best != NULL) ifa_ref(best); return (best); } static void carp_send_ad_locked(struct carp_softc *sc) { struct carp_header ch; struct timeval tv; struct epoch_tracker et; struct ifaddr *ifa; struct carp_header *ch_ptr; struct mbuf *m; int len, advskew; CARP_LOCK_ASSERT(sc); advskew = DEMOTE_ADVSKEW(sc); tv.tv_sec = sc->sc_advbase; tv.tv_usec = advskew * 1000000 / 256; ch.carp_version = CARP_VERSION; ch.carp_type = CARP_ADVERTISEMENT; ch.carp_vhid = sc->sc_vhid; ch.carp_advbase = sc->sc_advbase; ch.carp_advskew = advskew; ch.carp_authlen = 7; /* XXX DEFINE */ ch.carp_pad1 = 0; /* must be zero */ ch.carp_cksum = 0; /* XXXGL: OpenBSD picks first ifaddr with needed family. */ #ifdef INET if (sc->sc_naddrs) { struct ip *ip; m = m_gethdr(M_NOWAIT, MT_DATA); if (m == NULL) { CARPSTATS_INC(carps_onomem); goto resched; } len = sizeof(*ip) + sizeof(ch); m->m_pkthdr.len = len; m->m_pkthdr.rcvif = NULL; m->m_len = len; M_ALIGN(m, m->m_len); m->m_flags |= M_MCAST; ip = mtod(m, struct ip *); ip->ip_v = IPVERSION; ip->ip_hl = sizeof(*ip) >> 2; ip->ip_tos = V_carp_dscp << IPTOS_DSCP_OFFSET; ip->ip_len = htons(len); ip->ip_off = htons(IP_DF); ip->ip_ttl = CARP_DFLTTL; ip->ip_p = IPPROTO_CARP; ip->ip_sum = 0; ip_fillid(ip); ifa = carp_best_ifa(AF_INET, sc->sc_carpdev); if (ifa != NULL) { ip->ip_src.s_addr = ifatoia(ifa)->ia_addr.sin_addr.s_addr; ifa_free(ifa); } else ip->ip_src.s_addr = 0; ip->ip_dst.s_addr = htonl(INADDR_CARP_GROUP); ch_ptr = (struct carp_header *)(&ip[1]); bcopy(&ch, ch_ptr, sizeof(ch)); if (carp_prepare_ad(m, sc, ch_ptr)) goto resched; m->m_data += sizeof(*ip); ch_ptr->carp_cksum = in_cksum(m, len - sizeof(*ip)); m->m_data -= sizeof(*ip); CARPSTATS_INC(carps_opackets); NET_EPOCH_ENTER(et); carp_send_ad_error(sc, ip_output(m, NULL, NULL, IP_RAWOUTPUT, &sc->sc_carpdev->if_carp->cif_imo, NULL)); NET_EPOCH_EXIT(et); } #endif /* INET */ #ifdef INET6 if (sc->sc_naddrs6) { struct ip6_hdr *ip6; m = m_gethdr(M_NOWAIT, MT_DATA); if (m == NULL) { CARPSTATS_INC(carps_onomem); goto resched; } len = sizeof(*ip6) + sizeof(ch); m->m_pkthdr.len = len; m->m_pkthdr.rcvif = NULL; m->m_len = len; M_ALIGN(m, m->m_len); m->m_flags |= M_MCAST; ip6 = mtod(m, struct ip6_hdr *); bzero(ip6, sizeof(*ip6)); ip6->ip6_vfc |= IPV6_VERSION; /* Traffic class isn't defined in ip6 struct instead * it gets offset into flowid field */ ip6->ip6_flow |= htonl(V_carp_dscp << (IPV6_FLOWLABEL_LEN + IPTOS_DSCP_OFFSET)); ip6->ip6_hlim = CARP_DFLTTL; ip6->ip6_nxt = IPPROTO_CARP; /* set the source address */ ifa = carp_best_ifa(AF_INET6, sc->sc_carpdev); if (ifa != NULL) { bcopy(IFA_IN6(ifa), &ip6->ip6_src, sizeof(struct in6_addr)); ifa_free(ifa); } else /* This should never happen with IPv6. */ bzero(&ip6->ip6_src, sizeof(struct in6_addr)); /* Set the multicast destination. */ ip6->ip6_dst.s6_addr16[0] = htons(0xff02); ip6->ip6_dst.s6_addr8[15] = 0x12; if (in6_setscope(&ip6->ip6_dst, sc->sc_carpdev, NULL) != 0) { m_freem(m); CARP_DEBUG("%s: in6_setscope failed\n", __func__); goto resched; } ch_ptr = (struct carp_header *)(&ip6[1]); bcopy(&ch, ch_ptr, sizeof(ch)); if (carp_prepare_ad(m, sc, ch_ptr)) goto resched; m->m_data += sizeof(*ip6); ch_ptr->carp_cksum = in_cksum(m, len - sizeof(*ip6)); m->m_data -= sizeof(*ip6); CARPSTATS_INC(carps_opackets6); NET_EPOCH_ENTER(et); carp_send_ad_error(sc, ip6_output(m, NULL, NULL, 0, &sc->sc_carpdev->if_carp->cif_im6o, NULL, NULL)); NET_EPOCH_EXIT(et); } #endif /* INET6 */ resched: callout_reset(&sc->sc_ad_tmo, tvtohz(&tv), carp_send_ad, sc); } static void carp_addroute(struct carp_softc *sc) { struct ifaddr *ifa; CARP_FOREACH_IFA(sc, ifa) carp_ifa_addroute(ifa); } static void carp_ifa_addroute(struct ifaddr *ifa) { switch (ifa->ifa_addr->sa_family) { #ifdef INET case AF_INET: in_addprefix(ifatoia(ifa), RTF_UP); ifa_add_loopback_route(ifa, (struct sockaddr *)&ifatoia(ifa)->ia_addr); break; #endif #ifdef INET6 case AF_INET6: ifa_add_loopback_route(ifa, (struct sockaddr *)&ifatoia6(ifa)->ia_addr); nd6_add_ifa_lle(ifatoia6(ifa)); break; #endif } } static void carp_delroute(struct carp_softc *sc) { struct ifaddr *ifa; CARP_FOREACH_IFA(sc, ifa) carp_ifa_delroute(ifa); } static void carp_ifa_delroute(struct ifaddr *ifa) { switch (ifa->ifa_addr->sa_family) { #ifdef INET case AF_INET: ifa_del_loopback_route(ifa, (struct sockaddr *)&ifatoia(ifa)->ia_addr); in_scrubprefix(ifatoia(ifa), LLE_STATIC); break; #endif #ifdef INET6 case AF_INET6: ifa_del_loopback_route(ifa, (struct sockaddr *)&ifatoia6(ifa)->ia_addr); nd6_rem_ifa_lle(ifatoia6(ifa), 1); break; #endif } } int carp_master(struct ifaddr *ifa) { struct carp_softc *sc = ifa->ifa_carp; return (sc->sc_state == MASTER); } #ifdef INET /* * Broadcast a gratuitous ARP request containing * the virtual router MAC address for each IP address * associated with the virtual router. */ static void carp_send_arp(struct carp_softc *sc) { struct ifaddr *ifa; struct in_addr addr; CARP_FOREACH_IFA(sc, ifa) { if (ifa->ifa_addr->sa_family != AF_INET) continue; addr = ((struct sockaddr_in *)ifa->ifa_addr)->sin_addr; arp_announce_ifaddr(sc->sc_carpdev, addr, LLADDR(&sc->sc_addr)); } } int carp_iamatch(struct ifaddr *ifa, uint8_t **enaddr) { struct carp_softc *sc = ifa->ifa_carp; if (sc->sc_state == MASTER) { *enaddr = LLADDR(&sc->sc_addr); return (1); } return (0); } #endif #ifdef INET6 static void carp_send_na(struct carp_softc *sc) { static struct in6_addr mcast = IN6ADDR_LINKLOCAL_ALLNODES_INIT; struct ifaddr *ifa; struct in6_addr *in6; CARP_FOREACH_IFA(sc, ifa) { if (ifa->ifa_addr->sa_family != AF_INET6) continue; in6 = IFA_IN6(ifa); nd6_na_output(sc->sc_carpdev, &mcast, in6, ND_NA_FLAG_OVERRIDE, 1, NULL); DELAY(1000); /* XXX */ } } /* * Returns ifa in case it's a carp address and it is MASTER, or if the address * matches and is not a carp address. Returns NULL otherwise. */ struct ifaddr * carp_iamatch6(struct ifnet *ifp, struct in6_addr *taddr) { struct ifaddr *ifa; NET_EPOCH_ASSERT(); ifa = NULL; CK_STAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link) { if (ifa->ifa_addr->sa_family != AF_INET6) continue; if (!IN6_ARE_ADDR_EQUAL(taddr, IFA_IN6(ifa))) continue; if (ifa->ifa_carp && ifa->ifa_carp->sc_state != MASTER) ifa = NULL; else ifa_ref(ifa); break; } return (ifa); } char * carp_macmatch6(struct ifnet *ifp, struct mbuf *m, const struct in6_addr *taddr) { struct ifaddr *ifa; NET_EPOCH_ASSERT(); IFNET_FOREACH_IFA(ifp, ifa) if (ifa->ifa_addr->sa_family == AF_INET6 && IN6_ARE_ADDR_EQUAL(taddr, IFA_IN6(ifa))) { struct carp_softc *sc = ifa->ifa_carp; struct m_tag *mtag; mtag = m_tag_get(PACKET_TAG_CARP, sizeof(struct carp_softc *), M_NOWAIT); if (mtag == NULL) /* Better a bit than nothing. */ return (LLADDR(&sc->sc_addr)); bcopy(&sc, mtag + 1, sizeof(sc)); m_tag_prepend(m, mtag); return (LLADDR(&sc->sc_addr)); } return (NULL); } #endif /* INET6 */ int carp_forus(struct ifnet *ifp, u_char *dhost) { struct carp_softc *sc; uint8_t *ena = dhost; if (ena[0] || ena[1] || ena[2] != 0x5e || ena[3] || ena[4] != 1) return (0); CIF_LOCK(ifp->if_carp); IFNET_FOREACH_CARP(ifp, sc) { /* * CARP_LOCK() is not here, since would protect nothing, but * cause deadlock with if_bridge, calling this under its lock. */ if (sc->sc_state == MASTER && !bcmp(dhost, LLADDR(&sc->sc_addr), ETHER_ADDR_LEN)) { CIF_UNLOCK(ifp->if_carp); return (1); } } CIF_UNLOCK(ifp->if_carp); return (0); } /* Master down timeout event, executed in callout context. */ static void carp_master_down(void *v) { struct carp_softc *sc = v; CARP_LOCK_ASSERT(sc); CURVNET_SET(sc->sc_carpdev->if_vnet); if (sc->sc_state == BACKUP) { carp_master_down_locked(sc, "master timed out"); } CURVNET_RESTORE(); CARP_UNLOCK(sc); } static void carp_master_down_locked(struct carp_softc *sc, const char *reason) { CARP_LOCK_ASSERT(sc); switch (sc->sc_state) { case BACKUP: carp_set_state(sc, MASTER, reason); carp_send_ad_locked(sc); #ifdef INET carp_send_arp(sc); #endif #ifdef INET6 carp_send_na(sc); #endif carp_setrun(sc, 0); carp_addroute(sc); break; case INIT: case MASTER: #ifdef INVARIANTS panic("carp: VHID %u@%s: master_down event in %s state\n", sc->sc_vhid, sc->sc_carpdev->if_xname, sc->sc_state ? "MASTER" : "INIT"); #endif break; } } /* * When in backup state, af indicates whether to reset the master down timer * for v4 or v6. If it's set to zero, reset the ones which are already pending. */ static void carp_setrun(struct carp_softc *sc, sa_family_t af) { struct timeval tv; CARP_LOCK_ASSERT(sc); if ((sc->sc_carpdev->if_flags & IFF_UP) == 0 || sc->sc_carpdev->if_link_state != LINK_STATE_UP || (sc->sc_naddrs == 0 && sc->sc_naddrs6 == 0) || !V_carp_allow) return; switch (sc->sc_state) { case INIT: carp_set_state(sc, BACKUP, "initialization complete"); carp_setrun(sc, 0); break; case BACKUP: callout_stop(&sc->sc_ad_tmo); tv.tv_sec = 3 * sc->sc_advbase; tv.tv_usec = sc->sc_advskew * 1000000 / 256; switch (af) { #ifdef INET case AF_INET: callout_reset(&sc->sc_md_tmo, tvtohz(&tv), carp_master_down, sc); break; #endif #ifdef INET6 case AF_INET6: callout_reset(&sc->sc_md6_tmo, tvtohz(&tv), carp_master_down, sc); break; #endif default: #ifdef INET if (sc->sc_naddrs) callout_reset(&sc->sc_md_tmo, tvtohz(&tv), carp_master_down, sc); #endif #ifdef INET6 if (sc->sc_naddrs6) callout_reset(&sc->sc_md6_tmo, tvtohz(&tv), carp_master_down, sc); #endif break; } break; case MASTER: tv.tv_sec = sc->sc_advbase; tv.tv_usec = sc->sc_advskew * 1000000 / 256; callout_reset(&sc->sc_ad_tmo, tvtohz(&tv), carp_send_ad, sc); break; } } /* * Setup multicast structures. */ static int carp_multicast_setup(struct carp_if *cif, sa_family_t sa) { struct ifnet *ifp = cif->cif_ifp; int error = 0; switch (sa) { #ifdef INET case AF_INET: { struct ip_moptions *imo = &cif->cif_imo; struct in_mfilter *imf; struct in_addr addr; if (ip_mfilter_first(&imo->imo_head) != NULL) return (0); imf = ip_mfilter_alloc(M_WAITOK, 0, 0); ip_mfilter_init(&imo->imo_head); imo->imo_multicast_vif = -1; addr.s_addr = htonl(INADDR_CARP_GROUP); if ((error = in_joingroup(ifp, &addr, NULL, &imf->imf_inm)) != 0) { ip_mfilter_free(imf); break; } ip_mfilter_insert(&imo->imo_head, imf); imo->imo_multicast_ifp = ifp; imo->imo_multicast_ttl = CARP_DFLTTL; imo->imo_multicast_loop = 0; break; } #endif #ifdef INET6 case AF_INET6: { struct ip6_moptions *im6o = &cif->cif_im6o; struct in6_mfilter *im6f[2]; struct in6_addr in6; if (ip6_mfilter_first(&im6o->im6o_head)) return (0); im6f[0] = ip6_mfilter_alloc(M_WAITOK, 0, 0); im6f[1] = ip6_mfilter_alloc(M_WAITOK, 0, 0); ip6_mfilter_init(&im6o->im6o_head); im6o->im6o_multicast_hlim = CARP_DFLTTL; im6o->im6o_multicast_ifp = ifp; /* Join IPv6 CARP multicast group. */ bzero(&in6, sizeof(in6)); in6.s6_addr16[0] = htons(0xff02); in6.s6_addr8[15] = 0x12; if ((error = in6_setscope(&in6, ifp, NULL)) != 0) { ip6_mfilter_free(im6f[0]); ip6_mfilter_free(im6f[1]); break; } if ((error = in6_joingroup(ifp, &in6, NULL, &im6f[0]->im6f_in6m, 0)) != 0) { ip6_mfilter_free(im6f[0]); ip6_mfilter_free(im6f[1]); break; } /* Join solicited multicast address. */ bzero(&in6, sizeof(in6)); in6.s6_addr16[0] = htons(0xff02); in6.s6_addr32[1] = 0; in6.s6_addr32[2] = htonl(1); in6.s6_addr32[3] = 0; in6.s6_addr8[12] = 0xff; if ((error = in6_setscope(&in6, ifp, NULL)) != 0) { ip6_mfilter_free(im6f[0]); ip6_mfilter_free(im6f[1]); break; } if ((error = in6_joingroup(ifp, &in6, NULL, &im6f[1]->im6f_in6m, 0)) != 0) { in6_leavegroup(im6f[0]->im6f_in6m, NULL); ip6_mfilter_free(im6f[0]); ip6_mfilter_free(im6f[1]); break; } ip6_mfilter_insert(&im6o->im6o_head, im6f[0]); ip6_mfilter_insert(&im6o->im6o_head, im6f[1]); break; } #endif } return (error); } /* * Free multicast structures. */ static void carp_multicast_cleanup(struct carp_if *cif, sa_family_t sa) { #ifdef INET struct ip_moptions *imo = &cif->cif_imo; struct in_mfilter *imf; #endif #ifdef INET6 struct ip6_moptions *im6o = &cif->cif_im6o; struct in6_mfilter *im6f; #endif sx_assert(&carp_sx, SA_XLOCKED); switch (sa) { #ifdef INET case AF_INET: if (cif->cif_naddrs != 0) break; while ((imf = ip_mfilter_first(&imo->imo_head)) != NULL) { ip_mfilter_remove(&imo->imo_head, imf); in_leavegroup(imf->imf_inm, NULL); ip_mfilter_free(imf); } break; #endif #ifdef INET6 case AF_INET6: if (cif->cif_naddrs6 != 0) break; while ((im6f = ip6_mfilter_first(&im6o->im6o_head)) != NULL) { ip6_mfilter_remove(&im6o->im6o_head, im6f); in6_leavegroup(im6f->im6f_in6m, NULL); ip6_mfilter_free(im6f); } break; #endif } } int carp_output(struct ifnet *ifp, struct mbuf *m, const struct sockaddr *sa) { struct m_tag *mtag; struct carp_softc *sc; if (!sa) return (0); switch (sa->sa_family) { #ifdef INET case AF_INET: break; #endif #ifdef INET6 case AF_INET6: break; #endif default: return (0); } mtag = m_tag_find(m, PACKET_TAG_CARP, NULL); if (mtag == NULL) return (0); bcopy(mtag + 1, &sc, sizeof(sc)); /* Set the source MAC address to the Virtual Router MAC Address. */ switch (ifp->if_type) { case IFT_ETHER: case IFT_BRIDGE: case IFT_L2VLAN: { struct ether_header *eh; eh = mtod(m, struct ether_header *); eh->ether_shost[0] = 0; eh->ether_shost[1] = 0; eh->ether_shost[2] = 0x5e; eh->ether_shost[3] = 0; eh->ether_shost[4] = 1; eh->ether_shost[5] = sc->sc_vhid; } break; default: printf("%s: carp is not supported for the %d interface type\n", ifp->if_xname, ifp->if_type); return (EOPNOTSUPP); } return (0); } static struct carp_softc* carp_alloc(struct ifnet *ifp) { struct carp_softc *sc; struct carp_if *cif; sx_assert(&carp_sx, SA_XLOCKED); if ((cif = ifp->if_carp) == NULL) cif = carp_alloc_if(ifp); sc = malloc(sizeof(*sc), M_CARP, M_WAITOK|M_ZERO); sc->sc_advbase = CARP_DFLTINTV; sc->sc_vhid = -1; /* required setting */ sc->sc_init_counter = 1; sc->sc_state = INIT; sc->sc_ifasiz = sizeof(struct ifaddr *); sc->sc_ifas = malloc(sc->sc_ifasiz, M_CARP, M_WAITOK|M_ZERO); sc->sc_carpdev = ifp; CARP_LOCK_INIT(sc); #ifdef INET callout_init_mtx(&sc->sc_md_tmo, &sc->sc_mtx, CALLOUT_RETURNUNLOCKED); #endif #ifdef INET6 callout_init_mtx(&sc->sc_md6_tmo, &sc->sc_mtx, CALLOUT_RETURNUNLOCKED); #endif callout_init_mtx(&sc->sc_ad_tmo, &sc->sc_mtx, CALLOUT_RETURNUNLOCKED); CIF_LOCK(cif); TAILQ_INSERT_TAIL(&cif->cif_vrs, sc, sc_list); CIF_UNLOCK(cif); mtx_lock(&carp_mtx); LIST_INSERT_HEAD(&carp_list, sc, sc_next); mtx_unlock(&carp_mtx); return (sc); } static void carp_grow_ifas(struct carp_softc *sc) { struct ifaddr **new; new = malloc(sc->sc_ifasiz * 2, M_CARP, M_WAITOK | M_ZERO); CARP_LOCK(sc); bcopy(sc->sc_ifas, new, sc->sc_ifasiz); free(sc->sc_ifas, M_CARP); sc->sc_ifas = new; sc->sc_ifasiz *= 2; CARP_UNLOCK(sc); } static void carp_destroy(struct carp_softc *sc) { struct ifnet *ifp = sc->sc_carpdev; struct carp_if *cif = ifp->if_carp; sx_assert(&carp_sx, SA_XLOCKED); if (sc->sc_suppress) carp_demote_adj(-V_carp_ifdown_adj, "vhid removed"); CARP_UNLOCK(sc); CIF_LOCK(cif); TAILQ_REMOVE(&cif->cif_vrs, sc, sc_list); CIF_UNLOCK(cif); mtx_lock(&carp_mtx); LIST_REMOVE(sc, sc_next); mtx_unlock(&carp_mtx); callout_drain(&sc->sc_ad_tmo); #ifdef INET callout_drain(&sc->sc_md_tmo); #endif #ifdef INET6 callout_drain(&sc->sc_md6_tmo); #endif CARP_LOCK_DESTROY(sc); free(sc->sc_ifas, M_CARP); free(sc, M_CARP); } static struct carp_if* carp_alloc_if(struct ifnet *ifp) { struct carp_if *cif; int error; cif = malloc(sizeof(*cif), M_CARP, M_WAITOK|M_ZERO); if ((error = ifpromisc(ifp, 1)) != 0) printf("%s: ifpromisc(%s) failed: %d\n", __func__, ifp->if_xname, error); else cif->cif_flags |= CIF_PROMISC; CIF_LOCK_INIT(cif); cif->cif_ifp = ifp; TAILQ_INIT(&cif->cif_vrs); IF_ADDR_WLOCK(ifp); ifp->if_carp = cif; if_ref(ifp); IF_ADDR_WUNLOCK(ifp); return (cif); } static void carp_free_if(struct carp_if *cif) { struct ifnet *ifp = cif->cif_ifp; CIF_LOCK_ASSERT(cif); KASSERT(TAILQ_EMPTY(&cif->cif_vrs), ("%s: softc list not empty", __func__)); IF_ADDR_WLOCK(ifp); ifp->if_carp = NULL; IF_ADDR_WUNLOCK(ifp); CIF_LOCK_DESTROY(cif); if (cif->cif_flags & CIF_PROMISC) ifpromisc(ifp, 0); if_rele(ifp); free(cif, M_CARP); } static void carp_carprcp(struct carpreq *carpr, struct carp_softc *sc, int priv) { CARP_LOCK(sc); carpr->carpr_state = sc->sc_state; carpr->carpr_vhid = sc->sc_vhid; carpr->carpr_advbase = sc->sc_advbase; carpr->carpr_advskew = sc->sc_advskew; if (priv) bcopy(sc->sc_key, carpr->carpr_key, sizeof(carpr->carpr_key)); else bzero(carpr->carpr_key, sizeof(carpr->carpr_key)); CARP_UNLOCK(sc); } int carp_ioctl(struct ifreq *ifr, u_long cmd, struct thread *td) { struct carpreq carpr; struct ifnet *ifp; struct carp_softc *sc = NULL; int error = 0, locked = 0; if ((error = copyin(ifr_data_get_ptr(ifr), &carpr, sizeof carpr))) return (error); ifp = ifunit_ref(ifr->ifr_name); if (ifp == NULL) return (ENXIO); switch (ifp->if_type) { case IFT_ETHER: case IFT_L2VLAN: case IFT_BRIDGE: break; default: error = EOPNOTSUPP; goto out; } if ((ifp->if_flags & IFF_MULTICAST) == 0) { error = EADDRNOTAVAIL; goto out; } sx_xlock(&carp_sx); switch (cmd) { case SIOCSVH: if ((error = priv_check(td, PRIV_NETINET_CARP))) break; if (carpr.carpr_vhid <= 0 || carpr.carpr_vhid > CARP_MAXVHID || carpr.carpr_advbase < 0 || carpr.carpr_advskew < 0) { error = EINVAL; break; } if (ifp->if_carp) { IFNET_FOREACH_CARP(ifp, sc) if (sc->sc_vhid == carpr.carpr_vhid) break; } if (sc == NULL) { sc = carp_alloc(ifp); CARP_LOCK(sc); sc->sc_vhid = carpr.carpr_vhid; LLADDR(&sc->sc_addr)[0] = 0; LLADDR(&sc->sc_addr)[1] = 0; LLADDR(&sc->sc_addr)[2] = 0x5e; LLADDR(&sc->sc_addr)[3] = 0; LLADDR(&sc->sc_addr)[4] = 1; LLADDR(&sc->sc_addr)[5] = sc->sc_vhid; } else CARP_LOCK(sc); locked = 1; if (carpr.carpr_advbase > 0) { if (carpr.carpr_advbase > 255 || carpr.carpr_advbase < CARP_DFLTINTV) { error = EINVAL; break; } sc->sc_advbase = carpr.carpr_advbase; } if (carpr.carpr_advskew >= 255) { error = EINVAL; break; } sc->sc_advskew = carpr.carpr_advskew; if (carpr.carpr_key[0] != '\0') { bcopy(carpr.carpr_key, sc->sc_key, sizeof(sc->sc_key)); carp_hmac_prepare(sc); } if (sc->sc_state != INIT && carpr.carpr_state != sc->sc_state) { switch (carpr.carpr_state) { case BACKUP: callout_stop(&sc->sc_ad_tmo); carp_set_state(sc, BACKUP, "user requested via ifconfig"); carp_setrun(sc, 0); carp_delroute(sc); break; case MASTER: carp_master_down_locked(sc, "user requested via ifconfig"); break; default: break; } } break; case SIOCGVH: { int priveleged; if (carpr.carpr_vhid < 0 || carpr.carpr_vhid > CARP_MAXVHID) { error = EINVAL; break; } if (carpr.carpr_count < 1) { error = EMSGSIZE; break; } if (ifp->if_carp == NULL) { error = ENOENT; break; } priveleged = (priv_check(td, PRIV_NETINET_CARP) == 0); if (carpr.carpr_vhid != 0) { IFNET_FOREACH_CARP(ifp, sc) if (sc->sc_vhid == carpr.carpr_vhid) break; if (sc == NULL) { error = ENOENT; break; } carp_carprcp(&carpr, sc, priveleged); error = copyout(&carpr, ifr_data_get_ptr(ifr), sizeof(carpr)); } else { int i, count; count = 0; IFNET_FOREACH_CARP(ifp, sc) count++; if (count > carpr.carpr_count) { CIF_UNLOCK(ifp->if_carp); error = EMSGSIZE; break; } i = 0; IFNET_FOREACH_CARP(ifp, sc) { carp_carprcp(&carpr, sc, priveleged); carpr.carpr_count = count; error = copyout(&carpr, (char *)ifr_data_get_ptr(ifr) + (i * sizeof(carpr)), sizeof(carpr)); if (error) { CIF_UNLOCK(ifp->if_carp); break; } i++; } } break; } default: error = EINVAL; } sx_xunlock(&carp_sx); out: if (locked) CARP_UNLOCK(sc); if_rele(ifp); return (error); } static int carp_get_vhid(struct ifaddr *ifa) { if (ifa == NULL || ifa->ifa_carp == NULL) return (0); return (ifa->ifa_carp->sc_vhid); } int carp_attach(struct ifaddr *ifa, int vhid) { struct ifnet *ifp = ifa->ifa_ifp; struct carp_if *cif = ifp->if_carp; struct carp_softc *sc; int index, error; KASSERT(ifa->ifa_carp == NULL, ("%s: ifa %p attached", __func__, ifa)); switch (ifa->ifa_addr->sa_family) { #ifdef INET case AF_INET: #endif #ifdef INET6 case AF_INET6: #endif break; default: return (EPROTOTYPE); } sx_xlock(&carp_sx); if (ifp->if_carp == NULL) { sx_xunlock(&carp_sx); return (ENOPROTOOPT); } IFNET_FOREACH_CARP(ifp, sc) if (sc->sc_vhid == vhid) break; if (sc == NULL) { sx_xunlock(&carp_sx); return (ENOENT); } error = carp_multicast_setup(cif, ifa->ifa_addr->sa_family); if (error) { CIF_FREE(cif); sx_xunlock(&carp_sx); return (error); } index = sc->sc_naddrs + sc->sc_naddrs6 + 1; if (index > sc->sc_ifasiz / sizeof(struct ifaddr *)) carp_grow_ifas(sc); switch (ifa->ifa_addr->sa_family) { #ifdef INET case AF_INET: cif->cif_naddrs++; sc->sc_naddrs++; break; #endif #ifdef INET6 case AF_INET6: cif->cif_naddrs6++; sc->sc_naddrs6++; break; #endif } ifa_ref(ifa); CARP_LOCK(sc); sc->sc_ifas[index - 1] = ifa; ifa->ifa_carp = sc; carp_hmac_prepare(sc); carp_sc_state(sc); CARP_UNLOCK(sc); sx_xunlock(&carp_sx); return (0); } void carp_detach(struct ifaddr *ifa, bool keep_cif) { struct ifnet *ifp = ifa->ifa_ifp; struct carp_if *cif = ifp->if_carp; struct carp_softc *sc = ifa->ifa_carp; int i, index; KASSERT(sc != NULL, ("%s: %p not attached", __func__, ifa)); sx_xlock(&carp_sx); CARP_LOCK(sc); /* Shift array. */ index = sc->sc_naddrs + sc->sc_naddrs6; for (i = 0; i < index; i++) if (sc->sc_ifas[i] == ifa) break; KASSERT(i < index, ("%s: %p no backref", __func__, ifa)); for (; i < index - 1; i++) sc->sc_ifas[i] = sc->sc_ifas[i+1]; sc->sc_ifas[index - 1] = NULL; switch (ifa->ifa_addr->sa_family) { #ifdef INET case AF_INET: cif->cif_naddrs--; sc->sc_naddrs--; break; #endif #ifdef INET6 case AF_INET6: cif->cif_naddrs6--; sc->sc_naddrs6--; break; #endif } carp_ifa_delroute(ifa); carp_multicast_cleanup(cif, ifa->ifa_addr->sa_family); ifa->ifa_carp = NULL; ifa_free(ifa); carp_hmac_prepare(sc); carp_sc_state(sc); if (!keep_cif && sc->sc_naddrs == 0 && sc->sc_naddrs6 == 0) carp_destroy(sc); else CARP_UNLOCK(sc); if (!keep_cif) CIF_FREE(cif); sx_xunlock(&carp_sx); } static void carp_set_state(struct carp_softc *sc, int state, const char *reason) { CARP_LOCK_ASSERT(sc); if (sc->sc_state != state) { const char *carp_states[] = { CARP_STATES }; char subsys[IFNAMSIZ+5]; snprintf(subsys, IFNAMSIZ+5, "%u@%s", sc->sc_vhid, sc->sc_carpdev->if_xname); CARP_LOG("%s: %s -> %s (%s)\n", subsys, carp_states[sc->sc_state], carp_states[state], reason); sc->sc_state = state; devctl_notify("CARP", subsys, carp_states[state], NULL); } } static void carp_linkstate(struct ifnet *ifp) { struct carp_softc *sc; CIF_LOCK(ifp->if_carp); IFNET_FOREACH_CARP(ifp, sc) { CARP_LOCK(sc); carp_sc_state(sc); CARP_UNLOCK(sc); } CIF_UNLOCK(ifp->if_carp); } static void carp_sc_state(struct carp_softc *sc) { CARP_LOCK_ASSERT(sc); if (sc->sc_carpdev->if_link_state != LINK_STATE_UP || !(sc->sc_carpdev->if_flags & IFF_UP) || !V_carp_allow) { callout_stop(&sc->sc_ad_tmo); #ifdef INET callout_stop(&sc->sc_md_tmo); #endif #ifdef INET6 callout_stop(&sc->sc_md6_tmo); #endif carp_set_state(sc, INIT, "hardware interface down"); carp_setrun(sc, 0); if (!sc->sc_suppress) carp_demote_adj(V_carp_ifdown_adj, "interface down"); sc->sc_suppress = 1; } else { carp_set_state(sc, INIT, "hardware interface up"); carp_setrun(sc, 0); if (sc->sc_suppress) carp_demote_adj(-V_carp_ifdown_adj, "interface up"); sc->sc_suppress = 0; } } static void carp_demote_adj(int adj, char *reason) { atomic_add_int(&V_carp_demotion, adj); CARP_LOG("demoted by %d to %d (%s)\n", adj, V_carp_demotion, reason); taskqueue_enqueue(taskqueue_swi, &carp_sendall_task); } static int carp_allow_sysctl(SYSCTL_HANDLER_ARGS) { int new, error; struct carp_softc *sc; new = V_carp_allow; error = sysctl_handle_int(oidp, &new, 0, req); if (error || !req->newptr) return (error); if (V_carp_allow != new) { V_carp_allow = new; mtx_lock(&carp_mtx); LIST_FOREACH(sc, &carp_list, sc_next) { CARP_LOCK(sc); if (curvnet == sc->sc_carpdev->if_vnet) carp_sc_state(sc); CARP_UNLOCK(sc); } mtx_unlock(&carp_mtx); } return (0); } static int carp_dscp_sysctl(SYSCTL_HANDLER_ARGS) { int new, error; new = V_carp_dscp; error = sysctl_handle_int(oidp, &new, 0, req); if (error || !req->newptr) return (error); if (new < 0 || new > 63) return (EINVAL); V_carp_dscp = new; return (0); } static int carp_demote_adj_sysctl(SYSCTL_HANDLER_ARGS) { int new, error; new = V_carp_demotion; error = sysctl_handle_int(oidp, &new, 0, req); if (error || !req->newptr) return (error); carp_demote_adj(new, "sysctl"); return (0); } #ifdef INET extern struct domain inetdomain; static struct protosw in_carp_protosw = { .pr_type = SOCK_RAW, .pr_domain = &inetdomain, .pr_protocol = IPPROTO_CARP, .pr_flags = PR_ATOMIC|PR_ADDR, .pr_input = carp_input, .pr_output = rip_output, .pr_ctloutput = rip_ctloutput, .pr_usrreqs = &rip_usrreqs }; #endif #ifdef INET6 extern struct domain inet6domain; static struct protosw in6_carp_protosw = { .pr_type = SOCK_RAW, .pr_domain = &inet6domain, .pr_protocol = IPPROTO_CARP, .pr_flags = PR_ATOMIC|PR_ADDR, .pr_input = carp6_input, .pr_output = rip6_output, .pr_ctloutput = rip6_ctloutput, .pr_usrreqs = &rip6_usrreqs }; #endif static void carp_mod_cleanup(void) { #ifdef INET if (proto_reg[CARP_INET] == 0) { (void)ipproto_unregister(IPPROTO_CARP); pf_proto_unregister(PF_INET, IPPROTO_CARP, SOCK_RAW); proto_reg[CARP_INET] = -1; } carp_iamatch_p = NULL; #endif #ifdef INET6 if (proto_reg[CARP_INET6] == 0) { (void)ip6proto_unregister(IPPROTO_CARP); pf_proto_unregister(PF_INET6, IPPROTO_CARP, SOCK_RAW); proto_reg[CARP_INET6] = -1; } carp_iamatch6_p = NULL; carp_macmatch6_p = NULL; #endif carp_ioctl_p = NULL; carp_attach_p = NULL; carp_detach_p = NULL; carp_get_vhid_p = NULL; carp_linkstate_p = NULL; carp_forus_p = NULL; carp_output_p = NULL; carp_demote_adj_p = NULL; carp_master_p = NULL; mtx_unlock(&carp_mtx); taskqueue_drain(taskqueue_swi, &carp_sendall_task); mtx_destroy(&carp_mtx); sx_destroy(&carp_sx); } static int carp_mod_load(void) { int err; mtx_init(&carp_mtx, "carp_mtx", NULL, MTX_DEF); sx_init(&carp_sx, "carp_sx"); LIST_INIT(&carp_list); carp_get_vhid_p = carp_get_vhid; carp_forus_p = carp_forus; carp_output_p = carp_output; carp_linkstate_p = carp_linkstate; carp_ioctl_p = carp_ioctl; carp_attach_p = carp_attach; carp_detach_p = carp_detach; carp_demote_adj_p = carp_demote_adj; carp_master_p = carp_master; #ifdef INET6 carp_iamatch6_p = carp_iamatch6; carp_macmatch6_p = carp_macmatch6; proto_reg[CARP_INET6] = pf_proto_register(PF_INET6, (struct protosw *)&in6_carp_protosw); if (proto_reg[CARP_INET6]) { printf("carp: error %d attaching to PF_INET6\n", proto_reg[CARP_INET6]); carp_mod_cleanup(); return (proto_reg[CARP_INET6]); } err = ip6proto_register(IPPROTO_CARP); if (err) { printf("carp: error %d registering with INET6\n", err); carp_mod_cleanup(); return (err); } #endif #ifdef INET carp_iamatch_p = carp_iamatch; proto_reg[CARP_INET] = pf_proto_register(PF_INET, &in_carp_protosw); if (proto_reg[CARP_INET]) { printf("carp: error %d attaching to PF_INET\n", proto_reg[CARP_INET]); carp_mod_cleanup(); return (proto_reg[CARP_INET]); } err = ipproto_register(IPPROTO_CARP); if (err) { printf("carp: error %d registering with INET\n", err); carp_mod_cleanup(); return (err); } #endif return (0); } static int carp_modevent(module_t mod, int type, void *data) { switch (type) { case MOD_LOAD: return carp_mod_load(); /* NOTREACHED */ case MOD_UNLOAD: mtx_lock(&carp_mtx); if (LIST_EMPTY(&carp_list)) carp_mod_cleanup(); else { mtx_unlock(&carp_mtx); return (EBUSY); } break; default: return (EINVAL); } return (0); } static moduledata_t carp_mod = { "carp", carp_modevent, 0 }; DECLARE_MODULE(carp, carp_mod, SI_SUB_PROTO_DOMAIN, SI_ORDER_ANY); Index: projects/clang1000-import/sys/netpfil/pf/if_pfsync.c =================================================================== --- projects/clang1000-import/sys/netpfil/pf/if_pfsync.c (revision 358238) +++ projects/clang1000-import/sys/netpfil/pf/if_pfsync.c (revision 358239) @@ -1,2555 +1,2556 @@ /*- * SPDX-License-Identifier: (BSD-2-Clause-FreeBSD AND ISC) * * Copyright (c) 2002 Michael Shalayeff * Copyright (c) 2012 Gleb Smirnoff * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR OR HIS RELATIVES BE LIABLE FOR ANY DIRECT, * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR * SERVICES; LOSS OF MIND, USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING * IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF * THE POSSIBILITY OF SUCH DAMAGE. */ /*- * Copyright (c) 2009 David Gwynne * * Permission to use, copy, modify, and distribute this software for any * purpose with or without fee is hereby granted, provided that the above * copyright notice and this permission notice appear in all copies. * * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. */ /* * $OpenBSD: if_pfsync.c,v 1.110 2009/02/24 05:39:19 dlg Exp $ * * Revisions picked from OpenBSD after revision 1.110 import: * 1.119 - don't m_copydata() beyond the len of mbuf in pfsync_input() * 1.118, 1.124, 1.148, 1.149, 1.151, 1.171 - fixes to bulk updates * 1.120, 1.175 - use monotonic time_uptime * 1.122 - reduce number of updates for non-TCP sessions * 1.125, 1.127 - rewrite merge or stale processing * 1.128 - cleanups * 1.146 - bzero() mbuf before sparsely filling it with data * 1.170 - SIOCSIFMTU checks * 1.126, 1.142 - deferred packets processing * 1.173 - correct expire time processing */ #include __FBSDID("$FreeBSD$"); #include "opt_inet.h" #include "opt_inet6.h" #include "opt_pf.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #define PFSYNC_MINPKT ( \ sizeof(struct ip) + \ sizeof(struct pfsync_header) + \ sizeof(struct pfsync_subheader) ) struct pfsync_bucket; struct pfsync_pkt { struct ip *ip; struct in_addr src; u_int8_t flags; }; static int pfsync_upd_tcp(struct pf_state *, struct pfsync_state_peer *, struct pfsync_state_peer *); static int pfsync_in_clr(struct pfsync_pkt *, struct mbuf *, int, int); static int pfsync_in_ins(struct pfsync_pkt *, struct mbuf *, int, int); static int pfsync_in_iack(struct pfsync_pkt *, struct mbuf *, int, int); static int pfsync_in_upd(struct pfsync_pkt *, struct mbuf *, int, int); static int pfsync_in_upd_c(struct pfsync_pkt *, struct mbuf *, int, int); static int pfsync_in_ureq(struct pfsync_pkt *, struct mbuf *, int, int); static int pfsync_in_del(struct pfsync_pkt *, struct mbuf *, int, int); static int pfsync_in_del_c(struct pfsync_pkt *, struct mbuf *, int, int); static int pfsync_in_bus(struct pfsync_pkt *, struct mbuf *, int, int); static int pfsync_in_tdb(struct pfsync_pkt *, struct mbuf *, int, int); static int pfsync_in_eof(struct pfsync_pkt *, struct mbuf *, int, int); static int pfsync_in_error(struct pfsync_pkt *, struct mbuf *, int, int); static int (*pfsync_acts[])(struct pfsync_pkt *, struct mbuf *, int, int) = { pfsync_in_clr, /* PFSYNC_ACT_CLR */ pfsync_in_ins, /* PFSYNC_ACT_INS */ pfsync_in_iack, /* PFSYNC_ACT_INS_ACK */ pfsync_in_upd, /* PFSYNC_ACT_UPD */ pfsync_in_upd_c, /* PFSYNC_ACT_UPD_C */ pfsync_in_ureq, /* PFSYNC_ACT_UPD_REQ */ pfsync_in_del, /* PFSYNC_ACT_DEL */ pfsync_in_del_c, /* PFSYNC_ACT_DEL_C */ pfsync_in_error, /* PFSYNC_ACT_INS_F */ pfsync_in_error, /* PFSYNC_ACT_DEL_F */ pfsync_in_bus, /* PFSYNC_ACT_BUS */ pfsync_in_tdb, /* PFSYNC_ACT_TDB */ pfsync_in_eof /* PFSYNC_ACT_EOF */ }; struct pfsync_q { void (*write)(struct pf_state *, void *); size_t len; u_int8_t action; }; /* we have one of these for every PFSYNC_S_ */ static void pfsync_out_state(struct pf_state *, void *); static void pfsync_out_iack(struct pf_state *, void *); static void pfsync_out_upd_c(struct pf_state *, void *); static void pfsync_out_del(struct pf_state *, void *); static struct pfsync_q pfsync_qs[] = { { pfsync_out_state, sizeof(struct pfsync_state), PFSYNC_ACT_INS }, { pfsync_out_iack, sizeof(struct pfsync_ins_ack), PFSYNC_ACT_INS_ACK }, { pfsync_out_state, sizeof(struct pfsync_state), PFSYNC_ACT_UPD }, { pfsync_out_upd_c, sizeof(struct pfsync_upd_c), PFSYNC_ACT_UPD_C }, { pfsync_out_del, sizeof(struct pfsync_del_c), PFSYNC_ACT_DEL_C } }; static void pfsync_q_ins(struct pf_state *, int, bool); static void pfsync_q_del(struct pf_state *, bool, struct pfsync_bucket *); static void pfsync_update_state(struct pf_state *); struct pfsync_upd_req_item { TAILQ_ENTRY(pfsync_upd_req_item) ur_entry; struct pfsync_upd_req ur_msg; }; struct pfsync_deferral { struct pfsync_softc *pd_sc; TAILQ_ENTRY(pfsync_deferral) pd_entry; u_int pd_refs; struct callout pd_tmo; struct pf_state *pd_st; struct mbuf *pd_m; }; struct pfsync_sofct; struct pfsync_bucket { int b_id; struct pfsync_softc *b_sc; struct mtx b_mtx; struct callout b_tmo; int b_flags; #define PFSYNCF_BUCKET_PUSH 0x00000001 size_t b_len; TAILQ_HEAD(, pf_state) b_qs[PFSYNC_S_COUNT]; TAILQ_HEAD(, pfsync_upd_req_item) b_upd_req_list; TAILQ_HEAD(, pfsync_deferral) b_deferrals; u_int b_deferred; void *b_plus; size_t b_pluslen; struct ifaltq b_snd; }; struct pfsync_softc { /* Configuration */ struct ifnet *sc_ifp; struct ifnet *sc_sync_if; struct ip_moptions sc_imo; struct in_addr sc_sync_peer; uint32_t sc_flags; #define PFSYNCF_OK 0x00000001 #define PFSYNCF_DEFER 0x00000002 uint8_t sc_maxupdates; struct ip sc_template; struct mtx sc_mtx; /* Queued data */ struct pfsync_bucket *sc_buckets; /* Bulk update info */ struct mtx sc_bulk_mtx; uint32_t sc_ureq_sent; int sc_bulk_tries; uint32_t sc_ureq_received; int sc_bulk_hashid; uint64_t sc_bulk_stateid; uint32_t sc_bulk_creatorid; struct callout sc_bulk_tmo; struct callout sc_bulkfail_tmo; }; #define PFSYNC_LOCK(sc) mtx_lock(&(sc)->sc_mtx) #define PFSYNC_UNLOCK(sc) mtx_unlock(&(sc)->sc_mtx) #define PFSYNC_LOCK_ASSERT(sc) mtx_assert(&(sc)->sc_mtx, MA_OWNED) #define PFSYNC_BUCKET_LOCK(b) mtx_lock(&(b)->b_mtx) #define PFSYNC_BUCKET_UNLOCK(b) mtx_unlock(&(b)->b_mtx) #define PFSYNC_BUCKET_LOCK_ASSERT(b) mtx_assert(&(b)->b_mtx, MA_OWNED) #define PFSYNC_BLOCK(sc) mtx_lock(&(sc)->sc_bulk_mtx) #define PFSYNC_BUNLOCK(sc) mtx_unlock(&(sc)->sc_bulk_mtx) #define PFSYNC_BLOCK_ASSERT(sc) mtx_assert(&(sc)->sc_bulk_mtx, MA_OWNED) static const char pfsyncname[] = "pfsync"; static MALLOC_DEFINE(M_PFSYNC, pfsyncname, "pfsync(4) data"); VNET_DEFINE_STATIC(struct pfsync_softc *, pfsyncif) = NULL; #define V_pfsyncif VNET(pfsyncif) VNET_DEFINE_STATIC(void *, pfsync_swi_cookie) = NULL; #define V_pfsync_swi_cookie VNET(pfsync_swi_cookie) VNET_DEFINE_STATIC(struct pfsyncstats, pfsyncstats); #define V_pfsyncstats VNET(pfsyncstats) VNET_DEFINE_STATIC(int, pfsync_carp_adj) = CARP_MAXSKEW; #define V_pfsync_carp_adj VNET(pfsync_carp_adj) static void pfsync_timeout(void *); static void pfsync_push(struct pfsync_bucket *); static void pfsync_push_all(struct pfsync_softc *); static void pfsyncintr(void *); static int pfsync_multicast_setup(struct pfsync_softc *, struct ifnet *, struct in_mfilter *imf); static void pfsync_multicast_cleanup(struct pfsync_softc *); static void pfsync_pointers_init(void); static void pfsync_pointers_uninit(void); static int pfsync_init(void); static void pfsync_uninit(void); static unsigned long pfsync_buckets; -SYSCTL_NODE(_net, OID_AUTO, pfsync, CTLFLAG_RW, 0, "PFSYNC"); +SYSCTL_NODE(_net, OID_AUTO, pfsync, CTLFLAG_RW | CTLFLAG_MPSAFE, 0, + "PFSYNC"); SYSCTL_STRUCT(_net_pfsync, OID_AUTO, stats, CTLFLAG_VNET | CTLFLAG_RW, &VNET_NAME(pfsyncstats), pfsyncstats, "PFSYNC statistics (struct pfsyncstats, net/if_pfsync.h)"); SYSCTL_INT(_net_pfsync, OID_AUTO, carp_demotion_factor, CTLFLAG_RW, &VNET_NAME(pfsync_carp_adj), 0, "pfsync's CARP demotion factor adjustment"); SYSCTL_ULONG(_net_pfsync, OID_AUTO, pfsync_buckets, CTLFLAG_RDTUN, &pfsync_buckets, 0, "Number of pfsync hash buckets"); static int pfsync_clone_create(struct if_clone *, int, caddr_t); static void pfsync_clone_destroy(struct ifnet *); static int pfsync_alloc_scrub_memory(struct pfsync_state_peer *, struct pf_state_peer *); static int pfsyncoutput(struct ifnet *, struct mbuf *, const struct sockaddr *, struct route *); static int pfsyncioctl(struct ifnet *, u_long, caddr_t); static int pfsync_defer(struct pf_state *, struct mbuf *); static void pfsync_undefer(struct pfsync_deferral *, int); static void pfsync_undefer_state(struct pf_state *, int); static void pfsync_defer_tmo(void *); static void pfsync_request_update(u_int32_t, u_int64_t); static bool pfsync_update_state_req(struct pf_state *); static void pfsync_drop(struct pfsync_softc *); static void pfsync_sendout(int, int); static void pfsync_send_plus(void *, size_t); static void pfsync_bulk_start(void); static void pfsync_bulk_status(u_int8_t); static void pfsync_bulk_update(void *); static void pfsync_bulk_fail(void *); static void pfsync_detach_ifnet(struct ifnet *); #ifdef IPSEC static void pfsync_update_net_tdb(struct pfsync_tdb *); #endif static struct pfsync_bucket *pfsync_get_bucket(struct pfsync_softc *, struct pf_state *); #define PFSYNC_MAX_BULKTRIES 12 VNET_DEFINE(struct if_clone *, pfsync_cloner); #define V_pfsync_cloner VNET(pfsync_cloner) static int pfsync_clone_create(struct if_clone *ifc, int unit, caddr_t param) { struct pfsync_softc *sc; struct ifnet *ifp; struct pfsync_bucket *b; int c, q; if (unit != 0) return (EINVAL); if (! pfsync_buckets) pfsync_buckets = mp_ncpus * 2; sc = malloc(sizeof(struct pfsync_softc), M_PFSYNC, M_WAITOK | M_ZERO); sc->sc_flags |= PFSYNCF_OK; sc->sc_maxupdates = 128; ifp = sc->sc_ifp = if_alloc(IFT_PFSYNC); if (ifp == NULL) { free(sc, M_PFSYNC); return (ENOSPC); } if_initname(ifp, pfsyncname, unit); ifp->if_softc = sc; ifp->if_ioctl = pfsyncioctl; ifp->if_output = pfsyncoutput; ifp->if_type = IFT_PFSYNC; ifp->if_hdrlen = sizeof(struct pfsync_header); ifp->if_mtu = ETHERMTU; mtx_init(&sc->sc_mtx, pfsyncname, NULL, MTX_DEF); mtx_init(&sc->sc_bulk_mtx, "pfsync bulk", NULL, MTX_DEF); callout_init_mtx(&sc->sc_bulk_tmo, &sc->sc_bulk_mtx, 0); callout_init_mtx(&sc->sc_bulkfail_tmo, &sc->sc_bulk_mtx, 0); if_attach(ifp); bpfattach(ifp, DLT_PFSYNC, PFSYNC_HDRLEN); sc->sc_buckets = mallocarray(pfsync_buckets, sizeof(*sc->sc_buckets), M_PFSYNC, M_ZERO | M_WAITOK); for (c = 0; c < pfsync_buckets; c++) { b = &sc->sc_buckets[c]; mtx_init(&b->b_mtx, "pfsync bucket", NULL, MTX_DEF); b->b_id = c; b->b_sc = sc; b->b_len = PFSYNC_MINPKT; for (q = 0; q < PFSYNC_S_COUNT; q++) TAILQ_INIT(&b->b_qs[q]); TAILQ_INIT(&b->b_upd_req_list); TAILQ_INIT(&b->b_deferrals); callout_init(&b->b_tmo, 1); b->b_snd.ifq_maxlen = ifqmaxlen; } V_pfsyncif = sc; return (0); } static void pfsync_clone_destroy(struct ifnet *ifp) { struct pfsync_softc *sc = ifp->if_softc; struct pfsync_bucket *b; int c; for (c = 0; c < pfsync_buckets; c++) { b = &sc->sc_buckets[c]; /* * At this stage, everything should have already been * cleared by pfsync_uninit(), and we have only to * drain callouts. */ while (b->b_deferred > 0) { struct pfsync_deferral *pd = TAILQ_FIRST(&b->b_deferrals); TAILQ_REMOVE(&b->b_deferrals, pd, pd_entry); b->b_deferred--; if (callout_stop(&pd->pd_tmo) > 0) { pf_release_state(pd->pd_st); m_freem(pd->pd_m); free(pd, M_PFSYNC); } else { pd->pd_refs++; callout_drain(&pd->pd_tmo); free(pd, M_PFSYNC); } } callout_drain(&b->b_tmo); } callout_drain(&sc->sc_bulkfail_tmo); callout_drain(&sc->sc_bulk_tmo); if (!(sc->sc_flags & PFSYNCF_OK) && carp_demote_adj_p) (*carp_demote_adj_p)(-V_pfsync_carp_adj, "pfsync destroy"); bpfdetach(ifp); if_detach(ifp); pfsync_drop(sc); if_free(ifp); pfsync_multicast_cleanup(sc); mtx_destroy(&sc->sc_mtx); mtx_destroy(&sc->sc_bulk_mtx); free(sc->sc_buckets, M_PFSYNC); free(sc, M_PFSYNC); V_pfsyncif = NULL; } static int pfsync_alloc_scrub_memory(struct pfsync_state_peer *s, struct pf_state_peer *d) { if (s->scrub.scrub_flag && d->scrub == NULL) { d->scrub = uma_zalloc(V_pf_state_scrub_z, M_NOWAIT | M_ZERO); if (d->scrub == NULL) return (ENOMEM); } return (0); } static int pfsync_state_import(struct pfsync_state *sp, u_int8_t flags) { struct pfsync_softc *sc = V_pfsyncif; #ifndef __NO_STRICT_ALIGNMENT struct pfsync_state_key key[2]; #endif struct pfsync_state_key *kw, *ks; struct pf_state *st = NULL; struct pf_state_key *skw = NULL, *sks = NULL; struct pf_rule *r = NULL; struct pfi_kif *kif; int error; PF_RULES_RASSERT(); if (sp->creatorid == 0) { if (V_pf_status.debug >= PF_DEBUG_MISC) printf("%s: invalid creator id: %08x\n", __func__, ntohl(sp->creatorid)); return (EINVAL); } if ((kif = pfi_kif_find(sp->ifname)) == NULL) { if (V_pf_status.debug >= PF_DEBUG_MISC) printf("%s: unknown interface: %s\n", __func__, sp->ifname); if (flags & PFSYNC_SI_IOCTL) return (EINVAL); return (0); /* skip this state */ } /* * If the ruleset checksums match or the state is coming from the ioctl, * it's safe to associate the state with the rule of that number. */ if (sp->rule != htonl(-1) && sp->anchor == htonl(-1) && (flags & (PFSYNC_SI_IOCTL | PFSYNC_SI_CKSUM)) && ntohl(sp->rule) < pf_main_ruleset.rules[PF_RULESET_FILTER].active.rcount) r = pf_main_ruleset.rules[ PF_RULESET_FILTER].active.ptr_array[ntohl(sp->rule)]; else r = &V_pf_default_rule; if ((r->max_states && counter_u64_fetch(r->states_cur) >= r->max_states)) goto cleanup; /* * XXXGL: consider M_WAITOK in ioctl path after. */ if ((st = uma_zalloc(V_pf_state_z, M_NOWAIT | M_ZERO)) == NULL) goto cleanup; if ((skw = uma_zalloc(V_pf_state_key_z, M_NOWAIT)) == NULL) goto cleanup; #ifndef __NO_STRICT_ALIGNMENT bcopy(&sp->key, key, sizeof(struct pfsync_state_key) * 2); kw = &key[PF_SK_WIRE]; ks = &key[PF_SK_STACK]; #else kw = &sp->key[PF_SK_WIRE]; ks = &sp->key[PF_SK_STACK]; #endif if (PF_ANEQ(&kw->addr[0], &ks->addr[0], sp->af) || PF_ANEQ(&kw->addr[1], &ks->addr[1], sp->af) || kw->port[0] != ks->port[0] || kw->port[1] != ks->port[1]) { sks = uma_zalloc(V_pf_state_key_z, M_NOWAIT); if (sks == NULL) goto cleanup; } else sks = skw; /* allocate memory for scrub info */ if (pfsync_alloc_scrub_memory(&sp->src, &st->src) || pfsync_alloc_scrub_memory(&sp->dst, &st->dst)) goto cleanup; /* Copy to state key(s). */ skw->addr[0] = kw->addr[0]; skw->addr[1] = kw->addr[1]; skw->port[0] = kw->port[0]; skw->port[1] = kw->port[1]; skw->proto = sp->proto; skw->af = sp->af; if (sks != skw) { sks->addr[0] = ks->addr[0]; sks->addr[1] = ks->addr[1]; sks->port[0] = ks->port[0]; sks->port[1] = ks->port[1]; sks->proto = sp->proto; sks->af = sp->af; } /* copy to state */ bcopy(&sp->rt_addr, &st->rt_addr, sizeof(st->rt_addr)); st->creation = time_uptime - ntohl(sp->creation); st->expire = time_uptime; if (sp->expire) { uint32_t timeout; timeout = r->timeout[sp->timeout]; if (!timeout) timeout = V_pf_default_rule.timeout[sp->timeout]; /* sp->expire may have been adaptively scaled by export. */ st->expire -= timeout - ntohl(sp->expire); } st->direction = sp->direction; st->log = sp->log; st->timeout = sp->timeout; st->state_flags = sp->state_flags; st->id = sp->id; st->creatorid = sp->creatorid; pf_state_peer_ntoh(&sp->src, &st->src); pf_state_peer_ntoh(&sp->dst, &st->dst); st->rule.ptr = r; st->nat_rule.ptr = NULL; st->anchor.ptr = NULL; st->rt_kif = NULL; st->pfsync_time = time_uptime; st->sync_state = PFSYNC_S_NONE; if (!(flags & PFSYNC_SI_IOCTL)) st->state_flags |= PFSTATE_NOSYNC; if ((error = pf_state_insert(kif, skw, sks, st)) != 0) goto cleanup_state; /* XXX when we have nat_rule/anchors, use STATE_INC_COUNTERS */ counter_u64_add(r->states_cur, 1); counter_u64_add(r->states_tot, 1); if (!(flags & PFSYNC_SI_IOCTL)) { st->state_flags &= ~PFSTATE_NOSYNC; if (st->state_flags & PFSTATE_ACK) { pfsync_q_ins(st, PFSYNC_S_IACK, true); pfsync_push_all(sc); } } st->state_flags &= ~PFSTATE_ACK; PF_STATE_UNLOCK(st); return (0); cleanup: error = ENOMEM; if (skw == sks) sks = NULL; if (skw != NULL) uma_zfree(V_pf_state_key_z, skw); if (sks != NULL) uma_zfree(V_pf_state_key_z, sks); cleanup_state: /* pf_state_insert() frees the state keys. */ if (st) { if (st->dst.scrub) uma_zfree(V_pf_state_scrub_z, st->dst.scrub); if (st->src.scrub) uma_zfree(V_pf_state_scrub_z, st->src.scrub); uma_zfree(V_pf_state_z, st); } return (error); } static int pfsync_input(struct mbuf **mp, int *offp __unused, int proto __unused) { struct pfsync_softc *sc = V_pfsyncif; struct pfsync_pkt pkt; struct mbuf *m = *mp; struct ip *ip = mtod(m, struct ip *); struct pfsync_header *ph; struct pfsync_subheader subh; int offset, len; int rv; uint16_t count; PF_RULES_RLOCK_TRACKER; *mp = NULL; V_pfsyncstats.pfsyncs_ipackets++; /* Verify that we have a sync interface configured. */ if (!sc || !sc->sc_sync_if || !V_pf_status.running || (sc->sc_ifp->if_drv_flags & IFF_DRV_RUNNING) == 0) goto done; /* verify that the packet came in on the right interface */ if (sc->sc_sync_if != m->m_pkthdr.rcvif) { V_pfsyncstats.pfsyncs_badif++; goto done; } if_inc_counter(sc->sc_ifp, IFCOUNTER_IPACKETS, 1); if_inc_counter(sc->sc_ifp, IFCOUNTER_IBYTES, m->m_pkthdr.len); /* verify that the IP TTL is 255. */ if (ip->ip_ttl != PFSYNC_DFLTTL) { V_pfsyncstats.pfsyncs_badttl++; goto done; } offset = ip->ip_hl << 2; if (m->m_pkthdr.len < offset + sizeof(*ph)) { V_pfsyncstats.pfsyncs_hdrops++; goto done; } if (offset + sizeof(*ph) > m->m_len) { if (m_pullup(m, offset + sizeof(*ph)) == NULL) { V_pfsyncstats.pfsyncs_hdrops++; return (IPPROTO_DONE); } ip = mtod(m, struct ip *); } ph = (struct pfsync_header *)((char *)ip + offset); /* verify the version */ if (ph->version != PFSYNC_VERSION) { V_pfsyncstats.pfsyncs_badver++; goto done; } len = ntohs(ph->len) + offset; if (m->m_pkthdr.len < len) { V_pfsyncstats.pfsyncs_badlen++; goto done; } /* Cheaper to grab this now than having to mess with mbufs later */ pkt.ip = ip; pkt.src = ip->ip_src; pkt.flags = 0; /* * Trusting pf_chksum during packet processing, as well as seeking * in interface name tree, require holding PF_RULES_RLOCK(). */ PF_RULES_RLOCK(); if (!bcmp(&ph->pfcksum, &V_pf_status.pf_chksum, PF_MD5_DIGEST_LENGTH)) pkt.flags |= PFSYNC_SI_CKSUM; offset += sizeof(*ph); while (offset <= len - sizeof(subh)) { m_copydata(m, offset, sizeof(subh), (caddr_t)&subh); offset += sizeof(subh); if (subh.action >= PFSYNC_ACT_MAX) { V_pfsyncstats.pfsyncs_badact++; PF_RULES_RUNLOCK(); goto done; } count = ntohs(subh.count); V_pfsyncstats.pfsyncs_iacts[subh.action] += count; rv = (*pfsync_acts[subh.action])(&pkt, m, offset, count); if (rv == -1) { PF_RULES_RUNLOCK(); return (IPPROTO_DONE); } offset += rv; } PF_RULES_RUNLOCK(); done: m_freem(m); return (IPPROTO_DONE); } static int pfsync_in_clr(struct pfsync_pkt *pkt, struct mbuf *m, int offset, int count) { struct pfsync_clr *clr; struct mbuf *mp; int len = sizeof(*clr) * count; int i, offp; u_int32_t creatorid; mp = m_pulldown(m, offset, len, &offp); if (mp == NULL) { V_pfsyncstats.pfsyncs_badlen++; return (-1); } clr = (struct pfsync_clr *)(mp->m_data + offp); for (i = 0; i < count; i++) { creatorid = clr[i].creatorid; if (clr[i].ifname[0] != '\0' && pfi_kif_find(clr[i].ifname) == NULL) continue; for (int i = 0; i <= pf_hashmask; i++) { struct pf_idhash *ih = &V_pf_idhash[i]; struct pf_state *s; relock: PF_HASHROW_LOCK(ih); LIST_FOREACH(s, &ih->states, entry) { if (s->creatorid == creatorid) { s->state_flags |= PFSTATE_NOSYNC; pf_unlink_state(s, PF_ENTER_LOCKED); goto relock; } } PF_HASHROW_UNLOCK(ih); } } return (len); } static int pfsync_in_ins(struct pfsync_pkt *pkt, struct mbuf *m, int offset, int count) { struct mbuf *mp; struct pfsync_state *sa, *sp; int len = sizeof(*sp) * count; int i, offp; mp = m_pulldown(m, offset, len, &offp); if (mp == NULL) { V_pfsyncstats.pfsyncs_badlen++; return (-1); } sa = (struct pfsync_state *)(mp->m_data + offp); for (i = 0; i < count; i++) { sp = &sa[i]; /* Check for invalid values. */ if (sp->timeout >= PFTM_MAX || sp->src.state > PF_TCPS_PROXY_DST || sp->dst.state > PF_TCPS_PROXY_DST || sp->direction > PF_OUT || (sp->af != AF_INET && sp->af != AF_INET6)) { if (V_pf_status.debug >= PF_DEBUG_MISC) printf("%s: invalid value\n", __func__); V_pfsyncstats.pfsyncs_badval++; continue; } if (pfsync_state_import(sp, pkt->flags) == ENOMEM) /* Drop out, but process the rest of the actions. */ break; } return (len); } static int pfsync_in_iack(struct pfsync_pkt *pkt, struct mbuf *m, int offset, int count) { struct pfsync_ins_ack *ia, *iaa; struct pf_state *st; struct mbuf *mp; int len = count * sizeof(*ia); int offp, i; mp = m_pulldown(m, offset, len, &offp); if (mp == NULL) { V_pfsyncstats.pfsyncs_badlen++; return (-1); } iaa = (struct pfsync_ins_ack *)(mp->m_data + offp); for (i = 0; i < count; i++) { ia = &iaa[i]; st = pf_find_state_byid(ia->id, ia->creatorid); if (st == NULL) continue; if (st->state_flags & PFSTATE_ACK) { pfsync_undefer_state(st, 0); } PF_STATE_UNLOCK(st); } /* * XXX this is not yet implemented, but we know the size of the * message so we can skip it. */ return (count * sizeof(struct pfsync_ins_ack)); } static int pfsync_upd_tcp(struct pf_state *st, struct pfsync_state_peer *src, struct pfsync_state_peer *dst) { int sync = 0; PF_STATE_LOCK_ASSERT(st); /* * The state should never go backwards except * for syn-proxy states. Neither should the * sequence window slide backwards. */ if ((st->src.state > src->state && (st->src.state < PF_TCPS_PROXY_SRC || src->state >= PF_TCPS_PROXY_SRC)) || (st->src.state == src->state && SEQ_GT(st->src.seqlo, ntohl(src->seqlo)))) sync++; else pf_state_peer_ntoh(src, &st->src); if ((st->dst.state > dst->state) || (st->dst.state >= TCPS_SYN_SENT && SEQ_GT(st->dst.seqlo, ntohl(dst->seqlo)))) sync++; else pf_state_peer_ntoh(dst, &st->dst); return (sync); } static int pfsync_in_upd(struct pfsync_pkt *pkt, struct mbuf *m, int offset, int count) { struct pfsync_softc *sc = V_pfsyncif; struct pfsync_state *sa, *sp; struct pf_state *st; int sync; struct mbuf *mp; int len = count * sizeof(*sp); int offp, i; mp = m_pulldown(m, offset, len, &offp); if (mp == NULL) { V_pfsyncstats.pfsyncs_badlen++; return (-1); } sa = (struct pfsync_state *)(mp->m_data + offp); for (i = 0; i < count; i++) { sp = &sa[i]; /* check for invalid values */ if (sp->timeout >= PFTM_MAX || sp->src.state > PF_TCPS_PROXY_DST || sp->dst.state > PF_TCPS_PROXY_DST) { if (V_pf_status.debug >= PF_DEBUG_MISC) { printf("pfsync_input: PFSYNC_ACT_UPD: " "invalid value\n"); } V_pfsyncstats.pfsyncs_badval++; continue; } st = pf_find_state_byid(sp->id, sp->creatorid); if (st == NULL) { /* insert the update */ if (pfsync_state_import(sp, pkt->flags)) V_pfsyncstats.pfsyncs_badstate++; continue; } if (st->state_flags & PFSTATE_ACK) { pfsync_undefer_state(st, 1); } if (st->key[PF_SK_WIRE]->proto == IPPROTO_TCP) sync = pfsync_upd_tcp(st, &sp->src, &sp->dst); else { sync = 0; /* * Non-TCP protocol state machine always go * forwards */ if (st->src.state > sp->src.state) sync++; else pf_state_peer_ntoh(&sp->src, &st->src); if (st->dst.state > sp->dst.state) sync++; else pf_state_peer_ntoh(&sp->dst, &st->dst); } if (sync < 2) { pfsync_alloc_scrub_memory(&sp->dst, &st->dst); pf_state_peer_ntoh(&sp->dst, &st->dst); st->expire = time_uptime; st->timeout = sp->timeout; } st->pfsync_time = time_uptime; if (sync) { V_pfsyncstats.pfsyncs_stale++; pfsync_update_state(st); PF_STATE_UNLOCK(st); pfsync_push_all(sc); continue; } PF_STATE_UNLOCK(st); } return (len); } static int pfsync_in_upd_c(struct pfsync_pkt *pkt, struct mbuf *m, int offset, int count) { struct pfsync_softc *sc = V_pfsyncif; struct pfsync_upd_c *ua, *up; struct pf_state *st; int len = count * sizeof(*up); int sync; struct mbuf *mp; int offp, i; mp = m_pulldown(m, offset, len, &offp); if (mp == NULL) { V_pfsyncstats.pfsyncs_badlen++; return (-1); } ua = (struct pfsync_upd_c *)(mp->m_data + offp); for (i = 0; i < count; i++) { up = &ua[i]; /* check for invalid values */ if (up->timeout >= PFTM_MAX || up->src.state > PF_TCPS_PROXY_DST || up->dst.state > PF_TCPS_PROXY_DST) { if (V_pf_status.debug >= PF_DEBUG_MISC) { printf("pfsync_input: " "PFSYNC_ACT_UPD_C: " "invalid value\n"); } V_pfsyncstats.pfsyncs_badval++; continue; } st = pf_find_state_byid(up->id, up->creatorid); if (st == NULL) { /* We don't have this state. Ask for it. */ PFSYNC_BUCKET_LOCK(&sc->sc_buckets[0]); pfsync_request_update(up->creatorid, up->id); PFSYNC_BUCKET_UNLOCK(&sc->sc_buckets[0]); continue; } if (st->state_flags & PFSTATE_ACK) { pfsync_undefer_state(st, 1); } if (st->key[PF_SK_WIRE]->proto == IPPROTO_TCP) sync = pfsync_upd_tcp(st, &up->src, &up->dst); else { sync = 0; /* * Non-TCP protocol state machine always go * forwards */ if (st->src.state > up->src.state) sync++; else pf_state_peer_ntoh(&up->src, &st->src); if (st->dst.state > up->dst.state) sync++; else pf_state_peer_ntoh(&up->dst, &st->dst); } if (sync < 2) { pfsync_alloc_scrub_memory(&up->dst, &st->dst); pf_state_peer_ntoh(&up->dst, &st->dst); st->expire = time_uptime; st->timeout = up->timeout; } st->pfsync_time = time_uptime; if (sync) { V_pfsyncstats.pfsyncs_stale++; pfsync_update_state(st); PF_STATE_UNLOCK(st); pfsync_push_all(sc); continue; } PF_STATE_UNLOCK(st); } return (len); } static int pfsync_in_ureq(struct pfsync_pkt *pkt, struct mbuf *m, int offset, int count) { struct pfsync_upd_req *ur, *ura; struct mbuf *mp; int len = count * sizeof(*ur); int i, offp; struct pf_state *st; mp = m_pulldown(m, offset, len, &offp); if (mp == NULL) { V_pfsyncstats.pfsyncs_badlen++; return (-1); } ura = (struct pfsync_upd_req *)(mp->m_data + offp); for (i = 0; i < count; i++) { ur = &ura[i]; if (ur->id == 0 && ur->creatorid == 0) pfsync_bulk_start(); else { st = pf_find_state_byid(ur->id, ur->creatorid); if (st == NULL) { V_pfsyncstats.pfsyncs_badstate++; continue; } if (st->state_flags & PFSTATE_NOSYNC) { PF_STATE_UNLOCK(st); continue; } pfsync_update_state_req(st); PF_STATE_UNLOCK(st); } } return (len); } static int pfsync_in_del(struct pfsync_pkt *pkt, struct mbuf *m, int offset, int count) { struct mbuf *mp; struct pfsync_state *sa, *sp; struct pf_state *st; int len = count * sizeof(*sp); int offp, i; mp = m_pulldown(m, offset, len, &offp); if (mp == NULL) { V_pfsyncstats.pfsyncs_badlen++; return (-1); } sa = (struct pfsync_state *)(mp->m_data + offp); for (i = 0; i < count; i++) { sp = &sa[i]; st = pf_find_state_byid(sp->id, sp->creatorid); if (st == NULL) { V_pfsyncstats.pfsyncs_badstate++; continue; } st->state_flags |= PFSTATE_NOSYNC; pf_unlink_state(st, PF_ENTER_LOCKED); } return (len); } static int pfsync_in_del_c(struct pfsync_pkt *pkt, struct mbuf *m, int offset, int count) { struct mbuf *mp; struct pfsync_del_c *sa, *sp; struct pf_state *st; int len = count * sizeof(*sp); int offp, i; mp = m_pulldown(m, offset, len, &offp); if (mp == NULL) { V_pfsyncstats.pfsyncs_badlen++; return (-1); } sa = (struct pfsync_del_c *)(mp->m_data + offp); for (i = 0; i < count; i++) { sp = &sa[i]; st = pf_find_state_byid(sp->id, sp->creatorid); if (st == NULL) { V_pfsyncstats.pfsyncs_badstate++; continue; } st->state_flags |= PFSTATE_NOSYNC; pf_unlink_state(st, PF_ENTER_LOCKED); } return (len); } static int pfsync_in_bus(struct pfsync_pkt *pkt, struct mbuf *m, int offset, int count) { struct pfsync_softc *sc = V_pfsyncif; struct pfsync_bus *bus; struct mbuf *mp; int len = count * sizeof(*bus); int offp; PFSYNC_BLOCK(sc); /* If we're not waiting for a bulk update, who cares. */ if (sc->sc_ureq_sent == 0) { PFSYNC_BUNLOCK(sc); return (len); } mp = m_pulldown(m, offset, len, &offp); if (mp == NULL) { PFSYNC_BUNLOCK(sc); V_pfsyncstats.pfsyncs_badlen++; return (-1); } bus = (struct pfsync_bus *)(mp->m_data + offp); switch (bus->status) { case PFSYNC_BUS_START: callout_reset(&sc->sc_bulkfail_tmo, 4 * hz + V_pf_limits[PF_LIMIT_STATES].limit / ((sc->sc_ifp->if_mtu - PFSYNC_MINPKT) / sizeof(struct pfsync_state)), pfsync_bulk_fail, sc); if (V_pf_status.debug >= PF_DEBUG_MISC) printf("pfsync: received bulk update start\n"); break; case PFSYNC_BUS_END: if (time_uptime - ntohl(bus->endtime) >= sc->sc_ureq_sent) { /* that's it, we're happy */ sc->sc_ureq_sent = 0; sc->sc_bulk_tries = 0; callout_stop(&sc->sc_bulkfail_tmo); if (!(sc->sc_flags & PFSYNCF_OK) && carp_demote_adj_p) (*carp_demote_adj_p)(-V_pfsync_carp_adj, "pfsync bulk done"); sc->sc_flags |= PFSYNCF_OK; if (V_pf_status.debug >= PF_DEBUG_MISC) printf("pfsync: received valid " "bulk update end\n"); } else { if (V_pf_status.debug >= PF_DEBUG_MISC) printf("pfsync: received invalid " "bulk update end: bad timestamp\n"); } break; } PFSYNC_BUNLOCK(sc); return (len); } static int pfsync_in_tdb(struct pfsync_pkt *pkt, struct mbuf *m, int offset, int count) { int len = count * sizeof(struct pfsync_tdb); #if defined(IPSEC) struct pfsync_tdb *tp; struct mbuf *mp; int offp; int i; int s; mp = m_pulldown(m, offset, len, &offp); if (mp == NULL) { V_pfsyncstats.pfsyncs_badlen++; return (-1); } tp = (struct pfsync_tdb *)(mp->m_data + offp); for (i = 0; i < count; i++) pfsync_update_net_tdb(&tp[i]); #endif return (len); } #if defined(IPSEC) /* Update an in-kernel tdb. Silently fail if no tdb is found. */ static void pfsync_update_net_tdb(struct pfsync_tdb *pt) { struct tdb *tdb; int s; /* check for invalid values */ if (ntohl(pt->spi) <= SPI_RESERVED_MAX || (pt->dst.sa.sa_family != AF_INET && pt->dst.sa.sa_family != AF_INET6)) goto bad; tdb = gettdb(pt->spi, &pt->dst, pt->sproto); if (tdb) { pt->rpl = ntohl(pt->rpl); pt->cur_bytes = (unsigned long long)be64toh(pt->cur_bytes); /* Neither replay nor byte counter should ever decrease. */ if (pt->rpl < tdb->tdb_rpl || pt->cur_bytes < tdb->tdb_cur_bytes) { goto bad; } tdb->tdb_rpl = pt->rpl; tdb->tdb_cur_bytes = pt->cur_bytes; } return; bad: if (V_pf_status.debug >= PF_DEBUG_MISC) printf("pfsync_insert: PFSYNC_ACT_TDB_UPD: " "invalid value\n"); V_pfsyncstats.pfsyncs_badstate++; return; } #endif static int pfsync_in_eof(struct pfsync_pkt *pkt, struct mbuf *m, int offset, int count) { /* check if we are at the right place in the packet */ if (offset != m->m_pkthdr.len) V_pfsyncstats.pfsyncs_badlen++; /* we're done. free and let the caller return */ m_freem(m); return (-1); } static int pfsync_in_error(struct pfsync_pkt *pkt, struct mbuf *m, int offset, int count) { V_pfsyncstats.pfsyncs_badact++; m_freem(m); return (-1); } static int pfsyncoutput(struct ifnet *ifp, struct mbuf *m, const struct sockaddr *dst, struct route *rt) { m_freem(m); return (0); } /* ARGSUSED */ static int pfsyncioctl(struct ifnet *ifp, u_long cmd, caddr_t data) { struct pfsync_softc *sc = ifp->if_softc; struct ifreq *ifr = (struct ifreq *)data; struct pfsyncreq pfsyncr; int error; int c; switch (cmd) { case SIOCSIFFLAGS: PFSYNC_LOCK(sc); if (ifp->if_flags & IFF_UP) { ifp->if_drv_flags |= IFF_DRV_RUNNING; PFSYNC_UNLOCK(sc); pfsync_pointers_init(); } else { ifp->if_drv_flags &= ~IFF_DRV_RUNNING; PFSYNC_UNLOCK(sc); pfsync_pointers_uninit(); } break; case SIOCSIFMTU: if (!sc->sc_sync_if || ifr->ifr_mtu <= PFSYNC_MINPKT || ifr->ifr_mtu > sc->sc_sync_if->if_mtu) return (EINVAL); if (ifr->ifr_mtu < ifp->if_mtu) { for (c = 0; c < pfsync_buckets; c++) { PFSYNC_BUCKET_LOCK(&sc->sc_buckets[c]); if (sc->sc_buckets[c].b_len > PFSYNC_MINPKT) pfsync_sendout(1, c); PFSYNC_BUCKET_UNLOCK(&sc->sc_buckets[c]); } } ifp->if_mtu = ifr->ifr_mtu; break; case SIOCGETPFSYNC: bzero(&pfsyncr, sizeof(pfsyncr)); PFSYNC_LOCK(sc); if (sc->sc_sync_if) { strlcpy(pfsyncr.pfsyncr_syncdev, sc->sc_sync_if->if_xname, IFNAMSIZ); } pfsyncr.pfsyncr_syncpeer = sc->sc_sync_peer; pfsyncr.pfsyncr_maxupdates = sc->sc_maxupdates; pfsyncr.pfsyncr_defer = (PFSYNCF_DEFER == (sc->sc_flags & PFSYNCF_DEFER)); PFSYNC_UNLOCK(sc); return (copyout(&pfsyncr, ifr_data_get_ptr(ifr), sizeof(pfsyncr))); case SIOCSETPFSYNC: { struct in_mfilter *imf = NULL; struct ifnet *sifp; struct ip *ip; if ((error = priv_check(curthread, PRIV_NETINET_PF)) != 0) return (error); if ((error = copyin(ifr_data_get_ptr(ifr), &pfsyncr, sizeof(pfsyncr)))) return (error); if (pfsyncr.pfsyncr_maxupdates > 255) return (EINVAL); if (pfsyncr.pfsyncr_syncdev[0] == 0) sifp = NULL; else if ((sifp = ifunit_ref(pfsyncr.pfsyncr_syncdev)) == NULL) return (EINVAL); if (sifp != NULL && ( pfsyncr.pfsyncr_syncpeer.s_addr == 0 || pfsyncr.pfsyncr_syncpeer.s_addr == htonl(INADDR_PFSYNC_GROUP))) imf = ip_mfilter_alloc(M_WAITOK, 0, 0); PFSYNC_LOCK(sc); if (pfsyncr.pfsyncr_syncpeer.s_addr == 0) sc->sc_sync_peer.s_addr = htonl(INADDR_PFSYNC_GROUP); else sc->sc_sync_peer.s_addr = pfsyncr.pfsyncr_syncpeer.s_addr; sc->sc_maxupdates = pfsyncr.pfsyncr_maxupdates; if (pfsyncr.pfsyncr_defer) { sc->sc_flags |= PFSYNCF_DEFER; V_pfsync_defer_ptr = pfsync_defer; } else { sc->sc_flags &= ~PFSYNCF_DEFER; V_pfsync_defer_ptr = NULL; } if (sifp == NULL) { if (sc->sc_sync_if) if_rele(sc->sc_sync_if); sc->sc_sync_if = NULL; pfsync_multicast_cleanup(sc); PFSYNC_UNLOCK(sc); break; } for (c = 0; c < pfsync_buckets; c++) { PFSYNC_BUCKET_LOCK(&sc->sc_buckets[c]); if (sc->sc_buckets[c].b_len > PFSYNC_MINPKT && (sifp->if_mtu < sc->sc_ifp->if_mtu || (sc->sc_sync_if != NULL && sifp->if_mtu < sc->sc_sync_if->if_mtu) || sifp->if_mtu < MCLBYTES - sizeof(struct ip))) pfsync_sendout(1, c); PFSYNC_BUCKET_UNLOCK(&sc->sc_buckets[c]); } pfsync_multicast_cleanup(sc); if (sc->sc_sync_peer.s_addr == htonl(INADDR_PFSYNC_GROUP)) { error = pfsync_multicast_setup(sc, sifp, imf); if (error) { if_rele(sifp); ip_mfilter_free(imf); PFSYNC_UNLOCK(sc); return (error); } } if (sc->sc_sync_if) if_rele(sc->sc_sync_if); sc->sc_sync_if = sifp; ip = &sc->sc_template; bzero(ip, sizeof(*ip)); ip->ip_v = IPVERSION; ip->ip_hl = sizeof(sc->sc_template) >> 2; ip->ip_tos = IPTOS_LOWDELAY; /* len and id are set later. */ ip->ip_off = htons(IP_DF); ip->ip_ttl = PFSYNC_DFLTTL; ip->ip_p = IPPROTO_PFSYNC; ip->ip_src.s_addr = INADDR_ANY; ip->ip_dst.s_addr = sc->sc_sync_peer.s_addr; /* Request a full state table update. */ if ((sc->sc_flags & PFSYNCF_OK) && carp_demote_adj_p) (*carp_demote_adj_p)(V_pfsync_carp_adj, "pfsync bulk start"); sc->sc_flags &= ~PFSYNCF_OK; if (V_pf_status.debug >= PF_DEBUG_MISC) printf("pfsync: requesting bulk update\n"); PFSYNC_UNLOCK(sc); PFSYNC_BUCKET_LOCK(&sc->sc_buckets[0]); pfsync_request_update(0, 0); PFSYNC_BUCKET_UNLOCK(&sc->sc_buckets[0]); PFSYNC_BLOCK(sc); sc->sc_ureq_sent = time_uptime; callout_reset(&sc->sc_bulkfail_tmo, 5 * hz, pfsync_bulk_fail, sc); PFSYNC_BUNLOCK(sc); break; } default: return (ENOTTY); } return (0); } static void pfsync_out_state(struct pf_state *st, void *buf) { struct pfsync_state *sp = buf; pfsync_state_export(sp, st); } static void pfsync_out_iack(struct pf_state *st, void *buf) { struct pfsync_ins_ack *iack = buf; iack->id = st->id; iack->creatorid = st->creatorid; } static void pfsync_out_upd_c(struct pf_state *st, void *buf) { struct pfsync_upd_c *up = buf; bzero(up, sizeof(*up)); up->id = st->id; pf_state_peer_hton(&st->src, &up->src); pf_state_peer_hton(&st->dst, &up->dst); up->creatorid = st->creatorid; up->timeout = st->timeout; } static void pfsync_out_del(struct pf_state *st, void *buf) { struct pfsync_del_c *dp = buf; dp->id = st->id; dp->creatorid = st->creatorid; st->state_flags |= PFSTATE_NOSYNC; } static void pfsync_drop(struct pfsync_softc *sc) { struct pf_state *st, *next; struct pfsync_upd_req_item *ur; struct pfsync_bucket *b; int c, q; for (c = 0; c < pfsync_buckets; c++) { b = &sc->sc_buckets[c]; for (q = 0; q < PFSYNC_S_COUNT; q++) { if (TAILQ_EMPTY(&b->b_qs[q])) continue; TAILQ_FOREACH_SAFE(st, &b->b_qs[q], sync_list, next) { KASSERT(st->sync_state == q, ("%s: st->sync_state == q", __func__)); st->sync_state = PFSYNC_S_NONE; pf_release_state(st); } TAILQ_INIT(&b->b_qs[q]); } while ((ur = TAILQ_FIRST(&b->b_upd_req_list)) != NULL) { TAILQ_REMOVE(&b->b_upd_req_list, ur, ur_entry); free(ur, M_PFSYNC); } b->b_len = PFSYNC_MINPKT; b->b_plus = NULL; } } static void pfsync_sendout(int schedswi, int c) { struct pfsync_softc *sc = V_pfsyncif; struct ifnet *ifp = sc->sc_ifp; struct mbuf *m; struct ip *ip; struct pfsync_header *ph; struct pfsync_subheader *subh; struct pf_state *st, *st_next; struct pfsync_upd_req_item *ur; struct pfsync_bucket *b = &sc->sc_buckets[c]; int offset; int q, count = 0; KASSERT(sc != NULL, ("%s: null sc", __func__)); KASSERT(b->b_len > PFSYNC_MINPKT, ("%s: sc_len %zu", __func__, b->b_len)); PFSYNC_BUCKET_LOCK_ASSERT(b); if (ifp->if_bpf == NULL && sc->sc_sync_if == NULL) { pfsync_drop(sc); return; } m = m_get2(max_linkhdr + b->b_len, M_NOWAIT, MT_DATA, M_PKTHDR); if (m == NULL) { if_inc_counter(sc->sc_ifp, IFCOUNTER_OERRORS, 1); V_pfsyncstats.pfsyncs_onomem++; return; } m->m_data += max_linkhdr; m->m_len = m->m_pkthdr.len = b->b_len; /* build the ip header */ ip = (struct ip *)m->m_data; bcopy(&sc->sc_template, ip, sizeof(*ip)); offset = sizeof(*ip); ip->ip_len = htons(m->m_pkthdr.len); ip_fillid(ip); /* build the pfsync header */ ph = (struct pfsync_header *)(m->m_data + offset); bzero(ph, sizeof(*ph)); offset += sizeof(*ph); ph->version = PFSYNC_VERSION; ph->len = htons(b->b_len - sizeof(*ip)); bcopy(V_pf_status.pf_chksum, ph->pfcksum, PF_MD5_DIGEST_LENGTH); /* walk the queues */ for (q = 0; q < PFSYNC_S_COUNT; q++) { if (TAILQ_EMPTY(&b->b_qs[q])) continue; subh = (struct pfsync_subheader *)(m->m_data + offset); offset += sizeof(*subh); count = 0; TAILQ_FOREACH_SAFE(st, &b->b_qs[q], sync_list, st_next) { KASSERT(st->sync_state == q, ("%s: st->sync_state == q", __func__)); /* * XXXGL: some of write methods do unlocked reads * of state data :( */ pfsync_qs[q].write(st, m->m_data + offset); offset += pfsync_qs[q].len; st->sync_state = PFSYNC_S_NONE; pf_release_state(st); count++; } TAILQ_INIT(&b->b_qs[q]); bzero(subh, sizeof(*subh)); subh->action = pfsync_qs[q].action; subh->count = htons(count); V_pfsyncstats.pfsyncs_oacts[pfsync_qs[q].action] += count; } if (!TAILQ_EMPTY(&b->b_upd_req_list)) { subh = (struct pfsync_subheader *)(m->m_data + offset); offset += sizeof(*subh); count = 0; while ((ur = TAILQ_FIRST(&b->b_upd_req_list)) != NULL) { TAILQ_REMOVE(&b->b_upd_req_list, ur, ur_entry); bcopy(&ur->ur_msg, m->m_data + offset, sizeof(ur->ur_msg)); offset += sizeof(ur->ur_msg); free(ur, M_PFSYNC); count++; } bzero(subh, sizeof(*subh)); subh->action = PFSYNC_ACT_UPD_REQ; subh->count = htons(count); V_pfsyncstats.pfsyncs_oacts[PFSYNC_ACT_UPD_REQ] += count; } /* has someone built a custom region for us to add? */ if (b->b_plus != NULL) { bcopy(b->b_plus, m->m_data + offset, b->b_pluslen); offset += b->b_pluslen; b->b_plus = NULL; } subh = (struct pfsync_subheader *)(m->m_data + offset); offset += sizeof(*subh); bzero(subh, sizeof(*subh)); subh->action = PFSYNC_ACT_EOF; subh->count = htons(1); V_pfsyncstats.pfsyncs_oacts[PFSYNC_ACT_EOF]++; /* we're done, let's put it on the wire */ if (ifp->if_bpf) { m->m_data += sizeof(*ip); m->m_len = m->m_pkthdr.len = b->b_len - sizeof(*ip); BPF_MTAP(ifp, m); m->m_data -= sizeof(*ip); m->m_len = m->m_pkthdr.len = b->b_len; } if (sc->sc_sync_if == NULL) { b->b_len = PFSYNC_MINPKT; m_freem(m); return; } if_inc_counter(sc->sc_ifp, IFCOUNTER_OPACKETS, 1); if_inc_counter(sc->sc_ifp, IFCOUNTER_OBYTES, m->m_pkthdr.len); b->b_len = PFSYNC_MINPKT; if (!_IF_QFULL(&b->b_snd)) _IF_ENQUEUE(&b->b_snd, m); else { m_freem(m); if_inc_counter(sc->sc_ifp, IFCOUNTER_OQDROPS, 1); } if (schedswi) swi_sched(V_pfsync_swi_cookie, 0); } static void pfsync_insert_state(struct pf_state *st) { struct pfsync_softc *sc = V_pfsyncif; struct pfsync_bucket *b = pfsync_get_bucket(sc, st); if (st->state_flags & PFSTATE_NOSYNC) return; if ((st->rule.ptr->rule_flag & PFRULE_NOSYNC) || st->key[PF_SK_WIRE]->proto == IPPROTO_PFSYNC) { st->state_flags |= PFSTATE_NOSYNC; return; } KASSERT(st->sync_state == PFSYNC_S_NONE, ("%s: st->sync_state %u", __func__, st->sync_state)); PFSYNC_BUCKET_LOCK(b); if (b->b_len == PFSYNC_MINPKT) callout_reset(&b->b_tmo, 1 * hz, pfsync_timeout, b); pfsync_q_ins(st, PFSYNC_S_INS, true); PFSYNC_BUCKET_UNLOCK(b); st->sync_updates = 0; } static int pfsync_defer(struct pf_state *st, struct mbuf *m) { struct pfsync_softc *sc = V_pfsyncif; struct pfsync_deferral *pd; struct pfsync_bucket *b = pfsync_get_bucket(sc, st); if (m->m_flags & (M_BCAST|M_MCAST)) return (0); PFSYNC_LOCK(sc); if (sc == NULL || !(sc->sc_ifp->if_flags & IFF_DRV_RUNNING) || !(sc->sc_flags & PFSYNCF_DEFER)) { PFSYNC_UNLOCK(sc); return (0); } if (b->b_deferred >= 128) pfsync_undefer(TAILQ_FIRST(&b->b_deferrals), 0); pd = malloc(sizeof(*pd), M_PFSYNC, M_NOWAIT); if (pd == NULL) return (0); b->b_deferred++; m->m_flags |= M_SKIP_FIREWALL; st->state_flags |= PFSTATE_ACK; pd->pd_sc = sc; pd->pd_refs = 0; pd->pd_st = st; pf_ref_state(st); pd->pd_m = m; TAILQ_INSERT_TAIL(&b->b_deferrals, pd, pd_entry); callout_init_mtx(&pd->pd_tmo, &b->b_mtx, CALLOUT_RETURNUNLOCKED); callout_reset(&pd->pd_tmo, 10, pfsync_defer_tmo, pd); pfsync_push(b); return (1); } static void pfsync_undefer(struct pfsync_deferral *pd, int drop) { struct pfsync_softc *sc = pd->pd_sc; struct mbuf *m = pd->pd_m; struct pf_state *st = pd->pd_st; struct pfsync_bucket *b = pfsync_get_bucket(sc, st); PFSYNC_BUCKET_LOCK_ASSERT(b); TAILQ_REMOVE(&b->b_deferrals, pd, pd_entry); b->b_deferred--; pd->pd_st->state_flags &= ~PFSTATE_ACK; /* XXX: locking! */ free(pd, M_PFSYNC); pf_release_state(st); if (drop) m_freem(m); else { _IF_ENQUEUE(&b->b_snd, m); pfsync_push(b); } } static void pfsync_defer_tmo(void *arg) { struct epoch_tracker et; struct pfsync_deferral *pd = arg; struct pfsync_softc *sc = pd->pd_sc; struct mbuf *m = pd->pd_m; struct pf_state *st = pd->pd_st; struct pfsync_bucket *b = pfsync_get_bucket(sc, st); PFSYNC_BUCKET_LOCK_ASSERT(b); NET_EPOCH_ENTER(et); CURVNET_SET(m->m_pkthdr.rcvif->if_vnet); TAILQ_REMOVE(&b->b_deferrals, pd, pd_entry); b->b_deferred--; pd->pd_st->state_flags &= ~PFSTATE_ACK; /* XXX: locking! */ if (pd->pd_refs == 0) free(pd, M_PFSYNC); PFSYNC_UNLOCK(sc); ip_output(m, NULL, NULL, 0, NULL, NULL); pf_release_state(st); CURVNET_RESTORE(); NET_EPOCH_EXIT(et); } static void pfsync_undefer_state(struct pf_state *st, int drop) { struct pfsync_softc *sc = V_pfsyncif; struct pfsync_deferral *pd; struct pfsync_bucket *b = pfsync_get_bucket(sc, st); PFSYNC_BUCKET_LOCK(b); TAILQ_FOREACH(pd, &b->b_deferrals, pd_entry) { if (pd->pd_st == st) { if (callout_stop(&pd->pd_tmo) > 0) pfsync_undefer(pd, drop); PFSYNC_BUCKET_UNLOCK(b); return; } } PFSYNC_BUCKET_UNLOCK(b); panic("%s: unable to find deferred state", __func__); } static struct pfsync_bucket* pfsync_get_bucket(struct pfsync_softc *sc, struct pf_state *st) { int c = PF_IDHASH(st) % pfsync_buckets; return &sc->sc_buckets[c]; } static void pfsync_update_state(struct pf_state *st) { struct pfsync_softc *sc = V_pfsyncif; bool sync = false, ref = true; struct pfsync_bucket *b = pfsync_get_bucket(sc, st); PF_STATE_LOCK_ASSERT(st); PFSYNC_BUCKET_LOCK(b); if (st->state_flags & PFSTATE_ACK) pfsync_undefer_state(st, 0); if (st->state_flags & PFSTATE_NOSYNC) { if (st->sync_state != PFSYNC_S_NONE) pfsync_q_del(st, true, b); PFSYNC_BUCKET_UNLOCK(b); return; } if (b->b_len == PFSYNC_MINPKT) callout_reset(&b->b_tmo, 1 * hz, pfsync_timeout, b); switch (st->sync_state) { case PFSYNC_S_UPD_C: case PFSYNC_S_UPD: case PFSYNC_S_INS: /* we're already handling it */ if (st->key[PF_SK_WIRE]->proto == IPPROTO_TCP) { st->sync_updates++; if (st->sync_updates >= sc->sc_maxupdates) sync = true; } break; case PFSYNC_S_IACK: pfsync_q_del(st, false, b); ref = false; /* FALLTHROUGH */ case PFSYNC_S_NONE: pfsync_q_ins(st, PFSYNC_S_UPD_C, ref); st->sync_updates = 0; break; default: panic("%s: unexpected sync state %d", __func__, st->sync_state); } if (sync || (time_uptime - st->pfsync_time) < 2) pfsync_push(b); PFSYNC_BUCKET_UNLOCK(b); } static void pfsync_request_update(u_int32_t creatorid, u_int64_t id) { struct pfsync_softc *sc = V_pfsyncif; struct pfsync_bucket *b = &sc->sc_buckets[0]; struct pfsync_upd_req_item *item; size_t nlen = sizeof(struct pfsync_upd_req); PFSYNC_BUCKET_LOCK_ASSERT(b); /* * This code does a bit to prevent multiple update requests for the * same state being generated. It searches current subheader queue, * but it doesn't lookup into queue of already packed datagrams. */ TAILQ_FOREACH(item, &b->b_upd_req_list, ur_entry) if (item->ur_msg.id == id && item->ur_msg.creatorid == creatorid) return; item = malloc(sizeof(*item), M_PFSYNC, M_NOWAIT); if (item == NULL) return; /* XXX stats */ item->ur_msg.id = id; item->ur_msg.creatorid = creatorid; if (TAILQ_EMPTY(&b->b_upd_req_list)) nlen += sizeof(struct pfsync_subheader); if (b->b_len + nlen > sc->sc_ifp->if_mtu) { pfsync_sendout(1, 0); nlen = sizeof(struct pfsync_subheader) + sizeof(struct pfsync_upd_req); } TAILQ_INSERT_TAIL(&b->b_upd_req_list, item, ur_entry); b->b_len += nlen; } static bool pfsync_update_state_req(struct pf_state *st) { struct pfsync_softc *sc = V_pfsyncif; bool ref = true, full = false; struct pfsync_bucket *b = pfsync_get_bucket(sc, st); PF_STATE_LOCK_ASSERT(st); PFSYNC_BUCKET_LOCK(b); if (st->state_flags & PFSTATE_NOSYNC) { if (st->sync_state != PFSYNC_S_NONE) pfsync_q_del(st, true, b); PFSYNC_BUCKET_UNLOCK(b); return (full); } switch (st->sync_state) { case PFSYNC_S_UPD_C: case PFSYNC_S_IACK: pfsync_q_del(st, false, b); ref = false; /* FALLTHROUGH */ case PFSYNC_S_NONE: pfsync_q_ins(st, PFSYNC_S_UPD, ref); pfsync_push(b); break; case PFSYNC_S_INS: case PFSYNC_S_UPD: case PFSYNC_S_DEL: /* we're already handling it */ break; default: panic("%s: unexpected sync state %d", __func__, st->sync_state); } if ((sc->sc_ifp->if_mtu - b->b_len) < sizeof(struct pfsync_state)) full = true; PFSYNC_BUCKET_UNLOCK(b); return (full); } static void pfsync_delete_state(struct pf_state *st) { struct pfsync_softc *sc = V_pfsyncif; struct pfsync_bucket *b = pfsync_get_bucket(sc, st); bool ref = true; PFSYNC_BUCKET_LOCK(b); if (st->state_flags & PFSTATE_ACK) pfsync_undefer_state(st, 1); if (st->state_flags & PFSTATE_NOSYNC) { if (st->sync_state != PFSYNC_S_NONE) pfsync_q_del(st, true, b); PFSYNC_BUCKET_UNLOCK(b); return; } if (b->b_len == PFSYNC_MINPKT) callout_reset(&b->b_tmo, 1 * hz, pfsync_timeout, b); switch (st->sync_state) { case PFSYNC_S_INS: /* We never got to tell the world so just forget about it. */ pfsync_q_del(st, true, b); break; case PFSYNC_S_UPD_C: case PFSYNC_S_UPD: case PFSYNC_S_IACK: pfsync_q_del(st, false, b); ref = false; /* FALLTHROUGH */ case PFSYNC_S_NONE: pfsync_q_ins(st, PFSYNC_S_DEL, ref); break; default: panic("%s: unexpected sync state %d", __func__, st->sync_state); } PFSYNC_BUCKET_UNLOCK(b); } static void pfsync_clear_states(u_int32_t creatorid, const char *ifname) { struct { struct pfsync_subheader subh; struct pfsync_clr clr; } __packed r; bzero(&r, sizeof(r)); r.subh.action = PFSYNC_ACT_CLR; r.subh.count = htons(1); V_pfsyncstats.pfsyncs_oacts[PFSYNC_ACT_CLR]++; strlcpy(r.clr.ifname, ifname, sizeof(r.clr.ifname)); r.clr.creatorid = creatorid; pfsync_send_plus(&r, sizeof(r)); } static void pfsync_q_ins(struct pf_state *st, int q, bool ref) { struct pfsync_softc *sc = V_pfsyncif; size_t nlen = pfsync_qs[q].len; struct pfsync_bucket *b = pfsync_get_bucket(sc, st); PFSYNC_BUCKET_LOCK_ASSERT(b); KASSERT(st->sync_state == PFSYNC_S_NONE, ("%s: st->sync_state %u", __func__, st->sync_state)); KASSERT(b->b_len >= PFSYNC_MINPKT, ("pfsync pkt len is too low %zu", b->b_len)); if (TAILQ_EMPTY(&b->b_qs[q])) nlen += sizeof(struct pfsync_subheader); if (b->b_len + nlen > sc->sc_ifp->if_mtu) { pfsync_sendout(1, b->b_id); nlen = sizeof(struct pfsync_subheader) + pfsync_qs[q].len; } b->b_len += nlen; TAILQ_INSERT_TAIL(&b->b_qs[q], st, sync_list); st->sync_state = q; if (ref) pf_ref_state(st); } static void pfsync_q_del(struct pf_state *st, bool unref, struct pfsync_bucket *b) { int q = st->sync_state; PFSYNC_BUCKET_LOCK_ASSERT(b); KASSERT(st->sync_state != PFSYNC_S_NONE, ("%s: st->sync_state != PFSYNC_S_NONE", __func__)); b->b_len -= pfsync_qs[q].len; TAILQ_REMOVE(&b->b_qs[q], st, sync_list); st->sync_state = PFSYNC_S_NONE; if (unref) pf_release_state(st); if (TAILQ_EMPTY(&b->b_qs[q])) b->b_len -= sizeof(struct pfsync_subheader); } static void pfsync_bulk_start(void) { struct pfsync_softc *sc = V_pfsyncif; if (V_pf_status.debug >= PF_DEBUG_MISC) printf("pfsync: received bulk update request\n"); PFSYNC_BLOCK(sc); sc->sc_ureq_received = time_uptime; sc->sc_bulk_hashid = 0; sc->sc_bulk_stateid = 0; pfsync_bulk_status(PFSYNC_BUS_START); callout_reset(&sc->sc_bulk_tmo, 1, pfsync_bulk_update, sc); PFSYNC_BUNLOCK(sc); } static void pfsync_bulk_update(void *arg) { struct pfsync_softc *sc = arg; struct pf_state *s; int i, sent = 0; PFSYNC_BLOCK_ASSERT(sc); CURVNET_SET(sc->sc_ifp->if_vnet); /* * Start with last state from previous invocation. * It may had gone, in this case start from the * hash slot. */ s = pf_find_state_byid(sc->sc_bulk_stateid, sc->sc_bulk_creatorid); if (s != NULL) i = PF_IDHASH(s); else i = sc->sc_bulk_hashid; for (; i <= pf_hashmask; i++) { struct pf_idhash *ih = &V_pf_idhash[i]; if (s != NULL) PF_HASHROW_ASSERT(ih); else { PF_HASHROW_LOCK(ih); s = LIST_FIRST(&ih->states); } for (; s; s = LIST_NEXT(s, entry)) { if (s->sync_state == PFSYNC_S_NONE && s->timeout < PFTM_MAX && s->pfsync_time <= sc->sc_ureq_received) { if (pfsync_update_state_req(s)) { /* We've filled a packet. */ sc->sc_bulk_hashid = i; sc->sc_bulk_stateid = s->id; sc->sc_bulk_creatorid = s->creatorid; PF_HASHROW_UNLOCK(ih); callout_reset(&sc->sc_bulk_tmo, 1, pfsync_bulk_update, sc); goto full; } sent++; } } PF_HASHROW_UNLOCK(ih); } /* We're done. */ pfsync_bulk_status(PFSYNC_BUS_END); full: CURVNET_RESTORE(); } static void pfsync_bulk_status(u_int8_t status) { struct { struct pfsync_subheader subh; struct pfsync_bus bus; } __packed r; struct pfsync_softc *sc = V_pfsyncif; bzero(&r, sizeof(r)); r.subh.action = PFSYNC_ACT_BUS; r.subh.count = htons(1); V_pfsyncstats.pfsyncs_oacts[PFSYNC_ACT_BUS]++; r.bus.creatorid = V_pf_status.hostid; r.bus.endtime = htonl(time_uptime - sc->sc_ureq_received); r.bus.status = status; pfsync_send_plus(&r, sizeof(r)); } static void pfsync_bulk_fail(void *arg) { struct pfsync_softc *sc = arg; struct pfsync_bucket *b = &sc->sc_buckets[0]; CURVNET_SET(sc->sc_ifp->if_vnet); PFSYNC_BLOCK_ASSERT(sc); if (sc->sc_bulk_tries++ < PFSYNC_MAX_BULKTRIES) { /* Try again */ callout_reset(&sc->sc_bulkfail_tmo, 5 * hz, pfsync_bulk_fail, V_pfsyncif); PFSYNC_BUCKET_LOCK(b); pfsync_request_update(0, 0); PFSYNC_BUCKET_UNLOCK(b); } else { /* Pretend like the transfer was ok. */ sc->sc_ureq_sent = 0; sc->sc_bulk_tries = 0; PFSYNC_LOCK(sc); if (!(sc->sc_flags & PFSYNCF_OK) && carp_demote_adj_p) (*carp_demote_adj_p)(-V_pfsync_carp_adj, "pfsync bulk fail"); sc->sc_flags |= PFSYNCF_OK; PFSYNC_UNLOCK(sc); if (V_pf_status.debug >= PF_DEBUG_MISC) printf("pfsync: failed to receive bulk update\n"); } CURVNET_RESTORE(); } static void pfsync_send_plus(void *plus, size_t pluslen) { struct pfsync_softc *sc = V_pfsyncif; struct pfsync_bucket *b = &sc->sc_buckets[0]; PFSYNC_BUCKET_LOCK(b); if (b->b_len + pluslen > sc->sc_ifp->if_mtu) pfsync_sendout(1, b->b_id); b->b_plus = plus; b->b_len += (b->b_pluslen = pluslen); pfsync_sendout(1, b->b_id); PFSYNC_BUCKET_UNLOCK(b); } static void pfsync_timeout(void *arg) { struct pfsync_bucket *b = arg; CURVNET_SET(b->b_sc->sc_ifp->if_vnet); PFSYNC_BUCKET_LOCK(b); pfsync_push(b); PFSYNC_BUCKET_UNLOCK(b); CURVNET_RESTORE(); } static void pfsync_push(struct pfsync_bucket *b) { PFSYNC_BUCKET_LOCK_ASSERT(b); b->b_flags |= PFSYNCF_BUCKET_PUSH; swi_sched(V_pfsync_swi_cookie, 0); } static void pfsync_push_all(struct pfsync_softc *sc) { int c; struct pfsync_bucket *b; for (c = 0; c < pfsync_buckets; c++) { b = &sc->sc_buckets[c]; PFSYNC_BUCKET_LOCK(b); pfsync_push(b); PFSYNC_BUCKET_UNLOCK(b); } } static void pfsyncintr(void *arg) { struct epoch_tracker et; struct pfsync_softc *sc = arg; struct pfsync_bucket *b; struct mbuf *m, *n; int c; NET_EPOCH_ENTER(et); CURVNET_SET(sc->sc_ifp->if_vnet); for (c = 0; c < pfsync_buckets; c++) { b = &sc->sc_buckets[c]; PFSYNC_BUCKET_LOCK(b); if ((b->b_flags & PFSYNCF_BUCKET_PUSH) && b->b_len > PFSYNC_MINPKT) { pfsync_sendout(0, b->b_id); b->b_flags &= ~PFSYNCF_BUCKET_PUSH; } _IF_DEQUEUE_ALL(&b->b_snd, m); PFSYNC_BUCKET_UNLOCK(b); for (; m != NULL; m = n) { n = m->m_nextpkt; m->m_nextpkt = NULL; /* * We distinguish between a deferral packet and our * own pfsync packet based on M_SKIP_FIREWALL * flag. This is XXX. */ if (m->m_flags & M_SKIP_FIREWALL) ip_output(m, NULL, NULL, 0, NULL, NULL); else if (ip_output(m, NULL, NULL, IP_RAWOUTPUT, &sc->sc_imo, NULL) == 0) V_pfsyncstats.pfsyncs_opackets++; else V_pfsyncstats.pfsyncs_oerrors++; } } CURVNET_RESTORE(); NET_EPOCH_EXIT(et); } static int pfsync_multicast_setup(struct pfsync_softc *sc, struct ifnet *ifp, struct in_mfilter *imf) { struct ip_moptions *imo = &sc->sc_imo; int error; if (!(ifp->if_flags & IFF_MULTICAST)) return (EADDRNOTAVAIL); imo->imo_multicast_vif = -1; if ((error = in_joingroup(ifp, &sc->sc_sync_peer, NULL, &imf->imf_inm)) != 0) return (error); ip_mfilter_init(&imo->imo_head); ip_mfilter_insert(&imo->imo_head, imf); imo->imo_multicast_ifp = ifp; imo->imo_multicast_ttl = PFSYNC_DFLTTL; imo->imo_multicast_loop = 0; return (0); } static void pfsync_multicast_cleanup(struct pfsync_softc *sc) { struct ip_moptions *imo = &sc->sc_imo; struct in_mfilter *imf; while ((imf = ip_mfilter_first(&imo->imo_head)) != NULL) { ip_mfilter_remove(&imo->imo_head, imf); in_leavegroup(imf->imf_inm, NULL); ip_mfilter_free(imf); } imo->imo_multicast_ifp = NULL; } void pfsync_detach_ifnet(struct ifnet *ifp) { struct pfsync_softc *sc = V_pfsyncif; if (sc == NULL) return; PFSYNC_LOCK(sc); if (sc->sc_sync_if == ifp) { /* We don't need mutlicast cleanup here, because the interface * is going away. We do need to ensure we don't try to do * cleanup later. */ ip_mfilter_init(&sc->sc_imo.imo_head); sc->sc_imo.imo_multicast_ifp = NULL; sc->sc_sync_if = NULL; } PFSYNC_UNLOCK(sc); } #ifdef INET extern struct domain inetdomain; static struct protosw in_pfsync_protosw = { .pr_type = SOCK_RAW, .pr_domain = &inetdomain, .pr_protocol = IPPROTO_PFSYNC, .pr_flags = PR_ATOMIC|PR_ADDR, .pr_input = pfsync_input, .pr_output = rip_output, .pr_ctloutput = rip_ctloutput, .pr_usrreqs = &rip_usrreqs }; #endif static void pfsync_pointers_init() { PF_RULES_WLOCK(); V_pfsync_state_import_ptr = pfsync_state_import; V_pfsync_insert_state_ptr = pfsync_insert_state; V_pfsync_update_state_ptr = pfsync_update_state; V_pfsync_delete_state_ptr = pfsync_delete_state; V_pfsync_clear_states_ptr = pfsync_clear_states; V_pfsync_defer_ptr = pfsync_defer; PF_RULES_WUNLOCK(); } static void pfsync_pointers_uninit() { PF_RULES_WLOCK(); V_pfsync_state_import_ptr = NULL; V_pfsync_insert_state_ptr = NULL; V_pfsync_update_state_ptr = NULL; V_pfsync_delete_state_ptr = NULL; V_pfsync_clear_states_ptr = NULL; V_pfsync_defer_ptr = NULL; PF_RULES_WUNLOCK(); } static void vnet_pfsync_init(const void *unused __unused) { int error; V_pfsync_cloner = if_clone_simple(pfsyncname, pfsync_clone_create, pfsync_clone_destroy, 1); error = swi_add(NULL, pfsyncname, pfsyncintr, V_pfsyncif, SWI_NET, INTR_MPSAFE, &V_pfsync_swi_cookie); if (error) { if_clone_detach(V_pfsync_cloner); log(LOG_INFO, "swi_add() failed in %s\n", __func__); } pfsync_pointers_init(); } VNET_SYSINIT(vnet_pfsync_init, SI_SUB_PROTO_FIREWALL, SI_ORDER_ANY, vnet_pfsync_init, NULL); static void vnet_pfsync_uninit(const void *unused __unused) { pfsync_pointers_uninit(); if_clone_detach(V_pfsync_cloner); swi_remove(V_pfsync_swi_cookie); } VNET_SYSUNINIT(vnet_pfsync_uninit, SI_SUB_PROTO_FIREWALL, SI_ORDER_FOURTH, vnet_pfsync_uninit, NULL); static int pfsync_init() { #ifdef INET int error; pfsync_detach_ifnet_ptr = pfsync_detach_ifnet; error = pf_proto_register(PF_INET, &in_pfsync_protosw); if (error) return (error); error = ipproto_register(IPPROTO_PFSYNC); if (error) { pf_proto_unregister(PF_INET, IPPROTO_PFSYNC, SOCK_RAW); return (error); } #endif return (0); } static void pfsync_uninit() { pfsync_detach_ifnet_ptr = NULL; #ifdef INET ipproto_unregister(IPPROTO_PFSYNC); pf_proto_unregister(PF_INET, IPPROTO_PFSYNC, SOCK_RAW); #endif } static int pfsync_modevent(module_t mod, int type, void *data) { int error = 0; switch (type) { case MOD_LOAD: error = pfsync_init(); break; case MOD_UNLOAD: pfsync_uninit(); break; default: error = EINVAL; break; } return (error); } static moduledata_t pfsync_mod = { pfsyncname, pfsync_modevent, 0 }; #define PFSYNC_MODVER 1 /* Stay on FIREWALL as we depend on pf being initialized and on inetdomain. */ DECLARE_MODULE(pfsync, pfsync_mod, SI_SUB_PROTO_FIREWALL, SI_ORDER_ANY); MODULE_VERSION(pfsync, PFSYNC_MODVER); MODULE_DEPEND(pfsync, pf, PF_MODVER, PF_MODVER, PF_MODVER); Index: projects/clang1000-import/sys/netpfil/pf/pf.c =================================================================== --- projects/clang1000-import/sys/netpfil/pf/pf.c (revision 358238) +++ projects/clang1000-import/sys/netpfil/pf/pf.c (revision 358239) @@ -1,6714 +1,6715 @@ /*- * SPDX-License-Identifier: BSD-2-Clause * * Copyright (c) 2001 Daniel Hartmeier * Copyright (c) 2002 - 2008 Henning Brauer * Copyright (c) 2012 Gleb Smirnoff * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * - Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials provided * with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE * COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN * ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE * POSSIBILITY OF SUCH DAMAGE. * * Effort sponsored in part by the Defense Advanced Research Projects * Agency (DARPA) and Air Force Research Laboratory, Air Force * Materiel Command, USAF, under agreement number F30602-01-2-0537. * * $OpenBSD: pf.c,v 1.634 2009/02/27 12:37:45 henning Exp $ */ #include __FBSDID("$FreeBSD$"); #include "opt_inet.h" #include "opt_inet6.h" #include "opt_bpf.h" #include "opt_pf.h" #include "opt_sctp.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #ifdef INET6 #include #include #include #include #include #include #include #endif /* INET6 */ #ifdef SCTP #include #endif #include #include #define DPFPRINTF(n, x) if (V_pf_status.debug >= (n)) printf x /* * Global variables */ /* state tables */ VNET_DEFINE(struct pf_altqqueue, pf_altqs[4]); VNET_DEFINE(struct pf_palist, pf_pabuf); VNET_DEFINE(struct pf_altqqueue *, pf_altqs_active); VNET_DEFINE(struct pf_altqqueue *, pf_altq_ifs_active); VNET_DEFINE(struct pf_altqqueue *, pf_altqs_inactive); VNET_DEFINE(struct pf_altqqueue *, pf_altq_ifs_inactive); VNET_DEFINE(struct pf_kstatus, pf_status); VNET_DEFINE(u_int32_t, ticket_altqs_active); VNET_DEFINE(u_int32_t, ticket_altqs_inactive); VNET_DEFINE(int, altqs_inactive_open); VNET_DEFINE(u_int32_t, ticket_pabuf); VNET_DEFINE(MD5_CTX, pf_tcp_secret_ctx); #define V_pf_tcp_secret_ctx VNET(pf_tcp_secret_ctx) VNET_DEFINE(u_char, pf_tcp_secret[16]); #define V_pf_tcp_secret VNET(pf_tcp_secret) VNET_DEFINE(int, pf_tcp_secret_init); #define V_pf_tcp_secret_init VNET(pf_tcp_secret_init) VNET_DEFINE(int, pf_tcp_iss_off); #define V_pf_tcp_iss_off VNET(pf_tcp_iss_off) VNET_DECLARE(int, pf_vnet_active); #define V_pf_vnet_active VNET(pf_vnet_active) VNET_DEFINE_STATIC(uint32_t, pf_purge_idx); #define V_pf_purge_idx VNET(pf_purge_idx) /* * Queue for pf_intr() sends. */ static MALLOC_DEFINE(M_PFTEMP, "pf_temp", "pf(4) temporary allocations"); struct pf_send_entry { STAILQ_ENTRY(pf_send_entry) pfse_next; struct mbuf *pfse_m; enum { PFSE_IP, PFSE_IP6, PFSE_ICMP, PFSE_ICMP6, } pfse_type; struct { int type; int code; int mtu; } icmpopts; }; STAILQ_HEAD(pf_send_head, pf_send_entry); VNET_DEFINE_STATIC(struct pf_send_head, pf_sendqueue); #define V_pf_sendqueue VNET(pf_sendqueue) static struct mtx pf_sendqueue_mtx; MTX_SYSINIT(pf_sendqueue_mtx, &pf_sendqueue_mtx, "pf send queue", MTX_DEF); #define PF_SENDQ_LOCK() mtx_lock(&pf_sendqueue_mtx) #define PF_SENDQ_UNLOCK() mtx_unlock(&pf_sendqueue_mtx) /* * Queue for pf_overload_task() tasks. */ struct pf_overload_entry { SLIST_ENTRY(pf_overload_entry) next; struct pf_addr addr; sa_family_t af; uint8_t dir; struct pf_rule *rule; }; SLIST_HEAD(pf_overload_head, pf_overload_entry); VNET_DEFINE_STATIC(struct pf_overload_head, pf_overloadqueue); #define V_pf_overloadqueue VNET(pf_overloadqueue) VNET_DEFINE_STATIC(struct task, pf_overloadtask); #define V_pf_overloadtask VNET(pf_overloadtask) static struct mtx pf_overloadqueue_mtx; MTX_SYSINIT(pf_overloadqueue_mtx, &pf_overloadqueue_mtx, "pf overload/flush queue", MTX_DEF); #define PF_OVERLOADQ_LOCK() mtx_lock(&pf_overloadqueue_mtx) #define PF_OVERLOADQ_UNLOCK() mtx_unlock(&pf_overloadqueue_mtx) VNET_DEFINE(struct pf_rulequeue, pf_unlinked_rules); struct mtx pf_unlnkdrules_mtx; MTX_SYSINIT(pf_unlnkdrules_mtx, &pf_unlnkdrules_mtx, "pf unlinked rules", MTX_DEF); VNET_DEFINE_STATIC(uma_zone_t, pf_sources_z); #define V_pf_sources_z VNET(pf_sources_z) uma_zone_t pf_mtag_z; VNET_DEFINE(uma_zone_t, pf_state_z); VNET_DEFINE(uma_zone_t, pf_state_key_z); VNET_DEFINE(uint64_t, pf_stateid[MAXCPU]); #define PFID_CPUBITS 8 #define PFID_CPUSHIFT (sizeof(uint64_t) * NBBY - PFID_CPUBITS) #define PFID_CPUMASK ((uint64_t)((1 << PFID_CPUBITS) - 1) << PFID_CPUSHIFT) #define PFID_MAXID (~PFID_CPUMASK) CTASSERT((1 << PFID_CPUBITS) >= MAXCPU); static void pf_src_tree_remove_state(struct pf_state *); static void pf_init_threshold(struct pf_threshold *, u_int32_t, u_int32_t); static void pf_add_threshold(struct pf_threshold *); static int pf_check_threshold(struct pf_threshold *); static void pf_change_ap(struct mbuf *, struct pf_addr *, u_int16_t *, u_int16_t *, u_int16_t *, struct pf_addr *, u_int16_t, u_int8_t, sa_family_t); static int pf_modulate_sack(struct mbuf *, int, struct pf_pdesc *, struct tcphdr *, struct pf_state_peer *); static void pf_change_icmp(struct pf_addr *, u_int16_t *, struct pf_addr *, struct pf_addr *, u_int16_t, u_int16_t *, u_int16_t *, u_int16_t *, u_int16_t *, u_int8_t, sa_family_t); static void pf_send_tcp(struct mbuf *, const struct pf_rule *, sa_family_t, const struct pf_addr *, const struct pf_addr *, u_int16_t, u_int16_t, u_int32_t, u_int32_t, u_int8_t, u_int16_t, u_int16_t, u_int8_t, int, u_int16_t, struct ifnet *); static void pf_send_icmp(struct mbuf *, u_int8_t, u_int8_t, sa_family_t, struct pf_rule *); static void pf_detach_state(struct pf_state *); static int pf_state_key_attach(struct pf_state_key *, struct pf_state_key *, struct pf_state *); static void pf_state_key_detach(struct pf_state *, int); static int pf_state_key_ctor(void *, int, void *, int); static u_int32_t pf_tcp_iss(struct pf_pdesc *); static int pf_test_rule(struct pf_rule **, struct pf_state **, int, struct pfi_kif *, struct mbuf *, int, struct pf_pdesc *, struct pf_rule **, struct pf_ruleset **, struct inpcb *); static int pf_create_state(struct pf_rule *, struct pf_rule *, struct pf_rule *, struct pf_pdesc *, struct pf_src_node *, struct pf_state_key *, struct pf_state_key *, struct mbuf *, int, u_int16_t, u_int16_t, int *, struct pfi_kif *, struct pf_state **, int, u_int16_t, u_int16_t, int); static int pf_test_fragment(struct pf_rule **, int, struct pfi_kif *, struct mbuf *, void *, struct pf_pdesc *, struct pf_rule **, struct pf_ruleset **); static int pf_tcp_track_full(struct pf_state_peer *, struct pf_state_peer *, struct pf_state **, struct pfi_kif *, struct mbuf *, int, struct pf_pdesc *, u_short *, int *); static int pf_tcp_track_sloppy(struct pf_state_peer *, struct pf_state_peer *, struct pf_state **, struct pf_pdesc *, u_short *); static int pf_test_state_tcp(struct pf_state **, int, struct pfi_kif *, struct mbuf *, int, void *, struct pf_pdesc *, u_short *); static int pf_test_state_udp(struct pf_state **, int, struct pfi_kif *, struct mbuf *, int, void *, struct pf_pdesc *); static int pf_test_state_icmp(struct pf_state **, int, struct pfi_kif *, struct mbuf *, int, void *, struct pf_pdesc *, u_short *); static int pf_test_state_other(struct pf_state **, int, struct pfi_kif *, struct mbuf *, struct pf_pdesc *); static u_int8_t pf_get_wscale(struct mbuf *, int, u_int16_t, sa_family_t); static u_int16_t pf_get_mss(struct mbuf *, int, u_int16_t, sa_family_t); static u_int16_t pf_calc_mss(struct pf_addr *, sa_family_t, int, u_int16_t); static int pf_check_proto_cksum(struct mbuf *, int, int, u_int8_t, sa_family_t); static void pf_print_state_parts(struct pf_state *, struct pf_state_key *, struct pf_state_key *); static int pf_addr_wrap_neq(struct pf_addr_wrap *, struct pf_addr_wrap *); static struct pf_state *pf_find_state(struct pfi_kif *, struct pf_state_key_cmp *, u_int); static int pf_src_connlimit(struct pf_state **); static void pf_overload_task(void *v, int pending); static int pf_insert_src_node(struct pf_src_node **, struct pf_rule *, struct pf_addr *, sa_family_t); static u_int pf_purge_expired_states(u_int, int); static void pf_purge_unlinked_rules(void); static int pf_mtag_uminit(void *, int, int); static void pf_mtag_free(struct m_tag *); #ifdef INET static void pf_route(struct mbuf **, struct pf_rule *, int, struct ifnet *, struct pf_state *, struct pf_pdesc *, struct inpcb *); #endif /* INET */ #ifdef INET6 static void pf_change_a6(struct pf_addr *, u_int16_t *, struct pf_addr *, u_int8_t); static void pf_route6(struct mbuf **, struct pf_rule *, int, struct ifnet *, struct pf_state *, struct pf_pdesc *, struct inpcb *); #endif /* INET6 */ int in4_cksum(struct mbuf *m, u_int8_t nxt, int off, int len); extern int pf_end_threads; extern struct proc *pf_purge_proc; VNET_DEFINE(struct pf_limit, pf_limits[PF_LIMIT_MAX]); #define PACKET_LOOPED(pd) ((pd)->pf_mtag && \ (pd)->pf_mtag->flags & PF_PACKET_LOOPED) #define STATE_LOOKUP(i, k, d, s, pd) \ do { \ (s) = pf_find_state((i), (k), (d)); \ if ((s) == NULL) \ return (PF_DROP); \ if (PACKET_LOOPED(pd)) \ return (PF_PASS); \ if ((d) == PF_OUT && \ (((s)->rule.ptr->rt == PF_ROUTETO && \ (s)->rule.ptr->direction == PF_OUT) || \ ((s)->rule.ptr->rt == PF_REPLYTO && \ (s)->rule.ptr->direction == PF_IN)) && \ (s)->rt_kif != NULL && \ (s)->rt_kif != (i)) \ return (PF_PASS); \ } while (0) #define BOUND_IFACE(r, k) \ ((r)->rule_flag & PFRULE_IFBOUND) ? (k) : V_pfi_all #define STATE_INC_COUNTERS(s) \ do { \ counter_u64_add(s->rule.ptr->states_cur, 1); \ counter_u64_add(s->rule.ptr->states_tot, 1); \ if (s->anchor.ptr != NULL) { \ counter_u64_add(s->anchor.ptr->states_cur, 1); \ counter_u64_add(s->anchor.ptr->states_tot, 1); \ } \ if (s->nat_rule.ptr != NULL) { \ counter_u64_add(s->nat_rule.ptr->states_cur, 1);\ counter_u64_add(s->nat_rule.ptr->states_tot, 1);\ } \ } while (0) #define STATE_DEC_COUNTERS(s) \ do { \ if (s->nat_rule.ptr != NULL) \ counter_u64_add(s->nat_rule.ptr->states_cur, -1);\ if (s->anchor.ptr != NULL) \ counter_u64_add(s->anchor.ptr->states_cur, -1); \ counter_u64_add(s->rule.ptr->states_cur, -1); \ } while (0) MALLOC_DEFINE(M_PFHASH, "pf_hash", "pf(4) hash header structures"); VNET_DEFINE(struct pf_keyhash *, pf_keyhash); VNET_DEFINE(struct pf_idhash *, pf_idhash); VNET_DEFINE(struct pf_srchash *, pf_srchash); -SYSCTL_NODE(_net, OID_AUTO, pf, CTLFLAG_RW, 0, "pf(4)"); +SYSCTL_NODE(_net, OID_AUTO, pf, CTLFLAG_RW | CTLFLAG_MPSAFE, 0, + "pf(4)"); u_long pf_hashmask; u_long pf_srchashmask; static u_long pf_hashsize; static u_long pf_srchashsize; u_long pf_ioctl_maxcount = 65535; SYSCTL_ULONG(_net_pf, OID_AUTO, states_hashsize, CTLFLAG_RDTUN, &pf_hashsize, 0, "Size of pf(4) states hashtable"); SYSCTL_ULONG(_net_pf, OID_AUTO, source_nodes_hashsize, CTLFLAG_RDTUN, &pf_srchashsize, 0, "Size of pf(4) source nodes hashtable"); SYSCTL_ULONG(_net_pf, OID_AUTO, request_maxcount, CTLFLAG_RW, &pf_ioctl_maxcount, 0, "Maximum number of tables, addresses, ... in a single ioctl() call"); VNET_DEFINE(void *, pf_swi_cookie); VNET_DEFINE(uint32_t, pf_hashseed); #define V_pf_hashseed VNET(pf_hashseed) int pf_addr_cmp(struct pf_addr *a, struct pf_addr *b, sa_family_t af) { switch (af) { #ifdef INET case AF_INET: if (a->addr32[0] > b->addr32[0]) return (1); if (a->addr32[0] < b->addr32[0]) return (-1); break; #endif /* INET */ #ifdef INET6 case AF_INET6: if (a->addr32[3] > b->addr32[3]) return (1); if (a->addr32[3] < b->addr32[3]) return (-1); if (a->addr32[2] > b->addr32[2]) return (1); if (a->addr32[2] < b->addr32[2]) return (-1); if (a->addr32[1] > b->addr32[1]) return (1); if (a->addr32[1] < b->addr32[1]) return (-1); if (a->addr32[0] > b->addr32[0]) return (1); if (a->addr32[0] < b->addr32[0]) return (-1); break; #endif /* INET6 */ default: panic("%s: unknown address family %u", __func__, af); } return (0); } static __inline uint32_t pf_hashkey(struct pf_state_key *sk) { uint32_t h; h = murmur3_32_hash32((uint32_t *)sk, sizeof(struct pf_state_key_cmp)/sizeof(uint32_t), V_pf_hashseed); return (h & pf_hashmask); } static __inline uint32_t pf_hashsrc(struct pf_addr *addr, sa_family_t af) { uint32_t h; switch (af) { case AF_INET: h = murmur3_32_hash32((uint32_t *)&addr->v4, sizeof(addr->v4)/sizeof(uint32_t), V_pf_hashseed); break; case AF_INET6: h = murmur3_32_hash32((uint32_t *)&addr->v6, sizeof(addr->v6)/sizeof(uint32_t), V_pf_hashseed); break; default: panic("%s: unknown address family %u", __func__, af); } return (h & pf_srchashmask); } #ifdef ALTQ static int pf_state_hash(struct pf_state *s) { u_int32_t hv = (intptr_t)s / sizeof(*s); hv ^= crc32(&s->src, sizeof(s->src)); hv ^= crc32(&s->dst, sizeof(s->dst)); if (hv == 0) hv = 1; return (hv); } #endif #ifdef INET6 void pf_addrcpy(struct pf_addr *dst, struct pf_addr *src, sa_family_t af) { switch (af) { #ifdef INET case AF_INET: dst->addr32[0] = src->addr32[0]; break; #endif /* INET */ case AF_INET6: dst->addr32[0] = src->addr32[0]; dst->addr32[1] = src->addr32[1]; dst->addr32[2] = src->addr32[2]; dst->addr32[3] = src->addr32[3]; break; } } #endif /* INET6 */ static void pf_init_threshold(struct pf_threshold *threshold, u_int32_t limit, u_int32_t seconds) { threshold->limit = limit * PF_THRESHOLD_MULT; threshold->seconds = seconds; threshold->count = 0; threshold->last = time_uptime; } static void pf_add_threshold(struct pf_threshold *threshold) { u_int32_t t = time_uptime, diff = t - threshold->last; if (diff >= threshold->seconds) threshold->count = 0; else threshold->count -= threshold->count * diff / threshold->seconds; threshold->count += PF_THRESHOLD_MULT; threshold->last = t; } static int pf_check_threshold(struct pf_threshold *threshold) { return (threshold->count > threshold->limit); } static int pf_src_connlimit(struct pf_state **state) { struct pf_overload_entry *pfoe; int bad = 0; PF_STATE_LOCK_ASSERT(*state); (*state)->src_node->conn++; (*state)->src.tcp_est = 1; pf_add_threshold(&(*state)->src_node->conn_rate); if ((*state)->rule.ptr->max_src_conn && (*state)->rule.ptr->max_src_conn < (*state)->src_node->conn) { counter_u64_add(V_pf_status.lcounters[LCNT_SRCCONN], 1); bad++; } if ((*state)->rule.ptr->max_src_conn_rate.limit && pf_check_threshold(&(*state)->src_node->conn_rate)) { counter_u64_add(V_pf_status.lcounters[LCNT_SRCCONNRATE], 1); bad++; } if (!bad) return (0); /* Kill this state. */ (*state)->timeout = PFTM_PURGE; (*state)->src.state = (*state)->dst.state = TCPS_CLOSED; if ((*state)->rule.ptr->overload_tbl == NULL) return (1); /* Schedule overloading and flushing task. */ pfoe = malloc(sizeof(*pfoe), M_PFTEMP, M_NOWAIT); if (pfoe == NULL) return (1); /* too bad :( */ bcopy(&(*state)->src_node->addr, &pfoe->addr, sizeof(pfoe->addr)); pfoe->af = (*state)->key[PF_SK_WIRE]->af; pfoe->rule = (*state)->rule.ptr; pfoe->dir = (*state)->direction; PF_OVERLOADQ_LOCK(); SLIST_INSERT_HEAD(&V_pf_overloadqueue, pfoe, next); PF_OVERLOADQ_UNLOCK(); taskqueue_enqueue(taskqueue_swi, &V_pf_overloadtask); return (1); } static void pf_overload_task(void *v, int pending) { struct pf_overload_head queue; struct pfr_addr p; struct pf_overload_entry *pfoe, *pfoe1; uint32_t killed = 0; CURVNET_SET((struct vnet *)v); PF_OVERLOADQ_LOCK(); queue = V_pf_overloadqueue; SLIST_INIT(&V_pf_overloadqueue); PF_OVERLOADQ_UNLOCK(); bzero(&p, sizeof(p)); SLIST_FOREACH(pfoe, &queue, next) { counter_u64_add(V_pf_status.lcounters[LCNT_OVERLOAD_TABLE], 1); if (V_pf_status.debug >= PF_DEBUG_MISC) { printf("%s: blocking address ", __func__); pf_print_host(&pfoe->addr, 0, pfoe->af); printf("\n"); } p.pfra_af = pfoe->af; switch (pfoe->af) { #ifdef INET case AF_INET: p.pfra_net = 32; p.pfra_ip4addr = pfoe->addr.v4; break; #endif #ifdef INET6 case AF_INET6: p.pfra_net = 128; p.pfra_ip6addr = pfoe->addr.v6; break; #endif } PF_RULES_WLOCK(); pfr_insert_kentry(pfoe->rule->overload_tbl, &p, time_second); PF_RULES_WUNLOCK(); } /* * Remove those entries, that don't need flushing. */ SLIST_FOREACH_SAFE(pfoe, &queue, next, pfoe1) if (pfoe->rule->flush == 0) { SLIST_REMOVE(&queue, pfoe, pf_overload_entry, next); free(pfoe, M_PFTEMP); } else counter_u64_add( V_pf_status.lcounters[LCNT_OVERLOAD_FLUSH], 1); /* If nothing to flush, return. */ if (SLIST_EMPTY(&queue)) { CURVNET_RESTORE(); return; } for (int i = 0; i <= pf_hashmask; i++) { struct pf_idhash *ih = &V_pf_idhash[i]; struct pf_state_key *sk; struct pf_state *s; PF_HASHROW_LOCK(ih); LIST_FOREACH(s, &ih->states, entry) { sk = s->key[PF_SK_WIRE]; SLIST_FOREACH(pfoe, &queue, next) if (sk->af == pfoe->af && ((pfoe->rule->flush & PF_FLUSH_GLOBAL) || pfoe->rule == s->rule.ptr) && ((pfoe->dir == PF_OUT && PF_AEQ(&pfoe->addr, &sk->addr[1], sk->af)) || (pfoe->dir == PF_IN && PF_AEQ(&pfoe->addr, &sk->addr[0], sk->af)))) { s->timeout = PFTM_PURGE; s->src.state = s->dst.state = TCPS_CLOSED; killed++; } } PF_HASHROW_UNLOCK(ih); } SLIST_FOREACH_SAFE(pfoe, &queue, next, pfoe1) free(pfoe, M_PFTEMP); if (V_pf_status.debug >= PF_DEBUG_MISC) printf("%s: %u states killed", __func__, killed); CURVNET_RESTORE(); } /* * Can return locked on failure, so that we can consistently * allocate and insert a new one. */ struct pf_src_node * pf_find_src_node(struct pf_addr *src, struct pf_rule *rule, sa_family_t af, int returnlocked) { struct pf_srchash *sh; struct pf_src_node *n; counter_u64_add(V_pf_status.scounters[SCNT_SRC_NODE_SEARCH], 1); sh = &V_pf_srchash[pf_hashsrc(src, af)]; PF_HASHROW_LOCK(sh); LIST_FOREACH(n, &sh->nodes, entry) if (n->rule.ptr == rule && n->af == af && ((af == AF_INET && n->addr.v4.s_addr == src->v4.s_addr) || (af == AF_INET6 && bcmp(&n->addr, src, sizeof(*src)) == 0))) break; if (n != NULL) { n->states++; PF_HASHROW_UNLOCK(sh); } else if (returnlocked == 0) PF_HASHROW_UNLOCK(sh); return (n); } static int pf_insert_src_node(struct pf_src_node **sn, struct pf_rule *rule, struct pf_addr *src, sa_family_t af) { KASSERT((rule->rule_flag & PFRULE_RULESRCTRACK || rule->rpool.opts & PF_POOL_STICKYADDR), ("%s for non-tracking rule %p", __func__, rule)); if (*sn == NULL) *sn = pf_find_src_node(src, rule, af, 1); if (*sn == NULL) { struct pf_srchash *sh = &V_pf_srchash[pf_hashsrc(src, af)]; PF_HASHROW_ASSERT(sh); if (!rule->max_src_nodes || counter_u64_fetch(rule->src_nodes) < rule->max_src_nodes) (*sn) = uma_zalloc(V_pf_sources_z, M_NOWAIT | M_ZERO); else counter_u64_add(V_pf_status.lcounters[LCNT_SRCNODES], 1); if ((*sn) == NULL) { PF_HASHROW_UNLOCK(sh); return (-1); } pf_init_threshold(&(*sn)->conn_rate, rule->max_src_conn_rate.limit, rule->max_src_conn_rate.seconds); (*sn)->af = af; (*sn)->rule.ptr = rule; PF_ACPY(&(*sn)->addr, src, af); LIST_INSERT_HEAD(&sh->nodes, *sn, entry); (*sn)->creation = time_uptime; (*sn)->ruletype = rule->action; (*sn)->states = 1; if ((*sn)->rule.ptr != NULL) counter_u64_add((*sn)->rule.ptr->src_nodes, 1); PF_HASHROW_UNLOCK(sh); counter_u64_add(V_pf_status.scounters[SCNT_SRC_NODE_INSERT], 1); } else { if (rule->max_src_states && (*sn)->states >= rule->max_src_states) { counter_u64_add(V_pf_status.lcounters[LCNT_SRCSTATES], 1); return (-1); } } return (0); } void pf_unlink_src_node(struct pf_src_node *src) { PF_HASHROW_ASSERT(&V_pf_srchash[pf_hashsrc(&src->addr, src->af)]); LIST_REMOVE(src, entry); if (src->rule.ptr) counter_u64_add(src->rule.ptr->src_nodes, -1); } u_int pf_free_src_nodes(struct pf_src_node_list *head) { struct pf_src_node *sn, *tmp; u_int count = 0; LIST_FOREACH_SAFE(sn, head, entry, tmp) { uma_zfree(V_pf_sources_z, sn); count++; } counter_u64_add(V_pf_status.scounters[SCNT_SRC_NODE_REMOVALS], count); return (count); } void pf_mtag_initialize() { pf_mtag_z = uma_zcreate("pf mtags", sizeof(struct m_tag) + sizeof(struct pf_mtag), NULL, NULL, pf_mtag_uminit, NULL, UMA_ALIGN_PTR, 0); } /* Per-vnet data storage structures initialization. */ void pf_initialize() { struct pf_keyhash *kh; struct pf_idhash *ih; struct pf_srchash *sh; u_int i; if (pf_hashsize == 0 || !powerof2(pf_hashsize)) pf_hashsize = PF_HASHSIZ; if (pf_srchashsize == 0 || !powerof2(pf_srchashsize)) pf_srchashsize = PF_SRCHASHSIZ; V_pf_hashseed = arc4random(); /* States and state keys storage. */ V_pf_state_z = uma_zcreate("pf states", sizeof(struct pf_state), NULL, NULL, NULL, NULL, UMA_ALIGN_PTR, 0); V_pf_limits[PF_LIMIT_STATES].zone = V_pf_state_z; uma_zone_set_max(V_pf_state_z, PFSTATE_HIWAT); uma_zone_set_warning(V_pf_state_z, "PF states limit reached"); V_pf_state_key_z = uma_zcreate("pf state keys", sizeof(struct pf_state_key), pf_state_key_ctor, NULL, NULL, NULL, UMA_ALIGN_PTR, 0); V_pf_keyhash = mallocarray(pf_hashsize, sizeof(struct pf_keyhash), M_PFHASH, M_NOWAIT | M_ZERO); V_pf_idhash = mallocarray(pf_hashsize, sizeof(struct pf_idhash), M_PFHASH, M_NOWAIT | M_ZERO); if (V_pf_keyhash == NULL || V_pf_idhash == NULL) { printf("pf: Unable to allocate memory for " "state_hashsize %lu.\n", pf_hashsize); free(V_pf_keyhash, M_PFHASH); free(V_pf_idhash, M_PFHASH); pf_hashsize = PF_HASHSIZ; V_pf_keyhash = mallocarray(pf_hashsize, sizeof(struct pf_keyhash), M_PFHASH, M_WAITOK | M_ZERO); V_pf_idhash = mallocarray(pf_hashsize, sizeof(struct pf_idhash), M_PFHASH, M_WAITOK | M_ZERO); } pf_hashmask = pf_hashsize - 1; for (i = 0, kh = V_pf_keyhash, ih = V_pf_idhash; i <= pf_hashmask; i++, kh++, ih++) { mtx_init(&kh->lock, "pf_keyhash", NULL, MTX_DEF | MTX_DUPOK); mtx_init(&ih->lock, "pf_idhash", NULL, MTX_DEF); } /* Source nodes. */ V_pf_sources_z = uma_zcreate("pf source nodes", sizeof(struct pf_src_node), NULL, NULL, NULL, NULL, UMA_ALIGN_PTR, 0); V_pf_limits[PF_LIMIT_SRC_NODES].zone = V_pf_sources_z; uma_zone_set_max(V_pf_sources_z, PFSNODE_HIWAT); uma_zone_set_warning(V_pf_sources_z, "PF source nodes limit reached"); V_pf_srchash = mallocarray(pf_srchashsize, sizeof(struct pf_srchash), M_PFHASH, M_NOWAIT | M_ZERO); if (V_pf_srchash == NULL) { printf("pf: Unable to allocate memory for " "source_hashsize %lu.\n", pf_srchashsize); pf_srchashsize = PF_SRCHASHSIZ; V_pf_srchash = mallocarray(pf_srchashsize, sizeof(struct pf_srchash), M_PFHASH, M_WAITOK | M_ZERO); } pf_srchashmask = pf_srchashsize - 1; for (i = 0, sh = V_pf_srchash; i <= pf_srchashmask; i++, sh++) mtx_init(&sh->lock, "pf_srchash", NULL, MTX_DEF); /* ALTQ */ TAILQ_INIT(&V_pf_altqs[0]); TAILQ_INIT(&V_pf_altqs[1]); TAILQ_INIT(&V_pf_altqs[2]); TAILQ_INIT(&V_pf_altqs[3]); TAILQ_INIT(&V_pf_pabuf); V_pf_altqs_active = &V_pf_altqs[0]; V_pf_altq_ifs_active = &V_pf_altqs[1]; V_pf_altqs_inactive = &V_pf_altqs[2]; V_pf_altq_ifs_inactive = &V_pf_altqs[3]; /* Send & overload+flush queues. */ STAILQ_INIT(&V_pf_sendqueue); SLIST_INIT(&V_pf_overloadqueue); TASK_INIT(&V_pf_overloadtask, 0, pf_overload_task, curvnet); /* Unlinked, but may be referenced rules. */ TAILQ_INIT(&V_pf_unlinked_rules); } void pf_mtag_cleanup() { uma_zdestroy(pf_mtag_z); } void pf_cleanup() { struct pf_keyhash *kh; struct pf_idhash *ih; struct pf_srchash *sh; struct pf_send_entry *pfse, *next; u_int i; for (i = 0, kh = V_pf_keyhash, ih = V_pf_idhash; i <= pf_hashmask; i++, kh++, ih++) { KASSERT(LIST_EMPTY(&kh->keys), ("%s: key hash not empty", __func__)); KASSERT(LIST_EMPTY(&ih->states), ("%s: id hash not empty", __func__)); mtx_destroy(&kh->lock); mtx_destroy(&ih->lock); } free(V_pf_keyhash, M_PFHASH); free(V_pf_idhash, M_PFHASH); for (i = 0, sh = V_pf_srchash; i <= pf_srchashmask; i++, sh++) { KASSERT(LIST_EMPTY(&sh->nodes), ("%s: source node hash not empty", __func__)); mtx_destroy(&sh->lock); } free(V_pf_srchash, M_PFHASH); STAILQ_FOREACH_SAFE(pfse, &V_pf_sendqueue, pfse_next, next) { m_freem(pfse->pfse_m); free(pfse, M_PFTEMP); } uma_zdestroy(V_pf_sources_z); uma_zdestroy(V_pf_state_z); uma_zdestroy(V_pf_state_key_z); } static int pf_mtag_uminit(void *mem, int size, int how) { struct m_tag *t; t = (struct m_tag *)mem; t->m_tag_cookie = MTAG_ABI_COMPAT; t->m_tag_id = PACKET_TAG_PF; t->m_tag_len = sizeof(struct pf_mtag); t->m_tag_free = pf_mtag_free; return (0); } static void pf_mtag_free(struct m_tag *t) { uma_zfree(pf_mtag_z, t); } struct pf_mtag * pf_get_mtag(struct mbuf *m) { struct m_tag *mtag; if ((mtag = m_tag_find(m, PACKET_TAG_PF, NULL)) != NULL) return ((struct pf_mtag *)(mtag + 1)); mtag = uma_zalloc(pf_mtag_z, M_NOWAIT); if (mtag == NULL) return (NULL); bzero(mtag + 1, sizeof(struct pf_mtag)); m_tag_prepend(m, mtag); return ((struct pf_mtag *)(mtag + 1)); } static int pf_state_key_attach(struct pf_state_key *skw, struct pf_state_key *sks, struct pf_state *s) { struct pf_keyhash *khs, *khw, *kh; struct pf_state_key *sk, *cur; struct pf_state *si, *olds = NULL; int idx; KASSERT(s->refs == 0, ("%s: state not pristine", __func__)); KASSERT(s->key[PF_SK_WIRE] == NULL, ("%s: state has key", __func__)); KASSERT(s->key[PF_SK_STACK] == NULL, ("%s: state has key", __func__)); /* * We need to lock hash slots of both keys. To avoid deadlock * we always lock the slot with lower address first. Unlock order * isn't important. * * We also need to lock ID hash slot before dropping key * locks. On success we return with ID hash slot locked. */ if (skw == sks) { khs = khw = &V_pf_keyhash[pf_hashkey(skw)]; PF_HASHROW_LOCK(khs); } else { khs = &V_pf_keyhash[pf_hashkey(sks)]; khw = &V_pf_keyhash[pf_hashkey(skw)]; if (khs == khw) { PF_HASHROW_LOCK(khs); } else if (khs < khw) { PF_HASHROW_LOCK(khs); PF_HASHROW_LOCK(khw); } else { PF_HASHROW_LOCK(khw); PF_HASHROW_LOCK(khs); } } #define KEYS_UNLOCK() do { \ if (khs != khw) { \ PF_HASHROW_UNLOCK(khs); \ PF_HASHROW_UNLOCK(khw); \ } else \ PF_HASHROW_UNLOCK(khs); \ } while (0) /* * First run: start with wire key. */ sk = skw; kh = khw; idx = PF_SK_WIRE; keyattach: LIST_FOREACH(cur, &kh->keys, entry) if (bcmp(cur, sk, sizeof(struct pf_state_key_cmp)) == 0) break; if (cur != NULL) { /* Key exists. Check for same kif, if none, add to key. */ TAILQ_FOREACH(si, &cur->states[idx], key_list[idx]) { struct pf_idhash *ih = &V_pf_idhash[PF_IDHASH(si)]; PF_HASHROW_LOCK(ih); if (si->kif == s->kif && si->direction == s->direction) { if (sk->proto == IPPROTO_TCP && si->src.state >= TCPS_FIN_WAIT_2 && si->dst.state >= TCPS_FIN_WAIT_2) { /* * New state matches an old >FIN_WAIT_2 * state. We can't drop key hash locks, * thus we can't unlink it properly. * * As a workaround we drop it into * TCPS_CLOSED state, schedule purge * ASAP and push it into the very end * of the slot TAILQ, so that it won't * conflict with our new state. */ si->src.state = si->dst.state = TCPS_CLOSED; si->timeout = PFTM_PURGE; olds = si; } else { if (V_pf_status.debug >= PF_DEBUG_MISC) { printf("pf: %s key attach " "failed on %s: ", (idx == PF_SK_WIRE) ? "wire" : "stack", s->kif->pfik_name); pf_print_state_parts(s, (idx == PF_SK_WIRE) ? sk : NULL, (idx == PF_SK_STACK) ? sk : NULL); printf(", existing: "); pf_print_state_parts(si, (idx == PF_SK_WIRE) ? sk : NULL, (idx == PF_SK_STACK) ? sk : NULL); printf("\n"); } PF_HASHROW_UNLOCK(ih); KEYS_UNLOCK(); uma_zfree(V_pf_state_key_z, sk); if (idx == PF_SK_STACK) pf_detach_state(s); return (EEXIST); /* collision! */ } } PF_HASHROW_UNLOCK(ih); } uma_zfree(V_pf_state_key_z, sk); s->key[idx] = cur; } else { LIST_INSERT_HEAD(&kh->keys, sk, entry); s->key[idx] = sk; } stateattach: /* List is sorted, if-bound states before floating. */ if (s->kif == V_pfi_all) TAILQ_INSERT_TAIL(&s->key[idx]->states[idx], s, key_list[idx]); else TAILQ_INSERT_HEAD(&s->key[idx]->states[idx], s, key_list[idx]); if (olds) { TAILQ_REMOVE(&s->key[idx]->states[idx], olds, key_list[idx]); TAILQ_INSERT_TAIL(&s->key[idx]->states[idx], olds, key_list[idx]); olds = NULL; } /* * Attach done. See how should we (or should not?) * attach a second key. */ if (sks == skw) { s->key[PF_SK_STACK] = s->key[PF_SK_WIRE]; idx = PF_SK_STACK; sks = NULL; goto stateattach; } else if (sks != NULL) { /* * Continue attaching with stack key. */ sk = sks; kh = khs; idx = PF_SK_STACK; sks = NULL; goto keyattach; } PF_STATE_LOCK(s); KEYS_UNLOCK(); KASSERT(s->key[PF_SK_WIRE] != NULL && s->key[PF_SK_STACK] != NULL, ("%s failure", __func__)); return (0); #undef KEYS_UNLOCK } static void pf_detach_state(struct pf_state *s) { struct pf_state_key *sks = s->key[PF_SK_STACK]; struct pf_keyhash *kh; if (sks != NULL) { kh = &V_pf_keyhash[pf_hashkey(sks)]; PF_HASHROW_LOCK(kh); if (s->key[PF_SK_STACK] != NULL) pf_state_key_detach(s, PF_SK_STACK); /* * If both point to same key, then we are done. */ if (sks == s->key[PF_SK_WIRE]) { pf_state_key_detach(s, PF_SK_WIRE); PF_HASHROW_UNLOCK(kh); return; } PF_HASHROW_UNLOCK(kh); } if (s->key[PF_SK_WIRE] != NULL) { kh = &V_pf_keyhash[pf_hashkey(s->key[PF_SK_WIRE])]; PF_HASHROW_LOCK(kh); if (s->key[PF_SK_WIRE] != NULL) pf_state_key_detach(s, PF_SK_WIRE); PF_HASHROW_UNLOCK(kh); } } static void pf_state_key_detach(struct pf_state *s, int idx) { struct pf_state_key *sk = s->key[idx]; #ifdef INVARIANTS struct pf_keyhash *kh = &V_pf_keyhash[pf_hashkey(sk)]; PF_HASHROW_ASSERT(kh); #endif TAILQ_REMOVE(&sk->states[idx], s, key_list[idx]); s->key[idx] = NULL; if (TAILQ_EMPTY(&sk->states[0]) && TAILQ_EMPTY(&sk->states[1])) { LIST_REMOVE(sk, entry); uma_zfree(V_pf_state_key_z, sk); } } static int pf_state_key_ctor(void *mem, int size, void *arg, int flags) { struct pf_state_key *sk = mem; bzero(sk, sizeof(struct pf_state_key_cmp)); TAILQ_INIT(&sk->states[PF_SK_WIRE]); TAILQ_INIT(&sk->states[PF_SK_STACK]); return (0); } struct pf_state_key * pf_state_key_setup(struct pf_pdesc *pd, struct pf_addr *saddr, struct pf_addr *daddr, u_int16_t sport, u_int16_t dport) { struct pf_state_key *sk; sk = uma_zalloc(V_pf_state_key_z, M_NOWAIT); if (sk == NULL) return (NULL); PF_ACPY(&sk->addr[pd->sidx], saddr, pd->af); PF_ACPY(&sk->addr[pd->didx], daddr, pd->af); sk->port[pd->sidx] = sport; sk->port[pd->didx] = dport; sk->proto = pd->proto; sk->af = pd->af; return (sk); } struct pf_state_key * pf_state_key_clone(struct pf_state_key *orig) { struct pf_state_key *sk; sk = uma_zalloc(V_pf_state_key_z, M_NOWAIT); if (sk == NULL) return (NULL); bcopy(orig, sk, sizeof(struct pf_state_key_cmp)); return (sk); } int pf_state_insert(struct pfi_kif *kif, struct pf_state_key *skw, struct pf_state_key *sks, struct pf_state *s) { struct pf_idhash *ih; struct pf_state *cur; int error; KASSERT(TAILQ_EMPTY(&sks->states[0]) && TAILQ_EMPTY(&sks->states[1]), ("%s: sks not pristine", __func__)); KASSERT(TAILQ_EMPTY(&skw->states[0]) && TAILQ_EMPTY(&skw->states[1]), ("%s: skw not pristine", __func__)); KASSERT(s->refs == 0, ("%s: state not pristine", __func__)); s->kif = kif; if (s->id == 0 && s->creatorid == 0) { /* XXX: should be atomic, but probability of collision low */ if ((s->id = V_pf_stateid[curcpu]++) == PFID_MAXID) V_pf_stateid[curcpu] = 1; s->id |= (uint64_t )curcpu << PFID_CPUSHIFT; s->id = htobe64(s->id); s->creatorid = V_pf_status.hostid; } /* Returns with ID locked on success. */ if ((error = pf_state_key_attach(skw, sks, s)) != 0) return (error); ih = &V_pf_idhash[PF_IDHASH(s)]; PF_HASHROW_ASSERT(ih); LIST_FOREACH(cur, &ih->states, entry) if (cur->id == s->id && cur->creatorid == s->creatorid) break; if (cur != NULL) { PF_HASHROW_UNLOCK(ih); if (V_pf_status.debug >= PF_DEBUG_MISC) { printf("pf: state ID collision: " "id: %016llx creatorid: %08x\n", (unsigned long long)be64toh(s->id), ntohl(s->creatorid)); } pf_detach_state(s); return (EEXIST); } LIST_INSERT_HEAD(&ih->states, s, entry); /* One for keys, one for ID hash. */ refcount_init(&s->refs, 2); counter_u64_add(V_pf_status.fcounters[FCNT_STATE_INSERT], 1); if (V_pfsync_insert_state_ptr != NULL) V_pfsync_insert_state_ptr(s); /* Returns locked. */ return (0); } /* * Find state by ID: returns with locked row on success. */ struct pf_state * pf_find_state_byid(uint64_t id, uint32_t creatorid) { struct pf_idhash *ih; struct pf_state *s; counter_u64_add(V_pf_status.fcounters[FCNT_STATE_SEARCH], 1); ih = &V_pf_idhash[(be64toh(id) % (pf_hashmask + 1))]; PF_HASHROW_LOCK(ih); LIST_FOREACH(s, &ih->states, entry) if (s->id == id && s->creatorid == creatorid) break; if (s == NULL) PF_HASHROW_UNLOCK(ih); return (s); } /* * Find state by key. * Returns with ID hash slot locked on success. */ static struct pf_state * pf_find_state(struct pfi_kif *kif, struct pf_state_key_cmp *key, u_int dir) { struct pf_keyhash *kh; struct pf_state_key *sk; struct pf_state *s; int idx; counter_u64_add(V_pf_status.fcounters[FCNT_STATE_SEARCH], 1); kh = &V_pf_keyhash[pf_hashkey((struct pf_state_key *)key)]; PF_HASHROW_LOCK(kh); LIST_FOREACH(sk, &kh->keys, entry) if (bcmp(sk, key, sizeof(struct pf_state_key_cmp)) == 0) break; if (sk == NULL) { PF_HASHROW_UNLOCK(kh); return (NULL); } idx = (dir == PF_IN ? PF_SK_WIRE : PF_SK_STACK); /* List is sorted, if-bound states before floating ones. */ TAILQ_FOREACH(s, &sk->states[idx], key_list[idx]) if (s->kif == V_pfi_all || s->kif == kif) { PF_STATE_LOCK(s); PF_HASHROW_UNLOCK(kh); if (s->timeout >= PFTM_MAX) { /* * State is either being processed by * pf_unlink_state() in an other thread, or * is scheduled for immediate expiry. */ PF_STATE_UNLOCK(s); return (NULL); } return (s); } PF_HASHROW_UNLOCK(kh); return (NULL); } struct pf_state * pf_find_state_all(struct pf_state_key_cmp *key, u_int dir, int *more) { struct pf_keyhash *kh; struct pf_state_key *sk; struct pf_state *s, *ret = NULL; int idx, inout = 0; counter_u64_add(V_pf_status.fcounters[FCNT_STATE_SEARCH], 1); kh = &V_pf_keyhash[pf_hashkey((struct pf_state_key *)key)]; PF_HASHROW_LOCK(kh); LIST_FOREACH(sk, &kh->keys, entry) if (bcmp(sk, key, sizeof(struct pf_state_key_cmp)) == 0) break; if (sk == NULL) { PF_HASHROW_UNLOCK(kh); return (NULL); } switch (dir) { case PF_IN: idx = PF_SK_WIRE; break; case PF_OUT: idx = PF_SK_STACK; break; case PF_INOUT: idx = PF_SK_WIRE; inout = 1; break; default: panic("%s: dir %u", __func__, dir); } second_run: TAILQ_FOREACH(s, &sk->states[idx], key_list[idx]) { if (more == NULL) { PF_HASHROW_UNLOCK(kh); return (s); } if (ret) (*more)++; else ret = s; } if (inout == 1) { inout = 0; idx = PF_SK_STACK; goto second_run; } PF_HASHROW_UNLOCK(kh); return (ret); } /* END state table stuff */ static void pf_send(struct pf_send_entry *pfse) { PF_SENDQ_LOCK(); STAILQ_INSERT_TAIL(&V_pf_sendqueue, pfse, pfse_next); PF_SENDQ_UNLOCK(); swi_sched(V_pf_swi_cookie, 0); } void pf_intr(void *v) { struct epoch_tracker et; struct pf_send_head queue; struct pf_send_entry *pfse, *next; CURVNET_SET((struct vnet *)v); PF_SENDQ_LOCK(); queue = V_pf_sendqueue; STAILQ_INIT(&V_pf_sendqueue); PF_SENDQ_UNLOCK(); NET_EPOCH_ENTER(et); STAILQ_FOREACH_SAFE(pfse, &queue, pfse_next, next) { switch (pfse->pfse_type) { #ifdef INET case PFSE_IP: ip_output(pfse->pfse_m, NULL, NULL, 0, NULL, NULL); break; case PFSE_ICMP: icmp_error(pfse->pfse_m, pfse->icmpopts.type, pfse->icmpopts.code, 0, pfse->icmpopts.mtu); break; #endif /* INET */ #ifdef INET6 case PFSE_IP6: ip6_output(pfse->pfse_m, NULL, NULL, 0, NULL, NULL, NULL); break; case PFSE_ICMP6: icmp6_error(pfse->pfse_m, pfse->icmpopts.type, pfse->icmpopts.code, pfse->icmpopts.mtu); break; #endif /* INET6 */ default: panic("%s: unknown type", __func__); } free(pfse, M_PFTEMP); } NET_EPOCH_EXIT(et); CURVNET_RESTORE(); } void pf_purge_thread(void *unused __unused) { VNET_ITERATOR_DECL(vnet_iter); sx_xlock(&pf_end_lock); while (pf_end_threads == 0) { sx_sleep(pf_purge_thread, &pf_end_lock, 0, "pftm", hz / 10); VNET_LIST_RLOCK(); VNET_FOREACH(vnet_iter) { CURVNET_SET(vnet_iter); /* Wait until V_pf_default_rule is initialized. */ if (V_pf_vnet_active == 0) { CURVNET_RESTORE(); continue; } /* * Process 1/interval fraction of the state * table every run. */ V_pf_purge_idx = pf_purge_expired_states(V_pf_purge_idx, pf_hashmask / (V_pf_default_rule.timeout[PFTM_INTERVAL] * 10)); /* * Purge other expired types every * PFTM_INTERVAL seconds. */ if (V_pf_purge_idx == 0) { /* * Order is important: * - states and src nodes reference rules * - states and rules reference kifs */ pf_purge_expired_fragments(); pf_purge_expired_src_nodes(); pf_purge_unlinked_rules(); pfi_kif_purge(); } CURVNET_RESTORE(); } VNET_LIST_RUNLOCK(); } pf_end_threads++; sx_xunlock(&pf_end_lock); kproc_exit(0); } void pf_unload_vnet_purge(void) { /* * To cleanse up all kifs and rules we need * two runs: first one clears reference flags, * then pf_purge_expired_states() doesn't * raise them, and then second run frees. */ pf_purge_unlinked_rules(); pfi_kif_purge(); /* * Now purge everything. */ pf_purge_expired_states(0, pf_hashmask); pf_purge_fragments(UINT_MAX); pf_purge_expired_src_nodes(); /* * Now all kifs & rules should be unreferenced, * thus should be successfully freed. */ pf_purge_unlinked_rules(); pfi_kif_purge(); } u_int32_t pf_state_expires(const struct pf_state *state) { u_int32_t timeout; u_int32_t start; u_int32_t end; u_int32_t states; /* handle all PFTM_* > PFTM_MAX here */ if (state->timeout == PFTM_PURGE) return (time_uptime); KASSERT(state->timeout != PFTM_UNLINKED, ("pf_state_expires: timeout == PFTM_UNLINKED")); KASSERT((state->timeout < PFTM_MAX), ("pf_state_expires: timeout > PFTM_MAX")); timeout = state->rule.ptr->timeout[state->timeout]; if (!timeout) timeout = V_pf_default_rule.timeout[state->timeout]; start = state->rule.ptr->timeout[PFTM_ADAPTIVE_START]; if (start && state->rule.ptr != &V_pf_default_rule) { end = state->rule.ptr->timeout[PFTM_ADAPTIVE_END]; states = counter_u64_fetch(state->rule.ptr->states_cur); } else { start = V_pf_default_rule.timeout[PFTM_ADAPTIVE_START]; end = V_pf_default_rule.timeout[PFTM_ADAPTIVE_END]; states = V_pf_status.states; } if (end && states > start && start < end) { if (states < end) { timeout = (u_int64_t)timeout * (end - states) / (end - start); return (state->expire + timeout); } else return (time_uptime); } return (state->expire + timeout); } void pf_purge_expired_src_nodes() { struct pf_src_node_list freelist; struct pf_srchash *sh; struct pf_src_node *cur, *next; int i; LIST_INIT(&freelist); for (i = 0, sh = V_pf_srchash; i <= pf_srchashmask; i++, sh++) { PF_HASHROW_LOCK(sh); LIST_FOREACH_SAFE(cur, &sh->nodes, entry, next) if (cur->states == 0 && cur->expire <= time_uptime) { pf_unlink_src_node(cur); LIST_INSERT_HEAD(&freelist, cur, entry); } else if (cur->rule.ptr != NULL) cur->rule.ptr->rule_flag |= PFRULE_REFS; PF_HASHROW_UNLOCK(sh); } pf_free_src_nodes(&freelist); V_pf_status.src_nodes = uma_zone_get_cur(V_pf_sources_z); } static void pf_src_tree_remove_state(struct pf_state *s) { struct pf_src_node *sn; struct pf_srchash *sh; uint32_t timeout; timeout = s->rule.ptr->timeout[PFTM_SRC_NODE] ? s->rule.ptr->timeout[PFTM_SRC_NODE] : V_pf_default_rule.timeout[PFTM_SRC_NODE]; if (s->src_node != NULL) { sn = s->src_node; sh = &V_pf_srchash[pf_hashsrc(&sn->addr, sn->af)]; PF_HASHROW_LOCK(sh); if (s->src.tcp_est) --sn->conn; if (--sn->states == 0) sn->expire = time_uptime + timeout; PF_HASHROW_UNLOCK(sh); } if (s->nat_src_node != s->src_node && s->nat_src_node != NULL) { sn = s->nat_src_node; sh = &V_pf_srchash[pf_hashsrc(&sn->addr, sn->af)]; PF_HASHROW_LOCK(sh); if (--sn->states == 0) sn->expire = time_uptime + timeout; PF_HASHROW_UNLOCK(sh); } s->src_node = s->nat_src_node = NULL; } /* * Unlink and potentilly free a state. Function may be * called with ID hash row locked, but always returns * unlocked, since it needs to go through key hash locking. */ int pf_unlink_state(struct pf_state *s, u_int flags) { struct pf_idhash *ih = &V_pf_idhash[PF_IDHASH(s)]; if ((flags & PF_ENTER_LOCKED) == 0) PF_HASHROW_LOCK(ih); else PF_HASHROW_ASSERT(ih); if (s->timeout == PFTM_UNLINKED) { /* * State is being processed * by pf_unlink_state() in * an other thread. */ PF_HASHROW_UNLOCK(ih); return (0); /* XXXGL: undefined actually */ } if (s->src.state == PF_TCPS_PROXY_DST) { /* XXX wire key the right one? */ pf_send_tcp(NULL, s->rule.ptr, s->key[PF_SK_WIRE]->af, &s->key[PF_SK_WIRE]->addr[1], &s->key[PF_SK_WIRE]->addr[0], s->key[PF_SK_WIRE]->port[1], s->key[PF_SK_WIRE]->port[0], s->src.seqhi, s->src.seqlo + 1, TH_RST|TH_ACK, 0, 0, 0, 1, s->tag, NULL); } LIST_REMOVE(s, entry); pf_src_tree_remove_state(s); if (V_pfsync_delete_state_ptr != NULL) V_pfsync_delete_state_ptr(s); STATE_DEC_COUNTERS(s); s->timeout = PFTM_UNLINKED; PF_HASHROW_UNLOCK(ih); pf_detach_state(s); /* pf_state_insert() initialises refs to 2, so we can never release the * last reference here, only in pf_release_state(). */ (void)refcount_release(&s->refs); return (pf_release_state(s)); } void pf_free_state(struct pf_state *cur) { KASSERT(cur->refs == 0, ("%s: %p has refs", __func__, cur)); KASSERT(cur->timeout == PFTM_UNLINKED, ("%s: timeout %u", __func__, cur->timeout)); pf_normalize_tcp_cleanup(cur); uma_zfree(V_pf_state_z, cur); counter_u64_add(V_pf_status.fcounters[FCNT_STATE_REMOVALS], 1); } /* * Called only from pf_purge_thread(), thus serialized. */ static u_int pf_purge_expired_states(u_int i, int maxcheck) { struct pf_idhash *ih; struct pf_state *s; V_pf_status.states = uma_zone_get_cur(V_pf_state_z); /* * Go through hash and unlink states that expire now. */ while (maxcheck > 0) { ih = &V_pf_idhash[i]; /* only take the lock if we expect to do work */ if (!LIST_EMPTY(&ih->states)) { relock: PF_HASHROW_LOCK(ih); LIST_FOREACH(s, &ih->states, entry) { if (pf_state_expires(s) <= time_uptime) { V_pf_status.states -= pf_unlink_state(s, PF_ENTER_LOCKED); goto relock; } s->rule.ptr->rule_flag |= PFRULE_REFS; if (s->nat_rule.ptr != NULL) s->nat_rule.ptr->rule_flag |= PFRULE_REFS; if (s->anchor.ptr != NULL) s->anchor.ptr->rule_flag |= PFRULE_REFS; s->kif->pfik_flags |= PFI_IFLAG_REFS; if (s->rt_kif) s->rt_kif->pfik_flags |= PFI_IFLAG_REFS; } PF_HASHROW_UNLOCK(ih); } /* Return when we hit end of hash. */ if (++i > pf_hashmask) { V_pf_status.states = uma_zone_get_cur(V_pf_state_z); return (0); } maxcheck--; } V_pf_status.states = uma_zone_get_cur(V_pf_state_z); return (i); } static void pf_purge_unlinked_rules() { struct pf_rulequeue tmpq; struct pf_rule *r, *r1; /* * If we have overloading task pending, then we'd * better skip purging this time. There is a tiny * probability that overloading task references * an already unlinked rule. */ PF_OVERLOADQ_LOCK(); if (!SLIST_EMPTY(&V_pf_overloadqueue)) { PF_OVERLOADQ_UNLOCK(); return; } PF_OVERLOADQ_UNLOCK(); /* * Do naive mark-and-sweep garbage collecting of old rules. * Reference flag is raised by pf_purge_expired_states() * and pf_purge_expired_src_nodes(). * * To avoid LOR between PF_UNLNKDRULES_LOCK/PF_RULES_WLOCK, * use a temporary queue. */ TAILQ_INIT(&tmpq); PF_UNLNKDRULES_LOCK(); TAILQ_FOREACH_SAFE(r, &V_pf_unlinked_rules, entries, r1) { if (!(r->rule_flag & PFRULE_REFS)) { TAILQ_REMOVE(&V_pf_unlinked_rules, r, entries); TAILQ_INSERT_TAIL(&tmpq, r, entries); } else r->rule_flag &= ~PFRULE_REFS; } PF_UNLNKDRULES_UNLOCK(); if (!TAILQ_EMPTY(&tmpq)) { PF_RULES_WLOCK(); TAILQ_FOREACH_SAFE(r, &tmpq, entries, r1) { TAILQ_REMOVE(&tmpq, r, entries); pf_free_rule(r); } PF_RULES_WUNLOCK(); } } void pf_print_host(struct pf_addr *addr, u_int16_t p, sa_family_t af) { switch (af) { #ifdef INET case AF_INET: { u_int32_t a = ntohl(addr->addr32[0]); printf("%u.%u.%u.%u", (a>>24)&255, (a>>16)&255, (a>>8)&255, a&255); if (p) { p = ntohs(p); printf(":%u", p); } break; } #endif /* INET */ #ifdef INET6 case AF_INET6: { u_int16_t b; u_int8_t i, curstart, curend, maxstart, maxend; curstart = curend = maxstart = maxend = 255; for (i = 0; i < 8; i++) { if (!addr->addr16[i]) { if (curstart == 255) curstart = i; curend = i; } else { if ((curend - curstart) > (maxend - maxstart)) { maxstart = curstart; maxend = curend; } curstart = curend = 255; } } if ((curend - curstart) > (maxend - maxstart)) { maxstart = curstart; maxend = curend; } for (i = 0; i < 8; i++) { if (i >= maxstart && i <= maxend) { if (i == 0) printf(":"); if (i == maxend) printf(":"); } else { b = ntohs(addr->addr16[i]); printf("%x", b); if (i < 7) printf(":"); } } if (p) { p = ntohs(p); printf("[%u]", p); } break; } #endif /* INET6 */ } } void pf_print_state(struct pf_state *s) { pf_print_state_parts(s, NULL, NULL); } static void pf_print_state_parts(struct pf_state *s, struct pf_state_key *skwp, struct pf_state_key *sksp) { struct pf_state_key *skw, *sks; u_int8_t proto, dir; /* Do our best to fill these, but they're skipped if NULL */ skw = skwp ? skwp : (s ? s->key[PF_SK_WIRE] : NULL); sks = sksp ? sksp : (s ? s->key[PF_SK_STACK] : NULL); proto = skw ? skw->proto : (sks ? sks->proto : 0); dir = s ? s->direction : 0; switch (proto) { case IPPROTO_IPV4: printf("IPv4"); break; case IPPROTO_IPV6: printf("IPv6"); break; case IPPROTO_TCP: printf("TCP"); break; case IPPROTO_UDP: printf("UDP"); break; case IPPROTO_ICMP: printf("ICMP"); break; case IPPROTO_ICMPV6: printf("ICMPv6"); break; default: printf("%u", proto); break; } switch (dir) { case PF_IN: printf(" in"); break; case PF_OUT: printf(" out"); break; } if (skw) { printf(" wire: "); pf_print_host(&skw->addr[0], skw->port[0], skw->af); printf(" "); pf_print_host(&skw->addr[1], skw->port[1], skw->af); } if (sks) { printf(" stack: "); if (sks != skw) { pf_print_host(&sks->addr[0], sks->port[0], sks->af); printf(" "); pf_print_host(&sks->addr[1], sks->port[1], sks->af); } else printf("-"); } if (s) { if (proto == IPPROTO_TCP) { printf(" [lo=%u high=%u win=%u modulator=%u", s->src.seqlo, s->src.seqhi, s->src.max_win, s->src.seqdiff); if (s->src.wscale && s->dst.wscale) printf(" wscale=%u", s->src.wscale & PF_WSCALE_MASK); printf("]"); printf(" [lo=%u high=%u win=%u modulator=%u", s->dst.seqlo, s->dst.seqhi, s->dst.max_win, s->dst.seqdiff); if (s->src.wscale && s->dst.wscale) printf(" wscale=%u", s->dst.wscale & PF_WSCALE_MASK); printf("]"); } printf(" %u:%u", s->src.state, s->dst.state); } } void pf_print_flags(u_int8_t f) { if (f) printf(" "); if (f & TH_FIN) printf("F"); if (f & TH_SYN) printf("S"); if (f & TH_RST) printf("R"); if (f & TH_PUSH) printf("P"); if (f & TH_ACK) printf("A"); if (f & TH_URG) printf("U"); if (f & TH_ECE) printf("E"); if (f & TH_CWR) printf("W"); } #define PF_SET_SKIP_STEPS(i) \ do { \ while (head[i] != cur) { \ head[i]->skip[i].ptr = cur; \ head[i] = TAILQ_NEXT(head[i], entries); \ } \ } while (0) void pf_calc_skip_steps(struct pf_rulequeue *rules) { struct pf_rule *cur, *prev, *head[PF_SKIP_COUNT]; int i; cur = TAILQ_FIRST(rules); prev = cur; for (i = 0; i < PF_SKIP_COUNT; ++i) head[i] = cur; while (cur != NULL) { if (cur->kif != prev->kif || cur->ifnot != prev->ifnot) PF_SET_SKIP_STEPS(PF_SKIP_IFP); if (cur->direction != prev->direction) PF_SET_SKIP_STEPS(PF_SKIP_DIR); if (cur->af != prev->af) PF_SET_SKIP_STEPS(PF_SKIP_AF); if (cur->proto != prev->proto) PF_SET_SKIP_STEPS(PF_SKIP_PROTO); if (cur->src.neg != prev->src.neg || pf_addr_wrap_neq(&cur->src.addr, &prev->src.addr)) PF_SET_SKIP_STEPS(PF_SKIP_SRC_ADDR); if (cur->src.port[0] != prev->src.port[0] || cur->src.port[1] != prev->src.port[1] || cur->src.port_op != prev->src.port_op) PF_SET_SKIP_STEPS(PF_SKIP_SRC_PORT); if (cur->dst.neg != prev->dst.neg || pf_addr_wrap_neq(&cur->dst.addr, &prev->dst.addr)) PF_SET_SKIP_STEPS(PF_SKIP_DST_ADDR); if (cur->dst.port[0] != prev->dst.port[0] || cur->dst.port[1] != prev->dst.port[1] || cur->dst.port_op != prev->dst.port_op) PF_SET_SKIP_STEPS(PF_SKIP_DST_PORT); prev = cur; cur = TAILQ_NEXT(cur, entries); } for (i = 0; i < PF_SKIP_COUNT; ++i) PF_SET_SKIP_STEPS(i); } static int pf_addr_wrap_neq(struct pf_addr_wrap *aw1, struct pf_addr_wrap *aw2) { if (aw1->type != aw2->type) return (1); switch (aw1->type) { case PF_ADDR_ADDRMASK: case PF_ADDR_RANGE: if (PF_ANEQ(&aw1->v.a.addr, &aw2->v.a.addr, AF_INET6)) return (1); if (PF_ANEQ(&aw1->v.a.mask, &aw2->v.a.mask, AF_INET6)) return (1); return (0); case PF_ADDR_DYNIFTL: return (aw1->p.dyn->pfid_kt != aw2->p.dyn->pfid_kt); case PF_ADDR_NOROUTE: case PF_ADDR_URPFFAILED: return (0); case PF_ADDR_TABLE: return (aw1->p.tbl != aw2->p.tbl); default: printf("invalid address type: %d\n", aw1->type); return (1); } } /** * Checksum updates are a little complicated because the checksum in the TCP/UDP * header isn't always a full checksum. In some cases (i.e. output) it's a * pseudo-header checksum, which is a partial checksum over src/dst IP * addresses, protocol number and length. * * That means we have the following cases: * * Input or forwarding: we don't have TSO, the checksum fields are full * checksums, we need to update the checksum whenever we change anything. * * Output (i.e. the checksum is a pseudo-header checksum): * x The field being updated is src/dst address or affects the length of * the packet. We need to update the pseudo-header checksum (note that this * checksum is not ones' complement). * x Some other field is being modified (e.g. src/dst port numbers): We * don't have to update anything. **/ u_int16_t pf_cksum_fixup(u_int16_t cksum, u_int16_t old, u_int16_t new, u_int8_t udp) { u_int32_t l; if (udp && !cksum) return (0x0000); l = cksum + old - new; l = (l >> 16) + (l & 65535); l = l & 65535; if (udp && !l) return (0xFFFF); return (l); } u_int16_t pf_proto_cksum_fixup(struct mbuf *m, u_int16_t cksum, u_int16_t old, u_int16_t new, u_int8_t udp) { if (m->m_pkthdr.csum_flags & (CSUM_DELAY_DATA | CSUM_DELAY_DATA_IPV6)) return (cksum); return (pf_cksum_fixup(cksum, old, new, udp)); } static void pf_change_ap(struct mbuf *m, struct pf_addr *a, u_int16_t *p, u_int16_t *ic, u_int16_t *pc, struct pf_addr *an, u_int16_t pn, u_int8_t u, sa_family_t af) { struct pf_addr ao; u_int16_t po = *p; PF_ACPY(&ao, a, af); PF_ACPY(a, an, af); if (m->m_pkthdr.csum_flags & (CSUM_DELAY_DATA | CSUM_DELAY_DATA_IPV6)) *pc = ~*pc; *p = pn; switch (af) { #ifdef INET case AF_INET: *ic = pf_cksum_fixup(pf_cksum_fixup(*ic, ao.addr16[0], an->addr16[0], 0), ao.addr16[1], an->addr16[1], 0); *p = pn; *pc = pf_cksum_fixup(pf_cksum_fixup(*pc, ao.addr16[0], an->addr16[0], u), ao.addr16[1], an->addr16[1], u); *pc = pf_proto_cksum_fixup(m, *pc, po, pn, u); break; #endif /* INET */ #ifdef INET6 case AF_INET6: *pc = pf_cksum_fixup(pf_cksum_fixup(pf_cksum_fixup( pf_cksum_fixup(pf_cksum_fixup(pf_cksum_fixup( pf_cksum_fixup(pf_cksum_fixup(*pc, ao.addr16[0], an->addr16[0], u), ao.addr16[1], an->addr16[1], u), ao.addr16[2], an->addr16[2], u), ao.addr16[3], an->addr16[3], u), ao.addr16[4], an->addr16[4], u), ao.addr16[5], an->addr16[5], u), ao.addr16[6], an->addr16[6], u), ao.addr16[7], an->addr16[7], u); *pc = pf_proto_cksum_fixup(m, *pc, po, pn, u); break; #endif /* INET6 */ } if (m->m_pkthdr.csum_flags & (CSUM_DELAY_DATA | CSUM_DELAY_DATA_IPV6)) { *pc = ~*pc; if (! *pc) *pc = 0xffff; } } /* Changes a u_int32_t. Uses a void * so there are no align restrictions */ void pf_change_a(void *a, u_int16_t *c, u_int32_t an, u_int8_t u) { u_int32_t ao; memcpy(&ao, a, sizeof(ao)); memcpy(a, &an, sizeof(u_int32_t)); *c = pf_cksum_fixup(pf_cksum_fixup(*c, ao / 65536, an / 65536, u), ao % 65536, an % 65536, u); } void pf_change_proto_a(struct mbuf *m, void *a, u_int16_t *c, u_int32_t an, u_int8_t udp) { u_int32_t ao; memcpy(&ao, a, sizeof(ao)); memcpy(a, &an, sizeof(u_int32_t)); *c = pf_proto_cksum_fixup(m, pf_proto_cksum_fixup(m, *c, ao / 65536, an / 65536, udp), ao % 65536, an % 65536, udp); } #ifdef INET6 static void pf_change_a6(struct pf_addr *a, u_int16_t *c, struct pf_addr *an, u_int8_t u) { struct pf_addr ao; PF_ACPY(&ao, a, AF_INET6); PF_ACPY(a, an, AF_INET6); *c = pf_cksum_fixup(pf_cksum_fixup(pf_cksum_fixup( pf_cksum_fixup(pf_cksum_fixup(pf_cksum_fixup( pf_cksum_fixup(pf_cksum_fixup(*c, ao.addr16[0], an->addr16[0], u), ao.addr16[1], an->addr16[1], u), ao.addr16[2], an->addr16[2], u), ao.addr16[3], an->addr16[3], u), ao.addr16[4], an->addr16[4], u), ao.addr16[5], an->addr16[5], u), ao.addr16[6], an->addr16[6], u), ao.addr16[7], an->addr16[7], u); } #endif /* INET6 */ static void pf_change_icmp(struct pf_addr *ia, u_int16_t *ip, struct pf_addr *oa, struct pf_addr *na, u_int16_t np, u_int16_t *pc, u_int16_t *h2c, u_int16_t *ic, u_int16_t *hc, u_int8_t u, sa_family_t af) { struct pf_addr oia, ooa; PF_ACPY(&oia, ia, af); if (oa) PF_ACPY(&ooa, oa, af); /* Change inner protocol port, fix inner protocol checksum. */ if (ip != NULL) { u_int16_t oip = *ip; u_int32_t opc; if (pc != NULL) opc = *pc; *ip = np; if (pc != NULL) *pc = pf_cksum_fixup(*pc, oip, *ip, u); *ic = pf_cksum_fixup(*ic, oip, *ip, 0); if (pc != NULL) *ic = pf_cksum_fixup(*ic, opc, *pc, 0); } /* Change inner ip address, fix inner ip and icmp checksums. */ PF_ACPY(ia, na, af); switch (af) { #ifdef INET case AF_INET: { u_int32_t oh2c = *h2c; *h2c = pf_cksum_fixup(pf_cksum_fixup(*h2c, oia.addr16[0], ia->addr16[0], 0), oia.addr16[1], ia->addr16[1], 0); *ic = pf_cksum_fixup(pf_cksum_fixup(*ic, oia.addr16[0], ia->addr16[0], 0), oia.addr16[1], ia->addr16[1], 0); *ic = pf_cksum_fixup(*ic, oh2c, *h2c, 0); break; } #endif /* INET */ #ifdef INET6 case AF_INET6: *ic = pf_cksum_fixup(pf_cksum_fixup(pf_cksum_fixup( pf_cksum_fixup(pf_cksum_fixup(pf_cksum_fixup( pf_cksum_fixup(pf_cksum_fixup(*ic, oia.addr16[0], ia->addr16[0], u), oia.addr16[1], ia->addr16[1], u), oia.addr16[2], ia->addr16[2], u), oia.addr16[3], ia->addr16[3], u), oia.addr16[4], ia->addr16[4], u), oia.addr16[5], ia->addr16[5], u), oia.addr16[6], ia->addr16[6], u), oia.addr16[7], ia->addr16[7], u); break; #endif /* INET6 */ } /* Outer ip address, fix outer ip or icmpv6 checksum, if necessary. */ if (oa) { PF_ACPY(oa, na, af); switch (af) { #ifdef INET case AF_INET: *hc = pf_cksum_fixup(pf_cksum_fixup(*hc, ooa.addr16[0], oa->addr16[0], 0), ooa.addr16[1], oa->addr16[1], 0); break; #endif /* INET */ #ifdef INET6 case AF_INET6: *ic = pf_cksum_fixup(pf_cksum_fixup(pf_cksum_fixup( pf_cksum_fixup(pf_cksum_fixup(pf_cksum_fixup( pf_cksum_fixup(pf_cksum_fixup(*ic, ooa.addr16[0], oa->addr16[0], u), ooa.addr16[1], oa->addr16[1], u), ooa.addr16[2], oa->addr16[2], u), ooa.addr16[3], oa->addr16[3], u), ooa.addr16[4], oa->addr16[4], u), ooa.addr16[5], oa->addr16[5], u), ooa.addr16[6], oa->addr16[6], u), ooa.addr16[7], oa->addr16[7], u); break; #endif /* INET6 */ } } } /* * Need to modulate the sequence numbers in the TCP SACK option * (credits to Krzysztof Pfaff for report and patch) */ static int pf_modulate_sack(struct mbuf *m, int off, struct pf_pdesc *pd, struct tcphdr *th, struct pf_state_peer *dst) { int hlen = (th->th_off << 2) - sizeof(*th), thoptlen = hlen; u_int8_t opts[TCP_MAXOLEN], *opt = opts; int copyback = 0, i, olen; struct sackblk sack; #define TCPOLEN_SACKLEN (TCPOLEN_SACK + 2) if (hlen < TCPOLEN_SACKLEN || !pf_pull_hdr(m, off + sizeof(*th), opts, hlen, NULL, NULL, pd->af)) return 0; while (hlen >= TCPOLEN_SACKLEN) { olen = opt[1]; switch (*opt) { case TCPOPT_EOL: /* FALLTHROUGH */ case TCPOPT_NOP: opt++; hlen--; break; case TCPOPT_SACK: if (olen > hlen) olen = hlen; if (olen >= TCPOLEN_SACKLEN) { for (i = 2; i + TCPOLEN_SACK <= olen; i += TCPOLEN_SACK) { memcpy(&sack, &opt[i], sizeof(sack)); pf_change_proto_a(m, &sack.start, &th->th_sum, htonl(ntohl(sack.start) - dst->seqdiff), 0); pf_change_proto_a(m, &sack.end, &th->th_sum, htonl(ntohl(sack.end) - dst->seqdiff), 0); memcpy(&opt[i], &sack, sizeof(sack)); } copyback = 1; } /* FALLTHROUGH */ default: if (olen < 2) olen = 2; hlen -= olen; opt += olen; } } if (copyback) m_copyback(m, off + sizeof(*th), thoptlen, (caddr_t)opts); return (copyback); } static void pf_send_tcp(struct mbuf *replyto, const struct pf_rule *r, sa_family_t af, const struct pf_addr *saddr, const struct pf_addr *daddr, u_int16_t sport, u_int16_t dport, u_int32_t seq, u_int32_t ack, u_int8_t flags, u_int16_t win, u_int16_t mss, u_int8_t ttl, int tag, u_int16_t rtag, struct ifnet *ifp) { struct pf_send_entry *pfse; struct mbuf *m; int len, tlen; #ifdef INET struct ip *h = NULL; #endif /* INET */ #ifdef INET6 struct ip6_hdr *h6 = NULL; #endif /* INET6 */ struct tcphdr *th; char *opt; struct pf_mtag *pf_mtag; len = 0; th = NULL; /* maximum segment size tcp option */ tlen = sizeof(struct tcphdr); if (mss) tlen += 4; switch (af) { #ifdef INET case AF_INET: len = sizeof(struct ip) + tlen; break; #endif /* INET */ #ifdef INET6 case AF_INET6: len = sizeof(struct ip6_hdr) + tlen; break; #endif /* INET6 */ default: panic("%s: unsupported af %d", __func__, af); } /* Allocate outgoing queue entry, mbuf and mbuf tag. */ pfse = malloc(sizeof(*pfse), M_PFTEMP, M_NOWAIT); if (pfse == NULL) return; m = m_gethdr(M_NOWAIT, MT_DATA); if (m == NULL) { free(pfse, M_PFTEMP); return; } #ifdef MAC mac_netinet_firewall_send(m); #endif if ((pf_mtag = pf_get_mtag(m)) == NULL) { free(pfse, M_PFTEMP); m_freem(m); return; } if (tag) m->m_flags |= M_SKIP_FIREWALL; pf_mtag->tag = rtag; if (r != NULL && r->rtableid >= 0) M_SETFIB(m, r->rtableid); #ifdef ALTQ if (r != NULL && r->qid) { pf_mtag->qid = r->qid; /* add hints for ecn */ pf_mtag->hdr = mtod(m, struct ip *); } #endif /* ALTQ */ m->m_data += max_linkhdr; m->m_pkthdr.len = m->m_len = len; m->m_pkthdr.rcvif = NULL; bzero(m->m_data, len); switch (af) { #ifdef INET case AF_INET: h = mtod(m, struct ip *); /* IP header fields included in the TCP checksum */ h->ip_p = IPPROTO_TCP; h->ip_len = htons(tlen); h->ip_src.s_addr = saddr->v4.s_addr; h->ip_dst.s_addr = daddr->v4.s_addr; th = (struct tcphdr *)((caddr_t)h + sizeof(struct ip)); break; #endif /* INET */ #ifdef INET6 case AF_INET6: h6 = mtod(m, struct ip6_hdr *); /* IP header fields included in the TCP checksum */ h6->ip6_nxt = IPPROTO_TCP; h6->ip6_plen = htons(tlen); memcpy(&h6->ip6_src, &saddr->v6, sizeof(struct in6_addr)); memcpy(&h6->ip6_dst, &daddr->v6, sizeof(struct in6_addr)); th = (struct tcphdr *)((caddr_t)h6 + sizeof(struct ip6_hdr)); break; #endif /* INET6 */ } /* TCP header */ th->th_sport = sport; th->th_dport = dport; th->th_seq = htonl(seq); th->th_ack = htonl(ack); th->th_off = tlen >> 2; th->th_flags = flags; th->th_win = htons(win); if (mss) { opt = (char *)(th + 1); opt[0] = TCPOPT_MAXSEG; opt[1] = 4; HTONS(mss); bcopy((caddr_t)&mss, (caddr_t)(opt + 2), 2); } switch (af) { #ifdef INET case AF_INET: /* TCP checksum */ th->th_sum = in_cksum(m, len); /* Finish the IP header */ h->ip_v = 4; h->ip_hl = sizeof(*h) >> 2; h->ip_tos = IPTOS_LOWDELAY; h->ip_off = htons(V_path_mtu_discovery ? IP_DF : 0); h->ip_len = htons(len); h->ip_ttl = ttl ? ttl : V_ip_defttl; h->ip_sum = 0; pfse->pfse_type = PFSE_IP; break; #endif /* INET */ #ifdef INET6 case AF_INET6: /* TCP checksum */ th->th_sum = in6_cksum(m, IPPROTO_TCP, sizeof(struct ip6_hdr), tlen); h6->ip6_vfc |= IPV6_VERSION; h6->ip6_hlim = IPV6_DEFHLIM; pfse->pfse_type = PFSE_IP6; break; #endif /* INET6 */ } pfse->pfse_m = m; pf_send(pfse); } static void pf_return(struct pf_rule *r, struct pf_rule *nr, struct pf_pdesc *pd, struct pf_state_key *sk, int off, struct mbuf *m, struct tcphdr *th, struct pfi_kif *kif, u_int16_t bproto_sum, u_int16_t bip_sum, int hdrlen, u_short *reason) { struct pf_addr * const saddr = pd->src; struct pf_addr * const daddr = pd->dst; sa_family_t af = pd->af; /* undo NAT changes, if they have taken place */ if (nr != NULL) { PF_ACPY(saddr, &sk->addr[pd->sidx], af); PF_ACPY(daddr, &sk->addr[pd->didx], af); if (pd->sport) *pd->sport = sk->port[pd->sidx]; if (pd->dport) *pd->dport = sk->port[pd->didx]; if (pd->proto_sum) *pd->proto_sum = bproto_sum; if (pd->ip_sum) *pd->ip_sum = bip_sum; m_copyback(m, off, hdrlen, pd->hdr.any); } if (pd->proto == IPPROTO_TCP && ((r->rule_flag & PFRULE_RETURNRST) || (r->rule_flag & PFRULE_RETURN)) && !(th->th_flags & TH_RST)) { u_int32_t ack = ntohl(th->th_seq) + pd->p_len; int len = 0; #ifdef INET struct ip *h4; #endif #ifdef INET6 struct ip6_hdr *h6; #endif switch (af) { #ifdef INET case AF_INET: h4 = mtod(m, struct ip *); len = ntohs(h4->ip_len) - off; break; #endif #ifdef INET6 case AF_INET6: h6 = mtod(m, struct ip6_hdr *); len = ntohs(h6->ip6_plen) - (off - sizeof(*h6)); break; #endif } if (pf_check_proto_cksum(m, off, len, IPPROTO_TCP, af)) REASON_SET(reason, PFRES_PROTCKSUM); else { if (th->th_flags & TH_SYN) ack++; if (th->th_flags & TH_FIN) ack++; pf_send_tcp(m, r, af, pd->dst, pd->src, th->th_dport, th->th_sport, ntohl(th->th_ack), ack, TH_RST|TH_ACK, 0, 0, r->return_ttl, 1, 0, kif->pfik_ifp); } } else if (pd->proto != IPPROTO_ICMP && af == AF_INET && r->return_icmp) pf_send_icmp(m, r->return_icmp >> 8, r->return_icmp & 255, af, r); else if (pd->proto != IPPROTO_ICMPV6 && af == AF_INET6 && r->return_icmp6) pf_send_icmp(m, r->return_icmp6 >> 8, r->return_icmp6 & 255, af, r); } static int pf_ieee8021q_setpcp(struct mbuf *m, u_int8_t prio) { struct m_tag *mtag; KASSERT(prio <= PF_PRIO_MAX, ("%s with invalid pcp", __func__)); mtag = m_tag_locate(m, MTAG_8021Q, MTAG_8021Q_PCP_OUT, NULL); if (mtag == NULL) { mtag = m_tag_alloc(MTAG_8021Q, MTAG_8021Q_PCP_OUT, sizeof(uint8_t), M_NOWAIT); if (mtag == NULL) return (ENOMEM); m_tag_prepend(m, mtag); } *(uint8_t *)(mtag + 1) = prio; return (0); } static int pf_match_ieee8021q_pcp(u_int8_t prio, struct mbuf *m) { struct m_tag *mtag; u_int8_t mpcp; mtag = m_tag_locate(m, MTAG_8021Q, MTAG_8021Q_PCP_IN, NULL); if (mtag == NULL) return (0); if (prio == PF_PRIO_ZERO) prio = 0; mpcp = *(uint8_t *)(mtag + 1); return (mpcp == prio); } static void pf_send_icmp(struct mbuf *m, u_int8_t type, u_int8_t code, sa_family_t af, struct pf_rule *r) { struct pf_send_entry *pfse; struct mbuf *m0; struct pf_mtag *pf_mtag; /* Allocate outgoing queue entry, mbuf and mbuf tag. */ pfse = malloc(sizeof(*pfse), M_PFTEMP, M_NOWAIT); if (pfse == NULL) return; if ((m0 = m_copypacket(m, M_NOWAIT)) == NULL) { free(pfse, M_PFTEMP); return; } if ((pf_mtag = pf_get_mtag(m0)) == NULL) { free(pfse, M_PFTEMP); return; } /* XXX: revisit */ m0->m_flags |= M_SKIP_FIREWALL; if (r->rtableid >= 0) M_SETFIB(m0, r->rtableid); #ifdef ALTQ if (r->qid) { pf_mtag->qid = r->qid; /* add hints for ecn */ pf_mtag->hdr = mtod(m0, struct ip *); } #endif /* ALTQ */ switch (af) { #ifdef INET case AF_INET: pfse->pfse_type = PFSE_ICMP; break; #endif /* INET */ #ifdef INET6 case AF_INET6: pfse->pfse_type = PFSE_ICMP6; break; #endif /* INET6 */ } pfse->pfse_m = m0; pfse->icmpopts.type = type; pfse->icmpopts.code = code; pf_send(pfse); } /* * Return 1 if the addresses a and b match (with mask m), otherwise return 0. * If n is 0, they match if they are equal. If n is != 0, they match if they * are different. */ int pf_match_addr(u_int8_t n, struct pf_addr *a, struct pf_addr *m, struct pf_addr *b, sa_family_t af) { int match = 0; switch (af) { #ifdef INET case AF_INET: if ((a->addr32[0] & m->addr32[0]) == (b->addr32[0] & m->addr32[0])) match++; break; #endif /* INET */ #ifdef INET6 case AF_INET6: if (((a->addr32[0] & m->addr32[0]) == (b->addr32[0] & m->addr32[0])) && ((a->addr32[1] & m->addr32[1]) == (b->addr32[1] & m->addr32[1])) && ((a->addr32[2] & m->addr32[2]) == (b->addr32[2] & m->addr32[2])) && ((a->addr32[3] & m->addr32[3]) == (b->addr32[3] & m->addr32[3]))) match++; break; #endif /* INET6 */ } if (match) { if (n) return (0); else return (1); } else { if (n) return (1); else return (0); } } /* * Return 1 if b <= a <= e, otherwise return 0. */ int pf_match_addr_range(struct pf_addr *b, struct pf_addr *e, struct pf_addr *a, sa_family_t af) { switch (af) { #ifdef INET case AF_INET: if ((ntohl(a->addr32[0]) < ntohl(b->addr32[0])) || (ntohl(a->addr32[0]) > ntohl(e->addr32[0]))) return (0); break; #endif /* INET */ #ifdef INET6 case AF_INET6: { int i; /* check a >= b */ for (i = 0; i < 4; ++i) if (ntohl(a->addr32[i]) > ntohl(b->addr32[i])) break; else if (ntohl(a->addr32[i]) < ntohl(b->addr32[i])) return (0); /* check a <= e */ for (i = 0; i < 4; ++i) if (ntohl(a->addr32[i]) < ntohl(e->addr32[i])) break; else if (ntohl(a->addr32[i]) > ntohl(e->addr32[i])) return (0); break; } #endif /* INET6 */ } return (1); } static int pf_match(u_int8_t op, u_int32_t a1, u_int32_t a2, u_int32_t p) { switch (op) { case PF_OP_IRG: return ((p > a1) && (p < a2)); case PF_OP_XRG: return ((p < a1) || (p > a2)); case PF_OP_RRG: return ((p >= a1) && (p <= a2)); case PF_OP_EQ: return (p == a1); case PF_OP_NE: return (p != a1); case PF_OP_LT: return (p < a1); case PF_OP_LE: return (p <= a1); case PF_OP_GT: return (p > a1); case PF_OP_GE: return (p >= a1); } return (0); /* never reached */ } int pf_match_port(u_int8_t op, u_int16_t a1, u_int16_t a2, u_int16_t p) { NTOHS(a1); NTOHS(a2); NTOHS(p); return (pf_match(op, a1, a2, p)); } static int pf_match_uid(u_int8_t op, uid_t a1, uid_t a2, uid_t u) { if (u == UID_MAX && op != PF_OP_EQ && op != PF_OP_NE) return (0); return (pf_match(op, a1, a2, u)); } static int pf_match_gid(u_int8_t op, gid_t a1, gid_t a2, gid_t g) { if (g == GID_MAX && op != PF_OP_EQ && op != PF_OP_NE) return (0); return (pf_match(op, a1, a2, g)); } int pf_match_tag(struct mbuf *m, struct pf_rule *r, int *tag, int mtag) { if (*tag == -1) *tag = mtag; return ((!r->match_tag_not && r->match_tag == *tag) || (r->match_tag_not && r->match_tag != *tag)); } int pf_tag_packet(struct mbuf *m, struct pf_pdesc *pd, int tag) { KASSERT(tag > 0, ("%s: tag %d", __func__, tag)); if (pd->pf_mtag == NULL && ((pd->pf_mtag = pf_get_mtag(m)) == NULL)) return (ENOMEM); pd->pf_mtag->tag = tag; return (0); } #define PF_ANCHOR_STACKSIZE 32 struct pf_anchor_stackframe { struct pf_ruleset *rs; struct pf_rule *r; /* XXX: + match bit */ struct pf_anchor *child; }; /* * XXX: We rely on malloc(9) returning pointer aligned addresses. */ #define PF_ANCHORSTACK_MATCH 0x00000001 #define PF_ANCHORSTACK_MASK (PF_ANCHORSTACK_MATCH) #define PF_ANCHOR_MATCH(f) ((uintptr_t)(f)->r & PF_ANCHORSTACK_MATCH) #define PF_ANCHOR_RULE(f) (struct pf_rule *) \ ((uintptr_t)(f)->r & ~PF_ANCHORSTACK_MASK) #define PF_ANCHOR_SET_MATCH(f) do { (f)->r = (void *) \ ((uintptr_t)(f)->r | PF_ANCHORSTACK_MATCH); \ } while (0) void pf_step_into_anchor(struct pf_anchor_stackframe *stack, int *depth, struct pf_ruleset **rs, int n, struct pf_rule **r, struct pf_rule **a, int *match) { struct pf_anchor_stackframe *f; PF_RULES_RASSERT(); if (match) *match = 0; if (*depth >= PF_ANCHOR_STACKSIZE) { printf("%s: anchor stack overflow on %s\n", __func__, (*r)->anchor->name); *r = TAILQ_NEXT(*r, entries); return; } else if (*depth == 0 && a != NULL) *a = *r; f = stack + (*depth)++; f->rs = *rs; f->r = *r; if ((*r)->anchor_wildcard) { struct pf_anchor_node *parent = &(*r)->anchor->children; if ((f->child = RB_MIN(pf_anchor_node, parent)) == NULL) { *r = NULL; return; } *rs = &f->child->ruleset; } else { f->child = NULL; *rs = &(*r)->anchor->ruleset; } *r = TAILQ_FIRST((*rs)->rules[n].active.ptr); } int pf_step_out_of_anchor(struct pf_anchor_stackframe *stack, int *depth, struct pf_ruleset **rs, int n, struct pf_rule **r, struct pf_rule **a, int *match) { struct pf_anchor_stackframe *f; struct pf_rule *fr; int quick = 0; PF_RULES_RASSERT(); do { if (*depth <= 0) break; f = stack + *depth - 1; fr = PF_ANCHOR_RULE(f); if (f->child != NULL) { struct pf_anchor_node *parent; /* * This block traverses through * a wildcard anchor. */ parent = &fr->anchor->children; if (match != NULL && *match) { /* * If any of "*" matched, then * "foo/ *" matched, mark frame * appropriately. */ PF_ANCHOR_SET_MATCH(f); *match = 0; } f->child = RB_NEXT(pf_anchor_node, parent, f->child); if (f->child != NULL) { *rs = &f->child->ruleset; *r = TAILQ_FIRST((*rs)->rules[n].active.ptr); if (*r == NULL) continue; else break; } } (*depth)--; if (*depth == 0 && a != NULL) *a = NULL; *rs = f->rs; if (PF_ANCHOR_MATCH(f) || (match != NULL && *match)) quick = fr->quick; *r = TAILQ_NEXT(fr, entries); } while (*r == NULL); return (quick); } #ifdef INET6 void pf_poolmask(struct pf_addr *naddr, struct pf_addr *raddr, struct pf_addr *rmask, struct pf_addr *saddr, sa_family_t af) { switch (af) { #ifdef INET case AF_INET: naddr->addr32[0] = (raddr->addr32[0] & rmask->addr32[0]) | ((rmask->addr32[0] ^ 0xffffffff ) & saddr->addr32[0]); break; #endif /* INET */ case AF_INET6: naddr->addr32[0] = (raddr->addr32[0] & rmask->addr32[0]) | ((rmask->addr32[0] ^ 0xffffffff ) & saddr->addr32[0]); naddr->addr32[1] = (raddr->addr32[1] & rmask->addr32[1]) | ((rmask->addr32[1] ^ 0xffffffff ) & saddr->addr32[1]); naddr->addr32[2] = (raddr->addr32[2] & rmask->addr32[2]) | ((rmask->addr32[2] ^ 0xffffffff ) & saddr->addr32[2]); naddr->addr32[3] = (raddr->addr32[3] & rmask->addr32[3]) | ((rmask->addr32[3] ^ 0xffffffff ) & saddr->addr32[3]); break; } } void pf_addr_inc(struct pf_addr *addr, sa_family_t af) { switch (af) { #ifdef INET case AF_INET: addr->addr32[0] = htonl(ntohl(addr->addr32[0]) + 1); break; #endif /* INET */ case AF_INET6: if (addr->addr32[3] == 0xffffffff) { addr->addr32[3] = 0; if (addr->addr32[2] == 0xffffffff) { addr->addr32[2] = 0; if (addr->addr32[1] == 0xffffffff) { addr->addr32[1] = 0; addr->addr32[0] = htonl(ntohl(addr->addr32[0]) + 1); } else addr->addr32[1] = htonl(ntohl(addr->addr32[1]) + 1); } else addr->addr32[2] = htonl(ntohl(addr->addr32[2]) + 1); } else addr->addr32[3] = htonl(ntohl(addr->addr32[3]) + 1); break; } } #endif /* INET6 */ int pf_socket_lookup(int direction, struct pf_pdesc *pd, struct mbuf *m) { struct pf_addr *saddr, *daddr; u_int16_t sport, dport; struct inpcbinfo *pi; struct inpcb *inp; pd->lookup.uid = UID_MAX; pd->lookup.gid = GID_MAX; switch (pd->proto) { case IPPROTO_TCP: if (pd->hdr.tcp == NULL) return (-1); sport = pd->hdr.tcp->th_sport; dport = pd->hdr.tcp->th_dport; pi = &V_tcbinfo; break; case IPPROTO_UDP: if (pd->hdr.udp == NULL) return (-1); sport = pd->hdr.udp->uh_sport; dport = pd->hdr.udp->uh_dport; pi = &V_udbinfo; break; default: return (-1); } if (direction == PF_IN) { saddr = pd->src; daddr = pd->dst; } else { u_int16_t p; p = sport; sport = dport; dport = p; saddr = pd->dst; daddr = pd->src; } switch (pd->af) { #ifdef INET case AF_INET: inp = in_pcblookup_mbuf(pi, saddr->v4, sport, daddr->v4, dport, INPLOOKUP_RLOCKPCB, NULL, m); if (inp == NULL) { inp = in_pcblookup_mbuf(pi, saddr->v4, sport, daddr->v4, dport, INPLOOKUP_WILDCARD | INPLOOKUP_RLOCKPCB, NULL, m); if (inp == NULL) return (-1); } break; #endif /* INET */ #ifdef INET6 case AF_INET6: inp = in6_pcblookup_mbuf(pi, &saddr->v6, sport, &daddr->v6, dport, INPLOOKUP_RLOCKPCB, NULL, m); if (inp == NULL) { inp = in6_pcblookup_mbuf(pi, &saddr->v6, sport, &daddr->v6, dport, INPLOOKUP_WILDCARD | INPLOOKUP_RLOCKPCB, NULL, m); if (inp == NULL) return (-1); } break; #endif /* INET6 */ default: return (-1); } INP_RLOCK_ASSERT(inp); pd->lookup.uid = inp->inp_cred->cr_uid; pd->lookup.gid = inp->inp_cred->cr_groups[0]; INP_RUNLOCK(inp); return (1); } static u_int8_t pf_get_wscale(struct mbuf *m, int off, u_int16_t th_off, sa_family_t af) { int hlen; u_int8_t hdr[60]; u_int8_t *opt, optlen; u_int8_t wscale = 0; hlen = th_off << 2; /* hlen <= sizeof(hdr) */ if (hlen <= sizeof(struct tcphdr)) return (0); if (!pf_pull_hdr(m, off, hdr, hlen, NULL, NULL, af)) return (0); opt = hdr + sizeof(struct tcphdr); hlen -= sizeof(struct tcphdr); while (hlen >= 3) { switch (*opt) { case TCPOPT_EOL: case TCPOPT_NOP: ++opt; --hlen; break; case TCPOPT_WINDOW: wscale = opt[2]; if (wscale > TCP_MAX_WINSHIFT) wscale = TCP_MAX_WINSHIFT; wscale |= PF_WSCALE_FLAG; /* FALLTHROUGH */ default: optlen = opt[1]; if (optlen < 2) optlen = 2; hlen -= optlen; opt += optlen; break; } } return (wscale); } static u_int16_t pf_get_mss(struct mbuf *m, int off, u_int16_t th_off, sa_family_t af) { int hlen; u_int8_t hdr[60]; u_int8_t *opt, optlen; u_int16_t mss = V_tcp_mssdflt; hlen = th_off << 2; /* hlen <= sizeof(hdr) */ if (hlen <= sizeof(struct tcphdr)) return (0); if (!pf_pull_hdr(m, off, hdr, hlen, NULL, NULL, af)) return (0); opt = hdr + sizeof(struct tcphdr); hlen -= sizeof(struct tcphdr); while (hlen >= TCPOLEN_MAXSEG) { switch (*opt) { case TCPOPT_EOL: case TCPOPT_NOP: ++opt; --hlen; break; case TCPOPT_MAXSEG: bcopy((caddr_t)(opt + 2), (caddr_t)&mss, 2); NTOHS(mss); /* FALLTHROUGH */ default: optlen = opt[1]; if (optlen < 2) optlen = 2; hlen -= optlen; opt += optlen; break; } } return (mss); } static u_int16_t pf_calc_mss(struct pf_addr *addr, sa_family_t af, int rtableid, u_int16_t offer) { #ifdef INET struct nhop4_basic nh4; #endif /* INET */ #ifdef INET6 struct nhop6_basic nh6; struct in6_addr dst6; uint32_t scopeid; #endif /* INET6 */ int hlen = 0; uint16_t mss = 0; switch (af) { #ifdef INET case AF_INET: hlen = sizeof(struct ip); if (fib4_lookup_nh_basic(rtableid, addr->v4, 0, 0, &nh4) == 0) mss = nh4.nh_mtu - hlen - sizeof(struct tcphdr); break; #endif /* INET */ #ifdef INET6 case AF_INET6: hlen = sizeof(struct ip6_hdr); in6_splitscope(&addr->v6, &dst6, &scopeid); if (fib6_lookup_nh_basic(rtableid, &dst6, scopeid, 0,0,&nh6)==0) mss = nh6.nh_mtu - hlen - sizeof(struct tcphdr); break; #endif /* INET6 */ } mss = max(V_tcp_mssdflt, mss); mss = min(mss, offer); mss = max(mss, 64); /* sanity - at least max opt space */ return (mss); } static u_int32_t pf_tcp_iss(struct pf_pdesc *pd) { MD5_CTX ctx; u_int32_t digest[4]; if (V_pf_tcp_secret_init == 0) { arc4random_buf(&V_pf_tcp_secret, sizeof(V_pf_tcp_secret)); MD5Init(&V_pf_tcp_secret_ctx); MD5Update(&V_pf_tcp_secret_ctx, V_pf_tcp_secret, sizeof(V_pf_tcp_secret)); V_pf_tcp_secret_init = 1; } ctx = V_pf_tcp_secret_ctx; MD5Update(&ctx, (char *)&pd->hdr.tcp->th_sport, sizeof(u_short)); MD5Update(&ctx, (char *)&pd->hdr.tcp->th_dport, sizeof(u_short)); if (pd->af == AF_INET6) { MD5Update(&ctx, (char *)&pd->src->v6, sizeof(struct in6_addr)); MD5Update(&ctx, (char *)&pd->dst->v6, sizeof(struct in6_addr)); } else { MD5Update(&ctx, (char *)&pd->src->v4, sizeof(struct in_addr)); MD5Update(&ctx, (char *)&pd->dst->v4, sizeof(struct in_addr)); } MD5Final((u_char *)digest, &ctx); V_pf_tcp_iss_off += 4096; #define ISN_RANDOM_INCREMENT (4096 - 1) return (digest[0] + (arc4random() & ISN_RANDOM_INCREMENT) + V_pf_tcp_iss_off); #undef ISN_RANDOM_INCREMENT } static int pf_test_rule(struct pf_rule **rm, struct pf_state **sm, int direction, struct pfi_kif *kif, struct mbuf *m, int off, struct pf_pdesc *pd, struct pf_rule **am, struct pf_ruleset **rsm, struct inpcb *inp) { struct pf_rule *nr = NULL; struct pf_addr * const saddr = pd->src; struct pf_addr * const daddr = pd->dst; sa_family_t af = pd->af; struct pf_rule *r, *a = NULL; struct pf_ruleset *ruleset = NULL; struct pf_src_node *nsn = NULL; struct tcphdr *th = pd->hdr.tcp; struct pf_state_key *sk = NULL, *nk = NULL; u_short reason; int rewrite = 0, hdrlen = 0; int tag = -1, rtableid = -1; int asd = 0; int match = 0; int state_icmp = 0; u_int16_t sport = 0, dport = 0; u_int16_t bproto_sum = 0, bip_sum = 0; u_int8_t icmptype = 0, icmpcode = 0; struct pf_anchor_stackframe anchor_stack[PF_ANCHOR_STACKSIZE]; PF_RULES_RASSERT(); if (inp != NULL) { INP_LOCK_ASSERT(inp); pd->lookup.uid = inp->inp_cred->cr_uid; pd->lookup.gid = inp->inp_cred->cr_groups[0]; pd->lookup.done = 1; } switch (pd->proto) { case IPPROTO_TCP: sport = th->th_sport; dport = th->th_dport; hdrlen = sizeof(*th); break; case IPPROTO_UDP: sport = pd->hdr.udp->uh_sport; dport = pd->hdr.udp->uh_dport; hdrlen = sizeof(*pd->hdr.udp); break; #ifdef INET case IPPROTO_ICMP: if (pd->af != AF_INET) break; sport = dport = pd->hdr.icmp->icmp_id; hdrlen = sizeof(*pd->hdr.icmp); icmptype = pd->hdr.icmp->icmp_type; icmpcode = pd->hdr.icmp->icmp_code; if (icmptype == ICMP_UNREACH || icmptype == ICMP_SOURCEQUENCH || icmptype == ICMP_REDIRECT || icmptype == ICMP_TIMXCEED || icmptype == ICMP_PARAMPROB) state_icmp++; break; #endif /* INET */ #ifdef INET6 case IPPROTO_ICMPV6: if (af != AF_INET6) break; sport = dport = pd->hdr.icmp6->icmp6_id; hdrlen = sizeof(*pd->hdr.icmp6); icmptype = pd->hdr.icmp6->icmp6_type; icmpcode = pd->hdr.icmp6->icmp6_code; if (icmptype == ICMP6_DST_UNREACH || icmptype == ICMP6_PACKET_TOO_BIG || icmptype == ICMP6_TIME_EXCEEDED || icmptype == ICMP6_PARAM_PROB) state_icmp++; break; #endif /* INET6 */ default: sport = dport = hdrlen = 0; break; } r = TAILQ_FIRST(pf_main_ruleset.rules[PF_RULESET_FILTER].active.ptr); /* check packet for BINAT/NAT/RDR */ if ((nr = pf_get_translation(pd, m, off, direction, kif, &nsn, &sk, &nk, saddr, daddr, sport, dport, anchor_stack)) != NULL) { KASSERT(sk != NULL, ("%s: null sk", __func__)); KASSERT(nk != NULL, ("%s: null nk", __func__)); if (pd->ip_sum) bip_sum = *pd->ip_sum; switch (pd->proto) { case IPPROTO_TCP: bproto_sum = th->th_sum; pd->proto_sum = &th->th_sum; if (PF_ANEQ(saddr, &nk->addr[pd->sidx], af) || nk->port[pd->sidx] != sport) { pf_change_ap(m, saddr, &th->th_sport, pd->ip_sum, &th->th_sum, &nk->addr[pd->sidx], nk->port[pd->sidx], 0, af); pd->sport = &th->th_sport; sport = th->th_sport; } if (PF_ANEQ(daddr, &nk->addr[pd->didx], af) || nk->port[pd->didx] != dport) { pf_change_ap(m, daddr, &th->th_dport, pd->ip_sum, &th->th_sum, &nk->addr[pd->didx], nk->port[pd->didx], 0, af); dport = th->th_dport; pd->dport = &th->th_dport; } rewrite++; break; case IPPROTO_UDP: bproto_sum = pd->hdr.udp->uh_sum; pd->proto_sum = &pd->hdr.udp->uh_sum; if (PF_ANEQ(saddr, &nk->addr[pd->sidx], af) || nk->port[pd->sidx] != sport) { pf_change_ap(m, saddr, &pd->hdr.udp->uh_sport, pd->ip_sum, &pd->hdr.udp->uh_sum, &nk->addr[pd->sidx], nk->port[pd->sidx], 1, af); sport = pd->hdr.udp->uh_sport; pd->sport = &pd->hdr.udp->uh_sport; } if (PF_ANEQ(daddr, &nk->addr[pd->didx], af) || nk->port[pd->didx] != dport) { pf_change_ap(m, daddr, &pd->hdr.udp->uh_dport, pd->ip_sum, &pd->hdr.udp->uh_sum, &nk->addr[pd->didx], nk->port[pd->didx], 1, af); dport = pd->hdr.udp->uh_dport; pd->dport = &pd->hdr.udp->uh_dport; } rewrite++; break; #ifdef INET case IPPROTO_ICMP: nk->port[0] = nk->port[1]; if (PF_ANEQ(saddr, &nk->addr[pd->sidx], AF_INET)) pf_change_a(&saddr->v4.s_addr, pd->ip_sum, nk->addr[pd->sidx].v4.s_addr, 0); if (PF_ANEQ(daddr, &nk->addr[pd->didx], AF_INET)) pf_change_a(&daddr->v4.s_addr, pd->ip_sum, nk->addr[pd->didx].v4.s_addr, 0); if (nk->port[1] != pd->hdr.icmp->icmp_id) { pd->hdr.icmp->icmp_cksum = pf_cksum_fixup( pd->hdr.icmp->icmp_cksum, sport, nk->port[1], 0); pd->hdr.icmp->icmp_id = nk->port[1]; pd->sport = &pd->hdr.icmp->icmp_id; } m_copyback(m, off, ICMP_MINLEN, (caddr_t)pd->hdr.icmp); break; #endif /* INET */ #ifdef INET6 case IPPROTO_ICMPV6: nk->port[0] = nk->port[1]; if (PF_ANEQ(saddr, &nk->addr[pd->sidx], AF_INET6)) pf_change_a6(saddr, &pd->hdr.icmp6->icmp6_cksum, &nk->addr[pd->sidx], 0); if (PF_ANEQ(daddr, &nk->addr[pd->didx], AF_INET6)) pf_change_a6(daddr, &pd->hdr.icmp6->icmp6_cksum, &nk->addr[pd->didx], 0); rewrite++; break; #endif /* INET */ default: switch (af) { #ifdef INET case AF_INET: if (PF_ANEQ(saddr, &nk->addr[pd->sidx], AF_INET)) pf_change_a(&saddr->v4.s_addr, pd->ip_sum, nk->addr[pd->sidx].v4.s_addr, 0); if (PF_ANEQ(daddr, &nk->addr[pd->didx], AF_INET)) pf_change_a(&daddr->v4.s_addr, pd->ip_sum, nk->addr[pd->didx].v4.s_addr, 0); break; #endif /* INET */ #ifdef INET6 case AF_INET6: if (PF_ANEQ(saddr, &nk->addr[pd->sidx], AF_INET6)) PF_ACPY(saddr, &nk->addr[pd->sidx], af); if (PF_ANEQ(daddr, &nk->addr[pd->didx], AF_INET6)) PF_ACPY(daddr, &nk->addr[pd->didx], af); break; #endif /* INET */ } break; } if (nr->natpass) r = NULL; pd->nat_rule = nr; } while (r != NULL) { r->evaluations++; if (pfi_kif_match(r->kif, kif) == r->ifnot) r = r->skip[PF_SKIP_IFP].ptr; else if (r->direction && r->direction != direction) r = r->skip[PF_SKIP_DIR].ptr; else if (r->af && r->af != af) r = r->skip[PF_SKIP_AF].ptr; else if (r->proto && r->proto != pd->proto) r = r->skip[PF_SKIP_PROTO].ptr; else if (PF_MISMATCHAW(&r->src.addr, saddr, af, r->src.neg, kif, M_GETFIB(m))) r = r->skip[PF_SKIP_SRC_ADDR].ptr; /* tcp/udp only. port_op always 0 in other cases */ else if (r->src.port_op && !pf_match_port(r->src.port_op, r->src.port[0], r->src.port[1], sport)) r = r->skip[PF_SKIP_SRC_PORT].ptr; else if (PF_MISMATCHAW(&r->dst.addr, daddr, af, r->dst.neg, NULL, M_GETFIB(m))) r = r->skip[PF_SKIP_DST_ADDR].ptr; /* tcp/udp only. port_op always 0 in other cases */ else if (r->dst.port_op && !pf_match_port(r->dst.port_op, r->dst.port[0], r->dst.port[1], dport)) r = r->skip[PF_SKIP_DST_PORT].ptr; /* icmp only. type always 0 in other cases */ else if (r->type && r->type != icmptype + 1) r = TAILQ_NEXT(r, entries); /* icmp only. type always 0 in other cases */ else if (r->code && r->code != icmpcode + 1) r = TAILQ_NEXT(r, entries); else if (r->tos && !(r->tos == pd->tos)) r = TAILQ_NEXT(r, entries); else if (r->rule_flag & PFRULE_FRAGMENT) r = TAILQ_NEXT(r, entries); else if (pd->proto == IPPROTO_TCP && (r->flagset & th->th_flags) != r->flags) r = TAILQ_NEXT(r, entries); /* tcp/udp only. uid.op always 0 in other cases */ else if (r->uid.op && (pd->lookup.done || (pd->lookup.done = pf_socket_lookup(direction, pd, m), 1)) && !pf_match_uid(r->uid.op, r->uid.uid[0], r->uid.uid[1], pd->lookup.uid)) r = TAILQ_NEXT(r, entries); /* tcp/udp only. gid.op always 0 in other cases */ else if (r->gid.op && (pd->lookup.done || (pd->lookup.done = pf_socket_lookup(direction, pd, m), 1)) && !pf_match_gid(r->gid.op, r->gid.gid[0], r->gid.gid[1], pd->lookup.gid)) r = TAILQ_NEXT(r, entries); else if (r->prio && !pf_match_ieee8021q_pcp(r->prio, m)) r = TAILQ_NEXT(r, entries); else if (r->prob && r->prob <= arc4random()) r = TAILQ_NEXT(r, entries); else if (r->match_tag && !pf_match_tag(m, r, &tag, pd->pf_mtag ? pd->pf_mtag->tag : 0)) r = TAILQ_NEXT(r, entries); else if (r->os_fingerprint != PF_OSFP_ANY && (pd->proto != IPPROTO_TCP || !pf_osfp_match( pf_osfp_fingerprint(pd, m, off, th), r->os_fingerprint))) r = TAILQ_NEXT(r, entries); else { if (r->tag) tag = r->tag; if (r->rtableid >= 0) rtableid = r->rtableid; if (r->anchor == NULL) { match = 1; *rm = r; *am = a; *rsm = ruleset; if ((*rm)->quick) break; r = TAILQ_NEXT(r, entries); } else pf_step_into_anchor(anchor_stack, &asd, &ruleset, PF_RULESET_FILTER, &r, &a, &match); } if (r == NULL && pf_step_out_of_anchor(anchor_stack, &asd, &ruleset, PF_RULESET_FILTER, &r, &a, &match)) break; } r = *rm; a = *am; ruleset = *rsm; REASON_SET(&reason, PFRES_MATCH); if (r->log || (nr != NULL && nr->log)) { if (rewrite) m_copyback(m, off, hdrlen, pd->hdr.any); PFLOG_PACKET(kif, m, af, direction, reason, r->log ? r : nr, a, ruleset, pd, 1); } if ((r->action == PF_DROP) && ((r->rule_flag & PFRULE_RETURNRST) || (r->rule_flag & PFRULE_RETURNICMP) || (r->rule_flag & PFRULE_RETURN))) { pf_return(r, nr, pd, sk, off, m, th, kif, bproto_sum, bip_sum, hdrlen, &reason); } if (r->action == PF_DROP) goto cleanup; if (tag > 0 && pf_tag_packet(m, pd, tag)) { REASON_SET(&reason, PFRES_MEMORY); goto cleanup; } if (rtableid >= 0) M_SETFIB(m, rtableid); if (!state_icmp && (r->keep_state || nr != NULL || (pd->flags & PFDESC_TCP_NORM))) { int action; action = pf_create_state(r, nr, a, pd, nsn, nk, sk, m, off, sport, dport, &rewrite, kif, sm, tag, bproto_sum, bip_sum, hdrlen); if (action != PF_PASS) { if (action == PF_DROP && (r->rule_flag & PFRULE_RETURN)) pf_return(r, nr, pd, sk, off, m, th, kif, bproto_sum, bip_sum, hdrlen, &reason); return (action); } } else { if (sk != NULL) uma_zfree(V_pf_state_key_z, sk); if (nk != NULL) uma_zfree(V_pf_state_key_z, nk); } /* copy back packet headers if we performed NAT operations */ if (rewrite) m_copyback(m, off, hdrlen, pd->hdr.any); if (*sm != NULL && !((*sm)->state_flags & PFSTATE_NOSYNC) && direction == PF_OUT && V_pfsync_defer_ptr != NULL && V_pfsync_defer_ptr(*sm, m)) /* * We want the state created, but we dont * want to send this in case a partner * firewall has to know about it to allow * replies through it. */ return (PF_DEFER); return (PF_PASS); cleanup: if (sk != NULL) uma_zfree(V_pf_state_key_z, sk); if (nk != NULL) uma_zfree(V_pf_state_key_z, nk); return (PF_DROP); } static int pf_create_state(struct pf_rule *r, struct pf_rule *nr, struct pf_rule *a, struct pf_pdesc *pd, struct pf_src_node *nsn, struct pf_state_key *nk, struct pf_state_key *sk, struct mbuf *m, int off, u_int16_t sport, u_int16_t dport, int *rewrite, struct pfi_kif *kif, struct pf_state **sm, int tag, u_int16_t bproto_sum, u_int16_t bip_sum, int hdrlen) { struct pf_state *s = NULL; struct pf_src_node *sn = NULL; struct tcphdr *th = pd->hdr.tcp; u_int16_t mss = V_tcp_mssdflt; u_short reason; /* check maximums */ if (r->max_states && (counter_u64_fetch(r->states_cur) >= r->max_states)) { counter_u64_add(V_pf_status.lcounters[LCNT_STATES], 1); REASON_SET(&reason, PFRES_MAXSTATES); goto csfailed; } /* src node for filter rule */ if ((r->rule_flag & PFRULE_SRCTRACK || r->rpool.opts & PF_POOL_STICKYADDR) && pf_insert_src_node(&sn, r, pd->src, pd->af) != 0) { REASON_SET(&reason, PFRES_SRCLIMIT); goto csfailed; } /* src node for translation rule */ if (nr != NULL && (nr->rpool.opts & PF_POOL_STICKYADDR) && pf_insert_src_node(&nsn, nr, &sk->addr[pd->sidx], pd->af)) { REASON_SET(&reason, PFRES_SRCLIMIT); goto csfailed; } s = uma_zalloc(V_pf_state_z, M_NOWAIT | M_ZERO); if (s == NULL) { REASON_SET(&reason, PFRES_MEMORY); goto csfailed; } s->rule.ptr = r; s->nat_rule.ptr = nr; s->anchor.ptr = a; STATE_INC_COUNTERS(s); if (r->allow_opts) s->state_flags |= PFSTATE_ALLOWOPTS; if (r->rule_flag & PFRULE_STATESLOPPY) s->state_flags |= PFSTATE_SLOPPY; s->log = r->log & PF_LOG_ALL; s->sync_state = PFSYNC_S_NONE; if (nr != NULL) s->log |= nr->log & PF_LOG_ALL; switch (pd->proto) { case IPPROTO_TCP: s->src.seqlo = ntohl(th->th_seq); s->src.seqhi = s->src.seqlo + pd->p_len + 1; if ((th->th_flags & (TH_SYN|TH_ACK)) == TH_SYN && r->keep_state == PF_STATE_MODULATE) { /* Generate sequence number modulator */ if ((s->src.seqdiff = pf_tcp_iss(pd) - s->src.seqlo) == 0) s->src.seqdiff = 1; pf_change_proto_a(m, &th->th_seq, &th->th_sum, htonl(s->src.seqlo + s->src.seqdiff), 0); *rewrite = 1; } else s->src.seqdiff = 0; if (th->th_flags & TH_SYN) { s->src.seqhi++; s->src.wscale = pf_get_wscale(m, off, th->th_off, pd->af); } s->src.max_win = MAX(ntohs(th->th_win), 1); if (s->src.wscale & PF_WSCALE_MASK) { /* Remove scale factor from initial window */ int win = s->src.max_win; win += 1 << (s->src.wscale & PF_WSCALE_MASK); s->src.max_win = (win - 1) >> (s->src.wscale & PF_WSCALE_MASK); } if (th->th_flags & TH_FIN) s->src.seqhi++; s->dst.seqhi = 1; s->dst.max_win = 1; s->src.state = TCPS_SYN_SENT; s->dst.state = TCPS_CLOSED; s->timeout = PFTM_TCP_FIRST_PACKET; break; case IPPROTO_UDP: s->src.state = PFUDPS_SINGLE; s->dst.state = PFUDPS_NO_TRAFFIC; s->timeout = PFTM_UDP_FIRST_PACKET; break; case IPPROTO_ICMP: #ifdef INET6 case IPPROTO_ICMPV6: #endif s->timeout = PFTM_ICMP_FIRST_PACKET; break; default: s->src.state = PFOTHERS_SINGLE; s->dst.state = PFOTHERS_NO_TRAFFIC; s->timeout = PFTM_OTHER_FIRST_PACKET; } if (r->rt) { if (pf_map_addr(pd->af, r, pd->src, &s->rt_addr, NULL, &sn)) { REASON_SET(&reason, PFRES_MAPFAILED); pf_src_tree_remove_state(s); STATE_DEC_COUNTERS(s); uma_zfree(V_pf_state_z, s); goto csfailed; } s->rt_kif = r->rpool.cur->kif; } s->creation = time_uptime; s->expire = time_uptime; if (sn != NULL) s->src_node = sn; if (nsn != NULL) { /* XXX We only modify one side for now. */ PF_ACPY(&nsn->raddr, &nk->addr[1], pd->af); s->nat_src_node = nsn; } if (pd->proto == IPPROTO_TCP) { if ((pd->flags & PFDESC_TCP_NORM) && pf_normalize_tcp_init(m, off, pd, th, &s->src, &s->dst)) { REASON_SET(&reason, PFRES_MEMORY); pf_src_tree_remove_state(s); STATE_DEC_COUNTERS(s); uma_zfree(V_pf_state_z, s); return (PF_DROP); } if ((pd->flags & PFDESC_TCP_NORM) && s->src.scrub && pf_normalize_tcp_stateful(m, off, pd, &reason, th, s, &s->src, &s->dst, rewrite)) { /* This really shouldn't happen!!! */ DPFPRINTF(PF_DEBUG_URGENT, ("pf_normalize_tcp_stateful failed on first " "pkt\n")); pf_normalize_tcp_cleanup(s); pf_src_tree_remove_state(s); STATE_DEC_COUNTERS(s); uma_zfree(V_pf_state_z, s); return (PF_DROP); } } s->direction = pd->dir; /* * sk/nk could already been setup by pf_get_translation(). */ if (nr == NULL) { KASSERT((sk == NULL && nk == NULL), ("%s: nr %p sk %p, nk %p", __func__, nr, sk, nk)); sk = pf_state_key_setup(pd, pd->src, pd->dst, sport, dport); if (sk == NULL) goto csfailed; nk = sk; } else KASSERT((sk != NULL && nk != NULL), ("%s: nr %p sk %p, nk %p", __func__, nr, sk, nk)); /* Swap sk/nk for PF_OUT. */ if (pf_state_insert(BOUND_IFACE(r, kif), (pd->dir == PF_IN) ? sk : nk, (pd->dir == PF_IN) ? nk : sk, s)) { if (pd->proto == IPPROTO_TCP) pf_normalize_tcp_cleanup(s); REASON_SET(&reason, PFRES_STATEINS); pf_src_tree_remove_state(s); STATE_DEC_COUNTERS(s); uma_zfree(V_pf_state_z, s); return (PF_DROP); } else *sm = s; if (tag > 0) s->tag = tag; if (pd->proto == IPPROTO_TCP && (th->th_flags & (TH_SYN|TH_ACK)) == TH_SYN && r->keep_state == PF_STATE_SYNPROXY) { s->src.state = PF_TCPS_PROXY_SRC; /* undo NAT changes, if they have taken place */ if (nr != NULL) { struct pf_state_key *skt = s->key[PF_SK_WIRE]; if (pd->dir == PF_OUT) skt = s->key[PF_SK_STACK]; PF_ACPY(pd->src, &skt->addr[pd->sidx], pd->af); PF_ACPY(pd->dst, &skt->addr[pd->didx], pd->af); if (pd->sport) *pd->sport = skt->port[pd->sidx]; if (pd->dport) *pd->dport = skt->port[pd->didx]; if (pd->proto_sum) *pd->proto_sum = bproto_sum; if (pd->ip_sum) *pd->ip_sum = bip_sum; m_copyback(m, off, hdrlen, pd->hdr.any); } s->src.seqhi = htonl(arc4random()); /* Find mss option */ int rtid = M_GETFIB(m); mss = pf_get_mss(m, off, th->th_off, pd->af); mss = pf_calc_mss(pd->src, pd->af, rtid, mss); mss = pf_calc_mss(pd->dst, pd->af, rtid, mss); s->src.mss = mss; pf_send_tcp(NULL, r, pd->af, pd->dst, pd->src, th->th_dport, th->th_sport, s->src.seqhi, ntohl(th->th_seq) + 1, TH_SYN|TH_ACK, 0, s->src.mss, 0, 1, 0, NULL); REASON_SET(&reason, PFRES_SYNPROXY); return (PF_SYNPROXY_DROP); } return (PF_PASS); csfailed: if (sk != NULL) uma_zfree(V_pf_state_key_z, sk); if (nk != NULL) uma_zfree(V_pf_state_key_z, nk); if (sn != NULL) { struct pf_srchash *sh; sh = &V_pf_srchash[pf_hashsrc(&sn->addr, sn->af)]; PF_HASHROW_LOCK(sh); if (--sn->states == 0 && sn->expire == 0) { pf_unlink_src_node(sn); uma_zfree(V_pf_sources_z, sn); counter_u64_add( V_pf_status.scounters[SCNT_SRC_NODE_REMOVALS], 1); } PF_HASHROW_UNLOCK(sh); } if (nsn != sn && nsn != NULL) { struct pf_srchash *sh; sh = &V_pf_srchash[pf_hashsrc(&nsn->addr, nsn->af)]; PF_HASHROW_LOCK(sh); if (--nsn->states == 0 && nsn->expire == 0) { pf_unlink_src_node(nsn); uma_zfree(V_pf_sources_z, nsn); counter_u64_add( V_pf_status.scounters[SCNT_SRC_NODE_REMOVALS], 1); } PF_HASHROW_UNLOCK(sh); } return (PF_DROP); } static int pf_test_fragment(struct pf_rule **rm, int direction, struct pfi_kif *kif, struct mbuf *m, void *h, struct pf_pdesc *pd, struct pf_rule **am, struct pf_ruleset **rsm) { struct pf_rule *r, *a = NULL; struct pf_ruleset *ruleset = NULL; sa_family_t af = pd->af; u_short reason; int tag = -1; int asd = 0; int match = 0; struct pf_anchor_stackframe anchor_stack[PF_ANCHOR_STACKSIZE]; PF_RULES_RASSERT(); r = TAILQ_FIRST(pf_main_ruleset.rules[PF_RULESET_FILTER].active.ptr); while (r != NULL) { r->evaluations++; if (pfi_kif_match(r->kif, kif) == r->ifnot) r = r->skip[PF_SKIP_IFP].ptr; else if (r->direction && r->direction != direction) r = r->skip[PF_SKIP_DIR].ptr; else if (r->af && r->af != af) r = r->skip[PF_SKIP_AF].ptr; else if (r->proto && r->proto != pd->proto) r = r->skip[PF_SKIP_PROTO].ptr; else if (PF_MISMATCHAW(&r->src.addr, pd->src, af, r->src.neg, kif, M_GETFIB(m))) r = r->skip[PF_SKIP_SRC_ADDR].ptr; else if (PF_MISMATCHAW(&r->dst.addr, pd->dst, af, r->dst.neg, NULL, M_GETFIB(m))) r = r->skip[PF_SKIP_DST_ADDR].ptr; else if (r->tos && !(r->tos == pd->tos)) r = TAILQ_NEXT(r, entries); else if (r->os_fingerprint != PF_OSFP_ANY) r = TAILQ_NEXT(r, entries); else if (pd->proto == IPPROTO_UDP && (r->src.port_op || r->dst.port_op)) r = TAILQ_NEXT(r, entries); else if (pd->proto == IPPROTO_TCP && (r->src.port_op || r->dst.port_op || r->flagset)) r = TAILQ_NEXT(r, entries); else if ((pd->proto == IPPROTO_ICMP || pd->proto == IPPROTO_ICMPV6) && (r->type || r->code)) r = TAILQ_NEXT(r, entries); else if (r->prio && !pf_match_ieee8021q_pcp(r->prio, m)) r = TAILQ_NEXT(r, entries); else if (r->prob && r->prob <= (arc4random() % (UINT_MAX - 1) + 1)) r = TAILQ_NEXT(r, entries); else if (r->match_tag && !pf_match_tag(m, r, &tag, pd->pf_mtag ? pd->pf_mtag->tag : 0)) r = TAILQ_NEXT(r, entries); else { if (r->anchor == NULL) { match = 1; *rm = r; *am = a; *rsm = ruleset; if ((*rm)->quick) break; r = TAILQ_NEXT(r, entries); } else pf_step_into_anchor(anchor_stack, &asd, &ruleset, PF_RULESET_FILTER, &r, &a, &match); } if (r == NULL && pf_step_out_of_anchor(anchor_stack, &asd, &ruleset, PF_RULESET_FILTER, &r, &a, &match)) break; } r = *rm; a = *am; ruleset = *rsm; REASON_SET(&reason, PFRES_MATCH); if (r->log) PFLOG_PACKET(kif, m, af, direction, reason, r, a, ruleset, pd, 1); if (r->action != PF_PASS) return (PF_DROP); if (tag > 0 && pf_tag_packet(m, pd, tag)) { REASON_SET(&reason, PFRES_MEMORY); return (PF_DROP); } return (PF_PASS); } static int pf_tcp_track_full(struct pf_state_peer *src, struct pf_state_peer *dst, struct pf_state **state, struct pfi_kif *kif, struct mbuf *m, int off, struct pf_pdesc *pd, u_short *reason, int *copyback) { struct tcphdr *th = pd->hdr.tcp; u_int16_t win = ntohs(th->th_win); u_int32_t ack, end, seq, orig_seq; u_int8_t sws, dws; int ackskew; if (src->wscale && dst->wscale && !(th->th_flags & TH_SYN)) { sws = src->wscale & PF_WSCALE_MASK; dws = dst->wscale & PF_WSCALE_MASK; } else sws = dws = 0; /* * Sequence tracking algorithm from Guido van Rooij's paper: * http://www.madison-gurkha.com/publications/tcp_filtering/ * tcp_filtering.ps */ orig_seq = seq = ntohl(th->th_seq); if (src->seqlo == 0) { /* First packet from this end. Set its state */ if ((pd->flags & PFDESC_TCP_NORM || dst->scrub) && src->scrub == NULL) { if (pf_normalize_tcp_init(m, off, pd, th, src, dst)) { REASON_SET(reason, PFRES_MEMORY); return (PF_DROP); } } /* Deferred generation of sequence number modulator */ if (dst->seqdiff && !src->seqdiff) { /* use random iss for the TCP server */ while ((src->seqdiff = arc4random() - seq) == 0) ; ack = ntohl(th->th_ack) - dst->seqdiff; pf_change_proto_a(m, &th->th_seq, &th->th_sum, htonl(seq + src->seqdiff), 0); pf_change_proto_a(m, &th->th_ack, &th->th_sum, htonl(ack), 0); *copyback = 1; } else { ack = ntohl(th->th_ack); } end = seq + pd->p_len; if (th->th_flags & TH_SYN) { end++; if (dst->wscale & PF_WSCALE_FLAG) { src->wscale = pf_get_wscale(m, off, th->th_off, pd->af); if (src->wscale & PF_WSCALE_FLAG) { /* Remove scale factor from initial * window */ sws = src->wscale & PF_WSCALE_MASK; win = ((u_int32_t)win + (1 << sws) - 1) >> sws; dws = dst->wscale & PF_WSCALE_MASK; } else { /* fixup other window */ dst->max_win <<= dst->wscale & PF_WSCALE_MASK; /* in case of a retrans SYN|ACK */ dst->wscale = 0; } } } if (th->th_flags & TH_FIN) end++; src->seqlo = seq; if (src->state < TCPS_SYN_SENT) src->state = TCPS_SYN_SENT; /* * May need to slide the window (seqhi may have been set by * the crappy stack check or if we picked up the connection * after establishment) */ if (src->seqhi == 1 || SEQ_GEQ(end + MAX(1, dst->max_win << dws), src->seqhi)) src->seqhi = end + MAX(1, dst->max_win << dws); if (win > src->max_win) src->max_win = win; } else { ack = ntohl(th->th_ack) - dst->seqdiff; if (src->seqdiff) { /* Modulate sequence numbers */ pf_change_proto_a(m, &th->th_seq, &th->th_sum, htonl(seq + src->seqdiff), 0); pf_change_proto_a(m, &th->th_ack, &th->th_sum, htonl(ack), 0); *copyback = 1; } end = seq + pd->p_len; if (th->th_flags & TH_SYN) end++; if (th->th_flags & TH_FIN) end++; } if ((th->th_flags & TH_ACK) == 0) { /* Let it pass through the ack skew check */ ack = dst->seqlo; } else if ((ack == 0 && (th->th_flags & (TH_ACK|TH_RST)) == (TH_ACK|TH_RST)) || /* broken tcp stacks do not set ack */ (dst->state < TCPS_SYN_SENT)) { /* * Many stacks (ours included) will set the ACK number in an * FIN|ACK if the SYN times out -- no sequence to ACK. */ ack = dst->seqlo; } if (seq == end) { /* Ease sequencing restrictions on no data packets */ seq = src->seqlo; end = seq; } ackskew = dst->seqlo - ack; /* * Need to demodulate the sequence numbers in any TCP SACK options * (Selective ACK). We could optionally validate the SACK values * against the current ACK window, either forwards or backwards, but * I'm not confident that SACK has been implemented properly * everywhere. It wouldn't surprise me if several stacks accidentally * SACK too far backwards of previously ACKed data. There really aren't * any security implications of bad SACKing unless the target stack * doesn't validate the option length correctly. Someone trying to * spoof into a TCP connection won't bother blindly sending SACK * options anyway. */ if (dst->seqdiff && (th->th_off << 2) > sizeof(struct tcphdr)) { if (pf_modulate_sack(m, off, pd, th, dst)) *copyback = 1; } #define MAXACKWINDOW (0xffff + 1500) /* 1500 is an arbitrary fudge factor */ if (SEQ_GEQ(src->seqhi, end) && /* Last octet inside other's window space */ SEQ_GEQ(seq, src->seqlo - (dst->max_win << dws)) && /* Retrans: not more than one window back */ (ackskew >= -MAXACKWINDOW) && /* Acking not more than one reassembled fragment backwards */ (ackskew <= (MAXACKWINDOW << sws)) && /* Acking not more than one window forward */ ((th->th_flags & TH_RST) == 0 || orig_seq == src->seqlo || (orig_seq == src->seqlo + 1) || (orig_seq + 1 == src->seqlo) || (pd->flags & PFDESC_IP_REAS) == 0)) { /* Require an exact/+1 sequence match on resets when possible */ if (dst->scrub || src->scrub) { if (pf_normalize_tcp_stateful(m, off, pd, reason, th, *state, src, dst, copyback)) return (PF_DROP); } /* update max window */ if (src->max_win < win) src->max_win = win; /* synchronize sequencing */ if (SEQ_GT(end, src->seqlo)) src->seqlo = end; /* slide the window of what the other end can send */ if (SEQ_GEQ(ack + (win << sws), dst->seqhi)) dst->seqhi = ack + MAX((win << sws), 1); /* update states */ if (th->th_flags & TH_SYN) if (src->state < TCPS_SYN_SENT) src->state = TCPS_SYN_SENT; if (th->th_flags & TH_FIN) if (src->state < TCPS_CLOSING) src->state = TCPS_CLOSING; if (th->th_flags & TH_ACK) { if (dst->state == TCPS_SYN_SENT) { dst->state = TCPS_ESTABLISHED; if (src->state == TCPS_ESTABLISHED && (*state)->src_node != NULL && pf_src_connlimit(state)) { REASON_SET(reason, PFRES_SRCLIMIT); return (PF_DROP); } } else if (dst->state == TCPS_CLOSING) dst->state = TCPS_FIN_WAIT_2; } if (th->th_flags & TH_RST) src->state = dst->state = TCPS_TIME_WAIT; /* update expire time */ (*state)->expire = time_uptime; if (src->state >= TCPS_FIN_WAIT_2 && dst->state >= TCPS_FIN_WAIT_2) (*state)->timeout = PFTM_TCP_CLOSED; else if (src->state >= TCPS_CLOSING && dst->state >= TCPS_CLOSING) (*state)->timeout = PFTM_TCP_FIN_WAIT; else if (src->state < TCPS_ESTABLISHED || dst->state < TCPS_ESTABLISHED) (*state)->timeout = PFTM_TCP_OPENING; else if (src->state >= TCPS_CLOSING || dst->state >= TCPS_CLOSING) (*state)->timeout = PFTM_TCP_CLOSING; else (*state)->timeout = PFTM_TCP_ESTABLISHED; /* Fall through to PASS packet */ } else if ((dst->state < TCPS_SYN_SENT || dst->state >= TCPS_FIN_WAIT_2 || src->state >= TCPS_FIN_WAIT_2) && SEQ_GEQ(src->seqhi + MAXACKWINDOW, end) && /* Within a window forward of the originating packet */ SEQ_GEQ(seq, src->seqlo - MAXACKWINDOW)) { /* Within a window backward of the originating packet */ /* * This currently handles three situations: * 1) Stupid stacks will shotgun SYNs before their peer * replies. * 2) When PF catches an already established stream (the * firewall rebooted, the state table was flushed, routes * changed...) * 3) Packets get funky immediately after the connection * closes (this should catch Solaris spurious ACK|FINs * that web servers like to spew after a close) * * This must be a little more careful than the above code * since packet floods will also be caught here. We don't * update the TTL here to mitigate the damage of a packet * flood and so the same code can handle awkward establishment * and a loosened connection close. * In the establishment case, a correct peer response will * validate the connection, go through the normal state code * and keep updating the state TTL. */ if (V_pf_status.debug >= PF_DEBUG_MISC) { printf("pf: loose state match: "); pf_print_state(*state); pf_print_flags(th->th_flags); printf(" seq=%u (%u) ack=%u len=%u ackskew=%d " "pkts=%llu:%llu dir=%s,%s\n", seq, orig_seq, ack, pd->p_len, ackskew, (unsigned long long)(*state)->packets[0], (unsigned long long)(*state)->packets[1], pd->dir == PF_IN ? "in" : "out", pd->dir == (*state)->direction ? "fwd" : "rev"); } if (dst->scrub || src->scrub) { if (pf_normalize_tcp_stateful(m, off, pd, reason, th, *state, src, dst, copyback)) return (PF_DROP); } /* update max window */ if (src->max_win < win) src->max_win = win; /* synchronize sequencing */ if (SEQ_GT(end, src->seqlo)) src->seqlo = end; /* slide the window of what the other end can send */ if (SEQ_GEQ(ack + (win << sws), dst->seqhi)) dst->seqhi = ack + MAX((win << sws), 1); /* * Cannot set dst->seqhi here since this could be a shotgunned * SYN and not an already established connection. */ if (th->th_flags & TH_FIN) if (src->state < TCPS_CLOSING) src->state = TCPS_CLOSING; if (th->th_flags & TH_RST) src->state = dst->state = TCPS_TIME_WAIT; /* Fall through to PASS packet */ } else { if ((*state)->dst.state == TCPS_SYN_SENT && (*state)->src.state == TCPS_SYN_SENT) { /* Send RST for state mismatches during handshake */ if (!(th->th_flags & TH_RST)) pf_send_tcp(NULL, (*state)->rule.ptr, pd->af, pd->dst, pd->src, th->th_dport, th->th_sport, ntohl(th->th_ack), 0, TH_RST, 0, 0, (*state)->rule.ptr->return_ttl, 1, 0, kif->pfik_ifp); src->seqlo = 0; src->seqhi = 1; src->max_win = 1; } else if (V_pf_status.debug >= PF_DEBUG_MISC) { printf("pf: BAD state: "); pf_print_state(*state); pf_print_flags(th->th_flags); printf(" seq=%u (%u) ack=%u len=%u ackskew=%d " "pkts=%llu:%llu dir=%s,%s\n", seq, orig_seq, ack, pd->p_len, ackskew, (unsigned long long)(*state)->packets[0], (unsigned long long)(*state)->packets[1], pd->dir == PF_IN ? "in" : "out", pd->dir == (*state)->direction ? "fwd" : "rev"); printf("pf: State failure on: %c %c %c %c | %c %c\n", SEQ_GEQ(src->seqhi, end) ? ' ' : '1', SEQ_GEQ(seq, src->seqlo - (dst->max_win << dws)) ? ' ': '2', (ackskew >= -MAXACKWINDOW) ? ' ' : '3', (ackskew <= (MAXACKWINDOW << sws)) ? ' ' : '4', SEQ_GEQ(src->seqhi + MAXACKWINDOW, end) ?' ' :'5', SEQ_GEQ(seq, src->seqlo - MAXACKWINDOW) ?' ' :'6'); } REASON_SET(reason, PFRES_BADSTATE); return (PF_DROP); } return (PF_PASS); } static int pf_tcp_track_sloppy(struct pf_state_peer *src, struct pf_state_peer *dst, struct pf_state **state, struct pf_pdesc *pd, u_short *reason) { struct tcphdr *th = pd->hdr.tcp; if (th->th_flags & TH_SYN) if (src->state < TCPS_SYN_SENT) src->state = TCPS_SYN_SENT; if (th->th_flags & TH_FIN) if (src->state < TCPS_CLOSING) src->state = TCPS_CLOSING; if (th->th_flags & TH_ACK) { if (dst->state == TCPS_SYN_SENT) { dst->state = TCPS_ESTABLISHED; if (src->state == TCPS_ESTABLISHED && (*state)->src_node != NULL && pf_src_connlimit(state)) { REASON_SET(reason, PFRES_SRCLIMIT); return (PF_DROP); } } else if (dst->state == TCPS_CLOSING) { dst->state = TCPS_FIN_WAIT_2; } else if (src->state == TCPS_SYN_SENT && dst->state < TCPS_SYN_SENT) { /* * Handle a special sloppy case where we only see one * half of the connection. If there is a ACK after * the initial SYN without ever seeing a packet from * the destination, set the connection to established. */ dst->state = src->state = TCPS_ESTABLISHED; if ((*state)->src_node != NULL && pf_src_connlimit(state)) { REASON_SET(reason, PFRES_SRCLIMIT); return (PF_DROP); } } else if (src->state == TCPS_CLOSING && dst->state == TCPS_ESTABLISHED && dst->seqlo == 0) { /* * Handle the closing of half connections where we * don't see the full bidirectional FIN/ACK+ACK * handshake. */ dst->state = TCPS_CLOSING; } } if (th->th_flags & TH_RST) src->state = dst->state = TCPS_TIME_WAIT; /* update expire time */ (*state)->expire = time_uptime; if (src->state >= TCPS_FIN_WAIT_2 && dst->state >= TCPS_FIN_WAIT_2) (*state)->timeout = PFTM_TCP_CLOSED; else if (src->state >= TCPS_CLOSING && dst->state >= TCPS_CLOSING) (*state)->timeout = PFTM_TCP_FIN_WAIT; else if (src->state < TCPS_ESTABLISHED || dst->state < TCPS_ESTABLISHED) (*state)->timeout = PFTM_TCP_OPENING; else if (src->state >= TCPS_CLOSING || dst->state >= TCPS_CLOSING) (*state)->timeout = PFTM_TCP_CLOSING; else (*state)->timeout = PFTM_TCP_ESTABLISHED; return (PF_PASS); } static int pf_test_state_tcp(struct pf_state **state, int direction, struct pfi_kif *kif, struct mbuf *m, int off, void *h, struct pf_pdesc *pd, u_short *reason) { struct pf_state_key_cmp key; struct tcphdr *th = pd->hdr.tcp; int copyback = 0; struct pf_state_peer *src, *dst; struct pf_state_key *sk; bzero(&key, sizeof(key)); key.af = pd->af; key.proto = IPPROTO_TCP; if (direction == PF_IN) { /* wire side, straight */ PF_ACPY(&key.addr[0], pd->src, key.af); PF_ACPY(&key.addr[1], pd->dst, key.af); key.port[0] = th->th_sport; key.port[1] = th->th_dport; } else { /* stack side, reverse */ PF_ACPY(&key.addr[1], pd->src, key.af); PF_ACPY(&key.addr[0], pd->dst, key.af); key.port[1] = th->th_sport; key.port[0] = th->th_dport; } STATE_LOOKUP(kif, &key, direction, *state, pd); if (direction == (*state)->direction) { src = &(*state)->src; dst = &(*state)->dst; } else { src = &(*state)->dst; dst = &(*state)->src; } sk = (*state)->key[pd->didx]; if ((*state)->src.state == PF_TCPS_PROXY_SRC) { if (direction != (*state)->direction) { REASON_SET(reason, PFRES_SYNPROXY); return (PF_SYNPROXY_DROP); } if (th->th_flags & TH_SYN) { if (ntohl(th->th_seq) != (*state)->src.seqlo) { REASON_SET(reason, PFRES_SYNPROXY); return (PF_DROP); } pf_send_tcp(NULL, (*state)->rule.ptr, pd->af, pd->dst, pd->src, th->th_dport, th->th_sport, (*state)->src.seqhi, ntohl(th->th_seq) + 1, TH_SYN|TH_ACK, 0, (*state)->src.mss, 0, 1, 0, NULL); REASON_SET(reason, PFRES_SYNPROXY); return (PF_SYNPROXY_DROP); } else if ((th->th_flags & (TH_ACK|TH_RST|TH_FIN)) != TH_ACK || (ntohl(th->th_ack) != (*state)->src.seqhi + 1) || (ntohl(th->th_seq) != (*state)->src.seqlo + 1)) { REASON_SET(reason, PFRES_SYNPROXY); return (PF_DROP); } else if ((*state)->src_node != NULL && pf_src_connlimit(state)) { REASON_SET(reason, PFRES_SRCLIMIT); return (PF_DROP); } else (*state)->src.state = PF_TCPS_PROXY_DST; } if ((*state)->src.state == PF_TCPS_PROXY_DST) { if (direction == (*state)->direction) { if (((th->th_flags & (TH_SYN|TH_ACK)) != TH_ACK) || (ntohl(th->th_ack) != (*state)->src.seqhi + 1) || (ntohl(th->th_seq) != (*state)->src.seqlo + 1)) { REASON_SET(reason, PFRES_SYNPROXY); return (PF_DROP); } (*state)->src.max_win = MAX(ntohs(th->th_win), 1); if ((*state)->dst.seqhi == 1) (*state)->dst.seqhi = htonl(arc4random()); pf_send_tcp(NULL, (*state)->rule.ptr, pd->af, &sk->addr[pd->sidx], &sk->addr[pd->didx], sk->port[pd->sidx], sk->port[pd->didx], (*state)->dst.seqhi, 0, TH_SYN, 0, (*state)->src.mss, 0, 0, (*state)->tag, NULL); REASON_SET(reason, PFRES_SYNPROXY); return (PF_SYNPROXY_DROP); } else if (((th->th_flags & (TH_SYN|TH_ACK)) != (TH_SYN|TH_ACK)) || (ntohl(th->th_ack) != (*state)->dst.seqhi + 1)) { REASON_SET(reason, PFRES_SYNPROXY); return (PF_DROP); } else { (*state)->dst.max_win = MAX(ntohs(th->th_win), 1); (*state)->dst.seqlo = ntohl(th->th_seq); pf_send_tcp(NULL, (*state)->rule.ptr, pd->af, pd->dst, pd->src, th->th_dport, th->th_sport, ntohl(th->th_ack), ntohl(th->th_seq) + 1, TH_ACK, (*state)->src.max_win, 0, 0, 0, (*state)->tag, NULL); pf_send_tcp(NULL, (*state)->rule.ptr, pd->af, &sk->addr[pd->sidx], &sk->addr[pd->didx], sk->port[pd->sidx], sk->port[pd->didx], (*state)->src.seqhi + 1, (*state)->src.seqlo + 1, TH_ACK, (*state)->dst.max_win, 0, 0, 1, 0, NULL); (*state)->src.seqdiff = (*state)->dst.seqhi - (*state)->src.seqlo; (*state)->dst.seqdiff = (*state)->src.seqhi - (*state)->dst.seqlo; (*state)->src.seqhi = (*state)->src.seqlo + (*state)->dst.max_win; (*state)->dst.seqhi = (*state)->dst.seqlo + (*state)->src.max_win; (*state)->src.wscale = (*state)->dst.wscale = 0; (*state)->src.state = (*state)->dst.state = TCPS_ESTABLISHED; REASON_SET(reason, PFRES_SYNPROXY); return (PF_SYNPROXY_DROP); } } if (((th->th_flags & (TH_SYN|TH_ACK)) == TH_SYN) && dst->state >= TCPS_FIN_WAIT_2 && src->state >= TCPS_FIN_WAIT_2) { if (V_pf_status.debug >= PF_DEBUG_MISC) { printf("pf: state reuse "); pf_print_state(*state); pf_print_flags(th->th_flags); printf("\n"); } /* XXX make sure it's the same direction ?? */ (*state)->src.state = (*state)->dst.state = TCPS_CLOSED; pf_unlink_state(*state, PF_ENTER_LOCKED); *state = NULL; return (PF_DROP); } if ((*state)->state_flags & PFSTATE_SLOPPY) { if (pf_tcp_track_sloppy(src, dst, state, pd, reason) == PF_DROP) return (PF_DROP); } else { if (pf_tcp_track_full(src, dst, state, kif, m, off, pd, reason, ©back) == PF_DROP) return (PF_DROP); } /* translate source/destination address, if necessary */ if ((*state)->key[PF_SK_WIRE] != (*state)->key[PF_SK_STACK]) { struct pf_state_key *nk = (*state)->key[pd->didx]; if (PF_ANEQ(pd->src, &nk->addr[pd->sidx], pd->af) || nk->port[pd->sidx] != th->th_sport) pf_change_ap(m, pd->src, &th->th_sport, pd->ip_sum, &th->th_sum, &nk->addr[pd->sidx], nk->port[pd->sidx], 0, pd->af); if (PF_ANEQ(pd->dst, &nk->addr[pd->didx], pd->af) || nk->port[pd->didx] != th->th_dport) pf_change_ap(m, pd->dst, &th->th_dport, pd->ip_sum, &th->th_sum, &nk->addr[pd->didx], nk->port[pd->didx], 0, pd->af); copyback = 1; } /* Copyback sequence modulation or stateful scrub changes if needed */ if (copyback) m_copyback(m, off, sizeof(*th), (caddr_t)th); return (PF_PASS); } static int pf_test_state_udp(struct pf_state **state, int direction, struct pfi_kif *kif, struct mbuf *m, int off, void *h, struct pf_pdesc *pd) { struct pf_state_peer *src, *dst; struct pf_state_key_cmp key; struct udphdr *uh = pd->hdr.udp; bzero(&key, sizeof(key)); key.af = pd->af; key.proto = IPPROTO_UDP; if (direction == PF_IN) { /* wire side, straight */ PF_ACPY(&key.addr[0], pd->src, key.af); PF_ACPY(&key.addr[1], pd->dst, key.af); key.port[0] = uh->uh_sport; key.port[1] = uh->uh_dport; } else { /* stack side, reverse */ PF_ACPY(&key.addr[1], pd->src, key.af); PF_ACPY(&key.addr[0], pd->dst, key.af); key.port[1] = uh->uh_sport; key.port[0] = uh->uh_dport; } STATE_LOOKUP(kif, &key, direction, *state, pd); if (direction == (*state)->direction) { src = &(*state)->src; dst = &(*state)->dst; } else { src = &(*state)->dst; dst = &(*state)->src; } /* update states */ if (src->state < PFUDPS_SINGLE) src->state = PFUDPS_SINGLE; if (dst->state == PFUDPS_SINGLE) dst->state = PFUDPS_MULTIPLE; /* update expire time */ (*state)->expire = time_uptime; if (src->state == PFUDPS_MULTIPLE && dst->state == PFUDPS_MULTIPLE) (*state)->timeout = PFTM_UDP_MULTIPLE; else (*state)->timeout = PFTM_UDP_SINGLE; /* translate source/destination address, if necessary */ if ((*state)->key[PF_SK_WIRE] != (*state)->key[PF_SK_STACK]) { struct pf_state_key *nk = (*state)->key[pd->didx]; if (PF_ANEQ(pd->src, &nk->addr[pd->sidx], pd->af) || nk->port[pd->sidx] != uh->uh_sport) pf_change_ap(m, pd->src, &uh->uh_sport, pd->ip_sum, &uh->uh_sum, &nk->addr[pd->sidx], nk->port[pd->sidx], 1, pd->af); if (PF_ANEQ(pd->dst, &nk->addr[pd->didx], pd->af) || nk->port[pd->didx] != uh->uh_dport) pf_change_ap(m, pd->dst, &uh->uh_dport, pd->ip_sum, &uh->uh_sum, &nk->addr[pd->didx], nk->port[pd->didx], 1, pd->af); m_copyback(m, off, sizeof(*uh), (caddr_t)uh); } return (PF_PASS); } static int pf_test_state_icmp(struct pf_state **state, int direction, struct pfi_kif *kif, struct mbuf *m, int off, void *h, struct pf_pdesc *pd, u_short *reason) { struct pf_addr *saddr = pd->src, *daddr = pd->dst; u_int16_t icmpid = 0, *icmpsum; u_int8_t icmptype, icmpcode; int state_icmp = 0; struct pf_state_key_cmp key; bzero(&key, sizeof(key)); switch (pd->proto) { #ifdef INET case IPPROTO_ICMP: icmptype = pd->hdr.icmp->icmp_type; icmpcode = pd->hdr.icmp->icmp_code; icmpid = pd->hdr.icmp->icmp_id; icmpsum = &pd->hdr.icmp->icmp_cksum; if (icmptype == ICMP_UNREACH || icmptype == ICMP_SOURCEQUENCH || icmptype == ICMP_REDIRECT || icmptype == ICMP_TIMXCEED || icmptype == ICMP_PARAMPROB) state_icmp++; break; #endif /* INET */ #ifdef INET6 case IPPROTO_ICMPV6: icmptype = pd->hdr.icmp6->icmp6_type; icmpcode = pd->hdr.icmp6->icmp6_code; icmpid = pd->hdr.icmp6->icmp6_id; icmpsum = &pd->hdr.icmp6->icmp6_cksum; if (icmptype == ICMP6_DST_UNREACH || icmptype == ICMP6_PACKET_TOO_BIG || icmptype == ICMP6_TIME_EXCEEDED || icmptype == ICMP6_PARAM_PROB) state_icmp++; break; #endif /* INET6 */ } if (!state_icmp) { /* * ICMP query/reply message not related to a TCP/UDP packet. * Search for an ICMP state. */ key.af = pd->af; key.proto = pd->proto; key.port[0] = key.port[1] = icmpid; if (direction == PF_IN) { /* wire side, straight */ PF_ACPY(&key.addr[0], pd->src, key.af); PF_ACPY(&key.addr[1], pd->dst, key.af); } else { /* stack side, reverse */ PF_ACPY(&key.addr[1], pd->src, key.af); PF_ACPY(&key.addr[0], pd->dst, key.af); } STATE_LOOKUP(kif, &key, direction, *state, pd); (*state)->expire = time_uptime; (*state)->timeout = PFTM_ICMP_ERROR_REPLY; /* translate source/destination address, if necessary */ if ((*state)->key[PF_SK_WIRE] != (*state)->key[PF_SK_STACK]) { struct pf_state_key *nk = (*state)->key[pd->didx]; switch (pd->af) { #ifdef INET case AF_INET: if (PF_ANEQ(pd->src, &nk->addr[pd->sidx], AF_INET)) pf_change_a(&saddr->v4.s_addr, pd->ip_sum, nk->addr[pd->sidx].v4.s_addr, 0); if (PF_ANEQ(pd->dst, &nk->addr[pd->didx], AF_INET)) pf_change_a(&daddr->v4.s_addr, pd->ip_sum, nk->addr[pd->didx].v4.s_addr, 0); if (nk->port[0] != pd->hdr.icmp->icmp_id) { pd->hdr.icmp->icmp_cksum = pf_cksum_fixup( pd->hdr.icmp->icmp_cksum, icmpid, nk->port[pd->sidx], 0); pd->hdr.icmp->icmp_id = nk->port[pd->sidx]; } m_copyback(m, off, ICMP_MINLEN, (caddr_t )pd->hdr.icmp); break; #endif /* INET */ #ifdef INET6 case AF_INET6: if (PF_ANEQ(pd->src, &nk->addr[pd->sidx], AF_INET6)) pf_change_a6(saddr, &pd->hdr.icmp6->icmp6_cksum, &nk->addr[pd->sidx], 0); if (PF_ANEQ(pd->dst, &nk->addr[pd->didx], AF_INET6)) pf_change_a6(daddr, &pd->hdr.icmp6->icmp6_cksum, &nk->addr[pd->didx], 0); m_copyback(m, off, sizeof(struct icmp6_hdr), (caddr_t )pd->hdr.icmp6); break; #endif /* INET6 */ } } return (PF_PASS); } else { /* * ICMP error message in response to a TCP/UDP packet. * Extract the inner TCP/UDP header and search for that state. */ struct pf_pdesc pd2; bzero(&pd2, sizeof pd2); #ifdef INET struct ip h2; #endif /* INET */ #ifdef INET6 struct ip6_hdr h2_6; int terminal = 0; #endif /* INET6 */ int ipoff2 = 0; int off2 = 0; pd2.af = pd->af; /* Payload packet is from the opposite direction. */ pd2.sidx = (direction == PF_IN) ? 1 : 0; pd2.didx = (direction == PF_IN) ? 0 : 1; switch (pd->af) { #ifdef INET case AF_INET: /* offset of h2 in mbuf chain */ ipoff2 = off + ICMP_MINLEN; if (!pf_pull_hdr(m, ipoff2, &h2, sizeof(h2), NULL, reason, pd2.af)) { DPFPRINTF(PF_DEBUG_MISC, ("pf: ICMP error message too short " "(ip)\n")); return (PF_DROP); } /* * ICMP error messages don't refer to non-first * fragments */ if (h2.ip_off & htons(IP_OFFMASK)) { REASON_SET(reason, PFRES_FRAG); return (PF_DROP); } /* offset of protocol header that follows h2 */ off2 = ipoff2 + (h2.ip_hl << 2); pd2.proto = h2.ip_p; pd2.src = (struct pf_addr *)&h2.ip_src; pd2.dst = (struct pf_addr *)&h2.ip_dst; pd2.ip_sum = &h2.ip_sum; break; #endif /* INET */ #ifdef INET6 case AF_INET6: ipoff2 = off + sizeof(struct icmp6_hdr); if (!pf_pull_hdr(m, ipoff2, &h2_6, sizeof(h2_6), NULL, reason, pd2.af)) { DPFPRINTF(PF_DEBUG_MISC, ("pf: ICMP error message too short " "(ip6)\n")); return (PF_DROP); } pd2.proto = h2_6.ip6_nxt; pd2.src = (struct pf_addr *)&h2_6.ip6_src; pd2.dst = (struct pf_addr *)&h2_6.ip6_dst; pd2.ip_sum = NULL; off2 = ipoff2 + sizeof(h2_6); do { switch (pd2.proto) { case IPPROTO_FRAGMENT: /* * ICMPv6 error messages for * non-first fragments */ REASON_SET(reason, PFRES_FRAG); return (PF_DROP); case IPPROTO_AH: case IPPROTO_HOPOPTS: case IPPROTO_ROUTING: case IPPROTO_DSTOPTS: { /* get next header and header length */ struct ip6_ext opt6; if (!pf_pull_hdr(m, off2, &opt6, sizeof(opt6), NULL, reason, pd2.af)) { DPFPRINTF(PF_DEBUG_MISC, ("pf: ICMPv6 short opt\n")); return (PF_DROP); } if (pd2.proto == IPPROTO_AH) off2 += (opt6.ip6e_len + 2) * 4; else off2 += (opt6.ip6e_len + 1) * 8; pd2.proto = opt6.ip6e_nxt; /* goto the next header */ break; } default: terminal++; break; } } while (!terminal); break; #endif /* INET6 */ } if (PF_ANEQ(pd->dst, pd2.src, pd->af)) { if (V_pf_status.debug >= PF_DEBUG_MISC) { printf("pf: BAD ICMP %d:%d outer dst: ", icmptype, icmpcode); pf_print_host(pd->src, 0, pd->af); printf(" -> "); pf_print_host(pd->dst, 0, pd->af); printf(" inner src: "); pf_print_host(pd2.src, 0, pd2.af); printf(" -> "); pf_print_host(pd2.dst, 0, pd2.af); printf("\n"); } REASON_SET(reason, PFRES_BADSTATE); return (PF_DROP); } switch (pd2.proto) { case IPPROTO_TCP: { struct tcphdr th; u_int32_t seq; struct pf_state_peer *src, *dst; u_int8_t dws; int copyback = 0; /* * Only the first 8 bytes of the TCP header can be * expected. Don't access any TCP header fields after * th_seq, an ackskew test is not possible. */ if (!pf_pull_hdr(m, off2, &th, 8, NULL, reason, pd2.af)) { DPFPRINTF(PF_DEBUG_MISC, ("pf: ICMP error message too short " "(tcp)\n")); return (PF_DROP); } key.af = pd2.af; key.proto = IPPROTO_TCP; PF_ACPY(&key.addr[pd2.sidx], pd2.src, key.af); PF_ACPY(&key.addr[pd2.didx], pd2.dst, key.af); key.port[pd2.sidx] = th.th_sport; key.port[pd2.didx] = th.th_dport; STATE_LOOKUP(kif, &key, direction, *state, pd); if (direction == (*state)->direction) { src = &(*state)->dst; dst = &(*state)->src; } else { src = &(*state)->src; dst = &(*state)->dst; } if (src->wscale && dst->wscale) dws = dst->wscale & PF_WSCALE_MASK; else dws = 0; /* Demodulate sequence number */ seq = ntohl(th.th_seq) - src->seqdiff; if (src->seqdiff) { pf_change_a(&th.th_seq, icmpsum, htonl(seq), 0); copyback = 1; } if (!((*state)->state_flags & PFSTATE_SLOPPY) && (!SEQ_GEQ(src->seqhi, seq) || !SEQ_GEQ(seq, src->seqlo - (dst->max_win << dws)))) { if (V_pf_status.debug >= PF_DEBUG_MISC) { printf("pf: BAD ICMP %d:%d ", icmptype, icmpcode); pf_print_host(pd->src, 0, pd->af); printf(" -> "); pf_print_host(pd->dst, 0, pd->af); printf(" state: "); pf_print_state(*state); printf(" seq=%u\n", seq); } REASON_SET(reason, PFRES_BADSTATE); return (PF_DROP); } else { if (V_pf_status.debug >= PF_DEBUG_MISC) { printf("pf: OK ICMP %d:%d ", icmptype, icmpcode); pf_print_host(pd->src, 0, pd->af); printf(" -> "); pf_print_host(pd->dst, 0, pd->af); printf(" state: "); pf_print_state(*state); printf(" seq=%u\n", seq); } } /* translate source/destination address, if necessary */ if ((*state)->key[PF_SK_WIRE] != (*state)->key[PF_SK_STACK]) { struct pf_state_key *nk = (*state)->key[pd->didx]; if (PF_ANEQ(pd2.src, &nk->addr[pd2.sidx], pd2.af) || nk->port[pd2.sidx] != th.th_sport) pf_change_icmp(pd2.src, &th.th_sport, daddr, &nk->addr[pd2.sidx], nk->port[pd2.sidx], NULL, pd2.ip_sum, icmpsum, pd->ip_sum, 0, pd2.af); if (PF_ANEQ(pd2.dst, &nk->addr[pd2.didx], pd2.af) || nk->port[pd2.didx] != th.th_dport) pf_change_icmp(pd2.dst, &th.th_dport, saddr, &nk->addr[pd2.didx], nk->port[pd2.didx], NULL, pd2.ip_sum, icmpsum, pd->ip_sum, 0, pd2.af); copyback = 1; } if (copyback) { switch (pd2.af) { #ifdef INET case AF_INET: m_copyback(m, off, ICMP_MINLEN, (caddr_t )pd->hdr.icmp); m_copyback(m, ipoff2, sizeof(h2), (caddr_t )&h2); break; #endif /* INET */ #ifdef INET6 case AF_INET6: m_copyback(m, off, sizeof(struct icmp6_hdr), (caddr_t )pd->hdr.icmp6); m_copyback(m, ipoff2, sizeof(h2_6), (caddr_t )&h2_6); break; #endif /* INET6 */ } m_copyback(m, off2, 8, (caddr_t)&th); } return (PF_PASS); break; } case IPPROTO_UDP: { struct udphdr uh; if (!pf_pull_hdr(m, off2, &uh, sizeof(uh), NULL, reason, pd2.af)) { DPFPRINTF(PF_DEBUG_MISC, ("pf: ICMP error message too short " "(udp)\n")); return (PF_DROP); } key.af = pd2.af; key.proto = IPPROTO_UDP; PF_ACPY(&key.addr[pd2.sidx], pd2.src, key.af); PF_ACPY(&key.addr[pd2.didx], pd2.dst, key.af); key.port[pd2.sidx] = uh.uh_sport; key.port[pd2.didx] = uh.uh_dport; STATE_LOOKUP(kif, &key, direction, *state, pd); /* translate source/destination address, if necessary */ if ((*state)->key[PF_SK_WIRE] != (*state)->key[PF_SK_STACK]) { struct pf_state_key *nk = (*state)->key[pd->didx]; if (PF_ANEQ(pd2.src, &nk->addr[pd2.sidx], pd2.af) || nk->port[pd2.sidx] != uh.uh_sport) pf_change_icmp(pd2.src, &uh.uh_sport, daddr, &nk->addr[pd2.sidx], nk->port[pd2.sidx], &uh.uh_sum, pd2.ip_sum, icmpsum, pd->ip_sum, 1, pd2.af); if (PF_ANEQ(pd2.dst, &nk->addr[pd2.didx], pd2.af) || nk->port[pd2.didx] != uh.uh_dport) pf_change_icmp(pd2.dst, &uh.uh_dport, saddr, &nk->addr[pd2.didx], nk->port[pd2.didx], &uh.uh_sum, pd2.ip_sum, icmpsum, pd->ip_sum, 1, pd2.af); switch (pd2.af) { #ifdef INET case AF_INET: m_copyback(m, off, ICMP_MINLEN, (caddr_t )pd->hdr.icmp); m_copyback(m, ipoff2, sizeof(h2), (caddr_t)&h2); break; #endif /* INET */ #ifdef INET6 case AF_INET6: m_copyback(m, off, sizeof(struct icmp6_hdr), (caddr_t )pd->hdr.icmp6); m_copyback(m, ipoff2, sizeof(h2_6), (caddr_t )&h2_6); break; #endif /* INET6 */ } m_copyback(m, off2, sizeof(uh), (caddr_t)&uh); } return (PF_PASS); break; } #ifdef INET case IPPROTO_ICMP: { struct icmp iih; if (!pf_pull_hdr(m, off2, &iih, ICMP_MINLEN, NULL, reason, pd2.af)) { DPFPRINTF(PF_DEBUG_MISC, ("pf: ICMP error message too short i" "(icmp)\n")); return (PF_DROP); } key.af = pd2.af; key.proto = IPPROTO_ICMP; PF_ACPY(&key.addr[pd2.sidx], pd2.src, key.af); PF_ACPY(&key.addr[pd2.didx], pd2.dst, key.af); key.port[0] = key.port[1] = iih.icmp_id; STATE_LOOKUP(kif, &key, direction, *state, pd); /* translate source/destination address, if necessary */ if ((*state)->key[PF_SK_WIRE] != (*state)->key[PF_SK_STACK]) { struct pf_state_key *nk = (*state)->key[pd->didx]; if (PF_ANEQ(pd2.src, &nk->addr[pd2.sidx], pd2.af) || nk->port[pd2.sidx] != iih.icmp_id) pf_change_icmp(pd2.src, &iih.icmp_id, daddr, &nk->addr[pd2.sidx], nk->port[pd2.sidx], NULL, pd2.ip_sum, icmpsum, pd->ip_sum, 0, AF_INET); if (PF_ANEQ(pd2.dst, &nk->addr[pd2.didx], pd2.af) || nk->port[pd2.didx] != iih.icmp_id) pf_change_icmp(pd2.dst, &iih.icmp_id, saddr, &nk->addr[pd2.didx], nk->port[pd2.didx], NULL, pd2.ip_sum, icmpsum, pd->ip_sum, 0, AF_INET); m_copyback(m, off, ICMP_MINLEN, (caddr_t)pd->hdr.icmp); m_copyback(m, ipoff2, sizeof(h2), (caddr_t)&h2); m_copyback(m, off2, ICMP_MINLEN, (caddr_t)&iih); } return (PF_PASS); break; } #endif /* INET */ #ifdef INET6 case IPPROTO_ICMPV6: { struct icmp6_hdr iih; if (!pf_pull_hdr(m, off2, &iih, sizeof(struct icmp6_hdr), NULL, reason, pd2.af)) { DPFPRINTF(PF_DEBUG_MISC, ("pf: ICMP error message too short " "(icmp6)\n")); return (PF_DROP); } key.af = pd2.af; key.proto = IPPROTO_ICMPV6; PF_ACPY(&key.addr[pd2.sidx], pd2.src, key.af); PF_ACPY(&key.addr[pd2.didx], pd2.dst, key.af); key.port[0] = key.port[1] = iih.icmp6_id; STATE_LOOKUP(kif, &key, direction, *state, pd); /* translate source/destination address, if necessary */ if ((*state)->key[PF_SK_WIRE] != (*state)->key[PF_SK_STACK]) { struct pf_state_key *nk = (*state)->key[pd->didx]; if (PF_ANEQ(pd2.src, &nk->addr[pd2.sidx], pd2.af) || nk->port[pd2.sidx] != iih.icmp6_id) pf_change_icmp(pd2.src, &iih.icmp6_id, daddr, &nk->addr[pd2.sidx], nk->port[pd2.sidx], NULL, pd2.ip_sum, icmpsum, pd->ip_sum, 0, AF_INET6); if (PF_ANEQ(pd2.dst, &nk->addr[pd2.didx], pd2.af) || nk->port[pd2.didx] != iih.icmp6_id) pf_change_icmp(pd2.dst, &iih.icmp6_id, saddr, &nk->addr[pd2.didx], nk->port[pd2.didx], NULL, pd2.ip_sum, icmpsum, pd->ip_sum, 0, AF_INET6); m_copyback(m, off, sizeof(struct icmp6_hdr), (caddr_t)pd->hdr.icmp6); m_copyback(m, ipoff2, sizeof(h2_6), (caddr_t)&h2_6); m_copyback(m, off2, sizeof(struct icmp6_hdr), (caddr_t)&iih); } return (PF_PASS); break; } #endif /* INET6 */ default: { key.af = pd2.af; key.proto = pd2.proto; PF_ACPY(&key.addr[pd2.sidx], pd2.src, key.af); PF_ACPY(&key.addr[pd2.didx], pd2.dst, key.af); key.port[0] = key.port[1] = 0; STATE_LOOKUP(kif, &key, direction, *state, pd); /* translate source/destination address, if necessary */ if ((*state)->key[PF_SK_WIRE] != (*state)->key[PF_SK_STACK]) { struct pf_state_key *nk = (*state)->key[pd->didx]; if (PF_ANEQ(pd2.src, &nk->addr[pd2.sidx], pd2.af)) pf_change_icmp(pd2.src, NULL, daddr, &nk->addr[pd2.sidx], 0, NULL, pd2.ip_sum, icmpsum, pd->ip_sum, 0, pd2.af); if (PF_ANEQ(pd2.dst, &nk->addr[pd2.didx], pd2.af)) pf_change_icmp(pd2.dst, NULL, saddr, &nk->addr[pd2.didx], 0, NULL, pd2.ip_sum, icmpsum, pd->ip_sum, 0, pd2.af); switch (pd2.af) { #ifdef INET case AF_INET: m_copyback(m, off, ICMP_MINLEN, (caddr_t)pd->hdr.icmp); m_copyback(m, ipoff2, sizeof(h2), (caddr_t)&h2); break; #endif /* INET */ #ifdef INET6 case AF_INET6: m_copyback(m, off, sizeof(struct icmp6_hdr), (caddr_t )pd->hdr.icmp6); m_copyback(m, ipoff2, sizeof(h2_6), (caddr_t )&h2_6); break; #endif /* INET6 */ } } return (PF_PASS); break; } } } } static int pf_test_state_other(struct pf_state **state, int direction, struct pfi_kif *kif, struct mbuf *m, struct pf_pdesc *pd) { struct pf_state_peer *src, *dst; struct pf_state_key_cmp key; bzero(&key, sizeof(key)); key.af = pd->af; key.proto = pd->proto; if (direction == PF_IN) { PF_ACPY(&key.addr[0], pd->src, key.af); PF_ACPY(&key.addr[1], pd->dst, key.af); key.port[0] = key.port[1] = 0; } else { PF_ACPY(&key.addr[1], pd->src, key.af); PF_ACPY(&key.addr[0], pd->dst, key.af); key.port[1] = key.port[0] = 0; } STATE_LOOKUP(kif, &key, direction, *state, pd); if (direction == (*state)->direction) { src = &(*state)->src; dst = &(*state)->dst; } else { src = &(*state)->dst; dst = &(*state)->src; } /* update states */ if (src->state < PFOTHERS_SINGLE) src->state = PFOTHERS_SINGLE; if (dst->state == PFOTHERS_SINGLE) dst->state = PFOTHERS_MULTIPLE; /* update expire time */ (*state)->expire = time_uptime; if (src->state == PFOTHERS_MULTIPLE && dst->state == PFOTHERS_MULTIPLE) (*state)->timeout = PFTM_OTHER_MULTIPLE; else (*state)->timeout = PFTM_OTHER_SINGLE; /* translate source/destination address, if necessary */ if ((*state)->key[PF_SK_WIRE] != (*state)->key[PF_SK_STACK]) { struct pf_state_key *nk = (*state)->key[pd->didx]; KASSERT(nk, ("%s: nk is null", __func__)); KASSERT(pd, ("%s: pd is null", __func__)); KASSERT(pd->src, ("%s: pd->src is null", __func__)); KASSERT(pd->dst, ("%s: pd->dst is null", __func__)); switch (pd->af) { #ifdef INET case AF_INET: if (PF_ANEQ(pd->src, &nk->addr[pd->sidx], AF_INET)) pf_change_a(&pd->src->v4.s_addr, pd->ip_sum, nk->addr[pd->sidx].v4.s_addr, 0); if (PF_ANEQ(pd->dst, &nk->addr[pd->didx], AF_INET)) pf_change_a(&pd->dst->v4.s_addr, pd->ip_sum, nk->addr[pd->didx].v4.s_addr, 0); break; #endif /* INET */ #ifdef INET6 case AF_INET6: if (PF_ANEQ(pd->src, &nk->addr[pd->sidx], AF_INET)) PF_ACPY(pd->src, &nk->addr[pd->sidx], pd->af); if (PF_ANEQ(pd->dst, &nk->addr[pd->didx], AF_INET)) PF_ACPY(pd->dst, &nk->addr[pd->didx], pd->af); #endif /* INET6 */ } } return (PF_PASS); } /* * ipoff and off are measured from the start of the mbuf chain. * h must be at "ipoff" on the mbuf chain. */ void * pf_pull_hdr(struct mbuf *m, int off, void *p, int len, u_short *actionp, u_short *reasonp, sa_family_t af) { switch (af) { #ifdef INET case AF_INET: { struct ip *h = mtod(m, struct ip *); u_int16_t fragoff = (ntohs(h->ip_off) & IP_OFFMASK) << 3; if (fragoff) { if (fragoff >= len) ACTION_SET(actionp, PF_PASS); else { ACTION_SET(actionp, PF_DROP); REASON_SET(reasonp, PFRES_FRAG); } return (NULL); } if (m->m_pkthdr.len < off + len || ntohs(h->ip_len) < off + len) { ACTION_SET(actionp, PF_DROP); REASON_SET(reasonp, PFRES_SHORT); return (NULL); } break; } #endif /* INET */ #ifdef INET6 case AF_INET6: { struct ip6_hdr *h = mtod(m, struct ip6_hdr *); if (m->m_pkthdr.len < off + len || (ntohs(h->ip6_plen) + sizeof(struct ip6_hdr)) < (unsigned)(off + len)) { ACTION_SET(actionp, PF_DROP); REASON_SET(reasonp, PFRES_SHORT); return (NULL); } break; } #endif /* INET6 */ } m_copydata(m, off, len, p); return (p); } #ifdef RADIX_MPATH static int pf_routable_oldmpath(struct pf_addr *addr, sa_family_t af, struct pfi_kif *kif, int rtableid) { struct radix_node_head *rnh; struct sockaddr_in *dst; int ret = 1; int check_mpath; #ifdef INET6 struct sockaddr_in6 *dst6; struct route_in6 ro; #else struct route ro; #endif struct radix_node *rn; struct rtentry *rt; struct ifnet *ifp; check_mpath = 0; /* XXX: stick to table 0 for now */ rnh = rt_tables_get_rnh(0, af); if (rnh != NULL && rn_mpath_capable(rnh)) check_mpath = 1; bzero(&ro, sizeof(ro)); switch (af) { case AF_INET: dst = satosin(&ro.ro_dst); dst->sin_family = AF_INET; dst->sin_len = sizeof(*dst); dst->sin_addr = addr->v4; break; #ifdef INET6 case AF_INET6: /* * Skip check for addresses with embedded interface scope, * as they would always match anyway. */ if (IN6_IS_SCOPE_EMBED(&addr->v6)) goto out; dst6 = (struct sockaddr_in6 *)&ro.ro_dst; dst6->sin6_family = AF_INET6; dst6->sin6_len = sizeof(*dst6); dst6->sin6_addr = addr->v6; break; #endif /* INET6 */ default: return (0); } /* Skip checks for ipsec interfaces */ if (kif != NULL && kif->pfik_ifp->if_type == IFT_ENC) goto out; switch (af) { #ifdef INET6 case AF_INET6: in6_rtalloc_ign(&ro, 0, rtableid); break; #endif #ifdef INET case AF_INET: in_rtalloc_ign((struct route *)&ro, 0, rtableid); break; #endif } if (ro.ro_rt != NULL) { /* No interface given, this is a no-route check */ if (kif == NULL) goto out; if (kif->pfik_ifp == NULL) { ret = 0; goto out; } /* Perform uRPF check if passed input interface */ ret = 0; rn = (struct radix_node *)ro.ro_rt; do { rt = (struct rtentry *)rn; ifp = rt->rt_ifp; if (kif->pfik_ifp == ifp) ret = 1; rn = rn_mpath_next(rn); } while (check_mpath == 1 && rn != NULL && ret == 0); } else ret = 0; out: if (ro.ro_rt != NULL) RTFREE(ro.ro_rt); return (ret); } #endif int pf_routable(struct pf_addr *addr, sa_family_t af, struct pfi_kif *kif, int rtableid) { #ifdef INET struct nhop4_basic nh4; #endif #ifdef INET6 struct nhop6_basic nh6; #endif struct ifnet *ifp; #ifdef RADIX_MPATH struct radix_node_head *rnh; /* XXX: stick to table 0 for now */ rnh = rt_tables_get_rnh(0, af); if (rnh != NULL && rn_mpath_capable(rnh)) return (pf_routable_oldmpath(addr, af, kif, rtableid)); #endif /* * Skip check for addresses with embedded interface scope, * as they would always match anyway. */ if (af == AF_INET6 && IN6_IS_SCOPE_EMBED(&addr->v6)) return (1); if (af != AF_INET && af != AF_INET6) return (0); /* Skip checks for ipsec interfaces */ if (kif != NULL && kif->pfik_ifp->if_type == IFT_ENC) return (1); ifp = NULL; switch (af) { #ifdef INET6 case AF_INET6: if (fib6_lookup_nh_basic(rtableid, &addr->v6, 0, 0, 0, &nh6)!=0) return (0); ifp = nh6.nh_ifp; break; #endif #ifdef INET case AF_INET: if (fib4_lookup_nh_basic(rtableid, addr->v4, 0, 0, &nh4) != 0) return (0); ifp = nh4.nh_ifp; break; #endif } /* No interface given, this is a no-route check */ if (kif == NULL) return (1); if (kif->pfik_ifp == NULL) return (0); /* Perform uRPF check if passed input interface */ if (kif->pfik_ifp == ifp) return (1); return (0); } #ifdef INET static void pf_route(struct mbuf **m, struct pf_rule *r, int dir, struct ifnet *oifp, struct pf_state *s, struct pf_pdesc *pd, struct inpcb *inp) { struct mbuf *m0, *m1; struct sockaddr_in dst; struct ip *ip; struct ifnet *ifp = NULL; struct pf_addr naddr; struct pf_src_node *sn = NULL; int error = 0; uint16_t ip_len, ip_off; KASSERT(m && *m && r && oifp, ("%s: invalid parameters", __func__)); KASSERT(dir == PF_IN || dir == PF_OUT, ("%s: invalid direction", __func__)); if ((pd->pf_mtag == NULL && ((pd->pf_mtag = pf_get_mtag(*m)) == NULL)) || pd->pf_mtag->routed++ > 3) { m0 = *m; *m = NULL; goto bad_locked; } if (r->rt == PF_DUPTO) { if ((m0 = m_dup(*m, M_NOWAIT)) == NULL) { if (s) PF_STATE_UNLOCK(s); return; } } else { if ((r->rt == PF_REPLYTO) == (r->direction == dir)) { if (s) PF_STATE_UNLOCK(s); return; } m0 = *m; } ip = mtod(m0, struct ip *); bzero(&dst, sizeof(dst)); dst.sin_family = AF_INET; dst.sin_len = sizeof(dst); dst.sin_addr = ip->ip_dst; bzero(&naddr, sizeof(naddr)); if (TAILQ_EMPTY(&r->rpool.list)) { DPFPRINTF(PF_DEBUG_URGENT, ("%s: TAILQ_EMPTY(&r->rpool.list)\n", __func__)); goto bad_locked; } if (s == NULL) { pf_map_addr(AF_INET, r, (struct pf_addr *)&ip->ip_src, &naddr, NULL, &sn); if (!PF_AZERO(&naddr, AF_INET)) dst.sin_addr.s_addr = naddr.v4.s_addr; ifp = r->rpool.cur->kif ? r->rpool.cur->kif->pfik_ifp : NULL; } else { if (!PF_AZERO(&s->rt_addr, AF_INET)) dst.sin_addr.s_addr = s->rt_addr.v4.s_addr; ifp = s->rt_kif ? s->rt_kif->pfik_ifp : NULL; PF_STATE_UNLOCK(s); } if (ifp == NULL) goto bad; if (oifp != ifp) { if (pf_test(PF_OUT, 0, ifp, &m0, inp) != PF_PASS) goto bad; else if (m0 == NULL) goto done; if (m0->m_len < sizeof(struct ip)) { DPFPRINTF(PF_DEBUG_URGENT, ("%s: m0->m_len < sizeof(struct ip)\n", __func__)); goto bad; } ip = mtod(m0, struct ip *); } if (ifp->if_flags & IFF_LOOPBACK) m0->m_flags |= M_SKIP_FIREWALL; ip_len = ntohs(ip->ip_len); ip_off = ntohs(ip->ip_off); /* Copied from FreeBSD 10.0-CURRENT ip_output. */ m0->m_pkthdr.csum_flags |= CSUM_IP; if (m0->m_pkthdr.csum_flags & CSUM_DELAY_DATA & ~ifp->if_hwassist) { in_delayed_cksum(m0); m0->m_pkthdr.csum_flags &= ~CSUM_DELAY_DATA; } #ifdef SCTP if (m0->m_pkthdr.csum_flags & CSUM_SCTP & ~ifp->if_hwassist) { sctp_delayed_cksum(m0, (uint32_t)(ip->ip_hl << 2)); m0->m_pkthdr.csum_flags &= ~CSUM_SCTP; } #endif /* * If small enough for interface, or the interface will take * care of the fragmentation for us, we can just send directly. */ if (ip_len <= ifp->if_mtu || (m0->m_pkthdr.csum_flags & ifp->if_hwassist & CSUM_TSO) != 0) { ip->ip_sum = 0; if (m0->m_pkthdr.csum_flags & CSUM_IP & ~ifp->if_hwassist) { ip->ip_sum = in_cksum(m0, ip->ip_hl << 2); m0->m_pkthdr.csum_flags &= ~CSUM_IP; } m_clrprotoflags(m0); /* Avoid confusing lower layers. */ error = (*ifp->if_output)(ifp, m0, sintosa(&dst), NULL); goto done; } /* Balk when DF bit is set or the interface didn't support TSO. */ if ((ip_off & IP_DF) || (m0->m_pkthdr.csum_flags & CSUM_TSO)) { error = EMSGSIZE; KMOD_IPSTAT_INC(ips_cantfrag); if (r->rt != PF_DUPTO) { icmp_error(m0, ICMP_UNREACH, ICMP_UNREACH_NEEDFRAG, 0, ifp->if_mtu); goto done; } else goto bad; } error = ip_fragment(ip, &m0, ifp->if_mtu, ifp->if_hwassist); if (error) goto bad; for (; m0; m0 = m1) { m1 = m0->m_nextpkt; m0->m_nextpkt = NULL; if (error == 0) { m_clrprotoflags(m0); error = (*ifp->if_output)(ifp, m0, sintosa(&dst), NULL); } else m_freem(m0); } if (error == 0) KMOD_IPSTAT_INC(ips_fragmented); done: if (r->rt != PF_DUPTO) *m = NULL; return; bad_locked: if (s) PF_STATE_UNLOCK(s); bad: m_freem(m0); goto done; } #endif /* INET */ #ifdef INET6 static void pf_route6(struct mbuf **m, struct pf_rule *r, int dir, struct ifnet *oifp, struct pf_state *s, struct pf_pdesc *pd, struct inpcb *inp) { struct mbuf *m0; struct sockaddr_in6 dst; struct ip6_hdr *ip6; struct ifnet *ifp = NULL; struct pf_addr naddr; struct pf_src_node *sn = NULL; KASSERT(m && *m && r && oifp, ("%s: invalid parameters", __func__)); KASSERT(dir == PF_IN || dir == PF_OUT, ("%s: invalid direction", __func__)); if ((pd->pf_mtag == NULL && ((pd->pf_mtag = pf_get_mtag(*m)) == NULL)) || pd->pf_mtag->routed++ > 3) { m0 = *m; *m = NULL; goto bad_locked; } if (r->rt == PF_DUPTO) { if ((m0 = m_dup(*m, M_NOWAIT)) == NULL) { if (s) PF_STATE_UNLOCK(s); return; } } else { if ((r->rt == PF_REPLYTO) == (r->direction == dir)) { if (s) PF_STATE_UNLOCK(s); return; } m0 = *m; } ip6 = mtod(m0, struct ip6_hdr *); bzero(&dst, sizeof(dst)); dst.sin6_family = AF_INET6; dst.sin6_len = sizeof(dst); dst.sin6_addr = ip6->ip6_dst; bzero(&naddr, sizeof(naddr)); if (TAILQ_EMPTY(&r->rpool.list)) { DPFPRINTF(PF_DEBUG_URGENT, ("%s: TAILQ_EMPTY(&r->rpool.list)\n", __func__)); goto bad_locked; } if (s == NULL) { pf_map_addr(AF_INET6, r, (struct pf_addr *)&ip6->ip6_src, &naddr, NULL, &sn); if (!PF_AZERO(&naddr, AF_INET6)) PF_ACPY((struct pf_addr *)&dst.sin6_addr, &naddr, AF_INET6); ifp = r->rpool.cur->kif ? r->rpool.cur->kif->pfik_ifp : NULL; } else { if (!PF_AZERO(&s->rt_addr, AF_INET6)) PF_ACPY((struct pf_addr *)&dst.sin6_addr, &s->rt_addr, AF_INET6); ifp = s->rt_kif ? s->rt_kif->pfik_ifp : NULL; } if (s) PF_STATE_UNLOCK(s); if (ifp == NULL) goto bad; if (oifp != ifp) { if (pf_test6(PF_OUT, PFIL_FWD, ifp, &m0, inp) != PF_PASS) goto bad; else if (m0 == NULL) goto done; if (m0->m_len < sizeof(struct ip6_hdr)) { DPFPRINTF(PF_DEBUG_URGENT, ("%s: m0->m_len < sizeof(struct ip6_hdr)\n", __func__)); goto bad; } ip6 = mtod(m0, struct ip6_hdr *); } if (ifp->if_flags & IFF_LOOPBACK) m0->m_flags |= M_SKIP_FIREWALL; if (m0->m_pkthdr.csum_flags & CSUM_DELAY_DATA_IPV6 & ~ifp->if_hwassist) { uint32_t plen = m0->m_pkthdr.len - sizeof(*ip6); in6_delayed_cksum(m0, plen, sizeof(struct ip6_hdr)); m0->m_pkthdr.csum_flags &= ~CSUM_DELAY_DATA_IPV6; } /* * If the packet is too large for the outgoing interface, * send back an icmp6 error. */ if (IN6_IS_SCOPE_EMBED(&dst.sin6_addr)) dst.sin6_addr.s6_addr16[1] = htons(ifp->if_index); if ((u_long)m0->m_pkthdr.len <= ifp->if_mtu) nd6_output_ifp(ifp, ifp, m0, &dst, NULL); else { in6_ifstat_inc(ifp, ifs6_in_toobig); if (r->rt != PF_DUPTO) icmp6_error(m0, ICMP6_PACKET_TOO_BIG, 0, ifp->if_mtu); else goto bad; } done: if (r->rt != PF_DUPTO) *m = NULL; return; bad_locked: if (s) PF_STATE_UNLOCK(s); bad: m_freem(m0); goto done; } #endif /* INET6 */ /* * FreeBSD supports cksum offloads for the following drivers. * em(4), fxp(4), lge(4), ndis(4), nge(4), re(4), ti(4), txp(4), xl(4) * * CSUM_DATA_VALID | CSUM_PSEUDO_HDR : * network driver performed cksum including pseudo header, need to verify * csum_data * CSUM_DATA_VALID : * network driver performed cksum, needs to additional pseudo header * cksum computation with partial csum_data(i.e. lack of H/W support for * pseudo header, for instance hme(4), sk(4) and possibly gem(4)) * * After validating the cksum of packet, set both flag CSUM_DATA_VALID and * CSUM_PSEUDO_HDR in order to avoid recomputation of the cksum in upper * TCP/UDP layer. * Also, set csum_data to 0xffff to force cksum validation. */ static int pf_check_proto_cksum(struct mbuf *m, int off, int len, u_int8_t p, sa_family_t af) { u_int16_t sum = 0; int hw_assist = 0; struct ip *ip; if (off < sizeof(struct ip) || len < sizeof(struct udphdr)) return (1); if (m->m_pkthdr.len < off + len) return (1); switch (p) { case IPPROTO_TCP: if (m->m_pkthdr.csum_flags & CSUM_DATA_VALID) { if (m->m_pkthdr.csum_flags & CSUM_PSEUDO_HDR) { sum = m->m_pkthdr.csum_data; } else { ip = mtod(m, struct ip *); sum = in_pseudo(ip->ip_src.s_addr, ip->ip_dst.s_addr, htonl((u_short)len + m->m_pkthdr.csum_data + IPPROTO_TCP)); } sum ^= 0xffff; ++hw_assist; } break; case IPPROTO_UDP: if (m->m_pkthdr.csum_flags & CSUM_DATA_VALID) { if (m->m_pkthdr.csum_flags & CSUM_PSEUDO_HDR) { sum = m->m_pkthdr.csum_data; } else { ip = mtod(m, struct ip *); sum = in_pseudo(ip->ip_src.s_addr, ip->ip_dst.s_addr, htonl((u_short)len + m->m_pkthdr.csum_data + IPPROTO_UDP)); } sum ^= 0xffff; ++hw_assist; } break; case IPPROTO_ICMP: #ifdef INET6 case IPPROTO_ICMPV6: #endif /* INET6 */ break; default: return (1); } if (!hw_assist) { switch (af) { case AF_INET: if (p == IPPROTO_ICMP) { if (m->m_len < off) return (1); m->m_data += off; m->m_len -= off; sum = in_cksum(m, len); m->m_data -= off; m->m_len += off; } else { if (m->m_len < sizeof(struct ip)) return (1); sum = in4_cksum(m, p, off, len); } break; #ifdef INET6 case AF_INET6: if (m->m_len < sizeof(struct ip6_hdr)) return (1); sum = in6_cksum(m, p, off, len); break; #endif /* INET6 */ default: return (1); } } if (sum) { switch (p) { case IPPROTO_TCP: { KMOD_TCPSTAT_INC(tcps_rcvbadsum); break; } case IPPROTO_UDP: { KMOD_UDPSTAT_INC(udps_badsum); break; } #ifdef INET case IPPROTO_ICMP: { KMOD_ICMPSTAT_INC(icps_checksum); break; } #endif #ifdef INET6 case IPPROTO_ICMPV6: { KMOD_ICMP6STAT_INC(icp6s_checksum); break; } #endif /* INET6 */ } return (1); } else { if (p == IPPROTO_TCP || p == IPPROTO_UDP) { m->m_pkthdr.csum_flags |= (CSUM_DATA_VALID | CSUM_PSEUDO_HDR); m->m_pkthdr.csum_data = 0xffff; } } return (0); } #ifdef INET int pf_test(int dir, int pflags, struct ifnet *ifp, struct mbuf **m0, struct inpcb *inp) { struct pfi_kif *kif; u_short action, reason = 0, log = 0; struct mbuf *m = *m0; struct ip *h = NULL; struct m_tag *ipfwtag; struct pf_rule *a = NULL, *r = &V_pf_default_rule, *tr, *nr; struct pf_state *s = NULL; struct pf_ruleset *ruleset = NULL; struct pf_pdesc pd; int off, dirndx, pqid = 0; PF_RULES_RLOCK_TRACKER; M_ASSERTPKTHDR(m); if (!V_pf_status.running) return (PF_PASS); memset(&pd, 0, sizeof(pd)); kif = (struct pfi_kif *)ifp->if_pf_kif; if (kif == NULL) { DPFPRINTF(PF_DEBUG_URGENT, ("pf_test: kif == NULL, if_xname %s\n", ifp->if_xname)); return (PF_DROP); } if (kif->pfik_flags & PFI_IFLAG_SKIP) return (PF_PASS); if (m->m_flags & M_SKIP_FIREWALL) return (PF_PASS); pd.pf_mtag = pf_find_mtag(m); PF_RULES_RLOCK(); if (ip_divert_ptr != NULL && ((ipfwtag = m_tag_locate(m, MTAG_IPFW_RULE, 0, NULL)) != NULL)) { struct ipfw_rule_ref *rr = (struct ipfw_rule_ref *)(ipfwtag+1); if (rr->info & IPFW_IS_DIVERT && rr->rulenum == 0) { if (pd.pf_mtag == NULL && ((pd.pf_mtag = pf_get_mtag(m)) == NULL)) { action = PF_DROP; goto done; } pd.pf_mtag->flags |= PF_PACKET_LOOPED; m_tag_delete(m, ipfwtag); } if (pd.pf_mtag && pd.pf_mtag->flags & PF_FASTFWD_OURS_PRESENT) { m->m_flags |= M_FASTFWD_OURS; pd.pf_mtag->flags &= ~PF_FASTFWD_OURS_PRESENT; } } else if (pf_normalize_ip(m0, dir, kif, &reason, &pd) != PF_PASS) { /* We do IP header normalization and packet reassembly here */ action = PF_DROP; goto done; } m = *m0; /* pf_normalize messes with m0 */ h = mtod(m, struct ip *); off = h->ip_hl << 2; if (off < (int)sizeof(struct ip)) { action = PF_DROP; REASON_SET(&reason, PFRES_SHORT); log = 1; goto done; } pd.src = (struct pf_addr *)&h->ip_src; pd.dst = (struct pf_addr *)&h->ip_dst; pd.sport = pd.dport = NULL; pd.ip_sum = &h->ip_sum; pd.proto_sum = NULL; pd.proto = h->ip_p; pd.dir = dir; pd.sidx = (dir == PF_IN) ? 0 : 1; pd.didx = (dir == PF_IN) ? 1 : 0; pd.af = AF_INET; pd.tos = h->ip_tos & ~IPTOS_ECN_MASK; pd.tot_len = ntohs(h->ip_len); /* handle fragments that didn't get reassembled by normalization */ if (h->ip_off & htons(IP_MF | IP_OFFMASK)) { action = pf_test_fragment(&r, dir, kif, m, h, &pd, &a, &ruleset); goto done; } switch (h->ip_p) { case IPPROTO_TCP: { struct tcphdr th; pd.hdr.tcp = &th; if (!pf_pull_hdr(m, off, &th, sizeof(th), &action, &reason, AF_INET)) { log = action != PF_PASS; goto done; } pd.p_len = pd.tot_len - off - (th.th_off << 2); if ((th.th_flags & TH_ACK) && pd.p_len == 0) pqid = 1; action = pf_normalize_tcp(dir, kif, m, 0, off, h, &pd); if (action == PF_DROP) goto done; action = pf_test_state_tcp(&s, dir, kif, m, off, h, &pd, &reason); if (action == PF_PASS) { if (V_pfsync_update_state_ptr != NULL) V_pfsync_update_state_ptr(s); r = s->rule.ptr; a = s->anchor.ptr; log = s->log; } else if (s == NULL) action = pf_test_rule(&r, &s, dir, kif, m, off, &pd, &a, &ruleset, inp); break; } case IPPROTO_UDP: { struct udphdr uh; pd.hdr.udp = &uh; if (!pf_pull_hdr(m, off, &uh, sizeof(uh), &action, &reason, AF_INET)) { log = action != PF_PASS; goto done; } if (uh.uh_dport == 0 || ntohs(uh.uh_ulen) > m->m_pkthdr.len - off || ntohs(uh.uh_ulen) < sizeof(struct udphdr)) { action = PF_DROP; REASON_SET(&reason, PFRES_SHORT); goto done; } action = pf_test_state_udp(&s, dir, kif, m, off, h, &pd); if (action == PF_PASS) { if (V_pfsync_update_state_ptr != NULL) V_pfsync_update_state_ptr(s); r = s->rule.ptr; a = s->anchor.ptr; log = s->log; } else if (s == NULL) action = pf_test_rule(&r, &s, dir, kif, m, off, &pd, &a, &ruleset, inp); break; } case IPPROTO_ICMP: { struct icmp ih; pd.hdr.icmp = &ih; if (!pf_pull_hdr(m, off, &ih, ICMP_MINLEN, &action, &reason, AF_INET)) { log = action != PF_PASS; goto done; } action = pf_test_state_icmp(&s, dir, kif, m, off, h, &pd, &reason); if (action == PF_PASS) { if (V_pfsync_update_state_ptr != NULL) V_pfsync_update_state_ptr(s); r = s->rule.ptr; a = s->anchor.ptr; log = s->log; } else if (s == NULL) action = pf_test_rule(&r, &s, dir, kif, m, off, &pd, &a, &ruleset, inp); break; } #ifdef INET6 case IPPROTO_ICMPV6: { action = PF_DROP; DPFPRINTF(PF_DEBUG_MISC, ("pf: dropping IPv4 packet with ICMPv6 payload\n")); goto done; } #endif default: action = pf_test_state_other(&s, dir, kif, m, &pd); if (action == PF_PASS) { if (V_pfsync_update_state_ptr != NULL) V_pfsync_update_state_ptr(s); r = s->rule.ptr; a = s->anchor.ptr; log = s->log; } else if (s == NULL) action = pf_test_rule(&r, &s, dir, kif, m, off, &pd, &a, &ruleset, inp); break; } done: PF_RULES_RUNLOCK(); if (action == PF_PASS && h->ip_hl > 5 && !((s && s->state_flags & PFSTATE_ALLOWOPTS) || r->allow_opts)) { action = PF_DROP; REASON_SET(&reason, PFRES_IPOPTIONS); log = r->log; DPFPRINTF(PF_DEBUG_MISC, ("pf: dropping packet with ip options\n")); } if (s && s->tag > 0 && pf_tag_packet(m, &pd, s->tag)) { action = PF_DROP; REASON_SET(&reason, PFRES_MEMORY); } if (r->rtableid >= 0) M_SETFIB(m, r->rtableid); if (r->scrub_flags & PFSTATE_SETPRIO) { if (pd.tos & IPTOS_LOWDELAY) pqid = 1; if (pf_ieee8021q_setpcp(m, r->set_prio[pqid])) { action = PF_DROP; REASON_SET(&reason, PFRES_MEMORY); log = 1; DPFPRINTF(PF_DEBUG_MISC, ("pf: failed to allocate 802.1q mtag\n")); } } #ifdef ALTQ if (action == PF_PASS && r->qid) { if (pd.pf_mtag == NULL && ((pd.pf_mtag = pf_get_mtag(m)) == NULL)) { action = PF_DROP; REASON_SET(&reason, PFRES_MEMORY); } else { if (s != NULL) pd.pf_mtag->qid_hash = pf_state_hash(s); if (pqid || (pd.tos & IPTOS_LOWDELAY)) pd.pf_mtag->qid = r->pqid; else pd.pf_mtag->qid = r->qid; /* Add hints for ecn. */ pd.pf_mtag->hdr = h; } } #endif /* ALTQ */ /* * connections redirected to loopback should not match sockets * bound specifically to loopback due to security implications, * see tcp_input() and in_pcblookup_listen(). */ if (dir == PF_IN && action == PF_PASS && (pd.proto == IPPROTO_TCP || pd.proto == IPPROTO_UDP) && s != NULL && s->nat_rule.ptr != NULL && (s->nat_rule.ptr->action == PF_RDR || s->nat_rule.ptr->action == PF_BINAT) && IN_LOOPBACK(ntohl(pd.dst->v4.s_addr))) m->m_flags |= M_SKIP_FIREWALL; if (action == PF_PASS && r->divert.port && ip_divert_ptr != NULL && !PACKET_LOOPED(&pd)) { ipfwtag = m_tag_alloc(MTAG_IPFW_RULE, 0, sizeof(struct ipfw_rule_ref), M_NOWAIT | M_ZERO); if (ipfwtag != NULL) { ((struct ipfw_rule_ref *)(ipfwtag+1))->info = ntohs(r->divert.port); ((struct ipfw_rule_ref *)(ipfwtag+1))->rulenum = dir; if (s) PF_STATE_UNLOCK(s); m_tag_prepend(m, ipfwtag); if (m->m_flags & M_FASTFWD_OURS) { if (pd.pf_mtag == NULL && ((pd.pf_mtag = pf_get_mtag(m)) == NULL)) { action = PF_DROP; REASON_SET(&reason, PFRES_MEMORY); log = 1; DPFPRINTF(PF_DEBUG_MISC, ("pf: failed to allocate tag\n")); } else { pd.pf_mtag->flags |= PF_FASTFWD_OURS_PRESENT; m->m_flags &= ~M_FASTFWD_OURS; } } ip_divert_ptr(*m0, dir == PF_IN); *m0 = NULL; return (action); } else { /* XXX: ipfw has the same behaviour! */ action = PF_DROP; REASON_SET(&reason, PFRES_MEMORY); log = 1; DPFPRINTF(PF_DEBUG_MISC, ("pf: failed to allocate divert tag\n")); } } if (log) { struct pf_rule *lr; if (s != NULL && s->nat_rule.ptr != NULL && s->nat_rule.ptr->log & PF_LOG_ALL) lr = s->nat_rule.ptr; else lr = r; PFLOG_PACKET(kif, m, AF_INET, dir, reason, lr, a, ruleset, &pd, (s == NULL)); } kif->pfik_bytes[0][dir == PF_OUT][action != PF_PASS] += pd.tot_len; kif->pfik_packets[0][dir == PF_OUT][action != PF_PASS]++; if (action == PF_PASS || r->action == PF_DROP) { dirndx = (dir == PF_OUT); r->packets[dirndx]++; r->bytes[dirndx] += pd.tot_len; if (a != NULL) { a->packets[dirndx]++; a->bytes[dirndx] += pd.tot_len; } if (s != NULL) { if (s->nat_rule.ptr != NULL) { s->nat_rule.ptr->packets[dirndx]++; s->nat_rule.ptr->bytes[dirndx] += pd.tot_len; } if (s->src_node != NULL) { s->src_node->packets[dirndx]++; s->src_node->bytes[dirndx] += pd.tot_len; } if (s->nat_src_node != NULL) { s->nat_src_node->packets[dirndx]++; s->nat_src_node->bytes[dirndx] += pd.tot_len; } dirndx = (dir == s->direction) ? 0 : 1; s->packets[dirndx]++; s->bytes[dirndx] += pd.tot_len; } tr = r; nr = (s != NULL) ? s->nat_rule.ptr : pd.nat_rule; if (nr != NULL && r == &V_pf_default_rule) tr = nr; if (tr->src.addr.type == PF_ADDR_TABLE) pfr_update_stats(tr->src.addr.p.tbl, (s == NULL) ? pd.src : &s->key[(s->direction == PF_IN)]-> addr[(s->direction == PF_OUT)], pd.af, pd.tot_len, dir == PF_OUT, r->action == PF_PASS, tr->src.neg); if (tr->dst.addr.type == PF_ADDR_TABLE) pfr_update_stats(tr->dst.addr.p.tbl, (s == NULL) ? pd.dst : &s->key[(s->direction == PF_IN)]-> addr[(s->direction == PF_IN)], pd.af, pd.tot_len, dir == PF_OUT, r->action == PF_PASS, tr->dst.neg); } switch (action) { case PF_SYNPROXY_DROP: m_freem(*m0); case PF_DEFER: *m0 = NULL; action = PF_PASS; break; case PF_DROP: m_freem(*m0); *m0 = NULL; break; default: /* pf_route() returns unlocked. */ if (r->rt) { pf_route(m0, r, dir, kif->pfik_ifp, s, &pd, inp); return (action); } break; } if (s) PF_STATE_UNLOCK(s); return (action); } #endif /* INET */ #ifdef INET6 int pf_test6(int dir, int pflags, struct ifnet *ifp, struct mbuf **m0, struct inpcb *inp) { struct pfi_kif *kif; u_short action, reason = 0, log = 0; struct mbuf *m = *m0, *n = NULL; struct m_tag *mtag; struct ip6_hdr *h = NULL; struct pf_rule *a = NULL, *r = &V_pf_default_rule, *tr, *nr; struct pf_state *s = NULL; struct pf_ruleset *ruleset = NULL; struct pf_pdesc pd; int off, terminal = 0, dirndx, rh_cnt = 0, pqid = 0; PF_RULES_RLOCK_TRACKER; M_ASSERTPKTHDR(m); if (!V_pf_status.running) return (PF_PASS); memset(&pd, 0, sizeof(pd)); pd.pf_mtag = pf_find_mtag(m); if (pd.pf_mtag && pd.pf_mtag->flags & PF_TAG_GENERATED) return (PF_PASS); kif = (struct pfi_kif *)ifp->if_pf_kif; if (kif == NULL) { DPFPRINTF(PF_DEBUG_URGENT, ("pf_test6: kif == NULL, if_xname %s\n", ifp->if_xname)); return (PF_DROP); } if (kif->pfik_flags & PFI_IFLAG_SKIP) return (PF_PASS); if (m->m_flags & M_SKIP_FIREWALL) return (PF_PASS); PF_RULES_RLOCK(); /* We do IP header normalization and packet reassembly here */ if (pf_normalize_ip6(m0, dir, kif, &reason, &pd) != PF_PASS) { action = PF_DROP; goto done; } m = *m0; /* pf_normalize messes with m0 */ h = mtod(m, struct ip6_hdr *); /* * we do not support jumbogram. if we keep going, zero ip6_plen * will do something bad, so drop the packet for now. */ if (htons(h->ip6_plen) == 0) { action = PF_DROP; REASON_SET(&reason, PFRES_NORM); /*XXX*/ goto done; } pd.src = (struct pf_addr *)&h->ip6_src; pd.dst = (struct pf_addr *)&h->ip6_dst; pd.sport = pd.dport = NULL; pd.ip_sum = NULL; pd.proto_sum = NULL; pd.dir = dir; pd.sidx = (dir == PF_IN) ? 0 : 1; pd.didx = (dir == PF_IN) ? 1 : 0; pd.af = AF_INET6; pd.tos = 0; pd.tot_len = ntohs(h->ip6_plen) + sizeof(struct ip6_hdr); off = ((caddr_t)h - m->m_data) + sizeof(struct ip6_hdr); pd.proto = h->ip6_nxt; do { switch (pd.proto) { case IPPROTO_FRAGMENT: action = pf_test_fragment(&r, dir, kif, m, h, &pd, &a, &ruleset); if (action == PF_DROP) REASON_SET(&reason, PFRES_FRAG); goto done; case IPPROTO_ROUTING: { struct ip6_rthdr rthdr; if (rh_cnt++) { DPFPRINTF(PF_DEBUG_MISC, ("pf: IPv6 more than one rthdr\n")); action = PF_DROP; REASON_SET(&reason, PFRES_IPOPTIONS); log = 1; goto done; } if (!pf_pull_hdr(m, off, &rthdr, sizeof(rthdr), NULL, &reason, pd.af)) { DPFPRINTF(PF_DEBUG_MISC, ("pf: IPv6 short rthdr\n")); action = PF_DROP; REASON_SET(&reason, PFRES_SHORT); log = 1; goto done; } if (rthdr.ip6r_type == IPV6_RTHDR_TYPE_0) { DPFPRINTF(PF_DEBUG_MISC, ("pf: IPv6 rthdr0\n")); action = PF_DROP; REASON_SET(&reason, PFRES_IPOPTIONS); log = 1; goto done; } /* FALLTHROUGH */ } case IPPROTO_AH: case IPPROTO_HOPOPTS: case IPPROTO_DSTOPTS: { /* get next header and header length */ struct ip6_ext opt6; if (!pf_pull_hdr(m, off, &opt6, sizeof(opt6), NULL, &reason, pd.af)) { DPFPRINTF(PF_DEBUG_MISC, ("pf: IPv6 short opt\n")); action = PF_DROP; log = 1; goto done; } if (pd.proto == IPPROTO_AH) off += (opt6.ip6e_len + 2) * 4; else off += (opt6.ip6e_len + 1) * 8; pd.proto = opt6.ip6e_nxt; /* goto the next header */ break; } default: terminal++; break; } } while (!terminal); /* if there's no routing header, use unmodified mbuf for checksumming */ if (!n) n = m; switch (pd.proto) { case IPPROTO_TCP: { struct tcphdr th; pd.hdr.tcp = &th; if (!pf_pull_hdr(m, off, &th, sizeof(th), &action, &reason, AF_INET6)) { log = action != PF_PASS; goto done; } pd.p_len = pd.tot_len - off - (th.th_off << 2); action = pf_normalize_tcp(dir, kif, m, 0, off, h, &pd); if (action == PF_DROP) goto done; action = pf_test_state_tcp(&s, dir, kif, m, off, h, &pd, &reason); if (action == PF_PASS) { if (V_pfsync_update_state_ptr != NULL) V_pfsync_update_state_ptr(s); r = s->rule.ptr; a = s->anchor.ptr; log = s->log; } else if (s == NULL) action = pf_test_rule(&r, &s, dir, kif, m, off, &pd, &a, &ruleset, inp); break; } case IPPROTO_UDP: { struct udphdr uh; pd.hdr.udp = &uh; if (!pf_pull_hdr(m, off, &uh, sizeof(uh), &action, &reason, AF_INET6)) { log = action != PF_PASS; goto done; } if (uh.uh_dport == 0 || ntohs(uh.uh_ulen) > m->m_pkthdr.len - off || ntohs(uh.uh_ulen) < sizeof(struct udphdr)) { action = PF_DROP; REASON_SET(&reason, PFRES_SHORT); goto done; } action = pf_test_state_udp(&s, dir, kif, m, off, h, &pd); if (action == PF_PASS) { if (V_pfsync_update_state_ptr != NULL) V_pfsync_update_state_ptr(s); r = s->rule.ptr; a = s->anchor.ptr; log = s->log; } else if (s == NULL) action = pf_test_rule(&r, &s, dir, kif, m, off, &pd, &a, &ruleset, inp); break; } case IPPROTO_ICMP: { action = PF_DROP; DPFPRINTF(PF_DEBUG_MISC, ("pf: dropping IPv6 packet with ICMPv4 payload\n")); goto done; } case IPPROTO_ICMPV6: { struct icmp6_hdr ih; pd.hdr.icmp6 = &ih; if (!pf_pull_hdr(m, off, &ih, sizeof(ih), &action, &reason, AF_INET6)) { log = action != PF_PASS; goto done; } action = pf_test_state_icmp(&s, dir, kif, m, off, h, &pd, &reason); if (action == PF_PASS) { if (V_pfsync_update_state_ptr != NULL) V_pfsync_update_state_ptr(s); r = s->rule.ptr; a = s->anchor.ptr; log = s->log; } else if (s == NULL) action = pf_test_rule(&r, &s, dir, kif, m, off, &pd, &a, &ruleset, inp); break; } default: action = pf_test_state_other(&s, dir, kif, m, &pd); if (action == PF_PASS) { if (V_pfsync_update_state_ptr != NULL) V_pfsync_update_state_ptr(s); r = s->rule.ptr; a = s->anchor.ptr; log = s->log; } else if (s == NULL) action = pf_test_rule(&r, &s, dir, kif, m, off, &pd, &a, &ruleset, inp); break; } done: PF_RULES_RUNLOCK(); if (n != m) { m_freem(n); n = NULL; } /* handle dangerous IPv6 extension headers. */ if (action == PF_PASS && rh_cnt && !((s && s->state_flags & PFSTATE_ALLOWOPTS) || r->allow_opts)) { action = PF_DROP; REASON_SET(&reason, PFRES_IPOPTIONS); log = r->log; DPFPRINTF(PF_DEBUG_MISC, ("pf: dropping packet with dangerous v6 headers\n")); } if (s && s->tag > 0 && pf_tag_packet(m, &pd, s->tag)) { action = PF_DROP; REASON_SET(&reason, PFRES_MEMORY); } if (r->rtableid >= 0) M_SETFIB(m, r->rtableid); if (r->scrub_flags & PFSTATE_SETPRIO) { if (pd.tos & IPTOS_LOWDELAY) pqid = 1; if (pf_ieee8021q_setpcp(m, r->set_prio[pqid])) { action = PF_DROP; REASON_SET(&reason, PFRES_MEMORY); log = 1; DPFPRINTF(PF_DEBUG_MISC, ("pf: failed to allocate 802.1q mtag\n")); } } #ifdef ALTQ if (action == PF_PASS && r->qid) { if (pd.pf_mtag == NULL && ((pd.pf_mtag = pf_get_mtag(m)) == NULL)) { action = PF_DROP; REASON_SET(&reason, PFRES_MEMORY); } else { if (s != NULL) pd.pf_mtag->qid_hash = pf_state_hash(s); if (pd.tos & IPTOS_LOWDELAY) pd.pf_mtag->qid = r->pqid; else pd.pf_mtag->qid = r->qid; /* Add hints for ecn. */ pd.pf_mtag->hdr = h; } } #endif /* ALTQ */ if (dir == PF_IN && action == PF_PASS && (pd.proto == IPPROTO_TCP || pd.proto == IPPROTO_UDP) && s != NULL && s->nat_rule.ptr != NULL && (s->nat_rule.ptr->action == PF_RDR || s->nat_rule.ptr->action == PF_BINAT) && IN6_IS_ADDR_LOOPBACK(&pd.dst->v6)) m->m_flags |= M_SKIP_FIREWALL; /* XXX: Anybody working on it?! */ if (r->divert.port) printf("pf: divert(9) is not supported for IPv6\n"); if (log) { struct pf_rule *lr; if (s != NULL && s->nat_rule.ptr != NULL && s->nat_rule.ptr->log & PF_LOG_ALL) lr = s->nat_rule.ptr; else lr = r; PFLOG_PACKET(kif, m, AF_INET6, dir, reason, lr, a, ruleset, &pd, (s == NULL)); } kif->pfik_bytes[1][dir == PF_OUT][action != PF_PASS] += pd.tot_len; kif->pfik_packets[1][dir == PF_OUT][action != PF_PASS]++; if (action == PF_PASS || r->action == PF_DROP) { dirndx = (dir == PF_OUT); r->packets[dirndx]++; r->bytes[dirndx] += pd.tot_len; if (a != NULL) { a->packets[dirndx]++; a->bytes[dirndx] += pd.tot_len; } if (s != NULL) { if (s->nat_rule.ptr != NULL) { s->nat_rule.ptr->packets[dirndx]++; s->nat_rule.ptr->bytes[dirndx] += pd.tot_len; } if (s->src_node != NULL) { s->src_node->packets[dirndx]++; s->src_node->bytes[dirndx] += pd.tot_len; } if (s->nat_src_node != NULL) { s->nat_src_node->packets[dirndx]++; s->nat_src_node->bytes[dirndx] += pd.tot_len; } dirndx = (dir == s->direction) ? 0 : 1; s->packets[dirndx]++; s->bytes[dirndx] += pd.tot_len; } tr = r; nr = (s != NULL) ? s->nat_rule.ptr : pd.nat_rule; if (nr != NULL && r == &V_pf_default_rule) tr = nr; if (tr->src.addr.type == PF_ADDR_TABLE) pfr_update_stats(tr->src.addr.p.tbl, (s == NULL) ? pd.src : &s->key[(s->direction == PF_IN)]->addr[0], pd.af, pd.tot_len, dir == PF_OUT, r->action == PF_PASS, tr->src.neg); if (tr->dst.addr.type == PF_ADDR_TABLE) pfr_update_stats(tr->dst.addr.p.tbl, (s == NULL) ? pd.dst : &s->key[(s->direction == PF_IN)]->addr[1], pd.af, pd.tot_len, dir == PF_OUT, r->action == PF_PASS, tr->dst.neg); } switch (action) { case PF_SYNPROXY_DROP: m_freem(*m0); case PF_DEFER: *m0 = NULL; action = PF_PASS; break; case PF_DROP: m_freem(*m0); *m0 = NULL; break; default: /* pf_route6() returns unlocked. */ if (r->rt) { pf_route6(m0, r, dir, kif->pfik_ifp, s, &pd, inp); return (action); } break; } if (s) PF_STATE_UNLOCK(s); /* If reassembled packet passed, create new fragments. */ if (action == PF_PASS && *m0 && (pflags & PFIL_FWD) && (mtag = m_tag_find(m, PF_REASSEMBLED, NULL)) != NULL) action = pf_refragment6(ifp, m0, mtag); return (action); } #endif /* INET6 */ Index: projects/clang1000-import/sys/powerpc/booke/pmap.c =================================================================== --- projects/clang1000-import/sys/powerpc/booke/pmap.c (revision 358238) +++ projects/clang1000-import/sys/powerpc/booke/pmap.c (revision 358239) @@ -1,4426 +1,4425 @@ /*- * SPDX-License-Identifier: BSD-2-Clause-FreeBSD * * Copyright (C) 2007-2009 Semihalf, Rafal Jaworowski * Copyright (C) 2006 Semihalf, Marian Balakowicz * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN * NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED * TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING * NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * Some hw specific parts of this pmap were derived or influenced * by NetBSD's ibm4xx pmap module. More generic code is shared with * a few other pmap modules from the FreeBSD tree. */ /* * VM layout notes: * * Kernel and user threads run within one common virtual address space * defined by AS=0. * * 32-bit pmap: * Virtual address space layout: * ----------------------------- * 0x0000_0000 - 0x7fff_ffff : user process * 0x8000_0000 - 0xbfff_ffff : pmap_mapdev()-ed area (PCI/PCIE etc.) * 0xc000_0000 - 0xc0ff_ffff : kernel reserved * 0xc000_0000 - data_end : kernel code+data, env, metadata etc. * 0xc100_0000 - 0xffff_ffff : KVA * 0xc100_0000 - 0xc100_3fff : reserved for page zero/copy * 0xc100_4000 - 0xc200_3fff : reserved for ptbl bufs * 0xc200_4000 - 0xc200_8fff : guard page + kstack0 * 0xc200_9000 - 0xfeef_ffff : actual free KVA space * * 64-bit pmap: * Virtual address space layout: * ----------------------------- * 0x0000_0000_0000_0000 - 0xbfff_ffff_ffff_ffff : user process * 0x0000_0000_0000_0000 - 0x8fff_ffff_ffff_ffff : text, data, heap, maps, libraries * 0x9000_0000_0000_0000 - 0xafff_ffff_ffff_ffff : mmio region * 0xb000_0000_0000_0000 - 0xbfff_ffff_ffff_ffff : stack * 0xc000_0000_0000_0000 - 0xcfff_ffff_ffff_ffff : kernel reserved * 0xc000_0000_0000_0000 - endkernel-1 : kernel code & data * endkernel - msgbufp-1 : flat device tree * msgbufp - kernel_pdir-1 : message buffer * kernel_pdir - kernel_pp2d-1 : kernel page directory * kernel_pp2d - . : kernel pointers to page directory * pmap_zero_copy_min - crashdumpmap-1 : reserved for page zero/copy * crashdumpmap - ptbl_buf_pool_vabase-1 : reserved for ptbl bufs * ptbl_buf_pool_vabase - virtual_avail-1 : user page directories and page tables * virtual_avail - 0xcfff_ffff_ffff_ffff : actual free KVA space * 0xd000_0000_0000_0000 - 0xdfff_ffff_ffff_ffff : coprocessor region * 0xe000_0000_0000_0000 - 0xefff_ffff_ffff_ffff : mmio region * 0xf000_0000_0000_0000 - 0xffff_ffff_ffff_ffff : direct map * 0xf000_0000_0000_0000 - +Maxmem : physmem map * - 0xffff_ffff_ffff_ffff : device direct map */ #include __FBSDID("$FreeBSD$"); #include "opt_ddb.h" #include "opt_kstack_pages.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "mmu_if.h" #define SPARSE_MAPDEV #ifdef DEBUG #define debugf(fmt, args...) printf(fmt, ##args) #else #define debugf(fmt, args...) #endif #ifdef __powerpc64__ #define PRI0ptrX "016lx" #else #define PRI0ptrX "08x" #endif #define TODO panic("%s: not implemented", __func__); extern unsigned char _etext[]; extern unsigned char _end[]; extern uint32_t *bootinfo; vm_paddr_t kernload; vm_offset_t kernstart; vm_size_t kernsize; /* Message buffer and tables. */ static vm_offset_t data_start; static vm_size_t data_end; /* Phys/avail memory regions. */ static struct mem_region *availmem_regions; static int availmem_regions_sz; static struct mem_region *physmem_regions; static int physmem_regions_sz; #ifndef __powerpc64__ /* Reserved KVA space and mutex for mmu_booke_zero_page. */ static vm_offset_t zero_page_va; static struct mtx zero_page_mutex; /* Reserved KVA space and mutex for mmu_booke_copy_page. */ static vm_offset_t copy_page_src_va; static vm_offset_t copy_page_dst_va; static struct mtx copy_page_mutex; #endif static struct mtx tlbivax_mutex; /**************************************************************************/ /* PMAP */ /**************************************************************************/ static int mmu_booke_enter_locked(mmu_t, pmap_t, vm_offset_t, vm_page_t, vm_prot_t, u_int flags, int8_t psind); unsigned int kptbl_min; /* Index of the first kernel ptbl. */ unsigned int kernel_ptbls; /* Number of KVA ptbls. */ #ifdef __powerpc64__ unsigned int kernel_pdirs; #endif static uma_zone_t ptbl_root_zone; /* * If user pmap is processed with mmu_booke_remove and the resident count * drops to 0, there are no more pages to remove, so we need not continue. */ #define PMAP_REMOVE_DONE(pmap) \ ((pmap) != kernel_pmap && (pmap)->pm_stats.resident_count == 0) #if defined(COMPAT_FREEBSD32) || !defined(__powerpc64__) extern int elf32_nxstack; #endif /**************************************************************************/ /* TLB and TID handling */ /**************************************************************************/ /* Translation ID busy table */ static volatile pmap_t tidbusy[MAXCPU][TID_MAX + 1]; /* * TLB0 capabilities (entry, way numbers etc.). These can vary between e500 * core revisions and should be read from h/w registers during early config. */ uint32_t tlb0_entries; uint32_t tlb0_ways; uint32_t tlb0_entries_per_way; uint32_t tlb1_entries; #define TLB0_ENTRIES (tlb0_entries) #define TLB0_WAYS (tlb0_ways) #define TLB0_ENTRIES_PER_WAY (tlb0_entries_per_way) #define TLB1_ENTRIES (tlb1_entries) static vm_offset_t tlb1_map_base = (vm_offset_t)VM_MAXUSER_ADDRESS + PAGE_SIZE; static tlbtid_t tid_alloc(struct pmap *); static void tid_flush(tlbtid_t tid); #ifdef DDB #ifdef __powerpc64__ static void tlb_print_entry(int, uint32_t, uint64_t, uint32_t, uint32_t); #else static void tlb_print_entry(int, uint32_t, uint32_t, uint32_t, uint32_t); #endif #endif static void tlb1_read_entry(tlb_entry_t *, unsigned int); static void tlb1_write_entry(tlb_entry_t *, unsigned int); static int tlb1_iomapped(int, vm_paddr_t, vm_size_t, vm_offset_t *); static vm_size_t tlb1_mapin_region(vm_offset_t, vm_paddr_t, vm_size_t, int); static vm_size_t tsize2size(unsigned int); static unsigned int size2tsize(vm_size_t); static unsigned long ilog2(unsigned long); static void set_mas4_defaults(void); static inline void tlb0_flush_entry(vm_offset_t); static inline unsigned int tlb0_tableidx(vm_offset_t, unsigned int); /**************************************************************************/ /* Page table management */ /**************************************************************************/ static struct rwlock_padalign pvh_global_lock; /* Data for the pv entry allocation mechanism */ static uma_zone_t pvzone; static int pv_entry_count = 0, pv_entry_max = 0, pv_entry_high_water = 0; #define PV_ENTRY_ZONE_MIN 2048 /* min pv entries in uma zone */ #ifndef PMAP_SHPGPERPROC #define PMAP_SHPGPERPROC 200 #endif #ifdef __powerpc64__ #define PMAP_ROOT_SIZE (sizeof(pte_t***) * PP2D_NENTRIES) static pte_t *ptbl_alloc(mmu_t, pmap_t, pte_t **, unsigned int, boolean_t); static void ptbl_free(mmu_t, pmap_t, pte_t **, unsigned int, vm_page_t); static void ptbl_hold(mmu_t, pmap_t, pte_t **, unsigned int); static int ptbl_unhold(mmu_t, pmap_t, vm_offset_t); #else #define PMAP_ROOT_SIZE (sizeof(pte_t**) * PDIR_NENTRIES) static void ptbl_init(void); static struct ptbl_buf *ptbl_buf_alloc(void); static void ptbl_buf_free(struct ptbl_buf *); static void ptbl_free_pmap_ptbl(pmap_t, pte_t *); static pte_t *ptbl_alloc(mmu_t, pmap_t, unsigned int, boolean_t); static void ptbl_free(mmu_t, pmap_t, unsigned int); static void ptbl_hold(mmu_t, pmap_t, unsigned int); static int ptbl_unhold(mmu_t, pmap_t, unsigned int); #endif static vm_paddr_t pte_vatopa(mmu_t, pmap_t, vm_offset_t); static int pte_enter(mmu_t, pmap_t, vm_page_t, vm_offset_t, uint32_t, boolean_t); static int pte_remove(mmu_t, pmap_t, vm_offset_t, uint8_t); static pte_t *pte_find(mmu_t, pmap_t, vm_offset_t); static void kernel_pte_alloc(vm_offset_t, vm_offset_t, vm_offset_t); static pv_entry_t pv_alloc(void); static void pv_free(pv_entry_t); static void pv_insert(pmap_t, vm_offset_t, vm_page_t); static void pv_remove(pmap_t, vm_offset_t, vm_page_t); static void booke_pmap_init_qpages(void); struct ptbl_buf { TAILQ_ENTRY(ptbl_buf) link; /* list link */ vm_offset_t kva; /* va of mapping */ }; #ifndef __powerpc64__ /* Number of kva ptbl buffers, each covering one ptbl (PTBL_PAGES). */ #define PTBL_BUFS (128 * 16) /* ptbl free list and a lock used for access synchronization. */ static TAILQ_HEAD(, ptbl_buf) ptbl_buf_freelist; static struct mtx ptbl_buf_freelist_lock; /* Base address of kva space allocated fot ptbl bufs. */ static vm_offset_t ptbl_buf_pool_vabase; /* Pointer to ptbl_buf structures. */ static struct ptbl_buf *ptbl_bufs; #endif #ifdef SMP extern tlb_entry_t __boot_tlb1[]; void pmap_bootstrap_ap(volatile uint32_t *); #endif /* * Kernel MMU interface */ static void mmu_booke_clear_modify(mmu_t, vm_page_t); static void mmu_booke_copy(mmu_t, pmap_t, pmap_t, vm_offset_t, vm_size_t, vm_offset_t); static void mmu_booke_copy_page(mmu_t, vm_page_t, vm_page_t); static void mmu_booke_copy_pages(mmu_t, vm_page_t *, vm_offset_t, vm_page_t *, vm_offset_t, int); static int mmu_booke_enter(mmu_t, pmap_t, vm_offset_t, vm_page_t, vm_prot_t, u_int flags, int8_t psind); static void mmu_booke_enter_object(mmu_t, pmap_t, vm_offset_t, vm_offset_t, vm_page_t, vm_prot_t); static void mmu_booke_enter_quick(mmu_t, pmap_t, vm_offset_t, vm_page_t, vm_prot_t); static vm_paddr_t mmu_booke_extract(mmu_t, pmap_t, vm_offset_t); static vm_page_t mmu_booke_extract_and_hold(mmu_t, pmap_t, vm_offset_t, vm_prot_t); static void mmu_booke_init(mmu_t); static boolean_t mmu_booke_is_modified(mmu_t, vm_page_t); static boolean_t mmu_booke_is_prefaultable(mmu_t, pmap_t, vm_offset_t); static boolean_t mmu_booke_is_referenced(mmu_t, vm_page_t); static int mmu_booke_ts_referenced(mmu_t, vm_page_t); static vm_offset_t mmu_booke_map(mmu_t, vm_offset_t *, vm_paddr_t, vm_paddr_t, int); static int mmu_booke_mincore(mmu_t, pmap_t, vm_offset_t, vm_paddr_t *); static void mmu_booke_object_init_pt(mmu_t, pmap_t, vm_offset_t, vm_object_t, vm_pindex_t, vm_size_t); static boolean_t mmu_booke_page_exists_quick(mmu_t, pmap_t, vm_page_t); static void mmu_booke_page_init(mmu_t, vm_page_t); static int mmu_booke_page_wired_mappings(mmu_t, vm_page_t); static void mmu_booke_pinit(mmu_t, pmap_t); static void mmu_booke_pinit0(mmu_t, pmap_t); static void mmu_booke_protect(mmu_t, pmap_t, vm_offset_t, vm_offset_t, vm_prot_t); static void mmu_booke_qenter(mmu_t, vm_offset_t, vm_page_t *, int); static void mmu_booke_qremove(mmu_t, vm_offset_t, int); static void mmu_booke_release(mmu_t, pmap_t); static void mmu_booke_remove(mmu_t, pmap_t, vm_offset_t, vm_offset_t); static void mmu_booke_remove_all(mmu_t, vm_page_t); static void mmu_booke_remove_write(mmu_t, vm_page_t); static void mmu_booke_unwire(mmu_t, pmap_t, vm_offset_t, vm_offset_t); static void mmu_booke_zero_page(mmu_t, vm_page_t); static void mmu_booke_zero_page_area(mmu_t, vm_page_t, int, int); static void mmu_booke_activate(mmu_t, struct thread *); static void mmu_booke_deactivate(mmu_t, struct thread *); static void mmu_booke_bootstrap(mmu_t, vm_offset_t, vm_offset_t); static void *mmu_booke_mapdev(mmu_t, vm_paddr_t, vm_size_t); static void *mmu_booke_mapdev_attr(mmu_t, vm_paddr_t, vm_size_t, vm_memattr_t); static void mmu_booke_unmapdev(mmu_t, vm_offset_t, vm_size_t); static vm_paddr_t mmu_booke_kextract(mmu_t, vm_offset_t); static void mmu_booke_kenter(mmu_t, vm_offset_t, vm_paddr_t); static void mmu_booke_kenter_attr(mmu_t, vm_offset_t, vm_paddr_t, vm_memattr_t); static void mmu_booke_kremove(mmu_t, vm_offset_t); static boolean_t mmu_booke_dev_direct_mapped(mmu_t, vm_paddr_t, vm_size_t); static void mmu_booke_sync_icache(mmu_t, pmap_t, vm_offset_t, vm_size_t); static void mmu_booke_dumpsys_map(mmu_t, vm_paddr_t pa, size_t, void **); static void mmu_booke_dumpsys_unmap(mmu_t, vm_paddr_t pa, size_t, void *); static void mmu_booke_scan_init(mmu_t); static vm_offset_t mmu_booke_quick_enter_page(mmu_t mmu, vm_page_t m); static void mmu_booke_quick_remove_page(mmu_t mmu, vm_offset_t addr); static int mmu_booke_change_attr(mmu_t mmu, vm_offset_t addr, vm_size_t sz, vm_memattr_t mode); static int mmu_booke_map_user_ptr(mmu_t mmu, pmap_t pm, volatile const void *uaddr, void **kaddr, size_t ulen, size_t *klen); static int mmu_booke_decode_kernel_ptr(mmu_t mmu, vm_offset_t addr, int *is_user, vm_offset_t *decoded_addr); static void mmu_booke_page_array_startup(mmu_t , long); static mmu_method_t mmu_booke_methods[] = { /* pmap dispatcher interface */ MMUMETHOD(mmu_clear_modify, mmu_booke_clear_modify), MMUMETHOD(mmu_copy, mmu_booke_copy), MMUMETHOD(mmu_copy_page, mmu_booke_copy_page), MMUMETHOD(mmu_copy_pages, mmu_booke_copy_pages), MMUMETHOD(mmu_enter, mmu_booke_enter), MMUMETHOD(mmu_enter_object, mmu_booke_enter_object), MMUMETHOD(mmu_enter_quick, mmu_booke_enter_quick), MMUMETHOD(mmu_extract, mmu_booke_extract), MMUMETHOD(mmu_extract_and_hold, mmu_booke_extract_and_hold), MMUMETHOD(mmu_init, mmu_booke_init), MMUMETHOD(mmu_is_modified, mmu_booke_is_modified), MMUMETHOD(mmu_is_prefaultable, mmu_booke_is_prefaultable), MMUMETHOD(mmu_is_referenced, mmu_booke_is_referenced), MMUMETHOD(mmu_ts_referenced, mmu_booke_ts_referenced), MMUMETHOD(mmu_map, mmu_booke_map), MMUMETHOD(mmu_mincore, mmu_booke_mincore), MMUMETHOD(mmu_object_init_pt, mmu_booke_object_init_pt), MMUMETHOD(mmu_page_exists_quick,mmu_booke_page_exists_quick), MMUMETHOD(mmu_page_init, mmu_booke_page_init), MMUMETHOD(mmu_page_wired_mappings, mmu_booke_page_wired_mappings), MMUMETHOD(mmu_pinit, mmu_booke_pinit), MMUMETHOD(mmu_pinit0, mmu_booke_pinit0), MMUMETHOD(mmu_protect, mmu_booke_protect), MMUMETHOD(mmu_qenter, mmu_booke_qenter), MMUMETHOD(mmu_qremove, mmu_booke_qremove), MMUMETHOD(mmu_release, mmu_booke_release), MMUMETHOD(mmu_remove, mmu_booke_remove), MMUMETHOD(mmu_remove_all, mmu_booke_remove_all), MMUMETHOD(mmu_remove_write, mmu_booke_remove_write), MMUMETHOD(mmu_sync_icache, mmu_booke_sync_icache), MMUMETHOD(mmu_unwire, mmu_booke_unwire), MMUMETHOD(mmu_zero_page, mmu_booke_zero_page), MMUMETHOD(mmu_zero_page_area, mmu_booke_zero_page_area), MMUMETHOD(mmu_activate, mmu_booke_activate), MMUMETHOD(mmu_deactivate, mmu_booke_deactivate), MMUMETHOD(mmu_quick_enter_page, mmu_booke_quick_enter_page), MMUMETHOD(mmu_quick_remove_page, mmu_booke_quick_remove_page), MMUMETHOD(mmu_page_array_startup, mmu_booke_page_array_startup), /* Internal interfaces */ MMUMETHOD(mmu_bootstrap, mmu_booke_bootstrap), MMUMETHOD(mmu_dev_direct_mapped,mmu_booke_dev_direct_mapped), MMUMETHOD(mmu_mapdev, mmu_booke_mapdev), MMUMETHOD(mmu_mapdev_attr, mmu_booke_mapdev_attr), MMUMETHOD(mmu_kenter, mmu_booke_kenter), MMUMETHOD(mmu_kenter_attr, mmu_booke_kenter_attr), MMUMETHOD(mmu_kextract, mmu_booke_kextract), MMUMETHOD(mmu_kremove, mmu_booke_kremove), MMUMETHOD(mmu_unmapdev, mmu_booke_unmapdev), MMUMETHOD(mmu_change_attr, mmu_booke_change_attr), MMUMETHOD(mmu_map_user_ptr, mmu_booke_map_user_ptr), MMUMETHOD(mmu_decode_kernel_ptr, mmu_booke_decode_kernel_ptr), /* dumpsys() support */ MMUMETHOD(mmu_dumpsys_map, mmu_booke_dumpsys_map), MMUMETHOD(mmu_dumpsys_unmap, mmu_booke_dumpsys_unmap), MMUMETHOD(mmu_scan_init, mmu_booke_scan_init), { 0, 0 } }; MMU_DEF(booke_mmu, MMU_TYPE_BOOKE, mmu_booke_methods, 0); static __inline uint32_t tlb_calc_wimg(vm_paddr_t pa, vm_memattr_t ma) { uint32_t attrib; int i; if (ma != VM_MEMATTR_DEFAULT) { switch (ma) { case VM_MEMATTR_UNCACHEABLE: return (MAS2_I | MAS2_G); case VM_MEMATTR_WRITE_COMBINING: case VM_MEMATTR_WRITE_BACK: case VM_MEMATTR_PREFETCHABLE: return (MAS2_I); case VM_MEMATTR_WRITE_THROUGH: return (MAS2_W | MAS2_M); case VM_MEMATTR_CACHEABLE: return (MAS2_M); } } /* * Assume the page is cache inhibited and access is guarded unless * it's in our available memory array. */ attrib = _TLB_ENTRY_IO; for (i = 0; i < physmem_regions_sz; i++) { if ((pa >= physmem_regions[i].mr_start) && (pa < (physmem_regions[i].mr_start + physmem_regions[i].mr_size))) { attrib = _TLB_ENTRY_MEM; break; } } return (attrib); } static inline void tlb_miss_lock(void) { #ifdef SMP struct pcpu *pc; if (!smp_started) return; STAILQ_FOREACH(pc, &cpuhead, pc_allcpu) { if (pc != pcpup) { CTR3(KTR_PMAP, "%s: tlb miss LOCK of CPU=%d, " "tlb_lock=%p", __func__, pc->pc_cpuid, pc->pc_booke.tlb_lock); KASSERT((pc->pc_cpuid != PCPU_GET(cpuid)), ("tlb_miss_lock: tried to lock self")); tlb_lock(pc->pc_booke.tlb_lock); CTR1(KTR_PMAP, "%s: locked", __func__); } } #endif } static inline void tlb_miss_unlock(void) { #ifdef SMP struct pcpu *pc; if (!smp_started) return; STAILQ_FOREACH(pc, &cpuhead, pc_allcpu) { if (pc != pcpup) { CTR2(KTR_PMAP, "%s: tlb miss UNLOCK of CPU=%d", __func__, pc->pc_cpuid); tlb_unlock(pc->pc_booke.tlb_lock); CTR1(KTR_PMAP, "%s: unlocked", __func__); } } #endif } /* Return number of entries in TLB0. */ static __inline void tlb0_get_tlbconf(void) { uint32_t tlb0_cfg; tlb0_cfg = mfspr(SPR_TLB0CFG); tlb0_entries = tlb0_cfg & TLBCFG_NENTRY_MASK; tlb0_ways = (tlb0_cfg & TLBCFG_ASSOC_MASK) >> TLBCFG_ASSOC_SHIFT; tlb0_entries_per_way = tlb0_entries / tlb0_ways; } /* Return number of entries in TLB1. */ static __inline void tlb1_get_tlbconf(void) { uint32_t tlb1_cfg; tlb1_cfg = mfspr(SPR_TLB1CFG); tlb1_entries = tlb1_cfg & TLBCFG_NENTRY_MASK; } /**************************************************************************/ /* Page table related */ /**************************************************************************/ #ifdef __powerpc64__ /* Initialize pool of kva ptbl buffers. */ static void ptbl_init(void) { } /* Get a pointer to a PTE in a page table. */ static __inline pte_t * pte_find(mmu_t mmu, pmap_t pmap, vm_offset_t va) { pte_t **pdir; pte_t *ptbl; KASSERT((pmap != NULL), ("pte_find: invalid pmap")); pdir = pmap->pm_pp2d[PP2D_IDX(va)]; if (!pdir) return NULL; ptbl = pdir[PDIR_IDX(va)]; return ((ptbl != NULL) ? &ptbl[PTBL_IDX(va)] : NULL); } /* * allocate a page of pointers to page directories, do not preallocate the * page tables */ static pte_t ** pdir_alloc(mmu_t mmu, pmap_t pmap, unsigned int pp2d_idx, bool nosleep) { vm_page_t m; pte_t **pdir; int req; req = VM_ALLOC_NOOBJ | VM_ALLOC_WIRED; while ((m = vm_page_alloc(NULL, pp2d_idx, req)) == NULL) { PMAP_UNLOCK(pmap); if (nosleep) { return (NULL); } vm_wait(NULL); PMAP_LOCK(pmap); } /* Zero whole ptbl. */ pdir = (pte_t **)PHYS_TO_DMAP(VM_PAGE_TO_PHYS(m)); mmu_booke_zero_page(mmu, m); return (pdir); } /* Free pdir pages and invalidate pdir entry. */ static void pdir_free(mmu_t mmu, pmap_t pmap, unsigned int pp2d_idx, vm_page_t m) { pte_t **pdir; pdir = pmap->pm_pp2d[pp2d_idx]; KASSERT((pdir != NULL), ("pdir_free: null pdir")); pmap->pm_pp2d[pp2d_idx] = NULL; vm_wire_sub(1); vm_page_free_zero(m); } /* * Decrement pdir pages hold count and attempt to free pdir pages. Called * when removing directory entry from pdir. * * Return 1 if pdir pages were freed. */ static int pdir_unhold(mmu_t mmu, pmap_t pmap, u_int pp2d_idx) { pte_t **pdir; vm_paddr_t pa; vm_page_t m; KASSERT((pmap != kernel_pmap), ("pdir_unhold: unholding kernel pdir!")); pdir = pmap->pm_pp2d[pp2d_idx]; /* decrement hold count */ pa = DMAP_TO_PHYS((vm_offset_t) pdir); m = PHYS_TO_VM_PAGE(pa); /* * Free pdir page if there are no dir entries in this pdir. */ m->ref_count--; if (m->ref_count == 0) { pdir_free(mmu, pmap, pp2d_idx, m); return (1); } return (0); } /* * Increment hold count for pdir pages. This routine is used when new ptlb * entry is being inserted into pdir. */ static void pdir_hold(mmu_t mmu, pmap_t pmap, pte_t ** pdir) { vm_page_t m; KASSERT((pmap != kernel_pmap), ("pdir_hold: holding kernel pdir!")); KASSERT((pdir != NULL), ("pdir_hold: null pdir")); m = PHYS_TO_VM_PAGE(DMAP_TO_PHYS((vm_offset_t)pdir)); m->ref_count++; } /* Allocate page table. */ static pte_t * ptbl_alloc(mmu_t mmu, pmap_t pmap, pte_t ** pdir, unsigned int pdir_idx, boolean_t nosleep) { vm_page_t m; pte_t *ptbl; int req; KASSERT((pdir[pdir_idx] == NULL), ("%s: valid ptbl entry exists!", __func__)); req = VM_ALLOC_NOOBJ | VM_ALLOC_WIRED; while ((m = vm_page_alloc(NULL, pdir_idx, req)) == NULL) { + if (nosleep) + return (NULL); PMAP_UNLOCK(pmap); rw_wunlock(&pvh_global_lock); - if (nosleep) { - return (NULL); - } vm_wait(NULL); rw_wlock(&pvh_global_lock); PMAP_LOCK(pmap); } /* Zero whole ptbl. */ ptbl = (pte_t *)PHYS_TO_DMAP(VM_PAGE_TO_PHYS(m)); mmu_booke_zero_page(mmu, m); return (ptbl); } /* Free ptbl pages and invalidate pdir entry. */ static void ptbl_free(mmu_t mmu, pmap_t pmap, pte_t ** pdir, unsigned int pdir_idx, vm_page_t m) { pte_t *ptbl; ptbl = pdir[pdir_idx]; KASSERT((ptbl != NULL), ("ptbl_free: null ptbl")); pdir[pdir_idx] = NULL; vm_wire_sub(1); vm_page_free_zero(m); } /* * Decrement ptbl pages hold count and attempt to free ptbl pages. Called * when removing pte entry from ptbl. * * Return 1 if ptbl pages were freed. */ static int ptbl_unhold(mmu_t mmu, pmap_t pmap, vm_offset_t va) { pte_t *ptbl; vm_page_t m; u_int pp2d_idx; pte_t **pdir; u_int pdir_idx; pp2d_idx = PP2D_IDX(va); pdir_idx = PDIR_IDX(va); KASSERT((pmap != kernel_pmap), ("ptbl_unhold: unholding kernel ptbl!")); pdir = pmap->pm_pp2d[pp2d_idx]; ptbl = pdir[pdir_idx]; /* decrement hold count */ m = PHYS_TO_VM_PAGE(DMAP_TO_PHYS((vm_offset_t) ptbl)); /* * Free ptbl pages if there are no pte entries in this ptbl. * ref_count has the same value for all ptbl pages, so check the * last page. */ m->ref_count--; if (m->ref_count == 0) { ptbl_free(mmu, pmap, pdir, pdir_idx, m); pdir_unhold(mmu, pmap, pp2d_idx); return (1); } return (0); } /* * Increment hold count for ptbl pages. This routine is used when new pte * entry is being inserted into ptbl. */ static void ptbl_hold(mmu_t mmu, pmap_t pmap, pte_t ** pdir, unsigned int pdir_idx) { pte_t *ptbl; vm_page_t m; KASSERT((pmap != kernel_pmap), ("ptbl_hold: holding kernel ptbl!")); ptbl = pdir[pdir_idx]; KASSERT((ptbl != NULL), ("ptbl_hold: null ptbl")); m = PHYS_TO_VM_PAGE(DMAP_TO_PHYS((vm_offset_t) ptbl)); m->ref_count++; } #else /* Initialize pool of kva ptbl buffers. */ static void ptbl_init(void) { int i; CTR3(KTR_PMAP, "%s: s (ptbl_bufs = 0x%08x size 0x%08x)", __func__, (uint32_t)ptbl_bufs, sizeof(struct ptbl_buf) * PTBL_BUFS); CTR3(KTR_PMAP, "%s: s (ptbl_buf_pool_vabase = 0x%08x size = 0x%08x)", __func__, ptbl_buf_pool_vabase, PTBL_BUFS * PTBL_PAGES * PAGE_SIZE); mtx_init(&ptbl_buf_freelist_lock, "ptbl bufs lock", NULL, MTX_DEF); TAILQ_INIT(&ptbl_buf_freelist); for (i = 0; i < PTBL_BUFS; i++) { ptbl_bufs[i].kva = ptbl_buf_pool_vabase + i * PTBL_PAGES * PAGE_SIZE; TAILQ_INSERT_TAIL(&ptbl_buf_freelist, &ptbl_bufs[i], link); } } /* Get a ptbl_buf from the freelist. */ static struct ptbl_buf * ptbl_buf_alloc(void) { struct ptbl_buf *buf; mtx_lock(&ptbl_buf_freelist_lock); buf = TAILQ_FIRST(&ptbl_buf_freelist); if (buf != NULL) TAILQ_REMOVE(&ptbl_buf_freelist, buf, link); mtx_unlock(&ptbl_buf_freelist_lock); CTR2(KTR_PMAP, "%s: buf = %p", __func__, buf); return (buf); } /* Return ptbl buff to free pool. */ static void ptbl_buf_free(struct ptbl_buf *buf) { CTR2(KTR_PMAP, "%s: buf = %p", __func__, buf); mtx_lock(&ptbl_buf_freelist_lock); TAILQ_INSERT_TAIL(&ptbl_buf_freelist, buf, link); mtx_unlock(&ptbl_buf_freelist_lock); } /* * Search the list of allocated ptbl bufs and find on list of allocated ptbls */ static void ptbl_free_pmap_ptbl(pmap_t pmap, pte_t *ptbl) { struct ptbl_buf *pbuf; CTR2(KTR_PMAP, "%s: ptbl = %p", __func__, ptbl); PMAP_LOCK_ASSERT(pmap, MA_OWNED); TAILQ_FOREACH(pbuf, &pmap->pm_ptbl_list, link) if (pbuf->kva == (vm_offset_t)ptbl) { /* Remove from pmap ptbl buf list. */ TAILQ_REMOVE(&pmap->pm_ptbl_list, pbuf, link); /* Free corresponding ptbl buf. */ ptbl_buf_free(pbuf); break; } } /* Allocate page table. */ static pte_t * ptbl_alloc(mmu_t mmu, pmap_t pmap, unsigned int pdir_idx, boolean_t nosleep) { vm_page_t mtbl[PTBL_PAGES]; vm_page_t m; struct ptbl_buf *pbuf; unsigned int pidx; pte_t *ptbl; int i, j; CTR4(KTR_PMAP, "%s: pmap = %p su = %d pdir_idx = %d", __func__, pmap, (pmap == kernel_pmap), pdir_idx); KASSERT((pdir_idx <= (VM_MAXUSER_ADDRESS / PDIR_SIZE)), ("ptbl_alloc: invalid pdir_idx")); KASSERT((pmap->pm_pdir[pdir_idx] == NULL), ("pte_alloc: valid ptbl entry exists!")); pbuf = ptbl_buf_alloc(); if (pbuf == NULL) panic("pte_alloc: couldn't alloc kernel virtual memory"); ptbl = (pte_t *)pbuf->kva; CTR2(KTR_PMAP, "%s: ptbl kva = %p", __func__, ptbl); for (i = 0; i < PTBL_PAGES; i++) { pidx = (PTBL_PAGES * pdir_idx) + i; while ((m = vm_page_alloc(NULL, pidx, VM_ALLOC_NOOBJ | VM_ALLOC_WIRED)) == NULL) { - PMAP_UNLOCK(pmap); - rw_wunlock(&pvh_global_lock); if (nosleep) { ptbl_free_pmap_ptbl(pmap, ptbl); for (j = 0; j < i; j++) vm_page_free(mtbl[j]); vm_wire_sub(i); return (NULL); } + PMAP_UNLOCK(pmap); + rw_wunlock(&pvh_global_lock); vm_wait(NULL); rw_wlock(&pvh_global_lock); PMAP_LOCK(pmap); } mtbl[i] = m; } /* Map allocated pages into kernel_pmap. */ mmu_booke_qenter(mmu, (vm_offset_t)ptbl, mtbl, PTBL_PAGES); /* Zero whole ptbl. */ bzero((caddr_t)ptbl, PTBL_PAGES * PAGE_SIZE); /* Add pbuf to the pmap ptbl bufs list. */ TAILQ_INSERT_TAIL(&pmap->pm_ptbl_list, pbuf, link); return (ptbl); } /* Free ptbl pages and invalidate pdir entry. */ static void ptbl_free(mmu_t mmu, pmap_t pmap, unsigned int pdir_idx) { pte_t *ptbl; vm_paddr_t pa; vm_offset_t va; vm_page_t m; int i; CTR4(KTR_PMAP, "%s: pmap = %p su = %d pdir_idx = %d", __func__, pmap, (pmap == kernel_pmap), pdir_idx); KASSERT((pdir_idx <= (VM_MAXUSER_ADDRESS / PDIR_SIZE)), ("ptbl_free: invalid pdir_idx")); ptbl = pmap->pm_pdir[pdir_idx]; CTR2(KTR_PMAP, "%s: ptbl = %p", __func__, ptbl); KASSERT((ptbl != NULL), ("ptbl_free: null ptbl")); /* * Invalidate the pdir entry as soon as possible, so that other CPUs * don't attempt to look up the page tables we are releasing. */ mtx_lock_spin(&tlbivax_mutex); tlb_miss_lock(); pmap->pm_pdir[pdir_idx] = NULL; tlb_miss_unlock(); mtx_unlock_spin(&tlbivax_mutex); for (i = 0; i < PTBL_PAGES; i++) { va = ((vm_offset_t)ptbl + (i * PAGE_SIZE)); pa = pte_vatopa(mmu, kernel_pmap, va); m = PHYS_TO_VM_PAGE(pa); vm_page_free_zero(m); vm_wire_sub(1); mmu_booke_kremove(mmu, va); } ptbl_free_pmap_ptbl(pmap, ptbl); } /* * Decrement ptbl pages hold count and attempt to free ptbl pages. * Called when removing pte entry from ptbl. * * Return 1 if ptbl pages were freed. */ static int ptbl_unhold(mmu_t mmu, pmap_t pmap, unsigned int pdir_idx) { pte_t *ptbl; vm_paddr_t pa; vm_page_t m; int i; CTR4(KTR_PMAP, "%s: pmap = %p su = %d pdir_idx = %d", __func__, pmap, (pmap == kernel_pmap), pdir_idx); KASSERT((pdir_idx <= (VM_MAXUSER_ADDRESS / PDIR_SIZE)), ("ptbl_unhold: invalid pdir_idx")); KASSERT((pmap != kernel_pmap), ("ptbl_unhold: unholding kernel ptbl!")); ptbl = pmap->pm_pdir[pdir_idx]; //debugf("ptbl_unhold: ptbl = 0x%08x\n", (u_int32_t)ptbl); KASSERT(((vm_offset_t)ptbl >= VM_MIN_KERNEL_ADDRESS), ("ptbl_unhold: non kva ptbl")); /* decrement hold count */ for (i = 0; i < PTBL_PAGES; i++) { pa = pte_vatopa(mmu, kernel_pmap, (vm_offset_t)ptbl + (i * PAGE_SIZE)); m = PHYS_TO_VM_PAGE(pa); m->ref_count--; } /* * Free ptbl pages if there are no pte etries in this ptbl. * ref_count has the same value for all ptbl pages, so check the last * page. */ if (m->ref_count == 0) { ptbl_free(mmu, pmap, pdir_idx); //debugf("ptbl_unhold: e (freed ptbl)\n"); return (1); } return (0); } /* * Increment hold count for ptbl pages. This routine is used when a new pte * entry is being inserted into the ptbl. */ static void ptbl_hold(mmu_t mmu, pmap_t pmap, unsigned int pdir_idx) { vm_paddr_t pa; pte_t *ptbl; vm_page_t m; int i; CTR3(KTR_PMAP, "%s: pmap = %p pdir_idx = %d", __func__, pmap, pdir_idx); KASSERT((pdir_idx <= (VM_MAXUSER_ADDRESS / PDIR_SIZE)), ("ptbl_hold: invalid pdir_idx")); KASSERT((pmap != kernel_pmap), ("ptbl_hold: holding kernel ptbl!")); ptbl = pmap->pm_pdir[pdir_idx]; KASSERT((ptbl != NULL), ("ptbl_hold: null ptbl")); for (i = 0; i < PTBL_PAGES; i++) { pa = pte_vatopa(mmu, kernel_pmap, (vm_offset_t)ptbl + (i * PAGE_SIZE)); m = PHYS_TO_VM_PAGE(pa); m->ref_count++; } } #endif /* Allocate pv_entry structure. */ pv_entry_t pv_alloc(void) { pv_entry_t pv; pv_entry_count++; if (pv_entry_count > pv_entry_high_water) pagedaemon_wakeup(0); /* XXX powerpc NUMA */ pv = uma_zalloc(pvzone, M_NOWAIT); return (pv); } /* Free pv_entry structure. */ static __inline void pv_free(pv_entry_t pve) { pv_entry_count--; uma_zfree(pvzone, pve); } /* Allocate and initialize pv_entry structure. */ static void pv_insert(pmap_t pmap, vm_offset_t va, vm_page_t m) { pv_entry_t pve; //int su = (pmap == kernel_pmap); //debugf("pv_insert: s (su = %d pmap = 0x%08x va = 0x%08x m = 0x%08x)\n", su, // (u_int32_t)pmap, va, (u_int32_t)m); pve = pv_alloc(); if (pve == NULL) panic("pv_insert: no pv entries!"); pve->pv_pmap = pmap; pve->pv_va = va; /* add to pv_list */ PMAP_LOCK_ASSERT(pmap, MA_OWNED); rw_assert(&pvh_global_lock, RA_WLOCKED); TAILQ_INSERT_TAIL(&m->md.pv_list, pve, pv_link); //debugf("pv_insert: e\n"); } /* Destroy pv entry. */ static void pv_remove(pmap_t pmap, vm_offset_t va, vm_page_t m) { pv_entry_t pve; //int su = (pmap == kernel_pmap); //debugf("pv_remove: s (su = %d pmap = 0x%08x va = 0x%08x)\n", su, (u_int32_t)pmap, va); PMAP_LOCK_ASSERT(pmap, MA_OWNED); rw_assert(&pvh_global_lock, RA_WLOCKED); /* find pv entry */ TAILQ_FOREACH(pve, &m->md.pv_list, pv_link) { if ((pmap == pve->pv_pmap) && (va == pve->pv_va)) { /* remove from pv_list */ TAILQ_REMOVE(&m->md.pv_list, pve, pv_link); if (TAILQ_EMPTY(&m->md.pv_list)) vm_page_aflag_clear(m, PGA_WRITEABLE); /* free pv entry struct */ pv_free(pve); break; } } //debugf("pv_remove: e\n"); } #ifdef __powerpc64__ /* * Clean pte entry, try to free page table page if requested. * * Return 1 if ptbl pages were freed, otherwise return 0. */ static int pte_remove(mmu_t mmu, pmap_t pmap, vm_offset_t va, u_int8_t flags) { vm_page_t m; pte_t *pte; pte = pte_find(mmu, pmap, va); KASSERT(pte != NULL, ("%s: NULL pte", __func__)); if (!PTE_ISVALID(pte)) return (0); /* Get vm_page_t for mapped pte. */ m = PHYS_TO_VM_PAGE(PTE_PA(pte)); if (PTE_ISWIRED(pte)) pmap->pm_stats.wired_count--; /* Handle managed entry. */ if (PTE_ISMANAGED(pte)) { /* Handle modified pages. */ if (PTE_ISMODIFIED(pte)) vm_page_dirty(m); /* Referenced pages. */ if (PTE_ISREFERENCED(pte)) vm_page_aflag_set(m, PGA_REFERENCED); /* Remove pv_entry from pv_list. */ pv_remove(pmap, va, m); } else if (pmap == kernel_pmap && m && m->md.pv_tracked) { pv_remove(pmap, va, m); if (TAILQ_EMPTY(&m->md.pv_list)) m->md.pv_tracked = false; } mtx_lock_spin(&tlbivax_mutex); tlb_miss_lock(); tlb0_flush_entry(va); *pte = 0; tlb_miss_unlock(); mtx_unlock_spin(&tlbivax_mutex); pmap->pm_stats.resident_count--; if (flags & PTBL_UNHOLD) { return (ptbl_unhold(mmu, pmap, va)); } return (0); } /* * Insert PTE for a given page and virtual address. */ static int pte_enter(mmu_t mmu, pmap_t pmap, vm_page_t m, vm_offset_t va, uint32_t flags, boolean_t nosleep) { unsigned int pp2d_idx = PP2D_IDX(va); unsigned int pdir_idx = PDIR_IDX(va); unsigned int ptbl_idx = PTBL_IDX(va); pte_t *ptbl, *pte, pte_tmp; pte_t **pdir; /* Get the page directory pointer. */ pdir = pmap->pm_pp2d[pp2d_idx]; if (pdir == NULL) pdir = pdir_alloc(mmu, pmap, pp2d_idx, nosleep); /* Get the page table pointer. */ ptbl = pdir[pdir_idx]; if (ptbl == NULL) { /* Allocate page table pages. */ ptbl = ptbl_alloc(mmu, pmap, pdir, pdir_idx, nosleep); if (ptbl == NULL) { KASSERT(nosleep, ("nosleep and NULL ptbl")); return (ENOMEM); } pte = &ptbl[ptbl_idx]; } else { /* * Check if there is valid mapping for requested va, if there * is, remove it. */ pte = &ptbl[ptbl_idx]; if (PTE_ISVALID(pte)) { pte_remove(mmu, pmap, va, PTBL_HOLD); } else { /* * pte is not used, increment hold count for ptbl * pages. */ if (pmap != kernel_pmap) ptbl_hold(mmu, pmap, pdir, pdir_idx); } } if (pdir[pdir_idx] == NULL) { if (pmap != kernel_pmap && pmap->pm_pp2d[pp2d_idx] != NULL) pdir_hold(mmu, pmap, pdir); pdir[pdir_idx] = ptbl; } if (pmap->pm_pp2d[pp2d_idx] == NULL) pmap->pm_pp2d[pp2d_idx] = pdir; /* * Insert pv_entry into pv_list for mapped page if part of managed * memory. */ if ((m->oflags & VPO_UNMANAGED) == 0) { flags |= PTE_MANAGED; /* Create and insert pv entry. */ pv_insert(pmap, va, m); } pmap->pm_stats.resident_count++; pte_tmp = PTE_RPN_FROM_PA(VM_PAGE_TO_PHYS(m)); pte_tmp |= (PTE_VALID | flags); mtx_lock_spin(&tlbivax_mutex); tlb_miss_lock(); tlb0_flush_entry(va); *pte = pte_tmp; tlb_miss_unlock(); mtx_unlock_spin(&tlbivax_mutex); return (0); } /* Return the pa for the given pmap/va. */ static vm_paddr_t pte_vatopa(mmu_t mmu, pmap_t pmap, vm_offset_t va) { vm_paddr_t pa = 0; pte_t *pte; pte = pte_find(mmu, pmap, va); if ((pte != NULL) && PTE_ISVALID(pte)) pa = (PTE_PA(pte) | (va & PTE_PA_MASK)); return (pa); } /* allocate pte entries to manage (addr & mask) to (addr & mask) + size */ static void kernel_pte_alloc(vm_offset_t data_end, vm_offset_t addr, vm_offset_t pdir) { int i, j; vm_offset_t va; pte_t *pte; va = addr; /* Initialize kernel pdir */ for (i = 0; i < kernel_pdirs; i++) { kernel_pmap->pm_pp2d[i + PP2D_IDX(va)] = (pte_t **)(pdir + (i * PAGE_SIZE * PDIR_PAGES)); for (j = PDIR_IDX(va + (i * PAGE_SIZE * PDIR_NENTRIES * PTBL_NENTRIES)); j < PDIR_NENTRIES; j++) { kernel_pmap->pm_pp2d[i + PP2D_IDX(va)][j] = (pte_t *)(pdir + (kernel_pdirs * PAGE_SIZE) + (((i * PDIR_NENTRIES) + j) * PAGE_SIZE)); } } /* * Fill in PTEs covering kernel code and data. They are not required * for address translation, as this area is covered by static TLB1 * entries, but for pte_vatopa() to work correctly with kernel area * addresses. */ for (va = addr; va < data_end; va += PAGE_SIZE) { pte = &(kernel_pmap->pm_pp2d[PP2D_IDX(va)][PDIR_IDX(va)][PTBL_IDX(va)]); *pte = PTE_RPN_FROM_PA(kernload + (va - kernstart)); *pte |= PTE_M | PTE_SR | PTE_SW | PTE_SX | PTE_WIRED | PTE_VALID | PTE_PS_4KB; } } #else /* * Clean pte entry, try to free page table page if requested. * * Return 1 if ptbl pages were freed, otherwise return 0. */ static int pte_remove(mmu_t mmu, pmap_t pmap, vm_offset_t va, uint8_t flags) { unsigned int pdir_idx = PDIR_IDX(va); unsigned int ptbl_idx = PTBL_IDX(va); vm_page_t m; pte_t *ptbl; pte_t *pte; //int su = (pmap == kernel_pmap); //debugf("pte_remove: s (su = %d pmap = 0x%08x va = 0x%08x flags = %d)\n", // su, (u_int32_t)pmap, va, flags); ptbl = pmap->pm_pdir[pdir_idx]; KASSERT(ptbl, ("pte_remove: null ptbl")); pte = &ptbl[ptbl_idx]; if (pte == NULL || !PTE_ISVALID(pte)) return (0); if (PTE_ISWIRED(pte)) pmap->pm_stats.wired_count--; /* Get vm_page_t for mapped pte. */ m = PHYS_TO_VM_PAGE(PTE_PA(pte)); /* Handle managed entry. */ if (PTE_ISMANAGED(pte)) { if (PTE_ISMODIFIED(pte)) vm_page_dirty(m); if (PTE_ISREFERENCED(pte)) vm_page_aflag_set(m, PGA_REFERENCED); pv_remove(pmap, va, m); } else if (pmap == kernel_pmap && m && m->md.pv_tracked) { /* * Always pv_insert()/pv_remove() on MPC85XX, in case DPAA is * used. This is needed by the NCSW support code for fast * VA<->PA translation. */ pv_remove(pmap, va, m); if (TAILQ_EMPTY(&m->md.pv_list)) m->md.pv_tracked = false; } mtx_lock_spin(&tlbivax_mutex); tlb_miss_lock(); tlb0_flush_entry(va); *pte = 0; tlb_miss_unlock(); mtx_unlock_spin(&tlbivax_mutex); pmap->pm_stats.resident_count--; if (flags & PTBL_UNHOLD) { //debugf("pte_remove: e (unhold)\n"); return (ptbl_unhold(mmu, pmap, pdir_idx)); } //debugf("pte_remove: e\n"); return (0); } /* * Insert PTE for a given page and virtual address. */ static int pte_enter(mmu_t mmu, pmap_t pmap, vm_page_t m, vm_offset_t va, uint32_t flags, boolean_t nosleep) { unsigned int pdir_idx = PDIR_IDX(va); unsigned int ptbl_idx = PTBL_IDX(va); pte_t *ptbl, *pte, pte_tmp; CTR4(KTR_PMAP, "%s: su = %d pmap = %p va = %p", __func__, pmap == kernel_pmap, pmap, va); /* Get the page table pointer. */ ptbl = pmap->pm_pdir[pdir_idx]; if (ptbl == NULL) { /* Allocate page table pages. */ ptbl = ptbl_alloc(mmu, pmap, pdir_idx, nosleep); if (ptbl == NULL) { KASSERT(nosleep, ("nosleep and NULL ptbl")); return (ENOMEM); } pmap->pm_pdir[pdir_idx] = ptbl; pte = &ptbl[ptbl_idx]; } else { /* * Check if there is valid mapping for requested * va, if there is, remove it. */ pte = &pmap->pm_pdir[pdir_idx][ptbl_idx]; if (PTE_ISVALID(pte)) { pte_remove(mmu, pmap, va, PTBL_HOLD); } else { /* * pte is not used, increment hold count * for ptbl pages. */ if (pmap != kernel_pmap) ptbl_hold(mmu, pmap, pdir_idx); } } /* * Insert pv_entry into pv_list for mapped page if part of managed * memory. */ if ((m->oflags & VPO_UNMANAGED) == 0) { flags |= PTE_MANAGED; /* Create and insert pv entry. */ pv_insert(pmap, va, m); } pmap->pm_stats.resident_count++; pte_tmp = PTE_RPN_FROM_PA(VM_PAGE_TO_PHYS(m)); pte_tmp |= (PTE_VALID | flags | PTE_PS_4KB); /* 4KB pages only */ mtx_lock_spin(&tlbivax_mutex); tlb_miss_lock(); tlb0_flush_entry(va); *pte = pte_tmp; tlb_miss_unlock(); mtx_unlock_spin(&tlbivax_mutex); return (0); } /* Return the pa for the given pmap/va. */ static vm_paddr_t pte_vatopa(mmu_t mmu, pmap_t pmap, vm_offset_t va) { vm_paddr_t pa = 0; pte_t *pte; pte = pte_find(mmu, pmap, va); if ((pte != NULL) && PTE_ISVALID(pte)) pa = (PTE_PA(pte) | (va & PTE_PA_MASK)); return (pa); } /* Get a pointer to a PTE in a page table. */ static pte_t * pte_find(mmu_t mmu, pmap_t pmap, vm_offset_t va) { unsigned int pdir_idx = PDIR_IDX(va); unsigned int ptbl_idx = PTBL_IDX(va); KASSERT((pmap != NULL), ("pte_find: invalid pmap")); if (pmap->pm_pdir[pdir_idx]) return (&(pmap->pm_pdir[pdir_idx][ptbl_idx])); return (NULL); } /* Set up kernel page tables. */ static void kernel_pte_alloc(vm_offset_t data_end, vm_offset_t addr, vm_offset_t pdir) { int i; vm_offset_t va; pte_t *pte; /* Initialize kernel pdir */ for (i = 0; i < kernel_ptbls; i++) kernel_pmap->pm_pdir[kptbl_min + i] = (pte_t *)(pdir + (i * PAGE_SIZE * PTBL_PAGES)); /* * Fill in PTEs covering kernel code and data. They are not required * for address translation, as this area is covered by static TLB1 * entries, but for pte_vatopa() to work correctly with kernel area * addresses. */ for (va = addr; va < data_end; va += PAGE_SIZE) { pte = &(kernel_pmap->pm_pdir[PDIR_IDX(va)][PTBL_IDX(va)]); *pte = PTE_RPN_FROM_PA(kernload + (va - kernstart)); *pte |= PTE_M | PTE_SR | PTE_SW | PTE_SX | PTE_WIRED | PTE_VALID | PTE_PS_4KB; } } #endif /**************************************************************************/ /* PMAP related */ /**************************************************************************/ /* * This is called during booke_init, before the system is really initialized. */ static void mmu_booke_bootstrap(mmu_t mmu, vm_offset_t start, vm_offset_t kernelend) { vm_paddr_t phys_kernelend; struct mem_region *mp, *mp1; int cnt, i, j; vm_paddr_t s, e, sz; vm_paddr_t physsz, hwphyssz; u_int phys_avail_count; vm_size_t kstack0_sz; vm_offset_t kernel_pdir, kstack0; vm_paddr_t kstack0_phys; void *dpcpu; vm_offset_t kernel_ptbl_root; debugf("mmu_booke_bootstrap: entered\n"); /* Set interesting system properties */ #ifdef __powerpc64__ hw_direct_map = 1; #else hw_direct_map = 0; #endif #if defined(COMPAT_FREEBSD32) || !defined(__powerpc64__) elf32_nxstack = 1; #endif /* Initialize invalidation mutex */ mtx_init(&tlbivax_mutex, "tlbivax", NULL, MTX_SPIN); /* Read TLB0 size and associativity. */ tlb0_get_tlbconf(); /* * Align kernel start and end address (kernel image). * Note that kernel end does not necessarily relate to kernsize. * kernsize is the size of the kernel that is actually mapped. */ data_start = round_page(kernelend); data_end = data_start; /* Allocate the dynamic per-cpu area. */ dpcpu = (void *)data_end; data_end += DPCPU_SIZE; /* Allocate space for the message buffer. */ msgbufp = (struct msgbuf *)data_end; data_end += msgbufsize; debugf(" msgbufp at 0x%"PRI0ptrX" end = 0x%"PRI0ptrX"\n", (uintptr_t)msgbufp, data_end); data_end = round_page(data_end); #ifdef __powerpc64__ kernel_ptbl_root = data_end; data_end += PP2D_NENTRIES * sizeof(pte_t**); #else /* Allocate space for ptbl_bufs. */ ptbl_bufs = (struct ptbl_buf *)data_end; data_end += sizeof(struct ptbl_buf) * PTBL_BUFS; debugf(" ptbl_bufs at 0x%"PRI0ptrX" end = 0x%"PRI0ptrX"\n", (uintptr_t)ptbl_bufs, data_end); data_end = round_page(data_end); kernel_ptbl_root = data_end; data_end += PDIR_NENTRIES * sizeof(pte_t*); #endif /* Allocate PTE tables for kernel KVA. */ kernel_pdir = data_end; kernel_ptbls = howmany(VM_MAX_KERNEL_ADDRESS - VM_MIN_KERNEL_ADDRESS, PDIR_SIZE); #ifdef __powerpc64__ kernel_pdirs = howmany(kernel_ptbls, PDIR_NENTRIES); data_end += kernel_pdirs * PDIR_PAGES * PAGE_SIZE; #endif data_end += kernel_ptbls * PTBL_PAGES * PAGE_SIZE; debugf(" kernel ptbls: %d\n", kernel_ptbls); debugf(" kernel pdir at 0x%"PRI0ptrX" end = 0x%"PRI0ptrX"\n", kernel_pdir, data_end); /* Retrieve phys/avail mem regions */ mem_regions(&physmem_regions, &physmem_regions_sz, &availmem_regions, &availmem_regions_sz); if (PHYS_AVAIL_ENTRIES < availmem_regions_sz) panic("mmu_booke_bootstrap: phys_avail too small"); data_end = round_page(data_end); vm_page_array = (vm_page_t)data_end; /* * Get a rough idea (upper bound) on the size of the page array. The * vm_page_array will not handle any more pages than we have in the * avail_regions array, and most likely much less. */ sz = 0; for (mp = availmem_regions; mp->mr_size; mp++) { sz += mp->mr_size; } sz = (round_page(sz) / (PAGE_SIZE + sizeof(struct vm_page))); data_end += round_page(sz * sizeof(struct vm_page)); /* Pre-round up to 1MB. This wastes some space, but saves TLB entries */ data_end = roundup2(data_end, 1 << 20); debugf(" data_end: 0x%"PRI0ptrX"\n", data_end); debugf(" kernstart: %#zx\n", kernstart); debugf(" kernsize: %#zx\n", kernsize); if (data_end - kernstart > kernsize) { kernsize += tlb1_mapin_region(kernstart + kernsize, kernload + kernsize, (data_end - kernstart) - kernsize, _TLB_ENTRY_MEM); } data_end = kernstart + kernsize; debugf(" updated data_end: 0x%"PRI0ptrX"\n", data_end); /* * Clear the structures - note we can only do it safely after the * possible additional TLB1 translations are in place (above) so that * all range up to the currently calculated 'data_end' is covered. */ dpcpu_init(dpcpu, 0); #ifdef __powerpc64__ memset((void *)kernel_pdir, 0, kernel_pdirs * PDIR_PAGES * PAGE_SIZE + kernel_ptbls * PTBL_PAGES * PAGE_SIZE); #else memset((void *)ptbl_bufs, 0, sizeof(struct ptbl_buf) * PTBL_SIZE); memset((void *)kernel_pdir, 0, kernel_ptbls * PTBL_PAGES * PAGE_SIZE); #endif /*******************************************************/ /* Set the start and end of kva. */ /*******************************************************/ virtual_avail = round_page(data_end); virtual_end = VM_MAX_KERNEL_ADDRESS; #ifndef __powerpc64__ /* Allocate KVA space for page zero/copy operations. */ zero_page_va = virtual_avail; virtual_avail += PAGE_SIZE; copy_page_src_va = virtual_avail; virtual_avail += PAGE_SIZE; copy_page_dst_va = virtual_avail; virtual_avail += PAGE_SIZE; debugf("zero_page_va = 0x%"PRI0ptrX"\n", zero_page_va); debugf("copy_page_src_va = 0x%"PRI0ptrX"\n", copy_page_src_va); debugf("copy_page_dst_va = 0x%"PRI0ptrX"\n", copy_page_dst_va); /* Initialize page zero/copy mutexes. */ mtx_init(&zero_page_mutex, "mmu_booke_zero_page", NULL, MTX_DEF); mtx_init(©_page_mutex, "mmu_booke_copy_page", NULL, MTX_DEF); /* Allocate KVA space for ptbl bufs. */ ptbl_buf_pool_vabase = virtual_avail; virtual_avail += PTBL_BUFS * PTBL_PAGES * PAGE_SIZE; debugf("ptbl_buf_pool_vabase = 0x%"PRI0ptrX" end = 0x%"PRI0ptrX"\n", ptbl_buf_pool_vabase, virtual_avail); #endif /* Calculate corresponding physical addresses for the kernel region. */ phys_kernelend = kernload + kernsize; debugf("kernel image and allocated data:\n"); debugf(" kernload = 0x%09jx\n", (uintmax_t)kernload); debugf(" kernstart = 0x%"PRI0ptrX"\n", kernstart); debugf(" kernsize = 0x%"PRI0ptrX"\n", kernsize); /* * Remove kernel physical address range from avail regions list. Page * align all regions. Non-page aligned memory isn't very interesting * to us. Also, sort the entries for ascending addresses. */ sz = 0; cnt = availmem_regions_sz; debugf("processing avail regions:\n"); for (mp = availmem_regions; mp->mr_size; mp++) { s = mp->mr_start; e = mp->mr_start + mp->mr_size; debugf(" %09jx-%09jx -> ", (uintmax_t)s, (uintmax_t)e); /* Check whether this region holds all of the kernel. */ if (s < kernload && e > phys_kernelend) { availmem_regions[cnt].mr_start = phys_kernelend; availmem_regions[cnt++].mr_size = e - phys_kernelend; e = kernload; } /* Look whether this regions starts within the kernel. */ if (s >= kernload && s < phys_kernelend) { if (e <= phys_kernelend) goto empty; s = phys_kernelend; } /* Now look whether this region ends within the kernel. */ if (e > kernload && e <= phys_kernelend) { if (s >= kernload) goto empty; e = kernload; } /* Now page align the start and size of the region. */ s = round_page(s); e = trunc_page(e); if (e < s) e = s; sz = e - s; debugf("%09jx-%09jx = %jx\n", (uintmax_t)s, (uintmax_t)e, (uintmax_t)sz); /* Check whether some memory is left here. */ if (sz == 0) { empty: memmove(mp, mp + 1, (cnt - (mp - availmem_regions)) * sizeof(*mp)); cnt--; mp--; continue; } /* Do an insertion sort. */ for (mp1 = availmem_regions; mp1 < mp; mp1++) if (s < mp1->mr_start) break; if (mp1 < mp) { memmove(mp1 + 1, mp1, (char *)mp - (char *)mp1); mp1->mr_start = s; mp1->mr_size = sz; } else { mp->mr_start = s; mp->mr_size = sz; } } availmem_regions_sz = cnt; /*******************************************************/ /* Steal physical memory for kernel stack from the end */ /* of the first avail region */ /*******************************************************/ kstack0_sz = kstack_pages * PAGE_SIZE; kstack0_phys = availmem_regions[0].mr_start + availmem_regions[0].mr_size; kstack0_phys -= kstack0_sz; availmem_regions[0].mr_size -= kstack0_sz; /*******************************************************/ /* Fill in phys_avail table, based on availmem_regions */ /*******************************************************/ phys_avail_count = 0; physsz = 0; hwphyssz = 0; TUNABLE_ULONG_FETCH("hw.physmem", (u_long *) &hwphyssz); debugf("fill in phys_avail:\n"); for (i = 0, j = 0; i < availmem_regions_sz; i++, j += 2) { debugf(" region: 0x%jx - 0x%jx (0x%jx)\n", (uintmax_t)availmem_regions[i].mr_start, (uintmax_t)availmem_regions[i].mr_start + availmem_regions[i].mr_size, (uintmax_t)availmem_regions[i].mr_size); if (hwphyssz != 0 && (physsz + availmem_regions[i].mr_size) >= hwphyssz) { debugf(" hw.physmem adjust\n"); if (physsz < hwphyssz) { phys_avail[j] = availmem_regions[i].mr_start; phys_avail[j + 1] = availmem_regions[i].mr_start + hwphyssz - physsz; physsz = hwphyssz; phys_avail_count++; dump_avail[j] = phys_avail[j]; dump_avail[j + 1] = phys_avail[j + 1]; } break; } phys_avail[j] = availmem_regions[i].mr_start; phys_avail[j + 1] = availmem_regions[i].mr_start + availmem_regions[i].mr_size; phys_avail_count++; physsz += availmem_regions[i].mr_size; dump_avail[j] = phys_avail[j]; dump_avail[j + 1] = phys_avail[j + 1]; } physmem = btoc(physsz); /* Calculate the last available physical address. */ for (i = 0; phys_avail[i + 2] != 0; i += 2) ; Maxmem = powerpc_btop(phys_avail[i + 1]); debugf("Maxmem = 0x%08lx\n", Maxmem); debugf("phys_avail_count = %d\n", phys_avail_count); debugf("physsz = 0x%09jx physmem = %jd (0x%09jx)\n", (uintmax_t)physsz, (uintmax_t)physmem, (uintmax_t)physmem); #ifdef __powerpc64__ /* * Map the physical memory contiguously in TLB1. * Round so it fits into a single mapping. */ tlb1_mapin_region(DMAP_BASE_ADDRESS, 0, phys_avail[i + 1], _TLB_ENTRY_MEM); #endif /*******************************************************/ /* Initialize (statically allocated) kernel pmap. */ /*******************************************************/ PMAP_LOCK_INIT(kernel_pmap); #ifdef __powerpc64__ kernel_pmap->pm_pp2d = (pte_t ***)kernel_ptbl_root; #else kptbl_min = VM_MIN_KERNEL_ADDRESS / PDIR_SIZE; kernel_pmap->pm_pdir = (pte_t **)kernel_ptbl_root; #endif debugf("kernel_pmap = 0x%"PRI0ptrX"\n", (uintptr_t)kernel_pmap); kernel_pte_alloc(virtual_avail, kernstart, kernel_pdir); for (i = 0; i < MAXCPU; i++) { kernel_pmap->pm_tid[i] = TID_KERNEL; /* Initialize each CPU's tidbusy entry 0 with kernel_pmap */ tidbusy[i][TID_KERNEL] = kernel_pmap; } /* Mark kernel_pmap active on all CPUs */ CPU_FILL(&kernel_pmap->pm_active); /* * Initialize the global pv list lock. */ rw_init(&pvh_global_lock, "pmap pv global"); /*******************************************************/ /* Final setup */ /*******************************************************/ /* Enter kstack0 into kernel map, provide guard page */ kstack0 = virtual_avail + KSTACK_GUARD_PAGES * PAGE_SIZE; thread0.td_kstack = kstack0; thread0.td_kstack_pages = kstack_pages; debugf("kstack_sz = 0x%08jx\n", (uintmax_t)kstack0_sz); debugf("kstack0_phys at 0x%09jx - 0x%09jx\n", (uintmax_t)kstack0_phys, (uintmax_t)kstack0_phys + kstack0_sz); debugf("kstack0 at 0x%"PRI0ptrX" - 0x%"PRI0ptrX"\n", kstack0, kstack0 + kstack0_sz); virtual_avail += KSTACK_GUARD_PAGES * PAGE_SIZE + kstack0_sz; for (i = 0; i < kstack_pages; i++) { mmu_booke_kenter(mmu, kstack0, kstack0_phys); kstack0 += PAGE_SIZE; kstack0_phys += PAGE_SIZE; } pmap_bootstrapped = 1; debugf("virtual_avail = %"PRI0ptrX"\n", virtual_avail); debugf("virtual_end = %"PRI0ptrX"\n", virtual_end); debugf("mmu_booke_bootstrap: exit\n"); } #ifdef SMP void tlb1_ap_prep(void) { tlb_entry_t *e, tmp; unsigned int i; /* Prepare TLB1 image for AP processors */ e = __boot_tlb1; for (i = 0; i < TLB1_ENTRIES; i++) { tlb1_read_entry(&tmp, i); if ((tmp.mas1 & MAS1_VALID) && (tmp.mas2 & _TLB_ENTRY_SHARED)) memcpy(e++, &tmp, sizeof(tmp)); } } void pmap_bootstrap_ap(volatile uint32_t *trcp __unused) { int i; /* * Finish TLB1 configuration: the BSP already set up its TLB1 and we * have the snapshot of its contents in the s/w __boot_tlb1[] table * created by tlb1_ap_prep(), so use these values directly to * (re)program AP's TLB1 hardware. * * Start at index 1 because index 0 has the kernel map. */ for (i = 1; i < TLB1_ENTRIES; i++) { if (__boot_tlb1[i].mas1 & MAS1_VALID) tlb1_write_entry(&__boot_tlb1[i], i); } set_mas4_defaults(); } #endif static void booke_pmap_init_qpages(void) { struct pcpu *pc; int i; CPU_FOREACH(i) { pc = pcpu_find(i); pc->pc_qmap_addr = kva_alloc(PAGE_SIZE); if (pc->pc_qmap_addr == 0) panic("pmap_init_qpages: unable to allocate KVA"); } } SYSINIT(qpages_init, SI_SUB_CPU, SI_ORDER_ANY, booke_pmap_init_qpages, NULL); /* * Get the physical page address for the given pmap/virtual address. */ static vm_paddr_t mmu_booke_extract(mmu_t mmu, pmap_t pmap, vm_offset_t va) { vm_paddr_t pa; PMAP_LOCK(pmap); pa = pte_vatopa(mmu, pmap, va); PMAP_UNLOCK(pmap); return (pa); } /* * Extract the physical page address associated with the given * kernel virtual address. */ static vm_paddr_t mmu_booke_kextract(mmu_t mmu, vm_offset_t va) { tlb_entry_t e; vm_paddr_t p = 0; int i; #ifdef __powerpc64__ if (va >= DMAP_BASE_ADDRESS && va <= DMAP_MAX_ADDRESS) return (DMAP_TO_PHYS(va)); #endif if (va >= VM_MIN_KERNEL_ADDRESS && va <= VM_MAX_KERNEL_ADDRESS) p = pte_vatopa(mmu, kernel_pmap, va); if (p == 0) { /* Check TLB1 mappings */ for (i = 0; i < TLB1_ENTRIES; i++) { tlb1_read_entry(&e, i); if (!(e.mas1 & MAS1_VALID)) continue; if (va >= e.virt && va < e.virt + e.size) return (e.phys + (va - e.virt)); } } return (p); } /* * Initialize the pmap module. * Called by vm_init, to initialize any structures that the pmap * system needs to map virtual memory. */ static void mmu_booke_init(mmu_t mmu) { int shpgperproc = PMAP_SHPGPERPROC; /* * Initialize the address space (zone) for the pv entries. Set a * high water mark so that the system can recover from excessive * numbers of pv entries. */ pvzone = uma_zcreate("PV ENTRY", sizeof(struct pv_entry), NULL, NULL, NULL, NULL, UMA_ALIGN_PTR, UMA_ZONE_VM | UMA_ZONE_NOFREE); TUNABLE_INT_FETCH("vm.pmap.shpgperproc", &shpgperproc); pv_entry_max = shpgperproc * maxproc + vm_cnt.v_page_count; TUNABLE_INT_FETCH("vm.pmap.pv_entries", &pv_entry_max); pv_entry_high_water = 9 * (pv_entry_max / 10); uma_zone_reserve_kva(pvzone, pv_entry_max); /* Pre-fill pvzone with initial number of pv entries. */ uma_prealloc(pvzone, PV_ENTRY_ZONE_MIN); /* Create a UMA zone for page table roots. */ ptbl_root_zone = uma_zcreate("pmap root", PMAP_ROOT_SIZE, NULL, NULL, NULL, NULL, UMA_ALIGN_CACHE, UMA_ZONE_VM); /* Initialize ptbl allocation. */ ptbl_init(); } /* * Map a list of wired pages into kernel virtual address space. This is * intended for temporary mappings which do not need page modification or * references recorded. Existing mappings in the region are overwritten. */ static void mmu_booke_qenter(mmu_t mmu, vm_offset_t sva, vm_page_t *m, int count) { vm_offset_t va; va = sva; while (count-- > 0) { mmu_booke_kenter(mmu, va, VM_PAGE_TO_PHYS(*m)); va += PAGE_SIZE; m++; } } /* * Remove page mappings from kernel virtual address space. Intended for * temporary mappings entered by mmu_booke_qenter. */ static void mmu_booke_qremove(mmu_t mmu, vm_offset_t sva, int count) { vm_offset_t va; va = sva; while (count-- > 0) { mmu_booke_kremove(mmu, va); va += PAGE_SIZE; } } /* * Map a wired page into kernel virtual address space. */ static void mmu_booke_kenter(mmu_t mmu, vm_offset_t va, vm_paddr_t pa) { mmu_booke_kenter_attr(mmu, va, pa, VM_MEMATTR_DEFAULT); } static void mmu_booke_kenter_attr(mmu_t mmu, vm_offset_t va, vm_paddr_t pa, vm_memattr_t ma) { uint32_t flags; pte_t *pte; KASSERT(((va >= VM_MIN_KERNEL_ADDRESS) && (va <= VM_MAX_KERNEL_ADDRESS)), ("mmu_booke_kenter: invalid va")); flags = PTE_SR | PTE_SW | PTE_SX | PTE_WIRED | PTE_VALID; flags |= tlb_calc_wimg(pa, ma) << PTE_MAS2_SHIFT; flags |= PTE_PS_4KB; pte = pte_find(mmu, kernel_pmap, va); KASSERT((pte != NULL), ("mmu_booke_kenter: invalid va. NULL PTE")); mtx_lock_spin(&tlbivax_mutex); tlb_miss_lock(); if (PTE_ISVALID(pte)) { CTR1(KTR_PMAP, "%s: replacing entry!", __func__); /* Flush entry from TLB0 */ tlb0_flush_entry(va); } *pte = PTE_RPN_FROM_PA(pa) | flags; //debugf("mmu_booke_kenter: pdir_idx = %d ptbl_idx = %d va=0x%08x " // "pa=0x%08x rpn=0x%08x flags=0x%08x\n", // pdir_idx, ptbl_idx, va, pa, pte->rpn, pte->flags); /* Flush the real memory from the instruction cache. */ if ((flags & (PTE_I | PTE_G)) == 0) __syncicache((void *)va, PAGE_SIZE); tlb_miss_unlock(); mtx_unlock_spin(&tlbivax_mutex); } /* * Remove a page from kernel page table. */ static void mmu_booke_kremove(mmu_t mmu, vm_offset_t va) { pte_t *pte; CTR2(KTR_PMAP,"%s: s (va = 0x%"PRI0ptrX")\n", __func__, va); KASSERT(((va >= VM_MIN_KERNEL_ADDRESS) && (va <= VM_MAX_KERNEL_ADDRESS)), ("mmu_booke_kremove: invalid va")); pte = pte_find(mmu, kernel_pmap, va); if (!PTE_ISVALID(pte)) { CTR1(KTR_PMAP, "%s: invalid pte", __func__); return; } mtx_lock_spin(&tlbivax_mutex); tlb_miss_lock(); /* Invalidate entry in TLB0, update PTE. */ tlb0_flush_entry(va); *pte = 0; tlb_miss_unlock(); mtx_unlock_spin(&tlbivax_mutex); } /* * Provide a kernel pointer corresponding to a given userland pointer. * The returned pointer is valid until the next time this function is * called in this thread. This is used internally in copyin/copyout. */ int mmu_booke_map_user_ptr(mmu_t mmu, pmap_t pm, volatile const void *uaddr, void **kaddr, size_t ulen, size_t *klen) { if (trunc_page((uintptr_t)uaddr + ulen) > VM_MAXUSER_ADDRESS) return (EFAULT); *kaddr = (void *)(uintptr_t)uaddr; if (klen) *klen = ulen; return (0); } /* * Figure out where a given kernel pointer (usually in a fault) points * to from the VM's perspective, potentially remapping into userland's * address space. */ static int mmu_booke_decode_kernel_ptr(mmu_t mmu, vm_offset_t addr, int *is_user, vm_offset_t *decoded_addr) { if (trunc_page(addr) <= VM_MAXUSER_ADDRESS) *is_user = 1; else *is_user = 0; *decoded_addr = addr; return (0); } /* * Initialize pmap associated with process 0. */ static void mmu_booke_pinit0(mmu_t mmu, pmap_t pmap) { PMAP_LOCK_INIT(pmap); mmu_booke_pinit(mmu, pmap); PCPU_SET(curpmap, pmap); } /* * Initialize a preallocated and zeroed pmap structure, * such as one in a vmspace structure. */ static void mmu_booke_pinit(mmu_t mmu, pmap_t pmap) { int i; CTR4(KTR_PMAP, "%s: pmap = %p, proc %d '%s'", __func__, pmap, curthread->td_proc->p_pid, curthread->td_proc->p_comm); KASSERT((pmap != kernel_pmap), ("pmap_pinit: initializing kernel_pmap")); for (i = 0; i < MAXCPU; i++) pmap->pm_tid[i] = TID_NONE; CPU_ZERO(&kernel_pmap->pm_active); bzero(&pmap->pm_stats, sizeof(pmap->pm_stats)); #ifdef __powerpc64__ pmap->pm_pp2d = uma_zalloc(ptbl_root_zone, M_WAITOK); bzero(pmap->pm_pp2d, sizeof(pte_t **) * PP2D_NENTRIES); #else pmap->pm_pdir = uma_zalloc(ptbl_root_zone, M_WAITOK); bzero(pmap->pm_pdir, sizeof(pte_t *) * PDIR_NENTRIES); TAILQ_INIT(&pmap->pm_ptbl_list); #endif } /* * Release any resources held by the given physical map. * Called when a pmap initialized by mmu_booke_pinit is being released. * Should only be called if the map contains no valid mappings. */ static void mmu_booke_release(mmu_t mmu, pmap_t pmap) { KASSERT(pmap->pm_stats.resident_count == 0, ("pmap_release: pmap resident count %ld != 0", pmap->pm_stats.resident_count)); #ifdef __powerpc64__ uma_zfree(ptbl_root_zone, pmap->pm_pp2d); #else uma_zfree(ptbl_root_zone, pmap->pm_pdir); #endif } /* * Insert the given physical page at the specified virtual address in the * target physical map with the protection requested. If specified the page * will be wired down. */ static int mmu_booke_enter(mmu_t mmu, pmap_t pmap, vm_offset_t va, vm_page_t m, vm_prot_t prot, u_int flags, int8_t psind) { int error; rw_wlock(&pvh_global_lock); PMAP_LOCK(pmap); error = mmu_booke_enter_locked(mmu, pmap, va, m, prot, flags, psind); PMAP_UNLOCK(pmap); rw_wunlock(&pvh_global_lock); return (error); } static int mmu_booke_enter_locked(mmu_t mmu, pmap_t pmap, vm_offset_t va, vm_page_t m, vm_prot_t prot, u_int pmap_flags, int8_t psind __unused) { pte_t *pte; vm_paddr_t pa; uint32_t flags; int error, su, sync; pa = VM_PAGE_TO_PHYS(m); su = (pmap == kernel_pmap); sync = 0; //debugf("mmu_booke_enter_locked: s (pmap=0x%08x su=%d tid=%d m=0x%08x va=0x%08x " // "pa=0x%08x prot=0x%08x flags=%#x)\n", // (u_int32_t)pmap, su, pmap->pm_tid, // (u_int32_t)m, va, pa, prot, flags); if (su) { KASSERT(((va >= virtual_avail) && (va <= VM_MAX_KERNEL_ADDRESS)), ("mmu_booke_enter_locked: kernel pmap, non kernel va")); } else { KASSERT((va <= VM_MAXUSER_ADDRESS), ("mmu_booke_enter_locked: user pmap, non user va")); } if ((m->oflags & VPO_UNMANAGED) == 0) { if ((pmap_flags & PMAP_ENTER_QUICK_LOCKED) == 0) VM_PAGE_OBJECT_BUSY_ASSERT(m); else VM_OBJECT_ASSERT_LOCKED(m->object); } PMAP_LOCK_ASSERT(pmap, MA_OWNED); /* * If there is an existing mapping, and the physical address has not * changed, must be protection or wiring change. */ if (((pte = pte_find(mmu, pmap, va)) != NULL) && (PTE_ISVALID(pte)) && (PTE_PA(pte) == pa)) { /* * Before actually updating pte->flags we calculate and * prepare its new value in a helper var. */ flags = *pte; flags &= ~(PTE_UW | PTE_UX | PTE_SW | PTE_SX | PTE_MODIFIED); /* Wiring change, just update stats. */ if ((pmap_flags & PMAP_ENTER_WIRED) != 0) { if (!PTE_ISWIRED(pte)) { flags |= PTE_WIRED; pmap->pm_stats.wired_count++; } } else { if (PTE_ISWIRED(pte)) { flags &= ~PTE_WIRED; pmap->pm_stats.wired_count--; } } if (prot & VM_PROT_WRITE) { /* Add write permissions. */ flags |= PTE_SW; if (!su) flags |= PTE_UW; if ((flags & PTE_MANAGED) != 0) vm_page_aflag_set(m, PGA_WRITEABLE); } else { /* Handle modified pages, sense modify status. */ /* * The PTE_MODIFIED flag could be set by underlying * TLB misses since we last read it (above), possibly * other CPUs could update it so we check in the PTE * directly rather than rely on that saved local flags * copy. */ if (PTE_ISMODIFIED(pte)) vm_page_dirty(m); } if (prot & VM_PROT_EXECUTE) { flags |= PTE_SX; if (!su) flags |= PTE_UX; /* * Check existing flags for execute permissions: if we * are turning execute permissions on, icache should * be flushed. */ if ((*pte & (PTE_UX | PTE_SX)) == 0) sync++; } flags &= ~PTE_REFERENCED; /* * The new flags value is all calculated -- only now actually * update the PTE. */ mtx_lock_spin(&tlbivax_mutex); tlb_miss_lock(); tlb0_flush_entry(va); *pte &= ~PTE_FLAGS_MASK; *pte |= flags; tlb_miss_unlock(); mtx_unlock_spin(&tlbivax_mutex); } else { /* * If there is an existing mapping, but it's for a different * physical address, pte_enter() will delete the old mapping. */ //if ((pte != NULL) && PTE_ISVALID(pte)) // debugf("mmu_booke_enter_locked: replace\n"); //else // debugf("mmu_booke_enter_locked: new\n"); /* Now set up the flags and install the new mapping. */ flags = (PTE_SR | PTE_VALID); flags |= PTE_M; if (!su) flags |= PTE_UR; if (prot & VM_PROT_WRITE) { flags |= PTE_SW; if (!su) flags |= PTE_UW; if ((m->oflags & VPO_UNMANAGED) == 0) vm_page_aflag_set(m, PGA_WRITEABLE); } if (prot & VM_PROT_EXECUTE) { flags |= PTE_SX; if (!su) flags |= PTE_UX; } /* If its wired update stats. */ if ((pmap_flags & PMAP_ENTER_WIRED) != 0) flags |= PTE_WIRED; error = pte_enter(mmu, pmap, m, va, flags, (pmap_flags & PMAP_ENTER_NOSLEEP) != 0); if (error != 0) return (KERN_RESOURCE_SHORTAGE); if ((flags & PMAP_ENTER_WIRED) != 0) pmap->pm_stats.wired_count++; /* Flush the real memory from the instruction cache. */ if (prot & VM_PROT_EXECUTE) sync++; } if (sync && (su || pmap == PCPU_GET(curpmap))) { __syncicache((void *)va, PAGE_SIZE); sync = 0; } return (KERN_SUCCESS); } /* * Maps a sequence of resident pages belonging to the same object. * The sequence begins with the given page m_start. This page is * mapped at the given virtual address start. Each subsequent page is * mapped at a virtual address that is offset from start by the same * amount as the page is offset from m_start within the object. The * last page in the sequence is the page with the largest offset from * m_start that can be mapped at a virtual address less than the given * virtual address end. Not every virtual page between start and end * is mapped; only those for which a resident page exists with the * corresponding offset from m_start are mapped. */ static void mmu_booke_enter_object(mmu_t mmu, pmap_t pmap, vm_offset_t start, vm_offset_t end, vm_page_t m_start, vm_prot_t prot) { vm_page_t m; vm_pindex_t diff, psize; VM_OBJECT_ASSERT_LOCKED(m_start->object); psize = atop(end - start); m = m_start; rw_wlock(&pvh_global_lock); PMAP_LOCK(pmap); while (m != NULL && (diff = m->pindex - m_start->pindex) < psize) { mmu_booke_enter_locked(mmu, pmap, start + ptoa(diff), m, prot & (VM_PROT_READ | VM_PROT_EXECUTE), PMAP_ENTER_NOSLEEP | PMAP_ENTER_QUICK_LOCKED, 0); m = TAILQ_NEXT(m, listq); } - rw_wunlock(&pvh_global_lock); PMAP_UNLOCK(pmap); + rw_wunlock(&pvh_global_lock); } static void mmu_booke_enter_quick(mmu_t mmu, pmap_t pmap, vm_offset_t va, vm_page_t m, vm_prot_t prot) { rw_wlock(&pvh_global_lock); PMAP_LOCK(pmap); mmu_booke_enter_locked(mmu, pmap, va, m, prot & (VM_PROT_READ | VM_PROT_EXECUTE), PMAP_ENTER_NOSLEEP | PMAP_ENTER_QUICK_LOCKED, 0); - rw_wunlock(&pvh_global_lock); PMAP_UNLOCK(pmap); + rw_wunlock(&pvh_global_lock); } /* * Remove the given range of addresses from the specified map. * * It is assumed that the start and end are properly rounded to the page size. */ static void mmu_booke_remove(mmu_t mmu, pmap_t pmap, vm_offset_t va, vm_offset_t endva) { pte_t *pte; uint8_t hold_flag; int su = (pmap == kernel_pmap); //debugf("mmu_booke_remove: s (su = %d pmap=0x%08x tid=%d va=0x%08x endva=0x%08x)\n", // su, (u_int32_t)pmap, pmap->pm_tid, va, endva); if (su) { KASSERT(((va >= virtual_avail) && (va <= VM_MAX_KERNEL_ADDRESS)), ("mmu_booke_remove: kernel pmap, non kernel va")); } else { KASSERT((va <= VM_MAXUSER_ADDRESS), ("mmu_booke_remove: user pmap, non user va")); } if (PMAP_REMOVE_DONE(pmap)) { //debugf("mmu_booke_remove: e (empty)\n"); return; } hold_flag = PTBL_HOLD_FLAG(pmap); //debugf("mmu_booke_remove: hold_flag = %d\n", hold_flag); rw_wlock(&pvh_global_lock); PMAP_LOCK(pmap); for (; va < endva; va += PAGE_SIZE) { pte = pte_find(mmu, pmap, va); if ((pte != NULL) && PTE_ISVALID(pte)) pte_remove(mmu, pmap, va, hold_flag); } PMAP_UNLOCK(pmap); rw_wunlock(&pvh_global_lock); //debugf("mmu_booke_remove: e\n"); } /* * Remove physical page from all pmaps in which it resides. */ static void mmu_booke_remove_all(mmu_t mmu, vm_page_t m) { pv_entry_t pv, pvn; uint8_t hold_flag; rw_wlock(&pvh_global_lock); for (pv = TAILQ_FIRST(&m->md.pv_list); pv != NULL; pv = pvn) { pvn = TAILQ_NEXT(pv, pv_link); PMAP_LOCK(pv->pv_pmap); hold_flag = PTBL_HOLD_FLAG(pv->pv_pmap); pte_remove(mmu, pv->pv_pmap, pv->pv_va, hold_flag); PMAP_UNLOCK(pv->pv_pmap); } vm_page_aflag_clear(m, PGA_WRITEABLE); rw_wunlock(&pvh_global_lock); } /* * Map a range of physical addresses into kernel virtual address space. */ static vm_offset_t mmu_booke_map(mmu_t mmu, vm_offset_t *virt, vm_paddr_t pa_start, vm_paddr_t pa_end, int prot) { vm_offset_t sva = *virt; vm_offset_t va = sva; #ifdef __powerpc64__ /* XXX: Handle memory not starting at 0x0. */ if (pa_end < ctob(Maxmem)) return (PHYS_TO_DMAP(pa_start)); #endif while (pa_start < pa_end) { mmu_booke_kenter(mmu, va, pa_start); va += PAGE_SIZE; pa_start += PAGE_SIZE; } *virt = va; return (sva); } /* * The pmap must be activated before it's address space can be accessed in any * way. */ static void mmu_booke_activate(mmu_t mmu, struct thread *td) { pmap_t pmap; u_int cpuid; pmap = &td->td_proc->p_vmspace->vm_pmap; CTR5(KTR_PMAP, "%s: s (td = %p, proc = '%s', id = %d, pmap = 0x%"PRI0ptrX")", __func__, td, td->td_proc->p_comm, td->td_proc->p_pid, pmap); KASSERT((pmap != kernel_pmap), ("mmu_booke_activate: kernel_pmap!")); sched_pin(); cpuid = PCPU_GET(cpuid); CPU_SET_ATOMIC(cpuid, &pmap->pm_active); PCPU_SET(curpmap, pmap); if (pmap->pm_tid[cpuid] == TID_NONE) tid_alloc(pmap); /* Load PID0 register with pmap tid value. */ mtspr(SPR_PID0, pmap->pm_tid[cpuid]); __asm __volatile("isync"); mtspr(SPR_DBCR0, td->td_pcb->pcb_cpu.booke.dbcr0); sched_unpin(); CTR3(KTR_PMAP, "%s: e (tid = %d for '%s')", __func__, pmap->pm_tid[PCPU_GET(cpuid)], td->td_proc->p_comm); } /* * Deactivate the specified process's address space. */ static void mmu_booke_deactivate(mmu_t mmu, struct thread *td) { pmap_t pmap; pmap = &td->td_proc->p_vmspace->vm_pmap; CTR5(KTR_PMAP, "%s: td=%p, proc = '%s', id = %d, pmap = 0x%"PRI0ptrX, __func__, td, td->td_proc->p_comm, td->td_proc->p_pid, pmap); td->td_pcb->pcb_cpu.booke.dbcr0 = mfspr(SPR_DBCR0); CPU_CLR_ATOMIC(PCPU_GET(cpuid), &pmap->pm_active); PCPU_SET(curpmap, NULL); } /* * Copy the range specified by src_addr/len * from the source map to the range dst_addr/len * in the destination map. * * This routine is only advisory and need not do anything. */ static void mmu_booke_copy(mmu_t mmu, pmap_t dst_pmap, pmap_t src_pmap, vm_offset_t dst_addr, vm_size_t len, vm_offset_t src_addr) { } /* * Set the physical protection on the specified range of this map as requested. */ static void mmu_booke_protect(mmu_t mmu, pmap_t pmap, vm_offset_t sva, vm_offset_t eva, vm_prot_t prot) { vm_offset_t va; vm_page_t m; pte_t *pte; if ((prot & VM_PROT_READ) == VM_PROT_NONE) { mmu_booke_remove(mmu, pmap, sva, eva); return; } if (prot & VM_PROT_WRITE) return; PMAP_LOCK(pmap); for (va = sva; va < eva; va += PAGE_SIZE) { if ((pte = pte_find(mmu, pmap, va)) != NULL) { if (PTE_ISVALID(pte)) { m = PHYS_TO_VM_PAGE(PTE_PA(pte)); mtx_lock_spin(&tlbivax_mutex); tlb_miss_lock(); /* Handle modified pages. */ if (PTE_ISMODIFIED(pte) && PTE_ISMANAGED(pte)) vm_page_dirty(m); tlb0_flush_entry(va); *pte &= ~(PTE_UW | PTE_SW | PTE_MODIFIED); tlb_miss_unlock(); mtx_unlock_spin(&tlbivax_mutex); } } } PMAP_UNLOCK(pmap); } /* * Clear the write and modified bits in each of the given page's mappings. */ static void mmu_booke_remove_write(mmu_t mmu, vm_page_t m) { pv_entry_t pv; pte_t *pte; KASSERT((m->oflags & VPO_UNMANAGED) == 0, ("mmu_booke_remove_write: page %p is not managed", m)); vm_page_assert_busied(m); if (!pmap_page_is_write_mapped(m)) return; rw_wlock(&pvh_global_lock); TAILQ_FOREACH(pv, &m->md.pv_list, pv_link) { PMAP_LOCK(pv->pv_pmap); if ((pte = pte_find(mmu, pv->pv_pmap, pv->pv_va)) != NULL) { if (PTE_ISVALID(pte)) { m = PHYS_TO_VM_PAGE(PTE_PA(pte)); mtx_lock_spin(&tlbivax_mutex); tlb_miss_lock(); /* Handle modified pages. */ if (PTE_ISMODIFIED(pte)) vm_page_dirty(m); /* Flush mapping from TLB0. */ *pte &= ~(PTE_UW | PTE_SW | PTE_MODIFIED); tlb_miss_unlock(); mtx_unlock_spin(&tlbivax_mutex); } } PMAP_UNLOCK(pv->pv_pmap); } vm_page_aflag_clear(m, PGA_WRITEABLE); rw_wunlock(&pvh_global_lock); } static void mmu_booke_sync_icache(mmu_t mmu, pmap_t pm, vm_offset_t va, vm_size_t sz) { pte_t *pte; vm_paddr_t pa = 0; int sync_sz, valid; #ifndef __powerpc64__ pmap_t pmap; vm_page_t m; vm_offset_t addr; int active; #endif #ifndef __powerpc64__ rw_wlock(&pvh_global_lock); pmap = PCPU_GET(curpmap); active = (pm == kernel_pmap || pm == pmap) ? 1 : 0; #endif while (sz > 0) { PMAP_LOCK(pm); pte = pte_find(mmu, pm, va); valid = (pte != NULL && PTE_ISVALID(pte)) ? 1 : 0; if (valid) pa = PTE_PA(pte); PMAP_UNLOCK(pm); sync_sz = PAGE_SIZE - (va & PAGE_MASK); sync_sz = min(sync_sz, sz); if (valid) { #ifdef __powerpc64__ pa += (va & PAGE_MASK); __syncicache((void *)PHYS_TO_DMAP(pa), sync_sz); #else if (!active) { /* Create a mapping in the active pmap. */ addr = 0; m = PHYS_TO_VM_PAGE(pa); PMAP_LOCK(pmap); pte_enter(mmu, pmap, m, addr, PTE_SR | PTE_VALID, FALSE); addr += (va & PAGE_MASK); __syncicache((void *)addr, sync_sz); pte_remove(mmu, pmap, addr, PTBL_UNHOLD); PMAP_UNLOCK(pmap); } else __syncicache((void *)va, sync_sz); #endif } va += sync_sz; sz -= sync_sz; } #ifndef __powerpc64__ rw_wunlock(&pvh_global_lock); #endif } /* * Atomically extract and hold the physical page with the given * pmap and virtual address pair if that mapping permits the given * protection. */ static vm_page_t mmu_booke_extract_and_hold(mmu_t mmu, pmap_t pmap, vm_offset_t va, vm_prot_t prot) { pte_t *pte; vm_page_t m; uint32_t pte_wbit; m = NULL; PMAP_LOCK(pmap); pte = pte_find(mmu, pmap, va); if ((pte != NULL) && PTE_ISVALID(pte)) { if (pmap == kernel_pmap) pte_wbit = PTE_SW; else pte_wbit = PTE_UW; if ((*pte & pte_wbit) != 0 || (prot & VM_PROT_WRITE) == 0) { m = PHYS_TO_VM_PAGE(PTE_PA(pte)); if (!vm_page_wire_mapped(m)) m = NULL; } } PMAP_UNLOCK(pmap); return (m); } /* * Initialize a vm_page's machine-dependent fields. */ static void mmu_booke_page_init(mmu_t mmu, vm_page_t m) { m->md.pv_tracked = 0; TAILQ_INIT(&m->md.pv_list); } /* * mmu_booke_zero_page_area zeros the specified hardware page by * mapping it into virtual memory and using bzero to clear * its contents. * * off and size must reside within a single page. */ static void mmu_booke_zero_page_area(mmu_t mmu, vm_page_t m, int off, int size) { vm_offset_t va; /* XXX KASSERT off and size are within a single page? */ #ifdef __powerpc64__ va = PHYS_TO_DMAP(VM_PAGE_TO_PHYS(m)); bzero((caddr_t)va + off, size); #else mtx_lock(&zero_page_mutex); va = zero_page_va; mmu_booke_kenter(mmu, va, VM_PAGE_TO_PHYS(m)); bzero((caddr_t)va + off, size); mmu_booke_kremove(mmu, va); mtx_unlock(&zero_page_mutex); #endif } /* * mmu_booke_zero_page zeros the specified hardware page. */ static void mmu_booke_zero_page(mmu_t mmu, vm_page_t m) { vm_offset_t off, va; #ifdef __powerpc64__ va = PHYS_TO_DMAP(VM_PAGE_TO_PHYS(m)); for (off = 0; off < PAGE_SIZE; off += cacheline_size) __asm __volatile("dcbz 0,%0" :: "r"(va + off)); #else va = zero_page_va; mtx_lock(&zero_page_mutex); mmu_booke_kenter(mmu, va, VM_PAGE_TO_PHYS(m)); for (off = 0; off < PAGE_SIZE; off += cacheline_size) __asm __volatile("dcbz 0,%0" :: "r"(va + off)); mmu_booke_kremove(mmu, va); mtx_unlock(&zero_page_mutex); #endif } /* * mmu_booke_copy_page copies the specified (machine independent) page by * mapping the page into virtual memory and using memcopy to copy the page, * one machine dependent page at a time. */ static void mmu_booke_copy_page(mmu_t mmu, vm_page_t sm, vm_page_t dm) { vm_offset_t sva, dva; #ifdef __powerpc64__ sva = PHYS_TO_DMAP(VM_PAGE_TO_PHYS(sm)); dva = PHYS_TO_DMAP(VM_PAGE_TO_PHYS(dm)); memcpy((caddr_t)dva, (caddr_t)sva, PAGE_SIZE); #else sva = copy_page_src_va; dva = copy_page_dst_va; mtx_lock(©_page_mutex); mmu_booke_kenter(mmu, sva, VM_PAGE_TO_PHYS(sm)); mmu_booke_kenter(mmu, dva, VM_PAGE_TO_PHYS(dm)); memcpy((caddr_t)dva, (caddr_t)sva, PAGE_SIZE); mmu_booke_kremove(mmu, dva); mmu_booke_kremove(mmu, sva); mtx_unlock(©_page_mutex); #endif } static inline void mmu_booke_copy_pages(mmu_t mmu, vm_page_t *ma, vm_offset_t a_offset, vm_page_t *mb, vm_offset_t b_offset, int xfersize) { void *a_cp, *b_cp; vm_offset_t a_pg_offset, b_pg_offset; int cnt; #ifdef __powerpc64__ vm_page_t pa, pb; while (xfersize > 0) { a_pg_offset = a_offset & PAGE_MASK; pa = ma[a_offset >> PAGE_SHIFT]; b_pg_offset = b_offset & PAGE_MASK; pb = mb[b_offset >> PAGE_SHIFT]; cnt = min(xfersize, PAGE_SIZE - a_pg_offset); cnt = min(cnt, PAGE_SIZE - b_pg_offset); a_cp = (caddr_t)((uintptr_t)PHYS_TO_DMAP(VM_PAGE_TO_PHYS(pa)) + a_pg_offset); b_cp = (caddr_t)((uintptr_t)PHYS_TO_DMAP(VM_PAGE_TO_PHYS(pb)) + b_pg_offset); bcopy(a_cp, b_cp, cnt); a_offset += cnt; b_offset += cnt; xfersize -= cnt; } #else mtx_lock(©_page_mutex); while (xfersize > 0) { a_pg_offset = a_offset & PAGE_MASK; cnt = min(xfersize, PAGE_SIZE - a_pg_offset); mmu_booke_kenter(mmu, copy_page_src_va, VM_PAGE_TO_PHYS(ma[a_offset >> PAGE_SHIFT])); a_cp = (char *)copy_page_src_va + a_pg_offset; b_pg_offset = b_offset & PAGE_MASK; cnt = min(cnt, PAGE_SIZE - b_pg_offset); mmu_booke_kenter(mmu, copy_page_dst_va, VM_PAGE_TO_PHYS(mb[b_offset >> PAGE_SHIFT])); b_cp = (char *)copy_page_dst_va + b_pg_offset; bcopy(a_cp, b_cp, cnt); mmu_booke_kremove(mmu, copy_page_dst_va); mmu_booke_kremove(mmu, copy_page_src_va); a_offset += cnt; b_offset += cnt; xfersize -= cnt; } mtx_unlock(©_page_mutex); #endif } static vm_offset_t mmu_booke_quick_enter_page(mmu_t mmu, vm_page_t m) { #ifdef __powerpc64__ return (PHYS_TO_DMAP(VM_PAGE_TO_PHYS(m))); #else vm_paddr_t paddr; vm_offset_t qaddr; uint32_t flags; pte_t *pte; paddr = VM_PAGE_TO_PHYS(m); flags = PTE_SR | PTE_SW | PTE_SX | PTE_WIRED | PTE_VALID; flags |= tlb_calc_wimg(paddr, pmap_page_get_memattr(m)) << PTE_MAS2_SHIFT; flags |= PTE_PS_4KB; critical_enter(); qaddr = PCPU_GET(qmap_addr); pte = pte_find(mmu, kernel_pmap, qaddr); KASSERT(*pte == 0, ("mmu_booke_quick_enter_page: PTE busy")); /* * XXX: tlbivax is broadcast to other cores, but qaddr should * not be present in other TLBs. Is there a better instruction * sequence to use? Or just forget it & use mmu_booke_kenter()... */ __asm __volatile("tlbivax 0, %0" :: "r"(qaddr & MAS2_EPN_MASK)); __asm __volatile("isync; msync"); *pte = PTE_RPN_FROM_PA(paddr) | flags; /* Flush the real memory from the instruction cache. */ if ((flags & (PTE_I | PTE_G)) == 0) __syncicache((void *)qaddr, PAGE_SIZE); return (qaddr); #endif } static void mmu_booke_quick_remove_page(mmu_t mmu, vm_offset_t addr) { #ifndef __powerpc64__ pte_t *pte; pte = pte_find(mmu, kernel_pmap, addr); KASSERT(PCPU_GET(qmap_addr) == addr, ("mmu_booke_quick_remove_page: invalid address")); KASSERT(*pte != 0, ("mmu_booke_quick_remove_page: PTE not in use")); *pte = 0; critical_exit(); #endif } /* * Return whether or not the specified physical page was modified * in any of physical maps. */ static boolean_t mmu_booke_is_modified(mmu_t mmu, vm_page_t m) { pte_t *pte; pv_entry_t pv; boolean_t rv; KASSERT((m->oflags & VPO_UNMANAGED) == 0, ("mmu_booke_is_modified: page %p is not managed", m)); rv = FALSE; /* * If the page is not busied then this check is racy. */ if (!pmap_page_is_write_mapped(m)) return (FALSE); rw_wlock(&pvh_global_lock); TAILQ_FOREACH(pv, &m->md.pv_list, pv_link) { PMAP_LOCK(pv->pv_pmap); if ((pte = pte_find(mmu, pv->pv_pmap, pv->pv_va)) != NULL && PTE_ISVALID(pte)) { if (PTE_ISMODIFIED(pte)) rv = TRUE; } PMAP_UNLOCK(pv->pv_pmap); if (rv) break; } rw_wunlock(&pvh_global_lock); return (rv); } /* * Return whether or not the specified virtual address is eligible * for prefault. */ static boolean_t mmu_booke_is_prefaultable(mmu_t mmu, pmap_t pmap, vm_offset_t addr) { return (FALSE); } /* * Return whether or not the specified physical page was referenced * in any physical maps. */ static boolean_t mmu_booke_is_referenced(mmu_t mmu, vm_page_t m) { pte_t *pte; pv_entry_t pv; boolean_t rv; KASSERT((m->oflags & VPO_UNMANAGED) == 0, ("mmu_booke_is_referenced: page %p is not managed", m)); rv = FALSE; rw_wlock(&pvh_global_lock); TAILQ_FOREACH(pv, &m->md.pv_list, pv_link) { PMAP_LOCK(pv->pv_pmap); if ((pte = pte_find(mmu, pv->pv_pmap, pv->pv_va)) != NULL && PTE_ISVALID(pte)) { if (PTE_ISREFERENCED(pte)) rv = TRUE; } PMAP_UNLOCK(pv->pv_pmap); if (rv) break; } rw_wunlock(&pvh_global_lock); return (rv); } /* * Clear the modify bits on the specified physical page. */ static void mmu_booke_clear_modify(mmu_t mmu, vm_page_t m) { pte_t *pte; pv_entry_t pv; KASSERT((m->oflags & VPO_UNMANAGED) == 0, ("mmu_booke_clear_modify: page %p is not managed", m)); vm_page_assert_busied(m); if (!pmap_page_is_write_mapped(m)) return; rw_wlock(&pvh_global_lock); TAILQ_FOREACH(pv, &m->md.pv_list, pv_link) { PMAP_LOCK(pv->pv_pmap); if ((pte = pte_find(mmu, pv->pv_pmap, pv->pv_va)) != NULL && PTE_ISVALID(pte)) { mtx_lock_spin(&tlbivax_mutex); tlb_miss_lock(); if (*pte & (PTE_SW | PTE_UW | PTE_MODIFIED)) { tlb0_flush_entry(pv->pv_va); *pte &= ~(PTE_SW | PTE_UW | PTE_MODIFIED | PTE_REFERENCED); } tlb_miss_unlock(); mtx_unlock_spin(&tlbivax_mutex); } PMAP_UNLOCK(pv->pv_pmap); } rw_wunlock(&pvh_global_lock); } /* * Return a count of reference bits for a page, clearing those bits. * It is not necessary for every reference bit to be cleared, but it * is necessary that 0 only be returned when there are truly no * reference bits set. * * As an optimization, update the page's dirty field if a modified bit is * found while counting reference bits. This opportunistic update can be * performed at low cost and can eliminate the need for some future calls * to pmap_is_modified(). However, since this function stops after * finding PMAP_TS_REFERENCED_MAX reference bits, it may not detect some * dirty pages. Those dirty pages will only be detected by a future call * to pmap_is_modified(). */ static int mmu_booke_ts_referenced(mmu_t mmu, vm_page_t m) { pte_t *pte; pv_entry_t pv; int count; KASSERT((m->oflags & VPO_UNMANAGED) == 0, ("mmu_booke_ts_referenced: page %p is not managed", m)); count = 0; rw_wlock(&pvh_global_lock); TAILQ_FOREACH(pv, &m->md.pv_list, pv_link) { PMAP_LOCK(pv->pv_pmap); if ((pte = pte_find(mmu, pv->pv_pmap, pv->pv_va)) != NULL && PTE_ISVALID(pte)) { if (PTE_ISMODIFIED(pte)) vm_page_dirty(m); if (PTE_ISREFERENCED(pte)) { mtx_lock_spin(&tlbivax_mutex); tlb_miss_lock(); tlb0_flush_entry(pv->pv_va); *pte &= ~PTE_REFERENCED; tlb_miss_unlock(); mtx_unlock_spin(&tlbivax_mutex); if (++count >= PMAP_TS_REFERENCED_MAX) { PMAP_UNLOCK(pv->pv_pmap); break; } } } PMAP_UNLOCK(pv->pv_pmap); } rw_wunlock(&pvh_global_lock); return (count); } /* * Clear the wired attribute from the mappings for the specified range of * addresses in the given pmap. Every valid mapping within that range must * have the wired attribute set. In contrast, invalid mappings cannot have * the wired attribute set, so they are ignored. * * The wired attribute of the page table entry is not a hardware feature, so * there is no need to invalidate any TLB entries. */ static void mmu_booke_unwire(mmu_t mmu, pmap_t pmap, vm_offset_t sva, vm_offset_t eva) { vm_offset_t va; pte_t *pte; PMAP_LOCK(pmap); for (va = sva; va < eva; va += PAGE_SIZE) { if ((pte = pte_find(mmu, pmap, va)) != NULL && PTE_ISVALID(pte)) { if (!PTE_ISWIRED(pte)) panic("mmu_booke_unwire: pte %p isn't wired", pte); *pte &= ~PTE_WIRED; pmap->pm_stats.wired_count--; } } PMAP_UNLOCK(pmap); } /* * Return true if the pmap's pv is one of the first 16 pvs linked to from this * page. This count may be changed upwards or downwards in the future; it is * only necessary that true be returned for a small subset of pmaps for proper * page aging. */ static boolean_t mmu_booke_page_exists_quick(mmu_t mmu, pmap_t pmap, vm_page_t m) { pv_entry_t pv; int loops; boolean_t rv; KASSERT((m->oflags & VPO_UNMANAGED) == 0, ("mmu_booke_page_exists_quick: page %p is not managed", m)); loops = 0; rv = FALSE; rw_wlock(&pvh_global_lock); TAILQ_FOREACH(pv, &m->md.pv_list, pv_link) { if (pv->pv_pmap == pmap) { rv = TRUE; break; } if (++loops >= 16) break; } rw_wunlock(&pvh_global_lock); return (rv); } /* * Return the number of managed mappings to the given physical page that are * wired. */ static int mmu_booke_page_wired_mappings(mmu_t mmu, vm_page_t m) { pv_entry_t pv; pte_t *pte; int count = 0; if ((m->oflags & VPO_UNMANAGED) != 0) return (count); rw_wlock(&pvh_global_lock); TAILQ_FOREACH(pv, &m->md.pv_list, pv_link) { PMAP_LOCK(pv->pv_pmap); if ((pte = pte_find(mmu, pv->pv_pmap, pv->pv_va)) != NULL) if (PTE_ISVALID(pte) && PTE_ISWIRED(pte)) count++; PMAP_UNLOCK(pv->pv_pmap); } rw_wunlock(&pvh_global_lock); return (count); } static int mmu_booke_dev_direct_mapped(mmu_t mmu, vm_paddr_t pa, vm_size_t size) { int i; vm_offset_t va; /* * This currently does not work for entries that * overlap TLB1 entries. */ for (i = 0; i < TLB1_ENTRIES; i ++) { if (tlb1_iomapped(i, pa, size, &va) == 0) return (0); } return (EFAULT); } void mmu_booke_dumpsys_map(mmu_t mmu, vm_paddr_t pa, size_t sz, void **va) { vm_paddr_t ppa; vm_offset_t ofs; vm_size_t gran; /* Minidumps are based on virtual memory addresses. */ if (do_minidump) { *va = (void *)(vm_offset_t)pa; return; } /* Raw physical memory dumps don't have a virtual address. */ /* We always map a 256MB page at 256M. */ gran = 256 * 1024 * 1024; ppa = rounddown2(pa, gran); ofs = pa - ppa; *va = (void *)gran; tlb1_set_entry((vm_offset_t)va, ppa, gran, _TLB_ENTRY_IO); if (sz > (gran - ofs)) tlb1_set_entry((vm_offset_t)(va + gran), ppa + gran, gran, _TLB_ENTRY_IO); } void mmu_booke_dumpsys_unmap(mmu_t mmu, vm_paddr_t pa, size_t sz, void *va) { vm_paddr_t ppa; vm_offset_t ofs; vm_size_t gran; tlb_entry_t e; int i; /* Minidumps are based on virtual memory addresses. */ /* Nothing to do... */ if (do_minidump) return; for (i = 0; i < TLB1_ENTRIES; i++) { tlb1_read_entry(&e, i); if (!(e.mas1 & MAS1_VALID)) break; } /* Raw physical memory dumps don't have a virtual address. */ i--; e.mas1 = 0; e.mas2 = 0; e.mas3 = 0; tlb1_write_entry(&e, i); gran = 256 * 1024 * 1024; ppa = rounddown2(pa, gran); ofs = pa - ppa; if (sz > (gran - ofs)) { i--; e.mas1 = 0; e.mas2 = 0; e.mas3 = 0; tlb1_write_entry(&e, i); } } extern struct dump_pa dump_map[PHYS_AVAIL_SZ + 1]; void mmu_booke_scan_init(mmu_t mmu) { vm_offset_t va; pte_t *pte; int i; if (!do_minidump) { /* Initialize phys. segments for dumpsys(). */ memset(&dump_map, 0, sizeof(dump_map)); mem_regions(&physmem_regions, &physmem_regions_sz, &availmem_regions, &availmem_regions_sz); for (i = 0; i < physmem_regions_sz; i++) { dump_map[i].pa_start = physmem_regions[i].mr_start; dump_map[i].pa_size = physmem_regions[i].mr_size; } return; } /* Virtual segments for minidumps: */ memset(&dump_map, 0, sizeof(dump_map)); /* 1st: kernel .data and .bss. */ dump_map[0].pa_start = trunc_page((uintptr_t)_etext); dump_map[0].pa_size = round_page((uintptr_t)_end) - dump_map[0].pa_start; /* 2nd: msgbuf and tables (see pmap_bootstrap()). */ dump_map[1].pa_start = data_start; dump_map[1].pa_size = data_end - data_start; /* 3rd: kernel VM. */ va = dump_map[1].pa_start + dump_map[1].pa_size; /* Find start of next chunk (from va). */ while (va < virtual_end) { /* Don't dump the buffer cache. */ if (va >= kmi.buffer_sva && va < kmi.buffer_eva) { va = kmi.buffer_eva; continue; } pte = pte_find(mmu, kernel_pmap, va); if (pte != NULL && PTE_ISVALID(pte)) break; va += PAGE_SIZE; } if (va < virtual_end) { dump_map[2].pa_start = va; va += PAGE_SIZE; /* Find last page in chunk. */ while (va < virtual_end) { /* Don't run into the buffer cache. */ if (va == kmi.buffer_sva) break; pte = pte_find(mmu, kernel_pmap, va); if (pte == NULL || !PTE_ISVALID(pte)) break; va += PAGE_SIZE; } dump_map[2].pa_size = va - dump_map[2].pa_start; } } /* * Map a set of physical memory pages into the kernel virtual address space. * Return a pointer to where it is mapped. This routine is intended to be used * for mapping device memory, NOT real memory. */ static void * mmu_booke_mapdev(mmu_t mmu, vm_paddr_t pa, vm_size_t size) { return (mmu_booke_mapdev_attr(mmu, pa, size, VM_MEMATTR_DEFAULT)); } static int tlb1_find_pa(vm_paddr_t pa, tlb_entry_t *e) { int i; for (i = 0; i < TLB1_ENTRIES; i++) { tlb1_read_entry(e, i); if ((e->mas1 & MAS1_VALID) == 0) return (i); } return (-1); } static void * mmu_booke_mapdev_attr(mmu_t mmu, vm_paddr_t pa, vm_size_t size, vm_memattr_t ma) { tlb_entry_t e; vm_paddr_t tmppa; void *res; uintptr_t va, tmpva; vm_size_t sz; int i; int wimge; /* * Check if this is premapped in TLB1. */ sz = size; tmppa = pa; va = ~0; wimge = tlb_calc_wimg(pa, ma); for (i = 0; i < TLB1_ENTRIES; i++) { tlb1_read_entry(&e, i); if (!(e.mas1 & MAS1_VALID)) continue; if (wimge != (e.mas2 & (MAS2_WIMGE_MASK & ~_TLB_ENTRY_SHARED))) continue; if (tmppa >= e.phys && tmppa < e.phys + e.size) { va = e.virt + (pa - e.phys); tmppa = e.phys + e.size; sz -= MIN(sz, e.size); while (sz > 0 && (i = tlb1_find_pa(tmppa, &e)) != -1) { if (wimge != (e.mas2 & (MAS2_WIMGE_MASK & ~_TLB_ENTRY_SHARED))) break; sz -= MIN(sz, e.size); tmppa = e.phys + e.size; } if (sz != 0) break; return ((void *)va); } } size = roundup(size, PAGE_SIZE); /* * The device mapping area is between VM_MAXUSER_ADDRESS and * VM_MIN_KERNEL_ADDRESS. This gives 1GB of device addressing. */ #ifdef SPARSE_MAPDEV /* * With a sparse mapdev, align to the largest starting region. This * could feasibly be optimized for a 'best-fit' alignment, but that * calculation could be very costly. * Align to the smaller of: * - first set bit in overlap of (pa & size mask) * - largest size envelope * * It's possible the device mapping may start at a PA that's not larger * than the size mask, so we need to offset in to maximize the TLB entry * range and minimize the number of used TLB entries. */ do { tmpva = tlb1_map_base; sz = ffsl((~((1 << flsl(size-1)) - 1)) & pa); sz = sz ? min(roundup(sz + 3, 4), flsl(size) - 1) : flsl(size) - 1; va = roundup(tlb1_map_base, 1 << sz) | (((1 << sz) - 1) & pa); #ifdef __powerpc64__ } while (!atomic_cmpset_long(&tlb1_map_base, tmpva, va + size)); #else } while (!atomic_cmpset_int(&tlb1_map_base, tmpva, va + size)); #endif #else #ifdef __powerpc64__ va = atomic_fetchadd_long(&tlb1_map_base, size); #else va = atomic_fetchadd_int(&tlb1_map_base, size); #endif #endif res = (void *)va; if (tlb1_mapin_region(va, pa, size, tlb_calc_wimg(pa, ma)) != size) return (NULL); return (res); } /* * 'Unmap' a range mapped by mmu_booke_mapdev(). */ static void mmu_booke_unmapdev(mmu_t mmu, vm_offset_t va, vm_size_t size) { #ifdef SUPPORTS_SHRINKING_TLB1 vm_offset_t base, offset; /* * Unmap only if this is inside kernel virtual space. */ if ((va >= VM_MIN_KERNEL_ADDRESS) && (va <= VM_MAX_KERNEL_ADDRESS)) { base = trunc_page(va); offset = va & PAGE_MASK; size = roundup(offset + size, PAGE_SIZE); kva_free(base, size); } #endif } /* * mmu_booke_object_init_pt preloads the ptes for a given object into the * specified pmap. This eliminates the blast of soft faults on process startup * and immediately after an mmap. */ static void mmu_booke_object_init_pt(mmu_t mmu, pmap_t pmap, vm_offset_t addr, vm_object_t object, vm_pindex_t pindex, vm_size_t size) { VM_OBJECT_ASSERT_WLOCKED(object); KASSERT(object->type == OBJT_DEVICE || object->type == OBJT_SG, ("mmu_booke_object_init_pt: non-device object")); } /* * Perform the pmap work for mincore. */ static int mmu_booke_mincore(mmu_t mmu, pmap_t pmap, vm_offset_t addr, vm_paddr_t *pap) { /* XXX: this should be implemented at some point */ return (0); } static int mmu_booke_change_attr(mmu_t mmu, vm_offset_t addr, vm_size_t sz, vm_memattr_t mode) { vm_offset_t va; pte_t *pte; int i, j; tlb_entry_t e; addr = trunc_page(addr); /* Only allow changes to mapped kernel addresses. This includes: * - KVA * - DMAP (powerpc64) * - Device mappings */ if (addr <= VM_MAXUSER_ADDRESS || #ifdef __powerpc64__ (addr >= tlb1_map_base && addr < DMAP_BASE_ADDRESS) || (addr > DMAP_MAX_ADDRESS && addr < VM_MIN_KERNEL_ADDRESS) || #else (addr >= tlb1_map_base && addr < VM_MIN_KERNEL_ADDRESS) || #endif (addr > VM_MAX_KERNEL_ADDRESS)) return (EINVAL); /* Check TLB1 mappings */ for (i = 0; i < TLB1_ENTRIES; i++) { tlb1_read_entry(&e, i); if (!(e.mas1 & MAS1_VALID)) continue; if (addr >= e.virt && addr < e.virt + e.size) break; } if (i < TLB1_ENTRIES) { /* Only allow full mappings to be modified for now. */ /* Validate the range. */ for (j = i, va = addr; va < addr + sz; va += e.size, j++) { tlb1_read_entry(&e, j); if (va != e.virt || (sz - (va - addr) < e.size)) return (EINVAL); } for (va = addr; va < addr + sz; va += e.size, i++) { tlb1_read_entry(&e, i); e.mas2 &= ~MAS2_WIMGE_MASK; e.mas2 |= tlb_calc_wimg(e.phys, mode); /* * Write it out to the TLB. Should really re-sync with other * cores. */ tlb1_write_entry(&e, i); } return (0); } /* Not in TLB1, try through pmap */ /* First validate the range. */ for (va = addr; va < addr + sz; va += PAGE_SIZE) { pte = pte_find(mmu, kernel_pmap, va); if (pte == NULL || !PTE_ISVALID(pte)) return (EINVAL); } mtx_lock_spin(&tlbivax_mutex); tlb_miss_lock(); for (va = addr; va < addr + sz; va += PAGE_SIZE) { pte = pte_find(mmu, kernel_pmap, va); *pte &= ~(PTE_MAS2_MASK << PTE_MAS2_SHIFT); *pte |= tlb_calc_wimg(PTE_PA(pte), mode) << PTE_MAS2_SHIFT; tlb0_flush_entry(va); } tlb_miss_unlock(); mtx_unlock_spin(&tlbivax_mutex); return (0); } static void mmu_booke_page_array_startup(mmu_t mmu, long pages) { vm_page_array_size = pages; } /**************************************************************************/ /* TID handling */ /**************************************************************************/ /* * Allocate a TID. If necessary, steal one from someone else. * The new TID is flushed from the TLB before returning. */ static tlbtid_t tid_alloc(pmap_t pmap) { tlbtid_t tid; int thiscpu; KASSERT((pmap != kernel_pmap), ("tid_alloc: kernel pmap")); CTR2(KTR_PMAP, "%s: s (pmap = %p)", __func__, pmap); thiscpu = PCPU_GET(cpuid); tid = PCPU_GET(booke.tid_next); if (tid > TID_MAX) tid = TID_MIN; PCPU_SET(booke.tid_next, tid + 1); /* If we are stealing TID then clear the relevant pmap's field */ if (tidbusy[thiscpu][tid] != NULL) { CTR2(KTR_PMAP, "%s: warning: stealing tid %d", __func__, tid); tidbusy[thiscpu][tid]->pm_tid[thiscpu] = TID_NONE; /* Flush all entries from TLB0 matching this TID. */ tid_flush(tid); } tidbusy[thiscpu][tid] = pmap; pmap->pm_tid[thiscpu] = tid; __asm __volatile("msync; isync"); CTR3(KTR_PMAP, "%s: e (%02d next = %02d)", __func__, tid, PCPU_GET(booke.tid_next)); return (tid); } /**************************************************************************/ /* TLB0 handling */ /**************************************************************************/ /* Convert TLB0 va and way number to tlb0[] table index. */ static inline unsigned int tlb0_tableidx(vm_offset_t va, unsigned int way) { unsigned int idx; idx = (way * TLB0_ENTRIES_PER_WAY); idx += (va & MAS2_TLB0_ENTRY_IDX_MASK) >> MAS2_TLB0_ENTRY_IDX_SHIFT; return (idx); } /* * Invalidate TLB0 entry. */ static inline void tlb0_flush_entry(vm_offset_t va) { CTR2(KTR_PMAP, "%s: s va=0x%08x", __func__, va); mtx_assert(&tlbivax_mutex, MA_OWNED); __asm __volatile("tlbivax 0, %0" :: "r"(va & MAS2_EPN_MASK)); __asm __volatile("isync; msync"); __asm __volatile("tlbsync; msync"); CTR1(KTR_PMAP, "%s: e", __func__); } /**************************************************************************/ /* TLB1 handling */ /**************************************************************************/ /* * TLB1 mapping notes: * * TLB1[0] Kernel text and data. * TLB1[1-15] Additional kernel text and data mappings (if required), PCI * windows, other devices mappings. */ /* * Read an entry from given TLB1 slot. */ void tlb1_read_entry(tlb_entry_t *entry, unsigned int slot) { register_t msr; uint32_t mas0; KASSERT((entry != NULL), ("%s(): Entry is NULL!", __func__)); msr = mfmsr(); __asm __volatile("wrteei 0"); mas0 = MAS0_TLBSEL(1) | MAS0_ESEL(slot); mtspr(SPR_MAS0, mas0); __asm __volatile("isync; tlbre"); entry->mas1 = mfspr(SPR_MAS1); entry->mas2 = mfspr(SPR_MAS2); entry->mas3 = mfspr(SPR_MAS3); switch ((mfpvr() >> 16) & 0xFFFF) { case FSL_E500v2: case FSL_E500mc: case FSL_E5500: case FSL_E6500: entry->mas7 = mfspr(SPR_MAS7); break; default: entry->mas7 = 0; break; } __asm __volatile("wrtee %0" :: "r"(msr)); entry->virt = entry->mas2 & MAS2_EPN_MASK; entry->phys = ((vm_paddr_t)(entry->mas7 & MAS7_RPN) << 32) | (entry->mas3 & MAS3_RPN); entry->size = tsize2size((entry->mas1 & MAS1_TSIZE_MASK) >> MAS1_TSIZE_SHIFT); } struct tlbwrite_args { tlb_entry_t *e; unsigned int idx; }; static uint32_t tlb1_find_free(void) { tlb_entry_t e; int i; for (i = 0; i < TLB1_ENTRIES; i++) { tlb1_read_entry(&e, i); if ((e.mas1 & MAS1_VALID) == 0) return (i); } return (-1); } static void tlb1_write_entry_int(void *arg) { struct tlbwrite_args *args = arg; uint32_t idx, mas0; idx = args->idx; if (idx == -1) { idx = tlb1_find_free(); if (idx == -1) panic("No free TLB1 entries!\n"); } /* Select entry */ mas0 = MAS0_TLBSEL(1) | MAS0_ESEL(idx); mtspr(SPR_MAS0, mas0); mtspr(SPR_MAS1, args->e->mas1); mtspr(SPR_MAS2, args->e->mas2); mtspr(SPR_MAS3, args->e->mas3); switch ((mfpvr() >> 16) & 0xFFFF) { case FSL_E500mc: case FSL_E5500: case FSL_E6500: mtspr(SPR_MAS8, 0); /* FALLTHROUGH */ case FSL_E500v2: mtspr(SPR_MAS7, args->e->mas7); break; default: break; } __asm __volatile("isync; tlbwe; isync; msync"); } static void tlb1_write_entry_sync(void *arg) { /* Empty synchronization point for smp_rendezvous(). */ } /* * Write given entry to TLB1 hardware. */ static void tlb1_write_entry(tlb_entry_t *e, unsigned int idx) { struct tlbwrite_args args; args.e = e; args.idx = idx; #ifdef SMP if ((e->mas2 & _TLB_ENTRY_SHARED) && smp_started) { mb(); smp_rendezvous(tlb1_write_entry_sync, tlb1_write_entry_int, tlb1_write_entry_sync, &args); } else #endif { register_t msr; msr = mfmsr(); __asm __volatile("wrteei 0"); tlb1_write_entry_int(&args); __asm __volatile("wrtee %0" :: "r"(msr)); } } /* * Return the largest uint value log such that 2^log <= num. */ static unsigned long ilog2(unsigned long num) { long lz; #ifdef __powerpc64__ __asm ("cntlzd %0, %1" : "=r" (lz) : "r" (num)); return (63 - lz); #else __asm ("cntlzw %0, %1" : "=r" (lz) : "r" (num)); return (31 - lz); #endif } /* * Convert TLB TSIZE value to mapped region size. */ static vm_size_t tsize2size(unsigned int tsize) { /* * size = 4^tsize KB * size = 4^tsize * 2^10 = 2^(2 * tsize - 10) */ return ((1 << (2 * tsize)) * 1024); } /* * Convert region size (must be power of 4) to TLB TSIZE value. */ static unsigned int size2tsize(vm_size_t size) { return (ilog2(size) / 2 - 5); } /* * Register permanent kernel mapping in TLB1. * * Entries are created starting from index 0 (current free entry is * kept in tlb1_idx) and are not supposed to be invalidated. */ int tlb1_set_entry(vm_offset_t va, vm_paddr_t pa, vm_size_t size, uint32_t flags) { tlb_entry_t e; uint32_t ts, tid; int tsize, index; /* First try to update an existing entry. */ for (index = 0; index < TLB1_ENTRIES; index++) { tlb1_read_entry(&e, index); /* Check if we're just updating the flags, and update them. */ if (e.phys == pa && e.virt == va && e.size == size) { e.mas2 = (va & MAS2_EPN_MASK) | flags; tlb1_write_entry(&e, index); return (0); } } /* Convert size to TSIZE */ tsize = size2tsize(size); tid = (TID_KERNEL << MAS1_TID_SHIFT) & MAS1_TID_MASK; /* XXX TS is hard coded to 0 for now as we only use single address space */ ts = (0 << MAS1_TS_SHIFT) & MAS1_TS_MASK; e.phys = pa; e.virt = va; e.size = size; e.mas1 = MAS1_VALID | MAS1_IPROT | ts | tid; e.mas1 |= ((tsize << MAS1_TSIZE_SHIFT) & MAS1_TSIZE_MASK); e.mas2 = (va & MAS2_EPN_MASK) | flags; /* Set supervisor RWX permission bits */ e.mas3 = (pa & MAS3_RPN) | MAS3_SR | MAS3_SW | MAS3_SX; e.mas7 = (pa >> 32) & MAS7_RPN; tlb1_write_entry(&e, -1); return (0); } /* * Map in contiguous RAM region into the TLB1. */ static vm_size_t tlb1_mapin_region(vm_offset_t va, vm_paddr_t pa, vm_size_t size, int wimge) { vm_offset_t base; vm_size_t mapped, sz, ssize; mapped = 0; base = va; ssize = size; while (size > 0) { sz = 1UL << (ilog2(size) & ~1); /* Align size to PA */ if (pa % sz != 0) { do { sz >>= 2; } while (pa % sz != 0); } /* Now align from there to VA */ if (va % sz != 0) { do { sz >>= 2; } while (va % sz != 0); } #ifdef __powerpc64__ /* * Clamp TLB1 entries to 4G. * * While the e6500 supports up to 1TB mappings, the e5500 * only supports up to 4G mappings. (0b1011) * * If any e6500 machines capable of supporting a very * large amount of memory appear in the future, we can * revisit this. * * For now, though, since we have plenty of space in TLB1, * always avoid creating entries larger than 4GB. */ sz = MIN(sz, 1UL << 32); #endif if (bootverbose) printf("Wiring VA=%p to PA=%jx (size=%lx)\n", (void *)va, (uintmax_t)pa, (long)sz); if (tlb1_set_entry(va, pa, sz, _TLB_ENTRY_SHARED | wimge) < 0) return (mapped); size -= sz; pa += sz; va += sz; } mapped = (va - base); if (bootverbose) printf("mapped size 0x%"PRIxPTR" (wasted space 0x%"PRIxPTR")\n", mapped, mapped - ssize); return (mapped); } /* * TLB1 initialization routine, to be called after the very first * assembler level setup done in locore.S. */ void tlb1_init() { vm_offset_t mas2; uint32_t mas0, mas1, mas3, mas7; uint32_t tsz; tlb1_get_tlbconf(); mas0 = MAS0_TLBSEL(1) | MAS0_ESEL(0); mtspr(SPR_MAS0, mas0); __asm __volatile("isync; tlbre"); mas1 = mfspr(SPR_MAS1); mas2 = mfspr(SPR_MAS2); mas3 = mfspr(SPR_MAS3); mas7 = mfspr(SPR_MAS7); kernload = ((vm_paddr_t)(mas7 & MAS7_RPN) << 32) | (mas3 & MAS3_RPN); tsz = (mas1 & MAS1_TSIZE_MASK) >> MAS1_TSIZE_SHIFT; kernsize += (tsz > 0) ? tsize2size(tsz) : 0; kernstart = trunc_page(mas2); /* Setup TLB miss defaults */ set_mas4_defaults(); } /* * pmap_early_io_unmap() should be used in short conjunction with * pmap_early_io_map(), as in the following snippet: * * x = pmap_early_io_map(...); * * pmap_early_io_unmap(x, size); * * And avoiding more allocations between. */ void pmap_early_io_unmap(vm_offset_t va, vm_size_t size) { int i; tlb_entry_t e; vm_size_t isize; size = roundup(size, PAGE_SIZE); isize = size; for (i = 0; i < TLB1_ENTRIES && size > 0; i++) { tlb1_read_entry(&e, i); if (!(e.mas1 & MAS1_VALID)) continue; if (va <= e.virt && (va + isize) >= (e.virt + e.size)) { size -= e.size; e.mas1 &= ~MAS1_VALID; tlb1_write_entry(&e, i); } } if (tlb1_map_base == va + isize) tlb1_map_base -= isize; } vm_offset_t pmap_early_io_map(vm_paddr_t pa, vm_size_t size) { vm_paddr_t pa_base; vm_offset_t va, sz; int i; tlb_entry_t e; KASSERT(!pmap_bootstrapped, ("Do not use after PMAP is up!")); for (i = 0; i < TLB1_ENTRIES; i++) { tlb1_read_entry(&e, i); if (!(e.mas1 & MAS1_VALID)) continue; if (pa >= e.phys && (pa + size) <= (e.phys + e.size)) return (e.virt + (pa - e.phys)); } pa_base = rounddown(pa, PAGE_SIZE); size = roundup(size + (pa - pa_base), PAGE_SIZE); tlb1_map_base = roundup2(tlb1_map_base, 1 << (ilog2(size) & ~1)); va = tlb1_map_base + (pa - pa_base); do { sz = 1 << (ilog2(size) & ~1); tlb1_set_entry(tlb1_map_base, pa_base, sz, _TLB_ENTRY_SHARED | _TLB_ENTRY_IO); size -= sz; pa_base += sz; tlb1_map_base += sz; } while (size > 0); return (va); } void pmap_track_page(pmap_t pmap, vm_offset_t va) { vm_paddr_t pa; vm_page_t page; struct pv_entry *pve; va = trunc_page(va); pa = pmap_kextract(va); page = PHYS_TO_VM_PAGE(pa); rw_wlock(&pvh_global_lock); PMAP_LOCK(pmap); TAILQ_FOREACH(pve, &page->md.pv_list, pv_link) { if ((pmap == pve->pv_pmap) && (va == pve->pv_va)) { goto out; } } page->md.pv_tracked = true; pv_insert(pmap, va, page); out: PMAP_UNLOCK(pmap); rw_wunlock(&pvh_global_lock); } /* * Setup MAS4 defaults. * These values are loaded to MAS0-2 on a TLB miss. */ static void set_mas4_defaults(void) { uint32_t mas4; /* Defaults: TLB0, PID0, TSIZED=4K */ mas4 = MAS4_TLBSELD0; mas4 |= (TLB_SIZE_4K << MAS4_TSIZED_SHIFT) & MAS4_TSIZED_MASK; #ifdef SMP mas4 |= MAS4_MD; #endif mtspr(SPR_MAS4, mas4); __asm __volatile("isync"); } /* * Return 0 if the physical IO range is encompassed by one of the * the TLB1 entries, otherwise return related error code. */ static int tlb1_iomapped(int i, vm_paddr_t pa, vm_size_t size, vm_offset_t *va) { uint32_t prot; vm_paddr_t pa_start; vm_paddr_t pa_end; unsigned int entry_tsize; vm_size_t entry_size; tlb_entry_t e; *va = (vm_offset_t)NULL; tlb1_read_entry(&e, i); /* Skip invalid entries */ if (!(e.mas1 & MAS1_VALID)) return (EINVAL); /* * The entry must be cache-inhibited, guarded, and r/w * so it can function as an i/o page */ prot = e.mas2 & (MAS2_I | MAS2_G); if (prot != (MAS2_I | MAS2_G)) return (EPERM); prot = e.mas3 & (MAS3_SR | MAS3_SW); if (prot != (MAS3_SR | MAS3_SW)) return (EPERM); /* The address should be within the entry range. */ entry_tsize = (e.mas1 & MAS1_TSIZE_MASK) >> MAS1_TSIZE_SHIFT; KASSERT((entry_tsize), ("tlb1_iomapped: invalid entry tsize")); entry_size = tsize2size(entry_tsize); pa_start = (((vm_paddr_t)e.mas7 & MAS7_RPN) << 32) | (e.mas3 & MAS3_RPN); pa_end = pa_start + entry_size; if ((pa < pa_start) || ((pa + size) > pa_end)) return (ERANGE); /* Return virtual address of this mapping. */ *va = (e.mas2 & MAS2_EPN_MASK) + (pa - pa_start); return (0); } /* * Invalidate all TLB0 entries which match the given TID. Note this is * dedicated for cases when invalidations should NOT be propagated to other * CPUs. */ static void tid_flush(tlbtid_t tid) { register_t msr; uint32_t mas0, mas1, mas2; int entry, way; /* Don't evict kernel translations */ if (tid == TID_KERNEL) return; msr = mfmsr(); __asm __volatile("wrteei 0"); /* * Newer (e500mc and later) have tlbilx, which doesn't broadcast, so use * it for PID invalidation. */ switch ((mfpvr() >> 16) & 0xffff) { case FSL_E500mc: case FSL_E5500: case FSL_E6500: mtspr(SPR_MAS6, tid << MAS6_SPID0_SHIFT); /* tlbilxpid */ __asm __volatile("isync; .long 0x7c200024; isync; msync"); __asm __volatile("wrtee %0" :: "r"(msr)); return; } for (way = 0; way < TLB0_WAYS; way++) for (entry = 0; entry < TLB0_ENTRIES_PER_WAY; entry++) { mas0 = MAS0_TLBSEL(0) | MAS0_ESEL(way); mtspr(SPR_MAS0, mas0); mas2 = entry << MAS2_TLB0_ENTRY_IDX_SHIFT; mtspr(SPR_MAS2, mas2); __asm __volatile("isync; tlbre"); mas1 = mfspr(SPR_MAS1); if (!(mas1 & MAS1_VALID)) continue; if (((mas1 & MAS1_TID_MASK) >> MAS1_TID_SHIFT) != tid) continue; mas1 &= ~MAS1_VALID; mtspr(SPR_MAS1, mas1); __asm __volatile("isync; tlbwe; isync; msync"); } __asm __volatile("wrtee %0" :: "r"(msr)); } #ifdef DDB /* Print out contents of the MAS registers for each TLB0 entry */ static void #ifdef __powerpc64__ tlb_print_entry(int i, uint32_t mas1, uint64_t mas2, uint32_t mas3, #else tlb_print_entry(int i, uint32_t mas1, uint32_t mas2, uint32_t mas3, #endif uint32_t mas7) { int as; char desc[3]; tlbtid_t tid; vm_size_t size; unsigned int tsize; desc[2] = '\0'; if (mas1 & MAS1_VALID) desc[0] = 'V'; else desc[0] = ' '; if (mas1 & MAS1_IPROT) desc[1] = 'P'; else desc[1] = ' '; as = (mas1 & MAS1_TS_MASK) ? 1 : 0; tid = MAS1_GETTID(mas1); tsize = (mas1 & MAS1_TSIZE_MASK) >> MAS1_TSIZE_SHIFT; size = 0; if (tsize) size = tsize2size(tsize); printf("%3d: (%s) [AS=%d] " "sz = 0x%jx tsz = %d tid = %d mas1 = 0x%08x " "mas2(va) = 0x%"PRI0ptrX" mas3(pa) = 0x%08x mas7 = 0x%08x\n", i, desc, as, (uintmax_t)size, tsize, tid, mas1, mas2, mas3, mas7); } DB_SHOW_COMMAND(tlb0, tlb0_print_tlbentries) { uint32_t mas0, mas1, mas3, mas7; #ifdef __powerpc64__ uint64_t mas2; #else uint32_t mas2; #endif int entryidx, way, idx; printf("TLB0 entries:\n"); for (way = 0; way < TLB0_WAYS; way ++) for (entryidx = 0; entryidx < TLB0_ENTRIES_PER_WAY; entryidx++) { mas0 = MAS0_TLBSEL(0) | MAS0_ESEL(way); mtspr(SPR_MAS0, mas0); mas2 = entryidx << MAS2_TLB0_ENTRY_IDX_SHIFT; mtspr(SPR_MAS2, mas2); __asm __volatile("isync; tlbre"); mas1 = mfspr(SPR_MAS1); mas2 = mfspr(SPR_MAS2); mas3 = mfspr(SPR_MAS3); mas7 = mfspr(SPR_MAS7); idx = tlb0_tableidx(mas2, way); tlb_print_entry(idx, mas1, mas2, mas3, mas7); } } /* * Print out contents of the MAS registers for each TLB1 entry */ DB_SHOW_COMMAND(tlb1, tlb1_print_tlbentries) { uint32_t mas0, mas1, mas3, mas7; #ifdef __powerpc64__ uint64_t mas2; #else uint32_t mas2; #endif int i; printf("TLB1 entries:\n"); for (i = 0; i < TLB1_ENTRIES; i++) { mas0 = MAS0_TLBSEL(1) | MAS0_ESEL(i); mtspr(SPR_MAS0, mas0); __asm __volatile("isync; tlbre"); mas1 = mfspr(SPR_MAS1); mas2 = mfspr(SPR_MAS2); mas3 = mfspr(SPR_MAS3); mas7 = mfspr(SPR_MAS7); tlb_print_entry(i, mas1, mas2, mas3, mas7); } } #endif Index: projects/clang1000-import/sys/security/audit/audit.h =================================================================== --- projects/clang1000-import/sys/security/audit/audit.h (revision 358238) +++ projects/clang1000-import/sys/security/audit/audit.h (revision 358239) @@ -1,462 +1,478 @@ /*- * SPDX-License-Identifier: BSD-3-Clause * * Copyright (c) 1999-2005 Apple Inc. * Copyright (c) 2016-2018 Robert N. M. Watson * All rights reserved. * * This software was developed by BAE Systems, the University of Cambridge * Computer Laboratory, and Memorial University under DARPA/AFRL contract * FA8650-15-C-7558 ("CADETS"), as part of the DARPA Transparent Computing * (TC) research program. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. Neither the name of Apple Inc. ("Apple") nor the names of * its contributors may be used to endorse or promote products derived * from this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY APPLE AND ITS CONTRIBUTORS "AS IS" AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL APPLE OR ITS CONTRIBUTORS BE LIABLE FOR * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING * IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE * POSSIBILITY OF SUCH DAMAGE. * * $FreeBSD$ */ /* * This header includes function prototypes and type definitions that are * necessary for the kernel as a whole to interact with the audit subsystem. */ #ifndef _SECURITY_AUDIT_KERNEL_H_ #define _SECURITY_AUDIT_KERNEL_H_ #ifndef _KERNEL #error "no user-serviceable parts inside" #endif #include #include #include /* * Audit subsystem condition flags. The audit_trail_enabled flag is set and * removed automatically as a result of configuring log files, and can be * observed but should not be directly manipulated. The audit suspension * flag permits audit to be temporarily disabled without reconfiguring the * audit target. * * As DTrace can also request system-call auditing, a further * audit_syscalls_enabled flag tracks whether newly entering system calls * should be considered for auditing or not. * * XXXRW: Move trail flags to audit_private.h, as they no longer need to be * visible outside the audit code...? */ extern u_int audit_dtrace_enabled; extern int audit_trail_enabled; extern int audit_trail_suspended; extern bool audit_syscalls_enabled; void audit_syscall_enter(unsigned short code, struct thread *td); void audit_syscall_exit(int error, struct thread *td); /* * The remaining kernel functions are conditionally compiled in as they are * wrapped by a macro, and the macro should be the only place in the source * tree where these functions are referenced. */ #ifdef AUDIT struct ipc_perm; struct sockaddr; union auditon_udata; void audit_arg_addr(void * addr); void audit_arg_exit(int status, int retval); void audit_arg_len(int len); void audit_arg_atfd1(int atfd); void audit_arg_atfd2(int atfd); void audit_arg_fd(int fd); void audit_arg_fflags(int fflags); void audit_arg_gid(gid_t gid); void audit_arg_uid(uid_t uid); void audit_arg_egid(gid_t egid); void audit_arg_euid(uid_t euid); void audit_arg_rgid(gid_t rgid); void audit_arg_ruid(uid_t ruid); void audit_arg_sgid(gid_t sgid); void audit_arg_suid(uid_t suid); void audit_arg_groupset(gid_t *gidset, u_int gidset_size); void audit_arg_login(char *login); void audit_arg_ctlname(int *name, int namelen); void audit_arg_mask(int mask); void audit_arg_mode(mode_t mode); void audit_arg_dev(int dev); void audit_arg_value(long value); void audit_arg_owner(uid_t uid, gid_t gid); void audit_arg_pid(pid_t pid); void audit_arg_process(struct proc *p); void audit_arg_signum(u_int signum); void audit_arg_socket(int sodomain, int sotype, int soprotocol); void audit_arg_sockaddr(struct thread *td, int dirfd, struct sockaddr *sa); void audit_arg_auid(uid_t auid); void audit_arg_auditinfo(struct auditinfo *au_info); void audit_arg_auditinfo_addr(struct auditinfo_addr *au_info); void audit_arg_upath1(struct thread *td, int dirfd, char *upath); void audit_arg_upath1_canon(char *upath); void audit_arg_upath2(struct thread *td, int dirfd, char *upath); void audit_arg_upath2_canon(char *upath); +void audit_arg_upath1_vp(struct thread *td, struct vnode *rdir, + struct vnode *cdir, char *upath); +void audit_arg_upath2_vp(struct thread *td, struct vnode *rdir, + struct vnode *cdir, char *upath); void audit_arg_vnode1(struct vnode *vp); void audit_arg_vnode2(struct vnode *vp); void audit_arg_text(const char *text); void audit_arg_cmd(int cmd); void audit_arg_svipc_cmd(int cmd); void audit_arg_svipc_perm(struct ipc_perm *perm); void audit_arg_svipc_id(int id); void audit_arg_svipc_addr(void *addr); void audit_arg_svipc_which(int which); void audit_arg_posix_ipc_perm(uid_t uid, gid_t gid, mode_t mode); void audit_arg_auditon(union auditon_udata *udata); void audit_arg_file(struct proc *p, struct file *fp); void audit_arg_argv(char *argv, int argc, int length); void audit_arg_envv(char *envv, int envc, int length); void audit_arg_rights(cap_rights_t *rightsp); void audit_arg_fcntl_rights(uint32_t fcntlrights); void audit_sysclose(struct thread *td, int fd); void audit_cred_copy(struct ucred *src, struct ucred *dest); void audit_cred_destroy(struct ucred *cred); void audit_cred_init(struct ucred *cred); void audit_cred_kproc0(struct ucred *cred); void audit_cred_proc1(struct ucred *cred); void audit_proc_coredump(struct thread *td, char *path, int errcode); void audit_thread_alloc(struct thread *td); void audit_thread_free(struct thread *td); /* * Define macros to wrap the audit_arg_* calls by checking the global * audit_syscalls_enabled flag before performing the actual call. */ #define AUDITING_TD(td) (__predict_false((td)->td_pflags & TDP_AUDITREC)) #define AUDIT_ARG_ADDR(addr) do { \ if (AUDITING_TD(curthread)) \ audit_arg_addr((addr)); \ } while (0) #define AUDIT_ARG_ARGV(argv, argc, length) do { \ if (AUDITING_TD(curthread)) \ audit_arg_argv((argv), (argc), (length)); \ } while (0) #define AUDIT_ARG_ATFD1(atfd) do { \ if (AUDITING_TD(curthread)) \ audit_arg_atfd1((atfd)); \ } while (0) #define AUDIT_ARG_ATFD2(atfd) do { \ if (AUDITING_TD(curthread)) \ audit_arg_atfd2((atfd)); \ } while (0) #define AUDIT_ARG_AUDITON(udata) do { \ if (AUDITING_TD(curthread)) \ audit_arg_auditon((udata)); \ } while (0) #define AUDIT_ARG_CMD(cmd) do { \ if (AUDITING_TD(curthread)) \ audit_arg_cmd((cmd)); \ } while (0) #define AUDIT_ARG_DEV(dev) do { \ if (AUDITING_TD(curthread)) \ audit_arg_dev((dev)); \ } while (0) #define AUDIT_ARG_EGID(egid) do { \ if (AUDITING_TD(curthread)) \ audit_arg_egid((egid)); \ } while (0) #define AUDIT_ARG_ENVV(envv, envc, length) do { \ if (AUDITING_TD(curthread)) \ audit_arg_envv((envv), (envc), (length)); \ } while (0) #define AUDIT_ARG_EXIT(status, retval) do { \ if (AUDITING_TD(curthread)) \ audit_arg_exit((status), (retval)); \ } while (0) #define AUDIT_ARG_EUID(euid) do { \ if (AUDITING_TD(curthread)) \ audit_arg_euid((euid)); \ } while (0) #define AUDIT_ARG_FD(fd) do { \ if (AUDITING_TD(curthread)) \ audit_arg_fd((fd)); \ } while (0) #define AUDIT_ARG_FILE(p, fp) do { \ if (AUDITING_TD(curthread)) \ audit_arg_file((p), (fp)); \ } while (0) #define AUDIT_ARG_FFLAGS(fflags) do { \ if (AUDITING_TD(curthread)) \ audit_arg_fflags((fflags)); \ } while (0) #define AUDIT_ARG_GID(gid) do { \ if (AUDITING_TD(curthread)) \ audit_arg_gid((gid)); \ } while (0) #define AUDIT_ARG_GROUPSET(gidset, gidset_size) do { \ if (AUDITING_TD(curthread)) \ audit_arg_groupset((gidset), (gidset_size)); \ } while (0) #define AUDIT_ARG_LOGIN(login) do { \ if (AUDITING_TD(curthread)) \ audit_arg_login((login)); \ } while (0) #define AUDIT_ARG_MODE(mode) do { \ if (AUDITING_TD(curthread)) \ audit_arg_mode((mode)); \ } while (0) #define AUDIT_ARG_OWNER(uid, gid) do { \ if (AUDITING_TD(curthread)) \ audit_arg_owner((uid), (gid)); \ } while (0) #define AUDIT_ARG_PID(pid) do { \ if (AUDITING_TD(curthread)) \ audit_arg_pid((pid)); \ } while (0) #define AUDIT_ARG_POSIX_IPC_PERM(uid, gid, mode) do { \ if (AUDITING_TD(curthread)) \ audit_arg_posix_ipc_perm((uid), (gid), (mod)); \ } while (0) #define AUDIT_ARG_PROCESS(p) do { \ if (AUDITING_TD(curthread)) \ audit_arg_process((p)); \ } while (0) #define AUDIT_ARG_RGID(rgid) do { \ if (AUDITING_TD(curthread)) \ audit_arg_rgid((rgid)); \ } while (0) #define AUDIT_ARG_RIGHTS(rights) do { \ if (AUDITING_TD(curthread)) \ audit_arg_rights((rights)); \ } while (0) #define AUDIT_ARG_FCNTL_RIGHTS(fcntlrights) do { \ if (AUDITING_TD(curthread)) \ audit_arg_fcntl_rights((fcntlrights)); \ } while (0) #define AUDIT_ARG_RUID(ruid) do { \ if (AUDITING_TD(curthread)) \ audit_arg_ruid((ruid)); \ } while (0) #define AUDIT_ARG_SIGNUM(signum) do { \ if (AUDITING_TD(curthread)) \ audit_arg_signum((signum)); \ } while (0) #define AUDIT_ARG_SGID(sgid) do { \ if (AUDITING_TD(curthread)) \ audit_arg_sgid((sgid)); \ } while (0) #define AUDIT_ARG_SOCKET(sodomain, sotype, soprotocol) do { \ if (AUDITING_TD(curthread)) \ audit_arg_socket((sodomain), (sotype), (soprotocol)); \ } while (0) #define AUDIT_ARG_SOCKADDR(td, dirfd, sa) do { \ if (AUDITING_TD(curthread)) \ audit_arg_sockaddr((td), (dirfd), (sa)); \ } while (0) #define AUDIT_ARG_SUID(suid) do { \ if (AUDITING_TD(curthread)) \ audit_arg_suid((suid)); \ } while (0) #define AUDIT_ARG_SVIPC_CMD(cmd) do { \ if (AUDITING_TD(curthread)) \ audit_arg_svipc_cmd((cmd)); \ } while (0) #define AUDIT_ARG_SVIPC_PERM(perm) do { \ if (AUDITING_TD(curthread)) \ audit_arg_svipc_perm((perm)); \ } while (0) #define AUDIT_ARG_SVIPC_ID(id) do { \ if (AUDITING_TD(curthread)) \ audit_arg_svipc_id((id)); \ } while (0) #define AUDIT_ARG_SVIPC_ADDR(addr) do { \ if (AUDITING_TD(curthread)) \ audit_arg_svipc_addr((addr)); \ } while (0) #define AUDIT_ARG_SVIPC_WHICH(which) do { \ if (AUDITING_TD(curthread)) \ audit_arg_svipc_which((which)); \ } while (0) #define AUDIT_ARG_TEXT(text) do { \ if (AUDITING_TD(curthread)) \ audit_arg_text((text)); \ } while (0) #define AUDIT_ARG_UID(uid) do { \ if (AUDITING_TD(curthread)) \ audit_arg_uid((uid)); \ } while (0) #define AUDIT_ARG_UPATH1(td, dirfd, upath) do { \ if (AUDITING_TD(curthread)) \ audit_arg_upath1((td), (dirfd), (upath)); \ } while (0) #define AUDIT_ARG_UPATH1_CANON(upath) do { \ if (AUDITING_TD(curthread)) \ audit_arg_upath1_canon((upath)); \ } while (0) #define AUDIT_ARG_UPATH2(td, dirfd, upath) do { \ if (AUDITING_TD(curthread)) \ audit_arg_upath2((td), (dirfd), (upath)); \ } while (0) #define AUDIT_ARG_UPATH2_CANON(upath) do { \ if (AUDITING_TD(curthread)) \ audit_arg_upath2_canon((upath)); \ } while (0) +#define AUDIT_ARG_UPATH1_VP(td, rdir, cdir, upath) do { \ + if (AUDITING_TD(curthread)) \ + audit_arg_upath1_vp((td), (rdir), (cdir), (upath)); \ +} while (0) + +#define AUDIT_ARG_UPATH2_VP(td, rdir, cdir, upath) do { \ + if (AUDITING_TD(curthread)) \ + audit_arg_upath2_vp((td), (rdir), (cdir), (upath)); \ +} while (0) + #define AUDIT_ARG_VALUE(value) do { \ if (AUDITING_TD(curthread)) \ audit_arg_value((value)); \ } while (0) #define AUDIT_ARG_VNODE1(vp) do { \ if (AUDITING_TD(curthread)) \ audit_arg_vnode1((vp)); \ } while (0) #define AUDIT_ARG_VNODE2(vp) do { \ if (AUDITING_TD(curthread)) \ audit_arg_vnode2((vp)); \ } while (0) #define AUDIT_SYSCALL_ENTER(code, td) ({ \ bool _audit_entered = false; \ if (__predict_false(audit_syscalls_enabled)) { \ audit_syscall_enter(code, td); \ _audit_entered = true; \ } \ _audit_entered; \ }) /* * Wrap the audit_syscall_exit() function so that it is called only when * we have a audit record on the thread. Audit records can persist after * auditing is disabled, so we don't just check audit_syscalls_enabled here. */ #define AUDIT_SYSCALL_EXIT(error, td) do { \ if (AUDITING_TD(td)) \ audit_syscall_exit(error, td); \ } while (0) /* * A Macro to wrap the audit_sysclose() function. */ #define AUDIT_SYSCLOSE(td, fd) do { \ if (AUDITING_TD(td)) \ audit_sysclose(td, fd); \ } while (0) #else /* !AUDIT */ #define AUDIT_ARG_ADDR(addr) #define AUDIT_ARG_ARGV(argv, argc, length) #define AUDIT_ARG_ATFD1(atfd) #define AUDIT_ARG_ATFD2(atfd) #define AUDIT_ARG_AUDITON(udata) #define AUDIT_ARG_CMD(cmd) #define AUDIT_ARG_DEV(dev) #define AUDIT_ARG_EGID(egid) #define AUDIT_ARG_ENVV(envv, envc, length) #define AUDIT_ARG_EXIT(status, retval) #define AUDIT_ARG_EUID(euid) #define AUDIT_ARG_FD(fd) #define AUDIT_ARG_FILE(p, fp) #define AUDIT_ARG_FFLAGS(fflags) #define AUDIT_ARG_GID(gid) #define AUDIT_ARG_GROUPSET(gidset, gidset_size) #define AUDIT_ARG_LOGIN(login) #define AUDIT_ARG_MODE(mode) #define AUDIT_ARG_OWNER(uid, gid) #define AUDIT_ARG_PID(pid) #define AUDIT_ARG_POSIX_IPC_PERM(uid, gid, mode) #define AUDIT_ARG_PROCESS(p) #define AUDIT_ARG_RGID(rgid) #define AUDIT_ARG_RIGHTS(rights) #define AUDIT_ARG_FCNTL_RIGHTS(fcntlrights) #define AUDIT_ARG_RUID(ruid) #define AUDIT_ARG_SIGNUM(signum) #define AUDIT_ARG_SGID(sgid) #define AUDIT_ARG_SOCKET(sodomain, sotype, soprotocol) #define AUDIT_ARG_SOCKADDR(td, dirfd, sa) #define AUDIT_ARG_SUID(suid) #define AUDIT_ARG_SVIPC_CMD(cmd) #define AUDIT_ARG_SVIPC_PERM(perm) #define AUDIT_ARG_SVIPC_ID(id) #define AUDIT_ARG_SVIPC_ADDR(addr) #define AUDIT_ARG_SVIPC_WHICH(which) #define AUDIT_ARG_TEXT(text) #define AUDIT_ARG_UID(uid) #define AUDIT_ARG_UPATH1(td, dirfd, upath) #define AUDIT_ARG_UPATH1_CANON(upath) #define AUDIT_ARG_UPATH2(td, dirfd, upath) #define AUDIT_ARG_UPATH2_CANON(upath) +#define AUDIT_ARG_UPATH1_VP(td, rdir, cdir, upath) +#define AUDIT_ARG_UPATH2_VP(td, rdir, cdir, upath) #define AUDIT_ARG_VALUE(value) #define AUDIT_ARG_VNODE1(vp) #define AUDIT_ARG_VNODE2(vp) #define AUDIT_SYSCALL_ENTER(code, td) 0 #define AUDIT_SYSCALL_EXIT(error, td) #define AUDIT_SYSCLOSE(p, fd) #endif /* AUDIT */ #endif /* !_SECURITY_AUDIT_KERNEL_H_ */ Index: projects/clang1000-import/sys/security/audit/audit_arg.c =================================================================== --- projects/clang1000-import/sys/security/audit/audit_arg.c (revision 358238) +++ projects/clang1000-import/sys/security/audit/audit_arg.c (revision 358239) @@ -1,983 +1,1021 @@ /*- * SPDX-License-Identifier: BSD-3-Clause * * Copyright (c) 1999-2005 Apple Inc. * Copyright (c) 2016-2017 Robert N. M. Watson * All rights reserved. * * Portions of this software were developed by BAE Systems, the University of * Cambridge Computer Laboratory, and Memorial University under DARPA/AFRL * contract FA8650-15-C-7558 ("CADETS"), as part of the DARPA Transparent * Computing (TC) research program. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. Neither the name of Apple Inc. ("Apple") nor the names of * its contributors may be used to endorse or promote products derived * from this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY APPLE AND ITS CONTRIBUTORS "AS IS" AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL APPLE OR ITS CONTRIBUTORS BE LIABLE FOR * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING * IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE * POSSIBILITY OF SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include /* * Calls to manipulate elements of the audit record structure from system * call code. Macro wrappers will prevent this functions from being entered * if auditing is disabled, avoiding the function call cost. We check the * thread audit record pointer anyway, as the audit condition could change, * and pre-selection may not have allocated an audit record for this event. * * XXXAUDIT: Should we assert, in each case, that this field of the record * hasn't already been filled in? */ void audit_arg_addr(void *addr) { struct kaudit_record *ar; ar = currecord(); if (ar == NULL) return; ar->k_ar.ar_arg_addr = addr; ARG_SET_VALID(ar, ARG_ADDR); } void audit_arg_exit(int status, int retval) { struct kaudit_record *ar; ar = currecord(); if (ar == NULL) return; ar->k_ar.ar_arg_exitstatus = status; ar->k_ar.ar_arg_exitretval = retval; ARG_SET_VALID(ar, ARG_EXIT); } void audit_arg_len(int len) { struct kaudit_record *ar; ar = currecord(); if (ar == NULL) return; ar->k_ar.ar_arg_len = len; ARG_SET_VALID(ar, ARG_LEN); } void audit_arg_atfd1(int atfd) { struct kaudit_record *ar; ar = currecord(); if (ar == NULL) return; ar->k_ar.ar_arg_atfd1 = atfd; ARG_SET_VALID(ar, ARG_ATFD1); } void audit_arg_atfd2(int atfd) { struct kaudit_record *ar; ar = currecord(); if (ar == NULL) return; ar->k_ar.ar_arg_atfd2 = atfd; ARG_SET_VALID(ar, ARG_ATFD2); } void audit_arg_fd(int fd) { struct kaudit_record *ar; ar = currecord(); if (ar == NULL) return; ar->k_ar.ar_arg_fd = fd; ARG_SET_VALID(ar, ARG_FD); } void audit_arg_fflags(int fflags) { struct kaudit_record *ar; ar = currecord(); if (ar == NULL) return; ar->k_ar.ar_arg_fflags = fflags; ARG_SET_VALID(ar, ARG_FFLAGS); } void audit_arg_gid(gid_t gid) { struct kaudit_record *ar; ar = currecord(); if (ar == NULL) return; ar->k_ar.ar_arg_gid = gid; ARG_SET_VALID(ar, ARG_GID); } void audit_arg_uid(uid_t uid) { struct kaudit_record *ar; ar = currecord(); if (ar == NULL) return; ar->k_ar.ar_arg_uid = uid; ARG_SET_VALID(ar, ARG_UID); } void audit_arg_egid(gid_t egid) { struct kaudit_record *ar; ar = currecord(); if (ar == NULL) return; ar->k_ar.ar_arg_egid = egid; ARG_SET_VALID(ar, ARG_EGID); } void audit_arg_euid(uid_t euid) { struct kaudit_record *ar; ar = currecord(); if (ar == NULL) return; ar->k_ar.ar_arg_euid = euid; ARG_SET_VALID(ar, ARG_EUID); } void audit_arg_rgid(gid_t rgid) { struct kaudit_record *ar; ar = currecord(); if (ar == NULL) return; ar->k_ar.ar_arg_rgid = rgid; ARG_SET_VALID(ar, ARG_RGID); } void audit_arg_ruid(uid_t ruid) { struct kaudit_record *ar; ar = currecord(); if (ar == NULL) return; ar->k_ar.ar_arg_ruid = ruid; ARG_SET_VALID(ar, ARG_RUID); } void audit_arg_sgid(gid_t sgid) { struct kaudit_record *ar; ar = currecord(); if (ar == NULL) return; ar->k_ar.ar_arg_sgid = sgid; ARG_SET_VALID(ar, ARG_SGID); } void audit_arg_suid(uid_t suid) { struct kaudit_record *ar; ar = currecord(); if (ar == NULL) return; ar->k_ar.ar_arg_suid = suid; ARG_SET_VALID(ar, ARG_SUID); } void audit_arg_groupset(gid_t *gidset, u_int gidset_size) { u_int i; struct kaudit_record *ar; KASSERT(gidset_size <= ngroups_max + 1, ("audit_arg_groupset: gidset_size > (kern.ngroups + 1)")); ar = currecord(); if (ar == NULL) return; if (ar->k_ar.ar_arg_groups.gidset == NULL) ar->k_ar.ar_arg_groups.gidset = malloc( sizeof(gid_t) * gidset_size, M_AUDITGIDSET, M_WAITOK); for (i = 0; i < gidset_size; i++) ar->k_ar.ar_arg_groups.gidset[i] = gidset[i]; ar->k_ar.ar_arg_groups.gidset_size = gidset_size; ARG_SET_VALID(ar, ARG_GROUPSET); } void audit_arg_login(char *login) { struct kaudit_record *ar; ar = currecord(); if (ar == NULL) return; strlcpy(ar->k_ar.ar_arg_login, login, MAXLOGNAME); ARG_SET_VALID(ar, ARG_LOGIN); } void audit_arg_ctlname(int *name, int namelen) { struct kaudit_record *ar; ar = currecord(); if (ar == NULL) return; bcopy(name, &ar->k_ar.ar_arg_ctlname, namelen * sizeof(int)); ar->k_ar.ar_arg_len = namelen; ARG_SET_VALID(ar, ARG_CTLNAME | ARG_LEN); } void audit_arg_mask(int mask) { struct kaudit_record *ar; ar = currecord(); if (ar == NULL) return; ar->k_ar.ar_arg_mask = mask; ARG_SET_VALID(ar, ARG_MASK); } void audit_arg_mode(mode_t mode) { struct kaudit_record *ar; ar = currecord(); if (ar == NULL) return; ar->k_ar.ar_arg_mode = mode; ARG_SET_VALID(ar, ARG_MODE); } void audit_arg_dev(int dev) { struct kaudit_record *ar; ar = currecord(); if (ar == NULL) return; ar->k_ar.ar_arg_dev = dev; ARG_SET_VALID(ar, ARG_DEV); } void audit_arg_value(long value) { struct kaudit_record *ar; ar = currecord(); if (ar == NULL) return; ar->k_ar.ar_arg_value = value; ARG_SET_VALID(ar, ARG_VALUE); } void audit_arg_owner(uid_t uid, gid_t gid) { struct kaudit_record *ar; ar = currecord(); if (ar == NULL) return; ar->k_ar.ar_arg_uid = uid; ar->k_ar.ar_arg_gid = gid; ARG_SET_VALID(ar, ARG_UID | ARG_GID); } void audit_arg_pid(pid_t pid) { struct kaudit_record *ar; ar = currecord(); if (ar == NULL) return; ar->k_ar.ar_arg_pid = pid; ARG_SET_VALID(ar, ARG_PID); } void audit_arg_process(struct proc *p) { struct kaudit_record *ar; struct ucred *cred; KASSERT(p != NULL, ("audit_arg_process: p == NULL")); PROC_LOCK_ASSERT(p, MA_OWNED); ar = currecord(); if (ar == NULL) return; cred = p->p_ucred; ar->k_ar.ar_arg_auid = cred->cr_audit.ai_auid; ar->k_ar.ar_arg_euid = cred->cr_uid; ar->k_ar.ar_arg_egid = cred->cr_groups[0]; ar->k_ar.ar_arg_ruid = cred->cr_ruid; ar->k_ar.ar_arg_rgid = cred->cr_rgid; ar->k_ar.ar_arg_asid = cred->cr_audit.ai_asid; ar->k_ar.ar_arg_termid_addr = cred->cr_audit.ai_termid; ar->k_ar.ar_arg_pid = p->p_pid; ARG_SET_VALID(ar, ARG_AUID | ARG_EUID | ARG_EGID | ARG_RUID | ARG_RGID | ARG_ASID | ARG_TERMID_ADDR | ARG_PID | ARG_PROCESS); } void audit_arg_signum(u_int signum) { struct kaudit_record *ar; ar = currecord(); if (ar == NULL) return; ar->k_ar.ar_arg_signum = signum; ARG_SET_VALID(ar, ARG_SIGNUM); } void audit_arg_socket(int sodomain, int sotype, int soprotocol) { struct kaudit_record *ar; ar = currecord(); if (ar == NULL) return; ar->k_ar.ar_arg_sockinfo.so_domain = sodomain; ar->k_ar.ar_arg_sockinfo.so_type = sotype; ar->k_ar.ar_arg_sockinfo.so_protocol = soprotocol; ARG_SET_VALID(ar, ARG_SOCKINFO); } void audit_arg_sockaddr(struct thread *td, int dirfd, struct sockaddr *sa) { struct kaudit_record *ar; KASSERT(td != NULL, ("audit_arg_sockaddr: td == NULL")); KASSERT(sa != NULL, ("audit_arg_sockaddr: sa == NULL")); ar = currecord(); if (ar == NULL) return; bcopy(sa, &ar->k_ar.ar_arg_sockaddr, sa->sa_len); switch (sa->sa_family) { case AF_INET: ARG_SET_VALID(ar, ARG_SADDRINET); break; case AF_INET6: ARG_SET_VALID(ar, ARG_SADDRINET6); break; case AF_UNIX: if (dirfd != AT_FDCWD) audit_arg_atfd1(dirfd); audit_arg_upath1(td, dirfd, ((struct sockaddr_un *)sa)->sun_path); ARG_SET_VALID(ar, ARG_SADDRUNIX); break; /* XXXAUDIT: default:? */ } } void audit_arg_auid(uid_t auid) { struct kaudit_record *ar; ar = currecord(); if (ar == NULL) return; ar->k_ar.ar_arg_auid = auid; ARG_SET_VALID(ar, ARG_AUID); } void audit_arg_auditinfo(struct auditinfo *au_info) { struct kaudit_record *ar; ar = currecord(); if (ar == NULL) return; ar->k_ar.ar_arg_auid = au_info->ai_auid; ar->k_ar.ar_arg_asid = au_info->ai_asid; ar->k_ar.ar_arg_amask.am_success = au_info->ai_mask.am_success; ar->k_ar.ar_arg_amask.am_failure = au_info->ai_mask.am_failure; ar->k_ar.ar_arg_termid.port = au_info->ai_termid.port; ar->k_ar.ar_arg_termid.machine = au_info->ai_termid.machine; ARG_SET_VALID(ar, ARG_AUID | ARG_ASID | ARG_AMASK | ARG_TERMID); } void audit_arg_auditinfo_addr(struct auditinfo_addr *au_info) { struct kaudit_record *ar; ar = currecord(); if (ar == NULL) return; ar->k_ar.ar_arg_auid = au_info->ai_auid; ar->k_ar.ar_arg_asid = au_info->ai_asid; ar->k_ar.ar_arg_amask.am_success = au_info->ai_mask.am_success; ar->k_ar.ar_arg_amask.am_failure = au_info->ai_mask.am_failure; ar->k_ar.ar_arg_termid_addr.at_type = au_info->ai_termid.at_type; ar->k_ar.ar_arg_termid_addr.at_port = au_info->ai_termid.at_port; ar->k_ar.ar_arg_termid_addr.at_addr[0] = au_info->ai_termid.at_addr[0]; ar->k_ar.ar_arg_termid_addr.at_addr[1] = au_info->ai_termid.at_addr[1]; ar->k_ar.ar_arg_termid_addr.at_addr[2] = au_info->ai_termid.at_addr[2]; ar->k_ar.ar_arg_termid_addr.at_addr[3] = au_info->ai_termid.at_addr[3]; ARG_SET_VALID(ar, ARG_AUID | ARG_ASID | ARG_AMASK | ARG_TERMID_ADDR); } void audit_arg_text(const char *text) { struct kaudit_record *ar; KASSERT(text != NULL, ("audit_arg_text: text == NULL")); ar = currecord(); if (ar == NULL) return; /* Invalidate the text string */ ar->k_ar.ar_valid_arg &= (ARG_ALL ^ ARG_TEXT); if (ar->k_ar.ar_arg_text == NULL) ar->k_ar.ar_arg_text = malloc(MAXPATHLEN, M_AUDITTEXT, M_WAITOK); strncpy(ar->k_ar.ar_arg_text, text, MAXPATHLEN); ARG_SET_VALID(ar, ARG_TEXT); } void audit_arg_cmd(int cmd) { struct kaudit_record *ar; ar = currecord(); if (ar == NULL) return; ar->k_ar.ar_arg_cmd = cmd; ARG_SET_VALID(ar, ARG_CMD); } void audit_arg_svipc_cmd(int cmd) { struct kaudit_record *ar; ar = currecord(); if (ar == NULL) return; ar->k_ar.ar_arg_svipc_cmd = cmd; ARG_SET_VALID(ar, ARG_SVIPC_CMD); } void audit_arg_svipc_perm(struct ipc_perm *perm) { struct kaudit_record *ar; ar = currecord(); if (ar == NULL) return; bcopy(perm, &ar->k_ar.ar_arg_svipc_perm, sizeof(ar->k_ar.ar_arg_svipc_perm)); ARG_SET_VALID(ar, ARG_SVIPC_PERM); } void audit_arg_svipc_id(int id) { struct kaudit_record *ar; ar = currecord(); if (ar == NULL) return; ar->k_ar.ar_arg_svipc_id = id; ARG_SET_VALID(ar, ARG_SVIPC_ID); } void audit_arg_svipc_addr(void * addr) { struct kaudit_record *ar; ar = currecord(); if (ar == NULL) return; ar->k_ar.ar_arg_svipc_addr = addr; ARG_SET_VALID(ar, ARG_SVIPC_ADDR); } void audit_arg_svipc_which(int which) { struct kaudit_record *ar; ar = currecord(); if (ar == NULL) return; ar->k_ar.ar_arg_svipc_which = which; ARG_SET_VALID(ar, ARG_SVIPC_WHICH); } void audit_arg_posix_ipc_perm(uid_t uid, gid_t gid, mode_t mode) { struct kaudit_record *ar; ar = currecord(); if (ar == NULL) return; ar->k_ar.ar_arg_pipc_perm.pipc_uid = uid; ar->k_ar.ar_arg_pipc_perm.pipc_gid = gid; ar->k_ar.ar_arg_pipc_perm.pipc_mode = mode; ARG_SET_VALID(ar, ARG_POSIX_IPC_PERM); } void audit_arg_auditon(union auditon_udata *udata) { struct kaudit_record *ar; ar = currecord(); if (ar == NULL) return; bcopy((void *)udata, &ar->k_ar.ar_arg_auditon, sizeof(ar->k_ar.ar_arg_auditon)); ARG_SET_VALID(ar, ARG_AUDITON); } /* * Audit information about a file, either the file's vnode info, or its * socket address info. */ void audit_arg_file(struct proc *p, struct file *fp) { struct kaudit_record *ar; struct socket *so; struct inpcb *pcb; struct vnode *vp; ar = currecord(); if (ar == NULL) return; switch (fp->f_type) { case DTYPE_VNODE: case DTYPE_FIFO: /* * XXXAUDIT: Only possibly to record as first vnode? */ vp = fp->f_vnode; vn_lock(vp, LK_SHARED | LK_RETRY); audit_arg_vnode1(vp); VOP_UNLOCK(vp); break; case DTYPE_SOCKET: so = (struct socket *)fp->f_data; if (INP_CHECK_SOCKAF(so, PF_INET)) { SOCK_LOCK(so); ar->k_ar.ar_arg_sockinfo.so_type = so->so_type; ar->k_ar.ar_arg_sockinfo.so_domain = INP_SOCKAF(so); ar->k_ar.ar_arg_sockinfo.so_protocol = so->so_proto->pr_protocol; SOCK_UNLOCK(so); pcb = (struct inpcb *)so->so_pcb; INP_RLOCK(pcb); ar->k_ar.ar_arg_sockinfo.so_raddr = pcb->inp_faddr.s_addr; ar->k_ar.ar_arg_sockinfo.so_laddr = pcb->inp_laddr.s_addr; ar->k_ar.ar_arg_sockinfo.so_rport = pcb->inp_fport; ar->k_ar.ar_arg_sockinfo.so_lport = pcb->inp_lport; INP_RUNLOCK(pcb); ARG_SET_VALID(ar, ARG_SOCKINFO); } break; default: /* XXXAUDIT: else? */ break; } } /* * Store a path as given by the user process for auditing into the audit * record stored on the user thread. This function will allocate the memory * to store the path info if not already available. This memory will be * freed when the audit record is freed. The path is canonlicalised with * respect to the thread and directory descriptor passed. */ static void audit_arg_upath(struct thread *td, int dirfd, char *upath, char **pathp) { if (*pathp == NULL) *pathp = malloc(MAXPATHLEN, M_AUDITPATH, M_WAITOK); audit_canon_path(td, dirfd, upath, *pathp); } void audit_arg_upath1(struct thread *td, int dirfd, char *upath) { struct kaudit_record *ar; ar = currecord(); if (ar == NULL) return; audit_arg_upath(td, dirfd, upath, &ar->k_ar.ar_arg_upath1); ARG_SET_VALID(ar, ARG_UPATH1); } void audit_arg_upath2(struct thread *td, int dirfd, char *upath) { struct kaudit_record *ar; ar = currecord(); if (ar == NULL) return; audit_arg_upath(td, dirfd, upath, &ar->k_ar.ar_arg_upath2); ARG_SET_VALID(ar, ARG_UPATH2); } +static void +audit_arg_upath_vp(struct thread *td, struct vnode *rdir, struct vnode *cdir, + char *upath, char **pathp) +{ + + if (*pathp == NULL) + *pathp = malloc(MAXPATHLEN, M_AUDITPATH, M_WAITOK); + audit_canon_path_vp(td, rdir, cdir, upath, *pathp); +} + +void +audit_arg_upath1_vp(struct thread *td, struct vnode *rdir, struct vnode *cdir, + char *upath) +{ + struct kaudit_record *ar; + + ar = currecord(); + if (ar == NULL) + return; + + audit_arg_upath_vp(td, rdir, cdir, upath, &ar->k_ar.ar_arg_upath1); + ARG_SET_VALID(ar, ARG_UPATH1); +} + +void +audit_arg_upath2_vp(struct thread *td, struct vnode *rdir, struct vnode *cdir, + char *upath) +{ + struct kaudit_record *ar; + + ar = currecord(); + if (ar == NULL) + return; + + audit_arg_upath_vp(td, rdir, cdir, upath, &ar->k_ar.ar_arg_upath2); + ARG_SET_VALID(ar, ARG_UPATH2); +} + /* * Variants on path auditing that do not canonicalise the path passed in; * these are for use with filesystem-like subsystems that employ string names, * but do not support a hierarchical namespace -- for example, POSIX IPC * objects. The subsystem should have performed any necessary * canonicalisation required to make the paths useful to audit analysis. */ static void audit_arg_upath_canon(char *upath, char **pathp) { if (*pathp == NULL) *pathp = malloc(MAXPATHLEN, M_AUDITPATH, M_WAITOK); (void)snprintf(*pathp, MAXPATHLEN, "%s", upath); } void audit_arg_upath1_canon(char *upath) { struct kaudit_record *ar; ar = currecord(); if (ar == NULL) return; audit_arg_upath_canon(upath, &ar->k_ar.ar_arg_upath1); ARG_SET_VALID(ar, ARG_UPATH1); } void audit_arg_upath2_canon(char *upath) { struct kaudit_record *ar; ar = currecord(); if (ar == NULL) return; audit_arg_upath_canon(upath, &ar->k_ar.ar_arg_upath2); ARG_SET_VALID(ar, ARG_UPATH2); } /* * Function to save the path and vnode attr information into the audit * record. * * It is assumed that the caller will hold any vnode locks necessary to * perform a VOP_GETATTR() on the passed vnode. * * XXX: The attr code is very similar to vfs_vnops.c:vn_stat(), but always * provides access to the generation number as we need that to construct the * BSM file ID. * * XXX: We should accept the process argument from the caller, since it's * very likely they already have a reference. * * XXX: Error handling in this function is poor. * * XXXAUDIT: Possibly KASSERT the path pointer is NULL? */ static int audit_arg_vnode(struct vnode *vp, struct vnode_au_info *vnp) { struct vattr vattr; int error; ASSERT_VOP_LOCKED(vp, "audit_arg_vnode"); error = VOP_GETATTR(vp, &vattr, curthread->td_ucred); if (error) { /* XXX: How to handle this case? */ return (error); } vnp->vn_mode = vattr.va_mode; vnp->vn_uid = vattr.va_uid; vnp->vn_gid = vattr.va_gid; vnp->vn_dev = vattr.va_rdev; vnp->vn_fsid = vattr.va_fsid; vnp->vn_fileid = vattr.va_fileid; vnp->vn_gen = vattr.va_gen; return (0); } void audit_arg_vnode1(struct vnode *vp) { struct kaudit_record *ar; int error; ar = currecord(); if (ar == NULL) return; ARG_CLEAR_VALID(ar, ARG_VNODE1); error = audit_arg_vnode(vp, &ar->k_ar.ar_arg_vnode1); if (error == 0) ARG_SET_VALID(ar, ARG_VNODE1); } void audit_arg_vnode2(struct vnode *vp) { struct kaudit_record *ar; int error; ar = currecord(); if (ar == NULL) return; ARG_CLEAR_VALID(ar, ARG_VNODE2); error = audit_arg_vnode(vp, &ar->k_ar.ar_arg_vnode2); if (error == 0) ARG_SET_VALID(ar, ARG_VNODE2); } /* * Audit the argument strings passed to exec. */ void audit_arg_argv(char *argv, int argc, int length) { struct kaudit_record *ar; if (audit_argv == 0) return; ar = currecord(); if (ar == NULL) return; ar->k_ar.ar_arg_argv = malloc(length, M_AUDITTEXT, M_WAITOK); bcopy(argv, ar->k_ar.ar_arg_argv, length); ar->k_ar.ar_arg_argc = argc; ARG_SET_VALID(ar, ARG_ARGV); } /* * Audit the environment strings passed to exec. */ void audit_arg_envv(char *envv, int envc, int length) { struct kaudit_record *ar; if (audit_arge == 0) return; ar = currecord(); if (ar == NULL) return; ar->k_ar.ar_arg_envv = malloc(length, M_AUDITTEXT, M_WAITOK); bcopy(envv, ar->k_ar.ar_arg_envv, length); ar->k_ar.ar_arg_envc = envc; ARG_SET_VALID(ar, ARG_ENVV); } void audit_arg_rights(cap_rights_t *rightsp) { struct kaudit_record *ar; ar = currecord(); if (ar == NULL) return; ar->k_ar.ar_arg_rights = *rightsp; ARG_SET_VALID(ar, ARG_RIGHTS); } void audit_arg_fcntl_rights(uint32_t fcntlrights) { struct kaudit_record *ar; ar = currecord(); if (ar == NULL) return; ar->k_ar.ar_arg_fcntl_rights = fcntlrights; ARG_SET_VALID(ar, ARG_FCNTL_RIGHTS); } /* * The close() system call uses it's own audit call to capture the path/vnode * information because those pieces are not easily obtained within the system * call itself. */ void audit_sysclose(struct thread *td, int fd) { cap_rights_t rights; struct kaudit_record *ar; struct vnode *vp; struct file *fp; KASSERT(td != NULL, ("audit_sysclose: td == NULL")); ar = currecord(); if (ar == NULL) return; audit_arg_fd(fd); if (getvnode(td, fd, cap_rights_init(&rights), &fp) != 0) return; vp = fp->f_vnode; vn_lock(vp, LK_SHARED | LK_RETRY); audit_arg_vnode1(vp); VOP_UNLOCK(vp); fdrop(fp, td); } Index: projects/clang1000-import/sys/security/audit/audit_bsm_klib.c =================================================================== --- projects/clang1000-import/sys/security/audit/audit_bsm_klib.c (revision 358238) +++ projects/clang1000-import/sys/security/audit/audit_bsm_klib.c (revision 358239) @@ -1,532 +1,530 @@ /*- * SPDX-License-Identifier: BSD-3-Clause * * Copyright (c) 1999-2009 Apple Inc. * Copyright (c) 2005, 2016-2017 Robert N. M. Watson * All rights reserved. * * Portions of this software were developed by BAE Systems, the University of * Cambridge Computer Laboratory, and Memorial University under DARPA/AFRL * contract FA8650-15-C-7558 ("CADETS"), as part of the DARPA Transparent * Computing (TC) research program. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. Neither the name of Apple Inc. ("Apple") nor the names of * its contributors may be used to endorse or promote products derived * from this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY APPLE AND ITS CONTRIBUTORS "AS IS" AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL APPLE OR ITS CONTRIBUTORS BE LIABLE FOR * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING * IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE * POSSIBILITY OF SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include struct aue_open_event { int aoe_flags; au_event_t aoe_event; }; static const struct aue_open_event aue_open[] = { { O_RDONLY, AUE_OPEN_R }, { (O_RDONLY | O_CREAT), AUE_OPEN_RC }, { (O_RDONLY | O_CREAT | O_TRUNC), AUE_OPEN_RTC }, { (O_RDONLY | O_TRUNC), AUE_OPEN_RT }, { O_RDWR, AUE_OPEN_RW }, { (O_RDWR | O_CREAT), AUE_OPEN_RWC }, { (O_RDWR | O_CREAT | O_TRUNC), AUE_OPEN_RWTC }, { (O_RDWR | O_TRUNC), AUE_OPEN_RWT }, { O_WRONLY, AUE_OPEN_W }, { (O_WRONLY | O_CREAT), AUE_OPEN_WC }, { (O_WRONLY | O_CREAT | O_TRUNC), AUE_OPEN_WTC }, { (O_WRONLY | O_TRUNC), AUE_OPEN_WT }, }; static const struct aue_open_event aue_openat[] = { { O_RDONLY, AUE_OPENAT_R }, { (O_RDONLY | O_CREAT), AUE_OPENAT_RC }, { (O_RDONLY | O_CREAT | O_TRUNC), AUE_OPENAT_RTC }, { (O_RDONLY | O_TRUNC), AUE_OPENAT_RT }, { O_RDWR, AUE_OPENAT_RW }, { (O_RDWR | O_CREAT), AUE_OPENAT_RWC }, { (O_RDWR | O_CREAT | O_TRUNC), AUE_OPENAT_RWTC }, { (O_RDWR | O_TRUNC), AUE_OPENAT_RWT }, { O_WRONLY, AUE_OPENAT_W }, { (O_WRONLY | O_CREAT), AUE_OPENAT_WC }, { (O_WRONLY | O_CREAT | O_TRUNC), AUE_OPENAT_WTC }, { (O_WRONLY | O_TRUNC), AUE_OPENAT_WT }, }; static const int aue_msgsys[] = { /* 0 */ AUE_MSGCTL, /* 1 */ AUE_MSGGET, /* 2 */ AUE_MSGSND, /* 3 */ AUE_MSGRCV, }; static const int aue_msgsys_count = sizeof(aue_msgsys) / sizeof(int); static const int aue_semsys[] = { /* 0 */ AUE_SEMCTL, /* 1 */ AUE_SEMGET, /* 2 */ AUE_SEMOP, }; static const int aue_semsys_count = sizeof(aue_semsys) / sizeof(int); static const int aue_shmsys[] = { /* 0 */ AUE_SHMAT, /* 1 */ AUE_SHMDT, /* 2 */ AUE_SHMGET, /* 3 */ AUE_SHMCTL, }; static const int aue_shmsys_count = sizeof(aue_shmsys) / sizeof(int); /* * Check whether an event is auditable by comparing the mask of classes this * event is part of against the given mask. */ int au_preselect(au_event_t event, au_class_t class, au_mask_t *mask_p, int sorf) { au_class_t effmask = 0; if (mask_p == NULL) return (-1); /* * Perform the actual check of the masks against the event. */ if (sorf & AU_PRS_SUCCESS) effmask |= (mask_p->am_success & class); if (sorf & AU_PRS_FAILURE) effmask |= (mask_p->am_failure & class); if (effmask) return (1); else return (0); } /* * Convert sysctl names and present arguments to events. */ au_event_t audit_ctlname_to_sysctlevent(int name[], uint64_t valid_arg) { /* can't parse it - so return the worst case */ if ((valid_arg & (ARG_CTLNAME | ARG_LEN)) != (ARG_CTLNAME | ARG_LEN)) return (AUE_SYSCTL); switch (name[0]) { /* non-admin "lookups" treat them special */ case KERN_OSTYPE: case KERN_OSRELEASE: case KERN_OSREV: case KERN_VERSION: case KERN_ARGMAX: case KERN_CLOCKRATE: case KERN_BOOTTIME: case KERN_POSIX1: case KERN_NGROUPS: case KERN_JOB_CONTROL: case KERN_SAVED_IDS: case KERN_OSRELDATE: case KERN_DUMMY: return (AUE_SYSCTL_NONADMIN); /* only treat the changeable controls as admin */ case KERN_MAXVNODES: case KERN_MAXPROC: case KERN_MAXFILES: case KERN_MAXPROCPERUID: case KERN_MAXFILESPERPROC: case KERN_HOSTID: case KERN_SECURELVL: case KERN_HOSTNAME: case KERN_VNODE: case KERN_PROC: case KERN_FILE: case KERN_PROF: case KERN_NISDOMAINNAME: case KERN_UPDATEINTERVAL: case KERN_NTP_PLL: case KERN_BOOTFILE: case KERN_DUMPDEV: case KERN_IPC: case KERN_PS_STRINGS: case KERN_USRSTACK: case KERN_LOGSIGEXIT: case KERN_IOV_MAX: return ((valid_arg & ARG_VALUE) ? AUE_SYSCTL : AUE_SYSCTL_NONADMIN); default: return (AUE_SYSCTL); } /* NOTREACHED */ } /* * Convert an open flags specifier into a specific type of open event for * auditing purposes. */ au_event_t audit_flags_and_error_to_openevent(int oflags, int error) { int i; /* * Need to check only those flags we care about. */ oflags = oflags & (O_RDONLY | O_CREAT | O_TRUNC | O_RDWR | O_WRONLY); for (i = 0; i < nitems(aue_open); i++) { if (aue_open[i].aoe_flags == oflags) return (aue_open[i].aoe_event); } return (AUE_OPEN); } au_event_t audit_flags_and_error_to_openatevent(int oflags, int error) { int i; /* * Need to check only those flags we care about. */ oflags = oflags & (O_RDONLY | O_CREAT | O_TRUNC | O_RDWR | O_WRONLY); for (i = 0; i < nitems(aue_openat); i++) { if (aue_openat[i].aoe_flags == oflags) return (aue_openat[i].aoe_event); } return (AUE_OPENAT); } /* * Convert a MSGCTL command to a specific event. */ au_event_t audit_msgctl_to_event(int cmd) { switch (cmd) { case IPC_RMID: return (AUE_MSGCTL_RMID); case IPC_SET: return (AUE_MSGCTL_SET); case IPC_STAT: return (AUE_MSGCTL_STAT); default: /* We will audit a bad command. */ return (AUE_MSGCTL); } } /* * Convert a SEMCTL command to a specific event. */ au_event_t audit_semctl_to_event(int cmd) { switch (cmd) { case GETALL: return (AUE_SEMCTL_GETALL); case GETNCNT: return (AUE_SEMCTL_GETNCNT); case GETPID: return (AUE_SEMCTL_GETPID); case GETVAL: return (AUE_SEMCTL_GETVAL); case GETZCNT: return (AUE_SEMCTL_GETZCNT); case IPC_RMID: return (AUE_SEMCTL_RMID); case IPC_SET: return (AUE_SEMCTL_SET); case SETALL: return (AUE_SEMCTL_SETALL); case SETVAL: return (AUE_SEMCTL_SETVAL); case IPC_STAT: return (AUE_SEMCTL_STAT); default: /* We will audit a bad command. */ return (AUE_SEMCTL); } } /* * Convert msgsys(2), semsys(2), and shmsys(2) system-call variations into * audit events, if possible. */ au_event_t audit_msgsys_to_event(int which) { if ((which >= 0) && (which < aue_msgsys_count)) return (aue_msgsys[which]); /* Audit a bad command. */ return (AUE_MSGSYS); } au_event_t audit_semsys_to_event(int which) { if ((which >= 0) && (which < aue_semsys_count)) return (aue_semsys[which]); /* Audit a bad command. */ return (AUE_SEMSYS); } au_event_t audit_shmsys_to_event(int which) { if ((which >= 0) && (which < aue_shmsys_count)) return (aue_shmsys[which]); /* Audit a bad command. */ return (AUE_SHMSYS); } /* * Convert a command for the auditon() system call to a audit event. */ au_event_t auditon_command_event(int cmd) { switch(cmd) { case A_GETPOLICY: return (AUE_AUDITON_GPOLICY); case A_SETPOLICY: return (AUE_AUDITON_SPOLICY); case A_GETKMASK: return (AUE_AUDITON_GETKMASK); case A_SETKMASK: return (AUE_AUDITON_SETKMASK); case A_GETQCTRL: return (AUE_AUDITON_GQCTRL); case A_SETQCTRL: return (AUE_AUDITON_SQCTRL); case A_GETCWD: return (AUE_AUDITON_GETCWD); case A_GETCAR: return (AUE_AUDITON_GETCAR); case A_GETSTAT: return (AUE_AUDITON_GETSTAT); case A_SETSTAT: return (AUE_AUDITON_SETSTAT); case A_SETUMASK: return (AUE_AUDITON_SETUMASK); case A_SETSMASK: return (AUE_AUDITON_SETSMASK); case A_GETCOND: return (AUE_AUDITON_GETCOND); case A_SETCOND: return (AUE_AUDITON_SETCOND); case A_GETCLASS: return (AUE_AUDITON_GETCLASS); case A_SETCLASS: return (AUE_AUDITON_SETCLASS); case A_GETPINFO: case A_SETPMASK: case A_SETFSIZE: case A_GETFSIZE: case A_GETPINFO_ADDR: case A_GETKAUDIT: case A_SETKAUDIT: default: return (AUE_AUDITON); /* No special record */ } } /* * Create a canonical path from given path by prefixing either the root * directory, or the current working directory. If the process working * directory is NULL, we could use 'rootvnode' to obtain the root directory, * but this results in a volfs name written to the audit log. So we will * leave the filename starting with '/' in the audit log in this case. */ void -audit_canon_path(struct thread *td, int dirfd, char *path, char *cpath) +audit_canon_path_vp(struct thread *td, struct vnode *rdir, struct vnode *cdir, + char *path, char *cpath) { - struct vnode *cvnp, *rvnp; + struct vnode *vp; char *rbuf, *fbuf, *copy; - struct filedesc *fdp; struct sbuf sbf; - cap_rights_t rights; - int error, needslash; + int error; WITNESS_WARN(WARN_GIANTOK | WARN_SLEEPOK, NULL, "%s: at %s:%d", __func__, __FILE__, __LINE__); copy = path; - rvnp = cvnp = NULL; - fdp = td->td_proc->p_fd; - FILEDESC_SLOCK(fdp); + if (*path == '/') + vp = rdir; + else + vp = cdir; + MPASS(vp != NULL); /* - * Make sure that we handle the chroot(2) case. If there is an - * alternate root directory, prepend it to the audited pathname. - */ - if (fdp->fd_rdir != NULL && fdp->fd_rdir != rootvnode) { - rvnp = fdp->fd_rdir; - vhold(rvnp); - } - /* - * If the supplied path is relative, make sure we capture the current - * working directory so we can prepend it to the supplied relative - * path. - */ - if (*path != '/') { - if (dirfd == AT_FDCWD) { - cvnp = fdp->fd_cdir; - vhold(cvnp); - } else { - /* XXX: fgetvp() that vhold()s vnode instead of vref()ing it would be better */ - error = fgetvp(td, dirfd, cap_rights_init(&rights), &cvnp); - if (error) { - FILEDESC_SUNLOCK(fdp); - cpath[0] = '\0'; - if (rvnp != NULL) - vdrop(rvnp); - return; - } - vhold(cvnp); - vrele(cvnp); - } - needslash = (fdp->fd_rdir != cvnp); - } else { - needslash = 1; - } - FILEDESC_SUNLOCK(fdp); - /* * NB: We require that the supplied array be at least MAXPATHLEN bytes * long. If this is not the case, then we can run into serious trouble. */ (void) sbuf_new(&sbf, cpath, MAXPATHLEN, SBUF_FIXEDLEN); /* * Strip leading forward slashes. + * + * Note this does nothing to fully canonicalize the path. */ while (*copy == '/') copy++; /* * Make sure we handle chroot(2) and prepend the global path to these * environments. * * NB: vn_fullpath(9) on FreeBSD is less reliable than vn_getpath(9) * on Darwin. As a result, this may need some additional attention * in the future. */ - if (rvnp != NULL) { - error = vn_fullpath_global(td, rvnp, &rbuf, &fbuf); - vdrop(rvnp); - if (error) { - cpath[0] = '\0'; - if (cvnp != NULL) - vdrop(cvnp); - return; - } - (void) sbuf_cat(&sbf, rbuf); - free(fbuf, M_TEMP); + error = vn_fullpath_global(td, vp, &rbuf, &fbuf); + if (error) { + cpath[0] = '\0'; + return; } - if (cvnp != NULL) { - error = vn_fullpath(td, cvnp, &rbuf, &fbuf); - vdrop(cvnp); - if (error) { - cpath[0] = '\0'; - return; - } - (void) sbuf_cat(&sbf, rbuf); - free(fbuf, M_TEMP); - } - if (needslash) + (void) sbuf_cat(&sbf, rbuf); + /* + * We are going to concatenate the resolved path with the passed path + * with all slashes removed and we want them glued with a single slash. + * However, if the directory is /, the slash is already there. + */ + if (rbuf[1] != '\0') (void) sbuf_putc(&sbf, '/'); + free(fbuf, M_TEMP); /* * Now that we have processed any alternate root and relative path * names, add the supplied pathname. */ - (void) sbuf_cat(&sbf, copy); + (void) sbuf_cat(&sbf, copy); /* * One or more of the previous sbuf operations could have resulted in * the supplied buffer being overflowed. Check to see if this is the * case. */ if (sbuf_error(&sbf) != 0) { cpath[0] = '\0'; return; } sbuf_finish(&sbf); +} + +void +audit_canon_path(struct thread *td, int dirfd, char *path, char *cpath) +{ + struct vnode *cdir, *rdir; + struct filedesc *fdp; + cap_rights_t rights; + int error; + + WITNESS_WARN(WARN_GIANTOK | WARN_SLEEPOK, NULL, "%s: at %s:%d", + __func__, __FILE__, __LINE__); + + rdir = cdir = NULL; + fdp = td->td_proc->p_fd; + FILEDESC_SLOCK(fdp); + if (*path == '/') { + rdir = fdp->fd_rdir; + vrefact(rdir); + } else { + if (dirfd == AT_FDCWD) { + cdir = fdp->fd_cdir; + vrefact(cdir); + } else { + error = fgetvp(td, dirfd, cap_rights_init(&rights), &cdir); + if (error != 0) { + FILEDESC_SUNLOCK(fdp); + cpath[0] = '\0'; + return; + } + } + } + FILEDESC_SUNLOCK(fdp); + + audit_canon_path_vp(td, rdir, cdir, path, cpath); + + if (rdir != NULL) + vrele(rdir); + if (cdir != NULL) + vrele(cdir); } Index: projects/clang1000-import/sys/security/audit/audit_private.h =================================================================== --- projects/clang1000-import/sys/security/audit/audit_private.h (revision 358238) +++ projects/clang1000-import/sys/security/audit/audit_private.h (revision 358239) @@ -1,509 +1,511 @@ /*- * SPDX-License-Identifier: BSD-3-Clause * * Copyright (c) 1999-2009 Apple Inc. * Copyright (c) 2016, 2018 Robert N. M. Watson * All rights reserved. * * Portions of this software were developed by BAE Systems, the University of * Cambridge Computer Laboratory, and Memorial University under DARPA/AFRL * contract FA8650-15-C-7558 ("CADETS"), as part of the DARPA Transparent * Computing (TC) research program. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. Neither the name of Apple Inc. ("Apple") nor the names of * its contributors may be used to endorse or promote products derived * from this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY APPLE AND ITS CONTRIBUTORS "AS IS" AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL APPLE OR ITS CONTRIBUTORS BE LIABLE FOR * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING * IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE * POSSIBILITY OF SUCH DAMAGE. * * $FreeBSD$ */ /* * This include file contains function prototypes and type definitions used * within the audit implementation. */ #ifndef _SECURITY_AUDIT_PRIVATE_H_ #define _SECURITY_AUDIT_PRIVATE_H_ #ifndef _KERNEL #error "no user-serviceable parts inside" #endif #include #include #include #include #ifdef MALLOC_DECLARE MALLOC_DECLARE(M_AUDITBSM); MALLOC_DECLARE(M_AUDITDATA); MALLOC_DECLARE(M_AUDITPATH); MALLOC_DECLARE(M_AUDITTEXT); MALLOC_DECLARE(M_AUDITGIDSET); #endif /* * Audit control variables that are usually set/read via system calls and * used to control various aspects of auditing. */ extern struct au_qctrl audit_qctrl; extern struct audit_fstat audit_fstat; extern struct au_mask audit_nae_mask; extern int audit_panic_on_write_fail; extern int audit_fail_stop; extern int audit_argv; extern int audit_arge; /* * Success/failure conditions for the conversion of a kernel audit record to * BSM format. */ #define BSM_SUCCESS 0 #define BSM_FAILURE 1 #define BSM_NOAUDIT 2 /* * Defines for the kernel audit record k_ar_commit field. Flags are set to * indicate what sort of record it is, and which preselection mechanism * selected it. */ #define AR_COMMIT_KERNEL 0x00000001U #define AR_COMMIT_USER 0x00000010U #define AR_PRESELECT_TRAIL 0x00001000U #define AR_PRESELECT_PIPE 0x00002000U #define AR_PRESELECT_USER_TRAIL 0x00004000U #define AR_PRESELECT_USER_PIPE 0x00008000U #define AR_PRESELECT_DTRACE 0x00010000U /* * Audit data is generated as a stream of struct audit_record structures, * linked by struct kaudit_record, and contain storage for possible audit so * that it will not need to be allocated during the processing of a system * call, both improving efficiency and avoiding sleeping at untimely moments. * This structure is converted to BSM format before being written to disk. */ struct vnode_au_info { mode_t vn_mode; uid_t vn_uid; gid_t vn_gid; u_int32_t vn_dev; /* XXX dev_t compatibility */ long vn_fsid; /* XXX uint64_t compatibility */ long vn_fileid; /* XXX ino_t compatibility */ long vn_gen; }; struct groupset { gid_t *gidset; u_int gidset_size; }; struct socket_au_info { int so_domain; int so_type; int so_protocol; in_addr_t so_raddr; /* Remote address if INET socket. */ in_addr_t so_laddr; /* Local address if INET socket. */ u_short so_rport; /* Remote port. */ u_short so_lport; /* Local port. */ }; /* * The following is used for A_OLDSETQCTRL and AU_OLDGETQCTRL and a 64-bit * userland. */ struct au_qctrl64 { u_int64_t aq64_hiwater; u_int64_t aq64_lowater; u_int64_t aq64_bufsz; u_int64_t aq64_delay; u_int64_t aq64_minfree; }; typedef struct au_qctrl64 au_qctrl64_t; union auditon_udata { char *au_path; int au_cond; int au_flags; int au_policy; int au_trigger; int64_t au_cond64; int64_t au_policy64; au_evclass_map_t au_evclass; au_mask_t au_mask; auditinfo_t au_auinfo; auditpinfo_t au_aupinfo; auditpinfo_addr_t au_aupinfo_addr; au_qctrl_t au_qctrl; au_qctrl64_t au_qctrl64; au_stat_t au_stat; au_fstat_t au_fstat; auditinfo_addr_t au_kau_info; au_evname_map_t au_evname; }; struct posix_ipc_perm { uid_t pipc_uid; gid_t pipc_gid; mode_t pipc_mode; }; struct audit_record { /* Audit record header. */ u_int32_t ar_magic; int ar_event; int ar_retval; /* value returned to the process */ int ar_errno; /* return status of system call */ struct timespec ar_starttime; struct timespec ar_endtime; u_int64_t ar_valid_arg; /* Bitmask of valid arguments */ /* Audit subject information. */ struct xucred ar_subj_cred; uid_t ar_subj_ruid; gid_t ar_subj_rgid; gid_t ar_subj_egid; uid_t ar_subj_auid; /* Audit user ID */ pid_t ar_subj_asid; /* Audit session ID */ pid_t ar_subj_pid; struct au_tid ar_subj_term; struct au_tid_addr ar_subj_term_addr; struct au_mask ar_subj_amask; /* Operation arguments. */ uid_t ar_arg_euid; uid_t ar_arg_ruid; uid_t ar_arg_suid; gid_t ar_arg_egid; gid_t ar_arg_rgid; gid_t ar_arg_sgid; pid_t ar_arg_pid; pid_t ar_arg_asid; struct au_tid ar_arg_termid; struct au_tid_addr ar_arg_termid_addr; uid_t ar_arg_uid; uid_t ar_arg_auid; gid_t ar_arg_gid; struct groupset ar_arg_groups; int ar_arg_fd; int ar_arg_atfd1; int ar_arg_atfd2; int ar_arg_fflags; mode_t ar_arg_mode; int ar_arg_dev; /* XXX dev_t compatibility */ long ar_arg_value; void *ar_arg_addr; int ar_arg_len; int ar_arg_mask; u_int ar_arg_signum; char ar_arg_login[MAXLOGNAME]; int ar_arg_ctlname[CTL_MAXNAME]; struct socket_au_info ar_arg_sockinfo; char *ar_arg_upath1; char *ar_arg_upath2; char *ar_arg_text; struct au_mask ar_arg_amask; struct vnode_au_info ar_arg_vnode1; struct vnode_au_info ar_arg_vnode2; int ar_arg_cmd; int ar_arg_svipc_which; int ar_arg_svipc_cmd; struct ipc_perm ar_arg_svipc_perm; int ar_arg_svipc_id; void *ar_arg_svipc_addr; struct posix_ipc_perm ar_arg_pipc_perm; union auditon_udata ar_arg_auditon; char *ar_arg_argv; int ar_arg_argc; char *ar_arg_envv; int ar_arg_envc; int ar_arg_exitstatus; int ar_arg_exitretval; struct sockaddr_storage ar_arg_sockaddr; cap_rights_t ar_arg_rights; uint32_t ar_arg_fcntl_rights; char ar_jailname[MAXHOSTNAMELEN]; }; /* * Arguments in the audit record are initially not defined; flags are set to * indicate if they are present so they can be included in the audit log * stream only if defined. */ #define ARG_EUID 0x0000000000000001ULL #define ARG_RUID 0x0000000000000002ULL #define ARG_SUID 0x0000000000000004ULL #define ARG_EGID 0x0000000000000008ULL #define ARG_RGID 0x0000000000000010ULL #define ARG_SGID 0x0000000000000020ULL #define ARG_PID 0x0000000000000040ULL #define ARG_UID 0x0000000000000080ULL #define ARG_AUID 0x0000000000000100ULL #define ARG_GID 0x0000000000000200ULL #define ARG_FD 0x0000000000000400ULL #define ARG_POSIX_IPC_PERM 0x0000000000000800ULL #define ARG_FFLAGS 0x0000000000001000ULL #define ARG_MODE 0x0000000000002000ULL #define ARG_DEV 0x0000000000004000ULL #define ARG_ADDR 0x0000000000008000ULL #define ARG_LEN 0x0000000000010000ULL #define ARG_MASK 0x0000000000020000ULL #define ARG_SIGNUM 0x0000000000040000ULL #define ARG_LOGIN 0x0000000000080000ULL #define ARG_SADDRINET 0x0000000000100000ULL #define ARG_SADDRINET6 0x0000000000200000ULL #define ARG_SADDRUNIX 0x0000000000400000ULL #define ARG_TERMID_ADDR 0x0000000000800000ULL #define ARG_UNUSED2 0x0000000001000000ULL #define ARG_UPATH1 0x0000000002000000ULL #define ARG_UPATH2 0x0000000004000000ULL #define ARG_TEXT 0x0000000008000000ULL #define ARG_VNODE1 0x0000000010000000ULL #define ARG_VNODE2 0x0000000020000000ULL #define ARG_SVIPC_CMD 0x0000000040000000ULL #define ARG_SVIPC_PERM 0x0000000080000000ULL #define ARG_SVIPC_ID 0x0000000100000000ULL #define ARG_SVIPC_ADDR 0x0000000200000000ULL #define ARG_GROUPSET 0x0000000400000000ULL #define ARG_CMD 0x0000000800000000ULL #define ARG_SOCKINFO 0x0000001000000000ULL #define ARG_ASID 0x0000002000000000ULL #define ARG_TERMID 0x0000004000000000ULL #define ARG_AUDITON 0x0000008000000000ULL #define ARG_VALUE 0x0000010000000000ULL #define ARG_AMASK 0x0000020000000000ULL #define ARG_CTLNAME 0x0000040000000000ULL #define ARG_PROCESS 0x0000080000000000ULL #define ARG_MACHPORT1 0x0000100000000000ULL #define ARG_MACHPORT2 0x0000200000000000ULL #define ARG_EXIT 0x0000400000000000ULL #define ARG_IOVECSTR 0x0000800000000000ULL #define ARG_ARGV 0x0001000000000000ULL #define ARG_ENVV 0x0002000000000000ULL #define ARG_ATFD1 0x0004000000000000ULL #define ARG_ATFD2 0x0008000000000000ULL #define ARG_RIGHTS 0x0010000000000000ULL #define ARG_FCNTL_RIGHTS 0x0020000000000000ULL #define ARG_SVIPC_WHICH 0x0200000000000000ULL #define ARG_NONE 0x0000000000000000ULL #define ARG_ALL 0xFFFFFFFFFFFFFFFFULL #define ARG_IS_VALID(kar, arg) ((kar)->k_ar.ar_valid_arg & (arg)) #define ARG_SET_VALID(kar, arg) do { \ (kar)->k_ar.ar_valid_arg |= (arg); \ } while (0) #define ARG_CLEAR_VALID(kar, arg) do { \ (kar)->k_ar.ar_valid_arg &= ~(arg); \ } while (0) /* * In-kernel version of audit record; the basic record plus queue meta-data. * This record can also have a pointer set to some opaque data that will be * passed through to the audit writing mechanism. */ struct kaudit_record { struct audit_record k_ar; u_int32_t k_ar_commit; void *k_udata; /* User data. */ u_int k_ulen; /* User data length. */ struct uthread *k_uthread; /* Audited thread. */ void *k_dtaudit_state; TAILQ_ENTRY(kaudit_record) k_q; }; TAILQ_HEAD(kaudit_queue, kaudit_record); /* * Functions to manage the allocation, release, and commit of kernel audit * records. */ void audit_abort(struct kaudit_record *ar); void audit_commit(struct kaudit_record *ar, int error, int retval); struct kaudit_record *audit_new(int event, struct thread *td); /* * Function to update the audit_syscalls_enabled flag, whose value is affected * by configuration of the audit trail/pipe mechanism and DTrace. Call this * function when any of the inputs to that policy change. */ void audit_syscalls_enabled_update(void); /* * Functions relating to the conversion of internal kernel audit records to * the BSM file format. */ struct au_record; int kaudit_to_bsm(struct kaudit_record *kar, struct au_record **pau); int bsm_rec_verify(void *rec); /* * Kernel versions of the libbsm audit record functions. */ void kau_free(struct au_record *rec); void kau_init(void); /* * Return values for pre-selection and post-selection decisions. */ #define AU_PRS_SUCCESS 1 #define AU_PRS_FAILURE 2 #define AU_PRS_BOTH (AU_PRS_SUCCESS|AU_PRS_FAILURE) /* * Data structures relating to the kernel audit queue. Ideally, these might * be abstracted so that only accessor methods are exposed. */ extern struct mtx audit_mtx; extern struct cv audit_watermark_cv; extern struct cv audit_worker_cv; extern struct kaudit_queue audit_q; extern int audit_q_len; extern int audit_pre_q_len; extern int audit_in_failure; /* * Flags to use on audit files when opening and closing. */ #define AUDIT_OPEN_FLAGS (FWRITE | O_APPEND) #define AUDIT_CLOSE_FLAGS (FWRITE | O_APPEND) /* * Audit event-to-name mapping structure, maintained in audit_bsm_klib.c. It * appears in this header so that the DTrace audit provider can dereference * instances passed back in the au_evname_foreach() callbacks. Safe access to * its fields requires holding ene_lock (after it is visible in the global * table). * * Locking: * (c) - Constant after inserted in the global table * (l) - Protected by ene_lock * (m) - Protected by evnamemap_lock (audit_bsm_klib.c) * (M) - Writes protected by evnamemap_lock; reads unprotected. */ struct evname_elem { au_event_t ene_event; /* (c) */ char ene_name[EVNAMEMAP_NAME_SIZE]; /* (l) */ LIST_ENTRY(evname_elem) ene_entry; /* (m) */ struct mtx ene_lock; /* DTrace probe IDs; 0 if not yet registered. */ uint32_t ene_commit_probe_id; /* (M) */ uint32_t ene_bsm_probe_id; /* (M) */ /* Flags indicating if the probes enabled or not. */ int ene_commit_probe_enabled; /* (M) */ int ene_bsm_probe_enabled; /* (M) */ }; #define EVNAME_LOCK(ene) mtx_lock(&(ene)->ene_lock) #define EVNAME_UNLOCK(ene) mtx_unlock(&(ene)->ene_lock) /* * Callback function typedef for the same. */ typedef void (*au_evnamemap_callback_t)(struct evname_elem *ene); /* * DTrace audit provider (dtaudit) hooks -- to be set non-NULL when the audit * provider is loaded and ready to be called into. */ extern void *(*dtaudit_hook_preselect)(au_id_t auid, au_event_t event, au_class_t class); extern int (*dtaudit_hook_commit)(struct kaudit_record *kar, au_id_t auid, au_event_t event, au_class_t class, int sorf); extern void (*dtaudit_hook_bsm)(struct kaudit_record *kar, au_id_t auid, au_event_t event, au_class_t class, int sorf, void *bsm_data, size_t bsm_len); #include #include #include /* * Some of the BSM tokenizer functions take different parameters in the * kernel implementations in order to save the copying of large kernel data * structures. The prototypes of these functions are declared here. */ token_t *kau_to_socket(struct socket_au_info *soi); /* * audit_klib prototypes */ int au_preselect(au_event_t event, au_class_t class, au_mask_t *mask_p, int sorf); void au_evclassmap_init(void); void au_evclassmap_insert(au_event_t event, au_class_t class); au_class_t au_event_class(au_event_t event); void au_evnamemap_init(void); void au_evnamemap_insert(au_event_t event, const char *name); void au_evnamemap_foreach(au_evnamemap_callback_t callback); struct evname_elem *au_evnamemap_lookup(au_event_t event); int au_event_name(au_event_t event, char *name); au_event_t audit_ctlname_to_sysctlevent(int name[], uint64_t valid_arg); au_event_t audit_flags_and_error_to_openevent(int oflags, int error); au_event_t audit_flags_and_error_to_openatevent(int oflags, int error); au_event_t audit_msgctl_to_event(int cmd); au_event_t audit_msgsys_to_event(int which); au_event_t audit_semctl_to_event(int cmd); au_event_t audit_semsys_to_event(int which); au_event_t audit_shmsys_to_event(int which); void audit_canon_path(struct thread *td, int dirfd, char *path, char *cpath); +void audit_canon_path_vp(struct thread *td, struct vnode *rdir, + struct vnode *cdir, char *path, char *cpath); au_event_t auditon_command_event(int cmd); /* * Audit trigger events notify user space of kernel audit conditions * asynchronously. */ void audit_trigger_init(void); int audit_send_trigger(unsigned int trigger); /* * Accessor functions to manage global audit state. */ void audit_set_kinfo(struct auditinfo_addr *); void audit_get_kinfo(struct auditinfo_addr *); /* * General audit related functions. */ struct kaudit_record *currecord(void); void audit_free(struct kaudit_record *ar); void audit_shutdown(void *arg, int howto); void audit_rotate_vnode(struct ucred *cred, struct vnode *vp); void audit_worker_init(void); /* * Audit pipe functions. */ int audit_pipe_preselect(au_id_t auid, au_event_t event, au_class_t class, int sorf, int trail_select); void audit_pipe_submit(au_id_t auid, au_event_t event, au_class_t class, int sorf, int trail_select, void *record, u_int record_len); void audit_pipe_submit_user(void *record, u_int record_len); #endif /* ! _SECURITY_AUDIT_PRIVATE_H_ */ Index: projects/clang1000-import/sys/sys/_smr.h =================================================================== --- projects/clang1000-import/sys/sys/_smr.h (revision 358238) +++ projects/clang1000-import/sys/sys/_smr.h (revision 358239) @@ -1,37 +1,38 @@ /*- * SPDX-License-Identifier: BSD-2-Clause-FreeBSD * * Copyright (c) 2019, 2020 Jeffrey Roberson * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice unmodified, this list of conditions, and the following * disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * $FreeBSD$ * */ #ifndef _SYS__SMR_H_ #define _SYS__SMR_H_ typedef uint32_t smr_seq_t; +typedef int32_t smr_delta_t; typedef struct smr *smr_t; #endif /* __SYS_SMR_H_ */ Index: projects/clang1000-import/sys/sys/param.h =================================================================== --- projects/clang1000-import/sys/sys/param.h (revision 358238) +++ projects/clang1000-import/sys/sys/param.h (revision 358239) @@ -1,368 +1,368 @@ /*- * SPDX-License-Identifier: BSD-3-Clause * * Copyright (c) 1982, 1986, 1989, 1993 * The Regents of the University of California. All rights reserved. * (c) UNIX System Laboratories, Inc. * All or some portions of this file are derived from material licensed * to the University of California by American Telephone and Telegraph * Co. or Unix System Laboratories, Inc. and are reproduced herein with * the permission of UNIX System Laboratories, Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * @(#)param.h 8.3 (Berkeley) 4/4/95 * $FreeBSD$ */ #ifndef _SYS_PARAM_H_ #define _SYS_PARAM_H_ #include #define BSD 199506 /* System version (year & month). */ #define BSD4_3 1 #define BSD4_4 1 /* * __FreeBSD_version numbers are documented in the Porter's Handbook. * If you bump the version for any reason, you should update the documentation * there. * Currently this lives here in the doc/ repository: * * head/en_US.ISO8859-1/books/porters-handbook/versions/chapter.xml * * scheme is: Rxx * 'R' is in the range 0 to 4 if this is a release branch or * X.0-CURRENT before releng/X.0 is created, otherwise 'R' is * in the range 5 to 9. */ #undef __FreeBSD_version -#define __FreeBSD_version 1300080 /* Master, propagated to newvers */ +#define __FreeBSD_version 1300081 /* Master, propagated to newvers */ /* * __FreeBSD_kernel__ indicates that this system uses the kernel of FreeBSD, * which by definition is always true on FreeBSD. This macro is also defined * on other systems that use the kernel of FreeBSD, such as GNU/kFreeBSD. * * It is tempting to use this macro in userland code when we want to enable * kernel-specific routines, and in fact it's fine to do this in code that * is part of FreeBSD itself. However, be aware that as presence of this * macro is still not widespread (e.g. older FreeBSD versions, 3rd party * compilers, etc), it is STRONGLY DISCOURAGED to check for this macro in * external applications without also checking for __FreeBSD__ as an * alternative. */ #undef __FreeBSD_kernel__ #define __FreeBSD_kernel__ #if defined(_KERNEL) || defined(IN_RTLD) #define P_OSREL_SIGWAIT 700000 #define P_OSREL_SIGSEGV 700004 #define P_OSREL_MAP_ANON 800104 #define P_OSREL_MAP_FSTRICT 1100036 #define P_OSREL_SHUTDOWN_ENOTCONN 1100077 #define P_OSREL_MAP_GUARD 1200035 #define P_OSREL_WRFSBASE 1200041 #define P_OSREL_CK_CYLGRP 1200046 #define P_OSREL_VMTOTAL64 1200054 #define P_OSREL_CK_SUPERBLOCK 1300000 #define P_OSREL_CK_INODE 1300005 #define P_OSREL_POWERPC_NEW_AUX_ARGS 1300070 #define P_OSREL_MAJOR(x) ((x) / 100000) #endif #ifndef LOCORE #include #endif /* * Machine-independent constants (some used in following include files). * Redefined constants are from POSIX 1003.1 limits file. * * MAXCOMLEN should be >= sizeof(ac_comm) (see ) */ #include #define MAXCOMLEN 19 /* max command name remembered */ #define MAXINTERP PATH_MAX /* max interpreter file name length */ #define MAXLOGNAME 33 /* max login name length (incl. NUL) */ #define MAXUPRC CHILD_MAX /* max simultaneous processes */ #define NCARGS ARG_MAX /* max bytes for an exec function */ #define NGROUPS (NGROUPS_MAX+1) /* max number groups */ #define NOFILE OPEN_MAX /* max open files per process */ #define NOGROUP 65535 /* marker for empty group set member */ #define MAXHOSTNAMELEN 256 /* max hostname size */ #define SPECNAMELEN 255 /* max length of devicename */ /* More types and definitions used throughout the kernel. */ #ifdef _KERNEL #include #include #ifndef LOCORE #include #include #endif #ifndef FALSE #define FALSE 0 #endif #ifndef TRUE #define TRUE 1 #endif #endif #ifndef _KERNEL /* Signals. */ #include #endif /* Machine type dependent parameters. */ #include #ifndef _KERNEL #include #endif #ifndef DEV_BSHIFT #define DEV_BSHIFT 9 /* log2(DEV_BSIZE) */ #endif #define DEV_BSIZE (1<>PAGE_SHIFT) #endif /* * btodb() is messy and perhaps slow because `bytes' may be an off_t. We * want to shift an unsigned type to avoid sign extension and we don't * want to widen `bytes' unnecessarily. Assume that the result fits in * a daddr_t. */ #ifndef btodb #define btodb(bytes) /* calculates (bytes / DEV_BSIZE) */ \ (sizeof (bytes) > sizeof(long) \ ? (daddr_t)((unsigned long long)(bytes) >> DEV_BSHIFT) \ : (daddr_t)((unsigned long)(bytes) >> DEV_BSHIFT)) #endif #ifndef dbtob #define dbtob(db) /* calculates (db * DEV_BSIZE) */ \ ((off_t)(db) << DEV_BSHIFT) #endif #define PRIMASK 0x0ff #define PCATCH 0x100 /* OR'd with pri for tsleep to check signals */ #define PDROP 0x200 /* OR'd with pri to stop re-entry of interlock mutex */ #define NZERO 0 /* default "nice" */ #define NBBY 8 /* number of bits in a byte */ #define NBPW sizeof(int) /* number of bytes per word (integer) */ #define CMASK 022 /* default file mask: S_IWGRP|S_IWOTH */ #define NODEV (dev_t)(-1) /* non-existent device */ /* * File system parameters and macros. * * MAXBSIZE - Filesystems are made out of blocks of at most MAXBSIZE bytes * per block. MAXBSIZE may be made larger without effecting * any existing filesystems as long as it does not exceed MAXPHYS, * and may be made smaller at the risk of not being able to use * filesystems which require a block size exceeding MAXBSIZE. * * MAXBCACHEBUF - Maximum size of a buffer in the buffer cache. This must * be >= MAXBSIZE and can be set differently for different * architectures by defining it in . * Making this larger allows NFS to do larger reads/writes. * * BKVASIZE - Nominal buffer space per buffer, in bytes. BKVASIZE is the * minimum KVM memory reservation the kernel is willing to make. * Filesystems can of course request smaller chunks. Actual * backing memory uses a chunk size of a page (PAGE_SIZE). * The default value here can be overridden on a per-architecture * basis by defining it in . * * If you make BKVASIZE too small you risk seriously fragmenting * the buffer KVM map which may slow things down a bit. If you * make it too big the kernel will not be able to optimally use * the KVM memory reserved for the buffer cache and will wind * up with too-few buffers. * * The default is 16384, roughly 2x the block size used by a * normal UFS filesystem. */ #define MAXBSIZE 65536 /* must be power of 2 */ #ifndef MAXBCACHEBUF #define MAXBCACHEBUF MAXBSIZE /* must be a power of 2 >= MAXBSIZE */ #endif #ifndef BKVASIZE #define BKVASIZE 16384 /* must be power of 2 */ #endif #define BKVAMASK (BKVASIZE-1) /* * MAXPATHLEN defines the longest permissible path length after expanding * symbolic links. It is used to allocate a temporary buffer from the buffer * pool in which to do the name expansion, hence should be a power of two, * and must be less than or equal to MAXBSIZE. MAXSYMLINKS defines the * maximum number of symbolic links that may be expanded in a path name. * It should be set high enough to allow all legitimate uses, but halt * infinite loops reasonably quickly. */ #define MAXPATHLEN PATH_MAX #define MAXSYMLINKS 32 /* Bit map related macros. */ #define setbit(a,i) (((unsigned char *)(a))[(i)/NBBY] |= 1<<((i)%NBBY)) #define clrbit(a,i) (((unsigned char *)(a))[(i)/NBBY] &= ~(1<<((i)%NBBY))) #define isset(a,i) \ (((const unsigned char *)(a))[(i)/NBBY] & (1<<((i)%NBBY))) #define isclr(a,i) \ ((((const unsigned char *)(a))[(i)/NBBY] & (1<<((i)%NBBY))) == 0) /* Macros for counting and rounding. */ #ifndef howmany #define howmany(x, y) (((x)+((y)-1))/(y)) #endif #define nitems(x) (sizeof((x)) / sizeof((x)[0])) #define rounddown(x, y) (((x)/(y))*(y)) #define rounddown2(x, y) ((x)&(~((y)-1))) /* if y is power of two */ #define roundup(x, y) ((((x)+((y)-1))/(y))*(y)) /* to any y */ #define roundup2(x, y) (((x)+((y)-1))&(~((y)-1))) /* if y is powers of two */ #define powerof2(x) ((((x)-1)&(x))==0) /* Macros for min/max. */ #define MIN(a,b) (((a)<(b))?(a):(b)) #define MAX(a,b) (((a)>(b))?(a):(b)) #ifdef _KERNEL /* * Basic byte order function prototypes for non-inline functions. */ #ifndef LOCORE #ifndef _BYTEORDER_PROTOTYPED #define _BYTEORDER_PROTOTYPED __BEGIN_DECLS __uint32_t htonl(__uint32_t); __uint16_t htons(__uint16_t); __uint32_t ntohl(__uint32_t); __uint16_t ntohs(__uint16_t); __END_DECLS #endif #endif #ifndef _BYTEORDER_FUNC_DEFINED #define _BYTEORDER_FUNC_DEFINED #define htonl(x) __htonl(x) #define htons(x) __htons(x) #define ntohl(x) __ntohl(x) #define ntohs(x) __ntohs(x) #endif /* !_BYTEORDER_FUNC_DEFINED */ #endif /* _KERNEL */ /* * Scale factor for scaled integers used to count %cpu time and load avgs. * * The number of CPU `tick's that map to a unique `%age' can be expressed * by the formula (1 / (2 ^ (FSHIFT - 11))). The maximum load average that * can be calculated (assuming 32 bits) can be closely approximated using * the formula (2 ^ (2 * (16 - FSHIFT))) for (FSHIFT < 15). * * For the scheduler to maintain a 1:1 mapping of CPU `tick' to `%age', * FSHIFT must be at least 11; this gives us a maximum load avg of ~1024. */ #define FSHIFT 11 /* bits to right of fixed binary point */ #define FSCALE (1<> (PAGE_SHIFT - DEV_BSHIFT)) #define ctodb(db) /* calculates pages to devblks */ \ ((db) << (PAGE_SHIFT - DEV_BSHIFT)) /* * Old spelling of __containerof(). */ #define member2struct(s, m, x) \ ((struct s *)(void *)((char *)(x) - offsetof(struct s, m))) /* * Access a variable length array that has been declared as a fixed * length array. */ #define __PAST_END(array, offset) (((__typeof__(*(array)) *)(array))[offset]) #endif /* _SYS_PARAM_H_ */ Index: projects/clang1000-import/sys/sys/smr.h =================================================================== --- projects/clang1000-import/sys/sys/smr.h (revision 358238) +++ projects/clang1000-import/sys/sys/smr.h (revision 358239) @@ -1,296 +1,359 @@ /*- * SPDX-License-Identifier: BSD-2-Clause-FreeBSD * * Copyright (c) 2019, 2020 Jeffrey Roberson * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice unmodified, this list of conditions, and the following * disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * $FreeBSD$ * */ #ifndef _SYS_SMR_H_ #define _SYS_SMR_H_ #include /* * Safe memory reclamation. See subr_smr.c for a description of the * algorithm. * * Readers synchronize with smr_enter()/exit() and writers may either * free directly to a SMR UMA zone or use smr_synchronize or wait. */ /* * Modular arithmetic for comparing sequence numbers that have * potentially wrapped. Copied from tcp_seq.h. */ -#define SMR_SEQ_LT(a, b) ((int32_t)((a)-(b)) < 0) -#define SMR_SEQ_LEQ(a, b) ((int32_t)((a)-(b)) <= 0) -#define SMR_SEQ_GT(a, b) ((int32_t)((a)-(b)) > 0) -#define SMR_SEQ_GEQ(a, b) ((int32_t)((a)-(b)) >= 0) -#define SMR_SEQ_DELTA(a, b) ((int32_t)((a)-(b))) +#define SMR_SEQ_LT(a, b) ((smr_delta_t)((a)-(b)) < 0) +#define SMR_SEQ_LEQ(a, b) ((smr_delta_t)((a)-(b)) <= 0) +#define SMR_SEQ_GT(a, b) ((smr_delta_t)((a)-(b)) > 0) +#define SMR_SEQ_GEQ(a, b) ((smr_delta_t)((a)-(b)) >= 0) +#define SMR_SEQ_DELTA(a, b) ((smr_delta_t)((a)-(b))) +#define SMR_SEQ_MIN(a, b) (SMR_SEQ_LT((a), (b)) ? (a) : (b)) +#define SMR_SEQ_MAX(a, b) (SMR_SEQ_GT((a), (b)) ? (a) : (b)) #define SMR_SEQ_INVALID 0 /* Shared SMR state. */ struct smr_shared { const char *s_name; /* Name for debugging/reporting. */ smr_seq_t s_wr_seq; /* Current write sequence #. */ smr_seq_t s_rd_seq; /* Minimum observed read sequence. */ }; typedef struct smr_shared *smr_shared_t; /* Per-cpu SMR state. */ struct smr { smr_seq_t c_seq; /* Current observed sequence. */ smr_shared_t c_shared; /* Shared SMR state. */ int c_deferred; /* Deferred advance counter. */ + int c_limit; /* Deferred advance limit. */ + int c_flags; /* SMR Configuration */ }; +#define SMR_LAZY 0x0001 /* Higher latency write, fast read. */ +#define SMR_DEFERRED 0x0002 /* Aggregate updates to wr_seq. */ + #define SMR_ENTERED(smr) \ (curthread->td_critnest != 0 && zpcpu_get((smr))->c_seq != SMR_SEQ_INVALID) #define SMR_ASSERT_ENTERED(smr) \ KASSERT(SMR_ENTERED(smr), ("Not in smr section")) #define SMR_ASSERT_NOT_ENTERED(smr) \ KASSERT(!SMR_ENTERED(smr), ("In smr section.")); #define SMR_ASSERT(ex, fn) \ KASSERT((ex), (fn ": Assertion " #ex " failed at %s:%d", __FILE__, __LINE__)) /* * SMR Accessors are meant to provide safe access to SMR protected * pointers and prevent misuse and accidental access. * * Accessors are grouped by type: * entered - Use while in a read section (between smr_enter/smr_exit()) * serialized - Use while holding a lock that serializes writers. Updates * are synchronized with readers via included barriers. * unserialized - Use after the memory is out of scope and not visible to * readers. * * All acceses include a parameter for an assert to verify the required * synchronization. For example, a writer might use: * - * smr_serilized_store(pointer, value, mtx_owned(&writelock)); + * smr_serialized_store(pointer, value, mtx_owned(&writelock)); * * These are only enabled in INVARIANTS kernels. */ /* Type restricting pointer access to force smr accessors. */ #define SMR_TYPE_DECLARE(smrtype, type) \ typedef struct { \ type __ptr; /* Do not access directly */ \ } smrtype /* * Read from an SMR protected pointer while in a read section. */ #define smr_entered_load(p, smr) ({ \ SMR_ASSERT(SMR_ENTERED((smr)), "smr_entered_load"); \ (__typeof((p)->__ptr))atomic_load_acq_ptr((uintptr_t *)&(p)->__ptr); \ }) /* * Read from an SMR protected pointer while serialized by an * external mechanism. 'ex' should contain an assert that the * external mechanism is held. i.e. mtx_owned() */ #define smr_serialized_load(p, ex) ({ \ SMR_ASSERT(ex, "smr_serialized_load"); \ (__typeof((p)->__ptr))atomic_load_ptr(&(p)->__ptr); \ }) /* * Store 'v' to an SMR protected pointer while serialized by an * external mechanism. 'ex' should contain an assert that the * external mechanism is held. i.e. mtx_owned() + * + * Writers that are serialized with mutual exclusion or on a single + * thread should use smr_serialized_store() rather than swap. */ #define smr_serialized_store(p, v, ex) do { \ SMR_ASSERT(ex, "smr_serialized_store"); \ __typeof((p)->__ptr) _v = (v); \ atomic_store_rel_ptr((uintptr_t *)&(p)->__ptr, (uintptr_t)_v); \ } while (0) /* * swap 'v' with an SMR protected pointer and return the old value * while serialized by an external mechanism. 'ex' should contain * an assert that the external mechanism is provided. i.e. mtx_owned() + * + * Swap permits multiple writers to update a pointer concurrently. */ #define smr_serialized_swap(p, v, ex) ({ \ SMR_ASSERT(ex, "smr_serialized_swap"); \ __typeof((p)->__ptr) _v = (v); \ /* Release barrier guarantees contents are visible to reader */ \ atomic_thread_fence_rel(); \ (__typeof((p)->__ptr))atomic_swap_ptr( \ (uintptr_t *)&(p)->__ptr, (uintptr_t)_v); \ }) /* * Read from an SMR protected pointer when no serialization is required * such as in the destructor callback or when the caller guarantees other * synchronization. */ #define smr_unserialized_load(p, ex) ({ \ SMR_ASSERT(ex, "smr_unserialized_load"); \ (__typeof((p)->__ptr))atomic_load_ptr(&(p)->__ptr); \ }) /* * Store to an SMR protected pointer when no serialiation is required * such as in the destructor callback or when the caller guarantees other * synchronization. */ #define smr_unserialized_store(p, v, ex) do { \ SMR_ASSERT(ex, "smr_unserialized_store"); \ __typeof((p)->__ptr) _v = (v); \ atomic_store_ptr((uintptr_t *)&(p)->__ptr, (uintptr_t)_v); \ } while (0) /* - * Return the current write sequence number. + * Return the current write sequence number. This is not the same as the + * current goal which may be in the future. */ static inline smr_seq_t smr_shared_current(smr_shared_t s) { return (atomic_load_int(&s->s_wr_seq)); } static inline smr_seq_t smr_current(smr_t smr) { return (smr_shared_current(zpcpu_get(smr)->c_shared)); } /* * Enter a read section. */ static inline void smr_enter(smr_t smr) { critical_enter(); smr = zpcpu_get(smr); + KASSERT((smr->c_flags & SMR_LAZY) == 0, + ("smr_enter(%s) lazy smr.", smr->c_shared->s_name)); KASSERT(smr->c_seq == 0, ("smr_enter(%s) does not support recursion.", smr->c_shared->s_name)); /* * Store the current observed write sequence number in our * per-cpu state so that it can be queried via smr_poll(). * Frees that are newer than this stored value will be * deferred until we call smr_exit(). * * An acquire barrier is used to synchronize with smr_exit() * and smr_poll(). * * It is possible that a long delay between loading the wr_seq * and storing the c_seq could create a situation where the * rd_seq advances beyond our stored c_seq. In this situation * only the observed wr_seq is stale, the fence still orders * the load. See smr_poll() for details on how this condition * is detected and handled there. */ /* This is an add because we do not have atomic_store_acq_int */ atomic_add_acq_int(&smr->c_seq, smr_shared_current(smr->c_shared)); } /* * Exit a read section. */ static inline void smr_exit(smr_t smr) { smr = zpcpu_get(smr); CRITICAL_ASSERT(curthread); + KASSERT((smr->c_flags & SMR_LAZY) == 0, + ("smr_exit(%s) lazy smr.", smr->c_shared->s_name)); KASSERT(smr->c_seq != SMR_SEQ_INVALID, ("smr_exit(%s) not in a smr section.", smr->c_shared->s_name)); /* * Clear the recorded sequence number. This allows poll() to * detect CPUs not in read sections. * * Use release semantics to retire any stores before the sequence * number is cleared. */ atomic_store_rel_int(&smr->c_seq, SMR_SEQ_INVALID); critical_exit(); } /* - * Advances the write sequence number. Returns the sequence number - * required to ensure that all modifications are visible to readers. + * Enter a lazy smr section. This is used for read-mostly state that + * can tolerate a high free latency. */ -smr_seq_t smr_advance(smr_t smr); +static inline void +smr_lazy_enter(smr_t smr) +{ + critical_enter(); + smr = zpcpu_get(smr); + KASSERT((smr->c_flags & SMR_LAZY) != 0, + ("smr_lazy_enter(%s) non-lazy smr.", smr->c_shared->s_name)); + KASSERT(smr->c_seq == 0, + ("smr_lazy_enter(%s) does not support recursion.", + smr->c_shared->s_name)); + + /* + * This needs no serialization. If an interrupt occurs before we + * assign sr_seq to c_seq any speculative loads will be discarded. + * If we assign a stale wr_seq value due to interrupt we use the + * same algorithm that renders smr_enter() safe. + */ + smr->c_seq = smr_shared_current(smr->c_shared); +} + /* - * Advances the write sequence number only after N calls. Returns - * the correct goal for a wr_seq that has not yet occurred. Used to - * minimize shared cacheline invalidations for frequent writers. + * Exit a lazy smr section. This is used for read-mostly state that + * can tolerate a high free latency. */ -smr_seq_t smr_advance_deferred(smr_t smr, int limit); +static inline void +smr_lazy_exit(smr_t smr) +{ + smr = zpcpu_get(smr); + CRITICAL_ASSERT(curthread); + KASSERT((smr->c_flags & SMR_LAZY) != 0, + ("smr_lazy_enter(%s) non-lazy smr.", smr->c_shared->s_name)); + KASSERT(smr->c_seq != SMR_SEQ_INVALID, + ("smr_lazy_exit(%s) not in a smr section.", smr->c_shared->s_name)); + + /* + * All loads/stores must be retired before the sequence becomes + * visible. The fence compiles away on amd64. Another + * alternative would be to omit the fence but store the exit + * time and wait 1 tick longer. + */ + atomic_thread_fence_rel(); + smr->c_seq = SMR_SEQ_INVALID; + critical_exit(); +} + /* + * Advances the write sequence number. Returns the sequence number + * required to ensure that all modifications are visible to readers. + */ +smr_seq_t smr_advance(smr_t smr); + +/* * Returns true if a goal sequence has been reached. If * wait is true this will busy loop until success. */ bool smr_poll(smr_t smr, smr_seq_t goal, bool wait); /* Create a new SMR context. */ -smr_t smr_create(const char *name); +smr_t smr_create(const char *name, int limit, int flags); + +/* Destroy the context. */ void smr_destroy(smr_t smr); /* * Blocking wait for all readers to observe 'goal'. */ static inline bool smr_wait(smr_t smr, smr_seq_t goal) { return (smr_poll(smr, goal, true)); } /* * Synchronize advances the write sequence and returns when all * readers have observed it. * * If your application can cache a sequence number returned from * smr_advance() and poll or wait at a later time there will * be less chance of busy looping while waiting for readers. */ static inline void smr_synchronize(smr_t smr) { smr_wait(smr, smr_advance(smr)); } /* Only at startup. */ void smr_init(void); #endif /* _SYS_SMR_H_ */ Index: projects/clang1000-import/sys/vm/uma_core.c =================================================================== --- projects/clang1000-import/sys/vm/uma_core.c (revision 358238) +++ projects/clang1000-import/sys/vm/uma_core.c (revision 358239) @@ -1,5439 +1,5437 @@ /*- * SPDX-License-Identifier: BSD-2-Clause-FreeBSD * * Copyright (c) 2002-2019 Jeffrey Roberson * Copyright (c) 2004, 2005 Bosko Milekic * Copyright (c) 2004-2006 Robert N. M. Watson * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice unmodified, this list of conditions, and the following * disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. */ /* * uma_core.c Implementation of the Universal Memory allocator * * This allocator is intended to replace the multitude of similar object caches * in the standard FreeBSD kernel. The intent is to be flexible as well as * efficient. A primary design goal is to return unused memory to the rest of * the system. This will make the system as a whole more flexible due to the * ability to move memory to subsystems which most need it instead of leaving * pools of reserved memory unused. * * The basic ideas stem from similar slab/zone based allocators whose algorithms * are well known. * */ /* * TODO: * - Improve memory usage for large allocations * - Investigate cache size adjustments */ #include __FBSDID("$FreeBSD$"); #include "opt_ddb.h" #include "opt_param.h" #include "opt_vm.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #ifdef DEBUG_MEMGUARD #include #endif #include #ifdef INVARIANTS #define UMA_ALWAYS_CTORDTOR 1 #else #define UMA_ALWAYS_CTORDTOR 0 #endif /* * This is the zone and keg from which all zones are spawned. */ static uma_zone_t kegs; static uma_zone_t zones; /* * These are the two zones from which all offpage uma_slab_ts are allocated. * * One zone is for slab headers that can represent a larger number of items, * making the slabs themselves more efficient, and the other zone is for * headers that are smaller and represent fewer items, making the headers more * efficient. */ #define SLABZONE_SIZE(setsize) \ (sizeof(struct uma_hash_slab) + BITSET_SIZE(setsize) * SLAB_BITSETS) #define SLABZONE0_SETSIZE (PAGE_SIZE / 16) #define SLABZONE1_SETSIZE SLAB_MAX_SETSIZE #define SLABZONE0_SIZE SLABZONE_SIZE(SLABZONE0_SETSIZE) #define SLABZONE1_SIZE SLABZONE_SIZE(SLABZONE1_SETSIZE) static uma_zone_t slabzones[2]; /* * The initial hash tables come out of this zone so they can be allocated * prior to malloc coming up. */ static uma_zone_t hashzone; /* The boot-time adjusted value for cache line alignment. */ int uma_align_cache = 64 - 1; static MALLOC_DEFINE(M_UMAHASH, "UMAHash", "UMA Hash Buckets"); static MALLOC_DEFINE(M_UMA, "UMA", "UMA Misc"); /* * Are we allowed to allocate buckets? */ static int bucketdisable = 1; /* Linked list of all kegs in the system */ static LIST_HEAD(,uma_keg) uma_kegs = LIST_HEAD_INITIALIZER(uma_kegs); /* Linked list of all cache-only zones in the system */ static LIST_HEAD(,uma_zone) uma_cachezones = LIST_HEAD_INITIALIZER(uma_cachezones); /* This RW lock protects the keg list */ static struct rwlock_padalign __exclusive_cache_line uma_rwlock; /* * First available virual address for boot time allocations. */ static vm_offset_t bootstart; static vm_offset_t bootmem; static struct sx uma_reclaim_lock; /* * kmem soft limit, initialized by uma_set_limit(). Ensure that early * allocations don't trigger a wakeup of the reclaim thread. */ unsigned long uma_kmem_limit = LONG_MAX; SYSCTL_ULONG(_vm, OID_AUTO, uma_kmem_limit, CTLFLAG_RD, &uma_kmem_limit, 0, "UMA kernel memory soft limit"); unsigned long uma_kmem_total; SYSCTL_ULONG(_vm, OID_AUTO, uma_kmem_total, CTLFLAG_RD, &uma_kmem_total, 0, "UMA kernel memory usage"); /* Is the VM done starting up? */ static enum { BOOT_COLD, BOOT_KVA, BOOT_RUNNING, BOOT_SHUTDOWN, } booted = BOOT_COLD; /* * This is the handle used to schedule events that need to happen * outside of the allocation fast path. */ static struct callout uma_callout; #define UMA_TIMEOUT 20 /* Seconds for callout interval. */ /* * This structure is passed as the zone ctor arg so that I don't have to create * a special allocation function just for zones. */ struct uma_zctor_args { const char *name; size_t size; uma_ctor ctor; uma_dtor dtor; uma_init uminit; uma_fini fini; uma_import import; uma_release release; void *arg; uma_keg_t keg; int align; uint32_t flags; }; struct uma_kctor_args { uma_zone_t zone; size_t size; uma_init uminit; uma_fini fini; int align; uint32_t flags; }; struct uma_bucket_zone { uma_zone_t ubz_zone; char *ubz_name; int ubz_entries; /* Number of items it can hold. */ int ubz_maxsize; /* Maximum allocation size per-item. */ }; /* * Compute the actual number of bucket entries to pack them in power * of two sizes for more efficient space utilization. */ #define BUCKET_SIZE(n) \ (((sizeof(void *) * (n)) - sizeof(struct uma_bucket)) / sizeof(void *)) #define BUCKET_MAX BUCKET_SIZE(256) #define BUCKET_MIN 2 struct uma_bucket_zone bucket_zones[] = { /* Literal bucket sizes. */ { NULL, "2 Bucket", 2, 4096 }, { NULL, "4 Bucket", 4, 3072 }, { NULL, "8 Bucket", 8, 2048 }, { NULL, "16 Bucket", 16, 1024 }, /* Rounded down power of 2 sizes for efficiency. */ { NULL, "32 Bucket", BUCKET_SIZE(32), 512 }, { NULL, "64 Bucket", BUCKET_SIZE(64), 256 }, { NULL, "128 Bucket", BUCKET_SIZE(128), 128 }, { NULL, "256 Bucket", BUCKET_SIZE(256), 64 }, { NULL, NULL, 0} }; /* * Flags and enumerations to be passed to internal functions. */ enum zfreeskip { SKIP_NONE = 0, SKIP_CNT = 0x00000001, SKIP_DTOR = 0x00010000, SKIP_FINI = 0x00020000, }; /* Prototypes.. */ void uma_startup1(vm_offset_t); void uma_startup2(void); static void *noobj_alloc(uma_zone_t, vm_size_t, int, uint8_t *, int); static void *page_alloc(uma_zone_t, vm_size_t, int, uint8_t *, int); static void *pcpu_page_alloc(uma_zone_t, vm_size_t, int, uint8_t *, int); static void *startup_alloc(uma_zone_t, vm_size_t, int, uint8_t *, int); static void *contig_alloc(uma_zone_t, vm_size_t, int, uint8_t *, int); static void page_free(void *, vm_size_t, uint8_t); static void pcpu_page_free(void *, vm_size_t, uint8_t); static uma_slab_t keg_alloc_slab(uma_keg_t, uma_zone_t, int, int, int); static void cache_drain(uma_zone_t); static void bucket_drain(uma_zone_t, uma_bucket_t); static void bucket_cache_reclaim(uma_zone_t zone, bool); static int keg_ctor(void *, int, void *, int); static void keg_dtor(void *, int, void *); static int zone_ctor(void *, int, void *, int); static void zone_dtor(void *, int, void *); static inline void item_dtor(uma_zone_t zone, void *item, int size, void *udata, enum zfreeskip skip); static int zero_init(void *, int, int); static void zone_free_bucket(uma_zone_t zone, uma_bucket_t bucket, void *udata, int itemdomain, bool ws); static void zone_foreach(void (*zfunc)(uma_zone_t, void *), void *); static void zone_foreach_unlocked(void (*zfunc)(uma_zone_t, void *), void *); static void zone_timeout(uma_zone_t zone, void *); static int hash_alloc(struct uma_hash *, u_int); static int hash_expand(struct uma_hash *, struct uma_hash *); static void hash_free(struct uma_hash *hash); static void uma_timeout(void *); static void uma_startup3(void); static void uma_shutdown(void); static void *zone_alloc_item(uma_zone_t, void *, int, int); static void zone_free_item(uma_zone_t, void *, void *, enum zfreeskip); static int zone_alloc_limit(uma_zone_t zone, int count, int flags); static void zone_free_limit(uma_zone_t zone, int count); static void bucket_enable(void); static void bucket_init(void); static uma_bucket_t bucket_alloc(uma_zone_t zone, void *, int); static void bucket_free(uma_zone_t zone, uma_bucket_t, void *); static void bucket_zone_drain(void); static uma_bucket_t zone_alloc_bucket(uma_zone_t, void *, int, int); static void *slab_alloc_item(uma_keg_t keg, uma_slab_t slab); static void slab_free_item(uma_zone_t zone, uma_slab_t slab, void *item); static uma_keg_t uma_kcreate(uma_zone_t zone, size_t size, uma_init uminit, uma_fini fini, int align, uint32_t flags); static int zone_import(void *, void **, int, int, int); static void zone_release(void *, void **, int); static bool cache_alloc(uma_zone_t, uma_cache_t, void *, int); static bool cache_free(uma_zone_t, uma_cache_t, void *, void *, int); static int sysctl_vm_zone_count(SYSCTL_HANDLER_ARGS); static int sysctl_vm_zone_stats(SYSCTL_HANDLER_ARGS); static int sysctl_handle_uma_zone_allocs(SYSCTL_HANDLER_ARGS); static int sysctl_handle_uma_zone_frees(SYSCTL_HANDLER_ARGS); static int sysctl_handle_uma_zone_flags(SYSCTL_HANDLER_ARGS); static int sysctl_handle_uma_slab_efficiency(SYSCTL_HANDLER_ARGS); static int sysctl_handle_uma_zone_items(SYSCTL_HANDLER_ARGS); static uint64_t uma_zone_get_allocs(uma_zone_t zone); static SYSCTL_NODE(_vm, OID_AUTO, debug, CTLFLAG_RD, 0, "Memory allocation debugging"); #ifdef INVARIANTS static uint64_t uma_keg_get_allocs(uma_keg_t zone); static inline struct noslabbits *slab_dbg_bits(uma_slab_t slab, uma_keg_t keg); static bool uma_dbg_kskip(uma_keg_t keg, void *mem); static bool uma_dbg_zskip(uma_zone_t zone, void *mem); static void uma_dbg_free(uma_zone_t zone, uma_slab_t slab, void *item); static void uma_dbg_alloc(uma_zone_t zone, uma_slab_t slab, void *item); static u_int dbg_divisor = 1; SYSCTL_UINT(_vm_debug, OID_AUTO, divisor, CTLFLAG_RDTUN | CTLFLAG_NOFETCH, &dbg_divisor, 0, "Debug & thrash every this item in memory allocator"); static counter_u64_t uma_dbg_cnt = EARLY_COUNTER; static counter_u64_t uma_skip_cnt = EARLY_COUNTER; SYSCTL_COUNTER_U64(_vm_debug, OID_AUTO, trashed, CTLFLAG_RD, &uma_dbg_cnt, "memory items debugged"); SYSCTL_COUNTER_U64(_vm_debug, OID_AUTO, skipped, CTLFLAG_RD, &uma_skip_cnt, "memory items skipped, not debugged"); #endif SYSINIT(uma_startup3, SI_SUB_VM_CONF, SI_ORDER_SECOND, uma_startup3, NULL); SYSCTL_NODE(_vm, OID_AUTO, uma, CTLFLAG_RW, 0, "Universal Memory Allocator"); SYSCTL_PROC(_vm, OID_AUTO, zone_count, CTLFLAG_RD|CTLFLAG_MPSAFE|CTLTYPE_INT, 0, 0, sysctl_vm_zone_count, "I", "Number of UMA zones"); SYSCTL_PROC(_vm, OID_AUTO, zone_stats, CTLFLAG_RD|CTLFLAG_MPSAFE|CTLTYPE_STRUCT, 0, 0, sysctl_vm_zone_stats, "s,struct uma_type_header", "Zone Stats"); static int zone_warnings = 1; SYSCTL_INT(_vm, OID_AUTO, zone_warnings, CTLFLAG_RWTUN, &zone_warnings, 0, "Warn when UMA zones becomes full"); static int multipage_slabs = 1; TUNABLE_INT("vm.debug.uma_multipage_slabs", &multipage_slabs); SYSCTL_INT(_vm_debug, OID_AUTO, uma_multipage_slabs, CTLFLAG_RDTUN | CTLFLAG_NOFETCH, &multipage_slabs, 0, "UMA may choose larger slab sizes for better efficiency"); /* * Select the slab zone for an offpage slab with the given maximum item count. */ static inline uma_zone_t slabzone(int ipers) { return (slabzones[ipers > SLABZONE0_SETSIZE]); } /* * This routine checks to see whether or not it's safe to enable buckets. */ static void bucket_enable(void) { KASSERT(booted >= BOOT_KVA, ("Bucket enable before init")); bucketdisable = vm_page_count_min(); } /* * Initialize bucket_zones, the array of zones of buckets of various sizes. * * For each zone, calculate the memory required for each bucket, consisting * of the header and an array of pointers. */ static void bucket_init(void) { struct uma_bucket_zone *ubz; int size; for (ubz = &bucket_zones[0]; ubz->ubz_entries != 0; ubz++) { size = roundup(sizeof(struct uma_bucket), sizeof(void *)); size += sizeof(void *) * ubz->ubz_entries; ubz->ubz_zone = uma_zcreate(ubz->ubz_name, size, NULL, NULL, NULL, NULL, UMA_ALIGN_PTR, UMA_ZONE_MTXCLASS | UMA_ZFLAG_BUCKET | UMA_ZONE_FIRSTTOUCH); } } /* * Given a desired number of entries for a bucket, return the zone from which * to allocate the bucket. */ static struct uma_bucket_zone * bucket_zone_lookup(int entries) { struct uma_bucket_zone *ubz; for (ubz = &bucket_zones[0]; ubz->ubz_entries != 0; ubz++) if (ubz->ubz_entries >= entries) return (ubz); ubz--; return (ubz); } static struct uma_bucket_zone * bucket_zone_max(uma_zone_t zone, int nitems) { struct uma_bucket_zone *ubz; int bpcpu; bpcpu = 2; if ((zone->uz_flags & UMA_ZONE_FIRSTTOUCH) != 0) /* Count the cross-domain bucket. */ bpcpu++; for (ubz = &bucket_zones[0]; ubz->ubz_entries != 0; ubz++) if (ubz->ubz_entries * bpcpu * mp_ncpus > nitems) break; if (ubz == &bucket_zones[0]) ubz = NULL; else ubz--; return (ubz); } static int bucket_select(int size) { struct uma_bucket_zone *ubz; ubz = &bucket_zones[0]; if (size > ubz->ubz_maxsize) return MAX((ubz->ubz_maxsize * ubz->ubz_entries) / size, 1); for (; ubz->ubz_entries != 0; ubz++) if (ubz->ubz_maxsize < size) break; ubz--; return (ubz->ubz_entries); } static uma_bucket_t bucket_alloc(uma_zone_t zone, void *udata, int flags) { struct uma_bucket_zone *ubz; uma_bucket_t bucket; /* * Don't allocate buckets early in boot. */ if (__predict_false(booted < BOOT_KVA)) return (NULL); /* * To limit bucket recursion we store the original zone flags * in a cookie passed via zalloc_arg/zfree_arg. This allows the * NOVM flag to persist even through deep recursions. We also * store ZFLAG_BUCKET once we have recursed attempting to allocate * a bucket for a bucket zone so we do not allow infinite bucket * recursion. This cookie will even persist to frees of unused * buckets via the allocation path or bucket allocations in the * free path. */ if ((zone->uz_flags & UMA_ZFLAG_BUCKET) == 0) udata = (void *)(uintptr_t)zone->uz_flags; else { if ((uintptr_t)udata & UMA_ZFLAG_BUCKET) return (NULL); udata = (void *)((uintptr_t)udata | UMA_ZFLAG_BUCKET); } if (((uintptr_t)udata & UMA_ZONE_VM) != 0) flags |= M_NOVM; ubz = bucket_zone_lookup(zone->uz_bucket_size); if (ubz->ubz_zone == zone && (ubz + 1)->ubz_entries != 0) ubz++; bucket = uma_zalloc_arg(ubz->ubz_zone, udata, flags); if (bucket) { #ifdef INVARIANTS bzero(bucket->ub_bucket, sizeof(void *) * ubz->ubz_entries); #endif bucket->ub_cnt = 0; bucket->ub_entries = ubz->ubz_entries; bucket->ub_seq = SMR_SEQ_INVALID; CTR3(KTR_UMA, "bucket_alloc: zone %s(%p) allocated bucket %p", zone->uz_name, zone, bucket); } return (bucket); } static void bucket_free(uma_zone_t zone, uma_bucket_t bucket, void *udata) { struct uma_bucket_zone *ubz; if (bucket->ub_cnt != 0) bucket_drain(zone, bucket); KASSERT(bucket->ub_cnt == 0, ("bucket_free: Freeing a non free bucket.")); KASSERT(bucket->ub_seq == SMR_SEQ_INVALID, ("bucket_free: Freeing an SMR bucket.")); if ((zone->uz_flags & UMA_ZFLAG_BUCKET) == 0) udata = (void *)(uintptr_t)zone->uz_flags; ubz = bucket_zone_lookup(bucket->ub_entries); uma_zfree_arg(ubz->ubz_zone, bucket, udata); } static void bucket_zone_drain(void) { struct uma_bucket_zone *ubz; for (ubz = &bucket_zones[0]; ubz->ubz_entries != 0; ubz++) uma_zone_reclaim(ubz->ubz_zone, UMA_RECLAIM_DRAIN); } /* * Acquire the domain lock and record contention. */ static uma_zone_domain_t zone_domain_lock(uma_zone_t zone, int domain) { uma_zone_domain_t zdom; bool lockfail; zdom = ZDOM_GET(zone, domain); lockfail = false; if (ZDOM_OWNED(zdom)) lockfail = true; ZDOM_LOCK(zdom); /* This is unsynchronized. The counter does not need to be precise. */ if (lockfail && zone->uz_bucket_size < zone->uz_bucket_size_max) zone->uz_bucket_size++; return (zdom); } /* * Search for the domain with the least cached items and return it, breaking * ties with a preferred domain by returning it. */ static __noinline int zone_domain_lowest(uma_zone_t zone, int pref) { long least, nitems; int domain; int i; least = LONG_MAX; domain = 0; for (i = 0; i < vm_ndomains; i++) { nitems = ZDOM_GET(zone, i)->uzd_nitems; if (nitems < least) { domain = i; least = nitems; } else if (nitems == least && (i == pref || domain == pref)) domain = pref; } return (domain); } /* * Search for the domain with the most cached items and return it or the * preferred domain if it has enough to proceed. */ static __noinline int zone_domain_highest(uma_zone_t zone, int pref) { long most, nitems; int domain; int i; if (ZDOM_GET(zone, pref)->uzd_nitems > BUCKET_MAX) return (pref); most = 0; domain = 0; for (i = 0; i < vm_ndomains; i++) { nitems = ZDOM_GET(zone, i)->uzd_nitems; if (nitems > most) { domain = i; most = nitems; } } return (domain); } /* * Safely subtract cnt from imax. */ static void zone_domain_imax_sub(uma_zone_domain_t zdom, int cnt) { long new; long old; old = zdom->uzd_imax; do { if (old <= cnt) new = 0; else new = old - cnt; } while (atomic_fcmpset_long(&zdom->uzd_imax, &old, new) == 0); } /* * Set the maximum imax value. */ static void zone_domain_imax_set(uma_zone_domain_t zdom, int nitems) { long old; old = zdom->uzd_imax; do { if (old >= nitems) break; } while (atomic_fcmpset_long(&zdom->uzd_imax, &old, nitems) == 0); } /* * Attempt to satisfy an allocation by retrieving a full bucket from one of the * zone's caches. If a bucket is found the zone is not locked on return. */ static uma_bucket_t zone_fetch_bucket(uma_zone_t zone, uma_zone_domain_t zdom, bool reclaim) { uma_bucket_t bucket; int i; bool dtor = false; ZDOM_LOCK_ASSERT(zdom); if ((bucket = STAILQ_FIRST(&zdom->uzd_buckets)) == NULL) return (NULL); /* SMR Buckets can not be re-used until readers expire. */ if ((zone->uz_flags & UMA_ZONE_SMR) != 0 && bucket->ub_seq != SMR_SEQ_INVALID) { if (!smr_poll(zone->uz_smr, bucket->ub_seq, false)) return (NULL); bucket->ub_seq = SMR_SEQ_INVALID; dtor = (zone->uz_dtor != NULL) || UMA_ALWAYS_CTORDTOR; if (STAILQ_NEXT(bucket, ub_link) != NULL) zdom->uzd_seq = STAILQ_NEXT(bucket, ub_link)->ub_seq; } MPASS(zdom->uzd_nitems >= bucket->ub_cnt); STAILQ_REMOVE_HEAD(&zdom->uzd_buckets, ub_link); zdom->uzd_nitems -= bucket->ub_cnt; /* * Shift the bounds of the current WSS interval to avoid * perturbing the estimate. */ if (reclaim) { zdom->uzd_imin -= lmin(zdom->uzd_imin, bucket->ub_cnt); zone_domain_imax_sub(zdom, bucket->ub_cnt); } else if (zdom->uzd_imin > zdom->uzd_nitems) zdom->uzd_imin = zdom->uzd_nitems; ZDOM_UNLOCK(zdom); if (dtor) for (i = 0; i < bucket->ub_cnt; i++) item_dtor(zone, bucket->ub_bucket[i], zone->uz_size, NULL, SKIP_NONE); return (bucket); } /* * Insert a full bucket into the specified cache. The "ws" parameter indicates * whether the bucket's contents should be counted as part of the zone's working * set. The bucket may be freed if it exceeds the bucket limit. */ static void zone_put_bucket(uma_zone_t zone, int domain, uma_bucket_t bucket, void *udata, const bool ws) { uma_zone_domain_t zdom; /* We don't cache empty buckets. This can happen after a reclaim. */ if (bucket->ub_cnt == 0) goto out; zdom = zone_domain_lock(zone, domain); KASSERT(!ws || zdom->uzd_nitems < zone->uz_bucket_max, ("%s: zone %p overflow", __func__, zone)); /* * Conditionally set the maximum number of items. */ zdom->uzd_nitems += bucket->ub_cnt; if (__predict_true(zdom->uzd_nitems < zone->uz_bucket_max)) { if (ws) zone_domain_imax_set(zdom, zdom->uzd_nitems); if (STAILQ_EMPTY(&zdom->uzd_buckets)) zdom->uzd_seq = bucket->ub_seq; STAILQ_INSERT_TAIL(&zdom->uzd_buckets, bucket, ub_link); ZDOM_UNLOCK(zdom); return; } zdom->uzd_nitems -= bucket->ub_cnt; ZDOM_UNLOCK(zdom); out: bucket_free(zone, bucket, udata); } /* Pops an item out of a per-cpu cache bucket. */ static inline void * cache_bucket_pop(uma_cache_t cache, uma_cache_bucket_t bucket) { void *item; CRITICAL_ASSERT(curthread); bucket->ucb_cnt--; item = bucket->ucb_bucket->ub_bucket[bucket->ucb_cnt]; #ifdef INVARIANTS bucket->ucb_bucket->ub_bucket[bucket->ucb_cnt] = NULL; KASSERT(item != NULL, ("uma_zalloc: Bucket pointer mangled.")); #endif cache->uc_allocs++; return (item); } /* Pushes an item into a per-cpu cache bucket. */ static inline void cache_bucket_push(uma_cache_t cache, uma_cache_bucket_t bucket, void *item) { CRITICAL_ASSERT(curthread); KASSERT(bucket->ucb_bucket->ub_bucket[bucket->ucb_cnt] == NULL, ("uma_zfree: Freeing to non free bucket index.")); bucket->ucb_bucket->ub_bucket[bucket->ucb_cnt] = item; bucket->ucb_cnt++; cache->uc_frees++; } /* * Unload a UMA bucket from a per-cpu cache. */ static inline uma_bucket_t cache_bucket_unload(uma_cache_bucket_t bucket) { uma_bucket_t b; b = bucket->ucb_bucket; if (b != NULL) { MPASS(b->ub_entries == bucket->ucb_entries); b->ub_cnt = bucket->ucb_cnt; bucket->ucb_bucket = NULL; bucket->ucb_entries = bucket->ucb_cnt = 0; } return (b); } static inline uma_bucket_t cache_bucket_unload_alloc(uma_cache_t cache) { return (cache_bucket_unload(&cache->uc_allocbucket)); } static inline uma_bucket_t cache_bucket_unload_free(uma_cache_t cache) { return (cache_bucket_unload(&cache->uc_freebucket)); } static inline uma_bucket_t cache_bucket_unload_cross(uma_cache_t cache) { return (cache_bucket_unload(&cache->uc_crossbucket)); } /* * Load a bucket into a per-cpu cache bucket. */ static inline void cache_bucket_load(uma_cache_bucket_t bucket, uma_bucket_t b) { CRITICAL_ASSERT(curthread); MPASS(bucket->ucb_bucket == NULL); MPASS(b->ub_seq == SMR_SEQ_INVALID); bucket->ucb_bucket = b; bucket->ucb_cnt = b->ub_cnt; bucket->ucb_entries = b->ub_entries; } static inline void cache_bucket_load_alloc(uma_cache_t cache, uma_bucket_t b) { cache_bucket_load(&cache->uc_allocbucket, b); } static inline void cache_bucket_load_free(uma_cache_t cache, uma_bucket_t b) { cache_bucket_load(&cache->uc_freebucket, b); } #ifdef NUMA static inline void cache_bucket_load_cross(uma_cache_t cache, uma_bucket_t b) { cache_bucket_load(&cache->uc_crossbucket, b); } #endif /* * Copy and preserve ucb_spare. */ static inline void cache_bucket_copy(uma_cache_bucket_t b1, uma_cache_bucket_t b2) { b1->ucb_bucket = b2->ucb_bucket; b1->ucb_entries = b2->ucb_entries; b1->ucb_cnt = b2->ucb_cnt; } /* * Swap two cache buckets. */ static inline void cache_bucket_swap(uma_cache_bucket_t b1, uma_cache_bucket_t b2) { struct uma_cache_bucket b3; CRITICAL_ASSERT(curthread); cache_bucket_copy(&b3, b1); cache_bucket_copy(b1, b2); cache_bucket_copy(b2, &b3); } /* * Attempt to fetch a bucket from a zone on behalf of the current cpu cache. */ static uma_bucket_t cache_fetch_bucket(uma_zone_t zone, uma_cache_t cache, int domain) { uma_zone_domain_t zdom; uma_bucket_t bucket; /* * Avoid the lock if possible. */ zdom = ZDOM_GET(zone, domain); if (zdom->uzd_nitems == 0) return (NULL); if ((cache_uz_flags(cache) & UMA_ZONE_SMR) != 0 && !smr_poll(zone->uz_smr, zdom->uzd_seq, false)) return (NULL); /* * Check the zone's cache of buckets. */ zdom = zone_domain_lock(zone, domain); if ((bucket = zone_fetch_bucket(zone, zdom, false)) != NULL) { KASSERT(bucket->ub_cnt != 0, ("cache_fetch_bucket: Returning an empty bucket.")); return (bucket); } ZDOM_UNLOCK(zdom); return (NULL); } static void zone_log_warning(uma_zone_t zone) { static const struct timeval warninterval = { 300, 0 }; if (!zone_warnings || zone->uz_warning == NULL) return; if (ratecheck(&zone->uz_ratecheck, &warninterval)) printf("[zone: %s] %s\n", zone->uz_name, zone->uz_warning); } static inline void zone_maxaction(uma_zone_t zone) { if (zone->uz_maxaction.ta_func != NULL) taskqueue_enqueue(taskqueue_thread, &zone->uz_maxaction); } /* * Routine called by timeout which is used to fire off some time interval * based calculations. (stats, hash size, etc.) * * Arguments: * arg Unused * * Returns: * Nothing */ static void uma_timeout(void *unused) { bucket_enable(); zone_foreach(zone_timeout, NULL); /* Reschedule this event */ callout_reset(&uma_callout, UMA_TIMEOUT * hz, uma_timeout, NULL); } /* * Update the working set size estimate for the zone's bucket cache. * The constants chosen here are somewhat arbitrary. With an update period of * 20s (UMA_TIMEOUT), this estimate is dominated by zone activity over the * last 100s. */ static void zone_domain_update_wss(uma_zone_domain_t zdom) { long wss; ZDOM_LOCK(zdom); MPASS(zdom->uzd_imax >= zdom->uzd_imin); wss = zdom->uzd_imax - zdom->uzd_imin; zdom->uzd_imax = zdom->uzd_imin = zdom->uzd_nitems; zdom->uzd_wss = (4 * wss + zdom->uzd_wss) / 5; ZDOM_UNLOCK(zdom); } /* * Routine to perform timeout driven calculations. This expands the * hashes and does per cpu statistics aggregation. * * Returns nothing. */ static void zone_timeout(uma_zone_t zone, void *unused) { uma_keg_t keg; u_int slabs, pages; if ((zone->uz_flags & UMA_ZFLAG_HASH) == 0) goto update_wss; keg = zone->uz_keg; /* * Hash zones are non-numa by definition so the first domain * is the only one present. */ KEG_LOCK(keg, 0); pages = keg->uk_domain[0].ud_pages; /* * Expand the keg hash table. * * This is done if the number of slabs is larger than the hash size. * What I'm trying to do here is completely reduce collisions. This * may be a little aggressive. Should I allow for two collisions max? */ if ((slabs = pages / keg->uk_ppera) > keg->uk_hash.uh_hashsize) { struct uma_hash newhash; struct uma_hash oldhash; int ret; /* * This is so involved because allocating and freeing * while the keg lock is held will lead to deadlock. * I have to do everything in stages and check for * races. */ KEG_UNLOCK(keg, 0); ret = hash_alloc(&newhash, 1 << fls(slabs)); KEG_LOCK(keg, 0); if (ret) { if (hash_expand(&keg->uk_hash, &newhash)) { oldhash = keg->uk_hash; keg->uk_hash = newhash; } else oldhash = newhash; KEG_UNLOCK(keg, 0); hash_free(&oldhash); goto update_wss; } } KEG_UNLOCK(keg, 0); update_wss: for (int i = 0; i < vm_ndomains; i++) zone_domain_update_wss(ZDOM_GET(zone, i)); } /* * Allocate and zero fill the next sized hash table from the appropriate * backing store. * * Arguments: * hash A new hash structure with the old hash size in uh_hashsize * * Returns: * 1 on success and 0 on failure. */ static int hash_alloc(struct uma_hash *hash, u_int size) { size_t alloc; KASSERT(powerof2(size), ("hash size must be power of 2")); if (size > UMA_HASH_SIZE_INIT) { hash->uh_hashsize = size; alloc = sizeof(hash->uh_slab_hash[0]) * hash->uh_hashsize; hash->uh_slab_hash = malloc(alloc, M_UMAHASH, M_NOWAIT); } else { alloc = sizeof(hash->uh_slab_hash[0]) * UMA_HASH_SIZE_INIT; hash->uh_slab_hash = zone_alloc_item(hashzone, NULL, UMA_ANYDOMAIN, M_WAITOK); hash->uh_hashsize = UMA_HASH_SIZE_INIT; } if (hash->uh_slab_hash) { bzero(hash->uh_slab_hash, alloc); hash->uh_hashmask = hash->uh_hashsize - 1; return (1); } return (0); } /* * Expands the hash table for HASH zones. This is done from zone_timeout * to reduce collisions. This must not be done in the regular allocation * path, otherwise, we can recurse on the vm while allocating pages. * * Arguments: * oldhash The hash you want to expand * newhash The hash structure for the new table * * Returns: * Nothing * * Discussion: */ static int hash_expand(struct uma_hash *oldhash, struct uma_hash *newhash) { uma_hash_slab_t slab; u_int hval; u_int idx; if (!newhash->uh_slab_hash) return (0); if (oldhash->uh_hashsize >= newhash->uh_hashsize) return (0); /* * I need to investigate hash algorithms for resizing without a * full rehash. */ for (idx = 0; idx < oldhash->uh_hashsize; idx++) while (!LIST_EMPTY(&oldhash->uh_slab_hash[idx])) { slab = LIST_FIRST(&oldhash->uh_slab_hash[idx]); LIST_REMOVE(slab, uhs_hlink); hval = UMA_HASH(newhash, slab->uhs_data); LIST_INSERT_HEAD(&newhash->uh_slab_hash[hval], slab, uhs_hlink); } return (1); } /* * Free the hash bucket to the appropriate backing store. * * Arguments: * slab_hash The hash bucket we're freeing * hashsize The number of entries in that hash bucket * * Returns: * Nothing */ static void hash_free(struct uma_hash *hash) { if (hash->uh_slab_hash == NULL) return; if (hash->uh_hashsize == UMA_HASH_SIZE_INIT) zone_free_item(hashzone, hash->uh_slab_hash, NULL, SKIP_NONE); else free(hash->uh_slab_hash, M_UMAHASH); } /* * Frees all outstanding items in a bucket * * Arguments: * zone The zone to free to, must be unlocked. * bucket The free/alloc bucket with items. * * Returns: * Nothing */ - static void bucket_drain(uma_zone_t zone, uma_bucket_t bucket) { int i; if (bucket->ub_cnt == 0) return; if ((zone->uz_flags & UMA_ZONE_SMR) != 0 && bucket->ub_seq != SMR_SEQ_INVALID) { smr_wait(zone->uz_smr, bucket->ub_seq); bucket->ub_seq = SMR_SEQ_INVALID; for (i = 0; i < bucket->ub_cnt; i++) item_dtor(zone, bucket->ub_bucket[i], zone->uz_size, NULL, SKIP_NONE); } if (zone->uz_fini) for (i = 0; i < bucket->ub_cnt; i++) zone->uz_fini(bucket->ub_bucket[i], zone->uz_size); zone->uz_release(zone->uz_arg, bucket->ub_bucket, bucket->ub_cnt); if (zone->uz_max_items > 0) zone_free_limit(zone, bucket->ub_cnt); #ifdef INVARIANTS bzero(bucket->ub_bucket, sizeof(void *) * bucket->ub_cnt); #endif bucket->ub_cnt = 0; } /* * Drains the per cpu caches for a zone. * * NOTE: This may only be called while the zone is being torn down, and not * during normal operation. This is necessary in order that we do not have * to migrate CPUs to drain the per-CPU caches. * * Arguments: * zone The zone to drain, must be unlocked. * * Returns: * Nothing */ static void cache_drain(uma_zone_t zone) { uma_cache_t cache; uma_bucket_t bucket; smr_seq_t seq; int cpu; /* * XXX: It is safe to not lock the per-CPU caches, because we're * tearing down the zone anyway. I.e., there will be no further use * of the caches at this point. * * XXX: It would good to be able to assert that the zone is being * torn down to prevent improper use of cache_drain(). */ seq = SMR_SEQ_INVALID; if ((zone->uz_flags & UMA_ZONE_SMR) != 0) - seq = smr_current(zone->uz_smr); + seq = smr_advance(zone->uz_smr); CPU_FOREACH(cpu) { cache = &zone->uz_cpu[cpu]; bucket = cache_bucket_unload_alloc(cache); if (bucket != NULL) bucket_free(zone, bucket, NULL); bucket = cache_bucket_unload_free(cache); if (bucket != NULL) { bucket->ub_seq = seq; bucket_free(zone, bucket, NULL); } bucket = cache_bucket_unload_cross(cache); if (bucket != NULL) { bucket->ub_seq = seq; bucket_free(zone, bucket, NULL); } } bucket_cache_reclaim(zone, true); } static void cache_shrink(uma_zone_t zone, void *unused) { if (zone->uz_flags & UMA_ZFLAG_INTERNAL) return; zone->uz_bucket_size = (zone->uz_bucket_size_min + zone->uz_bucket_size) / 2; } static void cache_drain_safe_cpu(uma_zone_t zone, void *unused) { uma_cache_t cache; uma_bucket_t b1, b2, b3; int domain; if (zone->uz_flags & UMA_ZFLAG_INTERNAL) return; b1 = b2 = b3 = NULL; critical_enter(); cache = &zone->uz_cpu[curcpu]; domain = PCPU_GET(domain); b1 = cache_bucket_unload_alloc(cache); /* * Don't flush SMR zone buckets. This leaves the zone without a * bucket and forces every free to synchronize(). */ if ((zone->uz_flags & UMA_ZONE_SMR) == 0) { b2 = cache_bucket_unload_free(cache); b3 = cache_bucket_unload_cross(cache); } critical_exit(); if (b1 != NULL) zone_free_bucket(zone, b1, NULL, domain, false); if (b2 != NULL) zone_free_bucket(zone, b2, NULL, domain, false); if (b3 != NULL) { /* Adjust the domain so it goes to zone_free_cross. */ domain = (domain + 1) % vm_ndomains; zone_free_bucket(zone, b3, NULL, domain, false); } } /* * Safely drain per-CPU caches of a zone(s) to alloc bucket. * This is an expensive call because it needs to bind to all CPUs * one by one and enter a critical section on each of them in order * to safely access their cache buckets. * Zone lock must not be held on call this function. */ static void pcpu_cache_drain_safe(uma_zone_t zone) { int cpu; /* * Polite bucket sizes shrinking was not enough, shrink aggressively. */ if (zone) cache_shrink(zone, NULL); else zone_foreach(cache_shrink, NULL); CPU_FOREACH(cpu) { thread_lock(curthread); sched_bind(curthread, cpu); thread_unlock(curthread); if (zone) cache_drain_safe_cpu(zone, NULL); else zone_foreach(cache_drain_safe_cpu, NULL); } thread_lock(curthread); sched_unbind(curthread); thread_unlock(curthread); } /* * Reclaim cached buckets from a zone. All buckets are reclaimed if the caller * requested a drain, otherwise the per-domain caches are trimmed to either * estimated working set size. */ static void bucket_cache_reclaim(uma_zone_t zone, bool drain) { uma_zone_domain_t zdom; uma_bucket_t bucket; long target; int i; /* * Shrink the zone bucket size to ensure that the per-CPU caches * don't grow too large. */ if (zone->uz_bucket_size > zone->uz_bucket_size_min) zone->uz_bucket_size--; for (i = 0; i < vm_ndomains; i++) { /* * The cross bucket is partially filled and not part of * the item count. Reclaim it individually here. */ zdom = ZDOM_GET(zone, i); - if ((zone->uz_flags & UMA_ZONE_SMR) == 0) { + if ((zone->uz_flags & UMA_ZONE_SMR) == 0 || drain) { ZONE_CROSS_LOCK(zone); bucket = zdom->uzd_cross; zdom->uzd_cross = NULL; ZONE_CROSS_UNLOCK(zone); if (bucket != NULL) bucket_free(zone, bucket, NULL); } /* * If we were asked to drain the zone, we are done only once * this bucket cache is empty. Otherwise, we reclaim items in * excess of the zone's estimated working set size. If the * difference nitems - imin is larger than the WSS estimate, * then the estimate will grow at the end of this interval and * we ignore the historical average. */ ZDOM_LOCK(zdom); target = drain ? 0 : lmax(zdom->uzd_wss, zdom->uzd_nitems - zdom->uzd_imin); while (zdom->uzd_nitems > target) { bucket = zone_fetch_bucket(zone, zdom, true); if (bucket == NULL) break; bucket_free(zone, bucket, NULL); ZDOM_LOCK(zdom); } ZDOM_UNLOCK(zdom); } } static void keg_free_slab(uma_keg_t keg, uma_slab_t slab, int start) { uint8_t *mem; int i; uint8_t flags; CTR4(KTR_UMA, "keg_free_slab keg %s(%p) slab %p, returning %d bytes", keg->uk_name, keg, slab, PAGE_SIZE * keg->uk_ppera); mem = slab_data(slab, keg); flags = slab->us_flags; i = start; if (keg->uk_fini != NULL) { for (i--; i > -1; i--) #ifdef INVARIANTS /* * trash_fini implies that dtor was trash_dtor. trash_fini * would check that memory hasn't been modified since free, * which executed trash_dtor. * That's why we need to run uma_dbg_kskip() check here, * albeit we don't make skip check for other init/fini * invocations. */ if (!uma_dbg_kskip(keg, slab_item(slab, keg, i)) || keg->uk_fini != trash_fini) #endif keg->uk_fini(slab_item(slab, keg, i), keg->uk_size); } if (keg->uk_flags & UMA_ZFLAG_OFFPAGE) zone_free_item(slabzone(keg->uk_ipers), slab_tohashslab(slab), NULL, SKIP_NONE); keg->uk_freef(mem, PAGE_SIZE * keg->uk_ppera, flags); uma_total_dec(PAGE_SIZE * keg->uk_ppera); } /* * Frees pages from a keg back to the system. This is done on demand from * the pageout daemon. * * Returns nothing. */ static void keg_drain(uma_keg_t keg) { struct slabhead freeslabs; uma_domain_t dom; uma_slab_t slab, tmp; int i, n; if (keg->uk_flags & UMA_ZONE_NOFREE || keg->uk_freef == NULL) return; for (i = 0; i < vm_ndomains; i++) { CTR4(KTR_UMA, "keg_drain %s(%p) domain %d free items: %u", keg->uk_name, keg, i, dom->ud_free_items); dom = &keg->uk_domain[i]; LIST_INIT(&freeslabs); KEG_LOCK(keg, i); if ((keg->uk_flags & UMA_ZFLAG_HASH) != 0) { LIST_FOREACH(slab, &dom->ud_free_slab, us_link) UMA_HASH_REMOVE(&keg->uk_hash, slab); } n = dom->ud_free_slabs; LIST_SWAP(&freeslabs, &dom->ud_free_slab, uma_slab, us_link); dom->ud_free_slabs = 0; dom->ud_free_items -= n * keg->uk_ipers; dom->ud_pages -= n * keg->uk_ppera; KEG_UNLOCK(keg, i); LIST_FOREACH_SAFE(slab, &freeslabs, us_link, tmp) keg_free_slab(keg, slab, keg->uk_ipers); } } static void zone_reclaim(uma_zone_t zone, int waitok, bool drain) { /* * Set draining to interlock with zone_dtor() so we can release our * locks as we go. Only dtor() should do a WAITOK call since it * is the only call that knows the structure will still be available * when it wakes up. */ ZONE_LOCK(zone); while (zone->uz_flags & UMA_ZFLAG_RECLAIMING) { if (waitok == M_NOWAIT) goto out; msleep(zone, &ZDOM_GET(zone, 0)->uzd_lock, PVM, "zonedrain", 1); } zone->uz_flags |= UMA_ZFLAG_RECLAIMING; ZONE_UNLOCK(zone); bucket_cache_reclaim(zone, drain); /* * The DRAINING flag protects us from being freed while * we're running. Normally the uma_rwlock would protect us but we * must be able to release and acquire the right lock for each keg. */ if ((zone->uz_flags & UMA_ZFLAG_CACHE) == 0) keg_drain(zone->uz_keg); ZONE_LOCK(zone); zone->uz_flags &= ~UMA_ZFLAG_RECLAIMING; wakeup(zone); out: ZONE_UNLOCK(zone); } static void zone_drain(uma_zone_t zone, void *unused) { zone_reclaim(zone, M_NOWAIT, true); } static void zone_trim(uma_zone_t zone, void *unused) { zone_reclaim(zone, M_NOWAIT, false); } /* * Allocate a new slab for a keg and inserts it into the partial slab list. * The keg should be unlocked on entry. If the allocation succeeds it will * be locked on return. * * Arguments: * flags Wait flags for the item initialization routine * aflags Wait flags for the slab allocation * * Returns: * The slab that was allocated or NULL if there is no memory and the * caller specified M_NOWAIT. */ static uma_slab_t keg_alloc_slab(uma_keg_t keg, uma_zone_t zone, int domain, int flags, int aflags) { uma_domain_t dom; uma_alloc allocf; uma_slab_t slab; unsigned long size; uint8_t *mem; uint8_t sflags; int i; KASSERT(domain >= 0 && domain < vm_ndomains, ("keg_alloc_slab: domain %d out of range", domain)); allocf = keg->uk_allocf; slab = NULL; mem = NULL; if (keg->uk_flags & UMA_ZFLAG_OFFPAGE) { uma_hash_slab_t hslab; hslab = zone_alloc_item(slabzone(keg->uk_ipers), NULL, domain, aflags); if (hslab == NULL) goto fail; slab = &hslab->uhs_slab; } /* * This reproduces the old vm_zone behavior of zero filling pages the * first time they are added to a zone. * * Malloced items are zeroed in uma_zalloc. */ if ((keg->uk_flags & UMA_ZONE_MALLOC) == 0) aflags |= M_ZERO; else aflags &= ~M_ZERO; if (keg->uk_flags & UMA_ZONE_NODUMP) aflags |= M_NODUMP; /* zone is passed for legacy reasons. */ size = keg->uk_ppera * PAGE_SIZE; mem = allocf(zone, size, domain, &sflags, aflags); if (mem == NULL) { if (keg->uk_flags & UMA_ZFLAG_OFFPAGE) zone_free_item(slabzone(keg->uk_ipers), slab_tohashslab(slab), NULL, SKIP_NONE); goto fail; } uma_total_inc(size); /* For HASH zones all pages go to the same uma_domain. */ if ((keg->uk_flags & UMA_ZFLAG_HASH) != 0) domain = 0; /* Point the slab into the allocated memory */ if (!(keg->uk_flags & UMA_ZFLAG_OFFPAGE)) slab = (uma_slab_t )(mem + keg->uk_pgoff); else slab_tohashslab(slab)->uhs_data = mem; if (keg->uk_flags & UMA_ZFLAG_VTOSLAB) for (i = 0; i < keg->uk_ppera; i++) vsetzoneslab((vm_offset_t)mem + (i * PAGE_SIZE), zone, slab); slab->us_freecount = keg->uk_ipers; slab->us_flags = sflags; slab->us_domain = domain; BIT_FILL(keg->uk_ipers, &slab->us_free); #ifdef INVARIANTS BIT_ZERO(keg->uk_ipers, slab_dbg_bits(slab, keg)); #endif if (keg->uk_init != NULL) { for (i = 0; i < keg->uk_ipers; i++) if (keg->uk_init(slab_item(slab, keg, i), keg->uk_size, flags) != 0) break; if (i != keg->uk_ipers) { keg_free_slab(keg, slab, i); goto fail; } } KEG_LOCK(keg, domain); CTR3(KTR_UMA, "keg_alloc_slab: allocated slab %p for %s(%p)", slab, keg->uk_name, keg); if (keg->uk_flags & UMA_ZFLAG_HASH) UMA_HASH_INSERT(&keg->uk_hash, slab, mem); /* * If we got a slab here it's safe to mark it partially used * and return. We assume that the caller is going to remove * at least one item. */ dom = &keg->uk_domain[domain]; LIST_INSERT_HEAD(&dom->ud_part_slab, slab, us_link); dom->ud_pages += keg->uk_ppera; dom->ud_free_items += keg->uk_ipers; return (slab); fail: return (NULL); } /* * This function is intended to be used early on in place of page_alloc() so * that we may use the boot time page cache to satisfy allocations before * the VM is ready. */ static void * startup_alloc(uma_zone_t zone, vm_size_t bytes, int domain, uint8_t *pflag, int wait) { vm_paddr_t pa; vm_page_t m; void *mem; int pages; int i; pages = howmany(bytes, PAGE_SIZE); KASSERT(pages > 0, ("%s can't reserve 0 pages", __func__)); *pflag = UMA_SLAB_BOOT; m = vm_page_alloc_contig_domain(NULL, 0, domain, malloc2vm_flags(wait) | VM_ALLOC_NOOBJ | VM_ALLOC_WIRED, pages, (vm_paddr_t)0, ~(vm_paddr_t)0, 1, 0, VM_MEMATTR_DEFAULT); if (m == NULL) return (NULL); pa = VM_PAGE_TO_PHYS(m); for (i = 0; i < pages; i++, pa += PAGE_SIZE) { #if defined(__aarch64__) || defined(__amd64__) || defined(__mips__) || \ defined(__riscv) || defined(__powerpc64__) if ((wait & M_NODUMP) == 0) dump_add_page(pa); #endif } /* Allocate KVA and indirectly advance bootmem. */ mem = (void *)pmap_map(&bootmem, m->phys_addr, m->phys_addr + (pages * PAGE_SIZE), VM_PROT_READ | VM_PROT_WRITE); if ((wait & M_ZERO) != 0) bzero(mem, pages * PAGE_SIZE); return (mem); } static void startup_free(void *mem, vm_size_t bytes) { vm_offset_t va; vm_page_t m; va = (vm_offset_t)mem; m = PHYS_TO_VM_PAGE(pmap_kextract(va)); pmap_remove(kernel_pmap, va, va + bytes); for (; bytes != 0; bytes -= PAGE_SIZE, m++) { #if defined(__aarch64__) || defined(__amd64__) || defined(__mips__) || \ defined(__riscv) || defined(__powerpc64__) dump_drop_page(VM_PAGE_TO_PHYS(m)); #endif vm_page_unwire_noq(m); vm_page_free(m); } } /* * Allocates a number of pages from the system * * Arguments: * bytes The number of bytes requested * wait Shall we wait? * * Returns: * A pointer to the alloced memory or possibly * NULL if M_NOWAIT is set. */ static void * page_alloc(uma_zone_t zone, vm_size_t bytes, int domain, uint8_t *pflag, int wait) { void *p; /* Returned page */ *pflag = UMA_SLAB_KERNEL; p = (void *)kmem_malloc_domainset(DOMAINSET_FIXED(domain), bytes, wait); return (p); } static void * pcpu_page_alloc(uma_zone_t zone, vm_size_t bytes, int domain, uint8_t *pflag, int wait) { struct pglist alloctail; vm_offset_t addr, zkva; int cpu, flags; vm_page_t p, p_next; #ifdef NUMA struct pcpu *pc; #endif MPASS(bytes == (mp_maxid + 1) * PAGE_SIZE); TAILQ_INIT(&alloctail); flags = VM_ALLOC_SYSTEM | VM_ALLOC_WIRED | VM_ALLOC_NOOBJ | malloc2vm_flags(wait); *pflag = UMA_SLAB_KERNEL; for (cpu = 0; cpu <= mp_maxid; cpu++) { if (CPU_ABSENT(cpu)) { p = vm_page_alloc(NULL, 0, flags); } else { #ifndef NUMA p = vm_page_alloc(NULL, 0, flags); #else pc = pcpu_find(cpu); if (__predict_false(VM_DOMAIN_EMPTY(pc->pc_domain))) p = NULL; else p = vm_page_alloc_domain(NULL, 0, pc->pc_domain, flags); if (__predict_false(p == NULL)) p = vm_page_alloc(NULL, 0, flags); #endif } if (__predict_false(p == NULL)) goto fail; TAILQ_INSERT_TAIL(&alloctail, p, listq); } if ((addr = kva_alloc(bytes)) == 0) goto fail; zkva = addr; TAILQ_FOREACH(p, &alloctail, listq) { pmap_qenter(zkva, &p, 1); zkva += PAGE_SIZE; } return ((void*)addr); fail: TAILQ_FOREACH_SAFE(p, &alloctail, listq, p_next) { vm_page_unwire_noq(p); vm_page_free(p); } return (NULL); } /* * Allocates a number of pages from within an object * * Arguments: * bytes The number of bytes requested * wait Shall we wait? * * Returns: * A pointer to the alloced memory or possibly * NULL if M_NOWAIT is set. */ static void * noobj_alloc(uma_zone_t zone, vm_size_t bytes, int domain, uint8_t *flags, int wait) { TAILQ_HEAD(, vm_page) alloctail; u_long npages; vm_offset_t retkva, zkva; vm_page_t p, p_next; uma_keg_t keg; TAILQ_INIT(&alloctail); keg = zone->uz_keg; npages = howmany(bytes, PAGE_SIZE); while (npages > 0) { p = vm_page_alloc_domain(NULL, 0, domain, VM_ALLOC_INTERRUPT | VM_ALLOC_WIRED | VM_ALLOC_NOOBJ | ((wait & M_WAITOK) != 0 ? VM_ALLOC_WAITOK : VM_ALLOC_NOWAIT)); if (p != NULL) { /* * Since the page does not belong to an object, its * listq is unused. */ TAILQ_INSERT_TAIL(&alloctail, p, listq); npages--; continue; } /* * Page allocation failed, free intermediate pages and * exit. */ TAILQ_FOREACH_SAFE(p, &alloctail, listq, p_next) { vm_page_unwire_noq(p); vm_page_free(p); } return (NULL); } *flags = UMA_SLAB_PRIV; zkva = keg->uk_kva + atomic_fetchadd_long(&keg->uk_offset, round_page(bytes)); retkva = zkva; TAILQ_FOREACH(p, &alloctail, listq) { pmap_qenter(zkva, &p, 1); zkva += PAGE_SIZE; } return ((void *)retkva); } /* * Allocate physically contiguous pages. */ static void * contig_alloc(uma_zone_t zone, vm_size_t bytes, int domain, uint8_t *pflag, int wait) { *pflag = UMA_SLAB_KERNEL; return ((void *)kmem_alloc_contig_domainset(DOMAINSET_FIXED(domain), bytes, wait, 0, ~(vm_paddr_t)0, 1, 0, VM_MEMATTR_DEFAULT)); } /* * Frees a number of pages to the system * * Arguments: * mem A pointer to the memory to be freed * size The size of the memory being freed * flags The original p->us_flags field * * Returns: * Nothing */ static void page_free(void *mem, vm_size_t size, uint8_t flags) { if ((flags & UMA_SLAB_BOOT) != 0) { startup_free(mem, size); return; } KASSERT((flags & UMA_SLAB_KERNEL) != 0, ("UMA: page_free used with invalid flags %x", flags)); kmem_free((vm_offset_t)mem, size); } /* * Frees pcpu zone allocations * * Arguments: * mem A pointer to the memory to be freed * size The size of the memory being freed * flags The original p->us_flags field * * Returns: * Nothing */ static void pcpu_page_free(void *mem, vm_size_t size, uint8_t flags) { vm_offset_t sva, curva; vm_paddr_t paddr; vm_page_t m; MPASS(size == (mp_maxid+1)*PAGE_SIZE); if ((flags & UMA_SLAB_BOOT) != 0) { startup_free(mem, size); return; } sva = (vm_offset_t)mem; for (curva = sva; curva < sva + size; curva += PAGE_SIZE) { paddr = pmap_kextract(curva); m = PHYS_TO_VM_PAGE(paddr); vm_page_unwire_noq(m); vm_page_free(m); } pmap_qremove(sva, size >> PAGE_SHIFT); kva_free(sva, size); } /* * Zero fill initializer * * Arguments/Returns follow uma_init specifications */ static int zero_init(void *mem, int size, int flags) { bzero(mem, size); return (0); } #ifdef INVARIANTS struct noslabbits * slab_dbg_bits(uma_slab_t slab, uma_keg_t keg) { return ((void *)((char *)&slab->us_free + BITSET_SIZE(keg->uk_ipers))); } #endif /* * Actual size of embedded struct slab (!OFFPAGE). */ size_t slab_sizeof(int nitems) { size_t s; s = sizeof(struct uma_slab) + BITSET_SIZE(nitems) * SLAB_BITSETS; return (roundup(s, UMA_ALIGN_PTR + 1)); } /* * Size of memory for embedded slabs (!OFFPAGE). */ size_t slab_space(int nitems) { return (UMA_SLAB_SIZE - slab_sizeof(nitems)); } #define UMA_FIXPT_SHIFT 31 #define UMA_FRAC_FIXPT(n, d) \ ((uint32_t)(((uint64_t)(n) << UMA_FIXPT_SHIFT) / (d))) #define UMA_FIXPT_PCT(f) \ ((u_int)(((uint64_t)100 * (f)) >> UMA_FIXPT_SHIFT)) #define UMA_PCT_FIXPT(pct) UMA_FRAC_FIXPT((pct), 100) #define UMA_MIN_EFF UMA_PCT_FIXPT(100 - UMA_MAX_WASTE) /* * Compute the number of items that will fit in a slab. If hdr is true, the * item count may be limited to provide space in the slab for an inline slab * header. Otherwise, all slab space will be provided for item storage. */ static u_int slab_ipers_hdr(u_int size, u_int rsize, u_int slabsize, bool hdr) { u_int ipers; u_int padpi; /* The padding between items is not needed after the last item. */ padpi = rsize - size; if (hdr) { /* * Start with the maximum item count and remove items until * the slab header first alongside the allocatable memory. */ for (ipers = MIN(SLAB_MAX_SETSIZE, (slabsize + padpi - slab_sizeof(1)) / rsize); ipers > 0 && ipers * rsize - padpi + slab_sizeof(ipers) > slabsize; ipers--) continue; } else { ipers = MIN((slabsize + padpi) / rsize, SLAB_MAX_SETSIZE); } return (ipers); } /* * Compute the number of items that will fit in a slab for a startup zone. */ int slab_ipers(size_t size, int align) { int rsize; rsize = roundup(size, align + 1); /* Assume no CACHESPREAD */ return (slab_ipers_hdr(size, rsize, UMA_SLAB_SIZE, true)); } struct keg_layout_result { u_int format; u_int slabsize; u_int ipers; u_int eff; }; static void keg_layout_one(uma_keg_t keg, u_int rsize, u_int slabsize, u_int fmt, struct keg_layout_result *kl) { u_int total; kl->format = fmt; kl->slabsize = slabsize; /* Handle INTERNAL as inline with an extra page. */ if ((fmt & UMA_ZFLAG_INTERNAL) != 0) { kl->format &= ~UMA_ZFLAG_INTERNAL; kl->slabsize += PAGE_SIZE; } kl->ipers = slab_ipers_hdr(keg->uk_size, rsize, kl->slabsize, (fmt & UMA_ZFLAG_OFFPAGE) == 0); /* Account for memory used by an offpage slab header. */ total = kl->slabsize; if ((fmt & UMA_ZFLAG_OFFPAGE) != 0) total += slabzone(kl->ipers)->uz_keg->uk_rsize; kl->eff = UMA_FRAC_FIXPT(kl->ipers * rsize, total); } /* * Determine the format of a uma keg. This determines where the slab header * will be placed (inline or offpage) and calculates ipers, rsize, and ppera. * * Arguments * keg The zone we should initialize * * Returns * Nothing */ static void keg_layout(uma_keg_t keg) { struct keg_layout_result kl = {}, kl_tmp; u_int fmts[2]; u_int alignsize; u_int nfmt; u_int pages; u_int rsize; u_int slabsize; u_int i, j; KASSERT((keg->uk_flags & UMA_ZONE_PCPU) == 0 || (keg->uk_size <= UMA_PCPU_ALLOC_SIZE && (keg->uk_flags & UMA_ZONE_CACHESPREAD) == 0), ("%s: cannot configure for PCPU: keg=%s, size=%u, flags=0x%b", __func__, keg->uk_name, keg->uk_size, keg->uk_flags, PRINT_UMA_ZFLAGS)); KASSERT((keg->uk_flags & (UMA_ZFLAG_INTERNAL | UMA_ZONE_VM)) == 0 || (keg->uk_flags & (UMA_ZONE_NOTOUCH | UMA_ZONE_PCPU)) == 0, ("%s: incompatible flags 0x%b", __func__, keg->uk_flags, PRINT_UMA_ZFLAGS)); alignsize = keg->uk_align + 1; /* * Calculate the size of each allocation (rsize) according to * alignment. If the requested size is smaller than we have * allocation bits for we round it up. */ rsize = MAX(keg->uk_size, UMA_SMALLEST_UNIT); rsize = roundup2(rsize, alignsize); if ((keg->uk_flags & UMA_ZONE_CACHESPREAD) != 0) { /* * We want one item to start on every align boundary in a page. * To do this we will span pages. We will also extend the item * by the size of align if it is an even multiple of align. * Otherwise, it would fall on the same boundary every time. */ if ((rsize & alignsize) == 0) rsize += alignsize; slabsize = rsize * (PAGE_SIZE / alignsize); slabsize = MIN(slabsize, rsize * SLAB_MAX_SETSIZE); slabsize = MIN(slabsize, UMA_CACHESPREAD_MAX_SIZE); slabsize = round_page(slabsize); } else { /* * Start with a slab size of as many pages as it takes to * represent a single item. We will try to fit as many * additional items into the slab as possible. */ slabsize = round_page(keg->uk_size); } /* Build a list of all of the available formats for this keg. */ nfmt = 0; /* Evaluate an inline slab layout. */ if ((keg->uk_flags & (UMA_ZONE_NOTOUCH | UMA_ZONE_PCPU)) == 0) fmts[nfmt++] = 0; /* TODO: vm_page-embedded slab. */ /* * We can't do OFFPAGE if we're internal or if we've been * asked to not go to the VM for buckets. If we do this we * may end up going to the VM for slabs which we do not want * to do if we're UMA_ZONE_VM, which clearly forbids it. * In those cases, evaluate a pseudo-format called INTERNAL * which has an inline slab header and one extra page to * guarantee that it fits. * * Otherwise, see if using an OFFPAGE slab will improve our * efficiency. */ if ((keg->uk_flags & (UMA_ZFLAG_INTERNAL | UMA_ZONE_VM)) != 0) fmts[nfmt++] = UMA_ZFLAG_INTERNAL; else fmts[nfmt++] = UMA_ZFLAG_OFFPAGE; /* * Choose a slab size and format which satisfy the minimum efficiency. * Prefer the smallest slab size that meets the constraints. * * Start with a minimum slab size, to accommodate CACHESPREAD. Then, * for small items (up to PAGE_SIZE), the iteration increment is one * page; and for large items, the increment is one item. */ i = (slabsize + rsize - keg->uk_size) / MAX(PAGE_SIZE, rsize); KASSERT(i >= 1, ("keg %s(%p) flags=0x%b slabsize=%u, rsize=%u, i=%u", keg->uk_name, keg, keg->uk_flags, PRINT_UMA_ZFLAGS, slabsize, rsize, i)); for ( ; ; i++) { slabsize = (rsize <= PAGE_SIZE) ? ptoa(i) : round_page(rsize * (i - 1) + keg->uk_size); for (j = 0; j < nfmt; j++) { /* Only if we have no viable format yet. */ if ((fmts[j] & UMA_ZFLAG_INTERNAL) != 0 && kl.ipers > 0) continue; keg_layout_one(keg, rsize, slabsize, fmts[j], &kl_tmp); if (kl_tmp.eff <= kl.eff) continue; kl = kl_tmp; CTR6(KTR_UMA, "keg %s layout: format %#x " "(ipers %u * rsize %u) / slabsize %#x = %u%% eff", keg->uk_name, kl.format, kl.ipers, rsize, kl.slabsize, UMA_FIXPT_PCT(kl.eff)); /* Stop when we reach the minimum efficiency. */ if (kl.eff >= UMA_MIN_EFF) break; } if (kl.eff >= UMA_MIN_EFF || !multipage_slabs || slabsize >= SLAB_MAX_SETSIZE * rsize || (keg->uk_flags & (UMA_ZONE_PCPU | UMA_ZONE_CONTIG)) != 0) break; } pages = atop(kl.slabsize); if ((keg->uk_flags & UMA_ZONE_PCPU) != 0) pages *= mp_maxid + 1; keg->uk_rsize = rsize; keg->uk_ipers = kl.ipers; keg->uk_ppera = pages; keg->uk_flags |= kl.format; /* * How do we find the slab header if it is offpage or if not all item * start addresses are in the same page? We could solve the latter * case with vaddr alignment, but we don't. */ if ((keg->uk_flags & UMA_ZFLAG_OFFPAGE) != 0 || (keg->uk_ipers - 1) * rsize >= PAGE_SIZE) { if ((keg->uk_flags & UMA_ZONE_NOTPAGE) != 0) keg->uk_flags |= UMA_ZFLAG_HASH; else keg->uk_flags |= UMA_ZFLAG_VTOSLAB; } CTR6(KTR_UMA, "%s: keg=%s, flags=%#x, rsize=%u, ipers=%u, ppera=%u", __func__, keg->uk_name, keg->uk_flags, rsize, keg->uk_ipers, pages); KASSERT(keg->uk_ipers > 0 && keg->uk_ipers <= SLAB_MAX_SETSIZE, ("%s: keg=%s, flags=0x%b, rsize=%u, ipers=%u, ppera=%u", __func__, keg->uk_name, keg->uk_flags, PRINT_UMA_ZFLAGS, rsize, keg->uk_ipers, pages)); } /* * Keg header ctor. This initializes all fields, locks, etc. And inserts * the keg onto the global keg list. * * Arguments/Returns follow uma_ctor specifications * udata Actually uma_kctor_args */ static int keg_ctor(void *mem, int size, void *udata, int flags) { struct uma_kctor_args *arg = udata; uma_keg_t keg = mem; uma_zone_t zone; int i; bzero(keg, size); keg->uk_size = arg->size; keg->uk_init = arg->uminit; keg->uk_fini = arg->fini; keg->uk_align = arg->align; keg->uk_reserve = 0; keg->uk_flags = arg->flags; /* * We use a global round-robin policy by default. Zones with * UMA_ZONE_FIRSTTOUCH set will use first-touch instead, in which * case the iterator is never run. */ keg->uk_dr.dr_policy = DOMAINSET_RR(); keg->uk_dr.dr_iter = 0; /* * The master zone is passed to us at keg-creation time. */ zone = arg->zone; keg->uk_name = zone->uz_name; if (arg->flags & UMA_ZONE_ZINIT) keg->uk_init = zero_init; if (arg->flags & UMA_ZONE_MALLOC) keg->uk_flags |= UMA_ZFLAG_VTOSLAB; #ifndef SMP keg->uk_flags &= ~UMA_ZONE_PCPU; #endif keg_layout(keg); /* * Use a first-touch NUMA policy for kegs that pmap_extract() will * work on. Use round-robin for everything else. * * Zones may override the default by specifying either. */ #ifdef NUMA if ((keg->uk_flags & (UMA_ZONE_ROUNDROBIN | UMA_ZFLAG_CACHE | UMA_ZONE_NOTPAGE)) == 0) keg->uk_flags |= UMA_ZONE_FIRSTTOUCH; else if ((keg->uk_flags & UMA_ZONE_FIRSTTOUCH) == 0) keg->uk_flags |= UMA_ZONE_ROUNDROBIN; #endif /* * If we haven't booted yet we need allocations to go through the * startup cache until the vm is ready. */ #ifdef UMA_MD_SMALL_ALLOC if (keg->uk_ppera == 1) keg->uk_allocf = uma_small_alloc; else #endif if (booted < BOOT_KVA) keg->uk_allocf = startup_alloc; else if (keg->uk_flags & UMA_ZONE_PCPU) keg->uk_allocf = pcpu_page_alloc; else if ((keg->uk_flags & UMA_ZONE_CONTIG) != 0 && keg->uk_ppera > 1) keg->uk_allocf = contig_alloc; else keg->uk_allocf = page_alloc; #ifdef UMA_MD_SMALL_ALLOC if (keg->uk_ppera == 1) keg->uk_freef = uma_small_free; else #endif if (keg->uk_flags & UMA_ZONE_PCPU) keg->uk_freef = pcpu_page_free; else keg->uk_freef = page_free; /* * Initialize keg's locks. */ for (i = 0; i < vm_ndomains; i++) KEG_LOCK_INIT(keg, i, (arg->flags & UMA_ZONE_MTXCLASS)); /* * If we're putting the slab header in the actual page we need to * figure out where in each page it goes. See slab_sizeof * definition. */ if (!(keg->uk_flags & UMA_ZFLAG_OFFPAGE)) { size_t shsize; shsize = slab_sizeof(keg->uk_ipers); keg->uk_pgoff = (PAGE_SIZE * keg->uk_ppera) - shsize; /* * The only way the following is possible is if with our * UMA_ALIGN_PTR adjustments we are now bigger than * UMA_SLAB_SIZE. I haven't checked whether this is * mathematically possible for all cases, so we make * sure here anyway. */ KASSERT(keg->uk_pgoff + shsize <= PAGE_SIZE * keg->uk_ppera, ("zone %s ipers %d rsize %d size %d slab won't fit", zone->uz_name, keg->uk_ipers, keg->uk_rsize, keg->uk_size)); } if (keg->uk_flags & UMA_ZFLAG_HASH) hash_alloc(&keg->uk_hash, 0); CTR3(KTR_UMA, "keg_ctor %p zone %s(%p)", keg, zone->uz_name, zone); LIST_INSERT_HEAD(&keg->uk_zones, zone, uz_link); rw_wlock(&uma_rwlock); LIST_INSERT_HEAD(&uma_kegs, keg, uk_link); rw_wunlock(&uma_rwlock); return (0); } static void zone_kva_available(uma_zone_t zone, void *unused) { uma_keg_t keg; if ((zone->uz_flags & UMA_ZFLAG_CACHE) != 0) return; KEG_GET(zone, keg); if (keg->uk_allocf == startup_alloc) { /* Switch to the real allocator. */ if (keg->uk_flags & UMA_ZONE_PCPU) keg->uk_allocf = pcpu_page_alloc; else if ((keg->uk_flags & UMA_ZONE_CONTIG) != 0 && keg->uk_ppera > 1) keg->uk_allocf = contig_alloc; else keg->uk_allocf = page_alloc; } } static void zone_alloc_counters(uma_zone_t zone, void *unused) { zone->uz_allocs = counter_u64_alloc(M_WAITOK); zone->uz_frees = counter_u64_alloc(M_WAITOK); zone->uz_fails = counter_u64_alloc(M_WAITOK); zone->uz_xdomain = counter_u64_alloc(M_WAITOK); } static void zone_alloc_sysctl(uma_zone_t zone, void *unused) { uma_zone_domain_t zdom; uma_domain_t dom; uma_keg_t keg; struct sysctl_oid *oid, *domainoid; int domains, i, cnt; static const char *nokeg = "cache zone"; char *c; /* * Make a sysctl safe copy of the zone name by removing * any special characters and handling dups by appending * an index. */ if (zone->uz_namecnt != 0) { /* Count the number of decimal digits and '_' separator. */ for (i = 1, cnt = zone->uz_namecnt; cnt != 0; i++) cnt /= 10; zone->uz_ctlname = malloc(strlen(zone->uz_name) + i + 1, M_UMA, M_WAITOK); sprintf(zone->uz_ctlname, "%s_%d", zone->uz_name, zone->uz_namecnt); } else zone->uz_ctlname = strdup(zone->uz_name, M_UMA); for (c = zone->uz_ctlname; *c != '\0'; c++) if (strchr("./\\ -", *c) != NULL) *c = '_'; /* * Basic parameters at the root. */ zone->uz_oid = SYSCTL_ADD_NODE(NULL, SYSCTL_STATIC_CHILDREN(_vm_uma), OID_AUTO, zone->uz_ctlname, CTLFLAG_RD, NULL, ""); oid = zone->uz_oid; SYSCTL_ADD_U32(NULL, SYSCTL_CHILDREN(oid), OID_AUTO, "size", CTLFLAG_RD, &zone->uz_size, 0, "Allocation size"); SYSCTL_ADD_PROC(NULL, SYSCTL_CHILDREN(oid), OID_AUTO, "flags", CTLFLAG_RD | CTLTYPE_STRING | CTLFLAG_MPSAFE, zone, 0, sysctl_handle_uma_zone_flags, "A", "Allocator configuration flags"); SYSCTL_ADD_U16(NULL, SYSCTL_CHILDREN(oid), OID_AUTO, "bucket_size", CTLFLAG_RD, &zone->uz_bucket_size, 0, "Desired per-cpu cache size"); SYSCTL_ADD_U16(NULL, SYSCTL_CHILDREN(oid), OID_AUTO, "bucket_size_max", CTLFLAG_RD, &zone->uz_bucket_size_max, 0, "Maximum allowed per-cpu cache size"); /* * keg if present. */ if ((zone->uz_flags & UMA_ZFLAG_HASH) == 0) domains = vm_ndomains; else domains = 1; oid = SYSCTL_ADD_NODE(NULL, SYSCTL_CHILDREN(zone->uz_oid), OID_AUTO, "keg", CTLFLAG_RD, NULL, ""); keg = zone->uz_keg; if ((zone->uz_flags & UMA_ZFLAG_CACHE) == 0) { SYSCTL_ADD_CONST_STRING(NULL, SYSCTL_CHILDREN(oid), OID_AUTO, "name", CTLFLAG_RD, keg->uk_name, "Keg name"); SYSCTL_ADD_U32(NULL, SYSCTL_CHILDREN(oid), OID_AUTO, "rsize", CTLFLAG_RD, &keg->uk_rsize, 0, "Real object size with alignment"); SYSCTL_ADD_U16(NULL, SYSCTL_CHILDREN(oid), OID_AUTO, "ppera", CTLFLAG_RD, &keg->uk_ppera, 0, "pages per-slab allocation"); SYSCTL_ADD_U16(NULL, SYSCTL_CHILDREN(oid), OID_AUTO, "ipers", CTLFLAG_RD, &keg->uk_ipers, 0, "items available per-slab"); SYSCTL_ADD_U32(NULL, SYSCTL_CHILDREN(oid), OID_AUTO, "align", CTLFLAG_RD, &keg->uk_align, 0, "item alignment mask"); SYSCTL_ADD_PROC(NULL, SYSCTL_CHILDREN(oid), OID_AUTO, "efficiency", CTLFLAG_RD | CTLTYPE_INT | CTLFLAG_MPSAFE, keg, 0, sysctl_handle_uma_slab_efficiency, "I", "Slab utilization (100 - internal fragmentation %)"); domainoid = SYSCTL_ADD_NODE(NULL, SYSCTL_CHILDREN(oid), OID_AUTO, "domain", CTLFLAG_RD, NULL, ""); for (i = 0; i < domains; i++) { dom = &keg->uk_domain[i]; oid = SYSCTL_ADD_NODE(NULL, SYSCTL_CHILDREN(domainoid), OID_AUTO, VM_DOMAIN(i)->vmd_name, CTLFLAG_RD, NULL, ""); SYSCTL_ADD_U32(NULL, SYSCTL_CHILDREN(oid), OID_AUTO, "pages", CTLFLAG_RD, &dom->ud_pages, 0, "Total pages currently allocated from VM"); SYSCTL_ADD_U32(NULL, SYSCTL_CHILDREN(oid), OID_AUTO, "free_items", CTLFLAG_RD, &dom->ud_free_items, 0, "items free in the slab layer"); } } else SYSCTL_ADD_CONST_STRING(NULL, SYSCTL_CHILDREN(oid), OID_AUTO, "name", CTLFLAG_RD, nokeg, "Keg name"); /* * Information about zone limits. */ oid = SYSCTL_ADD_NODE(NULL, SYSCTL_CHILDREN(zone->uz_oid), OID_AUTO, "limit", CTLFLAG_RD, NULL, ""); SYSCTL_ADD_PROC(NULL, SYSCTL_CHILDREN(oid), OID_AUTO, "items", CTLFLAG_RD | CTLTYPE_U64 | CTLFLAG_MPSAFE, zone, 0, sysctl_handle_uma_zone_items, "QU", "current number of allocated items if limit is set"); SYSCTL_ADD_U64(NULL, SYSCTL_CHILDREN(oid), OID_AUTO, "max_items", CTLFLAG_RD, &zone->uz_max_items, 0, "Maximum number of cached items"); SYSCTL_ADD_U32(NULL, SYSCTL_CHILDREN(oid), OID_AUTO, "sleepers", CTLFLAG_RD, &zone->uz_sleepers, 0, "Number of threads sleeping at limit"); SYSCTL_ADD_U64(NULL, SYSCTL_CHILDREN(oid), OID_AUTO, "sleeps", CTLFLAG_RD, &zone->uz_sleeps, 0, "Total zone limit sleeps"); SYSCTL_ADD_U64(NULL, SYSCTL_CHILDREN(oid), OID_AUTO, "bucket_max", CTLFLAG_RD, &zone->uz_bucket_max, 0, "Maximum number of items in each domain's bucket cache"); /* * Per-domain zone information. */ domainoid = SYSCTL_ADD_NODE(NULL, SYSCTL_CHILDREN(zone->uz_oid), OID_AUTO, "domain", CTLFLAG_RD, NULL, ""); for (i = 0; i < domains; i++) { zdom = ZDOM_GET(zone, i); oid = SYSCTL_ADD_NODE(NULL, SYSCTL_CHILDREN(domainoid), OID_AUTO, VM_DOMAIN(i)->vmd_name, CTLFLAG_RD, NULL, ""); SYSCTL_ADD_LONG(NULL, SYSCTL_CHILDREN(oid), OID_AUTO, "nitems", CTLFLAG_RD, &zdom->uzd_nitems, "number of items in this domain"); SYSCTL_ADD_LONG(NULL, SYSCTL_CHILDREN(oid), OID_AUTO, "imax", CTLFLAG_RD, &zdom->uzd_imax, "maximum item count in this period"); SYSCTL_ADD_LONG(NULL, SYSCTL_CHILDREN(oid), OID_AUTO, "imin", CTLFLAG_RD, &zdom->uzd_imin, "minimum item count in this period"); SYSCTL_ADD_LONG(NULL, SYSCTL_CHILDREN(oid), OID_AUTO, "wss", CTLFLAG_RD, &zdom->uzd_wss, "Working set size"); } /* * General statistics. */ oid = SYSCTL_ADD_NODE(NULL, SYSCTL_CHILDREN(zone->uz_oid), OID_AUTO, "stats", CTLFLAG_RD, NULL, ""); SYSCTL_ADD_PROC(NULL, SYSCTL_CHILDREN(oid), OID_AUTO, "current", CTLFLAG_RD | CTLTYPE_INT | CTLFLAG_MPSAFE, zone, 1, sysctl_handle_uma_zone_cur, "I", "Current number of allocated items"); SYSCTL_ADD_PROC(NULL, SYSCTL_CHILDREN(oid), OID_AUTO, "allocs", CTLFLAG_RD | CTLTYPE_U64 | CTLFLAG_MPSAFE, zone, 0, sysctl_handle_uma_zone_allocs, "QU", "Total allocation calls"); SYSCTL_ADD_PROC(NULL, SYSCTL_CHILDREN(oid), OID_AUTO, "frees", CTLFLAG_RD | CTLTYPE_U64 | CTLFLAG_MPSAFE, zone, 0, sysctl_handle_uma_zone_frees, "QU", "Total free calls"); SYSCTL_ADD_COUNTER_U64(NULL, SYSCTL_CHILDREN(oid), OID_AUTO, "fails", CTLFLAG_RD, &zone->uz_fails, "Number of allocation failures"); SYSCTL_ADD_COUNTER_U64(NULL, SYSCTL_CHILDREN(oid), OID_AUTO, "xdomain", CTLFLAG_RD, &zone->uz_xdomain, "Free calls from the wrong domain"); } struct uma_zone_count { const char *name; int count; }; static void zone_count(uma_zone_t zone, void *arg) { struct uma_zone_count *cnt; cnt = arg; /* * Some zones are rapidly created with identical names and * destroyed out of order. This can lead to gaps in the count. * Use one greater than the maximum observed for this name. */ if (strcmp(zone->uz_name, cnt->name) == 0) cnt->count = MAX(cnt->count, zone->uz_namecnt + 1); } static void zone_update_caches(uma_zone_t zone) { int i; for (i = 0; i <= mp_maxid; i++) { cache_set_uz_size(&zone->uz_cpu[i], zone->uz_size); cache_set_uz_flags(&zone->uz_cpu[i], zone->uz_flags); } } /* * Zone header ctor. This initializes all fields, locks, etc. * * Arguments/Returns follow uma_ctor specifications * udata Actually uma_zctor_args */ static int zone_ctor(void *mem, int size, void *udata, int flags) { struct uma_zone_count cnt; struct uma_zctor_args *arg = udata; uma_zone_domain_t zdom; uma_zone_t zone = mem; uma_zone_t z; uma_keg_t keg; int i; bzero(zone, size); zone->uz_name = arg->name; zone->uz_ctor = arg->ctor; zone->uz_dtor = arg->dtor; zone->uz_init = NULL; zone->uz_fini = NULL; zone->uz_sleeps = 0; zone->uz_bucket_size = 0; zone->uz_bucket_size_min = 0; zone->uz_bucket_size_max = BUCKET_MAX; zone->uz_flags = (arg->flags & UMA_ZONE_SMR); zone->uz_warning = NULL; /* The domain structures follow the cpu structures. */ zone->uz_bucket_max = ULONG_MAX; timevalclear(&zone->uz_ratecheck); /* Count the number of duplicate names. */ cnt.name = arg->name; cnt.count = 0; zone_foreach(zone_count, &cnt); zone->uz_namecnt = cnt.count; ZONE_CROSS_LOCK_INIT(zone); for (i = 0; i < vm_ndomains; i++) { zdom = ZDOM_GET(zone, i); ZDOM_LOCK_INIT(zone, zdom, (arg->flags & UMA_ZONE_MTXCLASS)); STAILQ_INIT(&zdom->uzd_buckets); } #ifdef INVARIANTS if (arg->uminit == trash_init && arg->fini == trash_fini) zone->uz_flags |= UMA_ZFLAG_TRASH | UMA_ZFLAG_CTORDTOR; #endif /* * This is a pure cache zone, no kegs. */ if (arg->import) { KASSERT((arg->flags & UMA_ZFLAG_CACHE) != 0, ("zone_ctor: Import specified for non-cache zone.")); zone->uz_flags = arg->flags; zone->uz_size = arg->size; zone->uz_import = arg->import; zone->uz_release = arg->release; zone->uz_arg = arg->arg; #ifdef NUMA /* * Cache zones are round-robin unless a policy is * specified because they may have incompatible * constraints. */ if ((zone->uz_flags & UMA_ZONE_FIRSTTOUCH) == 0) zone->uz_flags |= UMA_ZONE_ROUNDROBIN; #endif rw_wlock(&uma_rwlock); LIST_INSERT_HEAD(&uma_cachezones, zone, uz_link); rw_wunlock(&uma_rwlock); goto out; } /* * Use the regular zone/keg/slab allocator. */ zone->uz_import = zone_import; zone->uz_release = zone_release; zone->uz_arg = zone; keg = arg->keg; if (arg->flags & UMA_ZONE_SECONDARY) { KASSERT((zone->uz_flags & UMA_ZONE_SECONDARY) == 0, ("Secondary zone requested UMA_ZFLAG_INTERNAL")); KASSERT(arg->keg != NULL, ("Secondary zone on zero'd keg")); zone->uz_init = arg->uminit; zone->uz_fini = arg->fini; zone->uz_flags |= UMA_ZONE_SECONDARY; rw_wlock(&uma_rwlock); ZONE_LOCK(zone); LIST_FOREACH(z, &keg->uk_zones, uz_link) { if (LIST_NEXT(z, uz_link) == NULL) { LIST_INSERT_AFTER(z, zone, uz_link); break; } } ZONE_UNLOCK(zone); rw_wunlock(&uma_rwlock); } else if (keg == NULL) { if ((keg = uma_kcreate(zone, arg->size, arg->uminit, arg->fini, arg->align, arg->flags)) == NULL) return (ENOMEM); } else { struct uma_kctor_args karg; int error; /* We should only be here from uma_startup() */ karg.size = arg->size; karg.uminit = arg->uminit; karg.fini = arg->fini; karg.align = arg->align; karg.flags = (arg->flags & ~UMA_ZONE_SMR); karg.zone = zone; error = keg_ctor(arg->keg, sizeof(struct uma_keg), &karg, flags); if (error) return (error); } /* Inherit properties from the keg. */ zone->uz_keg = keg; zone->uz_size = keg->uk_size; zone->uz_flags |= (keg->uk_flags & (UMA_ZONE_INHERIT | UMA_ZFLAG_INHERIT)); out: if (__predict_true(booted >= BOOT_RUNNING)) { zone_alloc_counters(zone, NULL); zone_alloc_sysctl(zone, NULL); } else { zone->uz_allocs = EARLY_COUNTER; zone->uz_frees = EARLY_COUNTER; zone->uz_fails = EARLY_COUNTER; } /* Caller requests a private SMR context. */ if ((zone->uz_flags & UMA_ZONE_SMR) != 0) - zone->uz_smr = smr_create(zone->uz_name); + zone->uz_smr = smr_create(zone->uz_name, 0, 0); KASSERT((arg->flags & (UMA_ZONE_MAXBUCKET | UMA_ZONE_NOBUCKET)) != (UMA_ZONE_MAXBUCKET | UMA_ZONE_NOBUCKET), ("Invalid zone flag combination")); if (arg->flags & UMA_ZFLAG_INTERNAL) zone->uz_bucket_size_max = zone->uz_bucket_size = 0; if ((arg->flags & UMA_ZONE_MAXBUCKET) != 0) zone->uz_bucket_size = BUCKET_MAX; else if ((arg->flags & UMA_ZONE_MINBUCKET) != 0) zone->uz_bucket_size_max = zone->uz_bucket_size = BUCKET_MIN; else if ((arg->flags & UMA_ZONE_NOBUCKET) != 0) zone->uz_bucket_size = 0; else zone->uz_bucket_size = bucket_select(zone->uz_size); zone->uz_bucket_size_min = zone->uz_bucket_size; if (zone->uz_dtor != NULL || zone->uz_ctor != NULL) zone->uz_flags |= UMA_ZFLAG_CTORDTOR; zone_update_caches(zone); return (0); } /* * Keg header dtor. This frees all data, destroys locks, frees the hash * table and removes the keg from the global list. * * Arguments/Returns follow uma_dtor specifications * udata unused */ static void keg_dtor(void *arg, int size, void *udata) { uma_keg_t keg; uint32_t free, pages; int i; keg = (uma_keg_t)arg; free = pages = 0; for (i = 0; i < vm_ndomains; i++) { free += keg->uk_domain[i].ud_free_items; pages += keg->uk_domain[i].ud_pages; KEG_LOCK_FINI(keg, i); } if (pages != 0) printf("Freed UMA keg (%s) was not empty (%u items). " " Lost %u pages of memory.\n", keg->uk_name ? keg->uk_name : "", pages / keg->uk_ppera * keg->uk_ipers - free, pages); hash_free(&keg->uk_hash); } /* * Zone header dtor. * * Arguments/Returns follow uma_dtor specifications * udata unused */ static void zone_dtor(void *arg, int size, void *udata) { uma_zone_t zone; uma_keg_t keg; int i; zone = (uma_zone_t)arg; sysctl_remove_oid(zone->uz_oid, 1, 1); if (!(zone->uz_flags & UMA_ZFLAG_INTERNAL)) cache_drain(zone); rw_wlock(&uma_rwlock); LIST_REMOVE(zone, uz_link); rw_wunlock(&uma_rwlock); zone_reclaim(zone, M_WAITOK, true); /* * We only destroy kegs from non secondary/non cache zones. */ if ((zone->uz_flags & (UMA_ZONE_SECONDARY | UMA_ZFLAG_CACHE)) == 0) { keg = zone->uz_keg; rw_wlock(&uma_rwlock); LIST_REMOVE(keg, uk_link); rw_wunlock(&uma_rwlock); zone_free_item(kegs, keg, NULL, SKIP_NONE); } counter_u64_free(zone->uz_allocs); counter_u64_free(zone->uz_frees); counter_u64_free(zone->uz_fails); counter_u64_free(zone->uz_xdomain); free(zone->uz_ctlname, M_UMA); for (i = 0; i < vm_ndomains; i++) ZDOM_LOCK_FINI(ZDOM_GET(zone, i)); ZONE_CROSS_LOCK_FINI(zone); } static void zone_foreach_unlocked(void (*zfunc)(uma_zone_t, void *arg), void *arg) { uma_keg_t keg; uma_zone_t zone; LIST_FOREACH(keg, &uma_kegs, uk_link) { LIST_FOREACH(zone, &keg->uk_zones, uz_link) zfunc(zone, arg); } LIST_FOREACH(zone, &uma_cachezones, uz_link) zfunc(zone, arg); } /* * Traverses every zone in the system and calls a callback * * Arguments: * zfunc A pointer to a function which accepts a zone * as an argument. * * Returns: * Nothing */ static void zone_foreach(void (*zfunc)(uma_zone_t, void *arg), void *arg) { rw_rlock(&uma_rwlock); zone_foreach_unlocked(zfunc, arg); rw_runlock(&uma_rwlock); } /* * Initialize the kernel memory allocator. This is done after pages can be * allocated but before general KVA is available. */ void uma_startup1(vm_offset_t virtual_avail) { struct uma_zctor_args args; size_t ksize, zsize, size; uma_keg_t masterkeg; uintptr_t m; uint8_t pflag; bootstart = bootmem = virtual_avail; rw_init(&uma_rwlock, "UMA lock"); sx_init(&uma_reclaim_lock, "umareclaim"); ksize = sizeof(struct uma_keg) + (sizeof(struct uma_domain) * vm_ndomains); ksize = roundup(ksize, UMA_SUPER_ALIGN); zsize = sizeof(struct uma_zone) + (sizeof(struct uma_cache) * (mp_maxid + 1)) + (sizeof(struct uma_zone_domain) * vm_ndomains); zsize = roundup(zsize, UMA_SUPER_ALIGN); /* Allocate the zone of zones, zone of kegs, and zone of zones keg. */ size = (zsize * 2) + ksize; m = (uintptr_t)startup_alloc(NULL, size, 0, &pflag, M_NOWAIT | M_ZERO); zones = (uma_zone_t)m; m += zsize; kegs = (uma_zone_t)m; m += zsize; masterkeg = (uma_keg_t)m; /* "manually" create the initial zone */ memset(&args, 0, sizeof(args)); args.name = "UMA Kegs"; args.size = ksize; args.ctor = keg_ctor; args.dtor = keg_dtor; args.uminit = zero_init; args.fini = NULL; args.keg = masterkeg; args.align = UMA_SUPER_ALIGN - 1; args.flags = UMA_ZFLAG_INTERNAL; zone_ctor(kegs, zsize, &args, M_WAITOK); args.name = "UMA Zones"; args.size = zsize; args.ctor = zone_ctor; args.dtor = zone_dtor; args.uminit = zero_init; args.fini = NULL; args.keg = NULL; args.align = UMA_SUPER_ALIGN - 1; args.flags = UMA_ZFLAG_INTERNAL; zone_ctor(zones, zsize, &args, M_WAITOK); /* Now make zones for slab headers */ slabzones[0] = uma_zcreate("UMA Slabs 0", SLABZONE0_SIZE, NULL, NULL, NULL, NULL, UMA_ALIGN_PTR, UMA_ZFLAG_INTERNAL); slabzones[1] = uma_zcreate("UMA Slabs 1", SLABZONE1_SIZE, NULL, NULL, NULL, NULL, UMA_ALIGN_PTR, UMA_ZFLAG_INTERNAL); hashzone = uma_zcreate("UMA Hash", sizeof(struct slabhead *) * UMA_HASH_SIZE_INIT, NULL, NULL, NULL, NULL, UMA_ALIGN_PTR, UMA_ZFLAG_INTERNAL); bucket_init(); smr_init(); } #ifndef UMA_MD_SMALL_ALLOC extern void vm_radix_reserve_kva(void); #endif /* * Advertise the availability of normal kva allocations and switch to * the default back-end allocator. Marks the KVA we consumed on startup * as used in the map. */ void uma_startup2(void) { if (bootstart != bootmem) { vm_map_lock(kernel_map); (void)vm_map_insert(kernel_map, NULL, 0, bootstart, bootmem, VM_PROT_RW, VM_PROT_RW, MAP_NOFAULT); vm_map_unlock(kernel_map); } #ifndef UMA_MD_SMALL_ALLOC /* Set up radix zone to use noobj_alloc. */ vm_radix_reserve_kva(); #endif booted = BOOT_KVA; zone_foreach_unlocked(zone_kva_available, NULL); bucket_enable(); } /* * Finish our initialization steps. */ static void uma_startup3(void) { #ifdef INVARIANTS TUNABLE_INT_FETCH("vm.debug.divisor", &dbg_divisor); uma_dbg_cnt = counter_u64_alloc(M_WAITOK); uma_skip_cnt = counter_u64_alloc(M_WAITOK); #endif zone_foreach_unlocked(zone_alloc_counters, NULL); zone_foreach_unlocked(zone_alloc_sysctl, NULL); callout_init(&uma_callout, 1); callout_reset(&uma_callout, UMA_TIMEOUT * hz, uma_timeout, NULL); booted = BOOT_RUNNING; EVENTHANDLER_REGISTER(shutdown_post_sync, uma_shutdown, NULL, EVENTHANDLER_PRI_FIRST); } static void uma_shutdown(void) { booted = BOOT_SHUTDOWN; } static uma_keg_t uma_kcreate(uma_zone_t zone, size_t size, uma_init uminit, uma_fini fini, int align, uint32_t flags) { struct uma_kctor_args args; args.size = size; args.uminit = uminit; args.fini = fini; args.align = (align == UMA_ALIGN_CACHE) ? uma_align_cache : align; args.flags = flags; args.zone = zone; return (zone_alloc_item(kegs, &args, UMA_ANYDOMAIN, M_WAITOK)); } /* Public functions */ /* See uma.h */ void uma_set_align(int align) { if (align != UMA_ALIGN_CACHE) uma_align_cache = align; } /* See uma.h */ uma_zone_t uma_zcreate(const char *name, size_t size, uma_ctor ctor, uma_dtor dtor, uma_init uminit, uma_fini fini, int align, uint32_t flags) { struct uma_zctor_args args; uma_zone_t res; KASSERT(powerof2(align + 1), ("invalid zone alignment %d for \"%s\"", align, name)); /* This stuff is essential for the zone ctor */ memset(&args, 0, sizeof(args)); args.name = name; args.size = size; args.ctor = ctor; args.dtor = dtor; args.uminit = uminit; args.fini = fini; #ifdef INVARIANTS /* * Inject procedures which check for memory use after free if we are * allowed to scramble the memory while it is not allocated. This * requires that: UMA is actually able to access the memory, no init * or fini procedures, no dependency on the initial value of the * memory, and no (legitimate) use of the memory after free. Note, * the ctor and dtor do not need to be empty. */ if ((!(flags & (UMA_ZONE_ZINIT | UMA_ZONE_NOTOUCH | UMA_ZONE_NOFREE))) && uminit == NULL && fini == NULL) { args.uminit = trash_init; args.fini = trash_fini; } #endif args.align = align; args.flags = flags; args.keg = NULL; sx_slock(&uma_reclaim_lock); res = zone_alloc_item(zones, &args, UMA_ANYDOMAIN, M_WAITOK); sx_sunlock(&uma_reclaim_lock); return (res); } /* See uma.h */ uma_zone_t uma_zsecond_create(char *name, uma_ctor ctor, uma_dtor dtor, uma_init zinit, uma_fini zfini, uma_zone_t master) { struct uma_zctor_args args; uma_keg_t keg; uma_zone_t res; keg = master->uz_keg; memset(&args, 0, sizeof(args)); args.name = name; args.size = keg->uk_size; args.ctor = ctor; args.dtor = dtor; args.uminit = zinit; args.fini = zfini; args.align = keg->uk_align; args.flags = keg->uk_flags | UMA_ZONE_SECONDARY; args.keg = keg; sx_slock(&uma_reclaim_lock); res = zone_alloc_item(zones, &args, UMA_ANYDOMAIN, M_WAITOK); sx_sunlock(&uma_reclaim_lock); return (res); } /* See uma.h */ uma_zone_t uma_zcache_create(char *name, int size, uma_ctor ctor, uma_dtor dtor, uma_init zinit, uma_fini zfini, uma_import zimport, uma_release zrelease, void *arg, int flags) { struct uma_zctor_args args; memset(&args, 0, sizeof(args)); args.name = name; args.size = size; args.ctor = ctor; args.dtor = dtor; args.uminit = zinit; args.fini = zfini; args.import = zimport; args.release = zrelease; args.arg = arg; args.align = 0; args.flags = flags | UMA_ZFLAG_CACHE; return (zone_alloc_item(zones, &args, UMA_ANYDOMAIN, M_WAITOK)); } /* See uma.h */ void uma_zdestroy(uma_zone_t zone) { /* * Large slabs are expensive to reclaim, so don't bother doing * unnecessary work if we're shutting down. */ if (booted == BOOT_SHUTDOWN && zone->uz_fini == NULL && zone->uz_release == zone_release) return; sx_slock(&uma_reclaim_lock); zone_free_item(zones, zone, NULL, SKIP_NONE); sx_sunlock(&uma_reclaim_lock); } void uma_zwait(uma_zone_t zone) { if ((zone->uz_flags & UMA_ZONE_SMR) != 0) uma_zfree_smr(zone, uma_zalloc_smr(zone, M_WAITOK)); else if ((zone->uz_flags & UMA_ZONE_PCPU) != 0) uma_zfree_pcpu(zone, uma_zalloc_pcpu(zone, M_WAITOK)); else uma_zfree(zone, uma_zalloc(zone, M_WAITOK)); } void * uma_zalloc_pcpu_arg(uma_zone_t zone, void *udata, int flags) { void *item, *pcpu_item; #ifdef SMP int i; MPASS(zone->uz_flags & UMA_ZONE_PCPU); #endif item = uma_zalloc_arg(zone, udata, flags & ~M_ZERO); if (item == NULL) return (NULL); pcpu_item = zpcpu_base_to_offset(item); if (flags & M_ZERO) { #ifdef SMP for (i = 0; i <= mp_maxid; i++) bzero(zpcpu_get_cpu(pcpu_item, i), zone->uz_size); #else bzero(item, zone->uz_size); #endif } return (pcpu_item); } /* * A stub while both regular and pcpu cases are identical. */ void uma_zfree_pcpu_arg(uma_zone_t zone, void *pcpu_item, void *udata) { void *item; #ifdef SMP MPASS(zone->uz_flags & UMA_ZONE_PCPU); #endif item = zpcpu_offset_to_base(pcpu_item); uma_zfree_arg(zone, item, udata); } static inline void * item_ctor(uma_zone_t zone, int uz_flags, int size, void *udata, int flags, void *item) { #ifdef INVARIANTS bool skipdbg; skipdbg = uma_dbg_zskip(zone, item); if (!skipdbg && (zone->uz_flags & UMA_ZFLAG_TRASH) != 0 && zone->uz_ctor != trash_ctor) trash_ctor(item, size, udata, flags); #endif /* Check flags before loading ctor pointer. */ if (__predict_false((uz_flags & UMA_ZFLAG_CTORDTOR) != 0) && __predict_false(zone->uz_ctor != NULL) && zone->uz_ctor(item, size, udata, flags) != 0) { counter_u64_add(zone->uz_fails, 1); zone_free_item(zone, item, udata, SKIP_DTOR | SKIP_CNT); return (NULL); } #ifdef INVARIANTS if (!skipdbg) uma_dbg_alloc(zone, NULL, item); #endif if (__predict_false(flags & M_ZERO)) return (memset(item, 0, size)); return (item); } static inline void item_dtor(uma_zone_t zone, void *item, int size, void *udata, enum zfreeskip skip) { #ifdef INVARIANTS bool skipdbg; skipdbg = uma_dbg_zskip(zone, item); if (skip == SKIP_NONE && !skipdbg) { if ((zone->uz_flags & UMA_ZONE_MALLOC) != 0) uma_dbg_free(zone, udata, item); else uma_dbg_free(zone, NULL, item); } #endif if (__predict_true(skip < SKIP_DTOR)) { if (zone->uz_dtor != NULL) zone->uz_dtor(item, size, udata); #ifdef INVARIANTS if (!skipdbg && (zone->uz_flags & UMA_ZFLAG_TRASH) != 0 && zone->uz_dtor != trash_dtor) trash_dtor(item, size, udata); #endif } } #if defined(INVARIANTS) || defined(DEBUG_MEMGUARD) || defined(WITNESS) #define UMA_ZALLOC_DEBUG static int uma_zalloc_debug(uma_zone_t zone, void **itemp, void *udata, int flags) { int error; error = 0; #ifdef WITNESS if (flags & M_WAITOK) { WITNESS_WARN(WARN_GIANTOK | WARN_SLEEPOK, NULL, "uma_zalloc_debug: zone \"%s\"", zone->uz_name); } #endif #ifdef INVARIANTS KASSERT((flags & M_EXEC) == 0, ("uma_zalloc_debug: called with M_EXEC")); KASSERT(curthread->td_critnest == 0 || SCHEDULER_STOPPED(), ("uma_zalloc_debug: called within spinlock or critical section")); KASSERT((zone->uz_flags & UMA_ZONE_PCPU) == 0 || (flags & M_ZERO) == 0, ("uma_zalloc_debug: allocating from a pcpu zone with M_ZERO")); #endif #ifdef DEBUG_MEMGUARD if ((zone->uz_flags & UMA_ZONE_SMR) == 0 && memguard_cmp_zone(zone)) { void *item; item = memguard_alloc(zone->uz_size, flags); if (item != NULL) { error = EJUSTRETURN; if (zone->uz_init != NULL && zone->uz_init(item, zone->uz_size, flags) != 0) { *itemp = NULL; return (error); } if (zone->uz_ctor != NULL && zone->uz_ctor(item, zone->uz_size, udata, flags) != 0) { counter_u64_add(zone->uz_fails, 1); zone->uz_fini(item, zone->uz_size); *itemp = NULL; return (error); } *itemp = item; return (error); } /* This is unfortunate but should not be fatal. */ } #endif return (error); } static int uma_zfree_debug(uma_zone_t zone, void *item, void *udata) { KASSERT(curthread->td_critnest == 0 || SCHEDULER_STOPPED(), ("uma_zfree_debug: called with spinlock or critical section held")); #ifdef DEBUG_MEMGUARD if ((zone->uz_flags & UMA_ZONE_SMR) == 0 && is_memguard_addr(item)) { if (zone->uz_dtor != NULL) zone->uz_dtor(item, zone->uz_size, udata); if (zone->uz_fini != NULL) zone->uz_fini(item, zone->uz_size); memguard_free(item); return (EJUSTRETURN); } #endif return (0); } #endif static inline void * cache_alloc_item(uma_zone_t zone, uma_cache_t cache, uma_cache_bucket_t bucket, void *udata, int flags) { void *item; int size, uz_flags; item = cache_bucket_pop(cache, bucket); size = cache_uz_size(cache); uz_flags = cache_uz_flags(cache); critical_exit(); return (item_ctor(zone, uz_flags, size, udata, flags, item)); } static __noinline void * cache_alloc_retry(uma_zone_t zone, uma_cache_t cache, void *udata, int flags) { uma_cache_bucket_t bucket; int domain; while (cache_alloc(zone, cache, udata, flags)) { cache = &zone->uz_cpu[curcpu]; bucket = &cache->uc_allocbucket; if (__predict_false(bucket->ucb_cnt == 0)) continue; return (cache_alloc_item(zone, cache, bucket, udata, flags)); } critical_exit(); /* * We can not get a bucket so try to return a single item. */ if (zone->uz_flags & UMA_ZONE_FIRSTTOUCH) domain = PCPU_GET(domain); else domain = UMA_ANYDOMAIN; return (zone_alloc_item(zone, udata, domain, flags)); } /* See uma.h */ void * uma_zalloc_smr(uma_zone_t zone, int flags) { uma_cache_bucket_t bucket; uma_cache_t cache; #ifdef UMA_ZALLOC_DEBUG void *item; KASSERT((zone->uz_flags & UMA_ZONE_SMR) != 0, ("uma_zalloc_arg: called with non-SMR zone.\n")); if (uma_zalloc_debug(zone, &item, NULL, flags) == EJUSTRETURN) return (item); #endif critical_enter(); cache = &zone->uz_cpu[curcpu]; bucket = &cache->uc_allocbucket; if (__predict_false(bucket->ucb_cnt == 0)) return (cache_alloc_retry(zone, cache, NULL, flags)); return (cache_alloc_item(zone, cache, bucket, NULL, flags)); } /* See uma.h */ void * uma_zalloc_arg(uma_zone_t zone, void *udata, int flags) { uma_cache_bucket_t bucket; uma_cache_t cache; /* Enable entropy collection for RANDOM_ENABLE_UMA kernel option */ random_harvest_fast_uma(&zone, sizeof(zone), RANDOM_UMA); /* This is the fast path allocation */ CTR3(KTR_UMA, "uma_zalloc_arg zone %s(%p) flags %d", zone->uz_name, zone, flags); #ifdef UMA_ZALLOC_DEBUG void *item; KASSERT((zone->uz_flags & UMA_ZONE_SMR) == 0, ("uma_zalloc_arg: called with SMR zone.\n")); if (uma_zalloc_debug(zone, &item, udata, flags) == EJUSTRETURN) return (item); #endif /* * If possible, allocate from the per-CPU cache. There are two * requirements for safe access to the per-CPU cache: (1) the thread * accessing the cache must not be preempted or yield during access, * and (2) the thread must not migrate CPUs without switching which * cache it accesses. We rely on a critical section to prevent * preemption and migration. We release the critical section in * order to acquire the zone mutex if we are unable to allocate from * the current cache; when we re-acquire the critical section, we * must detect and handle migration if it has occurred. */ critical_enter(); cache = &zone->uz_cpu[curcpu]; bucket = &cache->uc_allocbucket; if (__predict_false(bucket->ucb_cnt == 0)) return (cache_alloc_retry(zone, cache, udata, flags)); return (cache_alloc_item(zone, cache, bucket, udata, flags)); } /* * Replenish an alloc bucket and possibly restore an old one. Called in * a critical section. Returns in a critical section. * * A false return value indicates an allocation failure. * A true return value indicates success and the caller should retry. */ static __noinline bool cache_alloc(uma_zone_t zone, uma_cache_t cache, void *udata, int flags) { uma_bucket_t bucket; int domain; bool new; CRITICAL_ASSERT(curthread); /* * If we have run out of items in our alloc bucket see * if we can switch with the free bucket. * * SMR Zones can't re-use the free bucket until the sequence has * expired. */ if ((cache_uz_flags(cache) & UMA_ZONE_SMR) == 0 && cache->uc_freebucket.ucb_cnt != 0) { cache_bucket_swap(&cache->uc_freebucket, &cache->uc_allocbucket); return (true); } /* * Discard any empty allocation bucket while we hold no locks. */ bucket = cache_bucket_unload_alloc(cache); critical_exit(); if (bucket != NULL) { KASSERT(bucket->ub_cnt == 0, ("cache_alloc: Entered with non-empty alloc bucket.")); bucket_free(zone, bucket, udata); } /* Short-circuit for zones without buckets and low memory. */ if (zone->uz_bucket_size == 0 || bucketdisable) { critical_enter(); return (false); } /* * Attempt to retrieve the item from the per-CPU cache has failed, so * we must go back to the zone. This requires the zdom lock, so we * must drop the critical section, then re-acquire it when we go back * to the cache. Since the critical section is released, we may be * preempted or migrate. As such, make sure not to maintain any * thread-local state specific to the cache from prior to releasing * the critical section. */ domain = PCPU_GET(domain); if ((cache_uz_flags(cache) & UMA_ZONE_ROUNDROBIN) != 0) domain = zone_domain_highest(zone, domain); bucket = cache_fetch_bucket(zone, cache, domain); if (bucket == NULL) { bucket = zone_alloc_bucket(zone, udata, domain, flags); new = true; } else new = false; CTR3(KTR_UMA, "uma_zalloc: zone %s(%p) bucket zone returned %p", zone->uz_name, zone, bucket); if (bucket == NULL) { critical_enter(); return (false); } /* * See if we lost the race or were migrated. Cache the * initialized bucket to make this less likely or claim * the memory directly. */ critical_enter(); cache = &zone->uz_cpu[curcpu]; if (cache->uc_allocbucket.ucb_bucket == NULL && ((cache_uz_flags(cache) & UMA_ZONE_FIRSTTOUCH) == 0 || domain == PCPU_GET(domain))) { if (new) atomic_add_long(&ZDOM_GET(zone, domain)->uzd_imax, bucket->ub_cnt); cache_bucket_load_alloc(cache, bucket); return (true); } /* * We lost the race, release this bucket and start over. */ critical_exit(); zone_put_bucket(zone, domain, bucket, udata, false); critical_enter(); return (true); } void * uma_zalloc_domain(uma_zone_t zone, void *udata, int domain, int flags) { /* Enable entropy collection for RANDOM_ENABLE_UMA kernel option */ random_harvest_fast_uma(&zone, sizeof(zone), RANDOM_UMA); /* This is the fast path allocation */ CTR4(KTR_UMA, "uma_zalloc_domain zone %s(%p) domain %d flags %d", zone->uz_name, zone, domain, flags); if (flags & M_WAITOK) { WITNESS_WARN(WARN_GIANTOK | WARN_SLEEPOK, NULL, "uma_zalloc_domain: zone \"%s\"", zone->uz_name); } KASSERT(curthread->td_critnest == 0 || SCHEDULER_STOPPED(), ("uma_zalloc_domain: called with spinlock or critical section held")); return (zone_alloc_item(zone, udata, domain, flags)); } /* * Find a slab with some space. Prefer slabs that are partially used over those * that are totally full. This helps to reduce fragmentation. * * If 'rr' is 1, search all domains starting from 'domain'. Otherwise check * only 'domain'. */ static uma_slab_t keg_first_slab(uma_keg_t keg, int domain, bool rr) { uma_domain_t dom; uma_slab_t slab; int start; KASSERT(domain >= 0 && domain < vm_ndomains, ("keg_first_slab: domain %d out of range", domain)); KEG_LOCK_ASSERT(keg, domain); slab = NULL; start = domain; do { dom = &keg->uk_domain[domain]; if ((slab = LIST_FIRST(&dom->ud_part_slab)) != NULL) return (slab); if ((slab = LIST_FIRST(&dom->ud_free_slab)) != NULL) { LIST_REMOVE(slab, us_link); dom->ud_free_slabs--; LIST_INSERT_HEAD(&dom->ud_part_slab, slab, us_link); return (slab); } if (rr) domain = (domain + 1) % vm_ndomains; } while (domain != start); return (NULL); } /* * Fetch an existing slab from a free or partial list. Returns with the * keg domain lock held if a slab was found or unlocked if not. */ static uma_slab_t keg_fetch_free_slab(uma_keg_t keg, int domain, bool rr, int flags) { uma_slab_t slab; uint32_t reserve; /* HASH has a single free list. */ if ((keg->uk_flags & UMA_ZFLAG_HASH) != 0) domain = 0; KEG_LOCK(keg, domain); reserve = (flags & M_USE_RESERVE) != 0 ? 0 : keg->uk_reserve; if (keg->uk_domain[domain].ud_free_items <= reserve || (slab = keg_first_slab(keg, domain, rr)) == NULL) { KEG_UNLOCK(keg, domain); return (NULL); } return (slab); } static uma_slab_t keg_fetch_slab(uma_keg_t keg, uma_zone_t zone, int rdomain, const int flags) { struct vm_domainset_iter di; uma_slab_t slab; int aflags, domain; bool rr; restart: /* * Use the keg's policy if upper layers haven't already specified a * domain (as happens with first-touch zones). * * To avoid races we run the iterator with the keg lock held, but that * means that we cannot allow the vm_domainset layer to sleep. Thus, * clear M_WAITOK and handle low memory conditions locally. */ rr = rdomain == UMA_ANYDOMAIN; if (rr) { aflags = (flags & ~M_WAITOK) | M_NOWAIT; vm_domainset_iter_policy_ref_init(&di, &keg->uk_dr, &domain, &aflags); } else { aflags = flags; domain = rdomain; } for (;;) { slab = keg_fetch_free_slab(keg, domain, rr, flags); if (slab != NULL) return (slab); /* * M_NOVM means don't ask at all! */ if (flags & M_NOVM) break; slab = keg_alloc_slab(keg, zone, domain, flags, aflags); if (slab != NULL) return (slab); if (!rr && (flags & M_WAITOK) == 0) break; if (rr && vm_domainset_iter_policy(&di, &domain) != 0) { if ((flags & M_WAITOK) != 0) { vm_wait_doms(&keg->uk_dr.dr_policy->ds_mask); goto restart; } break; } } /* * We might not have been able to get a slab but another cpu * could have while we were unlocked. Check again before we * fail. */ if ((slab = keg_fetch_free_slab(keg, domain, rr, flags)) != NULL) return (slab); return (NULL); } static void * slab_alloc_item(uma_keg_t keg, uma_slab_t slab) { uma_domain_t dom; void *item; int freei; KEG_LOCK_ASSERT(keg, slab->us_domain); dom = &keg->uk_domain[slab->us_domain]; freei = BIT_FFS(keg->uk_ipers, &slab->us_free) - 1; BIT_CLR(keg->uk_ipers, freei, &slab->us_free); item = slab_item(slab, keg, freei); slab->us_freecount--; dom->ud_free_items--; /* * Move this slab to the full list. It must be on the partial list, so * we do not need to update the free slab count. In particular, * keg_fetch_slab() always returns slabs on the partial list. */ if (slab->us_freecount == 0) { LIST_REMOVE(slab, us_link); LIST_INSERT_HEAD(&dom->ud_full_slab, slab, us_link); } return (item); } static int zone_import(void *arg, void **bucket, int max, int domain, int flags) { uma_domain_t dom; uma_zone_t zone; uma_slab_t slab; uma_keg_t keg; #ifdef NUMA int stripe; #endif int i; zone = arg; slab = NULL; keg = zone->uz_keg; /* Try to keep the buckets totally full */ for (i = 0; i < max; ) { if ((slab = keg_fetch_slab(keg, zone, domain, flags)) == NULL) break; #ifdef NUMA stripe = howmany(max, vm_ndomains); #endif dom = &keg->uk_domain[slab->us_domain]; while (slab->us_freecount && i < max) { bucket[i++] = slab_alloc_item(keg, slab); if (dom->ud_free_items <= keg->uk_reserve) break; #ifdef NUMA /* * If the zone is striped we pick a new slab for every * N allocations. Eliminating this conditional will * instead pick a new domain for each bucket rather * than stripe within each bucket. The current option * produces more fragmentation and requires more cpu * time but yields better distribution. */ if ((zone->uz_flags & UMA_ZONE_ROUNDROBIN) != 0 && vm_ndomains > 1 && --stripe == 0) break; #endif } KEG_UNLOCK(keg, slab->us_domain); /* Don't block if we allocated any successfully. */ flags &= ~M_WAITOK; flags |= M_NOWAIT; } return i; } static int zone_alloc_limit_hard(uma_zone_t zone, int count, int flags) { uint64_t old, new, total, max; /* * The hard case. We're going to sleep because there were existing * sleepers or because we ran out of items. This routine enforces * fairness by keeping fifo order. * * First release our ill gotten gains and make some noise. */ for (;;) { zone_free_limit(zone, count); zone_log_warning(zone); zone_maxaction(zone); if (flags & M_NOWAIT) return (0); /* * We need to allocate an item or set ourself as a sleeper * while the sleepq lock is held to avoid wakeup races. This * is essentially a home rolled semaphore. */ sleepq_lock(&zone->uz_max_items); old = zone->uz_items; do { MPASS(UZ_ITEMS_SLEEPERS(old) < UZ_ITEMS_SLEEPERS_MAX); /* Cache the max since we will evaluate twice. */ max = zone->uz_max_items; if (UZ_ITEMS_SLEEPERS(old) != 0 || UZ_ITEMS_COUNT(old) >= max) new = old + UZ_ITEMS_SLEEPER; else new = old + MIN(count, max - old); } while (atomic_fcmpset_64(&zone->uz_items, &old, new) == 0); /* We may have successfully allocated under the sleepq lock. */ if (UZ_ITEMS_SLEEPERS(new) == 0) { sleepq_release(&zone->uz_max_items); return (new - old); } /* * This is in a different cacheline from uz_items so that we * don't constantly invalidate the fastpath cacheline when we * adjust item counts. This could be limited to toggling on * transitions. */ atomic_add_32(&zone->uz_sleepers, 1); atomic_add_64(&zone->uz_sleeps, 1); /* * We have added ourselves as a sleeper. The sleepq lock * protects us from wakeup races. Sleep now and then retry. */ sleepq_add(&zone->uz_max_items, NULL, "zonelimit", 0, 0); sleepq_wait(&zone->uz_max_items, PVM); /* * After wakeup, remove ourselves as a sleeper and try * again. We no longer have the sleepq lock for protection. * * Subract ourselves as a sleeper while attempting to add * our count. */ atomic_subtract_32(&zone->uz_sleepers, 1); old = atomic_fetchadd_64(&zone->uz_items, -(UZ_ITEMS_SLEEPER - count)); /* We're no longer a sleeper. */ old -= UZ_ITEMS_SLEEPER; /* * If we're still at the limit, restart. Notably do not * block on other sleepers. Cache the max value to protect * against changes via sysctl. */ total = UZ_ITEMS_COUNT(old); max = zone->uz_max_items; if (total >= max) continue; /* Truncate if necessary, otherwise wake other sleepers. */ if (total + count > max) { zone_free_limit(zone, total + count - max); count = max - total; } else if (total + count < max && UZ_ITEMS_SLEEPERS(old) != 0) wakeup_one(&zone->uz_max_items); return (count); } } /* * Allocate 'count' items from our max_items limit. Returns the number * available. If M_NOWAIT is not specified it will sleep until at least * one item can be allocated. */ static int zone_alloc_limit(uma_zone_t zone, int count, int flags) { uint64_t old; uint64_t max; max = zone->uz_max_items; MPASS(max > 0); /* * We expect normal allocations to succeed with a simple * fetchadd. */ old = atomic_fetchadd_64(&zone->uz_items, count); if (__predict_true(old + count <= max)) return (count); /* * If we had some items and no sleepers just return the * truncated value. We have to release the excess space * though because that may wake sleepers who weren't woken * because we were temporarily over the limit. */ if (old < max) { zone_free_limit(zone, (old + count) - max); return (max - old); } return (zone_alloc_limit_hard(zone, count, flags)); } /* * Free a number of items back to the limit. */ static void zone_free_limit(uma_zone_t zone, int count) { uint64_t old; MPASS(count > 0); /* * In the common case we either have no sleepers or * are still over the limit and can just return. */ old = atomic_fetchadd_64(&zone->uz_items, -count); if (__predict_true(UZ_ITEMS_SLEEPERS(old) == 0 || UZ_ITEMS_COUNT(old) - count >= zone->uz_max_items)) return; /* * Moderate the rate of wakeups. Sleepers will continue * to generate wakeups if necessary. */ wakeup_one(&zone->uz_max_items); } static uma_bucket_t zone_alloc_bucket(uma_zone_t zone, void *udata, int domain, int flags) { uma_bucket_t bucket; int maxbucket, cnt; CTR3(KTR_UMA, "zone_alloc_bucket zone %s(%p) domain %d", zone->uz_name, zone, domain); /* Avoid allocs targeting empty domains. */ if (domain != UMA_ANYDOMAIN && VM_DOMAIN_EMPTY(domain)) domain = UMA_ANYDOMAIN; if ((zone->uz_flags & UMA_ZONE_ROUNDROBIN) != 0) domain = UMA_ANYDOMAIN; if (zone->uz_max_items > 0) maxbucket = zone_alloc_limit(zone, zone->uz_bucket_size, M_NOWAIT); else maxbucket = zone->uz_bucket_size; if (maxbucket == 0) return (false); /* Don't wait for buckets, preserve caller's NOVM setting. */ bucket = bucket_alloc(zone, udata, M_NOWAIT | (flags & M_NOVM)); if (bucket == NULL) { cnt = 0; goto out; } bucket->ub_cnt = zone->uz_import(zone->uz_arg, bucket->ub_bucket, MIN(maxbucket, bucket->ub_entries), domain, flags); /* * Initialize the memory if necessary. */ if (bucket->ub_cnt != 0 && zone->uz_init != NULL) { int i; for (i = 0; i < bucket->ub_cnt; i++) if (zone->uz_init(bucket->ub_bucket[i], zone->uz_size, flags) != 0) break; /* * If we couldn't initialize the whole bucket, put the * rest back onto the freelist. */ if (i != bucket->ub_cnt) { zone->uz_release(zone->uz_arg, &bucket->ub_bucket[i], bucket->ub_cnt - i); #ifdef INVARIANTS bzero(&bucket->ub_bucket[i], sizeof(void *) * (bucket->ub_cnt - i)); #endif bucket->ub_cnt = i; } } cnt = bucket->ub_cnt; if (bucket->ub_cnt == 0) { bucket_free(zone, bucket, udata); counter_u64_add(zone->uz_fails, 1); bucket = NULL; } out: if (zone->uz_max_items > 0 && cnt < maxbucket) zone_free_limit(zone, maxbucket - cnt); return (bucket); } /* * Allocates a single item from a zone. * * Arguments * zone The zone to alloc for. * udata The data to be passed to the constructor. * domain The domain to allocate from or UMA_ANYDOMAIN. * flags M_WAITOK, M_NOWAIT, M_ZERO. * * Returns * NULL if there is no memory and M_NOWAIT is set * An item if successful */ static void * zone_alloc_item(uma_zone_t zone, void *udata, int domain, int flags) { void *item; if (zone->uz_max_items > 0 && zone_alloc_limit(zone, 1, flags) == 0) return (NULL); /* Avoid allocs targeting empty domains. */ if (domain != UMA_ANYDOMAIN && VM_DOMAIN_EMPTY(domain)) domain = UMA_ANYDOMAIN; if (zone->uz_import(zone->uz_arg, &item, 1, domain, flags) != 1) goto fail_cnt; /* * We have to call both the zone's init (not the keg's init) * and the zone's ctor. This is because the item is going from * a keg slab directly to the user, and the user is expecting it * to be both zone-init'd as well as zone-ctor'd. */ if (zone->uz_init != NULL) { if (zone->uz_init(item, zone->uz_size, flags) != 0) { zone_free_item(zone, item, udata, SKIP_FINI | SKIP_CNT); goto fail_cnt; } } item = item_ctor(zone, zone->uz_flags, zone->uz_size, udata, flags, item); if (item == NULL) goto fail; counter_u64_add(zone->uz_allocs, 1); CTR3(KTR_UMA, "zone_alloc_item item %p from %s(%p)", item, zone->uz_name, zone); return (item); fail_cnt: counter_u64_add(zone->uz_fails, 1); fail: if (zone->uz_max_items > 0) zone_free_limit(zone, 1); CTR2(KTR_UMA, "zone_alloc_item failed from %s(%p)", zone->uz_name, zone); return (NULL); } /* See uma.h */ void uma_zfree_smr(uma_zone_t zone, void *item) { uma_cache_t cache; uma_cache_bucket_t bucket; int itemdomain, uz_flags; #ifdef UMA_ZALLOC_DEBUG KASSERT((zone->uz_flags & UMA_ZONE_SMR) != 0, ("uma_zfree_smr: called with non-SMR zone.\n")); KASSERT(item != NULL, ("uma_zfree_smr: Called with NULL pointer.")); SMR_ASSERT_NOT_ENTERED(zone->uz_smr); if (uma_zfree_debug(zone, item, NULL) == EJUSTRETURN) return; #endif cache = &zone->uz_cpu[curcpu]; uz_flags = cache_uz_flags(cache); itemdomain = 0; #ifdef NUMA if ((uz_flags & UMA_ZONE_FIRSTTOUCH) != 0) itemdomain = _vm_phys_domain(pmap_kextract((vm_offset_t)item)); #endif critical_enter(); do { cache = &zone->uz_cpu[curcpu]; /* SMR Zones must free to the free bucket. */ bucket = &cache->uc_freebucket; #ifdef NUMA if ((uz_flags & UMA_ZONE_FIRSTTOUCH) != 0 && PCPU_GET(domain) != itemdomain) { bucket = &cache->uc_crossbucket; } #endif if (__predict_true(bucket->ucb_cnt < bucket->ucb_entries)) { cache_bucket_push(cache, bucket, item); critical_exit(); return; } } while (cache_free(zone, cache, NULL, item, itemdomain)); critical_exit(); /* * If nothing else caught this, we'll just do an internal free. */ zone_free_item(zone, item, NULL, SKIP_NONE); } /* See uma.h */ void uma_zfree_arg(uma_zone_t zone, void *item, void *udata) { uma_cache_t cache; uma_cache_bucket_t bucket; int itemdomain, uz_flags; /* Enable entropy collection for RANDOM_ENABLE_UMA kernel option */ random_harvest_fast_uma(&zone, sizeof(zone), RANDOM_UMA); CTR2(KTR_UMA, "uma_zfree_arg zone %s(%p)", zone->uz_name, zone); #ifdef UMA_ZALLOC_DEBUG KASSERT((zone->uz_flags & UMA_ZONE_SMR) == 0, ("uma_zfree_arg: called with SMR zone.\n")); if (uma_zfree_debug(zone, item, udata) == EJUSTRETURN) return; #endif /* uma_zfree(..., NULL) does nothing, to match free(9). */ if (item == NULL) return; /* * We are accessing the per-cpu cache without a critical section to * fetch size and flags. This is acceptable, if we are preempted we * will simply read another cpu's line. */ cache = &zone->uz_cpu[curcpu]; uz_flags = cache_uz_flags(cache); if (UMA_ALWAYS_CTORDTOR || __predict_false((uz_flags & UMA_ZFLAG_CTORDTOR) != 0)) item_dtor(zone, item, cache_uz_size(cache), udata, SKIP_NONE); /* * The race here is acceptable. If we miss it we'll just have to wait * a little longer for the limits to be reset. */ if (__predict_false(uz_flags & UMA_ZFLAG_LIMIT)) { if (zone->uz_sleepers > 0) goto zfree_item; } /* * If possible, free to the per-CPU cache. There are two * requirements for safe access to the per-CPU cache: (1) the thread * accessing the cache must not be preempted or yield during access, * and (2) the thread must not migrate CPUs without switching which * cache it accesses. We rely on a critical section to prevent * preemption and migration. We release the critical section in * order to acquire the zone mutex if we are unable to free to the * current cache; when we re-acquire the critical section, we must * detect and handle migration if it has occurred. */ itemdomain = 0; #ifdef NUMA if ((uz_flags & UMA_ZONE_FIRSTTOUCH) != 0) itemdomain = _vm_phys_domain(pmap_kextract((vm_offset_t)item)); #endif critical_enter(); do { cache = &zone->uz_cpu[curcpu]; /* * Try to free into the allocbucket first to give LIFO * ordering for cache-hot datastructures. Spill over * into the freebucket if necessary. Alloc will swap * them if one runs dry. */ bucket = &cache->uc_allocbucket; #ifdef NUMA if ((uz_flags & UMA_ZONE_FIRSTTOUCH) != 0 && PCPU_GET(domain) != itemdomain) { bucket = &cache->uc_crossbucket; } else #endif if (bucket->ucb_cnt >= bucket->ucb_entries) bucket = &cache->uc_freebucket; if (__predict_true(bucket->ucb_cnt < bucket->ucb_entries)) { cache_bucket_push(cache, bucket, item); critical_exit(); return; } } while (cache_free(zone, cache, udata, item, itemdomain)); critical_exit(); /* * If nothing else caught this, we'll just do an internal free. */ zfree_item: zone_free_item(zone, item, udata, SKIP_DTOR); } #ifdef NUMA /* * sort crossdomain free buckets to domain correct buckets and cache * them. */ static void zone_free_cross(uma_zone_t zone, uma_bucket_t bucket, void *udata) { struct uma_bucketlist fullbuckets; uma_zone_domain_t zdom; uma_bucket_t b; smr_seq_t seq; void *item; int domain; CTR3(KTR_UMA, "uma_zfree: zone %s(%p) draining cross bucket %p", zone->uz_name, zone, bucket); - STAILQ_INIT(&fullbuckets); + /* + * It is possible for buckets to arrive here out of order so we fetch + * the current smr seq rather than accepting the bucket's. + */ + seq = SMR_SEQ_INVALID; + if ((zone->uz_flags & UMA_ZONE_SMR) != 0) + seq = smr_advance(zone->uz_smr); /* * To avoid having ndomain * ndomain buckets for sorting we have a * lock on the current crossfree bucket. A full matrix with * per-domain locking could be used if necessary. */ + STAILQ_INIT(&fullbuckets); ZONE_CROSS_LOCK(zone); - - /* - * It is possible for buckets to arrive here out of order so we fetch - * the current smr seq rather than accepting the bucket's. - */ - seq = SMR_SEQ_INVALID; - if ((zone->uz_flags & UMA_ZONE_SMR) != 0) - seq = smr_current(zone->uz_smr); while (bucket->ub_cnt > 0) { item = bucket->ub_bucket[bucket->ub_cnt - 1]; domain = _vm_phys_domain(pmap_kextract((vm_offset_t)item)); zdom = ZDOM_GET(zone, domain); if (zdom->uzd_cross == NULL) { zdom->uzd_cross = bucket_alloc(zone, udata, M_NOWAIT); if (zdom->uzd_cross == NULL) break; } b = zdom->uzd_cross; b->ub_bucket[b->ub_cnt++] = item; b->ub_seq = seq; if (b->ub_cnt == b->ub_entries) { STAILQ_INSERT_HEAD(&fullbuckets, b, ub_link); zdom->uzd_cross = NULL; } bucket->ub_cnt--; } ZONE_CROSS_UNLOCK(zone); if (bucket->ub_cnt == 0) bucket->ub_seq = SMR_SEQ_INVALID; bucket_free(zone, bucket, udata); while ((b = STAILQ_FIRST(&fullbuckets)) != NULL) { STAILQ_REMOVE_HEAD(&fullbuckets, ub_link); domain = _vm_phys_domain(pmap_kextract( (vm_offset_t)b->ub_bucket[0])); zone_put_bucket(zone, domain, b, udata, true); } } #endif static void zone_free_bucket(uma_zone_t zone, uma_bucket_t bucket, void *udata, int itemdomain, bool ws) { #ifdef NUMA /* * Buckets coming from the wrong domain will be entirely for the * only other domain on two domain systems. In this case we can * simply cache them. Otherwise we need to sort them back to * correct domains. */ if ((zone->uz_flags & UMA_ZONE_FIRSTTOUCH) != 0 && vm_ndomains > 2 && PCPU_GET(domain) != itemdomain) { zone_free_cross(zone, bucket, udata); return; } #endif /* * Attempt to save the bucket in the zone's domain bucket cache. */ CTR3(KTR_UMA, "uma_zfree: zone %s(%p) putting bucket %p on free list", zone->uz_name, zone, bucket); /* ub_cnt is pointing to the last free item */ if ((zone->uz_flags & UMA_ZONE_ROUNDROBIN) != 0) itemdomain = zone_domain_lowest(zone, itemdomain); zone_put_bucket(zone, itemdomain, bucket, udata, ws); } /* * Populate a free or cross bucket for the current cpu cache. Free any * existing full bucket either to the zone cache or back to the slab layer. * * Enters and returns in a critical section. false return indicates that * we can not satisfy this free in the cache layer. true indicates that * the caller should retry. */ static __noinline bool cache_free(uma_zone_t zone, uma_cache_t cache, void *udata, void *item, int itemdomain) { uma_cache_bucket_t cbucket; uma_bucket_t newbucket, bucket; CRITICAL_ASSERT(curthread); if (zone->uz_bucket_size == 0) return false; cache = &zone->uz_cpu[curcpu]; newbucket = NULL; /* * FIRSTTOUCH domains need to free to the correct zdom. When * enabled this is the zdom of the item. The bucket is the * cross bucket if the current domain and itemdomain do not match. */ cbucket = &cache->uc_freebucket; #ifdef NUMA if ((cache_uz_flags(cache) & UMA_ZONE_FIRSTTOUCH) != 0) { if (PCPU_GET(domain) != itemdomain) { cbucket = &cache->uc_crossbucket; if (cbucket->ucb_cnt != 0) counter_u64_add(zone->uz_xdomain, cbucket->ucb_cnt); } } #endif bucket = cache_bucket_unload(cbucket); KASSERT(bucket == NULL || bucket->ub_cnt == bucket->ub_entries, ("cache_free: Entered with non-full free bucket.")); /* We are no longer associated with this CPU. */ critical_exit(); /* * Don't let SMR zones operate without a free bucket. Force * a synchronize and re-use this one. We will only degrade * to a synchronize every bucket_size items rather than every * item if we fail to allocate a bucket. */ if ((zone->uz_flags & UMA_ZONE_SMR) != 0) { if (bucket != NULL) bucket->ub_seq = smr_advance(zone->uz_smr); newbucket = bucket_alloc(zone, udata, M_NOWAIT); if (newbucket == NULL && bucket != NULL) { bucket_drain(zone, bucket); newbucket = bucket; bucket = NULL; } } else if (!bucketdisable) newbucket = bucket_alloc(zone, udata, M_NOWAIT); if (bucket != NULL) zone_free_bucket(zone, bucket, udata, itemdomain, true); critical_enter(); if ((bucket = newbucket) == NULL) return (false); cache = &zone->uz_cpu[curcpu]; #ifdef NUMA /* * Check to see if we should be populating the cross bucket. If it * is already populated we will fall through and attempt to populate * the free bucket. */ if ((cache_uz_flags(cache) & UMA_ZONE_FIRSTTOUCH) != 0) { if (PCPU_GET(domain) != itemdomain && cache->uc_crossbucket.ucb_bucket == NULL) { cache_bucket_load_cross(cache, bucket); return (true); } } #endif /* * We may have lost the race to fill the bucket or switched CPUs. */ if (cache->uc_freebucket.ucb_bucket != NULL) { critical_exit(); bucket_free(zone, bucket, udata); critical_enter(); } else cache_bucket_load_free(cache, bucket); return (true); } void uma_zfree_domain(uma_zone_t zone, void *item, void *udata) { /* Enable entropy collection for RANDOM_ENABLE_UMA kernel option */ random_harvest_fast_uma(&zone, sizeof(zone), RANDOM_UMA); CTR2(KTR_UMA, "uma_zfree_domain zone %s(%p)", zone->uz_name, zone); KASSERT(curthread->td_critnest == 0 || SCHEDULER_STOPPED(), ("uma_zfree_domain: called with spinlock or critical section held")); /* uma_zfree(..., NULL) does nothing, to match free(9). */ if (item == NULL) return; zone_free_item(zone, item, udata, SKIP_NONE); } static void slab_free_item(uma_zone_t zone, uma_slab_t slab, void *item) { uma_keg_t keg; uma_domain_t dom; int freei; keg = zone->uz_keg; KEG_LOCK_ASSERT(keg, slab->us_domain); /* Do we need to remove from any lists? */ dom = &keg->uk_domain[slab->us_domain]; if (slab->us_freecount + 1 == keg->uk_ipers) { LIST_REMOVE(slab, us_link); LIST_INSERT_HEAD(&dom->ud_free_slab, slab, us_link); dom->ud_free_slabs++; } else if (slab->us_freecount == 0) { LIST_REMOVE(slab, us_link); LIST_INSERT_HEAD(&dom->ud_part_slab, slab, us_link); } /* Slab management. */ freei = slab_item_index(slab, keg, item); BIT_SET(keg->uk_ipers, freei, &slab->us_free); slab->us_freecount++; /* Keg statistics. */ dom->ud_free_items++; } static void zone_release(void *arg, void **bucket, int cnt) { struct mtx *lock; uma_zone_t zone; uma_slab_t slab; uma_keg_t keg; uint8_t *mem; void *item; int i; zone = arg; keg = zone->uz_keg; lock = NULL; if (__predict_false((zone->uz_flags & UMA_ZFLAG_HASH) != 0)) lock = KEG_LOCK(keg, 0); for (i = 0; i < cnt; i++) { item = bucket[i]; if (__predict_true((zone->uz_flags & UMA_ZFLAG_VTOSLAB) != 0)) { slab = vtoslab((vm_offset_t)item); } else { mem = (uint8_t *)((uintptr_t)item & (~UMA_SLAB_MASK)); if ((zone->uz_flags & UMA_ZFLAG_HASH) != 0) slab = hash_sfind(&keg->uk_hash, mem); else slab = (uma_slab_t)(mem + keg->uk_pgoff); } if (lock != KEG_LOCKPTR(keg, slab->us_domain)) { if (lock != NULL) mtx_unlock(lock); lock = KEG_LOCK(keg, slab->us_domain); } slab_free_item(zone, slab, item); } if (lock != NULL) mtx_unlock(lock); } /* * Frees a single item to any zone. * * Arguments: * zone The zone to free to * item The item we're freeing * udata User supplied data for the dtor * skip Skip dtors and finis */ static __noinline void zone_free_item(uma_zone_t zone, void *item, void *udata, enum zfreeskip skip) { /* * If a free is sent directly to an SMR zone we have to * synchronize immediately because the item can instantly * be reallocated. This should only happen in degenerate * cases when no memory is available for per-cpu caches. */ if ((zone->uz_flags & UMA_ZONE_SMR) != 0 && skip == SKIP_NONE) smr_synchronize(zone->uz_smr); item_dtor(zone, item, zone->uz_size, udata, skip); if (skip < SKIP_FINI && zone->uz_fini) zone->uz_fini(item, zone->uz_size); zone->uz_release(zone->uz_arg, &item, 1); if (skip & SKIP_CNT) return; counter_u64_add(zone->uz_frees, 1); if (zone->uz_max_items > 0) zone_free_limit(zone, 1); } /* See uma.h */ int uma_zone_set_max(uma_zone_t zone, int nitems) { struct uma_bucket_zone *ubz; int count; /* * XXX This can misbehave if the zone has any allocations with * no limit and a limit is imposed. There is currently no * way to clear a limit. */ ZONE_LOCK(zone); ubz = bucket_zone_max(zone, nitems); count = ubz != NULL ? ubz->ubz_entries : 0; zone->uz_bucket_size_max = zone->uz_bucket_size = count; if (zone->uz_bucket_size_min > zone->uz_bucket_size_max) zone->uz_bucket_size_min = zone->uz_bucket_size_max; zone->uz_max_items = nitems; zone->uz_flags |= UMA_ZFLAG_LIMIT; zone_update_caches(zone); /* We may need to wake waiters. */ wakeup(&zone->uz_max_items); ZONE_UNLOCK(zone); return (nitems); } /* See uma.h */ void uma_zone_set_maxcache(uma_zone_t zone, int nitems) { struct uma_bucket_zone *ubz; int bpcpu; ZONE_LOCK(zone); ubz = bucket_zone_max(zone, nitems); if (ubz != NULL) { bpcpu = 2; if ((zone->uz_flags & UMA_ZONE_FIRSTTOUCH) != 0) /* Count the cross-domain bucket. */ bpcpu++; nitems -= ubz->ubz_entries * bpcpu * mp_ncpus; zone->uz_bucket_size_max = ubz->ubz_entries; } else { zone->uz_bucket_size_max = zone->uz_bucket_size = 0; } if (zone->uz_bucket_size_min > zone->uz_bucket_size_max) zone->uz_bucket_size_min = zone->uz_bucket_size_max; zone->uz_bucket_max = nitems / vm_ndomains; ZONE_UNLOCK(zone); } /* See uma.h */ int uma_zone_get_max(uma_zone_t zone) { int nitems; nitems = atomic_load_64(&zone->uz_max_items); return (nitems); } /* See uma.h */ void uma_zone_set_warning(uma_zone_t zone, const char *warning) { ZONE_ASSERT_COLD(zone); zone->uz_warning = warning; } /* See uma.h */ void uma_zone_set_maxaction(uma_zone_t zone, uma_maxaction_t maxaction) { ZONE_ASSERT_COLD(zone); TASK_INIT(&zone->uz_maxaction, 0, (task_fn_t *)maxaction, zone); } /* See uma.h */ int uma_zone_get_cur(uma_zone_t zone) { int64_t nitems; u_int i; nitems = 0; if (zone->uz_allocs != EARLY_COUNTER && zone->uz_frees != EARLY_COUNTER) nitems = counter_u64_fetch(zone->uz_allocs) - counter_u64_fetch(zone->uz_frees); CPU_FOREACH(i) nitems += atomic_load_64(&zone->uz_cpu[i].uc_allocs) - atomic_load_64(&zone->uz_cpu[i].uc_frees); return (nitems < 0 ? 0 : nitems); } static uint64_t uma_zone_get_allocs(uma_zone_t zone) { uint64_t nitems; u_int i; nitems = 0; if (zone->uz_allocs != EARLY_COUNTER) nitems = counter_u64_fetch(zone->uz_allocs); CPU_FOREACH(i) nitems += atomic_load_64(&zone->uz_cpu[i].uc_allocs); return (nitems); } static uint64_t uma_zone_get_frees(uma_zone_t zone) { uint64_t nitems; u_int i; nitems = 0; if (zone->uz_frees != EARLY_COUNTER) nitems = counter_u64_fetch(zone->uz_frees); CPU_FOREACH(i) nitems += atomic_load_64(&zone->uz_cpu[i].uc_frees); return (nitems); } #ifdef INVARIANTS /* Used only for KEG_ASSERT_COLD(). */ static uint64_t uma_keg_get_allocs(uma_keg_t keg) { uma_zone_t z; uint64_t nitems; nitems = 0; LIST_FOREACH(z, &keg->uk_zones, uz_link) nitems += uma_zone_get_allocs(z); return (nitems); } #endif /* See uma.h */ void uma_zone_set_init(uma_zone_t zone, uma_init uminit) { uma_keg_t keg; KEG_GET(zone, keg); KEG_ASSERT_COLD(keg); keg->uk_init = uminit; } /* See uma.h */ void uma_zone_set_fini(uma_zone_t zone, uma_fini fini) { uma_keg_t keg; KEG_GET(zone, keg); KEG_ASSERT_COLD(keg); keg->uk_fini = fini; } /* See uma.h */ void uma_zone_set_zinit(uma_zone_t zone, uma_init zinit) { ZONE_ASSERT_COLD(zone); zone->uz_init = zinit; } /* See uma.h */ void uma_zone_set_zfini(uma_zone_t zone, uma_fini zfini) { ZONE_ASSERT_COLD(zone); zone->uz_fini = zfini; } /* See uma.h */ void uma_zone_set_freef(uma_zone_t zone, uma_free freef) { uma_keg_t keg; KEG_GET(zone, keg); KEG_ASSERT_COLD(keg); keg->uk_freef = freef; } /* See uma.h */ void uma_zone_set_allocf(uma_zone_t zone, uma_alloc allocf) { uma_keg_t keg; KEG_GET(zone, keg); KEG_ASSERT_COLD(keg); keg->uk_allocf = allocf; } /* See uma.h */ void uma_zone_set_smr(uma_zone_t zone, smr_t smr) { ZONE_ASSERT_COLD(zone); zone->uz_flags |= UMA_ZONE_SMR; zone->uz_smr = smr; zone_update_caches(zone); } smr_t uma_zone_get_smr(uma_zone_t zone) { return (zone->uz_smr); } /* See uma.h */ void uma_zone_reserve(uma_zone_t zone, int items) { uma_keg_t keg; KEG_GET(zone, keg); KEG_ASSERT_COLD(keg); keg->uk_reserve = items; } /* See uma.h */ int uma_zone_reserve_kva(uma_zone_t zone, int count) { uma_keg_t keg; vm_offset_t kva; u_int pages; KEG_GET(zone, keg); KEG_ASSERT_COLD(keg); ZONE_ASSERT_COLD(zone); pages = howmany(count, keg->uk_ipers) * keg->uk_ppera; #ifdef UMA_MD_SMALL_ALLOC if (keg->uk_ppera > 1) { #else if (1) { #endif kva = kva_alloc((vm_size_t)pages * PAGE_SIZE); if (kva == 0) return (0); } else kva = 0; MPASS(keg->uk_kva == 0); keg->uk_kva = kva; keg->uk_offset = 0; zone->uz_max_items = pages * keg->uk_ipers; #ifdef UMA_MD_SMALL_ALLOC keg->uk_allocf = (keg->uk_ppera > 1) ? noobj_alloc : uma_small_alloc; #else keg->uk_allocf = noobj_alloc; #endif keg->uk_flags |= UMA_ZFLAG_LIMIT | UMA_ZONE_NOFREE; zone->uz_flags |= UMA_ZFLAG_LIMIT | UMA_ZONE_NOFREE; zone_update_caches(zone); return (1); } /* See uma.h */ void uma_prealloc(uma_zone_t zone, int items) { struct vm_domainset_iter di; uma_domain_t dom; uma_slab_t slab; uma_keg_t keg; int aflags, domain, slabs; KEG_GET(zone, keg); slabs = howmany(items, keg->uk_ipers); while (slabs-- > 0) { aflags = M_NOWAIT; vm_domainset_iter_policy_ref_init(&di, &keg->uk_dr, &domain, &aflags); for (;;) { slab = keg_alloc_slab(keg, zone, domain, M_WAITOK, aflags); if (slab != NULL) { dom = &keg->uk_domain[slab->us_domain]; /* * keg_alloc_slab() always returns a slab on the * partial list. */ LIST_REMOVE(slab, us_link); LIST_INSERT_HEAD(&dom->ud_free_slab, slab, us_link); dom->ud_free_slabs++; KEG_UNLOCK(keg, slab->us_domain); break; } if (vm_domainset_iter_policy(&di, &domain) != 0) vm_wait_doms(&keg->uk_dr.dr_policy->ds_mask); } } } /* * Returns a snapshot of memory consumption in bytes. */ size_t uma_zone_memory(uma_zone_t zone) { size_t sz; int i; sz = 0; if (zone->uz_flags & UMA_ZFLAG_CACHE) { for (i = 0; i < vm_ndomains; i++) sz += ZDOM_GET(zone, i)->uzd_nitems; return (sz * zone->uz_size); } for (i = 0; i < vm_ndomains; i++) sz += zone->uz_keg->uk_domain[i].ud_pages; return (sz * PAGE_SIZE); } /* See uma.h */ void uma_reclaim(int req) { CTR0(KTR_UMA, "UMA: vm asked us to release pages!"); sx_xlock(&uma_reclaim_lock); bucket_enable(); switch (req) { case UMA_RECLAIM_TRIM: zone_foreach(zone_trim, NULL); break; case UMA_RECLAIM_DRAIN: case UMA_RECLAIM_DRAIN_CPU: zone_foreach(zone_drain, NULL); if (req == UMA_RECLAIM_DRAIN_CPU) { pcpu_cache_drain_safe(NULL); zone_foreach(zone_drain, NULL); } break; default: panic("unhandled reclamation request %d", req); } /* * Some slabs may have been freed but this zone will be visited early * we visit again so that we can free pages that are empty once other * zones are drained. We have to do the same for buckets. */ zone_drain(slabzones[0], NULL); zone_drain(slabzones[1], NULL); bucket_zone_drain(); sx_xunlock(&uma_reclaim_lock); } static volatile int uma_reclaim_needed; void uma_reclaim_wakeup(void) { if (atomic_fetchadd_int(&uma_reclaim_needed, 1) == 0) wakeup(uma_reclaim); } void uma_reclaim_worker(void *arg __unused) { for (;;) { sx_xlock(&uma_reclaim_lock); while (atomic_load_int(&uma_reclaim_needed) == 0) sx_sleep(uma_reclaim, &uma_reclaim_lock, PVM, "umarcl", hz); sx_xunlock(&uma_reclaim_lock); EVENTHANDLER_INVOKE(vm_lowmem, VM_LOW_KMEM); uma_reclaim(UMA_RECLAIM_DRAIN_CPU); atomic_store_int(&uma_reclaim_needed, 0); /* Don't fire more than once per-second. */ pause("umarclslp", hz); } } /* See uma.h */ void uma_zone_reclaim(uma_zone_t zone, int req) { switch (req) { case UMA_RECLAIM_TRIM: zone_trim(zone, NULL); break; case UMA_RECLAIM_DRAIN: zone_drain(zone, NULL); break; case UMA_RECLAIM_DRAIN_CPU: pcpu_cache_drain_safe(zone); zone_drain(zone, NULL); break; default: panic("unhandled reclamation request %d", req); } } /* See uma.h */ int uma_zone_exhausted(uma_zone_t zone) { return (atomic_load_32(&zone->uz_sleepers) > 0); } unsigned long uma_limit(void) { return (uma_kmem_limit); } void uma_set_limit(unsigned long limit) { uma_kmem_limit = limit; } unsigned long uma_size(void) { return (atomic_load_long(&uma_kmem_total)); } long uma_avail(void) { return (uma_kmem_limit - uma_size()); } #ifdef DDB /* * Generate statistics across both the zone and its per-cpu cache's. Return * desired statistics if the pointer is non-NULL for that statistic. * * Note: does not update the zone statistics, as it can't safely clear the * per-CPU cache statistic. * */ static void uma_zone_sumstat(uma_zone_t z, long *cachefreep, uint64_t *allocsp, uint64_t *freesp, uint64_t *sleepsp, uint64_t *xdomainp) { uma_cache_t cache; uint64_t allocs, frees, sleeps, xdomain; int cachefree, cpu; allocs = frees = sleeps = xdomain = 0; cachefree = 0; CPU_FOREACH(cpu) { cache = &z->uz_cpu[cpu]; cachefree += cache->uc_allocbucket.ucb_cnt; cachefree += cache->uc_freebucket.ucb_cnt; xdomain += cache->uc_crossbucket.ucb_cnt; cachefree += cache->uc_crossbucket.ucb_cnt; allocs += cache->uc_allocs; frees += cache->uc_frees; } allocs += counter_u64_fetch(z->uz_allocs); frees += counter_u64_fetch(z->uz_frees); xdomain += counter_u64_fetch(z->uz_xdomain); sleeps += z->uz_sleeps; if (cachefreep != NULL) *cachefreep = cachefree; if (allocsp != NULL) *allocsp = allocs; if (freesp != NULL) *freesp = frees; if (sleepsp != NULL) *sleepsp = sleeps; if (xdomainp != NULL) *xdomainp = xdomain; } #endif /* DDB */ static int sysctl_vm_zone_count(SYSCTL_HANDLER_ARGS) { uma_keg_t kz; uma_zone_t z; int count; count = 0; rw_rlock(&uma_rwlock); LIST_FOREACH(kz, &uma_kegs, uk_link) { LIST_FOREACH(z, &kz->uk_zones, uz_link) count++; } LIST_FOREACH(z, &uma_cachezones, uz_link) count++; rw_runlock(&uma_rwlock); return (sysctl_handle_int(oidp, &count, 0, req)); } static void uma_vm_zone_stats(struct uma_type_header *uth, uma_zone_t z, struct sbuf *sbuf, struct uma_percpu_stat *ups, bool internal) { uma_zone_domain_t zdom; uma_cache_t cache; int i; for (i = 0; i < vm_ndomains; i++) { zdom = ZDOM_GET(z, i); uth->uth_zone_free += zdom->uzd_nitems; } uth->uth_allocs = counter_u64_fetch(z->uz_allocs); uth->uth_frees = counter_u64_fetch(z->uz_frees); uth->uth_fails = counter_u64_fetch(z->uz_fails); uth->uth_xdomain = counter_u64_fetch(z->uz_xdomain); uth->uth_sleeps = z->uz_sleeps; for (i = 0; i < mp_maxid + 1; i++) { bzero(&ups[i], sizeof(*ups)); if (internal || CPU_ABSENT(i)) continue; cache = &z->uz_cpu[i]; ups[i].ups_cache_free += cache->uc_allocbucket.ucb_cnt; ups[i].ups_cache_free += cache->uc_freebucket.ucb_cnt; ups[i].ups_cache_free += cache->uc_crossbucket.ucb_cnt; ups[i].ups_allocs = cache->uc_allocs; ups[i].ups_frees = cache->uc_frees; } } static int sysctl_vm_zone_stats(SYSCTL_HANDLER_ARGS) { struct uma_stream_header ush; struct uma_type_header uth; struct uma_percpu_stat *ups; struct sbuf sbuf; uma_keg_t kz; uma_zone_t z; uint64_t items; uint32_t kfree, pages; int count, error, i; error = sysctl_wire_old_buffer(req, 0); if (error != 0) return (error); sbuf_new_for_sysctl(&sbuf, NULL, 128, req); sbuf_clear_flags(&sbuf, SBUF_INCLUDENUL); ups = malloc((mp_maxid + 1) * sizeof(*ups), M_TEMP, M_WAITOK); count = 0; rw_rlock(&uma_rwlock); LIST_FOREACH(kz, &uma_kegs, uk_link) { LIST_FOREACH(z, &kz->uk_zones, uz_link) count++; } LIST_FOREACH(z, &uma_cachezones, uz_link) count++; /* * Insert stream header. */ bzero(&ush, sizeof(ush)); ush.ush_version = UMA_STREAM_VERSION; ush.ush_maxcpus = (mp_maxid + 1); ush.ush_count = count; (void)sbuf_bcat(&sbuf, &ush, sizeof(ush)); LIST_FOREACH(kz, &uma_kegs, uk_link) { kfree = pages = 0; for (i = 0; i < vm_ndomains; i++) { kfree += kz->uk_domain[i].ud_free_items; pages += kz->uk_domain[i].ud_pages; } LIST_FOREACH(z, &kz->uk_zones, uz_link) { bzero(&uth, sizeof(uth)); strlcpy(uth.uth_name, z->uz_name, UTH_MAX_NAME); uth.uth_align = kz->uk_align; uth.uth_size = kz->uk_size; uth.uth_rsize = kz->uk_rsize; if (z->uz_max_items > 0) { items = UZ_ITEMS_COUNT(z->uz_items); uth.uth_pages = (items / kz->uk_ipers) * kz->uk_ppera; } else uth.uth_pages = pages; uth.uth_maxpages = (z->uz_max_items / kz->uk_ipers) * kz->uk_ppera; uth.uth_limit = z->uz_max_items; uth.uth_keg_free = kfree; /* * A zone is secondary is it is not the first entry * on the keg's zone list. */ if ((z->uz_flags & UMA_ZONE_SECONDARY) && (LIST_FIRST(&kz->uk_zones) != z)) uth.uth_zone_flags = UTH_ZONE_SECONDARY; uma_vm_zone_stats(&uth, z, &sbuf, ups, kz->uk_flags & UMA_ZFLAG_INTERNAL); (void)sbuf_bcat(&sbuf, &uth, sizeof(uth)); for (i = 0; i < mp_maxid + 1; i++) (void)sbuf_bcat(&sbuf, &ups[i], sizeof(ups[i])); } } LIST_FOREACH(z, &uma_cachezones, uz_link) { bzero(&uth, sizeof(uth)); strlcpy(uth.uth_name, z->uz_name, UTH_MAX_NAME); uth.uth_size = z->uz_size; uma_vm_zone_stats(&uth, z, &sbuf, ups, false); (void)sbuf_bcat(&sbuf, &uth, sizeof(uth)); for (i = 0; i < mp_maxid + 1; i++) (void)sbuf_bcat(&sbuf, &ups[i], sizeof(ups[i])); } rw_runlock(&uma_rwlock); error = sbuf_finish(&sbuf); sbuf_delete(&sbuf); free(ups, M_TEMP); return (error); } int sysctl_handle_uma_zone_max(SYSCTL_HANDLER_ARGS) { uma_zone_t zone = *(uma_zone_t *)arg1; int error, max; max = uma_zone_get_max(zone); error = sysctl_handle_int(oidp, &max, 0, req); if (error || !req->newptr) return (error); uma_zone_set_max(zone, max); return (0); } int sysctl_handle_uma_zone_cur(SYSCTL_HANDLER_ARGS) { uma_zone_t zone; int cur; /* * Some callers want to add sysctls for global zones that * may not yet exist so they pass a pointer to a pointer. */ if (arg2 == 0) zone = *(uma_zone_t *)arg1; else zone = arg1; cur = uma_zone_get_cur(zone); return (sysctl_handle_int(oidp, &cur, 0, req)); } static int sysctl_handle_uma_zone_allocs(SYSCTL_HANDLER_ARGS) { uma_zone_t zone = arg1; uint64_t cur; cur = uma_zone_get_allocs(zone); return (sysctl_handle_64(oidp, &cur, 0, req)); } static int sysctl_handle_uma_zone_frees(SYSCTL_HANDLER_ARGS) { uma_zone_t zone = arg1; uint64_t cur; cur = uma_zone_get_frees(zone); return (sysctl_handle_64(oidp, &cur, 0, req)); } static int sysctl_handle_uma_zone_flags(SYSCTL_HANDLER_ARGS) { struct sbuf sbuf; uma_zone_t zone = arg1; int error; sbuf_new_for_sysctl(&sbuf, NULL, 0, req); if (zone->uz_flags != 0) sbuf_printf(&sbuf, "0x%b", zone->uz_flags, PRINT_UMA_ZFLAGS); else sbuf_printf(&sbuf, "0"); error = sbuf_finish(&sbuf); sbuf_delete(&sbuf); return (error); } static int sysctl_handle_uma_slab_efficiency(SYSCTL_HANDLER_ARGS) { uma_keg_t keg = arg1; int avail, effpct, total; total = keg->uk_ppera * PAGE_SIZE; if ((keg->uk_flags & UMA_ZFLAG_OFFPAGE) != 0) total += slabzone(keg->uk_ipers)->uz_keg->uk_rsize; /* * We consider the client's requested size and alignment here, not the * real size determination uk_rsize, because we also adjust the real * size for internal implementation reasons (max bitset size). */ avail = keg->uk_ipers * roundup2(keg->uk_size, keg->uk_align + 1); if ((keg->uk_flags & UMA_ZONE_PCPU) != 0) avail *= mp_maxid + 1; effpct = 100 * avail / total; return (sysctl_handle_int(oidp, &effpct, 0, req)); } static int sysctl_handle_uma_zone_items(SYSCTL_HANDLER_ARGS) { uma_zone_t zone = arg1; uint64_t cur; cur = UZ_ITEMS_COUNT(atomic_load_64(&zone->uz_items)); return (sysctl_handle_64(oidp, &cur, 0, req)); } #ifdef INVARIANTS static uma_slab_t uma_dbg_getslab(uma_zone_t zone, void *item) { uma_slab_t slab; uma_keg_t keg; uint8_t *mem; /* * It is safe to return the slab here even though the * zone is unlocked because the item's allocation state * essentially holds a reference. */ mem = (uint8_t *)((uintptr_t)item & (~UMA_SLAB_MASK)); if ((zone->uz_flags & UMA_ZFLAG_CACHE) != 0) return (NULL); if (zone->uz_flags & UMA_ZFLAG_VTOSLAB) return (vtoslab((vm_offset_t)mem)); keg = zone->uz_keg; if ((keg->uk_flags & UMA_ZFLAG_HASH) == 0) return ((uma_slab_t)(mem + keg->uk_pgoff)); KEG_LOCK(keg, 0); slab = hash_sfind(&keg->uk_hash, mem); KEG_UNLOCK(keg, 0); return (slab); } static bool uma_dbg_zskip(uma_zone_t zone, void *mem) { if ((zone->uz_flags & UMA_ZFLAG_CACHE) != 0) return (true); return (uma_dbg_kskip(zone->uz_keg, mem)); } static bool uma_dbg_kskip(uma_keg_t keg, void *mem) { uintptr_t idx; if (dbg_divisor == 0) return (true); if (dbg_divisor == 1) return (false); idx = (uintptr_t)mem >> PAGE_SHIFT; if (keg->uk_ipers > 1) { idx *= keg->uk_ipers; idx += ((uintptr_t)mem & PAGE_MASK) / keg->uk_rsize; } if ((idx / dbg_divisor) * dbg_divisor != idx) { counter_u64_add(uma_skip_cnt, 1); return (true); } counter_u64_add(uma_dbg_cnt, 1); return (false); } /* * Set up the slab's freei data such that uma_dbg_free can function. * */ static void uma_dbg_alloc(uma_zone_t zone, uma_slab_t slab, void *item) { uma_keg_t keg; int freei; if (slab == NULL) { slab = uma_dbg_getslab(zone, item); if (slab == NULL) panic("uma: item %p did not belong to zone %s\n", item, zone->uz_name); } keg = zone->uz_keg; freei = slab_item_index(slab, keg, item); if (BIT_ISSET(keg->uk_ipers, freei, slab_dbg_bits(slab, keg))) panic("Duplicate alloc of %p from zone %p(%s) slab %p(%d)\n", item, zone, zone->uz_name, slab, freei); BIT_SET_ATOMIC(keg->uk_ipers, freei, slab_dbg_bits(slab, keg)); } /* * Verifies freed addresses. Checks for alignment, valid slab membership * and duplicate frees. * */ static void uma_dbg_free(uma_zone_t zone, uma_slab_t slab, void *item) { uma_keg_t keg; int freei; if (slab == NULL) { slab = uma_dbg_getslab(zone, item); if (slab == NULL) panic("uma: Freed item %p did not belong to zone %s\n", item, zone->uz_name); } keg = zone->uz_keg; freei = slab_item_index(slab, keg, item); if (freei >= keg->uk_ipers) panic("Invalid free of %p from zone %p(%s) slab %p(%d)\n", item, zone, zone->uz_name, slab, freei); if (slab_item(slab, keg, freei) != item) panic("Unaligned free of %p from zone %p(%s) slab %p(%d)\n", item, zone, zone->uz_name, slab, freei); if (!BIT_ISSET(keg->uk_ipers, freei, slab_dbg_bits(slab, keg))) panic("Duplicate free of %p from zone %p(%s) slab %p(%d)\n", item, zone, zone->uz_name, slab, freei); BIT_CLR_ATOMIC(keg->uk_ipers, freei, slab_dbg_bits(slab, keg)); } #endif /* INVARIANTS */ #ifdef DDB static int64_t get_uma_stats(uma_keg_t kz, uma_zone_t z, uint64_t *allocs, uint64_t *used, uint64_t *sleeps, long *cachefree, uint64_t *xdomain) { uint64_t frees; int i; if (kz->uk_flags & UMA_ZFLAG_INTERNAL) { *allocs = counter_u64_fetch(z->uz_allocs); frees = counter_u64_fetch(z->uz_frees); *sleeps = z->uz_sleeps; *cachefree = 0; *xdomain = 0; } else uma_zone_sumstat(z, cachefree, allocs, &frees, sleeps, xdomain); for (i = 0; i < vm_ndomains; i++) { *cachefree += ZDOM_GET(z, i)->uzd_nitems; if (!((z->uz_flags & UMA_ZONE_SECONDARY) && (LIST_FIRST(&kz->uk_zones) != z))) *cachefree += kz->uk_domain[i].ud_free_items; } *used = *allocs - frees; return (((int64_t)*used + *cachefree) * kz->uk_size); } DB_SHOW_COMMAND(uma, db_show_uma) { const char *fmt_hdr, *fmt_entry; uma_keg_t kz; uma_zone_t z; uint64_t allocs, used, sleeps, xdomain; long cachefree; /* variables for sorting */ uma_keg_t cur_keg; uma_zone_t cur_zone, last_zone; int64_t cur_size, last_size, size; int ties; /* /i option produces machine-parseable CSV output */ if (modif[0] == 'i') { fmt_hdr = "%s,%s,%s,%s,%s,%s,%s,%s,%s\n"; fmt_entry = "\"%s\",%ju,%jd,%ld,%ju,%ju,%u,%jd,%ju\n"; } else { fmt_hdr = "%18s %6s %7s %7s %11s %7s %7s %10s %8s\n"; fmt_entry = "%18s %6ju %7jd %7ld %11ju %7ju %7u %10jd %8ju\n"; } db_printf(fmt_hdr, "Zone", "Size", "Used", "Free", "Requests", "Sleeps", "Bucket", "Total Mem", "XFree"); /* Sort the zones with largest size first. */ last_zone = NULL; last_size = INT64_MAX; for (;;) { cur_zone = NULL; cur_size = -1; ties = 0; LIST_FOREACH(kz, &uma_kegs, uk_link) { LIST_FOREACH(z, &kz->uk_zones, uz_link) { /* * In the case of size ties, print out zones * in the order they are encountered. That is, * when we encounter the most recently output * zone, we have already printed all preceding * ties, and we must print all following ties. */ if (z == last_zone) { ties = 1; continue; } size = get_uma_stats(kz, z, &allocs, &used, &sleeps, &cachefree, &xdomain); if (size > cur_size && size < last_size + ties) { cur_size = size; cur_zone = z; cur_keg = kz; } } } if (cur_zone == NULL) break; size = get_uma_stats(cur_keg, cur_zone, &allocs, &used, &sleeps, &cachefree, &xdomain); db_printf(fmt_entry, cur_zone->uz_name, (uintmax_t)cur_keg->uk_size, (intmax_t)used, cachefree, (uintmax_t)allocs, (uintmax_t)sleeps, (unsigned)cur_zone->uz_bucket_size, (intmax_t)size, xdomain); if (db_pager_quit) return; last_zone = cur_zone; last_size = cur_size; } } DB_SHOW_COMMAND(umacache, db_show_umacache) { uma_zone_t z; uint64_t allocs, frees; long cachefree; int i; db_printf("%18s %8s %8s %8s %12s %8s\n", "Zone", "Size", "Used", "Free", "Requests", "Bucket"); LIST_FOREACH(z, &uma_cachezones, uz_link) { uma_zone_sumstat(z, &cachefree, &allocs, &frees, NULL, NULL); for (i = 0; i < vm_ndomains; i++) cachefree += ZDOM_GET(z, i)->uzd_nitems; db_printf("%18s %8ju %8jd %8ld %12ju %8u\n", z->uz_name, (uintmax_t)z->uz_size, (intmax_t)(allocs - frees), cachefree, (uintmax_t)allocs, z->uz_bucket_size); if (db_pager_quit) return; } } #endif /* DDB */ Index: projects/clang1000-import/sys/x86/x86/identcpu.c =================================================================== --- projects/clang1000-import/sys/x86/x86/identcpu.c (revision 358238) +++ projects/clang1000-import/sys/x86/x86/identcpu.c (revision 358239) @@ -1,2672 +1,2672 @@ /*- * Copyright (c) 1992 Terrence R. Lambert. * Copyright (c) 1982, 1987, 1990 The Regents of the University of California. * Copyright (c) 1997 KATO Takenori. * All rights reserved. * * This code is derived from software contributed to Berkeley by * William Jolitz. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. All advertising materials mentioning features or use of this software * must display the following acknowledgement: * This product includes software developed by the University of * California, Berkeley and its contributors. * 4. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * from: Id: machdep.c,v 1.193 1996/06/18 01:22:04 bde Exp */ #include __FBSDID("$FreeBSD$"); #include "opt_cpu.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #ifdef __i386__ #define IDENTBLUE_CYRIX486 0 #define IDENTBLUE_IBMCPU 1 #define IDENTBLUE_CYRIXM2 2 static void identifycyrix(void); static void print_transmeta_info(void); #endif static u_int find_cpu_vendor_id(void); static void print_AMD_info(void); static void print_INTEL_info(void); static void print_INTEL_TLB(u_int data); static void print_hypervisor_info(void); static void print_svm_info(void); static void print_via_padlock_info(void); static void print_vmx_info(void); #ifdef __i386__ int cpu; /* Are we 386, 386sx, 486, etc? */ int cpu_class; #endif u_int cpu_feature; /* Feature flags */ u_int cpu_feature2; /* Feature flags */ u_int amd_feature; /* AMD feature flags */ u_int amd_feature2; /* AMD feature flags */ u_int amd_rascap; /* AMD RAS capabilities */ u_int amd_pminfo; /* AMD advanced power management info */ u_int amd_extended_feature_extensions; u_int via_feature_rng; /* VIA RNG features */ u_int via_feature_xcrypt; /* VIA ACE features */ u_int cpu_high; /* Highest arg to CPUID */ u_int cpu_exthigh; /* Highest arg to extended CPUID */ u_int cpu_id; /* Stepping ID */ u_int cpu_procinfo; /* HyperThreading Info / Brand Index / CLFUSH */ u_int cpu_procinfo2; /* Multicore info */ char cpu_vendor[20]; /* CPU Origin code */ u_int cpu_vendor_id; /* CPU vendor ID */ u_int cpu_fxsr; /* SSE enabled */ u_int cpu_mxcsr_mask; /* Valid bits in mxcsr */ u_int cpu_clflush_line_size = 32; u_int cpu_stdext_feature; /* %ebx */ u_int cpu_stdext_feature2; /* %ecx */ u_int cpu_stdext_feature3; /* %edx */ uint64_t cpu_ia32_arch_caps; u_int cpu_max_ext_state_size; u_int cpu_mon_mwait_flags; /* MONITOR/MWAIT flags (CPUID.05H.ECX) */ u_int cpu_mon_min_size; /* MONITOR minimum range size, bytes */ u_int cpu_mon_max_size; /* MONITOR minimum range size, bytes */ u_int cpu_maxphyaddr; /* Max phys addr width in bits */ u_int cpu_power_eax; /* 06H: Power management leaf, %eax */ u_int cpu_power_ebx; /* 06H: Power management leaf, %ebx */ u_int cpu_power_ecx; /* 06H: Power management leaf, %ecx */ u_int cpu_power_edx; /* 06H: Power management leaf, %edx */ char machine[] = MACHINE; SYSCTL_UINT(_hw, OID_AUTO, via_feature_rng, CTLFLAG_RD, &via_feature_rng, 0, "VIA RNG feature available in CPU"); SYSCTL_UINT(_hw, OID_AUTO, via_feature_xcrypt, CTLFLAG_RD, &via_feature_xcrypt, 0, "VIA xcrypt feature available in CPU"); #ifdef __amd64__ #ifdef SCTL_MASK32 extern int adaptive_machine_arch; #endif static int sysctl_hw_machine(SYSCTL_HANDLER_ARGS) { #ifdef SCTL_MASK32 static const char machine32[] = "i386"; #endif int error; #ifdef SCTL_MASK32 if ((req->flags & SCTL_MASK32) != 0 && adaptive_machine_arch) error = SYSCTL_OUT(req, machine32, sizeof(machine32)); else #endif error = SYSCTL_OUT(req, machine, sizeof(machine)); return (error); } SYSCTL_PROC(_hw, HW_MACHINE, machine, CTLTYPE_STRING | CTLFLAG_RD | CTLFLAG_MPSAFE, NULL, 0, sysctl_hw_machine, "A", "Machine class"); #else SYSCTL_STRING(_hw, HW_MACHINE, machine, CTLFLAG_RD, machine, 0, "Machine class"); #endif static char cpu_model[128]; SYSCTL_STRING(_hw, HW_MODEL, model, CTLFLAG_RD | CTLFLAG_MPSAFE, cpu_model, 0, "Machine model"); static int hw_clockrate; SYSCTL_INT(_hw, OID_AUTO, clockrate, CTLFLAG_RD, &hw_clockrate, 0, "CPU instruction clock rate"); u_int hv_base; u_int hv_high; char hv_vendor[16]; SYSCTL_STRING(_hw, OID_AUTO, hv_vendor, CTLFLAG_RD | CTLFLAG_MPSAFE, hv_vendor, 0, "Hypervisor vendor"); static eventhandler_tag tsc_post_tag; static char cpu_brand[48]; #ifdef __i386__ #define MAX_BRAND_INDEX 8 static const char *cpu_brandtable[MAX_BRAND_INDEX + 1] = { NULL, /* No brand */ "Intel Celeron", "Intel Pentium III", "Intel Pentium III Xeon", NULL, NULL, NULL, NULL, "Intel Pentium 4" }; static struct { char *cpu_name; int cpu_class; } cpus[] = { { "Intel 80286", CPUCLASS_286 }, /* CPU_286 */ { "i386SX", CPUCLASS_386 }, /* CPU_386SX */ { "i386DX", CPUCLASS_386 }, /* CPU_386 */ { "i486SX", CPUCLASS_486 }, /* CPU_486SX */ { "i486DX", CPUCLASS_486 }, /* CPU_486 */ { "Pentium", CPUCLASS_586 }, /* CPU_586 */ { "Cyrix 486", CPUCLASS_486 }, /* CPU_486DLC */ { "Pentium Pro", CPUCLASS_686 }, /* CPU_686 */ { "Cyrix 5x86", CPUCLASS_486 }, /* CPU_M1SC */ { "Cyrix 6x86", CPUCLASS_486 }, /* CPU_M1 */ { "Blue Lightning", CPUCLASS_486 }, /* CPU_BLUE */ { "Cyrix 6x86MX", CPUCLASS_686 }, /* CPU_M2 */ { "NexGen 586", CPUCLASS_386 }, /* CPU_NX586 (XXX) */ { "Cyrix 486S/DX", CPUCLASS_486 }, /* CPU_CY486DX */ { "Pentium II", CPUCLASS_686 }, /* CPU_PII */ { "Pentium III", CPUCLASS_686 }, /* CPU_PIII */ { "Pentium 4", CPUCLASS_686 }, /* CPU_P4 */ }; #endif static struct { char *vendor; u_int vendor_id; } cpu_vendors[] = { { INTEL_VENDOR_ID, CPU_VENDOR_INTEL }, /* GenuineIntel */ { AMD_VENDOR_ID, CPU_VENDOR_AMD }, /* AuthenticAMD */ { HYGON_VENDOR_ID, CPU_VENDOR_HYGON }, /* HygonGenuine*/ { CENTAUR_VENDOR_ID, CPU_VENDOR_CENTAUR }, /* CentaurHauls */ #ifdef __i386__ { NSC_VENDOR_ID, CPU_VENDOR_NSC }, /* Geode by NSC */ { CYRIX_VENDOR_ID, CPU_VENDOR_CYRIX }, /* CyrixInstead */ { TRANSMETA_VENDOR_ID, CPU_VENDOR_TRANSMETA }, /* GenuineTMx86 */ { SIS_VENDOR_ID, CPU_VENDOR_SIS }, /* SiS SiS SiS */ { UMC_VENDOR_ID, CPU_VENDOR_UMC }, /* UMC UMC UMC */ { NEXGEN_VENDOR_ID, CPU_VENDOR_NEXGEN }, /* NexGenDriven */ { RISE_VENDOR_ID, CPU_VENDOR_RISE }, /* RiseRiseRise */ #if 0 /* XXX CPUID 8000_0000h and 8086_0000h, not 0000_0000h */ { "TransmetaCPU", CPU_VENDOR_TRANSMETA }, #endif #endif }; void printcpuinfo(void) { u_int regs[4], i; char *brand; printf("CPU: "); #ifdef __i386__ cpu_class = cpus[cpu].cpu_class; strncpy(cpu_model, cpus[cpu].cpu_name, sizeof (cpu_model)); #else strncpy(cpu_model, "Hammer", sizeof (cpu_model)); #endif /* Check for extended CPUID information and a processor name. */ if (cpu_exthigh >= 0x80000004) { brand = cpu_brand; for (i = 0x80000002; i < 0x80000005; i++) { do_cpuid(i, regs); memcpy(brand, regs, sizeof(regs)); brand += sizeof(regs); } } switch (cpu_vendor_id) { case CPU_VENDOR_INTEL: #ifdef __i386__ if ((cpu_id & 0xf00) > 0x300) { u_int brand_index; cpu_model[0] = '\0'; switch (cpu_id & 0x3000) { case 0x1000: strcpy(cpu_model, "Overdrive "); break; case 0x2000: strcpy(cpu_model, "Dual "); break; } switch (cpu_id & 0xf00) { case 0x400: strcat(cpu_model, "i486 "); - /* Check the particular flavor of 486 */ + /* Check the particular flavor of 486 */ switch (cpu_id & 0xf0) { case 0x00: case 0x10: strcat(cpu_model, "DX"); break; case 0x20: strcat(cpu_model, "SX"); break; case 0x30: strcat(cpu_model, "DX2"); break; case 0x40: strcat(cpu_model, "SL"); break; case 0x50: strcat(cpu_model, "SX2"); break; case 0x70: strcat(cpu_model, "DX2 Write-Back Enhanced"); break; case 0x80: strcat(cpu_model, "DX4"); break; } break; case 0x500: - /* Check the particular flavor of 586 */ - strcat(cpu_model, "Pentium"); - switch (cpu_id & 0xf0) { + /* Check the particular flavor of 586 */ + strcat(cpu_model, "Pentium"); + switch (cpu_id & 0xf0) { case 0x00: - strcat(cpu_model, " A-step"); + strcat(cpu_model, " A-step"); break; case 0x10: - strcat(cpu_model, "/P5"); + strcat(cpu_model, "/P5"); break; case 0x20: - strcat(cpu_model, "/P54C"); + strcat(cpu_model, "/P54C"); break; case 0x30: - strcat(cpu_model, "/P24T"); + strcat(cpu_model, "/P24T"); break; case 0x40: - strcat(cpu_model, "/P55C"); + strcat(cpu_model, "/P55C"); break; case 0x70: - strcat(cpu_model, "/P54C"); + strcat(cpu_model, "/P54C"); break; case 0x80: - strcat(cpu_model, "/P55C (quarter-micron)"); + strcat(cpu_model, "/P55C (quarter-micron)"); break; default: - /* nothing */ + /* nothing */ break; } #if defined(I586_CPU) && !defined(NO_F00F_HACK) /* * XXX - If/when Intel fixes the bug, this * should also check the version of the * CPU, not just that it's a Pentium. */ has_f00f_bug = 1; #endif break; case 0x600: - /* Check the particular flavor of 686 */ - switch (cpu_id & 0xf0) { + /* Check the particular flavor of 686 */ + switch (cpu_id & 0xf0) { case 0x00: - strcat(cpu_model, "Pentium Pro A-step"); + strcat(cpu_model, "Pentium Pro A-step"); break; case 0x10: - strcat(cpu_model, "Pentium Pro"); + strcat(cpu_model, "Pentium Pro"); break; case 0x30: case 0x50: case 0x60: - strcat(cpu_model, + strcat(cpu_model, "Pentium II/Pentium II Xeon/Celeron"); cpu = CPU_PII; break; case 0x70: case 0x80: case 0xa0: case 0xb0: - strcat(cpu_model, + strcat(cpu_model, "Pentium III/Pentium III Xeon/Celeron"); cpu = CPU_PIII; break; default: - strcat(cpu_model, "Unknown 80686"); + strcat(cpu_model, "Unknown 80686"); break; } break; case 0xf00: strcat(cpu_model, "Pentium 4"); cpu = CPU_P4; break; default: strcat(cpu_model, "unknown"); break; } /* * If we didn't get a brand name from the extended * CPUID, try to look it up in the brand table. */ if (cpu_high > 0 && *cpu_brand == '\0') { brand_index = cpu_procinfo & CPUID_BRAND_INDEX; if (brand_index <= MAX_BRAND_INDEX && cpu_brandtable[brand_index] != NULL) strcpy(cpu_brand, cpu_brandtable[brand_index]); } } #else /* Please make up your mind folks! */ strcat(cpu_model, "EM64T"); #endif break; case CPU_VENDOR_AMD: /* * Values taken from AMD Processor Recognition * http://www.amd.com/K6/k6docs/pdf/20734g.pdf * (also describes ``Features'' encodings. */ strcpy(cpu_model, "AMD "); #ifdef __i386__ switch (cpu_id & 0xFF0) { case 0x410: strcat(cpu_model, "Standard Am486DX"); break; case 0x430: strcat(cpu_model, "Enhanced Am486DX2 Write-Through"); break; case 0x470: strcat(cpu_model, "Enhanced Am486DX2 Write-Back"); break; case 0x480: strcat(cpu_model, "Enhanced Am486DX4/Am5x86 Write-Through"); break; case 0x490: strcat(cpu_model, "Enhanced Am486DX4/Am5x86 Write-Back"); break; case 0x4E0: strcat(cpu_model, "Am5x86 Write-Through"); break; case 0x4F0: strcat(cpu_model, "Am5x86 Write-Back"); break; case 0x500: strcat(cpu_model, "K5 model 0"); break; case 0x510: strcat(cpu_model, "K5 model 1"); break; case 0x520: strcat(cpu_model, "K5 PR166 (model 2)"); break; case 0x530: strcat(cpu_model, "K5 PR200 (model 3)"); break; case 0x560: strcat(cpu_model, "K6"); break; case 0x570: strcat(cpu_model, "K6 266 (model 1)"); break; case 0x580: strcat(cpu_model, "K6-2"); break; case 0x590: strcat(cpu_model, "K6-III"); break; case 0x5a0: strcat(cpu_model, "Geode LX"); break; default: strcat(cpu_model, "Unknown"); break; } #else if ((cpu_id & 0xf00) == 0xf00) strcat(cpu_model, "AMD64 Processor"); else strcat(cpu_model, "Unknown"); #endif break; #ifdef __i386__ case CPU_VENDOR_CYRIX: strcpy(cpu_model, "Cyrix "); switch (cpu_id & 0xff0) { case 0x440: strcat(cpu_model, "MediaGX"); break; case 0x520: strcat(cpu_model, "6x86"); break; case 0x540: cpu_class = CPUCLASS_586; strcat(cpu_model, "GXm"); break; case 0x600: strcat(cpu_model, "6x86MX"); break; default: /* * Even though CPU supports the cpuid * instruction, it can be disabled. * Therefore, this routine supports all Cyrix * CPUs. */ switch (cyrix_did & 0xf0) { case 0x00: switch (cyrix_did & 0x0f) { case 0x00: strcat(cpu_model, "486SLC"); break; case 0x01: strcat(cpu_model, "486DLC"); break; case 0x02: strcat(cpu_model, "486SLC2"); break; case 0x03: strcat(cpu_model, "486DLC2"); break; case 0x04: strcat(cpu_model, "486SRx"); break; case 0x05: strcat(cpu_model, "486DRx"); break; case 0x06: strcat(cpu_model, "486SRx2"); break; case 0x07: strcat(cpu_model, "486DRx2"); break; case 0x08: strcat(cpu_model, "486SRu"); break; case 0x09: strcat(cpu_model, "486DRu"); break; case 0x0a: strcat(cpu_model, "486SRu2"); break; case 0x0b: strcat(cpu_model, "486DRu2"); break; default: strcat(cpu_model, "Unknown"); break; } break; case 0x10: switch (cyrix_did & 0x0f) { case 0x00: strcat(cpu_model, "486S"); break; case 0x01: strcat(cpu_model, "486S2"); break; case 0x02: strcat(cpu_model, "486Se"); break; case 0x03: strcat(cpu_model, "486S2e"); break; case 0x0a: strcat(cpu_model, "486DX"); break; case 0x0b: strcat(cpu_model, "486DX2"); break; case 0x0f: strcat(cpu_model, "486DX4"); break; default: strcat(cpu_model, "Unknown"); break; } break; case 0x20: if ((cyrix_did & 0x0f) < 8) strcat(cpu_model, "6x86"); /* Where did you get it? */ else strcat(cpu_model, "5x86"); break; case 0x30: strcat(cpu_model, "6x86"); break; case 0x40: if ((cyrix_did & 0xf000) == 0x3000) { cpu_class = CPUCLASS_586; strcat(cpu_model, "GXm"); } else strcat(cpu_model, "MediaGX"); break; case 0x50: strcat(cpu_model, "6x86MX"); break; case 0xf0: switch (cyrix_did & 0x0f) { case 0x0d: strcat(cpu_model, "Overdrive CPU"); break; case 0x0e: strcpy(cpu_model, "Texas Instruments 486SXL"); break; case 0x0f: strcat(cpu_model, "486SLC/DLC"); break; default: strcat(cpu_model, "Unknown"); break; } break; default: strcat(cpu_model, "Unknown"); break; } break; } break; case CPU_VENDOR_RISE: strcpy(cpu_model, "Rise "); switch (cpu_id & 0xff0) { case 0x500: /* 6401 and 6441 (Kirin) */ case 0x520: /* 6510 (Lynx) */ strcat(cpu_model, "mP6"); break; default: strcat(cpu_model, "Unknown"); } break; #endif case CPU_VENDOR_CENTAUR: #ifdef __i386__ switch (cpu_id & 0xff0) { case 0x540: strcpy(cpu_model, "IDT WinChip C6"); break; case 0x580: strcpy(cpu_model, "IDT WinChip 2"); break; case 0x590: strcpy(cpu_model, "IDT WinChip 3"); break; case 0x660: strcpy(cpu_model, "VIA C3 Samuel"); break; case 0x670: if (cpu_id & 0x8) strcpy(cpu_model, "VIA C3 Ezra"); else strcpy(cpu_model, "VIA C3 Samuel 2"); break; case 0x680: strcpy(cpu_model, "VIA C3 Ezra-T"); break; case 0x690: strcpy(cpu_model, "VIA C3 Nehemiah"); break; case 0x6a0: case 0x6d0: strcpy(cpu_model, "VIA C7 Esther"); break; case 0x6f0: strcpy(cpu_model, "VIA Nano"); break; default: strcpy(cpu_model, "VIA/IDT Unknown"); } #else strcpy(cpu_model, "VIA "); if ((cpu_id & 0xff0) == 0x6f0) strcat(cpu_model, "Nano Processor"); else strcat(cpu_model, "Unknown"); #endif break; #ifdef __i386__ case CPU_VENDOR_IBM: strcpy(cpu_model, "Blue Lightning CPU"); break; case CPU_VENDOR_NSC: switch (cpu_id & 0xff0) { case 0x540: strcpy(cpu_model, "Geode SC1100"); cpu = CPU_GEODE1100; break; default: strcpy(cpu_model, "Geode/NSC unknown"); break; } break; #endif case CPU_VENDOR_HYGON: strcpy(cpu_model, "Hygon "); #ifdef __i386__ strcat(cpu_model, "Unknown"); #else if ((cpu_id & 0xf00) == 0xf00) strcat(cpu_model, "AMD64 Processor"); else strcat(cpu_model, "Unknown"); #endif break; default: strcat(cpu_model, "Unknown"); break; } /* * Replace cpu_model with cpu_brand minus leading spaces if * we have one. */ brand = cpu_brand; while (*brand == ' ') ++brand; if (*brand != '\0') strcpy(cpu_model, brand); printf("%s (", cpu_model); if (tsc_freq != 0) { hw_clockrate = (tsc_freq + 5000) / 1000000; printf("%jd.%02d-MHz ", (intmax_t)(tsc_freq + 4999) / 1000000, (u_int)((tsc_freq + 4999) / 10000) % 100); } #ifdef __i386__ switch(cpu_class) { case CPUCLASS_286: printf("286"); break; case CPUCLASS_386: printf("386"); break; #if defined(I486_CPU) case CPUCLASS_486: printf("486"); break; #endif #if defined(I586_CPU) case CPUCLASS_586: printf("586"); break; #endif #if defined(I686_CPU) case CPUCLASS_686: printf("686"); break; #endif default: printf("Unknown"); /* will panic below... */ } #else printf("K8"); #endif printf("-class CPU)\n"); if (*cpu_vendor) printf(" Origin=\"%s\"", cpu_vendor); if (cpu_id) printf(" Id=0x%x", cpu_id); if (cpu_vendor_id == CPU_VENDOR_INTEL || cpu_vendor_id == CPU_VENDOR_AMD || cpu_vendor_id == CPU_VENDOR_HYGON || cpu_vendor_id == CPU_VENDOR_CENTAUR || #ifdef __i386__ cpu_vendor_id == CPU_VENDOR_TRANSMETA || cpu_vendor_id == CPU_VENDOR_RISE || cpu_vendor_id == CPU_VENDOR_NSC || (cpu_vendor_id == CPU_VENDOR_CYRIX && ((cpu_id & 0xf00) > 0x500)) || #endif 0) { printf(" Family=0x%x", CPUID_TO_FAMILY(cpu_id)); printf(" Model=0x%x", CPUID_TO_MODEL(cpu_id)); printf(" Stepping=%u", cpu_id & CPUID_STEPPING); #ifdef __i386__ if (cpu_vendor_id == CPU_VENDOR_CYRIX) printf("\n DIR=0x%04x", cyrix_did); #endif /* * AMD CPUID Specification * http://support.amd.com/us/Embedded_TechDocs/25481.pdf * * Intel Processor Identification and CPUID Instruction * http://www.intel.com/assets/pdf/appnote/241618.pdf */ if (cpu_high > 0) { /* * Here we should probably set up flags indicating * whether or not various features are available. * The interesting ones are probably VME, PSE, PAE, * and PGE. The code already assumes without bothering * to check that all CPUs >= Pentium have a TSC and * MSRs. */ printf("\n Features=0x%b", cpu_feature, "\020" "\001FPU" /* Integral FPU */ "\002VME" /* Extended VM86 mode support */ "\003DE" /* Debugging Extensions (CR4.DE) */ "\004PSE" /* 4MByte page tables */ "\005TSC" /* Timestamp counter */ "\006MSR" /* Machine specific registers */ "\007PAE" /* Physical address extension */ "\010MCE" /* Machine Check support */ "\011CX8" /* CMPEXCH8 instruction */ "\012APIC" /* SMP local APIC */ "\013oldMTRR" /* Previous implementation of MTRR */ "\014SEP" /* Fast System Call */ "\015MTRR" /* Memory Type Range Registers */ "\016PGE" /* PG_G (global bit) support */ "\017MCA" /* Machine Check Architecture */ "\020CMOV" /* CMOV instruction */ "\021PAT" /* Page attributes table */ "\022PSE36" /* 36 bit address space support */ "\023PN" /* Processor Serial number */ "\024CLFLUSH" /* Has the CLFLUSH instruction */ "\025" "\026DTS" /* Debug Trace Store */ "\027ACPI" /* ACPI support */ "\030MMX" /* MMX instructions */ "\031FXSR" /* FXSAVE/FXRSTOR */ "\032SSE" /* Streaming SIMD Extensions */ "\033SSE2" /* Streaming SIMD Extensions #2 */ "\034SS" /* Self snoop */ "\035HTT" /* Hyperthreading (see EBX bit 16-23) */ "\036TM" /* Thermal Monitor clock slowdown */ "\037IA64" /* CPU can execute IA64 instructions */ "\040PBE" /* Pending Break Enable */ ); if (cpu_feature2 != 0) { printf("\n Features2=0x%b", cpu_feature2, "\020" "\001SSE3" /* SSE3 */ "\002PCLMULQDQ" /* Carry-Less Mul Quadword */ "\003DTES64" /* 64-bit Debug Trace */ "\004MON" /* MONITOR/MWAIT Instructions */ "\005DS_CPL" /* CPL Qualified Debug Store */ "\006VMX" /* Virtual Machine Extensions */ "\007SMX" /* Safer Mode Extensions */ "\010EST" /* Enhanced SpeedStep */ "\011TM2" /* Thermal Monitor 2 */ "\012SSSE3" /* SSSE3 */ "\013CNXT-ID" /* L1 context ID available */ "\014SDBG" /* IA32 silicon debug */ "\015FMA" /* Fused Multiply Add */ "\016CX16" /* CMPXCHG16B Instruction */ "\017xTPR" /* Send Task Priority Messages*/ "\020PDCM" /* Perf/Debug Capability MSR */ "\021" "\022PCID" /* Process-context Identifiers*/ "\023DCA" /* Direct Cache Access */ "\024SSE4.1" /* SSE 4.1 */ "\025SSE4.2" /* SSE 4.2 */ "\026x2APIC" /* xAPIC Extensions */ "\027MOVBE" /* MOVBE Instruction */ "\030POPCNT" /* POPCNT Instruction */ "\031TSCDLT" /* TSC-Deadline Timer */ "\032AESNI" /* AES Crypto */ "\033XSAVE" /* XSAVE/XRSTOR States */ "\034OSXSAVE" /* OS-Enabled State Management*/ "\035AVX" /* Advanced Vector Extensions */ "\036F16C" /* Half-precision conversions */ "\037RDRAND" /* RDRAND Instruction */ "\040HV" /* Hypervisor */ ); } if (amd_feature != 0) { printf("\n AMD Features=0x%b", amd_feature, "\020" /* in hex */ "\001" /* Same */ "\002" /* Same */ "\003" /* Same */ "\004" /* Same */ "\005" /* Same */ "\006" /* Same */ "\007" /* Same */ "\010" /* Same */ "\011" /* Same */ "\012" /* Same */ "\013" /* Undefined */ "\014SYSCALL" /* Have SYSCALL/SYSRET */ "\015" /* Same */ "\016" /* Same */ "\017" /* Same */ "\020" /* Same */ "\021" /* Same */ "\022" /* Same */ "\023" /* Reserved, unknown */ "\024MP" /* Multiprocessor Capable */ "\025NX" /* Has EFER.NXE, NX */ "\026" /* Undefined */ "\027MMX+" /* AMD MMX Extensions */ "\030" /* Same */ "\031" /* Same */ "\032FFXSR" /* Fast FXSAVE/FXRSTOR */ "\033Page1GB" /* 1-GB large page support */ "\034RDTSCP" /* RDTSCP */ "\035" /* Undefined */ "\036LM" /* 64 bit long mode */ "\0373DNow!+" /* AMD 3DNow! Extensions */ "\0403DNow!" /* AMD 3DNow! */ ); } if (amd_feature2 != 0) { printf("\n AMD Features2=0x%b", amd_feature2, "\020" "\001LAHF" /* LAHF/SAHF in long mode */ "\002CMP" /* CMP legacy */ "\003SVM" /* Secure Virtual Mode */ "\004ExtAPIC" /* Extended APIC register */ "\005CR8" /* CR8 in legacy mode */ "\006ABM" /* LZCNT instruction */ "\007SSE4A" /* SSE4A */ "\010MAS" /* Misaligned SSE mode */ "\011Prefetch" /* 3DNow! Prefetch/PrefetchW */ "\012OSVW" /* OS visible workaround */ "\013IBS" /* Instruction based sampling */ "\014XOP" /* XOP extended instructions */ "\015SKINIT" /* SKINIT/STGI */ "\016WDT" /* Watchdog timer */ "\017" "\020LWP" /* Lightweight Profiling */ "\021FMA4" /* 4-operand FMA instructions */ "\022TCE" /* Translation Cache Extension */ "\023" "\024NodeId" /* NodeId MSR support */ "\025" "\026TBM" /* Trailing Bit Manipulation */ "\027Topology" /* Topology Extensions */ "\030PCXC" /* Core perf count */ "\031PNXC" /* NB perf count */ "\032" "\033DBE" /* Data Breakpoint extension */ "\034PTSC" /* Performance TSC */ "\035PL2I" /* L2I perf count */ "\036MWAITX" /* MONITORX/MWAITX instructions */ "\037ADMSKX" /* Address mask extension */ "\040" ); } if (cpu_stdext_feature != 0) { printf("\n Structured Extended Features=0x%b", cpu_stdext_feature, "\020" /* RDFSBASE/RDGSBASE/WRFSBASE/WRGSBASE */ "\001FSGSBASE" "\002TSCADJ" "\003SGX" /* Bit Manipulation Instructions */ "\004BMI1" /* Hardware Lock Elision */ "\005HLE" /* Advanced Vector Instructions 2 */ "\006AVX2" /* FDP_EXCPTN_ONLY */ "\007FDPEXC" /* Supervisor Mode Execution Prot. */ "\010SMEP" /* Bit Manipulation Instructions */ "\011BMI2" "\012ERMS" /* Invalidate Processor Context ID */ "\013INVPCID" /* Restricted Transactional Memory */ "\014RTM" "\015PQM" "\016NFPUSG" /* Intel Memory Protection Extensions */ "\017MPX" "\020PQE" /* AVX512 Foundation */ "\021AVX512F" "\022AVX512DQ" /* Enhanced NRBG */ "\023RDSEED" /* ADCX + ADOX */ "\024ADX" /* Supervisor Mode Access Prevention */ "\025SMAP" "\026AVX512IFMA" /* Formerly PCOMMIT */ "\027" "\030CLFLUSHOPT" "\031CLWB" "\032PROCTRACE" "\033AVX512PF" "\034AVX512ER" "\035AVX512CD" "\036SHA" "\037AVX512BW" "\040AVX512VL" ); } if (cpu_stdext_feature2 != 0) { printf("\n Structured Extended Features2=0x%b", cpu_stdext_feature2, "\020" "\001PREFETCHWT1" "\002AVX512VBMI" "\003UMIP" "\004PKU" "\005OSPKE" "\006WAITPKG" "\007AVX512VBMI2" "\011GFNI" "\012VAES" "\013VPCLMULQDQ" "\014AVX512VNNI" "\015AVX512BITALG" "\016AVX512VPOPCNTDQ" "\027RDPID" "\032CLDEMOTE" "\034MOVDIRI" "\035MOVDIR64B" "\036ENQCMD" "\037SGXLC" ); } if (cpu_stdext_feature3 != 0) { printf("\n Structured Extended Features3=0x%b", cpu_stdext_feature3, "\020" "\003AVX512_4VNNIW" "\004AVX512_4FMAPS" "\005FSRM" "\011AVX512VP2INTERSECT" "\013MD_CLEAR" "\016TSXFA" "\023PCONFIG" "\025IBT" "\033IBPB" "\034STIBP" "\035L1DFL" "\036ARCH_CAP" "\037CORE_CAP" "\040SSBD" ); } if ((cpu_feature2 & CPUID2_XSAVE) != 0) { cpuid_count(0xd, 0x1, regs); if (regs[0] != 0) { printf("\n XSAVE Features=0x%b", regs[0], "\020" "\001XSAVEOPT" "\002XSAVEC" "\003XINUSE" "\004XSAVES"); } } if (cpu_ia32_arch_caps != 0) { printf("\n IA32_ARCH_CAPS=0x%b", (u_int)cpu_ia32_arch_caps, "\020" "\001RDCL_NO" "\002IBRS_ALL" "\003RSBA" "\004SKIP_L1DFL_VME" "\005SSB_NO" "\006MDS_NO" "\010TSX_CTRL" "\011TAA_NO" ); } if (amd_extended_feature_extensions != 0) { u_int amd_fe_masked; amd_fe_masked = amd_extended_feature_extensions; if ((amd_fe_masked & AMDFEID_IBRS) == 0) amd_fe_masked &= ~(AMDFEID_IBRS_ALWAYSON | AMDFEID_PREFER_IBRS); if ((amd_fe_masked & AMDFEID_STIBP) == 0) amd_fe_masked &= ~AMDFEID_STIBP_ALWAYSON; printf("\n " "AMD Extended Feature Extensions ID EBX=" "0x%b", amd_fe_masked, "\020" "\001CLZERO" "\002IRPerf" "\003XSaveErPtr" "\005RDPRU" "\011MCOMMIT" "\012WBNOINVD" "\015IBPB" "\017IBRS" "\020STIBP" "\021IBRS_ALWAYSON" "\022STIBP_ALWAYSON" "\023PREFER_IBRS" "\031SSBD" "\032VIRT_SSBD" "\033SSB_NO" ); } if (via_feature_rng != 0 || via_feature_xcrypt != 0) print_via_padlock_info(); if (cpu_feature2 & CPUID2_VMX) print_vmx_info(); if (amd_feature2 & AMDID2_SVM) print_svm_info(); if ((cpu_feature & CPUID_HTT) && (cpu_vendor_id == CPU_VENDOR_AMD || cpu_vendor_id == CPU_VENDOR_HYGON)) cpu_feature &= ~CPUID_HTT; /* * If this CPU supports P-state invariant TSC then * mention the capability. */ if (tsc_is_invariant) { printf("\n TSC: P-state invariant"); if (tsc_perf_stat) printf(", performance statistics"); } } #ifdef __i386__ } else if (cpu_vendor_id == CPU_VENDOR_CYRIX) { printf(" DIR=0x%04x", cyrix_did); printf(" Stepping=%u", (cyrix_did & 0xf000) >> 12); printf(" Revision=%u", (cyrix_did & 0x0f00) >> 8); #ifndef CYRIX_CACHE_REALLY_WORKS if (cpu == CPU_M1 && (cyrix_did & 0xff00) < 0x1700) printf("\n CPU cache: write-through mode"); #endif #endif } /* Avoid ugly blank lines: only print newline when we have to. */ if (*cpu_vendor || cpu_id) printf("\n"); if (bootverbose) { if (cpu_vendor_id == CPU_VENDOR_AMD || cpu_vendor_id == CPU_VENDOR_HYGON) print_AMD_info(); else if (cpu_vendor_id == CPU_VENDOR_INTEL) print_INTEL_info(); #ifdef __i386__ else if (cpu_vendor_id == CPU_VENDOR_TRANSMETA) print_transmeta_info(); #endif } print_hypervisor_info(); } #ifdef __i386__ void panicifcpuunsupported(void) { #if !defined(lint) #if !defined(I486_CPU) && !defined(I586_CPU) && !defined(I686_CPU) #error This kernel is not configured for one of the supported CPUs #endif #else /* lint */ #endif /* lint */ /* * Now that we have told the user what they have, * let them know if that machine type isn't configured. */ switch (cpu_class) { case CPUCLASS_286: /* a 286 should not make it this far, anyway */ case CPUCLASS_386: #if !defined(I486_CPU) case CPUCLASS_486: #endif #if !defined(I586_CPU) case CPUCLASS_586: #endif #if !defined(I686_CPU) case CPUCLASS_686: #endif panic("CPU class not configured"); default: break; } } static volatile u_int trap_by_rdmsr; /* * Special exception 6 handler. * The rdmsr instruction generates invalid opcodes fault on 486-class * Cyrix CPU. Stacked eip register points the rdmsr instruction in the * function identblue() when this handler is called. Stacked eip should * be advanced. */ inthand_t bluetrap6; #ifdef __GNUCLIKE_ASM __asm (" \n\ .text \n\ .p2align 2,0x90 \n\ .type " __XSTRING(CNAME(bluetrap6)) ",@function \n\ " __XSTRING(CNAME(bluetrap6)) ": \n\ ss \n\ movl $0xa8c1d," __XSTRING(CNAME(trap_by_rdmsr)) " \n\ addl $2, (%esp) /* rdmsr is a 2-byte instruction */ \n\ iret \n\ "); #endif /* * Special exception 13 handler. * Accessing non-existent MSR generates general protection fault. */ inthand_t bluetrap13; #ifdef __GNUCLIKE_ASM __asm (" \n\ .text \n\ .p2align 2,0x90 \n\ .type " __XSTRING(CNAME(bluetrap13)) ",@function \n\ " __XSTRING(CNAME(bluetrap13)) ": \n\ ss \n\ movl $0xa89c4," __XSTRING(CNAME(trap_by_rdmsr)) " \n\ popl %eax /* discard error code */ \n\ addl $2, (%esp) /* rdmsr is a 2-byte instruction */ \n\ iret \n\ "); #endif /* * Distinguish IBM Blue Lightning CPU from Cyrix CPUs that does not * support cpuid instruction. This function should be called after * loading interrupt descriptor table register. * * I don't like this method that handles fault, but I couldn't get * information for any other methods. Does blue giant know? */ static int identblue(void) { trap_by_rdmsr = 0; /* * Cyrix 486-class CPU does not support rdmsr instruction. * The rdmsr instruction generates invalid opcode fault, and exception * will be trapped by bluetrap6() on Cyrix 486-class CPU. The * bluetrap6() set the magic number to trap_by_rdmsr. */ setidt(IDT_UD, bluetrap6, SDT_SYS386TGT, SEL_KPL, GSEL(GCODE_SEL, SEL_KPL)); /* * Certain BIOS disables cpuid instruction of Cyrix 6x86MX CPU. * In this case, rdmsr generates general protection fault, and * exception will be trapped by bluetrap13(). */ setidt(IDT_GP, bluetrap13, SDT_SYS386TGT, SEL_KPL, GSEL(GCODE_SEL, SEL_KPL)); rdmsr(0x1002); /* Cyrix CPU generates fault. */ if (trap_by_rdmsr == 0xa8c1d) return IDENTBLUE_CYRIX486; else if (trap_by_rdmsr == 0xa89c4) return IDENTBLUE_CYRIXM2; return IDENTBLUE_IBMCPU; } /* * identifycyrix() set lower 16 bits of cyrix_did as follows: * * F E D C B A 9 8 7 6 5 4 3 2 1 0 * +-------+-------+---------------+ * | SID | RID | Device ID | * | (DIR 1) | (DIR 0) | * +-------+-------+---------------+ */ static void identifycyrix(void) { register_t saveintr; int ccr2_test = 0, dir_test = 0; u_char ccr2, ccr3; saveintr = intr_disable(); ccr2 = read_cyrix_reg(CCR2); write_cyrix_reg(CCR2, ccr2 ^ CCR2_LOCK_NW); read_cyrix_reg(CCR2); if (read_cyrix_reg(CCR2) != ccr2) ccr2_test = 1; write_cyrix_reg(CCR2, ccr2); ccr3 = read_cyrix_reg(CCR3); write_cyrix_reg(CCR3, ccr3 ^ CCR3_MAPEN3); read_cyrix_reg(CCR3); if (read_cyrix_reg(CCR3) != ccr3) dir_test = 1; /* CPU supports DIRs. */ write_cyrix_reg(CCR3, ccr3); if (dir_test) { /* Device ID registers are available. */ cyrix_did = read_cyrix_reg(DIR1) << 8; cyrix_did += read_cyrix_reg(DIR0); } else if (ccr2_test) cyrix_did = 0x0010; /* 486S A-step */ else cyrix_did = 0x00ff; /* Old 486SLC/DLC and TI486SXLC/SXL */ intr_restore(saveintr); } #endif /* Update TSC freq with the value indicated by the caller. */ static void tsc_freq_changed(void *arg __unused, const struct cf_level *level, int status) { /* If there was an error during the transition, don't do anything. */ if (status != 0) return; /* Total setting for this level gives the new frequency in MHz. */ hw_clockrate = level->total_set.freq; } static void hook_tsc_freq(void *arg __unused) { if (tsc_is_invariant) return; tsc_post_tag = EVENTHANDLER_REGISTER(cpufreq_post_change, tsc_freq_changed, NULL, EVENTHANDLER_PRI_ANY); } SYSINIT(hook_tsc_freq, SI_SUB_CONFIGURE, SI_ORDER_ANY, hook_tsc_freq, NULL); static const struct { const char * vm_bname; int vm_guest; } vm_bnames[] = { { "QEMU", VM_GUEST_VM }, /* QEMU */ { "Plex86", VM_GUEST_VM }, /* Plex86 */ { "Bochs", VM_GUEST_VM }, /* Bochs */ { "Xen", VM_GUEST_XEN }, /* Xen */ { "BHYVE", VM_GUEST_BHYVE }, /* bhyve */ { "Seabios", VM_GUEST_KVM }, /* KVM */ }; static const struct { const char * vm_pname; int vm_guest; } vm_pnames[] = { { "VMware Virtual Platform", VM_GUEST_VMWARE }, { "Virtual Machine", VM_GUEST_VM }, /* Microsoft VirtualPC */ { "VirtualBox", VM_GUEST_VBOX }, { "Parallels Virtual Platform", VM_GUEST_PARALLELS }, { "KVM", VM_GUEST_KVM }, }; static struct { const char *vm_cpuid; int vm_guest; } vm_cpuids[] = { { "XENXENXEN", VM_GUEST_XEN }, /* XEN */ { "Microsoft Hv", VM_GUEST_HV }, /* Microsoft Hyper-V */ { "VMwareVMware", VM_GUEST_VMWARE }, /* VMware VM */ { "KVMKVMKVM", VM_GUEST_KVM }, /* KVM */ { "bhyve bhyve ", VM_GUEST_BHYVE }, /* bhyve */ { "VBoxVBoxVBox", VM_GUEST_VBOX }, /* VirtualBox */ }; static void identify_hypervisor_cpuid_base(void) { u_int leaf, regs[4]; int i; /* * [RFC] CPUID usage for interaction between Hypervisors and Linux. * http://lkml.org/lkml/2008/10/1/246 * * KB1009458: Mechanisms to determine if software is running in * a VMware virtual machine * http://kb.vmware.com/kb/1009458 * * Search for a hypervisor that we recognize. If we cannot find * a specific hypervisor, return the first information about the * hypervisor that we found, as others may be able to use. */ for (leaf = 0x40000000; leaf < 0x40010000; leaf += 0x100) { do_cpuid(leaf, regs); /* * KVM from Linux kernels prior to commit * 57c22e5f35aa4b9b2fe11f73f3e62bbf9ef36190 set %eax * to 0 rather than a valid hv_high value. Check for * the KVM signature bytes and fixup %eax to the * highest supported leaf in that case. */ if (regs[0] == 0 && regs[1] == 0x4b4d564b && regs[2] == 0x564b4d56 && regs[3] == 0x0000004d) regs[0] = leaf + 1; - + if (regs[0] >= leaf) { for (i = 0; i < nitems(vm_cpuids); i++) if (strncmp((const char *)®s[1], vm_cpuids[i].vm_cpuid, 12) == 0) { vm_guest = vm_cpuids[i].vm_guest; break; } /* * If this is the first entry or we found a * specific hypervisor, record the base, high value, * and vendor identifier. */ if (vm_guest != VM_GUEST_VM || leaf == 0x40000000) { hv_base = leaf; hv_high = regs[0]; ((u_int *)&hv_vendor)[0] = regs[1]; ((u_int *)&hv_vendor)[1] = regs[2]; ((u_int *)&hv_vendor)[2] = regs[3]; hv_vendor[12] = '\0'; /* * If we found a specific hypervisor, then * we are finished. */ if (vm_guest != VM_GUEST_VM) return; } } } } void identify_hypervisor(void) { u_int regs[4]; char *p; int i; /* * If CPUID2_HV is set, we are running in a hypervisor environment. */ if (cpu_feature2 & CPUID2_HV) { vm_guest = VM_GUEST_VM; identify_hypervisor_cpuid_base(); /* If we have a definitive vendor, we can return now. */ if (*hv_vendor != '\0') return; } /* * Examine SMBIOS strings for older hypervisors. */ p = kern_getenv("smbios.system.serial"); if (p != NULL) { if (strncmp(p, "VMware-", 7) == 0 || strncmp(p, "VMW", 3) == 0) { vmware_hvcall(VMW_HVCMD_GETVERSION, regs); if (regs[1] == VMW_HVMAGIC) { - vm_guest = VM_GUEST_VMWARE; + vm_guest = VM_GUEST_VMWARE; freeenv(p); return; } } freeenv(p); } /* * XXX: Some of these entries may not be needed since they were * added to FreeBSD before the checks above. */ p = kern_getenv("smbios.bios.vendor"); if (p != NULL) { for (i = 0; i < nitems(vm_bnames); i++) if (strcmp(p, vm_bnames[i].vm_bname) == 0) { vm_guest = vm_bnames[i].vm_guest; /* If we have a specific match, return */ if (vm_guest != VM_GUEST_VM) { freeenv(p); return; } /* * We are done with bnames, but there might be * a more specific match in the pnames */ break; } freeenv(p); } p = kern_getenv("smbios.system.product"); if (p != NULL) { for (i = 0; i < nitems(vm_pnames); i++) if (strcmp(p, vm_pnames[i].vm_pname) == 0) { vm_guest = vm_pnames[i].vm_guest; freeenv(p); return; } freeenv(p); } } bool fix_cpuid(void) { uint64_t msr; /* * Clear "Limit CPUID Maxval" bit and return true if the caller should * get the largest standard CPUID function number again if it is set * from BIOS. It is necessary for probing correct CPU topology later * and for the correct operation of the AVX-aware userspace. */ if (cpu_vendor_id == CPU_VENDOR_INTEL && ((CPUID_TO_FAMILY(cpu_id) == 0xf && CPUID_TO_MODEL(cpu_id) >= 0x3) || (CPUID_TO_FAMILY(cpu_id) == 0x6 && CPUID_TO_MODEL(cpu_id) >= 0xe))) { msr = rdmsr(MSR_IA32_MISC_ENABLE); if ((msr & IA32_MISC_EN_LIMCPUID) != 0) { msr &= ~IA32_MISC_EN_LIMCPUID; wrmsr(MSR_IA32_MISC_ENABLE, msr); return (true); } } /* * Re-enable AMD Topology Extension that could be disabled by BIOS * on some notebook processors. Without the extension it's really * hard to determine the correct CPU cache topology. * See BIOS and Kernel Developer’s Guide (BKDG) for AMD Family 15h * Models 60h-6Fh Processors, Publication # 50742. */ if (vm_guest == VM_GUEST_NO && cpu_vendor_id == CPU_VENDOR_AMD && CPUID_TO_FAMILY(cpu_id) == 0x15) { msr = rdmsr(MSR_EXTFEATURES); if ((msr & ((uint64_t)1 << 54)) == 0) { msr |= (uint64_t)1 << 54; wrmsr(MSR_EXTFEATURES, msr); return (true); } } return (false); } void identify_cpu1(void) { u_int regs[4]; do_cpuid(0, regs); cpu_high = regs[0]; ((u_int *)&cpu_vendor)[0] = regs[1]; ((u_int *)&cpu_vendor)[1] = regs[3]; ((u_int *)&cpu_vendor)[2] = regs[2]; cpu_vendor[12] = '\0'; do_cpuid(1, regs); cpu_id = regs[0]; cpu_procinfo = regs[1]; cpu_feature = regs[3]; cpu_feature2 = regs[2]; } void identify_cpu2(void) { u_int regs[4], cpu_stdext_disable; if (cpu_high >= 6) { cpuid_count(6, 0, regs); cpu_power_eax = regs[0]; cpu_power_ebx = regs[1]; cpu_power_ecx = regs[2]; cpu_power_edx = regs[3]; } if (cpu_high >= 7) { cpuid_count(7, 0, regs); cpu_stdext_feature = regs[1]; /* * Some hypervisors failed to filter out unsupported * extended features. Allow to disable the * extensions, activation of which requires setting a * bit in CR4, and which VM monitors do not support. */ cpu_stdext_disable = 0; TUNABLE_INT_FETCH("hw.cpu_stdext_disable", &cpu_stdext_disable); cpu_stdext_feature &= ~cpu_stdext_disable; cpu_stdext_feature2 = regs[2]; cpu_stdext_feature3 = regs[3]; if ((cpu_stdext_feature3 & CPUID_STDEXT3_ARCH_CAP) != 0) cpu_ia32_arch_caps = rdmsr(MSR_IA32_ARCH_CAP); } } void identify_cpu_fixup_bsp(void) { u_int regs[4]; cpu_vendor_id = find_cpu_vendor_id(); if (fix_cpuid()) { do_cpuid(0, regs); cpu_high = regs[0]; } } /* * Final stage of CPU identification. */ void finishidentcpu(void) { u_int regs[4]; #ifdef __i386__ u_char ccr3; #endif identify_cpu_fixup_bsp(); if (cpu_high >= 5 && (cpu_feature2 & CPUID2_MON) != 0) { do_cpuid(5, regs); cpu_mon_mwait_flags = regs[2]; cpu_mon_min_size = regs[0] & CPUID5_MON_MIN_SIZE; cpu_mon_max_size = regs[1] & CPUID5_MON_MAX_SIZE; } identify_cpu2(); #ifdef __i386__ if (cpu_high > 0 && (cpu_vendor_id == CPU_VENDOR_INTEL || cpu_vendor_id == CPU_VENDOR_AMD || cpu_vendor_id == CPU_VENDOR_HYGON || cpu_vendor_id == CPU_VENDOR_TRANSMETA || cpu_vendor_id == CPU_VENDOR_CENTAUR || cpu_vendor_id == CPU_VENDOR_NSC)) { do_cpuid(0x80000000, regs); if (regs[0] >= 0x80000000) cpu_exthigh = regs[0]; } #else if (cpu_vendor_id == CPU_VENDOR_INTEL || cpu_vendor_id == CPU_VENDOR_AMD || cpu_vendor_id == CPU_VENDOR_HYGON || cpu_vendor_id == CPU_VENDOR_CENTAUR) { do_cpuid(0x80000000, regs); cpu_exthigh = regs[0]; } #endif if (cpu_exthigh >= 0x80000001) { do_cpuid(0x80000001, regs); amd_feature = regs[3] & ~(cpu_feature & 0x0183f3ff); amd_feature2 = regs[2]; } if (cpu_exthigh >= 0x80000007) { do_cpuid(0x80000007, regs); amd_rascap = regs[1]; amd_pminfo = regs[3]; } if (cpu_exthigh >= 0x80000008) { do_cpuid(0x80000008, regs); cpu_maxphyaddr = regs[0] & 0xff; amd_extended_feature_extensions = regs[1]; cpu_procinfo2 = regs[2]; } else { cpu_maxphyaddr = (cpu_feature & CPUID_PAE) != 0 ? 36 : 32; } #ifdef __i386__ if (cpu_vendor_id == CPU_VENDOR_CYRIX) { if (cpu == CPU_486) { /* * These conditions are equivalent to: * - CPU does not support cpuid instruction. * - Cyrix/IBM CPU is detected. */ if (identblue() == IDENTBLUE_IBMCPU) { strcpy(cpu_vendor, "IBM"); cpu_vendor_id = CPU_VENDOR_IBM; cpu = CPU_BLUE; return; } } switch (cpu_id & 0xf00) { case 0x600: /* * Cyrix's datasheet does not describe DIRs. * Therefor, I assume it does not have them * and use the result of the cpuid instruction. * XXX they seem to have it for now at least. -Peter */ identifycyrix(); cpu = CPU_M2; break; default: identifycyrix(); /* * This routine contains a trick. * Don't check (cpu_id & 0x00f0) == 0x50 to detect M2, now. */ switch (cyrix_did & 0x00f0) { case 0x00: case 0xf0: cpu = CPU_486DLC; break; case 0x10: cpu = CPU_CY486DX; break; case 0x20: if ((cyrix_did & 0x000f) < 8) cpu = CPU_M1; else cpu = CPU_M1SC; break; case 0x30: cpu = CPU_M1; break; case 0x40: /* MediaGX CPU */ cpu = CPU_M1SC; break; default: /* M2 and later CPUs are treated as M2. */ cpu = CPU_M2; /* * enable cpuid instruction. */ ccr3 = read_cyrix_reg(CCR3); write_cyrix_reg(CCR3, CCR3_MAPEN0); write_cyrix_reg(CCR4, read_cyrix_reg(CCR4) | CCR4_CPUID); write_cyrix_reg(CCR3, ccr3); do_cpuid(0, regs); cpu_high = regs[0]; /* eax */ do_cpuid(1, regs); cpu_id = regs[0]; /* eax */ cpu_feature = regs[3]; /* edx */ break; } } } else if (cpu == CPU_486 && *cpu_vendor == '\0') { /* * There are BlueLightning CPUs that do not change * undefined flags by dividing 5 by 2. In this case, * the CPU identification routine in locore.s leaves * cpu_vendor null string and puts CPU_486 into the * cpu. */ if (identblue() == IDENTBLUE_IBMCPU) { strcpy(cpu_vendor, "IBM"); cpu_vendor_id = CPU_VENDOR_IBM; cpu = CPU_BLUE; return; } } #endif } int pti_get_default(void) { if (strcmp(cpu_vendor, AMD_VENDOR_ID) == 0 || strcmp(cpu_vendor, HYGON_VENDOR_ID) == 0) return (0); if ((cpu_ia32_arch_caps & IA32_ARCH_CAP_RDCL_NO) != 0) return (0); return (1); } static u_int find_cpu_vendor_id(void) { int i; for (i = 0; i < nitems(cpu_vendors); i++) if (strcmp(cpu_vendor, cpu_vendors[i].vendor) == 0) return (cpu_vendors[i].vendor_id); return (0); } static void print_AMD_assoc(int i) { if (i == 255) printf(", fully associative\n"); else printf(", %d-way associative\n", i); } static void print_AMD_l2_assoc(int i) { switch (i & 0x0f) { case 0: printf(", disabled/not present\n"); break; case 1: printf(", direct mapped\n"); break; case 2: printf(", 2-way associative\n"); break; case 4: printf(", 4-way associative\n"); break; case 6: printf(", 8-way associative\n"); break; case 8: printf(", 16-way associative\n"); break; case 15: printf(", fully associative\n"); break; default: printf(", reserved configuration\n"); break; } } static void print_AMD_info(void) { #ifdef __i386__ uint64_t amd_whcr; #endif u_int regs[4]; if (cpu_exthigh >= 0x80000005) { do_cpuid(0x80000005, regs); printf("L1 2MB data TLB: %d entries", (regs[0] >> 16) & 0xff); print_AMD_assoc(regs[0] >> 24); printf("L1 2MB instruction TLB: %d entries", regs[0] & 0xff); print_AMD_assoc((regs[0] >> 8) & 0xff); printf("L1 4KB data TLB: %d entries", (regs[1] >> 16) & 0xff); print_AMD_assoc(regs[1] >> 24); printf("L1 4KB instruction TLB: %d entries", regs[1] & 0xff); print_AMD_assoc((regs[1] >> 8) & 0xff); printf("L1 data cache: %d kbytes", regs[2] >> 24); printf(", %d bytes/line", regs[2] & 0xff); printf(", %d lines/tag", (regs[2] >> 8) & 0xff); print_AMD_assoc((regs[2] >> 16) & 0xff); printf("L1 instruction cache: %d kbytes", regs[3] >> 24); printf(", %d bytes/line", regs[3] & 0xff); printf(", %d lines/tag", (regs[3] >> 8) & 0xff); print_AMD_assoc((regs[3] >> 16) & 0xff); } if (cpu_exthigh >= 0x80000006) { do_cpuid(0x80000006, regs); if ((regs[0] >> 16) != 0) { printf("L2 2MB data TLB: %d entries", (regs[0] >> 16) & 0xfff); print_AMD_l2_assoc(regs[0] >> 28); printf("L2 2MB instruction TLB: %d entries", regs[0] & 0xfff); print_AMD_l2_assoc((regs[0] >> 28) & 0xf); } else { printf("L2 2MB unified TLB: %d entries", regs[0] & 0xfff); print_AMD_l2_assoc((regs[0] >> 28) & 0xf); } if ((regs[1] >> 16) != 0) { printf("L2 4KB data TLB: %d entries", (regs[1] >> 16) & 0xfff); print_AMD_l2_assoc(regs[1] >> 28); printf("L2 4KB instruction TLB: %d entries", (regs[1] >> 16) & 0xfff); print_AMD_l2_assoc((regs[1] >> 28) & 0xf); } else { printf("L2 4KB unified TLB: %d entries", (regs[1] >> 16) & 0xfff); print_AMD_l2_assoc((regs[1] >> 28) & 0xf); } printf("L2 unified cache: %d kbytes", regs[2] >> 16); printf(", %d bytes/line", regs[2] & 0xff); printf(", %d lines/tag", (regs[2] >> 8) & 0x0f); print_AMD_l2_assoc((regs[2] >> 12) & 0x0f); } #ifdef __i386__ if (((cpu_id & 0xf00) == 0x500) && (((cpu_id & 0x0f0) > 0x80) || (((cpu_id & 0x0f0) == 0x80) && (cpu_id & 0x00f) > 0x07))) { /* K6-2(new core [Stepping 8-F]), K6-III or later */ amd_whcr = rdmsr(0xc0000082); if (!(amd_whcr & (0x3ff << 22))) { printf("Write Allocate Disable\n"); } else { printf("Write Allocate Enable Limit: %dM bytes\n", (u_int32_t)((amd_whcr & (0x3ff << 22)) >> 22) * 4); printf("Write Allocate 15-16M bytes: %s\n", (amd_whcr & (1 << 16)) ? "Enable" : "Disable"); } } else if (((cpu_id & 0xf00) == 0x500) && ((cpu_id & 0x0f0) > 0x50)) { /* K6, K6-2(old core) */ amd_whcr = rdmsr(0xc0000082); if (!(amd_whcr & (0x7f << 1))) { printf("Write Allocate Disable\n"); } else { printf("Write Allocate Enable Limit: %dM bytes\n", (u_int32_t)((amd_whcr & (0x7f << 1)) >> 1) * 4); printf("Write Allocate 15-16M bytes: %s\n", (amd_whcr & 0x0001) ? "Enable" : "Disable"); printf("Hardware Write Allocate Control: %s\n", (amd_whcr & 0x0100) ? "Enable" : "Disable"); } } #endif /* * Opteron Rev E shows a bug as in very rare occasions a read memory * barrier is not performed as expected if it is followed by a * non-atomic read-modify-write instruction. * As long as that bug pops up very rarely (intensive machine usage * on other operating systems generally generates one unexplainable * crash any 2 months) and as long as a model specific fix would be * impractical at this stage, print out a warning string if the broken * model and family are identified. */ if (CPUID_TO_FAMILY(cpu_id) == 0xf && CPUID_TO_MODEL(cpu_id) >= 0x20 && CPUID_TO_MODEL(cpu_id) <= 0x3f) printf("WARNING: This architecture revision has known SMP " "hardware bugs which may cause random instability\n"); } static void print_INTEL_info(void) { u_int regs[4]; u_int rounds, regnum; u_int nwaycode, nway; if (cpu_high >= 2) { rounds = 0; do { do_cpuid(0x2, regs); if (rounds == 0 && (rounds = (regs[0] & 0xff)) == 0) break; /* we have a buggy CPU */ for (regnum = 0; regnum <= 3; ++regnum) { if (regs[regnum] & (1<<31)) continue; if (regnum != 0) print_INTEL_TLB(regs[regnum] & 0xff); print_INTEL_TLB((regs[regnum] >> 8) & 0xff); print_INTEL_TLB((regs[regnum] >> 16) & 0xff); print_INTEL_TLB((regs[regnum] >> 24) & 0xff); } } while (--rounds > 0); } if (cpu_exthigh >= 0x80000006) { do_cpuid(0x80000006, regs); nwaycode = (regs[2] >> 12) & 0x0f; if (nwaycode >= 0x02 && nwaycode <= 0x08) nway = 1 << (nwaycode / 2); else nway = 0; printf("L2 cache: %u kbytes, %u-way associative, %u bytes/line\n", (regs[2] >> 16) & 0xffff, nway, regs[2] & 0xff); } } static void print_INTEL_TLB(u_int data) { switch (data) { case 0x0: case 0x40: default: break; case 0x1: printf("Instruction TLB: 4 KB pages, 4-way set associative, 32 entries\n"); break; case 0x2: printf("Instruction TLB: 4 MB pages, fully associative, 2 entries\n"); break; case 0x3: printf("Data TLB: 4 KB pages, 4-way set associative, 64 entries\n"); break; case 0x4: printf("Data TLB: 4 MB Pages, 4-way set associative, 8 entries\n"); break; case 0x6: printf("1st-level instruction cache: 8 KB, 4-way set associative, 32 byte line size\n"); break; case 0x8: printf("1st-level instruction cache: 16 KB, 4-way set associative, 32 byte line size\n"); break; case 0x9: printf("1st-level instruction cache: 32 KB, 4-way set associative, 64 byte line size\n"); break; case 0xa: printf("1st-level data cache: 8 KB, 2-way set associative, 32 byte line size\n"); break; case 0xb: printf("Instruction TLB: 4 MByte pages, 4-way set associative, 4 entries\n"); break; case 0xc: printf("1st-level data cache: 16 KB, 4-way set associative, 32 byte line size\n"); break; case 0xd: printf("1st-level data cache: 16 KBytes, 4-way set associative, 64 byte line size"); break; case 0xe: printf("1st-level data cache: 24 KBytes, 6-way set associative, 64 byte line size\n"); break; case 0x1d: printf("2nd-level cache: 128 KBytes, 2-way set associative, 64 byte line size\n"); break; case 0x21: printf("2nd-level cache: 256 KBytes, 8-way set associative, 64 byte line size\n"); break; case 0x22: printf("3rd-level cache: 512 KB, 4-way set associative, sectored cache, 64 byte line size\n"); break; case 0x23: printf("3rd-level cache: 1 MB, 8-way set associative, sectored cache, 64 byte line size\n"); break; case 0x24: printf("2nd-level cache: 1 MBytes, 16-way set associative, 64 byte line size\n"); break; case 0x25: printf("3rd-level cache: 2 MB, 8-way set associative, sectored cache, 64 byte line size\n"); break; case 0x29: printf("3rd-level cache: 4 MB, 8-way set associative, sectored cache, 64 byte line size\n"); break; case 0x2c: printf("1st-level data cache: 32 KB, 8-way set associative, 64 byte line size\n"); break; case 0x30: printf("1st-level instruction cache: 32 KB, 8-way set associative, 64 byte line size\n"); break; case 0x39: /* De-listed in SDM rev. 54 */ printf("2nd-level cache: 128 KB, 4-way set associative, sectored cache, 64 byte line size\n"); break; case 0x3b: /* De-listed in SDM rev. 54 */ printf("2nd-level cache: 128 KB, 2-way set associative, sectored cache, 64 byte line size\n"); break; case 0x3c: /* De-listed in SDM rev. 54 */ printf("2nd-level cache: 256 KB, 4-way set associative, sectored cache, 64 byte line size\n"); break; case 0x41: printf("2nd-level cache: 128 KB, 4-way set associative, 32 byte line size\n"); break; case 0x42: printf("2nd-level cache: 256 KB, 4-way set associative, 32 byte line size\n"); break; case 0x43: printf("2nd-level cache: 512 KB, 4-way set associative, 32 byte line size\n"); break; case 0x44: printf("2nd-level cache: 1 MB, 4-way set associative, 32 byte line size\n"); break; case 0x45: printf("2nd-level cache: 2 MB, 4-way set associative, 32 byte line size\n"); break; case 0x46: printf("3rd-level cache: 4 MB, 4-way set associative, 64 byte line size\n"); break; case 0x47: printf("3rd-level cache: 8 MB, 8-way set associative, 64 byte line size\n"); break; case 0x48: printf("2nd-level cache: 3MByte, 12-way set associative, 64 byte line size\n"); break; case 0x49: if (CPUID_TO_FAMILY(cpu_id) == 0xf && CPUID_TO_MODEL(cpu_id) == 0x6) printf("3rd-level cache: 4MB, 16-way set associative, 64-byte line size\n"); else printf("2nd-level cache: 4 MByte, 16-way set associative, 64 byte line size"); break; case 0x4a: printf("3rd-level cache: 6MByte, 12-way set associative, 64 byte line size\n"); break; case 0x4b: printf("3rd-level cache: 8MByte, 16-way set associative, 64 byte line size\n"); break; case 0x4c: printf("3rd-level cache: 12MByte, 12-way set associative, 64 byte line size\n"); break; case 0x4d: printf("3rd-level cache: 16MByte, 16-way set associative, 64 byte line size\n"); break; case 0x4e: printf("2nd-level cache: 6MByte, 24-way set associative, 64 byte line size\n"); break; case 0x4f: printf("Instruction TLB: 4 KByte pages, 32 entries\n"); break; case 0x50: printf("Instruction TLB: 4 KB, 2 MB or 4 MB pages, fully associative, 64 entries\n"); break; case 0x51: printf("Instruction TLB: 4 KB, 2 MB or 4 MB pages, fully associative, 128 entries\n"); break; case 0x52: printf("Instruction TLB: 4 KB, 2 MB or 4 MB pages, fully associative, 256 entries\n"); break; case 0x55: printf("Instruction TLB: 2-MByte or 4-MByte pages, fully associative, 7 entries\n"); break; case 0x56: printf("Data TLB0: 4 MByte pages, 4-way set associative, 16 entries\n"); break; case 0x57: printf("Data TLB0: 4 KByte pages, 4-way associative, 16 entries\n"); break; case 0x59: printf("Data TLB0: 4 KByte pages, fully associative, 16 entries\n"); break; case 0x5a: printf("Data TLB0: 2-MByte or 4 MByte pages, 4-way set associative, 32 entries\n"); break; case 0x5b: printf("Data TLB: 4 KB or 4 MB pages, fully associative, 64 entries\n"); break; case 0x5c: printf("Data TLB: 4 KB or 4 MB pages, fully associative, 128 entries\n"); break; case 0x5d: printf("Data TLB: 4 KB or 4 MB pages, fully associative, 256 entries\n"); break; case 0x60: printf("1st-level data cache: 16 KB, 8-way set associative, sectored cache, 64 byte line size\n"); break; case 0x61: printf("Instruction TLB: 4 KByte pages, fully associative, 48 entries\n"); break; case 0x63: printf("Data TLB: 2 MByte or 4 MByte pages, 4-way set associative, 32 entries and a separate array with 1 GByte pages, 4-way set associative, 4 entries\n"); break; case 0x64: printf("Data TLB: 4 KBytes pages, 4-way set associative, 512 entries\n"); break; case 0x66: printf("1st-level data cache: 8 KB, 4-way set associative, sectored cache, 64 byte line size\n"); break; case 0x67: printf("1st-level data cache: 16 KB, 4-way set associative, sectored cache, 64 byte line size\n"); break; case 0x68: printf("1st-level data cache: 32 KB, 4 way set associative, sectored cache, 64 byte line size\n"); break; case 0x6a: printf("uTLB: 4KByte pages, 8-way set associative, 64 entries\n"); break; case 0x6b: printf("DTLB: 4KByte pages, 8-way set associative, 256 entries\n"); break; case 0x6c: printf("DTLB: 2M/4M pages, 8-way set associative, 128 entries\n"); break; case 0x6d: printf("DTLB: 1 GByte pages, fully associative, 16 entries\n"); break; case 0x70: printf("Trace cache: 12K-uops, 8-way set associative\n"); break; case 0x71: printf("Trace cache: 16K-uops, 8-way set associative\n"); break; case 0x72: printf("Trace cache: 32K-uops, 8-way set associative\n"); break; case 0x76: printf("Instruction TLB: 2M/4M pages, fully associative, 8 entries\n"); break; case 0x78: printf("2nd-level cache: 1 MB, 4-way set associative, 64-byte line size\n"); break; case 0x79: printf("2nd-level cache: 128 KB, 8-way set associative, sectored cache, 64 byte line size\n"); break; case 0x7a: printf("2nd-level cache: 256 KB, 8-way set associative, sectored cache, 64 byte line size\n"); break; case 0x7b: printf("2nd-level cache: 512 KB, 8-way set associative, sectored cache, 64 byte line size\n"); break; case 0x7c: printf("2nd-level cache: 1 MB, 8-way set associative, sectored cache, 64 byte line size\n"); break; case 0x7d: printf("2nd-level cache: 2-MB, 8-way set associative, 64-byte line size\n"); break; case 0x7f: printf("2nd-level cache: 512-KB, 2-way set associative, 64-byte line size\n"); break; case 0x80: printf("2nd-level cache: 512 KByte, 8-way set associative, 64-byte line size\n"); break; case 0x82: printf("2nd-level cache: 256 KB, 8-way set associative, 32 byte line size\n"); break; case 0x83: printf("2nd-level cache: 512 KB, 8-way set associative, 32 byte line size\n"); break; case 0x84: printf("2nd-level cache: 1 MB, 8-way set associative, 32 byte line size\n"); break; case 0x85: printf("2nd-level cache: 2 MB, 8-way set associative, 32 byte line size\n"); break; case 0x86: printf("2nd-level cache: 512 KB, 4-way set associative, 64 byte line size\n"); break; case 0x87: printf("2nd-level cache: 1 MB, 8-way set associative, 64 byte line size\n"); break; case 0xa0: printf("DTLB: 4k pages, fully associative, 32 entries\n"); break; case 0xb0: printf("Instruction TLB: 4 KB Pages, 4-way set associative, 128 entries\n"); break; case 0xb1: printf("Instruction TLB: 2M pages, 4-way, 8 entries or 4M pages, 4-way, 4 entries\n"); break; case 0xb2: printf("Instruction TLB: 4KByte pages, 4-way set associative, 64 entries\n"); break; case 0xb3: printf("Data TLB: 4 KB Pages, 4-way set associative, 128 entries\n"); break; case 0xb4: printf("Data TLB1: 4 KByte pages, 4-way associative, 256 entries\n"); break; case 0xb5: printf("Instruction TLB: 4KByte pages, 8-way set associative, 64 entries\n"); break; case 0xb6: printf("Instruction TLB: 4KByte pages, 8-way set associative, 128 entries\n"); break; case 0xba: printf("Data TLB1: 4 KByte pages, 4-way associative, 64 entries\n"); break; case 0xc0: printf("Data TLB: 4 KByte and 4 MByte pages, 4-way associative, 8 entries\n"); break; case 0xc1: printf("Shared 2nd-Level TLB: 4 KByte/2MByte pages, 8-way associative, 1024 entries\n"); break; case 0xc2: printf("DTLB: 4 KByte/2 MByte pages, 4-way associative, 16 entries\n"); break; case 0xc3: printf("Shared 2nd-Level TLB: 4 KByte /2 MByte pages, 6-way associative, 1536 entries. Also 1GBbyte pages, 4-way, 16 entries\n"); break; case 0xc4: printf("DTLB: 2M/4M Byte pages, 4-way associative, 32 entries\n"); break; case 0xca: printf("Shared 2nd-Level TLB: 4 KByte pages, 4-way associative, 512 entries\n"); break; case 0xd0: printf("3rd-level cache: 512 KByte, 4-way set associative, 64 byte line size\n"); break; case 0xd1: printf("3rd-level cache: 1 MByte, 4-way set associative, 64 byte line size\n"); break; case 0xd2: printf("3rd-level cache: 2 MByte, 4-way set associative, 64 byte line size\n"); break; case 0xd6: printf("3rd-level cache: 1 MByte, 8-way set associative, 64 byte line size\n"); break; case 0xd7: printf("3rd-level cache: 2 MByte, 8-way set associative, 64 byte line size\n"); break; case 0xd8: printf("3rd-level cache: 4 MByte, 8-way set associative, 64 byte line size\n"); break; case 0xdc: printf("3rd-level cache: 1.5 MByte, 12-way set associative, 64 byte line size\n"); break; case 0xdd: printf("3rd-level cache: 3 MByte, 12-way set associative, 64 byte line size\n"); break; case 0xde: printf("3rd-level cache: 6 MByte, 12-way set associative, 64 byte line size\n"); break; case 0xe2: printf("3rd-level cache: 2 MByte, 16-way set associative, 64 byte line size\n"); break; case 0xe3: printf("3rd-level cache: 4 MByte, 16-way set associative, 64 byte line size\n"); break; case 0xe4: printf("3rd-level cache: 8 MByte, 16-way set associative, 64 byte line size\n"); break; case 0xea: printf("3rd-level cache: 12MByte, 24-way set associative, 64 byte line size\n"); break; case 0xeb: printf("3rd-level cache: 18MByte, 24-way set associative, 64 byte line size\n"); break; case 0xec: printf("3rd-level cache: 24MByte, 24-way set associative, 64 byte line size\n"); break; case 0xf0: printf("64-Byte prefetching\n"); break; case 0xf1: printf("128-Byte prefetching\n"); break; } } static void print_svm_info(void) { u_int features, regs[4]; uint64_t msr; int comma; printf("\n SVM: "); do_cpuid(0x8000000A, regs); features = regs[3]; msr = rdmsr(MSR_VM_CR); if ((msr & VM_CR_SVMDIS) == VM_CR_SVMDIS) printf("(disabled in BIOS) "); if (!bootverbose) { comma = 0; if (features & (1 << 0)) { printf("%sNP", comma ? "," : ""); - comma = 1; + comma = 1; } if (features & (1 << 3)) { printf("%sNRIP", comma ? "," : ""); - comma = 1; + comma = 1; } if (features & (1 << 5)) { printf("%sVClean", comma ? "," : ""); - comma = 1; + comma = 1; } if (features & (1 << 6)) { printf("%sAFlush", comma ? "," : ""); - comma = 1; + comma = 1; } if (features & (1 << 7)) { printf("%sDAssist", comma ? "," : ""); - comma = 1; + comma = 1; } printf("%sNAsids=%d", comma ? "," : "", regs[1]); return; } printf("Features=0x%b", features, "\020" "\001NP" /* Nested paging */ "\002LbrVirt" /* LBR virtualization */ "\003SVML" /* SVM lock */ "\004NRIPS" /* NRIP save */ "\005TscRateMsr" /* MSR based TSC rate control */ "\006VmcbClean" /* VMCB clean bits */ "\007FlushByAsid" /* Flush by ASID */ "\010DecodeAssist" /* Decode assist */ "\011" "\012" - "\013PauseFilter" /* PAUSE intercept filter */ + "\013PauseFilter" /* PAUSE intercept filter */ "\014EncryptedMcodePatch" "\015PauseFilterThreshold" /* PAUSE filter threshold */ "\016AVIC" /* virtual interrupt controller */ "\017" "\020V_VMSAVE_VMLOAD" "\021vGIF" "\022GMET" /* Guest Mode Execute Trap */ "\023" "\024" - "\025" + "\025GuesSpecCtl" /* Guest Spec_ctl */ "\026" "\027" "\030" "\031" "\032" "\033" "\034" "\035" "\036" "\037" "\040" - ); + ); printf("\nRevision=%d, ASIDs=%d", regs[0] & 0xff, regs[1]); } #ifdef __i386__ static void print_transmeta_info(void) { u_int regs[4], nreg = 0; do_cpuid(0x80860000, regs); nreg = regs[0]; if (nreg >= 0x80860001) { do_cpuid(0x80860001, regs); printf(" Processor revision %u.%u.%u.%u\n", (regs[1] >> 24) & 0xff, (regs[1] >> 16) & 0xff, (regs[1] >> 8) & 0xff, regs[1] & 0xff); } if (nreg >= 0x80860002) { do_cpuid(0x80860002, regs); printf(" Code Morphing Software revision %u.%u.%u-%u-%u\n", (regs[1] >> 24) & 0xff, (regs[1] >> 16) & 0xff, (regs[1] >> 8) & 0xff, regs[1] & 0xff, regs[2]); } if (nreg >= 0x80860006) { char info[65]; do_cpuid(0x80860003, (u_int*) &info[0]); do_cpuid(0x80860004, (u_int*) &info[16]); do_cpuid(0x80860005, (u_int*) &info[32]); do_cpuid(0x80860006, (u_int*) &info[48]); info[64] = 0; printf(" %s\n", info); } } #endif static void print_via_padlock_info(void) { u_int regs[4]; do_cpuid(0xc0000001, regs); printf("\n VIA Padlock Features=0x%b", regs[3], "\020" "\003RNG" /* RNG */ "\007AES" /* ACE */ "\011AES-CTR" /* ACE2 */ "\013SHA1,SHA256" /* PHE */ "\015RSA" /* PMM */ ); } static uint32_t vmx_settable(uint64_t basic, int msr, int true_msr) { uint64_t val; if (basic & (1ULL << 55)) val = rdmsr(true_msr); else val = rdmsr(msr); /* Just report the controls that can be set to 1. */ return (val >> 32); } static void print_vmx_info(void) { uint64_t basic, msr; uint32_t entry, exit, mask, pin, proc, proc2; int comma; printf("\n VT-x: "); msr = rdmsr(MSR_IA32_FEATURE_CONTROL); if (!(msr & IA32_FEATURE_CONTROL_VMX_EN)) printf("(disabled in BIOS) "); basic = rdmsr(MSR_VMX_BASIC); pin = vmx_settable(basic, MSR_VMX_PINBASED_CTLS, MSR_VMX_TRUE_PINBASED_CTLS); proc = vmx_settable(basic, MSR_VMX_PROCBASED_CTLS, MSR_VMX_TRUE_PROCBASED_CTLS); if (proc & PROCBASED_SECONDARY_CONTROLS) proc2 = vmx_settable(basic, MSR_VMX_PROCBASED_CTLS2, MSR_VMX_PROCBASED_CTLS2); else proc2 = 0; exit = vmx_settable(basic, MSR_VMX_EXIT_CTLS, MSR_VMX_TRUE_EXIT_CTLS); entry = vmx_settable(basic, MSR_VMX_ENTRY_CTLS, MSR_VMX_TRUE_ENTRY_CTLS); if (!bootverbose) { comma = 0; if (exit & VM_EXIT_SAVE_PAT && exit & VM_EXIT_LOAD_PAT && entry & VM_ENTRY_LOAD_PAT) { printf("%sPAT", comma ? "," : ""); comma = 1; } if (proc & PROCBASED_HLT_EXITING) { printf("%sHLT", comma ? "," : ""); comma = 1; } if (proc & PROCBASED_MTF) { printf("%sMTF", comma ? "," : ""); comma = 1; } if (proc & PROCBASED_PAUSE_EXITING) { printf("%sPAUSE", comma ? "," : ""); comma = 1; } if (proc2 & PROCBASED2_ENABLE_EPT) { printf("%sEPT", comma ? "," : ""); comma = 1; } if (proc2 & PROCBASED2_UNRESTRICTED_GUEST) { printf("%sUG", comma ? "," : ""); comma = 1; } if (proc2 & PROCBASED2_ENABLE_VPID) { printf("%sVPID", comma ? "," : ""); comma = 1; } if (proc & PROCBASED_USE_TPR_SHADOW && proc2 & PROCBASED2_VIRTUALIZE_APIC_ACCESSES && proc2 & PROCBASED2_VIRTUALIZE_X2APIC_MODE && proc2 & PROCBASED2_APIC_REGISTER_VIRTUALIZATION && proc2 & PROCBASED2_VIRTUAL_INTERRUPT_DELIVERY) { printf("%sVID", comma ? "," : ""); comma = 1; if (pin & PINBASED_POSTED_INTERRUPT) printf(",PostIntr"); } return; } mask = basic >> 32; printf("Basic Features=0x%b", mask, "\020" "\02132PA" /* 32-bit physical addresses */ "\022SMM" /* SMM dual-monitor */ "\027INS/OUTS" /* VM-exit info for INS and OUTS */ "\030TRUE" /* TRUE_CTLS MSRs */ ); printf("\n Pin-Based Controls=0x%b", pin, "\020" "\001ExtINT" /* External-interrupt exiting */ "\004NMI" /* NMI exiting */ "\006VNMI" /* Virtual NMIs */ "\007PreTmr" /* Activate VMX-preemption timer */ "\010PostIntr" /* Process posted interrupts */ ); printf("\n Primary Processor Controls=0x%b", proc, "\020" "\003INTWIN" /* Interrupt-window exiting */ "\004TSCOff" /* Use TSC offsetting */ "\010HLT" /* HLT exiting */ "\012INVLPG" /* INVLPG exiting */ "\013MWAIT" /* MWAIT exiting */ "\014RDPMC" /* RDPMC exiting */ "\015RDTSC" /* RDTSC exiting */ "\020CR3-LD" /* CR3-load exiting */ "\021CR3-ST" /* CR3-store exiting */ "\024CR8-LD" /* CR8-load exiting */ "\025CR8-ST" /* CR8-store exiting */ "\026TPR" /* Use TPR shadow */ "\027NMIWIN" /* NMI-window exiting */ "\030MOV-DR" /* MOV-DR exiting */ "\031IO" /* Unconditional I/O exiting */ "\032IOmap" /* Use I/O bitmaps */ "\034MTF" /* Monitor trap flag */ "\035MSRmap" /* Use MSR bitmaps */ "\036MONITOR" /* MONITOR exiting */ "\037PAUSE" /* PAUSE exiting */ ); if (proc & PROCBASED_SECONDARY_CONTROLS) printf("\n Secondary Processor Controls=0x%b", proc2, "\020" "\001APIC" /* Virtualize APIC accesses */ "\002EPT" /* Enable EPT */ "\003DT" /* Descriptor-table exiting */ "\004RDTSCP" /* Enable RDTSCP */ "\005x2APIC" /* Virtualize x2APIC mode */ "\006VPID" /* Enable VPID */ "\007WBINVD" /* WBINVD exiting */ "\010UG" /* Unrestricted guest */ "\011APIC-reg" /* APIC-register virtualization */ "\012VID" /* Virtual-interrupt delivery */ "\013PAUSE-loop" /* PAUSE-loop exiting */ "\014RDRAND" /* RDRAND exiting */ "\015INVPCID" /* Enable INVPCID */ "\016VMFUNC" /* Enable VM functions */ "\017VMCS" /* VMCS shadowing */ "\020EPT#VE" /* EPT-violation #VE */ "\021XSAVES" /* Enable XSAVES/XRSTORS */ ); printf("\n Exit Controls=0x%b", mask, "\020" "\003DR" /* Save debug controls */ /* Ignore Host address-space size */ "\015PERF" /* Load MSR_PERF_GLOBAL_CTRL */ "\020AckInt" /* Acknowledge interrupt on exit */ "\023PAT-SV" /* Save MSR_PAT */ "\024PAT-LD" /* Load MSR_PAT */ "\025EFER-SV" /* Save MSR_EFER */ "\026EFER-LD" /* Load MSR_EFER */ "\027PTMR-SV" /* Save VMX-preemption timer value */ ); printf("\n Entry Controls=0x%b", mask, "\020" "\003DR" /* Save debug controls */ /* Ignore IA-32e mode guest */ /* Ignore Entry to SMM */ /* Ignore Deactivate dual-monitor treatment */ "\016PERF" /* Load MSR_PERF_GLOBAL_CTRL */ "\017PAT" /* Load MSR_PAT */ "\020EFER" /* Load MSR_EFER */ ); if (proc & PROCBASED_SECONDARY_CONTROLS && (proc2 & (PROCBASED2_ENABLE_EPT | PROCBASED2_ENABLE_VPID)) != 0) { msr = rdmsr(MSR_VMX_EPT_VPID_CAP); mask = msr; printf("\n EPT Features=0x%b", mask, "\020" "\001XO" /* Execute-only translations */ "\007PW4" /* Page-walk length of 4 */ "\011UC" /* EPT paging-structure mem can be UC */ "\017WB" /* EPT paging-structure mem can be WB */ "\0212M" /* EPT PDE can map a 2-Mbyte page */ "\0221G" /* EPT PDPTE can map a 1-Gbyte page */ "\025INVEPT" /* INVEPT is supported */ "\026AD" /* Accessed and dirty flags for EPT */ "\032single" /* INVEPT single-context type */ "\033all" /* INVEPT all-context type */ ); mask = msr >> 32; printf("\n VPID Features=0x%b", mask, "\020" "\001INVVPID" /* INVVPID is supported */ "\011individual" /* INVVPID individual-address type */ "\012single" /* INVVPID single-context type */ "\013all" /* INVVPID all-context type */ /* INVVPID single-context-retaining-globals type */ "\014single-globals" ); } } static void print_hypervisor_info(void) { if (*hv_vendor != '\0') printf("Hypervisor: Origin = \"%s\"\n", hv_vendor); } /* * Returns the maximum physical address that can be used with the * current system. */ vm_paddr_t cpu_getmaxphyaddr(void) { #if defined(__i386__) if (!pae_mode) return (0xffffffff); #endif return ((1ULL << cpu_maxphyaddr) - 1); } Index: projects/clang1000-import/tools/bsdbox/Makefile.base =================================================================== --- projects/clang1000-import/tools/bsdbox/Makefile.base (revision 358238) +++ projects/clang1000-import/tools/bsdbox/Makefile.base (revision 358239) @@ -1,57 +1,58 @@ # # This builds a variety of "base" tools, useful for an embedded # system. # # $FreeBSD$ # CRUNCH_PROGS_sbin+= dmesg sysctl init reboot CRUNCH_PROGS_bin+= ls cat dd df cp hostname kill mkdir sleep ps CRUNCH_PROGS_bin+= ln rm kenv mv expr CRUNCH_PROGS_usr.bin+= true false hexdump tail nc w head uname tset CRUNCH_PROGS_usr.sbin+= gpioctl CRUNCH_ALIAS_w= uptime CRUNCH_ALIAS_tset= reset CRUNCH_PROGS_usr.bin+= vmstat #CRUNCH_PROGS_user.bin+= systat CRUNCH_LIBS+= -ldevstat -lncursesw -lncurses -lmemstat -lkvm -lelf # CRUNCH_PROGS_usr.bin+= tar CRUNCH_PROGS_usr.bin+= cpio # XXX SSL ? CRUNCH_LIBS+= -larchive -lbz2 -lz -llzma -lbsdxml -lssl -lcrypto +CRUNCH_LIBS+= -lprivatezstd -lthr # Clear requires tput, and it's a shell script so it won't be crunched CRUNCH_PROGS_usr.bin+= tput # sh CRUNCH_PROGS_bin+= sh CRUNCH_ALIAS_sh= -sh CRUNCH_SUPPRESS_LINK_-sh= 1 CRUNCH_BUILDTOOLS+= bin/sh # chown CRUNCH_PROGS_usr.sbin+= chown CRUNCH_ALIAS_chown= chgrp # Basic filesystem stuff CRUNCH_PROGS_sbin+= mount umount # grep # grep doesn't yet work -adrian #CRUNCH_PROGS_usr.bin+= grep # less/more #CRUNCH_PROGS_usr.bin+= less #CRUNCH_ALIAS_less= more # passwd CRUNCH_PROGS_usr.bin+= passwd # These need to be shared, or PAM wants to include _all_ of the libraries # at runtime. CRUNCH_SHLIBS+= -lpam -lbsm # gzip/gunzip CRUNCH_PROGS_usr.bin+= gzip CRUNCH_ALIAS_gzip= gunzip gzcat zcat CRUNCH_LIBS+= -lz -llzma -lbz2 Index: projects/clang1000-import/usr.bin/dtc/dtc.cc =================================================================== --- projects/clang1000-import/usr.bin/dtc/dtc.cc (revision 358238) +++ projects/clang1000-import/usr.bin/dtc/dtc.cc (revision 358239) @@ -1,381 +1,384 @@ /*- * SPDX-License-Identifier: BSD-2-Clause-FreeBSD * * Copyright (c) 2013 David Chisnall * All rights reserved. * * This software was developed by SRI International and the University of * Cambridge Computer Laboratory under DARPA/AFRL contract (FA8750-10-C-0237) * ("CTSRD"), as part of the DARPA CRASH research programme. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $FreeBSD$ */ #include #include #include #include #include #include #include #include #include #include "fdt.hh" #include "checking.hh" #include "util.hh" using namespace dtc; using std::string; namespace { /** * The current major version of the tool. */ int version_major = 0; int version_major_compatible = 1; /** * The current minor version of the tool. */ int version_minor = 5; int version_minor_compatible = 4; /** * The current patch level of the tool. */ int version_patch = 0; int version_patch_compatible = 7; void usage(const string &argv0) { fprintf(stderr, "Usage:\n" "\t%s\t[-fhsv@] [-b boot_cpu_id] [-d dependency_file]" "[-E [no-]checker_name]\n" "\t\t[-H phandle_format] [-I input_format]" "[-O output_format]\n" "\t\t[-o output_file] [-R entries] [-S bytes] [-p bytes]" "[-V blob_version]\n" "\t\t-W [no-]checker_name] input_file\n", basename(argv0).c_str()); } /** * Prints the current version of this program.. */ void version(const char* progname) { fprintf(stdout, "Version: %s %d.%d.%d compatible with gpl dtc %d.%d.%d\n", progname, version_major, version_minor, version_patch, version_major_compatible, version_minor_compatible, version_patch_compatible); } } // Anonymous namespace using fdt::device_tree; using fdt::tree_write_fn_ptr; using fdt::tree_read_fn_ptr; int main(int argc, char **argv) { int ch; int outfile = fileno(stdout); const char *outfile_name = "-"; const char *in_file = "-"; FILE *depfile = 0; bool debug_mode = false; tree_write_fn_ptr write_fn = nullptr; tree_read_fn_ptr read_fn = nullptr; uint32_t boot_cpu = 0; bool boot_cpu_specified = false; bool keep_going = false; bool sort = false; clock_t c0 = clock(); class device_tree tree; fdt::checking::check_manager checks; const char *options = "@hqI:O:o:V:d:R:S:p:b:fi:svH:W:E:DP:"; // Don't forget to update the man page if any more options are added. while ((ch = getopt(argc, argv, options)) != -1) { switch (ch) { case 'h': usage(argv[0]); return EXIT_SUCCESS; case 'v': version(argv[0]); return EXIT_SUCCESS; case '@': tree.write_symbols = true; break; case 'I': { string arg(optarg); if (arg == "dtb") { read_fn = &device_tree::parse_dtb; if (write_fn == nullptr) { write_fn = &device_tree::write_dts; } } else if (arg == "dts") { read_fn = &device_tree::parse_dts; } else { fprintf(stderr, "Unknown input format: %s\n", optarg); return EXIT_FAILURE; } break; } case 'O': { string arg(optarg); if (arg == "dtb") { write_fn = &device_tree::write_binary; } else if (arg == "asm") { write_fn = &device_tree::write_asm; } else if (arg == "dts") { write_fn = &device_tree::write_dts; if (read_fn == nullptr) { read_fn = &device_tree::parse_dtb; } } else { fprintf(stderr, "Unknown output format: %s\n", optarg); return EXIT_FAILURE; } break; } case 'o': { outfile_name = optarg; if (strcmp(outfile_name, "-") != 0) { outfile = open(optarg, O_CREAT | O_TRUNC | O_WRONLY, 0666); if (outfile == -1) { perror("Unable to open output file"); return EXIT_FAILURE; } } break; } case 'D': debug_mode = true; break; case 'V': if (string(optarg) != "17") { fprintf(stderr, "Unknown output format version: %s\n", optarg); return EXIT_FAILURE; } break; case 'd': { if (depfile != 0) { fclose(depfile); } if (string(optarg) == "-") { depfile = stdout; } else { depfile = fdopen(open(optarg, O_CREAT | O_TRUNC | O_WRONLY, 0666), "w"); if (depfile == 0) { perror("Unable to open dependency file"); return EXIT_FAILURE; } } break; } case 'H': { string arg(optarg); if (arg == "both") { tree.set_phandle_format(device_tree::BOTH); } else if (arg == "epapr") { tree.set_phandle_format(device_tree::EPAPR); } else if (arg == "linux") { tree.set_phandle_format(device_tree::LINUX); } else { fprintf(stderr, "Unknown phandle format: %s\n", optarg); return EXIT_FAILURE; } break; } case 'b': // Don't bother to check if strtoll fails, just // use the 0 it returns. boot_cpu = (uint32_t)strtoll(optarg, 0, 10); boot_cpu_specified = true; break; case 'f': keep_going = true; break; case 'W': case 'E': { string arg(optarg); if ((arg.size() > 3) && (strncmp(optarg, "no-", 3) == 0)) { arg = string(optarg+3); if (!checks.disable_checker(arg)) { fprintf(stderr, "Checker %s either does not exist or is already disabled\n", optarg+3); } break; } if (!checks.enable_checker(arg)) { fprintf(stderr, "Checker %s either does not exist or is already enabled\n", optarg); } break; } case 's': { sort = true; break; } case 'i': { tree.add_include_path(optarg); break; } // Should quiet warnings, but for now is silently ignored. case 'q': break; case 'R': tree.set_empty_reserve_map_entries(strtoll(optarg, 0, 10)); break; case 'S': tree.set_blob_minimum_size(strtoll(optarg, 0, 10)); break; case 'p': tree.set_blob_padding(strtoll(optarg, 0, 10)); break; case 'P': if (!tree.parse_define(optarg)) { fprintf(stderr, "Invalid predefine value %s\n", optarg); } break; default: - fprintf(stderr, "Unknown option %c\n", ch); + /* + * Since opterr is non-zero, getopt will have + * already printed an error message. + */ return EXIT_FAILURE; } } if (read_fn == nullptr) { read_fn = &device_tree::parse_dts; } if (write_fn == nullptr) { write_fn = &device_tree::write_binary; } if (optind < argc) { in_file = argv[optind]; } if (depfile != 0) { fputs(outfile_name, depfile); fputs(": ", depfile); fputs(in_file, depfile); } clock_t c1 = clock(); (tree.*read_fn)(in_file, depfile); // Override the boot CPU found in the header, if we're loading from dtb if (boot_cpu_specified) { tree.set_boot_cpu(boot_cpu); } if (sort) { tree.sort(); } if (depfile != 0) { putc('\n', depfile); fclose(depfile); } if (!(tree.is_valid() || keep_going)) { fprintf(stderr, "Failed to parse tree.\n"); return EXIT_FAILURE; } clock_t c2 = clock(); if (!(checks.run_checks(&tree, true) || keep_going)) { return EXIT_FAILURE; } clock_t c3 = clock(); (tree.*write_fn)(outfile); close(outfile); clock_t c4 = clock(); if (debug_mode) { struct rusage r; getrusage(RUSAGE_SELF, &r); fprintf(stderr, "Peak memory usage: %ld bytes\n", r.ru_maxrss); fprintf(stderr, "Setup and option parsing took %f seconds\n", ((double)(c1-c0))/CLOCKS_PER_SEC); fprintf(stderr, "Parsing took %f seconds\n", ((double)(c2-c1))/CLOCKS_PER_SEC); fprintf(stderr, "Checking took %f seconds\n", ((double)(c3-c2))/CLOCKS_PER_SEC); fprintf(stderr, "Generating output took %f seconds\n", ((double)(c4-c3))/CLOCKS_PER_SEC); fprintf(stderr, "Total time: %f seconds\n", ((double)(c4-c0))/CLOCKS_PER_SEC); // This is not needed, but keeps valgrind quiet. fclose(stdin); } return EXIT_SUCCESS; } Index: projects/clang1000-import/usr.sbin/bhyve/iov.c =================================================================== --- projects/clang1000-import/usr.sbin/bhyve/iov.c (revision 358238) +++ projects/clang1000-import/usr.sbin/bhyve/iov.c (revision 358239) @@ -1,148 +1,149 @@ /*- * SPDX-License-Identifier: BSD-2-Clause-FreeBSD * * Copyright (c) 2016 Jakub Klama . * Copyright (c) 2018 Alexander Motin * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer * in this position and unchanged. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include "iov.h" void seek_iov(const struct iovec *iov1, int niov1, struct iovec *iov2, int *niov2, size_t seek) { size_t remainder = 0; size_t left = seek; int i, j; for (i = 0; i < niov1; i++) { size_t toseek = MIN(left, iov1[i].iov_len); left -= toseek; if (toseek == iov1[i].iov_len) continue; if (left == 0) { remainder = toseek; break; } } for (j = i; j < niov1; j++) { iov2[j - i].iov_base = (char *)iov1[j].iov_base + remainder; iov2[j - i].iov_len = iov1[j].iov_len - remainder; remainder = 0; } *niov2 = j - i; } size_t count_iov(const struct iovec *iov, int niov) { size_t total = 0; int i; for (i = 0; i < niov; i++) total += iov[i].iov_len; return (total); } void truncate_iov(struct iovec *iov, int *niov, size_t length) { size_t done = 0; int i; for (i = 0; i < *niov; i++) { size_t toseek = MIN(length - done, iov[i].iov_len); done += toseek; if (toseek <= iov[i].iov_len) { iov[i].iov_len = toseek; *niov = i + 1; return; } } } ssize_t iov_to_buf(const struct iovec *iov, int niov, void **buf) { size_t ptr, total; int i; total = count_iov(iov, niov); *buf = realloc(*buf, total); if (*buf == NULL) return (-1); for (i = 0, ptr = 0; i < niov; i++) { memcpy(*buf + ptr, iov[i].iov_base, iov[i].iov_len); ptr += iov[i].iov_len; } return (total); } ssize_t -buf_to_iov(const void *buf, size_t buflen, struct iovec *iov, int niov, +buf_to_iov(const void *buf, size_t buflen, const struct iovec *iov, int niov, size_t seek) { struct iovec *diov; - int ndiov, i; size_t off = 0, len; + int i; if (seek > 0) { + int ndiov; + diov = malloc(sizeof(struct iovec) * niov); seek_iov(iov, niov, diov, &ndiov, seek); - } else { - diov = iov; - ndiov = niov; + iov = diov; + niov = ndiov; } - for (i = 0; i < ndiov && off < buflen; i++) { - len = MIN(diov[i].iov_len, buflen - off); - memcpy(diov[i].iov_base, buf + off, len); + for (i = 0; i < niov && off < buflen; i++) { + len = MIN(iov[i].iov_len, buflen - off); + memcpy(iov[i].iov_base, buf + off, len); off += len; } if (seek > 0) free(diov); return ((ssize_t)off); } Index: projects/clang1000-import/usr.sbin/bhyve/iov.h =================================================================== --- projects/clang1000-import/usr.sbin/bhyve/iov.h (revision 358238) +++ projects/clang1000-import/usr.sbin/bhyve/iov.h (revision 358239) @@ -1,44 +1,44 @@ /*- * SPDX-License-Identifier: BSD-2-Clause-FreeBSD * * Copyright (c) 2016 Jakub Klama . * Copyright (c) 2018 Alexander Motin * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer * in this position and unchanged. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $FreeBSD$ */ #ifndef _IOV_H_ #define _IOV_H_ void seek_iov(const struct iovec *iov1, int niov1, struct iovec *iov2, int *niov2, size_t seek); void truncate_iov(struct iovec *iov, int *niov, size_t length); size_t count_iov(const struct iovec *iov, int niov); ssize_t iov_to_buf(const struct iovec *iov, int niov, void **buf); -ssize_t buf_to_iov(const void *buf, size_t buflen, struct iovec *iov, int niov, - size_t seek); +ssize_t buf_to_iov(const void *buf, size_t buflen, const struct iovec *iov, + int niov, size_t seek); #endif /* _IOV_H_ */ Index: projects/clang1000-import/usr.sbin/bhyve/net_backends.c =================================================================== --- projects/clang1000-import/usr.sbin/bhyve/net_backends.c (revision 358238) +++ projects/clang1000-import/usr.sbin/bhyve/net_backends.c (revision 358239) @@ -1,814 +1,902 @@ /*- * SPDX-License-Identifier: BSD-2-Clause-FreeBSD * * Copyright (c) 2019 Vincenzo Maffione * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, * OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT * OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE * OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, * EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * $FreeBSD$ */ /* * This file implements multiple network backends (tap, netmap, ...), * to be used by network frontends such as virtio-net and e1000. * The API to access the backend (e.g. send/receive packets, negotiate * features) is exported by net_backends.h. */ #include __FBSDID("$FreeBSD$"); #include /* u_short etc */ #ifndef WITHOUT_CAPSICUM #include #endif #include #include #include #include #include #include #define NETMAP_WITH_LIBS #include #ifndef WITHOUT_CAPSICUM #include #endif #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "debug.h" #include "iov.h" #include "mevent.h" #include "net_backends.h" #include /* * Each network backend registers a set of function pointers that are * used to implement the net backends API. * This might need to be exposed if we implement backends in separate files. */ struct net_backend { const char *prefix; /* prefix matching this backend */ /* * Routines used to initialize and cleanup the resources needed * by a backend. The cleanup function is used internally, * and should not be called by the frontend. */ int (*init)(struct net_backend *be, const char *devname, net_be_rxeof_t cb, void *param); void (*cleanup)(struct net_backend *be); /* * Called to serve a guest transmit request. The scatter-gather * vector provided by the caller has 'iovcnt' elements and contains * the packet to send. */ ssize_t (*send)(struct net_backend *be, const struct iovec *iov, int iovcnt); /* + * Get the length of the next packet that can be received from + * the backend. If no packets are currently available, this + * function returns 0. + */ + ssize_t (*peek_recvlen)(struct net_backend *be); + + /* * Called to receive a packet from the backend. When the function * returns a positive value 'len', the scatter-gather vector * provided by the caller contains a packet with such length. * The function returns 0 if the backend doesn't have a new packet to * receive. */ ssize_t (*recv)(struct net_backend *be, const struct iovec *iov, int iovcnt); /* * Ask the backend to enable or disable receive operation in the * backend. On return from a disable operation, it is guaranteed * that the receive callback won't be called until receive is * enabled again. Note however that it is up to the caller to make * sure that netbe_recv() is not currently being executed by another * thread. */ void (*recv_enable)(struct net_backend *be); void (*recv_disable)(struct net_backend *be); /* * Ask the backend for the virtio-net features it is able to * support. Possible features are TSO, UFO and checksum offloading * in both rx and tx direction and for both IPv4 and IPv6. */ uint64_t (*get_cap)(struct net_backend *be); /* * Tell the backend to enable/disable the specified virtio-net * features (capabilities). */ int (*set_cap)(struct net_backend *be, uint64_t features, unsigned int vnet_hdr_len); struct pci_vtnet_softc *sc; int fd; /* * Length of the virtio-net header used by the backend and the * frontend, respectively. A zero value means that the header * is not used. */ unsigned int be_vnet_hdr_len; unsigned int fe_vnet_hdr_len; /* Size of backend-specific private data. */ size_t priv_size; /* Room for backend-specific data. */ char opaque[0]; }; SET_DECLARE(net_backend_set, struct net_backend); #define VNET_HDR_LEN sizeof(struct virtio_net_rxhdr) #define WPRINTF(params) PRINTLN params /* * The tap backend */ struct tap_priv { struct mevent *mevp; + /* + * A bounce buffer that allows us to implement the peek_recvlen + * callback. In the future we may get the same information from + * the kevent data. + */ + char bbuf[1 << 16]; + ssize_t bbuflen; }; static void tap_cleanup(struct net_backend *be) { struct tap_priv *priv = (struct tap_priv *)be->opaque; if (priv->mevp) { mevent_delete(priv->mevp); } if (be->fd != -1) { close(be->fd); be->fd = -1; } } static int tap_init(struct net_backend *be, const char *devname, net_be_rxeof_t cb, void *param) { struct tap_priv *priv = (struct tap_priv *)be->opaque; char tbuf[80]; int opt = 1; #ifndef WITHOUT_CAPSICUM cap_rights_t rights; #endif if (cb == NULL) { WPRINTF(("TAP backend requires non-NULL callback")); return (-1); } strcpy(tbuf, "/dev/"); strlcat(tbuf, devname, sizeof(tbuf)); be->fd = open(tbuf, O_RDWR); if (be->fd == -1) { WPRINTF(("open of tap device %s failed", tbuf)); goto error; } /* * Set non-blocking and register for read * notifications with the event loop */ if (ioctl(be->fd, FIONBIO, &opt) < 0) { WPRINTF(("tap device O_NONBLOCK failed")); goto error; } #ifndef WITHOUT_CAPSICUM cap_rights_init(&rights, CAP_EVENT, CAP_READ, CAP_WRITE); if (caph_rights_limit(be->fd, &rights) == -1) errx(EX_OSERR, "Unable to apply rights for sandbox"); #endif + memset(priv->bbuf, 0, sizeof(priv->bbuf)); + priv->bbuflen = 0; + priv->mevp = mevent_add_disabled(be->fd, EVF_READ, cb, param); if (priv->mevp == NULL) { WPRINTF(("Could not register event")); goto error; } return (0); error: tap_cleanup(be); return (-1); } /* * Called to send a buffer chain out to the tap device */ static ssize_t tap_send(struct net_backend *be, const struct iovec *iov, int iovcnt) { return (writev(be->fd, iov, iovcnt)); } static ssize_t +tap_peek_recvlen(struct net_backend *be) +{ + struct tap_priv *priv = (struct tap_priv *)be->opaque; + ssize_t ret; + + if (priv->bbuflen > 0) { + /* + * We already have a packet in the bounce buffer. + * Just return its length. + */ + return priv->bbuflen; + } + + /* + * Read the next packet (if any) into the bounce buffer, so + * that we get to know its length and we can return that + * to the caller. + */ + ret = read(be->fd, priv->bbuf, sizeof(priv->bbuf)); + if (ret < 0 && errno == EWOULDBLOCK) { + return (0); + } + + if (ret > 0) + priv->bbuflen = ret; + + return (ret); +} + +static ssize_t tap_recv(struct net_backend *be, const struct iovec *iov, int iovcnt) { + struct tap_priv *priv = (struct tap_priv *)be->opaque; ssize_t ret; - /* Should never be called without a valid tap fd */ - assert(be->fd != -1); + if (priv->bbuflen > 0) { + /* + * A packet is available in the bounce buffer, so + * we read it from there. + */ + ret = buf_to_iov(priv->bbuf, priv->bbuflen, + iov, iovcnt, 0); - ret = readv(be->fd, iov, iovcnt); + /* Mark the bounce buffer as empty. */ + priv->bbuflen = 0; + return (ret); + } + + ret = readv(be->fd, iov, iovcnt); if (ret < 0 && errno == EWOULDBLOCK) { return (0); } return (ret); } static void tap_recv_enable(struct net_backend *be) { struct tap_priv *priv = (struct tap_priv *)be->opaque; mevent_enable(priv->mevp); } static void tap_recv_disable(struct net_backend *be) { struct tap_priv *priv = (struct tap_priv *)be->opaque; mevent_disable(priv->mevp); } static uint64_t tap_get_cap(struct net_backend *be) { return (0); /* no capabilities for now */ } static int tap_set_cap(struct net_backend *be, uint64_t features, unsigned vnet_hdr_len) { return ((features || vnet_hdr_len) ? -1 : 0); } static struct net_backend tap_backend = { .prefix = "tap", .priv_size = sizeof(struct tap_priv), .init = tap_init, .cleanup = tap_cleanup, .send = tap_send, + .peek_recvlen = tap_peek_recvlen, .recv = tap_recv, .recv_enable = tap_recv_enable, .recv_disable = tap_recv_disable, .get_cap = tap_get_cap, .set_cap = tap_set_cap, }; /* A clone of the tap backend, with a different prefix. */ static struct net_backend vmnet_backend = { .prefix = "vmnet", .priv_size = sizeof(struct tap_priv), .init = tap_init, .cleanup = tap_cleanup, .send = tap_send, + .peek_recvlen = tap_peek_recvlen, .recv = tap_recv, .recv_enable = tap_recv_enable, .recv_disable = tap_recv_disable, .get_cap = tap_get_cap, .set_cap = tap_set_cap, }; DATA_SET(net_backend_set, tap_backend); DATA_SET(net_backend_set, vmnet_backend); /* * The netmap backend */ /* The virtio-net features supported by netmap. */ #define NETMAP_FEATURES (VIRTIO_NET_F_CSUM | VIRTIO_NET_F_HOST_TSO4 | \ VIRTIO_NET_F_HOST_TSO6 | VIRTIO_NET_F_HOST_UFO | \ VIRTIO_NET_F_GUEST_CSUM | VIRTIO_NET_F_GUEST_TSO4 | \ - VIRTIO_NET_F_GUEST_TSO6 | VIRTIO_NET_F_GUEST_UFO | \ - VIRTIO_NET_F_MRG_RXBUF) + VIRTIO_NET_F_GUEST_TSO6 | VIRTIO_NET_F_GUEST_UFO) struct netmap_priv { char ifname[IFNAMSIZ]; struct nm_desc *nmd; uint16_t memid; struct netmap_ring *rx; struct netmap_ring *tx; struct mevent *mevp; net_be_rxeof_t cb; void *cb_param; }; static void nmreq_init(struct nmreq *req, char *ifname) { memset(req, 0, sizeof(*req)); strlcpy(req->nr_name, ifname, sizeof(req->nr_name)); req->nr_version = NETMAP_API; } static int netmap_set_vnet_hdr_len(struct net_backend *be, int vnet_hdr_len) { int err; struct nmreq req; struct netmap_priv *priv = (struct netmap_priv *)be->opaque; nmreq_init(&req, priv->ifname); req.nr_cmd = NETMAP_BDG_VNET_HDR; req.nr_arg1 = vnet_hdr_len; err = ioctl(be->fd, NIOCREGIF, &req); if (err) { WPRINTF(("Unable to set vnet header length %d", vnet_hdr_len)); return (err); } be->be_vnet_hdr_len = vnet_hdr_len; return (0); } static int netmap_has_vnet_hdr_len(struct net_backend *be, unsigned vnet_hdr_len) { int prev_hdr_len = be->be_vnet_hdr_len; int ret; if (vnet_hdr_len == prev_hdr_len) { return (1); } ret = netmap_set_vnet_hdr_len(be, vnet_hdr_len); if (ret) { return (0); } netmap_set_vnet_hdr_len(be, prev_hdr_len); return (1); } static uint64_t netmap_get_cap(struct net_backend *be) { return (netmap_has_vnet_hdr_len(be, VNET_HDR_LEN) ? NETMAP_FEATURES : 0); } static int netmap_set_cap(struct net_backend *be, uint64_t features, unsigned vnet_hdr_len) { return (netmap_set_vnet_hdr_len(be, vnet_hdr_len)); } static int netmap_init(struct net_backend *be, const char *devname, net_be_rxeof_t cb, void *param) { struct netmap_priv *priv = (struct netmap_priv *)be->opaque; strlcpy(priv->ifname, devname, sizeof(priv->ifname)); priv->ifname[sizeof(priv->ifname) - 1] = '\0'; priv->nmd = nm_open(priv->ifname, NULL, NETMAP_NO_TX_POLL, NULL); if (priv->nmd == NULL) { WPRINTF(("Unable to nm_open(): interface '%s', errno (%s)", devname, strerror(errno))); free(priv); return (-1); } priv->memid = priv->nmd->req.nr_arg2; priv->tx = NETMAP_TXRING(priv->nmd->nifp, 0); priv->rx = NETMAP_RXRING(priv->nmd->nifp, 0); priv->cb = cb; priv->cb_param = param; be->fd = priv->nmd->fd; priv->mevp = mevent_add_disabled(be->fd, EVF_READ, cb, param); if (priv->mevp == NULL) { WPRINTF(("Could not register event")); return (-1); } return (0); } static void netmap_cleanup(struct net_backend *be) { struct netmap_priv *priv = (struct netmap_priv *)be->opaque; if (priv->mevp) { mevent_delete(priv->mevp); } if (priv->nmd) { nm_close(priv->nmd); } be->fd = -1; } static ssize_t netmap_send(struct net_backend *be, const struct iovec *iov, int iovcnt) { struct netmap_priv *priv = (struct netmap_priv *)be->opaque; struct netmap_ring *ring; ssize_t totlen = 0; int nm_buf_size; int nm_buf_len; uint32_t head; void *nm_buf; int j; ring = priv->tx; head = ring->head; if (head == ring->tail) { WPRINTF(("No space, drop %zu bytes", count_iov(iov, iovcnt))); goto txsync; } nm_buf = NETMAP_BUF(ring, ring->slot[head].buf_idx); nm_buf_size = ring->nr_buf_size; nm_buf_len = 0; for (j = 0; j < iovcnt; j++) { int iov_frag_size = iov[j].iov_len; void *iov_frag_buf = iov[j].iov_base; totlen += iov_frag_size; /* * Split each iovec fragment over more netmap slots, if * necessary. */ for (;;) { int copylen; copylen = iov_frag_size < nm_buf_size ? iov_frag_size : nm_buf_size; memcpy(nm_buf, iov_frag_buf, copylen); iov_frag_buf += copylen; iov_frag_size -= copylen; nm_buf += copylen; nm_buf_size -= copylen; nm_buf_len += copylen; if (iov_frag_size == 0) { break; } ring->slot[head].len = nm_buf_len; ring->slot[head].flags = NS_MOREFRAG; head = nm_ring_next(ring, head); if (head == ring->tail) { /* * We ran out of netmap slots while * splitting the iovec fragments. */ WPRINTF(("No space, drop %zu bytes", count_iov(iov, iovcnt))); goto txsync; } nm_buf = NETMAP_BUF(ring, ring->slot[head].buf_idx); nm_buf_size = ring->nr_buf_size; nm_buf_len = 0; } } /* Complete the last slot, which must not have NS_MOREFRAG set. */ ring->slot[head].len = nm_buf_len; ring->slot[head].flags = 0; head = nm_ring_next(ring, head); /* Now update ring->head and ring->cur. */ ring->head = ring->cur = head; txsync: ioctl(be->fd, NIOCTXSYNC, NULL); return (totlen); } static ssize_t +netmap_peek_recvlen(struct net_backend *be) +{ + struct netmap_priv *priv = (struct netmap_priv *)be->opaque; + struct netmap_ring *ring = priv->rx; + uint32_t head = ring->head; + ssize_t totlen = 0; + + while (head != ring->tail) { + struct netmap_slot *slot = ring->slot + head; + + totlen += slot->len; + if ((slot->flags & NS_MOREFRAG) == 0) + break; + head = nm_ring_next(ring, head); + } + + return (totlen); +} + +static ssize_t netmap_recv(struct net_backend *be, const struct iovec *iov, int iovcnt) { struct netmap_priv *priv = (struct netmap_priv *)be->opaque; struct netmap_slot *slot = NULL; struct netmap_ring *ring; void *iov_frag_buf; int iov_frag_size; ssize_t totlen = 0; uint32_t head; assert(iovcnt); ring = priv->rx; head = ring->head; iov_frag_buf = iov->iov_base; iov_frag_size = iov->iov_len; do { int nm_buf_len; void *nm_buf; if (head == ring->tail) { return (0); } slot = ring->slot + head; nm_buf = NETMAP_BUF(ring, slot->buf_idx); nm_buf_len = slot->len; for (;;) { int copylen = nm_buf_len < iov_frag_size ? nm_buf_len : iov_frag_size; memcpy(iov_frag_buf, nm_buf, copylen); nm_buf += copylen; nm_buf_len -= copylen; iov_frag_buf += copylen; iov_frag_size -= copylen; totlen += copylen; if (nm_buf_len == 0) { break; } iov++; iovcnt--; if (iovcnt == 0) { /* No space to receive. */ WPRINTF(("Short iov, drop %zd bytes", totlen)); return (-ENOSPC); } iov_frag_buf = iov->iov_base; iov_frag_size = iov->iov_len; } head = nm_ring_next(ring, head); } while (slot->flags & NS_MOREFRAG); /* Release slots to netmap. */ ring->head = ring->cur = head; return (totlen); } static void netmap_recv_enable(struct net_backend *be) { struct netmap_priv *priv = (struct netmap_priv *)be->opaque; mevent_enable(priv->mevp); } static void netmap_recv_disable(struct net_backend *be) { struct netmap_priv *priv = (struct netmap_priv *)be->opaque; mevent_disable(priv->mevp); } static struct net_backend netmap_backend = { .prefix = "netmap", .priv_size = sizeof(struct netmap_priv), .init = netmap_init, .cleanup = netmap_cleanup, .send = netmap_send, + .peek_recvlen = netmap_peek_recvlen, .recv = netmap_recv, .recv_enable = netmap_recv_enable, .recv_disable = netmap_recv_disable, .get_cap = netmap_get_cap, .set_cap = netmap_set_cap, }; /* A clone of the netmap backend, with a different prefix. */ static struct net_backend vale_backend = { .prefix = "vale", .priv_size = sizeof(struct netmap_priv), .init = netmap_init, .cleanup = netmap_cleanup, .send = netmap_send, + .peek_recvlen = netmap_peek_recvlen, .recv = netmap_recv, .recv_enable = netmap_recv_enable, .recv_disable = netmap_recv_disable, .get_cap = netmap_get_cap, .set_cap = netmap_set_cap, }; DATA_SET(net_backend_set, netmap_backend); DATA_SET(net_backend_set, vale_backend); /* * Initialize a backend and attach to the frontend. * This is called during frontend initialization. * @pbe is a pointer to the backend to be initialized * @devname is the backend-name as supplied on the command line, * e.g. -s 2:0,frontend-name,backend-name[,other-args] * @cb is the receive callback supplied by the frontend, * and it is invoked in the event loop when a receive * event is generated in the hypervisor, * @param is a pointer to the frontend, and normally used as * the argument for the callback. */ int netbe_init(struct net_backend **ret, const char *devname, net_be_rxeof_t cb, void *param) { struct net_backend **pbe, *nbe, *tbe = NULL; int err; /* * Find the network backend that matches the user-provided * device name. net_backend_set is built using a linker set. */ SET_FOREACH(pbe, net_backend_set) { if (strncmp(devname, (*pbe)->prefix, strlen((*pbe)->prefix)) == 0) { tbe = *pbe; assert(tbe->init != NULL); assert(tbe->cleanup != NULL); assert(tbe->send != NULL); assert(tbe->recv != NULL); assert(tbe->get_cap != NULL); assert(tbe->set_cap != NULL); break; } } *ret = NULL; if (tbe == NULL) return (EINVAL); nbe = calloc(1, sizeof(*nbe) + tbe->priv_size); *nbe = *tbe; /* copy the template */ nbe->fd = -1; nbe->sc = param; nbe->be_vnet_hdr_len = 0; nbe->fe_vnet_hdr_len = 0; /* Initialize the backend. */ err = nbe->init(nbe, devname, cb, param); if (err) { free(nbe); return (err); } *ret = nbe; return (0); } void netbe_cleanup(struct net_backend *be) { if (be != NULL) { be->cleanup(be); free(be); } } uint64_t netbe_get_cap(struct net_backend *be) { assert(be != NULL); return (be->get_cap(be)); } int netbe_set_cap(struct net_backend *be, uint64_t features, unsigned vnet_hdr_len) { int ret; assert(be != NULL); /* There are only three valid lengths, i.e., 0, 10 and 12. */ if (vnet_hdr_len && vnet_hdr_len != VNET_HDR_LEN && vnet_hdr_len != (VNET_HDR_LEN - sizeof(uint16_t))) return (-1); be->fe_vnet_hdr_len = vnet_hdr_len; ret = be->set_cap(be, features, vnet_hdr_len); assert(be->be_vnet_hdr_len == 0 || be->be_vnet_hdr_len == be->fe_vnet_hdr_len); return (ret); } ssize_t netbe_send(struct net_backend *be, const struct iovec *iov, int iovcnt) { return (be->send(be, iov, iovcnt)); +} + +ssize_t +netbe_peek_recvlen(struct net_backend *be) +{ + + return (be->peek_recvlen(be)); } /* * Try to read a packet from the backend, without blocking. * If no packets are available, return 0. In case of success, return * the length of the packet just read. Return -1 in case of errors. */ ssize_t netbe_recv(struct net_backend *be, const struct iovec *iov, int iovcnt) { return (be->recv(be, iov, iovcnt)); } /* * Read a packet from the backend and discard it. * Returns the size of the discarded packet or zero if no packet was available. * A negative error code is returned in case of read error. */ ssize_t netbe_rx_discard(struct net_backend *be) { /* * MP note: the dummybuf is only used to discard frames, * so there is no need for it to be per-vtnet or locked. * We only make it large enough for TSO-sized segment. */ static uint8_t dummybuf[65536 + 64]; struct iovec iov; iov.iov_base = dummybuf; iov.iov_len = sizeof(dummybuf); return netbe_recv(be, &iov, 1); } void netbe_rx_disable(struct net_backend *be) { return be->recv_disable(be); } void netbe_rx_enable(struct net_backend *be) { return be->recv_enable(be); } size_t netbe_get_vnet_hdr_len(struct net_backend *be) { return (be->be_vnet_hdr_len); } Index: projects/clang1000-import/usr.sbin/bhyve/net_backends.h =================================================================== --- projects/clang1000-import/usr.sbin/bhyve/net_backends.h (revision 358238) +++ projects/clang1000-import/usr.sbin/bhyve/net_backends.h (revision 358239) @@ -1,92 +1,93 @@ /*- * SPDX-License-Identifier: BSD-2-Clause-FreeBSD * * Copyright (c) 2019 Vincenzo Maffione * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, * OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT * OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE * OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, * EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * $FreeBSD$ */ #ifndef __NET_BACKENDS_H__ #define __NET_BACKENDS_H__ #include /* Opaque type representing a network backend. */ typedef struct net_backend net_backend_t; /* Interface between network frontends and the network backends. */ typedef void (*net_be_rxeof_t)(int, enum ev_type, void *param); int netbe_init(net_backend_t **be, const char *devname, net_be_rxeof_t cb, void *param); void netbe_cleanup(net_backend_t *be); uint64_t netbe_get_cap(net_backend_t *be); int netbe_set_cap(net_backend_t *be, uint64_t cap, unsigned vnet_hdr_len); size_t netbe_get_vnet_hdr_len(net_backend_t *be); ssize_t netbe_send(net_backend_t *be, const struct iovec *iov, int iovcnt); +ssize_t netbe_peek_recvlen(net_backend_t *be); ssize_t netbe_recv(net_backend_t *be, const struct iovec *iov, int iovcnt); ssize_t netbe_rx_discard(net_backend_t *be); void netbe_rx_disable(net_backend_t *be); void netbe_rx_enable(net_backend_t *be); /* * Network device capabilities taken from the VirtIO standard. * Despite the name, these capabilities can be used by different frontents * (virtio-net, ptnet) and supported by different backends (netmap, tap, ...). */ #define VIRTIO_NET_F_CSUM (1 << 0) /* host handles partial cksum */ #define VIRTIO_NET_F_GUEST_CSUM (1 << 1) /* guest handles partial cksum */ #define VIRTIO_NET_F_MAC (1 << 5) /* host supplies MAC */ #define VIRTIO_NET_F_GSO_DEPREC (1 << 6) /* deprecated: host handles GSO */ #define VIRTIO_NET_F_GUEST_TSO4 (1 << 7) /* guest can rcv TSOv4 */ #define VIRTIO_NET_F_GUEST_TSO6 (1 << 8) /* guest can rcv TSOv6 */ #define VIRTIO_NET_F_GUEST_ECN (1 << 9) /* guest can rcv TSO with ECN */ #define VIRTIO_NET_F_GUEST_UFO (1 << 10) /* guest can rcv UFO */ #define VIRTIO_NET_F_HOST_TSO4 (1 << 11) /* host can rcv TSOv4 */ #define VIRTIO_NET_F_HOST_TSO6 (1 << 12) /* host can rcv TSOv6 */ #define VIRTIO_NET_F_HOST_ECN (1 << 13) /* host can rcv TSO with ECN */ #define VIRTIO_NET_F_HOST_UFO (1 << 14) /* host can rcv UFO */ #define VIRTIO_NET_F_MRG_RXBUF (1 << 15) /* host can merge RX buffers */ #define VIRTIO_NET_F_STATUS (1 << 16) /* config status field available */ #define VIRTIO_NET_F_CTRL_VQ (1 << 17) /* control channel available */ #define VIRTIO_NET_F_CTRL_RX (1 << 18) /* control channel RX mode support */ #define VIRTIO_NET_F_CTRL_VLAN (1 << 19) /* control channel VLAN filtering */ #define VIRTIO_NET_F_GUEST_ANNOUNCE \ (1 << 21) /* guest can send gratuitous pkts */ /* * Fixed network header size */ struct virtio_net_rxhdr { uint8_t vrh_flags; uint8_t vrh_gso_type; uint16_t vrh_hdr_len; uint16_t vrh_gso_size; uint16_t vrh_csum_start; uint16_t vrh_csum_offset; uint16_t vrh_bufs; } __packed; #endif /* __NET_BACKENDS_H__ */ Index: projects/clang1000-import/usr.sbin/bhyve/pci_virtio_net.c =================================================================== --- projects/clang1000-import/usr.sbin/bhyve/pci_virtio_net.c (revision 358238) +++ projects/clang1000-import/usr.sbin/bhyve/pci_virtio_net.c (revision 358239) @@ -1,709 +1,719 @@ /*- * SPDX-License-Identifier: BSD-2-Clause-FreeBSD * * Copyright (c) 2011 NetApp, Inc. * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY NETAPP, INC ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL NETAPP, INC OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $FreeBSD$ */ #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include /* IFNAMSIZ */ #include #include #include #include #include #include #include #include #include #include #include #include #include "bhyverun.h" #include "debug.h" #include "pci_emul.h" #include "mevent.h" #include "virtio.h" #include "net_utils.h" #include "net_backends.h" #include "iov.h" #define VTNET_RINGSZ 1024 #define VTNET_MAXSEGS 256 #define VTNET_MAX_PKT_LEN (65536 + 64) #define VTNET_S_HOSTCAPS \ ( VIRTIO_NET_F_MAC | VIRTIO_NET_F_STATUS | \ VIRTIO_F_NOTIFY_ON_EMPTY | VIRTIO_RING_F_INDIRECT_DESC) /* * PCI config-space "registers" */ struct virtio_net_config { uint8_t mac[6]; uint16_t status; } __packed; /* * Queue definitions. */ #define VTNET_RXQ 0 #define VTNET_TXQ 1 #define VTNET_CTLQ 2 /* NB: not yet supported */ #define VTNET_MAXQ 3 /* * Debug printf */ static int pci_vtnet_debug; #define DPRINTF(params) if (pci_vtnet_debug) PRINTLN params #define WPRINTF(params) PRINTLN params /* * Per-device softc */ struct pci_vtnet_softc { struct virtio_softc vsc_vs; struct vqueue_info vsc_queues[VTNET_MAXQ - 1]; pthread_mutex_t vsc_mtx; net_backend_t *vsc_be; int resetting; /* protected by tx_mtx */ uint64_t vsc_features; /* negotiated features */ pthread_mutex_t rx_mtx; int rx_merge; /* merged rx bufs in use */ pthread_t tx_tid; pthread_mutex_t tx_mtx; pthread_cond_t tx_cond; int tx_in_progress; size_t vhdrlen; size_t be_vhdrlen; struct virtio_net_config vsc_config; struct virtio_consts vsc_consts; }; static void pci_vtnet_reset(void *); /* static void pci_vtnet_notify(void *, struct vqueue_info *); */ static int pci_vtnet_cfgread(void *, int, int, uint32_t *); static int pci_vtnet_cfgwrite(void *, int, int, uint32_t); static void pci_vtnet_neg_features(void *, uint64_t); static struct virtio_consts vtnet_vi_consts = { "vtnet", /* our name */ VTNET_MAXQ - 1, /* we currently support 2 virtqueues */ sizeof(struct virtio_net_config), /* config reg size */ pci_vtnet_reset, /* reset */ NULL, /* device-wide qnotify -- not used */ pci_vtnet_cfgread, /* read PCI config */ pci_vtnet_cfgwrite, /* write PCI config */ pci_vtnet_neg_features, /* apply negotiated features */ VTNET_S_HOSTCAPS, /* our capabilities */ }; static void pci_vtnet_reset(void *vsc) { struct pci_vtnet_softc *sc = vsc; DPRINTF(("vtnet: device reset requested !")); /* Acquire the RX lock to block RX processing. */ pthread_mutex_lock(&sc->rx_mtx); /* * Make sure receive operation is disabled at least until we * re-negotiate the features, since receive operation depends * on the value of sc->rx_merge and the header length, which * are both set in pci_vtnet_neg_features(). * Receive operation will be enabled again once the guest adds * the first receive buffers and kicks us. */ netbe_rx_disable(sc->vsc_be); /* Set sc->resetting and give a chance to the TX thread to stop. */ pthread_mutex_lock(&sc->tx_mtx); sc->resetting = 1; while (sc->tx_in_progress) { pthread_mutex_unlock(&sc->tx_mtx); usleep(10000); pthread_mutex_lock(&sc->tx_mtx); } /* * Now reset rings, MSI-X vectors, and negotiated capabilities. * Do that with the TX lock held, since we need to reset * sc->resetting. */ vi_reset_dev(&sc->vsc_vs); sc->resetting = 0; pthread_mutex_unlock(&sc->tx_mtx); pthread_mutex_unlock(&sc->rx_mtx); } static __inline struct iovec * iov_trim_hdr(struct iovec *iov, int *iovcnt, unsigned int hlen) { struct iovec *riov; if (iov[0].iov_len < hlen) { /* * Not enough header space in the first fragment. * That's not ok for us. */ return NULL; } iov[0].iov_len -= hlen; if (iov[0].iov_len == 0) { *iovcnt -= 1; if (*iovcnt == 0) { /* * Only space for the header. That's not * enough for us. */ return NULL; } riov = &iov[1]; } else { iov[0].iov_base = (void *)((uintptr_t)iov[0].iov_base + hlen); riov = &iov[0]; } return (riov); } struct virtio_mrg_rxbuf_info { uint16_t idx; uint16_t pad; uint32_t len; }; static void pci_vtnet_rx(struct pci_vtnet_softc *sc) { int prepend_hdr_len = sc->vhdrlen - sc->be_vhdrlen; struct virtio_mrg_rxbuf_info info[VTNET_MAXSEGS]; struct iovec iov[VTNET_MAXSEGS + 1]; struct vqueue_info *vq; - uint32_t riov_bytes; - struct iovec *riov; - int riov_len; - uint32_t ulen; - int n_chains; - int len; vq = &sc->vsc_queues[VTNET_RXQ]; for (;;) { struct virtio_net_rxhdr *hdr; + uint32_t riov_bytes; + struct iovec *riov; + uint32_t ulen; + int riov_len; + int n_chains; + ssize_t rlen; + ssize_t plen; + plen = netbe_peek_recvlen(sc->vsc_be); + if (plen <= 0) { + /* + * No more packets (plen == 0), or backend errored + * (plen < 0). Interrupt if needed and stop. + */ + vq_endchains(vq, /*used_all_avail=*/0); + return; + } + plen += prepend_hdr_len; + /* * Get a descriptor chain to store the next ingress * packet. In case of mergeable rx buffers, get as * many chains as necessary in order to make room - * for a maximum sized LRO packet. + * for plen bytes. */ riov_bytes = 0; riov_len = 0; riov = iov; n_chains = 0; do { int n = vq_getchain(vq, &info[n_chains].idx, riov, VTNET_MAXSEGS - riov_len, NULL); if (n == 0) { /* * No rx buffers. Enable RX kicks and double * check. */ vq_kick_enable(vq); if (!vq_has_descs(vq)) { /* * Still no buffers. Return the unused * chains (if any), interrupt if needed * (including for NOTIFY_ON_EMPTY), and * disable the backend until the next * kick. */ vq_retchains(vq, n_chains); vq_endchains(vq, /*used_all_avail=*/1); netbe_rx_disable(sc->vsc_be); return; } /* More rx buffers found, so keep going. */ vq_kick_disable(vq); continue; } assert(n >= 1 && riov_len + n <= VTNET_MAXSEGS); riov_len += n; if (!sc->rx_merge) { n_chains = 1; break; } info[n_chains].len = (uint32_t)count_iov(riov, n); riov_bytes += info[n_chains].len; riov += n; n_chains++; - } while (riov_bytes < VTNET_MAX_PKT_LEN && - riov_len < VTNET_MAXSEGS); + } while (riov_bytes < plen && riov_len < VTNET_MAXSEGS); riov = iov; hdr = riov[0].iov_base; if (prepend_hdr_len > 0) { /* * The frontend uses a virtio-net header, but the * backend does not. We need to prepend a zeroed * header. */ riov = iov_trim_hdr(riov, &riov_len, prepend_hdr_len); if (riov == NULL) { /* * The first collected chain is nonsensical, * as it is not even enough to store the * virtio-net header. Just drop it. */ vq_relchain(vq, info[0].idx, 0); vq_retchains(vq, n_chains - 1); continue; } memset(hdr, 0, prepend_hdr_len); } - len = netbe_recv(sc->vsc_be, riov, riov_len); - - if (len <= 0) { + rlen = netbe_recv(sc->vsc_be, riov, riov_len); + if (rlen != plen - prepend_hdr_len) { /* - * No more packets (len == 0), or backend errored - * (err < 0). Return unused available buffers - * and stop. + * If this happens it means there is something + * wrong with the backend (e.g., some other + * process is stealing our packets). */ + WPRINTF(("netbe_recv: expected %zd bytes, " + "got %zd", plen - prepend_hdr_len, rlen)); vq_retchains(vq, n_chains); - /* Interrupt if needed/appropriate and stop. */ - vq_endchains(vq, /*used_all_avail=*/0); - return; + continue; } - ulen = (uint32_t)(len + prepend_hdr_len); + ulen = (uint32_t)plen; /* * Publish the used buffers to the guest, reporting the * number of bytes that we wrote. */ if (!sc->rx_merge) { vq_relchain(vq, info[0].idx, ulen); } else { uint32_t iolen; int i = 0; do { iolen = info[i].len; if (iolen > ulen) { iolen = ulen; } vq_relchain_prepare(vq, info[i].idx, iolen); ulen -= iolen; i++; - assert(i <= n_chains); } while (ulen > 0); hdr->vrh_bufs = i; vq_relchain_publish(vq); - vq_retchains(vq, n_chains - i); + assert(i == n_chains); } } } /* * Called when there is read activity on the backend file descriptor. * Each buffer posted by the guest is assumed to be able to contain * an entire ethernet frame + rx header. */ static void pci_vtnet_rx_callback(int fd, enum ev_type type, void *param) { struct pci_vtnet_softc *sc = param; pthread_mutex_lock(&sc->rx_mtx); pci_vtnet_rx(sc); pthread_mutex_unlock(&sc->rx_mtx); } /* Called on RX kick. */ static void pci_vtnet_ping_rxq(void *vsc, struct vqueue_info *vq) { struct pci_vtnet_softc *sc = vsc; /* * A qnotify means that the rx process can now begin. */ pthread_mutex_lock(&sc->rx_mtx); vq_kick_disable(vq); netbe_rx_enable(sc->vsc_be); pthread_mutex_unlock(&sc->rx_mtx); } /* TX virtqueue processing, called by the TX thread. */ static void pci_vtnet_proctx(struct pci_vtnet_softc *sc, struct vqueue_info *vq) { struct iovec iov[VTNET_MAXSEGS + 1]; struct iovec *siov = iov; uint16_t idx; ssize_t len; int n; /* * Obtain chain of descriptors. The first descriptor also * contains the virtio-net header. */ n = vq_getchain(vq, &idx, iov, VTNET_MAXSEGS, NULL); assert(n >= 1 && n <= VTNET_MAXSEGS); if (sc->vhdrlen != sc->be_vhdrlen) { /* * The frontend uses a virtio-net header, but the backend * does not. We simply strip the header and ignore it, as * it should be zero-filled. */ siov = iov_trim_hdr(siov, &n, sc->vhdrlen); } if (siov == NULL) { /* The chain is nonsensical. Just drop it. */ len = 0; } else { len = netbe_send(sc->vsc_be, siov, n); if (len < 0) { /* * If send failed, report that 0 bytes * were read. */ len = 0; } } /* * Return the processed chain to the guest, reporting * the number of bytes that we read. */ vq_relchain(vq, idx, len); } /* Called on TX kick. */ static void pci_vtnet_ping_txq(void *vsc, struct vqueue_info *vq) { struct pci_vtnet_softc *sc = vsc; /* * Any ring entries to process? */ if (!vq_has_descs(vq)) return; /* Signal the tx thread for processing */ pthread_mutex_lock(&sc->tx_mtx); vq_kick_disable(vq); if (sc->tx_in_progress == 0) pthread_cond_signal(&sc->tx_cond); pthread_mutex_unlock(&sc->tx_mtx); } /* * Thread which will handle processing of TX desc */ static void * pci_vtnet_tx_thread(void *param) { struct pci_vtnet_softc *sc = param; struct vqueue_info *vq; int error; vq = &sc->vsc_queues[VTNET_TXQ]; /* * Let us wait till the tx queue pointers get initialised & * first tx signaled */ pthread_mutex_lock(&sc->tx_mtx); error = pthread_cond_wait(&sc->tx_cond, &sc->tx_mtx); assert(error == 0); for (;;) { /* note - tx mutex is locked here */ while (sc->resetting || !vq_has_descs(vq)) { vq_kick_enable(vq); if (!sc->resetting && vq_has_descs(vq)) break; sc->tx_in_progress = 0; error = pthread_cond_wait(&sc->tx_cond, &sc->tx_mtx); assert(error == 0); } vq_kick_disable(vq); sc->tx_in_progress = 1; pthread_mutex_unlock(&sc->tx_mtx); do { /* * Run through entries, placing them into * iovecs and sending when an end-of-packet * is found */ pci_vtnet_proctx(sc, vq); } while (vq_has_descs(vq)); /* * Generate an interrupt if needed. */ vq_endchains(vq, /*used_all_avail=*/1); pthread_mutex_lock(&sc->tx_mtx); } } #ifdef notyet static void pci_vtnet_ping_ctlq(void *vsc, struct vqueue_info *vq) { DPRINTF(("vtnet: control qnotify!")); } #endif static int pci_vtnet_init(struct vmctx *ctx, struct pci_devinst *pi, char *opts) { struct pci_vtnet_softc *sc; char tname[MAXCOMLEN + 1]; int mac_provided; /* * Allocate data structures for further virtio initializations. * sc also contains a copy of vtnet_vi_consts, since capabilities * change depending on the backend. */ sc = calloc(1, sizeof(struct pci_vtnet_softc)); sc->vsc_consts = vtnet_vi_consts; pthread_mutex_init(&sc->vsc_mtx, NULL); sc->vsc_queues[VTNET_RXQ].vq_qsize = VTNET_RINGSZ; sc->vsc_queues[VTNET_RXQ].vq_notify = pci_vtnet_ping_rxq; sc->vsc_queues[VTNET_TXQ].vq_qsize = VTNET_RINGSZ; sc->vsc_queues[VTNET_TXQ].vq_notify = pci_vtnet_ping_txq; #ifdef notyet sc->vsc_queues[VTNET_CTLQ].vq_qsize = VTNET_RINGSZ; sc->vsc_queues[VTNET_CTLQ].vq_notify = pci_vtnet_ping_ctlq; #endif /* * Attempt to open the backend device and read the MAC address * if specified. */ mac_provided = 0; if (opts != NULL) { char *devname; char *vtopts; int err = 0; /* Get the device name. */ devname = vtopts = strdup(opts); (void) strsep(&vtopts, ","); /* * Parse the list of options in the form * key1=value1,...,keyN=valueN. */ while (vtopts != NULL) { char *value = vtopts; char *key; key = strsep(&value, "="); if (value == NULL) break; vtopts = value; (void) strsep(&vtopts, ","); if (strcmp(key, "mac") == 0) { err = net_parsemac(value, sc->vsc_config.mac); if (err) break; mac_provided = 1; } } if (err) { free(devname); free(sc); return (err); } err = netbe_init(&sc->vsc_be, devname, pci_vtnet_rx_callback, sc); free(devname); if (err) { free(sc); return (err); } - sc->vsc_consts.vc_hv_caps |= netbe_get_cap(sc->vsc_be); + sc->vsc_consts.vc_hv_caps |= VIRTIO_NET_F_MRG_RXBUF | + netbe_get_cap(sc->vsc_be); } if (!mac_provided) { net_genmac(pi, sc->vsc_config.mac); } /* initialize config space */ pci_set_cfgdata16(pi, PCIR_DEVICE, VIRTIO_DEV_NET); pci_set_cfgdata16(pi, PCIR_VENDOR, VIRTIO_VENDOR); pci_set_cfgdata8(pi, PCIR_CLASS, PCIC_NETWORK); pci_set_cfgdata16(pi, PCIR_SUBDEV_0, VIRTIO_TYPE_NET); pci_set_cfgdata16(pi, PCIR_SUBVEND_0, VIRTIO_VENDOR); /* Link is up if we managed to open backend device. */ sc->vsc_config.status = (opts == NULL || sc->vsc_be); vi_softc_linkup(&sc->vsc_vs, &sc->vsc_consts, sc, pi, sc->vsc_queues); sc->vsc_vs.vs_mtx = &sc->vsc_mtx; /* use BAR 1 to map MSI-X table and PBA, if we're using MSI-X */ if (vi_intr_init(&sc->vsc_vs, 1, fbsdrun_virtio_msix())) { free(sc); return (1); } /* use BAR 0 to map config regs in IO space */ vi_set_io_bar(&sc->vsc_vs, 0); sc->resetting = 0; sc->rx_merge = 0; sc->vhdrlen = sizeof(struct virtio_net_rxhdr) - 2; pthread_mutex_init(&sc->rx_mtx, NULL); /* * Initialize tx semaphore & spawn TX processing thread. * As of now, only one thread for TX desc processing is * spawned. */ sc->tx_in_progress = 0; pthread_mutex_init(&sc->tx_mtx, NULL); pthread_cond_init(&sc->tx_cond, NULL); pthread_create(&sc->tx_tid, NULL, pci_vtnet_tx_thread, (void *)sc); snprintf(tname, sizeof(tname), "vtnet-%d:%d tx", pi->pi_slot, pi->pi_func); pthread_set_name_np(sc->tx_tid, tname); return (0); } static int pci_vtnet_cfgwrite(void *vsc, int offset, int size, uint32_t value) { struct pci_vtnet_softc *sc = vsc; void *ptr; if (offset < (int)sizeof(sc->vsc_config.mac)) { assert(offset + size <= (int)sizeof(sc->vsc_config.mac)); /* * The driver is allowed to change the MAC address */ ptr = &sc->vsc_config.mac[offset]; memcpy(ptr, &value, size); } else { /* silently ignore other writes */ DPRINTF(("vtnet: write to readonly reg %d", offset)); } return (0); } static int pci_vtnet_cfgread(void *vsc, int offset, int size, uint32_t *retval) { struct pci_vtnet_softc *sc = vsc; void *ptr; ptr = (uint8_t *)&sc->vsc_config + offset; memcpy(retval, ptr, size); return (0); } static void pci_vtnet_neg_features(void *vsc, uint64_t negotiated_features) { struct pci_vtnet_softc *sc = vsc; sc->vsc_features = negotiated_features; if (negotiated_features & VIRTIO_NET_F_MRG_RXBUF) { sc->vhdrlen = sizeof(struct virtio_net_rxhdr); sc->rx_merge = 1; } else { /* * Without mergeable rx buffers, virtio-net header is 2 * bytes shorter than sizeof(struct virtio_net_rxhdr). */ sc->vhdrlen = sizeof(struct virtio_net_rxhdr) - 2; sc->rx_merge = 0; } /* Tell the backend to enable some capabilities it has advertised. */ netbe_set_cap(sc->vsc_be, negotiated_features, sc->vhdrlen); sc->be_vhdrlen = netbe_get_vnet_hdr_len(sc->vsc_be); assert(sc->be_vhdrlen == 0 || sc->be_vhdrlen == sc->vhdrlen); } static struct pci_devemu pci_de_vnet = { .pe_emu = "virtio-net", .pe_init = pci_vtnet_init, .pe_barwrite = vi_pci_write, .pe_barread = vi_pci_read }; PCI_EMUL_SET(pci_de_vnet); Index: projects/clang1000-import/usr.sbin/iostat/iostat.c =================================================================== --- projects/clang1000-import/usr.sbin/iostat/iostat.c (revision 358238) +++ projects/clang1000-import/usr.sbin/iostat/iostat.c (revision 358239) @@ -1,1028 +1,1028 @@ /*- * SPDX-License-Identifier: BSD-3-Clause * * Copyright (c) 1997, 1998, 2000, 2001 Kenneth D. Merry * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. The name of the author may not be used to endorse or promote products * derived from this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $FreeBSD$ */ /* * Parts of this program are derived from the original FreeBSD iostat * program: */ /*- * Copyright (c) 1986, 1991, 1993 * The Regents of the University of California. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 4. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ /* * Ideas for the new iostat statistics output modes taken from the NetBSD * version of iostat: */ /* * Copyright (c) 1996 John M. Vinopal * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. All advertising materials mentioning features or use of this software * must display the following acknowledgement: * This product includes software developed for the NetBSD Project * by John M. Vinopal. * 4. The name of the author may not be used to endorse or promote products * derived from this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED * AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include static struct nlist namelist[] = { #define X_TTY_NIN 0 { .n_name = "_tty_nin", .n_type = 0, .n_other = 0, .n_desc = 0, .n_value = 0 }, #define X_TTY_NOUT 1 { .n_name = "_tty_nout", .n_type = 0, .n_other = 0, .n_desc = 0, .n_value = 0 }, #define X_BOOTTIME 2 { .n_name = "_boottime", .n_type = 0, .n_other = 0, .n_desc = 0, .n_value = 0 }, #define X_END 2 { .n_name = NULL, .n_type = 0, .n_other = 0, .n_desc = 0, .n_value = 0 }, }; #define IOSTAT_DEFAULT_ROWS 20 /* Traditional default `wrows' */ static struct statinfo cur, last; static int num_devices; static struct device_selection *dev_select; static int maxshowdevs; static volatile sig_atomic_t headercount; static volatile sig_atomic_t wresized; /* Tty resized, when non-zero. */ static volatile sig_atomic_t alarm_rang; static volatile sig_atomic_t return_requested; static unsigned short wrows; /* Current number of tty rows. */ static int dflag = 0, Iflag = 0, Cflag = 0, Tflag = 0, oflag = 0, Kflag = 0; static int xflag = 0, zflag = 0; /* local function declarations */ static void usage(void); static void needhdr(int signo); static void needresize(int signo); static void needreturn(int signo); static void alarm_clock(int signo); static void doresize(void); static void phdr(void); static void devstats(int perf_select, long double etime, int havelast); static void cpustats(void); static int readvar(kvm_t *kd, const char *name, int nlid, void *ptr, size_t len); static void usage(void) { /* * We also support the following 'traditional' syntax: * iostat [drives] [wait [count]] * This isn't mentioned in the man page, or the usage statement, * but it is supported. */ fprintf(stderr, "usage: iostat [-CdhIKoTxz?] [-c count] [-M core]" " [-n devs] [-N system]\n" "\t [-t type,if,pass] [-w wait] [drives]\n"); } int main(int argc, char **argv) { int c, i; int tflag = 0, hflag = 0, cflag = 0, wflag = 0, nflag = 0; int count = 0, waittime = 0; char *memf = NULL, *nlistf = NULL; struct devstat_match *matches; struct itimerval alarmspec; int num_matches = 0; char errbuf[_POSIX2_LINE_MAX]; kvm_t *kd = NULL; long generation; int num_devices_specified; int num_selected, num_selections; long select_generation; char **specified_devices; devstat_select_mode select_mode; float f; int havelast = 0; matches = NULL; maxshowdevs = 3; while ((c = getopt(argc, argv, "c:CdhIKM:n:N:ot:Tw:xz?")) != -1) { switch(c) { case 'c': cflag++; count = atoi(optarg); if (count < 1) errx(1, "count %d is < 1", count); break; case 'C': Cflag++; break; case 'd': dflag++; break; case 'h': hflag++; break; case 'I': Iflag++; break; case 'K': Kflag++; break; case 'M': memf = optarg; break; case 'n': nflag++; maxshowdevs = atoi(optarg); if (maxshowdevs < 0) errx(1, "number of devices %d is < 0", maxshowdevs); break; case 'N': nlistf = optarg; break; case 'o': oflag++; break; case 't': tflag++; if (devstat_buildmatch(optarg, &matches, &num_matches) != 0) errx(1, "%s", devstat_errbuf); break; case 'T': Tflag++; break; case 'w': wflag++; f = atof(optarg); waittime = f * 1000; if (waittime < 1) errx(1, "wait time is < 1ms"); break; case 'x': xflag++; break; case 'z': zflag++; break; default: usage(); exit(1); break; } } argc -= optind; argv += optind; if (nlistf != NULL || memf != NULL) { kd = kvm_openfiles(nlistf, memf, NULL, O_RDONLY, errbuf); if (kd == NULL) errx(1, "kvm_openfiles: %s", errbuf); if (kvm_nlist(kd, namelist) == -1) errx(1, "kvm_nlist: %s", kvm_geterr(kd)); } /* * Make sure that the userland devstat version matches the kernel * devstat version. If not, exit and print a message informing * the user of his mistake. */ if (devstat_checkversion(kd) < 0) errx(1, "%s", devstat_errbuf); /* * Make sure Tflag and/or Cflag are set if dflag == 0. If dflag is * greater than 0, they may be 0 or non-zero. */ if (dflag == 0 && xflag == 0) { Cflag = 1; Tflag = 1; } /* find out how many devices we have */ if ((num_devices = devstat_getnumdevs(kd)) < 0) err(1, "can't get number of devices"); /* * Figure out how many devices we should display. */ if (nflag == 0) { if (xflag > 0) maxshowdevs = num_devices; else if (oflag > 0) { if ((dflag > 0) && (Cflag == 0) && (Tflag == 0)) maxshowdevs = 5; else if ((dflag > 0) && (Tflag > 0) && (Cflag == 0)) maxshowdevs = 5; else maxshowdevs = 4; } else { if ((dflag > 0) && (Cflag == 0)) maxshowdevs = 4; else maxshowdevs = 3; } } cur.dinfo = (struct devinfo *)calloc(1, sizeof(struct devinfo)); if (cur.dinfo == NULL) err(1, "calloc failed"); last.dinfo = (struct devinfo *)calloc(1, sizeof(struct devinfo)); if (last.dinfo == NULL) err(1, "calloc failed"); /* * Grab all the devices. We don't look to see if the list has * changed here, since it almost certainly has. We only look for * errors. */ if (devstat_getdevs(kd, &cur) == -1) errx(1, "%s", devstat_errbuf); num_devices = cur.dinfo->numdevs; generation = cur.dinfo->generation; /* * If the user specified any devices on the command line, see if * they are in the list of devices we have now. */ specified_devices = (char **)malloc(sizeof(char *)); if (specified_devices == NULL) err(1, "malloc failed"); for (num_devices_specified = 0; *argv; ++argv) { if (isdigit(**argv)) break; num_devices_specified++; specified_devices = (char **)realloc(specified_devices, sizeof(char *) * num_devices_specified); if (specified_devices == NULL) err(1, "realloc failed"); specified_devices[num_devices_specified - 1] = *argv; } if (nflag == 0 && maxshowdevs < num_devices_specified) maxshowdevs = num_devices_specified; dev_select = NULL; if ((num_devices_specified == 0) && (num_matches == 0)) select_mode = DS_SELECT_ADD; else select_mode = DS_SELECT_ONLY; /* * At this point, selectdevs will almost surely indicate that the * device list has changed, so we don't look for return values of 0 * or 1. If we get back -1, though, there is an error. */ if (devstat_selectdevs(&dev_select, &num_selected, &num_selections, &select_generation, generation, cur.dinfo->devices, num_devices, matches, num_matches, specified_devices, num_devices_specified, select_mode, maxshowdevs, hflag) == -1) errx(1, "%s", devstat_errbuf); /* * Look for the traditional wait time and count arguments. */ if (*argv) { f = atof(*argv); waittime = f * 1000; /* Let the user know he goofed, but keep going anyway */ if (wflag != 0) warnx("discarding previous wait interval, using" " %g instead", waittime / 1000.0); wflag++; if (*++argv) { count = atoi(*argv); if (cflag != 0) warnx("discarding previous count, using %d" " instead", count); cflag++; } else count = -1; } /* * If the user specified a count, but not an interval, we default * to an interval of 1 second. */ if ((wflag == 0) && (cflag > 0)) waittime = 1 * 1000; /* * If the user specified a wait time, but not a count, we want to * go on ad infinitum. This can be redundant if the user uses the * traditional method of specifying the wait, since in that case we * already set count = -1 above. Oh well. */ if ((wflag > 0) && (cflag == 0)) count = -1; bzero(cur.cp_time, sizeof(cur.cp_time)); cur.tk_nout = 0; cur.tk_nin = 0; /* * Set the snap time to the system boot time (ie: zero), so the * stats are calculated since system boot. */ cur.snap_time = 0; /* * If the user stops the program (control-Z) and then resumes it, * print out the header again. */ (void)signal(SIGCONT, needhdr); /* * If our standard output is a tty, then install a SIGWINCH handler * and set wresized so that our first iteration through the main * iostat loop will peek at the terminal's current rows to find out * how many lines can fit in a screenful of output. */ if (isatty(fileno(stdout)) != 0) { wresized = 1; (void)signal(SIGWINCH, needresize); } else { wresized = 0; wrows = IOSTAT_DEFAULT_ROWS; } /* * Register a SIGINT handler so that we can print out final statistics * when we get that signal */ (void)signal(SIGINT, needreturn); /* * Register a SIGALRM handler to implement sleeps if the user uses the * -c or -w options */ (void)signal(SIGALRM, alarm_clock); alarmspec.it_interval.tv_sec = waittime / 1000; alarmspec.it_interval.tv_usec = 1000 * (waittime % 1000); alarmspec.it_value.tv_sec = waittime / 1000; alarmspec.it_value.tv_usec = 1000 * (waittime % 1000); setitimer(ITIMER_REAL, &alarmspec, NULL); for (headercount = 1;;) { struct devinfo *tmp_dinfo; long tmp; long double etime; sigset_t sigmask, oldsigmask; if (Tflag > 0) { if ((readvar(kd, "kern.tty_nin", X_TTY_NIN, &cur.tk_nin, sizeof(cur.tk_nin)) != 0) || (readvar(kd, "kern.tty_nout", X_TTY_NOUT, &cur.tk_nout, sizeof(cur.tk_nout))!= 0)) { Tflag = 0; warnx("disabling TTY statistics"); } } if (Cflag > 0) { if (kd == NULL) { if (readvar(kd, "kern.cp_time", 0, &cur.cp_time, sizeof(cur.cp_time)) != 0) Cflag = 0; } else { if (kvm_getcptime(kd, cur.cp_time) < 0) { warnx("kvm_getcptime: %s", kvm_geterr(kd)); Cflag = 0; } } if (Cflag == 0) warnx("disabling CPU time statistics"); } if (!--headercount) { phdr(); if (wresized != 0) doresize(); headercount = wrows; } tmp_dinfo = last.dinfo; last.dinfo = cur.dinfo; cur.dinfo = tmp_dinfo; last.snap_time = cur.snap_time; /* * Here what we want to do is refresh our device stats. * devstat_getdevs() returns 1 when the device list has changed. * If the device list has changed, we want to go through * the selection process again, in case a device that we * were previously displaying has gone away. */ switch (devstat_getdevs(kd, &cur)) { case -1: errx(1, "%s", devstat_errbuf); break; case 1: { int retval; num_devices = cur.dinfo->numdevs; generation = cur.dinfo->generation; retval = devstat_selectdevs(&dev_select, &num_selected, &num_selections, &select_generation, generation, cur.dinfo->devices, num_devices, matches, num_matches, specified_devices, num_devices_specified, select_mode, maxshowdevs, hflag); switch(retval) { case -1: errx(1, "%s", devstat_errbuf); break; case 1: phdr(); if (wresized != 0) doresize(); headercount = wrows; break; default: break; } break; } default: break; } /* * We only want to re-select devices if we're in 'top' * mode. This is the only mode where the devices selected * could actually change. */ if (hflag > 0) { int retval; retval = devstat_selectdevs(&dev_select, &num_selected, &num_selections, &select_generation, generation, cur.dinfo->devices, num_devices, matches, num_matches, specified_devices, num_devices_specified, select_mode, maxshowdevs, hflag); switch(retval) { case -1: errx(1,"%s", devstat_errbuf); break; case 1: phdr(); if (wresized != 0) doresize(); headercount = wrows; break; default: break; } } if (Tflag > 0) { tmp = cur.tk_nin; cur.tk_nin -= last.tk_nin; last.tk_nin = tmp; tmp = cur.tk_nout; cur.tk_nout -= last.tk_nout; last.tk_nout = tmp; } etime = cur.snap_time - last.snap_time; if (etime == 0.0) etime = 1.0; for (i = 0; i < CPUSTATES; i++) { tmp = cur.cp_time[i]; cur.cp_time[i] -= last.cp_time[i]; last.cp_time[i] = tmp; } if (xflag == 0 && Tflag > 0) printf("%4.0Lf %5.0Lf", cur.tk_nin / etime, cur.tk_nout / etime); devstats(hflag, etime, havelast); if (xflag == 0) { if (Cflag > 0) cpustats(); printf("\n"); } fflush(stdout); if ((count >= 0 && --count <= 0) || return_requested) break; /* * Use sigsuspend to safely sleep until either signal is * received */ alarm_rang = 0; sigemptyset(&sigmask); sigaddset(&sigmask, SIGINT); sigaddset(&sigmask, SIGALRM); sigprocmask(SIG_BLOCK, &sigmask, &oldsigmask); while (! (alarm_rang || return_requested) ) { sigsuspend(&oldsigmask); } sigprocmask(SIG_UNBLOCK, &sigmask, NULL); havelast = 1; } exit(0); } /* * Force a header to be prepended to the next output. */ void needhdr(int signo __unused) { headercount = 1; } /* * When the terminal is resized, force an update of the maximum number of rows * printed between each header repetition. Then force a new header to be * prepended to the next output. */ void needresize(int signo __unused) { wresized = 1; headercount = 1; } /* * Record the alarm so the main loop can break its sleep */ void alarm_clock(int signo __unused) { alarm_rang = 1; } /* * Request that the main loop exit soon */ void needreturn(int signo __unused) { return_requested = 1; } /* * Update the global `wrows' count of terminal rows. */ void doresize(void) { int status; struct winsize w; for (;;) { status = ioctl(fileno(stdout), TIOCGWINSZ, &w); if (status == -1 && errno == EINTR) continue; else if (status == -1) err(1, "ioctl"); if (w.ws_row > 3) wrows = w.ws_row - 3; else wrows = IOSTAT_DEFAULT_ROWS; break; } /* * Inhibit doresize() calls until we are rescheduled by SIGWINCH. */ wresized = 0; } static void phdr(void) { int i, printed; char devbuf[256]; /* * If xflag is set, we need a per-loop header, not a page header, so * just return. We'll print the header in devstats(). */ if (xflag > 0) return; if (Tflag > 0) (void)printf(" tty"); for (i = 0, printed=0;(i < num_devices) && (printed < maxshowdevs);i++){ int di; if ((dev_select[i].selected != 0) && (dev_select[i].selected <= maxshowdevs)) { di = dev_select[i].position; snprintf(devbuf, sizeof(devbuf), "%s%d", cur.dinfo->devices[di].device_name, cur.dinfo->devices[di].unit_number); if (oflag > 0) (void)printf("%13.6s ", devbuf); else printf("%16.6s ", devbuf); printed++; } } if (Cflag > 0) (void)printf(" cpu\n"); else (void)printf("\n"); if (Tflag > 0) (void)printf(" tin tout"); for (i=0, printed = 0;(i < num_devices) && (printed < maxshowdevs);i++){ if ((dev_select[i].selected != 0) && (dev_select[i].selected <= maxshowdevs)) { if (oflag > 0) { if (Iflag == 0) (void)printf(" sps tps msps "); else (void)printf(" blk xfr msps "); } else { if (Iflag == 0) printf(" KB/t tps MB/s "); else printf(" KB/t xfrs MB "); } printed++; } } if (Cflag > 0) (void)printf(" us ni sy in id\n"); else printf("\n"); } static void devstats(int perf_select, long double etime, int havelast) { int dn; long double transfers_per_second, transfers_per_second_read; long double transfers_per_second_write; long double kb_per_transfer, mb_per_second, mb_per_second_read; long double mb_per_second_write; u_int64_t total_bytes, total_transfers, total_blocks; u_int64_t total_bytes_read, total_transfers_read; u_int64_t total_bytes_write, total_transfers_write; long double busy_pct, busy_time; u_int64_t queue_len; long double total_mb, blocks_per_second, total_duration; long double ms_per_other, ms_per_read, ms_per_write, ms_per_transaction; int firstline = 1; char *devicename; if (xflag > 0) { if (Cflag > 0) { printf(" cpu\n"); printf(" us ni sy in id\n"); cpustats(); printf("\n"); } printf(" extended device statistics "); if (Tflag > 0) printf(" tty "); printf("\n"); if (Iflag == 0) { printf("device r/s w/s kr/s kw/s " " ms/r ms/w ms/o ms/t qlen %%b "); } else { printf("device r/i w/i kr/i" " kw/i qlen tsvc_t/i sb/i "); } if (Tflag > 0) printf("tin tout "); printf("\n"); } for (dn = 0; dn < num_devices; dn++) { int di; if (((perf_select == 0) && (dev_select[dn].selected == 0)) || (dev_select[dn].selected > maxshowdevs)) continue; di = dev_select[dn].position; if (devstat_compute_statistics(&cur.dinfo->devices[di], havelast ? &last.dinfo->devices[di] : NULL, etime, DSM_TOTAL_BYTES, &total_bytes, DSM_TOTAL_BYTES_READ, &total_bytes_read, DSM_TOTAL_BYTES_WRITE, &total_bytes_write, DSM_TOTAL_TRANSFERS, &total_transfers, DSM_TOTAL_TRANSFERS_READ, &total_transfers_read, DSM_TOTAL_TRANSFERS_WRITE, &total_transfers_write, DSM_TOTAL_BLOCKS, &total_blocks, DSM_KB_PER_TRANSFER, &kb_per_transfer, DSM_TRANSFERS_PER_SECOND, &transfers_per_second, DSM_TRANSFERS_PER_SECOND_READ, &transfers_per_second_read, DSM_TRANSFERS_PER_SECOND_WRITE, &transfers_per_second_write, DSM_MB_PER_SECOND, &mb_per_second, DSM_MB_PER_SECOND_READ, &mb_per_second_read, DSM_MB_PER_SECOND_WRITE, &mb_per_second_write, DSM_BLOCKS_PER_SECOND, &blocks_per_second, DSM_MS_PER_TRANSACTION, &ms_per_transaction, DSM_MS_PER_TRANSACTION_READ, &ms_per_read, DSM_MS_PER_TRANSACTION_WRITE, &ms_per_write, DSM_MS_PER_TRANSACTION_OTHER, &ms_per_other, DSM_BUSY_PCT, &busy_pct, DSM_QUEUE_LENGTH, &queue_len, DSM_TOTAL_DURATION, &total_duration, DSM_TOTAL_BUSY_TIME, &busy_time, DSM_NONE) != 0) errx(1, "%s", devstat_errbuf); if (perf_select != 0) { dev_select[dn].bytes = total_bytes; if ((dev_select[dn].selected == 0) || (dev_select[dn].selected > maxshowdevs)) continue; } if (Kflag > 0 || xflag > 0) { int block_size = cur.dinfo->devices[di].block_size; total_blocks = total_blocks * (block_size ? block_size : 512) / 1024; } if (xflag > 0) { if (asprintf(&devicename, "%s%d", cur.dinfo->devices[di].device_name, cur.dinfo->devices[di].unit_number) == -1) err(1, "asprintf"); /* * If zflag is set, skip any devices with zero I/O. */ if (zflag == 0 || transfers_per_second_read > 0.05 || transfers_per_second_write > 0.05 || mb_per_second_read > ((long double).0005)/1024 || mb_per_second_write > ((long double).0005)/1024 || busy_pct > 0.5) { if (Iflag == 0) printf("%-8.8s %7d %7d %8.1Lf " "%8.1Lf %5d %5d %5d %5d " "%4" PRIu64 " %3.0Lf ", devicename, (int)transfers_per_second_read, (int)transfers_per_second_write, mb_per_second_read * 1024, mb_per_second_write * 1024, (int)ms_per_read, (int)ms_per_write, (int)ms_per_other, (int)ms_per_transaction, queue_len, busy_pct); else printf("%-8.8s %11.1Lf %11.1Lf " "%12.1Lf %12.1Lf %4" PRIu64 " %10.1Lf %9.1Lf ", devicename, (long double)total_transfers_read, (long double)total_transfers_write, (long double) total_bytes_read / 1024, (long double) total_bytes_write / 1024, queue_len, total_duration, busy_time); if (firstline) { /* * If this is the first device * we're printing, also print * CPU or TTY stats if requested. */ firstline = 0; if (Tflag > 0) printf("%4.0Lf%5.0Lf", cur.tk_nin / etime, cur.tk_nout / etime); } printf("\n"); } free(devicename); } else if (oflag > 0) { - int msdig = (ms_per_transaction < 100.0) ? 1 : 0; + int msdig = (ms_per_transaction < 99.94) ? 1 : 0; if (Iflag == 0) printf("%4.0Lf%4.0Lf%5.*Lf ", blocks_per_second, transfers_per_second, msdig, ms_per_transaction); else printf("%4.1" PRIu64 "%4.1" PRIu64 "%5.*Lf ", total_blocks, total_transfers, msdig, ms_per_transaction); } else { if (Iflag == 0) printf(" %4.*Lf %4.0Lf %5.*Lf ", kb_per_transfer >= 100 ? 0 : 1, kb_per_transfer, transfers_per_second, mb_per_second >= 1000 ? 0 : 1, mb_per_second); else { total_mb = total_bytes; total_mb /= 1024 * 1024; printf(" %4.1Lf %4.1" PRIu64 " %5.2Lf ", kb_per_transfer, total_transfers, total_mb); } } } if (xflag > 0 && zflag > 0 && firstline == 1 && (Tflag > 0 || Cflag > 0)) { /* * If zflag is set and we did not print any device * lines I/O because they were all zero, * print TTY/CPU stats. */ printf("%52s",""); if (Tflag > 0) printf("%4.0Lf %5.0Lf", cur.tk_nin / etime, cur.tk_nout / etime); if (Cflag > 0) cpustats(); printf("\n"); } } static void cpustats(void) { int state; double cptime; cptime = 0.0; for (state = 0; state < CPUSTATES; ++state) cptime += cur.cp_time[state]; for (state = 0; state < CPUSTATES; ++state) printf(" %2.0f", rint(100. * cur.cp_time[state] / (cptime ? cptime : 1))); } static int readvar(kvm_t *kd, const char *name, int nlid, void *ptr, size_t len) { if (kd != NULL) { ssize_t nbytes; nbytes = kvm_read(kd, namelist[nlid].n_value, ptr, len); if (nbytes < 0) { warnx("kvm_read(%s): %s", namelist[nlid].n_name, kvm_geterr(kd)); return (1); } else if ((size_t)nbytes != len) { warnx("kvm_read(%s): expected %zu bytes, got %zd bytes", namelist[nlid].n_name, len, nbytes); return (1); } } else { size_t nlen = len; if (sysctlbyname(name, ptr, &nlen, NULL, 0) == -1) { warn("sysctl(%s...) failed", name); return (1); } if (nlen != len) { warnx("sysctl(%s...): expected %lu, got %lu", name, (unsigned long)len, (unsigned long)nlen); return (1); } } return (0); } Index: projects/clang1000-import/usr.sbin/pstat/pstat.c =================================================================== --- projects/clang1000-import/usr.sbin/pstat/pstat.c (revision 358238) +++ projects/clang1000-import/usr.sbin/pstat/pstat.c (revision 358239) @@ -1,597 +1,611 @@ /*- * SPDX-License-Identifier: BSD-3-Clause * * Copyright (c) 1980, 1991, 1993, 1994 * The Regents of the University of California. All rights reserved. * Copyright (c) 2002 Networks Associates Technologies, Inc. * All rights reserved. * * Portions of this software were developed for the FreeBSD Project by * ThinkSec AS and NAI Labs, the Security Research Division of Network * Associates, Inc. under DARPA/SPAWAR contract N66001-01-C-8035 * ("CBOSS"), as part of the DARPA CHATS research program. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #if 0 #ifndef lint static const char copyright[] = "@(#) Copyright (c) 1980, 1991, 1993, 1994\n\ The Regents of the University of California. All rights reserved.\n"; #endif /* not lint */ #ifndef lint static char sccsid[] = "@(#)pstat.c 8.16 (Berkeley) 5/9/95"; #endif /* not lint */ #endif #include __FBSDID("$FreeBSD$"); #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include enum { NL_CONSTTY, NL_MAXFILES, NL_NFILES, NL_TTY_LIST, NL_MARKER }; static struct { int order; const char *name; } namelist[] = { { NL_CONSTTY, "_constty" }, { NL_MAXFILES, "_maxfiles" }, { NL_NFILES, "_openfiles" }, { NL_TTY_LIST, "_tty_list" }, { NL_MARKER, "" }, }; #define NNAMES (sizeof(namelist) / sizeof(*namelist)) static struct nlist nl[NNAMES]; +#define SIZEHDR "Size" + static int humanflag; static int usenumflag; static int totalflag; static int swapflag; static char *nlistf; static char *memf; static kvm_t *kd; static const char *usagestr; static void filemode(void); static int getfiles(struct xfile **, size_t *); static void swapmode(void); static void ttymode(void); static void ttyprt(struct xtty *); static void usage(void); int main(int argc, char *argv[]) { int ch, quit, ret; int fileflag, ttyflag; unsigned int i; char buf[_POSIX2_LINE_MAX]; const char *opts; fileflag = swapflag = ttyflag = 0; /* We will behave like good old swapinfo if thus invoked */ opts = strrchr(argv[0], '/'); if (opts) opts++; else opts = argv[0]; if (!strcmp(opts, "swapinfo")) { swapflag = 1; opts = "ghkmM:N:"; usagestr = "swapinfo [-ghkm] [-M core [-N system]]"; } else { opts = "TM:N:fghkmnst"; usagestr = "pstat [-Tfghkmnst] [-M core [-N system]]"; } while ((ch = getopt(argc, argv, opts)) != -1) switch (ch) { case 'f': fileflag = 1; break; case 'g': setenv("BLOCKSIZE", "1G", 1); break; case 'h': humanflag = 1; break; case 'k': setenv("BLOCKSIZE", "1K", 1); break; case 'm': setenv("BLOCKSIZE", "1M", 1); break; case 'M': memf = optarg; break; case 'N': nlistf = optarg; break; case 'n': usenumflag = 1; break; case 's': ++swapflag; break; case 'T': totalflag = 1; break; case 't': ttyflag = 1; break; default: usage(); } /* * Initialize symbol names list. */ for (i = 0; i < NNAMES; i++) nl[namelist[i].order].n_name = strdup(namelist[i].name); if (memf != NULL) { kd = kvm_openfiles(nlistf, memf, NULL, O_RDONLY, buf); if (kd == NULL) errx(1, "kvm_openfiles: %s", buf); if ((ret = kvm_nlist(kd, nl)) != 0) { if (ret == -1) errx(1, "kvm_nlist: %s", kvm_geterr(kd)); quit = 0; for (i = 0; nl[i].n_name[0] != '\0'; ++i) if (nl[i].n_value == 0) { quit = 1; warnx("undefined symbol: %s", nl[i].n_name); } if (quit) exit(1); } } if (!(fileflag | ttyflag | swapflag | totalflag)) usage(); if (fileflag || totalflag) filemode(); if (ttyflag) ttymode(); if (swapflag || totalflag) swapmode(); exit (0); } static void usage(void) { fprintf(stderr, "usage: %s\n", usagestr); exit (1); } static const char fhdr32[] = " LOC TYPE FLG CNT MSG DATA OFFSET\n"; /* c0000000 ------ RWAI 123 123 c0000000 1000000000000000 */ static const char fhdr64[] = " LOC TYPE FLG CNT MSG DATA OFFSET\n"; /* c000000000000000 ------ RWAI 123 123 c000000000000000 1000000000000000 */ static const char hdr[] = " LINE INQ CAN LIN LOW OUTQ USE LOW COL SESS PGID STATE\n"; static void ttymode_kvm(void) { TAILQ_HEAD(, tty) tl; struct tty *tp, tty; struct xtty xt; (void)printf("%s", hdr); bzero(&xt, sizeof xt); xt.xt_size = sizeof xt; if (kvm_read(kd, nl[NL_TTY_LIST].n_value, &tl, sizeof tl) != sizeof tl) errx(1, "kvm_read(): %s", kvm_geterr(kd)); tp = TAILQ_FIRST(&tl); while (tp != NULL) { if (kvm_read(kd, (u_long)tp, &tty, sizeof tty) != sizeof tty) errx(1, "kvm_read(): %s", kvm_geterr(kd)); xt.xt_insize = tty.t_inq.ti_nblocks * TTYINQ_DATASIZE; xt.xt_incc = tty.t_inq.ti_linestart - tty.t_inq.ti_begin; xt.xt_inlc = tty.t_inq.ti_end - tty.t_inq.ti_linestart; xt.xt_inlow = tty.t_inlow; xt.xt_outsize = tty.t_outq.to_nblocks * TTYOUTQ_DATASIZE; xt.xt_outcc = tty.t_outq.to_end - tty.t_outq.to_begin; xt.xt_outlow = tty.t_outlow; xt.xt_column = tty.t_column; /* xt.xt_pgid = ... */ /* xt.xt_sid = ... */ xt.xt_flags = tty.t_flags; xt.xt_dev = (uint32_t)NODEV; ttyprt(&xt); tp = TAILQ_NEXT(&tty, t_list); } } static void ttymode_sysctl(void) { struct xtty *xttys; size_t len; unsigned int i, n; (void)printf("%s", hdr); if ((xttys = malloc(len = sizeof(*xttys))) == NULL) err(1, "malloc()"); while (sysctlbyname("kern.ttys", xttys, &len, 0, 0) == -1) { if (errno != ENOMEM) err(1, "sysctlbyname()"); len *= 2; if ((xttys = realloc(xttys, len)) == NULL) err(1, "realloc()"); } n = len / sizeof(*xttys); for (i = 0; i < n; i++) ttyprt(&xttys[i]); } static void ttymode(void) { if (kd != NULL) ttymode_kvm(); else ttymode_sysctl(); } static struct { int flag; char val; } ttystates[] = { #if 0 { TF_NOPREFIX, 'N' }, #endif { TF_INITLOCK, 'I' }, { TF_CALLOUT, 'C' }, /* Keep these together -> 'Oi' and 'Oo'. */ { TF_OPENED, 'O' }, { TF_OPENED_IN, 'i' }, { TF_OPENED_OUT, 'o' }, { TF_OPENED_CONS, 'c' }, { TF_GONE, 'G' }, { TF_OPENCLOSE, 'B' }, { TF_ASYNC, 'Y' }, { TF_LITERAL, 'L' }, /* Keep these together -> 'Hi' and 'Ho'. */ { TF_HIWAT, 'H' }, { TF_HIWAT_IN, 'i' }, { TF_HIWAT_OUT, 'o' }, { TF_STOPPED, 'S' }, { TF_EXCLUDE, 'X' }, { TF_BYPASS, 'l' }, { TF_ZOMBIE, 'Z' }, { TF_HOOK, 's' }, /* Keep these together -> 'bi' and 'bo'. */ { TF_BUSY, 'b' }, { TF_BUSY_IN, 'i' }, { TF_BUSY_OUT, 'o' }, { 0, '\0'}, }; static void ttyprt(struct xtty *xt) { int i, j; const char *name; if (xt->xt_size != sizeof *xt) errx(1, "struct xtty size mismatch"); if (usenumflag || xt->xt_dev == 0 || (name = devname(xt->xt_dev, S_IFCHR)) == NULL) printf("%#10jx ", (uintmax_t)xt->xt_dev); else printf("%10s ", name); printf("%5zu %4zu %4zu %4zu %5zu %4zu %4zu %5u %5d %5d ", xt->xt_insize, xt->xt_incc, xt->xt_inlc, (xt->xt_insize - xt->xt_inlow), xt->xt_outsize, xt->xt_outcc, (xt->xt_outsize - xt->xt_outlow), MIN(xt->xt_column, 99999), xt->xt_sid, xt->xt_pgid); for (i = j = 0; ttystates[i].flag; i++) if (xt->xt_flags & ttystates[i].flag) { putchar(ttystates[i].val); j++; } if (j == 0) putchar('-'); putchar('\n'); } static void filemode(void) { struct xfile *fp, *buf; char flagbuf[16], *fbp; int maxf, openf; size_t len; static char const * const dtypes[] = { "???", "inode", "socket", "pipe", "fifo", "kqueue", "crypto" }; int i; int wid; if (kd != NULL) { if (kvm_read(kd, nl[NL_MAXFILES].n_value, &maxf, sizeof maxf) != sizeof maxf || kvm_read(kd, nl[NL_NFILES].n_value, &openf, sizeof openf) != sizeof openf) errx(1, "kvm_read(): %s", kvm_geterr(kd)); } else { len = sizeof(int); if (sysctlbyname("kern.maxfiles", &maxf, &len, 0, 0) == -1 || sysctlbyname("kern.openfiles", &openf, &len, 0, 0) == -1) err(1, "sysctlbyname()"); } if (totalflag) { (void)printf("%3d/%3d files\n", openf, maxf); return; } if (getfiles(&buf, &len) == -1) return; openf = len / sizeof *fp; (void)printf("%d/%d open files\n", openf, maxf); printf(sizeof(uintptr_t) == 4 ? fhdr32 : fhdr64); wid = (int)sizeof(uintptr_t) * 2; for (fp = (struct xfile *)buf, i = 0; i < openf; ++fp, ++i) { if ((size_t)fp->xf_type >= sizeof(dtypes) / sizeof(dtypes[0])) continue; (void)printf("%*jx", wid, (uintmax_t)(uintptr_t)fp->xf_file); (void)printf(" %-6.6s", dtypes[fp->xf_type]); fbp = flagbuf; if (fp->xf_flag & FREAD) *fbp++ = 'R'; if (fp->xf_flag & FWRITE) *fbp++ = 'W'; if (fp->xf_flag & FAPPEND) *fbp++ = 'A'; if (fp->xf_flag & FASYNC) *fbp++ = 'I'; *fbp = '\0'; (void)printf(" %4s %3d", flagbuf, fp->xf_count); (void)printf(" %3d", fp->xf_msgcount); (void)printf(" %*jx", wid, (uintmax_t)(uintptr_t)fp->xf_data); (void)printf(" %*jx\n", (int)sizeof(fp->xf_offset) * 2, (uintmax_t)fp->xf_offset); } free(buf); } static int getfiles(struct xfile **abuf, size_t *alen) { struct xfile *buf; size_t len; int mib[2]; /* * XXX * Add emulation of KINFO_FILE here. */ if (kd != NULL) errx(1, "files on dead kernel, not implemented"); mib[0] = CTL_KERN; mib[1] = KERN_FILE; if (sysctl(mib, 2, NULL, &len, NULL, 0) == -1) { warn("sysctl: KERN_FILE"); return (-1); } if ((buf = malloc(len)) == NULL) errx(1, "malloc"); if (sysctl(mib, 2, buf, &len, NULL, 0) == -1) { warn("sysctl: KERN_FILE"); return (-1); } *abuf = buf; *alen = len; return (0); } /* * swapmode is based on a program called swapinfo written * by Kevin Lahey . */ #define CONVERT(v) ((int64_t)(v) * pagesize / blocksize) #define CONVERT_BLOCKS(v) ((int64_t)(v) * pagesize) static struct kvm_swap swtot; static int nswdev; static void print_swap_header(void) { int hlen; long blocksize; const char *header; - header = getbsize(&hlen, &blocksize); + if (humanflag) { + header = SIZEHDR; + hlen = sizeof(SIZEHDR); + } else { + header = getbsize(&hlen, &blocksize); + } if (totalflag == 0) (void)printf("%-15s %*s %8s %8s %8s\n", "Device", hlen, header, "Used", "Avail", "Capacity"); } static void print_swap_line(const char *swdevname, intmax_t nblks, intmax_t bused, intmax_t bavail, float bpercent) { char usedbuf[5]; char availbuf[5]; + char sizebuf[5]; int hlen, pagesize; long blocksize; pagesize = getpagesize(); getbsize(&hlen, &blocksize); - printf("%-15s %*jd ", swdevname, hlen, CONVERT(nblks)); + printf("%-15s ", swdevname); if (humanflag) { + humanize_number(sizebuf, sizeof(sizebuf), + CONVERT_BLOCKS(nblks), "", + HN_AUTOSCALE, HN_B | HN_NOSPACE | HN_DECIMAL); humanize_number(usedbuf, sizeof(usedbuf), CONVERT_BLOCKS(bused), "", HN_AUTOSCALE, HN_B | HN_NOSPACE | HN_DECIMAL); humanize_number(availbuf, sizeof(availbuf), CONVERT_BLOCKS(bavail), "", HN_AUTOSCALE, HN_B | HN_NOSPACE | HN_DECIMAL); - printf("%8s %8s %5.0f%%\n", usedbuf, availbuf, bpercent); + printf("%8s %8s %8s %5.0f%%\n", sizebuf, + usedbuf, availbuf, bpercent); } else { - printf("%8jd %8jd %5.0f%%\n", (intmax_t)CONVERT(bused), + printf("%*jd %8jd %8jd %5.0f%%\n", hlen, + (intmax_t)CONVERT(nblks), + (intmax_t)CONVERT(bused), (intmax_t)CONVERT(bavail), bpercent); } } static void print_swap(struct kvm_swap *ksw) { swtot.ksw_total += ksw->ksw_total; swtot.ksw_used += ksw->ksw_used; ++nswdev; if (totalflag == 0) print_swap_line(ksw->ksw_devname, ksw->ksw_total, ksw->ksw_used, ksw->ksw_total - ksw->ksw_used, (ksw->ksw_used * 100.0) / ksw->ksw_total); } static void print_swap_total(void) { int hlen, pagesize; long blocksize; pagesize = getpagesize(); getbsize(&hlen, &blocksize); if (totalflag) { blocksize = 1024 * 1024; (void)printf("%jdM/%jdM swap space\n", CONVERT(swtot.ksw_used), CONVERT(swtot.ksw_total)); } else if (nswdev > 1) { print_swap_line("Total", swtot.ksw_total, swtot.ksw_used, swtot.ksw_total - swtot.ksw_used, (swtot.ksw_used * 100.0) / swtot.ksw_total); } } static void swapmode_kvm(void) { struct kvm_swap kswap[16]; int i, n; n = kvm_getswapinfo(kd, kswap, sizeof kswap / sizeof kswap[0], SWIF_DEV_PREFIX); print_swap_header(); for (i = 0; i < n; ++i) print_swap(&kswap[i]); print_swap_total(); } static void swapmode_sysctl(void) { struct kvm_swap ksw; struct xswdev xsw; size_t mibsize, size; int mib[16], n; print_swap_header(); mibsize = sizeof mib / sizeof mib[0]; if (sysctlnametomib("vm.swap_info", mib, &mibsize) == -1) err(1, "sysctlnametomib()"); for (n = 0; ; ++n) { mib[mibsize] = n; size = sizeof xsw; if (sysctl(mib, mibsize + 1, &xsw, &size, NULL, 0) == -1) break; if (xsw.xsw_version != XSWDEV_VERSION) errx(1, "xswdev version mismatch"); if (xsw.xsw_dev == NODEV) snprintf(ksw.ksw_devname, sizeof ksw.ksw_devname, ""); else snprintf(ksw.ksw_devname, sizeof ksw.ksw_devname, "/dev/%s", devname(xsw.xsw_dev, S_IFCHR)); ksw.ksw_used = xsw.xsw_used; ksw.ksw_total = xsw.xsw_nblks; ksw.ksw_flags = xsw.xsw_flags; print_swap(&ksw); } if (errno != ENOENT) err(1, "sysctl()"); print_swap_total(); } static void swapmode(void) { if (kd != NULL) swapmode_kvm(); else swapmode_sysctl(); } Index: projects/clang1000-import =================================================================== --- projects/clang1000-import (revision 358238) +++ projects/clang1000-import (revision 358239) Property changes on: projects/clang1000-import ___________________________________________________________________ Modified: svn:mergeinfo ## -0,0 +0,1 ## Merged /head:r358179-358238