diff --git a/UPDATING b/UPDATING index 1980411c1853..f4e13d97006d 100644 --- a/UPDATING +++ b/UPDATING @@ -1,1964 +1,1976 @@ Updating Information for users of FreeBSD-CURRENT. This file is maintained and copyrighted by M. Warner Losh . See end of file for further details. For commonly done items, please see the COMMON ITEMS: section later in the file. These instructions assume that you basically know what you are doing. If not, then please consult the FreeBSD handbook: https://docs.freebsd.org/en/books/handbook/cutting-edge/#makeworld Items affecting the ports and packages system can be found in /usr/ports/UPDATING. Please read that file before updating system packages and/or ports. NOTE TO PEOPLE WHO THINK THAT FreeBSD 14.x IS SLOW: FreeBSD 14.x has many debugging features turned on, in both the kernel and userland. These features attempt to detect incorrect use of system primitives, and encourage loud failure through extra sanity checking and fail stop semantics. They also substantially impact system performance. If you want to do performance measurement, benchmarking, and optimization, you'll want to turn them off. This includes various WITNESS- related kernel options, INVARIANTS, malloc debugging flags in userland, and various verbose features in the kernel. Many developers choose to disable these features on build machines to maximize performance. (To completely disable malloc debugging, define WITH_MALLOC_PRODUCTION in /etc/src.conf and rebuild world, or to merely disable the most expensive debugging functionality at runtime, run "ln -s 'abort:false,junk:false' /etc/malloc.conf".) +20230619: + To enable pf rdr rules for connections initiated from the host, pf + filter rules can be optionally enabled for packets delivered + locally. This can change the behavior of rules which match packets + delivered to lo0. To enable this feature: + + sysctl net.pf.filter_local=1 + service pf restart + + When enabled, its best to ensure that packets delivered locally are not + filtered, e.g. by adding a 'skip on lo' rule. + 20230613: Improvements to libtacplus(8) mean that tacplus.conf(5) now follows POSIX shell syntax rules. This may cause TACACS+ authentication to fail if the shared secret contains a single quote, double quote, or backslash character which isn't already properly quoted or escaped. 20230612: Belatedly switch the default nvme block device on x86 from nvd to nda. nda created nvd compatibility links by default, so this should be a nop. If this causes problems for your application, set hw.nvme.use_nvd=1 in your loader.conf or add `options NVME_USE_NVD=1` to your kernel config. To disable the nvd compatibility aliases, add kern.cam.nda.nvd_compat=0 to loader.conf. The default has been nda on all non-x86 platforms for some time now. If you need to fall back, please email imp@freebsd.org about why. 20230422: Remove portsnap(8). Users are encouraged to obtain the ports tree using git instead. 20230420: Add jobs.mk to save typing. Enables -j${JOB_MAX} and logging eg. make buildworld-jobs runs make -j${JOB_MAX} buildworld > ../buildworld.log 2>&1 where JOB_MAX is derrived from ncpus in local.sys.mk if not set in env. 20230316: Video related devices for some arm devices have been renamed. If you have a custom kernel config and want to use hdmi output on IMX6 board you need to add "device dwc_hdmi" "device imx6_hdmi" and "device imx6_ipu" to it. If you have a custom kernel config and want to use hdmi output on TI AM335X board you need to add "device tda19988" to it. If you add "device hdmi" in it you need to remove it as it doesn't exist anymore. 20230221: Introduce new kernel options KBD_DELAY1 and KBD_DELAY2. See atkbdc(4) for details. 20230206: sshd now defaults to having X11Forwarding disabled, following upstream. Administrators who wish to enable X11Forwarding should add `X11Forwarding yes` to /etc/ssh/sshd_config. 20230204: Since commit 75d41cb6967 Huawei 3G/4G LTE Mobile Devices do not default to ECM, but NCM mode and need u3g and ucom modules loaded. See cdce(4). 20230130: As of commit 7c40e2d5f685, the dependency on netlink(4) has been added to the linux_common(4) module. Users relying on linux_common may need to complile netlink(4) module if it is not present in their kernel. 20230126: The WITHOUT_CXX option has been removed. C++ components in the base system are now built unconditionally. 20230113: LinuxKPI pci.h changes may require out-of-tree drivers to be recompiled. Bump _FreeBSD_version to 1400078 to be able to detect this change. 20221212: llvm-objump is now always installed as objdump. Previously there was no /usr/bin/objdump unless the WITH_LLVM_BINUTILS knob was used. Some LLVM objdump options have a different output format compared to GNU objdump; readelf is available for inspecting ELF files, and GNU objdump is available from the devel/binutils port or package. 20221205: dma(8) has replaced sendmail(8) as the default mta. For people willing to reenable sendmail(8): $ cp /usr/share/examples/sendmail/mailer.conf /etc/mail/mailer.conf and add sendmail_enable="YES" to rc.conf. 20221204: hw.bus.disable_failed_devices has changed from 'false' to 'true' by default. Now if newbus succeeds in probing a device, but fails to attach the device, we'll disable the device. In the past, we'd keep retrying the device on each new driver loaded. To get that behavior now, one needs to use devctl to re-enable the device, and reprobe it (or set the sysctl/tunable hw.bus.disable_failed_devices=false). NOTE: This was reverted 20221205 due to unexpected compatibility issues 20221122: pf no longer accepts 'scrub fragment crop' or 'scrub fragment drop-ovl'. These configurations are no longer automatically reinterpreted as 'scrub fragment reassemble'. 20221121: The WITHOUT_CLANG_IS_CC option has been removed. When Clang is enabled it is always installed as /usr/bin/cc (and c++, cpp). 20221026: Some programs have been moved into separate packages. It is recommended for pkgbase users to do: pkg install FreeBSD-dhclient FreeBSD-geom FreeBSD-resolvconf \ FreeBSD-devd FreeBSD-devmatch after upgrading to restore all the component that were previously installed. 20220610: LinuxKPI pm.h changes require an update to the latest drm-kmod version before re-compiling to avoid errors. 20211230: The macros provided for the manipulation of CPU sets (e.g. CPU_AND) have been modified to take 2 source arguments instead of only 1. Externally maintained sources that use these macros will have to be adapted. The FreeBSD version has been bumped to 1400046 to reflect this change. 20211214: A number of the kernel include files are able to be included by themselves. A test has been added to buildworld to enforce this. 20211209: Remove mips as a recognized target. This starts the decommissioning of mips support in FreeBSD. mips related items will be removed wholesale in the coming days and weeks. This broke the NO_CLEAN build for some people. Either do a clean build or touch lib/clang/include/llvm/Config/Targets.def lib/clang/include/llvm/Config/AsmParsers.def lib/clang/include/llvm/Config/Disassemblers.def lib/clang/include/llvm/Config/AsmPrinters.def before the build to force everything to rebuild that needs to. 20211202: Unbound support for RFC8375: The special-use domain 'home.arpa' is by default blocked. To unblock it use a local-zone nodefault statement in unbound.conf: local-zone: "home.arpa." nodefault Or use another type of local-zone to override with your choice. The reason for this is discussed in Section 6.1 of RFC8375: Because 'home.arpa.' is not globally scoped and cannot be secured using DNSSEC based on the root domain's trust anchor, there is no way to tell, using a standard DNS query, in which homenet scope an answer belongs. Consequently, users may experience surprising results with such names when roaming to different homenets. 20211110: Commit b8d60729deef changed the TCP congestion control framework so that any of the included congestion control modules could be the single module built into the kernel. Previously newreno was automatically built in through direct reference. As of this commit you are required to declare at least one congestion control module (e.g. 'options CC_NEWRENO') and to also declare a default using the CC_DEFAULT option (e.g. options CC_DEFAULT="newreno\"). The GENERIC configuration includes CC_NEWRENO and defines newreno as the default. If no congestion control option is built into the kernel and you are including networking, the kernel compile will fail. Also if no default is declared the kernel compile will fail. 20211118: Mips has been removed from universe builds. It will be removed from the tree shortly. 20211106: Commit f0c9847a6c47 changed the arguments for VOP_ALLOCATE. The NFS modules must be rebuilt from sources and any out of tree file systems that implement their own VOP_ALLOCATE may need to be modified. 20211022: The synchronous PPP kernel driver sppp(4) has been removed. The cp(4) and ce(4) drivers are now always compiled with netgraph(4) support, formerly enabled by NETGRAPH_CRONYX option. 20211020: sh(1) is now the default shell for the root user. To force root to use the csh shell, please run the following command as root: # chsh -s csh 20211004: Ncurses distribution has been split between libtinfow and libncurses with libncurses.so becoming a linker (ld) script to seamlessly link to libtinfow as needed. Bump _FreeBSD_version to 1400035 to reflect this change. 20210923: As of commit 8160a0f62be6, the dummynet module no longer depends on the ipfw module. Dummynet can now be used by pf as well as ipfw. As such users who relied on this dependency may need to include ipfw in the list of modules to load on their systems. 20210922: As of commit 903873ce1560, the mixer(8) utility has got a slightly new syntax. Please refer to the mixer(8) manual page for more information. The old mixer utility can be installed from ports: audio/freebsd-13-mixer 20210911: As of commit 55089ef4f8bb, the global variable nfs_maxcopyrange has been deleted from the nfscommon.ko. As such, nfsd.ko must be built from up to date sources to avoid an undefined reference when being loaded. 20210817: As of commit 62ca9fc1ad56 OpenSSL no longer enables kernel TLS by default. Users can enable kernel TLS via the "KTLS" SSL option. This can be enabled globally by using a custom OpenSSL config file via OPENSSL_CONF or via an application-specific configuration option for applications which permit setting SSL options via SSL_CONF_cmd(3). 20210811: Commit 3ad1e1c1ce20 changed the internal KAPI between the NFS modules. Therefore, all need to be rebuilt from sources. 20210730: Commit b69019c14cd8 removes pf's DIOCGETSTATESNV ioctl. As of be70c7a50d32 it is no longer used by userspace, but it does mean users may not be able to enumerate pf states if they update the kernel past b69019c14cd8 without first updating userspace past be70c7a50d32. 20210729: As of commit 01ad0c007964 if_bridge member interfaces can no longer change their MTU. Changing the MTU of the bridge itself will change the MTU on all member interfaces instead. 20210716: Commit ee29e6f31111 changed the internal KAPI between the nfscommon and nfsd modules. Therefore, both need to be rebuilt from sources. Bump __FreeBSD_version to 1400026 for this KAPI change. 20210715: The 20210707 awk update brought in a change in behavior. This has been corrected as of d4d252c49976. Between these dates, if you installed a new awk binary, you may not be able to build a new kernel because the change in behavior affected the genoffset script used to build the kernel. If you did update, the fix is to update your sources past the above hash and do % cd usr.bin/awk % make clean all % sudo -E make install to enable building kernels again. 20210708: Commit 1e0a518d6548 changed the internal KAPI between the NFS modules. They all need to be rebuilt from sources. I did not bump __FreeBSD_version, since it was bumped recently. 20210707: awk has been updated to the latest one-true-awk version 20210215. This contains a number of minor bug fixes. 20210624: The NFSv4 client now uses the highest minor version of NFSv4 supported by the NFSv4 server by default instead of minor version 0, for NFSv4 mounts. The "minorversion" mount option may be used to override this default. 20210618: Bump __FreeBSD_version to 1400024 for LinuxKPI changes. Most notably netdev.h can change now as the (last) dependencies (mlx4/ofed) are now using struct ifnet directly, but also for PCI additions and others. 20210618: The directory "blacklisted" under /usr/share/certs/ has been renamed to "untrusted". 20210611: svnlite has been removed from base. Should you need svn for any reason please install the svn package or port. 20210611: Commit e1a907a25cfa changed the internal KAPI between the krpc and nfsserver. As such, both modules must be rebuilt from sources. Bump __FreeBSD_version to 1400022. 20210610: The an(4) driver has been removed from FreeBSD. 20210608: The vendor/openzfs branch was renamed to vendor/openzfs/legacy to start tracking OpenZFS upstream more closely. Please see https://lists.freebsd.org/archives/freebsd-current/2021-June/000153.html for details on how to correct any errors that might result. The short version is that you need to remove the old branch locally: git update-ref -d refs/remotes/freebsd/vendor/openzfs (assuming your upstream origin is named 'freebsd'). 20210525: Commits 17accc08ae15 and de102f870501 add new files to LinuxKPI which break drm-kmod. In addition various other additions where committed. Bump __FreeBSD_version to 1400015 to be able to detect this. 20210513: Commit ca179c4d74f2 changed the package in which the OpenSSL libraries and utilities are packaged. It is recommended for pkgbase user to do: pkg install -f FreeBSD-openssl before pkg upgrade otherwise some dependencies might not be met and pkg will stop working as libssl will not be present anymore on the system. 20210426: Commit 875977314881 changed the internal KAPI between the nfsd and nfscommon modules. As such these modules need to be rebuilt from sources. Without this patch in your NFSv4.1/4.2 server, enabling delegations by setting vfs.nfsd.issue_delegations non-zero is not recommended. 20210411: Commit 7763814fc9c2 changed the internal KAPI between the krpc and NFS. As such, the krpc, nfscommon and nfscl modules must all be rebuilt from sources. Without this patch, NFSv4.1/4.2 mounts should not be done with the nfscbd(8) daemon running, to avoid needing a working back channel for server->client RPCs. 20210330: Commit 01ae8969a9ee fixed the NFSv4.1/4.2 server so that it handles binding of the back channel as required by RFC5661. Until this patch is in your server, avoid use of the "nconnects" mount option for Linux NFSv4.1/4.2 mounts. 20210225: For 64-bit architectures the base system is now built with Position Independent Executable (PIE) support enabled by default. It may be disabled using the WITHOUT_PIE knob. A clean build is required. 20210128: Various LinuxKPI functionality was added which conflicts with DRM. Please update your drm-kmod port to after the __FreeBSD_version 1400003 update. 20210108: PC Card attachments for all devices have been removed. In the case of wi and cmx, the entire drivers were removed because they were only PC Card devices. FreeBSD_version 1300134 should be used for this since it was bumped so recently. 20210107: Transport-independent parts of HID support have been split off the USB code in to separate subsystem. Kernel configs which include one of ums, ukbd, uhid, atp, wsp, wmt, uaudio, ugold or ucycom drivers should be updated with adding of "device hid" line. 20210105: ncurses installation has been modified to only keep the widechar enabled version. Incremental build is broken for that change, so it requires a clean build. 20201223: The FreeBSD project has migrated from Subversion to Git. Temporary instructions can be found at https://github.com/bsdimp/freebsd-git-docs/blob/main/src-cvt.md and other documents in that repo. 20201216: The services database has been updated to cover more of the basic services expected in a modern system. The database is big enough that it will cause issues in mergemaster in Releases previous to 12.2 and 11.3, or in very old current systems from before r358154. 20201215: Obsolete in-tree GDB 6.1.1 has been removed. GDB (including kgdb) may be installed from ports or packages. 20201124: ping6 has been merged into ping. It can now be called as "ping -6". See ping(8) for details. 20201108: Default value of net.add_addr_allfibs has been changed to 0. If you have multi-fib configuration and rely on existence of all interface routes in every fib, you need to set the above sysctl to 1. 20201030: The internal pre-processor in the calendar(1) program has been extended to support more C pre-processor commands (e.g. #ifdef, #else, and #undef) and to detect unbalanced conditional statements. Error messages have been extended to include the filename and line number if processing stops to help fixing malformed data files. 20201026: All the data files for the calendar(1) program, except calendar.freebsd, have been moved to the deskutils/calendar-data port, much like the jewish calendar entries were moved to deskutils/hebcal years ago. After make delete-old-files, you need to install it to retain full functionality. calendar(1) will issue a reminder for files it can't find. 20200923: LINT files are no longer generated. We now include the relevant NOTES files. Note: This may cause conflicts with updating in some cases. find sys -name LINT\* -delete is suggested across this commit to remove the generated LINT files. If you have tried to update with generated files there, the svn command you want to un-auger the tree is cd sys/amd64/conf svn revert -R . and then do the above find from the top level. Substitute 'amd64' above with where the error message indicates a conflict. 20200824: OpenZFS support has been integrated. Do not upgrade root pools until the loader is updated to support zstd. Furthermore, we caution against 'zpool upgrade' for the next few weeks. The change should be transparent unless you want to use new features. Not all "NO_CLEAN" build scenarios work across these changes. Many scenarios have been tested and fixed, but rebuilding kernels without rebuilding world may fail. The ZFS cache file has moved from /boot to /etc to match the OpenZFS upstream default. A fallback to /boot has been added for mountroot. Pool auto import behavior at boot has been moved from the kernel module to an explicit "zpool import -a" in one of the rc scripts enabled by zfs_enable=YES. This means your non-root zpools won't auto import until you upgrade your /etc/rc.d files. 20200824: The resume code now notifies devd with the 'kernel' system rather than the old 'kern' subsystem to be consistent with other use. The old notification will be created as well, but will be removed prior to FreeBSD 14.0. 20200821: r362275 changed the internal API between the kernel RPC and the NFS modules. As such, all the modules must be recompiled from sources. 20200817: r364330 modified the internal API used between the NFS modules. As such, all the NFS modules must be re-compiled from sources. 20200816: Clang, llvm, lld, lldb, compiler-rt, libc++, libunwind and openmp have been upgraded to 11.0.0. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20200810: r364092 modified the internal ABI used between the kernel NFS modules. As such, all of these modules need to be rebuilt from sources, so a version bump was done. 20200807: Makefile.inc has been updated to work around the issue documented in 20200729. It was a case where the optimization of using symbolic links to point to binaries created a situation where we'd run new binaries with old libraries starting midway through the installworld process. 20200729: r363679 has redefined some undefined behavior in regcomp(3); notably, extraneous escapes of most ordinary characters will no longer be accepted. An exp-run has identified all of the problems with this in ports, but other non-ports software may need extra escapes removed to continue to function. Because of this change, installworld may encounter the following error from rtld: Undefined symbol "regcomp@FBSD_1.6" -- It is imperative that you do not halt installworld. Instead, let it run to completion (whether successful or not) and run installworld once more. 20200627: A new implementation of bc and dc has been imported in r362681. This implementation corrects non-conformant behavior of the previous bc and adds GNU bc compatible options. It offers a number of extensions, is much faster on large values, and has support for message catalogs (a number of languages are already supported, contributions of further languages welcome). The option WITHOUT_GH_BC can be used to build the world with the previous versions of bc and dc. 20200625: r362639 changed the internal API used between the NFS kernel modules. As such, they all need to be rebuilt from sources. 20200613: r362158 changed the arguments for VFS_CHECKEXP(). As such, any out of tree file systems need to be modified and rebuilt. Also, any file systems that are modules must be rebuilt. 20200604: read(2) of a directory fd is now rejected by default. root may re-enable it for system root only on non-ZFS filesystems with the security.bsd.allow_read_dir sysctl(8) MIB if security.bsd.suser_enabled=1. It may be advised to setup aliases for grep to default to `-d skip` if commonly non-recursively grepping a list that includes directories and the potential for the resulting stderr output is not tolerable. Example aliases are now installed, commented out, in /root/.cshrc and /root/.shrc. 20200523: Clang, llvm, lld, lldb, compiler-rt, libc++, libunwind and openmp have been upgraded to 10.0.1. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20200512: Support for obsolete compilers has been removed from the build system. Clang 6 and GCC 6.4 are the minimum supported versions. 20200424: closefrom(2) has been moved under COMPAT12, and replaced in libc with a stub that calls close_range(2). If using a custom kernel configuration, you may want to ensure that the COMPAT_FREEBSD12 option is included, as a slightly older -CURRENT userland and older FreeBSD userlands may not be functional without closefrom(2). 20200414: Upstream DTS from Linux 5.6 was merged and they now have the SID and THS (Secure ID controller and THermal Sensor) node present. The DTB overlays have now been removed from the tree for the H3/H5 and A64 SoCs and the aw_sid and aw_thermal driver have been updated to deal with upstream DTS. If you are using those overlays you need to remove them from loader.conf and update the DTBs on the FAT partition. 20200310: Clang, llvm, lld, lldb, compiler-rt, libc++, libunwind and openmp have been upgraded to 10.0.0. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20200309: The amd(8) automount daemon has been removed from the source tree. As of FreeBSD 10.1 autofs(5) is the preferred tool for automounting. amd is still available in the sysutils/am-utils port. 20200301: Removed brooktree driver (bktr.4) from the tree. 20200229: The WITH_GPL_DTC option has been removed. The BSD-licenced device tree compiler in usr.bin/dtc is used on all architectures which use dtc, and the GPL dtc is available (if needed) from the sysutils/dtc port. 20200229: The WITHOUT_LLVM_LIBUNWIND option has been removed. LLVM's libunwind is used by all supported CPU architectures. 20200229: GCC 4.2.1 has been removed from the tree. The WITH_GCC, WITH_GCC_BOOTSTRAP, and WITH_GNUCXX options are no longer available. Users who wish to build FreeBSD with GCC must use the external toolchain ports or packages. 20200220: ncurses has been updated to a newer version (6.2-20200215). Given the ABI has changed, users will have to rebuild all the ports that are linked to ncurses. 20200217: The size of struct vnet and the magic cookie have changed. Users need to recompile libkvm and all modules using VIMAGE together with their new kernel. 20200212: Defining the long deprecated NO_CTF, NO_DEBUG_FILES, NO_INSTALLLIB, NO_MAN, NO_PROFILE, and NO_WARNS variables is now an error. Update your Makefiles and scripts to define MK_=no instead as required. One exception to this is that program or library Makefiles should define MAN to empty rather than setting MK_MAN=no. 20200108: Clang/LLVM is now the default compiler and LLD the default linker for riscv64. 20200107: make universe no longer uses GCC 4.2.1 on any architectures. Architectures not supported by in-tree Clang/LLVM require an external toolchain package. 20200104: GCC 4.2.1 is now not built by default, as part of the GCC 4.2.1 retirement plan. Specifically, the GCC, GCC_BOOTSTRAP, and GNUCXX options default to off for all supported CPU architectures. As a short-term transition aid they may be enabled via WITH_* options. GCC 4.2.1 is expected to be removed from the tree on 2020-03-31. 20200102: Support for armv5 has been disconnected and is being removed. The machine combination MACHINE=arm MACHINE_ARCH=arm is no longer valid. You must now use a MACHINE_ARCH of armv6 or armv7. The default MACHINE_ARCH for MACHINE=arm is now armv7. 20191226: Clang/LLVM is now the default compiler for all powerpc architectures. LLD is now the default linker for powerpc64. The change for powerpc64 also includes a change to the ELFv2 ABI, incompatible with the existing ABI. 20191226: Kernel-loadable random(4) modules are no longer unloadable. 20191222: Clang, llvm, lld, lldb, compiler-rt, libc++, libunwind and openmp have been upgraded to 9.0.1. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20191212: r355677 has modified the internal interface used between the NFS modules in the kernel. As such, they must all be upgraded simultaneously. I will do a version bump for this. 20191205: The root certificates of the Mozilla CA Certificate Store have been imported into the base system and can be managed with the certctl(8) utility. If you have installed the security/ca_root_nss port or package with the ETCSYMLINK option (the default), be advised that there may be differences between those included in the port and those included in base due to differences in nss branch used as well as general update frequency. Note also that certctl(8) cannot manage certs in the format used by the security/ca_root_nss port. 20191120: The amd(8) automount daemon has been disabled by default, and will be removed in the future. As of FreeBSD 10.1 the autofs(5) is available for automounting. 20191107: The nctgpio and wbwd drivers have been moved to the superio bus. If you have one of these drivers in a kernel configuration, then you should add device superio to it. If you use one of these drivers as a module and you compile a custom set of modules, then you should add superio to the set. 20191021: KPIs for network drivers to access interface addresses have changed. Users need to recompile NIC driver modules together with kernel. 20191021: The net.link.tap.user_open sysctl no longer prevents user opening of already created /dev/tapNN devices. Access is still controlled by node permissions, just like tun devices. The net.link.tap.user_open sysctl is now used only to allow users to perform devfs cloning of tap devices, and the subsequent open may not succeed if the user is not in the appropriate group. This sysctl may be deprecated/removed completely in the future. 20191009: mips, powerpc, and sparc64 are no longer built as part of universe / tinderbox unless MAKE_OBSOLETE_GCC is defined. If not defined, mips, powerpc, and sparc64 builds will look for the xtoolchain binaries and if installed use them for universe builds. As llvm 9.0 becomes vetted for these architectures, they will be removed from the list. 20191009: Clang, llvm, lld, lldb, compiler-rt, libc++, libunwind and openmp have been upgraded to 9.0.0. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20191003: The hpt27xx, hptmv, hptnr, and hptrr drivers have been removed from GENERIC. They are available as modules and can be loaded by adding to /boot/loader.conf hpt27xx_load="YES", hptmv_load="YES", hptnr_load="YES", or hptrr_load="YES", respectively. 20190913: ntpd no longer by default locks its pages in memory, allowing them to be paged out by the kernel. Use rlimit memlock to restore historic BSD behaviour. For example, add "rlimit memlock 32" to ntp.conf to lock up to 32 MB of ntpd address space in memory. 20190823: Several of ping6's options have been renamed for better consistency with ping. If you use any of -ARWXaghmrtwx, you must update your scripts. See ping6(8) for details. 20190727: The vfs.fusefs.sync_unmount and vfs.fusefs.init_backgrounded sysctls and the "-o sync_unmount" and "-o init_backgrounded" mount options have been removed from mount_fusefs(8). You can safely remove them from your scripts, because they had no effect. The vfs.fusefs.fix_broken_io, vfs.fusefs.sync_resize, vfs.fusefs.refresh_size, vfs.fusefs.mmap_enable, vfs.fusefs.reclaim_revoked, and vfs.fusefs.data_cache_invalidate sysctls have been removed. If you felt the need to set any of them to a non-default value, please tell asomers@FreeBSD.org why. 20190713: Default permissions on the /var/account/acct file (and copies of it rotated by periodic daily scripts) are changed from 0644 to 0640 because the file contains sensitive information that should not be world-readable. If the /var/account directory must be created by rc.d/accounting, the mode used is now 0750. Admins who use the accounting feature are encouraged to change the mode of an existing /var/account directory to 0750 or 0700. 20190620: Entropy collection and the /dev/random device are no longer optional components. The "device random" option has been removed. Implementations of distilling algorithms can still be made loadable with "options RANDOM_LOADABLE" (e.g., random_fortuna.ko). 20190612: Clang, llvm, lld, lldb, compiler-rt, libc++, libunwind and openmp have been upgraded to 8.0.1. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20190608: A fix was applied to i386 kernel modules to avoid panics with dpcpu or vnet. Users need to recompile i386 kernel modules having pcpu or vnet sections or they will refuse to load. 20190513: User-wired pages now have their own counter, vm.stats.vm.v_user_wire_count. The vm.max_wired sysctl was renamed to vm.max_user_wired and changed from an unsigned int to an unsigned long. bhyve VMs wired with the -S are now subject to the user wiring limit; the vm.max_user_wired sysctl may need to be tuned to avoid running into the limit. 20190507: The IPSEC option has been removed from GENERIC. Users requiring ipsec(4) must now load the ipsec(4) kernel module. 20190507: The tap(4) driver has been folded into tun(4), and the module has been renamed to tuntap. You should update any kld_list="if_tap" or kld_list="if_tun" entries in /etc/rc.conf, if_tap_load="YES" or if_tun_load="YES" entries in /boot/loader.conf to load the if_tuntap module instead, and "device tap" or "device tun" entries in kernel config files to select the tuntap device instead. 20190418: The following knobs have been added related to tradeoffs between safe use of the random device and availability in the absence of entropy: kern.random.initial_seeding.bypass_before_seeding: tunable; set non-zero to bypass the random device prior to seeding, or zero to block random requests until the random device is initially seeded. For now, set to 1 (unsafe) by default to restore pre-r346250 boot availability properties. kern.random.initial_seeding.read_random_bypassed_before_seeding: read-only diagnostic sysctl that is set when bypass is enabled and read_random(9) is bypassed, to enable programmatic handling of this initial condition, if desired. kern.random.initial_seeding.arc4random_bypassed_before_seeding: Similar to the above, but for arc4random(9) initial seeding. kern.random.initial_seeding.disable_bypass_warnings: tunable; set non-zero to disable warnings in dmesg when the same conditions are met as for the diagnostic sysctls above. Defaults to zero, i.e., produce warnings in dmesg when the conditions are met. 20190416: The loadable random module KPI has changed; the random_infra_init() routine now requires a 3rd function pointer for a bool (*)(void) method that returns true if the random device is seeded (and therefore unblocked). 20190404: r345895 reverts r320698. This implies that an nfsuserd(8) daemon built from head sources between r320757 (July 6, 2017) and r338192 (Aug. 22, 2018) will not work unless the "-use-udpsock" is added to the command line. nfsuserd daemons built from head sources that are post-r338192 are not affected and should continue to work. 20190320: The fuse(4) module has been renamed to fusefs(4) for consistency with other filesystems. You should update any kld_load="fuse" entries in /etc/rc.conf, fuse_load="YES" entries in /boot/loader.conf, and "options FUSE" entries in kernel config files. 20190304: Clang, llvm, lld, lldb, compiler-rt and libc++ have been upgraded to 8.0.0. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20190226: geom_uzip(4) depends on the new module xz. If geom_uzip is statically compiled into your custom kernel, add 'device xz' statement to the kernel config. 20190219: drm and drm2 have been removed from the tree. Please see https://wiki.freebsd.org/Graphics for the latest information on migrating to the drm ports. 20190131: Iflib is no longer unconditionally compiled into the kernel. Drivers using iflib and statically compiled into the kernel, now require the 'device iflib' config option. For the same drivers loaded as modules on kernels not having 'device iflib', the iflib.ko module is loaded automatically. 20190125: The IEEE80211_AMPDU_AGE and AH_SUPPORT_AR5416 kernel configuration options no longer exist since r343219 and r343427 respectively; nothing uses them, so they should be just removed from custom kernel config files. 20181230: r342635 changes the way efibootmgr(8) works by requiring users to add the -b (bootnum) parameter for commands where the bootnum was previously specified with each option. For example 'efibootmgr -B 0001' is now 'efibootmgr -B -b 0001'. 20181220: r342286 modifies the NFSv4 server so that it obeys vfs.nfsd.nfs_privport in the same as it is applied to NFSv2 and 3. This implies that NFSv4 servers that have vfs.nfsd.nfs_privport set will only allow mounts from clients using a reserved port. Since both the FreeBSD and Linux NFSv4 clients use reserved ports by default, this should not affect most NFSv4 mounts. 20181219: The XLP config has been removed. We can't support 64-bit atomics in this kernel because it is running in 32-bit mode. XLP users must transition to running a 64-bit kernel (XLP64 or XLPN32). The mips GXEMUL support has been removed from FreeBSD. MALTA* + qemu is the preferred emulator today and we don't need two different ones. The old sibyte / swarm / Broadcom BCM1250 support has been removed from the mips port. 20181211: Clang, llvm, lld, lldb, compiler-rt and libc++ have been upgraded to 7.0.1. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20181211: Remove the timed and netdate programs from the base tree. Setting the time with these daemons has been obsolete for over a decade. 20181126: On amd64, arm64 and armv7 (architectures that install LLVM's ld.lld linker as /usr/bin/ld) GNU ld is no longer installed as ld.bfd, as it produces broken binaries when ifuncs are in use. Users needing GNU ld should install the binutils port or package. 20181123: The BSD crtbegin and crtend code has been enabled by default. It has had extensive testing on amd64, arm64, and i386. It can be disabled by building a world with -DWITHOUT_BSD_CRTBEGIN. 20181115: The set of CTM commands (ctm, ctm_smail, ctm_rmail, ctm_dequeue) has been converted to a port (misc/ctm) and will be removed from FreeBSD-13. It is available as a package (ctm) for all supported FreeBSD versions. 20181110: The default newsyslog.conf(5) file has been changed to only include files in /etc/newsyslog.conf.d/ and /usr/local/etc/newsyslog.conf.d/ if the filenames end in '.conf' and do not begin with a '.'. You should check the configuration files in these two directories match this naming convention. You can verify which configuration files are being included using the command: $ newsyslog -Nrv 20181015: Ports for the DRM modules have been simplified. Now, amd64 users should just install the drm-kmod port. All others should install drm-legacy-kmod. Graphics hardware that's newer than about 2010 usually works with drm-kmod. For hardware older than 2013, however, some users will need to use drm-legacy-kmod if drm-kmod doesn't work for them. Hardware older than 2008 usually only works in drm-legacy-kmod. The graphics team can only commit to hardware made since 2013 due to the complexity of the market and difficulty to test all the older cards effectively. If you have hardware supported by drm-kmod, you are strongly encouraged to use that as you will get better support. Other than KPI chasing, drm-legacy-kmod will not be updated. As outlined elsewhere, the drm and drm2 modules will be eliminated from the src base soon (with a limited exception for arm). Please update to the package asap and report any issues to x11@freebsd.org. Generally, anybody using the drm*-kmod packages should add WITHOUT_DRM_MODULE=t and WITHOUT_DRM2_MODULE=t to avoid nasty cross-threading surprises, especially with automatic driver loading from X11 startup. These will become the defaults in 13-current shortly. 20181012: The ixlv(4) driver has been renamed to iavf(4). As a consequence, custom kernel and module loading configuration files must be updated accordingly. Moreover, interfaces previous presented as ixlvN to the system are now exposed as iavfN and network configuration files must be adjusted as necessary. 20181009: OpenSSL has been updated to version 1.1.1. This update included additional various API changes throughout the base system. It is important to rebuild third-party software after upgrading. The value of __FreeBSD_version has been bumped accordingly. 20181006: The legacy DRM modules and drivers have now been added to the loader's module blacklist, in favor of loading them with kld_list in rc.conf(5). The module blacklist may be overridden with the loader.conf(5) 'module_blacklist' variable, but loading them via rc.conf(5) is strongly encouraged. 20181002: The cam(4) based nda(4) driver will be used over nvd(4) by default on powerpc64. You may set 'options NVME_USE_NVD=1' in your kernel conf or loader tunable 'hw.nvme.use_nvd=1' if you wish to use the existing driver. Make sure to edit /boot/etc/kboot.conf and fstab to use the nda device name. 20180913: Reproducible build mode is now on by default, in preparation for FreeBSD 12.0. This eliminates build metadata such as the user, host, and time from the kernel (and uname), unless the working tree corresponds to a modified checkout from a version control system. The previous behavior can be obtained by setting the /etc/src.conf knob WITHOUT_REPRODUCIBLE_BUILD. 20180826: The Yarrow CSPRNG has been removed from the kernel as it has not been supported by its designers since at least 2003. Fortuna has been the default since FreeBSD-11. 20180822: devctl freeze/thaw have gone into the tree, the rc scripts have been updated to use them and devmatch has been changed. You should update kernel, userland and rc scripts all at the same time. 20180818: The default interpreter has been switched from 4th to Lua. LOADER_DEFAULT_INTERP, documented in build(7), will override the default interpreter. If you have custom FORTH code you will need to set LOADER_DEFAULT_INTERP=4th (valid values are 4th, lua or simp) in src.conf for the build. This will create default hard links between loader and loader_4th instead of loader and loader_lua, the new default. If you are using UEFI it will create the proper hard link to loader.efi. bhyve uses userboot.so. It remains 4th-only until some issues are solved regarding coexisting with multiple versions of FreeBSD are resolved. 20180815: ls(1) now respects the COLORTERM environment variable used in other systems and software to indicate that a colored terminal is both supported and desired. If ls(1) is suddenly emitting colors, they may be disabled again by either removing the unwanted COLORTERM from your environment, or using `ls --color=never`. The ls(1) specific CLICOLOR may not be observed in a future release. 20180808: The default pager for most commands has been changed to "less". To restore the old behavior, set PAGER="more" and MANPAGER="more -s" in your environment. 20180731: The jedec_ts(4) driver has been removed. A superset of its functionality is available in the jedec_dimm(4) driver, and the manpage for that driver includes migration instructions. If you have "device jedec_ts" in your kernel configuration file, it must be removed. 20180730: amd64/GENERIC now has EFI runtime services, EFIRT, enabled by default. This should have no effect if the kernel is booted via BIOS/legacy boot. EFIRT may be disabled via a loader tunable, efi.rt.disabled, if a system has a buggy firmware that prevents a successful boot due to use of runtime services. 20180727: Atmel AT91RM9200 and AT91SAM9, Cavium CNS 11xx and XScale support has been removed from the tree. These ports were obsolete and/or known to be broken for many years. 20180723: loader.efi has been augmented to participate more fully in the UEFI boot manager protocol. loader.efi will now look at the BootXXXX environment variable to determine if a specific kernel or root partition was specified. XXXX is derived from BootCurrent. efibootmgr(8) manages these standard UEFI variables. 20180720: zfsloader's functionality has now been folded into loader. zfsloader is no longer necessary once you've updated your boot blocks. For a transition period, we will install a hardlink for zfsloader to loader to allow a smooth transition until the boot blocks can be updated (hard link because old zfs boot blocks don't understand symlinks). 20180719: ARM64 now have efifb support, if you want to have serial console on your arm64 board when an screen is connected and the bootloader setup a frame buffer for us to use, just add : boot_serial=YES boot_multicons=YES in /boot/loader.conf For Raspberry Pi 3 (RPI) users, this is needed even if you don't have an screen connected as the firmware will setup a frame buffer are that u-boot will expose as an EFI frame buffer. 20180719: New uid:gid added, ntpd:ntpd (123:123). Be sure to run mergemaster or take steps to update /etc/passwd before doing installworld on existing systems. Do not skip the "mergemaster -Fp" step before installworld, as described in the update procedures near the bottom of this document. Also, rc.d/ntpd now starts ntpd(8) as user ntpd if the new mac_ntpd(4) policy is available, unless ntpd_flags or the ntp config file contain options that change file/dir locations. When such options (e.g., "statsdir" or "crypto") are used, ntpd can still be run as non-root by setting ntpd_user=ntpd in rc.conf, after taking steps to ensure that all required files/dirs are accessible by the ntpd user. 20180717: Big endian arm support has been removed. 20180711: The static environment setup in kernel configs is no longer mutually exclusive with the loader(8) environment by default. In order to restore the previous default behavior of disabling the loader(8) environment if a static environment is present, you must specify loader_env.disabled=1 in the static environment. 20180705: The ABI of syscalls used by management tools like sockstat and netstat has been broken to allow 32-bit binaries to work on 64-bit kernels without modification. These programs will need to match the kernel in order to function. External programs may require minor modifications to accommodate a change of type in structures from pointers to 64-bit virtual addresses. 20180702: On i386 and amd64 atomics are now inlined. Out of tree modules using atomics will need to be rebuilt. 20180701: The '%I' format in the kern.corefile sysctl limits the number of core files that a process can generate to the number stored in the debug.ncores sysctl. The '%I' format is replaced by the single digit index. Previously, if all indexes were taken the kernel would overwrite only a core file with the highest index in a filename. Currently the system will create a new core file if there is a free index or if all slots are taken it will overwrite the oldest one. 20180630: Clang, llvm, lld, lldb, compiler-rt and libc++ have been upgraded to 6.0.1. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20180628: r335753 introduced a new quoting method. However, etc/devd/devmatch.conf needed to be changed to work with it. This change was made with r335763 and requires a mergemaster / etcupdate / etc to update the installed file. 20180612: r334930 changed the interface between the NFS modules, so they all need to be rebuilt. r335018 did a __FreeBSD_version bump for this. 20180530: As of r334391 lld is the default amd64 system linker; it is installed as /usr/bin/ld. Kernel build workarounds (see 20180510 entry) are no longer necessary. 20180530: The kernel / userland interface for devinfo changed, so you'll need a new kernel and userland as a pair for it to work (rebuilding lib/libdevinfo is all that's required). devinfo and devmatch will not work, but everything else will when there's a mismatch. 20180523: The on-disk format for hwpmc callchain records has changed to include threadid corresponding to a given record. This changes the field offsets and thus requires that libpmcstat be rebuilt before using a kernel later than r334108. 20180517: The vxge(4) driver has been removed. This driver was introduced into HEAD one week before the Exar left the Ethernet market and is not known to be used. If you have device vxge in your kernel config file it must be removed. 20180510: The amd64 kernel now requires a ld that supports ifunc to produce a working kernel, either lld or a newer binutils. lld is built by default on amd64, and the 'buildkernel' target uses it automatically. However, it is not the default linker, so building the kernel the traditional way requires LD=ld.lld on the command line (or LD=/usr/local/bin/ld for binutils port/package). lld will soon be default, and this requirement will go away. NOTE: As of r334391 lld is the default system linker on amd64, and no workaround is necessary. 20180508: The nxge(4) driver has been removed. This driver was for PCI-X 10g cards made by s2io/Neterion. The company was acquired by Exar and no longer sells or supports Ethernet products. If you have device nxge in your kernel config file it must be removed. 20180504: The tz database (tzdb) has been updated to 2018e. This version more correctly models time stamps in time zones with negative DST such as Europe/Dublin (from 1971 on), Europe/Prague (1946/7), and Africa/Windhoek (1994/2017). This does not affect the UT offsets, only time zone abbreviations and the tm_isdst flag. 20180502: The ixgb(4) driver has been removed. This driver was for an early and uncommon legacy PCI 10GbE for a single ASIC, Intel 82597EX. Intel quickly shifted to the long lived ixgbe family. If you have device ixgb in your kernel config file it must be removed. 20180501: The lmc(4) driver has been removed. This was a WAN interface card that was already reportedly rare in 2003, and had an ambiguous license. If you have device lmc in your kernel config file it must be removed. 20180413: Support for Arcnet networks has been removed. If you have device arcnet or device cm in your kernel config file they must be removed. 20180411: Support for FDDI networks has been removed. If you have device fddi or device fpa in your kernel config file they must be removed. 20180406: In addition to supporting RFC 3164 formatted messages, the syslogd(8) service is now capable of parsing RFC 5424 formatted log messages. The main benefit of using RFC 5424 is that clients may now send log messages with timestamps containing year numbers, microseconds and time zone offsets. Similarly, the syslog(3) C library function has been altered to send RFC 5424 formatted messages to the local system logging daemon. On systems using syslogd(8), this change should have no negative impact, as long as syslogd(8) and the C library are updated at the same time. On systems using a different system logging daemon, it may be necessary to make configuration adjustments, depending on the software used. When using syslog-ng, add the 'syslog-protocol' flag to local input sources to enable parsing of RFC 5424 formatted messages: source src { unix-dgram("/var/run/log" flags(syslog-protocol)); } When using rsyslog, disable the 'SysSock.UseSpecialParser' option of the 'imuxsock' module to let messages be processed by the regular RFC 3164/5424 parsing pipeline: module(load="imuxsock" SysSock.UseSpecialParser="off") Do note that these changes only affect communication between local applications and syslogd(8). The format that syslogd(8) uses to store messages on disk or forward messages to other systems remains unchanged. syslogd(8) still uses RFC 3164 for these purposes. Options to customize this behaviour will be added in the future. Utilities that process log files stored in /var/log are thus expected to continue to function as before. __FreeBSD_version has been incremented to 1200061 to denote this change. 20180328: Support for token ring networks has been removed. If you have "device token" in your kernel config you should remove it. No device drivers supported token ring. 20180323: makefs was modified to be able to tag ISO9660 El Torito boot catalog entries as EFI instead of overloading the i386 tag as done previously. The amd64 mkisoimages.sh script used to build amd64 ISO images for release was updated to use this. This may mean that makefs must be updated before "make cdrom" can be run in the release directory. This should be as simple as: $ cd $SRCDIR/usr.sbin/makefs $ make depend all install 20180212: FreeBSD boot loader enhanced with Lua scripting. It's purely opt-in for now by building WITH_LOADER_LUA and WITHOUT_FORTH in /etc/src.conf. Co-existence for the transition period will come shortly. Booting is a complex environment and test coverage for Lua-enabled loaders has been thin, so it would be prudent to assume it might not work and make provisions for backup boot methods. 20180211: devmatch functionality has been turned on in devd. It will automatically load drivers for unattached devices. This may cause unexpected drivers to be loaded. Please report any problems to current@ and imp@freebsd.org. 20180114: Clang, llvm, lld, lldb, compiler-rt and libc++ have been upgraded to 6.0.0. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20180110: LLVM's lld linker is now used as the FreeBSD/amd64 bootstrap linker. This means it is used to link the kernel and userland libraries and executables, but is not yet installed as /usr/bin/ld by default. To revert to ld.bfd as the bootstrap linker, in /etc/src.conf set WITHOUT_LLD_BOOTSTRAP=yes 20180110: On i386, pmtimer has been removed. Its functionality has been folded into apm. It was a no-op on ACPI in current for a while now (but was still needed on i386 in FreeBSD 11 and earlier). Users may need to remove it from kernel config files. 20180104: The use of RSS hash from the network card aka flowid has been disabled by default for lagg(4) as it's currently incompatible with the lacp and loadbalance protocols. This can be re-enabled by setting the following in loader.conf: net.link.lagg.default_use_flowid="1" 20180102: The SW_WATCHDOG option is no longer necessary to enable the hardclock-based software watchdog if no hardware watchdog is configured. As before, SW_WATCHDOG will cause the software watchdog to be enabled even if a hardware watchdog is configured. 20171215: r326887 fixes the issue described in the 20171214 UPDATING entry. r326888 flips the switch back to building GELI support always. 20171214: r362593 broke ZFS + GELI support for reasons unknown. However, it also broke ZFS support generally, so GELI has been turned off by default as the lesser evil in r326857. If you boot off ZFS and/or GELI, it might not be a good time to update. 20171125: PowerPC users must update loader(8) by rebuilding world before installing a new kernel, as the protocol connecting them has changed. Without the update, loader metadata will not be passed successfully to the kernel and users will have to enter their root partition at the kernel mountroot prompt to continue booting. Newer versions of loader can boot old kernels without issue. 20171110: The LOADER_FIREWIRE_SUPPORT build variable has been renamed to WITH/OUT_LOADER_FIREWIRE. LOADER_{NO_,}GELI_SUPPORT has been renamed to WITH/OUT_LOADER_GELI. 20171106: The naive and non-compliant support of posix_fallocate(2) in ZFS has been removed as of r325320. The system call now returns EINVAL when used on a ZFS file. Although the new behavior complies with the standard, some consumers are not prepared to cope with it. One known victim is lld prior to r325420. 20171102: Building in a FreeBSD src checkout will automatically create object directories now rather than store files in the current directory if 'make obj' was not ran. Calling 'make obj' is no longer necessary. This feature can be disabled by setting WITHOUT_AUTO_OBJ=yes in /etc/src-env.conf (not /etc/src.conf), or passing the option in the environment. 20171101: The default MAKEOBJDIR has changed from /usr/obj/ for native builds, and /usr/obj// for cross-builds, to a unified /usr/obj//. This behavior can be changed to the old format by setting WITHOUT_UNIFIED_OBJDIR=yes in /etc/src-env.conf, the environment, or with -DWITHOUT_UNIFIED_OBJDIR when building. The UNIFIED_OBJDIR option is a transitional feature that will be removed for 12.0 release; please migrate to the new format for any tools by looking up the OBJDIR used by 'make -V .OBJDIR' means rather than hardcoding paths. 20171028: The native-xtools target no longer installs the files by default to the OBJDIR. Use the native-xtools-install target with a DESTDIR to install to ${DESTDIR}/${NXTP} where NXTP defaults to /nxb-bin. 20171021: As part of the boot loader infrastructure cleanup, LOADER_*_SUPPORT options are changing from controlling the build if defined / undefined to controlling the build with explicit 'yes' or 'no' values. They will shift to WITH/WITHOUT options to match other options in the system. 20171010: libstand has turned into a private library for sys/boot use only. It is no longer supported as a public interface outside of sys/boot. 20171005: The arm port has split armv6 into armv6 and armv7. armv7 is now a valid TARGET_ARCH/MACHINE_ARCH setting. If you have an armv7 system and are running a kernel from before r324363, you will need to add MACHINE_ARCH=armv7 to 'make buildworld' to do a native build. 20171003: When building multiple kernels using KERNCONF, non-existent KERNCONF files will produce an error and buildkernel will fail. Previously missing KERNCONF files silently failed giving no indication as to why, only to subsequently discover during installkernel that the desired kernel was never built in the first place. 20170912: The default serial number format for CTL LUNs has changed. This will affect users who use /dev/diskid/* device nodes, or whose FibreChannel or iSCSI clients care about their LUNs' serial numbers. Users who require serial number stability should hardcode serial numbers in /etc/ctl.conf . 20170912: For 32-bit arm compiled for hard-float support, soft-floating point binaries now always get their shared libraries from LD_SOFT_LIBRARY_PATH (in the past, this was only used if /usr/libsoft also existed). Only users with a hard-float ld.so, but soft-float everything else should be affected. 20170826: The geli password typed at boot is now hidden. To restore the previous behavior, see geli(8) for configuration options. 20170825: Move PMTUD blackhole counters to TCPSTATS and remove them from bare sysctl values. Minor nit, but requires a rebuild of both world/kernel to complete. 20170814: "make check" behavior (made in ^/head@r295380) has been changed to execute from a limited sandbox, as opposed to executing from ${TESTSDIR}. Behavioral changes: - The "beforecheck" and "aftercheck" targets are now specified. - ${CHECKDIR} (added in commit noted above) has been removed. - Legacy behavior can be enabled by setting WITHOUT_MAKE_CHECK_USE_SANDBOX in src.conf(5) or the environment. If the limited sandbox mode is enabled, "make check" will execute "make distribution", then install, execute the tests, and clean up the sandbox if successful. The "make distribution" and "make install" targets are typically run as root to set appropriate permissions and ownership at installation time. The end-user should set "WITH_INSTALL_AS_USER" in src.conf(5) or the environment if executing "make check" with limited sandbox mode using an unprivileged user. 20170808: Since the switch to GPT disk labels, fsck for UFS/FFS has been unable to automatically find alternate superblocks. As of r322297, the information needed to find alternate superblocks has been moved to the end of the area reserved for the boot block. Filesystems created with a newfs of this vintage or later will create the recovery information. If you have a filesystem created prior to this change and wish to have a recovery block created for your filesystem, you can do so by running fsck in foreground mode (i.e., do not use the -p or -y options). As it starts, fsck will ask ``SAVE DATA TO FIND ALTERNATE SUPERBLOCKS'' to which you should answer yes. 20170728: As of r321665, an NFSv4 server configuration that services Kerberos mounts or clients that do not support the uid/gid in owner/owner_group string capability, must explicitly enable the nfsuserd daemon by adding nfsuserd_enable="YES" to the machine's /etc/rc.conf file. 20170722: Clang, llvm, lldb, compiler-rt and libc++ have been upgraded to 5.0.0. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20170701: WITHOUT_RCMDS is now the default. Set WITH_RCMDS if you need the r-commands (rlogin, rsh, etc.) to be built with the base system. 20170625: The FreeBSD/powerpc platform now uses a 64-bit type for time_t. This is a very major ABI incompatible change, so users of FreeBSD/powerpc must be careful when performing source upgrades. It is best to run 'make installworld' from an alternate root system, either a live CD/memory stick, or a temporary root partition. Additionally, all ports must be recompiled. powerpc64 is largely unaffected, except in the case of 32-bit compatibility. All 32-bit binaries will be affected. 20170623: Forward compatibility for the "ino64" project have been committed. This will allow most new binaries to run on older kernels in a limited fashion. This prevents many of the common foot-shooting actions in the upgrade as well as the limited ability to roll back the kernel across the ino64 upgrade. Complicated use cases may not work properly, though enough simpler ones work to allow recovery in most situations. 20170620: Switch back to the BSDL dtc (Device Tree Compiler). Set WITH_GPL_DTC if you require the GPL compiler. 20170618: The internal ABI used for communication between the NFS kernel modules was changed by r320085, so __FreeBSD_version was bumped to ensure all the NFS related modules are updated together. 20170617: The ABI of struct event was changed by extending the data member to 64bit and adding ext fields. For upgrade, same precautions as for the entry 20170523 "ino64" must be followed. 20170531: The GNU roff toolchain has been removed from base. To render manpages which are not supported by mandoc(1), man(1) can fallback on GNU roff from ports (and recommends to install it). To render roff(7) documents, consider using GNU roff from ports or the heirloom doctools roff toolchain from ports via pkg install groff or via pkg install heirloom-doctools. 20170524: The ath(4) and ath_hal(4) modules now build piecemeal to allow for smaller runtime footprint builds. This is useful for embedded systems which only require one chipset support. If you load it as a module, make sure this is in /boot/loader.conf: if_ath_load="YES" This will load the HAL, all chip/RF backends and if_ath_pci. If you have if_ath_pci in /boot/loader.conf, ensure it is after if_ath or it will not load any HAL chipset support. If you want to selectively load things (eg on cheaper ARM/MIPS platforms where RAM is at a premium) you should: * load ath_hal * load the chip modules in question * load ath_rate, ath_dfs * load ath_main * load if_ath_pci and/or if_ath_ahb depending upon your particular bus bind type - this is where probe/attach is done. For further comments/feedback, poke adrian@ . 20170523: The "ino64" 64-bit inode project has been committed, which extends a number of types to 64 bits. Upgrading in place requires care and adherence to the documented upgrade procedure. If using a custom kernel configuration ensure that the COMPAT_FREEBSD11 option is included (as during the upgrade the system will be running the ino64 kernel with the existing world). For the safest in-place upgrade begin by removing previous build artifacts via "rm -rf /usr/obj/*". Then, carefully follow the full procedure documented below under the heading "To rebuild everything and install it on the current system." Specifically, a reboot is required after installing the new kernel before installing world. While an installworld normally works by accident from multiuser after rebooting the proper kernel, there are many cases where this will fail across this upgrade and installworld from single user is required. 20170424: The NATM framework including the en(4), fatm(4), hatm(4), and patm(4) devices has been removed. Consumers should plan a migration before the end-of-life date for FreeBSD 11. 20170420: GNU diff has been replaced by a BSD licensed diff. Some features of GNU diff has not been implemented, if those are needed a newer version of GNU diff is available via the diffutils package under the gdiff name. 20170413: As of r316810 for ipfilter, keep frags is no longer assumed when keep state is specified in a rule. r316810 aligns ipfilter with documentation in man pages separating keep frags from keep state. This allows keep state to be specified without forcing keep frags and allows keep frags to be specified independently of keep state. To maintain previous behaviour, also specify keep frags with keep state (as documented in ipf.conf.5). 20170407: arm64 builds now use the base system LLD 4.0.0 linker by default, instead of requiring that the aarch64-binutils port or package be installed. To continue using aarch64-binutils, set CROSS_BINUTILS_PREFIX=/usr/local/aarch64-freebsd/bin . 20170405: The UDP optimization in entry 20160818 that added the sysctl net.inet.udp.require_l2_bcast has been reverted. L2 broadcast packets will no longer be treated as L3 broadcast packets. 20170331: Binds and sends to the loopback addresses, IPv6 and IPv4, will now use any explicitly assigned loopback address available in the jail instead of using the first assigned address of the jail. 20170329: The ctl.ko module no longer implements the iSCSI target frontend: cfiscsi.ko does instead. If building cfiscsi.ko as a kernel module, the module can be loaded via one of the following methods: - `cfiscsi_load="YES"` in loader.conf(5). - Add `cfiscsi` to `$kld_list` in rc.conf(5). - ctladm(8)/ctld(8), when compiled with iSCSI support (`WITH_ISCSI=yes` in src.conf(5)) Please see cfiscsi(4) for more details. 20170316: The mmcsd.ko module now additionally depends on geom_flashmap.ko. Also, mmc.ko and mmcsd.ko need to be a matching pair built from the same source (previously, the dependency of mmcsd.ko on mmc.ko was missing, but mmcsd.ko now will refuse to load if it is incompatible with mmc.ko). 20170315: The syntax of ipfw(8) named states was changed to avoid ambiguity. If you have used named states in the firewall rules, you need to modify them after installworld and before rebooting. Now named states must be prefixed with colon. 20170311: The old drm (sys/dev/drm/) drivers for i915 and radeon have been removed as the userland we provide cannot use them. The KMS version (sys/dev/drm2) supports the same hardware. 20170302: Clang, llvm, lldb, compiler-rt and libc++ have been upgraded to 4.0.0. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20170221: The code that provides support for ZFS .zfs/ directory functionality has been reimplemented. It's not possible now to create a snapshot by mkdir under .zfs/snapshot/. That should be the only user visible change. 20170216: EISA bus support has been removed. The WITH_EISA option is no longer valid. 20170215: MCA bus support has been removed. 20170127: The WITH_LLD_AS_LD / WITHOUT_LLD_AS_LD build knobs have been renamed WITH_LLD_IS_LD / WITHOUT_LLD_IS_LD, for consistency with CLANG_IS_CC. 20170112: The EM_MULTIQUEUE kernel configuration option is deprecated now that the em(4) driver conforms to iflib specifications. 20170109: The igb(4), em(4) and lem(4) ethernet drivers are now implemented via IFLIB. If you have a custom kernel configuration that excludes em(4) but you use igb(4), you need to re-add em(4) to your custom configuration. 20161217: Clang, llvm, lldb, compiler-rt and libc++ have been upgraded to 3.9.1. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20161124: Clang, llvm, lldb, compiler-rt and libc++ have been upgraded to 3.9.0. Please see the 20141231 entry below for information about prerequisites and upgrading, if you are not already using clang 3.5.0 or higher. 20161119: The layout of the pmap structure has changed for powerpc to put the pmap statistics at the front for all CPU variations. libkvm(3) and all tools that link against it need to be recompiled. 20161030: isl(4) and cyapa(4) drivers now require a new driver, chromebook_platform(4), to work properly on Chromebook-class hardware. On other types of hardware the drivers may need to be configured using device hints. Please see the corresponding manual pages for details. 20161017: The urtwn(4) driver was merged into rtwn(4) and now consists of rtwn(4) main module + rtwn_usb(4) and rtwn_pci(4) bus-specific parts. Also, firmware for RTL8188CE was renamed due to possible name conflict (rtwnrtl8192cU(B) -> rtwnrtl8192cE(B)) 20161015: GNU rcs has been removed from base. It is available as packages: - rcs: Latest GPLv3 GNU rcs version. - rcs57: Copy of the latest version of GNU rcs (GPLv2) before it was removed from base. 20161008: Use of the cc_cdg, cc_chd, cc_hd, or cc_vegas congestion control modules now requires that the kernel configuration contain the TCP_HHOOK option. (This option is included in the GENERIC kernel.) 20161003: The WITHOUT_ELFCOPY_AS_OBJCOPY src.conf(5) knob has been retired. ELF Tool Chain's elfcopy is always installed as /usr/bin/objcopy. 20160924: Relocatable object files with the extension of .So have been renamed to use an extension of .pico instead. The purpose of this change is to avoid a name clash with shared libraries on case-insensitive file systems. On those file systems, foo.So is the same file as foo.so. 20160918: GNU rcs has been turned off by default. It can (temporarily) be built again by adding WITH_RCS knob in src.conf. Otherwise, GNU rcs is available from packages: - rcs: Latest GPLv3 GNU rcs version. - rcs57: Copy of the latest version of GNU rcs (GPLv2) from base. 20160918: The backup_uses_rcs functionality has been removed from rc.subr. 20160908: The queue(3) debugging macro, QUEUE_MACRO_DEBUG, has been split into two separate components, QUEUE_MACRO_DEBUG_TRACE and QUEUE_MACRO_DEBUG_TRASH. Define both for the original QUEUE_MACRO_DEBUG behavior. 20160824: r304787 changed some ioctl interfaces between the iSCSI userspace programs and the kernel. ctladm, ctld, iscsictl, and iscsid must be rebuilt to work with new kernels. __FreeBSD_version has been bumped to 1200005. 20160818: The UDP receive code has been updated to only treat incoming UDP packets that were addressed to an L2 broadcast address as L3 broadcast packets. It is not expected that this will affect any standards-conforming UDP application. The new behaviour can be disabled by setting the sysctl net.inet.udp.require_l2_bcast to 0. 20160818: Remove the openbsd_poll system call. __FreeBSD_version has been bumped because of this. 20160708: The stable/11 branch has been created from head@r302406. After branch N is created, entries older than the N-2 branch point are removed from this file. After stable/14 is branched and current becomes FreeBSD 15, entries older than stable/12 branch point will be removed from current's UPDATING file. COMMON ITEMS: General Notes ------------- Sometimes, obscure build problems are the result of environment poisoning. This can happen because the make utility reads its environment when searching for values for global variables. To run your build attempts in an "environmental clean room", prefix all make commands with 'env -i '. See the env(1) manual page for more details. Occasionally a build failure will occur with "make -j" due to a race condition. If this happens try building again without -j, and please report a bug if it happens consistently. When upgrading from one major version to another it is generally best to upgrade to the latest code in the currently installed branch first, then do an upgrade to the new branch. This is the best-tested upgrade path, and has the highest probability of being successful. Please try this approach if you encounter problems with a major version upgrade. Since the stable 4.x branch point, one has generally been able to upgrade from anywhere in the most recent stable branch to head / current (or even the last couple of stable branches). See the top of this file when there's an exception. The update process will emit an error on an attempt to perform a build or install from a FreeBSD version below the earliest supported version. When updating from an older version the update should be performed one major release at a time, including running `make delete-old` at each step. When upgrading a live system, having a root shell around before installing anything can help undo problems. Not having a root shell around can lead to problems if pam has changed too much from your starting point to allow continued authentication after the upgrade. This file should be read as a log of events. When a later event changes information of a prior event, the prior event should not be deleted. Instead, a pointer to the entry with the new information should be placed in the old entry. Readers of this file should also sanity check older entries before relying on them blindly. Authors of new entries should write them with this in mind. ZFS notes --------- When upgrading the boot ZFS pool to a new version (via zpool upgrade), always follow these three steps: 1) recompile and reinstall the ZFS boot loader and boot block (this is part of "make buildworld" and "make installworld") 2) update the ZFS boot block on your boot drive (only required when doing a zpool upgrade): When booting on x86 via BIOS, use the following to update the ZFS boot block on the freebsd-boot partition of a GPT partitioned drive ada0: gpart bootcode -p /boot/gptzfsboot -i $N ada0 The value $N will typically be 1. For EFI booting, see EFI notes. 3) zpool upgrade the root pool. New bootblocks will work with old pools, but not vice versa, so they need to be updated before any zpool upgrade. Non-boot pools do not need these updates. EFI notes --------- There are two locations the boot loader can be installed into. The current location (and the default) is \efi\freebsd\loader.efi and using efibootmgr(8) to configure it. The old location, that must be used on deficient systems that don't honor efibootmgr(8) protocols, is the fallback location of \EFI\BOOT\BOOTxxx.EFI. Generally, you will copy /boot/loader.efi to this location, but on systems installed a long time ago the ESP may be too small and /boot/boot1.efi may be needed unless the ESP has been expanded in the meantime. Recent systems will have the ESP mounted on /boot/efi, but older ones may not have it mounted at all, or mounted in a different location. Older arm SD images with MBR used /boot/msdos as the mountpoint. The ESP is a MSDOS filesystem. The EFI boot loader rarely needs to be updated. For ZFS booting, however, you must update loader.efi before you do 'zpool upgrade' the root zpool, otherwise the old loader.efi may reject the upgraded zpool since it does not automatically understand some new features. See loader.efi(8) and uefi(8) for more details. To build a kernel ----------------- If you are updating from a prior version of FreeBSD (even one just a few days old), you should follow this procedure. It is the most failsafe as it uses a /usr/obj tree with a fresh mini-buildworld, make kernel-toolchain make -DALWAYS_CHECK_MAKE buildkernel KERNCONF=YOUR_KERNEL_HERE make -DALWAYS_CHECK_MAKE installkernel KERNCONF=YOUR_KERNEL_HERE To test a kernel once --------------------- If you just want to boot a kernel once (because you are not sure if it works, or if you want to boot a known bad kernel to provide debugging information) run make installkernel KERNCONF=YOUR_KERNEL_HERE KODIR=/boot/testkernel nextboot -k testkernel To rebuild everything and install it on the current system. ----------------------------------------------------------- # Note: sometimes if you are running current you gotta do more than # is listed here if you are upgrading from a really old current. make buildworld make buildkernel KERNCONF=YOUR_KERNEL_HERE make installkernel KERNCONF=YOUR_KERNEL_HERE [1] [3] etcupdate -p [5] make installworld etcupdate -B [4] make delete-old [6] To cross-install current onto a separate partition -------------------------------------------------- # In this approach we use a separate partition to hold # current's root, 'usr', and 'var' directories. A partition # holding "/", "/usr" and "/var" should be about 2GB in # size. make buildworld make buildkernel KERNCONF=YOUR_KERNEL_HERE make installworld DESTDIR=${CURRENT_ROOT} -DDB_FROM_SRC make distribution DESTDIR=${CURRENT_ROOT} # if newfs'd make installkernel KERNCONF=YOUR_KERNEL_HERE DESTDIR=${CURRENT_ROOT} cp /etc/fstab ${CURRENT_ROOT}/etc/fstab # if newfs'd To upgrade in-place from stable to current ---------------------------------------------- make buildworld [9] make buildkernel KERNCONF=YOUR_KERNEL_HERE [8] make installkernel KERNCONF=YOUR_KERNEL_HERE [1] [3] etcupdate -p [5] make installworld etcupdate -B [4] make delete-old [6] Make sure that you've read the UPDATING file to understand the tweaks to various things you need. At this point in the life cycle of current, things change often and you are on your own to cope. The defaults can also change, so please read ALL of the UPDATING entries. Also, if you are tracking -current, you must be subscribed to freebsd-current@freebsd.org. Make sure that before you update your sources that you have read and understood all the recent messages there. If in doubt, please track -stable which has much fewer pitfalls. [1] If you have third party modules, such as vmware, you should disable them at this point so they don't crash your system on reboot. Alternatively, you should rebuild all the modules you have in your system and install them as well. If you are running -current, you should seriously consider placing all sources to all the modules for your system (or symlinks to them) in /usr/local/sys/modules so this happens automatically. If all your modules come from ports, then adding the port origin directories to PORTS_MODULES instead is also automatic and effective, eg: PORTS_MODULES+=x11/nvidia-driver [3] From the bootblocks, boot -s, and then do fsck -p mount -u / mount -a sh /etc/rc.d/zfs start # mount zfs filesystem, if needed cd src # full path to source adjkerntz -i # if CMOS is wall time Also, when doing a major release upgrade, it is required that you boot into single user mode to do the installworld. [4] Note: This step is non-optional. Failure to do this step can result in a significant reduction in the functionality of the system. Attempting to do it by hand is not recommended and those that pursue this avenue should read this file carefully, as well as the archives of freebsd-current and freebsd-hackers mailing lists for potential gotchas. See etcupdate(8) for more information. [5] Usually this step is a no-op. However, from time to time you may need to do this if you get unknown user in the following step. [6] This only deletes old files and directories. Old libraries can be deleted by "make delete-old-libs", but you have to make sure that no program is using those libraries anymore. [8] The new kernel must be able to run existing binaries used by an installworld. When upgrading across major versions, the new kernel's configuration must include the correct COMPAT_FREEBSD option for existing binaries (e.g. COMPAT_FREEBSD11 to run 11.x binaries). Failure to do so may leave you with a system that is hard to boot to recover. A GENERIC kernel will include suitable compatibility options to run binaries from older branches. Note that the ability to run binaries from unsupported branches is not guaranteed. Make sure that you merge any new devices from GENERIC since the last time you updated your kernel config file. Options also change over time, so you may need to adjust your custom kernels for these as well. [9] If CPUTYPE is defined in your /etc/make.conf, make sure to use the "?=" instead of the "=" assignment operator, so that buildworld can override the CPUTYPE if it needs to. MAKEOBJDIRPREFIX must be defined in an environment variable, and not on the command line, or in /etc/make.conf. buildworld will warn if it is improperly defined. FORMAT: This file contains a list, in reverse chronological order, of major breakages in tracking -current. It is not guaranteed to be a complete list of such breakages, and only contains entries since September 23, 2011. If you need to see UPDATING entries from before that date, you will need to fetch an UPDATING file from an older FreeBSD release. Copyright information: Copyright 1998-2009 M. Warner Losh Redistribution, publication, translation and use, with or without modification, in full or in part, in any form or format of this document are permitted without further permission from the author. THIS DOCUMENT IS PROVIDED BY WARNER LOSH ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL WARNER LOSH BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. Contact Warner Losh if you have any questions about your use of this document. $FreeBSD$ diff --git a/sys/netpfil/pf/pf_ioctl.c b/sys/netpfil/pf/pf_ioctl.c index e76a92fb7e7f..b78c30aa4b8c 100644 --- a/sys/netpfil/pf/pf_ioctl.c +++ b/sys/netpfil/pf/pf_ioctl.c @@ -1,6945 +1,6965 @@ /*- * SPDX-License-Identifier: BSD-2-Clause * * Copyright (c) 2001 Daniel Hartmeier * Copyright (c) 2002,2003 Henning Brauer * Copyright (c) 2012 Gleb Smirnoff * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * - Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials provided * with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE * COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN * ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE * POSSIBILITY OF SUCH DAMAGE. * * Effort sponsored in part by the Defense Advanced Research Projects * Agency (DARPA) and Air Force Research Laboratory, Air Force * Materiel Command, USAF, under agreement number F30602-01-2-0537. * * $OpenBSD: pf_ioctl.c,v 1.213 2009/02/15 21:46:12 mbalmer Exp $ */ #include __FBSDID("$FreeBSD$"); #include "opt_inet.h" #include "opt_inet6.h" #include "opt_bpf.h" #include "opt_pf.h" #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #ifdef INET6 #include #endif /* INET6 */ #ifdef ALTQ #include #endif SDT_PROBE_DEFINE3(pf, ioctl, ioctl, error, "int", "int", "int"); SDT_PROBE_DEFINE3(pf, ioctl, function, error, "char *", "int", "int"); SDT_PROBE_DEFINE2(pf, ioctl, addrule, error, "int", "int"); SDT_PROBE_DEFINE2(pf, ioctl, nvchk, error, "int", "int"); static struct pf_kpool *pf_get_kpool(const char *, u_int32_t, u_int8_t, u_int32_t, u_int8_t, u_int8_t, u_int8_t); static void pf_mv_kpool(struct pf_kpalist *, struct pf_kpalist *); static void pf_empty_kpool(struct pf_kpalist *); static int pfioctl(struct cdev *, u_long, caddr_t, int, struct thread *); static int pf_begin_eth(uint32_t *, const char *); static void pf_rollback_eth_cb(struct epoch_context *); static int pf_rollback_eth(uint32_t, const char *); static int pf_commit_eth(uint32_t, const char *); static void pf_free_eth_rule(struct pf_keth_rule *); #ifdef ALTQ static int pf_begin_altq(u_int32_t *); static int pf_rollback_altq(u_int32_t); static int pf_commit_altq(u_int32_t); static int pf_enable_altq(struct pf_altq *); static int pf_disable_altq(struct pf_altq *); static uint16_t pf_qname2qid(const char *); static void pf_qid_unref(uint16_t); #endif /* ALTQ */ static int pf_begin_rules(u_int32_t *, int, const char *); static int pf_rollback_rules(u_int32_t, int, char *); static int pf_setup_pfsync_matching(struct pf_kruleset *); static void pf_hash_rule_rolling(MD5_CTX *, struct pf_krule *); static void pf_hash_rule(struct pf_krule *); static void pf_hash_rule_addr(MD5_CTX *, struct pf_rule_addr *); static int pf_commit_rules(u_int32_t, int, char *); static int pf_addr_setup(struct pf_kruleset *, struct pf_addr_wrap *, sa_family_t); static void pf_addr_copyout(struct pf_addr_wrap *); static void pf_src_node_copy(const struct pf_ksrc_node *, struct pf_src_node *); #ifdef ALTQ static int pf_export_kaltq(struct pf_altq *, struct pfioc_altq_v1 *, size_t); static int pf_import_kaltq(struct pfioc_altq_v1 *, struct pf_altq *, size_t); #endif /* ALTQ */ VNET_DEFINE(struct pf_krule, pf_default_rule); static __inline int pf_krule_compare(struct pf_krule *, struct pf_krule *); RB_GENERATE(pf_krule_global, pf_krule, entry_global, pf_krule_compare); #ifdef ALTQ VNET_DEFINE_STATIC(int, pf_altq_running); #define V_pf_altq_running VNET(pf_altq_running) #endif #define TAGID_MAX 50000 struct pf_tagname { TAILQ_ENTRY(pf_tagname) namehash_entries; TAILQ_ENTRY(pf_tagname) taghash_entries; char name[PF_TAG_NAME_SIZE]; uint16_t tag; int ref; }; struct pf_tagset { TAILQ_HEAD(, pf_tagname) *namehash; TAILQ_HEAD(, pf_tagname) *taghash; unsigned int mask; uint32_t seed; BITSET_DEFINE(, TAGID_MAX) avail; }; VNET_DEFINE(struct pf_tagset, pf_tags); #define V_pf_tags VNET(pf_tags) static unsigned int pf_rule_tag_hashsize; #define PF_RULE_TAG_HASH_SIZE_DEFAULT 128 SYSCTL_UINT(_net_pf, OID_AUTO, rule_tag_hashsize, CTLFLAG_RDTUN, &pf_rule_tag_hashsize, PF_RULE_TAG_HASH_SIZE_DEFAULT, "Size of pf(4) rule tag hashtable"); #ifdef ALTQ VNET_DEFINE(struct pf_tagset, pf_qids); #define V_pf_qids VNET(pf_qids) static unsigned int pf_queue_tag_hashsize; #define PF_QUEUE_TAG_HASH_SIZE_DEFAULT 128 SYSCTL_UINT(_net_pf, OID_AUTO, queue_tag_hashsize, CTLFLAG_RDTUN, &pf_queue_tag_hashsize, PF_QUEUE_TAG_HASH_SIZE_DEFAULT, "Size of pf(4) queue tag hashtable"); #endif VNET_DEFINE(uma_zone_t, pf_tag_z); #define V_pf_tag_z VNET(pf_tag_z) static MALLOC_DEFINE(M_PFALTQ, "pf_altq", "pf(4) altq configuration db"); static MALLOC_DEFINE(M_PFRULE, "pf_rule", "pf(4) rules"); #if (PF_QNAME_SIZE != PF_TAG_NAME_SIZE) #error PF_QNAME_SIZE must be equal to PF_TAG_NAME_SIZE #endif +VNET_DEFINE_STATIC(bool, pf_filter_local) = false; +#define V_pf_filter_local VNET(pf_filter_local) +SYSCTL_BOOL(_net_pf, OID_AUTO, filter_local, CTLFLAG_VNET | CTLFLAG_RW, + &VNET_NAME(pf_filter_local), false, + "Enable filtering for packets delivered to local network stack"); + static void pf_init_tagset(struct pf_tagset *, unsigned int *, unsigned int); static void pf_cleanup_tagset(struct pf_tagset *); static uint16_t tagname2hashindex(const struct pf_tagset *, const char *); static uint16_t tag2hashindex(const struct pf_tagset *, uint16_t); static u_int16_t tagname2tag(struct pf_tagset *, const char *); static u_int16_t pf_tagname2tag(const char *); static void tag_unref(struct pf_tagset *, u_int16_t); #define DPFPRINTF(n, x) if (V_pf_status.debug >= (n)) printf x struct cdev *pf_dev; /* * XXX - These are new and need to be checked when moveing to a new version */ static void pf_clear_all_states(void); static unsigned int pf_clear_states(const struct pf_kstate_kill *); static void pf_killstates(struct pf_kstate_kill *, unsigned int *); static int pf_killstates_row(struct pf_kstate_kill *, struct pf_idhash *); static int pf_killstates_nv(struct pfioc_nv *); static int pf_clearstates_nv(struct pfioc_nv *); static int pf_getstate(struct pfioc_nv *); static int pf_getstatus(struct pfioc_nv *); static int pf_clear_tables(void); static void pf_clear_srcnodes(struct pf_ksrc_node *); static void pf_kill_srcnodes(struct pfioc_src_node_kill *); static int pf_keepcounters(struct pfioc_nv *); static void pf_tbladdr_copyout(struct pf_addr_wrap *); /* * Wrapper functions for pfil(9) hooks */ static pfil_return_t pf_eth_check_in(struct mbuf **m, struct ifnet *ifp, int flags, void *ruleset __unused, struct inpcb *inp); static pfil_return_t pf_eth_check_out(struct mbuf **m, struct ifnet *ifp, int flags, void *ruleset __unused, struct inpcb *inp); #ifdef INET static pfil_return_t pf_check_in(struct mbuf **m, struct ifnet *ifp, int flags, void *ruleset __unused, struct inpcb *inp); static pfil_return_t pf_check_out(struct mbuf **m, struct ifnet *ifp, int flags, void *ruleset __unused, struct inpcb *inp); #endif #ifdef INET6 static pfil_return_t pf_check6_in(struct mbuf **m, struct ifnet *ifp, int flags, void *ruleset __unused, struct inpcb *inp); static pfil_return_t pf_check6_out(struct mbuf **m, struct ifnet *ifp, int flags, void *ruleset __unused, struct inpcb *inp); #endif static void hook_pf_eth(void); static void hook_pf(void); static void dehook_pf_eth(void); static void dehook_pf(void); static int shutdown_pf(void); static int pf_load(void); static void pf_unload(void); static struct cdevsw pf_cdevsw = { .d_ioctl = pfioctl, .d_name = PF_NAME, .d_version = D_VERSION, }; VNET_DEFINE_STATIC(bool, pf_pfil_hooked); #define V_pf_pfil_hooked VNET(pf_pfil_hooked) VNET_DEFINE_STATIC(bool, pf_pfil_eth_hooked); #define V_pf_pfil_eth_hooked VNET(pf_pfil_eth_hooked) /* * We need a flag that is neither hooked nor running to know when * the VNET is "valid". We primarily need this to control (global) * external event, e.g., eventhandlers. */ VNET_DEFINE(int, pf_vnet_active); #define V_pf_vnet_active VNET(pf_vnet_active) int pf_end_threads; struct proc *pf_purge_proc; VNET_DEFINE(struct rmlock, pf_rules_lock); VNET_DEFINE_STATIC(struct sx, pf_ioctl_lock); #define V_pf_ioctl_lock VNET(pf_ioctl_lock) struct sx pf_end_lock; /* pfsync */ VNET_DEFINE(pfsync_state_import_t *, pfsync_state_import_ptr); VNET_DEFINE(pfsync_insert_state_t *, pfsync_insert_state_ptr); VNET_DEFINE(pfsync_update_state_t *, pfsync_update_state_ptr); VNET_DEFINE(pfsync_delete_state_t *, pfsync_delete_state_ptr); VNET_DEFINE(pfsync_clear_states_t *, pfsync_clear_states_ptr); VNET_DEFINE(pfsync_defer_t *, pfsync_defer_ptr); pfsync_detach_ifnet_t *pfsync_detach_ifnet_ptr; /* pflog */ pflog_packet_t *pflog_packet_ptr = NULL; /* * Copy a user-provided string, returning an error if truncation would occur. * Avoid scanning past "sz" bytes in the source string since there's no * guarantee that it's nul-terminated. */ static int pf_user_strcpy(char *dst, const char *src, size_t sz) { if (strnlen(src, sz) == sz) return (EINVAL); (void)strlcpy(dst, src, sz); return (0); } static void pfattach_vnet(void) { u_int32_t *my_timeout = V_pf_default_rule.timeout; bzero(&V_pf_status, sizeof(V_pf_status)); pf_initialize(); pfr_initialize(); pfi_initialize_vnet(); pf_normalize_init(); pf_syncookies_init(); V_pf_limits[PF_LIMIT_STATES].limit = PFSTATE_HIWAT; V_pf_limits[PF_LIMIT_SRC_NODES].limit = PFSNODE_HIWAT; RB_INIT(&V_pf_anchors); pf_init_kruleset(&pf_main_ruleset); pf_init_keth(V_pf_keth); /* default rule should never be garbage collected */ V_pf_default_rule.entries.tqe_prev = &V_pf_default_rule.entries.tqe_next; #ifdef PF_DEFAULT_TO_DROP V_pf_default_rule.action = PF_DROP; #else V_pf_default_rule.action = PF_PASS; #endif V_pf_default_rule.nr = -1; V_pf_default_rule.rtableid = -1; pf_counter_u64_init(&V_pf_default_rule.evaluations, M_WAITOK); for (int i = 0; i < 2; i++) { pf_counter_u64_init(&V_pf_default_rule.packets[i], M_WAITOK); pf_counter_u64_init(&V_pf_default_rule.bytes[i], M_WAITOK); } V_pf_default_rule.states_cur = counter_u64_alloc(M_WAITOK); V_pf_default_rule.states_tot = counter_u64_alloc(M_WAITOK); V_pf_default_rule.src_nodes = counter_u64_alloc(M_WAITOK); V_pf_default_rule.timestamp = uma_zalloc_pcpu(pf_timestamp_pcpu_zone, M_WAITOK | M_ZERO); #ifdef PF_WANT_32_TO_64_COUNTER V_pf_kifmarker = malloc(sizeof(*V_pf_kifmarker), PFI_MTYPE, M_WAITOK | M_ZERO); V_pf_rulemarker = malloc(sizeof(*V_pf_rulemarker), M_PFRULE, M_WAITOK | M_ZERO); PF_RULES_WLOCK(); LIST_INSERT_HEAD(&V_pf_allkiflist, V_pf_kifmarker, pfik_allkiflist); LIST_INSERT_HEAD(&V_pf_allrulelist, &V_pf_default_rule, allrulelist); V_pf_allrulecount++; LIST_INSERT_HEAD(&V_pf_allrulelist, V_pf_rulemarker, allrulelist); PF_RULES_WUNLOCK(); #endif /* initialize default timeouts */ my_timeout[PFTM_TCP_FIRST_PACKET] = PFTM_TCP_FIRST_PACKET_VAL; my_timeout[PFTM_TCP_OPENING] = PFTM_TCP_OPENING_VAL; my_timeout[PFTM_TCP_ESTABLISHED] = PFTM_TCP_ESTABLISHED_VAL; my_timeout[PFTM_TCP_CLOSING] = PFTM_TCP_CLOSING_VAL; my_timeout[PFTM_TCP_FIN_WAIT] = PFTM_TCP_FIN_WAIT_VAL; my_timeout[PFTM_TCP_CLOSED] = PFTM_TCP_CLOSED_VAL; my_timeout[PFTM_UDP_FIRST_PACKET] = PFTM_UDP_FIRST_PACKET_VAL; my_timeout[PFTM_UDP_SINGLE] = PFTM_UDP_SINGLE_VAL; my_timeout[PFTM_UDP_MULTIPLE] = PFTM_UDP_MULTIPLE_VAL; my_timeout[PFTM_ICMP_FIRST_PACKET] = PFTM_ICMP_FIRST_PACKET_VAL; my_timeout[PFTM_ICMP_ERROR_REPLY] = PFTM_ICMP_ERROR_REPLY_VAL; my_timeout[PFTM_OTHER_FIRST_PACKET] = PFTM_OTHER_FIRST_PACKET_VAL; my_timeout[PFTM_OTHER_SINGLE] = PFTM_OTHER_SINGLE_VAL; my_timeout[PFTM_OTHER_MULTIPLE] = PFTM_OTHER_MULTIPLE_VAL; my_timeout[PFTM_FRAG] = PFTM_FRAG_VAL; my_timeout[PFTM_INTERVAL] = PFTM_INTERVAL_VAL; my_timeout[PFTM_SRC_NODE] = PFTM_SRC_NODE_VAL; my_timeout[PFTM_TS_DIFF] = PFTM_TS_DIFF_VAL; my_timeout[PFTM_ADAPTIVE_START] = PFSTATE_ADAPT_START; my_timeout[PFTM_ADAPTIVE_END] = PFSTATE_ADAPT_END; V_pf_status.debug = PF_DEBUG_URGENT; /* * XXX This is different than in OpenBSD where reassembly is enabled by * defult. In FreeBSD we expect people to still use scrub rules and * switch to the new syntax later. Only when they switch they must * explicitly enable reassemle. We could change the default once the * scrub rule functionality is hopefully removed some day in future. */ V_pf_status.reass = 0; V_pf_pfil_hooked = false; V_pf_pfil_eth_hooked = false; /* XXX do our best to avoid a conflict */ V_pf_status.hostid = arc4random(); for (int i = 0; i < PFRES_MAX; i++) V_pf_status.counters[i] = counter_u64_alloc(M_WAITOK); for (int i = 0; i < KLCNT_MAX; i++) V_pf_status.lcounters[i] = counter_u64_alloc(M_WAITOK); for (int i = 0; i < FCNT_MAX; i++) pf_counter_u64_init(&V_pf_status.fcounters[i], M_WAITOK); for (int i = 0; i < SCNT_MAX; i++) V_pf_status.scounters[i] = counter_u64_alloc(M_WAITOK); if (swi_add(&V_pf_swi_ie, "pf send", pf_intr, curvnet, SWI_NET, INTR_MPSAFE, &V_pf_swi_cookie) != 0) /* XXXGL: leaked all above. */ return; } static struct pf_kpool * pf_get_kpool(const char *anchor, u_int32_t ticket, u_int8_t rule_action, u_int32_t rule_number, u_int8_t r_last, u_int8_t active, u_int8_t check_ticket) { struct pf_kruleset *ruleset; struct pf_krule *rule; int rs_num; ruleset = pf_find_kruleset(anchor); if (ruleset == NULL) return (NULL); rs_num = pf_get_ruleset_number(rule_action); if (rs_num >= PF_RULESET_MAX) return (NULL); if (active) { if (check_ticket && ticket != ruleset->rules[rs_num].active.ticket) return (NULL); if (r_last) rule = TAILQ_LAST(ruleset->rules[rs_num].active.ptr, pf_krulequeue); else rule = TAILQ_FIRST(ruleset->rules[rs_num].active.ptr); } else { if (check_ticket && ticket != ruleset->rules[rs_num].inactive.ticket) return (NULL); if (r_last) rule = TAILQ_LAST(ruleset->rules[rs_num].inactive.ptr, pf_krulequeue); else rule = TAILQ_FIRST(ruleset->rules[rs_num].inactive.ptr); } if (!r_last) { while ((rule != NULL) && (rule->nr != rule_number)) rule = TAILQ_NEXT(rule, entries); } if (rule == NULL) return (NULL); return (&rule->rpool); } static void pf_mv_kpool(struct pf_kpalist *poola, struct pf_kpalist *poolb) { struct pf_kpooladdr *mv_pool_pa; while ((mv_pool_pa = TAILQ_FIRST(poola)) != NULL) { TAILQ_REMOVE(poola, mv_pool_pa, entries); TAILQ_INSERT_TAIL(poolb, mv_pool_pa, entries); } } static void pf_empty_kpool(struct pf_kpalist *poola) { struct pf_kpooladdr *pa; while ((pa = TAILQ_FIRST(poola)) != NULL) { switch (pa->addr.type) { case PF_ADDR_DYNIFTL: pfi_dynaddr_remove(pa->addr.p.dyn); break; case PF_ADDR_TABLE: /* XXX: this could be unfinished pooladdr on pabuf */ if (pa->addr.p.tbl != NULL) pfr_detach_table(pa->addr.p.tbl); break; } if (pa->kif) pfi_kkif_unref(pa->kif); TAILQ_REMOVE(poola, pa, entries); free(pa, M_PFRULE); } } static void pf_unlink_rule_locked(struct pf_krulequeue *rulequeue, struct pf_krule *rule) { PF_RULES_WASSERT(); PF_UNLNKDRULES_ASSERT(); TAILQ_REMOVE(rulequeue, rule, entries); rule->rule_ref |= PFRULE_REFS; TAILQ_INSERT_TAIL(&V_pf_unlinked_rules, rule, entries); } static void pf_unlink_rule(struct pf_krulequeue *rulequeue, struct pf_krule *rule) { PF_RULES_WASSERT(); PF_UNLNKDRULES_LOCK(); pf_unlink_rule_locked(rulequeue, rule); PF_UNLNKDRULES_UNLOCK(); } static void pf_free_eth_rule(struct pf_keth_rule *rule) { PF_RULES_WASSERT(); if (rule == NULL) return; if (rule->tag) tag_unref(&V_pf_tags, rule->tag); if (rule->match_tag) tag_unref(&V_pf_tags, rule->match_tag); #ifdef ALTQ pf_qid_unref(rule->qid); #endif if (rule->bridge_to) pfi_kkif_unref(rule->bridge_to); if (rule->kif) pfi_kkif_unref(rule->kif); if (rule->ipsrc.addr.type == PF_ADDR_TABLE) pfr_detach_table(rule->ipsrc.addr.p.tbl); if (rule->ipdst.addr.type == PF_ADDR_TABLE) pfr_detach_table(rule->ipdst.addr.p.tbl); counter_u64_free(rule->evaluations); for (int i = 0; i < 2; i++) { counter_u64_free(rule->packets[i]); counter_u64_free(rule->bytes[i]); } uma_zfree_pcpu(pf_timestamp_pcpu_zone, rule->timestamp); pf_keth_anchor_remove(rule); free(rule, M_PFRULE); } void pf_free_rule(struct pf_krule *rule) { PF_RULES_WASSERT(); PF_CONFIG_ASSERT(); if (rule->tag) tag_unref(&V_pf_tags, rule->tag); if (rule->match_tag) tag_unref(&V_pf_tags, rule->match_tag); #ifdef ALTQ if (rule->pqid != rule->qid) pf_qid_unref(rule->pqid); pf_qid_unref(rule->qid); #endif switch (rule->src.addr.type) { case PF_ADDR_DYNIFTL: pfi_dynaddr_remove(rule->src.addr.p.dyn); break; case PF_ADDR_TABLE: pfr_detach_table(rule->src.addr.p.tbl); break; } switch (rule->dst.addr.type) { case PF_ADDR_DYNIFTL: pfi_dynaddr_remove(rule->dst.addr.p.dyn); break; case PF_ADDR_TABLE: pfr_detach_table(rule->dst.addr.p.tbl); break; } if (rule->overload_tbl) pfr_detach_table(rule->overload_tbl); if (rule->kif) pfi_kkif_unref(rule->kif); pf_kanchor_remove(rule); pf_empty_kpool(&rule->rpool.list); pf_krule_free(rule); } static void pf_init_tagset(struct pf_tagset *ts, unsigned int *tunable_size, unsigned int default_size) { unsigned int i; unsigned int hashsize; if (*tunable_size == 0 || !powerof2(*tunable_size)) *tunable_size = default_size; hashsize = *tunable_size; ts->namehash = mallocarray(hashsize, sizeof(*ts->namehash), M_PFHASH, M_WAITOK); ts->taghash = mallocarray(hashsize, sizeof(*ts->taghash), M_PFHASH, M_WAITOK); ts->mask = hashsize - 1; ts->seed = arc4random(); for (i = 0; i < hashsize; i++) { TAILQ_INIT(&ts->namehash[i]); TAILQ_INIT(&ts->taghash[i]); } BIT_FILL(TAGID_MAX, &ts->avail); } static void pf_cleanup_tagset(struct pf_tagset *ts) { unsigned int i; unsigned int hashsize; struct pf_tagname *t, *tmp; /* * Only need to clean up one of the hashes as each tag is hashed * into each table. */ hashsize = ts->mask + 1; for (i = 0; i < hashsize; i++) TAILQ_FOREACH_SAFE(t, &ts->namehash[i], namehash_entries, tmp) uma_zfree(V_pf_tag_z, t); free(ts->namehash, M_PFHASH); free(ts->taghash, M_PFHASH); } static uint16_t tagname2hashindex(const struct pf_tagset *ts, const char *tagname) { size_t len; len = strnlen(tagname, PF_TAG_NAME_SIZE - 1); return (murmur3_32_hash(tagname, len, ts->seed) & ts->mask); } static uint16_t tag2hashindex(const struct pf_tagset *ts, uint16_t tag) { return (tag & ts->mask); } static u_int16_t tagname2tag(struct pf_tagset *ts, const char *tagname) { struct pf_tagname *tag; u_int32_t index; u_int16_t new_tagid; PF_RULES_WASSERT(); index = tagname2hashindex(ts, tagname); TAILQ_FOREACH(tag, &ts->namehash[index], namehash_entries) if (strcmp(tagname, tag->name) == 0) { tag->ref++; return (tag->tag); } /* * new entry * * to avoid fragmentation, we do a linear search from the beginning * and take the first free slot we find. */ new_tagid = BIT_FFS(TAGID_MAX, &ts->avail); /* * Tags are 1-based, with valid tags in the range [1..TAGID_MAX]. * BIT_FFS() returns a 1-based bit number, with 0 indicating no bits * set. It may also return a bit number greater than TAGID_MAX due * to rounding of the number of bits in the vector up to a multiple * of the vector word size at declaration/allocation time. */ if ((new_tagid == 0) || (new_tagid > TAGID_MAX)) return (0); /* Mark the tag as in use. Bits are 0-based for BIT_CLR() */ BIT_CLR(TAGID_MAX, new_tagid - 1, &ts->avail); /* allocate and fill new struct pf_tagname */ tag = uma_zalloc(V_pf_tag_z, M_NOWAIT); if (tag == NULL) return (0); strlcpy(tag->name, tagname, sizeof(tag->name)); tag->tag = new_tagid; tag->ref = 1; /* Insert into namehash */ TAILQ_INSERT_TAIL(&ts->namehash[index], tag, namehash_entries); /* Insert into taghash */ index = tag2hashindex(ts, new_tagid); TAILQ_INSERT_TAIL(&ts->taghash[index], tag, taghash_entries); return (tag->tag); } static void tag_unref(struct pf_tagset *ts, u_int16_t tag) { struct pf_tagname *t; uint16_t index; PF_RULES_WASSERT(); index = tag2hashindex(ts, tag); TAILQ_FOREACH(t, &ts->taghash[index], taghash_entries) if (tag == t->tag) { if (--t->ref == 0) { TAILQ_REMOVE(&ts->taghash[index], t, taghash_entries); index = tagname2hashindex(ts, t->name); TAILQ_REMOVE(&ts->namehash[index], t, namehash_entries); /* Bits are 0-based for BIT_SET() */ BIT_SET(TAGID_MAX, tag - 1, &ts->avail); uma_zfree(V_pf_tag_z, t); } break; } } static uint16_t pf_tagname2tag(const char *tagname) { return (tagname2tag(&V_pf_tags, tagname)); } static int pf_begin_eth(uint32_t *ticket, const char *anchor) { struct pf_keth_rule *rule, *tmp; struct pf_keth_ruleset *rs; PF_RULES_WASSERT(); rs = pf_find_or_create_keth_ruleset(anchor); if (rs == NULL) return (EINVAL); /* Purge old inactive rules. */ TAILQ_FOREACH_SAFE(rule, rs->inactive.rules, entries, tmp) { TAILQ_REMOVE(rs->inactive.rules, rule, entries); pf_free_eth_rule(rule); } *ticket = ++rs->inactive.ticket; rs->inactive.open = 1; return (0); } static void pf_rollback_eth_cb(struct epoch_context *ctx) { struct pf_keth_ruleset *rs; rs = __containerof(ctx, struct pf_keth_ruleset, epoch_ctx); CURVNET_SET(rs->vnet); PF_RULES_WLOCK(); pf_rollback_eth(rs->inactive.ticket, rs->anchor ? rs->anchor->path : ""); PF_RULES_WUNLOCK(); CURVNET_RESTORE(); } static int pf_rollback_eth(uint32_t ticket, const char *anchor) { struct pf_keth_rule *rule, *tmp; struct pf_keth_ruleset *rs; PF_RULES_WASSERT(); rs = pf_find_keth_ruleset(anchor); if (rs == NULL) return (EINVAL); if (!rs->inactive.open || ticket != rs->inactive.ticket) return (0); /* Purge old inactive rules. */ TAILQ_FOREACH_SAFE(rule, rs->inactive.rules, entries, tmp) { TAILQ_REMOVE(rs->inactive.rules, rule, entries); pf_free_eth_rule(rule); } rs->inactive.open = 0; pf_remove_if_empty_keth_ruleset(rs); return (0); } #define PF_SET_SKIP_STEPS(i) \ do { \ while (head[i] != cur) { \ head[i]->skip[i].ptr = cur; \ head[i] = TAILQ_NEXT(head[i], entries); \ } \ } while (0) static void pf_eth_calc_skip_steps(struct pf_keth_ruleq *rules) { struct pf_keth_rule *cur, *prev, *head[PFE_SKIP_COUNT]; int i; cur = TAILQ_FIRST(rules); prev = cur; for (i = 0; i < PFE_SKIP_COUNT; ++i) head[i] = cur; while (cur != NULL) { if (cur->kif != prev->kif || cur->ifnot != prev->ifnot) PF_SET_SKIP_STEPS(PFE_SKIP_IFP); if (cur->direction != prev->direction) PF_SET_SKIP_STEPS(PFE_SKIP_DIR); if (cur->proto != prev->proto) PF_SET_SKIP_STEPS(PFE_SKIP_PROTO); if (memcmp(&cur->src, &prev->src, sizeof(cur->src)) != 0) PF_SET_SKIP_STEPS(PFE_SKIP_SRC_ADDR); if (memcmp(&cur->dst, &prev->dst, sizeof(cur->dst)) != 0) PF_SET_SKIP_STEPS(PFE_SKIP_DST_ADDR); if (cur->ipsrc.neg != prev->ipsrc.neg || pf_addr_wrap_neq(&cur->ipsrc.addr, &prev->ipsrc.addr)) PF_SET_SKIP_STEPS(PFE_SKIP_SRC_IP_ADDR); if (cur->ipdst.neg != prev->ipdst.neg || pf_addr_wrap_neq(&cur->ipdst.addr, &prev->ipdst.addr)) PF_SET_SKIP_STEPS(PFE_SKIP_DST_IP_ADDR); prev = cur; cur = TAILQ_NEXT(cur, entries); } for (i = 0; i < PFE_SKIP_COUNT; ++i) PF_SET_SKIP_STEPS(i); } static int pf_commit_eth(uint32_t ticket, const char *anchor) { struct pf_keth_ruleq *rules; struct pf_keth_ruleset *rs; rs = pf_find_keth_ruleset(anchor); if (rs == NULL) { return (EINVAL); } if (!rs->inactive.open || ticket != rs->inactive.ticket) return (EBUSY); PF_RULES_WASSERT(); pf_eth_calc_skip_steps(rs->inactive.rules); rules = rs->active.rules; ck_pr_store_ptr(&rs->active.rules, rs->inactive.rules); rs->inactive.rules = rules; rs->inactive.ticket = rs->active.ticket; /* Clean up inactive rules (i.e. previously active rules), only when * we're sure they're no longer used. */ NET_EPOCH_CALL(pf_rollback_eth_cb, &rs->epoch_ctx); return (0); } #ifdef ALTQ static uint16_t pf_qname2qid(const char *qname) { return (tagname2tag(&V_pf_qids, qname)); } static void pf_qid_unref(uint16_t qid) { tag_unref(&V_pf_qids, qid); } static int pf_begin_altq(u_int32_t *ticket) { struct pf_altq *altq, *tmp; int error = 0; PF_RULES_WASSERT(); /* Purge the old altq lists */ TAILQ_FOREACH_SAFE(altq, V_pf_altq_ifs_inactive, entries, tmp) { if ((altq->local_flags & PFALTQ_FLAG_IF_REMOVED) == 0) { /* detach and destroy the discipline */ error = altq_remove(altq); } free(altq, M_PFALTQ); } TAILQ_INIT(V_pf_altq_ifs_inactive); TAILQ_FOREACH_SAFE(altq, V_pf_altqs_inactive, entries, tmp) { pf_qid_unref(altq->qid); free(altq, M_PFALTQ); } TAILQ_INIT(V_pf_altqs_inactive); if (error) return (error); *ticket = ++V_ticket_altqs_inactive; V_altqs_inactive_open = 1; return (0); } static int pf_rollback_altq(u_int32_t ticket) { struct pf_altq *altq, *tmp; int error = 0; PF_RULES_WASSERT(); if (!V_altqs_inactive_open || ticket != V_ticket_altqs_inactive) return (0); /* Purge the old altq lists */ TAILQ_FOREACH_SAFE(altq, V_pf_altq_ifs_inactive, entries, tmp) { if ((altq->local_flags & PFALTQ_FLAG_IF_REMOVED) == 0) { /* detach and destroy the discipline */ error = altq_remove(altq); } free(altq, M_PFALTQ); } TAILQ_INIT(V_pf_altq_ifs_inactive); TAILQ_FOREACH_SAFE(altq, V_pf_altqs_inactive, entries, tmp) { pf_qid_unref(altq->qid); free(altq, M_PFALTQ); } TAILQ_INIT(V_pf_altqs_inactive); V_altqs_inactive_open = 0; return (error); } static int pf_commit_altq(u_int32_t ticket) { struct pf_altqqueue *old_altqs, *old_altq_ifs; struct pf_altq *altq, *tmp; int err, error = 0; PF_RULES_WASSERT(); if (!V_altqs_inactive_open || ticket != V_ticket_altqs_inactive) return (EBUSY); /* swap altqs, keep the old. */ old_altqs = V_pf_altqs_active; old_altq_ifs = V_pf_altq_ifs_active; V_pf_altqs_active = V_pf_altqs_inactive; V_pf_altq_ifs_active = V_pf_altq_ifs_inactive; V_pf_altqs_inactive = old_altqs; V_pf_altq_ifs_inactive = old_altq_ifs; V_ticket_altqs_active = V_ticket_altqs_inactive; /* Attach new disciplines */ TAILQ_FOREACH(altq, V_pf_altq_ifs_active, entries) { if ((altq->local_flags & PFALTQ_FLAG_IF_REMOVED) == 0) { /* attach the discipline */ error = altq_pfattach(altq); if (error == 0 && V_pf_altq_running) error = pf_enable_altq(altq); if (error != 0) return (error); } } /* Purge the old altq lists */ TAILQ_FOREACH_SAFE(altq, V_pf_altq_ifs_inactive, entries, tmp) { if ((altq->local_flags & PFALTQ_FLAG_IF_REMOVED) == 0) { /* detach and destroy the discipline */ if (V_pf_altq_running) error = pf_disable_altq(altq); err = altq_pfdetach(altq); if (err != 0 && error == 0) error = err; err = altq_remove(altq); if (err != 0 && error == 0) error = err; } free(altq, M_PFALTQ); } TAILQ_INIT(V_pf_altq_ifs_inactive); TAILQ_FOREACH_SAFE(altq, V_pf_altqs_inactive, entries, tmp) { pf_qid_unref(altq->qid); free(altq, M_PFALTQ); } TAILQ_INIT(V_pf_altqs_inactive); V_altqs_inactive_open = 0; return (error); } static int pf_enable_altq(struct pf_altq *altq) { struct ifnet *ifp; struct tb_profile tb; int error = 0; if ((ifp = ifunit(altq->ifname)) == NULL) return (EINVAL); if (ifp->if_snd.altq_type != ALTQT_NONE) error = altq_enable(&ifp->if_snd); /* set tokenbucket regulator */ if (error == 0 && ifp != NULL && ALTQ_IS_ENABLED(&ifp->if_snd)) { tb.rate = altq->ifbandwidth; tb.depth = altq->tbrsize; error = tbr_set(&ifp->if_snd, &tb); } return (error); } static int pf_disable_altq(struct pf_altq *altq) { struct ifnet *ifp; struct tb_profile tb; int error; if ((ifp = ifunit(altq->ifname)) == NULL) return (EINVAL); /* * when the discipline is no longer referenced, it was overridden * by a new one. if so, just return. */ if (altq->altq_disc != ifp->if_snd.altq_disc) return (0); error = altq_disable(&ifp->if_snd); if (error == 0) { /* clear tokenbucket regulator */ tb.rate = 0; error = tbr_set(&ifp->if_snd, &tb); } return (error); } static int pf_altq_ifnet_event_add(struct ifnet *ifp, int remove, u_int32_t ticket, struct pf_altq *altq) { struct ifnet *ifp1; int error = 0; /* Deactivate the interface in question */ altq->local_flags &= ~PFALTQ_FLAG_IF_REMOVED; if ((ifp1 = ifunit(altq->ifname)) == NULL || (remove && ifp1 == ifp)) { altq->local_flags |= PFALTQ_FLAG_IF_REMOVED; } else { error = altq_add(ifp1, altq); if (ticket != V_ticket_altqs_inactive) error = EBUSY; if (error) free(altq, M_PFALTQ); } return (error); } void pf_altq_ifnet_event(struct ifnet *ifp, int remove) { struct pf_altq *a1, *a2, *a3; u_int32_t ticket; int error = 0; /* * No need to re-evaluate the configuration for events on interfaces * that do not support ALTQ, as it's not possible for such * interfaces to be part of the configuration. */ if (!ALTQ_IS_READY(&ifp->if_snd)) return; /* Interrupt userland queue modifications */ if (V_altqs_inactive_open) pf_rollback_altq(V_ticket_altqs_inactive); /* Start new altq ruleset */ if (pf_begin_altq(&ticket)) return; /* Copy the current active set */ TAILQ_FOREACH(a1, V_pf_altq_ifs_active, entries) { a2 = malloc(sizeof(*a2), M_PFALTQ, M_NOWAIT); if (a2 == NULL) { error = ENOMEM; break; } bcopy(a1, a2, sizeof(struct pf_altq)); error = pf_altq_ifnet_event_add(ifp, remove, ticket, a2); if (error) break; TAILQ_INSERT_TAIL(V_pf_altq_ifs_inactive, a2, entries); } if (error) goto out; TAILQ_FOREACH(a1, V_pf_altqs_active, entries) { a2 = malloc(sizeof(*a2), M_PFALTQ, M_NOWAIT); if (a2 == NULL) { error = ENOMEM; break; } bcopy(a1, a2, sizeof(struct pf_altq)); if ((a2->qid = pf_qname2qid(a2->qname)) == 0) { error = EBUSY; free(a2, M_PFALTQ); break; } a2->altq_disc = NULL; TAILQ_FOREACH(a3, V_pf_altq_ifs_inactive, entries) { if (strncmp(a3->ifname, a2->ifname, IFNAMSIZ) == 0) { a2->altq_disc = a3->altq_disc; break; } } error = pf_altq_ifnet_event_add(ifp, remove, ticket, a2); if (error) break; TAILQ_INSERT_TAIL(V_pf_altqs_inactive, a2, entries); } out: if (error != 0) pf_rollback_altq(ticket); else pf_commit_altq(ticket); } #endif /* ALTQ */ static struct pf_krule_global * pf_rule_tree_alloc(int flags) { struct pf_krule_global *tree; tree = malloc(sizeof(struct pf_krule_global), M_TEMP, flags); if (tree == NULL) return (NULL); RB_INIT(tree); return (tree); } static void pf_rule_tree_free(struct pf_krule_global *tree) { free(tree, M_TEMP); } static int pf_begin_rules(u_int32_t *ticket, int rs_num, const char *anchor) { struct pf_krule_global *tree; struct pf_kruleset *rs; struct pf_krule *rule; PF_RULES_WASSERT(); if (rs_num < 0 || rs_num >= PF_RULESET_MAX) return (EINVAL); tree = pf_rule_tree_alloc(M_NOWAIT); if (tree == NULL) return (ENOMEM); rs = pf_find_or_create_kruleset(anchor); if (rs == NULL) { free(tree, M_TEMP); return (EINVAL); } pf_rule_tree_free(rs->rules[rs_num].inactive.tree); rs->rules[rs_num].inactive.tree = tree; while ((rule = TAILQ_FIRST(rs->rules[rs_num].inactive.ptr)) != NULL) { pf_unlink_rule(rs->rules[rs_num].inactive.ptr, rule); rs->rules[rs_num].inactive.rcount--; } *ticket = ++rs->rules[rs_num].inactive.ticket; rs->rules[rs_num].inactive.open = 1; return (0); } static int pf_rollback_rules(u_int32_t ticket, int rs_num, char *anchor) { struct pf_kruleset *rs; struct pf_krule *rule; PF_RULES_WASSERT(); if (rs_num < 0 || rs_num >= PF_RULESET_MAX) return (EINVAL); rs = pf_find_kruleset(anchor); if (rs == NULL || !rs->rules[rs_num].inactive.open || rs->rules[rs_num].inactive.ticket != ticket) return (0); while ((rule = TAILQ_FIRST(rs->rules[rs_num].inactive.ptr)) != NULL) { pf_unlink_rule(rs->rules[rs_num].inactive.ptr, rule); rs->rules[rs_num].inactive.rcount--; } rs->rules[rs_num].inactive.open = 0; return (0); } #define PF_MD5_UPD(st, elm) \ MD5Update(ctx, (u_int8_t *) &(st)->elm, sizeof((st)->elm)) #define PF_MD5_UPD_STR(st, elm) \ MD5Update(ctx, (u_int8_t *) (st)->elm, strlen((st)->elm)) #define PF_MD5_UPD_HTONL(st, elm, stor) do { \ (stor) = htonl((st)->elm); \ MD5Update(ctx, (u_int8_t *) &(stor), sizeof(u_int32_t));\ } while (0) #define PF_MD5_UPD_HTONS(st, elm, stor) do { \ (stor) = htons((st)->elm); \ MD5Update(ctx, (u_int8_t *) &(stor), sizeof(u_int16_t));\ } while (0) static void pf_hash_rule_addr(MD5_CTX *ctx, struct pf_rule_addr *pfr) { PF_MD5_UPD(pfr, addr.type); switch (pfr->addr.type) { case PF_ADDR_DYNIFTL: PF_MD5_UPD(pfr, addr.v.ifname); PF_MD5_UPD(pfr, addr.iflags); break; case PF_ADDR_TABLE: PF_MD5_UPD(pfr, addr.v.tblname); break; case PF_ADDR_ADDRMASK: /* XXX ignore af? */ PF_MD5_UPD(pfr, addr.v.a.addr.addr32); PF_MD5_UPD(pfr, addr.v.a.mask.addr32); break; } PF_MD5_UPD(pfr, port[0]); PF_MD5_UPD(pfr, port[1]); PF_MD5_UPD(pfr, neg); PF_MD5_UPD(pfr, port_op); } static void pf_hash_rule_rolling(MD5_CTX *ctx, struct pf_krule *rule) { u_int16_t x; u_int32_t y; pf_hash_rule_addr(ctx, &rule->src); pf_hash_rule_addr(ctx, &rule->dst); for (int i = 0; i < PF_RULE_MAX_LABEL_COUNT; i++) PF_MD5_UPD_STR(rule, label[i]); PF_MD5_UPD_STR(rule, ifname); PF_MD5_UPD_STR(rule, match_tagname); PF_MD5_UPD_HTONS(rule, match_tag, x); /* dup? */ PF_MD5_UPD_HTONL(rule, os_fingerprint, y); PF_MD5_UPD_HTONL(rule, prob, y); PF_MD5_UPD_HTONL(rule, uid.uid[0], y); PF_MD5_UPD_HTONL(rule, uid.uid[1], y); PF_MD5_UPD(rule, uid.op); PF_MD5_UPD_HTONL(rule, gid.gid[0], y); PF_MD5_UPD_HTONL(rule, gid.gid[1], y); PF_MD5_UPD(rule, gid.op); PF_MD5_UPD_HTONL(rule, rule_flag, y); PF_MD5_UPD(rule, action); PF_MD5_UPD(rule, direction); PF_MD5_UPD(rule, af); PF_MD5_UPD(rule, quick); PF_MD5_UPD(rule, ifnot); PF_MD5_UPD(rule, match_tag_not); PF_MD5_UPD(rule, natpass); PF_MD5_UPD(rule, keep_state); PF_MD5_UPD(rule, proto); PF_MD5_UPD(rule, type); PF_MD5_UPD(rule, code); PF_MD5_UPD(rule, flags); PF_MD5_UPD(rule, flagset); PF_MD5_UPD(rule, allow_opts); PF_MD5_UPD(rule, rt); PF_MD5_UPD(rule, tos); PF_MD5_UPD(rule, scrub_flags); PF_MD5_UPD(rule, min_ttl); PF_MD5_UPD(rule, set_tos); if (rule->anchor != NULL) PF_MD5_UPD_STR(rule, anchor->path); } static void pf_hash_rule(struct pf_krule *rule) { MD5_CTX ctx; MD5Init(&ctx); pf_hash_rule_rolling(&ctx, rule); MD5Final(rule->md5sum, &ctx); } static int pf_krule_compare(struct pf_krule *a, struct pf_krule *b) { return (memcmp(a->md5sum, b->md5sum, PF_MD5_DIGEST_LENGTH)); } static int pf_commit_rules(u_int32_t ticket, int rs_num, char *anchor) { struct pf_kruleset *rs; struct pf_krule *rule, **old_array, *old_rule; struct pf_krulequeue *old_rules; struct pf_krule_global *old_tree; int error; u_int32_t old_rcount; PF_RULES_WASSERT(); if (rs_num < 0 || rs_num >= PF_RULESET_MAX) return (EINVAL); rs = pf_find_kruleset(anchor); if (rs == NULL || !rs->rules[rs_num].inactive.open || ticket != rs->rules[rs_num].inactive.ticket) return (EBUSY); /* Calculate checksum for the main ruleset */ if (rs == &pf_main_ruleset) { error = pf_setup_pfsync_matching(rs); if (error != 0) return (error); } /* Swap rules, keep the old. */ old_rules = rs->rules[rs_num].active.ptr; old_rcount = rs->rules[rs_num].active.rcount; old_array = rs->rules[rs_num].active.ptr_array; old_tree = rs->rules[rs_num].active.tree; rs->rules[rs_num].active.ptr = rs->rules[rs_num].inactive.ptr; rs->rules[rs_num].active.ptr_array = rs->rules[rs_num].inactive.ptr_array; rs->rules[rs_num].active.tree = rs->rules[rs_num].inactive.tree; rs->rules[rs_num].active.rcount = rs->rules[rs_num].inactive.rcount; /* Attempt to preserve counter information. */ if (V_pf_status.keep_counters && old_tree != NULL) { TAILQ_FOREACH(rule, rs->rules[rs_num].active.ptr, entries) { old_rule = RB_FIND(pf_krule_global, old_tree, rule); if (old_rule == NULL) { continue; } pf_counter_u64_critical_enter(); pf_counter_u64_add_protected(&rule->evaluations, pf_counter_u64_fetch(&old_rule->evaluations)); pf_counter_u64_add_protected(&rule->packets[0], pf_counter_u64_fetch(&old_rule->packets[0])); pf_counter_u64_add_protected(&rule->packets[1], pf_counter_u64_fetch(&old_rule->packets[1])); pf_counter_u64_add_protected(&rule->bytes[0], pf_counter_u64_fetch(&old_rule->bytes[0])); pf_counter_u64_add_protected(&rule->bytes[1], pf_counter_u64_fetch(&old_rule->bytes[1])); pf_counter_u64_critical_exit(); } } rs->rules[rs_num].inactive.ptr = old_rules; rs->rules[rs_num].inactive.ptr_array = old_array; rs->rules[rs_num].inactive.tree = NULL; /* important for pf_ioctl_addrule */ rs->rules[rs_num].inactive.rcount = old_rcount; rs->rules[rs_num].active.ticket = rs->rules[rs_num].inactive.ticket; pf_calc_skip_steps(rs->rules[rs_num].active.ptr); /* Purge the old rule list. */ PF_UNLNKDRULES_LOCK(); while ((rule = TAILQ_FIRST(old_rules)) != NULL) pf_unlink_rule_locked(old_rules, rule); PF_UNLNKDRULES_UNLOCK(); if (rs->rules[rs_num].inactive.ptr_array) free(rs->rules[rs_num].inactive.ptr_array, M_TEMP); rs->rules[rs_num].inactive.ptr_array = NULL; rs->rules[rs_num].inactive.rcount = 0; rs->rules[rs_num].inactive.open = 0; pf_remove_if_empty_kruleset(rs); free(old_tree, M_TEMP); return (0); } static int pf_setup_pfsync_matching(struct pf_kruleset *rs) { MD5_CTX ctx; struct pf_krule *rule; int rs_cnt; u_int8_t digest[PF_MD5_DIGEST_LENGTH]; MD5Init(&ctx); for (rs_cnt = 0; rs_cnt < PF_RULESET_MAX; rs_cnt++) { /* XXX PF_RULESET_SCRUB as well? */ if (rs_cnt == PF_RULESET_SCRUB) continue; if (rs->rules[rs_cnt].inactive.ptr_array) free(rs->rules[rs_cnt].inactive.ptr_array, M_TEMP); rs->rules[rs_cnt].inactive.ptr_array = NULL; if (rs->rules[rs_cnt].inactive.rcount) { rs->rules[rs_cnt].inactive.ptr_array = mallocarray(rs->rules[rs_cnt].inactive.rcount, sizeof(struct pf_rule **), M_TEMP, M_NOWAIT); if (!rs->rules[rs_cnt].inactive.ptr_array) return (ENOMEM); } TAILQ_FOREACH(rule, rs->rules[rs_cnt].inactive.ptr, entries) { pf_hash_rule_rolling(&ctx, rule); (rs->rules[rs_cnt].inactive.ptr_array)[rule->nr] = rule; } } MD5Final(digest, &ctx); memcpy(V_pf_status.pf_chksum, digest, sizeof(V_pf_status.pf_chksum)); return (0); } static int pf_eth_addr_setup(struct pf_keth_ruleset *ruleset, struct pf_addr_wrap *addr) { int error = 0; switch (addr->type) { case PF_ADDR_TABLE: addr->p.tbl = pfr_eth_attach_table(ruleset, addr->v.tblname); if (addr->p.tbl == NULL) error = ENOMEM; break; default: error = EINVAL; } return (error); } static int pf_addr_setup(struct pf_kruleset *ruleset, struct pf_addr_wrap *addr, sa_family_t af) { int error = 0; switch (addr->type) { case PF_ADDR_TABLE: addr->p.tbl = pfr_attach_table(ruleset, addr->v.tblname); if (addr->p.tbl == NULL) error = ENOMEM; break; case PF_ADDR_DYNIFTL: error = pfi_dynaddr_setup(addr, af); break; } return (error); } static void pf_addr_copyout(struct pf_addr_wrap *addr) { switch (addr->type) { case PF_ADDR_DYNIFTL: pfi_dynaddr_copyout(addr); break; case PF_ADDR_TABLE: pf_tbladdr_copyout(addr); break; } } static void pf_src_node_copy(const struct pf_ksrc_node *in, struct pf_src_node *out) { int secs = time_uptime, diff; bzero(out, sizeof(struct pf_src_node)); bcopy(&in->addr, &out->addr, sizeof(struct pf_addr)); bcopy(&in->raddr, &out->raddr, sizeof(struct pf_addr)); if (in->rule.ptr != NULL) out->rule.nr = in->rule.ptr->nr; for (int i = 0; i < 2; i++) { out->bytes[i] = counter_u64_fetch(in->bytes[i]); out->packets[i] = counter_u64_fetch(in->packets[i]); } out->states = in->states; out->conn = in->conn; out->af = in->af; out->ruletype = in->ruletype; out->creation = secs - in->creation; if (out->expire > secs) out->expire -= secs; else out->expire = 0; /* Adjust the connection rate estimate. */ diff = secs - in->conn_rate.last; if (diff >= in->conn_rate.seconds) out->conn_rate.count = 0; else out->conn_rate.count -= in->conn_rate.count * diff / in->conn_rate.seconds; } #ifdef ALTQ /* * Handle export of struct pf_kaltq to user binaries that may be using any * version of struct pf_altq. */ static int pf_export_kaltq(struct pf_altq *q, struct pfioc_altq_v1 *pa, size_t ioc_size) { u_int32_t version; if (ioc_size == sizeof(struct pfioc_altq_v0)) version = 0; else version = pa->version; if (version > PFIOC_ALTQ_VERSION) return (EINVAL); #define ASSIGN(x) exported_q->x = q->x #define COPY(x) \ bcopy(&q->x, &exported_q->x, min(sizeof(q->x), sizeof(exported_q->x))) #define SATU16(x) (u_int32_t)uqmin((x), USHRT_MAX) #define SATU32(x) (u_int32_t)uqmin((x), UINT_MAX) switch (version) { case 0: { struct pf_altq_v0 *exported_q = &((struct pfioc_altq_v0 *)pa)->altq; COPY(ifname); ASSIGN(scheduler); ASSIGN(tbrsize); exported_q->tbrsize = SATU16(q->tbrsize); exported_q->ifbandwidth = SATU32(q->ifbandwidth); COPY(qname); COPY(parent); ASSIGN(parent_qid); exported_q->bandwidth = SATU32(q->bandwidth); ASSIGN(priority); ASSIGN(local_flags); ASSIGN(qlimit); ASSIGN(flags); if (q->scheduler == ALTQT_HFSC) { #define ASSIGN_OPT(x) exported_q->pq_u.hfsc_opts.x = q->pq_u.hfsc_opts.x #define ASSIGN_OPT_SATU32(x) exported_q->pq_u.hfsc_opts.x = \ SATU32(q->pq_u.hfsc_opts.x) ASSIGN_OPT_SATU32(rtsc_m1); ASSIGN_OPT(rtsc_d); ASSIGN_OPT_SATU32(rtsc_m2); ASSIGN_OPT_SATU32(lssc_m1); ASSIGN_OPT(lssc_d); ASSIGN_OPT_SATU32(lssc_m2); ASSIGN_OPT_SATU32(ulsc_m1); ASSIGN_OPT(ulsc_d); ASSIGN_OPT_SATU32(ulsc_m2); ASSIGN_OPT(flags); #undef ASSIGN_OPT #undef ASSIGN_OPT_SATU32 } else COPY(pq_u); ASSIGN(qid); break; } case 1: { struct pf_altq_v1 *exported_q = &((struct pfioc_altq_v1 *)pa)->altq; COPY(ifname); ASSIGN(scheduler); ASSIGN(tbrsize); ASSIGN(ifbandwidth); COPY(qname); COPY(parent); ASSIGN(parent_qid); ASSIGN(bandwidth); ASSIGN(priority); ASSIGN(local_flags); ASSIGN(qlimit); ASSIGN(flags); COPY(pq_u); ASSIGN(qid); break; } default: panic("%s: unhandled struct pfioc_altq version", __func__); break; } #undef ASSIGN #undef COPY #undef SATU16 #undef SATU32 return (0); } /* * Handle import to struct pf_kaltq of struct pf_altq from user binaries * that may be using any version of it. */ static int pf_import_kaltq(struct pfioc_altq_v1 *pa, struct pf_altq *q, size_t ioc_size) { u_int32_t version; if (ioc_size == sizeof(struct pfioc_altq_v0)) version = 0; else version = pa->version; if (version > PFIOC_ALTQ_VERSION) return (EINVAL); #define ASSIGN(x) q->x = imported_q->x #define COPY(x) \ bcopy(&imported_q->x, &q->x, min(sizeof(imported_q->x), sizeof(q->x))) switch (version) { case 0: { struct pf_altq_v0 *imported_q = &((struct pfioc_altq_v0 *)pa)->altq; COPY(ifname); ASSIGN(scheduler); ASSIGN(tbrsize); /* 16-bit -> 32-bit */ ASSIGN(ifbandwidth); /* 32-bit -> 64-bit */ COPY(qname); COPY(parent); ASSIGN(parent_qid); ASSIGN(bandwidth); /* 32-bit -> 64-bit */ ASSIGN(priority); ASSIGN(local_flags); ASSIGN(qlimit); ASSIGN(flags); if (imported_q->scheduler == ALTQT_HFSC) { #define ASSIGN_OPT(x) q->pq_u.hfsc_opts.x = imported_q->pq_u.hfsc_opts.x /* * The m1 and m2 parameters are being copied from * 32-bit to 64-bit. */ ASSIGN_OPT(rtsc_m1); ASSIGN_OPT(rtsc_d); ASSIGN_OPT(rtsc_m2); ASSIGN_OPT(lssc_m1); ASSIGN_OPT(lssc_d); ASSIGN_OPT(lssc_m2); ASSIGN_OPT(ulsc_m1); ASSIGN_OPT(ulsc_d); ASSIGN_OPT(ulsc_m2); ASSIGN_OPT(flags); #undef ASSIGN_OPT } else COPY(pq_u); ASSIGN(qid); break; } case 1: { struct pf_altq_v1 *imported_q = &((struct pfioc_altq_v1 *)pa)->altq; COPY(ifname); ASSIGN(scheduler); ASSIGN(tbrsize); ASSIGN(ifbandwidth); COPY(qname); COPY(parent); ASSIGN(parent_qid); ASSIGN(bandwidth); ASSIGN(priority); ASSIGN(local_flags); ASSIGN(qlimit); ASSIGN(flags); COPY(pq_u); ASSIGN(qid); break; } default: panic("%s: unhandled struct pfioc_altq version", __func__); break; } #undef ASSIGN #undef COPY return (0); } static struct pf_altq * pf_altq_get_nth_active(u_int32_t n) { struct pf_altq *altq; u_int32_t nr; nr = 0; TAILQ_FOREACH(altq, V_pf_altq_ifs_active, entries) { if (nr == n) return (altq); nr++; } TAILQ_FOREACH(altq, V_pf_altqs_active, entries) { if (nr == n) return (altq); nr++; } return (NULL); } #endif /* ALTQ */ struct pf_krule * pf_krule_alloc(void) { struct pf_krule *rule; rule = malloc(sizeof(struct pf_krule), M_PFRULE, M_WAITOK | M_ZERO); mtx_init(&rule->rpool.mtx, "pf_krule_pool", NULL, MTX_DEF); rule->timestamp = uma_zalloc_pcpu(pf_timestamp_pcpu_zone, M_WAITOK | M_ZERO); return (rule); } void pf_krule_free(struct pf_krule *rule) { #ifdef PF_WANT_32_TO_64_COUNTER bool wowned; #endif if (rule == NULL) return; #ifdef PF_WANT_32_TO_64_COUNTER if (rule->allrulelinked) { wowned = PF_RULES_WOWNED(); if (!wowned) PF_RULES_WLOCK(); LIST_REMOVE(rule, allrulelist); V_pf_allrulecount--; if (!wowned) PF_RULES_WUNLOCK(); } #endif pf_counter_u64_deinit(&rule->evaluations); for (int i = 0; i < 2; i++) { pf_counter_u64_deinit(&rule->packets[i]); pf_counter_u64_deinit(&rule->bytes[i]); } counter_u64_free(rule->states_cur); counter_u64_free(rule->states_tot); counter_u64_free(rule->src_nodes); uma_zfree_pcpu(pf_timestamp_pcpu_zone, rule->timestamp); mtx_destroy(&rule->rpool.mtx); free(rule, M_PFRULE); } static void pf_kpooladdr_to_pooladdr(const struct pf_kpooladdr *kpool, struct pf_pooladdr *pool) { bzero(pool, sizeof(*pool)); bcopy(&kpool->addr, &pool->addr, sizeof(pool->addr)); strlcpy(pool->ifname, kpool->ifname, sizeof(pool->ifname)); } static int pf_pooladdr_to_kpooladdr(const struct pf_pooladdr *pool, struct pf_kpooladdr *kpool) { int ret; bzero(kpool, sizeof(*kpool)); bcopy(&pool->addr, &kpool->addr, sizeof(kpool->addr)); ret = pf_user_strcpy(kpool->ifname, pool->ifname, sizeof(kpool->ifname)); return (ret); } static void pf_kpool_to_pool(const struct pf_kpool *kpool, struct pf_pool *pool) { bzero(pool, sizeof(*pool)); bcopy(&kpool->key, &pool->key, sizeof(pool->key)); bcopy(&kpool->counter, &pool->counter, sizeof(pool->counter)); pool->tblidx = kpool->tblidx; pool->proxy_port[0] = kpool->proxy_port[0]; pool->proxy_port[1] = kpool->proxy_port[1]; pool->opts = kpool->opts; } static void pf_pool_to_kpool(const struct pf_pool *pool, struct pf_kpool *kpool) { _Static_assert(sizeof(pool->key) == sizeof(kpool->key), ""); _Static_assert(sizeof(pool->counter) == sizeof(kpool->counter), ""); bcopy(&pool->key, &kpool->key, sizeof(kpool->key)); bcopy(&pool->counter, &kpool->counter, sizeof(kpool->counter)); kpool->tblidx = pool->tblidx; kpool->proxy_port[0] = pool->proxy_port[0]; kpool->proxy_port[1] = pool->proxy_port[1]; kpool->opts = pool->opts; } static void pf_krule_to_rule(const struct pf_krule *krule, struct pf_rule *rule) { bzero(rule, sizeof(*rule)); bcopy(&krule->src, &rule->src, sizeof(rule->src)); bcopy(&krule->dst, &rule->dst, sizeof(rule->dst)); for (int i = 0; i < PF_SKIP_COUNT; ++i) { if (rule->skip[i].ptr == NULL) rule->skip[i].nr = -1; else rule->skip[i].nr = krule->skip[i].ptr->nr; } strlcpy(rule->label, krule->label[0], sizeof(rule->label)); strlcpy(rule->ifname, krule->ifname, sizeof(rule->ifname)); strlcpy(rule->qname, krule->qname, sizeof(rule->qname)); strlcpy(rule->pqname, krule->pqname, sizeof(rule->pqname)); strlcpy(rule->tagname, krule->tagname, sizeof(rule->tagname)); strlcpy(rule->match_tagname, krule->match_tagname, sizeof(rule->match_tagname)); strlcpy(rule->overload_tblname, krule->overload_tblname, sizeof(rule->overload_tblname)); pf_kpool_to_pool(&krule->rpool, &rule->rpool); rule->evaluations = pf_counter_u64_fetch(&krule->evaluations); for (int i = 0; i < 2; i++) { rule->packets[i] = pf_counter_u64_fetch(&krule->packets[i]); rule->bytes[i] = pf_counter_u64_fetch(&krule->bytes[i]); } /* kif, anchor, overload_tbl are not copied over. */ rule->os_fingerprint = krule->os_fingerprint; rule->rtableid = krule->rtableid; bcopy(krule->timeout, rule->timeout, sizeof(krule->timeout)); rule->max_states = krule->max_states; rule->max_src_nodes = krule->max_src_nodes; rule->max_src_states = krule->max_src_states; rule->max_src_conn = krule->max_src_conn; rule->max_src_conn_rate.limit = krule->max_src_conn_rate.limit; rule->max_src_conn_rate.seconds = krule->max_src_conn_rate.seconds; rule->qid = krule->qid; rule->pqid = krule->pqid; rule->nr = krule->nr; rule->prob = krule->prob; rule->cuid = krule->cuid; rule->cpid = krule->cpid; rule->return_icmp = krule->return_icmp; rule->return_icmp6 = krule->return_icmp6; rule->max_mss = krule->max_mss; rule->tag = krule->tag; rule->match_tag = krule->match_tag; rule->scrub_flags = krule->scrub_flags; bcopy(&krule->uid, &rule->uid, sizeof(krule->uid)); bcopy(&krule->gid, &rule->gid, sizeof(krule->gid)); rule->rule_flag = krule->rule_flag; rule->action = krule->action; rule->direction = krule->direction; rule->log = krule->log; rule->logif = krule->logif; rule->quick = krule->quick; rule->ifnot = krule->ifnot; rule->match_tag_not = krule->match_tag_not; rule->natpass = krule->natpass; rule->keep_state = krule->keep_state; rule->af = krule->af; rule->proto = krule->proto; rule->type = krule->type; rule->code = krule->code; rule->flags = krule->flags; rule->flagset = krule->flagset; rule->min_ttl = krule->min_ttl; rule->allow_opts = krule->allow_opts; rule->rt = krule->rt; rule->return_ttl = krule->return_ttl; rule->tos = krule->tos; rule->set_tos = krule->set_tos; rule->anchor_relative = krule->anchor_relative; rule->anchor_wildcard = krule->anchor_wildcard; rule->flush = krule->flush; rule->prio = krule->prio; rule->set_prio[0] = krule->set_prio[0]; rule->set_prio[1] = krule->set_prio[1]; bcopy(&krule->divert, &rule->divert, sizeof(krule->divert)); rule->u_states_cur = counter_u64_fetch(krule->states_cur); rule->u_states_tot = counter_u64_fetch(krule->states_tot); rule->u_src_nodes = counter_u64_fetch(krule->src_nodes); } static int pf_rule_to_krule(const struct pf_rule *rule, struct pf_krule *krule) { int ret; #ifndef INET if (rule->af == AF_INET) { return (EAFNOSUPPORT); } #endif /* INET */ #ifndef INET6 if (rule->af == AF_INET6) { return (EAFNOSUPPORT); } #endif /* INET6 */ ret = pf_check_rule_addr(&rule->src); if (ret != 0) return (ret); ret = pf_check_rule_addr(&rule->dst); if (ret != 0) return (ret); bcopy(&rule->src, &krule->src, sizeof(rule->src)); bcopy(&rule->dst, &krule->dst, sizeof(rule->dst)); ret = pf_user_strcpy(krule->label[0], rule->label, sizeof(rule->label)); if (ret != 0) return (ret); ret = pf_user_strcpy(krule->ifname, rule->ifname, sizeof(rule->ifname)); if (ret != 0) return (ret); ret = pf_user_strcpy(krule->qname, rule->qname, sizeof(rule->qname)); if (ret != 0) return (ret); ret = pf_user_strcpy(krule->pqname, rule->pqname, sizeof(rule->pqname)); if (ret != 0) return (ret); ret = pf_user_strcpy(krule->tagname, rule->tagname, sizeof(rule->tagname)); if (ret != 0) return (ret); ret = pf_user_strcpy(krule->match_tagname, rule->match_tagname, sizeof(rule->match_tagname)); if (ret != 0) return (ret); ret = pf_user_strcpy(krule->overload_tblname, rule->overload_tblname, sizeof(rule->overload_tblname)); if (ret != 0) return (ret); pf_pool_to_kpool(&rule->rpool, &krule->rpool); /* Don't allow userspace to set evaluations, packets or bytes. */ /* kif, anchor, overload_tbl are not copied over. */ krule->os_fingerprint = rule->os_fingerprint; krule->rtableid = rule->rtableid; bcopy(rule->timeout, krule->timeout, sizeof(krule->timeout)); krule->max_states = rule->max_states; krule->max_src_nodes = rule->max_src_nodes; krule->max_src_states = rule->max_src_states; krule->max_src_conn = rule->max_src_conn; krule->max_src_conn_rate.limit = rule->max_src_conn_rate.limit; krule->max_src_conn_rate.seconds = rule->max_src_conn_rate.seconds; krule->qid = rule->qid; krule->pqid = rule->pqid; krule->nr = rule->nr; krule->prob = rule->prob; krule->cuid = rule->cuid; krule->cpid = rule->cpid; krule->return_icmp = rule->return_icmp; krule->return_icmp6 = rule->return_icmp6; krule->max_mss = rule->max_mss; krule->tag = rule->tag; krule->match_tag = rule->match_tag; krule->scrub_flags = rule->scrub_flags; bcopy(&rule->uid, &krule->uid, sizeof(krule->uid)); bcopy(&rule->gid, &krule->gid, sizeof(krule->gid)); krule->rule_flag = rule->rule_flag; krule->action = rule->action; krule->direction = rule->direction; krule->log = rule->log; krule->logif = rule->logif; krule->quick = rule->quick; krule->ifnot = rule->ifnot; krule->match_tag_not = rule->match_tag_not; krule->natpass = rule->natpass; krule->keep_state = rule->keep_state; krule->af = rule->af; krule->proto = rule->proto; krule->type = rule->type; krule->code = rule->code; krule->flags = rule->flags; krule->flagset = rule->flagset; krule->min_ttl = rule->min_ttl; krule->allow_opts = rule->allow_opts; krule->rt = rule->rt; krule->return_ttl = rule->return_ttl; krule->tos = rule->tos; krule->set_tos = rule->set_tos; krule->flush = rule->flush; krule->prio = rule->prio; krule->set_prio[0] = rule->set_prio[0]; krule->set_prio[1] = rule->set_prio[1]; bcopy(&rule->divert, &krule->divert, sizeof(krule->divert)); return (0); } static int pf_state_kill_to_kstate_kill(const struct pfioc_state_kill *psk, struct pf_kstate_kill *kill) { int ret; bzero(kill, sizeof(*kill)); bcopy(&psk->psk_pfcmp, &kill->psk_pfcmp, sizeof(kill->psk_pfcmp)); kill->psk_af = psk->psk_af; kill->psk_proto = psk->psk_proto; bcopy(&psk->psk_src, &kill->psk_src, sizeof(kill->psk_src)); bcopy(&psk->psk_dst, &kill->psk_dst, sizeof(kill->psk_dst)); ret = pf_user_strcpy(kill->psk_ifname, psk->psk_ifname, sizeof(kill->psk_ifname)); if (ret != 0) return (ret); ret = pf_user_strcpy(kill->psk_label, psk->psk_label, sizeof(kill->psk_label)); if (ret != 0) return (ret); return (0); } static int pf_ioctl_addrule(struct pf_krule *rule, uint32_t ticket, uint32_t pool_ticket, const char *anchor, const char *anchor_call, struct thread *td) { struct pf_kruleset *ruleset; struct pf_krule *tail; struct pf_kpooladdr *pa; struct pfi_kkif *kif = NULL; int rs_num; int error = 0; if ((rule->return_icmp >> 8) > ICMP_MAXTYPE) { error = EINVAL; goto errout_unlocked; } #define ERROUT(x) ERROUT_FUNCTION(errout, x) if (rule->ifname[0]) kif = pf_kkif_create(M_WAITOK); pf_counter_u64_init(&rule->evaluations, M_WAITOK); for (int i = 0; i < 2; i++) { pf_counter_u64_init(&rule->packets[i], M_WAITOK); pf_counter_u64_init(&rule->bytes[i], M_WAITOK); } rule->states_cur = counter_u64_alloc(M_WAITOK); rule->states_tot = counter_u64_alloc(M_WAITOK); rule->src_nodes = counter_u64_alloc(M_WAITOK); rule->cuid = td->td_ucred->cr_ruid; rule->cpid = td->td_proc ? td->td_proc->p_pid : 0; TAILQ_INIT(&rule->rpool.list); PF_CONFIG_LOCK(); PF_RULES_WLOCK(); #ifdef PF_WANT_32_TO_64_COUNTER LIST_INSERT_HEAD(&V_pf_allrulelist, rule, allrulelist); MPASS(!rule->allrulelinked); rule->allrulelinked = true; V_pf_allrulecount++; #endif ruleset = pf_find_kruleset(anchor); if (ruleset == NULL) ERROUT(EINVAL); rs_num = pf_get_ruleset_number(rule->action); if (rs_num >= PF_RULESET_MAX) ERROUT(EINVAL); if (ticket != ruleset->rules[rs_num].inactive.ticket) { DPFPRINTF(PF_DEBUG_MISC, ("ticket: %d != [%d]%d\n", ticket, rs_num, ruleset->rules[rs_num].inactive.ticket)); ERROUT(EBUSY); } if (pool_ticket != V_ticket_pabuf) { DPFPRINTF(PF_DEBUG_MISC, ("pool_ticket: %d != %d\n", pool_ticket, V_ticket_pabuf)); ERROUT(EBUSY); } /* * XXXMJG hack: there is no mechanism to ensure they started the * transaction. Ticket checked above may happen to match by accident, * even if nobody called DIOCXBEGIN, let alone this process. * Partially work around it by checking if the RB tree got allocated, * see pf_begin_rules. */ if (ruleset->rules[rs_num].inactive.tree == NULL) { ERROUT(EINVAL); } tail = TAILQ_LAST(ruleset->rules[rs_num].inactive.ptr, pf_krulequeue); if (tail) rule->nr = tail->nr + 1; else rule->nr = 0; if (rule->ifname[0]) { rule->kif = pfi_kkif_attach(kif, rule->ifname); kif = NULL; pfi_kkif_ref(rule->kif); } else rule->kif = NULL; if (rule->rtableid > 0 && rule->rtableid >= rt_numfibs) error = EBUSY; #ifdef ALTQ /* set queue IDs */ if (rule->qname[0] != 0) { if ((rule->qid = pf_qname2qid(rule->qname)) == 0) error = EBUSY; else if (rule->pqname[0] != 0) { if ((rule->pqid = pf_qname2qid(rule->pqname)) == 0) error = EBUSY; } else rule->pqid = rule->qid; } #endif if (rule->tagname[0]) if ((rule->tag = pf_tagname2tag(rule->tagname)) == 0) error = EBUSY; if (rule->match_tagname[0]) if ((rule->match_tag = pf_tagname2tag(rule->match_tagname)) == 0) error = EBUSY; if (rule->rt && !rule->direction) error = EINVAL; if (!rule->log) rule->logif = 0; if (rule->logif >= PFLOGIFS_MAX) error = EINVAL; if (pf_addr_setup(ruleset, &rule->src.addr, rule->af)) error = ENOMEM; if (pf_addr_setup(ruleset, &rule->dst.addr, rule->af)) error = ENOMEM; if (pf_kanchor_setup(rule, ruleset, anchor_call)) error = EINVAL; if (rule->scrub_flags & PFSTATE_SETPRIO && (rule->set_prio[0] > PF_PRIO_MAX || rule->set_prio[1] > PF_PRIO_MAX)) error = EINVAL; TAILQ_FOREACH(pa, &V_pf_pabuf, entries) if (pa->addr.type == PF_ADDR_TABLE) { pa->addr.p.tbl = pfr_attach_table(ruleset, pa->addr.v.tblname); if (pa->addr.p.tbl == NULL) error = ENOMEM; } rule->overload_tbl = NULL; if (rule->overload_tblname[0]) { if ((rule->overload_tbl = pfr_attach_table(ruleset, rule->overload_tblname)) == NULL) error = EINVAL; else rule->overload_tbl->pfrkt_flags |= PFR_TFLAG_ACTIVE; } pf_mv_kpool(&V_pf_pabuf, &rule->rpool.list); if (((((rule->action == PF_NAT) || (rule->action == PF_RDR) || (rule->action == PF_BINAT)) && rule->anchor == NULL) || (rule->rt > PF_NOPFROUTE)) && (TAILQ_FIRST(&rule->rpool.list) == NULL)) error = EINVAL; if (error) { pf_free_rule(rule); rule = NULL; ERROUT(error); } rule->rpool.cur = TAILQ_FIRST(&rule->rpool.list); TAILQ_INSERT_TAIL(ruleset->rules[rs_num].inactive.ptr, rule, entries); ruleset->rules[rs_num].inactive.rcount++; PF_RULES_WUNLOCK(); pf_hash_rule(rule); if (RB_INSERT(pf_krule_global, ruleset->rules[rs_num].inactive.tree, rule) != NULL) { PF_RULES_WLOCK(); TAILQ_REMOVE(ruleset->rules[rs_num].inactive.ptr, rule, entries); ruleset->rules[rs_num].inactive.rcount--; pf_free_rule(rule); rule = NULL; ERROUT(EEXIST); } PF_CONFIG_UNLOCK(); return (0); #undef ERROUT errout: PF_RULES_WUNLOCK(); PF_CONFIG_UNLOCK(); errout_unlocked: pf_kkif_free(kif); pf_krule_free(rule); return (error); } static bool pf_label_match(const struct pf_krule *rule, const char *label) { int i = 0; while (*rule->label[i]) { if (strcmp(rule->label[i], label) == 0) return (true); i++; } return (false); } static unsigned int pf_kill_matching_state(struct pf_state_key_cmp *key, int dir) { struct pf_kstate *s; int more = 0; s = pf_find_state_all(key, dir, &more); if (s == NULL) return (0); if (more) { PF_STATE_UNLOCK(s); return (0); } pf_unlink_state(s); return (1); } static int pf_killstates_row(struct pf_kstate_kill *psk, struct pf_idhash *ih) { struct pf_kstate *s; struct pf_state_key *sk; struct pf_addr *srcaddr, *dstaddr; struct pf_state_key_cmp match_key; int idx, killed = 0; unsigned int dir; u_int16_t srcport, dstport; struct pfi_kkif *kif; relock_DIOCKILLSTATES: PF_HASHROW_LOCK(ih); LIST_FOREACH(s, &ih->states, entry) { /* For floating states look at the original kif. */ kif = s->kif == V_pfi_all ? s->orig_kif : s->kif; sk = s->key[PF_SK_WIRE]; if (s->direction == PF_OUT) { srcaddr = &sk->addr[1]; dstaddr = &sk->addr[0]; srcport = sk->port[1]; dstport = sk->port[0]; } else { srcaddr = &sk->addr[0]; dstaddr = &sk->addr[1]; srcport = sk->port[0]; dstport = sk->port[1]; } if (psk->psk_af && sk->af != psk->psk_af) continue; if (psk->psk_proto && psk->psk_proto != sk->proto) continue; if (! PF_MATCHA(psk->psk_src.neg, &psk->psk_src.addr.v.a.addr, &psk->psk_src.addr.v.a.mask, srcaddr, sk->af)) continue; if (! PF_MATCHA(psk->psk_dst.neg, &psk->psk_dst.addr.v.a.addr, &psk->psk_dst.addr.v.a.mask, dstaddr, sk->af)) continue; if (! PF_MATCHA(psk->psk_rt_addr.neg, &psk->psk_rt_addr.addr.v.a.addr, &psk->psk_rt_addr.addr.v.a.mask, &s->rt_addr, sk->af)) continue; if (psk->psk_src.port_op != 0 && ! pf_match_port(psk->psk_src.port_op, psk->psk_src.port[0], psk->psk_src.port[1], srcport)) continue; if (psk->psk_dst.port_op != 0 && ! pf_match_port(psk->psk_dst.port_op, psk->psk_dst.port[0], psk->psk_dst.port[1], dstport)) continue; if (psk->psk_label[0] && ! pf_label_match(s->rule.ptr, psk->psk_label)) continue; if (psk->psk_ifname[0] && strcmp(psk->psk_ifname, kif->pfik_name)) continue; if (psk->psk_kill_match) { /* Create the key to find matching states, with lock * held. */ bzero(&match_key, sizeof(match_key)); if (s->direction == PF_OUT) { dir = PF_IN; idx = PF_SK_STACK; } else { dir = PF_OUT; idx = PF_SK_WIRE; } match_key.af = s->key[idx]->af; match_key.proto = s->key[idx]->proto; PF_ACPY(&match_key.addr[0], &s->key[idx]->addr[1], match_key.af); match_key.port[0] = s->key[idx]->port[1]; PF_ACPY(&match_key.addr[1], &s->key[idx]->addr[0], match_key.af); match_key.port[1] = s->key[idx]->port[0]; } pf_unlink_state(s); killed++; if (psk->psk_kill_match) killed += pf_kill_matching_state(&match_key, dir); goto relock_DIOCKILLSTATES; } PF_HASHROW_UNLOCK(ih); return (killed); } static int pfioctl(struct cdev *dev, u_long cmd, caddr_t addr, int flags, struct thread *td) { int error = 0; PF_RULES_RLOCK_TRACKER; #define ERROUT_IOCTL(target, x) \ do { \ error = (x); \ SDT_PROBE3(pf, ioctl, ioctl, error, cmd, error, __LINE__); \ goto target; \ } while (0) /* XXX keep in sync with switch() below */ if (securelevel_gt(td->td_ucred, 2)) switch (cmd) { case DIOCGETRULES: case DIOCGETRULE: case DIOCGETRULENV: case DIOCGETADDRS: case DIOCGETADDR: case DIOCGETSTATE: case DIOCGETSTATENV: case DIOCSETSTATUSIF: case DIOCGETSTATUS: case DIOCGETSTATUSNV: case DIOCCLRSTATUS: case DIOCNATLOOK: case DIOCSETDEBUG: case DIOCGETSTATES: case DIOCGETSTATESV2: case DIOCGETTIMEOUT: case DIOCCLRRULECTRS: case DIOCGETLIMIT: case DIOCGETALTQSV0: case DIOCGETALTQSV1: case DIOCGETALTQV0: case DIOCGETALTQV1: case DIOCGETQSTATSV0: case DIOCGETQSTATSV1: case DIOCGETRULESETS: case DIOCGETRULESET: case DIOCRGETTABLES: case DIOCRGETTSTATS: case DIOCRCLRTSTATS: case DIOCRCLRADDRS: case DIOCRADDADDRS: case DIOCRDELADDRS: case DIOCRSETADDRS: case DIOCRGETADDRS: case DIOCRGETASTATS: case DIOCRCLRASTATS: case DIOCRTSTADDRS: case DIOCOSFPGET: case DIOCGETSRCNODES: case DIOCCLRSRCNODES: case DIOCGETSYNCOOKIES: case DIOCIGETIFACES: case DIOCGIFSPEEDV0: case DIOCGIFSPEEDV1: case DIOCSETIFFLAG: case DIOCCLRIFFLAG: case DIOCGETETHRULES: case DIOCGETETHRULE: case DIOCGETETHRULESETS: case DIOCGETETHRULESET: break; case DIOCRCLRTABLES: case DIOCRADDTABLES: case DIOCRDELTABLES: case DIOCRSETTFLAGS: if (((struct pfioc_table *)addr)->pfrio_flags & PFR_FLAG_DUMMY) break; /* dummy operation ok */ return (EPERM); default: return (EPERM); } if (!(flags & FWRITE)) switch (cmd) { case DIOCGETRULES: case DIOCGETADDRS: case DIOCGETADDR: case DIOCGETSTATE: case DIOCGETSTATENV: case DIOCGETSTATUS: case DIOCGETSTATUSNV: case DIOCGETSTATES: case DIOCGETSTATESV2: case DIOCGETTIMEOUT: case DIOCGETLIMIT: case DIOCGETALTQSV0: case DIOCGETALTQSV1: case DIOCGETALTQV0: case DIOCGETALTQV1: case DIOCGETQSTATSV0: case DIOCGETQSTATSV1: case DIOCGETRULESETS: case DIOCGETRULESET: case DIOCNATLOOK: case DIOCRGETTABLES: case DIOCRGETTSTATS: case DIOCRGETADDRS: case DIOCRGETASTATS: case DIOCRTSTADDRS: case DIOCOSFPGET: case DIOCGETSRCNODES: case DIOCGETSYNCOOKIES: case DIOCIGETIFACES: case DIOCGIFSPEEDV1: case DIOCGIFSPEEDV0: case DIOCGETRULENV: case DIOCGETETHRULES: case DIOCGETETHRULE: case DIOCGETETHRULESETS: case DIOCGETETHRULESET: break; case DIOCRCLRTABLES: case DIOCRADDTABLES: case DIOCRDELTABLES: case DIOCRCLRTSTATS: case DIOCRCLRADDRS: case DIOCRADDADDRS: case DIOCRDELADDRS: case DIOCRSETADDRS: case DIOCRSETTFLAGS: if (((struct pfioc_table *)addr)->pfrio_flags & PFR_FLAG_DUMMY) { flags |= FWRITE; /* need write lock for dummy */ break; /* dummy operation ok */ } return (EACCES); case DIOCGETRULE: if (((struct pfioc_rule *)addr)->action == PF_GET_CLR_CNTR) return (EACCES); break; default: return (EACCES); } CURVNET_SET(TD_TO_VNET(td)); switch (cmd) { case DIOCSTART: sx_xlock(&V_pf_ioctl_lock); if (V_pf_status.running) error = EEXIST; else { hook_pf(); if (! TAILQ_EMPTY(V_pf_keth->active.rules)) hook_pf_eth(); V_pf_status.running = 1; V_pf_status.since = time_second; new_unrhdr64(&V_pf_stateid, time_second); DPFPRINTF(PF_DEBUG_MISC, ("pf: started\n")); } break; case DIOCSTOP: sx_xlock(&V_pf_ioctl_lock); if (!V_pf_status.running) error = ENOENT; else { V_pf_status.running = 0; dehook_pf(); dehook_pf_eth(); V_pf_status.since = time_second; DPFPRINTF(PF_DEBUG_MISC, ("pf: stopped\n")); } break; case DIOCGETETHRULES: { struct pfioc_nv *nv = (struct pfioc_nv *)addr; nvlist_t *nvl; void *packed; struct pf_keth_rule *tail; struct pf_keth_ruleset *rs; u_int32_t ticket, nr; const char *anchor = ""; nvl = NULL; packed = NULL; #define ERROUT(x) ERROUT_IOCTL(DIOCGETETHRULES_error, x) if (nv->len > pf_ioctl_maxcount) ERROUT(ENOMEM); /* Copy the request in */ packed = malloc(nv->len, M_NVLIST, M_WAITOK); if (packed == NULL) ERROUT(ENOMEM); error = copyin(nv->data, packed, nv->len); if (error) ERROUT(error); nvl = nvlist_unpack(packed, nv->len, 0); if (nvl == NULL) ERROUT(EBADMSG); if (! nvlist_exists_string(nvl, "anchor")) ERROUT(EBADMSG); anchor = nvlist_get_string(nvl, "anchor"); rs = pf_find_keth_ruleset(anchor); nvlist_destroy(nvl); nvl = NULL; free(packed, M_NVLIST); packed = NULL; if (rs == NULL) ERROUT(ENOENT); /* Reply */ nvl = nvlist_create(0); if (nvl == NULL) ERROUT(ENOMEM); PF_RULES_RLOCK(); ticket = rs->active.ticket; tail = TAILQ_LAST(rs->active.rules, pf_keth_ruleq); if (tail) nr = tail->nr + 1; else nr = 0; PF_RULES_RUNLOCK(); nvlist_add_number(nvl, "ticket", ticket); nvlist_add_number(nvl, "nr", nr); packed = nvlist_pack(nvl, &nv->len); if (packed == NULL) ERROUT(ENOMEM); if (nv->size == 0) ERROUT(0); else if (nv->size < nv->len) ERROUT(ENOSPC); error = copyout(packed, nv->data, nv->len); #undef ERROUT DIOCGETETHRULES_error: free(packed, M_NVLIST); nvlist_destroy(nvl); break; } case DIOCGETETHRULE: { struct epoch_tracker et; struct pfioc_nv *nv = (struct pfioc_nv *)addr; nvlist_t *nvl = NULL; void *nvlpacked = NULL; struct pf_keth_rule *rule = NULL; struct pf_keth_ruleset *rs; u_int32_t ticket, nr; bool clear = false; const char *anchor; #define ERROUT(x) ERROUT_IOCTL(DIOCGETETHRULE_error, x) if (nv->len > pf_ioctl_maxcount) ERROUT(ENOMEM); nvlpacked = malloc(nv->len, M_NVLIST, M_WAITOK); if (nvlpacked == NULL) ERROUT(ENOMEM); error = copyin(nv->data, nvlpacked, nv->len); if (error) ERROUT(error); nvl = nvlist_unpack(nvlpacked, nv->len, 0); if (nvl == NULL) ERROUT(EBADMSG); if (! nvlist_exists_number(nvl, "ticket")) ERROUT(EBADMSG); ticket = nvlist_get_number(nvl, "ticket"); if (! nvlist_exists_string(nvl, "anchor")) ERROUT(EBADMSG); anchor = nvlist_get_string(nvl, "anchor"); if (nvlist_exists_bool(nvl, "clear")) clear = nvlist_get_bool(nvl, "clear"); if (clear && !(flags & FWRITE)) ERROUT(EACCES); if (! nvlist_exists_number(nvl, "nr")) ERROUT(EBADMSG); nr = nvlist_get_number(nvl, "nr"); PF_RULES_RLOCK(); rs = pf_find_keth_ruleset(anchor); if (rs == NULL) { PF_RULES_RUNLOCK(); ERROUT(ENOENT); } if (ticket != rs->active.ticket) { PF_RULES_RUNLOCK(); ERROUT(EBUSY); } nvlist_destroy(nvl); nvl = NULL; free(nvlpacked, M_NVLIST); nvlpacked = NULL; rule = TAILQ_FIRST(rs->active.rules); while ((rule != NULL) && (rule->nr != nr)) rule = TAILQ_NEXT(rule, entries); if (rule == NULL) { PF_RULES_RUNLOCK(); ERROUT(ENOENT); } /* Make sure rule can't go away. */ NET_EPOCH_ENTER(et); PF_RULES_RUNLOCK(); nvl = pf_keth_rule_to_nveth_rule(rule); if (pf_keth_anchor_nvcopyout(rs, rule, nvl)) ERROUT(EBUSY); NET_EPOCH_EXIT(et); if (nvl == NULL) ERROUT(ENOMEM); nvlpacked = nvlist_pack(nvl, &nv->len); if (nvlpacked == NULL) ERROUT(ENOMEM); if (nv->size == 0) ERROUT(0); else if (nv->size < nv->len) ERROUT(ENOSPC); error = copyout(nvlpacked, nv->data, nv->len); if (error == 0 && clear) { counter_u64_zero(rule->evaluations); for (int i = 0; i < 2; i++) { counter_u64_zero(rule->packets[i]); counter_u64_zero(rule->bytes[i]); } } #undef ERROUT DIOCGETETHRULE_error: free(nvlpacked, M_NVLIST); nvlist_destroy(nvl); break; } case DIOCADDETHRULE: { struct pfioc_nv *nv = (struct pfioc_nv *)addr; nvlist_t *nvl = NULL; void *nvlpacked = NULL; struct pf_keth_rule *rule = NULL, *tail = NULL; struct pf_keth_ruleset *ruleset = NULL; struct pfi_kkif *kif = NULL, *bridge_to_kif = NULL; const char *anchor = "", *anchor_call = ""; #define ERROUT(x) ERROUT_IOCTL(DIOCADDETHRULE_error, x) if (nv->len > pf_ioctl_maxcount) ERROUT(ENOMEM); nvlpacked = malloc(nv->len, M_NVLIST, M_WAITOK); if (nvlpacked == NULL) ERROUT(ENOMEM); error = copyin(nv->data, nvlpacked, nv->len); if (error) ERROUT(error); nvl = nvlist_unpack(nvlpacked, nv->len, 0); if (nvl == NULL) ERROUT(EBADMSG); if (! nvlist_exists_number(nvl, "ticket")) ERROUT(EBADMSG); if (nvlist_exists_string(nvl, "anchor")) anchor = nvlist_get_string(nvl, "anchor"); if (nvlist_exists_string(nvl, "anchor_call")) anchor_call = nvlist_get_string(nvl, "anchor_call"); ruleset = pf_find_keth_ruleset(anchor); if (ruleset == NULL) ERROUT(EINVAL); if (nvlist_get_number(nvl, "ticket") != ruleset->inactive.ticket) { DPFPRINTF(PF_DEBUG_MISC, ("ticket: %d != %d\n", (u_int32_t)nvlist_get_number(nvl, "ticket"), ruleset->inactive.ticket)); ERROUT(EBUSY); } rule = malloc(sizeof(*rule), M_PFRULE, M_WAITOK); if (rule == NULL) ERROUT(ENOMEM); rule->timestamp = NULL; error = pf_nveth_rule_to_keth_rule(nvl, rule); if (error != 0) ERROUT(error); if (rule->ifname[0]) kif = pf_kkif_create(M_WAITOK); if (rule->bridge_to_name[0]) bridge_to_kif = pf_kkif_create(M_WAITOK); rule->evaluations = counter_u64_alloc(M_WAITOK); for (int i = 0; i < 2; i++) { rule->packets[i] = counter_u64_alloc(M_WAITOK); rule->bytes[i] = counter_u64_alloc(M_WAITOK); } rule->timestamp = uma_zalloc_pcpu(pf_timestamp_pcpu_zone, M_WAITOK | M_ZERO); PF_RULES_WLOCK(); if (rule->ifname[0]) { rule->kif = pfi_kkif_attach(kif, rule->ifname); pfi_kkif_ref(rule->kif); } else rule->kif = NULL; if (rule->bridge_to_name[0]) { rule->bridge_to = pfi_kkif_attach(bridge_to_kif, rule->bridge_to_name); pfi_kkif_ref(rule->bridge_to); } else rule->bridge_to = NULL; #ifdef ALTQ /* set queue IDs */ if (rule->qname[0] != 0) { if ((rule->qid = pf_qname2qid(rule->qname)) == 0) error = EBUSY; else rule->qid = rule->qid; } #endif if (rule->tagname[0]) if ((rule->tag = pf_tagname2tag(rule->tagname)) == 0) error = EBUSY; if (rule->match_tagname[0]) if ((rule->match_tag = pf_tagname2tag( rule->match_tagname)) == 0) error = EBUSY; if (error == 0 && rule->ipdst.addr.type == PF_ADDR_TABLE) error = pf_eth_addr_setup(ruleset, &rule->ipdst.addr); if (error == 0 && rule->ipsrc.addr.type == PF_ADDR_TABLE) error = pf_eth_addr_setup(ruleset, &rule->ipsrc.addr); if (error) { pf_free_eth_rule(rule); PF_RULES_WUNLOCK(); ERROUT(error); } if (pf_keth_anchor_setup(rule, ruleset, anchor_call)) { pf_free_eth_rule(rule); PF_RULES_WUNLOCK(); ERROUT(EINVAL); } tail = TAILQ_LAST(ruleset->inactive.rules, pf_keth_ruleq); if (tail) rule->nr = tail->nr + 1; else rule->nr = 0; TAILQ_INSERT_TAIL(ruleset->inactive.rules, rule, entries); PF_RULES_WUNLOCK(); #undef ERROUT DIOCADDETHRULE_error: nvlist_destroy(nvl); free(nvlpacked, M_NVLIST); break; } case DIOCGETETHRULESETS: { struct epoch_tracker et; struct pfioc_nv *nv = (struct pfioc_nv *)addr; nvlist_t *nvl = NULL; void *nvlpacked = NULL; struct pf_keth_ruleset *ruleset; struct pf_keth_anchor *anchor; int nr = 0; #define ERROUT(x) ERROUT_IOCTL(DIOCGETETHRULESETS_error, x) if (nv->len > pf_ioctl_maxcount) ERROUT(ENOMEM); nvlpacked = malloc(nv->len, M_NVLIST, M_WAITOK); if (nvlpacked == NULL) ERROUT(ENOMEM); error = copyin(nv->data, nvlpacked, nv->len); if (error) ERROUT(error); nvl = nvlist_unpack(nvlpacked, nv->len, 0); if (nvl == NULL) ERROUT(EBADMSG); if (! nvlist_exists_string(nvl, "path")) ERROUT(EBADMSG); NET_EPOCH_ENTER(et); if ((ruleset = pf_find_keth_ruleset( nvlist_get_string(nvl, "path"))) == NULL) { NET_EPOCH_EXIT(et); ERROUT(ENOENT); } if (ruleset->anchor == NULL) { RB_FOREACH(anchor, pf_keth_anchor_global, &V_pf_keth_anchors) if (anchor->parent == NULL) nr++; } else { RB_FOREACH(anchor, pf_keth_anchor_node, &ruleset->anchor->children) nr++; } NET_EPOCH_EXIT(et); nvlist_destroy(nvl); nvl = NULL; free(nvlpacked, M_NVLIST); nvlpacked = NULL; nvl = nvlist_create(0); if (nvl == NULL) ERROUT(ENOMEM); nvlist_add_number(nvl, "nr", nr); nvlpacked = nvlist_pack(nvl, &nv->len); if (nvlpacked == NULL) ERROUT(ENOMEM); if (nv->size == 0) ERROUT(0); else if (nv->size < nv->len) ERROUT(ENOSPC); error = copyout(nvlpacked, nv->data, nv->len); #undef ERROUT DIOCGETETHRULESETS_error: free(nvlpacked, M_NVLIST); nvlist_destroy(nvl); break; } case DIOCGETETHRULESET: { struct epoch_tracker et; struct pfioc_nv *nv = (struct pfioc_nv *)addr; nvlist_t *nvl = NULL; void *nvlpacked = NULL; struct pf_keth_ruleset *ruleset; struct pf_keth_anchor *anchor; int nr = 0, req_nr = 0; bool found = false; #define ERROUT(x) ERROUT_IOCTL(DIOCGETETHRULESET_error, x) if (nv->len > pf_ioctl_maxcount) ERROUT(ENOMEM); nvlpacked = malloc(nv->len, M_NVLIST, M_WAITOK); if (nvlpacked == NULL) ERROUT(ENOMEM); error = copyin(nv->data, nvlpacked, nv->len); if (error) ERROUT(error); nvl = nvlist_unpack(nvlpacked, nv->len, 0); if (nvl == NULL) ERROUT(EBADMSG); if (! nvlist_exists_string(nvl, "path")) ERROUT(EBADMSG); if (! nvlist_exists_number(nvl, "nr")) ERROUT(EBADMSG); req_nr = nvlist_get_number(nvl, "nr"); NET_EPOCH_ENTER(et); if ((ruleset = pf_find_keth_ruleset( nvlist_get_string(nvl, "path"))) == NULL) { NET_EPOCH_EXIT(et); ERROUT(ENOENT); } nvlist_destroy(nvl); nvl = NULL; free(nvlpacked, M_NVLIST); nvlpacked = NULL; nvl = nvlist_create(0); if (nvl == NULL) { NET_EPOCH_EXIT(et); ERROUT(ENOMEM); } if (ruleset->anchor == NULL) { RB_FOREACH(anchor, pf_keth_anchor_global, &V_pf_keth_anchors) { if (anchor->parent == NULL && nr++ == req_nr) { found = true; break; } } } else { RB_FOREACH(anchor, pf_keth_anchor_node, &ruleset->anchor->children) { if (nr++ == req_nr) { found = true; break; } } } NET_EPOCH_EXIT(et); if (found) { nvlist_add_number(nvl, "nr", nr); nvlist_add_string(nvl, "name", anchor->name); if (ruleset->anchor) nvlist_add_string(nvl, "path", ruleset->anchor->path); else nvlist_add_string(nvl, "path", ""); } else { ERROUT(EBUSY); } nvlpacked = nvlist_pack(nvl, &nv->len); if (nvlpacked == NULL) ERROUT(ENOMEM); if (nv->size == 0) ERROUT(0); else if (nv->size < nv->len) ERROUT(ENOSPC); error = copyout(nvlpacked, nv->data, nv->len); #undef ERROUT DIOCGETETHRULESET_error: free(nvlpacked, M_NVLIST); nvlist_destroy(nvl); break; } case DIOCADDRULENV: { struct pfioc_nv *nv = (struct pfioc_nv *)addr; nvlist_t *nvl = NULL; void *nvlpacked = NULL; struct pf_krule *rule = NULL; const char *anchor = "", *anchor_call = ""; uint32_t ticket = 0, pool_ticket = 0; #define ERROUT(x) ERROUT_IOCTL(DIOCADDRULENV_error, x) if (nv->len > pf_ioctl_maxcount) ERROUT(ENOMEM); nvlpacked = malloc(nv->len, M_NVLIST, M_WAITOK); error = copyin(nv->data, nvlpacked, nv->len); if (error) ERROUT(error); nvl = nvlist_unpack(nvlpacked, nv->len, 0); if (nvl == NULL) ERROUT(EBADMSG); if (! nvlist_exists_number(nvl, "ticket")) ERROUT(EINVAL); ticket = nvlist_get_number(nvl, "ticket"); if (! nvlist_exists_number(nvl, "pool_ticket")) ERROUT(EINVAL); pool_ticket = nvlist_get_number(nvl, "pool_ticket"); if (! nvlist_exists_nvlist(nvl, "rule")) ERROUT(EINVAL); rule = pf_krule_alloc(); error = pf_nvrule_to_krule(nvlist_get_nvlist(nvl, "rule"), rule); if (error) ERROUT(error); if (nvlist_exists_string(nvl, "anchor")) anchor = nvlist_get_string(nvl, "anchor"); if (nvlist_exists_string(nvl, "anchor_call")) anchor_call = nvlist_get_string(nvl, "anchor_call"); if ((error = nvlist_error(nvl))) ERROUT(error); /* Frees rule on error */ error = pf_ioctl_addrule(rule, ticket, pool_ticket, anchor, anchor_call, td); nvlist_destroy(nvl); free(nvlpacked, M_NVLIST); break; #undef ERROUT DIOCADDRULENV_error: pf_krule_free(rule); nvlist_destroy(nvl); free(nvlpacked, M_NVLIST); break; } case DIOCADDRULE: { struct pfioc_rule *pr = (struct pfioc_rule *)addr; struct pf_krule *rule; rule = pf_krule_alloc(); error = pf_rule_to_krule(&pr->rule, rule); if (error != 0) { pf_krule_free(rule); break; } pr->anchor[sizeof(pr->anchor) - 1] = 0; /* Frees rule on error */ error = pf_ioctl_addrule(rule, pr->ticket, pr->pool_ticket, pr->anchor, pr->anchor_call, td); break; } case DIOCGETRULES: { struct pfioc_rule *pr = (struct pfioc_rule *)addr; struct pf_kruleset *ruleset; struct pf_krule *tail; int rs_num; pr->anchor[sizeof(pr->anchor) - 1] = 0; PF_RULES_WLOCK(); ruleset = pf_find_kruleset(pr->anchor); if (ruleset == NULL) { PF_RULES_WUNLOCK(); error = EINVAL; break; } rs_num = pf_get_ruleset_number(pr->rule.action); if (rs_num >= PF_RULESET_MAX) { PF_RULES_WUNLOCK(); error = EINVAL; break; } tail = TAILQ_LAST(ruleset->rules[rs_num].active.ptr, pf_krulequeue); if (tail) pr->nr = tail->nr + 1; else pr->nr = 0; pr->ticket = ruleset->rules[rs_num].active.ticket; PF_RULES_WUNLOCK(); break; } case DIOCGETRULE: { struct pfioc_rule *pr = (struct pfioc_rule *)addr; struct pf_kruleset *ruleset; struct pf_krule *rule; int rs_num; pr->anchor[sizeof(pr->anchor) - 1] = 0; PF_RULES_WLOCK(); ruleset = pf_find_kruleset(pr->anchor); if (ruleset == NULL) { PF_RULES_WUNLOCK(); error = EINVAL; break; } rs_num = pf_get_ruleset_number(pr->rule.action); if (rs_num >= PF_RULESET_MAX) { PF_RULES_WUNLOCK(); error = EINVAL; break; } if (pr->ticket != ruleset->rules[rs_num].active.ticket) { PF_RULES_WUNLOCK(); error = EBUSY; break; } rule = TAILQ_FIRST(ruleset->rules[rs_num].active.ptr); while ((rule != NULL) && (rule->nr != pr->nr)) rule = TAILQ_NEXT(rule, entries); if (rule == NULL) { PF_RULES_WUNLOCK(); error = EBUSY; break; } pf_krule_to_rule(rule, &pr->rule); if (pf_kanchor_copyout(ruleset, rule, pr)) { PF_RULES_WUNLOCK(); error = EBUSY; break; } pf_addr_copyout(&pr->rule.src.addr); pf_addr_copyout(&pr->rule.dst.addr); if (pr->action == PF_GET_CLR_CNTR) { pf_counter_u64_zero(&rule->evaluations); for (int i = 0; i < 2; i++) { pf_counter_u64_zero(&rule->packets[i]); pf_counter_u64_zero(&rule->bytes[i]); } counter_u64_zero(rule->states_tot); } PF_RULES_WUNLOCK(); break; } case DIOCGETRULENV: { struct pfioc_nv *nv = (struct pfioc_nv *)addr; nvlist_t *nvrule = NULL; nvlist_t *nvl = NULL; struct pf_kruleset *ruleset; struct pf_krule *rule; void *nvlpacked = NULL; int rs_num, nr; bool clear_counter = false; #define ERROUT(x) ERROUT_IOCTL(DIOCGETRULENV_error, x) if (nv->len > pf_ioctl_maxcount) ERROUT(ENOMEM); /* Copy the request in */ nvlpacked = malloc(nv->len, M_NVLIST, M_WAITOK); if (nvlpacked == NULL) ERROUT(ENOMEM); error = copyin(nv->data, nvlpacked, nv->len); if (error) ERROUT(error); nvl = nvlist_unpack(nvlpacked, nv->len, 0); if (nvl == NULL) ERROUT(EBADMSG); if (! nvlist_exists_string(nvl, "anchor")) ERROUT(EBADMSG); if (! nvlist_exists_number(nvl, "ruleset")) ERROUT(EBADMSG); if (! nvlist_exists_number(nvl, "ticket")) ERROUT(EBADMSG); if (! nvlist_exists_number(nvl, "nr")) ERROUT(EBADMSG); if (nvlist_exists_bool(nvl, "clear_counter")) clear_counter = nvlist_get_bool(nvl, "clear_counter"); if (clear_counter && !(flags & FWRITE)) ERROUT(EACCES); nr = nvlist_get_number(nvl, "nr"); PF_RULES_WLOCK(); ruleset = pf_find_kruleset(nvlist_get_string(nvl, "anchor")); if (ruleset == NULL) { PF_RULES_WUNLOCK(); ERROUT(ENOENT); } rs_num = pf_get_ruleset_number(nvlist_get_number(nvl, "ruleset")); if (rs_num >= PF_RULESET_MAX) { PF_RULES_WUNLOCK(); ERROUT(EINVAL); } if (nvlist_get_number(nvl, "ticket") != ruleset->rules[rs_num].active.ticket) { PF_RULES_WUNLOCK(); ERROUT(EBUSY); } if ((error = nvlist_error(nvl))) { PF_RULES_WUNLOCK(); ERROUT(error); } rule = TAILQ_FIRST(ruleset->rules[rs_num].active.ptr); while ((rule != NULL) && (rule->nr != nr)) rule = TAILQ_NEXT(rule, entries); if (rule == NULL) { PF_RULES_WUNLOCK(); ERROUT(EBUSY); } nvrule = pf_krule_to_nvrule(rule); nvlist_destroy(nvl); nvl = nvlist_create(0); if (nvl == NULL) { PF_RULES_WUNLOCK(); ERROUT(ENOMEM); } nvlist_add_number(nvl, "nr", nr); nvlist_add_nvlist(nvl, "rule", nvrule); nvlist_destroy(nvrule); nvrule = NULL; if (pf_kanchor_nvcopyout(ruleset, rule, nvl)) { PF_RULES_WUNLOCK(); ERROUT(EBUSY); } free(nvlpacked, M_NVLIST); nvlpacked = nvlist_pack(nvl, &nv->len); if (nvlpacked == NULL) { PF_RULES_WUNLOCK(); ERROUT(ENOMEM); } if (nv->size == 0) { PF_RULES_WUNLOCK(); ERROUT(0); } else if (nv->size < nv->len) { PF_RULES_WUNLOCK(); ERROUT(ENOSPC); } if (clear_counter) { pf_counter_u64_zero(&rule->evaluations); for (int i = 0; i < 2; i++) { pf_counter_u64_zero(&rule->packets[i]); pf_counter_u64_zero(&rule->bytes[i]); } counter_u64_zero(rule->states_tot); } PF_RULES_WUNLOCK(); error = copyout(nvlpacked, nv->data, nv->len); #undef ERROUT DIOCGETRULENV_error: free(nvlpacked, M_NVLIST); nvlist_destroy(nvrule); nvlist_destroy(nvl); break; } case DIOCCHANGERULE: { struct pfioc_rule *pcr = (struct pfioc_rule *)addr; struct pf_kruleset *ruleset; struct pf_krule *oldrule = NULL, *newrule = NULL; struct pfi_kkif *kif = NULL; struct pf_kpooladdr *pa; u_int32_t nr = 0; int rs_num; pcr->anchor[sizeof(pcr->anchor) - 1] = 0; if (pcr->action < PF_CHANGE_ADD_HEAD || pcr->action > PF_CHANGE_GET_TICKET) { error = EINVAL; break; } if (pcr->rule.return_icmp >> 8 > ICMP_MAXTYPE) { error = EINVAL; break; } if (pcr->action != PF_CHANGE_REMOVE) { newrule = pf_krule_alloc(); error = pf_rule_to_krule(&pcr->rule, newrule); if (error != 0) { pf_krule_free(newrule); break; } if (newrule->ifname[0]) kif = pf_kkif_create(M_WAITOK); pf_counter_u64_init(&newrule->evaluations, M_WAITOK); for (int i = 0; i < 2; i++) { pf_counter_u64_init(&newrule->packets[i], M_WAITOK); pf_counter_u64_init(&newrule->bytes[i], M_WAITOK); } newrule->states_cur = counter_u64_alloc(M_WAITOK); newrule->states_tot = counter_u64_alloc(M_WAITOK); newrule->src_nodes = counter_u64_alloc(M_WAITOK); newrule->cuid = td->td_ucred->cr_ruid; newrule->cpid = td->td_proc ? td->td_proc->p_pid : 0; TAILQ_INIT(&newrule->rpool.list); } #define ERROUT(x) ERROUT_IOCTL(DIOCCHANGERULE_error, x) PF_CONFIG_LOCK(); PF_RULES_WLOCK(); #ifdef PF_WANT_32_TO_64_COUNTER if (newrule != NULL) { LIST_INSERT_HEAD(&V_pf_allrulelist, newrule, allrulelist); newrule->allrulelinked = true; V_pf_allrulecount++; } #endif if (!(pcr->action == PF_CHANGE_REMOVE || pcr->action == PF_CHANGE_GET_TICKET) && pcr->pool_ticket != V_ticket_pabuf) ERROUT(EBUSY); ruleset = pf_find_kruleset(pcr->anchor); if (ruleset == NULL) ERROUT(EINVAL); rs_num = pf_get_ruleset_number(pcr->rule.action); if (rs_num >= PF_RULESET_MAX) ERROUT(EINVAL); /* * XXXMJG: there is no guarantee that the ruleset was * created by the usual route of calling DIOCXBEGIN. * As a result it is possible the rule tree will not * be allocated yet. Hack around it by doing it here. * Note it is fine to let the tree persist in case of * error as it will be freed down the road on future * updates (if need be). */ if (ruleset->rules[rs_num].active.tree == NULL) { ruleset->rules[rs_num].active.tree = pf_rule_tree_alloc(M_NOWAIT); if (ruleset->rules[rs_num].active.tree == NULL) { ERROUT(ENOMEM); } } if (pcr->action == PF_CHANGE_GET_TICKET) { pcr->ticket = ++ruleset->rules[rs_num].active.ticket; ERROUT(0); } else if (pcr->ticket != ruleset->rules[rs_num].active.ticket) ERROUT(EINVAL); if (pcr->action != PF_CHANGE_REMOVE) { if (newrule->ifname[0]) { newrule->kif = pfi_kkif_attach(kif, newrule->ifname); kif = NULL; pfi_kkif_ref(newrule->kif); } else newrule->kif = NULL; if (newrule->rtableid > 0 && newrule->rtableid >= rt_numfibs) error = EBUSY; #ifdef ALTQ /* set queue IDs */ if (newrule->qname[0] != 0) { if ((newrule->qid = pf_qname2qid(newrule->qname)) == 0) error = EBUSY; else if (newrule->pqname[0] != 0) { if ((newrule->pqid = pf_qname2qid(newrule->pqname)) == 0) error = EBUSY; } else newrule->pqid = newrule->qid; } #endif /* ALTQ */ if (newrule->tagname[0]) if ((newrule->tag = pf_tagname2tag(newrule->tagname)) == 0) error = EBUSY; if (newrule->match_tagname[0]) if ((newrule->match_tag = pf_tagname2tag( newrule->match_tagname)) == 0) error = EBUSY; if (newrule->rt && !newrule->direction) error = EINVAL; if (!newrule->log) newrule->logif = 0; if (newrule->logif >= PFLOGIFS_MAX) error = EINVAL; if (pf_addr_setup(ruleset, &newrule->src.addr, newrule->af)) error = ENOMEM; if (pf_addr_setup(ruleset, &newrule->dst.addr, newrule->af)) error = ENOMEM; if (pf_kanchor_setup(newrule, ruleset, pcr->anchor_call)) error = EINVAL; TAILQ_FOREACH(pa, &V_pf_pabuf, entries) if (pa->addr.type == PF_ADDR_TABLE) { pa->addr.p.tbl = pfr_attach_table(ruleset, pa->addr.v.tblname); if (pa->addr.p.tbl == NULL) error = ENOMEM; } newrule->overload_tbl = NULL; if (newrule->overload_tblname[0]) { if ((newrule->overload_tbl = pfr_attach_table( ruleset, newrule->overload_tblname)) == NULL) error = EINVAL; else newrule->overload_tbl->pfrkt_flags |= PFR_TFLAG_ACTIVE; } pf_mv_kpool(&V_pf_pabuf, &newrule->rpool.list); if (((((newrule->action == PF_NAT) || (newrule->action == PF_RDR) || (newrule->action == PF_BINAT) || (newrule->rt > PF_NOPFROUTE)) && !newrule->anchor)) && (TAILQ_FIRST(&newrule->rpool.list) == NULL)) error = EINVAL; if (error) { pf_free_rule(newrule); PF_RULES_WUNLOCK(); PF_CONFIG_UNLOCK(); break; } newrule->rpool.cur = TAILQ_FIRST(&newrule->rpool.list); } pf_empty_kpool(&V_pf_pabuf); if (pcr->action == PF_CHANGE_ADD_HEAD) oldrule = TAILQ_FIRST( ruleset->rules[rs_num].active.ptr); else if (pcr->action == PF_CHANGE_ADD_TAIL) oldrule = TAILQ_LAST( ruleset->rules[rs_num].active.ptr, pf_krulequeue); else { oldrule = TAILQ_FIRST( ruleset->rules[rs_num].active.ptr); while ((oldrule != NULL) && (oldrule->nr != pcr->nr)) oldrule = TAILQ_NEXT(oldrule, entries); if (oldrule == NULL) { if (newrule != NULL) pf_free_rule(newrule); PF_RULES_WUNLOCK(); PF_CONFIG_UNLOCK(); error = EINVAL; break; } } if (pcr->action == PF_CHANGE_REMOVE) { pf_unlink_rule(ruleset->rules[rs_num].active.ptr, oldrule); RB_REMOVE(pf_krule_global, ruleset->rules[rs_num].active.tree, oldrule); ruleset->rules[rs_num].active.rcount--; } else { pf_hash_rule(newrule); if (RB_INSERT(pf_krule_global, ruleset->rules[rs_num].active.tree, newrule) != NULL) { pf_free_rule(newrule); PF_RULES_WUNLOCK(); PF_CONFIG_UNLOCK(); error = EEXIST; break; } if (oldrule == NULL) TAILQ_INSERT_TAIL( ruleset->rules[rs_num].active.ptr, newrule, entries); else if (pcr->action == PF_CHANGE_ADD_HEAD || pcr->action == PF_CHANGE_ADD_BEFORE) TAILQ_INSERT_BEFORE(oldrule, newrule, entries); else TAILQ_INSERT_AFTER( ruleset->rules[rs_num].active.ptr, oldrule, newrule, entries); ruleset->rules[rs_num].active.rcount++; } nr = 0; TAILQ_FOREACH(oldrule, ruleset->rules[rs_num].active.ptr, entries) oldrule->nr = nr++; ruleset->rules[rs_num].active.ticket++; pf_calc_skip_steps(ruleset->rules[rs_num].active.ptr); pf_remove_if_empty_kruleset(ruleset); PF_RULES_WUNLOCK(); PF_CONFIG_UNLOCK(); break; #undef ERROUT DIOCCHANGERULE_error: PF_RULES_WUNLOCK(); PF_CONFIG_UNLOCK(); pf_krule_free(newrule); pf_kkif_free(kif); break; } case DIOCCLRSTATES: { struct pfioc_state_kill *psk = (struct pfioc_state_kill *)addr; struct pf_kstate_kill kill; error = pf_state_kill_to_kstate_kill(psk, &kill); if (error) break; psk->psk_killed = pf_clear_states(&kill); break; } case DIOCCLRSTATESNV: { error = pf_clearstates_nv((struct pfioc_nv *)addr); break; } case DIOCKILLSTATES: { struct pfioc_state_kill *psk = (struct pfioc_state_kill *)addr; struct pf_kstate_kill kill; error = pf_state_kill_to_kstate_kill(psk, &kill); if (error) break; psk->psk_killed = 0; pf_killstates(&kill, &psk->psk_killed); break; } case DIOCKILLSTATESNV: { error = pf_killstates_nv((struct pfioc_nv *)addr); break; } case DIOCADDSTATE: { struct pfioc_state *ps = (struct pfioc_state *)addr; struct pfsync_state_1301 *sp = &ps->state; if (sp->timeout >= PFTM_MAX) { error = EINVAL; break; } if (V_pfsync_state_import_ptr != NULL) { PF_RULES_RLOCK(); error = V_pfsync_state_import_ptr( (union pfsync_state_union *)sp, PFSYNC_SI_IOCTL, PFSYNC_MSG_VERSION_1301); PF_RULES_RUNLOCK(); } else error = EOPNOTSUPP; break; } case DIOCGETSTATE: { struct pfioc_state *ps = (struct pfioc_state *)addr; struct pf_kstate *s; s = pf_find_state_byid(ps->state.id, ps->state.creatorid); if (s == NULL) { error = ENOENT; break; } pfsync_state_export((union pfsync_state_union*)&ps->state, s, PFSYNC_MSG_VERSION_1301); PF_STATE_UNLOCK(s); break; } case DIOCGETSTATENV: { error = pf_getstate((struct pfioc_nv *)addr); break; } case DIOCGETSTATES: { struct pfioc_states *ps = (struct pfioc_states *)addr; struct pf_kstate *s; struct pfsync_state_1301 *pstore, *p; int i, nr; size_t slice_count = 16, count; void *out; if (ps->ps_len <= 0) { nr = uma_zone_get_cur(V_pf_state_z); ps->ps_len = sizeof(struct pfsync_state_1301) * nr; break; } out = ps->ps_states; pstore = mallocarray(slice_count, sizeof(struct pfsync_state_1301), M_TEMP, M_WAITOK | M_ZERO); nr = 0; for (i = 0; i <= pf_hashmask; i++) { struct pf_idhash *ih = &V_pf_idhash[i]; DIOCGETSTATES_retry: p = pstore; if (LIST_EMPTY(&ih->states)) continue; PF_HASHROW_LOCK(ih); count = 0; LIST_FOREACH(s, &ih->states, entry) { if (s->timeout == PFTM_UNLINKED) continue; count++; } if (count > slice_count) { PF_HASHROW_UNLOCK(ih); free(pstore, M_TEMP); slice_count = count * 2; pstore = mallocarray(slice_count, sizeof(struct pfsync_state_1301), M_TEMP, M_WAITOK | M_ZERO); goto DIOCGETSTATES_retry; } if ((nr+count) * sizeof(*p) > ps->ps_len) { PF_HASHROW_UNLOCK(ih); goto DIOCGETSTATES_full; } LIST_FOREACH(s, &ih->states, entry) { if (s->timeout == PFTM_UNLINKED) continue; pfsync_state_export((union pfsync_state_union*)p, s, PFSYNC_MSG_VERSION_1301); p++; nr++; } PF_HASHROW_UNLOCK(ih); error = copyout(pstore, out, sizeof(struct pfsync_state_1301) * count); if (error) break; out = ps->ps_states + nr; } DIOCGETSTATES_full: ps->ps_len = sizeof(struct pfsync_state_1301) * nr; free(pstore, M_TEMP); break; } case DIOCGETSTATESV2: { struct pfioc_states_v2 *ps = (struct pfioc_states_v2 *)addr; struct pf_kstate *s; struct pf_state_export *pstore, *p; int i, nr; size_t slice_count = 16, count; void *out; if (ps->ps_req_version > PF_STATE_VERSION) { error = ENOTSUP; break; } if (ps->ps_len <= 0) { nr = uma_zone_get_cur(V_pf_state_z); ps->ps_len = sizeof(struct pf_state_export) * nr; break; } out = ps->ps_states; pstore = mallocarray(slice_count, sizeof(struct pf_state_export), M_TEMP, M_WAITOK | M_ZERO); nr = 0; for (i = 0; i <= pf_hashmask; i++) { struct pf_idhash *ih = &V_pf_idhash[i]; DIOCGETSTATESV2_retry: p = pstore; if (LIST_EMPTY(&ih->states)) continue; PF_HASHROW_LOCK(ih); count = 0; LIST_FOREACH(s, &ih->states, entry) { if (s->timeout == PFTM_UNLINKED) continue; count++; } if (count > slice_count) { PF_HASHROW_UNLOCK(ih); free(pstore, M_TEMP); slice_count = count * 2; pstore = mallocarray(slice_count, sizeof(struct pf_state_export), M_TEMP, M_WAITOK | M_ZERO); goto DIOCGETSTATESV2_retry; } if ((nr+count) * sizeof(*p) > ps->ps_len) { PF_HASHROW_UNLOCK(ih); goto DIOCGETSTATESV2_full; } LIST_FOREACH(s, &ih->states, entry) { if (s->timeout == PFTM_UNLINKED) continue; pf_state_export(p, s); p++; nr++; } PF_HASHROW_UNLOCK(ih); error = copyout(pstore, out, sizeof(struct pf_state_export) * count); if (error) break; out = ps->ps_states + nr; } DIOCGETSTATESV2_full: ps->ps_len = nr * sizeof(struct pf_state_export); free(pstore, M_TEMP); break; } case DIOCGETSTATUS: { struct pf_status *s = (struct pf_status *)addr; PF_RULES_RLOCK(); s->running = V_pf_status.running; s->since = V_pf_status.since; s->debug = V_pf_status.debug; s->hostid = V_pf_status.hostid; s->states = V_pf_status.states; s->src_nodes = V_pf_status.src_nodes; for (int i = 0; i < PFRES_MAX; i++) s->counters[i] = counter_u64_fetch(V_pf_status.counters[i]); for (int i = 0; i < LCNT_MAX; i++) s->lcounters[i] = counter_u64_fetch(V_pf_status.lcounters[i]); for (int i = 0; i < FCNT_MAX; i++) s->fcounters[i] = pf_counter_u64_fetch(&V_pf_status.fcounters[i]); for (int i = 0; i < SCNT_MAX; i++) s->scounters[i] = counter_u64_fetch(V_pf_status.scounters[i]); bcopy(V_pf_status.ifname, s->ifname, IFNAMSIZ); bcopy(V_pf_status.pf_chksum, s->pf_chksum, PF_MD5_DIGEST_LENGTH); pfi_update_status(s->ifname, s); PF_RULES_RUNLOCK(); break; } case DIOCGETSTATUSNV: { error = pf_getstatus((struct pfioc_nv *)addr); break; } case DIOCSETSTATUSIF: { struct pfioc_if *pi = (struct pfioc_if *)addr; if (pi->ifname[0] == 0) { bzero(V_pf_status.ifname, IFNAMSIZ); break; } PF_RULES_WLOCK(); error = pf_user_strcpy(V_pf_status.ifname, pi->ifname, IFNAMSIZ); PF_RULES_WUNLOCK(); break; } case DIOCCLRSTATUS: { PF_RULES_WLOCK(); for (int i = 0; i < PFRES_MAX; i++) counter_u64_zero(V_pf_status.counters[i]); for (int i = 0; i < FCNT_MAX; i++) pf_counter_u64_zero(&V_pf_status.fcounters[i]); for (int i = 0; i < SCNT_MAX; i++) counter_u64_zero(V_pf_status.scounters[i]); for (int i = 0; i < KLCNT_MAX; i++) counter_u64_zero(V_pf_status.lcounters[i]); V_pf_status.since = time_second; if (*V_pf_status.ifname) pfi_update_status(V_pf_status.ifname, NULL); PF_RULES_WUNLOCK(); break; } case DIOCNATLOOK: { struct pfioc_natlook *pnl = (struct pfioc_natlook *)addr; struct pf_state_key *sk; struct pf_kstate *state; struct pf_state_key_cmp key; int m = 0, direction = pnl->direction; int sidx, didx; /* NATLOOK src and dst are reversed, so reverse sidx/didx */ sidx = (direction == PF_IN) ? 1 : 0; didx = (direction == PF_IN) ? 0 : 1; if (!pnl->proto || PF_AZERO(&pnl->saddr, pnl->af) || PF_AZERO(&pnl->daddr, pnl->af) || ((pnl->proto == IPPROTO_TCP || pnl->proto == IPPROTO_UDP) && (!pnl->dport || !pnl->sport))) error = EINVAL; else { bzero(&key, sizeof(key)); key.af = pnl->af; key.proto = pnl->proto; PF_ACPY(&key.addr[sidx], &pnl->saddr, pnl->af); key.port[sidx] = pnl->sport; PF_ACPY(&key.addr[didx], &pnl->daddr, pnl->af); key.port[didx] = pnl->dport; state = pf_find_state_all(&key, direction, &m); if (state == NULL) { error = ENOENT; } else { if (m > 1) { PF_STATE_UNLOCK(state); error = E2BIG; /* more than one state */ } else { sk = state->key[sidx]; PF_ACPY(&pnl->rsaddr, &sk->addr[sidx], sk->af); pnl->rsport = sk->port[sidx]; PF_ACPY(&pnl->rdaddr, &sk->addr[didx], sk->af); pnl->rdport = sk->port[didx]; PF_STATE_UNLOCK(state); } } } break; } case DIOCSETTIMEOUT: { struct pfioc_tm *pt = (struct pfioc_tm *)addr; int old; if (pt->timeout < 0 || pt->timeout >= PFTM_MAX || pt->seconds < 0) { error = EINVAL; break; } PF_RULES_WLOCK(); old = V_pf_default_rule.timeout[pt->timeout]; if (pt->timeout == PFTM_INTERVAL && pt->seconds == 0) pt->seconds = 1; V_pf_default_rule.timeout[pt->timeout] = pt->seconds; if (pt->timeout == PFTM_INTERVAL && pt->seconds < old) wakeup(pf_purge_thread); pt->seconds = old; PF_RULES_WUNLOCK(); break; } case DIOCGETTIMEOUT: { struct pfioc_tm *pt = (struct pfioc_tm *)addr; if (pt->timeout < 0 || pt->timeout >= PFTM_MAX) { error = EINVAL; break; } PF_RULES_RLOCK(); pt->seconds = V_pf_default_rule.timeout[pt->timeout]; PF_RULES_RUNLOCK(); break; } case DIOCGETLIMIT: { struct pfioc_limit *pl = (struct pfioc_limit *)addr; if (pl->index < 0 || pl->index >= PF_LIMIT_MAX) { error = EINVAL; break; } PF_RULES_RLOCK(); pl->limit = V_pf_limits[pl->index].limit; PF_RULES_RUNLOCK(); break; } case DIOCSETLIMIT: { struct pfioc_limit *pl = (struct pfioc_limit *)addr; int old_limit; PF_RULES_WLOCK(); if (pl->index < 0 || pl->index >= PF_LIMIT_MAX || V_pf_limits[pl->index].zone == NULL) { PF_RULES_WUNLOCK(); error = EINVAL; break; } uma_zone_set_max(V_pf_limits[pl->index].zone, pl->limit); old_limit = V_pf_limits[pl->index].limit; V_pf_limits[pl->index].limit = pl->limit; pl->limit = old_limit; PF_RULES_WUNLOCK(); break; } case DIOCSETDEBUG: { u_int32_t *level = (u_int32_t *)addr; PF_RULES_WLOCK(); V_pf_status.debug = *level; PF_RULES_WUNLOCK(); break; } case DIOCCLRRULECTRS: { /* obsoleted by DIOCGETRULE with action=PF_GET_CLR_CNTR */ struct pf_kruleset *ruleset = &pf_main_ruleset; struct pf_krule *rule; PF_RULES_WLOCK(); TAILQ_FOREACH(rule, ruleset->rules[PF_RULESET_FILTER].active.ptr, entries) { pf_counter_u64_zero(&rule->evaluations); for (int i = 0; i < 2; i++) { pf_counter_u64_zero(&rule->packets[i]); pf_counter_u64_zero(&rule->bytes[i]); } } PF_RULES_WUNLOCK(); break; } case DIOCGIFSPEEDV0: case DIOCGIFSPEEDV1: { struct pf_ifspeed_v1 *psp = (struct pf_ifspeed_v1 *)addr; struct pf_ifspeed_v1 ps; struct ifnet *ifp; if (psp->ifname[0] == '\0') { error = EINVAL; break; } error = pf_user_strcpy(ps.ifname, psp->ifname, IFNAMSIZ); if (error != 0) break; ifp = ifunit(ps.ifname); if (ifp != NULL) { psp->baudrate32 = (u_int32_t)uqmin(ifp->if_baudrate, UINT_MAX); if (cmd == DIOCGIFSPEEDV1) psp->baudrate = ifp->if_baudrate; } else { error = EINVAL; } break; } #ifdef ALTQ case DIOCSTARTALTQ: { struct pf_altq *altq; PF_RULES_WLOCK(); /* enable all altq interfaces on active list */ TAILQ_FOREACH(altq, V_pf_altq_ifs_active, entries) { if ((altq->local_flags & PFALTQ_FLAG_IF_REMOVED) == 0) { error = pf_enable_altq(altq); if (error != 0) break; } } if (error == 0) V_pf_altq_running = 1; PF_RULES_WUNLOCK(); DPFPRINTF(PF_DEBUG_MISC, ("altq: started\n")); break; } case DIOCSTOPALTQ: { struct pf_altq *altq; PF_RULES_WLOCK(); /* disable all altq interfaces on active list */ TAILQ_FOREACH(altq, V_pf_altq_ifs_active, entries) { if ((altq->local_flags & PFALTQ_FLAG_IF_REMOVED) == 0) { error = pf_disable_altq(altq); if (error != 0) break; } } if (error == 0) V_pf_altq_running = 0; PF_RULES_WUNLOCK(); DPFPRINTF(PF_DEBUG_MISC, ("altq: stopped\n")); break; } case DIOCADDALTQV0: case DIOCADDALTQV1: { struct pfioc_altq_v1 *pa = (struct pfioc_altq_v1 *)addr; struct pf_altq *altq, *a; struct ifnet *ifp; altq = malloc(sizeof(*altq), M_PFALTQ, M_WAITOK | M_ZERO); error = pf_import_kaltq(pa, altq, IOCPARM_LEN(cmd)); if (error) break; altq->local_flags = 0; PF_RULES_WLOCK(); if (pa->ticket != V_ticket_altqs_inactive) { PF_RULES_WUNLOCK(); free(altq, M_PFALTQ); error = EBUSY; break; } /* * if this is for a queue, find the discipline and * copy the necessary fields */ if (altq->qname[0] != 0) { if ((altq->qid = pf_qname2qid(altq->qname)) == 0) { PF_RULES_WUNLOCK(); error = EBUSY; free(altq, M_PFALTQ); break; } altq->altq_disc = NULL; TAILQ_FOREACH(a, V_pf_altq_ifs_inactive, entries) { if (strncmp(a->ifname, altq->ifname, IFNAMSIZ) == 0) { altq->altq_disc = a->altq_disc; break; } } } if ((ifp = ifunit(altq->ifname)) == NULL) altq->local_flags |= PFALTQ_FLAG_IF_REMOVED; else error = altq_add(ifp, altq); if (error) { PF_RULES_WUNLOCK(); free(altq, M_PFALTQ); break; } if (altq->qname[0] != 0) TAILQ_INSERT_TAIL(V_pf_altqs_inactive, altq, entries); else TAILQ_INSERT_TAIL(V_pf_altq_ifs_inactive, altq, entries); /* version error check done on import above */ pf_export_kaltq(altq, pa, IOCPARM_LEN(cmd)); PF_RULES_WUNLOCK(); break; } case DIOCGETALTQSV0: case DIOCGETALTQSV1: { struct pfioc_altq_v1 *pa = (struct pfioc_altq_v1 *)addr; struct pf_altq *altq; PF_RULES_RLOCK(); pa->nr = 0; TAILQ_FOREACH(altq, V_pf_altq_ifs_active, entries) pa->nr++; TAILQ_FOREACH(altq, V_pf_altqs_active, entries) pa->nr++; pa->ticket = V_ticket_altqs_active; PF_RULES_RUNLOCK(); break; } case DIOCGETALTQV0: case DIOCGETALTQV1: { struct pfioc_altq_v1 *pa = (struct pfioc_altq_v1 *)addr; struct pf_altq *altq; PF_RULES_RLOCK(); if (pa->ticket != V_ticket_altqs_active) { PF_RULES_RUNLOCK(); error = EBUSY; break; } altq = pf_altq_get_nth_active(pa->nr); if (altq == NULL) { PF_RULES_RUNLOCK(); error = EBUSY; break; } pf_export_kaltq(altq, pa, IOCPARM_LEN(cmd)); PF_RULES_RUNLOCK(); break; } case DIOCCHANGEALTQV0: case DIOCCHANGEALTQV1: /* CHANGEALTQ not supported yet! */ error = ENODEV; break; case DIOCGETQSTATSV0: case DIOCGETQSTATSV1: { struct pfioc_qstats_v1 *pq = (struct pfioc_qstats_v1 *)addr; struct pf_altq *altq; int nbytes; u_int32_t version; PF_RULES_RLOCK(); if (pq->ticket != V_ticket_altqs_active) { PF_RULES_RUNLOCK(); error = EBUSY; break; } nbytes = pq->nbytes; altq = pf_altq_get_nth_active(pq->nr); if (altq == NULL) { PF_RULES_RUNLOCK(); error = EBUSY; break; } if ((altq->local_flags & PFALTQ_FLAG_IF_REMOVED) != 0) { PF_RULES_RUNLOCK(); error = ENXIO; break; } PF_RULES_RUNLOCK(); if (cmd == DIOCGETQSTATSV0) version = 0; /* DIOCGETQSTATSV0 means stats struct v0 */ else version = pq->version; error = altq_getqstats(altq, pq->buf, &nbytes, version); if (error == 0) { pq->scheduler = altq->scheduler; pq->nbytes = nbytes; } break; } #endif /* ALTQ */ case DIOCBEGINADDRS: { struct pfioc_pooladdr *pp = (struct pfioc_pooladdr *)addr; PF_RULES_WLOCK(); pf_empty_kpool(&V_pf_pabuf); pp->ticket = ++V_ticket_pabuf; PF_RULES_WUNLOCK(); break; } case DIOCADDADDR: { struct pfioc_pooladdr *pp = (struct pfioc_pooladdr *)addr; struct pf_kpooladdr *pa; struct pfi_kkif *kif = NULL; #ifndef INET if (pp->af == AF_INET) { error = EAFNOSUPPORT; break; } #endif /* INET */ #ifndef INET6 if (pp->af == AF_INET6) { error = EAFNOSUPPORT; break; } #endif /* INET6 */ if (pp->addr.addr.type != PF_ADDR_ADDRMASK && pp->addr.addr.type != PF_ADDR_DYNIFTL && pp->addr.addr.type != PF_ADDR_TABLE) { error = EINVAL; break; } if (pp->addr.addr.p.dyn != NULL) { error = EINVAL; break; } pa = malloc(sizeof(*pa), M_PFRULE, M_WAITOK); error = pf_pooladdr_to_kpooladdr(&pp->addr, pa); if (error != 0) break; if (pa->ifname[0]) kif = pf_kkif_create(M_WAITOK); PF_RULES_WLOCK(); if (pp->ticket != V_ticket_pabuf) { PF_RULES_WUNLOCK(); if (pa->ifname[0]) pf_kkif_free(kif); free(pa, M_PFRULE); error = EBUSY; break; } if (pa->ifname[0]) { pa->kif = pfi_kkif_attach(kif, pa->ifname); kif = NULL; pfi_kkif_ref(pa->kif); } else pa->kif = NULL; if (pa->addr.type == PF_ADDR_DYNIFTL && ((error = pfi_dynaddr_setup(&pa->addr, pp->af)) != 0)) { if (pa->ifname[0]) pfi_kkif_unref(pa->kif); PF_RULES_WUNLOCK(); free(pa, M_PFRULE); break; } TAILQ_INSERT_TAIL(&V_pf_pabuf, pa, entries); PF_RULES_WUNLOCK(); break; } case DIOCGETADDRS: { struct pfioc_pooladdr *pp = (struct pfioc_pooladdr *)addr; struct pf_kpool *pool; struct pf_kpooladdr *pa; pp->anchor[sizeof(pp->anchor) - 1] = 0; pp->nr = 0; PF_RULES_RLOCK(); pool = pf_get_kpool(pp->anchor, pp->ticket, pp->r_action, pp->r_num, 0, 1, 0); if (pool == NULL) { PF_RULES_RUNLOCK(); error = EBUSY; break; } TAILQ_FOREACH(pa, &pool->list, entries) pp->nr++; PF_RULES_RUNLOCK(); break; } case DIOCGETADDR: { struct pfioc_pooladdr *pp = (struct pfioc_pooladdr *)addr; struct pf_kpool *pool; struct pf_kpooladdr *pa; u_int32_t nr = 0; pp->anchor[sizeof(pp->anchor) - 1] = 0; PF_RULES_RLOCK(); pool = pf_get_kpool(pp->anchor, pp->ticket, pp->r_action, pp->r_num, 0, 1, 1); if (pool == NULL) { PF_RULES_RUNLOCK(); error = EBUSY; break; } pa = TAILQ_FIRST(&pool->list); while ((pa != NULL) && (nr < pp->nr)) { pa = TAILQ_NEXT(pa, entries); nr++; } if (pa == NULL) { PF_RULES_RUNLOCK(); error = EBUSY; break; } pf_kpooladdr_to_pooladdr(pa, &pp->addr); pf_addr_copyout(&pp->addr.addr); PF_RULES_RUNLOCK(); break; } case DIOCCHANGEADDR: { struct pfioc_pooladdr *pca = (struct pfioc_pooladdr *)addr; struct pf_kpool *pool; struct pf_kpooladdr *oldpa = NULL, *newpa = NULL; struct pf_kruleset *ruleset; struct pfi_kkif *kif = NULL; pca->anchor[sizeof(pca->anchor) - 1] = 0; if (pca->action < PF_CHANGE_ADD_HEAD || pca->action > PF_CHANGE_REMOVE) { error = EINVAL; break; } if (pca->addr.addr.type != PF_ADDR_ADDRMASK && pca->addr.addr.type != PF_ADDR_DYNIFTL && pca->addr.addr.type != PF_ADDR_TABLE) { error = EINVAL; break; } if (pca->addr.addr.p.dyn != NULL) { error = EINVAL; break; } if (pca->action != PF_CHANGE_REMOVE) { #ifndef INET if (pca->af == AF_INET) { error = EAFNOSUPPORT; break; } #endif /* INET */ #ifndef INET6 if (pca->af == AF_INET6) { error = EAFNOSUPPORT; break; } #endif /* INET6 */ newpa = malloc(sizeof(*newpa), M_PFRULE, M_WAITOK); bcopy(&pca->addr, newpa, sizeof(struct pf_pooladdr)); if (newpa->ifname[0]) kif = pf_kkif_create(M_WAITOK); newpa->kif = NULL; } #define ERROUT(x) ERROUT_IOCTL(DIOCCHANGEADDR_error, x) PF_RULES_WLOCK(); ruleset = pf_find_kruleset(pca->anchor); if (ruleset == NULL) ERROUT(EBUSY); pool = pf_get_kpool(pca->anchor, pca->ticket, pca->r_action, pca->r_num, pca->r_last, 1, 1); if (pool == NULL) ERROUT(EBUSY); if (pca->action != PF_CHANGE_REMOVE) { if (newpa->ifname[0]) { newpa->kif = pfi_kkif_attach(kif, newpa->ifname); pfi_kkif_ref(newpa->kif); kif = NULL; } switch (newpa->addr.type) { case PF_ADDR_DYNIFTL: error = pfi_dynaddr_setup(&newpa->addr, pca->af); break; case PF_ADDR_TABLE: newpa->addr.p.tbl = pfr_attach_table(ruleset, newpa->addr.v.tblname); if (newpa->addr.p.tbl == NULL) error = ENOMEM; break; } if (error) goto DIOCCHANGEADDR_error; } switch (pca->action) { case PF_CHANGE_ADD_HEAD: oldpa = TAILQ_FIRST(&pool->list); break; case PF_CHANGE_ADD_TAIL: oldpa = TAILQ_LAST(&pool->list, pf_kpalist); break; default: oldpa = TAILQ_FIRST(&pool->list); for (int i = 0; oldpa && i < pca->nr; i++) oldpa = TAILQ_NEXT(oldpa, entries); if (oldpa == NULL) ERROUT(EINVAL); } if (pca->action == PF_CHANGE_REMOVE) { TAILQ_REMOVE(&pool->list, oldpa, entries); switch (oldpa->addr.type) { case PF_ADDR_DYNIFTL: pfi_dynaddr_remove(oldpa->addr.p.dyn); break; case PF_ADDR_TABLE: pfr_detach_table(oldpa->addr.p.tbl); break; } if (oldpa->kif) pfi_kkif_unref(oldpa->kif); free(oldpa, M_PFRULE); } else { if (oldpa == NULL) TAILQ_INSERT_TAIL(&pool->list, newpa, entries); else if (pca->action == PF_CHANGE_ADD_HEAD || pca->action == PF_CHANGE_ADD_BEFORE) TAILQ_INSERT_BEFORE(oldpa, newpa, entries); else TAILQ_INSERT_AFTER(&pool->list, oldpa, newpa, entries); } pool->cur = TAILQ_FIRST(&pool->list); PF_ACPY(&pool->counter, &pool->cur->addr.v.a.addr, pca->af); PF_RULES_WUNLOCK(); break; #undef ERROUT DIOCCHANGEADDR_error: if (newpa != NULL) { if (newpa->kif) pfi_kkif_unref(newpa->kif); free(newpa, M_PFRULE); } PF_RULES_WUNLOCK(); pf_kkif_free(kif); break; } case DIOCGETRULESETS: { struct pfioc_ruleset *pr = (struct pfioc_ruleset *)addr; struct pf_kruleset *ruleset; struct pf_kanchor *anchor; pr->path[sizeof(pr->path) - 1] = 0; PF_RULES_RLOCK(); if ((ruleset = pf_find_kruleset(pr->path)) == NULL) { PF_RULES_RUNLOCK(); error = ENOENT; break; } pr->nr = 0; if (ruleset->anchor == NULL) { /* XXX kludge for pf_main_ruleset */ RB_FOREACH(anchor, pf_kanchor_global, &V_pf_anchors) if (anchor->parent == NULL) pr->nr++; } else { RB_FOREACH(anchor, pf_kanchor_node, &ruleset->anchor->children) pr->nr++; } PF_RULES_RUNLOCK(); break; } case DIOCGETRULESET: { struct pfioc_ruleset *pr = (struct pfioc_ruleset *)addr; struct pf_kruleset *ruleset; struct pf_kanchor *anchor; u_int32_t nr = 0; pr->path[sizeof(pr->path) - 1] = 0; PF_RULES_RLOCK(); if ((ruleset = pf_find_kruleset(pr->path)) == NULL) { PF_RULES_RUNLOCK(); error = ENOENT; break; } pr->name[0] = 0; if (ruleset->anchor == NULL) { /* XXX kludge for pf_main_ruleset */ RB_FOREACH(anchor, pf_kanchor_global, &V_pf_anchors) if (anchor->parent == NULL && nr++ == pr->nr) { strlcpy(pr->name, anchor->name, sizeof(pr->name)); break; } } else { RB_FOREACH(anchor, pf_kanchor_node, &ruleset->anchor->children) if (nr++ == pr->nr) { strlcpy(pr->name, anchor->name, sizeof(pr->name)); break; } } if (!pr->name[0]) error = EBUSY; PF_RULES_RUNLOCK(); break; } case DIOCRCLRTABLES: { struct pfioc_table *io = (struct pfioc_table *)addr; if (io->pfrio_esize != 0) { error = ENODEV; break; } PF_RULES_WLOCK(); error = pfr_clr_tables(&io->pfrio_table, &io->pfrio_ndel, io->pfrio_flags | PFR_FLAG_USERIOCTL); PF_RULES_WUNLOCK(); break; } case DIOCRADDTABLES: { struct pfioc_table *io = (struct pfioc_table *)addr; struct pfr_table *pfrts; size_t totlen; if (io->pfrio_esize != sizeof(struct pfr_table)) { error = ENODEV; break; } if (io->pfrio_size < 0 || io->pfrio_size > pf_ioctl_maxcount || WOULD_OVERFLOW(io->pfrio_size, sizeof(struct pfr_table))) { error = ENOMEM; break; } totlen = io->pfrio_size * sizeof(struct pfr_table); pfrts = mallocarray(io->pfrio_size, sizeof(struct pfr_table), M_TEMP, M_WAITOK); error = copyin(io->pfrio_buffer, pfrts, totlen); if (error) { free(pfrts, M_TEMP); break; } PF_RULES_WLOCK(); error = pfr_add_tables(pfrts, io->pfrio_size, &io->pfrio_nadd, io->pfrio_flags | PFR_FLAG_USERIOCTL); PF_RULES_WUNLOCK(); free(pfrts, M_TEMP); break; } case DIOCRDELTABLES: { struct pfioc_table *io = (struct pfioc_table *)addr; struct pfr_table *pfrts; size_t totlen; if (io->pfrio_esize != sizeof(struct pfr_table)) { error = ENODEV; break; } if (io->pfrio_size < 0 || io->pfrio_size > pf_ioctl_maxcount || WOULD_OVERFLOW(io->pfrio_size, sizeof(struct pfr_table))) { error = ENOMEM; break; } totlen = io->pfrio_size * sizeof(struct pfr_table); pfrts = mallocarray(io->pfrio_size, sizeof(struct pfr_table), M_TEMP, M_WAITOK); error = copyin(io->pfrio_buffer, pfrts, totlen); if (error) { free(pfrts, M_TEMP); break; } PF_RULES_WLOCK(); error = pfr_del_tables(pfrts, io->pfrio_size, &io->pfrio_ndel, io->pfrio_flags | PFR_FLAG_USERIOCTL); PF_RULES_WUNLOCK(); free(pfrts, M_TEMP); break; } case DIOCRGETTABLES: { struct pfioc_table *io = (struct pfioc_table *)addr; struct pfr_table *pfrts; size_t totlen; int n; if (io->pfrio_esize != sizeof(struct pfr_table)) { error = ENODEV; break; } PF_RULES_RLOCK(); n = pfr_table_count(&io->pfrio_table, io->pfrio_flags); if (n < 0) { PF_RULES_RUNLOCK(); error = EINVAL; break; } io->pfrio_size = min(io->pfrio_size, n); totlen = io->pfrio_size * sizeof(struct pfr_table); pfrts = mallocarray(io->pfrio_size, sizeof(struct pfr_table), M_TEMP, M_NOWAIT | M_ZERO); if (pfrts == NULL) { error = ENOMEM; PF_RULES_RUNLOCK(); break; } error = pfr_get_tables(&io->pfrio_table, pfrts, &io->pfrio_size, io->pfrio_flags | PFR_FLAG_USERIOCTL); PF_RULES_RUNLOCK(); if (error == 0) error = copyout(pfrts, io->pfrio_buffer, totlen); free(pfrts, M_TEMP); break; } case DIOCRGETTSTATS: { struct pfioc_table *io = (struct pfioc_table *)addr; struct pfr_tstats *pfrtstats; size_t totlen; int n; if (io->pfrio_esize != sizeof(struct pfr_tstats)) { error = ENODEV; break; } PF_TABLE_STATS_LOCK(); PF_RULES_RLOCK(); n = pfr_table_count(&io->pfrio_table, io->pfrio_flags); if (n < 0) { PF_RULES_RUNLOCK(); PF_TABLE_STATS_UNLOCK(); error = EINVAL; break; } io->pfrio_size = min(io->pfrio_size, n); totlen = io->pfrio_size * sizeof(struct pfr_tstats); pfrtstats = mallocarray(io->pfrio_size, sizeof(struct pfr_tstats), M_TEMP, M_NOWAIT | M_ZERO); if (pfrtstats == NULL) { error = ENOMEM; PF_RULES_RUNLOCK(); PF_TABLE_STATS_UNLOCK(); break; } error = pfr_get_tstats(&io->pfrio_table, pfrtstats, &io->pfrio_size, io->pfrio_flags | PFR_FLAG_USERIOCTL); PF_RULES_RUNLOCK(); PF_TABLE_STATS_UNLOCK(); if (error == 0) error = copyout(pfrtstats, io->pfrio_buffer, totlen); free(pfrtstats, M_TEMP); break; } case DIOCRCLRTSTATS: { struct pfioc_table *io = (struct pfioc_table *)addr; struct pfr_table *pfrts; size_t totlen; if (io->pfrio_esize != sizeof(struct pfr_table)) { error = ENODEV; break; } if (io->pfrio_size < 0 || io->pfrio_size > pf_ioctl_maxcount || WOULD_OVERFLOW(io->pfrio_size, sizeof(struct pfr_table))) { /* We used to count tables and use the minimum required * size, so we didn't fail on overly large requests. * Keep doing so. */ io->pfrio_size = pf_ioctl_maxcount; break; } totlen = io->pfrio_size * sizeof(struct pfr_table); pfrts = mallocarray(io->pfrio_size, sizeof(struct pfr_table), M_TEMP, M_WAITOK); error = copyin(io->pfrio_buffer, pfrts, totlen); if (error) { free(pfrts, M_TEMP); break; } PF_TABLE_STATS_LOCK(); PF_RULES_RLOCK(); error = pfr_clr_tstats(pfrts, io->pfrio_size, &io->pfrio_nzero, io->pfrio_flags | PFR_FLAG_USERIOCTL); PF_RULES_RUNLOCK(); PF_TABLE_STATS_UNLOCK(); free(pfrts, M_TEMP); break; } case DIOCRSETTFLAGS: { struct pfioc_table *io = (struct pfioc_table *)addr; struct pfr_table *pfrts; size_t totlen; int n; if (io->pfrio_esize != sizeof(struct pfr_table)) { error = ENODEV; break; } PF_RULES_RLOCK(); n = pfr_table_count(&io->pfrio_table, io->pfrio_flags); if (n < 0) { PF_RULES_RUNLOCK(); error = EINVAL; break; } io->pfrio_size = min(io->pfrio_size, n); PF_RULES_RUNLOCK(); totlen = io->pfrio_size * sizeof(struct pfr_table); pfrts = mallocarray(io->pfrio_size, sizeof(struct pfr_table), M_TEMP, M_WAITOK); error = copyin(io->pfrio_buffer, pfrts, totlen); if (error) { free(pfrts, M_TEMP); break; } PF_RULES_WLOCK(); error = pfr_set_tflags(pfrts, io->pfrio_size, io->pfrio_setflag, io->pfrio_clrflag, &io->pfrio_nchange, &io->pfrio_ndel, io->pfrio_flags | PFR_FLAG_USERIOCTL); PF_RULES_WUNLOCK(); free(pfrts, M_TEMP); break; } case DIOCRCLRADDRS: { struct pfioc_table *io = (struct pfioc_table *)addr; if (io->pfrio_esize != 0) { error = ENODEV; break; } PF_RULES_WLOCK(); error = pfr_clr_addrs(&io->pfrio_table, &io->pfrio_ndel, io->pfrio_flags | PFR_FLAG_USERIOCTL); PF_RULES_WUNLOCK(); break; } case DIOCRADDADDRS: { struct pfioc_table *io = (struct pfioc_table *)addr; struct pfr_addr *pfras; size_t totlen; if (io->pfrio_esize != sizeof(struct pfr_addr)) { error = ENODEV; break; } if (io->pfrio_size < 0 || io->pfrio_size > pf_ioctl_maxcount || WOULD_OVERFLOW(io->pfrio_size, sizeof(struct pfr_addr))) { error = EINVAL; break; } totlen = io->pfrio_size * sizeof(struct pfr_addr); pfras = mallocarray(io->pfrio_size, sizeof(struct pfr_addr), M_TEMP, M_WAITOK); error = copyin(io->pfrio_buffer, pfras, totlen); if (error) { free(pfras, M_TEMP); break; } PF_RULES_WLOCK(); error = pfr_add_addrs(&io->pfrio_table, pfras, io->pfrio_size, &io->pfrio_nadd, io->pfrio_flags | PFR_FLAG_USERIOCTL); PF_RULES_WUNLOCK(); if (error == 0 && io->pfrio_flags & PFR_FLAG_FEEDBACK) error = copyout(pfras, io->pfrio_buffer, totlen); free(pfras, M_TEMP); break; } case DIOCRDELADDRS: { struct pfioc_table *io = (struct pfioc_table *)addr; struct pfr_addr *pfras; size_t totlen; if (io->pfrio_esize != sizeof(struct pfr_addr)) { error = ENODEV; break; } if (io->pfrio_size < 0 || io->pfrio_size > pf_ioctl_maxcount || WOULD_OVERFLOW(io->pfrio_size, sizeof(struct pfr_addr))) { error = EINVAL; break; } totlen = io->pfrio_size * sizeof(struct pfr_addr); pfras = mallocarray(io->pfrio_size, sizeof(struct pfr_addr), M_TEMP, M_WAITOK); error = copyin(io->pfrio_buffer, pfras, totlen); if (error) { free(pfras, M_TEMP); break; } PF_RULES_WLOCK(); error = pfr_del_addrs(&io->pfrio_table, pfras, io->pfrio_size, &io->pfrio_ndel, io->pfrio_flags | PFR_FLAG_USERIOCTL); PF_RULES_WUNLOCK(); if (error == 0 && io->pfrio_flags & PFR_FLAG_FEEDBACK) error = copyout(pfras, io->pfrio_buffer, totlen); free(pfras, M_TEMP); break; } case DIOCRSETADDRS: { struct pfioc_table *io = (struct pfioc_table *)addr; struct pfr_addr *pfras; size_t totlen, count; if (io->pfrio_esize != sizeof(struct pfr_addr)) { error = ENODEV; break; } if (io->pfrio_size < 0 || io->pfrio_size2 < 0) { error = EINVAL; break; } count = max(io->pfrio_size, io->pfrio_size2); if (count > pf_ioctl_maxcount || WOULD_OVERFLOW(count, sizeof(struct pfr_addr))) { error = EINVAL; break; } totlen = count * sizeof(struct pfr_addr); pfras = mallocarray(count, sizeof(struct pfr_addr), M_TEMP, M_WAITOK); error = copyin(io->pfrio_buffer, pfras, totlen); if (error) { free(pfras, M_TEMP); break; } PF_RULES_WLOCK(); error = pfr_set_addrs(&io->pfrio_table, pfras, io->pfrio_size, &io->pfrio_size2, &io->pfrio_nadd, &io->pfrio_ndel, &io->pfrio_nchange, io->pfrio_flags | PFR_FLAG_USERIOCTL, 0); PF_RULES_WUNLOCK(); if (error == 0 && io->pfrio_flags & PFR_FLAG_FEEDBACK) error = copyout(pfras, io->pfrio_buffer, totlen); free(pfras, M_TEMP); break; } case DIOCRGETADDRS: { struct pfioc_table *io = (struct pfioc_table *)addr; struct pfr_addr *pfras; size_t totlen; if (io->pfrio_esize != sizeof(struct pfr_addr)) { error = ENODEV; break; } if (io->pfrio_size < 0 || io->pfrio_size > pf_ioctl_maxcount || WOULD_OVERFLOW(io->pfrio_size, sizeof(struct pfr_addr))) { error = EINVAL; break; } totlen = io->pfrio_size * sizeof(struct pfr_addr); pfras = mallocarray(io->pfrio_size, sizeof(struct pfr_addr), M_TEMP, M_WAITOK | M_ZERO); PF_RULES_RLOCK(); error = pfr_get_addrs(&io->pfrio_table, pfras, &io->pfrio_size, io->pfrio_flags | PFR_FLAG_USERIOCTL); PF_RULES_RUNLOCK(); if (error == 0) error = copyout(pfras, io->pfrio_buffer, totlen); free(pfras, M_TEMP); break; } case DIOCRGETASTATS: { struct pfioc_table *io = (struct pfioc_table *)addr; struct pfr_astats *pfrastats; size_t totlen; if (io->pfrio_esize != sizeof(struct pfr_astats)) { error = ENODEV; break; } if (io->pfrio_size < 0 || io->pfrio_size > pf_ioctl_maxcount || WOULD_OVERFLOW(io->pfrio_size, sizeof(struct pfr_astats))) { error = EINVAL; break; } totlen = io->pfrio_size * sizeof(struct pfr_astats); pfrastats = mallocarray(io->pfrio_size, sizeof(struct pfr_astats), M_TEMP, M_WAITOK | M_ZERO); PF_RULES_RLOCK(); error = pfr_get_astats(&io->pfrio_table, pfrastats, &io->pfrio_size, io->pfrio_flags | PFR_FLAG_USERIOCTL); PF_RULES_RUNLOCK(); if (error == 0) error = copyout(pfrastats, io->pfrio_buffer, totlen); free(pfrastats, M_TEMP); break; } case DIOCRCLRASTATS: { struct pfioc_table *io = (struct pfioc_table *)addr; struct pfr_addr *pfras; size_t totlen; if (io->pfrio_esize != sizeof(struct pfr_addr)) { error = ENODEV; break; } if (io->pfrio_size < 0 || io->pfrio_size > pf_ioctl_maxcount || WOULD_OVERFLOW(io->pfrio_size, sizeof(struct pfr_addr))) { error = EINVAL; break; } totlen = io->pfrio_size * sizeof(struct pfr_addr); pfras = mallocarray(io->pfrio_size, sizeof(struct pfr_addr), M_TEMP, M_WAITOK); error = copyin(io->pfrio_buffer, pfras, totlen); if (error) { free(pfras, M_TEMP); break; } PF_RULES_WLOCK(); error = pfr_clr_astats(&io->pfrio_table, pfras, io->pfrio_size, &io->pfrio_nzero, io->pfrio_flags | PFR_FLAG_USERIOCTL); PF_RULES_WUNLOCK(); if (error == 0 && io->pfrio_flags & PFR_FLAG_FEEDBACK) error = copyout(pfras, io->pfrio_buffer, totlen); free(pfras, M_TEMP); break; } case DIOCRTSTADDRS: { struct pfioc_table *io = (struct pfioc_table *)addr; struct pfr_addr *pfras; size_t totlen; if (io->pfrio_esize != sizeof(struct pfr_addr)) { error = ENODEV; break; } if (io->pfrio_size < 0 || io->pfrio_size > pf_ioctl_maxcount || WOULD_OVERFLOW(io->pfrio_size, sizeof(struct pfr_addr))) { error = EINVAL; break; } totlen = io->pfrio_size * sizeof(struct pfr_addr); pfras = mallocarray(io->pfrio_size, sizeof(struct pfr_addr), M_TEMP, M_WAITOK); error = copyin(io->pfrio_buffer, pfras, totlen); if (error) { free(pfras, M_TEMP); break; } PF_RULES_RLOCK(); error = pfr_tst_addrs(&io->pfrio_table, pfras, io->pfrio_size, &io->pfrio_nmatch, io->pfrio_flags | PFR_FLAG_USERIOCTL); PF_RULES_RUNLOCK(); if (error == 0) error = copyout(pfras, io->pfrio_buffer, totlen); free(pfras, M_TEMP); break; } case DIOCRINADEFINE: { struct pfioc_table *io = (struct pfioc_table *)addr; struct pfr_addr *pfras; size_t totlen; if (io->pfrio_esize != sizeof(struct pfr_addr)) { error = ENODEV; break; } if (io->pfrio_size < 0 || io->pfrio_size > pf_ioctl_maxcount || WOULD_OVERFLOW(io->pfrio_size, sizeof(struct pfr_addr))) { error = EINVAL; break; } totlen = io->pfrio_size * sizeof(struct pfr_addr); pfras = mallocarray(io->pfrio_size, sizeof(struct pfr_addr), M_TEMP, M_WAITOK); error = copyin(io->pfrio_buffer, pfras, totlen); if (error) { free(pfras, M_TEMP); break; } PF_RULES_WLOCK(); error = pfr_ina_define(&io->pfrio_table, pfras, io->pfrio_size, &io->pfrio_nadd, &io->pfrio_naddr, io->pfrio_ticket, io->pfrio_flags | PFR_FLAG_USERIOCTL); PF_RULES_WUNLOCK(); free(pfras, M_TEMP); break; } case DIOCOSFPADD: { struct pf_osfp_ioctl *io = (struct pf_osfp_ioctl *)addr; PF_RULES_WLOCK(); error = pf_osfp_add(io); PF_RULES_WUNLOCK(); break; } case DIOCOSFPGET: { struct pf_osfp_ioctl *io = (struct pf_osfp_ioctl *)addr; PF_RULES_RLOCK(); error = pf_osfp_get(io); PF_RULES_RUNLOCK(); break; } case DIOCXBEGIN: { struct pfioc_trans *io = (struct pfioc_trans *)addr; struct pfioc_trans_e *ioes, *ioe; size_t totlen; int i; if (io->esize != sizeof(*ioe)) { error = ENODEV; break; } if (io->size < 0 || io->size > pf_ioctl_maxcount || WOULD_OVERFLOW(io->size, sizeof(struct pfioc_trans_e))) { error = EINVAL; break; } totlen = sizeof(struct pfioc_trans_e) * io->size; ioes = mallocarray(io->size, sizeof(struct pfioc_trans_e), M_TEMP, M_WAITOK); error = copyin(io->array, ioes, totlen); if (error) { free(ioes, M_TEMP); break; } /* Ensure there's no more ethernet rules to clean up. */ NET_EPOCH_DRAIN_CALLBACKS(); PF_RULES_WLOCK(); for (i = 0, ioe = ioes; i < io->size; i++, ioe++) { ioe->anchor[sizeof(ioe->anchor) - 1] = '\0'; switch (ioe->rs_num) { case PF_RULESET_ETH: if ((error = pf_begin_eth(&ioe->ticket, ioe->anchor))) { PF_RULES_WUNLOCK(); free(ioes, M_TEMP); goto fail; } break; #ifdef ALTQ case PF_RULESET_ALTQ: if (ioe->anchor[0]) { PF_RULES_WUNLOCK(); free(ioes, M_TEMP); error = EINVAL; goto fail; } if ((error = pf_begin_altq(&ioe->ticket))) { PF_RULES_WUNLOCK(); free(ioes, M_TEMP); goto fail; } break; #endif /* ALTQ */ case PF_RULESET_TABLE: { struct pfr_table table; bzero(&table, sizeof(table)); strlcpy(table.pfrt_anchor, ioe->anchor, sizeof(table.pfrt_anchor)); if ((error = pfr_ina_begin(&table, &ioe->ticket, NULL, 0))) { PF_RULES_WUNLOCK(); free(ioes, M_TEMP); goto fail; } break; } default: if ((error = pf_begin_rules(&ioe->ticket, ioe->rs_num, ioe->anchor))) { PF_RULES_WUNLOCK(); free(ioes, M_TEMP); goto fail; } break; } } PF_RULES_WUNLOCK(); error = copyout(ioes, io->array, totlen); free(ioes, M_TEMP); break; } case DIOCXROLLBACK: { struct pfioc_trans *io = (struct pfioc_trans *)addr; struct pfioc_trans_e *ioe, *ioes; size_t totlen; int i; if (io->esize != sizeof(*ioe)) { error = ENODEV; break; } if (io->size < 0 || io->size > pf_ioctl_maxcount || WOULD_OVERFLOW(io->size, sizeof(struct pfioc_trans_e))) { error = EINVAL; break; } totlen = sizeof(struct pfioc_trans_e) * io->size; ioes = mallocarray(io->size, sizeof(struct pfioc_trans_e), M_TEMP, M_WAITOK); error = copyin(io->array, ioes, totlen); if (error) { free(ioes, M_TEMP); break; } PF_RULES_WLOCK(); for (i = 0, ioe = ioes; i < io->size; i++, ioe++) { ioe->anchor[sizeof(ioe->anchor) - 1] = '\0'; switch (ioe->rs_num) { case PF_RULESET_ETH: if ((error = pf_rollback_eth(ioe->ticket, ioe->anchor))) { PF_RULES_WUNLOCK(); free(ioes, M_TEMP); goto fail; /* really bad */ } break; #ifdef ALTQ case PF_RULESET_ALTQ: if (ioe->anchor[0]) { PF_RULES_WUNLOCK(); free(ioes, M_TEMP); error = EINVAL; goto fail; } if ((error = pf_rollback_altq(ioe->ticket))) { PF_RULES_WUNLOCK(); free(ioes, M_TEMP); goto fail; /* really bad */ } break; #endif /* ALTQ */ case PF_RULESET_TABLE: { struct pfr_table table; bzero(&table, sizeof(table)); strlcpy(table.pfrt_anchor, ioe->anchor, sizeof(table.pfrt_anchor)); if ((error = pfr_ina_rollback(&table, ioe->ticket, NULL, 0))) { PF_RULES_WUNLOCK(); free(ioes, M_TEMP); goto fail; /* really bad */ } break; } default: if ((error = pf_rollback_rules(ioe->ticket, ioe->rs_num, ioe->anchor))) { PF_RULES_WUNLOCK(); free(ioes, M_TEMP); goto fail; /* really bad */ } break; } } PF_RULES_WUNLOCK(); free(ioes, M_TEMP); break; } case DIOCXCOMMIT: { struct pfioc_trans *io = (struct pfioc_trans *)addr; struct pfioc_trans_e *ioe, *ioes; struct pf_kruleset *rs; struct pf_keth_ruleset *ers; size_t totlen; int i; if (io->esize != sizeof(*ioe)) { error = ENODEV; break; } if (io->size < 0 || io->size > pf_ioctl_maxcount || WOULD_OVERFLOW(io->size, sizeof(struct pfioc_trans_e))) { error = EINVAL; break; } totlen = sizeof(struct pfioc_trans_e) * io->size; ioes = mallocarray(io->size, sizeof(struct pfioc_trans_e), M_TEMP, M_WAITOK); error = copyin(io->array, ioes, totlen); if (error) { free(ioes, M_TEMP); break; } PF_RULES_WLOCK(); /* First makes sure everything will succeed. */ for (i = 0, ioe = ioes; i < io->size; i++, ioe++) { ioe->anchor[sizeof(ioe->anchor) - 1] = 0; switch (ioe->rs_num) { case PF_RULESET_ETH: ers = pf_find_keth_ruleset(ioe->anchor); if (ers == NULL || ioe->ticket == 0 || ioe->ticket != ers->inactive.ticket) { PF_RULES_WUNLOCK(); free(ioes, M_TEMP); error = EINVAL; goto fail; } break; #ifdef ALTQ case PF_RULESET_ALTQ: if (ioe->anchor[0]) { PF_RULES_WUNLOCK(); free(ioes, M_TEMP); error = EINVAL; goto fail; } if (!V_altqs_inactive_open || ioe->ticket != V_ticket_altqs_inactive) { PF_RULES_WUNLOCK(); free(ioes, M_TEMP); error = EBUSY; goto fail; } break; #endif /* ALTQ */ case PF_RULESET_TABLE: rs = pf_find_kruleset(ioe->anchor); if (rs == NULL || !rs->topen || ioe->ticket != rs->tticket) { PF_RULES_WUNLOCK(); free(ioes, M_TEMP); error = EBUSY; goto fail; } break; default: if (ioe->rs_num < 0 || ioe->rs_num >= PF_RULESET_MAX) { PF_RULES_WUNLOCK(); free(ioes, M_TEMP); error = EINVAL; goto fail; } rs = pf_find_kruleset(ioe->anchor); if (rs == NULL || !rs->rules[ioe->rs_num].inactive.open || rs->rules[ioe->rs_num].inactive.ticket != ioe->ticket) { PF_RULES_WUNLOCK(); free(ioes, M_TEMP); error = EBUSY; goto fail; } break; } } /* Now do the commit - no errors should happen here. */ for (i = 0, ioe = ioes; i < io->size; i++, ioe++) { switch (ioe->rs_num) { case PF_RULESET_ETH: if ((error = pf_commit_eth(ioe->ticket, ioe->anchor))) { PF_RULES_WUNLOCK(); free(ioes, M_TEMP); goto fail; /* really bad */ } break; #ifdef ALTQ case PF_RULESET_ALTQ: if ((error = pf_commit_altq(ioe->ticket))) { PF_RULES_WUNLOCK(); free(ioes, M_TEMP); goto fail; /* really bad */ } break; #endif /* ALTQ */ case PF_RULESET_TABLE: { struct pfr_table table; bzero(&table, sizeof(table)); (void)strlcpy(table.pfrt_anchor, ioe->anchor, sizeof(table.pfrt_anchor)); if ((error = pfr_ina_commit(&table, ioe->ticket, NULL, NULL, 0))) { PF_RULES_WUNLOCK(); free(ioes, M_TEMP); goto fail; /* really bad */ } break; } default: if ((error = pf_commit_rules(ioe->ticket, ioe->rs_num, ioe->anchor))) { PF_RULES_WUNLOCK(); free(ioes, M_TEMP); goto fail; /* really bad */ } break; } } PF_RULES_WUNLOCK(); /* Only hook into EtherNet taffic if we've got rules for it. */ if (! TAILQ_EMPTY(V_pf_keth->active.rules)) hook_pf_eth(); else dehook_pf_eth(); free(ioes, M_TEMP); break; } case DIOCGETSRCNODES: { struct pfioc_src_nodes *psn = (struct pfioc_src_nodes *)addr; struct pf_srchash *sh; struct pf_ksrc_node *n; struct pf_src_node *p, *pstore; uint32_t i, nr = 0; for (i = 0, sh = V_pf_srchash; i <= pf_srchashmask; i++, sh++) { PF_HASHROW_LOCK(sh); LIST_FOREACH(n, &sh->nodes, entry) nr++; PF_HASHROW_UNLOCK(sh); } psn->psn_len = min(psn->psn_len, sizeof(struct pf_src_node) * nr); if (psn->psn_len == 0) { psn->psn_len = sizeof(struct pf_src_node) * nr; break; } nr = 0; p = pstore = malloc(psn->psn_len, M_TEMP, M_WAITOK | M_ZERO); for (i = 0, sh = V_pf_srchash; i <= pf_srchashmask; i++, sh++) { PF_HASHROW_LOCK(sh); LIST_FOREACH(n, &sh->nodes, entry) { if ((nr + 1) * sizeof(*p) > (unsigned)psn->psn_len) break; pf_src_node_copy(n, p); p++; nr++; } PF_HASHROW_UNLOCK(sh); } error = copyout(pstore, psn->psn_src_nodes, sizeof(struct pf_src_node) * nr); if (error) { free(pstore, M_TEMP); break; } psn->psn_len = sizeof(struct pf_src_node) * nr; free(pstore, M_TEMP); break; } case DIOCCLRSRCNODES: { pf_clear_srcnodes(NULL); pf_purge_expired_src_nodes(); break; } case DIOCKILLSRCNODES: pf_kill_srcnodes((struct pfioc_src_node_kill *)addr); break; #ifdef COMPAT_FREEBSD13 case DIOCKEEPCOUNTERS_FREEBSD13: #endif case DIOCKEEPCOUNTERS: error = pf_keepcounters((struct pfioc_nv *)addr); break; case DIOCGETSYNCOOKIES: error = pf_get_syncookies((struct pfioc_nv *)addr); break; case DIOCSETSYNCOOKIES: error = pf_set_syncookies((struct pfioc_nv *)addr); break; case DIOCSETHOSTID: { u_int32_t *hostid = (u_int32_t *)addr; PF_RULES_WLOCK(); if (*hostid == 0) V_pf_status.hostid = arc4random(); else V_pf_status.hostid = *hostid; PF_RULES_WUNLOCK(); break; } case DIOCOSFPFLUSH: PF_RULES_WLOCK(); pf_osfp_flush(); PF_RULES_WUNLOCK(); break; case DIOCIGETIFACES: { struct pfioc_iface *io = (struct pfioc_iface *)addr; struct pfi_kif *ifstore; size_t bufsiz; if (io->pfiio_esize != sizeof(struct pfi_kif)) { error = ENODEV; break; } if (io->pfiio_size < 0 || io->pfiio_size > pf_ioctl_maxcount || WOULD_OVERFLOW(io->pfiio_size, sizeof(struct pfi_kif))) { error = EINVAL; break; } io->pfiio_name[sizeof(io->pfiio_name) - 1] = '\0'; bufsiz = io->pfiio_size * sizeof(struct pfi_kif); ifstore = mallocarray(io->pfiio_size, sizeof(struct pfi_kif), M_TEMP, M_WAITOK | M_ZERO); PF_RULES_RLOCK(); pfi_get_ifaces(io->pfiio_name, ifstore, &io->pfiio_size); PF_RULES_RUNLOCK(); error = copyout(ifstore, io->pfiio_buffer, bufsiz); free(ifstore, M_TEMP); break; } case DIOCSETIFFLAG: { struct pfioc_iface *io = (struct pfioc_iface *)addr; io->pfiio_name[sizeof(io->pfiio_name) - 1] = '\0'; PF_RULES_WLOCK(); error = pfi_set_flags(io->pfiio_name, io->pfiio_flags); PF_RULES_WUNLOCK(); break; } case DIOCCLRIFFLAG: { struct pfioc_iface *io = (struct pfioc_iface *)addr; io->pfiio_name[sizeof(io->pfiio_name) - 1] = '\0'; PF_RULES_WLOCK(); error = pfi_clear_flags(io->pfiio_name, io->pfiio_flags); PF_RULES_WUNLOCK(); break; } case DIOCSETREASS: { u_int32_t *reass = (u_int32_t *)addr; V_pf_status.reass = *reass & (PF_REASS_ENABLED|PF_REASS_NODF); /* Removal of DF flag without reassembly enabled is not a * valid combination. Disable reassembly in such case. */ if (!(V_pf_status.reass & PF_REASS_ENABLED)) V_pf_status.reass = 0; break; } default: error = ENODEV; break; } fail: if (sx_xlocked(&V_pf_ioctl_lock)) sx_xunlock(&V_pf_ioctl_lock); CURVNET_RESTORE(); #undef ERROUT_IOCTL return (error); } void pfsync_state_export(union pfsync_state_union *sp, struct pf_kstate *st, int msg_version) { bzero(sp, sizeof(union pfsync_state_union)); /* copy from state key */ sp->pfs_1301.key[PF_SK_WIRE].addr[0] = st->key[PF_SK_WIRE]->addr[0]; sp->pfs_1301.key[PF_SK_WIRE].addr[1] = st->key[PF_SK_WIRE]->addr[1]; sp->pfs_1301.key[PF_SK_WIRE].port[0] = st->key[PF_SK_WIRE]->port[0]; sp->pfs_1301.key[PF_SK_WIRE].port[1] = st->key[PF_SK_WIRE]->port[1]; sp->pfs_1301.key[PF_SK_STACK].addr[0] = st->key[PF_SK_STACK]->addr[0]; sp->pfs_1301.key[PF_SK_STACK].addr[1] = st->key[PF_SK_STACK]->addr[1]; sp->pfs_1301.key[PF_SK_STACK].port[0] = st->key[PF_SK_STACK]->port[0]; sp->pfs_1301.key[PF_SK_STACK].port[1] = st->key[PF_SK_STACK]->port[1]; sp->pfs_1301.proto = st->key[PF_SK_WIRE]->proto; sp->pfs_1301.af = st->key[PF_SK_WIRE]->af; /* copy from state */ strlcpy(sp->pfs_1301.ifname, st->kif->pfik_name, sizeof(sp->pfs_1301.ifname)); bcopy(&st->rt_addr, &sp->pfs_1301.rt_addr, sizeof(sp->pfs_1301.rt_addr)); sp->pfs_1301.creation = htonl(time_uptime - st->creation); sp->pfs_1301.expire = pf_state_expires(st); if (sp->pfs_1301.expire <= time_uptime) sp->pfs_1301.expire = htonl(0); else sp->pfs_1301.expire = htonl(sp->pfs_1301.expire - time_uptime); sp->pfs_1301.direction = st->direction; sp->pfs_1301.log = st->log; sp->pfs_1301.timeout = st->timeout; switch (msg_version) { case PFSYNC_MSG_VERSION_1301: sp->pfs_1301.state_flags = st->state_flags; break; case PFSYNC_MSG_VERSION_1400: sp->pfs_1400.state_flags = htons(st->state_flags); sp->pfs_1400.qid = htons(st->qid); sp->pfs_1400.pqid = htons(st->pqid); sp->pfs_1400.dnpipe = htons(st->dnpipe); sp->pfs_1400.dnrpipe = htons(st->dnrpipe); sp->pfs_1400.rtableid = htonl(st->rtableid); sp->pfs_1400.min_ttl = st->min_ttl; sp->pfs_1400.set_tos = st->set_tos; sp->pfs_1400.max_mss = htons(st->max_mss); sp->pfs_1400.set_prio[0] = st->set_prio[0]; sp->pfs_1400.set_prio[1] = st->set_prio[1]; sp->pfs_1400.rt = st->rt; if (st->rt_kif) strlcpy(sp->pfs_1400.rt_ifname, st->rt_kif->pfik_name, sizeof(sp->pfs_1400.rt_ifname)); break; default: panic("%s: Unsupported pfsync_msg_version %d", __func__, msg_version); } if (st->src_node) sp->pfs_1301.sync_flags |= PFSYNC_FLAG_SRCNODE; if (st->nat_src_node) sp->pfs_1301.sync_flags |= PFSYNC_FLAG_NATSRCNODE; sp->pfs_1301.id = st->id; sp->pfs_1301.creatorid = st->creatorid; pf_state_peer_hton(&st->src, &sp->pfs_1301.src); pf_state_peer_hton(&st->dst, &sp->pfs_1301.dst); if (st->rule.ptr == NULL) sp->pfs_1301.rule = htonl(-1); else sp->pfs_1301.rule = htonl(st->rule.ptr->nr); if (st->anchor.ptr == NULL) sp->pfs_1301.anchor = htonl(-1); else sp->pfs_1301.anchor = htonl(st->anchor.ptr->nr); if (st->nat_rule.ptr == NULL) sp->pfs_1301.nat_rule = htonl(-1); else sp->pfs_1301.nat_rule = htonl(st->nat_rule.ptr->nr); pf_state_counter_hton(st->packets[0], sp->pfs_1301.packets[0]); pf_state_counter_hton(st->packets[1], sp->pfs_1301.packets[1]); pf_state_counter_hton(st->bytes[0], sp->pfs_1301.bytes[0]); pf_state_counter_hton(st->bytes[1], sp->pfs_1301.bytes[1]); } void pf_state_export(struct pf_state_export *sp, struct pf_kstate *st) { bzero(sp, sizeof(*sp)); sp->version = PF_STATE_VERSION; /* copy from state key */ sp->key[PF_SK_WIRE].addr[0] = st->key[PF_SK_WIRE]->addr[0]; sp->key[PF_SK_WIRE].addr[1] = st->key[PF_SK_WIRE]->addr[1]; sp->key[PF_SK_WIRE].port[0] = st->key[PF_SK_WIRE]->port[0]; sp->key[PF_SK_WIRE].port[1] = st->key[PF_SK_WIRE]->port[1]; sp->key[PF_SK_STACK].addr[0] = st->key[PF_SK_STACK]->addr[0]; sp->key[PF_SK_STACK].addr[1] = st->key[PF_SK_STACK]->addr[1]; sp->key[PF_SK_STACK].port[0] = st->key[PF_SK_STACK]->port[0]; sp->key[PF_SK_STACK].port[1] = st->key[PF_SK_STACK]->port[1]; sp->proto = st->key[PF_SK_WIRE]->proto; sp->af = st->key[PF_SK_WIRE]->af; /* copy from state */ strlcpy(sp->ifname, st->kif->pfik_name, sizeof(sp->ifname)); strlcpy(sp->orig_ifname, st->orig_kif->pfik_name, sizeof(sp->orig_ifname)); bcopy(&st->rt_addr, &sp->rt_addr, sizeof(sp->rt_addr)); sp->creation = htonl(time_uptime - st->creation); sp->expire = pf_state_expires(st); if (sp->expire <= time_uptime) sp->expire = htonl(0); else sp->expire = htonl(sp->expire - time_uptime); sp->direction = st->direction; sp->log = st->log; sp->timeout = st->timeout; /* 8 bits for the old libpfctl, 16 bits for the new libpfctl */ sp->state_flags_compat = st->state_flags; sp->state_flags = htons(st->state_flags); if (st->src_node) sp->sync_flags |= PFSYNC_FLAG_SRCNODE; if (st->nat_src_node) sp->sync_flags |= PFSYNC_FLAG_NATSRCNODE; sp->id = st->id; sp->creatorid = st->creatorid; pf_state_peer_hton(&st->src, &sp->src); pf_state_peer_hton(&st->dst, &sp->dst); if (st->rule.ptr == NULL) sp->rule = htonl(-1); else sp->rule = htonl(st->rule.ptr->nr); if (st->anchor.ptr == NULL) sp->anchor = htonl(-1); else sp->anchor = htonl(st->anchor.ptr->nr); if (st->nat_rule.ptr == NULL) sp->nat_rule = htonl(-1); else sp->nat_rule = htonl(st->nat_rule.ptr->nr); sp->packets[0] = st->packets[0]; sp->packets[1] = st->packets[1]; sp->bytes[0] = st->bytes[0]; sp->bytes[1] = st->bytes[1]; sp->qid = htons(st->qid); sp->pqid = htons(st->pqid); sp->dnpipe = htons(st->dnpipe); sp->dnrpipe = htons(st->dnrpipe); sp->rtableid = htonl(st->rtableid); sp->min_ttl = st->min_ttl; sp->set_tos = st->set_tos; sp->max_mss = htons(st->max_mss); sp->rt = st->rt; if (st->rt_kif) strlcpy(sp->rt_ifname, st->rt_kif->pfik_name, sizeof(sp->rt_ifname)); sp->set_prio[0] = st->set_prio[0]; sp->set_prio[1] = st->set_prio[1]; } static void pf_tbladdr_copyout(struct pf_addr_wrap *aw) { struct pfr_ktable *kt; KASSERT(aw->type == PF_ADDR_TABLE, ("%s: type %u", __func__, aw->type)); kt = aw->p.tbl; if (!(kt->pfrkt_flags & PFR_TFLAG_ACTIVE) && kt->pfrkt_root != NULL) kt = kt->pfrkt_root; aw->p.tbl = NULL; aw->p.tblcnt = (kt->pfrkt_flags & PFR_TFLAG_ACTIVE) ? kt->pfrkt_cnt : -1; } static int pf_add_status_counters(nvlist_t *nvl, const char *name, counter_u64_t *counters, size_t number, char **names) { nvlist_t *nvc; nvc = nvlist_create(0); if (nvc == NULL) return (ENOMEM); for (int i = 0; i < number; i++) { nvlist_append_number_array(nvc, "counters", counter_u64_fetch(counters[i])); nvlist_append_string_array(nvc, "names", names[i]); nvlist_append_number_array(nvc, "ids", i); } nvlist_add_nvlist(nvl, name, nvc); nvlist_destroy(nvc); return (0); } static int pf_getstatus(struct pfioc_nv *nv) { nvlist_t *nvl = NULL, *nvc = NULL; void *nvlpacked = NULL; int error; struct pf_status s; char *pf_reasons[PFRES_MAX+1] = PFRES_NAMES; char *pf_lcounter[KLCNT_MAX+1] = KLCNT_NAMES; char *pf_fcounter[FCNT_MAX+1] = FCNT_NAMES; PF_RULES_RLOCK_TRACKER; #define ERROUT(x) ERROUT_FUNCTION(errout, x) PF_RULES_RLOCK(); nvl = nvlist_create(0); if (nvl == NULL) ERROUT(ENOMEM); nvlist_add_bool(nvl, "running", V_pf_status.running); nvlist_add_number(nvl, "since", V_pf_status.since); nvlist_add_number(nvl, "debug", V_pf_status.debug); nvlist_add_number(nvl, "hostid", V_pf_status.hostid); nvlist_add_number(nvl, "states", V_pf_status.states); nvlist_add_number(nvl, "src_nodes", V_pf_status.src_nodes); nvlist_add_number(nvl, "reass", V_pf_status.reass); nvlist_add_bool(nvl, "syncookies_active", V_pf_status.syncookies_active); /* counters */ error = pf_add_status_counters(nvl, "counters", V_pf_status.counters, PFRES_MAX, pf_reasons); if (error != 0) ERROUT(error); /* lcounters */ error = pf_add_status_counters(nvl, "lcounters", V_pf_status.lcounters, KLCNT_MAX, pf_lcounter); if (error != 0) ERROUT(error); /* fcounters */ nvc = nvlist_create(0); if (nvc == NULL) ERROUT(ENOMEM); for (int i = 0; i < FCNT_MAX; i++) { nvlist_append_number_array(nvc, "counters", pf_counter_u64_fetch(&V_pf_status.fcounters[i])); nvlist_append_string_array(nvc, "names", pf_fcounter[i]); nvlist_append_number_array(nvc, "ids", i); } nvlist_add_nvlist(nvl, "fcounters", nvc); nvlist_destroy(nvc); nvc = NULL; /* scounters */ error = pf_add_status_counters(nvl, "scounters", V_pf_status.scounters, SCNT_MAX, pf_fcounter); if (error != 0) ERROUT(error); nvlist_add_string(nvl, "ifname", V_pf_status.ifname); nvlist_add_binary(nvl, "chksum", V_pf_status.pf_chksum, PF_MD5_DIGEST_LENGTH); pfi_update_status(V_pf_status.ifname, &s); /* pcounters / bcounters */ for (int i = 0; i < 2; i++) { for (int j = 0; j < 2; j++) { for (int k = 0; k < 2; k++) { nvlist_append_number_array(nvl, "pcounters", s.pcounters[i][j][k]); } nvlist_append_number_array(nvl, "bcounters", s.bcounters[i][j]); } } nvlpacked = nvlist_pack(nvl, &nv->len); if (nvlpacked == NULL) ERROUT(ENOMEM); if (nv->size == 0) ERROUT(0); else if (nv->size < nv->len) ERROUT(ENOSPC); PF_RULES_RUNLOCK(); error = copyout(nvlpacked, nv->data, nv->len); goto done; #undef ERROUT errout: PF_RULES_RUNLOCK(); done: free(nvlpacked, M_NVLIST); nvlist_destroy(nvc); nvlist_destroy(nvl); return (error); } /* * XXX - Check for version mismatch!!! */ static void pf_clear_all_states(void) { struct pf_kstate *s; u_int i; for (i = 0; i <= pf_hashmask; i++) { struct pf_idhash *ih = &V_pf_idhash[i]; relock: PF_HASHROW_LOCK(ih); LIST_FOREACH(s, &ih->states, entry) { s->timeout = PFTM_PURGE; /* Don't send out individual delete messages. */ s->state_flags |= PFSTATE_NOSYNC; pf_unlink_state(s); goto relock; } PF_HASHROW_UNLOCK(ih); } } static int pf_clear_tables(void) { struct pfioc_table io; int error; bzero(&io, sizeof(io)); error = pfr_clr_tables(&io.pfrio_table, &io.pfrio_ndel, io.pfrio_flags); return (error); } static void pf_clear_srcnodes(struct pf_ksrc_node *n) { struct pf_kstate *s; int i; for (i = 0; i <= pf_hashmask; i++) { struct pf_idhash *ih = &V_pf_idhash[i]; PF_HASHROW_LOCK(ih); LIST_FOREACH(s, &ih->states, entry) { if (n == NULL || n == s->src_node) s->src_node = NULL; if (n == NULL || n == s->nat_src_node) s->nat_src_node = NULL; } PF_HASHROW_UNLOCK(ih); } if (n == NULL) { struct pf_srchash *sh; for (i = 0, sh = V_pf_srchash; i <= pf_srchashmask; i++, sh++) { PF_HASHROW_LOCK(sh); LIST_FOREACH(n, &sh->nodes, entry) { n->expire = 1; n->states = 0; } PF_HASHROW_UNLOCK(sh); } } else { /* XXX: hash slot should already be locked here. */ n->expire = 1; n->states = 0; } } static void pf_kill_srcnodes(struct pfioc_src_node_kill *psnk) { struct pf_ksrc_node_list kill; LIST_INIT(&kill); for (int i = 0; i <= pf_srchashmask; i++) { struct pf_srchash *sh = &V_pf_srchash[i]; struct pf_ksrc_node *sn, *tmp; PF_HASHROW_LOCK(sh); LIST_FOREACH_SAFE(sn, &sh->nodes, entry, tmp) if (PF_MATCHA(psnk->psnk_src.neg, &psnk->psnk_src.addr.v.a.addr, &psnk->psnk_src.addr.v.a.mask, &sn->addr, sn->af) && PF_MATCHA(psnk->psnk_dst.neg, &psnk->psnk_dst.addr.v.a.addr, &psnk->psnk_dst.addr.v.a.mask, &sn->raddr, sn->af)) { pf_unlink_src_node(sn); LIST_INSERT_HEAD(&kill, sn, entry); sn->expire = 1; } PF_HASHROW_UNLOCK(sh); } for (int i = 0; i <= pf_hashmask; i++) { struct pf_idhash *ih = &V_pf_idhash[i]; struct pf_kstate *s; PF_HASHROW_LOCK(ih); LIST_FOREACH(s, &ih->states, entry) { if (s->src_node && s->src_node->expire == 1) s->src_node = NULL; if (s->nat_src_node && s->nat_src_node->expire == 1) s->nat_src_node = NULL; } PF_HASHROW_UNLOCK(ih); } psnk->psnk_killed = pf_free_src_nodes(&kill); } static int pf_keepcounters(struct pfioc_nv *nv) { nvlist_t *nvl = NULL; void *nvlpacked = NULL; int error = 0; #define ERROUT(x) ERROUT_FUNCTION(on_error, x) if (nv->len > pf_ioctl_maxcount) ERROUT(ENOMEM); nvlpacked = malloc(nv->len, M_NVLIST, M_WAITOK); if (nvlpacked == NULL) ERROUT(ENOMEM); error = copyin(nv->data, nvlpacked, nv->len); if (error) ERROUT(error); nvl = nvlist_unpack(nvlpacked, nv->len, 0); if (nvl == NULL) ERROUT(EBADMSG); if (! nvlist_exists_bool(nvl, "keep_counters")) ERROUT(EBADMSG); V_pf_status.keep_counters = nvlist_get_bool(nvl, "keep_counters"); on_error: nvlist_destroy(nvl); free(nvlpacked, M_NVLIST); return (error); } static unsigned int pf_clear_states(const struct pf_kstate_kill *kill) { struct pf_state_key_cmp match_key; struct pf_kstate *s; struct pfi_kkif *kif; int idx; unsigned int killed = 0, dir; for (unsigned int i = 0; i <= pf_hashmask; i++) { struct pf_idhash *ih = &V_pf_idhash[i]; relock_DIOCCLRSTATES: PF_HASHROW_LOCK(ih); LIST_FOREACH(s, &ih->states, entry) { /* For floating states look at the original kif. */ kif = s->kif == V_pfi_all ? s->orig_kif : s->kif; if (kill->psk_ifname[0] && strcmp(kill->psk_ifname, kif->pfik_name)) continue; if (kill->psk_kill_match) { bzero(&match_key, sizeof(match_key)); if (s->direction == PF_OUT) { dir = PF_IN; idx = PF_SK_STACK; } else { dir = PF_OUT; idx = PF_SK_WIRE; } match_key.af = s->key[idx]->af; match_key.proto = s->key[idx]->proto; PF_ACPY(&match_key.addr[0], &s->key[idx]->addr[1], match_key.af); match_key.port[0] = s->key[idx]->port[1]; PF_ACPY(&match_key.addr[1], &s->key[idx]->addr[0], match_key.af); match_key.port[1] = s->key[idx]->port[0]; } /* * Don't send out individual * delete messages. */ s->state_flags |= PFSTATE_NOSYNC; pf_unlink_state(s); killed++; if (kill->psk_kill_match) killed += pf_kill_matching_state(&match_key, dir); goto relock_DIOCCLRSTATES; } PF_HASHROW_UNLOCK(ih); } if (V_pfsync_clear_states_ptr != NULL) V_pfsync_clear_states_ptr(V_pf_status.hostid, kill->psk_ifname); return (killed); } static void pf_killstates(struct pf_kstate_kill *kill, unsigned int *killed) { struct pf_kstate *s; if (kill->psk_pfcmp.id) { if (kill->psk_pfcmp.creatorid == 0) kill->psk_pfcmp.creatorid = V_pf_status.hostid; if ((s = pf_find_state_byid(kill->psk_pfcmp.id, kill->psk_pfcmp.creatorid))) { pf_unlink_state(s); *killed = 1; } return; } for (unsigned int i = 0; i <= pf_hashmask; i++) *killed += pf_killstates_row(kill, &V_pf_idhash[i]); return; } static int pf_killstates_nv(struct pfioc_nv *nv) { struct pf_kstate_kill kill; nvlist_t *nvl = NULL; void *nvlpacked = NULL; int error = 0; unsigned int killed = 0; #define ERROUT(x) ERROUT_FUNCTION(on_error, x) if (nv->len > pf_ioctl_maxcount) ERROUT(ENOMEM); nvlpacked = malloc(nv->len, M_NVLIST, M_WAITOK); if (nvlpacked == NULL) ERROUT(ENOMEM); error = copyin(nv->data, nvlpacked, nv->len); if (error) ERROUT(error); nvl = nvlist_unpack(nvlpacked, nv->len, 0); if (nvl == NULL) ERROUT(EBADMSG); error = pf_nvstate_kill_to_kstate_kill(nvl, &kill); if (error) ERROUT(error); pf_killstates(&kill, &killed); free(nvlpacked, M_NVLIST); nvlpacked = NULL; nvlist_destroy(nvl); nvl = nvlist_create(0); if (nvl == NULL) ERROUT(ENOMEM); nvlist_add_number(nvl, "killed", killed); nvlpacked = nvlist_pack(nvl, &nv->len); if (nvlpacked == NULL) ERROUT(ENOMEM); if (nv->size == 0) ERROUT(0); else if (nv->size < nv->len) ERROUT(ENOSPC); error = copyout(nvlpacked, nv->data, nv->len); on_error: nvlist_destroy(nvl); free(nvlpacked, M_NVLIST); return (error); } static int pf_clearstates_nv(struct pfioc_nv *nv) { struct pf_kstate_kill kill; nvlist_t *nvl = NULL; void *nvlpacked = NULL; int error = 0; unsigned int killed; #define ERROUT(x) ERROUT_FUNCTION(on_error, x) if (nv->len > pf_ioctl_maxcount) ERROUT(ENOMEM); nvlpacked = malloc(nv->len, M_NVLIST, M_WAITOK); if (nvlpacked == NULL) ERROUT(ENOMEM); error = copyin(nv->data, nvlpacked, nv->len); if (error) ERROUT(error); nvl = nvlist_unpack(nvlpacked, nv->len, 0); if (nvl == NULL) ERROUT(EBADMSG); error = pf_nvstate_kill_to_kstate_kill(nvl, &kill); if (error) ERROUT(error); killed = pf_clear_states(&kill); free(nvlpacked, M_NVLIST); nvlpacked = NULL; nvlist_destroy(nvl); nvl = nvlist_create(0); if (nvl == NULL) ERROUT(ENOMEM); nvlist_add_number(nvl, "killed", killed); nvlpacked = nvlist_pack(nvl, &nv->len); if (nvlpacked == NULL) ERROUT(ENOMEM); if (nv->size == 0) ERROUT(0); else if (nv->size < nv->len) ERROUT(ENOSPC); error = copyout(nvlpacked, nv->data, nv->len); #undef ERROUT on_error: nvlist_destroy(nvl); free(nvlpacked, M_NVLIST); return (error); } static int pf_getstate(struct pfioc_nv *nv) { nvlist_t *nvl = NULL, *nvls; void *nvlpacked = NULL; struct pf_kstate *s = NULL; int error = 0; uint64_t id, creatorid; #define ERROUT(x) ERROUT_FUNCTION(errout, x) if (nv->len > pf_ioctl_maxcount) ERROUT(ENOMEM); nvlpacked = malloc(nv->len, M_NVLIST, M_WAITOK); if (nvlpacked == NULL) ERROUT(ENOMEM); error = copyin(nv->data, nvlpacked, nv->len); if (error) ERROUT(error); nvl = nvlist_unpack(nvlpacked, nv->len, 0); if (nvl == NULL) ERROUT(EBADMSG); PFNV_CHK(pf_nvuint64(nvl, "id", &id)); PFNV_CHK(pf_nvuint64(nvl, "creatorid", &creatorid)); s = pf_find_state_byid(id, creatorid); if (s == NULL) ERROUT(ENOENT); free(nvlpacked, M_NVLIST); nvlpacked = NULL; nvlist_destroy(nvl); nvl = nvlist_create(0); if (nvl == NULL) ERROUT(ENOMEM); nvls = pf_state_to_nvstate(s); if (nvls == NULL) ERROUT(ENOMEM); nvlist_add_nvlist(nvl, "state", nvls); nvlist_destroy(nvls); nvlpacked = nvlist_pack(nvl, &nv->len); if (nvlpacked == NULL) ERROUT(ENOMEM); if (nv->size == 0) ERROUT(0); else if (nv->size < nv->len) ERROUT(ENOSPC); error = copyout(nvlpacked, nv->data, nv->len); #undef ERROUT errout: if (s != NULL) PF_STATE_UNLOCK(s); free(nvlpacked, M_NVLIST); nvlist_destroy(nvl); return (error); } /* * XXX - Check for version mismatch!!! */ /* * Duplicate pfctl -Fa operation to get rid of as much as we can. */ static int shutdown_pf(void) { int error = 0; u_int32_t t[5]; char nn = '\0'; do { if ((error = pf_begin_rules(&t[0], PF_RULESET_SCRUB, &nn)) != 0) { DPFPRINTF(PF_DEBUG_MISC, ("shutdown_pf: SCRUB\n")); break; } if ((error = pf_begin_rules(&t[1], PF_RULESET_FILTER, &nn)) != 0) { DPFPRINTF(PF_DEBUG_MISC, ("shutdown_pf: FILTER\n")); break; /* XXX: rollback? */ } if ((error = pf_begin_rules(&t[2], PF_RULESET_NAT, &nn)) != 0) { DPFPRINTF(PF_DEBUG_MISC, ("shutdown_pf: NAT\n")); break; /* XXX: rollback? */ } if ((error = pf_begin_rules(&t[3], PF_RULESET_BINAT, &nn)) != 0) { DPFPRINTF(PF_DEBUG_MISC, ("shutdown_pf: BINAT\n")); break; /* XXX: rollback? */ } if ((error = pf_begin_rules(&t[4], PF_RULESET_RDR, &nn)) != 0) { DPFPRINTF(PF_DEBUG_MISC, ("shutdown_pf: RDR\n")); break; /* XXX: rollback? */ } /* XXX: these should always succeed here */ pf_commit_rules(t[0], PF_RULESET_SCRUB, &nn); pf_commit_rules(t[1], PF_RULESET_FILTER, &nn); pf_commit_rules(t[2], PF_RULESET_NAT, &nn); pf_commit_rules(t[3], PF_RULESET_BINAT, &nn); pf_commit_rules(t[4], PF_RULESET_RDR, &nn); if ((error = pf_clear_tables()) != 0) break; if ((error = pf_begin_eth(&t[0], &nn)) != 0) { DPFPRINTF(PF_DEBUG_MISC, ("shutdown_pf: eth\n")); break; } pf_commit_eth(t[0], &nn); #ifdef ALTQ if ((error = pf_begin_altq(&t[0])) != 0) { DPFPRINTF(PF_DEBUG_MISC, ("shutdown_pf: ALTQ\n")); break; } pf_commit_altq(t[0]); #endif pf_clear_all_states(); pf_clear_srcnodes(NULL); /* status does not use malloced mem so no need to cleanup */ /* fingerprints and interfaces have their own cleanup code */ } while(0); return (error); } static pfil_return_t pf_check_return(int chk, struct mbuf **m) { switch (chk) { case PF_PASS: if (*m == NULL) return (PFIL_CONSUMED); else return (PFIL_PASS); break; default: if (*m != NULL) { m_freem(*m); *m = NULL; } return (PFIL_DROPPED); } } static pfil_return_t pf_eth_check_in(struct mbuf **m, struct ifnet *ifp, int flags, void *ruleset __unused, struct inpcb *inp) { int chk; chk = pf_test_eth(PF_IN, flags, ifp, m, inp); return (pf_check_return(chk, m)); } static pfil_return_t pf_eth_check_out(struct mbuf **m, struct ifnet *ifp, int flags, void *ruleset __unused, struct inpcb *inp) { int chk; chk = pf_test_eth(PF_OUT, flags, ifp, m, inp); return (pf_check_return(chk, m)); } #ifdef INET static pfil_return_t pf_check_in(struct mbuf **m, struct ifnet *ifp, int flags, void *ruleset __unused, struct inpcb *inp) { int chk; chk = pf_test(PF_IN, flags, ifp, m, inp, NULL); return (pf_check_return(chk, m)); } static pfil_return_t pf_check_out(struct mbuf **m, struct ifnet *ifp, int flags, void *ruleset __unused, struct inpcb *inp) { int chk; chk = pf_test(PF_OUT, flags, ifp, m, inp, NULL); return (pf_check_return(chk, m)); } #endif #ifdef INET6 static pfil_return_t pf_check6_in(struct mbuf **m, struct ifnet *ifp, int flags, void *ruleset __unused, struct inpcb *inp) { int chk; /* * In case of loopback traffic IPv6 uses the real interface in * order to support scoped addresses. In order to support stateful * filtering we have change this to lo0 as it is the case in IPv4. */ CURVNET_SET(ifp->if_vnet); chk = pf_test6(PF_IN, flags, (*m)->m_flags & M_LOOP ? V_loif : ifp, m, inp, NULL); CURVNET_RESTORE(); return (pf_check_return(chk, m)); } static pfil_return_t pf_check6_out(struct mbuf **m, struct ifnet *ifp, int flags, void *ruleset __unused, struct inpcb *inp) { int chk; CURVNET_SET(ifp->if_vnet); chk = pf_test6(PF_OUT, flags, ifp, m, inp, NULL); CURVNET_RESTORE(); return (pf_check_return(chk, m)); } #endif /* INET6 */ VNET_DEFINE_STATIC(pfil_hook_t, pf_eth_in_hook); VNET_DEFINE_STATIC(pfil_hook_t, pf_eth_out_hook); #define V_pf_eth_in_hook VNET(pf_eth_in_hook) #define V_pf_eth_out_hook VNET(pf_eth_out_hook) #ifdef INET VNET_DEFINE_STATIC(pfil_hook_t, pf_ip4_in_hook); VNET_DEFINE_STATIC(pfil_hook_t, pf_ip4_out_hook); #define V_pf_ip4_in_hook VNET(pf_ip4_in_hook) #define V_pf_ip4_out_hook VNET(pf_ip4_out_hook) #endif #ifdef INET6 VNET_DEFINE_STATIC(pfil_hook_t, pf_ip6_in_hook); VNET_DEFINE_STATIC(pfil_hook_t, pf_ip6_out_hook); #define V_pf_ip6_in_hook VNET(pf_ip6_in_hook) #define V_pf_ip6_out_hook VNET(pf_ip6_out_hook) #endif static void hook_pf_eth(void) { struct pfil_hook_args pha = { .pa_version = PFIL_VERSION, .pa_modname = "pf", .pa_type = PFIL_TYPE_ETHERNET, }; struct pfil_link_args pla = { .pa_version = PFIL_VERSION, }; int ret __diagused; if (atomic_load_bool(&V_pf_pfil_eth_hooked)) return; pha.pa_mbuf_chk = pf_eth_check_in; pha.pa_flags = PFIL_IN; pha.pa_rulname = "eth-in"; V_pf_eth_in_hook = pfil_add_hook(&pha); pla.pa_flags = PFIL_IN | PFIL_HEADPTR | PFIL_HOOKPTR; pla.pa_head = V_link_pfil_head; pla.pa_hook = V_pf_eth_in_hook; ret = pfil_link(&pla); MPASS(ret == 0); pha.pa_mbuf_chk = pf_eth_check_out; pha.pa_flags = PFIL_OUT; pha.pa_rulname = "eth-out"; V_pf_eth_out_hook = pfil_add_hook(&pha); pla.pa_flags = PFIL_OUT | PFIL_HEADPTR | PFIL_HOOKPTR; pla.pa_head = V_link_pfil_head; pla.pa_hook = V_pf_eth_out_hook; ret = pfil_link(&pla); MPASS(ret == 0); atomic_store_bool(&V_pf_pfil_eth_hooked, true); } static void hook_pf(void) { struct pfil_hook_args pha = { .pa_version = PFIL_VERSION, .pa_modname = "pf", }; struct pfil_link_args pla = { .pa_version = PFIL_VERSION, }; int ret __diagused; if (atomic_load_bool(&V_pf_pfil_hooked)) return; #ifdef INET pha.pa_type = PFIL_TYPE_IP4; pha.pa_mbuf_chk = pf_check_in; pha.pa_flags = PFIL_IN; pha.pa_rulname = "default-in"; V_pf_ip4_in_hook = pfil_add_hook(&pha); pla.pa_flags = PFIL_IN | PFIL_HEADPTR | PFIL_HOOKPTR; pla.pa_head = V_inet_pfil_head; pla.pa_hook = V_pf_ip4_in_hook; ret = pfil_link(&pla); MPASS(ret == 0); pha.pa_mbuf_chk = pf_check_out; pha.pa_flags = PFIL_OUT; pha.pa_rulname = "default-out"; V_pf_ip4_out_hook = pfil_add_hook(&pha); pla.pa_flags = PFIL_OUT | PFIL_HEADPTR | PFIL_HOOKPTR; pla.pa_head = V_inet_pfil_head; pla.pa_hook = V_pf_ip4_out_hook; ret = pfil_link(&pla); MPASS(ret == 0); + if (V_pf_filter_local) { + pla.pa_flags = PFIL_OUT | PFIL_HEADPTR | PFIL_HOOKPTR; + pla.pa_head = V_inet_local_pfil_head; + pla.pa_hook = V_pf_ip4_out_hook; + ret = pfil_link(&pla); + MPASS(ret == 0); + } #endif #ifdef INET6 pha.pa_type = PFIL_TYPE_IP6; pha.pa_mbuf_chk = pf_check6_in; pha.pa_flags = PFIL_IN; pha.pa_rulname = "default-in6"; V_pf_ip6_in_hook = pfil_add_hook(&pha); pla.pa_flags = PFIL_IN | PFIL_HEADPTR | PFIL_HOOKPTR; pla.pa_head = V_inet6_pfil_head; pla.pa_hook = V_pf_ip6_in_hook; ret = pfil_link(&pla); MPASS(ret == 0); pha.pa_mbuf_chk = pf_check6_out; pha.pa_rulname = "default-out6"; pha.pa_flags = PFIL_OUT; V_pf_ip6_out_hook = pfil_add_hook(&pha); pla.pa_flags = PFIL_OUT | PFIL_HEADPTR | PFIL_HOOKPTR; pla.pa_head = V_inet6_pfil_head; pla.pa_hook = V_pf_ip6_out_hook; ret = pfil_link(&pla); MPASS(ret == 0); + if (V_pf_filter_local) { + pla.pa_flags = PFIL_OUT | PFIL_HEADPTR | PFIL_HOOKPTR; + pla.pa_head = V_inet6_local_pfil_head; + pla.pa_hook = V_pf_ip6_out_hook; + ret = pfil_link(&pla); + MPASS(ret == 0); + } #endif atomic_store_bool(&V_pf_pfil_hooked, true); } static void dehook_pf_eth(void) { if (!atomic_load_bool(&V_pf_pfil_eth_hooked)) return; pfil_remove_hook(V_pf_eth_in_hook); pfil_remove_hook(V_pf_eth_out_hook); atomic_store_bool(&V_pf_pfil_eth_hooked, false); } static void dehook_pf(void) { if (!atomic_load_bool(&V_pf_pfil_hooked)) return; #ifdef INET pfil_remove_hook(V_pf_ip4_in_hook); pfil_remove_hook(V_pf_ip4_out_hook); #endif #ifdef INET6 pfil_remove_hook(V_pf_ip6_in_hook); pfil_remove_hook(V_pf_ip6_out_hook); #endif atomic_store_bool(&V_pf_pfil_hooked, false); } static void pf_load_vnet(void) { V_pf_tag_z = uma_zcreate("pf tags", sizeof(struct pf_tagname), NULL, NULL, NULL, NULL, UMA_ALIGN_PTR, 0); rm_init_flags(&V_pf_rules_lock, "pf rulesets", RM_RECURSE); sx_init(&V_pf_ioctl_lock, "pf ioctl"); pf_init_tagset(&V_pf_tags, &pf_rule_tag_hashsize, PF_RULE_TAG_HASH_SIZE_DEFAULT); #ifdef ALTQ pf_init_tagset(&V_pf_qids, &pf_queue_tag_hashsize, PF_QUEUE_TAG_HASH_SIZE_DEFAULT); #endif V_pf_keth = &V_pf_main_keth_anchor.ruleset; pfattach_vnet(); V_pf_vnet_active = 1; } static int pf_load(void) { int error; sx_init(&pf_end_lock, "pf end thread"); pf_mtag_initialize(); pf_dev = make_dev(&pf_cdevsw, 0, UID_ROOT, GID_WHEEL, 0600, PF_NAME); if (pf_dev == NULL) return (ENOMEM); pf_end_threads = 0; error = kproc_create(pf_purge_thread, NULL, &pf_purge_proc, 0, 0, "pf purge"); if (error != 0) return (error); pfi_initialize(); return (0); } static void pf_unload_vnet(void) { int ret __diagused; V_pf_vnet_active = 0; V_pf_status.running = 0; dehook_pf(); dehook_pf_eth(); PF_RULES_WLOCK(); pf_syncookies_cleanup(); shutdown_pf(); PF_RULES_WUNLOCK(); /* Make sure we've cleaned up ethernet rules before we continue. */ NET_EPOCH_DRAIN_CALLBACKS(); ret = swi_remove(V_pf_swi_cookie); MPASS(ret == 0); ret = intr_event_destroy(V_pf_swi_ie); MPASS(ret == 0); pf_unload_vnet_purge(); pf_normalize_cleanup(); PF_RULES_WLOCK(); pfi_cleanup_vnet(); PF_RULES_WUNLOCK(); pfr_cleanup(); pf_osfp_flush(); pf_cleanup(); if (IS_DEFAULT_VNET(curvnet)) pf_mtag_cleanup(); pf_cleanup_tagset(&V_pf_tags); #ifdef ALTQ pf_cleanup_tagset(&V_pf_qids); #endif uma_zdestroy(V_pf_tag_z); #ifdef PF_WANT_32_TO_64_COUNTER PF_RULES_WLOCK(); LIST_REMOVE(V_pf_kifmarker, pfik_allkiflist); MPASS(LIST_EMPTY(&V_pf_allkiflist)); MPASS(V_pf_allkifcount == 0); LIST_REMOVE(&V_pf_default_rule, allrulelist); V_pf_allrulecount--; LIST_REMOVE(V_pf_rulemarker, allrulelist); /* * There are known pf rule leaks when running the test suite. */ #ifdef notyet MPASS(LIST_EMPTY(&V_pf_allrulelist)); MPASS(V_pf_allrulecount == 0); #endif PF_RULES_WUNLOCK(); free(V_pf_kifmarker, PFI_MTYPE); free(V_pf_rulemarker, M_PFRULE); #endif /* Free counters last as we updated them during shutdown. */ pf_counter_u64_deinit(&V_pf_default_rule.evaluations); for (int i = 0; i < 2; i++) { pf_counter_u64_deinit(&V_pf_default_rule.packets[i]); pf_counter_u64_deinit(&V_pf_default_rule.bytes[i]); } counter_u64_free(V_pf_default_rule.states_cur); counter_u64_free(V_pf_default_rule.states_tot); counter_u64_free(V_pf_default_rule.src_nodes); uma_zfree_pcpu(pf_timestamp_pcpu_zone, V_pf_default_rule.timestamp); for (int i = 0; i < PFRES_MAX; i++) counter_u64_free(V_pf_status.counters[i]); for (int i = 0; i < KLCNT_MAX; i++) counter_u64_free(V_pf_status.lcounters[i]); for (int i = 0; i < FCNT_MAX; i++) pf_counter_u64_deinit(&V_pf_status.fcounters[i]); for (int i = 0; i < SCNT_MAX; i++) counter_u64_free(V_pf_status.scounters[i]); rm_destroy(&V_pf_rules_lock); sx_destroy(&V_pf_ioctl_lock); } static void pf_unload(void) { sx_xlock(&pf_end_lock); pf_end_threads = 1; while (pf_end_threads < 2) { wakeup_one(pf_purge_thread); sx_sleep(pf_purge_proc, &pf_end_lock, 0, "pftmo", 0); } sx_xunlock(&pf_end_lock); if (pf_dev != NULL) destroy_dev(pf_dev); pfi_cleanup(); sx_destroy(&pf_end_lock); } static void vnet_pf_init(void *unused __unused) { pf_load_vnet(); } VNET_SYSINIT(vnet_pf_init, SI_SUB_PROTO_FIREWALL, SI_ORDER_THIRD, vnet_pf_init, NULL); static void vnet_pf_uninit(const void *unused __unused) { pf_unload_vnet(); } SYSUNINIT(pf_unload, SI_SUB_PROTO_FIREWALL, SI_ORDER_SECOND, pf_unload, NULL); VNET_SYSUNINIT(vnet_pf_uninit, SI_SUB_PROTO_FIREWALL, SI_ORDER_THIRD, vnet_pf_uninit, NULL); static int pf_modevent(module_t mod, int type, void *data) { int error = 0; switch(type) { case MOD_LOAD: error = pf_load(); break; case MOD_UNLOAD: /* Handled in SYSUNINIT(pf_unload) to ensure it's done after * the vnet_pf_uninit()s */ break; default: error = EINVAL; break; } return (error); } static moduledata_t pf_mod = { "pf", pf_modevent, 0 }; DECLARE_MODULE(pf, pf_mod, SI_SUB_PROTO_FIREWALL, SI_ORDER_SECOND); MODULE_VERSION(pf, PF_MODVER); diff --git a/tests/sys/netpfil/common/utils.subr b/tests/sys/netpfil/common/utils.subr index f4eec24618a7..e354f6638b87 100644 --- a/tests/sys/netpfil/common/utils.subr +++ b/tests/sys/netpfil/common/utils.subr @@ -1,145 +1,144 @@ #- # SPDX-License-Identifier: BSD-2-Clause # # Copyright (c) 2019 Ahsan Barkati # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions # are met: # 1. Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # 2. Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # # THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND # ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE # IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE # ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE # FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL # DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS # OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) # HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT # LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY # OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF # SUCH DAMAGE. # # $FreeBSD$ # . $(atf_get_srcdir)/../../common/vnet.subr firewall_config() { jname=$1 shift fw=$1 shift while [ $# -gt 0 ]; do if [ $(is_firewall "$1") -eq 1 ]; then current_fw="$1" shift filename=${current_fw}.rule cwd=$(pwd) if [ -f ${current_fw}.rule ]; then rm ${current_fw}.rule fi fi rule=$1 echo $rule >> $filename shift done if [ ${fw} == "ipfw" ]; then jexec ${jname} ipfw -q -f flush jexec ${jname} /bin/sh $cwd/ipfw.rule elif [ ${fw} == "pf" ]; then + jexec ${jname} sysctl net.pf.filter_local=1 jexec ${jname} pfctl -e jexec ${jname} pfctl -F all jexec ${jname} pfctl -f $cwd/pf.rule - jexec ${jname} pfilctl link -o pf:default-out inet-local - jexec ${jname} pfilctl link -o pf:default-out6 inet6-local elif [ ${fw} == "ipf" ]; then jexec ${jname} ipf -E jexec ${jname} ipf -Fa -f $cwd/ipf.rule elif [ ${fw} == "ipfnat" ]; then jexec ${jname} service ipfilter start jexec ${jname} ipnat -CF -f $cwd/ipfnat.rule jexec ${jname} pfilctl link -o ipfilter:default-ip4 inet-local jexec ${jname} pfilctl link -o ipfilter:default-ip6 inet6-local else atf_fail "$fw is not a valid firewall to configure" fi } firewall_cleanup() { firewall=$1 echo "Cleaning $firewall" vnet_cleanup } firewall_init() { firewall=$1 vnet_init if [ ${firewall} == "ipfw" ]; then if ! kldstat -q -m ipfw; then atf_skip "This test requires ipfw" fi elif [ ${firewall} == "pf" ]; then if [ ! -c /dev/pf ]; then atf_skip "This test requires pf" fi elif [ ${firewall} == "ipf" ]; then if ! kldstat -q -m ipfilter; then atf_skip "This test requires ipf" fi elif [ ${firewall} == "ipfnat" ]; then if ! kldstat -q -m ipfilter; then atf_skip "This test requires ipf" fi else atf_fail "$fw is not a valid firewall to initialize" fi } dummynet_init() { firewall=$1 if ! kldstat -q -m dummynet; then atf_skip "This test requires dummynet" fi case $firewall in ipfw|pf) # Nothing. This is okay. ;; *) atf_skip "${firewall} does not support dummynet" ;; esac } nat_init() { firewall=$1 if [ ${firewall} == "ipfw" ]; then if ! kldstat -q -m ipfw_nat; then atf_skip "This test requires ipfw_nat" fi fi } is_firewall() { if [ "$1" = "pf" -o "$1" = "ipfw" -o "$1" = "ipf" -o "$1" = "ipfnat" ]; then echo 1 else echo 0 fi } diff --git a/tests/sys/netpfil/pf/fragmentation_compat.sh b/tests/sys/netpfil/pf/fragmentation_compat.sh index a783755e4592..3e559a216b54 100644 --- a/tests/sys/netpfil/pf/fragmentation_compat.sh +++ b/tests/sys/netpfil/pf/fragmentation_compat.sh @@ -1,396 +1,397 @@ # $FreeBSD$ # # SPDX-License-Identifier: BSD-2-Clause # # Copyright (c) 2017 Kristof Provost # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions # are met: # 1. Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # 2. Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # # THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND # ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE # IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE # ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE # FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL # DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS # OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) # HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT # LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY # OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF # SUCH DAMAGE. . $(atf_get_srcdir)/utils.subr atf_test_case "too_many_fragments" "cleanup" too_many_fragments_head() { atf_set descr 'IPv4 fragment limitation test' atf_set require.user root } too_many_fragments_body() { pft_init epair=$(vnet_mkepair) vnet_mkjail alcatraz ${epair}a ifconfig ${epair}b inet 192.0.2.1/24 up jexec alcatraz ifconfig ${epair}a 192.0.2.2/24 up ifconfig ${epair}b mtu 200 jexec alcatraz ifconfig ${epair}a mtu 200 jexec alcatraz pfctl -e pft_set_rules alcatraz \ "scrub all fragment reassemble" # So we know pf is limiting things jexec alcatraz sysctl net.inet.ip.maxfragsperpacket=1024 # Sanity check atf_check -s exit:0 -o ignore ping -c 1 192.0.2.2 # We can ping with < 64 fragments atf_check -s exit:0 -o ignore ping -c 1 -s 800 192.0.2.2 # Too many fragments should fail atf_check -s exit:2 -o ignore ping -c 1 -s 20000 192.0.2.2 } too_many_fragments_cleanup() { pft_cleanup } atf_test_case "v6" "cleanup" v6_head() { atf_set descr 'IPv6 fragmentation test' atf_set require.user root atf_set require.progs scapy } v6_body() { pft_init epair_send=$(vnet_mkepair) epair_link=$(vnet_mkepair) vnet_mkjail alcatraz ${epair_send}b ${epair_link}a vnet_mkjail singsing ${epair_link}b ifconfig ${epair_send}a inet6 2001:db8:42::1/64 no_dad up jexec alcatraz ifconfig ${epair_send}b inet6 2001:db8:42::2/64 no_dad up jexec alcatraz ifconfig ${epair_link}a inet6 2001:db8:43::2/64 no_dad up jexec alcatraz sysctl net.inet6.ip6.forwarding=1 jexec singsing ifconfig ${epair_link}b inet6 2001:db8:43::3/64 no_dad up jexec singsing route add -6 2001:db8:42::/64 2001:db8:43::2 route add -6 2001:db8:43::/64 2001:db8:42::2 jexec alcatraz ifconfig ${epair_send}b inet6 -ifdisabled jexec alcatraz ifconfig ${epair_link}a inet6 -ifdisabled jexec singsing ifconfig ${epair_link}b inet6 -ifdisabled ifconfig ${epair_send}a inet6 -ifdisabled ifconfig ${epair_send}a jexec alcatraz ifconfig ${epair_send}b lladdr=$(jexec alcatraz ifconfig ${epair_send}b | awk '/ scopeid / { print($2); }' | cut -f 1 -d %) jexec alcatraz pfctl -e pft_set_rules alcatraz \ "scrub fragment reassemble" \ "block in" \ "pass in inet6 proto icmp6 icmp6-type { neighbrsol, neighbradv }" \ - "pass in inet6 proto icmp6 icmp6-type { echoreq, echorep }" + "pass in inet6 proto icmp6 icmp6-type { echoreq, echorep }" \ + "set skip on lo" # Host test atf_check -s exit:0 -o ignore \ ping -6 -c 1 2001:db8:42::2 atf_check -s exit:0 -o ignore \ ping -6 -c 1 -s 4500 2001:db8:42::2 atf_check -s exit:0 -o ignore\ ping -6 -c 1 -b 70000 -s 65000 2001:db8:42::2 # Force an NDP lookup ping -6 -c 1 ${lladdr}%${epair_send}a atf_check -s exit:0 -o ignore\ ping -6 -c 1 -b 70000 -s 65000 ${lladdr}%${epair_send}a # Forwarding test atf_check -s exit:0 -o ignore \ ping -6 -c 1 2001:db8:43::3 atf_check -s exit:0 -o ignore \ ping -6 -c 1 -s 4500 2001:db8:43::3 atf_check -s exit:0 -o ignore\ ping -6 -c 1 -b 70000 -s 65000 2001:db8:43::3 $(atf_get_srcdir)/CVE-2019-5597.py \ ${epair_send}a \ 2001:db8:42::1 \ 2001:db8:43::3 } v6_cleanup() { pft_cleanup } atf_test_case "mtu_diff" "cleanup" mtu_diff_head() { atf_set descr 'Test reassembly across different MTUs, PR #255432' atf_set require.user root } mtu_diff_body() { pft_init epair_small=$(vnet_mkepair) epair_large=$(vnet_mkepair) vnet_mkjail first ${epair_small}b ${epair_large}a vnet_mkjail second ${epair_large}b ifconfig ${epair_small}a 192.0.2.1/25 up jexec first ifconfig ${epair_small}b 192.0.2.2/25 up jexec first sysctl net.inet.ip.forwarding=1 jexec first ifconfig ${epair_large}a 192.0.2.130/25 up jexec first ifconfig ${epair_large}a mtu 9000 jexec second ifconfig ${epair_large}b 192.0.2.131/25 up jexec second ifconfig ${epair_large}b mtu 9000 jexec second route add default 192.0.2.130 route add 192.0.2.128/25 192.0.2.2 jexec first pfctl -e pft_set_rules first \ "scrub all fragment reassemble" # Sanity checks atf_check -s exit:0 -o ignore ping -c 1 192.0.2.2 atf_check -s exit:0 -o ignore ping -c 1 192.0.2.130 atf_check -s exit:0 -o ignore ping -c 1 192.0.2.131 # Large packet that'll get reassembled and sent out in one on the large # epair atf_check -s exit:0 -o ignore ping -c 1 -s 8000 192.0.2.131 } mtu_diff_cleanup() { pft_cleanup } frag_common() { name=$1 pft_init epair=$(vnet_mkepair) vnet_mkjail alcatraz ${epair}a ifconfig ${epair}b inet 192.0.2.1/24 up jexec alcatraz ifconfig ${epair}a 192.0.2.2/24 up jexec alcatraz pfctl -e pft_set_rules alcatraz \ "scrub all fragment reassemble" # Sanity check atf_check -s exit:0 -o ignore ping -c 1 192.0.2.2 atf_check -s exit:0 -o ignore $(atf_get_srcdir)/frag-${1}.py \ --to 192.0.2.2 \ --fromaddr 192.0.2.1 \ --sendif ${epair}b \ --recvif ${epair}b } atf_test_case "overreplace" "cleanup" overreplace_head() { atf_set descr 'ping fragment that overlaps fragment at index boundary and replace it' atf_set require.user root atf_set require.progs scapy } overreplace_body() { frag_common overreplace } overreplace_cleanup() { pft_cleanup } atf_test_case "overindex" "cleanup" overindex_head() { atf_set descr 'ping fragment that overlaps the first fragment at index boundary' atf_set require.user root atf_set require.progs scapy } overindex_body() { frag_common overindex } overindex_cleanup() { pft_cleanup } atf_test_case "overlimit" "cleanup" overlimit_head() { atf_set descr 'ping fragment at index boundary that cannot be requeued' atf_set require.user root atf_set require.progs scapy } overlimit_body() { frag_common overlimit } overlimit_cleanup() { pft_cleanup } atf_test_case "reassemble" "cleanup" reassemble_head() { atf_set descr 'Test reassembly' atf_set require.user root } reassemble_body() { pft_init epair=$(vnet_mkepair) vnet_mkjail alcatraz ${epair}a ifconfig ${epair}b inet 192.0.2.1/24 up jexec alcatraz ifconfig ${epair}a 192.0.2.2/24 up # Sanity check atf_check -s exit:0 -o ignore ping -c 1 192.0.2.2 jexec alcatraz pfctl -e pft_set_rules alcatraz \ "pass out" \ "block in" \ "pass in inet proto icmp all icmp-type echoreq" # Single fragment passes atf_check -s exit:0 -o ignore ping -c 1 192.0.2.2 # But a fragmented ping does not atf_check -s exit:2 -o ignore ping -c 1 -s 2000 192.0.2.2 pft_set_rules alcatraz \ "scrub in" \ "pass out" \ "block in" \ "pass in inet proto icmp all icmp-type echoreq" # Both single packet & fragmented pass when we scrub atf_check -s exit:0 -o ignore ping -c 1 192.0.2.2 atf_check -s exit:0 -o ignore ping -c 1 -s 2000 192.0.2.2 pft_set_rules alcatraz \ "scrub in fragment no reassemble" \ "pass out" \ "block in" \ "pass in inet proto icmp all icmp-type echoreq" # And the fragmented ping doesn't pass if we do not reassemble atf_check -s exit:0 -o ignore ping -c 1 192.0.2.2 atf_check -s exit:2 -o ignore ping -c 1 -s 2000 192.0.2.2 } reassemble_cleanup() { pft_cleanup } atf_test_case "no_df" "cleanup" no_df_head() { atf_set descr 'Test removing of DF flag' atf_set require.user root } no_df_body() { setup_router_server_ipv4 ifconfig ${epair_tester}a mtu 9000 jexec router ifconfig ${epair_tester}b mtu 9000 jexec router ifconfig ${epair_server}a mtu 1500 jexec server ifconfig ${epair_server}b mtu 1500 # Sanity check. ping_server_check_reply exit:0 --ping-type=icmp pft_set_rules router \ "scrub fragment reassemble" \ "pass out" \ "block in" \ "pass in inet proto icmp all icmp-type echoreq" # Ping with normal, fragmentable packets. ping_server_check_reply exit:0 --ping-type=icmp --send-length=2000 # Ping with non-fragmentable packets, this will fail. ping_server_check_reply exit:1 --ping-type=icmp --send-length=2000 --send-flags DF pft_set_rules router \ "scrub any reassemble" \ "pass out" \ "block in" \ "pass in inet proto icmp all icmp-type echoreq" # Ping with non-fragmentable packets again. # This time pf will strip the DF flag. ping_server_check_reply exit:0 --ping-type=icmp --send-length=2000 --send-flags DF } no_df_cleanup() { pft_cleanup } atf_init_test_cases() { atf_add_test_case "too_many_fragments" atf_add_test_case "v6" atf_add_test_case "mtu_diff" atf_add_test_case "overreplace" atf_add_test_case "overindex" atf_add_test_case "overlimit" atf_add_test_case "reassemble" } diff --git a/tests/sys/netpfil/pf/fragmentation_pass.sh b/tests/sys/netpfil/pf/fragmentation_pass.sh index d257de730d2d..e2d28c307502 100644 --- a/tests/sys/netpfil/pf/fragmentation_pass.sh +++ b/tests/sys/netpfil/pf/fragmentation_pass.sh @@ -1,482 +1,483 @@ # $FreeBSD$ # # SPDX-License-Identifier: BSD-2-Clause # # Copyright (c) 2017 Kristof Provost # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions # are met: # 1. Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # 2. Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # # THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND # ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE # IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE # ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE # FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL # DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS # OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) # HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT # LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY # OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF # SUCH DAMAGE. . $(atf_get_srcdir)/utils.subr common_dir=$(atf_get_srcdir)/../common atf_test_case "too_many_fragments" "cleanup" too_many_fragments_head() { atf_set descr 'IPv4 fragment limitation test' atf_set require.user root } too_many_fragments_body() { pft_init epair=$(vnet_mkepair) vnet_mkjail alcatraz ${epair}a ifconfig ${epair}b inet 192.0.2.1/24 up jexec alcatraz ifconfig ${epair}a 192.0.2.2/24 up ifconfig ${epair}b mtu 200 jexec alcatraz ifconfig ${epair}a mtu 200 jexec alcatraz pfctl -e pft_set_rules alcatraz \ "set reassemble yes" \ "pass keep state" # So we know pf is limiting things jexec alcatraz sysctl net.inet.ip.maxfragsperpacket=1024 # Sanity check atf_check -s exit:0 -o ignore ping -c 1 192.0.2.2 # We can ping with < 64 fragments atf_check -s exit:0 -o ignore ping -c 1 -s 800 192.0.2.2 # Too many fragments should fail atf_check -s exit:2 -o ignore ping -c 1 -s 20000 192.0.2.2 } too_many_fragments_cleanup() { pft_cleanup } atf_test_case "v6" "cleanup" v6_head() { atf_set descr 'IPv6 fragmentation test' atf_set require.user root atf_set require.progs scapy } v6_body() { pft_init epair_send=$(vnet_mkepair) epair_link=$(vnet_mkepair) vnet_mkjail alcatraz ${epair_send}b ${epair_link}a vnet_mkjail singsing ${epair_link}b ifconfig ${epair_send}a inet6 2001:db8:42::1/64 no_dad up jexec alcatraz ifconfig ${epair_send}b inet6 2001:db8:42::2/64 no_dad up jexec alcatraz ifconfig ${epair_link}a inet6 2001:db8:43::2/64 no_dad up jexec alcatraz sysctl net.inet6.ip6.forwarding=1 jexec singsing ifconfig ${epair_link}b inet6 2001:db8:43::3/64 no_dad up jexec singsing route add -6 2001:db8:42::/64 2001:db8:43::2 route add -6 2001:db8:43::/64 2001:db8:42::2 jexec alcatraz ifconfig ${epair_send}b inet6 -ifdisabled jexec alcatraz ifconfig ${epair_link}a inet6 -ifdisabled jexec singsing ifconfig ${epair_link}b inet6 -ifdisabled ifconfig ${epair_send}a inet6 -ifdisabled ifconfig ${epair_send}a jexec alcatraz ifconfig ${epair_send}b lladdr=$(jexec alcatraz ifconfig ${epair_send}b | awk '/ scopeid / { print($2); }' | cut -f 1 -d %) jexec alcatraz pfctl -e pft_set_rules alcatraz \ "set reassemble yes" \ "pass keep state" \ "block in" \ "pass in inet6 proto icmp6 icmp6-type { neighbrsol, neighbradv }" \ - "pass in inet6 proto icmp6 icmp6-type { echoreq, echorep }" + "pass in inet6 proto icmp6 icmp6-type { echoreq, echorep }" \ + "set skip on lo" # Host test atf_check -s exit:0 -o ignore \ ping -6 -c 1 2001:db8:42::2 atf_check -s exit:0 -o ignore \ ping -6 -c 1 -s 4500 2001:db8:42::2 atf_check -s exit:0 -o ignore\ ping -6 -c 1 -b 70000 -s 65000 2001:db8:42::2 # Force an NDP lookup ping -6 -c 1 ${lladdr}%${epair_send}a atf_check -s exit:0 -o ignore\ ping -6 -c 1 -b 70000 -s 65000 ${lladdr}%${epair_send}a # Forwarding test atf_check -s exit:0 -o ignore \ ping -6 -c 1 2001:db8:43::3 atf_check -s exit:0 -o ignore \ ping -6 -c 1 -s 4500 2001:db8:43::3 atf_check -s exit:0 -o ignore\ ping -6 -c 1 -b 70000 -s 65000 2001:db8:43::3 $(atf_get_srcdir)/CVE-2019-5597.py \ ${epair_send}a \ 2001:db8:42::1 \ 2001:db8:43::3 } v6_cleanup() { pft_cleanup } atf_test_case "mtu_diff" "cleanup" mtu_diff_head() { atf_set descr 'Test reassembly across different MTUs, PR #255432' atf_set require.user root } mtu_diff_body() { pft_init epair_small=$(vnet_mkepair) epair_large=$(vnet_mkepair) vnet_mkjail first ${epair_small}b ${epair_large}a vnet_mkjail second ${epair_large}b ifconfig ${epair_small}a 192.0.2.1/25 up jexec first ifconfig ${epair_small}b 192.0.2.2/25 up jexec first sysctl net.inet.ip.forwarding=1 jexec first ifconfig ${epair_large}a 192.0.2.130/25 up jexec first ifconfig ${epair_large}a mtu 9000 jexec second ifconfig ${epair_large}b 192.0.2.131/25 up jexec second ifconfig ${epair_large}b mtu 9000 jexec second route add default 192.0.2.130 route add 192.0.2.128/25 192.0.2.2 jexec first pfctl -e pft_set_rules first \ "set reassemble yes" \ "pass keep state" # Sanity checks atf_check -s exit:0 -o ignore ping -c 1 192.0.2.2 atf_check -s exit:0 -o ignore ping -c 1 192.0.2.130 atf_check -s exit:0 -o ignore ping -c 1 192.0.2.131 # Large packet that'll get reassembled and sent out in one on the large # epair atf_check -s exit:0 -o ignore ping -c 1 -s 8000 192.0.2.131 } mtu_diff_cleanup() { pft_cleanup } frag_common() { name=$1 pft_init epair=$(vnet_mkepair) vnet_mkjail alcatraz ${epair}a ifconfig ${epair}b inet 192.0.2.1/24 up jexec alcatraz ifconfig ${epair}a 192.0.2.2/24 up jexec alcatraz pfctl -e pft_set_rules alcatraz \ "set reassemble yes" \ "pass keep state" # Sanity check atf_check -s exit:0 -o ignore ping -c 1 192.0.2.2 atf_check -s exit:0 -o ignore $(atf_get_srcdir)/frag-${1}.py \ --to 192.0.2.2 \ --fromaddr 192.0.2.1 \ --sendif ${epair}b \ --recvif ${epair}b } atf_test_case "overreplace" "cleanup" overreplace_head() { atf_set descr 'ping fragment that overlaps fragment at index boundary and replace it' atf_set require.user root atf_set require.progs scapy } overreplace_body() { frag_common overreplace } overreplace_cleanup() { pft_cleanup } atf_test_case "overindex" "cleanup" overindex_head() { atf_set descr 'ping fragment that overlaps the first fragment at index boundary' atf_set require.user root atf_set require.progs scapy } overindex_body() { frag_common overindex } overindex_cleanup() { pft_cleanup } atf_test_case "overlimit" "cleanup" overlimit_head() { atf_set descr 'ping fragment at index boundary that cannot be requeued' atf_set require.user root atf_set require.progs scapy } overlimit_body() { frag_common overlimit } overlimit_cleanup() { pft_cleanup } atf_test_case "reassemble" "cleanup" reassemble_head() { atf_set descr 'Test reassembly' atf_set require.user root } reassemble_body() { pft_init epair=$(vnet_mkepair) vnet_mkjail alcatraz ${epair}a ifconfig ${epair}b inet 192.0.2.1/24 up jexec alcatraz ifconfig ${epair}a 192.0.2.2/24 up # Sanity check atf_check -s exit:0 -o ignore ping -c 1 192.0.2.2 jexec alcatraz pfctl -e pft_set_rules alcatraz \ "pass out" \ "block in" \ "pass in inet proto icmp all icmp-type echoreq" # Single fragment passes atf_check -s exit:0 -o ignore ping -c 1 192.0.2.2 # But a fragmented ping does not atf_check -s exit:2 -o ignore ping -c 1 -s 2000 192.0.2.2 pft_set_rules alcatraz \ "set reassemble yes" \ "pass out" \ "block in" \ "pass in inet proto icmp all icmp-type echoreq" # Both single packet & fragmented pass when we scrub atf_check -s exit:0 -o ignore ping -c 1 192.0.2.2 atf_check -s exit:0 -o ignore ping -c 1 -s 2000 192.0.2.2 } reassemble_cleanup() { pft_cleanup } atf_test_case "no_df" "cleanup" no_df_head() { atf_set descr 'Test removing of DF flag' atf_set require.user root } no_df_body() { setup_router_server_ipv4 ifconfig ${epair_tester}a mtu 9000 jexec router ifconfig ${epair_tester}b mtu 9000 jexec router ifconfig ${epair_server}a mtu 1500 jexec server ifconfig ${epair_server}b mtu 1500 # Sanity check. ping_server_check_reply exit:0 --ping-type=icmp pft_set_rules router \ "set reassemble no" \ "pass out" \ "block in" \ "pass in inet proto icmp all icmp-type echoreq" # Ping with normal, fragmentable packets. ping_server_check_reply exit:1 --ping-type=icmp --send-length=2000 pft_set_rules router \ "set reassemble yes" \ "pass out" \ "block in" \ "pass in inet proto icmp all icmp-type echoreq" # Ping with normal, fragmentable packets. ping_server_check_reply exit:0 --ping-type=icmp --send-length=2000 # Ping with non-fragmentable packets. ping_server_check_reply exit:1 --ping-type=icmp --send-length=2000 --send-flags DF pft_set_rules router \ "set reassemble yes no-df" \ "pass out" \ "block in" \ "pass in inet proto icmp all icmp-type echoreq" # Ping with non-fragmentable packets again. # This time pf will strip the DF flag. ping_server_check_reply exit:0 --ping-type=icmp --send-length=2000 --send-flags DF } no_df_cleanup() { pft_cleanup } atf_test_case "no_df" "cleanup" no_df_head() { atf_set descr 'Test removing of DF flag' atf_set require.user root } no_df_body() { setup_router_server_ipv4 # Tester can send long packets which will get fragmented by the router. # Replies from server will come in fragments which might get # reassembled resulting in a long reply packet sent back to tester. ifconfig ${epair_tester}a mtu 9000 jexec router ifconfig ${epair_tester}b mtu 9000 jexec router ifconfig ${epair_server}a mtu 1500 jexec server ifconfig ${epair_server}b mtu 1500 # Sanity check. ping_server_check_reply exit:0 --ping-type=icmp # Enable packet reassembly with clearing of the no-df flag. pft_set_rules router \ "scrub all fragment reassemble no-df" \ "block" \ "pass inet proto icmp all icmp-type echoreq" # Ping with non-fragmentable packets. # pf will strip the DF flag resulting in fragmentation and packets # getting properly forwarded. ping_server_check_reply exit:0 --ping-type=icmp --send-length=2000 --send-flags DF } no_df_cleanup() { pft_cleanup } atf_test_case "reassemble_slowpath" "cleanup" reassemble_slowpath_head() { atf_set descr 'Test reassembly on the slow path' atf_set require.user root } reassemble_slowpath_body() { if ! sysctl -q kern.features.ipsec >/dev/null ; then atf_skip "This test requires ipsec" fi setup_router_server_ipv4 # Now define an ipsec policy so we end up taking the slow path. # We don't actually need the traffic to go through ipsec, we just don't # want to go through ip_tryforward(). echo "flush; spdflush; spdadd 203.0.113.1/32 203.0.113.2/32 any -P out ipsec esp/transport//require; add 203.0.113.1 203.0.113.2 esp 0x1001 -E aes-gcm-16 \"12345678901234567890\";" \ | jexec router setkey -c # Sanity check. ping_server_check_reply exit:0 --ping-type=icmp # Enable packet reassembly with clearing of the no-df flag. pft_set_rules router \ "scrub in on ${epair_tester}b fragment no reassemble" \ "scrub on ${epair_server}a fragment reassemble" \ "pass" # Ensure that the packet makes it through the slow path atf_check -s exit:0 -o ignore \ ping -c 1 -s 2000 198.51.100.2 } reassemble_slowpath_cleanup() { pft_cleanup } atf_init_test_cases() { atf_add_test_case "too_many_fragments" atf_add_test_case "v6" atf_add_test_case "mtu_diff" atf_add_test_case "overreplace" atf_add_test_case "overindex" atf_add_test_case "overlimit" atf_add_test_case "reassemble" atf_add_test_case "no_df" atf_add_test_case "reassemble_slowpath" } diff --git a/tests/sys/netpfil/pf/killstate.sh b/tests/sys/netpfil/pf/killstate.sh index 4263938e26be..cd4eeee05a10 100644 --- a/tests/sys/netpfil/pf/killstate.sh +++ b/tests/sys/netpfil/pf/killstate.sh @@ -1,577 +1,585 @@ # $FreeBSD$ # # SPDX-License-Identifier: BSD-2-Clause # # Copyright (c) 2021 Rubicon Communications, LLC (Netgate) # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions # are met: # 1. Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # 2. Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # # THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND # ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE # IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE # ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE # FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL # DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS # OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) # HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT # LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY # OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF # SUCH DAMAGE. . $(atf_get_srcdir)/utils.subr common_dir=$(atf_get_srcdir)/../common find_state() { jexec alcatraz pfctl -ss | grep icmp | grep 192.0.2.2 } find_state_v6() { jexec alcatraz pfctl -ss | grep icmp | grep 2001:db8::2 } atf_test_case "v4" "cleanup" v4_head() { atf_set descr 'Test killing states by IPv4 address' atf_set require.user root atf_set require.progs scapy } v4_body() { pft_init epair=$(vnet_mkepair) ifconfig ${epair}a 192.0.2.1/24 up vnet_mkjail alcatraz ${epair}b jexec alcatraz ifconfig ${epair}b 192.0.2.2/24 up jexec alcatraz pfctl -e pft_set_rules alcatraz "block all" \ - "pass in proto icmp" + "pass in proto icmp" \ + "set skip on lo" # Sanity check & establish state atf_check -s exit:0 -o ignore ${common_dir}/pft_ping.py \ --sendif ${epair}a \ --to 192.0.2.2 \ --replyif ${epair}a # Change rules to now deny the ICMP traffic pft_set_rules noflush alcatraz "block all" if ! find_state; then atf_fail "Setting new rules removed the state." fi # Killing with the wrong IP doesn't affect our state jexec alcatraz pfctl -k 192.0.2.3 if ! find_state; then atf_fail "Killing with the wrong IP removed our state." fi # Killing with one correct address and one incorrect doesn't kill the state jexec alcatraz pfctl -k 192.0.2.1 -k 192.0.2.3 if ! find_state; then atf_fail "Killing with one wrong IP removed our state." fi # Killing with correct address does remove the state jexec alcatraz pfctl -k 192.0.2.1 if find_state; then atf_fail "Killing with the correct IP did not remove our state." fi } v4_cleanup() { pft_cleanup } atf_test_case "v6" "cleanup" v6_head() { atf_set descr 'Test killing states by IPv6 address' atf_set require.user root atf_set require.progs scapy } v6_body() { pft_init if [ "$(atf_config_get ci false)" = "true" ]; then atf_skip "https://bugs.freebsd.org/260458" fi epair=$(vnet_mkepair) ifconfig ${epair}a inet6 2001:db8::1/64 up no_dad vnet_mkjail alcatraz ${epair}b jexec alcatraz ifconfig ${epair}b inet6 2001:db8::2/64 up no_dad jexec alcatraz pfctl -e pft_set_rules alcatraz "block all" \ - "pass in proto icmp6" + "pass in proto icmp6" \ + "set skip on lo" # Sanity check & establish state atf_check -s exit:0 -o ignore ${common_dir}/pft_ping.py \ --sendif ${epair}a \ --to 2001:db8::2 \ --replyif ${epair}a # Change rules to now deny the ICMP traffic pft_set_rules noflush alcatraz "block all" if ! find_state_v6; then atf_fail "Setting new rules removed the state." fi # Killing with the wrong IP doesn't affect our state jexec alcatraz pfctl -k 2001:db8::3 if ! find_state_v6; then atf_fail "Killing with the wrong IP removed our state." fi # Killing with one correct address and one incorrect doesn't kill the state jexec alcatraz pfctl -k 2001:db8::1 -k 2001:db8::3 if ! find_state_v6; then atf_fail "Killing with one wrong IP removed our state." fi # Killing with correct address does remove the state jexec alcatraz pfctl -k 2001:db8::1 if find_state_v6; then atf_fail "Killing with the correct IP did not remove our state." fi } v6_cleanup() { pft_cleanup } atf_test_case "label" "cleanup" label_head() { atf_set descr 'Test killing states by label' atf_set require.user root atf_set require.progs scapy } label_body() { pft_init epair=$(vnet_mkepair) ifconfig ${epair}a 192.0.2.1/24 up vnet_mkjail alcatraz ${epair}b jexec alcatraz ifconfig ${epair}b 192.0.2.2/24 up jexec alcatraz pfctl -e pft_set_rules alcatraz "block all" \ "pass in proto tcp label bar" \ - "pass in proto icmp label foo" + "pass in proto icmp label foo" \ + "set skip on lo" # Sanity check & establish state atf_check -s exit:0 -o ignore ${common_dir}/pft_ping.py \ --sendif ${epair}a \ --to 192.0.2.2 \ --replyif ${epair}a # Change rules to now deny the ICMP traffic pft_set_rules noflush alcatraz "block all" if ! find_state; then atf_fail "Setting new rules removed the state." fi # Killing a label on a different rules keeps the state jexec alcatraz pfctl -k label -k bar if ! find_state; then atf_fail "Killing a different label removed the state." fi # Killing a non-existing label keeps the state jexec alcatraz pfctl -k label -k baz if ! find_state; then atf_fail "Killing a non-existing label removed the state." fi # Killing the correct label kills the state jexec alcatraz pfctl -k label -k foo if find_state; then atf_fail "Killing the state did not remove it." fi } label_cleanup() { pft_cleanup } atf_test_case "multilabel" "cleanup" multilabel_head() { atf_set descr 'Test killing states with multiple labels by label' atf_set require.user root atf_set require.progs scapy } multilabel_body() { pft_init epair=$(vnet_mkepair) ifconfig ${epair}a 192.0.2.1/24 up vnet_mkjail alcatraz ${epair}b jexec alcatraz ifconfig ${epair}b 192.0.2.2/24 up jexec alcatraz pfctl -e pft_set_rules alcatraz "block all" \ - "pass in proto icmp label foo label bar" + "pass in proto icmp label foo label bar" \ + "set skip on lo" # Sanity check & establish state atf_check -s exit:0 -o ignore ${common_dir}/pft_ping.py \ --sendif ${epair}a \ --to 192.0.2.2 \ --replyif ${epair}a # Change rules to now deny the ICMP traffic pft_set_rules noflush alcatraz "block all" if ! find_state; then atf_fail "Setting new rules removed the state." fi # Killing a label on a different rules keeps the state jexec alcatraz pfctl -k label -k baz if ! find_state; then atf_fail "Killing a different label removed the state." fi # Killing the state with the last label works jexec alcatraz pfctl -k label -k bar if find_state; then atf_fail "Killing with the last label did not remove the state." fi pft_set_rules alcatraz "block all" \ - "pass in proto icmp label foo label bar" + "pass in proto icmp label foo label bar" \ + "set skip on lo" # Reestablish state atf_check -s exit:0 -o ignore ${common_dir}/pft_ping.py \ --sendif ${epair}a \ --to 192.0.2.2 \ --replyif ${epair}a # Change rules to now deny the ICMP traffic pft_set_rules noflush alcatraz "block all" if ! find_state; then atf_fail "Setting new rules removed the state." fi # Killing with the first label works too jexec alcatraz pfctl -k label -k foo if find_state; then atf_fail "Killing with the first label did not remove the state." fi } multilabel_cleanup() { pft_cleanup } atf_test_case "gateway" "cleanup" gateway_head() { atf_set descr 'Test killing states by route-to/reply-to address' atf_set require.user root atf_set require.progs scapy } gateway_body() { pft_init epair=$(vnet_mkepair) ifconfig ${epair}a 192.0.2.1/24 up vnet_mkjail alcatraz ${epair}b jexec alcatraz ifconfig ${epair}b 192.0.2.2/24 up jexec alcatraz pfctl -e pft_set_rules alcatraz "block all" \ - "pass in reply-to (${epair}b 192.0.2.1) proto icmp" + "pass in reply-to (${epair}b 192.0.2.1) proto icmp" \ + "set skip on lo" # Sanity check & establish state # Note: use pft_ping so we always use the same ID, so pf considers all # echo requests part of the same flow. atf_check -s exit:0 -o ignore ${common_dir}/pft_ping.py \ --sendif ${epair}a \ --to 192.0.2.2 \ --replyif ${epair}a # Change rules to now deny the ICMP traffic pft_set_rules noflush alcatraz "block all" if ! find_state; then atf_fail "Setting new rules removed the state." fi # Killing with a different gateway does not affect our state jexec alcatraz pfctl -k gateway -k 192.0.2.2 if ! find_state; then atf_fail "Killing with a different gateway removed the state." fi # Killing states with the relevant gateway does terminate our state jexec alcatraz pfctl -k gateway -k 192.0.2.1 if find_state; then atf_fail "Killing with the gateway did not remove the state." fi } gateway_cleanup() { pft_cleanup } atf_test_case "match" "cleanup" match_head() { atf_set descr 'Test killing matching states' atf_set require.user root } wait_for_state() { jail=$1 addr=$2 while ! jexec $jail pfctl -s s | grep $addr >/dev/null; do sleep .1 done } match_body() { pft_init epair_one=$(vnet_mkepair) ifconfig ${epair_one}a 192.0.2.1/24 up epair_two=$(vnet_mkepair) vnet_mkjail alcatraz ${epair_one}b ${epair_two}a jexec alcatraz ifconfig ${epair_one}b 192.0.2.2/24 up jexec alcatraz ifconfig ${epair_two}a 198.51.100.1/24 up jexec alcatraz sysctl net.inet.ip.forwarding=1 jexec alcatraz pfctl -e vnet_mkjail singsing ${epair_two}b jexec singsing ifconfig ${epair_two}b 198.51.100.2/24 up jexec singsing route add default 198.51.100.1 jexec singsing /usr/sbin/inetd -p inetd-echo.pid \ $(atf_get_srcdir)/echo_inetd.conf route add 198.51.100.0/24 192.0.2.2 pft_set_rules alcatraz \ "nat on ${epair_two}a from 192.0.2.0/24 -> (${epair_two}a)" \ "pass all" nc 198.51.100.2 7 & wait_for_state alcatraz 192.0.2.1 # Expect two states states=$(jexec alcatraz pfctl -s s | grep 192.0.2.1 | wc -l) if [ $states -ne 2 ] ; then atf_fail "Expected two states, found $states" fi # If we don't kill the matching NAT state one should be left jexec alcatraz pfctl -k 192.0.2.1 states=$(jexec alcatraz pfctl -s s | grep 192.0.2.1 | wc -l) if [ $states -ne 1 ] ; then atf_fail "Expected one states, found $states" fi # Flush jexec alcatraz pfctl -F states nc 198.51.100.2 7 & wait_for_state alcatraz 192.0.2.1 # Kill matching states, expect all of them to be gone jexec alcatraz pfctl -M -k 192.0.2.1 states=$(jexec alcatraz pfctl -s s | grep 192.0.2.1 | wc -l) if [ $states -ne 0 ] ; then atf_fail "Expected zero states, found $states" fi } match_cleanup() { pft_cleanup } atf_test_case "interface" "cleanup" interface_head() { atf_set descr 'Test killing states based on interface' atf_set require.user root atf_set require.progs scapy } interface_body() { pft_init epair=$(vnet_mkepair) ifconfig ${epair}a 192.0.2.1/24 up vnet_mkjail alcatraz ${epair}b jexec alcatraz ifconfig ${epair}b 192.0.2.2/24 up jexec alcatraz pfctl -e pft_set_rules alcatraz "block all" \ - "pass in proto icmp" + "pass in proto icmp" \ + "set skip on lo" # Sanity check & establish state atf_check -s exit:0 -o ignore ${common_dir}/pft_ping.py \ --sendif ${epair}a \ --to 192.0.2.2 \ --replyif ${epair}a # Change rules to now deny the ICMP traffic pft_set_rules noflush alcatraz "block all" if ! find_state; then atf_fail "Setting new rules removed the state." fi # Flushing states on a different interface doesn't affect our state jexec alcatraz pfctl -i ${epair}a -Fs if ! find_state; then atf_fail "Flushing on a different interface removed the state." fi # Flushing on the correct interface does (even with floating states) jexec alcatraz pfctl -i ${epair}b -Fs if find_state; then atf_fail "Flushing on a the interface did not remove the state." fi } interface_cleanup() { pft_cleanup } atf_test_case "id" "cleanup" id_head() { atf_set descr 'Test killing states by id' atf_set require.user root atf_set require.progs scapy } id_body() { pft_init epair=$(vnet_mkepair) ifconfig ${epair}a 192.0.2.1/24 up vnet_mkjail alcatraz ${epair}b jexec alcatraz ifconfig ${epair}b 192.0.2.2/24 up jexec alcatraz pfctl -e pft_set_rules alcatraz "block all" \ "pass in proto tcp" \ - "pass in proto icmp" + "pass in proto icmp" \ + "set skip on lo" # Sanity check & establish state atf_check -s exit:0 -o ignore ${common_dir}/pft_ping.py \ --sendif ${epair}a \ --to 192.0.2.2 \ --replyif ${epair}a # Change rules to now deny the ICMP traffic pft_set_rules noflush alcatraz "block all" if ! find_state; then atf_fail "Setting new rules removed the state." fi # Get the state ID id=$(jexec alcatraz pfctl -ss -vvv | grep -A 3 icmp | grep -A 3 192.0.2.2 | awk '/id:/ { printf("%s/%s", $2, $4); }') # Kill the wrong ID jexec alcatraz pfctl -k id -k 1 if ! find_state; then atf_fail "Killing a different ID removed the state." fi # Kill the correct ID jexec alcatraz pfctl -k id -k ${id} if find_state; then atf_fail "Killing the state did not remove it." fi } id_cleanup() { pft_cleanup } atf_init_test_cases() { atf_add_test_case "v4" atf_add_test_case "v6" atf_add_test_case "label" atf_add_test_case "multilabel" atf_add_test_case "gateway" atf_add_test_case "match" atf_add_test_case "interface" atf_add_test_case "id" } diff --git a/tests/sys/netpfil/pf/map_e.sh b/tests/sys/netpfil/pf/map_e.sh index ce0e567ae3c8..ea8ce33bf323 100644 --- a/tests/sys/netpfil/pf/map_e.sh +++ b/tests/sys/netpfil/pf/map_e.sh @@ -1,91 +1,92 @@ # $FreeBSD$ # # SPDX-License-Identifier: BSD-2-Clause # # Copyright (c) 2021 KUROSAWA Takahiro # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions # are met: # 1. Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # 2. Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # # THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND # ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE # IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE # ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE # FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL # DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS # OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) # HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT # LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY # OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF # SUCH DAMAGE. . $(atf_get_srcdir)/utils.subr atf_test_case "map_e" "cleanup" map_e_head() { atf_set descr 'map-e-portset test' atf_set require.user root } map_e_body() { NC_TRY_COUNT=12 pft_init epair_map_e=$(vnet_mkepair) epair_echo=$(vnet_mkepair) vnet_mkjail map_e ${epair_map_e}b ${epair_echo}a vnet_mkjail echo ${epair_echo}b ifconfig ${epair_map_e}a 192.0.2.2/24 up route add -net 198.51.100.0/24 192.0.2.1 jexec map_e ifconfig ${epair_map_e}b 192.0.2.1/24 up jexec map_e ifconfig ${epair_echo}a 198.51.100.1/24 up jexec map_e sysctl net.inet.ip.forwarding=1 jexec echo ifconfig ${epair_echo}b 198.51.100.2/24 up jexec echo /usr/sbin/inetd -p inetd-echo.pid $(atf_get_srcdir)/echo_inetd.conf # Enable pf! jexec map_e pfctl -e pft_set_rules map_e \ "nat pass on ${epair_echo}a inet from 192.0.2.0/24 to any -> (${epair_echo}a) map-e-portset 2/12/0x342" # Only allow specified ports. jexec echo pfctl -e pft_set_rules echo "block return all" \ "pass in on ${epair_echo}b inet proto tcp from 198.51.100.1 port 19720:19723 to (${epair_echo}b) port 7" \ "pass in on ${epair_echo}b inet proto tcp from 198.51.100.1 port 36104:36107 to (${epair_echo}b) port 7" \ - "pass in on ${epair_echo}b inet proto tcp from 198.51.100.1 port 52488:52491 to (${epair_echo}b) port 7" + "pass in on ${epair_echo}b inet proto tcp from 198.51.100.1 port 52488:52491 to (${epair_echo}b) port 7" \ + "set skip on lo" i=0 while [ ${i} -lt ${NC_TRY_COUNT} ] do echo "foo ${i}" | timeout 2 nc -N 198.51.100.2 7 if [ $? -ne 0 ]; then atf_fail "nc failed (${i})" fi i=$((${i}+1)) done } map_e_cleanup() { rm -f inetd-echo.pid pft_cleanup } atf_init_test_cases() { atf_add_test_case "map_e" } diff --git a/tests/sys/netpfil/pf/pass_block.sh b/tests/sys/netpfil/pf/pass_block.sh index 0f034b23a730..2a226f5c9651 100644 --- a/tests/sys/netpfil/pf/pass_block.sh +++ b/tests/sys/netpfil/pf/pass_block.sh @@ -1,265 +1,266 @@ # $FreeBSD$ # # SPDX-License-Identifier: BSD-2-Clause # # Copyright (c) 2018 Kristof Provost # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions # are met: # 1. Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # 2. Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # # THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND # ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE # IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE # ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE # FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL # DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS # OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) # HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT # LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY # OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF # SUCH DAMAGE. . $(atf_get_srcdir)/utils.subr common_dir=$(atf_get_srcdir)/../common atf_test_case "v4" "cleanup" v4_head() { atf_set descr 'Basic pass/block test for IPv4' atf_set require.user root } v4_body() { pft_init epair=$(vnet_mkepair) ifconfig ${epair}a 192.0.2.1/24 up # Set up a simple jail with one interface vnet_mkjail alcatraz ${epair}b jexec alcatraz ifconfig ${epair}b 192.0.2.2/24 up # Trivial ping to the jail, without pf atf_check -s exit:0 -o ignore ping -c 1 -t 1 192.0.2.2 # pf without policy will let us ping jexec alcatraz pfctl -e atf_check -s exit:0 -o ignore ping -c 1 -t 1 192.0.2.2 # Block everything pft_set_rules alcatraz "block in" atf_check -s exit:2 -o ignore ping -c 1 -t 1 192.0.2.2 # Block everything but ICMP pft_set_rules alcatraz "block in" "pass in proto icmp" atf_check -s exit:0 -o ignore ping -c 1 -t 1 192.0.2.2 } v4_cleanup() { pft_cleanup } atf_test_case "v6" "cleanup" v6_head() { atf_set descr 'Basic pass/block test for IPv6' atf_set require.user root } v6_body() { pft_init epair=$(vnet_mkepair) ifconfig ${epair}a inet6 2001:db8:42::1/64 up no_dad # Set up a simple jail with one interface vnet_mkjail alcatraz ${epair}b jexec alcatraz ifconfig ${epair}b inet6 2001:db8:42::2/64 up no_dad # Trivial ping to the jail, without pf atf_check -s exit:0 -o ignore ping -6 -c 1 -W 1 2001:db8:42::2 # pf without policy will let us ping jexec alcatraz pfctl -e atf_check -s exit:0 -o ignore ping -6 -c 1 -W 1 2001:db8:42::2 # Block everything pft_set_rules alcatraz "block in" atf_check -s exit:2 -o ignore ping -6 -c 1 -W 1 2001:db8:42::2 # Block everything but ICMP pft_set_rules alcatraz "block in" "pass in proto icmp6" atf_check -s exit:0 -o ignore ping -6 -c 1 -W 1 2001:db8:42::2 # Allowing ICMPv4 does not allow ICMPv6 pft_set_rules alcatraz "block in" "pass in proto icmp" atf_check -s exit:2 -o ignore ping -6 -c 1 -W 1 2001:db8:42::2 } v6_cleanup() { pft_cleanup } atf_test_case "noalias" "cleanup" noalias_head() { atf_set descr 'Test the :0 noalias option' atf_set require.user root } noalias_body() { pft_init epair=$(vnet_mkepair) ifconfig ${epair}a inet6 2001:db8:42::1/64 up no_dad vnet_mkjail alcatraz ${epair}b jexec alcatraz ifconfig ${epair}b inet6 2001:db8:42::2/64 up no_dad linklocaladdr=$(jexec alcatraz ifconfig ${epair}b inet6 \ | grep %${epair}b \ | awk '{ print $2; }' \ | cut -d % -f 1) # Sanity check atf_check -s exit:0 -o ignore ping -6 -c 3 -W 1 2001:db8:42::2 atf_check -s exit:0 -o ignore ping -6 -c 3 -W 1 ${linklocaladdr}%${epair}a jexec alcatraz pfctl -e pft_set_rules alcatraz "block out inet6 from (${epair}b:0) to any" atf_check -s exit:2 -o ignore ping -6 -c 3 -W 1 2001:db8:42::2 # We should still be able to ping the link-local address atf_check -s exit:0 -o ignore ping -6 -c 3 -W 1 ${linklocaladdr}%${epair}a pft_set_rules alcatraz "block out inet6 from (${epair}b) to any" # We cannot ping to the link-local address atf_check -s exit:2 -o ignore ping -6 -c 3 -W 1 ${linklocaladdr}%${epair}a } noalias_cleanup() { pft_cleanup } atf_test_case "nested_inline" "cleanup" nested_inline_head() { atf_set descr "Test nested inline anchors, PR196314" atf_set require.user root } nested_inline_body() { pft_init epair=$(vnet_mkepair) ifconfig ${epair}a inet 192.0.2.1/24 up vnet_mkjail alcatraz ${epair}b jexec alcatraz ifconfig ${epair}b 192.0.2.2/24 up jexec alcatraz pfctl -e pft_set_rules alcatraz \ "block in" \ "anchor \"an1\" {" \ "pass in quick proto tcp to port time" \ "anchor \"an2\" {" \ "pass in quick proto icmp" \ "}" \ "}" atf_check -s exit:0 -o ignore ping -c 1 -t 1 192.0.2.2 } nested_inline_cleanup() { pft_cleanup } atf_test_case "urpf" "cleanup" urpf_head() { atf_set descr "Test unicast reverse path forwarding check" atf_set require.user root atf_set require.progs scapy } urpf_body() { pft_init epair_one=$(vnet_mkepair) epair_two=$(vnet_mkepair) vnet_mkjail alcatraz ${epair_one}b ${epair_two}b ifconfig ${epair_one}a 192.0.2.2/24 up ifconfig ${epair_two}a 198.51.100.2/24 up jexec alcatraz ifconfig ${epair_one}b 192.0.2.1/24 up jexec alcatraz ifconfig ${epair_two}b 198.51.100.1/24 up jexec alcatraz sysctl net.inet.ip.forwarding=1 # Sanity checks atf_check -s exit:0 -o ignore ping -c 1 -t 1 192.0.2.1 atf_check -s exit:0 -o ignore ping -c 1 -t 1 198.51.100.1 atf_check -s exit:0 ${common_dir}/pft_ping.py \ --sendif ${epair_one}a \ --to 192.0.2.1 \ --fromaddr 198.51.100.2 \ --replyif ${epair_two}a atf_check -s exit:0 ${common_dir}/pft_ping.py \ --sendif ${epair_two}a \ --to 198.51.100.1 \ --fromaddr 192.0.2.2 \ --replyif ${epair_one}a pft_set_rules alcatraz \ - "block quick from urpf-failed" + "block quick from urpf-failed" \ + "set skip on lo" jexec alcatraz pfctl -e # Correct source still works atf_check -s exit:0 -o ignore ping -c 1 -t 1 192.0.2.1 atf_check -s exit:0 -o ignore ping -c 1 -t 1 198.51.100.1 # Unexpected source interface is blocked atf_check -s exit:1 ${common_dir}/pft_ping.py \ --sendif ${epair_one}a \ --to 192.0.2.1 \ --fromaddr 198.51.100.2 \ --replyif ${epair_two}a atf_check -s exit:1 ${common_dir}/pft_ping.py \ --sendif ${epair_two}a \ --to 198.51.100.1 \ --fromaddr 192.0.2.2 \ --replyif ${epair_one}a } urpf_cleanup() { pft_cleanup } atf_init_test_cases() { atf_add_test_case "v4" atf_add_test_case "v6" atf_add_test_case "noalias" atf_add_test_case "nested_inline" atf_add_test_case "urpf" } diff --git a/tests/sys/netpfil/pf/pfsync.sh b/tests/sys/netpfil/pf/pfsync.sh index 75788eed4bbe..1b61ec4f03a0 100644 --- a/tests/sys/netpfil/pf/pfsync.sh +++ b/tests/sys/netpfil/pf/pfsync.sh @@ -1,714 +1,715 @@ # $FreeBSD$ # # SPDX-License-Identifier: BSD-2-Clause # # Copyright (c) 2018 Orange Business Services # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions # are met: # 1. Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # 2. Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # # THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND # ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE # IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE # ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE # FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL # DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS # OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) # HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT # LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY # OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF # SUCH DAMAGE. . $(atf_get_srcdir)/utils.subr common_dir=$(atf_get_srcdir)/../common atf_test_case "basic" "cleanup" basic_head() { atf_set descr 'Basic pfsync test' atf_set require.user root } basic_body() { common_body } common_body() { defer=$1 pfsynct_init epair_sync=$(vnet_mkepair) epair_one=$(vnet_mkepair) epair_two=$(vnet_mkepair) vnet_mkjail one ${epair_one}a ${epair_sync}a vnet_mkjail two ${epair_two}a ${epair_sync}b # pfsync interface jexec one ifconfig ${epair_sync}a 192.0.2.1/24 up jexec one ifconfig ${epair_one}a 198.51.100.1/24 up jexec one ifconfig pfsync0 \ syncdev ${epair_sync}a \ maxupd 1 \ $defer \ up jexec two ifconfig ${epair_two}a 198.51.100.2/24 up jexec two ifconfig ${epair_sync}b 192.0.2.2/24 up jexec two ifconfig pfsync0 \ syncdev ${epair_sync}b \ maxupd 1 \ $defer \ up # Enable pf! jexec one pfctl -e pft_set_rules one \ "set skip on ${epair_sync}a" \ "pass out keep state" jexec two pfctl -e pft_set_rules two \ "set skip on ${epair_sync}b" \ "pass out keep state" ifconfig ${epair_one}b 198.51.100.254/24 up ping -c 1 -S 198.51.100.254 198.51.100.1 # Give pfsync time to do its thing sleep 2 if ! jexec two pfctl -s states | grep icmp | grep 198.51.100.1 | \ grep 198.51.100.254 ; then atf_fail "state not found on synced host" fi } basic_cleanup() { pfsynct_cleanup } atf_test_case "basic_defer" "cleanup" basic_defer_head() { atf_set descr 'Basic defer mode pfsync test' atf_set require.user root } basic_defer_body() { common_body defer } basic_defer_cleanup() { pfsynct_cleanup } atf_test_case "defer" "cleanup" defer_head() { atf_set descr 'Defer mode pfsync test' atf_set require.user root } defer_body() { pfsynct_init epair_sync=$(vnet_mkepair) epair_in=$(vnet_mkepair) epair_out=$(vnet_mkepair) vnet_mkjail alcatraz ${epair_sync}a ${epair_in}a ${epair_out}a jexec alcatraz ifconfig ${epair_sync}a 192.0.2.1/24 up jexec alcatraz ifconfig ${epair_out}a 198.51.100.1/24 up jexec alcatraz ifconfig ${epair_in}a 203.0.113.1/24 up jexec alcatraz arp -s 203.0.113.2 00:01:02:03:04:05 jexec alcatraz sysctl net.inet.ip.forwarding=1 # Set a long defer delay jexec alcatraz sysctl net.pfsync.defer_delay=2500 jexec alcatraz ifconfig pfsync0 \ syncdev ${epair_sync}a \ maxupd 1 \ defer \ up ifconfig ${epair_sync}b 192.0.2.2/24 up ifconfig ${epair_out}b 198.51.100.2/24 up ifconfig ${epair_in}b up route add -net 203.0.113.0/24 198.51.100.1 # Enable pf + jexec alcatraz sysctl net.pf.filter_local=0 jexec alcatraz pfctl -e pft_set_rules alcatraz \ "set skip on ${epair_sync}a" \ "pass keep state" atf_check -s exit:0 env PYTHONPATH=${common_dir} \ $(atf_get_srcdir)/pfsync_defer.py \ --syncdev ${epair_sync}b \ --indev ${epair_in}b \ --outdev ${epair_out}b # Now disable defer mode and expect failure. jexec alcatraz ifconfig pfsync0 -defer # Flush state pft_set_rules alcatraz \ "set skip on ${epair_sync}a" \ "pass keep state" atf_check -s exit:3 env PYTHONPATH=${common_dir} \ $(atf_get_srcdir)/pfsync_defer.py \ --syncdev ${epair_sync}b \ --indev ${epair_in}b \ --outdev ${epair_out}b } defer_cleanup() { pfsynct_cleanup } atf_test_case "bulk" "cleanup" bulk_head() { atf_set descr 'Test bulk updates' atf_set require.user root } bulk_body() { pfsynct_init epair_sync=$(vnet_mkepair) epair_one=$(vnet_mkepair) epair_two=$(vnet_mkepair) vnet_mkjail one ${epair_one}a ${epair_sync}a vnet_mkjail two ${epair_two}a ${epair_sync}b # pfsync interface jexec one ifconfig ${epair_sync}a 192.0.2.1/24 up jexec one ifconfig ${epair_one}a 198.51.100.1/24 up jexec one ifconfig pfsync0 \ syncdev ${epair_sync}a \ maxupd 1\ up jexec two ifconfig ${epair_two}a 198.51.100.2/24 up jexec two ifconfig ${epair_sync}b 192.0.2.2/24 up # Enable pf jexec one pfctl -e pft_set_rules one \ "set skip on ${epair_sync}a" \ "pass keep state" jexec two pfctl -e pft_set_rules two \ "set skip on ${epair_sync}b" \ "pass keep state" ifconfig ${epair_one}b 198.51.100.254/24 up # Create state prior to setting up pfsync ping -c 1 -S 198.51.100.254 198.51.100.1 # Wait before setting up pfsync on two, so we don't accidentally catch # the update anyway. sleep 1 # Now set up pfsync in jail two jexec two ifconfig pfsync0 \ syncdev ${epair_sync}b \ up # Give pfsync time to do its thing sleep 2 jexec two pfctl -s states if ! jexec two pfctl -s states | grep icmp | grep 198.51.100.1 | \ grep 198.51.100.2 ; then atf_fail "state not found on synced host" fi } bulk_cleanup() { pfsynct_cleanup } atf_test_case "pbr" "cleanup" pbr_head() { atf_set descr 'route_to and reply_to directives test' atf_set require.user root atf_set timeout '600' } pbr_body() { pbr_common_body } pbr_cleanup() { pbr_common_cleanup } atf_test_case "pfsync_pbr" "cleanup" pfsync_pbr_head() { atf_set descr 'route_to and reply_to directives pfsync test' atf_set require.user root atf_set timeout '600' } pfsync_pbr_body() { pbr_common_body backup_promotion } pfsync_pbr_cleanup() { pbr_common_cleanup } pbr_common_body() { # + builds bellow topology and initiate a single ping session # from client to server. # + gw* forward traffic through pbr not fib lookups. # + if backup_promotion arg is given, a carp failover event occurs # during the ping session on both gateways. # ┌──────┐ # │client│ # └───┬──┘ # │ # ┌───┴───┐ # │bridge0│ # └┬─────┬┘ # │ │ # ┌────────────────┴─┐ ┌─┴────────────────┐ # │gw_route_to_master├─┤gw_route_to_backup│ # └────────────────┬─┘ └─┬────────────────┘ # │ │ # ┌┴─────┴┐ # │bridge1│ # └┬─────┬┘ # │ │ # ┌────────────────┴─┐ ┌─┴────────────────┐ # │gw_reply_to_master├─┤gw_reply_to_backup│ # └────────────────┬─┘ └─┬────────────────┘ # │ │ # ┌┴─────┴┐ # │bridge2│ # └───┬───┘ # │ # ┌───┴──┐ # │server│ # └──────┘ if ! kldstat -q -m carp then atf_skip "This test requires carp" fi pfsynct_init bridge0=$(vnet_mkbridge) bridge1=$(vnet_mkbridge) bridge2=$(vnet_mkbridge) epair_sync_gw_route_to=$(vnet_mkepair) epair_sync_gw_reply_to=$(vnet_mkepair) epair_client_bridge0=$(vnet_mkepair) epair_gw_route_to_master_bridge0=$(vnet_mkepair) epair_gw_route_to_backup_bridge0=$(vnet_mkepair) epair_gw_route_to_master_bridge1=$(vnet_mkepair) epair_gw_route_to_backup_bridge1=$(vnet_mkepair) epair_gw_reply_to_master_bridge1=$(vnet_mkepair) epair_gw_reply_to_backup_bridge1=$(vnet_mkepair) epair_gw_reply_to_master_bridge2=$(vnet_mkepair) epair_gw_reply_to_backup_bridge2=$(vnet_mkepair) epair_server_bridge2=$(vnet_mkepair) ifconfig ${bridge0} up ifconfig ${epair_client_bridge0}b up ifconfig ${epair_gw_route_to_master_bridge0}b up ifconfig ${epair_gw_route_to_backup_bridge0}b up ifconfig ${bridge0} \ addm ${epair_client_bridge0}b \ addm ${epair_gw_route_to_master_bridge0}b \ addm ${epair_gw_route_to_backup_bridge0}b ifconfig ${bridge1} up ifconfig ${epair_gw_route_to_master_bridge1}b up ifconfig ${epair_gw_route_to_backup_bridge1}b up ifconfig ${epair_gw_reply_to_master_bridge1}b up ifconfig ${epair_gw_reply_to_backup_bridge1}b up ifconfig ${bridge1} \ addm ${epair_gw_route_to_master_bridge1}b \ addm ${epair_gw_route_to_backup_bridge1}b \ addm ${epair_gw_reply_to_master_bridge1}b \ addm ${epair_gw_reply_to_backup_bridge1}b ifconfig ${bridge2} up ifconfig ${epair_gw_reply_to_master_bridge2}b up ifconfig ${epair_gw_reply_to_backup_bridge2}b up ifconfig ${epair_server_bridge2}b up ifconfig ${bridge2} \ addm ${epair_gw_reply_to_master_bridge2}b \ addm ${epair_gw_reply_to_backup_bridge2}b \ addm ${epair_server_bridge2}b vnet_mkjail client ${epair_client_bridge0}a jexec client hostname client vnet_mkjail gw_route_to_master \ ${epair_gw_route_to_master_bridge0}a \ ${epair_gw_route_to_master_bridge1}a \ ${epair_sync_gw_route_to}a jexec gw_route_to_master hostname gw_route_to_master vnet_mkjail gw_route_to_backup \ ${epair_gw_route_to_backup_bridge0}a \ ${epair_gw_route_to_backup_bridge1}a \ ${epair_sync_gw_route_to}b jexec gw_route_to_backup hostname gw_route_to_backup vnet_mkjail gw_reply_to_master \ ${epair_gw_reply_to_master_bridge1}a \ ${epair_gw_reply_to_master_bridge2}a \ ${epair_sync_gw_reply_to}a jexec gw_reply_to_master hostname gw_reply_to_master vnet_mkjail gw_reply_to_backup \ ${epair_gw_reply_to_backup_bridge1}a \ ${epair_gw_reply_to_backup_bridge2}a \ ${epair_sync_gw_reply_to}b jexec gw_reply_to_backup hostname gw_reply_to_backup vnet_mkjail server ${epair_server_bridge2}a jexec server hostname server jexec client ifconfig ${epair_client_bridge0}a inet 198.18.0.1/24 up jexec client route add 198.18.2.0/24 198.18.0.10 jexec gw_route_to_master ifconfig ${epair_sync_gw_route_to}a \ inet 198.19.10.1/24 up jexec gw_route_to_master ifconfig ${epair_gw_route_to_master_bridge0}a \ inet 198.18.0.8/24 up jexec gw_route_to_master ifconfig ${epair_gw_route_to_master_bridge0}a \ alias 198.18.0.10/32 vhid 10 pass 3WjvVVw7 advskew 50 jexec gw_route_to_master ifconfig ${epair_gw_route_to_master_bridge1}a \ inet 198.18.1.8/24 up jexec gw_route_to_master ifconfig ${epair_gw_route_to_master_bridge1}a \ alias 198.18.1.10/32 vhid 11 pass 3WjvVVw7 advskew 50 jexec gw_route_to_master sysctl net.inet.ip.forwarding=1 jexec gw_route_to_master sysctl net.inet.carp.preempt=1 vnet_ifrename_jail gw_route_to_master ${epair_sync_gw_route_to}a if_pfsync vnet_ifrename_jail gw_route_to_master ${epair_gw_route_to_master_bridge0}a if_br0 vnet_ifrename_jail gw_route_to_master ${epair_gw_route_to_master_bridge1}a if_br1 jexec gw_route_to_master ifconfig pfsync0 \ syncpeer 198.19.10.2 \ syncdev if_pfsync \ maxupd 1 \ up pft_set_rules gw_route_to_master \ "keep_state = 'tag auth_packet keep state'" \ "set timeout { icmp.first 120, icmp.error 60 }" \ "block log all" \ "pass quick on if_pfsync proto pfsync keep state (no-sync)" \ "pass quick on { if_br0 if_br1 } proto carp keep state (no-sync)" \ "block drop in quick to 224.0.0.18/32" \ "pass out quick tagged auth_packet keep state" \ "pass in quick log on if_br0 route-to (if_br1 198.18.1.20) proto { icmp udp tcp } from 198.18.0.0/24 to 198.18.2.0/24 \$keep_state" jexec gw_route_to_master pfctl -e jexec gw_route_to_backup ifconfig ${epair_sync_gw_route_to}b \ inet 198.19.10.2/24 up jexec gw_route_to_backup ifconfig ${epair_gw_route_to_backup_bridge0}a \ inet 198.18.0.9/24 up jexec gw_route_to_backup ifconfig ${epair_gw_route_to_backup_bridge0}a \ alias 198.18.0.10/32 vhid 10 pass 3WjvVVw7 advskew 100 jexec gw_route_to_backup ifconfig ${epair_gw_route_to_backup_bridge1}a \ inet 198.18.1.9/24 up jexec gw_route_to_backup ifconfig ${epair_gw_route_to_backup_bridge1}a \ alias 198.18.1.10/32 vhid 11 pass 3WjvVVw7 advskew 100 jexec gw_route_to_backup sysctl net.inet.ip.forwarding=1 jexec gw_route_to_backup sysctl net.inet.carp.preempt=1 vnet_ifrename_jail gw_route_to_backup ${epair_sync_gw_route_to}b if_pfsync vnet_ifrename_jail gw_route_to_backup ${epair_gw_route_to_backup_bridge0}a if_br0 vnet_ifrename_jail gw_route_to_backup ${epair_gw_route_to_backup_bridge1}a if_br1 jexec gw_route_to_backup ifconfig pfsync0 \ syncpeer 198.19.10.1 \ syncdev if_pfsync \ up pft_set_rules gw_route_to_backup \ "keep_state = 'tag auth_packet keep state'" \ "set timeout { icmp.first 120, icmp.error 60 }" \ "block log all" \ "pass quick on if_pfsync proto pfsync keep state (no-sync)" \ "pass quick on { if_br0 if_br1 } proto carp keep state (no-sync)" \ "block drop in quick to 224.0.0.18/32" \ "pass out quick tagged auth_packet keep state" \ "pass in quick log on if_br0 route-to (if_br1 198.18.1.20) proto { icmp udp tcp } from 198.18.0.0/24 to 198.18.2.0/24 \$keep_state" jexec gw_route_to_backup pfctl -e jexec gw_reply_to_master ifconfig ${epair_sync_gw_reply_to}a \ inet 198.19.20.1/24 up jexec gw_reply_to_master ifconfig ${epair_gw_reply_to_master_bridge1}a \ inet 198.18.1.18/24 up jexec gw_reply_to_master ifconfig ${epair_gw_reply_to_master_bridge1}a \ alias 198.18.1.20/32 vhid 21 pass 3WjvVVw7 advskew 50 jexec gw_reply_to_master ifconfig ${epair_gw_reply_to_master_bridge2}a \ inet 198.18.2.18/24 up jexec gw_reply_to_master ifconfig ${epair_gw_reply_to_master_bridge2}a \ alias 198.18.2.20/32 vhid 22 pass 3WjvVVw7 advskew 50 jexec gw_reply_to_master sysctl net.inet.ip.forwarding=1 jexec gw_reply_to_master sysctl net.inet.carp.preempt=1 vnet_ifrename_jail gw_reply_to_master ${epair_sync_gw_reply_to}a if_pfsync vnet_ifrename_jail gw_reply_to_master ${epair_gw_reply_to_master_bridge1}a if_br1 vnet_ifrename_jail gw_reply_to_master ${epair_gw_reply_to_master_bridge2}a if_br2 jexec gw_reply_to_master ifconfig pfsync0 \ syncpeer 198.19.20.2 \ syncdev if_pfsync \ maxupd 1 \ up pft_set_rules gw_reply_to_master \ "set timeout { icmp.first 120, icmp.error 60 }" \ "block log all" \ "pass quick on if_pfsync proto pfsync keep state (no-sync)" \ "pass quick on { if_br1 if_br2 } proto carp keep state (no-sync)" \ "block drop in quick to 224.0.0.18/32" \ "pass out quick on if_br2 reply-to (if_br1 198.18.1.10) tagged auth_packet_reply_to keep state" \ "pass in quick log on if_br1 proto { icmp udp tcp } from 198.18.0.0/24 to 198.18.2.0/24 tag auth_packet_reply_to keep state" jexec gw_reply_to_master pfctl -e jexec gw_reply_to_backup ifconfig ${epair_sync_gw_reply_to}b \ inet 198.19.20.2/24 up jexec gw_reply_to_backup ifconfig ${epair_gw_reply_to_backup_bridge1}a \ inet 198.18.1.19/24 up jexec gw_reply_to_backup ifconfig ${epair_gw_reply_to_backup_bridge1}a \ alias 198.18.1.20/32 vhid 21 pass 3WjvVVw7 advskew 100 jexec gw_reply_to_backup ifconfig ${epair_gw_reply_to_backup_bridge2}a \ inet 198.18.2.19/24 up jexec gw_reply_to_backup ifconfig ${epair_gw_reply_to_backup_bridge2}a \ alias 198.18.2.20/32 vhid 22 pass 3WjvVVw7 advskew 100 jexec gw_reply_to_backup sysctl net.inet.ip.forwarding=1 jexec gw_reply_to_backup sysctl net.inet.carp.preempt=1 vnet_ifrename_jail gw_reply_to_backup ${epair_sync_gw_reply_to}b if_pfsync vnet_ifrename_jail gw_reply_to_backup ${epair_gw_reply_to_backup_bridge1}a if_br1 vnet_ifrename_jail gw_reply_to_backup ${epair_gw_reply_to_backup_bridge2}a if_br2 jexec gw_reply_to_backup ifconfig pfsync0 \ syncpeer 198.19.20.1 \ syncdev if_pfsync \ up pft_set_rules gw_reply_to_backup \ "set timeout { icmp.first 120, icmp.error 60 }" \ "block log all" \ "pass quick on if_pfsync proto pfsync keep state (no-sync)" \ "pass quick on { if_br1 if_br2 } proto carp keep state (no-sync)" \ "block drop in quick to 224.0.0.18/32" \ "pass out quick on if_br2 reply-to (if_br1 198.18.1.10) tagged auth_packet_reply_to keep state" \ "pass in quick log on if_br1 proto { icmp udp tcp } from 198.18.0.0/24 to 198.18.2.0/24 tag auth_packet_reply_to keep state" jexec gw_reply_to_backup pfctl -e jexec server ifconfig ${epair_server_bridge2}a inet 198.18.2.1/24 up jexec server route add 198.18.0.0/24 198.18.2.20 # Waiting for platform to settle while ! jexec gw_route_to_backup ifconfig | grep 'carp: BACKUP' do sleep 1 done while ! jexec gw_reply_to_backup ifconfig | grep 'carp: BACKUP' do sleep 1 done while ! jexec client ping -c 10 198.18.2.1 | grep ', 0.0% packet loss' do sleep 1 done # Checking cluster members pf.conf checksums match gw_route_to_master_checksum=$(jexec gw_route_to_master pfctl -si -v | grep 'Checksum:' | cut -d ' ' -f 2) gw_route_to_backup_checksum=$(jexec gw_route_to_backup pfctl -si -v | grep 'Checksum:' | cut -d ' ' -f 2) gw_reply_to_master_checksum=$(jexec gw_reply_to_master pfctl -si -v | grep 'Checksum:' | cut -d ' ' -f 2) gw_reply_to_backup_checksum=$(jexec gw_reply_to_backup pfctl -si -v | grep 'Checksum:' | cut -d ' ' -f 2) if [ "$gw_route_to_master_checksum" != "$gw_route_to_backup_checksum" ] then atf_fail "gw_route_to cluster members pf.conf do not match each others" fi if [ "$gw_reply_to_master_checksum" != "$gw_reply_to_backup_checksum" ] then atf_fail "gw_reply_to cluster members pf.conf do not match each others" fi # Creating state entries (jexec client ping -c 10 198.18.2.1 >ping.stdout) & if [ "$1" = "backup_promotion" ] then sleep 1 jexec gw_route_to_backup ifconfig if_br0 vhid 10 advskew 0 jexec gw_route_to_backup ifconfig if_br1 vhid 11 advskew 0 jexec gw_reply_to_backup ifconfig if_br1 vhid 21 advskew 0 jexec gw_reply_to_backup ifconfig if_br2 vhid 22 advskew 0 fi while ! grep -q -e 'packet loss' ping.stdout do sleep 1 done atf_check -s exit:0 -e ignore -o ignore grep ', 0.0% packet loss' ping.stdout } pbr_common_cleanup() { pft_cleanup } atf_test_case "ipsec" "cleanup" ipsec_head() { atf_set descr 'Transport pfsync over IPSec' atf_set require.user root } ipsec_body() { if ! sysctl -q kern.features.ipsec >/dev/null ; then atf_skip "This test requires ipsec" fi # Run the common test, to set up pfsync common_body # But we want unicast pfsync jexec one ifconfig pfsync0 syncpeer 192.0.2.2 jexec two ifconfig pfsync0 syncpeer 192.0.2.1 # Flush existing states jexec one pfctl -Fs jexec two pfctl -Fs # Now define an ipsec policy to run over the epair_sync interfaces echo "flush; spdflush; spdadd 192.0.2.1/32 192.0.2.2/32 any -P out ipsec esp/transport//require; spdadd 192.0.2.2/32 192.0.2.1/32 any -P in ipsec esp/transport//require; add 192.0.2.1 192.0.2.2 esp 0x1000 -E aes-gcm-16 \"12345678901234567890\"; add 192.0.2.2 192.0.2.1 esp 0x1001 -E aes-gcm-16 \"12345678901234567890\";" \ | jexec one setkey -c echo "flush; spdflush; spdadd 192.0.2.2/32 192.0.2.1/32 any -P out ipsec esp/transport//require; spdadd 192.0.2.1/32 192.0.2.2/32 any -P in ipsec esp/transport//require; add 192.0.2.1 192.0.2.2 esp 0x1000 -E aes-gcm-16 \"12345678901234567891\"; add 192.0.2.2 192.0.2.1 esp 0x1001 -E aes-gcm-16 \"12345678901234567891\";" \ | jexec two setkey -c # We've set incompatible keys, so pfsync will be broken. ping -c 1 -S 198.51.100.254 198.51.100.1 # Give pfsync time to do its thing sleep 2 if jexec two pfctl -s states | grep icmp | grep 198.51.100.1 | \ grep 198.51.100.2 ; then atf_fail "state synced although IPSec should have prevented it" fi # Flush existing states jexec one pfctl -Fs jexec two pfctl -Fs # Fix the IPSec key to match echo "flush; spdflush; spdadd 192.0.2.2/32 192.0.2.1/32 any -P out ipsec esp/transport//require; spdadd 192.0.2.1/32 192.0.2.2/32 any -P in ipsec esp/transport//require; add 192.0.2.1 192.0.2.2 esp 0x1000 -E aes-gcm-16 \"12345678901234567890\"; add 192.0.2.2 192.0.2.1 esp 0x1001 -E aes-gcm-16 \"12345678901234567890\";" \ | jexec two setkey -c ping -c 1 -S 198.51.100.254 198.51.100.1 # Give pfsync time to do its thing sleep 2 if ! jexec two pfctl -s states | grep icmp | grep 198.51.100.1 | \ grep 198.51.100.2 ; then atf_fail "state not found on synced host" fi } ipsec_cleanup() { pft_cleanup } atf_test_case "timeout" "cleanup" timeout_head() { atf_set descr 'Trigger pfsync_timeout()' atf_set require.user root } timeout_body() { pft_init vnet_mkjail one jexec one ifconfig lo0 127.0.0.1/8 up jexec one ifconfig lo0 inet6 ::1/128 up pft_set_rules one \ "pass all" jexec one pfctl -e jexec one ifconfig pfsync0 defer up jexec one ping -c 1 ::1 jexec one ping -c 1 127.0.0.1 # Give pfsync_timeout() time to fire (a callout on a 1 second delay) sleep 2 } timeout_cleanup() { pft_cleanup } atf_init_test_cases() { atf_add_test_case "basic" atf_add_test_case "basic_defer" atf_add_test_case "defer" atf_add_test_case "bulk" atf_add_test_case "pbr" atf_add_test_case "pfsync_pbr" atf_add_test_case "ipsec" atf_add_test_case "timeout" } diff --git a/tests/sys/netpfil/pf/route_to.sh b/tests/sys/netpfil/pf/route_to.sh index 203d0a944a5b..18e0e02db65e 100644 --- a/tests/sys/netpfil/pf/route_to.sh +++ b/tests/sys/netpfil/pf/route_to.sh @@ -1,376 +1,377 @@ # $FreeBSD$ # # SPDX-License-Identifier: BSD-2-Clause # # Copyright (c) 2018 Kristof Provost # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions # are met: # 1. Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # 2. Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # # THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND # ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE # IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE # ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE # FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL # DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS # OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) # HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT # LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY # OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF # SUCH DAMAGE. . $(atf_get_srcdir)/utils.subr common_dir=$(atf_get_srcdir)/../common atf_test_case "v4" "cleanup" v4_head() { atf_set descr 'Basic route-to test' atf_set require.user root } v4_body() { pft_init epair_send=$(vnet_mkepair) ifconfig ${epair_send}a 192.0.2.1/24 up epair_route=$(vnet_mkepair) ifconfig ${epair_route}a 203.0.113.1/24 up vnet_mkjail alcatraz ${epair_send}b ${epair_route}b jexec alcatraz ifconfig ${epair_send}b 192.0.2.2/24 up jexec alcatraz ifconfig ${epair_route}b 203.0.113.2/24 up jexec alcatraz route add -net 198.51.100.0/24 192.0.2.1 jexec alcatraz pfctl -e # Attempt to provoke PR 228782 pft_set_rules alcatraz "block all" "pass user 2" \ "pass out route-to (${epair_route}b 203.0.113.1) from 192.0.2.2 to 198.51.100.1 no state" jexec alcatraz nc -w 3 -s 192.0.2.2 198.51.100.1 22 # atf wants us to not return an error, but our netcat will fail true } v4_cleanup() { pft_cleanup } atf_test_case "v6" "cleanup" v6_head() { atf_set descr 'Basic route-to test (IPv6)' atf_set require.user root } v6_body() { pft_init epair_send=$(vnet_mkepair) ifconfig ${epair_send}a inet6 2001:db8:42::1/64 up no_dad -ifdisabled epair_route=$(vnet_mkepair) ifconfig ${epair_route}a inet6 2001:db8:43::1/64 up no_dad -ifdisabled vnet_mkjail alcatraz ${epair_send}b ${epair_route}b jexec alcatraz ifconfig ${epair_send}b inet6 2001:db8:42::2/64 up no_dad jexec alcatraz ifconfig ${epair_route}b inet6 2001:db8:43::2/64 up no_dad jexec alcatraz route add -6 2001:db8:666::/64 2001:db8:42::2 jexec alcatraz pfctl -e # Attempt to provoke PR 228782 pft_set_rules alcatraz "block all" "pass user 2" \ "pass out route-to (${epair_route}b 2001:db8:43::1) from 2001:db8:42::2 to 2001:db8:666::1 no state" jexec alcatraz nc -6 -w 3 -s 2001:db8:42::2 2001:db8:666::1 22 # atf wants us to not return an error, but our netcat will fail true } v6_cleanup() { pft_cleanup } atf_test_case "multiwan" "cleanup" multiwan_head() { atf_set descr 'Multi-WAN redirection / reply-to test' atf_set require.user root } multiwan_body() { pft_init epair_one=$(vnet_mkepair) epair_two=$(vnet_mkepair) epair_cl_one=$(vnet_mkepair) epair_cl_two=$(vnet_mkepair) vnet_mkjail srv ${epair_one}b ${epair_two}b vnet_mkjail wan_one ${epair_one}a ${epair_cl_one}b vnet_mkjail wan_two ${epair_two}a ${epair_cl_two}b vnet_mkjail client ${epair_cl_one}a ${epair_cl_two}a jexec client ifconfig ${epair_cl_one}a 203.0.113.1/25 jexec wan_one ifconfig ${epair_cl_one}b 203.0.113.2/25 jexec wan_one ifconfig ${epair_one}a 192.0.2.1/24 up jexec wan_one sysctl net.inet.ip.forwarding=1 jexec srv ifconfig ${epair_one}b 192.0.2.2/24 up jexec client route add 192.0.2.0/24 203.0.113.2 jexec client ifconfig ${epair_cl_two}a 203.0.113.128/25 jexec wan_two ifconfig ${epair_cl_two}b 203.0.113.129/25 jexec wan_two ifconfig ${epair_two}a 198.51.100.1/24 up jexec wan_two sysctl net.inet.ip.forwarding=1 jexec srv ifconfig ${epair_two}b 198.51.100.2/24 up jexec client route add 198.51.100.0/24 203.0.113.129 jexec srv ifconfig lo0 127.0.0.1/8 up jexec srv route add default 192.0.2.1 jexec srv sysctl net.inet.ip.forwarding=1 # Run echo server in srv jail jexec srv /usr/sbin/inetd -p multiwan.pid $(atf_get_srcdir)/echo_inetd.conf jexec srv pfctl -e pft_set_rules srv \ "nat on ${epair_one}b inet from 127.0.0.0/8 to any -> (${epair_one}b)" \ "nat on ${epair_two}b inet from 127.0.0.0/8 to any -> (${epair_two}b)" \ "rdr on ${epair_one}b inet proto tcp from any to 192.0.2.2 port 7 -> 127.0.0.1 port 7" \ "rdr on ${epair_two}b inet proto tcp from any to 198.51.100.2 port 7 -> 127.0.0.1 port 7" \ "block in" \ "block out" \ "pass in quick on ${epair_one}b reply-to (${epair_one}b 192.0.2.1) inet proto tcp from any to 127.0.0.1 port 7" \ "pass in quick on ${epair_two}b reply-to (${epair_two}b 198.51.100.1) inet proto tcp from any to 127.0.0.1 port 7" # These will always succeed, because we don't change interface to route # correctly here. result=$(echo "one" | jexec wan_one nc -N -w 3 192.0.2.2 7) if [ "${result}" != "one" ]; then atf_fail "Redirect on one failed" fi result=$(echo "two" | jexec wan_two nc -N -w 3 198.51.100.2 7) if [ "${result}" != "two" ]; then atf_fail "Redirect on two failed" fi result=$(echo "one" | jexec client nc -N -w 3 192.0.2.2 7) if [ "${result}" != "one" ]; then atf_fail "Redirect from client on one failed" fi # This should trigger the issue fixed in 829a69db855b48ff7e8242b95e193a0783c489d9 result=$(echo "two" | jexec client nc -N -w 3 198.51.100.2 7) if [ "${result}" != "two" ]; then atf_fail "Redirect from client on two failed" fi } multiwan_cleanup() { rm -f multiwan.pid pft_cleanup } atf_test_case "multiwanlocal" "cleanup" multiwanlocal_head() { atf_set descr 'Multi-WAN local origin source-based redirection / route-to test' atf_set require.user root } multiwanlocal_body() { pft_init epair_one=$(vnet_mkepair) epair_two=$(vnet_mkepair) epair_cl_one=$(vnet_mkepair) epair_cl_two=$(vnet_mkepair) vnet_mkjail srv1 ${epair_one}b vnet_mkjail srv2 ${epair_two}b vnet_mkjail wan_one ${epair_one}a ${epair_cl_one}b vnet_mkjail wan_two ${epair_two}a ${epair_cl_two}b vnet_mkjail client ${epair_cl_one}a ${epair_cl_two}a jexec client ifconfig ${epair_cl_one}a 203.0.113.1/25 jexec wan_one ifconfig ${epair_cl_one}b 203.0.113.2/25 jexec wan_one ifconfig ${epair_one}a 192.0.2.1/24 up jexec wan_one sysctl net.inet.ip.forwarding=1 jexec srv1 ifconfig ${epair_one}b 192.0.2.2/24 up jexec client ifconfig ${epair_cl_two}a 203.0.113.128/25 jexec wan_two ifconfig ${epair_cl_two}b 203.0.113.129/25 jexec wan_two ifconfig ${epair_two}a 198.51.100.1/24 up jexec wan_two sysctl net.inet.ip.forwarding=1 jexec srv2 ifconfig ${epair_two}b 198.51.100.2/24 up jexec client route add default 203.0.113.2 jexec srv1 route add default 192.0.2.1 jexec srv2 route add default 198.51.100.1 # Run data source in srv1 and srv2 jexec srv1 sh -c 'dd if=/dev/zero bs=1024 count=100 | nc -l 7 -w 2 -N &' jexec srv2 sh -c 'dd if=/dev/zero bs=1024 count=100 | nc -l 7 -w 2 -N &' jexec client pfctl -e pft_set_rules client \ "block in" \ "block out" \ "pass out quick route-to (${epair_cl_two}a 203.0.113.129) inet proto tcp from 203.0.113.128 to any port 7" \ - "pass out on ${epair_cl_one}a inet proto tcp from any to any port 7" + "pass out on ${epair_cl_one}a inet proto tcp from any to any port 7" \ + "set skip on lo" # This should work result=$(jexec client nc -N -w 1 192.0.2.2 7 | wc -c) if [ ${result} -ne 102400 ]; then jexec client pfctl -ss atf_fail "Redirect from client on one failed: ${result}" fi # This should trigger the issue result=$(jexec client nc -N -w 1 -s 203.0.113.128 198.51.100.2 7 | wc -c) jexec client pfctl -ss if [ ${result} -ne 102400 ]; then atf_fail "Redirect from client on two failed: ${result}" fi } multiwanlocal_cleanup() { pft_cleanup } atf_test_case "icmp_nat" "cleanup" icmp_nat_head() { atf_set descr 'Test that ICMP packets are correct for route-to + NAT' atf_set require.user root atf_set require.progs scapy } icmp_nat_body() { pft_init epair_one=$(vnet_mkepair) epair_two=$(vnet_mkepair) epair_three=$(vnet_mkepair) vnet_mkjail gw ${epair_one}b ${epair_two}a ${epair_three}a vnet_mkjail srv ${epair_two}b vnet_mkjail srv2 ${epair_three}b ifconfig ${epair_one}a 192.0.2.2/24 up route add -net 198.51.100.0/24 192.0.2.1 jexec gw sysctl net.inet.ip.forwarding=1 jexec gw ifconfig ${epair_one}b 192.0.2.1/24 up jexec gw ifconfig ${epair_two}a 198.51.100.1/24 up jexec gw ifconfig ${epair_three}a 203.0.113.1/24 up mtu 500 jexec srv ifconfig ${epair_two}b 198.51.100.2/24 up jexec srv route add default 198.51.100.1 jexec srv2 ifconfig ${epair_three}b 203.0.113.2/24 up mtu 500 jexec srv2 route add default 203.0.113.1 # Sanity check atf_check -s exit:0 -o ignore ping -c 1 198.51.100.2 jexec gw pfctl -e pft_set_rules gw \ "nat on ${epair_two}a inet from 192.0.2.0/24 to any -> (${epair_two}a)" \ "nat on ${epair_three}a inet from 192.0.2.0/24 to any -> (${epair_three}a)" \ "pass out route-to (${epair_three}a 203.0.113.2) proto icmp icmp-type echoreq" # Now ensure that we get an ICMP error with the correct IP addresses in it. atf_check -s exit:0 ${common_dir}/pft_icmp_check.py \ --to 198.51.100.2 \ --fromaddr 192.0.2.2 \ --recvif ${epair_one}a \ --sendif ${epair_one}a # ping reports the ICMP error, so check of that too. atf_check -s exit:2 -o match:'frag needed and DF set' \ ping -D -c 1 -s 1000 198.51.100.2 } icmp_nat_cleanup() { pft_cleanup } atf_test_case "dummynet" "cleanup" dummynet_head() { atf_set descr 'Test that dummynet applies to route-to packets' atf_set require.user root } dummynet_body() { dummynet_init epair_srv=$(vnet_mkepair) epair_gw=$(vnet_mkepair) vnet_mkjail srv ${epair_srv}a jexec srv ifconfig ${epair_srv}a 192.0.2.1/24 up jexec srv route add default 192.0.2.2 vnet_mkjail gw ${epair_srv}b ${epair_gw}a jexec gw ifconfig ${epair_srv}b 192.0.2.2/24 up jexec gw ifconfig ${epair_gw}a 198.51.100.1/24 up jexec gw sysctl net.inet.ip.forwarding=1 ifconfig ${epair_gw}b 198.51.100.2/24 up route add -net 192.0.2.0/24 198.51.100.1 # Sanity check atf_check -s exit:0 -o ignore ping -c 1 -t 1 192.0.2.1 jexec gw dnctl pipe 1 config delay 1200 pft_set_rules gw \ "pass out route-to (${epair_srv}b 192.0.2.1) to 192.0.2.1 dnpipe 1" jexec gw pfctl -e # The ping request will pass, but take 1.2 seconds # So this works: atf_check -s exit:0 -o ignore ping -c 1 192.0.2.1 # But this times out: atf_check -s exit:2 -o ignore ping -c 1 -t 1 192.0.2.1 # return path dummynet pft_set_rules gw \ "pass out route-to (${epair_srv}b 192.0.2.1) to 192.0.2.1 dnpipe (0, 1)" # The ping request will pass, but take 1.2 seconds # So this works: atf_check -s exit:0 -o ignore ping -c 1 192.0.2.1 # But this times out: atf_check -s exit:2 -o ignore ping -c 1 -t 1 192.0.2.1 } dummynet_cleanup() { pft_cleanup } atf_init_test_cases() { atf_add_test_case "v4" atf_add_test_case "v6" atf_add_test_case "multiwan" atf_add_test_case "multiwanlocal" atf_add_test_case "icmp_nat" atf_add_test_case "dummynet" } diff --git a/tests/sys/netpfil/pf/set_skip.sh b/tests/sys/netpfil/pf/set_skip.sh index 9b3d655a6d1d..c666622e3d15 100644 --- a/tests/sys/netpfil/pf/set_skip.sh +++ b/tests/sys/netpfil/pf/set_skip.sh @@ -1,171 +1,171 @@ # $FreeBSD$ # # SPDX-License-Identifier: BSD-2-Clause # # Copyright (c) 2018 Kristof Provost # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions # are met: # 1. Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # 2. Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # # THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND # ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE # IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE # ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE # FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL # DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS # OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) # HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT # LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY # OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF # SUCH DAMAGE. . $(atf_get_srcdir)/utils.subr atf_test_case "set_skip_group" "cleanup" set_skip_group_head() { atf_set descr 'Basic set skip test' atf_set require.user root } set_skip_group_body() { # See PR 229241 pft_init vnet_mkjail alcatraz jexec alcatraz ifconfig lo0 127.0.0.1/8 up jexec alcatraz ifconfig lo0 group foo jexec alcatraz pfctl -e pft_set_rules alcatraz "set skip on foo" \ "block in proto icmp" jexec alcatraz ifconfig atf_check -s exit:0 -o ignore jexec alcatraz ping -c 1 127.0.0.1 } set_skip_group_cleanup() { pft_cleanup } atf_test_case "set_skip_group_lo" "cleanup" set_skip_group_lo_head() { atf_set descr 'Basic set skip test, lo' atf_set require.user root } set_skip_group_lo_body() { # See PR 229241 pft_init vnet_mkjail alcatraz jexec alcatraz ifconfig lo0 127.0.0.1/8 up jexec alcatraz pfctl -e pft_set_rules alcatraz "set skip on lo" \ "block on lo0" atf_check -s exit:0 -o ignore jexec alcatraz ping -c 1 127.0.0.1 pft_set_rules noflush alcatraz "set skip on lo" \ "block on lo0" atf_check -s exit:0 -o ignore jexec alcatraz ping -c 1 127.0.0.1 jexec alcatraz pfctl -s rules } set_skip_group_lo_cleanup() { pft_cleanup } atf_test_case "set_skip_dynamic" "cleanup" set_skip_dynamic_head() { atf_set descr "Cope with group changes" atf_set require.user root } set_skip_dynamic_body() { pft_init set -x vnet_mkjail alcatraz jexec alcatraz pfctl -e pft_set_rules alcatraz "set skip on epair" \ - "block" + "block on ! lo" epair=$(vnet_mkepair) ifconfig ${epair}a 192.0.2.2/24 up vnet_ifmove ${epair}b alcatraz jexec alcatraz ifconfig ${epair}b 192.0.2.1/24 up atf_check -s exit:0 -o ignore jexec alcatraz ping -c 1 192.0.2.2 } set_skip_dynamic_cleanup() { pft_cleanup } atf_test_case "pr255852" "cleanup" pr255852_head() { atf_set descr "PR 255852" atf_set require.user root } pr255852_body() { pft_init epair=$(vnet_mkepair) ifconfig ${epair}a 192.0.2.1/24 up vnet_mkjail alcatraz ${epair}b jexec alcatraz ifconfig lo0 127.0.0.1/8 up jexec alcatraz ifconfig ${epair}b 192.0.2.2/24 up # Sanity check atf_check -s exit:0 -o ignore ping -c 1 192.0.2.2 jexec alcatraz pfctl -e pft_set_rules alcatraz "set skip on { lo0, epair }" \ "block" jexec alcatraz pfctl -vsI # We're skipping on epair, so this should work atf_check -s exit:0 -o ignore ping -c 1 192.0.2.2 # Note: flushing avoid the issue pft_set_rules noflush alcatraz "set skip on { lo0 }" \ "block" jexec alcatraz pfctl -vsI # No longer skipping, so this should fail atf_check -s exit:2 -o ignore ping -c 1 -t 1 192.0.2.2 } pr255852_cleanup() { pft_cleanup } atf_init_test_cases() { atf_add_test_case "set_skip_group" atf_add_test_case "set_skip_group_lo" atf_add_test_case "set_skip_dynamic" atf_add_test_case "pr255852" } diff --git a/tests/sys/netpfil/pf/table.sh b/tests/sys/netpfil/pf/table.sh index 64dbd3a36201..b820d0c11e75 100644 --- a/tests/sys/netpfil/pf/table.sh +++ b/tests/sys/netpfil/pf/table.sh @@ -1,332 +1,334 @@ # $FreeBSD$ # # SPDX-License-Identifier: BSD-2-Clause # # Copyright (c) 2020 Mark Johnston # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions # are met: # 1. Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # 2. Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # # THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND # ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE # IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE # ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE # FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL # DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS # OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) # HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT # LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY # OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF # SUCH DAMAGE. . $(atf_get_srcdir)/utils.subr TABLE_STATS_ZERO_REGEXP='Packets: 0[[:space:]]*Bytes: 0[[:space:]]' TABLE_STATS_NONZERO_REGEXP='Packets: [1-9][0-9]*[[:space:]]*Bytes: [1-9][0-9]*[[:space:]]' atf_test_case "v4_counters" "cleanup" v4_counters_head() { atf_set descr 'Verify per-address counters for v4' atf_set require.user root } v4_counters_body() { pft_init epair_send=$(vnet_mkepair) ifconfig ${epair_send}a 192.0.2.1/24 up vnet_mkjail alcatraz ${epair_send}b jexec alcatraz ifconfig ${epair_send}b 192.0.2.2/24 up jexec alcatraz pfctl -e pft_set_rules alcatraz \ "table counters { 192.0.2.1 }" \ "block all" \ "pass in from to any" \ - "pass out from any to " + "pass out from any to " \ + "set skip on lo" atf_check -s exit:0 -o ignore ping -c 3 192.0.2.2 atf_check -s exit:0 -e ignore \ -o match:'In/Block:.*'"$TABLE_STATS_ZERO_REGEXP" \ -o match:'In/Pass:.*'"$TABLE_STATS_NONZERO_REGEXP" \ -o match:'Out/Block:.*'"$TABLE_STATS_ZERO_REGEXP" \ -o match:'Out/Pass:.*'"$TABLE_STATS_NONZERO_REGEXP" \ jexec alcatraz pfctl -t foo -T show -vv } v4_counters_cleanup() { pft_cleanup } atf_test_case "v6_counters" "cleanup" v6_counters_head() { atf_set descr 'Verify per-address counters for v6' atf_set require.user root } v6_counters_body() { pft_init epair_send=$(vnet_mkepair) ifconfig ${epair_send}a inet6 2001:db8:42::1/64 up no_dad -ifdisabled vnet_mkjail alcatraz ${epair_send}b jexec alcatraz ifconfig ${epair_send}b inet6 2001:db8:42::2/64 up no_dad jexec alcatraz pfctl -e pft_set_rules alcatraz \ "table counters { 2001:db8:42::1 }" \ "block all" \ "pass in from to any" \ - "pass out from any to " + "pass out from any to " \ + "set skip on lo" atf_check -s exit:0 -o ignore ping -6 -c 3 2001:db8:42::2 atf_check -s exit:0 -e ignore \ -o match:'In/Block:.*'"$TABLE_STATS_ZERO_REGEXP" \ -o match:'In/Pass:.*'"$TABLE_STATS_NONZERO_REGEXP" \ -o match:'Out/Block:.*'"$TABLE_STATS_ZERO_REGEXP" \ -o match:'Out/Pass:.*'"$TABLE_STATS_NONZERO_REGEXP" \ jexec alcatraz pfctl -t foo6 -T show -vv } v6_counters_cleanup() { pft_cleanup } atf_test_case "pr251414" "cleanup" pr251414_head() { atf_set descr 'Test PR 251414' atf_set require.user root } pr251414_body() { pft_init epair_send=$(vnet_mkepair) ifconfig ${epair_send}a 192.0.2.1/24 up vnet_mkjail alcatraz ${epair_send}b jexec alcatraz ifconfig ${epair_send}b 192.0.2.2/24 up jexec alcatraz pfctl -e pft_set_rules alcatraz \ "pass all" \ "table { self }" \ "pass in log to " pft_set_rules noflush alcatraz \ "pass all" \ "table counters { self }" \ "pass in log to " atf_check -s exit:0 -o ignore ping -c 3 192.0.2.2 jexec alcatraz pfctl -t tab -T show -vv } pr251414_cleanup() { pft_cleanup } atf_test_case "automatic" "cleanup" automatic_head() { atf_set descr "Test automatic - optimizer generated - tables" atf_set require.user root } automatic_body() { pft_init epair=$(vnet_mkepair) ifconfig ${epair}a 192.0.2.1/24 up vnet_mkjail alcatraz ${epair}b jexec alcatraz ifconfig ${epair}b 192.0.2.2/24 up jexec alcatraz pfctl -e pft_set_rules alcatraz \ "block in" \ "pass in proto icmp from 192.0.2.1" \ "pass in proto icmp from 192.0.2.3" \ "pass in proto icmp from 192.0.2.4" \ "pass in proto icmp from 192.0.2.5" \ "pass in proto icmp from 192.0.2.6" \ "pass in proto icmp from 192.0.2.7" \ "pass in proto icmp from 192.0.2.8" \ "pass in proto icmp from 192.0.2.9" atf_check -s exit:0 -o ignore ping -c 1 192.0.2.2 } automatic_cleanup() { pft_cleanup } atf_test_case "network" "cleanup" network_head() { atf_set descr 'Test :network' atf_set require.user root } network_body() { pft_init epair=$(vnet_mkepair) ifconfig ${epair}a 192.0.2.1/24 up vnet_mkjail alcatraz ${epair}b jexec alcatraz ifconfig ${epair}b 192.0.2.2/24 up jexec alcatraz pfctl -e pft_set_rules alcatraz \ "table const { epair:network }"\ "block in" \ "pass in from " atf_check -s exit:0 -o ignore ping -c 1 192.0.2.2 } network_cleanup() { pft_cleanup } atf_test_case "pr259689" "cleanup" pr259689_head() { atf_set descr 'Test PR 259689' atf_set require.user root } pr259689_body() { pft_init vnet_mkjail alcatraz jexec alcatraz pfctl -e pft_set_rules alcatraz \ "pass in" \ "block in inet from { 1.1.1.1, 1.1.1.2, 2.2.2.2, 2.2.2.3, 4.4.4.4, 4.4.4.5 }" atf_check -o match:'block drop in inet from <__automatic_.*:6> to any' \ -e ignore \ jexec alcatraz pfctl -sr -vv } pr259689_cleanup() { pft_cleanup } atf_test_case "precreate" "cleanup" precreate_head() { atf_set descr 'Test creating a table without counters, then loading rules that add counters' atf_set require.user root } precreate_body() { pft_init vnet_mkjail alcatraz jexec alcatraz pfctl -t foo -T add 192.0.2.1 jexec alcatraz pfctl -t foo -T show pft_set_rules noflush alcatraz \ "table counters persist" \ "pass in from " # Expect all counters to be zero atf_check -s exit:0 -e ignore \ -o match:'In/Block:.*'"$TABLE_STATS_ZERO_REGEXP" \ -o match:'In/Pass:.*'"$TABLE_STATS_ZERO_REGEXP" \ -o match:'Out/Block:.*'"$TABLE_STATS_ZERO_REGEXP" \ -o match:'Out/Pass:.*'"$TABLE_STATS_ZERO_REGEXP" \ jexec alcatraz pfctl -t foo -T show -vv } precreate_cleanup() { pft_cleanup } atf_test_case "anchor" "cleanup" anchor_head() { atf_set descr 'Test tables in anchors' atf_set require.user root } anchor_body() { pft_init epair=$(vnet_mkepair) ifconfig ${epair}a 192.0.2.1/24 up vnet_mkjail alcatraz ${epair}b jexec alcatraz ifconfig ${epair}b 192.0.2.2/24 up jexec alcatraz pfctl -e (echo "table persist" echo "block in quick from to any" ) | jexec alcatraz pfctl -a anchorage -f - pft_set_rules noflush alcatraz \ "pass" \ "anchor anchorage" atf_check -s exit:0 -o ignore ping -c 1 192.0.2.2 # Tables belong to anchors, so this is a different table and won't affect anything jexec alcatraz pfctl -t testtable -T add 192.0.2.1 atf_check -s exit:0 -o ignore ping -c 1 192.0.2.2 # But when we add the address to the table in the anchor it does block traffic jexec alcatraz pfctl -a anchorage -t testtable -T add 192.0.2.1 atf_check -s exit:2 -o ignore ping -c 1 192.0.2.2 } anchor_cleanup() { pft_cleanup } atf_init_test_cases() { atf_add_test_case "v4_counters" atf_add_test_case "v6_counters" atf_add_test_case "pr251414" atf_add_test_case "automatic" atf_add_test_case "network" atf_add_test_case "pr259689" atf_add_test_case "precreate" atf_add_test_case "anchor" }